With this change, `write_path_to_store()` contains all the logic for
reading a file from disk and writing it to a `TreeBuilder`, making the
code for added and modified files more similar.
It's faster to add only files matched by the Watchman matcher to the
set of deleted files than to add all files and then removed files not
matched. This speeds up `jj diff` with Watchman in the Linux repo from
~530 ms to ~460 ms.
The intention in this and some future commits is to have `update_file_state` accept `&self` instead of `&mut self` to make clear what data is updated by `update_file_state` and to ensure transactional safety of the `TreeState` contents.
There's a comment saying that mutating the file's current state
simplifies later logic, but I don't think that's true. It might have
been true in the past, when we had `FileType::Conflict`.
When snapshotting a conflict, we read the contents and parse conflict
markers. If we determine that it's no longer a conflict, then we write
a normal file entry to the tree instead. When we do that, we re-read
the file from the working copy. Let's instead write the file contents
we already read.
When a file's mtime has changed on disk, we update our record of that
mtime, but we did so in a separate place for conflicts compared to
non-conflicts. Let's reuse it.
The working copy's current tree tracks whether a file is a
conflict. We also track that in the `TreeState` object. That allows us
to not read the trees object to decide if we should try to parse a
file as a conflict.
One disadvantage is that it's redundant information that needs to be
kept in sync with the tree object. Also, for Watchman, we would like
to completely ignore the persisted `FileState`.
This commit removes the `FileType::Conflict` variant and instead
checks in the tree object whether a given path was a conflict. This is
the change I mentioned in dc8a20773760. We still skip the check
completely if the file's mtime etc. is unchanged, so it shouldn't have
much effect in the common case of a mostly unchanged working copy. I
measured a slowdown on `jj diff` by ~3% in the Linux repo with a clean
working copy with all mtimes bumped. I think the simpler code and
reduced risk of subtle bugs is worth the performance hit.
Before we had `conflicts::Conflict`, most of these functions took a
`backend::Conflict`. I think I didn't want to pollute the `backend`
module with this kind of logic, trying to keep it focused on
storage. Now that we have the type in `conflicts`, however, I think it
makes sense to move these functions onto it.
For tree-level conflicts (#1624), I plan to remove `ConflictId`
completely. This commit removes `ConflictId` from
`update_conflict_from_content()` by instead making it take a
`Conflict<Option<TreeValue>>` and return a possibly different such
value.
I made the call site in `working_copy` avoid writing the conflict to
the store if it's unchanged, but I didn't make the same optimization
in `merge_tools` becuase it's much more likely to have changed there.
This prepares for allowing the base tree to be a conflict at the
root-tree level (#1624).
We could remove the `Conflict` variant completely. I tried doing that
and it slowed down `jj diff` by ~3% in the Linux repo with a clean
working copy with only mtime bumped on all files.
I don't think there's a good reason not to write the
`.jj/working_copy/tree_state` file on init. Being able to assume that
the file exists means that we won't need the store object to to lazily
load the `TreeState` object. Well, except that `TreeState` keeps an
`Arc<Store>`, but I'm trying to change that.
I've preferred "working-copy commit" over "checkout" for a while
because I think it's clearer, but there were lots of places still
using "checkout". I've left "checkout" in places where it refers to
the action of updating the working copy or the working-copy commit.
I needed this in the course of debugging an error. Before this commit, the error looked like this:
```
Error: Unexpected error from backend: Object not found
```
After this commit, it looks like this:
```
Error: Unexpected error from backend: Object with CommitId 8f59646bc9bb6bb44b5624f1248f4a708f37003c not found: object not found - no match for id (8f59646bc9bb6bb44b5624f1248f4a708f37003c); class=Odb (9); code=NotFound (-3)
```
I ran an upgraded Clippy on the codebase. All the changes seem to be
about using variables directly in format strings instead of passing
them as separate arguments.
A new FileType, GitSubmodule is added which is ignored. Files or
directories having this type are not added to the work queue and
are ignored in snapshot. Submodules are not created by jujutsu
when resetting or checking out a tree, they should be currently
managed using git.
Let's acknowledge everyone's contributions by replacing "Google LLC"
in the copyright header by "The Jujutsu Authors". If I understand
correctly, it won't have any legal effect, but maybe it still helps
reduce concerns from contributors (though I haven't heard any
concerns).
Google employees can read about Google's policy at
go/releasing/contributions#copyright.
There are no "non-normal" files, so "normal" is not needed. We have
symlinks and conflicts, but they are not files, so I think just "file"
is unambiguous.
I left `testutils::write_normal_file()` because there it's used to
mean "not executable file" (there's also a `write_executable_file()`).
I left `working_copy::FileType::Normal` since renaming `Normal` there
to `File` would also suggest we should rename `FileType`, and I don't
know what would be a better name for that type.
Let WorkspaceCommandHelper clone it. WorkspaceCommandHelper could return
workspace_id by reference, but doing that would introduce noisy .clone()
calls and lifetime mess.
For consistency with the tree_state handling. This isn't that simple and
concise compared to the tree_state one, so I'm fine to drop the series.
I've extracted {operation_id, workspace_id} pair so these values can be
safely initialized by OnceCell. The extracted struct is named after the
"checkout" file.
Here OnceCell<T> serves as RefCell<Option<T>>, but it doesn't require runtime
Ref/RefMut wrapper. This allows us to get rid of some .clone() calls needed to
hide Ref<_> from public interface.