1868 Commits

Author SHA1 Message Date
Martin von Zweigbergk
beb997e85a watchman: don't even add non-watchman files to set of deleted files
It's faster to add only files matched by the Watchman matcher to the
set of deleted files than to add all files and then removed files not
matched. This speeds up `jj diff` with Watchman in the Linux repo from
~530 ms to ~460 ms.
2023-07-28 12:12:09 -07:00
Waleed Khan
4b635e9713 working_copy: use tree rather than file states to detect if directory is tracked
This moves us closer towards treating the `file_states` map as purely a cache rather than authoritative state.
2023-07-28 10:15:33 -07:00
Waleed Khan
9d8702b537 refactor(working_copy): return new file state from update_file_state 2023-07-28 09:52:37 -07:00
Waleed Khan
c409f4ed53 refactor(working_copy): hoist current and new file states to top of update_file_state
Reduces some indentation and simplifies some control flow.
2023-07-28 09:52:37 -07:00
Waleed Khan
3264135489 refactor(working_copy): return updated new tree value from update_file_state
The intention in this and some future commits is to have `update_file_state` accept `&self` instead of `&mut self` to make clear what data is updated by `update_file_state` and to ensure transactional safety of the `TreeState` contents.
2023-07-28 09:52:37 -07:00
Waleed Khan
1e28b312c6 perf: instrument some steps in snapshot 2023-07-28 09:28:01 -07:00
Waleed Khan
018bb88ec6 perf: add several #[instrument]s 2023-07-28 09:28:01 -07:00
Waleed Khan
7875656354 perf: add #[instrument] to all cmd_* functions 2023-07-28 09:28:01 -07:00
Yuya Nishihara
4834d12c37 simple_op_store: serialize RefTarget in new format (breaks downgrades)
This is breaking change. Old jj binary will panic if it sees a view saved by
new jj. Alternatively, we can store both new and legacy data for backward
compatibility.
2023-07-27 15:32:48 +09:00
Yuya Nishihara
8351a743f6 simple_op_store: add deserialization support of new Conflict-based RefTarget
CommitId is wrapped with a message since we need a representation for None.
This is different from TreeConflict in which an empty tree has an id.
2023-07-27 15:32:48 +09:00
Martin von Zweigbergk
55520a0e9c simple_op_store: add object type and id to protobuf decode errors 2023-07-26 14:17:21 -07:00
Martin von Zweigbergk
56750bb360 op_store: add read/write error variants, matching commit backend
This will hopefully give enough information to tell which path was
unexpectedly a directory in #1907.
2023-07-26 14:17:21 -07:00
Martin von Zweigbergk
96e75e6ad1 simple_op_store: return NotFound for missing views
I guess we don't depend on `read_view()` ever returning `NotFound` the
way `read_operation()` does, but it seems like it still should return
`NotFound` when the view doesn't exist.
2023-07-26 14:17:21 -07:00
Martin von Zweigbergk
84a60d15bc op_store: make ViewId and OperationId implement ObjectId 2023-07-26 14:17:21 -07:00
Martin von Zweigbergk
769426f99a op_store: make OpStoreError::Other preserve source error object
This is the OpStore version of e1e75daa8e4a.
2023-07-26 14:17:21 -07:00
Yuya Nishihara
60d48c27f6 revset_graph: ignore duplicated entries in emittable stack
Since parent->child edge is populated lazily, emittable stack may have
duplicated entries.

Fixes #1909
2023-07-26 04:04:34 +09:00
Waleed Khan
70d3c64b1e operation: propagate OpStoreError
This is a rote propagation of the error value to hopefully improve the panic in https://github.com/martinvonz/jj/issues/1907.
2023-07-25 12:46:59 -05:00
Martin von Zweigbergk
7cc916e4f1 working_copy: in mtime race case, don't mutate current state
There's a comment saying that mutating the file's current state
simplifies later logic, but I don't think that's true. It might have
been true in the past, when we had `FileType::Conflict`.
2023-07-24 16:41:44 -07:00
Martin von Zweigbergk
861788ae09 working_copy: propagate errors from failing to read conflict file 2023-07-24 15:14:42 -07:00
Martin von Zweigbergk
99fec7ed06 working_copy: don't re-read resolved conflict from disk on snapshot
When snapshotting a conflict, we read the contents and parse conflict
markers. If we determine that it's no longer a conflict, then we write
a normal file entry to the tree instead. When we do that, we re-read
the file from the working copy. Let's instead write the file contents
we already read.
2023-07-24 15:14:42 -07:00
Martin von Zweigbergk
72d389cefb working_copy: extract a function for writing a conflict 2023-07-24 15:14:42 -07:00
Martin von Zweigbergk
c7e736f73c working_copy: update file state for conflict and non-conflict the same
When a file's mtime has changed on disk, we update our record of that
mtime, but we did so in a separate place for conflicts compared to
non-conflicts. Let's reuse it.
2023-07-24 15:14:42 -07:00
Martin von Zweigbergk
10a2a15993 working_copy: don't track conflict-ness in state file, use tree object
The working copy's current tree tracks whether a file is a
conflict. We also track that in the `TreeState` object. That allows us
to not read the trees object to decide if we should try to parse a
file as a conflict.

One disadvantage is that it's redundant information that needs to be
kept in sync with the tree object. Also, for Watchman, we would like
to completely ignore the persisted `FileState`.

This commit removes the `FileType::Conflict` variant and instead
checks in the tree object whether a given path was a conflict. This is
the change I mentioned in dc8a20773760. We still skip the check
completely if the file's mtime etc. is unchanged, so it shouldn't have
much effect in the common case of a mostly unchanged working copy. I
measured a slowdown on `jj diff` by ~3% in the Linux repo with a clean
working copy with all mtimes bumped. I think the simpler code and
reduced risk of subtle bugs is worth the performance hit.
2023-07-24 15:02:33 -07:00
Yuya Nishihara
a80758ee1d revset_graph: remove redundant boxing from reverse iterator constructor 2023-07-25 01:45:37 +09:00
Yuya Nishihara
a382e96168 revset_graph: place new heads as close to fork point as possible
The idea is simple. New heads are ignored until the node dependency resolution
stuck. Then, only the first head that will unblock the visit will be queued.

Closes #242
2023-07-25 01:45:37 +09:00
Yuya Nishihara
fb33620f9e revset_graph: group commits topologically
The original idea was similar to Mercurial's "topo" sorting, but it was bad
at handling merge-heavy history. In order to render merges of topic branches
nicely, we need to prioritize branches at merge point, not at fork point.
OTOH, we do also want to place unmerged branches as close to the fork point
as possible. This commit implements the former requirement, and the latter
will be addressed by the next commit.

I think this is similar to Git's sorting logic described in the following blog
post. In our case, the in-degree walk can be dumb since topological order is
guaranteed by the index. We keep HashSet<CommitId> instead of an in-degree
integer value, which will be used in the next commit to resolve new heads as
late as possible.

https://github.blog/2022-08-30-gits-database-internals-ii-commit-history-queries/#topological-sorting

Compared to Sapling's beautify_graph(), this is lazy, and can roughly preserve
the index (or chronological) order. I tried beautify_graph() with prioritizing
the @ commit, but the result seemed too aggressively reordered. Perhaps, for
more complex history, beautify_graph() would produce a better result. For my
wip branches (~30 branches, a couple of commits per branch), this works pretty
well.

#242
2023-07-25 01:45:37 +09:00
Yuya Nishihara
6fcb98c0c4 revset_graph: add helper to test graph sorting 2023-07-25 01:45:37 +09:00
Yuya Nishihara
e2f9ed439e revset: extract graph-related types to separate module
I'm going to add a topo-grouping iterator adapter, and the revset module is
already big enough to split.
2023-07-25 01:45:37 +09:00
Yuya Nishihara
817713c921 graphlog: use IndexPosition until transitive edges get eliminated
This partially reverts 4c8f484278de "graphlog: key by commit id (not index
position)." As Martin pointed out, it made "log -r 'tags()' -T.." in git
repo super slow. Apparently, both clone() and hash map insertion/lookup costs
increased by that change. Since we don't need CommitId inside the graph
iterator, we can simply replace it with IndexPosition, and resolve it to
CommitId later.
2023-07-24 05:07:07 +09:00
Yuya Nishihara
2bbaa4352a refs: rename RefTarget::is_conflict() to has_conflict()
Copied from MergedTree::has_conflict(). I feel it's slightly better since
RefTarget "is" always a Conflict-based type. It could be inverted to
RefTarget::is_resolved(), but refs are usually resolved, and all callers
have special case for !is_resolved() state.
2023-07-23 22:25:57 +09:00
Yuya Nishihara
961bb49ff0 refs: leverage Conflict::is_resolved() 2023-07-23 22:25:57 +09:00
Yuya Nishihara
465b8d9cf0 working_copy: remove unused SnapshotError::FileOpenError variant
Perhaps, this would have been generalized as IoError.
2023-07-23 14:58:17 +09:00
Martin von Zweigbergk
066f31a15d working_copy: remove an unneccessary Box 2023-07-21 12:16:02 -07:00
Martin von Zweigbergk
07dca6ef6a tree_builder: leverage BTreeMap::pop_last() now that we're on Rust 1.66+ 2023-07-20 06:14:28 -07:00
Martin von Zweigbergk
1c3fe9a651 cli: use MergedTree for finding conflicts
`MergedTree` is now ready to be used when checking if a commit has
conflicts, and when listing conflicts. We don't yet a way for the user
to say they want to use tree-level conflicts even for these
cases. However, since the backend can decide, we should be able to
have our backend return tree-level conflicts. All writes will still
use path-level conflicts, so the experimentation we can do at Google
is limited.

Beacause `MergedTree` doesn't yet have a way of walking conflicts
while restricting it by a matcher, this will make `jj resolve` a
little slower. I suspect no one will notice.
2023-07-19 22:04:16 -07:00
Martin von Zweigbergk
006c764694 backend: learn to store tree-level conflicts
Tree-level conflicts (#1624) will be stored as multiple trees
associated with a single commit. This patch adds support for that in
`backend::Commit` and in the backends.

When the Git backend writes a tree conflict, it creates a special root
tree for the commit. That tree has only the individual trees from the
conflict as subtrees. That way we prevent the trees from getting
GC'd. We also write the tree ids to the extra metadata table
(i.e. outside of the Git repo) so we don't need to load the tree
object to determine if there are conflicts.

I also added new flag to `backend::Commit` indicating whether the
commit is a new-style commit (with support for tree-level
conflicts). That will help with the migration. We will remove it once
we no longer care about old repos. When the flag is set, we know that
a commit with a single tree cannot have conflicts. When the flag is
not set, it's an old-style commit where we have to walk the whole tree
to find conflicts.
2023-07-19 22:04:16 -07:00
Martin von Zweigbergk
deb4ae476d merged_tree: add an iterator over conflicts
With `MergedTree`, we can iterate over conflicts by descending into
only the subdirectories that cannot be trivially resolved. We assume
that the trees have previously been resolved as much as possible, so
we don't attempt to resolve conflicts again.
2023-07-19 22:04:16 -07:00
Martin von Zweigbergk
828d528361 merged_tree: add a function for resolving conflicts
This adds a function for resolving conflicts that can be automatically
resolved, i.e. like our current `merge_trees()` function. However, the
new function is written to merge an arbitrary number of trees and, in
case of unresolvable conflicts, to produce a `Conflict<TreeId>` as
result instead of writing path-level conflicts to the backend. Like
`merge_trees()`, it still leaves conflicts unresolved at the file
level if any hunks conflict, and it resolves paths that can be
trivially resolved even if there are other paths that do conflict.
2023-07-19 22:04:16 -07:00
Martin von Zweigbergk
4f30417ffd merged_tree: introduce a type for a set of trees to merge on the fly
In order to store conflicts in the commit, as conflicts between a set
of trees, we want to be able merge those trees on the fly. This
introduces a type for that. It has a `Merge(Conflict(Tree))` variant,
where the individual trees cannot have path-level conflicts. It also
has a `Legacy(Tree)` variant, which does allow path-level conflicts. I
think that should help us with the migration.
2023-07-19 22:04:16 -07:00
Yuya Nishihara
92ee5121f6 view: replace .git_refs().get(name) with .get_git_ref(name) 2023-07-19 08:27:42 +09:00
Yuya Nishihara
84f0c96c8f view: replace .tags().get(name) with .get_tag(name) 2023-07-19 08:27:42 +09:00
Yuya Nishihara
bf46eb5ad2 view: replace .branches().get(name) with .get_branch(name) 2023-07-19 08:27:42 +09:00
Yuya Nishihara
ecb0850f1a view: return RefTarget by reference, clone() by caller 2023-07-19 08:27:42 +09:00
Yuya Nishihara
4da8483228 refs: reimplement RefTarget as Conflict<Option<CommitId>> wrapper
ContentHash is preserved at this point. I'll update it together with the
serialization format changes.
2023-07-18 18:12:09 +09:00
Yuya Nishihara
b6967a1dc9 refs: pass &Option<RefTarget> into merge_ref_targets() 2023-07-18 18:12:09 +09:00
Yuya Nishihara
443391bf8f view: store Option<RefTarget> in maps, add extension trait to flatten Option
Alternatively, we can wrap BTreeMap<String, Option<RefTarget>> to flatten
Option<&Option<..>> internally, but doing that would be tedious. It would
also be unclear if map.remove(name) should construct an absent RefTarget if
the ref doesn't exist.
2023-07-18 18:12:09 +09:00
Yuya Nishihara
7461370f6e view: add wrapper that will exclude absent RefTarget entries from ContentHash
The next commit will change these maps to store Option<RefTarget> entries, but
None entries will still be omitted from the serialized data. Since ContentHash
should describe the serialized data, relying on the generic ContentHash would
cause future hash conflict where absent RefTarget entries will be preserved.

For example, ([remove], [None, add]) will be serialized as ([remove], [add]),
and deserialized to ([remove], [add, None]). If we add support for lossless
serialization, hash(([remove], [None, add])) should differ from the lossy one.
2023-07-18 18:12:09 +09:00
Martin von Zweigbergk
e033f9139f cleanup: use let-else now that we're on Rust 1.65+ 2023-07-18 09:50:22 +01:00
Austin Seipp
017ce851b4 refactor(jj-lib): remove nightly_shims gunk
Summary: Now that we have Rust 1.71.0 at our fingertips, the `map_first_last`
feature has been stabilized. That means we can get rid of the `jj-lib` build
script and also the `nightly_shims` module.

Signed-off-by: Austin Seipp <aseipp@pobox.com>
Change-Id: Ibb5ce3258818a2de670763fbbaf3c2e7
2023-07-17 18:38:26 -05:00
Yuya Nishihara
0461a8575a refs: add stub constructors for absent RefTarget, replace None with it
Now we're mostly ready to reimplement RefTarget on top of
Conflict<Option<CommitId>>.
2023-07-17 08:24:24 +09:00