71 Commits

Author SHA1 Message Date
Yuya Nishihara
817713c921 graphlog: use IndexPosition until transitive edges get eliminated
This partially reverts 4c8f484278de "graphlog: key by commit id (not index
position)." As Martin pointed out, it made "log -r 'tags()' -T.." in git
repo super slow. Apparently, both clone() and hash map insertion/lookup costs
increased by that change. Since we don't need CommitId inside the graph
iterator, we can simply replace it with IndexPosition, and resolve it to
CommitId later.
2023-07-24 05:07:07 +09:00
Martin von Zweigbergk
1c3fe9a651 cli: use MergedTree for finding conflicts
`MergedTree` is now ready to be used when checking if a commit has
conflicts, and when listing conflicts. We don't yet a way for the user
to say they want to use tree-level conflicts even for these
cases. However, since the backend can decide, we should be able to
have our backend return tree-level conflicts. All writes will still
use path-level conflicts, so the experimentation we can do at Google
is limited.

Beacause `MergedTree` doesn't yet have a way of walking conflicts
while restricting it by a matcher, this will make `jj resolve` a
little slower. I suspect no one will notice.
2023-07-19 22:04:16 -07:00
Waleed Khan
54dba51a08 docs: warn about missing docs for jj-lib crate 2023-07-10 18:28:59 +03:00
Martin von Zweigbergk
b297c0c0d8 rewrite: propagate errors from merge_trees() 2023-06-30 14:12:36 +02:00
Yuya Nishihara
ca6b9828d1 id_prefix: only store first few bytes of keys in IdIndex
This eliminates indirect access through Vec<u8> and improves cache locality
while sorting the index entries. We can achieve a similar result by using
SmallVec<[u8; 24]> in place of Commit/ChangeId(Vec<u8>), but we would have
to determine a reasonable id length across backends. Indexing [u8; 4] performs
better, at the cost of the API and implementation complexity.

For temporary Commit/ChangeId allocation in general, I think a borrowed type
like Path/PathBuf will help.

Testing with my "linux" repo, this saves ~670ms needed to initialize both
change id index and disambiguation indexes.
2023-06-25 12:54:18 +09:00
Yuya Nishihara
580d8bd92e id_prefix: introduce builder interface to IdIndex
It allows us to build multiple IdIndex instances within a single loop. As the
final sorting is heavy operation, I don't want to implement Default + Extend
for IdIndex to be compatible with Iterator::unzip().
2023-06-25 12:54:18 +09:00
Yuya Nishihara
a67d8b5a65 index: turn CompositeIndex::walk_revs() into position-based API
This gets rid of round-trip conversion from queries like "(main..)-". I have
such expression in my default log/disambiguation revset, and the query could
take ~150ms to convert head positions back and forth if the repository had
tons of unmerged commits.
2023-06-19 13:41:43 +09:00
Yuya Nishihara
b7b9b8c88e index: pass only CompositeIndex to default_revset_engine::evaluate() 2023-05-29 08:15:40 +09:00
Yuya Nishihara
92c1b7091b index: make CompositeIndex copyable to clarify it is a cheap reference type
Well, I might change it to an owned wrapper later, but if I made such change,
the current CompositeIndex<'_> would be replaced with &CompositeIndex.
2023-05-29 08:15:40 +09:00
Yuya Nishihara
0b2f0eca05 revset: add Revset::count() API
The default-engine implementation is pretty much the same as iter().count(),
but custom engine may have an optimal path.
2023-05-24 01:02:37 +09:00
Yuya Nishihara
5b568cabcc revset: add iterator of (CommitId, ChangeId) pairs, use it in id_index
There are a few more places where we need these pairs.
2023-05-24 01:02:37 +09:00
Yuya Nishihara
44927be7c9 id_prefix: add IdIndex method that looks up unambiguous key
resolve_prefix_with() is changed to return both key and values.
2023-05-24 01:02:37 +09:00
Martin von Zweigbergk
f657bcb6ae prefixes: move IdIndex to id_prefix module
I'll reuse it there next.
2023-05-11 23:41:24 -07:00
Yuya Nishihara
d948acd5bf revset: do not scan ancestors more than once to evaluate nested children set 2023-04-28 08:36:58 +09:00
Yuya Nishihara
e6740d9c3b index: migrate walk_ancestors_until_roots() from revset engine
I'm going to add a RevWalk method to walk descendants with generation filter,
which will use this helper method. RevWalk::take_until_roots() uses .min()
instead of .last() since RevWalk shouldn't know the order of the input set.
2023-04-28 08:36:58 +09:00
Yuya Nishihara
837e8aa81a revset: add substitution rule for nested descendants/children
The substitution rule and tests are copied from ancestors/parents. The backend
logic will be reimplemented later. For now, it naively repeats children().
2023-04-24 20:45:13 +09:00
Yuya Nishihara
32253fed5e revset: replace children node with descendants of generation 1..2 2023-04-24 20:45:13 +09:00
Yuya Nishihara
a99b82c634 revset: add generation parameter to descendants node
This is a minimal change to replace Children with Descendants. A generation
parameter could be added to RevsetExpression::DagRange, but it's not needed
as of now.
2023-04-24 20:45:13 +09:00
Yuya Nishihara
eadf8faded revset: extract children() evaluation to function
I'm going to add generation parameter to Children/DagRange nodes, and
'Children { .. }' will be substituted to 'DagRange { .., gen: 1 }'. This
commit helps future code move.

Lifetime bounds of the arguments are unnecessarily restricted. It appears
walk_ancestors_until_roots() captures arguments lifetime on rustc 1.64.0.
I think the problem will go away if walk_*() functions are extracted to
RevWalk methods where input arguments will become less generic.
2023-04-24 20:45:13 +09:00
Yuya Nishihara
d9d2b405e1 revset: remove redundant boxing from evaluated children node
Just spotted while moving codes around. This wouldn't matter in practice.
2023-04-24 20:45:13 +09:00
Yuya Nishihara
36e7afe0db revset: exclude unreachable roots from collect_dag_range() result
It doesn't matter, but can simplify the function interface. I'll probably
extract this function to RevWalk so the descendants with/without generation
filter can be tested without using revset API.
2023-04-24 20:45:13 +09:00
Martin von Zweigbergk
c60f14899a index: remove entry_by_id() from trait
It no longer needs to be on the `Index` trait, thereby removing the
last direct use of `IndexEntry` in the trait (it's still used
indirectly in `walk_revs()`).
2023-04-18 18:32:23 -07:00
Martin von Zweigbergk
e492548772 revset: bump generation numbers in API to 64 bits
A chain of 4 billion commits is a lot, but it's not out of the
question, so let's support it. The current default index will not be
able to handle that many commits, so I let that still use 32-bit
integers.
2023-04-12 21:18:49 -07:00
Yuya Nishihara
5351371d51 revset: resolve visible heads prior to evaluation 2023-04-10 00:39:58 +09:00
Yuya Nishihara
7e1e9efa38 revset: resolve "all()" prior to evaluation 2023-04-10 00:39:58 +09:00
Yuya Nishihara
f43f0d24b8 revset: resolve candidates of children set prior to evaluation 2023-04-10 00:39:58 +09:00
Yuya Nishihara
7974269bab revset: remove None variant from resolved enum, use Commits([]) instead
We'll remove All, so it makes sense to not have None either.
2023-04-10 00:39:58 +09:00
Yuya Nishihara
0fcc13a6f4 revset: make resolve() return different type describing evaluation plan
New ResolvedExpression enum ensures that the evaluation engine doesn't have
to know the symbol resolution details. In this commit, I've moved Filter
and NotIn resolution to resolve_visibility(). Implicit All/VisibleHeads
resolution will be migrated later.

It's tempting to combine resolve_symbols() and resolve_visibility() to get
rid of panic!()s, but the resolution might have to be two passes to first
resolve&collect explicit commit ids, and then substitute "all()" with
"(:visible_heads())|commit_id|..". It's also possible to apply some tree
transformation after symbol resolution.
2023-04-10 00:39:58 +09:00
Yuya Nishihara
6d9b836d10 revset: extract unresolved commit references to separate enum
This makes it clear what should be resolved at resolve_symbols(). Symbol
is a bit special while parsing function arguments, but it's no different
than the other unresolved references at expression level.
2023-04-10 00:39:58 +09:00
Yuya Nishihara
adfd52445b revset: reimplement children to not scan visible ancestors twice
It's slightly faster, and removes the use of RevsetExpression::descendants()
API.
2023-04-08 12:13:30 +09:00
Yuya Nishihara
5dd99db250 revset: make evaluation helper not create trait object eagerly
We wouldn't care for the cost of virtual dispatch at this level, but I
think a concrete struct type is easier to deal with than trait object.
2023-04-08 12:13:30 +09:00
Yuya Nishihara
85fb1f74c3 revset: for roots:heads, terminate ancestor lookup at min(roots) 2023-04-08 12:13:30 +09:00
Yuya Nishihara
ddff089286 revset: do not evaluate roots() candidates three times 2023-04-08 12:13:30 +09:00
Yuya Nishihara
eef6a77aa4 revset: reuse reachable dag-range set to calculate roots
This also removes the use of RevsetExpression::connected() API from the
evaluation engine.
2023-04-08 12:13:30 +09:00
Yuya Nishihara
20aa31336e revset: extract dag-range calculation to function
The returned reachable set can be reused to calculate roots() expression.
2023-04-08 12:13:30 +09:00
Yuya Nishihara
7dc35b82b0 revset: evaluate ancestors without using RevsetExpression builder API
I'm thinking of transforming RevsetExpression to a enum dedicated for
the evaluation stage. To help the migration, I want to remove the use of
the RevsetExpression builder API from the evaluation engine.

Fewer virtual dispatch is also better.
2023-04-08 12:13:30 +09:00
Martin von Zweigbergk
24a512683b revset: add a revset function for finding commits with conflicts
This adds `conflict()` revset that selects commits with conflicts. We
may want to extend it later to consider only conflicts at certain
paths.
2023-04-06 16:46:21 -07:00
Yuya Nishihara
308a5b9eae revset: make empty()/file(".") not load root tree for liner history
TreeDiffIterator wouldn't load identical subtrees, but it's up to caller to
optimize out the root tree loading.
2023-04-05 21:53:24 +09:00
Martin von Zweigbergk
e1c57338a1 revset: split out no-args head() to visible_heads()
The `heads()` revset function with one argument is the counterpart to
`roots()`. Without arguments, it returns the visible heads in the
repo, i.e. `heads(all())`. The two use cases are quite different, and
I think it would be good to clarify that the no-arg form returns the
visible heads, so let's split that out to a new `visible_heads()`
function.
2023-04-03 23:46:34 -07:00
Yuya Nishihara
982062bd75 revset: do not always evaluate filter node to InternalRevset
This basically removes hidden 'all() &' from union/negation of filters. To
achieve that, I have two options: 1. add separate evaluation path (like the
one this commit introduced), or 2. wrap "all()" revset to override predicate
as Box::new(|_| true) function. I took the former since it's less ad-hoc.

We can add an explicit RevsetExpression node to branch between evaluate()
and evaluate_predicate(), but I don't think it would simplify the
implementation at this point. We might need such node if we want to resolve
"all()" at resolve_symbols(). It might be even better to extract a subset of
RevsetExpression enum, which only contains evaluatable nodes.

The cost of 'all() &' isn't significant for most filters. '~merges()' is
the exception. For jj repo,

    revsets/:v0.3.0 & (author(martinvonz) | committer(martinvonz))
    --------------------------------------------------------------
    base     1.06      11.2±0.04m
    new      1.00      10.5±0.05m

    revsets/~merges()
    -----------------
    base     1.69     750.0±8.47µ
    new      1.00     444.1±3.50µ
2023-04-04 15:21:21 +09:00
Yuya Nishihara
69794f2585 revset: add method to upcast InternalRevset to ToPredicateFn 2023-04-04 15:21:21 +09:00
Yuya Nishihara
426f3e4e0a revset: simplify evaluation of "all()"
I think this is more readable, and apparently it produces slightly better code
maybe because the compiler can determine that there are no unwanted markers.
2023-04-04 15:21:21 +09:00
Yuya Nishihara
f1e2d19d57 revset: fully consume Present(_) node by resolve_symbols()
Since resolve_symbols() now removes Present(_) node, it make sense to
handle symbol resolution error there. That's why I added a "pre" callback
to try_transform_expression().

Perhaps, "operation" scope (#1283) can be implemented in a similar way,
(but somehow need to resolve operation id and call repo.reload_at(op).)
2023-04-03 10:55:03 +09:00
Yuya Nishihara
c28d2d7784 revset: split RevsetError into RevsetResolution/EvaluationError
This makes it clear that RevsetExpression::Present node is noop at the
evaluation stage.

RevsetEvaluationError::StoreError is unused right now, but I'm not sure if
it should be removed. It makes some sense that evaluate() can propagate
StoreError as it has access to the store.
2023-04-03 10:55:03 +09:00
Yuya Nishihara
429562ca2f revset: implement Debug for RevsetImpl and add trait bound accordingly 2023-04-02 22:54:46 +09:00
Yuya Nishihara
2aab6c7825 revset: implement Debug for InternalRevset objects
Even though predicate function and RevWalk internals can't be debug printed,
it's useful to see an overview of InternalRevset tree.
2023-04-02 22:54:46 +09:00
Yuya Nishihara
b297b7c965 revset: turn PurePredicateFn into newtype struct
This will avoid extra boxing when converting PurePredicateFn to dyn
ToPredicateFn object.
2023-04-02 22:54:46 +09:00
Yuya Nishihara
fbb292f7c9 revset: relax lifetime bound of ToPredicateFn
We don't have to require that the input IndexEntry<'_> has 'index lifetime.
2023-04-02 22:54:46 +09:00
Yuya Nishihara
2404dc8cd3 revset: remove redundant type bound from RevWalkRevset
The "'index: 'a" bound can be removed by bypassing the Box<dyn> indirection
of self.iter().
2023-04-02 22:54:46 +09:00
Martin von Zweigbergk
3546cc1bf6 revset: pass in store, index, and heads instead of whole Repo
The `Repo` is a higher-level type that the index shouldn't have to
know about. With this change, a custom revset implementation should be
able evaluate the revset on a server without knowing which repo it
refers to.
2023-03-30 20:15:45 -07:00