298 Commits

Author SHA1 Message Date
Martin von Zweigbergk
5e7c57c527 revset: introduce a trait for resolving symbols
I'd like to make the symbol resolution more flexible, both so we can
support customizing it (in custom `jj` binaries) and so we can use it
for resolving short prefixes within a small revset.
2023-05-11 23:41:24 -07:00
Martin von Zweigbergk
b11d38fdf8 revset: delete obsolete comment about ambiguous change/commit ids
Commit ids and change ids or prefixes of either are never ambiguous
since we started using k-z for change ids.
2023-05-03 16:23:14 -07:00
Martin von Zweigbergk
9cf0440aef revset: elide some lifetimes (reported by IntelliJ) 2023-05-03 16:23:14 -07:00
Yuya Nishihara
837e8aa81a revset: add substitution rule for nested descendants/children
The substitution rule and tests are copied from ancestors/parents. The backend
logic will be reimplemented later. For now, it naively repeats children().
2023-04-24 20:45:13 +09:00
Yuya Nishihara
32253fed5e revset: replace children node with descendants of generation 1..2 2023-04-24 20:45:13 +09:00
Yuya Nishihara
a99b82c634 revset: add generation parameter to descendants node
This is a minimal change to replace Children with Descendants. A generation
parameter could be added to RevsetExpression::DagRange, but it's not needed
as of now.
2023-04-24 20:45:13 +09:00
Yuya Nishihara
f6570486e0 revset: add descendants node to expression, resolve it later
I'll add a substitution rule that folds (x+)+, and 'Descendants { roots }'
is easier to process than 'DagRange { roots, heads }'.
2023-04-24 20:45:13 +09:00
Martin von Zweigbergk
e492548772 revset: bump generation numbers in API to 64 bits
A chain of 4 billion commits is a lot, but it's not out of the
question, so let's support it. The current default index will not be
able to handle that many commits, so I let that still use 32-bit
integers.
2023-04-12 21:18:49 -07:00
Yuya Nishihara
5351371d51 revset: resolve visible heads prior to evaluation 2023-04-10 00:39:58 +09:00
Yuya Nishihara
7e1e9efa38 revset: resolve "all()" prior to evaluation 2023-04-10 00:39:58 +09:00
Yuya Nishihara
f43f0d24b8 revset: resolve candidates of children set prior to evaluation 2023-04-10 00:39:58 +09:00
Yuya Nishihara
7974269bab revset: remove None variant from resolved enum, use Commits([]) instead
We'll remove All, so it makes sense to not have None either.
2023-04-10 00:39:58 +09:00
Yuya Nishihara
0fcc13a6f4 revset: make resolve() return different type describing evaluation plan
New ResolvedExpression enum ensures that the evaluation engine doesn't have
to know the symbol resolution details. In this commit, I've moved Filter
and NotIn resolution to resolve_visibility(). Implicit All/VisibleHeads
resolution will be migrated later.

It's tempting to combine resolve_symbols() and resolve_visibility() to get
rid of panic!()s, but the resolution might have to be two passes to first
resolve&collect explicit commit ids, and then substitute "all()" with
"(:visible_heads())|commit_id|..". It's also possible to apply some tree
transformation after symbol resolution.
2023-04-10 00:39:58 +09:00
Yuya Nishihara
6c2525cb93 revset: add "resolve" method to RevsetExpression, always call it
I'll make the resolution stage mandatory, and have it return a "resolved"
type. RevsetExpression::evaluate() will be moved to the "resolved" type.
2023-04-10 00:39:58 +09:00
Yuya Nishihara
6d9b836d10 revset: extract unresolved commit references to separate enum
This makes it clear what should be resolved at resolve_symbols(). Symbol
is a bit special while parsing function arguments, but it's no different
than the other unresolved references at expression level.
2023-04-10 00:39:58 +09:00
Yuya Nishihara
fc65b00020 revset: extract CommitId resolution to function
I'm going to merge unresolved variants as RevsetExpression::CommitRef(_).
This prepares for the change.
2023-04-10 00:39:58 +09:00
Martin von Zweigbergk
24a512683b revset: add a revset function for finding commits with conflicts
This adds `conflict()` revset that selects commits with conflicts. We
may want to extend it later to consider only conflicts at certain
paths.
2023-04-06 16:46:21 -07:00
Martin von Zweigbergk
e1c57338a1 revset: split out no-args head() to visible_heads()
The `heads()` revset function with one argument is the counterpart to
`roots()`. Without arguments, it returns the visible heads in the
repo, i.e. `heads(all())`. The two use cases are quite different, and
I think it would be good to clarify that the no-arg form returns the
visible heads, so let's split that out to a new `visible_heads()`
function.
2023-04-03 23:46:34 -07:00
Yuya Nishihara
0bfdbcaa1e revset: don't rewrite '~set & filter' as difference
Since filter is slow in general, its input set should be minimized. This has
measurable impact on artificial query like '~(v0.4.0..) & author(_)'. If it
were evaluated as a difference of sets, all commits would have to be loaded.
2023-04-04 15:21:21 +09:00
Yuya Nishihara
3927c01d08 revset: make error type opaque to try_transform_expression()
It no longer handles RevsetResolutionError.
2023-04-03 10:55:03 +09:00
Yuya Nishihara
f1e2d19d57 revset: fully consume Present(_) node by resolve_symbols()
Since resolve_symbols() now removes Present(_) node, it make sense to
handle symbol resolution error there. That's why I added a "pre" callback
to try_transform_expression().

Perhaps, "operation" scope (#1283) can be implemented in a similar way,
(but somehow need to resolve operation id and call repo.reload_at(op).)
2023-04-03 10:55:03 +09:00
Yuya Nishihara
aeb93c7591 revset: insert pre-order callback that can terminate transformation early
This will be a hook for resolve_symbols() to transform Present(_) subtree.
2023-04-03 10:55:03 +09:00
Yuya Nishihara
feaad6b5fa revset: add type alias for Option<Rc<RevsetExpression>>
I'm going to parameterize error type of TransformResult, and the result type
will be replaced with Result<TransformedExpression, E>.
2023-04-03 10:55:03 +09:00
Yuya Nishihara
c28d2d7784 revset: split RevsetError into RevsetResolution/EvaluationError
This makes it clear that RevsetExpression::Present node is noop at the
evaluation stage.

RevsetEvaluationError::StoreError is unused right now, but I'm not sure if
it should be removed. It makes some sense that evaluate() can propagate
StoreError as it has access to the store.
2023-04-03 10:55:03 +09:00
Yuya Nishihara
429562ca2f revset: implement Debug for RevsetImpl and add trait bound accordingly 2023-04-02 22:54:46 +09:00
Martin von Zweigbergk
3546cc1bf6 revset: pass in store, index, and heads instead of whole Repo
The `Repo` is a higher-level type that the index shouldn't have to
know about. With this change, a custom revset implementation should be
able evaluate the revset on a server without knowing which repo it
refers to.
2023-03-30 20:15:45 -07:00
Martin von Zweigbergk
3ff1ab520b revset: remove public_heads()
The `public_heads()` revset only contains the root commit in
practice. I'm not sure what we want to do about phases, but since we
don't have any real support for them yet, let's just remove this
revset. I didn't update the changelog because we don't seem to have
documented the revset function (and it seems unlikely that users who
found out about it found it useful enough to use it when they could
just use `root`).
2023-03-30 20:15:45 -07:00
Martin von Zweigbergk
2a3d402d0c revset: also resolve branches(), tags(), etc. when resolving symbols
This is another step towards removing the `Repo` argument from
`Index::evaluate_revset()`.
2023-03-30 20:15:45 -07:00
Yuya Nishihara
0532301e03 revset: add latest(candidates, count) predicate
This serves the role of limit() in Mercurial. Since revsets in JJ is
(conceptually) an unordered set, a "limit" predicate should define its
ordering criteria. That's why the added predicate is named as "latest".

Closes #1110
2023-03-25 23:48:50 +09:00
Yuya Nishihara
185549f031 revset: extract helper to parse literal to e.g. usize 2023-03-25 23:48:50 +09:00
Martin von Zweigbergk
ce5c90b4e5 revset: use Index::has_id() for checking if a commit has been indexed
This avoids another use of `IndexEntry`.
2023-03-24 10:09:40 -07:00
Martin von Zweigbergk
75605e36af revset: iterate over commit ids instead of index entries
There are no remaining places where we iterate over a revset and need
the `IndexEntry`s, so we can now make `Revset::iter()` yield
`CommitId`s instead.
2023-03-23 21:58:15 -07:00
Martin von Zweigbergk
68cff2fa22 repo: get change id index from revset instead of building it in repo
This replaces the direct use of `IdIndex` in `ReadonlyRepo` by use of
`Revset::change_id_index()`.

I made the `Index` trait require `Send` and `Sync` in order to be able
to store an instance of it in `ReadonlyRepo` (via `ChangeIdIndex`) and
still have that be `Send` and `Sync`. We could alternatively store the
`ChangeIdIndex` in a `Mutex`. Now that will be up to the
`ChangeIdIndex` instead.
2023-03-23 20:49:15 -07:00
Martin von Zweigbergk
27a7fccefa revset: add a method returning a change id index
One of the remaining places we depend on index positions is when
creating a `ChangeIdIndex`. This moves that into the revset engine
(which is coupled to the commit index implementation) by adding a
`Revset::change_id_index()` method. We will also use this function
later when add support for resolving change id prefixes within a small
revset.

The current implementation simply creates an in-memory index using the
existing `IdIndex` we have in `repo.rs`.

The custom implementation at Google might do the same for small
revsets that are available on the client, but for revsets involving
many commits on the server, it might use a suboptimmal implementation
that uses longer-than-necessary prefixes for performance reasons. That
can be done by querying a server-side index including changes not in
the revset, and then verifying that the resulting commits are actually
in the revset.
2023-03-23 20:49:15 -07:00
Yuya Nishihara
9d661d6f69 cli: render other kind of revset error suggestion as hint 2023-03-23 23:08:17 +09:00
Yuya Nishihara
ddeb645d7f cli: provide hint for typo of revset function name
This is similar to what Mercurial does. The similarity threshold is copied
from clap, but we might want to adjust it later.
2023-03-23 23:08:17 +09:00
Yuya Nishihara
d3d8afc77b revset: rewrite match table of builtin functions as HashMap
So that we can build a list of function names.
2023-03-23 23:08:17 +09:00
Martin von Zweigbergk
ea498df539 repo: when resolving change id prefix, directly return commit ids
The functions resolving a change id to commits currently return a
`Vec<IndexEntry>`. We want to avoid depending on `IndexEntry` and we
only need the commit ids here.
2023-03-21 21:43:44 -07:00
Martin von Zweigbergk
01d7239732 revset: make graph iterator yield commit ids (not index entries)
We only need `CommitId`s, and `IndexEntry` is specific to the default
index implementation.
2023-03-20 01:45:54 -07:00
Martin von Zweigbergk
2f876861ae graphlog: key by commit id (not index position)
The index position is specific to the default index implementation and
we don't want to use it in outside of there. This commit removes the
use of it as a key for nodes in the graphlog.

I timed it on the git.git repo using `jj log -r 'all()' -T commit_id`
(the worst case I can think of) and it slowed down from ~2.02 s to
~2.20 s (~9%).
2023-03-20 01:45:54 -07:00
Martin von Zweigbergk
e721a81780 revset: update documentation of Revset::iter()
Since we hid the graph iterator implementation behind
`Revset::iter_graph()`, I don't think we have any callers of
`Revset::iter()` require the iteration to be in index position order,
so let's not promise that. We do want to promise that the iteration is
in topological order with children before parents, however.
2023-03-20 01:45:54 -07:00
Martin von Zweigbergk
2495c8f27e cargo: update MSRV to 1.64
We need 1.64 to bump `clap` to `4.1`. We don't really need to upgrade
to that, but being on an older version causes minor confusions like
#1393. Rust 1.64 is very close to 6 months old at this point.
2023-03-17 22:44:29 -07:00
Martin von Zweigbergk
70d4a0f42e revset: remove context parameter from evaluate()
The `RevsetWorkspaceContext` argument is now instead used by the new
`resolve_symbol()` function.
2023-03-17 22:42:41 -07:00
Martin von Zweigbergk
d971148e4e revset: move resolve_symbol() back to revset module
The only caller is now in `revset.rs`.
2023-03-17 22:42:41 -07:00
Martin von Zweigbergk
94aec90bee revset: resolve symbols earlier, before passing to revset engine
For large repos, it's useful to be able to use shorter change id and
commit id prefixes by resolving the prefix in a limited subset of the
repo (typically the same subset that you'd want to see in your default
log output). For very large repos, like Google's internal one, the
shortest unique prefix evaluated within the whole repo is practically
useless because it's long enough that the user would want to copy and
paste it anyway.

Mercurial supports this with its `revisions.disambiguatewithin` config
(added in https://www.mercurial-scm.org/repo/hg/rev/503f936489dd). I'd
like to add the same feature to jj. Mercurial's implementation works
by attempting to resolve the prefix in the whole repo and then, if the
prefix was ambiguous, it resolves it in the configured subset
instead. The advantage of doing it that way is that there's no extra
cost of resolving the revset defining the subset if the prefix was not
ambiguous within the whole repo. However, there are two important
reasons to do it differently in jj:

* We support very large repos using custom backends, and it's probably
  cheaper to resolve a prefix within the subset because it can all be
  cached on the client. Resolving the prefix within the whole repo
  requires a roundtrip to the server.

* We want to be able to resolve change id prefixes, which is always
  done in *some* revset. That revset is currently `all()`, i.e. all
  visible commits. Even on local disk, it's probably cheaper to
  resolve a small revset first and then resolve the prefix within that
  than it is to build up the index of all visible change ids.

We could achieve the goal by letting each revset engine respect the
configured subset, but since the solution proposed above makes sense
also for local-disk repos, I think it's better to do it outside of the
revset engine, so all revset engines can share the code.

This commit prepares for the new functionality by moving the symbol
resolution out of `Index::evaluate_revset()`.
2023-03-17 22:42:41 -07:00
Martin von Zweigbergk
1711cb61fb revset: add a Result version of transform_expression_bottom_up()
I'd like to add a stage after optimization for resolving most symbols
to commit IDs, and that needs to be able to fail.
2023-03-17 22:42:41 -07:00
Martin von Zweigbergk
3871efd2f9 revset: move ReverseRevsetGraphIterator into revset module
The iterator is not specific to the implementation in
`revset_graph_iterator`, so it belongs in the standard `revset`
module.
2023-03-14 05:32:02 -07:00
Martin von Zweigbergk
f62fac24ac revset: move graph iteration onto Revset trait
We want to allow custom revset engines define their own graph
iterator. This commit helps with that by adding a
`Revset::iter_graph()` function that returns an abstract iterator.

The current `RevsetGraphIterator` can be configured to skip or include
transitive edges. It skips them by default and we don't expose option
in the CLI. I didn't bother including that functionality in the new
`iter_graph()` either. At least for now, it will be up to the
implementation whether it includes such edges (it would of course be
free to ignore the caller's request even if we added an option for it
in the API).
2023-03-14 05:32:02 -07:00
Martin von Zweigbergk
28cbd7b1c5 revset: move evaluation into index
This commit adds an `evaluate_revset()` function to the `Index`
trait. It will require some further cleanup, but it already achieves
the goal of letting the index implementation decide which revset
engine to use.
2023-03-14 05:32:02 -07:00
Martin von Zweigbergk
eed0b23009 revset: move current implementation to new module
We want to allow customization of the revset engine, so it can query
server indexes, for example. The current revset implementation will be
our default implementation for now. What's left in the `revset` module
after this commit is mostly parsing code.
2023-03-14 05:32:02 -07:00