482 Commits

Author SHA1 Message Date
Martin von Zweigbergk
24a512683b revset: add a revset function for finding commits with conflicts
This adds `conflict()` revset that selects commits with conflicts. We
may want to extend it later to consider only conflicts at certain
paths.
2023-04-06 16:46:21 -07:00
Martin von Zweigbergk
e1c57338a1 revset: split out no-args head() to visible_heads()
The `heads()` revset function with one argument is the counterpart to
`roots()`. Without arguments, it returns the visible heads in the
repo, i.e. `heads(all())`. The two use cases are quite different, and
I think it would be good to clarify that the no-arg form returns the
visible heads, so let's split that out to a new `visible_heads()`
function.
2023-04-03 23:46:34 -07:00
Yuya Nishihara
0bfdbcaa1e revset: don't rewrite '~set & filter' as difference
Since filter is slow in general, its input set should be minimized. This has
measurable impact on artificial query like '~(v0.4.0..) & author(_)'. If it
were evaluated as a difference of sets, all commits would have to be loaded.
2023-04-04 15:21:21 +09:00
Yuya Nishihara
3927c01d08 revset: make error type opaque to try_transform_expression()
It no longer handles RevsetResolutionError.
2023-04-03 10:55:03 +09:00
Yuya Nishihara
f1e2d19d57 revset: fully consume Present(_) node by resolve_symbols()
Since resolve_symbols() now removes Present(_) node, it make sense to
handle symbol resolution error there. That's why I added a "pre" callback
to try_transform_expression().

Perhaps, "operation" scope (#1283) can be implemented in a similar way,
(but somehow need to resolve operation id and call repo.reload_at(op).)
2023-04-03 10:55:03 +09:00
Yuya Nishihara
aeb93c7591 revset: insert pre-order callback that can terminate transformation early
This will be a hook for resolve_symbols() to transform Present(_) subtree.
2023-04-03 10:55:03 +09:00
Yuya Nishihara
feaad6b5fa revset: add type alias for Option<Rc<RevsetExpression>>
I'm going to parameterize error type of TransformResult, and the result type
will be replaced with Result<TransformedExpression, E>.
2023-04-03 10:55:03 +09:00
Yuya Nishihara
c28d2d7784 revset: split RevsetError into RevsetResolution/EvaluationError
This makes it clear that RevsetExpression::Present node is noop at the
evaluation stage.

RevsetEvaluationError::StoreError is unused right now, but I'm not sure if
it should be removed. It makes some sense that evaluate() can propagate
StoreError as it has access to the store.
2023-04-03 10:55:03 +09:00
Yuya Nishihara
429562ca2f revset: implement Debug for RevsetImpl and add trait bound accordingly 2023-04-02 22:54:46 +09:00
Martin von Zweigbergk
3546cc1bf6 revset: pass in store, index, and heads instead of whole Repo
The `Repo` is a higher-level type that the index shouldn't have to
know about. With this change, a custom revset implementation should be
able evaluate the revset on a server without knowing which repo it
refers to.
2023-03-30 20:15:45 -07:00
Martin von Zweigbergk
3ff1ab520b revset: remove public_heads()
The `public_heads()` revset only contains the root commit in
practice. I'm not sure what we want to do about phases, but since we
don't have any real support for them yet, let's just remove this
revset. I didn't update the changelog because we don't seem to have
documented the revset function (and it seems unlikely that users who
found out about it found it useful enough to use it when they could
just use `root`).
2023-03-30 20:15:45 -07:00
Martin von Zweigbergk
2a3d402d0c revset: also resolve branches(), tags(), etc. when resolving symbols
This is another step towards removing the `Repo` argument from
`Index::evaluate_revset()`.
2023-03-30 20:15:45 -07:00
Yuya Nishihara
0532301e03 revset: add latest(candidates, count) predicate
This serves the role of limit() in Mercurial. Since revsets in JJ is
(conceptually) an unordered set, a "limit" predicate should define its
ordering criteria. That's why the added predicate is named as "latest".

Closes #1110
2023-03-25 23:48:50 +09:00
Yuya Nishihara
185549f031 revset: extract helper to parse literal to e.g. usize 2023-03-25 23:48:50 +09:00
Martin von Zweigbergk
ce5c90b4e5 revset: use Index::has_id() for checking if a commit has been indexed
This avoids another use of `IndexEntry`.
2023-03-24 10:09:40 -07:00
Martin von Zweigbergk
75605e36af revset: iterate over commit ids instead of index entries
There are no remaining places where we iterate over a revset and need
the `IndexEntry`s, so we can now make `Revset::iter()` yield
`CommitId`s instead.
2023-03-23 21:58:15 -07:00
Martin von Zweigbergk
68cff2fa22 repo: get change id index from revset instead of building it in repo
This replaces the direct use of `IdIndex` in `ReadonlyRepo` by use of
`Revset::change_id_index()`.

I made the `Index` trait require `Send` and `Sync` in order to be able
to store an instance of it in `ReadonlyRepo` (via `ChangeIdIndex`) and
still have that be `Send` and `Sync`. We could alternatively store the
`ChangeIdIndex` in a `Mutex`. Now that will be up to the
`ChangeIdIndex` instead.
2023-03-23 20:49:15 -07:00
Martin von Zweigbergk
27a7fccefa revset: add a method returning a change id index
One of the remaining places we depend on index positions is when
creating a `ChangeIdIndex`. This moves that into the revset engine
(which is coupled to the commit index implementation) by adding a
`Revset::change_id_index()` method. We will also use this function
later when add support for resolving change id prefixes within a small
revset.

The current implementation simply creates an in-memory index using the
existing `IdIndex` we have in `repo.rs`.

The custom implementation at Google might do the same for small
revsets that are available on the client, but for revsets involving
many commits on the server, it might use a suboptimmal implementation
that uses longer-than-necessary prefixes for performance reasons. That
can be done by querying a server-side index including changes not in
the revset, and then verifying that the resulting commits are actually
in the revset.
2023-03-23 20:49:15 -07:00
Yuya Nishihara
9d661d6f69 cli: render other kind of revset error suggestion as hint 2023-03-23 23:08:17 +09:00
Yuya Nishihara
ddeb645d7f cli: provide hint for typo of revset function name
This is similar to what Mercurial does. The similarity threshold is copied
from clap, but we might want to adjust it later.
2023-03-23 23:08:17 +09:00
Yuya Nishihara
d3d8afc77b revset: rewrite match table of builtin functions as HashMap
So that we can build a list of function names.
2023-03-23 23:08:17 +09:00
Martin von Zweigbergk
ea498df539 repo: when resolving change id prefix, directly return commit ids
The functions resolving a change id to commits currently return a
`Vec<IndexEntry>`. We want to avoid depending on `IndexEntry` and we
only need the commit ids here.
2023-03-21 21:43:44 -07:00
Martin von Zweigbergk
01d7239732 revset: make graph iterator yield commit ids (not index entries)
We only need `CommitId`s, and `IndexEntry` is specific to the default
index implementation.
2023-03-20 01:45:54 -07:00
Martin von Zweigbergk
2f876861ae graphlog: key by commit id (not index position)
The index position is specific to the default index implementation and
we don't want to use it in outside of there. This commit removes the
use of it as a key for nodes in the graphlog.

I timed it on the git.git repo using `jj log -r 'all()' -T commit_id`
(the worst case I can think of) and it slowed down from ~2.02 s to
~2.20 s (~9%).
2023-03-20 01:45:54 -07:00
Martin von Zweigbergk
e721a81780 revset: update documentation of Revset::iter()
Since we hid the graph iterator implementation behind
`Revset::iter_graph()`, I don't think we have any callers of
`Revset::iter()` require the iteration to be in index position order,
so let's not promise that. We do want to promise that the iteration is
in topological order with children before parents, however.
2023-03-20 01:45:54 -07:00
Martin von Zweigbergk
2495c8f27e cargo: update MSRV to 1.64
We need 1.64 to bump `clap` to `4.1`. We don't really need to upgrade
to that, but being on an older version causes minor confusions like
#1393. Rust 1.64 is very close to 6 months old at this point.
2023-03-17 22:44:29 -07:00
Martin von Zweigbergk
70d4a0f42e revset: remove context parameter from evaluate()
The `RevsetWorkspaceContext` argument is now instead used by the new
`resolve_symbol()` function.
2023-03-17 22:42:41 -07:00
Martin von Zweigbergk
d971148e4e revset: move resolve_symbol() back to revset module
The only caller is now in `revset.rs`.
2023-03-17 22:42:41 -07:00
Martin von Zweigbergk
94aec90bee revset: resolve symbols earlier, before passing to revset engine
For large repos, it's useful to be able to use shorter change id and
commit id prefixes by resolving the prefix in a limited subset of the
repo (typically the same subset that you'd want to see in your default
log output). For very large repos, like Google's internal one, the
shortest unique prefix evaluated within the whole repo is practically
useless because it's long enough that the user would want to copy and
paste it anyway.

Mercurial supports this with its `revisions.disambiguatewithin` config
(added in https://www.mercurial-scm.org/repo/hg/rev/503f936489dd). I'd
like to add the same feature to jj. Mercurial's implementation works
by attempting to resolve the prefix in the whole repo and then, if the
prefix was ambiguous, it resolves it in the configured subset
instead. The advantage of doing it that way is that there's no extra
cost of resolving the revset defining the subset if the prefix was not
ambiguous within the whole repo. However, there are two important
reasons to do it differently in jj:

* We support very large repos using custom backends, and it's probably
  cheaper to resolve a prefix within the subset because it can all be
  cached on the client. Resolving the prefix within the whole repo
  requires a roundtrip to the server.

* We want to be able to resolve change id prefixes, which is always
  done in *some* revset. That revset is currently `all()`, i.e. all
  visible commits. Even on local disk, it's probably cheaper to
  resolve a small revset first and then resolve the prefix within that
  than it is to build up the index of all visible change ids.

We could achieve the goal by letting each revset engine respect the
configured subset, but since the solution proposed above makes sense
also for local-disk repos, I think it's better to do it outside of the
revset engine, so all revset engines can share the code.

This commit prepares for the new functionality by moving the symbol
resolution out of `Index::evaluate_revset()`.
2023-03-17 22:42:41 -07:00
Martin von Zweigbergk
1711cb61fb revset: add a Result version of transform_expression_bottom_up()
I'd like to add a stage after optimization for resolving most symbols
to commit IDs, and that needs to be able to fail.
2023-03-17 22:42:41 -07:00
Martin von Zweigbergk
3871efd2f9 revset: move ReverseRevsetGraphIterator into revset module
The iterator is not specific to the implementation in
`revset_graph_iterator`, so it belongs in the standard `revset`
module.
2023-03-14 05:32:02 -07:00
Martin von Zweigbergk
f62fac24ac revset: move graph iteration onto Revset trait
We want to allow custom revset engines define their own graph
iterator. This commit helps with that by adding a
`Revset::iter_graph()` function that returns an abstract iterator.

The current `RevsetGraphIterator` can be configured to skip or include
transitive edges. It skips them by default and we don't expose option
in the CLI. I didn't bother including that functionality in the new
`iter_graph()` either. At least for now, it will be up to the
implementation whether it includes such edges (it would of course be
free to ignore the caller's request even if we added an option for it
in the API).
2023-03-14 05:32:02 -07:00
Martin von Zweigbergk
28cbd7b1c5 revset: move evaluation into index
This commit adds an `evaluate_revset()` function to the `Index`
trait. It will require some further cleanup, but it already achieves
the goal of letting the index implementation decide which revset
engine to use.
2023-03-14 05:32:02 -07:00
Martin von Zweigbergk
eed0b23009 revset: move current implementation to new module
We want to allow customization of the revset engine, so it can query
server indexes, for example. The current revset implementation will be
our default implementation for now. What's left in the `revset` module
after this commit is mostly parsing code.
2023-03-14 05:32:02 -07:00
Martin von Zweigbergk
6a57456067 revset: remove is_empty() implementation from trait
Now that there's a single implementation of `Revset`, I think it makes
more sense for `is_empty()` to be defined there. Maybe different
revset engines have different ways of implementing it. Even if they
don't, this is trivial to re-implement in each revset engine.
2023-03-13 07:20:35 -07:00
Martin von Zweigbergk
2d1b13b338 revset: make ToPredicateFn private, decouple public type from private
As the comment above `ToPredicateFn` says, it could be a private
type. This commit makes that happen by making the private `Revset`
implementations (`DifferenceRevset` etc.) instead implement an
internal revset type called `InternalRevset`. That type is what
extends `ToPredicateFn`, so the public type doesn't have to. The new
type will not need to implement the new functions I'm about to add to
the `Revset` trait.
2023-03-13 07:20:35 -07:00
Martin von Zweigbergk
9e6c139fa0 revset: create wrapper for current revset implementation
We don't want the public `Revset` interface to know about
`ToPredicateFn`. In order to hide it, I'm wrapping the internal type
in another type, so only the internal type can keep implementing
`ToPredicateFn`.
2023-03-13 07:20:35 -07:00
Martin von Zweigbergk
6c9cefb8a0 revset: make evaluate_revset() private and use it internally
I'd like to be able to change the return type of `evaluate_revset()`
to be an internal type. Since all external callers currently call the
function via `RevsetExpression::evaluate()`, it turns out it's easy to
make it private. To benefit from an internal type, we also need to
make the recursive calls be directly to the internal function.
2023-03-13 07:20:35 -07:00
Martin von Zweigbergk
ff1b6ce3d1 index: extract a MutableIndex trait
This is yet another step towards making the index pluggable. The
`IndexStore` trait seems reasonable after this commit. There's still a
lot of work to remove `IndexPosition` from the `Index` trait.
2023-03-11 22:22:46 -08:00
Martin von Zweigbergk
e5ba9d9e42 index: move current implementation to default_index_store.rs
The idea is that `index.rs` contains the interface, similar to
`backend.rs` for commits, and `op_store.rs` for operations.
2023-03-11 22:22:46 -08:00
Yuya Nishihara
8f9bc4e7a6 revset: ignore all ascii whitespace characters 2023-03-04 00:01:54 +09:00
Martin von Zweigbergk
bbd6ef0c7b revset: remove filter_by_diff(), have caller intersect expression
To be able to make e.g. `jj log some/path` perform well on cloud-based
repos, a custom revset engine needs to be able to see the paths to
filter by. That way it is able pass those to a server-side index. This
commit helps with that by effectively converting `jj log -r foo
some/path` into `jj log -r 'foo & file(some/path)'`.
2023-02-28 17:45:34 -08:00
Martin von Zweigbergk
bc9f66dad3 revset: replace RevsetIterator wrapper by extension
The type doesn't seem to provide any benefit. I don't think I had a
good reason for creating it in the first place; it was probably just
unfamiliarity with Rust.
2023-02-19 21:37:26 -08:00
Martin von Zweigbergk
91e56c7f2f revset: add type parameters to remaining iterator adapters
This is another step towards removing `RevsetIterator`. These types
are private, so someone using the library can't accidentally create a
`UnionRevsetIterator` with inputs in different order, for example.
2023-02-19 21:37:26 -08:00
Martin von Zweigbergk
44e6ef9bae revset: make into_predicate_fn() a free function
I'm about to convert `RevsetIterator` to a trait and I don't want
`into_predicate_fn()` on it.
2023-02-19 21:37:26 -08:00
Martin von Zweigbergk
36aa6e0be6 revset: add type parameter in commit iterator adapters
The standard way of creating iterator adapters seems to have the input
iterator type by a parameter to the function, so let's follow that
pattern.
2023-02-19 21:37:26 -08:00
Martin von Zweigbergk
30160f4d20 revset: pass revset, not iterator, into RevsetGraphIterator
I was thinking of replacing `RevsetIterator` by a regular
`Iterator<Item=IndexEntry>`. However, that would make it easier to
pass in an iterator that produces revisions in a non-topological order
into `RevsetGraphIterator`, which would produce unexpected results (it
would result in nodes that are not connected to their parents, if
their parents had already been emitted). I think it makes sense to
instead pass in a revset into `RevsetGraphIterator`.

Incidentally, it will also be useful to have the full revset available
in `RevsetGraphIterator` if we rewrite the algorithm to be more
similar to Mercurial's and Sapling's algorithm, which involves asking
the revset if it contains parent revisions.
2023-02-19 21:37:26 -08:00
Martin von Zweigbergk
8966580ba4 revset: clean up 'revset lifetimes on dyn Revset
If I understand correctly, the 'revset lifetimes on `Box<dyn
Revset<'index> + 'revset>` are not constrained by the lifetime of a
revset; we don't have any revsets that borrow data from other
revsets. Instead, they're all about constraining a boxed revset to the
index's lifetime. Without the lifetime annotation, it would default to
'static, and the borrow-checker doesn't like `dyn Revset<'index> +
'static`, since the revset could then live longer than the index it
borrows.
2023-02-17 07:52:17 -08:00
Martin von Zweigbergk
0159092c31 revset: elide lifetime parameter on iter()
The self-lifetime is the default when elided.
2023-02-17 07:52:17 -08:00
Martin von Zweigbergk
53565a816c revset: rename 'repo lifetime to the more accurate 'index 2023-02-17 07:52:17 -08:00