MutableRepo and CommitBuilder both define public (now crate-public) functions
which should only be called by each other. This commit adds documentation and
restricts visibility of these functions to the jj_lib crate. It might be even
better to move CommitBuilder to the same module as MutableRepo so that these
codependent functions can be private to the module to avoid misuse.
These comments are intended to make it easier for new developers to get up to
speed with the project. This is just a starting point... there are other types
and functions that could benefit from documentation.
There's no need to have a block of code at the beginning of the function to
cache the rewrite source id. We can simply check the necessary condition before
calling record_rewritten_commit.
This tweak makes the function a little easier to read since we don't check the
condition until we're ready to do the work.
This reverts dc074363d180 "no-op: Move external git repo canonicalization into
Workspace::init_git_external." As I said in the PR comment, appending ".git"
is normalization of the user input, which is IMHO more appropriate to be done
in the CLI layer.
This allows us to define documentation comments for types implemented using the
id_type! macro. Comments defined above the type inside the macro will be
captured and visible in generated docs.
Example:
```
id_type!(
/// Stable identifier for a [`Commit`]. Unlike the `CommitId`, the `ChangeId`
/// follows the commit and is not updated when the commit is rewritten.
pub ChangeId
);
```
This commit also adds documentation for the `CommitId` and `ChangeId` types
defined using the `id_type!` macro.
This removes the special handling of the working-copy commit. By
recording when an empty/emptied commit was abanoned, we rebase
descendants correctly and create a new empty working-copy commit on
top.
This partially reverts changes in a9f489ccdf17 "Switch to ignore crate for
gitignore handling." Since child ignore object no longer needs to access the
root to resolve the prefix path, it's simpler to store a matcher per node.
With the current implementation, the file3 pattern is set to the prefix
"foo/foo/bar". I don't know if (unrooted) "baz" prefixed with "foo/foo/bar"
should match "foo/bar/baz", but apparently it is. Anyway, that wouldn't be
the case in practice because adjacent .gitignore files shouldn't be loaded.
We do clone Operation object in several places, and I'm going to add one more
.clone() in the templater. Since the underlying metadata has many fields, I
think it's better to wrap it with Arc just like a Commit object.
The default immutable_heads() includes tags(), which makes sense, but computing
heads(tags()) can be expensive because the tags() set is usually sparse. For
example, "jj bench revset 'heads(tags())'" took 157ms in my linux stable
mirror. We can of course optimize the heads evaluation by using bit set or
segmented index, but the query includes many historical heads if the repository
has per-release branches, which are uninteresting anyway. So, this patch
replaces heads(immutable_heads()) with trunk().
The reason we include heads(immutable_heads()) is to mitigate the following
problem. Suppose trunk() is the branch to be based off, I think using trunk()
here is pretty good.
```
A B
*---*----* trunk() ⊆ immutable_heads()
\
* C
```
https://github.com/martinvonz/jj/pull/2247#discussion_r1335078879
In my linux stable mirror, this makes the default log revset evaluation super
fast. immutable_heads(), if configured properly, includes many historical
branch heads which are also the visible heads.
revsets/immutable_heads()..
---------------------------
0 12.27 117.1±0.77m
3 1.00 9.5±0.08m
I'm going to add pre-filtering to the 'roots..heads' evaluation path, and
difference_by() will be used there to calculate 'heads ~ roots'.
Union and intersection iterators are slightly changed so that all iterators
prioritize iter1's item.
When doing things like testing snapshot performance differences,
this allows you to turn off the monitor, no matter what the enabled
user or repository configuration has, e.g.
jj st --config-toml='core.fsmonitor="none"'
Signed-off-by: Austin Seipp <aseipp@pobox.com>
This is a no-op in terms of function, but provides a nicer way to derive the
ContentHash trait for structs using the `#[derive(ContentHash)]` syntax used
for other traits such as `Debug`.
This commit only adds the macro. A subsequent commit will replace uses of
`content_hash!{}` with `#[derive(ContentHash)]`.
The new macro generates nice error messages, just like the old macro:
```
error[E0277]: the trait bound `NotImplemented: content_hash::ContentHash` is not satisfied
--> lib/src/content_hash.rs:265:16
|
265 | z: NotImplemented,
| ^^^^^^^^^^^^^^ the trait `content_hash::ContentHash` is not implemented for `NotImplemented`
|
= help: the following other types implement trait `content_hash::ContentHash`:
bool
i32
i64
u8
u32
u64
std::collections::HashMap<K, V>
BTreeMap<K, V>
and 38 others
```
This commit does two things to make proc macros re-exported by jj_lib useable
by deps:
1. jj_lib needs to be able refer to itself as `jj_lib` which it does
by adding an `extern crate self as jj_lib` declaration.
2. jj_lib::content_hash needs to re-export the `digest::Update` type so that
users of jj_lib can use the `#[derive(ContentHash)]` proc macro without
directly depending on the digest crate. This is done by re-exporting it
as `DigestUpdate`.
#3054
It should be useful at least in the presentation layer to know which
operations correspond to working-copy snapshots. They might be
rendered differently in the graph, for example. Or maybe an undo
command wants to warn if you just undid a snapshot operation. This
patch just introduces a field in the metadata to store the
information.
I think the conclusion from #2600 is that at least auto-rebasing
should not simplify merge commits that merge a commit with its
ancestor. Let's start by adding an option for that in the library.
The shortest change id prefix will become a few digits longer, but I think
that's acceptable. Entries included in the "revsets.short-prefixes" set are
unaffected.
The reachable set is calculated eagerly, but this is still faster as we no
longer need to sort the reachable entries by change id. The lazy version will
save another ~100ms in mid-size repos.
"jj log" without working copy snapshot:
```
% hyperfine --sort command --warmup 3 --runs 20 -L bin jj-0,jj-1,jj-2 \
-s "target/release-with-debug/{bin} -R ~/mirrors/linux debug reindex" \
"target/release-with-debug/{bin} -R ~/mirrors/linux \
--ignore-working-copy log -r.. -l100 --config-toml='revsets.short-prefixes=\"\"'"
Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux --ignore-working-copy log -r.. -l100 --config-toml='revsets.short-prefixes=""'
Time (mean ± σ): 353.6 ms ± 11.9 ms [User: 266.7 ms, System: 87.0 ms]
Range (min … max): 329.0 ms … 365.6 ms 20 runs
Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux --ignore-working-copy log -r.. -l100 --config-toml='revsets.short-prefixes=""'
Time (mean ± σ): 271.3 ms ± 9.9 ms [User: 183.8 ms, System: 87.7 ms]
Range (min … max): 250.5 ms … 282.7 ms 20 runs
Relative speed comparison
1.99 ± 0.16 target/release-with-debug/jj-0 -R ~/mirrors/linux --ignore-working-copy log -r.. -l100 --config-toml='revsets.short-prefixes=""'
1.53 ± 0.12 target/release-with-debug/jj-1 -R ~/mirrors/linux --ignore-working-copy log -r.. -l100 --config-toml='revsets.short-prefixes=""'
```
"jj status" with working copy snapshot (watchman enabled):
```
% hyperfine --sort command --warmup 3 --runs 20 -L bin jj-0,jj-1,jj-2 \
-s "target/release-with-debug/{bin} -R ~/mirrors/linux debug reindex" \
"target/release-with-debug/{bin} -R ~/mirrors/linux \
status --config-toml='revsets.short-prefixes=\"\"'"
Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux status --config-toml='revsets.short-prefixes=""'
Time (mean ± σ): 396.6 ms ± 10.1 ms [User: 300.7 ms, System: 94.0 ms]
Range (min … max): 373.6 ms … 408.0 ms 20 runs
Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux status --config-toml='revsets.short-prefixes=""'
Time (mean ± σ): 318.6 ms ± 12.6 ms [User: 219.1 ms, System: 94.1 ms]
Range (min … max): 294.2 ms … 333.0 ms 20 runs
Relative speed comparison
1.85 ± 0.14 target/release-with-debug/jj-0 -R ~/mirrors/linux status --config-toml='revsets.short-prefixes=""'
1.48 ± 0.12 target/release-with-debug/jj-1 -R ~/mirrors/linux status --config-toml='revsets.short-prefixes=""'
```
These methods are basically the same as the commit_id versions, but
resolve_change_id_prefix() is a bit more involved as we need to gather matches
from multiple segments.
In resolve_change_id_prefix(), I've implemented two different ways of
collecting the overflow items. I don't think they impact the performance,
but we can switch to the alternative method as needed.
This basically means that the change ids are interned. We'll implement binary
search over the sorted change ids table. The table could be sorted differently
for better cache locality, but it is in lexicographical order for simplicity.
With my testing, the cost of the id lookup isn't dominant.
Unlike the parent entries, the size of the per-id overflow items isn't saved.
That's s because the number of the same-change-id commits is either 1 or many.
It doesn't make sense to allocate 8 bytes for each change id. Instead, we'll
pay extra indirection cost to determine the size.
I'm going to add change id overflow table whose elements are of LocalPosition
type. Let's make sure that the serialization code would break if we changed
the underlying data type.