86 Commits

Author SHA1 Message Date
Matt Kulukundis
34b0f87584 copy-tracking: plumb CopyRecordMap through diff method 2024-08-11 17:01:45 -04:00
Matt Kulukundis
e123eb21b9 copy-tracking: add source field to TreeDiffEntry
- add the field and make it compile, but don't use it yet
2024-08-11 17:01:45 -04:00
Matt Kulukundis
8e84c60157 copy-tracking: create an explicit TreeDiffEntry struct 2024-08-11 17:01:45 -04:00
Yuya Nishihara
ed1c07e73e tree: fill in valid id to null tree, rename function to empty()
If a null tree were written to the store, GitBackend would crash because of
invalid hash length.
2024-08-11 18:23:21 +09:00
Tim Janik
219a63540f local_working_copy: fix warnings ending up on stdout
As suggested by @crackcomm on discord, use eprintln!() to print warnings
to avoid messing up template output, e.g.:

jj --no-pager --ignore-working-copy show --tool true -T change_id -r rv...
rv...ignoring git submodule at "some/submodule"

Signed-off-by: Tim Janik <timj@gnu.org>
2024-07-12 10:32:13 +09:00
Ilya Grigoriev
c563ea8b29 clippy: remove warning when not using watchman feature
I keep seeing the one in lib/ when running tests in VS Code, but I also
fixed the warnings in watchman.rs for good measure.
2024-06-29 11:22:01 -07:00
Matt Kulukundis
c9b3d64ce5 Add background snapshotting info to debug watchman status. 2024-06-20 16:09:06 -04:00
Matt Kulukundis
8aa71f58f3 feat: add an option to monitor the filesystem asynchronously
- make an internal set of watchman extensions until the client api gets
  updates with triggers
- add a config option to enable using triggers in watchman

Co-authored-by: Waleed Khan <me@waleedkhan.name>
2024-06-16 23:24:22 -04:00
mlcui
458580cee2 working_copy: Add is_file_states_sorted to tree state proto
See #2651 and a935a4f70c9c4c4a76009f9aee3bdaa1f7b9084e for more
background.

This speeds up `jj log` in a large repo with watchman enabled by around
9%:

```
$ hyperfine --sort command --warmup 3 --runs 20 -L bin \
jj-before,jj-after "target/release/{bin} -R ~/chromiumjj/src log"
Benchmark 1: target/release/jj-before -R ~/chromiumjj/src log
  Time (mean ± σ):     788.3 ms ±   3.4 ms    [User: 618.6 ms, System: 168.8 ms]
  Range (min … max):   783.1 ms … 793.3 ms    20 runs

Benchmark 2: target/release/jj-after -R ~/chromiumjj/src log
  Time (mean ± σ):     713.4 ms ±   5.2 ms    [User: 536.1 ms, System: 176.2 ms]
  Range (min … max):   706.6 ms … 724.7 ms    20 runs

Relative speed comparison
        1.11 ±  0.01  target/release/jj-before -R ~/chromiumjj/src log
        1.00          target/release/jj-after -R ~/chromiumjj/src log
```
2024-05-31 22:28:35 +10:00
Martin von Zweigbergk
404f31cbc1 backend: add error variant for access denied, handle when diffing
Some backends, like the one we have at Google, can restrict access to
certain files. For such files, if they return a regular
`BackendError::ReadObject`, then that will terminate iteration in many
cases (e.g. when diffing or listing files). This patch adds a new
error variant for them to return instead, plus handling of such errors
in diff output and in the working copy.

In order to test the feature, I added a new commit backend that
returns the new `ReadAccessDenied` error when the caller tries to read
certain objects.
2024-05-30 18:27:38 -07:00
Martin von Zweigbergk
b227dde787 conflicts: indicate executable conflict in git-format diff 2024-05-22 06:46:58 -07:00
Martin von Zweigbergk
1970ddef15 tree: propagate errors from sub_tree()/path_value() 2024-05-22 06:46:38 -07:00
Thomas Castiglione
13c8f32ceb local_working_copy: fix some clippy lints that only show up on Windows 2024-05-21 14:37:17 +08:00
Thomas Castiglione
59d3a2c866 local_working_copy: when all sides of a conflict are executable, materialise the conflicted file as executable
Fixes #3579 and adds a testcase for an executable conflict treevalue.
2024-05-21 14:37:17 +08:00
Martin von Zweigbergk
428e209304 cleanup: consistently use BackendResult
We have the type alias so we should use it consistently.
2024-05-07 19:35:03 -07:00
Martin von Zweigbergk
8bb92fa6fa working_copy: allow load_working_copy() to return error
It's reasonable for a `WorkingCopy` implementation to want to return
an error. `LocalWorkingCopyFactory` doesn't because it loads all data
lazily. The VFS-based one at Google wants to be able to return an
error, however.
2024-04-19 15:22:37 -07:00
Austin Seipp
4b45dde8c6 clippy: disable bogus lints for nightly clippy
The nightly compiler has several clippy fix-its that, if applied, break the
build. There are various bugs about this, but there isn't enough space in the
margins to detail it all.

Just ignore these on a per-function basis; about 70% of them are just multiple
instances happening inside a single function.

This makes `cargo clippy --workspace --all-targets` run clean, even with the
nightly compiler.

Signed-off-by: Austin Seipp <aseipp@pobox.com>
Change-Id: Ic26a025d3c62b12fbf096171308b56e38f7d1bb9
2024-04-05 11:39:29 -05:00
Martin von Zweigbergk
c55e08023e workspace: don't lose sparsed-away paths when recovering workspace
When an operation is missing and we recover the workspace, we create a
new working-copy commit on top of the desired working-copy commit (per
the available head operation). We then reset the working copy to an
empty tree because it shouldn't really matter much which commit we
reset to. However, when the workspace is sparse, it does matter, as
the test case from the previous patch shows. This patch fixes it by
replacing the `reset_to_empty()` method by a new `recover(&Commit)`,
which effectively resets to the empty tree and then resets to the
commit. That way, any subsequent snapshotting will result keep the
paths from that tree for paths outside the sparse patterns.
2024-03-16 07:30:36 -07:00
Yuya Nishihara
a224d0f172 repo_path: show more detailed error if filesystem path failed to parse
This should address both use cases:
 1. If from_relative_path() is directly called, the error says ".." shouldn't
    be included in the (normalized) relative path.
 2. If parse_fs_path() is used, the error message contains paths relative to
    cwd. #3216
2024-03-09 11:01:43 +09:00
Thomas Castiglione
d661f59f9d working_copy: implement symlinks on windows with a helper function
enables symlink tests on windows, ignoring failures due to disabled developer mode,
and updates windows.md
2024-03-05 15:16:38 +08:00
Austin Seipp
6c31bab0d3 fsmonitor: allow core.fsmonitor = "none" to disable
When doing things like testing snapshot performance differences,
this allows you to turn off the monitor, no matter what the enabled
user or repository configuration has, e.g.

    jj st --config-toml='core.fsmonitor="none"'

Signed-off-by: Austin Seipp <aseipp@pobox.com>
2024-02-20 20:19:47 -06:00
Daehyeok Mun
a9f489ccdf Switch to ignore crate for gitignore handling.
Co-authored-by: Waleed Khan <me@waleedkhan.name>
2024-02-20 09:12:46 -08:00
Martin von Zweigbergk
6c1aeff7a9 working copy: materialize symlinks on Windows as regular files
I was a bit surprised to learn (or be reminded?) that checking out
symlinks on Windows leads to a panic. This patch fixes the crash by
materializing symlinks from the repo as regular files. It also updates
the snapshotting code so we preserve the symlink-ness of a path. The
user can update the symlink in the repo by updating the regular file
in the working copy. This seems to match Git's behavior on Windows
when symlinks are disabled.
2024-02-09 09:20:24 -08:00
Martin von Zweigbergk
5a898b16a8 working_copy: handle symlink outside write_path_to_store()
The `write_path_to_store()` has almost no overlapping code between the
handling of symlinks and regular files, which suggests that we should
move out the handling of symlinks to the caller (there's only one).
2024-02-09 09:20:24 -08:00
Jonathan Tan
33f3a420a1 workspace: recover from missing operation
If the operation corresponding to a workspace is missing for some reason
(the specific situation in the test in this commit is that an operation
was abandoned and garbage-collected from another workspace), currently,
jj fails with a 255 error code. Teach jj a way to recover from this
situation.

When jj detects such a situation, it prints a message and stops
operation, similar to when a workspace is stale. The message tells the
user what command to run.

When that command is run, jj loads the repo at the @ operation (instead
of the operation of the workspace), creates a new commit on the @
commit with an empty tree, and then proceeds as usual - in particular,
including the auto-snapshotting of the working tree, which creates
another commit that obsoletes the newly created commit.

There are several design points I considered.

1) Whether the recovery should be automatic, or (as in this commit)
manual in that the user should be prompted to run a command. The user
might prefer to recover in another way (e.g. by simply deleting the
workspace) and this situation is (hopefully) rare enough that I think
it's better to prompt the user.

2) Which command the user should be prompted to run (and thus, which
command should be taught to perform the recovery). I chose "workspace
update-stale" because the circumstances are very similar to it: it's
symptom is that the regular jj operation is blocked somewhere at the
beginning, and "workspace update-stale" already does some special work
before the blockage (this commit adds more of such special work). But it
might be better for something more explicitly named, or even a sequence
of commands (e.g. "create a new operation that becomes @ that no
workspace points to", "low-level command that makes a workspace point to
the operation @") but I can see how this can be unnecessarily confusing
for the user.

3) How we recover. I can think of several ways:
a) Always create a commit, and allow the automatic snapshotting to
create another commit that obsoletes this commit.
b) Create a commit but somehow teach the automatic snapshotting to
replace the created commit in-place (so it has no predecessor, as viewed
in "obslog").
c) Do either a) or b), with the added improvement that if there is no
diff between the newly created commit and the former @, to behave as if
no new commit was created (@ remains as the former @).
I chose a) since it was the simplest and most easily reasoned about,
which I think is the best way to go when recovering from a rare
situation.
2024-02-09 00:38:47 -08:00
Martin von Zweigbergk
b343289238 working_copy: make reset() take a commit instead of a tree
Our virtual file system at Google (CitC) would like to know the commit
so it can scan backwards and find the closest mainline tree based on
it. Since we always record an operation id (which resolves to a
working-copy commit) when we write the working-copy state, it doesn't
seem like a restriction to require a commit.
2024-02-06 12:41:09 -08:00
Yuya Nishihara
77ceadbfd0 cleanup: remove remaining ": {source}" from error message templates 2024-02-04 09:13:21 +09:00
Daniel Ploch
cb889f0b45 workspace: combine working copy functions into a trait 2024-01-25 11:46:07 -08:00
Yuya Nishihara
5a7d8ac596 working_copy: don't follow symlinks when visiting files in gitignored directory
Fixes #2878
2024-01-24 16:38:48 +09:00
Martin von Zweigbergk
a66e2a0a6d working_copy: mark commit_id field in proto reserved
By marking it reserved, we prevent accidental use. We can still read
working copy protos that have the field.
2024-01-12 17:38:23 -08:00
Yuya Nishihara
fa5e40719c object_id: extract ObjectId trait and macros to separate module
I'm going to add a prefix resolution method to OpStore, but OpStore is
unrelated to the index. I think ObjectId, HexPrefix, and PrefixResolution can
be extracted to this module.
2024-01-05 10:20:57 +09:00
Martin von Zweigbergk
90744fb770 working copy: read files ahead when updating
If the commit backend has high latency, it can make a big difference
to read files concurrently. This patch updates the working copy code
to do that in the update code (when reading files from the backend to
write to the working copy). Because our backend at Google reads files
from a local daemon process that already does a lot of prefetching,
this patch doesn't actually help us. I think it's still the right
thing to do for backends that don't do the same kind of
prefetching. It speeds up `jj sparse set --add` by >10x when I disable
the prefetching in our daemon (our `Backend::concurrency()` is 100).
2023-12-29 13:37:13 -08:00
Yuya Nishihara
ac99145a28 working_copy: drop open file instance from PersistError
For the same reason as the file_util change.
2023-12-17 08:20:07 +09:00
Yuya Nishihara
601be0d480 working_copy: narrow file_states recursively while visiting directories
This saves another ~10ms.

Without watchman:
```
% hyperfine --sort command --warmup 3 --runs 20 -L bin jj-1,jj-2 \
"target/release-with-debug/{bin} -R ~/mirrors/linux files ~/mirrors/linux/no-match"
Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux files ~/mirrors/linux/no-match
  Time (mean ± σ):     327.7 ms ±  24.9 ms    [User: 1059.1 ms, System: 654.3 ms]
  Range (min … max):   296.0 ms … 385.4 ms    20 runs

Benchmark 3: target/release-with-debug/jj-2 -R ~/mirrors/linux files ~/mirrors/linux/no-match
  Time (mean ± σ):     311.0 ms ±  24.8 ms    [User: 960.0 ms, System: 643.1 ms]
  Range (min … max):   274.9 ms … 358.5 ms    20 runs
```
2023-11-30 12:09:31 +09:00
Yuya Nishihara
a935a4f70c working_copy: use proto file states without rebuilding BTreeMap
In snapshot(), changed_file_states are received in arbitrary order. For the
other callers, entries are in diff_stream order, so we don't have to sort
them.

With watchman enabled, we can see the cost of sorting the sorted proto entries.
I don't think this is significant, but we can mitigate it by adding
is_file_states_sorted flag to the proto message if needed:
```
% hyperfine --sort command --warmup 3 --runs 20 -L bin jj-0,jj-1 \
"target/release-with-debug/{bin} -R ~/mirrors/linux files ~/mirrors/linux/no-match"
Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux files ~/mirrors/linux/no-match
  Time (mean ± σ):     164.8 ms ±  16.6 ms    [User: 50.2 ms, System: 111.7 ms]
  Range (min … max):   148.1 ms … 195.0 ms    20 runs

Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux files ~/mirrors/linux/no-match
  Time (mean ± σ):     171.8 ms ±  13.6 ms    [User: 61.7 ms, System: 109.0 ms]
  Range (min … max):   159.5 ms … 192.1 ms    20 runs
```

Without watchman:
```
% hyperfine --sort command --warmup 3 --runs 20 -L bin jj-0,jj-1 \
"target/release-with-debug/{bin} -R ~/mirrors/linux files ~/mirrors/linux/no-match"
Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux files ~/mirrors/linux/no-match
  Time (mean ± σ):     367.3 ms ±  30.3 ms    [User: 1415.2 ms, System: 633.8 ms]
  Range (min … max):   325.4 ms … 421.7 ms    20 runs

Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux files ~/mirrors/linux/no-match
  Time (mean ± σ):     327.7 ms ±  24.9 ms    [User: 1059.1 ms, System: 654.3 ms]
  Range (min … max):   296.0 ms … 385.4 ms    20 runs
```

I haven't measured snapshotting against dirty working copy, but I don't think
it would be slower than the original implementation.
2023-11-30 12:09:31 +09:00
Yuya Nishihara
fca3690dda working_copy: add file states wrapper that provides map-like API
I'll replace the current lazy loading mechanism with this. Read-only methods
are implemented on the borrowed type so that we can narrow lookup scope
recursively.
2023-11-30 12:09:31 +09:00
Yuya Nishihara
9292af5e52 working_copy: update file states in bulk
This helps migrate BTreeMap<RepoPath, _> to sorted Vec.
2023-11-30 12:09:31 +09:00
Yuya Nishihara
c9150d02fc working_copy: don't look up file state twice while visiting directories 2023-11-30 12:09:31 +09:00
Yuya Nishihara
6ce7bd5338 repo_path: replace .contains() with .starts_with(), flipping the arguments
self.contains(other) means that the self tree contains the other tree (i.e.
the self path is prefix of the other), but it could be confused the other way
around if we were thinking about the path literal, not the tree. Let's add
.starts_with() instead by copying the std::path::Path definition.
2023-11-29 08:41:23 +09:00
Yuya Nishihara
bc9725c73c working_copy: use RepoPath::parent() which no longer allocates temporary object 2023-11-29 08:41:23 +09:00
Yuya Nishihara
28ab9593c3 repo_path: split RepoPath into owned and borrowed types
This enables cheap str-to-RepoPath cast, which is useful when sorting and
filtering a large Vec<(String, _)> list by using matcher for example. It
will also eliminate temporary allocation by repo_path.parent().
2023-11-28 07:33:28 +09:00
Yuya Nishihara
0a1bc2ba42 repo_path: add stub RepoPathBuf type, update callers
Most RepoPath::from_internal_string() callers will be migrated to the function
that returns &RepoPath, and cloning &RepoPath won't work.
2023-11-28 07:33:28 +09:00
Yuya Nishihara
d322df0c8d matchers: make Files/PrefixMatcher constructors accept slice of borrowed paths
RepoPath will become slice type (like str), and it doesn't make sense to
require &[RepoPathBuf] here.
2023-11-28 07:33:28 +09:00
Yuya Nishihara
55f75278bc repo_path: make to_internal_file_string() return &str, rename accordingly 2023-11-27 08:42:09 +09:00
Yuya Nishihara
974a6870b3 repo_path: make RepoPath::components() return iterator
This allows us to change the backing type from Vec<String> to String.
2023-11-27 08:42:09 +09:00
Yuya Nishihara
59ef3f0023 repo_path: split RepoPathComponent into owned and borrowed types
This is a step towards introducing a borrowed RepoPath type. The current
RepoPath type is inefficient as each component String is usually short. We
could apply short-string optimization, but still each inlined component would
consume 24 bytes just for e.g. "src", and increase the chance of random memory
access. If the owned RepoPath type is backed by String, we can implement cheap
cast from &str to borrowed &RepoPath type.
2023-11-26 18:21:40 +09:00
Yuya Nishihara
6344cd56b3 repo_path: remove RepoPathJoin trait, just implement join() on the type
I don't think we'll add join() that takes different types.
2023-11-26 07:14:47 +09:00
Yuya Nishihara
042d26049c working_copy: lazily construct file_states BTreeMap
While it got faster to build a large BTreeMap<RepoPath, _>, there's still
a measurable cost. Let's eliminate it if watchman is enabled and the working
copy is clean. Perhaps, we should introduce new serialization format that
supports instant loading and lookup, but this hack works for the moment.
I'm not sure if the new tree_state format should be flat (RepoPath, _) list,
or tree like the backend storage btw.

In my "linux" repo (watchman enabled):
    % hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1 \
      "target/release-with-debug/{bin} -R ~/mirrors/linux status"
    Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux status
      Time (mean ± σ):     768.9 ms ±  14.2 ms    [User: 630.7 ms, System: 131.2 ms]
      Range (min … max):   742.3 ms … 783.1 ms    10 runs

    Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux status
      Time (mean ± σ):     713.0 ms ±  16.8 ms    [User: 587.9 ms, System: 116.2 ms]
      Range (min … max):   681.5 ms … 731.1 ms    10 runs

    Relative speed comparison
            1.08 ±  0.03  target/release-with-debug/jj-0 -R ~/mirrors/linux status
            1.00          target/release-with-debug/jj-1 -R ~/mirrors/linux status
2023-11-23 18:48:14 +09:00
Yuya Nishihara
12cd657837 working_copy: extract file_states_to_proto() helper
Just minimizing the changes in the next commit. As we already have
file_states_from_proto(), it makes sense to extract the "to" function.
2023-11-23 18:48:14 +09:00
Yuya Nishihara
1ddcaa43b3 fsmonitor: don't apply prefix matching to paths obtained from watchman
If I understand it, watchman returns changed files and directories, and a
directory change doesn't mean we need to scan all files under the directory.
2023-11-23 10:06:00 +09:00