cybercyst/jj - jj - Gitea: Git with a cup of tea

mirror of https://github.com/martinvonz/jj.git synced 2025-05-30 19:32:39 +00:00

Author	SHA1	Message	Date
Austin Seipp	4b45dde8c6	clippy: disable bogus lints for nightly clippy The nightly compiler has several clippy fix-its that, if applied, break the build. There are various bugs about this, but there isn't enough space in the margins to detail it all. Just ignore these on a per-function basis; about 70% of them are just multiple instances happening inside a single function. This makes `cargo clippy --workspace --all-targets` run clean, even with the nightly compiler. Signed-off-by: Austin Seipp <aseipp@pobox.com> Change-Id: Ic26a025d3c62b12fbf096171308b56e38f7d1bb9	2024-04-05 11:39:29 -05:00
Martin von Zweigbergk	c55e08023e	workspace: don't lose sparsed-away paths when recovering workspace When an operation is missing and we recover the workspace, we create a new working-copy commit on top of the desired working-copy commit (per the available head operation). We then reset the working copy to an empty tree because it shouldn't really matter much which commit we reset to. However, when the workspace is sparse, it does matter, as the test case from the previous patch shows. This patch fixes it by replacing the `reset_to_empty()` method by a new `recover(&Commit)`, which effectively resets to the empty tree and then resets to the commit. That way, any subsequent snapshotting will result keep the paths from that tree for paths outside the sparse patterns.	2024-03-16 07:30:36 -07:00
Yuya Nishihara	a224d0f172	repo_path: show more detailed error if filesystem path failed to parse This should address both use cases: 1. If from_relative_path() is directly called, the error says ".." shouldn't be included in the (normalized) relative path. 2. If parse_fs_path() is used, the error message contains paths relative to cwd. #3216	2024-03-09 11:01:43 +09:00
Thomas Castiglione	d661f59f9d	working_copy: implement symlinks on windows with a helper function enables symlink tests on windows, ignoring failures due to disabled developer mode, and updates windows.md	2024-03-05 15:16:38 +08:00
Austin Seipp	6c31bab0d3	fsmonitor: allow `core.fsmonitor = "none"` to disable When doing things like testing snapshot performance differences, this allows you to turn off the monitor, no matter what the enabled user or repository configuration has, e.g. jj st --config-toml='core.fsmonitor="none"' Signed-off-by: Austin Seipp <aseipp@pobox.com>	2024-02-20 20:19:47 -06:00
Daehyeok Mun	a9f489ccdf	Switch to ignore crate for gitignore handling. Co-authored-by: Waleed Khan <me@waleedkhan.name>	2024-02-20 09:12:46 -08:00
Martin von Zweigbergk	6c1aeff7a9	working copy: materialize symlinks on Windows as regular files I was a bit surprised to learn (or be reminded?) that checking out symlinks on Windows leads to a panic. This patch fixes the crash by materializing symlinks from the repo as regular files. It also updates the snapshotting code so we preserve the symlink-ness of a path. The user can update the symlink in the repo by updating the regular file in the working copy. This seems to match Git's behavior on Windows when symlinks are disabled.	2024-02-09 09:20:24 -08:00
Martin von Zweigbergk	5a898b16a8	working_copy: handle symlink outside write_path_to_store() The `write_path_to_store()` has almost no overlapping code between the handling of symlinks and regular files, which suggests that we should move out the handling of symlinks to the caller (there's only one).	2024-02-09 09:20:24 -08:00
Jonathan Tan	33f3a420a1	workspace: recover from missing operation If the operation corresponding to a workspace is missing for some reason (the specific situation in the test in this commit is that an operation was abandoned and garbage-collected from another workspace), currently, jj fails with a 255 error code. Teach jj a way to recover from this situation. When jj detects such a situation, it prints a message and stops operation, similar to when a workspace is stale. The message tells the user what command to run. When that command is run, jj loads the repo at the @ operation (instead of the operation of the workspace), creates a new commit on the @ commit with an empty tree, and then proceeds as usual - in particular, including the auto-snapshotting of the working tree, which creates another commit that obsoletes the newly created commit. There are several design points I considered. 1) Whether the recovery should be automatic, or (as in this commit) manual in that the user should be prompted to run a command. The user might prefer to recover in another way (e.g. by simply deleting the workspace) and this situation is (hopefully) rare enough that I think it's better to prompt the user. 2) Which command the user should be prompted to run (and thus, which command should be taught to perform the recovery). I chose "workspace update-stale" because the circumstances are very similar to it: it's symptom is that the regular jj operation is blocked somewhere at the beginning, and "workspace update-stale" already does some special work before the blockage (this commit adds more of such special work). But it might be better for something more explicitly named, or even a sequence of commands (e.g. "create a new operation that becomes @ that no workspace points to", "low-level command that makes a workspace point to the operation @") but I can see how this can be unnecessarily confusing for the user. 3) How we recover. I can think of several ways: a) Always create a commit, and allow the automatic snapshotting to create another commit that obsoletes this commit. b) Create a commit but somehow teach the automatic snapshotting to replace the created commit in-place (so it has no predecessor, as viewed in "obslog"). c) Do either a) or b), with the added improvement that if there is no diff between the newly created commit and the former @, to behave as if no new commit was created (@ remains as the former @). I chose a) since it was the simplest and most easily reasoned about, which I think is the best way to go when recovering from a rare situation.	2024-02-09 00:38:47 -08:00
Martin von Zweigbergk	b343289238	working_copy: make reset() take a commit instead of a tree Our virtual file system at Google (CitC) would like to know the commit so it can scan backwards and find the closest mainline tree based on it. Since we always record an operation id (which resolves to a working-copy commit) when we write the working-copy state, it doesn't seem like a restriction to require a commit.	2024-02-06 12:41:09 -08:00
Yuya Nishihara	77ceadbfd0	cleanup: remove remaining ": {source}" from error message templates	2024-02-04 09:13:21 +09:00
Daniel Ploch	cb889f0b45	workspace: combine working copy functions into a trait	2024-01-25 11:46:07 -08:00
Yuya Nishihara	5a7d8ac596	working_copy: don't follow symlinks when visiting files in gitignored directory Fixes #2878	2024-01-24 16:38:48 +09:00
Martin von Zweigbergk	a66e2a0a6d	working_copy: mark commit_id field in proto reserved By marking it reserved, we prevent accidental use. We can still read working copy protos that have the field.	2024-01-12 17:38:23 -08:00
Yuya Nishihara	fa5e40719c	object_id: extract ObjectId trait and macros to separate module I'm going to add a prefix resolution method to OpStore, but OpStore is unrelated to the index. I think ObjectId, HexPrefix, and PrefixResolution can be extracted to this module.	2024-01-05 10:20:57 +09:00
Martin von Zweigbergk	90744fb770	working copy: read files ahead when updating If the commit backend has high latency, it can make a big difference to read files concurrently. This patch updates the working copy code to do that in the update code (when reading files from the backend to write to the working copy). Because our backend at Google reads files from a local daemon process that already does a lot of prefetching, this patch doesn't actually help us. I think it's still the right thing to do for backends that don't do the same kind of prefetching. It speeds up `jj sparse set --add` by >10x when I disable the prefetching in our daemon (our `Backend::concurrency()` is 100).	2023-12-29 13:37:13 -08:00
Yuya Nishihara	ac99145a28	working_copy: drop open file instance from PersistError For the same reason as the file_util change.	2023-12-17 08:20:07 +09:00
Yuya Nishihara	601be0d480	working_copy: narrow file_states recursively while visiting directories This saves another ~10ms. Without watchman: ``` % hyperfine --sort command --warmup 3 --runs 20 -L bin jj-1,jj-2 \ "target/release-with-debug/{bin} -R ~/mirrors/linux files ~/mirrors/linux/no-match" Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux files ~/mirrors/linux/no-match Time (mean ± σ): 327.7 ms ± 24.9 ms [User: 1059.1 ms, System: 654.3 ms] Range (min … max): 296.0 ms … 385.4 ms 20 runs Benchmark 3: target/release-with-debug/jj-2 -R ~/mirrors/linux files ~/mirrors/linux/no-match Time (mean ± σ): 311.0 ms ± 24.8 ms [User: 960.0 ms, System: 643.1 ms] Range (min … max): 274.9 ms … 358.5 ms 20 runs ```	2023-11-30 12:09:31 +09:00
Yuya Nishihara	a935a4f70c	working_copy: use proto file states without rebuilding BTreeMap In snapshot(), changed_file_states are received in arbitrary order. For the other callers, entries are in diff_stream order, so we don't have to sort them. With watchman enabled, we can see the cost of sorting the sorted proto entries. I don't think this is significant, but we can mitigate it by adding is_file_states_sorted flag to the proto message if needed: ``` % hyperfine --sort command --warmup 3 --runs 20 -L bin jj-0,jj-1 \ "target/release-with-debug/{bin} -R ~/mirrors/linux files ~/mirrors/linux/no-match" Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux files ~/mirrors/linux/no-match Time (mean ± σ): 164.8 ms ± 16.6 ms [User: 50.2 ms, System: 111.7 ms] Range (min … max): 148.1 ms … 195.0 ms 20 runs Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux files ~/mirrors/linux/no-match Time (mean ± σ): 171.8 ms ± 13.6 ms [User: 61.7 ms, System: 109.0 ms] Range (min … max): 159.5 ms … 192.1 ms 20 runs ``` Without watchman: ``` % hyperfine --sort command --warmup 3 --runs 20 -L bin jj-0,jj-1 \ "target/release-with-debug/{bin} -R ~/mirrors/linux files ~/mirrors/linux/no-match" Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux files ~/mirrors/linux/no-match Time (mean ± σ): 367.3 ms ± 30.3 ms [User: 1415.2 ms, System: 633.8 ms] Range (min … max): 325.4 ms … 421.7 ms 20 runs Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux files ~/mirrors/linux/no-match Time (mean ± σ): 327.7 ms ± 24.9 ms [User: 1059.1 ms, System: 654.3 ms] Range (min … max): 296.0 ms … 385.4 ms 20 runs ``` I haven't measured snapshotting against dirty working copy, but I don't think it would be slower than the original implementation.	2023-11-30 12:09:31 +09:00
Yuya Nishihara	fca3690dda	working_copy: add file states wrapper that provides map-like API I'll replace the current lazy loading mechanism with this. Read-only methods are implemented on the borrowed type so that we can narrow lookup scope recursively.	2023-11-30 12:09:31 +09:00
Yuya Nishihara	9292af5e52	working_copy: update file states in bulk This helps migrate BTreeMap<RepoPath, _> to sorted Vec.	2023-11-30 12:09:31 +09:00
Yuya Nishihara	c9150d02fc	working_copy: don't look up file state twice while visiting directories	2023-11-30 12:09:31 +09:00
Yuya Nishihara	6ce7bd5338	repo_path: replace .contains() with .starts_with(), flipping the arguments self.contains(other) means that the self tree contains the other tree (i.e. the self path is prefix of the other), but it could be confused the other way around if we were thinking about the path literal, not the tree. Let's add .starts_with() instead by copying the std::path::Path definition.	2023-11-29 08:41:23 +09:00
Yuya Nishihara	bc9725c73c	working_copy: use RepoPath::parent() which no longer allocates temporary object	2023-11-29 08:41:23 +09:00
Yuya Nishihara	28ab9593c3	repo_path: split RepoPath into owned and borrowed types This enables cheap str-to-RepoPath cast, which is useful when sorting and filtering a large Vec<(String, _)> list by using matcher for example. It will also eliminate temporary allocation by repo_path.parent().	2023-11-28 07:33:28 +09:00
Yuya Nishihara	0a1bc2ba42	repo_path: add stub RepoPathBuf type, update callers Most RepoPath::from_internal_string() callers will be migrated to the function that returns &RepoPath, and cloning &RepoPath won't work.	2023-11-28 07:33:28 +09:00
Yuya Nishihara	d322df0c8d	matchers: make Files/PrefixMatcher constructors accept slice of borrowed paths RepoPath will become slice type (like str), and it doesn't make sense to require &[RepoPathBuf] here.	2023-11-28 07:33:28 +09:00
Yuya Nishihara	55f75278bc	repo_path: make to_internal_file_string() return &str, rename accordingly	2023-11-27 08:42:09 +09:00
Yuya Nishihara	974a6870b3	repo_path: make RepoPath::components() return iterator This allows us to change the backing type from Vec<String> to String.	2023-11-27 08:42:09 +09:00
Yuya Nishihara	59ef3f0023	repo_path: split RepoPathComponent into owned and borrowed types This is a step towards introducing a borrowed RepoPath type. The current RepoPath type is inefficient as each component String is usually short. We could apply short-string optimization, but still each inlined component would consume 24 bytes just for e.g. "src", and increase the chance of random memory access. If the owned RepoPath type is backed by String, we can implement cheap cast from &str to borrowed &RepoPath type.	2023-11-26 18:21:40 +09:00
Yuya Nishihara	6344cd56b3	repo_path: remove RepoPathJoin trait, just implement join() on the type I don't think we'll add join() that takes different types.	2023-11-26 07:14:47 +09:00
Yuya Nishihara	042d26049c	working_copy: lazily construct file_states BTreeMap While it got faster to build a large BTreeMap<RepoPath, _>, there's still a measurable cost. Let's eliminate it if watchman is enabled and the working copy is clean. Perhaps, we should introduce new serialization format that supports instant loading and lookup, but this hack works for the moment. I'm not sure if the new tree_state format should be flat (RepoPath, _) list, or tree like the backend storage btw. In my "linux" repo (watchman enabled): % hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1 \ "target/release-with-debug/{bin} -R ~/mirrors/linux status" Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux status Time (mean ± σ): 768.9 ms ± 14.2 ms [User: 630.7 ms, System: 131.2 ms] Range (min … max): 742.3 ms … 783.1 ms 10 runs Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux status Time (mean ± σ): 713.0 ms ± 16.8 ms [User: 587.9 ms, System: 116.2 ms] Range (min … max): 681.5 ms … 731.1 ms 10 runs Relative speed comparison 1.08 ± 0.03 target/release-with-debug/jj-0 -R ~/mirrors/linux status 1.00 target/release-with-debug/jj-1 -R ~/mirrors/linux status	2023-11-23 18:48:14 +09:00
Yuya Nishihara	12cd657837	working_copy: extract file_states_to_proto() helper Just minimizing the changes in the next commit. As we already have file_states_from_proto(), it makes sense to extract the "to" function.	2023-11-23 18:48:14 +09:00
Yuya Nishihara	1ddcaa43b3	fsmonitor: don't apply prefix matching to paths obtained from watchman If I understand it, watchman returns changed files and directories, and a directory change doesn't mean we need to scan all files under the directory.	2023-11-23 10:06:00 +09:00
Yuya Nishihara	767e94f5af	fsmonitor: drop unneeded mut from make_fsmonitor_matcher() We only need &self.working_copy_path here.	2023-11-23 10:06:00 +09:00
Yuya Nishihara	c16c89bc27	fsmonitor: keep paths relative to the workspace root Since the caller wants repo-relative paths, it doesn't make sense to convert them back and forth.	2023-11-23 10:06:00 +09:00
Yuya Nishihara	5186066cf5	working_copy: simply collect() proto file states into BTreeMap Suppose the input list is presorted, sorting a sorted vec would be cheaper than .insert()-ing sorted items one by one. In my "linux" repo (watchman eanbled): - jj-0: baseline - jj-1: previous (don't randomize by HashMap) - jj-2: this % hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1,jj-2 \ "target/release-with-debug/{bin} -R ~/mirrors/linux status" Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux status Time (mean ± σ): 1.034 s ± 0.020 s [User: 0.881 s, System: 0.212 s] Range (min … max): 1.011 s … 1.068 s 10 runs Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux status Time (mean ± σ): 849.3 ms ± 13.8 ms [User: 710.7 ms, System: 199.3 ms] Range (min … max): 821.7 ms … 870.2 ms 10 runs Benchmark 3: target/release-with-debug/jj-2 -R ~/mirrors/linux status Time (mean ± σ): 786.2 ms ± 16.7 ms [User: 650.7 ms, System: 204.1 ms] Range (min … max): 760.8 ms … 805.2 ms 10 runs Relative speed comparison 1.32 ± 0.04 target/release-with-debug/jj-0 -R ~/mirrors/linux status 1.08 ± 0.03 target/release-with-debug/jj-1 -R ~/mirrors/linux status 1.00 target/release-with-debug/jj-2 -R ~/mirrors/linux status	2023-11-20 08:29:33 +09:00
Yuya Nishihara	ee6a1e2c0a	working_copy: don't build intermediate HashMap from proto file states According to the doc, this is compatible with the map syntax. https://protobuf.dev/programming-guides/proto3/#maps This change means that the serialized file states are sorted by RepoPath, so BTreeMap<RepoPath, _> can be reconstructed with fewer cache misses. In my "linux" repo (watchman enabled): - jj-0: baseline - jj-1: this % hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1,jj-2 \ "target/release-with-debug/{bin} -R ~/mirrors/linux status" Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux status Time (mean ± σ): 1.034 s ± 0.020 s [User: 0.881 s, System: 0.212 s] Range (min … max): 1.011 s … 1.068 s 10 runs Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux status Time (mean ± σ): 849.3 ms ± 13.8 ms [User: 710.7 ms, System: 199.3 ms] Range (min … max): 821.7 ms … 870.2 ms 10 runs Relative speed comparison 1.32 ± 0.04 target/release-with-debug/jj-0 -R ~/mirrors/linux status 1.08 ± 0.03 target/release-with-debug/jj-1 -R ~/mirrors/linux status Cache-misses got reduced: % perf stat -e task-clock,cycles,instructions,cache-references,cache-misses \ -- ./target/release-with-debug/jj-0 -R ~/mirrors/linux --no-pager status 1,091.68 msec task-clock # 1.032 CPUs utilized 4,179,596,978 cycles # 3.829 GHz 6,166,231,489 instructions # 1.48 insn per cycle 134,032,047 cache-references # 122.776 M/sec 29,322,707 cache-misses # 21.88% of all cache refs 1.057474164 seconds time elapsed 0.897042000 seconds user 0.194819000 seconds sys % perf stat -e task-clock,cycles,instructions,cache-references,cache-misses \ -- ./target/release-with-debug/jj-1 -R ~/mirrors/linux --no-pager status 927.05 msec task-clock # 1.083 CPUs utilized 3,451,299,198 cycles # 3.723 GHz 6,222,418,272 instructions # 1.80 insn per cycle 98,499,363 cache-references # 106.251 M/sec 11,998,523 cache-misses # 12.18% of all cache refs 0.855938336 seconds time elapsed 0.720568000 seconds user 0.207924000 seconds sys	2023-11-20 08:29:33 +09:00
Yuya Nishihara	56047cb7ec	working_copy: don't pass all proto data to from_proto() functions Just a code cleanup. This allows us to consume proto fields if needed. I also removed redundant .clone() and .as_str().	2023-11-20 08:29:33 +09:00
Martin von Zweigbergk	9b24d24612	conflicts: add another helper for materializing a tree value We have a few places where we have a `MergedTreeValue` and need to read the data associated with it so we can write to the working copy or include it in a diff. Let's extract some of that shared logic to a function so we can reuse it. I plan to use it for reading file contents in advance while streaming a diff in `local_working_copy` soon (and probably in `jj diff` thereafter), but I think it seems like an improvement on its own.	2023-11-08 21:21:38 -08:00
Martin von Zweigbergk	65bd5cacba	working copy: on checkout, move read from store out of `write_()` functions I'd like to read N files ahead from the backend, to avoid serializing too many server calls on backends that are backed by a server. Moving the reads a little earlier is a little step towards that. The `TreeState::write_()` functions can now be made into free/static functions if we prefer.	2023-11-08 21:21:38 -08:00
Martin von Zweigbergk	904c37d36d	working copy: use `MergedTree::diff_stream()` This will make it a little faster to update the working copy at Google once we've made `MergedTree::diff_stream()` fetch trees concurrently. (It only makes it a little faster because we still fetch files serially.)	2023-11-03 08:15:10 -07:00
Martin von Zweigbergk	24b706641f	async: switch to `pollster`'s `block_on()` During the transition to using more async code, I keep running into https://github.com/rust-lang/futures-rs/issues/2090. Right now, I want to convert `MergedTree::diff()` into a `Stream`. I don't want to update all call sites at once, so instead I'm adding a `MergedTree::diff_stream()` method, which just wraps `MergedTree::diff()` in a `Stream. However, since the iterator is synchronous, it needs to block on the async `Backend::read_tree()` calls. If we then also block on the `Stream` in the CLI, we run into the panic.	2023-11-03 08:15:10 -07:00
Martin von Zweigbergk	a1ef9dc845	merged_tree: propagate backend errors in diff iterator I want to fix error propagation before I start using async in this code. This makes the diff iterator propagate errors from reading tree objects. Errors include the path and don't stop the iteration. The idea is that we should be able to show the user an error inline in diff output if we failed to read a tree. That's going to be especially useful for backends that can return `BackendError::AccessDenied`. That error variant doesn't yet exist, but I plan to add it, and use it in Google's internal backend.	2023-10-26 06:20:56 -07:00
Martin von Zweigbergk	309f1200d6	merge: introduce a type alias for `Merge<Option<TreeValue>>` Reasons to introduce this alias: * Reduces complexity of a type, to silence Clippy warnings in the future if we use this type as a type parameter * The type is used quite frequently, so it makes sense to have a name for it * It's easier to visually scan for the end of the type when you don't have to match opening and closing angle brackets	2023-10-26 06:20:56 -07:00
Martin von Zweigbergk	8764ad9826	conflicts: make materialization async We need to let async-ness propagate up from the backend because `block_on()` doesn't like to be called recursively. The conflict materialization code is a good place to make async because it doesn't depends on anything that isn't already async-ready.	2023-10-20 07:38:34 -07:00
Martin von Zweigbergk	6bfd618275	workspace: load working copy implementation dynamically This makes `Workspace::load()` look a new `.jj/working_copy/type` file in order to load the right working copy implementation, just like `Repo::load()` picks the right backends based on `.jj/store/type`, `.jj/op_store/type`, etc. We don't write the file yet, and we don't have a way of adding alternative working copy implementations, so it will always be `LocalWorkingCopy` for now.	2023-10-16 22:33:44 -07:00
Martin von Zweigbergk	e1f00d9426	working copy: pass commit instead of tree into `check_out()` Our internal working copy implementations at Google will need the commit so they can walk history backwards until they get to a "public" commit. They'll then use that to tell build tools and virtual file systems to present that as a base. I'm not sure if we'll need to update `reset()` too. It's currently only used by `jj untrack`, which doesn't change the commit's parent, so it wouldn't affect any history walks.	2023-10-16 22:33:44 -07:00
Martin von Zweigbergk	0582893144	working copy: return `Box<dyn LockedWorkingCopy>` from `start_mutation()`	2023-10-15 16:13:19 -07:00
Martin von Zweigbergk	580586d008	working copy: return `Box<dyn WorkingCopy>` from `finish()`	2023-10-15 16:13:19 -07:00

1 2

70 Commits