cybercyst/jj - jj - Gitea: Git with a cup of tea

mirror of https://github.com/martinvonz/jj.git synced 2025-05-30 19:32:39 +00:00

Author	SHA1	Message	Date
Martin von Zweigbergk	389f27f042	working_copy: move writing of conflict objects into new tree builder This introduces a `MergedTreeBuilder` type, which takes a set of base trees and overrides. The idea is that it will be able to write multiple trees or a legacy tree. For now, it's only able to write legacy trees. To show that it works, the working copy's snaphotting code has been updated to use it.	2023-08-26 08:16:57 -07:00
Martin von Zweigbergk	2fe4372121	tree_builder: remove unnecessary `has_overrides()` method It's easy to instead check if the new tree id is different from the tree id.	2023-08-26 07:02:04 -07:00
Martin von Zweigbergk	0b27c33a13	working_copy: remove last use of `current_tree()` We were using `current_tree()` only for an assertion where we were walking its entries. Now that `MergedTree` supports that, we can replace `current_tree()` by `current_merged_tree()`. There's more work needed before the working copy can fully work with tree-level conflicts. We still need to be able to store multiple tree ids in the `tree_state` file, and we need to be able to create multiple trees instead of writing conflict objects to the backend.	2023-08-25 07:06:20 -07:00
Martin von Zweigbergk	85bdba5bea	working_copy: use `MergedTree` for diffing in `reset()`	2023-08-25 07:06:20 -07:00
Martin von Zweigbergk	23509e939e	working_copy: get diff from `MergedTree`s To support tree-level conflicts, we're going to need to update the working copy from one `MergedTree` to another. We're going need to store multiple tree ids in the `tree_state` file. This patch gets us closer to that by getting the diff from `MergedTree`s`, even though we assume that they are legacy trees for now, so we can write to the single-tree `tree_state` file.	2023-08-25 06:40:36 -07:00
Martin von Zweigbergk	5610525c29	working_copy: pass in `Merge` arguments to closure in `update()` When we do an update between two `MergedTree` instances, we'll get diffs between two `Merge<Option<TreeValue>>`. This commit prepares for that by changing the type of the `before` and `after` arguments we pass into the closure in `update()`.	2023-08-25 06:40:36 -07:00
Martin von Zweigbergk	c65fcabdf8	working_copy: collect code for updating update stats in one place I think it's a little easier to follow if we don't update the stats in the large callback. It also reduces the risk of forgetting to update the stats in some case (like in the exec-bit-optimization case I just removed).	2023-08-25 06:40:36 -07:00
Martin von Zweigbergk	2151fd8930	working_copy: drop optimization for exec-bit-only change When updating the working copy from one tree to another, if only the executable bit has changed between the two trees, we set the executable bit on the file without touching its contents. The optimization probably gets used quite rarely. Maybe it's even so rarely that it's a pessimization overall. Perhaps its value lies more in that we avoid updating the file's mtime unnecessarily. Either way, I'm about to change this code to use `Merge<Option<TreeValue>>` and that will make this block more complex. I don't think it's worth the complexity even it provides some small benefit sometimes.	2023-08-25 06:40:36 -07:00
Piotr Kufel	2109a7b488	Fix .gitignore handling of ignored directories - Ignore .gitignore files from untracked directories - Do not allow un-ignoring files within ignored directories	2023-08-22 22:08:32 -07:00
Martin von Zweigbergk	49fb26fdae	working_copy: write state file even if only mtimes changed When the main `TreeState::snapshot()` thread doesn't receive any updated tree entries over the channel, it correctly doesn't write a new tree. However, it also doesn't write the working copy state file (`.jj/working_copy/tree_state`). This resulted in performance regression in 3f97a6da783a7. From that commit, repeated snapshotting would have to re-read all files from disk because it didn't remember the updated mtime from the previous time. This patch fixes the bug by also writing the file if there were any new file states.	2023-08-22 14:45:52 -07:00
Martin von Zweigbergk	5641ef9a42	working_copy: don't send unchanged file states over channel This doesn't seem to make any difference right now, but it will if we write the state file when there are mtime-only changes, which we currently don't do.	2023-08-22 14:45:52 -07:00
Benjamin Saunders	6c4b8a7383	settings: support human-readable byte sizes for max-new-file-size	2023-08-17 19:29:38 -07:00
Ben Saunders	351e7feef5	working_copy: don't snapshot new files larger than 1MiB by default	2023-08-17 19:29:38 -07:00
Martin von Zweigbergk	7ad2270c05	working_copy: pass `Merge`, not `ConflictId`, to `write_conflict()` This is another small step towards making this code work with tree-level conflicts.	2023-08-16 22:59:12 -07:00
Martin von Zweigbergk	1571541214	working_copy: combine blocks for updating added/modified paths There's a lot of duplication between the blocks of code for updating modified and added paths. This commit combines them.	2023-08-16 22:59:12 -07:00
Martin von Zweigbergk	01a6578ada	working_copy: move up special case for exec-bit-only change This is also just to make the next change simpler.	2023-08-16 22:59:12 -07:00
Martin von Zweigbergk	8ded5ae03b	working_copy: convert `Diff` into `Options` for matching This just a little refactoring to make the next step of sharing code between `Modified` and `Added` simpler.	2023-08-16 22:59:12 -07:00
Martin von Zweigbergk	5b8c1e013f	working_copy: add a helper for getting the current tree The code for getting the current tree object was repeated a few times over. I'm going to soon make it return a `MergedTree` and I don't want to repeat that code (it's more complicated than the current code).	2023-08-16 22:59:12 -07:00
Martin von Zweigbergk	9138bb5517	working_copy: use `MergedTree` for current tree when snapshotting We now have all the pieces in place to read the current tree as a `MergedTree` when snapshotting the working copy. For now, it's still always a legacy tree. We'll need to update the working copy state file to support storing multiple trees before we can create a `MergedTree` with multiple sides here.	2023-08-15 07:56:55 -07:00
Martin von Zweigbergk	c126e75b2b	working_copy: make `write_path_to_store()` work with merged values For tree-level conflicts, we're going to be getting `Merge<Option<TreeValue>>` from the current tree and produce a new such value if contents changes on disk. This commit gets us a little closer to that by passing in a value of that type into `write_path_to_store()`. This seems to have a small but measurable performance impact. Snapshotting the working copy in the git repo with all files `touch`ed went from 2.36 s to 2.43 s (3%). I think that's okay, especially since most files' mtimes rarely change, and we only pay the price when it has.	2023-08-15 07:56:55 -07:00
Martin von Zweigbergk	3f97a6da78	working_copy: avoid adding unchanged values to tree builder If the value at a path hasn't changed, there's no need to send it over the channel and have the receiver add it to `TreeBuilder`. I couldn't measure any performance impact. Now we should no longer send `TreeValue::Conflict` variants over the tree entry channel.	2023-08-14 23:32:52 -07:00
Martin von Zweigbergk	eacdad3ebd	working_copy: move writing of conflicts to receiver side of channel When writing tree-level conflicts, we're going to be writing multiple tree (maybe using some new `MergedTreeBuilder`), so we'll need the full `Merge<Option<TreeValue>>` object. This gets us closer to that by sending such objects over the channel and having the receiver write the conflict object. Note that we still sometimes send `TreeValue::Conflict` variants over the channel. That only happens if they're unchanged.	2023-08-14 23:32:52 -07:00
Martin von Zweigbergk	03f00bbf30	working_copy: return `Merge<Option<TreeValue>>` over channel When writing tree-level conflicts, we won't pass `TreeValue::Conflict` over the `tree_entries` channel. Instead, we're going to pass possibly unresolved `Merge<Option<TreeValue>>` instances. This commit prepares for that by changing the type even though we'll only pass `Merge::normal()` over the channel at this point. I did this partly to see what the performance impact is. I tested that by touching all files in the git.git repo to force the trees (and files) to be rewritten. There was no measurable impact at all (best-of-10 time was 2.44 s before and 2.40 s after, but I assume that was a fluke).	2023-08-14 23:32:52 -07:00
Martin von Zweigbergk	6c5d6d7e39	working_copy: delete duplicate comment I copied a comment that I should have just moved in 37a770e8b43d.	2023-08-14 23:32:52 -07:00
Martin von Zweigbergk	4eadb06251	working_copy: propagate errors from writing conflict parts to store	2023-08-14 23:32:52 -07:00
Martin von Zweigbergk	f1b817e8ca	cleanup: fix warnings from nightly clippy	2023-08-14 22:11:56 -07:00
Martin von Zweigbergk	e414f3b73c	cleanup: use `fs:read()` instead of `File::open().read_to_end()`	2023-08-13 14:04:59 +00:00
Martin von Zweigbergk	f9e0feaaf8	working_copy: return early from `write_path_to_store()` for non-files Almost the entire method deals with `FileType::Normal`, so we can reduce indentation and repeated matching on the file type by doing it early and returning in the non-normal-file cases.	2023-08-13 01:00:31 +00:00
Martin von Zweigbergk	23f54b8151	working_copy: propagate errors when reading conflicted file	2023-08-13 01:00:31 +00:00
Martin von Zweigbergk	33a93b6d2d	working_copy: reduce scope of a `content` variable This also avoids reading non-file conflict from disk.	2023-08-13 01:00:31 +00:00
Martin von Zweigbergk	585c212617	working_copy: reduce scope of an `executable` variable	2023-08-13 01:00:31 +00:00
Martin von Zweigbergk	2102de94b0	working_copy: inline `write_conflict_to_store()` For tree-level conflicts, we're eventually not going to have `ConflictId`. We'd want to make `write_conflict_to_store()` take a `Merge<Option<TreeValue>>` and return an updated such value. That would leave very little logic in the function, so let's just inline it instead.	2023-08-13 01:00:31 +00:00
Martin von Zweigbergk	4c46398b1c	conflicts: make `update_from_content()` write resolved content to store `update_from_content()` already writes file content for each term of an unresolved merge, so it seems consistent for it to also write the file content for resolved merges. I think this should simplify further refactoring for tree-level conflicts and for preserving the executable bit.	2023-08-11 23:59:44 +00:00
Martin von Zweigbergk	0b85f06e3d	conflicts: make `update_from_content()` work with only `FileId`s Since `update_from_contents()` only works with file contents and not the executable or other kinds of paths, I think it makes more sense for it to deal with `FileId`s instead of `TreeValue`s.	2023-08-11 23:59:44 +00:00
Martin von Zweigbergk	a995c66635	merge: move some methods back to `conflicts` as free functions I think I moved way too many functions onto `Merge<Option<TreeValue>>` in 82883e648da4. This effectively reverts almost all of that commit. The `Merge<T>` type is simple container and it seems like it should be at fairly low level in the dependency graph. By moving functions off of it, we can get rid of the back-depdencies from the `merge` module to the `conflict` module that I introduced when I moved `Merge` to the `merge` module. I'm thinking the `conflict` module can focus on materialized conflicts.	2023-08-11 21:11:25 +00:00
Martin von Zweigbergk	abc7312dbc	working_copy: avoid an unused variable on Windows	2023-08-11 01:14:52 +00:00
Martin von Zweigbergk	14ddd17673	working_copy: add debug assertion that tree and file states match Perhaps the most important invariant in `.jj/working_copy/tree_state` is that its set of files in it matches the files in its tree. In particular, if a file that exists in the tree doesn't exist in the file state and doesn't exist on disk either, we won't notice that it's gone, and we will therefore not delete it from the tree on future rounds of snapshotting either.	2023-08-06 22:17:18 +00:00
Martin von Zweigbergk	6cce5e758b	working_copy: reduce scope of some variables With the recent refactorings, we don't need the `tree_builder` and `deleted_files` until a bit later.	2023-08-06 22:17:18 +00:00
Martin von Zweigbergk	16d00581f6	working_copy: add trace scope to tree-writing call Writing the tree can probably take a bit of time when the working copy has changed.	2023-08-06 22:17:18 +00:00
Martin von Zweigbergk	d06f51a88c	working_copy: split up tracing scope a bit Now that we process the outputs from the file system traversal by reading from channels, we can separate the processing from the file system traversal. When the working copy is unchanged, processing tree entries and deleted files takes practically no time, but processing file states and present files takes significant time.	2023-08-06 22:17:18 +00:00
Martin von Zweigbergk	b27b686b4e	working_copy: rename `deleted_files_tx` to `present_files_tx` We use the chanell to report the files that exist, so `deleted_files_tx` seems confusing.	2023-08-06 22:17:18 +00:00
Waleed Khan	e1c194ce67	working_copy: rename `WorkItem` -> `DirectoryToVisit`	2023-08-03 19:09:59 +00:00
Waleed Khan	84f807d222	working_copy: traverse filesystem in parallel This improves `jj status` time by a factor of ~2x on my machine (M1 Macbook Pro 2021 16-inch, uses an SSD): ```sh $ hyperfine --parameter-list hash before,after --parameter-list repo nixpkgs,gecko-dev --setup 'git checkout {hash} && cargo build --profile release-with-debug' --warmup 3 './target/release-with-debug/jj -R ../{repo} st' Benchmark 1: ./target/release-with-debug/jj -R ../nixpkgs st (hash = before) Time (mean ± σ): 1.640 s ± 0.019 s [User: 0.580 s, System: 1.044 s] Range (min … max): 1.621 s … 1.673 s 10 runs Benchmark 2: ./target/release-with-debug/jj -R ../nixpkgs st (hash = after) Time (mean ± σ): 760.0 ms ± 5.4 ms [User: 812.9 ms, System: 2214.6 ms] Range (min … max): 751.4 ms … 768.7 ms 10 runs Benchmark 3: ./target/release-with-debug/jj -R ../gecko-dev st (hash = before) Time (mean ± σ): 11.403 s ± 0.648 s [User: 4.546 s, System: 5.932 s] Range (min … max): 10.553 s … 12.718 s 10 runs Benchmark 4: ./target/release-with-debug/jj -R ../gecko-dev st (hash = after) Time (mean ± σ): 5.974 s ± 0.028 s [User: 5.387 s, System: 11.959 s] Range (min … max): 5.937 s … 6.024 s 10 runs $ hyperfine --parameter-list repo nixpkgs,gecko-dev --warmup 3 'git -C ../{repo} status' Benchmark 1: git -C ../nixpkgs status Time (mean ± σ): 865.4 ms ± 8.4 ms [User: 119.4 ms, System: 1401.2 ms] Range (min … max): 852.8 ms … 879.1 ms 10 runs Benchmark 2: git -C ../gecko-dev status Time (mean ± σ): 2.892 s ± 0.029 s [User: 0.458 s, System: 14.244 s] Range (min … max): 2.837 s … 2.934 s 10 runs ``` Conclusions: - ~2x improvement from previous `jj status` time. - Slightly faster than Git on nixpkgs. - Still 2x slower than Git on gecko-dev, not sure why. For reference, Git's default number of threads is defined in the `online_cpus` function: `ee48e70a82/thread-utils.c (L21-L66)`. We are using whatever the Rayon default is.	2023-08-03 18:20:49 +00:00
Waleed Khan	326be7c91e	working_copy: send updates via `channel` In preparation of traversing the filesystem in parallel, send updates via `channel`. An alternative is to modify shared mutable state, e.g. put `self.file_states` behind a mutex or use a concurrent hash-map. This risks leaving the `TreeState` in an invalid state if an error occurs, and makes invariants harder to reason about. Using a channel introduces a small performance regression. (I didn't try out the concurrent hash-map approach.) ```sh $ hyperfine --parameter-list hash before,after --setup 'git checkout {hash} && cargo build --profile release-with-debug' --warmup 3 './target/release-with-debug/jj -R ../nixpkgs st' Benchmark 1: ./target/release-with-debug/jj -R ../nixpkgs st (hash = before) Time (mean ± σ): 1.533 s ± 0.013 s [User: 0.587 s, System: 0.926 s] Range (min … max): 1.510 s … 1.559 s 10 runs Benchmark 2: ./target/release-with-debug/jj -R ../nixpkgs st (hash = after) Time (mean ± σ): 1.563 s ± 0.021 s [User: 0.607 s, System: 0.936 s] Range (min … max): 1.518 s … 1.595 s 10 runs Summary ./target/release-with-debug/jj -R ../nixpkgs st (hash = before) ran 1.02 ± 0.02 times faster than ./target/release-with-debug/jj -R ../nixpkgs st (hash = after) ```	2023-08-03 17:56:05 +00:00
Waleed Khan	174704d752	working_copy: extract `visit_directory` function for snapshotting	2023-08-03 17:40:18 +00:00
Waleed Khan	515fb02049	working_copy: extract `WorkItem` to top-level `struct`	2023-08-03 09:49:22 -07:00
Yuya Nishihara	d17ef14956	merge_tools: extract 2-way diff checkout helper The directory prefix is renamed to "jj-diff-" as I'm going to use it for "jj diff --tool <external-diff-generator>".	2023-08-03 13:53:37 +09:00
Martin von Zweigbergk	48b1a1c533	working_copy: in ignored directories, visit only already tracked paths `.gitignores` in ignored directories should be ignored. Before this commit, we would visit ignored directories like any others if there were any ignored paths in them. I've done a lot of preparation for this commit, but There's still a bit of duplication between the new code and the existing code. I don't mind improving it if anyone has suggestions. Otherwise I might end up doing that when I get back to working on snapshotting tree-level conflicts soon. This fixes #1785.	2023-08-01 06:31:52 +00:00
Martin von Zweigbergk	bcba1c6682	working_copy: rename `sub_path` to `path` The `sub_path` is created by joining `dir` to a basename. I think calling it just `path` is clear, especially since its the main path involved in each iteration of the loop.	2023-08-01 06:31:52 +00:00
Martin von Zweigbergk	0dc5d967ae	working_copy: move a duplicate statement out of `match` block	2023-08-01 06:31:52 +00:00

1 2 3 4 5

230 Commits