cybercyst/jj - jj - Gitea: Git with a cup of tea

mirror of https://github.com/martinvonz/jj.git synced 2025-05-17 05:04:27 +00:00

Author	SHA1	Message	Date
Yuya Nishihara	7ddced7f3f	git: scan new commits all at once from multiple heads The visiting order is DFS from heads sorted in lexicographical order, but I plan to change it to chronological order.	2023-08-14 07:48:55 +09:00
Yuya Nishihara	73a4b7f5bf	repo: extract add_heads() that can import commits from multiple heads This allows us to reorder commits to be indexed in bulk. The incremental update optimization is applied only for a single head. This could be tried for multiple heads, but it's unlikely that every head has a single new commit for each.	2023-08-14 07:48:55 +09:00
Yuya Nishihara	157a0e748b	git: add separate step to apply HEAD@git change I'm going to extract a step to import new commits all at once.	2023-08-14 07:48:55 +09:00
Yuya Nishihara	359c871545	git: remove redundant id.clone() from diff_refs_to_import()	2023-08-14 07:48:55 +09:00
Martin von Zweigbergk	e414f3b73c	cleanup: use `fs:read()` instead of `File::open().read_to_end()`	2023-08-13 14:04:59 +00:00
Martin von Zweigbergk	0b3b62a777	conflicts: remove redundant `num_removes` argument from `parse_conflict()` Merges always have exactly one more "adds" than "removes" these days.	2023-08-13 09:54:16 +00:00
Yuya Nishihara	72271c0d1f	repo: micro-optimize add_head() to not instantiate indexed commit object	2023-08-13 18:52:17 +09:00
Yuya Nishihara	15fb8b95b0	index: rewrite topological sort by leveraging dag_walk function This is similar to what mut_repo.add_head() does. I'm going to adjust the visiting order so the bulk-imported history preserves chronological order. It might be a small adjustment on the current DFS approach, or new function based on Kahn's algorithm. Either way, it's important that both "jj git import" and "jj debug reindex" use the same underlying function.	2023-08-13 18:52:17 +09:00
Yuya Nishihara	8652bae925	index: add tracing output to "jj debug reindex" path	2023-08-13 18:52:17 +09:00
Martin von Zweigbergk	f9e0feaaf8	working_copy: return early from `write_path_to_store()` for non-files Almost the entire method deals with `FileType::Normal`, so we can reduce indentation and repeated matching on the file type by doing it early and returning in the non-normal-file cases.	2023-08-13 01:00:31 +00:00
Martin von Zweigbergk	23f54b8151	working_copy: propagate errors when reading conflicted file	2023-08-13 01:00:31 +00:00
Martin von Zweigbergk	33a93b6d2d	working_copy: reduce scope of a `content` variable This also avoids reading non-file conflict from disk.	2023-08-13 01:00:31 +00:00
Martin von Zweigbergk	585c212617	working_copy: reduce scope of an `executable` variable	2023-08-13 01:00:31 +00:00
Martin von Zweigbergk	2102de94b0	working_copy: inline `write_conflict_to_store()` For tree-level conflicts, we're eventually not going to have `ConflictId`. We'd want to make `write_conflict_to_store()` take a `Merge<Option<TreeValue>>` and return an updated such value. That would leave very little logic in the function, so let's just inline it instead.	2023-08-13 01:00:31 +00:00
Martin von Zweigbergk	4c46398b1c	conflicts: make `update_from_content()` write resolved content to store `update_from_content()` already writes file content for each term of an unresolved merge, so it seems consistent for it to also write the file content for resolved merges. I think this should simplify further refactoring for tree-level conflicts and for preserving the executable bit.	2023-08-11 23:59:44 +00:00
Martin von Zweigbergk	0b85f06e3d	conflicts: make `update_from_content()` work with only `FileId`s Since `update_from_contents()` only works with file contents and not the executable or other kinds of paths, I think it makes more sense for it to deal with `FileId`s instead of `TreeValue`s.	2023-08-11 23:59:44 +00:00
Martin von Zweigbergk	adf9679d4c	tree: inline `simplify_conflict()` The function is just a few lines now. I don't think we need the long documentation in it either since that's now in docs/technical/conflicts.md.	2023-08-11 21:11:25 +00:00
Martin von Zweigbergk	d4e755b4e4	merged_tree: rename some symbols away from "conflict" There were still many instances of `conflict` left from before we renamed `Conflict<T>` to `Merge<T>`. I decided to rename many of them based on the type parameter instead of the container. I think that made it more readable in many cases.	2023-08-11 21:11:25 +00:00
Martin von Zweigbergk	a995c66635	merge: move some methods back to `conflicts` as free functions I think I moved way too many functions onto `Merge<Option<TreeValue>>` in 82883e648da4. This effectively reverts almost all of that commit. The `Merge<T>` type is simple container and it seems like it should be at fairly low level in the dependency graph. By moving functions off of it, we can get rid of the back-depdencies from the `merge` module to the `conflict` module that I introduced when I moved `Merge` to the `merge` module. I'm thinking the `conflict` module can focus on materialized conflicts.	2023-08-11 21:11:25 +00:00
Yuya Nishihara	925d54614d	revset: remove round-trip conversion from heads() evaluation This wouldn't matter much in practice, but I think it's better to stick to low-level index primitives during revset evaluation.	2023-08-12 02:16:29 +09:00
Martin von Zweigbergk	d1dbe6de98	git: propagate errors for missing commits when importing refs	2023-08-11 05:06:36 +00:00
Martin von Zweigbergk	abc7312dbc	working_copy: avoid an unused variable on Windows	2023-08-11 01:14:52 +00:00
Martin von Zweigbergk	0570963fe3	merge: add a `Merge::into_resolved()` to avoid cloning I don't know if this has any measurable impact. It just seems like we should be able to take a resolved value out of a `Merge` without clonning.	2023-08-09 21:58:15 +00:00
Martin von Zweigbergk	f7160cf936	merge: add `absent()` and `normal()` to `Merge<Option<T>>` These mimic the `RefTarget` functions. They're very useful in `MergedTree`. I might copy over other helpers from `RefTarget` later.	2023-08-09 21:58:15 +00:00
Yuya Nishihara	e7e49527ef	git: ensure that remote branches never diverge I was considering how refs would be imported if we had a per-remote view of named branches (and tags): Each remote has a view, and jj remembers the last known view state to compute diffs. That's the same for the pseudo "git" remote. Under the current storage, these view states are represented as follows: git_refs["refs/heads/{name}"] # pseudo "git" remote branches git_refs["refs/tags/{name}"] # pseudo "git" remote tags git_refs["refs/remotes/{remote}/{name}"] # real remote branches and the diffs are merged in to branches[name].local_target and tags[name]. We also have branches[name].remote_targets[remote], but I think it's redundant because a tracking branch should also be the last known state, not something that can diverge from the actual state. To make that clear, this commit replaces the use of the "merge" API.	2023-08-09 15:22:45 +09:00
Martin von Zweigbergk	1d2324ae5c	git: refactor SSH key callbacks to allow multiple keys This is to prepare for adding support for checking other keys than just id_rsa.	2023-08-09 03:44:03 +00:00
Benjamin Saunders	75636d626f	local_backend: don't reference uninitialized memory	2023-08-08 13:08:26 -07:00
Martin von Zweigbergk	c752b43db1	git: only try to use ssh-agent once per connection As reported in #1970, SSH authentication would sometimes run into a loop where it repeatedly tries to use ssh-agent for authentication without making progess. The problem can be reproduced by simply removing `$SSH_AUTH_KEY` from your environment (and not having a Git credentials helper configured, I think). This seems to be a bug introduced by b104f8e154c21. That commit meant to make it so we attempt to use ssh-agent and fall back to using (password-less) keys after that. The problem is that `git2::Cred::ssh_key_from_agent()` just returns an object that will be used later for looking up the credentials from ssh-agent, so the call will not fail because ssh-agent is not reachable. This commit attempts to fix the problem by having the credentials callback attempt to use ssh-agent only once.	2023-08-08 07:41:13 +00:00
Ilya Grigoriev	74d9970908	config: Rename `push.branch-prefix` option to `git.push-branch-prefix` This is for consistency with other `git.` options. See also https://github.com/martinvonz/jj/pull/1962#discussion_r1282605185	2023-08-07 19:10:10 -07:00
Yuya Nishihara	2619200657	refs: rename RefTarget::as_conflict() to as_merge() Follows up ecc030848dff. It's also nice that we have more distinction between has_conflict() ans as_merge().	2023-08-07 08:05:57 +09:00
Martin von Zweigbergk	b9b285c985	conflicts: move `Merge` tests to `merge` module I missed the tests when I moved the type.	2023-08-06 23:05:21 +00:00
Martin von Zweigbergk	af2dba1c8f	merge: move `tests` module to end of file I used IntelliJ to move the `Merge` type from the `conflict` module and didn't notice until now that it put the moved items after the tests.	2023-08-06 23:05:21 +00:00
Martin von Zweigbergk	14ddd17673	working_copy: add debug assertion that tree and file states match Perhaps the most important invariant in `.jj/working_copy/tree_state` is that its set of files in it matches the files in its tree. In particular, if a file that exists in the tree doesn't exist in the file state and doesn't exist on disk either, we won't notice that it's gone, and we will therefore not delete it from the tree on future rounds of snapshotting either.	2023-08-06 22:17:18 +00:00
Martin von Zweigbergk	6cce5e758b	working_copy: reduce scope of some variables With the recent refactorings, we don't need the `tree_builder` and `deleted_files` until a bit later.	2023-08-06 22:17:18 +00:00
Martin von Zweigbergk	16d00581f6	working_copy: add trace scope to tree-writing call Writing the tree can probably take a bit of time when the working copy has changed.	2023-08-06 22:17:18 +00:00
Martin von Zweigbergk	d06f51a88c	working_copy: split up tracing scope a bit Now that we process the outputs from the file system traversal by reading from channels, we can separate the processing from the file system traversal. When the working copy is unchanged, processing tree entries and deleted files takes practically no time, but processing file states and present files takes significant time.	2023-08-06 22:17:18 +00:00
Martin von Zweigbergk	b27b686b4e	working_copy: rename `deleted_files_tx` to `present_files_tx` We use the chanell to report the files that exist, so `deleted_files_tx` seems confusing.	2023-08-06 22:17:18 +00:00
Martin von Zweigbergk	ef5f97f8d7	conflicts: move `Merge<T>` to `merge` module The `merge` module now seems like the obvious place for this type.	2023-08-06 22:08:09 +00:00
Martin von Zweigbergk	ecc030848d	conflicts: rename `Conflict<T>` to `Merge<T>` Since `Conflict<T>` can also represent a non-conflict state (a single term), `Merge<T>` seems like better name. Thanks to @ilyagr for the suggestion in https://github.com/martinvonz/jj/pull/1774#discussion_r1257547709 Sorry about the churn. It would have been better if I thought of this name before I introduced `Conflict<T>`.	2023-08-06 22:08:09 +00:00
Yuya Nishihara	c8f7a5f73f	git: on import_refs(), filter out uninteresting refs earlier	2023-08-06 14:47:20 +09:00
Yuya Nishihara	b3ee8a0b3e	git: extract immutable part of import_refs() to separate function	2023-08-06 14:47:20 +09:00
Yuya Nishihara	a2b8d1cc3a	git: calculate refs to be imported first, then apply in later pass This allows us to use mut_repo.view() reference during diff computation.	2023-08-06 14:47:20 +09:00
Yuya Nishihara	feaddf6e51	git: on import_refs(), use post-mutation view to collect heads to be pinned This is simpler than carefully tracking mutation through old/new git refs and merged local branches. There are two subtle behavior changes: a. unimported git refs excluded by git_ref_filter() are not pinned. b. unexported branches are pinned (so fetched deletion doesn't abandon the branch if it's referenced by another branch.) I think (a) is okay (and even more correct) since such refs aren't known to jj yet. (b) is desired.	2023-08-06 14:47:20 +09:00
Kevin Liao	e00cb0fe08	Update init_with_factories to initialize a workspace with a workspace_id other than "default" This change allows a custom jj binary to initialize a workspace with a workspace_id other than "default".	2023-08-04 01:26:26 -07:00
Yuya Nishihara	dd5cc843da	revset_graph: remove unneeded Vec<IndexGraphEdge> cloning	2023-08-04 06:19:22 +09:00
Yuya Nishihara	8dc59a3d69	revset_graph: discard cache of edges that won't be accessed anymore This appears to be a bit slower (1.170s -> 1.211s with "log -R git -r 'tags()' -Tcommit_id --ignore-working-copy"), but seemed better than keeping growing cache.	2023-08-04 06:19:22 +09:00
Waleed Khan	e1c194ce67	working_copy: rename `WorkItem` -> `DirectoryToVisit`	2023-08-03 19:09:59 +00:00
Waleed Khan	84f807d222	working_copy: traverse filesystem in parallel This improves `jj status` time by a factor of ~2x on my machine (M1 Macbook Pro 2021 16-inch, uses an SSD): ```sh $ hyperfine --parameter-list hash before,after --parameter-list repo nixpkgs,gecko-dev --setup 'git checkout {hash} && cargo build --profile release-with-debug' --warmup 3 './target/release-with-debug/jj -R ../{repo} st' Benchmark 1: ./target/release-with-debug/jj -R ../nixpkgs st (hash = before) Time (mean ± σ): 1.640 s ± 0.019 s [User: 0.580 s, System: 1.044 s] Range (min … max): 1.621 s … 1.673 s 10 runs Benchmark 2: ./target/release-with-debug/jj -R ../nixpkgs st (hash = after) Time (mean ± σ): 760.0 ms ± 5.4 ms [User: 812.9 ms, System: 2214.6 ms] Range (min … max): 751.4 ms … 768.7 ms 10 runs Benchmark 3: ./target/release-with-debug/jj -R ../gecko-dev st (hash = before) Time (mean ± σ): 11.403 s ± 0.648 s [User: 4.546 s, System: 5.932 s] Range (min … max): 10.553 s … 12.718 s 10 runs Benchmark 4: ./target/release-with-debug/jj -R ../gecko-dev st (hash = after) Time (mean ± σ): 5.974 s ± 0.028 s [User: 5.387 s, System: 11.959 s] Range (min … max): 5.937 s … 6.024 s 10 runs $ hyperfine --parameter-list repo nixpkgs,gecko-dev --warmup 3 'git -C ../{repo} status' Benchmark 1: git -C ../nixpkgs status Time (mean ± σ): 865.4 ms ± 8.4 ms [User: 119.4 ms, System: 1401.2 ms] Range (min … max): 852.8 ms … 879.1 ms 10 runs Benchmark 2: git -C ../gecko-dev status Time (mean ± σ): 2.892 s ± 0.029 s [User: 0.458 s, System: 14.244 s] Range (min … max): 2.837 s … 2.934 s 10 runs ``` Conclusions: - ~2x improvement from previous `jj status` time. - Slightly faster than Git on nixpkgs. - Still 2x slower than Git on gecko-dev, not sure why. For reference, Git's default number of threads is defined in the `online_cpus` function: `ee48e70a82/thread-utils.c (L21-L66)`. We are using whatever the Rayon default is.	2023-08-03 18:20:49 +00:00
Waleed Khan	326be7c91e	working_copy: send updates via `channel` In preparation of traversing the filesystem in parallel, send updates via `channel`. An alternative is to modify shared mutable state, e.g. put `self.file_states` behind a mutex or use a concurrent hash-map. This risks leaving the `TreeState` in an invalid state if an error occurs, and makes invariants harder to reason about. Using a channel introduces a small performance regression. (I didn't try out the concurrent hash-map approach.) ```sh $ hyperfine --parameter-list hash before,after --setup 'git checkout {hash} && cargo build --profile release-with-debug' --warmup 3 './target/release-with-debug/jj -R ../nixpkgs st' Benchmark 1: ./target/release-with-debug/jj -R ../nixpkgs st (hash = before) Time (mean ± σ): 1.533 s ± 0.013 s [User: 0.587 s, System: 0.926 s] Range (min … max): 1.510 s … 1.559 s 10 runs Benchmark 2: ./target/release-with-debug/jj -R ../nixpkgs st (hash = after) Time (mean ± σ): 1.563 s ± 0.021 s [User: 0.607 s, System: 0.936 s] Range (min … max): 1.518 s … 1.595 s 10 runs Summary ./target/release-with-debug/jj -R ../nixpkgs st (hash = before) ran 1.02 ± 0.02 times faster than ./target/release-with-debug/jj -R ../nixpkgs st (hash = after) ```	2023-08-03 17:56:05 +00:00
Waleed Khan	174704d752	working_copy: extract `visit_directory` function for snapshotting	2023-08-03 17:40:18 +00:00

... 25 26 27 28 29 ...

2836 Commits