89 Commits

Author SHA1 Message Date
Ilya Grigoriev
12ee2b18cd Git backend: Allow simultaneous rebasing of duplicate commits
Fixes https://github.com/martinvonz/jj/issues/27
Fixes https://github.com/martinvonz/jj/issues/694
2023-01-17 21:17:27 -08:00
Yuya Nishihara
ca2e9fe6d1 git: simply use rand::random() to generate ref preventing gc
We don't care the ref content as long as it is unique, so using threaded
RNG should be fine.

This change means refs/jj/keep will now contain refs of the following
forms:

 - new create_no_gc_ref(): 0f8d6cd9721823906cfb55dac99d7bf5
 - old create_no_gc_ref(): 0f6d93fe-0507-4db8-ad0a-6317f02e27b9
 - prevent_gc(commit_id):  0f9c15100b6f1373f38186357e274a829fb6c4e2
2023-01-14 23:48:02 +09:00
Waleed Khan
af55d17a25 git_backend: propagate various errors
I needed this in the course of debugging an error. Before this commit, the error looked like this:

```
Error: Unexpected error from backend: Object not found
```

After this commit, it looks like this:

```
Error: Unexpected error from backend: Object with CommitId 8f59646bc9bb6bb44b5624f1248f4a708f37003c not found: object not found - no match for id (8f59646bc9bb6bb44b5624f1248f4a708f37003c); class=Odb (9); code=NotFound (-3)
```
2023-01-02 12:28:51 -06:00
Waleed Khan
456be4cc73 backend: create BackendError::InvalidHashLength
Strictly speaking, we could rely on e.g. `git2::Oid::from_str` to produce an error, but I figure that having an explicit error for a mismatching hash length might demystify some error condition in the future, since commit IDs and change IDs and potentially other backends' IDs may have different lengths, so this could flag a mismatch earlier/more obviously.
2023-01-02 12:28:51 -06:00
Waleed Khan
7f8a196ab2 backend: create ObjectId trait
This lets us operate over various kinds of objects polymorphically (e.g. call `.hex()` on any kind of object hash).
2023-01-02 12:28:51 -06:00
Benjamin Saunders
aaa175eca7 lib: replace protobuf crate with prost 2022-12-22 07:04:35 -08:00
Luke Granger-Brown
90ba55bd7b git: cache the extra metadata table
Performance on repositories with many commits is limited somewhat by repeatedly
stating the tablestore directory to work out what the head is. By caching the
table rather than looking it up from disk on every request, we can much more
rapidly satisfy requests. 

This avoids the pathological case in #845 where jj operations take several
minutes to complete.

This patch doesn't change the normal flow of the write path: that will still
always call get_head() on the underlying TableStore, which will stat the
directory before writing out changes. It will however empty the cache when the
metadata has been written.

Fixes #845.
2022-12-17 08:19:14 +00:00
Martin von Zweigbergk
7f9a0a2820 cleanup: let new Clippy move variables into format strings
I ran an upgraded Clippy on the codebase. All the changes seem to be
about using variables directly in format strings instead of passing
them as separate arguments.
2022-12-14 21:30:58 -08:00
Martin von Zweigbergk
be383cebc7 git: on import, add GC-preventing refs to all seen refs
To prevent git's GC from breaking a repo, we already add a git ref to
commits we create in the git backend. However, we don't add refs to
commits we import from git. This fixes that.

Closes #815.
2022-12-03 22:50:26 -08:00
Martin von Zweigbergk
d8feed9be4 copyright: change from "Google LLC" to "The Jujutsu Authors"
Let's acknowledge everyone's contributions by replacing "Google LLC"
in the copyright header by "The Jujutsu Authors". If I understand
correctly, it won't have any legal effect, but maybe it still helps
reduce concerns from contributors (though I haven't heard any
concerns).

Google employees can read about Google's policy at
go/releasing/contributions#copyright.
2022-11-28 06:05:45 -10:00
Martin von Zweigbergk
780d7fb59c backend: rename NormalFile to just File
There are no "non-normal" files, so "normal" is not needed. We have
symlinks and conflicts, but they are not files, so I think just "file"
is unambiguous.

I left `testutils::write_normal_file()` because there it's used to
mean "not executable file" (there's also a `write_executable_file()`).

I left `working_copy::FileType::Normal` since renaming `Normal` there
to `File` would also suggest we should rename `FileType`, and I don't
know what would be a better name for that type.
2022-11-14 23:36:43 -08:00
Martin von Zweigbergk
3c7c4e9f5c tests: move testutils module into separate crate
The `testutils` module should ideally not be part of the library
dependencies. Since they're used by the integration tests (and the CLI
tests), we need to move them to a separate crate to achieve that.
2022-11-08 07:29:35 -08:00
Martin von Zweigbergk
6703810c6e backend: remove Commit::is_open field from data model 2022-11-05 06:14:37 -07:00
Ilya Grigoriev
2b8dabaae4 Fixes suggested by new version of Clippy 2022-11-03 21:38:16 -07:00
Martin von Zweigbergk
3b3f6129e6 backend: allow negative timestamps in commits and operations
I was reading a draft of "Git Rev News: Edition 91" [1] where Peff
mentions some unfinished patches to allow negative timestamps in
Git. So I figured I should add support for that before I forget. I
haven't checked if libgit2 supports it, so it might be that our Git
backend still doesn't support it after this patch.

 [1] https://github.com/git/git.github.io/blob/master/rev_news/drafts/edition-91.md
2022-09-30 00:50:17 -07:00
Martin von Zweigbergk
de7b5cf8b0 repo: write format ("git" or "local") to disk on init
We currently determine if the repo uses the Git backend or the local
backend by checking for presence of a `.jj/repo/store/git_target`
file. To make it easier to add out-of-tree backends, let's instead add
a file that indicates which backend to use.
2022-09-25 09:40:42 -07:00
Martin von Zweigbergk
ea5aa0a96d cleanup: replace some PathBuf args by &Path
In many of these places, we don't need an owned value, so using a
reference means we don't force the caller to clone the value. I really
doubt it will have any noticeable impact on performance (I think these
are all once-per-repo paths); it's just a little simpler this way.
2022-09-25 09:40:42 -07:00
Martin von Zweigbergk
0108673087 backend: let each backend handle root commit on write
This moves the logic for handling the root commit when writing commits
from `CommitBuilder` into the individual backends. It always bothered
me a bit that the `commit::Commit` wrapper had a different idea of the
number of parents than the wrapped `backend::Commit` had.

With this change, the `LocalBackend` will now write the root commit in
the list of parents if it's there in the argument to
`write_commit()`. Note that root commit itself won't be written. The
main argument for not writing it is that we can then keep the fake
all-zeros hash for it. One argument for writing it, if we were to do
so, is that it would make the set of written objects consistent, so
any future processing of them (such as GC) doesn't have to know to
ignore the root commit in the list of parents.

We still treat the two backends the same, so the user won't be allowed
to create merges including the root commit even when using the
`LocalBackend`.
2022-09-20 21:20:57 -07:00
Martin von Zweigbergk
fb8d087882 backend: make backend aware of root commit
I had made the backends unaware of the virtual root commit because
they don't need to know about it, and we could avoid some duplicated
code by putting that in `Store` instead. However, as we saw in
b21a123bc894, the root commit being virtual has some user-visible
effects (they can't create a merge with the root and some other
commit). So I'm thinking that we may want to make the root commit an
actual commit, depending on which backend is used. Specificially, when
using the Git backend, we cannot record the root commit as an actual
parent since Git would fail when trying to look it up. Backends that
don't need compatibility can make the root commit an actual commit,
however.

This commit therefore makes the backends aware of the root commit. It
makes it remain a virtual commit in the Git backend, and makes it an
actual commit in the `LocalBackend`.

This commit breaks any existing repos using the `LocalBackend`, but
there shouldn't be any such repos other than for testing.
2022-09-20 21:20:57 -07:00
Martin von Zweigbergk
1d9f1720c5 backend: add a Tree::from_hex() helper 2022-09-20 21:20:57 -07:00
Yuya Nishihara
872081c867 tests: use testutils::new_temp_dir() thoroughly 2022-09-07 23:49:46 +09:00
Martin von Zweigbergk
540f2eb583 errors: avoid using Debug formatting on error types
The regular `Display` format is (not surprisingly) more user-friendly,
as pointed out by @yuja.

I also switched to using format strings for these cases, and some
nearby strings for consistency.
2022-05-25 19:33:59 -07:00
Martin von Zweigbergk
90c8cb0cba errors: add a custom error type for StackedTable 2022-05-01 23:35:09 -07:00
Martin von Zweigbergk
89476261c0 cleanup: move {read,write}_conflict() methods earlier in Backend trait
The methods working on conflicts are more closely related to those
working on files and trees, so it makes sense for them to be closer.
2022-05-01 23:35:09 -07:00
Martin von Zweigbergk
23d37f9060 git_backend: delete obsolete (?) comment about not avoiding use of index
It seems to me that we have never created a Git index in order to
create a commit, not even in the earliest versions of the code (before
it was moved to Git).
2022-04-29 13:26:27 -07:00
Martin von Zweigbergk
f5e9444456 cargo: upgrade uuid to 1.0.0 2022-04-20 14:18:59 -07:00
Martin von Zweigbergk
16994308fa git: remove code for upgrading from Git notes 2022-03-31 13:32:43 -07:00
Martin von Zweigbergk
f16d2a237b backend: pass in path when reading/writing conflicts as well
We do it for all the other kinds of objects already. It's useful to
have the path for backends that store objects by path (we don't have
any such backends yet). I think the reason I didn't do it from the
beginning was because we had separate `RepoPath` types for files and
directories back then.
2022-03-31 10:23:33 -07:00
Martin von Zweigbergk
6cd4e03c25 cleanup: use canonicalize() method instead of free function
I had somehow not noticed that `Path` and `PathBuf` have
`canonicalize()` methods. Using them saves a few characters of code.
2022-03-30 22:09:55 -07:00
Martin von Zweigbergk
42252a2f00 cli: on jj init --git-repo=., use relative path to .git/
When the backing Git repo is inside the workspace (typically directly
in `.git/`), let's point to it by a relative path so the whole
workspace can be moved without breaking the link.

Closes #72.
2022-03-05 09:37:48 -08:00
Martin von Zweigbergk
2d6b66a274 stacked_table: rename start_modification() to start_mutation()
`start_mutation()` better matches the return type's name.
2022-01-05 15:17:24 -08:00
Martin von Zweigbergk
8cf5dd286a backend: make Vec inside CommitId non-public
The recent e5dd93cbf712, whose description says "cleanup: make Vec
inside CommitId etc. non-public", made all ID types in the `backend`
module *except* for `CommitId` non-public :P This patch makes
2021-11-19 23:19:00 -08:00
Martin von Zweigbergk
f846112f80 cleanup: remove an unnecessary reference-taking noticed by Clippy 2021-11-14 12:47:14 -08:00
Martin von Zweigbergk
ced252f766 cleanup: replace some as_slice() by & 2021-11-10 10:55:58 -08:00
Martin von Zweigbergk
e5dd93cbf7 cleanup: make Vec inside CommitId etc. non-public 2021-11-10 10:46:10 -08:00
Martin von Zweigbergk
c260fea811 GitBackend: move extra metadata from Git notes to stacked-table storage
Git notes (at least as implemented by libgit2) quickly gets really
slow, as noted in issue #7. This patch replaces it by a custom storage
format.

I tested the performance in the git.git repo with just a few hundred
annotated commits (~450, I think) and no sharding. I listed the first
~2900 commits there using `jj log --no-graph -r ,,v1.0.0 -T 'author
"\n"' | wc -l`. That took about 882ms. After this patch, it dropped to
108ms.

I did a similar test in this repo with 12700 annotated commits and
sharding, listing all visible commits. That took 142ms before this
patch (the sharding helps a lot!) and 55ms after.

Closes #3.
Closes #7.
2021-10-20 13:22:59 -07:00
Martin von Zweigbergk
d8795b9ae7 store: move logic for initialization of GitBackend to that type 2021-10-18 08:49:22 -07:00
Martin von Zweigbergk
fdb861b957 backend: remove unused Commit::is_pruned (#32) 2021-10-06 23:53:15 -07:00
Martin von Zweigbergk
ce5e95fa80 store: rename Store to Backend and StoreWrapper to Store
For what's currently called `Store` in the code, I have been using
"backend" in plain text. That probably means that `Backend` is a good
name for it.
2021-09-12 12:02:10 -07:00