52 Commits

Author SHA1 Message Date
Yuya Nishihara
8a0fcfb032 git_backend: leverage read_extra_metadata_table() in write_commit()
And use the readonly table for lookup, which allows us to extract a helper
method to add/save entry.
2023-05-20 15:53:23 +09:00
Yuya Nishihara
14243a85a0 git_backend: extract helper to read extra metadata table and maintain cache 2023-05-20 15:53:23 +09:00
Martin von Zweigbergk
87a925d736 git_backend: return timestamps for what was actually written
Now that we return the written commit from `write_commit()`, let's
make the timestamps match what was actually written, accounting for
the whole-second precision and the adjustment we do to avoid
collisions.
2023-05-12 15:20:44 -07:00
Martin von Zweigbergk
a95188ddbc backend: take commit to write by value and return new value
The internal backend at Google doesn't let you write any value you
want for in the committer field. The `Store` type still caches the
value it attempted to write, which gets a little weird when the
written value is not what we tried to write. We should use the value
the backend actually wrote. However, we don't know if the backend
changed anything without reading the value back, which is often
wasteful. This commit changes the API to return the written value.

I only changed the signature of `write_commit()` for now. Maybe we
should make a similar change to `write_tree()`.
2023-05-12 15:20:44 -07:00
Martin von Zweigbergk
e7419e76a1 backend: replace git_repo() by as_any()
This has several advantages:

 * Makes it possible to downcast to non-Git custom backends (might be
   useful at Google, but we haven't needed it yet)

 * Lets us access more specific functionality on the `GitBackend`,
   making it possible to access the `git2::Repository` without
   creating a copy of it.

 * Removes the dependency on Git from the backend
2023-05-12 08:05:09 -07:00
Martin von Zweigbergk
a87125d08b backend: rename ConflictPart to ConflictTerm
It took a while before I realized that conflicts could be modeled as
simple algebraic expressions with positive and negative terms (they
were modeled as recursive 3-way conflicts initially). We've been
thinking of them that way for a while now, so let's make the
`ConflictPart` name match that model.
2023-02-17 23:28:50 -08:00
Martin von Zweigbergk
d1dc22d957 backend: let backend decide length of change id
As mentioned in the previous commit, our internal backend at Google
uses a 32-byte long change id. This commit will make us able to use
that.
2023-02-07 22:31:34 -08:00
Martin von Zweigbergk
e6693d0f68 backend: let backend choose root change id
Our internal backend at Google uses a 32-byte change id, so I'd like
to make the backend able to decide the length. To start with, let's
make the backend able to decide what the root change id should
be. That's consistent with how we already let the backend decide what
the root commit id should be.
2023-02-07 22:31:34 -08:00
Martin von Zweigbergk
98259346df backend: make hash_length() specifically about commit IDs
The function is currently only about the length of commit IDs, so
let's clarify that. I'm going to add another function for the length
of change IDs next. I don't know if we're going to care about lengths
of other hashes in the future. We might even be able to remove the
current restriction that all commit IDs and all change IDs have the
same length.
2023-02-07 22:31:34 -08:00
Martin von Zweigbergk
f4374086b3 git_backend: return error when told to write commit without parents
There should be no other commits than the root commit without parents.
2023-02-05 22:52:23 -08:00
Martin von Zweigbergk
8c63fbc4ed git_backend: don't panic if told to write merge with root commit
I think the CLI currently checks that the backend is not told to write
a merge commit with the root as one parent, but we should not panic if
those checks fail.
2023-02-05 22:52:23 -08:00
Martin von Zweigbergk
2b2a9a36d7 git_backend: test conversion of parents, including root
We didn't seem to have any tests showing how we convert the set of
parents, and especially how we handle the root commit, so let's add
some.
2023-02-05 22:52:23 -08:00
Martin von Zweigbergk
985555f393 git_backend: avoid redoing some steps when retrying in write_commit()
By inlining `wite_commit_internal()` into `write_commit()`, we can
avoid redoing some steps when we retry. This includes taking the mutex
lock, and reading the tree object and parent commits. It also means
that we avoid cloning the input commit object, which we otherwise
would even in the non-retrying case. I haven't measured if any of this
makes a significant difference, but I think it also slightly
simplifies the code, so it doesn't have to.
2023-01-17 23:12:50 -08:00
Ilya Grigoriev
12ee2b18cd Git backend: Allow simultaneous rebasing of duplicate commits
Fixes https://github.com/martinvonz/jj/issues/27
Fixes https://github.com/martinvonz/jj/issues/694
2023-01-17 21:17:27 -08:00
Yuya Nishihara
ca2e9fe6d1 git: simply use rand::random() to generate ref preventing gc
We don't care the ref content as long as it is unique, so using threaded
RNG should be fine.

This change means refs/jj/keep will now contain refs of the following
forms:

 - new create_no_gc_ref(): 0f8d6cd9721823906cfb55dac99d7bf5
 - old create_no_gc_ref(): 0f6d93fe-0507-4db8-ad0a-6317f02e27b9
 - prevent_gc(commit_id):  0f9c15100b6f1373f38186357e274a829fb6c4e2
2023-01-14 23:48:02 +09:00
Waleed Khan
af55d17a25 git_backend: propagate various errors
I needed this in the course of debugging an error. Before this commit, the error looked like this:

```
Error: Unexpected error from backend: Object not found
```

After this commit, it looks like this:

```
Error: Unexpected error from backend: Object with CommitId 8f59646bc9bb6bb44b5624f1248f4a708f37003c not found: object not found - no match for id (8f59646bc9bb6bb44b5624f1248f4a708f37003c); class=Odb (9); code=NotFound (-3)
```
2023-01-02 12:28:51 -06:00
Waleed Khan
456be4cc73 backend: create BackendError::InvalidHashLength
Strictly speaking, we could rely on e.g. `git2::Oid::from_str` to produce an error, but I figure that having an explicit error for a mismatching hash length might demystify some error condition in the future, since commit IDs and change IDs and potentially other backends' IDs may have different lengths, so this could flag a mismatch earlier/more obviously.
2023-01-02 12:28:51 -06:00
Waleed Khan
7f8a196ab2 backend: create ObjectId trait
This lets us operate over various kinds of objects polymorphically (e.g. call `.hex()` on any kind of object hash).
2023-01-02 12:28:51 -06:00
Benjamin Saunders
aaa175eca7 lib: replace protobuf crate with prost 2022-12-22 07:04:35 -08:00
Luke Granger-Brown
90ba55bd7b git: cache the extra metadata table
Performance on repositories with many commits is limited somewhat by repeatedly
stating the tablestore directory to work out what the head is. By caching the
table rather than looking it up from disk on every request, we can much more
rapidly satisfy requests. 

This avoids the pathological case in #845 where jj operations take several
minutes to complete.

This patch doesn't change the normal flow of the write path: that will still
always call get_head() on the underlying TableStore, which will stat the
directory before writing out changes. It will however empty the cache when the
metadata has been written.

Fixes #845.
2022-12-17 08:19:14 +00:00
Martin von Zweigbergk
7f9a0a2820 cleanup: let new Clippy move variables into format strings
I ran an upgraded Clippy on the codebase. All the changes seem to be
about using variables directly in format strings instead of passing
them as separate arguments.
2022-12-14 21:30:58 -08:00
Martin von Zweigbergk
be383cebc7 git: on import, add GC-preventing refs to all seen refs
To prevent git's GC from breaking a repo, we already add a git ref to
commits we create in the git backend. However, we don't add refs to
commits we import from git. This fixes that.

Closes #815.
2022-12-03 22:50:26 -08:00
Martin von Zweigbergk
d8feed9be4 copyright: change from "Google LLC" to "The Jujutsu Authors"
Let's acknowledge everyone's contributions by replacing "Google LLC"
in the copyright header by "The Jujutsu Authors". If I understand
correctly, it won't have any legal effect, but maybe it still helps
reduce concerns from contributors (though I haven't heard any
concerns).

Google employees can read about Google's policy at
go/releasing/contributions#copyright.
2022-11-28 06:05:45 -10:00
Martin von Zweigbergk
780d7fb59c backend: rename NormalFile to just File
There are no "non-normal" files, so "normal" is not needed. We have
symlinks and conflicts, but they are not files, so I think just "file"
is unambiguous.

I left `testutils::write_normal_file()` because there it's used to
mean "not executable file" (there's also a `write_executable_file()`).

I left `working_copy::FileType::Normal` since renaming `Normal` there
to `File` would also suggest we should rename `FileType`, and I don't
know what would be a better name for that type.
2022-11-14 23:36:43 -08:00
Martin von Zweigbergk
3c7c4e9f5c tests: move testutils module into separate crate
The `testutils` module should ideally not be part of the library
dependencies. Since they're used by the integration tests (and the CLI
tests), we need to move them to a separate crate to achieve that.
2022-11-08 07:29:35 -08:00
Martin von Zweigbergk
6703810c6e backend: remove Commit::is_open field from data model 2022-11-05 06:14:37 -07:00
Ilya Grigoriev
2b8dabaae4 Fixes suggested by new version of Clippy 2022-11-03 21:38:16 -07:00
Martin von Zweigbergk
3b3f6129e6 backend: allow negative timestamps in commits and operations
I was reading a draft of "Git Rev News: Edition 91" [1] where Peff
mentions some unfinished patches to allow negative timestamps in
Git. So I figured I should add support for that before I forget. I
haven't checked if libgit2 supports it, so it might be that our Git
backend still doesn't support it after this patch.

 [1] https://github.com/git/git.github.io/blob/master/rev_news/drafts/edition-91.md
2022-09-30 00:50:17 -07:00
Martin von Zweigbergk
de7b5cf8b0 repo: write format ("git" or "local") to disk on init
We currently determine if the repo uses the Git backend or the local
backend by checking for presence of a `.jj/repo/store/git_target`
file. To make it easier to add out-of-tree backends, let's instead add
a file that indicates which backend to use.
2022-09-25 09:40:42 -07:00
Martin von Zweigbergk
ea5aa0a96d cleanup: replace some PathBuf args by &Path
In many of these places, we don't need an owned value, so using a
reference means we don't force the caller to clone the value. I really
doubt it will have any noticeable impact on performance (I think these
are all once-per-repo paths); it's just a little simpler this way.
2022-09-25 09:40:42 -07:00
Martin von Zweigbergk
0108673087 backend: let each backend handle root commit on write
This moves the logic for handling the root commit when writing commits
from `CommitBuilder` into the individual backends. It always bothered
me a bit that the `commit::Commit` wrapper had a different idea of the
number of parents than the wrapped `backend::Commit` had.

With this change, the `LocalBackend` will now write the root commit in
the list of parents if it's there in the argument to
`write_commit()`. Note that root commit itself won't be written. The
main argument for not writing it is that we can then keep the fake
all-zeros hash for it. One argument for writing it, if we were to do
so, is that it would make the set of written objects consistent, so
any future processing of them (such as GC) doesn't have to know to
ignore the root commit in the list of parents.

We still treat the two backends the same, so the user won't be allowed
to create merges including the root commit even when using the
`LocalBackend`.
2022-09-20 21:20:57 -07:00
Martin von Zweigbergk
fb8d087882 backend: make backend aware of root commit
I had made the backends unaware of the virtual root commit because
they don't need to know about it, and we could avoid some duplicated
code by putting that in `Store` instead. However, as we saw in
b21a123bc894, the root commit being virtual has some user-visible
effects (they can't create a merge with the root and some other
commit). So I'm thinking that we may want to make the root commit an
actual commit, depending on which backend is used. Specificially, when
using the Git backend, we cannot record the root commit as an actual
parent since Git would fail when trying to look it up. Backends that
don't need compatibility can make the root commit an actual commit,
however.

This commit therefore makes the backends aware of the root commit. It
makes it remain a virtual commit in the Git backend, and makes it an
actual commit in the `LocalBackend`.

This commit breaks any existing repos using the `LocalBackend`, but
there shouldn't be any such repos other than for testing.
2022-09-20 21:20:57 -07:00
Martin von Zweigbergk
1d9f1720c5 backend: add a Tree::from_hex() helper 2022-09-20 21:20:57 -07:00
Yuya Nishihara
872081c867 tests: use testutils::new_temp_dir() thoroughly 2022-09-07 23:49:46 +09:00
Martin von Zweigbergk
540f2eb583 errors: avoid using Debug formatting on error types
The regular `Display` format is (not surprisingly) more user-friendly,
as pointed out by @yuja.

I also switched to using format strings for these cases, and some
nearby strings for consistency.
2022-05-25 19:33:59 -07:00
Martin von Zweigbergk
90c8cb0cba errors: add a custom error type for StackedTable 2022-05-01 23:35:09 -07:00
Martin von Zweigbergk
89476261c0 cleanup: move {read,write}_conflict() methods earlier in Backend trait
The methods working on conflicts are more closely related to those
working on files and trees, so it makes sense for them to be closer.
2022-05-01 23:35:09 -07:00
Martin von Zweigbergk
23d37f9060 git_backend: delete obsolete (?) comment about not avoiding use of index
It seems to me that we have never created a Git index in order to
create a commit, not even in the earliest versions of the code (before
it was moved to Git).
2022-04-29 13:26:27 -07:00
Martin von Zweigbergk
f5e9444456 cargo: upgrade uuid to 1.0.0 2022-04-20 14:18:59 -07:00
Martin von Zweigbergk
16994308fa git: remove code for upgrading from Git notes 2022-03-31 13:32:43 -07:00
Martin von Zweigbergk
f16d2a237b backend: pass in path when reading/writing conflicts as well
We do it for all the other kinds of objects already. It's useful to
have the path for backends that store objects by path (we don't have
any such backends yet). I think the reason I didn't do it from the
beginning was because we had separate `RepoPath` types for files and
directories back then.
2022-03-31 10:23:33 -07:00
Martin von Zweigbergk
6cd4e03c25 cleanup: use canonicalize() method instead of free function
I had somehow not noticed that `Path` and `PathBuf` have
`canonicalize()` methods. Using them saves a few characters of code.
2022-03-30 22:09:55 -07:00
Martin von Zweigbergk
42252a2f00 cli: on jj init --git-repo=., use relative path to .git/
When the backing Git repo is inside the workspace (typically directly
in `.git/`), let's point to it by a relative path so the whole
workspace can be moved without breaking the link.

Closes #72.
2022-03-05 09:37:48 -08:00
Martin von Zweigbergk
2d6b66a274 stacked_table: rename start_modification() to start_mutation()
`start_mutation()` better matches the return type's name.
2022-01-05 15:17:24 -08:00
Martin von Zweigbergk
8cf5dd286a backend: make Vec inside CommitId non-public
The recent e5dd93cbf712, whose description says "cleanup: make Vec
inside CommitId etc. non-public", made all ID types in the `backend`
module *except* for `CommitId` non-public :P This patch makes
2021-11-19 23:19:00 -08:00
Martin von Zweigbergk
f846112f80 cleanup: remove an unnecessary reference-taking noticed by Clippy 2021-11-14 12:47:14 -08:00
Martin von Zweigbergk
ced252f766 cleanup: replace some as_slice() by & 2021-11-10 10:55:58 -08:00
Martin von Zweigbergk
e5dd93cbf7 cleanup: make Vec inside CommitId etc. non-public 2021-11-10 10:46:10 -08:00
Martin von Zweigbergk
c260fea811 GitBackend: move extra metadata from Git notes to stacked-table storage
Git notes (at least as implemented by libgit2) quickly gets really
slow, as noted in issue #7. This patch replaces it by a custom storage
format.

I tested the performance in the git.git repo with just a few hundred
annotated commits (~450, I think) and no sharding. I listed the first
~2900 commits there using `jj log --no-graph -r ,,v1.0.0 -T 'author
"\n"' | wc -l`. That took about 882ms. After this patch, it dropped to
108ms.

I did a similar test in this repo with 12700 annotated commits and
sharding, listing all visible commits. That took 142ms before this
patch (the sharding helps a lot!) and 55ms after.

Closes #3.
Closes #7.
2021-10-20 13:22:59 -07:00
Martin von Zweigbergk
d8795b9ae7 store: move logic for initialization of GitBackend to that type 2021-10-18 08:49:22 -07:00