77 Commits

Author SHA1 Message Date
Martin von Zweigbergk
ef5f97f8d7 conflicts: move Merge<T> to merge module
The `merge` module now seems like the obvious place for this type.
2023-08-06 22:08:09 +00:00
Martin von Zweigbergk
ecc030848d conflicts: rename Conflict<T> to Merge<T>
Since `Conflict<T>` can also represent a non-conflict state (a single
term), `Merge<T>` seems like better name.

Thanks to @ilyagr for the suggestion in
https://github.com/martinvonz/jj/pull/1774#discussion_r1257547709

Sorry about the churn. It would have been better if I thought of this
name before I introduced `Conflict<T>`.
2023-08-06 22:08:09 +00:00
Martin von Zweigbergk
006c764694 backend: learn to store tree-level conflicts
Tree-level conflicts (#1624) will be stored as multiple trees
associated with a single commit. This patch adds support for that in
`backend::Commit` and in the backends.

When the Git backend writes a tree conflict, it creates a special root
tree for the commit. That tree has only the individual trees from the
conflict as subtrees. That way we prevent the trees from getting
GC'd. We also write the tree ids to the extra metadata table
(i.e. outside of the Git repo) so we don't need to load the tree
object to determine if there are conflicts.

I also added new flag to `backend::Commit` indicating whether the
commit is a new-style commit (with support for tree-level
conflicts). That will help with the migration. We will remove it once
we no longer care about old repos. When the flag is set, we know that
a commit with a single tree cannot have conflicts. When the flag is
not set, it's an old-style commit where we have to walk the whole tree
to find conflicts.
2023-07-19 22:04:16 -07:00
Waleed Khan
54dba51a08 docs: warn about missing docs for jj-lib crate 2023-07-10 18:28:59 +03:00
Yuya Nishihara
5346bd734f git_backend: translate io::Error of read_conflict() to ReadObject error
This is the last place in Git backend where io::Error is magically converted
to BackendError::Other.
2023-07-06 20:48:46 +09:00
Yuya Nishihara
4e4ca46998 git_backend: wrap TableStoreError to preserve source error object 2023-07-06 20:48:46 +09:00
Yuya Nishihara
cf8a0466c4 backend: introduce error types specific to init/load phases
Errors that may occur while loading backend would vary per backends, and
it's unlikely that these errors could be mapped to BackendError variants
other than BackendError::Other. So let's extract Other(_) of that kind as
a separate type to clarify there would be no other error variants.

Perhaps, Backend/Error will be renamed to CommitBackend/Error or
CommitStore/Error?, whereas I think BackendInit/LoadError can be shared
among store factories.
2023-07-06 20:48:46 +09:00
Yuya Nishihara
e1e75daa8e backend: make BackendError::Other preserve source error object 2023-07-06 20:48:46 +09:00
Yuya Nishihara
5b78fe75b1 git_backend: propagate load() error to caller
#1794
2023-07-06 12:43:49 +09:00
Yuya Nishihara
84060d750b git_backend: propagate init_internal() error to caller 2023-07-06 12:43:49 +09:00
Yuya Nishihara
2db4c906ad git_backend: attach file path to initialization error 2023-07-06 12:43:49 +09:00
Yuya Nishihara
31bb68486e git_backend: insert error type specific to backend initialization
This helps to map initialization error to BackendError without too general
From impl. I don't think io::Error (or our PathError) should be automatically
translated to BackendError::Other because BackendError has more specific
variants depending on context. If the error is specific to initialization,
it makes sense to translate it to Other variant.
2023-07-06 12:43:49 +09:00
Yuya Nishihara
a09a406817 git_backend: leverage std::fs::read/write() helpers 2023-07-06 12:43:49 +09:00
Kevin Liao
eac90fd113 Update init_external to return an error instead of unwrapping 2023-06-29 10:03:13 -07:00
Martin von Zweigbergk
da5db27bb0 backend: split up store.proto in git and local versions
It was convenient that what the git backend stored in its "extras"
table is exactly a subset of the fields that local backend stores, but
it's bit ugly and limiting. For example, it makes it possible to
populate the `author` field in the git extras, but that would have no
effect. It's better that it's not possible to do that (we store the
author field in the git commit, of course).

What made me notice this now was that I'm working on tree-level
conflicts (#1624) and I'm thinking of adding a field to the git extras
saying "this commit has single tree, but it's still a new-style
commit", so we can know not to walking such trees to find path-level
conflicts. That's only needed for the git backend because we don't
care about compatibility for the local backend.
2023-06-22 13:49:46 +02:00
Kevin Liao
86b6a11e63 Fix jj init --git-repo fails and leaves broken .jj folder
This commit fixes #1305

Before this commit, running `jj init --git-repo=./` in a folder that
does not have a .git would cause jj to panick and leave an unfinished corrupted jj repo.

This commit fixes that by changing the call chain to return an error
instead of calling .unwrap() and panicking. This commit also adds logic to delete the unfinished jj
repository when the git backend initialization failed.

Before this commit, running the above command would result in the following
```
Running `jj/target/debug/jj init --git-repo=./`
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Error { code: -3, klass: 2, message: "failed to resolve path '/Users/kevincliao/github/jj/test-repo/.jj/repo/store/../../../.git': No such file or directory" }', lib/src/git_backend.rs:83:75
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
```

After this commit, the result is the following and the jj repo is deleted:
```
Running `jj/target/debug/jj init --git-repo=./`
Error: Failed to access the repository: Error: Failed to open git repository: failed to resolve path '/Users/kevincliao/github/jj/test-repo/.jj/repo/store/../../../.git': No such file or directory; class=Os (2); code=NotFound (-3)
```
2023-06-20 11:02:06 -07:00
Yuya Nishihara
38a7e7fd62 git_backend: on read_commit(), bulk-update extra metadata table of ancestors
Otherwise, "jj init --git-repo ." would create extra table files per commit,
and merge them.

I considered adding an explicit GitBackend method to be called from
git::import_refs(), but the call order matters. The method should be invoked
before calling store.get_commit(..) or mut_repo.add_head(..). Since commits
are likely to be loaded from the head, we can instead make read_commit()
import ancestor metadata at all.

Alternatively, we could make a Git commit hidden until it's inserted into
the extra table. It's rather big change, and I wouldn't like to do that
without thinking more thoroughly.
2023-05-21 08:29:00 +09:00
Yuya Nishihara
fe97dccd02 git_backend: move add_entry() of extra metadata table to caller
I'm going to add a caller which will insert multiple entries at once.
2023-05-21 08:29:00 +09:00
Yuya Nishihara
e6addf7905 git_backend: extract helper that converts git2::Commit to backend::Commit
The root parent id is filled by caller because empty parents list is more
convenient while walking ancestors.
2023-05-21 08:29:00 +09:00
Yuya Nishihara
0149e7b311 git_backend: generate change id from git2::Commit object
I'm going to extract a helper function that converts git2::Commit to
backend::Commit struct, and the commit id can also be obtained from the
git2::Commit object.
2023-05-21 08:29:00 +09:00
Yuya Nishihara
5dba0502cb git_backend: cache head of saved extra metadata table
Just because we know the latest table head.
2023-05-21 08:29:00 +09:00
Yuya Nishihara
a9422460cb git_backend: ensure change id generated from git commit id never reassigned
Fixes #924
2023-05-20 15:53:23 +09:00
Yuya Nishihara
9aa72f6f1d git_backend: add lock to prevent racy change id assignments
My first attempt was to fix up corrupted index when merging, but it turned
out to be not easy because the self side may contain corrupted data. It's
also possible that two concurrent commit operations have exactly the same
view state (because change id isn't hashed into commit id), and only the
table heads diverge.

#924
2023-05-20 15:53:23 +09:00
Yuya Nishihara
e224044dea git_backend: consistently use CommitId type to look up extra metadata table 2023-05-20 15:53:23 +09:00
Yuya Nishihara
78c8dbc8fe git_backend: extract helper to add extra metadata entry and save table 2023-05-20 15:53:23 +09:00
Yuya Nishihara
8a0fcfb032 git_backend: leverage read_extra_metadata_table() in write_commit()
And use the readonly table for lookup, which allows us to extract a helper
method to add/save entry.
2023-05-20 15:53:23 +09:00
Yuya Nishihara
14243a85a0 git_backend: extract helper to read extra metadata table and maintain cache 2023-05-20 15:53:23 +09:00
Martin von Zweigbergk
87a925d736 git_backend: return timestamps for what was actually written
Now that we return the written commit from `write_commit()`, let's
make the timestamps match what was actually written, accounting for
the whole-second precision and the adjustment we do to avoid
collisions.
2023-05-12 15:20:44 -07:00
Martin von Zweigbergk
a95188ddbc backend: take commit to write by value and return new value
The internal backend at Google doesn't let you write any value you
want for in the committer field. The `Store` type still caches the
value it attempted to write, which gets a little weird when the
written value is not what we tried to write. We should use the value
the backend actually wrote. However, we don't know if the backend
changed anything without reading the value back, which is often
wasteful. This commit changes the API to return the written value.

I only changed the signature of `write_commit()` for now. Maybe we
should make a similar change to `write_tree()`.
2023-05-12 15:20:44 -07:00
Martin von Zweigbergk
e7419e76a1 backend: replace git_repo() by as_any()
This has several advantages:

 * Makes it possible to downcast to non-Git custom backends (might be
   useful at Google, but we haven't needed it yet)

 * Lets us access more specific functionality on the `GitBackend`,
   making it possible to access the `git2::Repository` without
   creating a copy of it.

 * Removes the dependency on Git from the backend
2023-05-12 08:05:09 -07:00
Martin von Zweigbergk
a87125d08b backend: rename ConflictPart to ConflictTerm
It took a while before I realized that conflicts could be modeled as
simple algebraic expressions with positive and negative terms (they
were modeled as recursive 3-way conflicts initially). We've been
thinking of them that way for a while now, so let's make the
`ConflictPart` name match that model.
2023-02-17 23:28:50 -08:00
Martin von Zweigbergk
d1dc22d957 backend: let backend decide length of change id
As mentioned in the previous commit, our internal backend at Google
uses a 32-byte long change id. This commit will make us able to use
that.
2023-02-07 22:31:34 -08:00
Martin von Zweigbergk
e6693d0f68 backend: let backend choose root change id
Our internal backend at Google uses a 32-byte change id, so I'd like
to make the backend able to decide the length. To start with, let's
make the backend able to decide what the root change id should
be. That's consistent with how we already let the backend decide what
the root commit id should be.
2023-02-07 22:31:34 -08:00
Martin von Zweigbergk
98259346df backend: make hash_length() specifically about commit IDs
The function is currently only about the length of commit IDs, so
let's clarify that. I'm going to add another function for the length
of change IDs next. I don't know if we're going to care about lengths
of other hashes in the future. We might even be able to remove the
current restriction that all commit IDs and all change IDs have the
same length.
2023-02-07 22:31:34 -08:00
Martin von Zweigbergk
f4374086b3 git_backend: return error when told to write commit without parents
There should be no other commits than the root commit without parents.
2023-02-05 22:52:23 -08:00
Martin von Zweigbergk
8c63fbc4ed git_backend: don't panic if told to write merge with root commit
I think the CLI currently checks that the backend is not told to write
a merge commit with the root as one parent, but we should not panic if
those checks fail.
2023-02-05 22:52:23 -08:00
Martin von Zweigbergk
2b2a9a36d7 git_backend: test conversion of parents, including root
We didn't seem to have any tests showing how we convert the set of
parents, and especially how we handle the root commit, so let's add
some.
2023-02-05 22:52:23 -08:00
Martin von Zweigbergk
985555f393 git_backend: avoid redoing some steps when retrying in write_commit()
By inlining `wite_commit_internal()` into `write_commit()`, we can
avoid redoing some steps when we retry. This includes taking the mutex
lock, and reading the tree object and parent commits. It also means
that we avoid cloning the input commit object, which we otherwise
would even in the non-retrying case. I haven't measured if any of this
makes a significant difference, but I think it also slightly
simplifies the code, so it doesn't have to.
2023-01-17 23:12:50 -08:00
Ilya Grigoriev
12ee2b18cd Git backend: Allow simultaneous rebasing of duplicate commits
Fixes https://github.com/martinvonz/jj/issues/27
Fixes https://github.com/martinvonz/jj/issues/694
2023-01-17 21:17:27 -08:00
Yuya Nishihara
ca2e9fe6d1 git: simply use rand::random() to generate ref preventing gc
We don't care the ref content as long as it is unique, so using threaded
RNG should be fine.

This change means refs/jj/keep will now contain refs of the following
forms:

 - new create_no_gc_ref(): 0f8d6cd9721823906cfb55dac99d7bf5
 - old create_no_gc_ref(): 0f6d93fe-0507-4db8-ad0a-6317f02e27b9
 - prevent_gc(commit_id):  0f9c15100b6f1373f38186357e274a829fb6c4e2
2023-01-14 23:48:02 +09:00
Waleed Khan
af55d17a25 git_backend: propagate various errors
I needed this in the course of debugging an error. Before this commit, the error looked like this:

```
Error: Unexpected error from backend: Object not found
```

After this commit, it looks like this:

```
Error: Unexpected error from backend: Object with CommitId 8f59646bc9bb6bb44b5624f1248f4a708f37003c not found: object not found - no match for id (8f59646bc9bb6bb44b5624f1248f4a708f37003c); class=Odb (9); code=NotFound (-3)
```
2023-01-02 12:28:51 -06:00
Waleed Khan
456be4cc73 backend: create BackendError::InvalidHashLength
Strictly speaking, we could rely on e.g. `git2::Oid::from_str` to produce an error, but I figure that having an explicit error for a mismatching hash length might demystify some error condition in the future, since commit IDs and change IDs and potentially other backends' IDs may have different lengths, so this could flag a mismatch earlier/more obviously.
2023-01-02 12:28:51 -06:00
Waleed Khan
7f8a196ab2 backend: create ObjectId trait
This lets us operate over various kinds of objects polymorphically (e.g. call `.hex()` on any kind of object hash).
2023-01-02 12:28:51 -06:00
Benjamin Saunders
aaa175eca7 lib: replace protobuf crate with prost 2022-12-22 07:04:35 -08:00
Luke Granger-Brown
90ba55bd7b git: cache the extra metadata table
Performance on repositories with many commits is limited somewhat by repeatedly
stating the tablestore directory to work out what the head is. By caching the
table rather than looking it up from disk on every request, we can much more
rapidly satisfy requests. 

This avoids the pathological case in #845 where jj operations take several
minutes to complete.

This patch doesn't change the normal flow of the write path: that will still
always call get_head() on the underlying TableStore, which will stat the
directory before writing out changes. It will however empty the cache when the
metadata has been written.

Fixes #845.
2022-12-17 08:19:14 +00:00
Martin von Zweigbergk
7f9a0a2820 cleanup: let new Clippy move variables into format strings
I ran an upgraded Clippy on the codebase. All the changes seem to be
about using variables directly in format strings instead of passing
them as separate arguments.
2022-12-14 21:30:58 -08:00
Martin von Zweigbergk
be383cebc7 git: on import, add GC-preventing refs to all seen refs
To prevent git's GC from breaking a repo, we already add a git ref to
commits we create in the git backend. However, we don't add refs to
commits we import from git. This fixes that.

Closes #815.
2022-12-03 22:50:26 -08:00
Martin von Zweigbergk
d8feed9be4 copyright: change from "Google LLC" to "The Jujutsu Authors"
Let's acknowledge everyone's contributions by replacing "Google LLC"
in the copyright header by "The Jujutsu Authors". If I understand
correctly, it won't have any legal effect, but maybe it still helps
reduce concerns from contributors (though I haven't heard any
concerns).

Google employees can read about Google's policy at
go/releasing/contributions#copyright.
2022-11-28 06:05:45 -10:00
Martin von Zweigbergk
780d7fb59c backend: rename NormalFile to just File
There are no "non-normal" files, so "normal" is not needed. We have
symlinks and conflicts, but they are not files, so I think just "file"
is unambiguous.

I left `testutils::write_normal_file()` because there it's used to
mean "not executable file" (there's also a `write_executable_file()`).

I left `working_copy::FileType::Normal` since renaming `Normal` there
to `File` would also suggest we should rename `FileType`, and I don't
know what would be a better name for that type.
2022-11-14 23:36:43 -08:00
Martin von Zweigbergk
3c7c4e9f5c tests: move testutils module into separate crate
The `testutils` module should ideally not be part of the library
dependencies. Since they're used by the integration tests (and the CLI
tests), we need to move them to a separate crate to achieve that.
2022-11-08 07:29:35 -08:00