96 Commits

Author SHA1 Message Date
Waleed Khan
134d85e635 backend: reduce BackendError size somewhat
One of the error types that I later created embedded `BackendError`, but `clippy` complained that the size of the type was too large. This helps address that.
2023-08-23 21:11:15 -07:00
Martin von Zweigbergk
ef5f97f8d7 conflicts: move Merge<T> to merge module
The `merge` module now seems like the obvious place for this type.
2023-08-06 22:08:09 +00:00
Martin von Zweigbergk
ecc030848d conflicts: rename Conflict<T> to Merge<T>
Since `Conflict<T>` can also represent a non-conflict state (a single
term), `Merge<T>` seems like better name.

Thanks to @ilyagr for the suggestion in
https://github.com/martinvonz/jj/pull/1774#discussion_r1257547709

Sorry about the churn. It would have been better if I thought of this
name before I introduced `Conflict<T>`.
2023-08-06 22:08:09 +00:00
Martin von Zweigbergk
84a60d15bc op_store: make ViewId and OperationId implement ObjectId 2023-07-26 14:17:21 -07:00
Martin von Zweigbergk
006c764694 backend: learn to store tree-level conflicts
Tree-level conflicts (#1624) will be stored as multiple trees
associated with a single commit. This patch adds support for that in
`backend::Commit` and in the backends.

When the Git backend writes a tree conflict, it creates a special root
tree for the commit. That tree has only the individual trees from the
conflict as subtrees. That way we prevent the trees from getting
GC'd. We also write the tree ids to the extra metadata table
(i.e. outside of the Git repo) so we don't need to load the tree
object to determine if there are conflicts.

I also added new flag to `backend::Commit` indicating whether the
commit is a new-style commit (with support for tree-level
conflicts). That will help with the migration. We will remove it once
we no longer care about old repos. When the flag is set, we know that
a commit with a single tree cannot have conflicts. When the flag is
not set, it's an old-style commit where we have to walk the whole tree
to find conflicts.
2023-07-19 22:04:16 -07:00
Waleed Khan
54dba51a08 docs: warn about missing docs for jj-lib crate 2023-07-10 18:28:59 +03:00
Yuya Nishihara
cf8a0466c4 backend: introduce error types specific to init/load phases
Errors that may occur while loading backend would vary per backends, and
it's unlikely that these errors could be mapped to BackendError variants
other than BackendError::Other. So let's extract Other(_) of that kind as
a separate type to clarify there would be no other error variants.

Perhaps, Backend/Error will be renamed to CommitBackend/Error or
CommitStore/Error?, whereas I think BackendInit/LoadError can be shared
among store factories.
2023-07-06 20:48:46 +09:00
Yuya Nishihara
e1e75daa8e backend: make BackendError::Other preserve source error object 2023-07-06 20:48:46 +09:00
Martin von Zweigbergk
99226bb96d tree: simplify diff iterator by leveraging Tree::value()
This is much simpler and I was slightly surprised that it doesn't have
much impact on performance. I tried `jj --ignore-working-copy diff -s
--from root --to v5.15` in the Linux kernel repo, and there was
perhaps a 1.5% slowdown (508 ms -> 515 ms). In more normal cases (like
diffing a single commit against its parent), I couldn't measure any
difference at all.
2023-07-06 11:21:21 +02:00
Martin von Zweigbergk
651a3cbe15 rewrite: delete TODOs about labels for each term in a conflict
I don't think we'll want to record a label for each term, because such
labels would get stale, and it seems hard to make them make sense
after transferring a remote to another repo. I think we'll probably
want to infer labels on demand instead (#1176).
2023-07-05 16:50:27 +02:00
Martin von Zweigbergk
6bd13382f4 backend: add a function for setting or removing a tree entry 2023-06-30 14:43:58 +02:00
Martin von Zweigbergk
a95188ddbc backend: take commit to write by value and return new value
The internal backend at Google doesn't let you write any value you
want for in the committer field. The `Store` type still caches the
value it attempted to write, which gets a little weird when the
written value is not what we tried to write. We should use the value
the backend actually wrote. However, we don't know if the backend
changed anything without reading the value back, which is often
wasteful. This commit changes the API to return the written value.

I only changed the signature of `write_commit()` for now. Maybe we
should make a similar change to `write_tree()`.
2023-05-12 15:20:44 -07:00
Martin von Zweigbergk
e7419e76a1 backend: replace git_repo() by as_any()
This has several advantages:

 * Makes it possible to downcast to non-Git custom backends (might be
   useful at Google, but we haven't needed it yet)

 * Lets us access more specific functionality on the `GitBackend`,
   making it possible to access the `git2::Repository` without
   creating a copy of it.

 * Removes the dependency on Git from the backend
2023-05-12 08:05:09 -07:00
Martin von Zweigbergk
a87125d08b backend: rename ConflictPart to ConflictTerm
It took a while before I realized that conflicts could be modeled as
simple algebraic expressions with positive and negative terms (they
were modeled as recursive 3-way conflicts initially). We've been
thinking of them that way for a while now, so let's make the
`ConflictPart` name match that model.
2023-02-17 23:28:50 -08:00
Martin von Zweigbergk
d1dc22d957 backend: let backend decide length of change id
As mentioned in the previous commit, our internal backend at Google
uses a 32-byte long change id. This commit will make us able to use
that.
2023-02-07 22:31:34 -08:00
Martin von Zweigbergk
e6693d0f68 backend: let backend choose root change id
Our internal backend at Google uses a 32-byte change id, so I'd like
to make the backend able to decide the length. To start with, let's
make the backend able to decide what the root change id should
be. That's consistent with how we already let the backend decide what
the root commit id should be.
2023-02-07 22:31:34 -08:00
Martin von Zweigbergk
98259346df backend: make hash_length() specifically about commit IDs
The function is currently only about the length of commit IDs, so
let's clarify that. I'm going to add another function for the length
of change IDs next. I don't know if we're going to care about lengths
of other hashes in the future. We might even be able to remove the
current restriction that all commit IDs and all change IDs have the
same length.
2023-02-07 22:31:34 -08:00
Yuya Nishihara
1a4b5c5ee6 index: make IdIndex store raw bytes, not hex bytes
This helps us to migrate commit_id index to ReadonlyIndex. For large
repositories, this also reduces initialization cost, but that's not the main
intent of this change.

https://github.com/martinvonz/jj/pull/1041#issuecomment-1399225876

common_hex_len() and iter_half_bytes() are added to backend.rs since more
call sites will be added to index.rs, and I feel index.rs isn't a good place
to host this kind of utility functions.
2023-01-22 12:03:08 +09:00
Yuya Nishihara
8c0f7d7707 backend: define root change id statically
I made it a free function. Alternatively, the root id could be instantiated
by and obtained through backend, but I don't think we'll need such level of
abstraction.

I'm going to add a workaround for shortest prefix calculation of the root ids,
where this function will be used.
2023-01-22 12:03:08 +09:00
Yuya Nishihara
ef33bd76df backend: declare CHANGE_ID_HASH_LENGTH as constant 2023-01-22 12:03:08 +09:00
Martin von Zweigbergk
8a1b21ff73 backend: implement equality for commits and trees
It can be useful in tests to be able to compare two commits or
trees. Most other structs already implement equality.
2023-01-20 23:26:20 -08:00
Samuel Tardieu
bdaebf33c4 style: do not dereference self to perform pattern-matching
Dereferencing `self` as `*self` in order to perform patten-matching
using `ref` is unnecessary and will be done automatically by the
compiler (match ergonomics, introduced in Rust 1.26).
2023-01-14 19:28:24 +01:00
Waleed Khan
af55d17a25 git_backend: propagate various errors
I needed this in the course of debugging an error. Before this commit, the error looked like this:

```
Error: Unexpected error from backend: Object not found
```

After this commit, it looks like this:

```
Error: Unexpected error from backend: Object with CommitId 8f59646bc9bb6bb44b5624f1248f4a708f37003c not found: object not found - no match for id (8f59646bc9bb6bb44b5624f1248f4a708f37003c); class=Odb (9); code=NotFound (-3)
```
2023-01-02 12:28:51 -06:00
Waleed Khan
e299963fae backend: remove PartialEq/Eq implementations
As soon as we start tracking the `#[source]` for error variants, we won't be able to rely on the presence of `Eq` implementations.
2023-01-02 12:28:51 -06:00
Waleed Khan
456be4cc73 backend: create BackendError::InvalidHashLength
Strictly speaking, we could rely on e.g. `git2::Oid::from_str` to produce an error, but I figure that having an explicit error for a mismatching hash length might demystify some error condition in the future, since commit IDs and change IDs and potentially other backends' IDs may have different lengths, so this could flag a mismatch earlier/more obviously.
2023-01-02 12:28:51 -06:00
Waleed Khan
7f8a196ab2 backend: create ObjectId trait
This lets us operate over various kinds of objects polymorphically (e.g. call `.hex()` on any kind of object hash).
2023-01-02 12:28:51 -06:00
Yuya Nishihara
587e42d65d backend: deduplicate id type declarations by using declarative macro 2022-12-23 23:52:03 +09:00
Yuya Nishihara
b07c0db56b backend: deduplicate id type impls by using declarative macro
It's unlikely we'll need to customize these impls per type, so let's ensure
that these newtypes have identical implementations. This commit also adds
from_hex() to FileId, SymlinkId, and ConflictId.
2022-12-23 23:52:03 +09:00
Martin von Zweigbergk
d8feed9be4 copyright: change from "Google LLC" to "The Jujutsu Authors"
Let's acknowledge everyone's contributions by replacing "Google LLC"
in the copyright header by "The Jujutsu Authors". If I understand
correctly, it won't have any legal effect, but maybe it still helps
reduce concerns from contributors (though I haven't heard any
concerns).

Google employees can read about Google's policy at
go/releasing/contributions#copyright.
2022-11-28 06:05:45 -10:00
Martin von Zweigbergk
780d7fb59c backend: rename NormalFile to just File
There are no "non-normal" files, so "normal" is not needed. We have
symlinks and conflicts, but they are not files, so I think just "file"
is unambiguous.

I left `testutils::write_normal_file()` because there it's used to
mean "not executable file" (there's also a `write_executable_file()`).

I left `working_copy::FileType::Normal` since renaming `Normal` there
to `File` would also suggest we should rename `FileType`, and I don't
know what would be a better name for that type.
2022-11-14 23:36:43 -08:00
Benjamin Saunders
c3bfe72754 local_backend: use ContentHash rather than hashing protos
Insulates identifiers from the unstable serialized form.
2022-11-12 21:40:36 -08:00
Benjamin Saunders
2447dfeed8 simple_op_store: hash view/operation data directly
Decouples view/operation IDs from serialized forms, which are not
necessarily stable. Not breaking as these IDs are persistent, never
recomputed or used for integrity checking.
2022-11-12 21:40:36 -08:00
Martin von Zweigbergk
6703810c6e backend: remove Commit::is_open field from data model 2022-11-05 06:14:37 -07:00
Martin von Zweigbergk
3b3f6129e6 backend: allow negative timestamps in commits and operations
I was reading a draft of "Git Rev News: Edition 91" [1] where Peff
mentions some unfinished patches to allow negative timestamps in
Git. So I figured I should add support for that before I forget. I
haven't checked if libgit2 supports it, so it might be that our Git
backend still doesn't support it after this patch.

 [1] https://github.com/git/git.github.io/blob/master/rev_news/drafts/edition-91.md
2022-09-30 00:50:17 -07:00
Martin von Zweigbergk
de7b5cf8b0 repo: write format ("git" or "local") to disk on init
We currently determine if the repo uses the Git backend or the local
backend by checking for presence of a `.jj/repo/store/git_target`
file. To make it easier to add out-of-tree backends, let's instead add
a file that indicates which backend to use.
2022-09-25 09:40:42 -07:00
Martin von Zweigbergk
fb8d087882 backend: make backend aware of root commit
I had made the backends unaware of the virtual root commit because
they don't need to know about it, and we could avoid some duplicated
code by putting that in `Store` instead. However, as we saw in
b21a123bc894, the root commit being virtual has some user-visible
effects (they can't create a merge with the root and some other
commit). So I'm thinking that we may want to make the root commit an
actual commit, depending on which backend is used. Specificially, when
using the Git backend, we cannot record the root commit as an actual
parent since Git would fail when trying to look it up. Backends that
don't need compatibility can make the root commit an actual commit,
however.

This commit therefore makes the backends aware of the root commit. It
makes it remain a virtual commit in the Git backend, and makes it an
actual commit in the `LocalBackend`.

This commit breaks any existing repos using the `LocalBackend`, but
there shouldn't be any such repos other than for testing.
2022-09-20 21:20:57 -07:00
Martin von Zweigbergk
1d9f1720c5 backend: add a Tree::from_hex() helper 2022-09-20 21:20:57 -07:00
Martin von Zweigbergk
89476261c0 cleanup: move {read,write}_conflict() methods earlier in Backend trait
The methods working on conflicts are more closely related to those
working on files and trees, so it makes sense for them to be closer.
2022-05-01 23:35:09 -07:00
Martin von Zweigbergk
f16d2a237b backend: pass in path when reading/writing conflicts as well
We do it for all the other kinds of objects already. It's useful to
have the path for backends that store objects by path (we don't have
any such backends yet). I think the reason I didn't do it from the
beginning was because we had separate `RepoPath` types for files and
directories back then.
2022-03-31 10:23:33 -07:00
Martin von Zweigbergk
0c6d89581e tests: pass timestamps via env vars for reproducible hashes
This patch introduces a `JJ_TIMESTAMP` environment variable that lets
us specify the timestamp to use in tests. It also updates the tests to
use it, which means we get to simplify the tests a lot now that that
the hashes are predictable.
2022-03-05 08:48:42 -08:00
Martin von Zweigbergk
8cf5dd286a backend: make Vec inside CommitId non-public
The recent e5dd93cbf712, whose description says "cleanup: make Vec
inside CommitId etc. non-public", made all ID types in the `backend`
module *except* for `CommitId` non-public :P This patch makes
2021-11-19 23:19:00 -08:00
Martin von Zweigbergk
f2d5aa188d tree: refactor TreeEntriesIterator to look like TreeDiffIterator
Consistency is better, and I think this will make it a little easier
to add support for restricting the iterator by a `Matcher`.
2021-11-14 21:50:10 -08:00
Martin von Zweigbergk
e5dd93cbf7 cleanup: make Vec inside CommitId etc. non-public 2021-11-10 10:46:10 -08:00
Martin von Zweigbergk
fdb861b957 backend: remove unused Commit::is_pruned (#32) 2021-10-06 23:53:15 -07:00
Martin von Zweigbergk
de5bf90675 cleanup: fix issues found by latest rustc and clippy 2021-09-29 10:12:38 -07:00
Martin von Zweigbergk
ce5e95fa80 store: rename Store to Backend and StoreWrapper to Store
For what's currently called `Store` in the code, I have been using
"backend" in plain text. That probably means that `Backend` is a good
name for it.
2021-09-12 12:02:10 -07:00