Git reports a rename source as deleted if the rename target is excluded. I
think that's because Git restricts the search space to the specified paths. For
example, Git doesn't also recognize a rename if the source path is excluded
whereas jj does.
I don't think we need to copy the exact behavior of Git, so this patch just
moves matcher application to earlier stage. This change will help remove
collect_copied_sources().
The added get_copy_records() helper could be moved to jj_lib, but we'll probably
want a stream version of this function in library, and writing a stream adapter
isn't as simple as iterator.
In this patch, I use the number of adds<->removes alternation as a threshold,
which approximates the visual complexity of diff hunks. I don't think user can
choose the threshold intuitively, but we need a config knob to try out some.
I set `max-inline-alternation = 3` locally. 0 and 1 mean "disable inlining"
and "inline adds-only/removes-only lines" respectively.
I've added "diff.<format>" config namespace assuming "ui.diff" will be
reorganized as "ui.diff-formatter" or something. #3327
Some other metrics I've tried:
```
// Per-line alternation. This also works well, but can't measure complexity of
// changes across lines.
fn count_max_diff_alternation_per_line(diff_lines: &[DiffLine]) -> usize {
diff_lines
.iter()
.map(|line| {
let sides = line.hunks.iter().map(|&(side, _)| side);
sides
.filter(|&side| side != DiffLineHunkSide::Both)
.dedup() // omit e.g. left->both->left
.count()
})
.max()
.unwrap_or(0)
}
// Per-line occupancy of changes. Large diffs don't always look complex.
fn max_diff_token_ratio_per_line(diff_lines: &[DiffLine]) -> f32 {
diff_lines
.iter()
.filter_map(|line| {
let [both_len, left_len, right_len] =
line.hunks.iter().fold([0, 0, 0], |mut acc, (side, data)| {
let index = match side {
DiffLineHunkSide::Both => 0,
DiffLineHunkSide::Left => 1,
DiffLineHunkSide::Right => 2,
};
acc[index] += data.len();
acc
});
// left/right-only change is readable
(left_len != 0 && right_len != 0).then(|| {
let diff_len = left_len + right_len;
let total_len = both_len + left_len + right_len;
(diff_len as f32) / (total_len as f32)
})
})
.reduce(f32::max)
.unwrap_or(0.0)
}
// Total occupancy of changes. Large diffs don't always look complex.
fn total_change_ratio(diff_lines: &[DiffLine]) -> f32 {
let (diff_len, total_len) = diff_lines
.iter()
.flat_map(|line| &line.hunks)
.fold((0, 0), |(diff_len, total_len), (side, data)| {
let l = data.len();
match side {
DiffLineHunkSide::Both => (diff_len, total_len + l),
DiffLineHunkSide::Left => (diff_len + l, total_len + l),
DiffLineHunkSide::Right => (diff_len + l, total_len + l),
}
});
(diff_len as f32) / (total_len as f32)
}
```
Though this is needed only for the last line, checking it for each line is
cheap. As I'm going to add another rendering style, the condition to pad "\n"
would become more complicated.
I plan to provide a richer version of `TreeDiffEntry` with copy info
(and to make `TreeDiffEntry` itself "poorer"). Most callers want to
know about copies/renames, but at least working copy implementations
probably don't. This patch adds separate `diff_stream()` and
`diff_stream_with_copies()` so we can provide the simpler interface
for callers that don't need copy info.
This allows us to select rendering function hunk by hunk. For example, a hunk
with lots of small changes could be rendered without interleaving left/right
words. Another good thing is that context line handling can be simplified as
the whole context hunk is available.
I'm going to split color-words diffs to by_line() and by_word() stages.
Perhaps, Diff::default_refinement() can be removed once all non-test callers
are migrated.
I'm thinking of adding some heuristics to render hunks containing lots of
small word changes differently, in a similar manner to the unified diffs. This
patch might help add some pre/post-processing at consumer.
files::diff() is inlined to caller to get around 'self borrowing.
- add support for copy tracking to `diff --stat`
- switch `--summary` to match git's output more closely
- rework `show_diff_summary` signature to be more consistent
- force each diff command to explicitly enable copy tracking
- enable copy tracking in diff_summary
- post-process for diff iterator
- post-process for diff stream
- update changelog
It's not so important, but this removes duplicated "diff" labels from template
output. Perhaps, this also fixes "diff access-denied" label in file-by-file
external diffs.
The inner show_*() functions no longer add "diff" labels, but that's okay
because all CLI callers (except for the templater) use DiffRenderer.
The width parameter is mandatory so it wouldn't fall back to ui.term_width() by
mistake. The API is getting messy and we might want to extract some parameters
to separate struct.
Fixes#4158
This patch adds TreeDiff template type to host formatting options. The main
reason of this API design is that diff formats have various incompatible
parameters, so a single .diff(files, format[, options..]) method would become
messy pretty quickly. Another reason is that we can probably add custom
summary templating support as diff.files().map(|file| file.path()..).
RepoPathUiConverter is passed to templater explicitly because the one stored
in RevsetParseContext is behind Option<_>.
This will help remove lifetimed &dyn Repo from diff object in templater.
Function arguments are reordered in a way that all show_*() functions have
common parameters in the same order.
This helps migrate internal [u8] variables to BStr.
b"" literals in tests are changed to &str to get around potential type
incompatibility between &[u8; N].