This ensures that the data printed through the raw stream is colorized if the
formatter already had color labels, and if the raw data doesn't reset the
surrounding color. This would only matter in templates containing
label(.., raw_escape_sequence() ..) expression.
Fixes#4631
This appears to be a bit faster if there are tons of unchanged ranges.
```
group new old
----- --- ---
bench_diff_git_git_read_tree_c 1.00 58.5±0.12µs 1.07 62.7±0.60µs
bench_diff_lines/modified/10k 1.00 34.2±0.72ms 1.08 37.0±1.09ms
bench_diff_lines/modified/1k 1.00 3.1±0.08ms 1.12 3.5±0.01ms
bench_diff_lines/reversed/10k 1.00 28.0±0.15ms 1.01 28.4±0.51ms
bench_diff_lines/reversed/1k 1.00 616.0±16.20µs 1.00 617.0±9.29µs
bench_diff_lines/unchanged/10k 1.00 3.5±0.04ms 1.10 3.9±0.06ms
bench_diff_lines/unchanged/1k 1.00 328.4±4.44µs 1.07 352.0±1.41µs
```
This adds `raw_escape_sequence(...)` support for things that use
FormatRecorder like wrapped text / `fill(...)` / `indent(...)`.
Change-Id: Id00000004248b10feb2acd54d90115b783fac0ff
These flags only apply to line-based diffs. This is easy, and seems still useful
to highlight whitespace changes (that could be ignored by line diffing.)
I've added short options only to "diff"-like commands. It seemed unclear if
they were added to deeply-nested commands such as "op log".
Closes#3781
We're likely to use the right (or new) context lines in rendered diffs, but
it's odd that the hunks iterator choose which context hunk to return. We'll
also need both contents to calculate left/right line numbers.
Since the hunk content types are the same, I also split enum DiffHunk into
{ kind, contents } pair.
I'm testing simple conflicts diffs locally, and we'll probably need to handle
consecutive context hunks when we add some form of unmaterialized conflicts
diffs. Let's buffer context hunks (up to 1 right now.) The new code looks
simpler.
FileConflict will be changed to not materialize Merge<BString>. I also updated
the revset engine to ignore non-file conflict. It doesn't make sense to grep
conflict description.
Not all callers need this information, but I assumed it's relatively cheap to
look up the source path in the target tree compared to diffing.
This could be represented as Regular(_)|Copied(_, _)|Renamed(_, _), but it's
a bit weird if Copied and Renamed were separate variants. Instead, I decided
to wrap copy metadata in Option.
This patch adds accessor methods as I'm going to change the underlying data
types. Since entry values are consumed separately, these methods are implemented
on CopiesTreeDiffEntryPath, not on *TreeDiffEntry.
Git reports a rename source as deleted if the rename target is excluded. I
think that's because Git restricts the search space to the specified paths. For
example, Git doesn't also recognize a rename if the source path is excluded
whereas jj does.
I don't think we need to copy the exact behavior of Git, so this patch just
moves matcher application to earlier stage. This change will help remove
collect_copied_sources().
The added get_copy_records() helper could be moved to jj_lib, but we'll probably
want a stream version of this function in library, and writing a stream adapter
isn't as simple as iterator.
In this patch, I use the number of adds<->removes alternation as a threshold,
which approximates the visual complexity of diff hunks. I don't think user can
choose the threshold intuitively, but we need a config knob to try out some.
I set `max-inline-alternation = 3` locally. 0 and 1 mean "disable inlining"
and "inline adds-only/removes-only lines" respectively.
I've added "diff.<format>" config namespace assuming "ui.diff" will be
reorganized as "ui.diff-formatter" or something. #3327
Some other metrics I've tried:
```
// Per-line alternation. This also works well, but can't measure complexity of
// changes across lines.
fn count_max_diff_alternation_per_line(diff_lines: &[DiffLine]) -> usize {
diff_lines
.iter()
.map(|line| {
let sides = line.hunks.iter().map(|&(side, _)| side);
sides
.filter(|&side| side != DiffLineHunkSide::Both)
.dedup() // omit e.g. left->both->left
.count()
})
.max()
.unwrap_or(0)
}
// Per-line occupancy of changes. Large diffs don't always look complex.
fn max_diff_token_ratio_per_line(diff_lines: &[DiffLine]) -> f32 {
diff_lines
.iter()
.filter_map(|line| {
let [both_len, left_len, right_len] =
line.hunks.iter().fold([0, 0, 0], |mut acc, (side, data)| {
let index = match side {
DiffLineHunkSide::Both => 0,
DiffLineHunkSide::Left => 1,
DiffLineHunkSide::Right => 2,
};
acc[index] += data.len();
acc
});
// left/right-only change is readable
(left_len != 0 && right_len != 0).then(|| {
let diff_len = left_len + right_len;
let total_len = both_len + left_len + right_len;
(diff_len as f32) / (total_len as f32)
})
})
.reduce(f32::max)
.unwrap_or(0.0)
}
// Total occupancy of changes. Large diffs don't always look complex.
fn total_change_ratio(diff_lines: &[DiffLine]) -> f32 {
let (diff_len, total_len) = diff_lines
.iter()
.flat_map(|line| &line.hunks)
.fold((0, 0), |(diff_len, total_len), (side, data)| {
let l = data.len();
match side {
DiffLineHunkSide::Both => (diff_len, total_len + l),
DiffLineHunkSide::Left => (diff_len + l, total_len + l),
DiffLineHunkSide::Right => (diff_len + l, total_len + l),
}
});
(diff_len as f32) / (total_len as f32)
}
```
Though this is needed only for the last line, checking it for each line is
cheap. As I'm going to add another rendering style, the condition to pad "\n"
would become more complicated.
I plan to provide a richer version of `TreeDiffEntry` with copy info
(and to make `TreeDiffEntry` itself "poorer"). Most callers want to
know about copies/renames, but at least working copy implementations
probably don't. This patch adds separate `diff_stream()` and
`diff_stream_with_copies()` so we can provide the simpler interface
for callers that don't need copy info.
This allows us to select rendering function hunk by hunk. For example, a hunk
with lots of small changes could be rendered without interleaving left/right
words. Another good thing is that context line handling can be simplified as
the whole context hunk is available.
I'm going to split color-words diffs to by_line() and by_word() stages.
Perhaps, Diff::default_refinement() can be removed once all non-test callers
are migrated.
I'm thinking of adding some heuristics to render hunks containing lots of
small word changes differently, in a similar manner to the unified diffs. This
patch might help add some pre/post-processing at consumer.
files::diff() is inlined to caller to get around 'self borrowing.