All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/8] gitweb: Highlight interesting parts of diff
@ 2012-03-23 22:56 Michał Kiedrowicz
  2012-03-23 22:56 ` [PATCH v2 1/8] gitweb: esc_html_hl_regions(): Don't create empty <span> elements Michał Kiedrowicz
                   ` (7 more replies)
  0 siblings, 8 replies; 30+ messages in thread
From: Michał Kiedrowicz @ 2012-03-23 22:56 UTC (permalink / raw)
  To: git; +Cc: Jakub Narebski, Jeff King, Michał Kiedrowicz

Reading diff output is sometimes very hard, even if it's colored, especially if
lines differ only in few characters. This is often true when a commit fixes a
typo or renames some variables or functions.

This patch series teaches gitweb to highlight fragments that are different
between old and new line. This should mimic the same feature in Trac or GitHub.

Changes since v1:

* Rebased on origin/master.

* Added two patches at the beginning of the series.  First one is a small fix
  and second one adds a feature to esc_html_hl_regions() that will be needed
  by later patches.

* Squashed patch "gitweb: Format diff lines just before printing" and reworked
  changes to format_diff_line().  Now it accepts both raw line that needs to be
  html_esc()'ed and already escaped line.

* Squashed changes in CSS with previous patch as proposed by Jakub.

* Dropped esc_html_mark_change() and started using esc_html_hl_regions()
  because it was merged to origin/master.

* Fixed highlightning for combined diffs with more than two parents.

* Fixed indentation (tabs -> spaces) in conditions.

* Ensured that patches with "\ No newline at end of file" are properly
  displayed.

* Ensured that binary patches are not supported in HTML format, thus this
  series canot break it :) (answering Jakub's questions) 

* Reworded some commit messages and fixed typos.

* Renamed few variables after comments from Jakub.

* Added some comments to the code.

* Added Acked-by from Jakub for few patches.

Michał Kiedrowicz (8):
  gitweb: esc_html_hl_regions(): Don't create empty <span> elements
  gitweb: Pass esc_html_hl_regions() options to esc_html()
  gitweb: Extract print_sidebyside_diff_lines()
  gitweb: Use print_diff_chunk() for both side-by-side and inline diffs
  gitweb: Move HTML-formatting diff line back to process_diff_line()
  gitweb: Push formatting diff lines to print_diff_chunk()
  gitweb: Highlight interesting parts of diff
  gitweb: Refinement highlightning in combined diffs

 gitweb/gitweb.perl       |  299 +++++++++++++++++++++++++++++++++-------------
 gitweb/static/gitweb.css |    8 ++
 2 files changed, 225 insertions(+), 82 deletions(-)

-- 
1.7.8.4

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 1/8] gitweb: esc_html_hl_regions(): Don't create empty <span> elements
  2012-03-23 22:56 [PATCH v2 0/8] gitweb: Highlight interesting parts of diff Michał Kiedrowicz
@ 2012-03-23 22:56 ` Michał Kiedrowicz
  2012-03-24 18:58   ` Jakub Narebski
  2012-03-23 22:56 ` [PATCH v2 2/8] gitweb: Pass esc_html_hl_regions() options to esc_html() Michał Kiedrowicz
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 30+ messages in thread
From: Michał Kiedrowicz @ 2012-03-23 22:56 UTC (permalink / raw)
  To: git; +Cc: Jakub Narebski, Jeff King, Michał Kiedrowicz

if $s->[1] is equal to or less than $s->[0], esc_html_hl_regions()
generates an empty <span> element.  It normally shouldn't be visible in
the web broweser, but it doesn't look good when looking at page source.
It also minimally increases generated page size for no special reason.

Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
---

I don't know if any code currently can produce empty <span> elements,
but diff refinement highlighning does it.

 gitweb/gitweb.perl |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index a8b5fad..af645e5 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -1738,6 +1738,9 @@ sub esc_html_hl_regions {
 	my $pos = 0;
 
 	for my $s (@sel) {
+		# Don't craete empty <span> elements.
+		next if $s->[1] <= $s->[0];
+
 		$out .= esc_html(substr($str, $pos, $s->[0] - $pos))
 			if ($s->[0] - $pos > 0);
 		$out .= $cgi->span({-class => $css_class},
-- 
1.7.8.4

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 2/8] gitweb: Pass esc_html_hl_regions() options to esc_html()
  2012-03-23 22:56 [PATCH v2 0/8] gitweb: Highlight interesting parts of diff Michał Kiedrowicz
  2012-03-23 22:56 ` [PATCH v2 1/8] gitweb: esc_html_hl_regions(): Don't create empty <span> elements Michał Kiedrowicz
@ 2012-03-23 22:56 ` Michał Kiedrowicz
  2012-03-24 19:15   ` Jakub Narebski
  2012-03-23 22:56 ` [PATCH v2 3/8] gitweb: Extract print_sidebyside_diff_lines() Michał Kiedrowicz
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 30+ messages in thread
From: Michał Kiedrowicz @ 2012-03-23 22:56 UTC (permalink / raw)
  To: git; +Cc: Jakub Narebski, Jeff King, Michał Kiedrowicz

With this change, esc_html_hl_regions() accepts options and passes them
down to esc_html().  This may be needed if a caller wants to pass
-nbsp=>1 to esc_html().

The idea and implementation example of this change was described in
337da8d2 (gitweb: Introduce esc_html_match_hl and esc_html_hl_regions,
2012-02-27).  While other suggestions may be more useful in some cases,
there is no need to implement them at the moment.  The
esc_html_hl_regions() interface may be changed later if it's needed.

Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
---

Jakub, The code was in fact stolen from a yours patch posted to the
list (gitweb: Use esc_html_match_hl() in 'grep' search) that wasn't
merged so maybe I should pass the authorship to you?

 gitweb/gitweb.perl |   10 ++++++----
 1 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index af645e5..1744c40 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -1732,7 +1732,9 @@ sub chop_and_escape_str {
 # '<span class="mark">foo</span>bar'
 sub esc_html_hl_regions {
 	my ($str, $css_class, @sel) = @_;
-	return esc_html($str) unless @sel;
+	my %opts = grep { ref($_) ne 'ARRAY' } @sel;
+	@sel     = grep { ref($_) eq 'ARRAY' } @sel;
+	return esc_html($str, %opts) unless @sel;
 
 	my $out = '';
 	my $pos = 0;
@@ -1741,14 +1743,14 @@ sub esc_html_hl_regions {
 		# Don't craete empty <span> elements.
 		next if $s->[1] <= $s->[0];
 
-		$out .= esc_html(substr($str, $pos, $s->[0] - $pos))
+		$out .= esc_html(substr($str, $pos, $s->[0] - $pos), %opts)
 			if ($s->[0] - $pos > 0);
 		$out .= $cgi->span({-class => $css_class},
-		                   esc_html(substr($str, $s->[0], $s->[1] - $s->[0])));
+		                   esc_html(substr($str, $s->[0], $s->[1] - $s->[0]), %opts));
 
 		$pos = $s->[1];
 	}
-	$out .= esc_html(substr($str, $pos))
+	$out .= esc_html(substr($str, $pos), %opts)
 		if ($pos < length($str));
 
 	return $out;
-- 
1.7.8.4

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 3/8] gitweb: Extract print_sidebyside_diff_lines()
  2012-03-23 22:56 [PATCH v2 0/8] gitweb: Highlight interesting parts of diff Michał Kiedrowicz
  2012-03-23 22:56 ` [PATCH v2 1/8] gitweb: esc_html_hl_regions(): Don't create empty <span> elements Michał Kiedrowicz
  2012-03-23 22:56 ` [PATCH v2 2/8] gitweb: Pass esc_html_hl_regions() options to esc_html() Michał Kiedrowicz
@ 2012-03-23 22:56 ` Michał Kiedrowicz
  2012-03-28 14:33   ` Jakub Narebski
  2012-03-23 22:56 ` [PATCH v2 4/8] gitweb: Use print_diff_chunk() for both side-by-side and inline diffs Michał Kiedrowicz
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 30+ messages in thread
From: Michał Kiedrowicz @ 2012-03-23 22:56 UTC (permalink / raw)
  To: git; +Cc: Jakub Narebski, Jeff King, Michał Kiedrowicz

Currently, print_sidebyside_diff_chunk() does two things: it
accumulates diff lines and prints them.  Accumulation may be used to
perform additional operations on diff lines,  so it makes sense to split
these two things.  Thus, the code that prints diff lines in a side-by-side
manner is moved out of print_sidebyside_diff_chunk() to a separate
subroutine.

The outcome of this patch is that print_sidebyside_diff_chunk() is now
much shorter and easier to read.

This is a preparation patch for diff refinement highlightning.  It should
not change the gitweb output, but it slightly changes its behavior.
Before this commit, context is printed on the class change. Now, it'it
printed just before printing added and removed lines.

Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
---
 gitweb/gitweb.perl |   97 ++++++++++++++++++++++++++++------------------------
 1 files changed, 52 insertions(+), 45 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 1744c40..86d5a02 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -4998,6 +4998,53 @@ sub git_difftree_body {
 	print "</table>\n";
 }
 
+# Print context lines and then rem/add lines in a side-by-side manner.
+sub print_sidebyside_diff_lines {
+	my ($ctx, $rem, $add) = @_;
+
+	# print context block before add/rem block
+	if (@$ctx) {
+		print join '',
+			'<div class="chunk_block ctx">',
+				'<div class="old">',
+				@$ctx,
+				'</div>',
+				'<div class="new">',
+				@$ctx,
+				'</div>',
+			'</div>';
+	}
+
+	if (!@$add) {
+		# pure removal
+		print join '',
+			'<div class="chunk_block rem">',
+				'<div class="old">',
+				@$rem,
+				'</div>',
+			'</div>';
+	} elsif (!@$rem) {
+		# pure addition
+		print join '',
+			'<div class="chunk_block add">',
+				'<div class="new">',
+				@$add,
+				'</div>',
+			'</div>';
+	} else {
+		# assume that it is change
+		print join '',
+			'<div class="chunk_block chg">',
+				'<div class="old">',
+				@$rem,
+				'</div>',
+				'<div class="new">',
+				@$add,
+				'</div>',
+			'</div>';
+	}
+}
+
 sub print_sidebyside_diff_chunk {
 	my @chunk = @_;
 	my (@ctx, @rem, @add);
@@ -5024,51 +5071,11 @@ sub print_sidebyside_diff_chunk {
 			next;
 		}
 
-		## print from accumulator when type of class of lines change
-		# empty contents block on start rem/add block, or end of chunk
-		if (@ctx && (!$class || $class eq 'rem' || $class eq 'add')) {
-			print join '',
-				'<div class="chunk_block ctx">',
-					'<div class="old">',
-					@ctx,
-					'</div>',
-					'<div class="new">',
-					@ctx,
-					'</div>',
-				'</div>';
-			@ctx = ();
-		}
-		# empty add/rem block on start context block, or end of chunk
-		if ((@rem || @add) && (!$class || $class eq 'ctx')) {
-			if (!@add) {
-				# pure removal
-				print join '',
-					'<div class="chunk_block rem">',
-						'<div class="old">',
-						@rem,
-						'</div>',
-					'</div>';
-			} elsif (!@rem) {
-				# pure addition
-				print join '',
-					'<div class="chunk_block add">',
-						'<div class="new">',
-						@add,
-						'</div>',
-					'</div>';
-			} else {
-				# assume that it is change
-				print join '',
-					'<div class="chunk_block chg">',
-						'<div class="old">',
-						@rem,
-						'</div>',
-						'<div class="new">',
-						@add,
-						'</div>',
-					'</div>';
-			}
-			@rem = @add = ();
+		## print from accumulator when have some add/rem lines or end
+		# of chunk (flush context lines)
+		if (((@rem || @add) && $class eq 'ctx') || !$class) {
+			print_sidebyside_diff_lines(\@ctx, \@rem, \@add);
+			@ctx = @rem = @add = ();
 		}
 
 		## adding lines to accumulator
-- 
1.7.8.4

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 4/8] gitweb: Use print_diff_chunk() for both side-by-side and inline diffs
  2012-03-23 22:56 [PATCH v2 0/8] gitweb: Highlight interesting parts of diff Michał Kiedrowicz
                   ` (2 preceding siblings ...)
  2012-03-23 22:56 ` [PATCH v2 3/8] gitweb: Extract print_sidebyside_diff_lines() Michał Kiedrowicz
@ 2012-03-23 22:56 ` Michał Kiedrowicz
  2012-03-28 15:56   ` Jakub Narebski
  2012-03-23 22:56 ` [PATCH v2 5/8] gitweb: Move HTML-formatting diff line back to process_diff_line() Michał Kiedrowicz
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 30+ messages in thread
From: Michał Kiedrowicz @ 2012-03-23 22:56 UTC (permalink / raw)
  To: git; +Cc: Jakub Narebski, Jeff King, Michał Kiedrowicz

This renames print_sidebyside_diff_chunk() to print_diff_chunk() and
makes use of it for both side-by-side and inline diffs.  Now diff lines
are always accumulated before they are printed.  This opens the
possibility to preprocess diff output before it's printed, which is
needed for diff refinement highlightning (implemented in incoming
patches).

If left as is, the new function print_inline_diff_lines() could reorder
diff lines.  It first prints all context lines, then all removed lines
and finally all added lines.  If the diff output consisted of mixed added
and removed lines, gitweb would reorder these lines.  This is true for
combined diff output, for example:

	 - removed line for first parent
	 + added line for first parent
	  -removed line for second parent
	 ++added line for both parents

would be rendered as:

	- removed line for first parent
	 -removed line for second parent
	+ added line for first parent
	++added line for both parents

To prevent gitweb from reordering lines, print_diff_chunk() calls
print_diff_lines() as soon as it detects that both added and removed
lines are present and there was a class change.

Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
---
 gitweb/gitweb.perl |   55 +++++++++++++++++++++++++++++++++++++--------------
 1 files changed, 40 insertions(+), 15 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 86d5a02..1a2f258 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -5045,10 +5045,34 @@ sub print_sidebyside_diff_lines {
 	}
 }
 
-sub print_sidebyside_diff_chunk {
-	my @chunk = @_;
+# Print context lines and then rem/add lines in inline manner.
+sub print_inline_diff_lines {
+	my ($ctx, $rem, $add) = @_;
+
+	print foreach (@$ctx);
+	print foreach (@$rem);
+	print foreach (@$add);
+}
+
+# Print context lines and then rem/add lines.
+sub print_diff_lines {
+	my ($ctx, $rem, $add, $diff_style, $is_combined) = @_;
+
+	if ($diff_style eq 'sidebyside' && !$is_combined) {
+		print_sidebyside_diff_lines($ctx, $rem, $add);
+	} else {
+		# default 'inline' style and unknown styles
+		print_inline_diff_lines($ctx, $rem, $add);
+	}
+}
+
+sub print_diff_chunk {
+	my ($diff_style, $is_combined, @chunk) = @_;
 	my (@ctx, @rem, @add);
 
+	# The class of the previous line. 
+	my $prev_class = '';
+
 	return unless @chunk;
 
 	# incomplete last line might be among removed or added lines,
@@ -5072,9 +5096,13 @@ sub print_sidebyside_diff_chunk {
 		}
 
 		## print from accumulator when have some add/rem lines or end
-		# of chunk (flush context lines)
-		if (((@rem || @add) && $class eq 'ctx') || !$class) {
-			print_sidebyside_diff_lines(\@ctx, \@rem, \@add);
+		# of chunk (flush context lines), or when have add and rem
+		# lines and new block is reached (otherwise add/rem lines could
+		# be reordered)
+		if (((@rem || @add) && $class eq 'ctx') || !$class ||
+		    (@rem && @add && $class ne $prev_class)) {
+			print_diff_lines(\@ctx, \@rem, \@add, $diff_style,
+					$is_combined);
 			@ctx = @rem = @add = ();
 		}
 
@@ -5091,6 +5119,8 @@ sub print_sidebyside_diff_chunk {
 		if ($class eq 'ctx') {
 			push @ctx, $line;
 		}
+
+		$prev_class = $class;
 	}
 }
 
@@ -5217,22 +5247,17 @@ sub git_patchset_body {
 			$diff_classes .= " $class" if ($class);
 			$line = "<div class=\"$diff_classes\">$line</div>\n";
 
-			if ($diff_style eq 'sidebyside' && !$is_combined) {
-				if ($class eq 'chunk_header') {
-					print_sidebyside_diff_chunk(@chunk);
-					@chunk = ( [ $class, $line ] );
-				} else {
-					push @chunk, [ $class, $line ];
-				}
+			if ($class eq 'chunk_header') {
+				print_diff_chunk($diff_style, $is_combined, @chunk);
+				@chunk = ( [ $class, $line ] );
 			} else {
-				# default 'inline' style and unknown styles
-				print $line;
+				push @chunk, [ $class, $line ];
 			}
 		}
 
 	} continue {
 		if (@chunk) {
-			print_sidebyside_diff_chunk(@chunk);
+			print_diff_chunk($diff_style, $is_combined, @chunk);
 			@chunk = ();
 		}
 		print "</div>\n"; # class="patch"
-- 
1.7.8.4

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 5/8] gitweb: Move HTML-formatting diff line back to process_diff_line()
  2012-03-23 22:56 [PATCH v2 0/8] gitweb: Highlight interesting parts of diff Michał Kiedrowicz
                   ` (3 preceding siblings ...)
  2012-03-23 22:56 ` [PATCH v2 4/8] gitweb: Use print_diff_chunk() for both side-by-side and inline diffs Michał Kiedrowicz
@ 2012-03-23 22:56 ` Michał Kiedrowicz
  2012-03-29 16:14   ` Jakub Narebski
  2012-03-23 22:56 ` [PATCH v2 6/8] gitweb: Push formatting diff lines to print_diff_chunk() Michał Kiedrowicz
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 30+ messages in thread
From: Michał Kiedrowicz @ 2012-03-23 22:56 UTC (permalink / raw)
  To: git; +Cc: Jakub Narebski, Jeff King, Michał Kiedrowicz

Commit 6ba1eb51b (gitweb: Add a feature to show side-by-side diff,
2011-10-31) for no special reason moved wrapping diff line in <div> out
of format_diff_line(). Bring back old behavior.

This simplifies code in git_patchset_body() and keeps formatting of a
diff line in one place.

The more long-term purpose of this patch is to move formatting diff
lines down to print_diff_chunk(), to allow processing lines without
HTML-formatting.

This is just a refactoring patch. It's not meant to change gitweb
output.

Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
Acked-by: Jakub Narębski <jnareb@gmail.com>
---
 gitweb/gitweb.perl |   16 ++++++++--------
 1 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 1a2f258..aacf518 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -2438,14 +2438,17 @@ sub process_diff_line {
 
 	if ($from && $to && $line =~ m/^\@{2} /) {
 		$line = format_unidiff_chunk_header($line, $from, $to);
-		return $diff_class, $line;
-
 	} elsif ($from && $to && $line =~ m/^\@{3}/) {
 		$line = format_cc_diff_chunk_header($line, $from, $to);
-		return $diff_class, $line;
-
+	} else {
+		$line = esc_html($line, -nbsp=>1);
 	}
-	return $diff_class, esc_html($line, -nbsp=>1);
+
+	my $diff_classes = "diff";
+	$diff_classes .= " $diff_class" if ($diff_class);
+	$line = "<div class=\"$diff_classes\">$line</div>\n";
+
+	return $diff_class, $line;
 }
 
 # Generates undef or something like "_snapshot_" or "snapshot (_tbz2_ _zip_)",
@@ -5243,9 +5246,6 @@ sub git_patchset_body {
 			next PATCH if ($patch_line =~ m/^diff /);
 
 			my ($class, $line) = process_diff_line($patch_line, \%from, \%to);
-			my $diff_classes = "diff";
-			$diff_classes .= " $class" if ($class);
-			$line = "<div class=\"$diff_classes\">$line</div>\n";
 
 			if ($class eq 'chunk_header') {
 				print_diff_chunk($diff_style, $is_combined, @chunk);
-- 
1.7.8.4

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 6/8] gitweb: Push formatting diff lines to print_diff_chunk()
  2012-03-23 22:56 [PATCH v2 0/8] gitweb: Highlight interesting parts of diff Michał Kiedrowicz
                   ` (4 preceding siblings ...)
  2012-03-23 22:56 ` [PATCH v2 5/8] gitweb: Move HTML-formatting diff line back to process_diff_line() Michał Kiedrowicz
@ 2012-03-23 22:56 ` Michał Kiedrowicz
  2012-03-29 16:59   ` Jakub Narebski
  2012-03-23 22:56 ` [PATCH v2 7/8] gitweb: Highlight interesting parts of diff Michał Kiedrowicz
  2012-03-23 22:56 ` [PATCH v2 8/8] gitweb: Refinement highlightning in combined diffs Michał Kiedrowicz
  7 siblings, 1 reply; 30+ messages in thread
From: Michał Kiedrowicz @ 2012-03-23 22:56 UTC (permalink / raw)
  To: git; +Cc: Jakub Narebski, Jeff King, Michał Kiedrowicz

Now git_patchset_body() only calls diff_line_class(), which is removed
from process_diff_line(). The latter function is renamed to
format_diff_line() and its output is changed to return only
HTML-formatted line, which brings it in line with outher format_*
subroutined.

This slightly changes the order of operations performed on diff lines.
Before this commit, each read line was formatted and then put to the
@chunk accumulator. Now, lines are formatted inside print_diff_chunk(),

This is a preparation patch for diff refinement highlightning. It's not
meant to change gitweb output.

Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
Acked-by: Jakub Narębski <jnareb@gmail.com>
---
 gitweb/gitweb.perl |   25 ++++++++++++-------------
 1 files changed, 12 insertions(+), 13 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index aacf518..db32588 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -2426,12 +2426,9 @@ sub format_cc_diff_chunk_header {
 }
 
 # process patch (diff) line (not to be used for diff headers),
-# returning class and HTML-formatted (but not wrapped) line
-sub process_diff_line {
-	my $line = shift;
-	my ($from, $to) = @_;
-
-	my $diff_class = diff_line_class($line, $from, $to);
+# returning HTML-formatted (but not wrapped) line
+sub format_diff_line {
+	my ($line, $diff_class, $from, $to) = @_;
 
 	chomp $line;
 	$line = untabify($line);
@@ -2448,7 +2445,7 @@ sub process_diff_line {
 	$diff_classes .= " $diff_class" if ($diff_class);
 	$line = "<div class=\"$diff_classes\">$line</div>\n";
 
-	return $diff_class, $line;
+	return $line;
 }
 
 # Generates undef or something like "_snapshot_" or "snapshot (_tbz2_ _zip_)",
@@ -5070,7 +5067,7 @@ sub print_diff_lines {
 }
 
 sub print_diff_chunk {
-	my ($diff_style, $is_combined, @chunk) = @_;
+	my ($diff_style, $is_combined, $from, $to, @chunk) = @_;
 	my (@ctx, @rem, @add);
 
 	# The class of the previous line. 
@@ -5092,6 +5089,8 @@ sub print_diff_chunk {
 	foreach my $line_info (@chunk) {
 		my ($class, $line) = @$line_info;
 
+		$line = format_diff_line($line, $class, $from, $to);
+
 		# print chunk headers
 		if ($class && $class eq 'chunk_header') {
 			print $line;
@@ -5245,19 +5244,19 @@ sub git_patchset_body {
 
 			next PATCH if ($patch_line =~ m/^diff /);
 
-			my ($class, $line) = process_diff_line($patch_line, \%from, \%to);
+			my $class = diff_line_class($patch_line, \%from, \%to);
 
 			if ($class eq 'chunk_header') {
-				print_diff_chunk($diff_style, $is_combined, @chunk);
-				@chunk = ( [ $class, $line ] );
+				print_diff_chunk($diff_style, $is_combined, \%from, \%to, @chunk);
+				@chunk = ( [ $class, $patch_line ] );
 			} else {
-				push @chunk, [ $class, $line ];
+				push @chunk, [ $class, $patch_line ];
 			}
 		}
 
 	} continue {
 		if (@chunk) {
-			print_diff_chunk($diff_style, $is_combined, @chunk);
+			print_diff_chunk($diff_style, $is_combined, \%from, \%to, @chunk);
 			@chunk = ();
 		}
 		print "</div>\n"; # class="patch"
-- 
1.7.8.4

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 7/8] gitweb: Highlight interesting parts of diff
  2012-03-23 22:56 [PATCH v2 0/8] gitweb: Highlight interesting parts of diff Michał Kiedrowicz
                   ` (5 preceding siblings ...)
  2012-03-23 22:56 ` [PATCH v2 6/8] gitweb: Push formatting diff lines to print_diff_chunk() Michał Kiedrowicz
@ 2012-03-23 22:56 ` Michał Kiedrowicz
  2012-03-29 19:42   ` Jakub Narebski
  2012-03-23 22:56 ` [PATCH v2 8/8] gitweb: Refinement highlightning in combined diffs Michał Kiedrowicz
  7 siblings, 1 reply; 30+ messages in thread
From: Michał Kiedrowicz @ 2012-03-23 22:56 UTC (permalink / raw)
  To: git; +Cc: Jakub Narebski, Jeff King, Michał Kiedrowicz

Reading diff output is sometimes very hard, even if it's colored,
especially if lines differ only in few characters.  This is often true
when a commit fixes a typo or renames some variables or functions.

This commit teaches gitweb to highlight characters that are different
between old and new line with a light green/red background.  This should
work in the similar manner as in Trac or GitHub.

The algorithm that compares lines is based on contrib/diff-highlight.
Basically, it works by determining common prefix/suffix of corresponding
lines and highlightning only the middle part of lines.  For more
information, see contrib/diff-highlight/README.

Combined diffs are not supported but a following commit will change it.

Two changes in format_diff_line() were needed to allow diff refinement
highlightning to work.  Firstly, format_diff_line() was taught to not
esc_html() line that was passed as a reference.  This was needed because
refining the highlight must be performed on unescaped lines and it uses
a <span> element to mark interesting parts of the line.  Secondly, the
lines are untabified before comparing because calling untabify()
after inserting <span>'s could improperly convert tabs to spaces.

Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
---
 gitweb/gitweb.perl       |   84 ++++++++++++++++++++++++++++++++++++++++++----
 gitweb/static/gitweb.css |    8 ++++
 2 files changed, 85 insertions(+), 7 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index db32588..872ba12 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -2426,14 +2426,14 @@ sub format_cc_diff_chunk_header {
 }
 
 # process patch (diff) line (not to be used for diff headers),
-# returning HTML-formatted (but not wrapped) line
+# returning HTML-formatted (but not wrapped) line.
+# If the line is already esc_html()'ed, pass it as a reference.
 sub format_diff_line {
 	my ($line, $diff_class, $from, $to) = @_;
 
-	chomp $line;
-	$line = untabify($line);
-
-	if ($from && $to && $line =~ m/^\@{2} /) {
+	if (ref($line)) {
+		$line = $$line;
+	} elsif ($from && $to && $line =~ m/^\@{2} /) {
 		$line = format_unidiff_chunk_header($line, $from, $to);
 	} elsif ($from && $to && $line =~ m/^\@{3}/) {
 		$line = format_cc_diff_chunk_header($line, $from, $to);
@@ -5054,10 +5054,80 @@ sub print_inline_diff_lines {
 	print foreach (@$add);
 }
 
+# Format removed and added line, mark changed part and HTML-format them.
+# Impementation is based on contrib/diff-highlight
+sub format_rem_add_line {
+	my ($rem, $add) = @_;
+	my @rem = split(//, $rem);
+	my @add = split(//, $add);
+	my ($esc_rem, $esc_add);
+	# Ignore +/- character, thus $prefix_len is set to 1.
+	my ($prefix_len, $suffix_len) = (1, 0);
+	my ($prefix_is_space, $suffix_is_space) = (1, 1);
+
+	while ($prefix_len < @rem && $prefix_len < @add) {
+		last if ($rem[$prefix_len] ne $add[$prefix_len]);
+
+		$prefix_is_space = 0 if ($rem[$prefix_len] !~ /\s/);
+		$prefix_len++;
+	}
+
+	my $shorter = (@rem < @add) ? @rem : @add;
+	while ($prefix_len + $suffix_len < $shorter) {
+		last if ($rem[-1 - $suffix_len] ne $add[-1 - $suffix_len]);
+
+		$suffix_is_space = 0 if ($rem[-1 - $suffix_len] !~ /\s/);
+		$suffix_len++;
+	}
+
+	# Mark lines that are different from each other, but have some common
+	# part that isn't whitespace.  If lines are completely different, don't
+	# mark them because that would make output unreadable, especially if
+	# diff consists of multiple lines.
+	if (($prefix_len == 1 && $suffix_len == 0) ||
+	    ($prefix_is_space && $suffix_is_space)) {
+		$esc_rem = esc_html($rem, -nbsp=>1);
+		$esc_add = esc_html($add, -nbsp=>1);
+	} else {
+		$esc_rem = esc_html_hl_regions($rem, 'marked', [$prefix_len, @rem - $suffix_len], -nbsp=>1);
+		$esc_add = esc_html_hl_regions($add, 'marked', [$prefix_len, @add - $suffix_len], -nbsp=>1);
+	}
+
+	return format_diff_line(\$esc_rem, 'rem'),
+	        format_diff_line(\$esc_add, 'add');
+}
+
+# HTML-format diff context, removed and added lines.
+sub format_ctx_rem_add_lines {
+	my ($ctx, $rem, $add, $is_combined) = @_;
+	my (@new_ctx, @new_rem, @new_add);
+
+	# Highlight if every removed line has a corresponding added line.
+	# Combined diffs are not supported ATM.
+	if (!$is_combined && @$add > 0 && @$add == @$rem) {
+		for (my $i = 0; $i < @$add; $i++) {
+			my ($line_rem, $line_add) = format_rem_add_line(
+			        $rem->[$i], $add->[$i]);
+			push @new_rem, $line_rem;
+			push @new_add, $line_add;
+		}
+	} else {
+		@new_rem = map { format_diff_line($_, 'rem') } @$rem;
+		@new_add = map { format_diff_line($_, 'add') } @$add;
+	}
+
+	@new_ctx = map { format_diff_line($_, 'ctx') } @$ctx;
+
+	return (\@new_ctx, \@new_rem, \@new_add);
+}
+
 # Print context lines and then rem/add lines.
 sub print_diff_lines {
 	my ($ctx, $rem, $add, $diff_style, $is_combined) = @_;
 
+	($ctx, $rem, $add) = format_ctx_rem_add_lines($ctx, $rem, $add,
+		$is_combined);
+
 	if ($diff_style eq 'sidebyside' && !$is_combined) {
 		print_sidebyside_diff_lines($ctx, $rem, $add);
 	} else {
@@ -5089,11 +5159,11 @@ sub print_diff_chunk {
 	foreach my $line_info (@chunk) {
 		my ($class, $line) = @$line_info;
 
-		$line = format_diff_line($line, $class, $from, $to);
+		$line = untabify($line);
 
 		# print chunk headers
 		if ($class && $class eq 'chunk_header') {
-			print $line;
+			print format_diff_line($line, $class, $from, $to);
 			next;
 		}
 
diff --git a/gitweb/static/gitweb.css b/gitweb/static/gitweb.css
index c530355..3c4a3c9 100644
--- a/gitweb/static/gitweb.css
+++ b/gitweb/static/gitweb.css
@@ -438,6 +438,10 @@ div.diff.add {
 	color: #008800;
 }
 
+div.diff.add span.marked {
+	background-color: #77ff77;
+}
+
 div.diff.from_file a.path,
 div.diff.from_file {
 	color: #aa0000;
@@ -447,6 +451,10 @@ div.diff.rem {
 	color: #cc0000;
 }
 
+div.diff.rem span.marked {
+	background-color: #ff7777;
+}
+
 div.diff.chunk_header a,
 div.diff.chunk_header {
 	color: #990099;
-- 
1.7.8.4

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 8/8] gitweb: Refinement highlightning in combined diffs
  2012-03-23 22:56 [PATCH v2 0/8] gitweb: Highlight interesting parts of diff Michał Kiedrowicz
                   ` (6 preceding siblings ...)
  2012-03-23 22:56 ` [PATCH v2 7/8] gitweb: Highlight interesting parts of diff Michał Kiedrowicz
@ 2012-03-23 22:56 ` Michał Kiedrowicz
  2012-03-29 23:37   ` Jakub Narebski
  7 siblings, 1 reply; 30+ messages in thread
From: Michał Kiedrowicz @ 2012-03-23 22:56 UTC (permalink / raw)
  To: git; +Cc: Jakub Narebski, Jeff King, Michał Kiedrowicz

The highlightning of combined diffs is currently disabled.  This is
because output from a combined diff is much harder to highlight because
it's not obvious which removed and added lines should be compared.

Moreover, code that compares added and removed lines doesn't care about
combined diffs. It only skips first +/- character, treating second +/-
as a line content.

Let's start with a simple case: only highlight changes that come from
one parent, i.e. when every removed line has a corresponding added line
for the same parent.  This way the highlightning cannot get wrong. For
example, following diffs would be highlighted:

	- removed line for first parent
	+ added line for first parent
	  context line
	 -removed line for second parent
	 +added line for second parent

or

	- removed line for first parent
	 -removed line for second parent
	+ added line for first parent
	 +added line for second parent

but following output will not:

	- removed line for first parent
	 -removed line for second parent
	 +added line for second parent
	++added line for both parents

Further changes may introduce more intelligent approach that better
handles combined diffs.

Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
---
 gitweb/gitweb.perl |   57 +++++++++++++++++++++++++++++++++++++++------------
 1 files changed, 43 insertions(+), 14 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 872ba12..c056e83 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -5057,12 +5057,12 @@ sub print_inline_diff_lines {
 # Format removed and added line, mark changed part and HTML-format them.
 # Impementation is based on contrib/diff-highlight
 sub format_rem_add_line {
-	my ($rem, $add) = @_;
+	my ($rem, $add, $num_parents) = @_;
 	my @rem = split(//, $rem);
 	my @add = split(//, $add);
 	my ($esc_rem, $esc_add);
-	# Ignore +/- character, thus $prefix_len is set to 1.
-	my ($prefix_len, $suffix_len) = (1, 0);
+	# Ignore leading +/- characters for each parent.
+	my ($prefix_len, $suffix_len) = ($num_parents, 0);
 	my ($prefix_is_space, $suffix_is_space) = (1, 1);
 
 	while ($prefix_len < @rem && $prefix_len < @add) {
@@ -5084,7 +5084,7 @@ sub format_rem_add_line {
 	# part that isn't whitespace.  If lines are completely different, don't
 	# mark them because that would make output unreadable, especially if
 	# diff consists of multiple lines.
-	if (($prefix_len == 1 && $suffix_len == 0) ||
+	if (($prefix_len == $num_parents && $suffix_len == 0) ||
 	    ($prefix_is_space && $suffix_is_space)) {
 		$esc_rem = esc_html($rem, -nbsp=>1);
 		$esc_add = esc_html($add, -nbsp=>1);
@@ -5099,15 +5099,43 @@ sub format_rem_add_line {
 
 # HTML-format diff context, removed and added lines.
 sub format_ctx_rem_add_lines {
-	my ($ctx, $rem, $add, $is_combined) = @_;
+	my ($ctx, $rem, $add, $num_parents) = @_;
 	my (@new_ctx, @new_rem, @new_add);
+	my $can_highlight = 0;
+	my $is_combined = ($num_parents > 1);
 
 	# Highlight if every removed line has a corresponding added line.
-	# Combined diffs are not supported ATM.
-	if (!$is_combined && @$add > 0 && @$add == @$rem) {
+	if (@$add > 0 && @$add == @$rem) {
+		$can_highlight = 1;
+
+		# Highlight lines in combined diff only if the chunk contains
+		# diff between the same version, e.g.
+		#
+		#    - a
+		#   -  b
+		#    + c
+		#   +  d
+		#
+		# Otherwise the highlightling would be confusing.
+		if ($is_combined) {
+			for (my $i = 0; $i < @$add; $i++) {
+				my $prefix_rem = substr($rem->[$i], 0, $num_parents);
+				my $prefix_add = substr($add->[$i], 0, $num_parents);
+
+				$prefix_rem =~ s/-/+/g;
+
+				if ($prefix_rem ne $prefix_add) {
+					$can_highlight = 0;
+					last;
+				}
+			}
+		}
+	}
+
+	if ($can_highlight) {
 		for (my $i = 0; $i < @$add; $i++) {
 			my ($line_rem, $line_add) = format_rem_add_line(
-			        $rem->[$i], $add->[$i]);
+			        $rem->[$i], $add->[$i], $num_parents);
 			push @new_rem, $line_rem;
 			push @new_add, $line_add;
 		}
@@ -5123,10 +5151,11 @@ sub format_ctx_rem_add_lines {
 
 # Print context lines and then rem/add lines.
 sub print_diff_lines {
-	my ($ctx, $rem, $add, $diff_style, $is_combined) = @_;
+	my ($ctx, $rem, $add, $diff_style, $num_parents) = @_;
+	my $is_combined = $num_parents > 1;
 
 	($ctx, $rem, $add) = format_ctx_rem_add_lines($ctx, $rem, $add,
-		$is_combined);
+		$num_parents);
 
 	if ($diff_style eq 'sidebyside' && !$is_combined) {
 		print_sidebyside_diff_lines($ctx, $rem, $add);
@@ -5137,7 +5166,7 @@ sub print_diff_lines {
 }
 
 sub print_diff_chunk {
-	my ($diff_style, $is_combined, $from, $to, @chunk) = @_;
+	my ($diff_style, $num_parents, $from, $to, @chunk) = @_;
 	my (@ctx, @rem, @add);
 
 	# The class of the previous line. 
@@ -5174,7 +5203,7 @@ sub print_diff_chunk {
 		if (((@rem || @add) && $class eq 'ctx') || !$class ||
 		    (@rem && @add && $class ne $prev_class)) {
 			print_diff_lines(\@ctx, \@rem, \@add, $diff_style,
-					$is_combined);
+					$num_parents);
 			@ctx = @rem = @add = ();
 		}
 
@@ -5317,7 +5346,7 @@ sub git_patchset_body {
 			my $class = diff_line_class($patch_line, \%from, \%to);
 
 			if ($class eq 'chunk_header') {
-				print_diff_chunk($diff_style, $is_combined, \%from, \%to, @chunk);
+				print_diff_chunk($diff_style, scalar @hash_parents, \%from, \%to, @chunk);
 				@chunk = ( [ $class, $patch_line ] );
 			} else {
 				push @chunk, [ $class, $patch_line ];
@@ -5326,7 +5355,7 @@ sub git_patchset_body {
 
 	} continue {
 		if (@chunk) {
-			print_diff_chunk($diff_style, $is_combined, \%from, \%to, @chunk);
+			print_diff_chunk($diff_style, scalar @hash_parents, \%from, \%to, @chunk);
 			@chunk = ();
 		}
 		print "</div>\n"; # class="patch"
-- 
1.7.8.4

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/8] gitweb: esc_html_hl_regions(): Don't create empty <span> elements
  2012-03-23 22:56 ` [PATCH v2 1/8] gitweb: esc_html_hl_regions(): Don't create empty <span> elements Michał Kiedrowicz
@ 2012-03-24 18:58   ` Jakub Narebski
  2012-03-24 23:38     ` Michał Kiedrowicz
  2012-03-29 18:04     ` [PATCH] gitweb: Use descriptive names in esc_html_hl_regions() Michał Kiedrowicz
  0 siblings, 2 replies; 30+ messages in thread
From: Jakub Narebski @ 2012-03-24 18:58 UTC (permalink / raw)
  To: Michał Kiedrowicz; +Cc: git, Jeff King

Note: all those minor issues can be fixed while applying, I think.

On Fri, 23 Mar 2012, Michał Kiedrowicz wrote:

> if $s->[1] is equal to or less than $s->[0], esc_html_hl_regions()
  ^^
s/if/If/

> generates an empty <span> element.  It normally shouldn't be visible in
> the web broweser, but it doesn't look good when looking at page source.
          ^^^^^^^^ 
s/broweser/browser/

> It also minimally increases generated page size for no special reason.
> 
> Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
> ---
> 
> I don't know if any code currently can produce empty <span> elements,
> but diff refinement highlighning does it.

I didn't want to make esc_html_hl_regions() paranoid and boggle it down
with checking that gitweb called it with sane values of parameters,
but this one is cheap and simple.  

It might be better to make esc_html_hl_regions() more robust instead
of modifying future caller to skip empty regions.

I have not read the rest of this series; for the time being conditional
ACK from me.

>  gitweb/gitweb.perl |    3 +++
>  1 files changed, 3 insertions(+), 0 deletions(-)
> 
> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> index a8b5fad..af645e5 100755
> --- a/gitweb/gitweb.perl
> +++ b/gitweb/gitweb.perl
> @@ -1738,6 +1738,9 @@ sub esc_html_hl_regions {
>  	my $pos = 0;
>  
>  	for my $s (@sel) {
> +		# Don't craete empty <span> elements.
                        ^^^^^^
s/craete/create/

> +		next if $s->[1] <= $s->[0];
> +
>  		$out .= esc_html(substr($str, $pos, $s->[0] - $pos))
>  			if ($s->[0] - $pos > 0);
>  		$out .= $cgi->span({-class => $css_class},
> -- 
> 1.7.8.4
 
P.S. I wonder if it wouldn't be better if we created and used loop-local
variables with descriptive names, e.g.

  my ($beg, $end) = @$s;

and use $beg in place of $s->[0] and $end in place of $s->[1], which are
a bit cryptic.

This of course doesn't affect this patch.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 2/8] gitweb: Pass esc_html_hl_regions() options to esc_html()
  2012-03-23 22:56 ` [PATCH v2 2/8] gitweb: Pass esc_html_hl_regions() options to esc_html() Michał Kiedrowicz
@ 2012-03-24 19:15   ` Jakub Narebski
  2012-03-24 23:31     ` Michał Kiedrowicz
  0 siblings, 1 reply; 30+ messages in thread
From: Jakub Narebski @ 2012-03-24 19:15 UTC (permalink / raw)
  To: Michał Kiedrowicz; +Cc: git, Jeff King

On Fri, 23 Mar 2012, Michał Kiedrowicz wrote:

> With this change, esc_html_hl_regions() accepts options and passes them
> down to esc_html().  This may be needed if a caller wants to pass
> -nbsp=>1 to esc_html().
> 
> The idea and implementation example of this change was described in
> 337da8d2 (gitweb: Introduce esc_html_match_hl and esc_html_hl_regions,
> 2012-02-27).  While other suggestions may be more useful in some cases,
> there is no need to implement them at the moment.  The
> esc_html_hl_regions() interface may be changed later if it's needed.
> 
> Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
> ---
> 
> Jakub, The code was in fact stolen from a yours patch posted to the
> list (gitweb: Use esc_html_match_hl() in 'grep' search) that wasn't
> merged so maybe I should pass the authorship to you?

Either passing authorship, with double signoff (mine and yours), and
note explaining modification, e.g.

  [mk: extracted from larger patch and wrote commit message]

or courtesy contribution in the form of signoff-like annotation just
before signoff, e.g.

  Based-on-patch-by: Jakub Narębski <jnareb@gmail.com>

is fine by me.
 
>  gitweb/gitweb.perl |   10 ++++++----
>  1 files changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> index af645e5..1744c40 100755
> --- a/gitweb/gitweb.perl
> +++ b/gitweb/gitweb.perl
> @@ -1732,7 +1732,9 @@ sub chop_and_escape_str {
>  # '<span class="mark">foo</span>bar'
>  sub esc_html_hl_regions {
>  	my ($str, $css_class, @sel) = @_;
> -	return esc_html($str) unless @sel;
> +	my %opts = grep { ref($_) ne 'ARRAY' } @sel;
> +	@sel     = grep { ref($_) eq 'ARRAY' } @sel;
> +	return esc_html($str, %opts) unless @sel;
>  
>  	my $out = '';
>  	my $pos = 0;
> @@ -1741,14 +1743,14 @@ sub esc_html_hl_regions {
>  		# Don't craete empty <span> elements.
>  		next if $s->[1] <= $s->[0];
>  
> -		$out .= esc_html(substr($str, $pos, $s->[0] - $pos))
> +		$out .= esc_html(substr($str, $pos, $s->[0] - $pos), %opts)
>  			if ($s->[0] - $pos > 0);
>  		$out .= $cgi->span({-class => $css_class},
> -		                   esc_html(substr($str, $s->[0], $s->[1] - $s->[0])));
> +		                   esc_html(substr($str, $s->[0], $s->[1] - $s->[0]), %opts));
>  
>  		$pos = $s->[1];
>  	}
> -	$out .= esc_html(substr($str, $pos))
> +	$out .= esc_html(substr($str, $pos), %opts)
>  		if ($pos < length($str));
>  
>  	return $out;
> -- 
> 1.7.8.4

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 2/8] gitweb: Pass esc_html_hl_regions() options to esc_html()
  2012-03-24 19:15   ` Jakub Narebski
@ 2012-03-24 23:31     ` Michał Kiedrowicz
  0 siblings, 0 replies; 30+ messages in thread
From: Michał Kiedrowicz @ 2012-03-24 23:31 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git, Jeff King

Jakub Narebski <jnareb@gmail.com> wrote:

> On Fri, 23 Mar 2012, Michał Kiedrowicz wrote:
> 
> > With this change, esc_html_hl_regions() accepts options and passes them
> > down to esc_html().  This may be needed if a caller wants to pass
> > -nbsp=>1 to esc_html().
> > 
> > The idea and implementation example of this change was described in
> > 337da8d2 (gitweb: Introduce esc_html_match_hl and esc_html_hl_regions,
> > 2012-02-27).  While other suggestions may be more useful in some cases,
> > there is no need to implement them at the moment.  The
> > esc_html_hl_regions() interface may be changed later if it's needed.
> > 
> > Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
> > ---
> > 
> > Jakub, The code was in fact stolen from a yours patch posted to the
> > list (gitweb: Use esc_html_match_hl() in 'grep' search) that wasn't
> > merged so maybe I should pass the authorship to you?
> 
> Either passing authorship, with double signoff (mine and yours), and
> note explaining modification, e.g.
> 
>   [mk: extracted from larger patch and wrote commit message]
> 
> or courtesy contribution in the form of signoff-like annotation just
> before signoff, e.g.
> 
>   Based-on-patch-by: Jakub Narębski <jnareb@gmail.com>
> 
> is fine by me.

OK, will do.

>  
> >  gitweb/gitweb.perl |   10 ++++++----
> >  1 files changed, 6 insertions(+), 4 deletions(-)
> > 
> > diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> > index af645e5..1744c40 100755
> > --- a/gitweb/gitweb.perl
> > +++ b/gitweb/gitweb.perl
> > @@ -1732,7 +1732,9 @@ sub chop_and_escape_str {
> >  # '<span class="mark">foo</span>bar'
> >  sub esc_html_hl_regions {
> >  	my ($str, $css_class, @sel) = @_;
> > -	return esc_html($str) unless @sel;
> > +	my %opts = grep { ref($_) ne 'ARRAY' } @sel;
> > +	@sel     = grep { ref($_) eq 'ARRAY' } @sel;
> > +	return esc_html($str, %opts) unless @sel;
> >  
> >  	my $out = '';
> >  	my $pos = 0;
> > @@ -1741,14 +1743,14 @@ sub esc_html_hl_regions {
> >  		# Don't craete empty <span> elements.
> >  		next if $s->[1] <= $s->[0];
> >  
> > -		$out .= esc_html(substr($str, $pos, $s->[0] - $pos))
> > +		$out .= esc_html(substr($str, $pos, $s->[0] - $pos), %opts)
> >  			if ($s->[0] - $pos > 0);
> >  		$out .= $cgi->span({-class => $css_class},
> > -		                   esc_html(substr($str, $s->[0], $s->[1] - $s->[0])));
> > +		                   esc_html(substr($str, $s->[0], $s->[1] - $s->[0]), %opts));
> >  
> >  		$pos = $s->[1];
> >  	}
> > -	$out .= esc_html(substr($str, $pos))
> > +	$out .= esc_html(substr($str, $pos), %opts)
> >  		if ($pos < length($str));
> >  
> >  	return $out;
> > -- 
> > 1.7.8.4

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/8] gitweb: esc_html_hl_regions(): Don't create empty <span> elements
  2012-03-24 18:58   ` Jakub Narebski
@ 2012-03-24 23:38     ` Michał Kiedrowicz
  2012-03-29 18:04     ` [PATCH] gitweb: Use descriptive names in esc_html_hl_regions() Michał Kiedrowicz
  1 sibling, 0 replies; 30+ messages in thread
From: Michał Kiedrowicz @ 2012-03-24 23:38 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git, Jeff King

Jakub Narebski <jnareb@gmail.com> wrote:

> Note: all those minor issues can be fixed while applying, I think.
> 
> On Fri, 23 Mar 2012, Michał Kiedrowicz wrote:
> 
> > if $s->[1] is equal to or less than $s->[0], esc_html_hl_regions()
>   ^^
> s/if/If/
> 
> > generates an empty <span> element.  It normally shouldn't be visible in
> > the web broweser, but it doesn't look good when looking at page source.
>           ^^^^^^^^ 
> s/broweser/browser/

Thanks.  I'm terribly sorry for these typos.  I really must tripple
check everything I write.

> 
> > It also minimally increases generated page size for no special reason.
> > 
> > Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
> > ---
> > 
> > I don't know if any code currently can produce empty <span> elements,
> > but diff refinement highlighning does it.
> 
> I didn't want to make esc_html_hl_regions() paranoid and boggle it down
> with checking that gitweb called it with sane values of parameters,
> but this one is cheap and simple.  
> 
> It might be better to make esc_html_hl_regions() more robust instead
> of modifying future caller to skip empty regions.
> 

Yeah, the caller is just

+$esc_rem = esc_html_hl_regions($rem, 'marked', [$prefix_len, @rem - $suffix_len], -nbsp=>1);
+$esc_add = esc_html_hl_regions($add, 'marked', [$prefix_len, @add - $suffix_len], -nbsp=>1);

Without this patch I would have to do (untested)

+# Create <span> only if there is anything to mark.
+if($prefix_len < @rem - $suffix_len) {
+	$esc_rem = esc_html_hl_regions($rem, 'marked', [$prefix_len, @rem - $suffix_len], -nbsp=>1);
+} else {
+	$esc_rem = esc_html($rem, -nbsp=1);
+}
+if($prefix_len < @add - $suffix_len) {
+	$esc_add = esc_html_hl_regions($add, 'marked', [$prefix_len, @add - $suffix_len], -nbsp=>1);
+} else {
+	$esc_add = esc_html($add, -nbsp=1);
+}

or something like that.

> I have not read the rest of this series; for the time being conditional
> ACK from me.
> 
> >  gitweb/gitweb.perl |    3 +++
> >  1 files changed, 3 insertions(+), 0 deletions(-)
> > 
> > diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> > index a8b5fad..af645e5 100755
> > --- a/gitweb/gitweb.perl
> > +++ b/gitweb/gitweb.perl
> > @@ -1738,6 +1738,9 @@ sub esc_html_hl_regions {
> >  	my $pos = 0;
> >  
> >  	for my $s (@sel) {
> > +		# Don't craete empty <span> elements.
>                         ^^^^^^
> s/craete/create/
> 
> > +		next if $s->[1] <= $s->[0];
> > +
> >  		$out .= esc_html(substr($str, $pos, $s->[0] - $pos))
> >  			if ($s->[0] - $pos > 0);
> >  		$out .= $cgi->span({-class => $css_class},
> > -- 
> > 1.7.8.4
>  
> P.S. I wonder if it wouldn't be better if we created and used loop-local
> variables with descriptive names, e.g.
> 
>   my ($beg, $end) = @$s;
> 
> and use $beg in place of $s->[0] and $end in place of $s->[1], which are
> a bit cryptic.
> 
> This of course doesn't affect this patch.
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 3/8] gitweb: Extract print_sidebyside_diff_lines()
  2012-03-23 22:56 ` [PATCH v2 3/8] gitweb: Extract print_sidebyside_diff_lines() Michał Kiedrowicz
@ 2012-03-28 14:33   ` Jakub Narebski
  2012-03-29 17:25     ` Michał Kiedrowicz
  0 siblings, 1 reply; 30+ messages in thread
From: Jakub Narebski @ 2012-03-28 14:33 UTC (permalink / raw)
  To: Michał Kiedrowicz; +Cc: git, Jeff King

On Fri, 23 Mar 2012, Michał Kiedrowicz wrote:

> Currently, print_sidebyside_diff_chunk() does two things: it
> accumulates diff lines and prints them.  Accumulation may be used to
> perform additional operations on diff lines,  so it makes sense to split
> these two things.  Thus, the code that prints diff lines in a side-by-side
> manner is moved out of print_sidebyside_diff_chunk() to a separate
> subroutine.
>
Right, that is quite sensible.

> The outcome of this patch is that print_sidebyside_diff_chunk() is now
> much shorter and easier to read.
> 
Nice effect.

> This is a preparation patch for diff refinement highlightning.  It should
> not change the gitweb output, but it slightly changes its behavior.
> Before this commit, context is printed on the class change. Now, it'it
> printed just before printing added and removed lines.

                                                      , and at the end
  of chunk.

IMVHO such change is irrelevant.

Acked-by: Jakub Narębski <jnareb@gmail.com>

> Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
> ---
>  gitweb/gitweb.perl |   97 ++++++++++++++++++++++++++++------------------------
>  1 files changed, 52 insertions(+), 45 deletions(-)

Nice code movement.
 
> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> index 1744c40..86d5a02 100755
> --- a/gitweb/gitweb.perl
> +++ b/gitweb/gitweb.perl
> @@ -4998,6 +4998,53 @@ sub git_difftree_body {
>  	print "</table>\n";
>  }
>  
> +# Print context lines and then rem/add lines in a side-by-side manner.
> +sub print_sidebyside_diff_lines {
> +	my ($ctx, $rem, $add) = @_;
> +
> +	# print context block before add/rem block
> +	if (@$ctx) {
> +		print join '',
> +			'<div class="chunk_block ctx">',
> +				'<div class="old">',
> +				@$ctx,
> +				'</div>',
> +				'<div class="new">',
> +				@$ctx,
> +				'</div>',
> +			'</div>';
> +	}
> +
> +	if (!@$add) {
> +		# pure removal
> +		print join '',
> +			'<div class="chunk_block rem">',
> +				'<div class="old">',
> +				@$rem,
> +				'</div>',
> +			'</div>';
> +	} elsif (!@$rem) {
> +		# pure addition
> +		print join '',
> +			'<div class="chunk_block add">',
> +				'<div class="new">',
> +				@$add,
> +				'</div>',
> +			'</div>';
> +	} else {
> +		# assume that it is change
> +		print join '',
> +			'<div class="chunk_block chg">',
> +				'<div class="old">',
> +				@$rem,
> +				'</div>',
> +				'<div class="new">',
> +				@$add,
> +				'</div>',
> +			'</div>';
> +	}
> +}
> +
>  sub print_sidebyside_diff_chunk {
>  	my @chunk = @_;
>  	my (@ctx, @rem, @add);
> @@ -5024,51 +5071,11 @@ sub print_sidebyside_diff_chunk {
>  			next;
>  		}
>  
> -		## print from accumulator when type of class of lines change
> -		# empty contents block on start rem/add block, or end of chunk
> -		if (@ctx && (!$class || $class eq 'rem' || $class eq 'add')) {
> -			print join '',
> -				'<div class="chunk_block ctx">',
> -					'<div class="old">',
> -					@ctx,
> -					'</div>',
> -					'<div class="new">',
> -					@ctx,
> -					'</div>',
> -				'</div>';
> -			@ctx = ();
> -		}
> -		# empty add/rem block on start context block, or end of chunk
> -		if ((@rem || @add) && (!$class || $class eq 'ctx')) {
> -			if (!@add) {
> -				# pure removal
> -				print join '',
> -					'<div class="chunk_block rem">',
> -						'<div class="old">',
> -						@rem,
> -						'</div>',
> -					'</div>';
> -			} elsif (!@rem) {
> -				# pure addition
> -				print join '',
> -					'<div class="chunk_block add">',
> -						'<div class="new">',
> -						@add,
> -						'</div>',
> -					'</div>';
> -			} else {
> -				# assume that it is change
> -				print join '',
> -					'<div class="chunk_block chg">',
> -						'<div class="old">',
> -						@rem,
> -						'</div>',
> -						'<div class="new">',
> -						@add,
> -						'</div>',
> -					'</div>';
> -			}
> -			@rem = @add = ();
> +		## print from accumulator when have some add/rem lines or end
> +		# of chunk (flush context lines)
> +		if (((@rem || @add) && $class eq 'ctx') || !$class) {
> +			print_sidebyside_diff_lines(\@ctx, \@rem, \@add);
> +			@ctx = @rem = @add = ();
>  		}
>  
>  		## adding lines to accumulator

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 4/8] gitweb: Use print_diff_chunk() for both side-by-side and inline diffs
  2012-03-23 22:56 ` [PATCH v2 4/8] gitweb: Use print_diff_chunk() for both side-by-side and inline diffs Michał Kiedrowicz
@ 2012-03-28 15:56   ` Jakub Narebski
  2012-03-29 17:31     ` Michał Kiedrowicz
  0 siblings, 1 reply; 30+ messages in thread
From: Jakub Narebski @ 2012-03-28 15:56 UTC (permalink / raw)
  To: Michał Kiedrowicz; +Cc: git, Jeff King

On Fri, 23 Mar 2012, Michał Kiedrowicz wrote:

> This renames print_sidebyside_diff_chunk() to print_diff_chunk() and
> makes use of it for both side-by-side and inline diffs.  Now diff lines
> are always accumulated before they are printed.  This opens the
> possibility to preprocess diff output before it's printed, which is
> needed for diff refinement highlightning (implemented in incoming
> patches).
> 
While I don't like that we use accumulator for something that was 
straightforwardly printed, I understand that it is necessary preprocessing
required for diff refinement highlighting.  Having refactoring upfront
is a good idea, making for easier review.

> If left as is, the new function print_inline_diff_lines() could reorder

I think the previous paragraph is long enough that it would be better
to repeat subject of this sentence, i.e. start with

 "If print_diff_chunk() was left as is, the new function ..."

> diff lines.  It first prints all context lines, then all removed lines
> and finally all added lines.  If the diff output consisted of mixed added
> and removed lines, gitweb would reorder these lines.  This is true for
> combined diff output, for example:
> 
> 	 - removed line for first parent
> 	 + added line for first parent
> 	  -removed line for second parent
> 	 ++added line for both parents
> 
> would be rendered as:
> 
> 	- removed line for first parent
> 	 -removed line for second parent
> 	+ added line for first parent
> 	++added line for both parents
> 
> To prevent gitweb from reordering lines, print_diff_chunk() calls
> print_diff_lines() as soon as it detects that both added and removed
> lines are present and there was a class change.
> 
O.K.

> Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
> ---
[...]

> +# Print context lines and then rem/add lines in inline manner.
> +sub print_inline_diff_lines {
> +	my ($ctx, $rem, $add) = @_;
> +
> +	print foreach (@$ctx);
> +	print foreach (@$rem);
> +	print foreach (@$add);

Why not simply

  +	print @$ctx, @$rem, @$add;

It is "print LIST" for a reason (and $\ is empty here in gitweb, as it
is empty by default).

> +}
> +
> +# Print context lines and then rem/add lines.
> +sub print_diff_lines {
> +	my ($ctx, $rem, $add, $diff_style, $is_combined) = @_;
> +
> +	if ($diff_style eq 'sidebyside' && !$is_combined) {
> +		print_sidebyside_diff_lines($ctx, $rem, $add);
> +	} else {
> +		# default 'inline' style and unknown styles
> +		print_inline_diff_lines($ctx, $rem, $add);
> +	}
> +}

Very nice subroutine!

> +
> +sub print_diff_chunk {
> +	my ($diff_style, $is_combined, @chunk) = @_;
>  	my (@ctx, @rem, @add);
>  
> +	# The class of the previous line. 
> +	my $prev_class = '';
> +
>  	return unless @chunk;
>  
>  	# incomplete last line might be among removed or added lines,
> @@ -5072,9 +5096,13 @@ sub print_sidebyside_diff_chunk {
>  		}
>  
>  		## print from accumulator when have some add/rem lines or end
> -		# of chunk (flush context lines)
> -		if (((@rem || @add) && $class eq 'ctx') || !$class) {
> -			print_sidebyside_diff_lines(\@ctx, \@rem, \@add);
> +		# of chunk (flush context lines), or when have add and rem
> +		# lines and new block is reached (otherwise add/rem lines could
> +		# be reordered)
> +		if (((@rem || @add) && $class eq 'ctx') || !$class ||
> +		    (@rem && @add && $class ne $prev_class)) {
> +			print_diff_lines(\@ctx, \@rem, \@add, $diff_style,
> +					$is_combined);

Nitpick: the following _might_ be a tiny little bit more readable:

  +			print_diff_lines(\@ctx, \@rem, \@add,
  +			                 $diff_style, $is_combined);

>  			@ctx = @rem = @add = ();
>  		}
>  
> @@ -5091,6 +5119,8 @@ sub print_sidebyside_diff_chunk {
>  		if ($class eq 'ctx') {
>  			push @ctx, $line;
>  		}
> +
> +		$prev_class = $class;
>  	}
>  }

Anyway nice change.
  
> @@ -5217,22 +5247,17 @@ sub git_patchset_body {
>  			$diff_classes .= " $class" if ($class);
>  			$line = "<div class=\"$diff_classes\">$line</div>\n";
>  
> -			if ($diff_style eq 'sidebyside' && !$is_combined) {
> -				if ($class eq 'chunk_header') {
> -					print_sidebyside_diff_chunk(@chunk);
> -					@chunk = ( [ $class, $line ] );
> -				} else {
> -					push @chunk, [ $class, $line ];
> -				}
> +			if ($class eq 'chunk_header') {
> +				print_diff_chunk($diff_style, $is_combined, @chunk);

Nice, pushing acting on $diff_style down to print_diff_chunk(), which
simplifies code a bit.

> +				@chunk = ( [ $class, $line ] );

BTW. this could be simplified to

  +				@chunk = ();
  +				push @chunk, [ $class, $line ];

Well, simplified after noticing the common part of those two branches
of a conditional.

But it really doesn't matter.

>  			} else {
> -				# default 'inline' style and unknown styles
> -				print $line;
> +				push @chunk, [ $class, $line ];
>  			}
>  		}
>  
>  	} continue {
>  		if (@chunk) {
> -			print_sidebyside_diff_chunk(@chunk);
> +			print_diff_chunk($diff_style, $is_combined, @chunk);
>  			@chunk = ();
>  		}
>  		print "</div>\n"; # class="patch"
> -- 

BTW. I was wondering about binary files, but this commit wouldn't make
it worse anyway as we parse diff output assuming unified-like diff for
diff syntax highlighting... and binary diffs are shown as

  "Binary files a/foo.png and b/foo.png differ"

in extended diff header.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 5/8] gitweb: Move HTML-formatting diff line back to process_diff_line()
  2012-03-23 22:56 ` [PATCH v2 5/8] gitweb: Move HTML-formatting diff line back to process_diff_line() Michał Kiedrowicz
@ 2012-03-29 16:14   ` Jakub Narebski
  2012-03-29 16:49     ` Jakub Narebski
  2012-03-29 17:36     ` Michał Kiedrowicz
  0 siblings, 2 replies; 30+ messages in thread
From: Jakub Narebski @ 2012-03-29 16:14 UTC (permalink / raw)
  To: Michał Kiedrowicz; +Cc: git, Jeff King

On Fri, 23 Mar 2012, Michał Kiedrowicz wrote:

> Commit 6ba1eb51b (gitweb: Add a feature to show side-by-side diff,
> 2011-10-31) for no special reason moved wrapping diff line in <div> out
> of format_diff_line(). Bring back old behavior.
> 
I remember that originally process_diff_line was format_diff_line... 

> This simplifies code in git_patchset_body() and keeps formatting of a
> diff line in one place.
> 
That is a good enough reason for me.

> The more long-term purpose of this patch is to move formatting diff
> lines down to print_diff_chunk(), to allow processing lines without
> HTML-formatting.
> 
Excuse me, but from this commit message (and from the patch itseld)
I don't see how this commit (patch) can help with this goal (and
don't remember details of discussion).

Please explain it in more detail, or simply remove above paragraph.

> This is just a refactoring patch. It's not meant to change gitweb
> output.
> 
> Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
> Acked-by: Jakub Narębski <jnareb@gmail.com>
> ---
[...]
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 5/8] gitweb: Move HTML-formatting diff line back to process_diff_line()
  2012-03-29 16:14   ` Jakub Narebski
@ 2012-03-29 16:49     ` Jakub Narebski
  2012-03-29 17:36     ` Michał Kiedrowicz
  1 sibling, 0 replies; 30+ messages in thread
From: Jakub Narebski @ 2012-03-29 16:49 UTC (permalink / raw)
  To: Michał Kiedrowicz; +Cc: git, Jeff King

Jakub Narebski wrote:
> On Fri, 23 Mar 2012, Michał Kiedrowicz wrote:

> > The more long-term purpose of this patch is to move formatting diff
> > lines down to print_diff_chunk(), to allow processing lines without
> > HTML-formatting.
> > 
> Excuse me, but from this commit message (and from the patch itseld)
> I don't see how this commit (patch) can help with this goal (and
> don't remember details of discussion).
> 
> Please explain it in more detail, or simply remove above paragraph.

Or just squash it with next patch in this series.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 6/8] gitweb: Push formatting diff lines to print_diff_chunk()
  2012-03-23 22:56 ` [PATCH v2 6/8] gitweb: Push formatting diff lines to print_diff_chunk() Michał Kiedrowicz
@ 2012-03-29 16:59   ` Jakub Narebski
  2012-03-29 17:41     ` Michał Kiedrowicz
  0 siblings, 1 reply; 30+ messages in thread
From: Jakub Narebski @ 2012-03-29 16:59 UTC (permalink / raw)
  To: Michał Kiedrowicz; +Cc: git, Jeff King

On Fri, 23 Mon 2012, Michał Kiedrowicz wrote:

> Now git_patchset_body() only calls diff_line_class(), which is removed
> from process_diff_line(). The latter function is renamed to
> format_diff_line() and its output is changed to return only
> HTML-formatted line, which brings it in line with outher format_*
> subroutined.
> 
> This slightly changes the order of operations performed on diff lines.
> Before this commit, each read line was formatted and then put to the
> @chunk accumulator. Now, lines are formatted inside print_diff_chunk(),

This is a bit convoluted description.


As I understand it, what happens here is that formatting lines is
pushed down to print_diff_chunk(), closer to the place where we
actually use HTML formatted output.

This means that we put raw lines in the @chunk accumulator, rather
than formatted lines.  Because we still need to know class (type)
of line when accumulating data to post-process and print, 
process_diff_line() subroutine was retired and replaced by 
diff_line_class() used in git_patchset_body() and new / resurrected
format_diff_line() used in print_diff_chunk().

Isn't it?
 

A side effect is that we have to pass \%from and \%to down the
callstack.

> This is a preparation patch for diff refinement highlightning. It's not
> meant to change gitweb output.
> 
This is a very nice refactoring.  I was never really comfortable with
the API of process_diff_line(), which was different from all other
subroutines in gitweb, and error prone to call.  I wish we used this
solution presented in this commit from the very beginning.

BTW. I think we can simply squash this commit with previous one; no
need to improve process_diff_line() if we are retiring it.

> Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
> Acked-by: Jakub Narębski <jnareb@gmail.com>
> ---
>  gitweb/gitweb.perl |   25 ++++++++++++-------------
>  1 files changed, 12 insertions(+), 13 deletions(-)

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 3/8] gitweb: Extract print_sidebyside_diff_lines()
  2012-03-28 14:33   ` Jakub Narebski
@ 2012-03-29 17:25     ` Michał Kiedrowicz
  2012-03-30 13:37       ` Jakub Narebski
  0 siblings, 1 reply; 30+ messages in thread
From: Michał Kiedrowicz @ 2012-03-29 17:25 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git, Jeff King

Jakub Narebski <jnareb@gmail.com> wrote:

> On Fri, 23 Mar 2012, Michał Kiedrowicz wrote:
> 
> > Currently, print_sidebyside_diff_chunk() does two things: it
> > accumulates diff lines and prints them.  Accumulation may be used to
> > perform additional operations on diff lines,  so it makes sense to split
> > these two things.  Thus, the code that prints diff lines in a side-by-side
> > manner is moved out of print_sidebyside_diff_chunk() to a separate
> > subroutine.
> >
> Right, that is quite sensible.
> 
> > The outcome of this patch is that print_sidebyside_diff_chunk() is now
> > much shorter and easier to read.
> > 
> Nice effect.
> 
> > This is a preparation patch for diff refinement highlightning.  It should
> > not change the gitweb output, but it slightly changes its behavior.
> > Before this commit, context is printed on the class change. Now, it'it
> > printed just before printing added and removed lines.
> 
>                                                       , and at the end
>   of chunk.
> 
> IMVHO such change is irrelevant.
> 
> Acked-by: Jakub Narębski <jnareb@gmail.com>

Thanks.

> 
> > Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
> > ---
> >  gitweb/gitweb.perl |   97 ++++++++++++++++++++++++++++------------------------
> >  1 files changed, 52 insertions(+), 45 deletions(-)
> 
> Nice code movement.
>  

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 4/8] gitweb: Use print_diff_chunk() for both side-by-side and inline diffs
  2012-03-28 15:56   ` Jakub Narebski
@ 2012-03-29 17:31     ` Michał Kiedrowicz
  2012-03-30 13:34       ` Jakub Narebski
  0 siblings, 1 reply; 30+ messages in thread
From: Michał Kiedrowicz @ 2012-03-29 17:31 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git, Jeff King

Jakub Narebski <jnareb@gmail.com> wrote:

> On Fri, 23 Mar 2012, Michał Kiedrowicz wrote:
> 
> > This renames print_sidebyside_diff_chunk() to print_diff_chunk() and
> > makes use of it for both side-by-side and inline diffs.  Now diff lines
> > are always accumulated before they are printed.  This opens the
> > possibility to preprocess diff output before it's printed, which is
> > needed for diff refinement highlightning (implemented in incoming
> > patches).
> > 
> While I don't like that we use accumulator for something that was 
> straightforwardly printed, I understand that it is necessary preprocessing
> required for diff refinement highlighting.  Having refactoring upfront
> is a good idea, making for easier review.

Yeah, this change doesn't make sense on its own but is required by later
patch that needs whole chunk at once.

> 
> > If left as is, the new function print_inline_diff_lines() could reorder
> 
> I think the previous paragraph is long enough that it would be better
> to repeat subject of this sentence, i.e. start with
> 
>  "If print_diff_chunk() was left as is, the new function ..."

OK.

> 
> > diff lines.  It first prints all context lines, then all removed lines
> > and finally all added lines.  If the diff output consisted of mixed added
> > and removed lines, gitweb would reorder these lines.  This is true for
> > combined diff output, for example:
> > 
> > 	 - removed line for first parent
> > 	 + added line for first parent
> > 	  -removed line for second parent
> > 	 ++added line for both parents
> > 
> > would be rendered as:
> > 
> > 	- removed line for first parent
> > 	 -removed line for second parent
> > 	+ added line for first parent
> > 	++added line for both parents
> > 
> > To prevent gitweb from reordering lines, print_diff_chunk() calls
> > print_diff_lines() as soon as it detects that both added and removed
> > lines are present and there was a class change.
> > 
> O.K.
> 
> > Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
> > ---
> [...]
> 
> > +# Print context lines and then rem/add lines in inline manner.
> > +sub print_inline_diff_lines {
> > +	my ($ctx, $rem, $add) = @_;
> > +
> > +	print foreach (@$ctx);
> > +	print foreach (@$rem);
> > +	print foreach (@$add);
> 
> Why not simply
> 
>   +	print @$ctx, @$rem, @$add;
> 
> It is "print LIST" for a reason (and $\ is empty here in gitweb, as it
> is empty by default).
> 

OK, that's much better.  I really didn't like those 3 prints.

> > +}
> > +
> > +# Print context lines and then rem/add lines.
> > +sub print_diff_lines {
> > +	my ($ctx, $rem, $add, $diff_style, $is_combined) = @_;
> > +
> > +	if ($diff_style eq 'sidebyside' && !$is_combined) {
> > +		print_sidebyside_diff_lines($ctx, $rem, $add);
> > +	} else {
> > +		# default 'inline' style and unknown styles
> > +		print_inline_diff_lines($ctx, $rem, $add);
> > +	}
> > +}
> 
> Very nice subroutine!
> 
> > +
> > +sub print_diff_chunk {
> > +	my ($diff_style, $is_combined, @chunk) = @_;
> >  	my (@ctx, @rem, @add);
> >  
> > +	# The class of the previous line. 
> > +	my $prev_class = '';
> > +
> >  	return unless @chunk;
> >  
> >  	# incomplete last line might be among removed or added lines,
> > @@ -5072,9 +5096,13 @@ sub print_sidebyside_diff_chunk {
> >  		}
> >  
> >  		## print from accumulator when have some add/rem lines or end
> > -		# of chunk (flush context lines)
> > -		if (((@rem || @add) && $class eq 'ctx') || !$class) {
> > -			print_sidebyside_diff_lines(\@ctx, \@rem, \@add);
> > +		# of chunk (flush context lines), or when have add and rem
> > +		# lines and new block is reached (otherwise add/rem lines could
> > +		# be reordered)
> > +		if (((@rem || @add) && $class eq 'ctx') || !$class ||
> > +		    (@rem && @add && $class ne $prev_class)) {
> > +			print_diff_lines(\@ctx, \@rem, \@add, $diff_style,
> > +					$is_combined);
> 
> Nitpick: the following _might_ be a tiny little bit more readable:
> 
>   +			print_diff_lines(\@ctx, \@rem, \@add,
>   +			                 $diff_style, $is_combined);
> 

OK.

> >  			@ctx = @rem = @add = ();
> >  		}
> >  
> > @@ -5091,6 +5119,8 @@ sub print_sidebyside_diff_chunk {
> >  		if ($class eq 'ctx') {
> >  			push @ctx, $line;
> >  		}
> > +
> > +		$prev_class = $class;
> >  	}
> >  }
> 
> Anyway nice change.
>   
> > @@ -5217,22 +5247,17 @@ sub git_patchset_body {
> >  			$diff_classes .= " $class" if ($class);
> >  			$line = "<div class=\"$diff_classes\">$line</div>\n";
> >  
> > -			if ($diff_style eq 'sidebyside' && !$is_combined) {
> > -				if ($class eq 'chunk_header') {
> > -					print_sidebyside_diff_chunk(@chunk);
> > -					@chunk = ( [ $class, $line ] );
> > -				} else {
> > -					push @chunk, [ $class, $line ];
> > -				}
> > +			if ($class eq 'chunk_header') {
> > +				print_diff_chunk($diff_style, $is_combined, @chunk);
> 
> Nice, pushing acting on $diff_style down to print_diff_chunk(), which
> simplifies code a bit.
> 
> > +				@chunk = ( [ $class, $line ] );
> 
> BTW. this could be simplified to
> 
>   +				@chunk = ();
>   +				push @chunk, [ $class, $line ];
> 
> Well, simplified after noticing the common part of those two branches
> of a conditional.
> 
> But it really doesn't matter.

Sounds sensible.

> 
> >  			} else {
> > -				# default 'inline' style and unknown styles
> > -				print $line;
> > +				push @chunk, [ $class, $line ];
> >  			}
> >  		}
> >  
> >  	} continue {
> >  		if (@chunk) {
> > -			print_sidebyside_diff_chunk(@chunk);
> > +			print_diff_chunk($diff_style, $is_combined, @chunk);
> >  			@chunk = ();
> >  		}
> >  		print "</div>\n"; # class="patch"
> > -- 
> 
> BTW. I was wondering about binary files, but this commit wouldn't make
> it worse anyway as we parse diff output assuming unified-like diff for
> diff syntax highlighting... and binary diffs are shown as
> 
>   "Binary files a/foo.png and b/foo.png differ"
> 
> in extended diff header.

Yeah, this is what I wrote in the cover letter:

	* Ensured that binary patches are not supported in HTML format,
	  thus this series canot break it :) (answering Jakub's questions)

but maybe I wasn't clear enough.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 5/8] gitweb: Move HTML-formatting diff line back to process_diff_line()
  2012-03-29 16:14   ` Jakub Narebski
  2012-03-29 16:49     ` Jakub Narebski
@ 2012-03-29 17:36     ` Michał Kiedrowicz
  1 sibling, 0 replies; 30+ messages in thread
From: Michał Kiedrowicz @ 2012-03-29 17:36 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git, Jeff King

Jakub Narebski <jnareb@gmail.com> wrote:

> On Fri, 23 Mar 2012, Michał Kiedrowicz wrote:
> 
> > Commit 6ba1eb51b (gitweb: Add a feature to show side-by-side diff,
> > 2011-10-31) for no special reason moved wrapping diff line in <div> out
> > of format_diff_line(). Bring back old behavior.
> > 
> I remember that originally process_diff_line was format_diff_line... 
> 
> > This simplifies code in git_patchset_body() and keeps formatting of a
> > diff line in one place.
> > 
> That is a good enough reason for me.
> 
> > The more long-term purpose of this patch is to move formatting diff
> > lines down to print_diff_chunk(), to allow processing lines without
> > HTML-formatting.
> > 
> Excuse me, but from this commit message (and from the patch itseld)
> I don't see how this commit (patch) can help with this goal (and
> don't remember details of discussion).
> 
> Please explain it in more detail, or simply remove above paragraph.

The important part in this patch is that it removes some of HTML
formatting from git_patchset_body().  I need this because I want to
process whole chunk before formatting.  So I must push all lines in a
chunk to print_diff_chunk(). 

But I may remove this paragraph as well.

> 
> > This is just a refactoring patch. It's not meant to change gitweb
> > output.
> > 
> > Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
> > Acked-by: Jakub Narębski <jnareb@gmail.com>
> > ---
> [...]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 6/8] gitweb: Push formatting diff lines to print_diff_chunk()
  2012-03-29 16:59   ` Jakub Narebski
@ 2012-03-29 17:41     ` Michał Kiedrowicz
  0 siblings, 0 replies; 30+ messages in thread
From: Michał Kiedrowicz @ 2012-03-29 17:41 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git, Jeff King

Jakub Narebski <jnareb@gmail.com> wrote:

> On Fri, 23 Mon 2012, Michał Kiedrowicz wrote:
> 
> > Now git_patchset_body() only calls diff_line_class(), which is removed
> > from process_diff_line(). The latter function is renamed to
> > format_diff_line() and its output is changed to return only
> > HTML-formatted line, which brings it in line with outher format_*
> > subroutined.
> > 
> > This slightly changes the order of operations performed on diff lines.
> > Before this commit, each read line was formatted and then put to the
> > @chunk accumulator. Now, lines are formatted inside print_diff_chunk(),
> 
> This is a bit convoluted description.
> 
> 
> As I understand it, what happens here is that formatting lines is
> pushed down to print_diff_chunk(), closer to the place where we
> actually use HTML formatted output.

Yes.

> 
> This means that we put raw lines in the @chunk accumulator, rather
> than formatted lines.  Because we still need to know class (type)
> of line when accumulating data to post-process and print, 
> process_diff_line() subroutine was retired and replaced by 
> diff_line_class() used in git_patchset_body() and new / resurrected
> format_diff_line() used in print_diff_chunk().
> 
> Isn't it?

Very true.

>  
> 
> A side effect is that we have to pass \%from and \%to down the
> callstack.

Yes.

> 
> > This is a preparation patch for diff refinement highlightning. It's not
> > meant to change gitweb output.
> > 
> This is a very nice refactoring.  I was never really comfortable with
> the API of process_diff_line(), which was different from all other
> subroutines in gitweb, and error prone to call.  I wish we used this
> solution presented in this commit from the very beginning.
> 
> BTW. I think we can simply squash this commit with previous one; no
> need to improve process_diff_line() if we are retiring it.

OK, will do.

> 
> > Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
> > Acked-by: Jakub Narębski <jnareb@gmail.com>
> > ---
> >  gitweb/gitweb.perl |   25 ++++++++++++-------------
> >  1 files changed, 12 insertions(+), 13 deletions(-)
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH] gitweb: Use descriptive names in esc_html_hl_regions()
  2012-03-24 18:58   ` Jakub Narebski
  2012-03-24 23:38     ` Michał Kiedrowicz
@ 2012-03-29 18:04     ` Michał Kiedrowicz
  1 sibling, 0 replies; 30+ messages in thread
From: Michał Kiedrowicz @ 2012-03-29 18:04 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git, Jeff King, Michał Kiedrowicz

The $s->[0] and $s->[1] variables look a bit cryptic.  Let's rename them
to $beg and $end so that it's clear what they do.

Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
---

> P.S. I wonder if it wouldn't be better if we created and used loop-local
> variables with descriptive names, e.g.
> 
>   my ($beg, $end) = @$s;
> 
> and use $beg in place of $s->[0] and $end in place of $s->[1], which are
> a bit cryptic.
> 
> This of course doesn't affect this patch.

Something like this? (patch not based on this series, may be applied
independently).

 gitweb/gitweb.perl |   10 ++++++----
 1 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index a8b5fad..a3754ff 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -1738,12 +1738,14 @@ sub esc_html_hl_regions {
 	my $pos = 0;
 
 	for my $s (@sel) {
-		$out .= esc_html(substr($str, $pos, $s->[0] - $pos))
-			if ($s->[0] - $pos > 0);
+		my ($beg, $end) = @$s;
+
+		$out .= esc_html(substr($str, $pos, $beg - $pos))
+			if ($beg - $pos > 0);
 		$out .= $cgi->span({-class => $css_class},
-		                   esc_html(substr($str, $s->[0], $s->[1] - $s->[0])));
+		                   esc_html(substr($str, $beg, $end - $beg)));
 
-		$pos = $s->[1];
+		$pos = $end;
 	}
 	$out .= esc_html(substr($str, $pos))
 		if ($pos < length($str));
-- 
1.7.8.4

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 7/8] gitweb: Highlight interesting parts of diff
  2012-03-23 22:56 ` [PATCH v2 7/8] gitweb: Highlight interesting parts of diff Michał Kiedrowicz
@ 2012-03-29 19:42   ` Jakub Narebski
  2012-03-29 19:59     ` Michał Kiedrowicz
  0 siblings, 1 reply; 30+ messages in thread
From: Jakub Narebski @ 2012-03-29 19:42 UTC (permalink / raw)
  To: Michał Kiedrowicz; +Cc: git, Jeff King

On Fri, 23 Mar 2012, Michał Kiedrowicz wrote:

> Reading diff output is sometimes very hard, even if it's colored,
> especially if lines differ only in few characters.  This is often true
> when a commit fixes a typo or renames some variables or functions.
> 
> This commit teaches gitweb to highlight characters that are different
> between old and new line with a light green/red background.  This should
> work in the similar manner as in Trac or GitHub.
> 
> The algorithm that compares lines is based on contrib/diff-highlight.
> Basically, it works by determining common prefix/suffix of corresponding
> lines and highlightning only the middle part of lines.  For more
> information, see contrib/diff-highlight/README.
>
Nice description.
 
> Combined diffs are not supported but a following commit will change it.
> 
O.K.

> Two changes in format_diff_line() were needed to allow diff refinement
> highlightning to work.  Firstly, format_diff_line() was taught to not
> esc_html() line that was passed as a reference.  This was needed because
> refining the highlight must be performed on unescaped lines and it uses
> a <span> element to mark interesting parts of the line.

Well, actually we just need to make format_diff_line to not esc_html
passed line which is already esc_html'ed or esc_html_hl_regions'ed,
i.e. to avoid double escaping.  Passing line as reference if it is
to be treated as HTML is one possibility, and I think it is a good
convention to start to use.

>                                                         Secondly, the 
> lines are untabified before comparing because calling untabify()
> after inserting <span>'s could improperly convert tabs to spaces.

Well, it is actually because untabify() must work on original text to
work correctly, i.e. to convert tabs to spaces according to tab stops.
It must precede esc_html'ing, and even more esc_html_hl_regions'ing.

> 
> Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>

Anyway this is a good change, much cleaner than previous version

  Acked-by: Jakub Narębski <jnareb@gmail.com>

> ---
>  gitweb/gitweb.perl       |   84 ++++++++++++++++++++++++++++++++++++++++++----
>  gitweb/static/gitweb.css |    8 ++++
>  2 files changed, 85 insertions(+), 7 deletions(-)
> 
> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> index db32588..872ba12 100755
> --- a/gitweb/gitweb.perl
> +++ b/gitweb/gitweb.perl
> @@ -2426,14 +2426,14 @@ sub format_cc_diff_chunk_header {
>  }
>  
>  # process patch (diff) line (not to be used for diff headers),
> -# returning HTML-formatted (but not wrapped) line
> +# returning HTML-formatted (but not wrapped) line.
> +# If the line is already esc_html()'ed, pass it as a reference.

Errr... I think the explanation here must be in reverse:

  +# If the line is passed as a reference, it is treated as HTML,
  +# and not esc_html()'ed.

>  sub format_diff_line {
>  	my ($line, $diff_class, $from, $to) = @_;
>  
> -	chomp $line;
> -	$line = untabify($line);
> -
> -	if ($from && $to && $line =~ m/^\@{2} /) {
> +	if (ref($line)) {
> +		$line = $$line;
> +	} elsif ($from && $to && $line =~ m/^\@{2} /) {
>  		$line = format_unidiff_chunk_header($line, $from, $to);
>  	} elsif ($from && $to && $line =~ m/^\@{3}/) {
>  		$line = format_cc_diff_chunk_header($line, $from, $to);
> @@ -5054,10 +5054,80 @@ sub print_inline_diff_lines {
>  	print foreach (@$add);
>  }
>  
> +# Format removed and added line, mark changed part and HTML-format them.
> +# Impementation is based on contrib/diff-highlight
> +sub format_rem_add_line {

Wouldn't a better name be format_rem_add_pair() or format_rem_add_lines(),
or format_rem_add_lines_pair()?

> +	my ($rem, $add) = @_;
> +	my @rem = split(//, $rem);
> +	my @add = split(//, $add);
> +	my ($esc_rem, $esc_add);
> +	# Ignore +/- character, thus $prefix_len is set to 1.
> +	my ($prefix_len, $suffix_len) = (1, 0);

I wonder if it wouldn't be slightly cleaner to count $prefix_len from
+/- rather than start of line.  It means

  +	# Prefix does not count initial +/- character.
  +	my ($prefix_len, $suffix_len) = (0, 0);

Perhaps even remove initial +/- from @add and @rem.

  +	shift @rem;
  +	shift @add;

Ehh... now I see that starting $prefix_len with 1 might be not so bad
idea after all.

> +	my ($prefix_is_space, $suffix_is_space) = (1, 1);

It is not entirely true: $prefix_is_space is really $prefix_is_space_or_empty.
It might be a better idea to use

  +	my ($prefix_has_nonspace, $suffix_has_nonspace);

using the fact that undefined value is false.

> +
> +	while ($prefix_len < @rem && $prefix_len < @add) {
> +		last if ($rem[$prefix_len] ne $add[$prefix_len]);
> +
> +		$prefix_is_space = 0 if ($rem[$prefix_len] !~ /\s/);

  +		$prefix_has_nonspace = 1 if ($rem[$prefix_len] =~ /\s/);

> +		$prefix_len++;
> +	}
> +
> +	my $shorter = (@rem < @add) ? @rem : @add;
> +	while ($prefix_len + $suffix_len < $shorter) {
> +		last if ($rem[-1 - $suffix_len] ne $add[-1 - $suffix_len]);
> +
> +		$suffix_is_space = 0 if ($rem[-1 - $suffix_len] !~ /\s/);

  +		$suffix_has_nonspace = 1 if ($rem[-1 - $suffix_len] =~ /\s/);

> +		$suffix_len++;
> +	}
> +
> +	# Mark lines that are different from each other, but have some common
> +	# part that isn't whitespace.  If lines are completely different, don't
> +	# mark them because that would make output unreadable, especially if
> +	# diff consists of multiple lines.
> +	if (($prefix_len == 1 && $suffix_len == 0) ||
> +	    ($prefix_is_space && $suffix_is_space)) {

Actually ($prefix_is_space && $suffix_is_space) is enough, but cryptic.
With $prefix_is_space (== $prefix_is_space_or_empty) -> $prefix_has_nonspace
it would be

  +	if (not ($prefix_has_nonspace || $suffix_has_nonspace)) {

Anyway, it is not as if it is not clear enough.

> +		$esc_rem = esc_html($rem, -nbsp=>1);
> +		$esc_add = esc_html($add, -nbsp=>1);
> +	} else {
> +		$esc_rem = esc_html_hl_regions($rem, 'marked', [$prefix_len, @rem - $suffix_len], -nbsp=>1);
> +		$esc_add = esc_html_hl_regions($add, 'marked', [$prefix_len, @add - $suffix_len], -nbsp=>1);

  +		$esc_rem = esc_html_hl_regions($rem, 'marked', [$prefix_len+1, @rem+1 - $suffix_len], -nbsp=>1);
  +		$esc_add = esc_html_hl_regions($add, 'marked', [$prefix_len+1, @add+1 - $suffix_len], -nbsp=>1);

So probably not worth it.

> +	}
> +
> +	return format_diff_line(\$esc_rem, 'rem'),
> +	        format_diff_line(\$esc_add, 'add');

  +	return format_diff_line(\$esc_rem, 'rem'),
  +	       format_diff_line(\$esc_add, 'add');

> +}
> +
> +# HTML-format diff context, removed and added lines.
> +sub format_ctx_rem_add_lines {
> +	my ($ctx, $rem, $add, $is_combined) = @_;
> +	my (@new_ctx, @new_rem, @new_add);
> +
> +	# Highlight if every removed line has a corresponding added line.
> +	# Combined diffs are not supported ATM.

ATM == at this moment?  Please say so.

> +	if (!$is_combined && @$add > 0 && @$add == @$rem) {
> +		for (my $i = 0; $i < @$add; $i++) {
> +			my ($line_rem, $line_add) = format_rem_add_line(
> +			        $rem->[$i], $add->[$i]);

  +			my ($line_rem, $line_add) =
  +			        format_rem_add_line($rem->[$i], $add->[$i]);

> +			push @new_rem, $line_rem;
> +			push @new_add, $line_add;
> +		}
> +	} else {
> +		@new_rem = map { format_diff_line($_, 'rem') } @$rem;
> +		@new_add = map { format_diff_line($_, 'add') } @$add;
> +	}
> +
> +	@new_ctx = map { format_diff_line($_, 'ctx') } @$ctx;
> +
> +	return (\@new_ctx, \@new_rem, \@new_add);
> +}

Nice.

> +
>  # Print context lines and then rem/add lines.
>  sub print_diff_lines {
>  	my ($ctx, $rem, $add, $diff_style, $is_combined) = @_;
>  
> +	($ctx, $rem, $add) = format_ctx_rem_add_lines($ctx, $rem, $add,
> +		$is_combined);
> +

  +	($ctx, $rem, $add) = format_ctx_rem_add_lines($ctx, $rem, $add,	$is_combined);
  +

Unless the line is too long.

>  	if ($diff_style eq 'sidebyside' && !$is_combined) {
>  		print_sidebyside_diff_lines($ctx, $rem, $add);
>  	} else {
> @@ -5089,11 +5159,11 @@ sub print_diff_chunk {
>  	foreach my $line_info (@chunk) {
>  		my ($class, $line) = @$line_info;
>  
> -		$line = format_diff_line($line, $class, $from, $to);
> +		$line = untabify($line);

O.K. you move untabify() out of format_diff_line(), and must now
ensure that caller uses it before calling format_diff_line() or 
format_ctx_rem_add_lines() (which uses format_diff_line() itself).

I wonder if leaving untabify() (and chomp) in format_diff_line(),
but not running it if passed reference, and running untabify() in
format_ctx_rem_add_lines() wouldn't be a better, less fragile code.

>  
>  		# print chunk headers
>  		if ($class && $class eq 'chunk_header') {
> -			print $line;
> +			print format_diff_line($line, $class, $from, $to);

O.K., only 'chunk_header' are not formatted.

>  			next;
>  		}

[I have to take a look how final version of print_diff_lines() looks
 like in this commit; the patch alone is not enough]

Right, so we format just before printing, and print_* do formatting
itself before printing.  Nice and clean.

>  
> diff --git a/gitweb/static/gitweb.css b/gitweb/static/gitweb.css
> index c530355..3c4a3c9 100644
> --- a/gitweb/static/gitweb.css
> +++ b/gitweb/static/gitweb.css
> @@ -438,6 +438,10 @@ div.diff.add {
>  	color: #008800;
>  }
>  
> +div.diff.add span.marked {
> +	background-color: #77ff77;
> +}
> +
>  div.diff.from_file a.path,
>  div.diff.from_file {
>  	color: #aa0000;
> @@ -447,6 +451,10 @@ div.diff.rem {
>  	color: #cc0000;
>  }
>  
> +div.diff.rem span.marked {
> +	background-color: #ff7777;
> +}
> +
>  div.diff.chunk_header a,
>  div.diff.chunk_header {
>  	color: #990099;
> -- 

Nice styling.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 7/8] gitweb: Highlight interesting parts of diff
  2012-03-29 19:42   ` Jakub Narebski
@ 2012-03-29 19:59     ` Michał Kiedrowicz
  0 siblings, 0 replies; 30+ messages in thread
From: Michał Kiedrowicz @ 2012-03-29 19:59 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git, Jeff King

Jakub Narebski <jnareb@gmail.com> wrote:

> On Fri, 23 Mar 2012, Michał Kiedrowicz wrote:
> 
> > Reading diff output is sometimes very hard, even if it's colored,
> > especially if lines differ only in few characters.  This is often true
> > when a commit fixes a typo or renames some variables or functions.
> > 
> > This commit teaches gitweb to highlight characters that are different
> > between old and new line with a light green/red background.  This should
> > work in the similar manner as in Trac or GitHub.
> > 
> > The algorithm that compares lines is based on contrib/diff-highlight.
> > Basically, it works by determining common prefix/suffix of corresponding
> > lines and highlightning only the middle part of lines.  For more
> > information, see contrib/diff-highlight/README.
> >
> Nice description.
>  
> > Combined diffs are not supported but a following commit will change it.
> > 
> O.K.
> 
> > Two changes in format_diff_line() were needed to allow diff refinement
> > highlightning to work.  Firstly, format_diff_line() was taught to not
> > esc_html() line that was passed as a reference.  This was needed because
> > refining the highlight must be performed on unescaped lines and it uses
> > a <span> element to mark interesting parts of the line.
> 
> Well, actually we just need to make format_diff_line to not esc_html
> passed line which is already esc_html'ed or esc_html_hl_regions'ed,
> i.e. to avoid double escaping.  Passing line as reference if it is
> to be treated as HTML is one possibility, and I think it is a good
> convention to start to use.
> 
> >                                                         Secondly, the 
> > lines are untabified before comparing because calling untabify()
> > after inserting <span>'s could improperly convert tabs to spaces.
> 
> Well, it is actually because untabify() must work on original text to
> work correctly, i.e. to convert tabs to spaces according to tab stops.
> It must precede esc_html'ing, and even more esc_html_hl_regions'ing.
> 
> > 
> > Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
> 
> Anyway this is a good change, much cleaner than previous version
> 
>   Acked-by: Jakub Narębski <jnareb@gmail.com>

Thanks.

> 
> > ---
> >  gitweb/gitweb.perl       |   84 ++++++++++++++++++++++++++++++++++++++++++----
> >  gitweb/static/gitweb.css |    8 ++++
> >  2 files changed, 85 insertions(+), 7 deletions(-)
> > 
> > diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> > index db32588..872ba12 100755
> > --- a/gitweb/gitweb.perl
> > +++ b/gitweb/gitweb.perl
> > @@ -2426,14 +2426,14 @@ sub format_cc_diff_chunk_header {
> >  }
> >  
> >  # process patch (diff) line (not to be used for diff headers),
> > -# returning HTML-formatted (but not wrapped) line
> > +# returning HTML-formatted (but not wrapped) line.
> > +# If the line is already esc_html()'ed, pass it as a reference.
> 
> Errr... I think the explanation here must be in reverse:
> 
>   +# If the line is passed as a reference, it is treated as HTML,
>   +# and not esc_html()'ed.

Yeah, I thought about that option too :).

> 
> >  sub format_diff_line {
> >  	my ($line, $diff_class, $from, $to) = @_;
> >  
> > -	chomp $line;
> > -	$line = untabify($line);
> > -
> > -	if ($from && $to && $line =~ m/^\@{2} /) {
> > +	if (ref($line)) {
> > +		$line = $$line;
> > +	} elsif ($from && $to && $line =~ m/^\@{2} /) {
> >  		$line = format_unidiff_chunk_header($line, $from, $to);
> >  	} elsif ($from && $to && $line =~ m/^\@{3}/) {
> >  		$line = format_cc_diff_chunk_header($line, $from, $to);
> > @@ -5054,10 +5054,80 @@ sub print_inline_diff_lines {
> >  	print foreach (@$add);
> >  }
> >  
> > +# Format removed and added line, mark changed part and HTML-format them.
> > +# Impementation is based on contrib/diff-highlight
> > +sub format_rem_add_line {
> 
> Wouldn't a better name be format_rem_add_pair() or format_rem_add_lines(),
> or format_rem_add_lines_pair()?

OK.

> 
> > +	my ($rem, $add) = @_;
> > +	my @rem = split(//, $rem);
> > +	my @add = split(//, $add);
> > +	my ($esc_rem, $esc_add);
> > +	# Ignore +/- character, thus $prefix_len is set to 1.
> > +	my ($prefix_len, $suffix_len) = (1, 0);
> 
> I wonder if it wouldn't be slightly cleaner to count $prefix_len from
> +/- rather than start of line.  It means
> 
>   +	# Prefix does not count initial +/- character.
>   +	my ($prefix_len, $suffix_len) = (0, 0);
> 
> Perhaps even remove initial +/- from @add and @rem.
> 
>   +	shift @rem;
>   +	shift @add;
> 
> Ehh... now I see that starting $prefix_len with 1 might be not so bad
> idea after all.
> 
> > +	my ($prefix_is_space, $suffix_is_space) = (1, 1);
> 
> It is not entirely true: $prefix_is_space is really $prefix_is_space_or_empty.
> It might be a better idea to use
> 
>   +	my ($prefix_has_nonspace, $suffix_has_nonspace);
> 
> using the fact that undefined value is false.
> 
> > +
> > +	while ($prefix_len < @rem && $prefix_len < @add) {
> > +		last if ($rem[$prefix_len] ne $add[$prefix_len]);
> > +
> > +		$prefix_is_space = 0 if ($rem[$prefix_len] !~ /\s/);
> 
>   +		$prefix_has_nonspace = 1 if ($rem[$prefix_len] =~ /\s/);
> 
> > +		$prefix_len++;
> > +	}
> > +
> > +	my $shorter = (@rem < @add) ? @rem : @add;
> > +	while ($prefix_len + $suffix_len < $shorter) {
> > +		last if ($rem[-1 - $suffix_len] ne $add[-1 - $suffix_len]);
> > +
> > +		$suffix_is_space = 0 if ($rem[-1 - $suffix_len] !~ /\s/);
> 
>   +		$suffix_has_nonspace = 1 if ($rem[-1 - $suffix_len] =~ /\s/);
> 
> > +		$suffix_len++;
> > +	}
> > +
> > +	# Mark lines that are different from each other, but have some common
> > +	# part that isn't whitespace.  If lines are completely different, don't
> > +	# mark them because that would make output unreadable, especially if
> > +	# diff consists of multiple lines.
> > +	if (($prefix_len == 1 && $suffix_len == 0) ||
> > +	    ($prefix_is_space && $suffix_is_space)) {
> 
> Actually ($prefix_is_space && $suffix_is_space) is enough, but cryptic.

Yes, you caught was I thought too but hadn't wrote explicitly.  The
first condition ($prefix_len == 1 && $suffix_len == 0) is redundant but
makes the condition easier to understand.

> With $prefix_is_space (== $prefix_is_space_or_empty) -> $prefix_has_nonspace
> it would be
> 
>   +	if (not ($prefix_has_nonspace || $suffix_has_nonspace)) {
> 
> Anyway, it is not as if it is not clear enough.

I may play with this for a while to see which version looks best.

> 
> > +		$esc_rem = esc_html($rem, -nbsp=>1);
> > +		$esc_add = esc_html($add, -nbsp=>1);
> > +	} else {
> > +		$esc_rem = esc_html_hl_regions($rem, 'marked', [$prefix_len, @rem - $suffix_len], -nbsp=>1);
> > +		$esc_add = esc_html_hl_regions($add, 'marked', [$prefix_len, @add - $suffix_len], -nbsp=>1);
> 
>   +		$esc_rem = esc_html_hl_regions($rem, 'marked', [$prefix_len+1, @rem+1 - $suffix_len], -nbsp=>1);
>   +		$esc_add = esc_html_hl_regions($add, 'marked', [$prefix_len+1, @add+1 - $suffix_len], -nbsp=>1);
> 
> So probably not worth it.
> 
> > +	}
> > +
> > +	return format_diff_line(\$esc_rem, 'rem'),
> > +	        format_diff_line(\$esc_add, 'add');
> 
>   +	return format_diff_line(\$esc_rem, 'rem'),
>   +	       format_diff_line(\$esc_add, 'add');
> 
> > +}
> > +
> > +# HTML-format diff context, removed and added lines.
> > +sub format_ctx_rem_add_lines {
> > +	my ($ctx, $rem, $add, $is_combined) = @_;
> > +	my (@new_ctx, @new_rem, @new_add);
> > +
> > +	# Highlight if every removed line has a corresponding added line.
> > +	# Combined diffs are not supported ATM.
> 
> ATM == at this moment?  Please say so.

OK.

> 
> > +	if (!$is_combined && @$add > 0 && @$add == @$rem) {
> > +		for (my $i = 0; $i < @$add; $i++) {
> > +			my ($line_rem, $line_add) = format_rem_add_line(
> > +			        $rem->[$i], $add->[$i]);
> 
>   +			my ($line_rem, $line_add) =
>   +			        format_rem_add_line($rem->[$i], $add->[$i]);
> 
> > +			push @new_rem, $line_rem;
> > +			push @new_add, $line_add;
> > +		}
> > +	} else {
> > +		@new_rem = map { format_diff_line($_, 'rem') } @$rem;
> > +		@new_add = map { format_diff_line($_, 'add') } @$add;
> > +	}
> > +
> > +	@new_ctx = map { format_diff_line($_, 'ctx') } @$ctx;
> > +
> > +	return (\@new_ctx, \@new_rem, \@new_add);
> > +}
> 
> Nice.
> 
> > +
> >  # Print context lines and then rem/add lines.
> >  sub print_diff_lines {
> >  	my ($ctx, $rem, $add, $diff_style, $is_combined) = @_;
> >  
> > +	($ctx, $rem, $add) = format_ctx_rem_add_lines($ctx, $rem, $add,
> > +		$is_combined);
> > +
> 
>   +	($ctx, $rem, $add) = format_ctx_rem_add_lines($ctx, $rem, $add,	$is_combined);
>   +
> 
> Unless the line is too long.
> 
> >  	if ($diff_style eq 'sidebyside' && !$is_combined) {
> >  		print_sidebyside_diff_lines($ctx, $rem, $add);
> >  	} else {
> > @@ -5089,11 +5159,11 @@ sub print_diff_chunk {
> >  	foreach my $line_info (@chunk) {
> >  		my ($class, $line) = @$line_info;
> >  
> > -		$line = format_diff_line($line, $class, $from, $to);
> > +		$line = untabify($line);
> 
> O.K. you move untabify() out of format_diff_line(), and must now
> ensure that caller uses it before calling format_diff_line() or 
> format_ctx_rem_add_lines() (which uses format_diff_line() itself).
> 
> I wonder if leaving untabify() (and chomp) in format_diff_line(),
> but not running it if passed reference, and running untabify() in
> format_ctx_rem_add_lines() wouldn't be a better, less fragile code.
> 

I can try that.

> >  
> >  		# print chunk headers
> >  		if ($class && $class eq 'chunk_header') {
> > -			print $line;
> > +			print format_diff_line($line, $class, $from, $to);
> 
> O.K., only 'chunk_header' are not formatted.
> 
> >  			next;
> >  		}
> 
> [I have to take a look how final version of print_diff_lines() looks
>  like in this commit; the patch alone is not enough]
> 
> Right, so we format just before printing, and print_* do formatting
> itself before printing.  Nice and clean.

Thanks.

> 
> >  
> > diff --git a/gitweb/static/gitweb.css b/gitweb/static/gitweb.css
> > index c530355..3c4a3c9 100644
> > --- a/gitweb/static/gitweb.css
> > +++ b/gitweb/static/gitweb.css
> > @@ -438,6 +438,10 @@ div.diff.add {
> >  	color: #008800;
> >  }
> >  
> > +div.diff.add span.marked {
> > +	background-color: #77ff77;
> > +}
> > +
> >  div.diff.from_file a.path,
> >  div.diff.from_file {
> >  	color: #aa0000;
> > @@ -447,6 +451,10 @@ div.diff.rem {
> >  	color: #cc0000;
> >  }
> >  
> > +div.diff.rem span.marked {
> > +	background-color: #ff7777;
> > +}
> > +
> >  div.diff.chunk_header a,
> >  div.diff.chunk_header {
> >  	color: #990099;
> > -- 
> 
> Nice styling.
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 8/8] gitweb: Refinement highlightning in combined diffs
  2012-03-23 22:56 ` [PATCH v2 8/8] gitweb: Refinement highlightning in combined diffs Michał Kiedrowicz
@ 2012-03-29 23:37   ` Jakub Narebski
  2012-03-30  6:49     ` Michal Kiedrowicz
  0 siblings, 1 reply; 30+ messages in thread
From: Jakub Narebski @ 2012-03-29 23:37 UTC (permalink / raw)
  To: Michał Kiedrowicz; +Cc: git, Jeff King

On Fri, 23 Mar 2012, Michał Kiedrowicz wrote:

> The highlightning of combined diffs is currently disabled.  This is
> because output from a combined diff is much harder to highlight because
> it's not obvious which removed and added lines should be compared.
> 
Is -> was?

> Moreover, code that compares added and removed lines doesn't care about
> combined diffs. It only skips first +/- character, treating second +/-
> as a line content.

Well, we explicitly skip combined diffs.  I think what you want to say
here is that it is not possible to simply use existing algorithm
unchanged for combined diffs.

> 
> Let's start with a simple case: only highlight changes that come from
> one parent, i.e. when every removed line has a corresponding added line
> for the same parent.  This way the highlightning cannot get wrong. For
> example, following diffs would be highlighted:
> 
> 	- removed line for first parent
> 	+ added line for first parent
> 	  context line
> 	 -removed line for second parent
> 	 +added line for second parent
> 
> or
> 
> 	- removed line for first parent
> 	 -removed line for second parent
> 	+ added line for first parent
> 	 +added line for second parent
> 
> but following output will not:
> 
> 	- removed line for first parent
> 	 -removed line for second parent
> 	 +added line for second parent
> 	++added line for both parents

O.K., that's a nice and sensible first step.

I wonder if it would be worth to specify that we currently require that
pattern of '-'-es in pre-image match pattern of '+'-es in postimage.

Nb. the prefix of combined diff would either include '+', or '-',
but never mixed (this is documented, but I had trouble with this).

> 
> Further changes may introduce more intelligent approach that better
> handles combined diffs.

Very sensible approach.
> 
> Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>

  Acked-by: Jakub Narębski <jnareb@gmail.com>

> ---

BTW. I went and checked if this approach helps for non-trivial merges
in git.git history:

* b10656c - helps a bit, though one can see limitation of pre/post-fix
  matching here, but it is present also for non-combined diff.

* 8b132bc - helps a bit, though char-interdiff or word-interdiff might
  be better.  Nb. I think that red background for 'marked' is a bit
  too dark (intensive).

* c58499c - doesn't help too much.

* f629c23, aa145bf - helps.

>  gitweb/gitweb.perl |   57 +++++++++++++++++++++++++++++++++++++++------------
>  1 files changed, 43 insertions(+), 14 deletions(-)
> 
> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> index 872ba12..c056e83 100755
> --- a/gitweb/gitweb.perl
> +++ b/gitweb/gitweb.perl
> @@ -5057,12 +5057,12 @@ sub print_inline_diff_lines {
>  # Format removed and added line, mark changed part and HTML-format them.
>  # Impementation is based on contrib/diff-highlight

Implementation
   ^---

>  sub format_rem_add_line {
> -	my ($rem, $add) = @_;
> +	my ($rem, $add, $num_parents) = @_;
>  	my @rem = split(//, $rem);
>  	my @add = split(//, $add);
>  	my ($esc_rem, $esc_add);
> -	# Ignore +/- character, thus $prefix_len is set to 1.
> -	my ($prefix_len, $suffix_len) = (1, 0);
> +	# Ignore leading +/- characters for each parent.
> +	my ($prefix_len, $suffix_len) = ($num_parents, 0);

Nice.

[...]
> @@ -5099,15 +5099,43 @@ sub format_rem_add_line {
>  
>  # HTML-format diff context, removed and added lines.
>  sub format_ctx_rem_add_lines {
> -	my ($ctx, $rem, $add, $is_combined) = @_;
> +	my ($ctx, $rem, $add, $num_parents) = @_;
>  	my (@new_ctx, @new_rem, @new_add);
> +	my $can_highlight = 0;
> +	my $is_combined = ($num_parents > 1);
>  
>  	# Highlight if every removed line has a corresponding added line.
> -	# Combined diffs are not supported ATM.
> -	if (!$is_combined && @$add > 0 && @$add == @$rem) {
> +	if (@$add > 0 && @$add == @$rem) {
> +		$can_highlight = 1;
> +
> +		# Highlight lines in combined diff only if the chunk contains
> +		# diff between the same version, e.g.
> +		#
> +		#    - a
> +		#   -  b
> +		#    + c
> +		#   +  d
> +		#
> +		# Otherwise the highlightling would be confusing.
> +		if ($is_combined) {
> +			for (my $i = 0; $i < @$add; $i++) {
> +				my $prefix_rem = substr($rem->[$i], 0, $num_parents);
> +				my $prefix_add = substr($add->[$i], 0, $num_parents);
> +
> +				$prefix_rem =~ s/-/+/g;
> +
> +				if ($prefix_rem ne $prefix_add) {
> +					$can_highlight = 0;
> +					last;
> +				}
> +			}
> +		}
> +	}

Good.

> +
> +	if ($can_highlight) {
>  		for (my $i = 0; $i < @$add; $i++) {
>  			my ($line_rem, $line_add) = format_rem_add_line(
> -			        $rem->[$i], $add->[$i]);
> +			        $rem->[$i], $add->[$i], $num_parents);
>  			push @new_rem, $line_rem;
>  			push @new_add, $line_add;
>  		}

[...]
> @@ -5326,7 +5355,7 @@ sub git_patchset_body {
>  
>  	} continue {
>  		if (@chunk) {
> -			print_diff_chunk($diff_style, $is_combined, \%from, \%to, @chunk);
> +			print_diff_chunk($diff_style, scalar @hash_parents, \%from, \%to, @chunk);
>  			@chunk = ();
>  		}
>  		print "</div>\n"; # class="patch"

I was wondering about 'commitdiff' between two commits, which is not
combined even ifany of those commits is a merge commit... but it looks
like it works all right.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 8/8] gitweb: Refinement highlightning in combined diffs
  2012-03-29 23:37   ` Jakub Narebski
@ 2012-03-30  6:49     ` Michal Kiedrowicz
  0 siblings, 0 replies; 30+ messages in thread
From: Michal Kiedrowicz @ 2012-03-30  6:49 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git, Jeff King

Jakub Narebski <jnareb@gmail.com> wrote:

> On Fri, 23 Mar 2012, Michał Kiedrowicz wrote:
> 
> > The highlightning of combined diffs is currently disabled.  This is
> > because output from a combined diff is much harder to highlight
> > because it's not obvious which removed and added lines should be
> > compared.
> > 
> Is -> was?
> 
> > Moreover, code that compares added and removed lines doesn't care
> > about combined diffs. It only skips first +/- character, treating
> > second +/- as a line content.
> 
> Well, we explicitly skip combined diffs.  I think what you want to say
> here is that it is not possible to simply use existing algorithm
> unchanged for combined diffs.

Yep, your wording is much better.

> 
> > 
> > Let's start with a simple case: only highlight changes that come
> > from one parent, i.e. when every removed line has a corresponding
> > added line for the same parent.  This way the highlightning cannot
> > get wrong. For example, following diffs would be highlighted:
> > 
> > 	- removed line for first parent
> > 	+ added line for first parent
> > 	  context line
> > 	 -removed line for second parent
> > 	 +added line for second parent
> > 
> > or
> > 
> > 	- removed line for first parent
> > 	 -removed line for second parent
> > 	+ added line for first parent
> > 	 +added line for second parent
> > 
> > but following output will not:
> > 
> > 	- removed line for first parent
> > 	 -removed line for second parent
> > 	 +added line for second parent
> > 	++added line for both parents
> 
> O.K., that's a nice and sensible first step.
> 
> I wonder if it would be worth to specify that we currently require
> that pattern of '-'-es in pre-image match pattern of '+'-es in
> postimage.
> 
> Nb. the prefix of combined diff would either include '+', or '-',
> but never mixed (this is documented, but I had trouble with this).
> 
> > 
> > Further changes may introduce more intelligent approach that better
> > handles combined diffs.
> 
> Very sensible approach.
> > 
> > Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
> 
>   Acked-by: Jakub Narębski <jnareb@gmail.com>
> 
> > ---
> 
> BTW. I went and checked if this approach helps for non-trivial merges
> in git.git history:
> 
> * b10656c - helps a bit, though one can see limitation of pre/post-fix
>   matching here, but it is present also for non-combined diff.
> 
> * 8b132bc - helps a bit, though char-interdiff or word-interdiff might
>   be better.  Nb. I think that red background for 'marked' is a bit
>   too dark (intensive).
> 
> * c58499c - doesn't help too much.
> 
> * f629c23, aa145bf - helps.
> 

Thanks for testing.  I'll experiment with less intensive backgrounds.

> >  gitweb/gitweb.perl |   57
> > +++++++++++++++++++++++++++++++++++++++------------ 1 files
> > changed, 43 insertions(+), 14 deletions(-)
> > 
> > diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> > index 872ba12..c056e83 100755
> > --- a/gitweb/gitweb.perl
> > +++ b/gitweb/gitweb.perl
> > @@ -5057,12 +5057,12 @@ sub print_inline_diff_lines {
> >  # Format removed and added line, mark changed part and HTML-format
> > them. # Impementation is based on contrib/diff-highlight
> 
> Implementation
>    ^---
> 
> >  sub format_rem_add_line {
> > -	my ($rem, $add) = @_;
> > +	my ($rem, $add, $num_parents) = @_;
> >  	my @rem = split(//, $rem);
> >  	my @add = split(//, $add);
> >  	my ($esc_rem, $esc_add);
> > -	# Ignore +/- character, thus $prefix_len is set to 1.
> > -	my ($prefix_len, $suffix_len) = (1, 0);
> > +	# Ignore leading +/- characters for each parent.
> > +	my ($prefix_len, $suffix_len) = ($num_parents, 0);
> 
> Nice.
> 
> [...]
> > @@ -5099,15 +5099,43 @@ sub format_rem_add_line {
> >  
> >  # HTML-format diff context, removed and added lines.
> >  sub format_ctx_rem_add_lines {
> > -	my ($ctx, $rem, $add, $is_combined) = @_;
> > +	my ($ctx, $rem, $add, $num_parents) = @_;
> >  	my (@new_ctx, @new_rem, @new_add);
> > +	my $can_highlight = 0;
> > +	my $is_combined = ($num_parents > 1);
> >  
> >  	# Highlight if every removed line has a corresponding
> > added line.
> > -	# Combined diffs are not supported ATM.
> > -	if (!$is_combined && @$add > 0 && @$add == @$rem) {
> > +	if (@$add > 0 && @$add == @$rem) {
> > +		$can_highlight = 1;
> > +
> > +		# Highlight lines in combined diff only if the
> > chunk contains
> > +		# diff between the same version, e.g.
> > +		#
> > +		#    - a
> > +		#   -  b
> > +		#    + c
> > +		#   +  d
> > +		#
> > +		# Otherwise the highlightling would be confusing.
> > +		if ($is_combined) {
> > +			for (my $i = 0; $i < @$add; $i++) {
> > +				my $prefix_rem =
> > substr($rem->[$i], 0, $num_parents);
> > +				my $prefix_add =
> > substr($add->[$i], 0, $num_parents); +
> > +				$prefix_rem =~ s/-/+/g;
> > +
> > +				if ($prefix_rem ne $prefix_add) {
> > +					$can_highlight = 0;
> > +					last;
> > +				}
> > +			}
> > +		}
> > +	}
> 
> Good.
> 
> > +
> > +	if ($can_highlight) {
> >  		for (my $i = 0; $i < @$add; $i++) {
> >  			my ($line_rem, $line_add) =
> > format_rem_add_line(
> > -			        $rem->[$i], $add->[$i]);
> > +			        $rem->[$i], $add->[$i],
> > $num_parents); push @new_rem, $line_rem;
> >  			push @new_add, $line_add;
> >  		}
> 
> [...]
> > @@ -5326,7 +5355,7 @@ sub git_patchset_body {
> >  
> >  	} continue {
> >  		if (@chunk) {
> > -			print_diff_chunk($diff_style,
> > $is_combined, \%from, \%to, @chunk);
> > +			print_diff_chunk($diff_style, scalar
> > @hash_parents, \%from, \%to, @chunk); @chunk = ();
> >  		}
> >  		print "</div>\n"; # class="patch"
> 
> I was wondering about 'commitdiff' between two commits, which is not
> combined even ifany of those commits is a merge commit... but it looks
> like it works all right.
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 4/8] gitweb: Use print_diff_chunk() for both side-by-side and inline diffs
  2012-03-29 17:31     ` Michał Kiedrowicz
@ 2012-03-30 13:34       ` Jakub Narebski
  2012-03-30 13:37         ` Michal Kiedrowicz
  0 siblings, 1 reply; 30+ messages in thread
From: Jakub Narebski @ 2012-03-30 13:34 UTC (permalink / raw)
  To: Michał Kiedrowicz; +Cc: git, Jeff King

On Thu, 29 Mar 2012, Michał Kiedrowicz wrote:
> Jakub Narebski <jnareb@gmail.com> wrote:

> > BTW. I was wondering about binary files, but this commit wouldn't make
> > it worse anyway as we parse diff output assuming unified-like diff for
> > diff syntax highlighting... and binary diffs are shown as
> > 
> >   "Binary files a/foo.png and b/foo.png differ"
> > 
> > in extended diff header.
> 
> Yeah, this is what I wrote in the cover letter:
> 
> 	* Ensured that binary patches are not supported in HTML format,
> 	  thus this series canot break it :) (answering Jakub's questions)
> 
> but maybe I wasn't clear enough.

Eh, sorry, this was more of reminder to myself, rather than questioning
you if you really checked that.  Sorry for that. 

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 4/8] gitweb: Use print_diff_chunk() for both side-by-side and inline diffs
  2012-03-30 13:34       ` Jakub Narebski
@ 2012-03-30 13:37         ` Michal Kiedrowicz
  0 siblings, 0 replies; 30+ messages in thread
From: Michal Kiedrowicz @ 2012-03-30 13:37 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git, Jeff King

Jakub Narebski <jnareb@gmail.com> wrote:

> On Thu, 29 Mar 2012, Michał Kiedrowicz wrote:
> > Jakub Narebski <jnareb@gmail.com> wrote:
> 
> > > BTW. I was wondering about binary files, but this commit wouldn't
> > > make it worse anyway as we parse diff output assuming
> > > unified-like diff for diff syntax highlighting... and binary
> > > diffs are shown as
> > > 
> > >   "Binary files a/foo.png and b/foo.png differ"
> > > 
> > > in extended diff header.
> > 
> > Yeah, this is what I wrote in the cover letter:
> > 
> > 	* Ensured that binary patches are not supported in HTML
> > format, thus this series canot break it :) (answering Jakub's
> > questions)
> > 
> > but maybe I wasn't clear enough.
> 
> Eh, sorry, this was more of reminder to myself, rather than
> questioning you if you really checked that.  Sorry for that. 
> 

OK, no problem.  Thanks for your very thorough review.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 3/8] gitweb: Extract print_sidebyside_diff_lines()
  2012-03-29 17:25     ` Michał Kiedrowicz
@ 2012-03-30 13:37       ` Jakub Narebski
  0 siblings, 0 replies; 30+ messages in thread
From: Jakub Narebski @ 2012-03-30 13:37 UTC (permalink / raw)
  To: Michał Kiedrowicz; +Cc: git, Jeff King

On Thu, 29 Mar 2012, Michał Kiedrowicz wrote:
> Jakub Narebski <jnareb@gmail.com> wrote:
> > On Fri, 23 Mar 2012, Michał Kiedrowicz wrote:

> > > This is a preparation patch for diff refinement highlightning.  It should
> > > not change the gitweb output, but it slightly changes its behavior.
> > > Before this commit, context is printed on the class change. Now, it'it
> > > printed just before printing added and removed lines.
> > 
> >                                                       , and at the end
> >   of chunk.

Please don't forget to add that information.  I was wondering about this,
and checked code that it works all right.

> > IMVHO such change is irrelevant.

This refers to "but it slightly changes its behavior", which might be
not clear from the position of this sentence.

> > Acked-by: Jakub Narębski <jnareb@gmail.com>
> 
> Thanks.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2012-03-30 13:37 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-23 22:56 [PATCH v2 0/8] gitweb: Highlight interesting parts of diff Michał Kiedrowicz
2012-03-23 22:56 ` [PATCH v2 1/8] gitweb: esc_html_hl_regions(): Don't create empty <span> elements Michał Kiedrowicz
2012-03-24 18:58   ` Jakub Narebski
2012-03-24 23:38     ` Michał Kiedrowicz
2012-03-29 18:04     ` [PATCH] gitweb: Use descriptive names in esc_html_hl_regions() Michał Kiedrowicz
2012-03-23 22:56 ` [PATCH v2 2/8] gitweb: Pass esc_html_hl_regions() options to esc_html() Michał Kiedrowicz
2012-03-24 19:15   ` Jakub Narebski
2012-03-24 23:31     ` Michał Kiedrowicz
2012-03-23 22:56 ` [PATCH v2 3/8] gitweb: Extract print_sidebyside_diff_lines() Michał Kiedrowicz
2012-03-28 14:33   ` Jakub Narebski
2012-03-29 17:25     ` Michał Kiedrowicz
2012-03-30 13:37       ` Jakub Narebski
2012-03-23 22:56 ` [PATCH v2 4/8] gitweb: Use print_diff_chunk() for both side-by-side and inline diffs Michał Kiedrowicz
2012-03-28 15:56   ` Jakub Narebski
2012-03-29 17:31     ` Michał Kiedrowicz
2012-03-30 13:34       ` Jakub Narebski
2012-03-30 13:37         ` Michal Kiedrowicz
2012-03-23 22:56 ` [PATCH v2 5/8] gitweb: Move HTML-formatting diff line back to process_diff_line() Michał Kiedrowicz
2012-03-29 16:14   ` Jakub Narebski
2012-03-29 16:49     ` Jakub Narebski
2012-03-29 17:36     ` Michał Kiedrowicz
2012-03-23 22:56 ` [PATCH v2 6/8] gitweb: Push formatting diff lines to print_diff_chunk() Michał Kiedrowicz
2012-03-29 16:59   ` Jakub Narebski
2012-03-29 17:41     ` Michał Kiedrowicz
2012-03-23 22:56 ` [PATCH v2 7/8] gitweb: Highlight interesting parts of diff Michał Kiedrowicz
2012-03-29 19:42   ` Jakub Narebski
2012-03-29 19:59     ` Michał Kiedrowicz
2012-03-23 22:56 ` [PATCH v2 8/8] gitweb: Refinement highlightning in combined diffs Michał Kiedrowicz
2012-03-29 23:37   ` Jakub Narebski
2012-03-30  6:49     ` Michal Kiedrowicz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.