All of lore.kernel.org
 help / color / mirror / Atom feed
* [WIP-PATCH 0/2] send-email: refactor the email parser loop
@ 2016-05-27 14:01 Samuel GROOT
  2016-05-27 14:01 ` [WIP-PATCH 1/2] send-email: create email parser subroutine Samuel GROOT
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Samuel GROOT @ 2016-05-27 14:01 UTC (permalink / raw)
  To: git
  Cc: e, erwan.mathoniere, jordan.de-gea, matthieu.moy, gitster, aaron,
	Samuel GROOT # This line is ignored .

While working on the new option `quote-email`, we needed to parse an
email file. Since the work is already done, but the parsing and data
processing are in the same loop, we wanted to refactor the parser, to
make the code more maintainable.

This is still WIP, and one of the main issue (and we need your
advice on that), is that around 30 tests will fail, and most of them
are false-negatives: to pass, they rely on the format of what is
displayed by `git send-email`, not only data.


For example, several tests will fail because they do a strict compare
between `git send-email`'s output and:

   (mbox) Adding cc: A<author@example.com>  from line 'Cc: A<author@example.com>, One<one@example.com>'
   (mbox) Adding cc: One<one@example.com>  from line 'Cc: A<author@example.com>, One<one@example.com>'

Though `git send-email` now outputs something like:

   (mbox) Adding cc: A<author@example.com>  from line 'Cc: A<author@example.com>'
   (mbox) Adding cc: One<one@example.com>  from line 'Cc: One<one@example.com>'

The new behavior is explained because parsing and processing are not
done in the same function, and processing cannot know the exact line
the data came from.

We can think of several solutions:

   1. Modify the parser, to give the script the knowledge of the exact
      line the data came from.

   2. Hack the tests: modify the script using the parser to artificially
      recreate the supposedly parsed line.
      (e.g. with a list.join(', ')-like function)

   3. Modify the tests to replace exact cmp by greps.


IMHO, we should consider option 3, since the tests should rely on data
rather than exact outputs. It also makes the tests more maintainable,
in the sense that they will be more resilient to future output
modifications.

What do you think?


   [WIP-PATCH 1/2] send-email: create email parser subroutine
   [WIP-PATCH 2/2] send-email: use refactored subroutine to parse patches

git-send-email.perl | 284 ++++++++++++++++++++++++++++++++++++++++++++--------------------------------
  1 file changed, 164 insertions(+), 120 deletions(-)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [WIP-PATCH 1/2] send-email: create email parser subroutine
  2016-05-27 14:01 [WIP-PATCH 0/2] send-email: refactor the email parser loop Samuel GROOT
@ 2016-05-27 14:01 ` Samuel GROOT
  2016-05-28 15:22   ` Matthieu Moy
  2016-05-27 14:01 ` [WIP-PATCH 2/2] send-email: use refactored subroutine to parse patches Samuel GROOT
  2016-05-27 20:14 ` [WIP-PATCH 0/2] send-email: refactor the email parser loop Eric Wong
  2 siblings, 1 reply; 18+ messages in thread
From: Samuel GROOT @ 2016-05-27 14:01 UTC (permalink / raw)
  To: git
  Cc: e, erwan.mathoniere, jordan.de-gea, matthieu.moy, gitster, aaron,
	Samuel GROOT, Tom RUSSELLO

Parsing and processing in send-email is done in the same loop.

To make the code more maintainable, we create two subroutines:
- `parse_email` to separate header and body
- `parse_header` to retrieve data from header

Signed-off-by: Samuel GROOT <samuel.groot@grenoble-inp.org>
Signed-off-by: Tom RUSSELLO <tom.russello@grenoble-inp.org>
Signed-off-by: Matthieu MOY <matthieu.moy@grenoble-inp.fr>
---
 git-send-email.perl | 105 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 105 insertions(+)

diff --git a/git-send-email.perl b/git-send-email.perl
index 6958785..f33a083 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -1750,3 +1750,108 @@ sub body_or_subject_has_nonascii {
 	}
 	return 0;
 }
+
+sub parse_email {
+	my @header = ();
+	my @body = ();
+	my $fh = shift;
+
+	# First unfold multiline header fields
+	while (<$fh>) {
+		last if /^\s*$/;
+		if (/^\s+\S/ and @header) {
+			chomp($header[$#header]);
+			s/^\s+/ /;
+			$header[$#header] .= $_;
+		} else {
+			push(@header, $_);
+		}
+	}
+
+	# Now unfold the message body
+	while (<$fh>) {
+		push @body, $_;
+	}
+
+	return (@header, @body);
+}
+
+sub parse_header {
+	# Return variables
+	my $from = undef, $subject = undef;
+	my $date = undef, $message_id = undef;
+	my @to = (), @cc = (), @xh = ();
+	my %flags = ();
+
+
+	# Internal variables
+	my $input_format = undef;
+
+	foreach(@_) {
+		if (/^From /) {
+			$input_format = 'mbox';
+			next;
+		}
+		chomp;
+		if (!defined $input_format && /^[-A-Za-z]+:\s/) {
+			$input_format = 'mbox';
+		}
+
+		if (defined $input_format && $input_format eq 'mbox') {
+			if (/^Subject:\s+(.*)$/i) {
+				$subject = $1;
+			} elsif (/^From:\s+(.*)$/i) {
+				$from = $1;
+			} elsif (/^To:\s+(.*)$/i) {
+				foreach my $addr (parse_address_line($1)) {
+					push @to, $addr;
+				}
+			} elsif (/^Cc:\s+(.*)$/i) {
+				foreach my $addr (parse_address_line($1)) {
+					push @cc, $addr;
+				}
+			} elsif (/^Content-type:/i) {
+				$flags{"has_content_type"} = 1;
+				if (/charset="?([^ "]+)/) {
+					$flags{"body_encoding"} = 1;
+				}
+				push @xh, $_;
+			} elsif (/^MIME-Version/i) {
+				$flags{"has_mime_version"} = 1;
+				push @xh, $_;
+			} elsif (/^Message-Id: (.*)/i) {
+				$message_id = $1;
+			} elsif (/^Content-Transfer-Encoding: (.*)/i) {
+				$flags{"xfer_encoding"} = $1 if not defined $flags{"xfer_encoding"};
+			} elsif (/^Date:\s(.*)$/i) {
+				$date = $1;
+			} elsif (/^[-A-Za-z]+:\s+\S/) {
+				push @xh, $_;
+			}
+
+		} else {
+			# In the traditional
+			# "send lots of email" format,
+			# line 1 = cc
+			# line 2 = subject
+			# So let's support that, too.
+			$input_format = 'lots';
+			if (@cc == 0) {
+				push @cc, $_;
+			} elsif (!defined $subject) {
+				$subject = $_;
+			}
+		}
+	}
+
+	return (
+		"from" => $from,
+		"subject" => $subject,
+		"date" => $date,
+		"message_id" => $message_id,
+		"to" => [@to],
+		"cc" => [@cc],
+		"xh" => [@xh],
+		"flags" => {%flags}
+	);
+}
-- 
2.8.2.537.gb153d2a

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [WIP-PATCH 2/2] send-email: use refactored subroutine to parse patches
  2016-05-27 14:01 [WIP-PATCH 0/2] send-email: refactor the email parser loop Samuel GROOT
  2016-05-27 14:01 ` [WIP-PATCH 1/2] send-email: create email parser subroutine Samuel GROOT
@ 2016-05-27 14:01 ` Samuel GROOT
  2016-05-27 20:14 ` [WIP-PATCH 0/2] send-email: refactor the email parser loop Eric Wong
  2 siblings, 0 replies; 18+ messages in thread
From: Samuel GROOT @ 2016-05-27 14:01 UTC (permalink / raw)
  To: git
  Cc: e, erwan.mathoniere, jordan.de-gea, matthieu.moy, gitster, aaron,
	Samuel GROOT, Tom RUSSELLO

Use the two subroutines `parse_email` and `parse_header` introduced in
previous commit to parse patches.

Signed-off-by: Samuel GROOT <samuel.groot@grenoble-inp.org>
Signed-off-by: Tom RUSSELLO <tom.russello@grenoble-inp.org>
Signed-off-by: Matthieu MOY <matthieu.moy@grenoble-inp.fr>
---
 git-send-email.perl | 179 +++++++++++++++++-----------------------------------
 1 file changed, 59 insertions(+), 120 deletions(-)

diff --git a/git-send-email.perl b/git-send-email.perl
index f33a083..7bb4a2d 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -161,7 +161,7 @@ my $re_encoded_word = qr/=\?($re_token)\?($re_token)\?($re_encoded_text)\?=/;
 # Variables we fill in automatically, or via prompting:
 my (@to,$no_to,@initial_to,@cc,$no_cc,@initial_cc,@bcclist,$no_bcc,@xh,
 	$initial_reply_to,$initial_subject,@files,
-	$author,$sender,$smtp_authpass,$annotate,$use_xmailer,$compose,$time);
+	$sender,$smtp_authpass,$annotate,$use_xmailer,$compose,$time);
 
 my $envelope_sender;
 
@@ -1431,117 +1431,57 @@ $subject = $initial_subject;
 $message_num = 0;
 
 foreach my $t (@files) {
-	open my $fh, "<", $t or die "can't open file $t";
-
-	my $author = undef;
-	my $sauthor = undef;
-	my $author_encoding;
-	my $has_content_type;
-	my $body_encoding;
-	my $xfer_encoding;
-	my $has_mime_version;
-	@to = ();
-	@cc = ();
-	@xh = ();
-	my $input_format = undef;
-	my @header = ();
 	$message = "";
 	$message_num++;
-	# First unfold multiline header fields
-	while(<$fh>) {
-		last if /^\s*$/;
-		if (/^\s+\S/ and @header) {
-			chomp($header[$#header]);
-			s/^\s+/ /;
-			$header[$#header] .= $_;
-	    } else {
-			push(@header, $_);
-		}
-	}
-	# Now parse the header
-	foreach(@header) {
-		if (/^From /) {
-			$input_format = 'mbox';
-			next;
-		}
-		chomp;
-		if (!defined $input_format && /^[-A-Za-z]+:\s/) {
-			$input_format = 'mbox';
-		}
 
-		if (defined $input_format && $input_format eq 'mbox') {
-			if (/^Subject:\s+(.*)$/i) {
-				$subject = $1;
-			}
-			elsif (/^From:\s+(.*)$/i) {
-				($author, $author_encoding) = unquote_rfc2047($1);
-				$sauthor = sanitize_address($author);
-				next if $suppress_cc{'author'};
-				next if $suppress_cc{'self'} and $sauthor eq $sender;
-				printf("(mbox) Adding cc: %s from line '%s'\n",
-					$1, $_) unless $quiet;
-				push @cc, $1;
-			}
-			elsif (/^To:\s+(.*)$/i) {
-				foreach my $addr (parse_address_line($1)) {
-					printf("(mbox) Adding to: %s from line '%s'\n",
-						$addr, $_) unless $quiet;
-					push @to, $addr;
-				}
-			}
-			elsif (/^Cc:\s+(.*)$/i) {
-				foreach my $addr (parse_address_line($1)) {
-					my $qaddr = unquote_rfc2047($addr);
-					my $saddr = sanitize_address($qaddr);
-					if ($saddr eq $sender) {
-						next if ($suppress_cc{'self'});
-					} else {
-						next if ($suppress_cc{'cc'});
-					}
-					printf("(mbox) Adding cc: %s from line '%s'\n",
-						$addr, $_) unless $quiet;
-					push @cc, $addr;
-				}
-			}
-			elsif (/^Content-type:/i) {
-				$has_content_type = 1;
-				if (/charset="?([^ "]+)/) {
-					$body_encoding = $1;
-				}
-				push @xh, $_;
-			}
-			elsif (/^MIME-Version/i) {
-				$has_mime_version = 1;
-				push @xh, $_;
-			}
-			elsif (/^Message-Id: (.*)/i) {
-				$message_id = $1;
-			}
-			elsif (/^Content-Transfer-Encoding: (.*)/i) {
-				$xfer_encoding = $1 if not defined $xfer_encoding;
-			}
-			elsif (!/^Date:\s/i && /^[-A-Za-z]+:\s+\S/) {
-				push @xh, $_;
-			}
+	# Split email into header and body
+	open my $fh, "<", $t or die "can't open file $t";
+	my (@header, @body) = parse_email($fh);
+	close $fh;
 
+	# Parse header
+	my %parsed_header = parse_header(@header);
+	my $from = $parsed_header{"from"};
+	$subject = $parsed_header{"subject"};
+	$message_id = $parsed_header{"message_id"};
+	@to = @{$parsed_header{"to"}};
+	@cc = @{$parsed_header{"cc"}};
+	@xh = @{$parsed_header{"xh"}};
+	my %flags = %{$parsed_header{"flags"}};
+
+	# Process parsed headers
+	my ($author, $author_encoding) = unquote_rfc2047($from);
+	my $sauthor = sanitize_address($author);
+	unless ($suppress_cc{'author'} or
+		($suppress_cc{'self'} and $sauthor eq $sender)) {
+		printf("(mbox) Adding cc: %s from line 'From: %s'\n",
+			$from, $from) unless $quiet;
+		push @cc, $from;
+	}
+
+	foreach (@to) {
+		printf("(mbox) Adding to: %s from line 'To: %s'\n",
+			$_, $_) unless $quiet;
+	}
+
+	my @tmpcc = ();
+	foreach (@cc) {
+		my $qaddr = unquote_rfc2047($_);
+		my $saddr = sanitize_address($qaddr);
+		if ($saddr eq $sender) {
+			next if ($suppress_cc{'self'});
 		} else {
-			# In the traditional
-			# "send lots of email" format,
-			# line 1 = cc
-			# line 2 = subject
-			# So let's support that, too.
-			$input_format = 'lots';
-			if (@cc == 0 && !$suppress_cc{'cc'}) {
-				printf("(non-mbox) Adding cc: %s from line '%s'\n",
-					$_, $_) unless $quiet;
-				push @cc, $_;
-			} elsif (!defined $subject) {
-				$subject = $_;
-			}
+			next if ($suppress_cc{'cc'});
 		}
+		printf("(mbox) Adding cc: %s from line 'Cc: %s'\n",
+			$_, $_) unless $quiet;
+		push @tmpcc, $_;
 	}
+	@cc = @tmpcc;
+
+
 	# Now parse the message body
-	while(<$fh>) {
+	foreach (@body) {
 		$message .=  $_;
 		if (/^(Signed-off-by|Cc): (.*)$/i) {
 			chomp;
@@ -1559,18 +1499,17 @@ foreach my $t (@files) {
 				$c, $_) unless $quiet;
 		}
 	}
-	close $fh;
 
 	push @to, recipients_cmd("to-cmd", "to", $to_cmd, $t)
 		if defined $to_cmd;
 	push @cc, recipients_cmd("cc-cmd", "cc", $cc_cmd, $t)
 		if defined $cc_cmd && !$suppress_cc{'cccmd'};
 
-	if ($broken_encoding{$t} && !$has_content_type) {
-		$xfer_encoding = '8bit' if not defined $xfer_encoding;
-		$has_content_type = 1;
+	if ($broken_encoding{$t} && !$flags{"has_content_type"}) {
+		$flags{"xfer_encoding"} = '8bit' if not defined $flags{"xfer_encoding"};
+		$flags{"has_content_type"} = 1;
 		push @xh, "Content-Type: text/plain; charset=$auto_8bit_encoding";
-		$body_encoding = $auto_8bit_encoding;
+		$flags{"body_encoding"} = $auto_8bit_encoding;
 	}
 
 	if ($broken_encoding{$t} && !is_rfc2047_quoted($subject)) {
@@ -1580,8 +1519,8 @@ foreach my $t (@files) {
 	if (defined $sauthor and $sauthor ne $sender) {
 		$message = "From: $author\n\n$message";
 		if (defined $author_encoding) {
-			if ($has_content_type) {
-				if ($body_encoding eq $author_encoding) {
+			if ($flags{"has_content_type"}) {
+				if ($flags{"body_encoding"} eq $author_encoding) {
 					# ok, we already have the right encoding
 				}
 				else {
@@ -1589,24 +1528,24 @@ foreach my $t (@files) {
 				}
 			}
 			else {
-				$xfer_encoding = '8bit' if not defined $xfer_encoding;
-				$has_content_type = 1;
+				$flags{"xfer_encoding"} = '8bit' if not defined $flags{"xfer_encoding"};
+				$flags{"has_content_type"} = 1;
 				push @xh,
 				  "Content-Type: text/plain; charset=$author_encoding";
 			}
 		}
 	}
 	if (defined $target_xfer_encoding) {
-		$xfer_encoding = '8bit' if not defined $xfer_encoding;
+		$flags{"xfer_encoding"} = '8bit' if not defined $flags{"xfer_encoding"};
 		$message = apply_transfer_encoding(
-			$message, $xfer_encoding, $target_xfer_encoding);
-		$xfer_encoding = $target_xfer_encoding;
+			$message, $flags{"xfer_encoding"}, $target_xfer_encoding);
+		$flags{"xfer_encoding"} = $target_xfer_encoding;
 	}
-	if (defined $xfer_encoding) {
-		push @xh, "Content-Transfer-Encoding: $xfer_encoding";
+	if (defined $flags{"xfer_encoding"}) {
+		push @xh, "Content-Transfer-Encoding: ".$flags{"xfer_encoding"};
 	}
-	if (defined $xfer_encoding or $has_content_type) {
-		unshift @xh, 'MIME-Version: 1.0' unless $has_mime_version;
+	if (defined $flags{"xfer_encoding"} or $flags{"has_content_type"}) {
+		unshift @xh, 'MIME-Version: 1.0' unless $flags{"has_mime_version"};
 	}
 
 	$needs_confirm = (
-- 
2.8.2.537.gb153d2a

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [WIP-PATCH 0/2] send-email: refactor the email parser loop
  2016-05-27 14:01 [WIP-PATCH 0/2] send-email: refactor the email parser loop Samuel GROOT
  2016-05-27 14:01 ` [WIP-PATCH 1/2] send-email: create email parser subroutine Samuel GROOT
  2016-05-27 14:01 ` [WIP-PATCH 2/2] send-email: use refactored subroutine to parse patches Samuel GROOT
@ 2016-05-27 20:14 ` Eric Wong
  2016-05-28 15:04   ` Matthieu Moy
  2 siblings, 1 reply; 18+ messages in thread
From: Eric Wong @ 2016-05-27 20:14 UTC (permalink / raw)
  To: Samuel GROOT
  Cc: git, erwan.mathoniere, jordan.de-gea, matthieu.moy, gitster, aaron

Samuel GROOT <samuel.groot@grenoble-inp.org> wrote:
> While working on the new option `quote-email`, we needed to parse an
> email file. Since the work is already done, but the parsing and data
> processing are in the same loop, we wanted to refactor the parser, to
> make the code more maintainable.

Thank you for doing this work :)

> This is still WIP, and one of the main issue (and we need your
> advice on that), is that around 30 tests will fail, and most of them
> are false-negatives: to pass, they rely on the format of what is
> displayed by `git send-email`, not only data.
> 
> 
> For example, several tests will fail because they do a strict compare
> between `git send-email`'s output and:
> 
>    (mbox) Adding cc: A<author@example.com>  from line 'Cc: A<author@example.com>, One<one@example.com>'
>    (mbox) Adding cc: One<one@example.com>  from line 'Cc: A<author@example.com>, One<one@example.com>'
> 
> Though `git send-email` now outputs something like:
> 
>    (mbox) Adding cc: A<author@example.com>  from line 'Cc: A<author@example.com>'
>    (mbox) Adding cc: One<one@example.com>  from line 'Cc: One<one@example.com>'
I actually like neither, and would prefer something shorter:

    (mbox) Adding cc: A <author@example.com> from Cc: header
    (mbox) Adding cc: One <one@example.com> from Cc: header
    (mbox) Adding cc: SoB <SoB@example.com> from Signed-off-by: trailer

That way, there's no redundant addresses in each line and less
likely to wrap.

But I actually never noticed these lines in the past since they
scrolled off my terminal :x

Since the headers are already shown after those lines, it's
redundant to have the entire line.  And we could add
trailers after the headers (with a blank line to delimit):

    # existing header output:
    From: F <F@example.com>
    Cc: A <author@example.com>, One <one@example.com>
    Subject: foo

    # new trailer output
    Signed-off-by: SoB <SoB@example.com>
    Acked-by: ack <ack@example.com>

> We can think of several solutions:
> 
>    1. Modify the parser, to give the script the knowledge of the exact
>       line the data came from.
> 
>    2. Hack the tests: modify the script using the parser to artificially
>       recreate the supposedly parsed line.
>       (e.g. with a list.join(', ')-like function)
> 
>    3. Modify the tests to replace exact cmp by greps.
> 
> 
> IMHO, we should consider option 3, since the tests should rely on data
> rather than exact outputs. It also makes the tests more maintainable,
> in the sense that they will be more resilient to future output
> modifications.

Agreed on 3.

I am not sure if anybody outside of tests parses the stdout of
send-email.  It's certainly a porcelain and I don't think
stdout needs to be stable, and maybe the output in
question should go to stderr since it could be considered
debug output.

But I could be wrong...

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [WIP-PATCH 0/2] send-email: refactor the email parser loop
  2016-05-27 20:14 ` [WIP-PATCH 0/2] send-email: refactor the email parser loop Eric Wong
@ 2016-05-28 15:04   ` Matthieu Moy
  2016-05-29 17:21     ` Samuel GROOT
  0 siblings, 1 reply; 18+ messages in thread
From: Matthieu Moy @ 2016-05-28 15:04 UTC (permalink / raw)
  To: Eric Wong
  Cc: Samuel GROOT, git, erwan.mathoniere, jordan.de-gea, gitster, aaron

Eric Wong <e@80x24.org> writes:

> Samuel GROOT <samuel.groot@grenoble-inp.org> wrote:
>
>>    (mbox) Adding cc: A<author@example.com>  from line 'Cc: A<author@example.com>, One<one@example.com>'
>>    (mbox) Adding cc: One<one@example.com>  from line 'Cc: A<author@example.com>, One<one@example.com>'
>> 
>> Though `git send-email` now outputs something like:
>> 
>>    (mbox) Adding cc: A<author@example.com>  from line 'Cc: A<author@example.com>'
>>    (mbox) Adding cc: One<one@example.com>  from line 'Cc: One<one@example.com>'
> I actually like neither, and would prefer something shorter:
>
>     (mbox) Adding cc: A <author@example.com> from Cc: header
>     (mbox) Adding cc: One <one@example.com> from Cc: header
>     (mbox) Adding cc: SoB <SoB@example.com> from Signed-off-by: trailer
>
> That way, there's no redundant addresses in each line and less
> likely to wrap.

I agree with this. Actually, I'd even say that the current output of
"git send-email" looks sloppy, and internal refactoring may be a good
opportunity to get it cleaner.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [WIP-PATCH 1/2] send-email: create email parser subroutine
  2016-05-27 14:01 ` [WIP-PATCH 1/2] send-email: create email parser subroutine Samuel GROOT
@ 2016-05-28 15:22   ` Matthieu Moy
  2016-05-28 23:33     ` Eric Wong
  0 siblings, 1 reply; 18+ messages in thread
From: Matthieu Moy @ 2016-05-28 15:22 UTC (permalink / raw)
  To: Samuel GROOT
  Cc: git, e, erwan.mathoniere, jordan.de-gea, gitster, aaron, Tom RUSSELLO

Samuel GROOT <samuel.groot@grenoble-inp.org> writes:

> Parsing and processing in send-email is done in the same loop.
>
> To make the code more maintainable, we create two subroutines:
> - `parse_email` to separate header and body
> - `parse_header` to retrieve data from header

These routines are not specific to git send-email, nor to Git.

Does it make sense to use an external library, like
http://search.cpan.org/~rjbs/Email-Simple-2.210/lib/Email/Simple.pm ,
either by depending on it, or by copying it in Git's source tree ?

If not, I think it would be better to introduce an email parsing library
in a dedicated Perl module in perl/ in our source tree, to keep
git-send-email.perl more focused on the "send-email" logic.

> +sub parse_email {
> +	my @header = ();
> +	my @body = ();
> +	my $fh = shift;
> +
> +	# First unfold multiline header fields
> +	while (<$fh>) {
> +		last if /^\s*$/;
> +		if (/^\s+\S/ and @header) {
> +			chomp($header[$#header]);
> +			s/^\s+/ /;
> +			$header[$#header] .= $_;
> +		} else {
> +			push(@header, $_);
> +		}
> +	}
> +
> +	# Now unfold the message body

Why "unfold"? Don't you mean "split message body into a list of lines"?

> +	while (<$fh>) {
> +		push @body, $_;
> +	}
> +
> +	return (@header, @body);
> +}

Please document your functions. See e.g. perl/Git.pm for an example of
what perldoc allows you to do.

This also lacks tests. One advantage of having a clean API is that it
also makes it simpler to do unit-testing. Grep "Test::More" in t/ to see
some existing unit-tests in Perl.

> +	foreach(@_) {

Style: space before (.

> +		if (defined $input_format && $input_format eq 'mbox') {
> +			if (/^Subject:\s+(.*)$/i) {
> +				$subject = $1;
> +			} elsif (/^From:\s+(.*)$/i) {
> +				$from = $1;

Not sure we need thes if/elsif/ for generic headers. Email::Simple's API
seems much simpler and general: $email->header("From");

> +				foreach my $addr (parse_address_line($1)) {
> +					push @to, $addr;
> +				}

3 lines for an array concatenation in a high-level language. It looks
like 2 more than needed ;-).

> +			}
> +
> +		} else {

Useless blank line.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [WIP-PATCH 1/2] send-email: create email parser subroutine
  2016-05-28 15:22   ` Matthieu Moy
@ 2016-05-28 23:33     ` Eric Wong
  2016-05-29 17:15       ` Samuel GROOT
  2016-06-02 16:57       ` Samuel GROOT
  0 siblings, 2 replies; 18+ messages in thread
From: Eric Wong @ 2016-05-28 23:33 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: Samuel GROOT, git, erwan.mathoniere, jordan.de-gea, gitster,
	aaron, Tom RUSSELLO

Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> wrote:
> Samuel GROOT <samuel.groot@grenoble-inp.org> writes:
> 
> > Parsing and processing in send-email is done in the same loop.
> >
> > To make the code more maintainable, we create two subroutines:
> > - `parse_email` to separate header and body
> > - `parse_header` to retrieve data from header
> 
> These routines are not specific to git send-email, nor to Git.
> 
> Does it make sense to use an external library, like
> http://search.cpan.org/~rjbs/Email-Simple-2.210/lib/Email/Simple.pm ,
> either by depending on it, or by copying it in Git's source tree ?

That might be overkill and increase installation/maintenance
burden.  Bundling it would probably be problematic to distros,
too.

> If not, I think it would be better to introduce an email parsing library
> in a dedicated Perl module in perl/ in our source tree, to keep
> git-send-email.perl more focused on the "send-email" logic.

Sounds good, Git.pm already has parse_mailboxes

> > +sub parse_email {
> > +	my @header = ();
> > +	my @body = ();
> > +	my $fh = shift;
> > +
> > +	# First unfold multiline header fields
> > +	while (<$fh>) {
> > +		last if /^\s*$/;
> > +		if (/^\s+\S/ and @header) {
> > +			chomp($header[$#header]);
> > +			s/^\s+/ /;
> > +			$header[$#header] .= $_;
> > +		} else {
> > +			push(@header, $_);
> > +		}
> > +	}
> > +
> > +	# Now unfold the message body
> 
> Why "unfold"? Don't you mean "split message body into a list of lines"?
> 
> > +	while (<$fh>) {
> > +		push @body, $_;
> > +	}

I'd rather avoid the loops entirely and do this:

	local $/ = "\n"; # in case caller clobbers $/
	@body = (<$fh>);

> > +	return (@header, @body);
> > +}


> > +		if (defined $input_format && $input_format eq 'mbox') {
> > +			if (/^Subject:\s+(.*)$/i) {
> > +				$subject = $1;
> > +			} elsif (/^From:\s+(.*)$/i) {
> > +				$from = $1;
> 
> Not sure we need thes if/elsif/ for generic headers. Email::Simple's API
> seems much simpler and general: $email->header("From");

Right.  Reading this, it would've been easier to parse headers into a
hash (normalized keys to lowercase) up front inside parse_email.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [WIP-PATCH 1/2] send-email: create email parser subroutine
  2016-05-28 23:33     ` Eric Wong
@ 2016-05-29 17:15       ` Samuel GROOT
  2016-05-29 17:53         ` Matthieu Moy
  2016-06-02 16:57       ` Samuel GROOT
  1 sibling, 1 reply; 18+ messages in thread
From: Samuel GROOT @ 2016-05-29 17:15 UTC (permalink / raw)
  To: Eric Wong, Matthieu Moy
  Cc: git, erwan.mathoniere, jordan.de-gea, gitster, aaron, Tom RUSSELLO

On 05/29/2016 01:33 AM, Eric Wong wrote:
> Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> wrote:
>> Samuel GROOT <samuel.groot@grenoble-inp.org> writes:
>>
>>> Parsing and processing in send-email is done in the same loop.
>>>
>>> To make the code more maintainable, we create two subroutines:
>>> - `parse_email` to separate header and body
>>> - `parse_header` to retrieve data from header
>>
>> These routines are not specific to git send-email, nor to Git.
>>
>> Does it make sense to use an external library, like
>> http://search.cpan.org/~rjbs/Email-Simple-2.210/lib/Email/Simple.pm ,
>> either by depending on it, or by copying it in Git's source tree ?
>
> That might be overkill and increase installation/maintenance
> burden.  Bundling it would probably be problematic to distros,
> too.

I have no opinion on that topic, but it could be interesting to have 
other opinions. For the first patch I thought it would be easier and 
quicker to use code already written, and maybe use another method in the 
next iteration.

Email::Simple is licensed under Perl's Artistic License or GPL (v1 or 
any later version), so it's fine to bundle it.

>> If not, I think it would be better to introduce an email parsing library
>> in a dedicated Perl module in perl/ in our source tree, to keep
>> git-send-email.perl more focused on the "send-email" logic.
>
> Sounds good, Git.pm already has parse_mailboxes

I agree, I will look into that.

>>> +sub parse_email {
>>> +	my @header = ();
>>> +	my @body = ();
>>> +	my $fh = shift;
>>> +
>>> +	# First unfold multiline header fields
>>> +	while (<$fh>) {
>>> +		last if /^\s*$/;
>>> +		if (/^\s+\S/ and @header) {
>>> +			chomp($header[$#header]);
>>> +			s/^\s+/ /;
>>> +			$header[$#header] .= $_;
>>> +		} else {
>>> +			push(@header, $_);
>>> +		}
>>> +	}
>>> +
>>> +	# Now unfold the message body
>>
>> Why "unfold"? Don't you mean "split message body into a list of lines"?
>>
>>> +	while (<$fh>) {
>>> +		push @body, $_;
>>> +	}
>
> I'd rather avoid the loops entirely and do this:
>
> 	local $/ = "\n"; # in case caller clobbers $/
> 	@body = (<$fh>);

I didn't know this method before, thanks for suggesting it!

>>> +	return (@header, @body);
>>> +}
>
>
>>> +		if (defined $input_format && $input_format eq 'mbox') {
>>> +			if (/^Subject:\s+(.*)$/i) {
>>> +				$subject = $1;
>>> +			} elsif (/^From:\s+(.*)$/i) {
>>> +				$from = $1;
>>
>> Not sure we need thes if/elsif/ for generic headers. Email::Simple's API
>> seems much simpler and general: $email->header("From");
>
> Right.  Reading this, it would've been easier to parse headers into a
> hash (normalized keys to lowercase) up front inside parse_email.

So should we merge parse_email and parse_header in one unique subroutine?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [WIP-PATCH 0/2] send-email: refactor the email parser loop
  2016-05-28 15:04   ` Matthieu Moy
@ 2016-05-29 17:21     ` Samuel GROOT
  2016-05-29 18:05       ` Matthieu Moy
  0 siblings, 1 reply; 18+ messages in thread
From: Samuel GROOT @ 2016-05-29 17:21 UTC (permalink / raw)
  To: Matthieu Moy, Eric Wong
  Cc: git, erwan.mathoniere, jordan.de-gea, gitster, aaron

On 05/28/2016 05:04 PM, Matthieu Moy wrote:
> Eric Wong <e@80x24.org> writes:
>
>> Samuel GROOT <samuel.groot@grenoble-inp.org> wrote:
>>
>>>    (mbox) Adding cc: A<author@example.com>  from line 'Cc: A<author@example.com>, One<one@example.com>'
>>>    (mbox) Adding cc: One<one@example.com>  from line 'Cc: A<author@example.com>, One<one@example.com>'
>>>
>>> Though `git send-email` now outputs something like:
>>>
>>>    (mbox) Adding cc: A<author@example.com>  from line 'Cc: A<author@example.com>'
>>>    (mbox) Adding cc: One<one@example.com>  from line 'Cc: One<one@example.com>'
>> I actually like neither, and would prefer something shorter:
>>
>>     (mbox) Adding cc: A <author@example.com> from Cc: header
>>     (mbox) Adding cc: One <one@example.com> from Cc: header
>>     (mbox) Adding cc: SoB <SoB@example.com> from Signed-off-by: trailer
>>
>> That way, there's no redundant addresses in each line and less
>> likely to wrap.
>
> I agree with this. Actually, I'd even say that the current output of
> "git send-email" looks sloppy, and internal refactoring may be a good
> opportunity to get it cleaner.

I agree.

Should we take what Eric suggested (see below) as standard output?

> Since the headers are already shown after those lines, it's
> redundant to have the entire line.  And we could add
> trailers after the headers (with a blank line to delimit):
>
>     # existing header output:
>     From: F <F@example.com>
>     Cc: A <author@example.com>, One <one@example.com>
>     Subject: foo
>
>     # new trailer output
>     Signed-off-by: SoB <SoB@example.com>
>     Acked-by: ack <ack@example.com>

And keep "(mbox) Adding ..." lines as error output, or maybe displayable 
by a new option `--verbose`?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [WIP-PATCH 1/2] send-email: create email parser subroutine
  2016-05-29 17:15       ` Samuel GROOT
@ 2016-05-29 17:53         ` Matthieu Moy
  2016-05-30 13:28           ` Samuel GROOT
  0 siblings, 1 reply; 18+ messages in thread
From: Matthieu Moy @ 2016-05-29 17:53 UTC (permalink / raw)
  To: Samuel GROOT
  Cc: Eric Wong, git, erwan.mathoniere, jordan.de-gea, gitster, aaron,
	Tom RUSSELLO

Samuel GROOT <samuel.groot@grenoble-inp.org> writes:

> So should we merge parse_email and parse_header in one unique
> subroutine?

At least on the user (i.e. caller of the API) side, one function is
probably enough.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [WIP-PATCH 0/2] send-email: refactor the email parser loop
  2016-05-29 17:21     ` Samuel GROOT
@ 2016-05-29 18:05       ` Matthieu Moy
  2016-05-30 14:01         ` Samuel GROOT
  0 siblings, 1 reply; 18+ messages in thread
From: Matthieu Moy @ 2016-05-29 18:05 UTC (permalink / raw)
  To: Samuel GROOT
  Cc: Eric Wong, git, erwan.mathoniere, jordan.de-gea, gitster, aaron

Samuel GROOT <samuel.groot@grenoble-inp.org> writes:

> Should we take what Eric suggested (see below) as standard output?
>
>> Since the headers are already shown after those lines, it's
>> redundant to have the entire line.  And we could add
>> trailers after the headers (with a blank line to delimit):
>>
>>     # existing header output:
>>     From: F <F@example.com>
>>     Cc: A <author@example.com>, One <one@example.com>
>>     Subject: foo
>>
>>     # new trailer output
>>     Signed-off-by: SoB <SoB@example.com>
>>     Acked-by: ack <ack@example.com>

I don't think adding the trailers in the output is needed. If the
message says

  Adding foo@example from Signed-off-by trailer

I can guess that it's from "Signed-off-by: foo@example" without having
it explicitly.

Perhaps others think differently, but for me, the shortest output would
be the better (if only to solve the "I never see these lines, they
scrolled out of my terminal" issue).

Ideally, I'd even like to shorten the current output a bit more (the
X-Mailer: header is useless IMHO, and the Date: and Message-id: headers
are probably not useful enough to be shown by default).

(Just thinking aloud, obviously none of this should be a prerequisite to
accept your refactoring patch)

> And keep "(mbox) Adding ..." lines as error output, or maybe
> displayable by a new option `--verbose`?

I think the "Adding ..." lines make sense by default at least for
beginners (just a few days ago, I received a bunch of test emails by
your team follow by a "Oops, I just noticed you got Cc-ed in my tests"
message ;-), that would probably have been worse without the message).
There could be an advice.* option to deactivate them, though.

The (mbox) prefix doesn't seem useful to me OTOH, I think it's a
leftover debugging message.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [WIP-PATCH 1/2] send-email: create email parser subroutine
  2016-05-29 17:53         ` Matthieu Moy
@ 2016-05-30 13:28           ` Samuel GROOT
  0 siblings, 0 replies; 18+ messages in thread
From: Samuel GROOT @ 2016-05-30 13:28 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: Eric Wong, git, erwan.mathoniere, jordan.de-gea, gitster, aaron,
	Tom RUSSELLO

On 05/29/2016 07:53 PM, Matthieu Moy wrote:
> Samuel GROOT <samuel.groot@grenoble-inp.org> writes:
>
>> So should we merge parse_email and parse_header in one unique
>> subroutine?
>
> At least on the user (i.e. caller of the API) side, one function is
> probably enough.
>

I will do that, as soon as we decide what's best between including 
Email::Simple library or making our own API.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [WIP-PATCH 0/2] send-email: refactor the email parser loop
  2016-05-29 18:05       ` Matthieu Moy
@ 2016-05-30 14:01         ` Samuel GROOT
  2016-05-30 14:20           ` Matthieu Moy
  0 siblings, 1 reply; 18+ messages in thread
From: Samuel GROOT @ 2016-05-30 14:01 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: Eric Wong, git, erwan.mathoniere, jordan.de-gea, gitster, aaron

On 05/29/2016 08:05 PM, Matthieu Moy wrote:
> Samuel GROOT <samuel.groot@grenoble-inp.org> writes:
>
>> Should we take what Eric suggested (see below) as standard output?
>>
>>> Since the headers are already shown after those lines, it's
>>> redundant to have the entire line.  And we could add
>>> trailers after the headers (with a blank line to delimit):
>>>
>>>     # existing header output:
>>>     From: F <F@example.com>
>>>     Cc: A <author@example.com>, One <one@example.com>
>>>     Subject: foo
>>>
>>>     # new trailer output
>>>     Signed-off-by: SoB <SoB@example.com>
>>>     Acked-by: ack <ack@example.com>
>
> I don't think adding the trailers in the output is needed. If the
> message says
>
>   Adding foo@example from Signed-off-by trailer
>
> I can guess that it's from "Signed-off-by: foo@example" without having
> it explicitly.
>
> Perhaps others think differently, but for me, the shortest output would
> be the better (if only to solve the "I never see these lines, they
> scrolled out of my terminal" issue).

I agree, the shorter the better.

> Ideally, I'd even like to shorten the current output a bit more (the
> X-Mailer: header is useless IMHO, and the Date: and Message-id: headers
> are probably not useful enough to be shown by default).

Agreed.

> (Just thinking aloud, obviously none of this should be a prerequisite to
> accept your refactoring patch)
>
>> And keep "(mbox) Adding ..." lines as error output, or maybe
>> displayable by a new option `--verbose`?
>
> I think the "Adding ..." lines make sense by default at least for
> beginners (just a few days ago, I received a bunch of test emails by
> your team follow by a "Oops, I just noticed you got Cc-ed in my tests"
> message ;-), that would probably have been worse without the message).
> There could be an advice.* option to deactivate them, though.

An advice.* option seems a good solution to me.

> The (mbox) prefix doesn't seem useful to me OTOH, I think it's a
> leftover debugging message.

(mbox) prefix was introduced by Ryan Anderson in 2005 (can't find the 
exact commit though), in opposition with the (non-mbox) format ("lots of 
email") that was used before.

Is the "lots of email" format still used?

When adding Cc: from a Signed-off-by: field, (body) prefix is used.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [WIP-PATCH 0/2] send-email: refactor the email parser loop
  2016-05-30 14:01         ` Samuel GROOT
@ 2016-05-30 14:20           ` Matthieu Moy
  2016-05-30 18:28             ` Samuel GROOT
  0 siblings, 1 reply; 18+ messages in thread
From: Matthieu Moy @ 2016-05-30 14:20 UTC (permalink / raw)
  To: Samuel GROOT
  Cc: Eric Wong, git, erwan.mathoniere, jordan.de-gea, gitster, aaron

Samuel GROOT <samuel.groot@grenoble-inp.org> writes:

> (mbox) prefix was introduced by Ryan Anderson in 2005 (can't find the
> exact commit though), in opposition with the (non-mbox) format ("lots
> of email") that was used before.

That is actually from the original commit introducing send-email:
83b2443 ([PATCH] Add git-send-email-script - tool to send emails from
git-format-patch-script, 2005-07-31), i.e. ~3 month after Git was born.

At that time, user-friendlyness was not really a priority ;-).

> Is the "lots of email" format still used?

AFAICT, it was initially supported for backward compatibility, and then
no one removed it, but I wouldn't be surprised if no one actually used
it.

I vaguely remember a message from Ryan Anderson being surprised to see
the old format still supported, but I can't find it in the archives.

In any case:

- git log --grep 'lots of email' => shows only 83b2443
- git log -S'lots of email' => likewise
- git grep 'lots of email' => just one answer in a comment

I'm not sure the feature is even tested.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [WIP-PATCH 0/2] send-email: refactor the email parser loop
  2016-05-30 14:20           ` Matthieu Moy
@ 2016-05-30 18:28             ` Samuel GROOT
  2016-05-30 19:29               ` Matthieu Moy
  0 siblings, 1 reply; 18+ messages in thread
From: Samuel GROOT @ 2016-05-30 18:28 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: Eric Wong, git, erwan.mathoniere, jordan.de-gea, gitster, aaron

On 05/30/2016 04:20 PM, Matthieu Moy wrote:
>> Is the "lots of email" format still used?
>
> AFAICT, it was initially supported for backward compatibility, and then
> no one removed it, but I wouldn't be surprised if no one actually used
> it.
>
> I vaguely remember a message from Ryan Anderson being surprised to see
> the old format still supported, but I can't find it in the archives.
>
> In any case:
>
> - git log --grep 'lots of email' => shows only 83b2443
> - git log -S'lots of email' => likewise
> - git grep 'lots of email' => just one answer in a comment
>
> I'm not sure the feature is even tested.

`grep "non-mbox" t/t9001-send-email.sh` didn't return anything, 
apparently it's not tested.

Can we consider this feature obsolete and remove it?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [WIP-PATCH 0/2] send-email: refactor the email parser loop
  2016-05-30 18:28             ` Samuel GROOT
@ 2016-05-30 19:29               ` Matthieu Moy
  0 siblings, 0 replies; 18+ messages in thread
From: Matthieu Moy @ 2016-05-30 19:29 UTC (permalink / raw)
  To: Samuel GROOT
  Cc: Eric Wong, git, erwan.mathoniere, jordan.de-gea, gitster, aaron

Samuel GROOT <samuel.groot@grenoble-inp.org> writes:

> Can we consider this feature obsolete and remove it?

We're usually quite conservative with backward compatibility. If we
remove the feature, we may want to announce it in the next feature
release and actually remove it in the one after (unless we get valid
objection in the meantime).

I'm all for dropping a feature that no one uses if it turns out to be
the case.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [WIP-PATCH 1/2] send-email: create email parser subroutine
  2016-05-28 23:33     ` Eric Wong
  2016-05-29 17:15       ` Samuel GROOT
@ 2016-06-02 16:57       ` Samuel GROOT
  2016-06-02 19:58         ` Eric Wong
  1 sibling, 1 reply; 18+ messages in thread
From: Samuel GROOT @ 2016-06-02 16:57 UTC (permalink / raw)
  To: Eric Wong, Matthieu Moy
  Cc: git, erwan.mathoniere, jordan.de-gea, gitster, aaron, Tom RUSSELLO

On 05/29/2016 01:33 AM, Eric Wong wrote:
> Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> wrote:
>> Samuel GROOT <samuel.groot@grenoble-inp.org> writes:
>>
>>> Parsing and processing in send-email is done in the same loop.
>>>
>>> To make the code more maintainable, we create two subroutines:
>>> - `parse_email` to separate header and body
>>> - `parse_header` to retrieve data from header
>>
>> These routines are not specific to git send-email, nor to Git.
>>
>> Does it make sense to use an external library, like
>> http://search.cpan.org/~rjbs/Email-Simple-2.210/lib/Email/Simple.pm ,
>> either by depending on it, or by copying it in Git's source tree ?
>
> That might be overkill and increase installation/maintenance
> burden.  Bundling it would probably be problematic to distros,
> too.

We have 5 solutions here:

   1. Make a new dependence to Email::Simple.

   2. Bundle Email::Simple in Git's source tree.

   3. Use Email::Simple if installed, else use our library.

   4. Making our own email parser library.

   5. Duplicate parser loop as we did for our patch to implement
      `--quote-email` as proposed in $gmane/295772 .

Obviously, option (5) is the easiest one for us, but it leaves 
refactoring for later, and option (1) is also easier but adds a new 
dependence which is not that good.

Since our project ends next week, we might not have enough time to 
finish developing a custom parser API so (4) is not a viable option for 
now but could be done in the future.

We could consider bundling Email::Simple as the best option, as it's 
developed since 2003 and might be safer to use than anything we could 
write in several weeks.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [WIP-PATCH 1/2] send-email: create email parser subroutine
  2016-06-02 16:57       ` Samuel GROOT
@ 2016-06-02 19:58         ` Eric Wong
  0 siblings, 0 replies; 18+ messages in thread
From: Eric Wong @ 2016-06-02 19:58 UTC (permalink / raw)
  To: Samuel GROOT
  Cc: Matthieu Moy, git, erwan.mathoniere, jordan.de-gea, gitster,
	aaron, Tom RUSSELLO, Jonathan Nieder

Samuel GROOT <samuel.groot@grenoble-inp.org> wrote:
> On 05/29/2016 01:33 AM, Eric Wong wrote:
> >Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> wrote:
> >>Samuel GROOT <samuel.groot@grenoble-inp.org> writes:
> >>
> >>>Parsing and processing in send-email is done in the same loop.
> >>>
> >>>To make the code more maintainable, we create two subroutines:
> >>>- `parse_email` to separate header and body
> >>>- `parse_header` to retrieve data from header
> >>
> >>These routines are not specific to git send-email, nor to Git.
> >>
> >>Does it make sense to use an external library, like
> >>http://search.cpan.org/~rjbs/Email-Simple-2.210/lib/Email/Simple.pm ,
> >>either by depending on it, or by copying it in Git's source tree ?
> >
> >That might be overkill and increase installation/maintenance
> >burden.  Bundling it would probably be problematic to distros,
> >too.
> 
> We have 5 solutions here:
> 
>   1. Make a new dependence to Email::Simple.
> 
>   2. Bundle Email::Simple in Git's source tree.
> 
>   3. Use Email::Simple if installed, else use our library.
> 
>   4. Making our own email parser library.
> 
>   5. Duplicate parser loop as we did for our patch to implement
>      `--quote-email` as proposed in $gmane/295772 .
> 
> Obviously, option (5) is the easiest one for us, but it leaves refactoring
> for later, and option (1) is also easier but adds a new dependence which is
> not that good.

I would go with (5) for now and leave (4) for later (which
might just be moving the function to a new file).

> Since our project ends next week, we might not have enough time to finish
> developing a custom parser API so (4) is not a viable option for now but
> could be done in the future.
> 
> We could consider bundling Email::Simple as the best option, as it's
> developed since 2003 and might be safer to use than anything we could write
> in several weeks.

In an ideal world, (1) would be nice.  But (IMHO) git-send-email
should remain installable on non-ideal systems which do not
provide Email::Simple as a package.

(2) would probably be non-ideal for distro maintainers
(+Cc: Jonathan for opinions), and (3) is the most complex
and difficult-to-support.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2016-06-02 19:58 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-27 14:01 [WIP-PATCH 0/2] send-email: refactor the email parser loop Samuel GROOT
2016-05-27 14:01 ` [WIP-PATCH 1/2] send-email: create email parser subroutine Samuel GROOT
2016-05-28 15:22   ` Matthieu Moy
2016-05-28 23:33     ` Eric Wong
2016-05-29 17:15       ` Samuel GROOT
2016-05-29 17:53         ` Matthieu Moy
2016-05-30 13:28           ` Samuel GROOT
2016-06-02 16:57       ` Samuel GROOT
2016-06-02 19:58         ` Eric Wong
2016-05-27 14:01 ` [WIP-PATCH 2/2] send-email: use refactored subroutine to parse patches Samuel GROOT
2016-05-27 20:14 ` [WIP-PATCH 0/2] send-email: refactor the email parser loop Eric Wong
2016-05-28 15:04   ` Matthieu Moy
2016-05-29 17:21     ` Samuel GROOT
2016-05-29 18:05       ` Matthieu Moy
2016-05-30 14:01         ` Samuel GROOT
2016-05-30 14:20           ` Matthieu Moy
2016-05-30 18:28             ` Samuel GROOT
2016-05-30 19:29               ` Matthieu Moy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.