All of lore.kernel.org
 help / color / mirror / Atom feed
From: Samuel GROOT <samuel.groot@grenoble-inp.org>
To: Eric Wong <e@80x24.org>, Matthieu Moy <Matthieu.Moy@grenoble-inp.fr>
Cc: git@vger.kernel.org, erwan.mathoniere@grenoble-inp.org,
	jordan.de-gea@grenoble-inp.org, gitster@pobox.com,
	aaron@schrab.com, Tom RUSSELLO <tom.russello@grenoble-inp.org>
Subject: Re: [WIP-PATCH 1/2] send-email: create email parser subroutine
Date: Sun, 29 May 2016 19:15:10 +0200	[thread overview]
Message-ID: <8904f487-985c-bd2d-a8d1-4a712c6ef558@grenoble-inp.org> (raw)
In-Reply-To: <20160528233329.GA1132@dcvr.yhbt.net>

On 05/29/2016 01:33 AM, Eric Wong wrote:
> Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> wrote:
>> Samuel GROOT <samuel.groot@grenoble-inp.org> writes:
>>
>>> Parsing and processing in send-email is done in the same loop.
>>>
>>> To make the code more maintainable, we create two subroutines:
>>> - `parse_email` to separate header and body
>>> - `parse_header` to retrieve data from header
>>
>> These routines are not specific to git send-email, nor to Git.
>>
>> Does it make sense to use an external library, like
>> http://search.cpan.org/~rjbs/Email-Simple-2.210/lib/Email/Simple.pm ,
>> either by depending on it, or by copying it in Git's source tree ?
>
> That might be overkill and increase installation/maintenance
> burden.  Bundling it would probably be problematic to distros,
> too.

I have no opinion on that topic, but it could be interesting to have 
other opinions. For the first patch I thought it would be easier and 
quicker to use code already written, and maybe use another method in the 
next iteration.

Email::Simple is licensed under Perl's Artistic License or GPL (v1 or 
any later version), so it's fine to bundle it.

>> If not, I think it would be better to introduce an email parsing library
>> in a dedicated Perl module in perl/ in our source tree, to keep
>> git-send-email.perl more focused on the "send-email" logic.
>
> Sounds good, Git.pm already has parse_mailboxes

I agree, I will look into that.

>>> +sub parse_email {
>>> +	my @header = ();
>>> +	my @body = ();
>>> +	my $fh = shift;
>>> +
>>> +	# First unfold multiline header fields
>>> +	while (<$fh>) {
>>> +		last if /^\s*$/;
>>> +		if (/^\s+\S/ and @header) {
>>> +			chomp($header[$#header]);
>>> +			s/^\s+/ /;
>>> +			$header[$#header] .= $_;
>>> +		} else {
>>> +			push(@header, $_);
>>> +		}
>>> +	}
>>> +
>>> +	# Now unfold the message body
>>
>> Why "unfold"? Don't you mean "split message body into a list of lines"?
>>
>>> +	while (<$fh>) {
>>> +		push @body, $_;
>>> +	}
>
> I'd rather avoid the loops entirely and do this:
>
> 	local $/ = "\n"; # in case caller clobbers $/
> 	@body = (<$fh>);

I didn't know this method before, thanks for suggesting it!

>>> +	return (@header, @body);
>>> +}
>
>
>>> +		if (defined $input_format && $input_format eq 'mbox') {
>>> +			if (/^Subject:\s+(.*)$/i) {
>>> +				$subject = $1;
>>> +			} elsif (/^From:\s+(.*)$/i) {
>>> +				$from = $1;
>>
>> Not sure we need thes if/elsif/ for generic headers. Email::Simple's API
>> seems much simpler and general: $email->header("From");
>
> Right.  Reading this, it would've been easier to parse headers into a
> hash (normalized keys to lowercase) up front inside parse_email.

So should we merge parse_email and parse_header in one unique subroutine?

  reply	other threads:[~2016-05-29 17:15 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-27 14:01 [WIP-PATCH 0/2] send-email: refactor the email parser loop Samuel GROOT
2016-05-27 14:01 ` [WIP-PATCH 1/2] send-email: create email parser subroutine Samuel GROOT
2016-05-28 15:22   ` Matthieu Moy
2016-05-28 23:33     ` Eric Wong
2016-05-29 17:15       ` Samuel GROOT [this message]
2016-05-29 17:53         ` Matthieu Moy
2016-05-30 13:28           ` Samuel GROOT
2016-06-02 16:57       ` Samuel GROOT
2016-06-02 19:58         ` Eric Wong
2016-05-27 14:01 ` [WIP-PATCH 2/2] send-email: use refactored subroutine to parse patches Samuel GROOT
2016-05-27 20:14 ` [WIP-PATCH 0/2] send-email: refactor the email parser loop Eric Wong
2016-05-28 15:04   ` Matthieu Moy
2016-05-29 17:21     ` Samuel GROOT
2016-05-29 18:05       ` Matthieu Moy
2016-05-30 14:01         ` Samuel GROOT
2016-05-30 14:20           ` Matthieu Moy
2016-05-30 18:28             ` Samuel GROOT
2016-05-30 19:29               ` Matthieu Moy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8904f487-985c-bd2d-a8d1-4a712c6ef558@grenoble-inp.org \
    --to=samuel.groot@grenoble-inp.org \
    --cc=Matthieu.Moy@grenoble-inp.fr \
    --cc=aaron@schrab.com \
    --cc=e@80x24.org \
    --cc=erwan.mathoniere@grenoble-inp.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jordan.de-gea@grenoble-inp.org \
    --cc=tom.russello@grenoble-inp.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.