All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Jeff King <peff@peff.net>
Cc: "René Scharfe" <l.s.r@web.de>,
	"Carlo Marcelo Arenas Belón" <carenas@gmail.com>,
	"Eric Sunshine" <sunshine@sunshineco.com>,
	"Git List" <git@vger.kernel.org>,
	"Hamza Mahfooz" <someguy@effective-light.com>
Subject: Re: [PATCH 1/5] grep: stop modifying buffer in strip_timestamp
Date: Thu, 23 Sep 2021 02:53:15 +0200	[thread overview]
Message-ID: <87y27o5cxx.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <YUuQJUSl/jrnQR7n@coredump.intra.peff.net>


On Wed, Sep 22 2021, Jeff King wrote:

> On Tue, Sep 21, 2021 at 11:02:31PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> 
>> On Tue, Sep 21 2021, Jeff King wrote:
>> 
>> > On Tue, Sep 21, 2021 at 09:37:23AM +0200, René Scharfe wrote:
>> >
>> >> > @@ -965,9 +953,12 @@ static int match_one_pattern(struct grep_pat *p, char *bol, char *eol,
>> >> >  		bol += len;
>> >> >  		switch (p->field) {
>> >> >  		case GREP_HEADER_AUTHOR:
>> >> > -		case GREP_HEADER_COMMITTER:
>> >> > -			strip_timestamp(bol, &eol);
>> >> > +		case GREP_HEADER_COMMITTER: {
>> >> > +			char *em = memrchr(bol, '>', eol - bol);
>> >> > +			if (em)
>> >> > +				eol = em + 1;
>> >> 
>> >> The old code documents the intent via the function name.  The new one
>> >> goes into the nitty-gritty without further explanation, which I find
>> >> harder to read.
>> >
>> > Agreed. I do think the conversion is functionally correct, but it
>> > doesn't strike me as worth the change.
>> 
>> As far as some general improvement in thish area it seems to me that
>> this whole subthread is losing the forest for the trees.
>
> I'd definitely agree with that. :)
>
>> It probably makes sense to split up that commit.c code into something
>> that can give you structured output, i.e. headers with types and
>> start/end points for interesting data, then in grep.c we won't need a
>> strip_anything(), or strrchr() or memrchr() or whatever.
>
> I don't disagree with any of this, either, but I think it's a separate
> (and much more complicated) topic than what this series is dealing with.
> So my preference would be to take this as an immediate improvement, and
> let anything like that get built on top.
>
>> It would also be a lot faster for grepping if we could offload more of
>> this work to the regex engine, particularly if we've got a more capable
>> engine like PCREv2.
>> 
>> In many cases we're splitting lines ourselves, when we could have the
>> engine work in a multi-line mode, or translate the user's --author match
>> into something that can match the raw commit header. So have an implicit
>> /^author /m anchor if we're matching author headers, instead of having
>> grep.c string-twiddle that around one line at a time.
>
> Ditto here. Multi-line matching may make things a lot more efficient,
> but I think is out of scope for this series. That seems like an
> interesting area for future work.

Indeed, *way* outside this series. I'm I think the change you suggested
here makes sense for this change, just pointing out that for any
follow-up changes it's much more worthwhile to consider a few more
stackframes than just the 1-2 that involve those strings in that
particular form.

  reply	other threads:[~2021-09-23  0:54 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-21  3:45 [PATCH 0/5] const-correctness in grep.c Jeff King
2021-09-21  3:46 ` [PATCH 1/5] grep: stop modifying buffer in strip_timestamp Jeff King
2021-09-21  5:18   ` Carlo Arenas
2021-09-21  5:24     ` Eric Sunshine
2021-09-21  5:40       ` Carlo Arenas
2021-09-21  5:43       ` Jeff King
2021-09-21  6:42         ` Carlo Marcelo Arenas Belón
2021-09-21  7:37           ` René Scharfe
2021-09-21 14:24             ` Jeff King
2021-09-21 21:02               ` Ævar Arnfjörð Bjarmason
2021-09-22 20:20                 ` Jeff King
2021-09-23  0:53                   ` Ævar Arnfjörð Bjarmason [this message]
2021-09-21  3:48 ` [PATCH 2/5] grep: stop modifying buffer in show_line() Jeff King
2021-09-21  4:22   ` Taylor Blau
2021-09-21  4:42     ` Jeff King
2021-09-21  4:45       ` Taylor Blau
2021-09-21  3:48 ` [PATCH 3/5] grep: stop modifying buffer in grep_source_1() Jeff King
2021-09-21  3:49 ` [PATCH 4/5] grep: mark "haystack" buffers as const Jeff King
2021-09-21 12:04   ` Ævar Arnfjörð Bjarmason
2021-09-21 14:27     ` Jeff King
2021-09-21  3:51 ` [PATCH 5/5] grep: store grep_source buffer " Jeff King
2021-09-21  4:30 ` [PATCH 0/5] const-correctness in grep.c Taylor Blau
2021-09-21 12:07 ` Ævar Arnfjörð Bjarmason
2021-09-21 14:49   ` Jeff King
2021-09-21 12:45 ` [PATCH 6/5] grep.c: mark eol/bol and derived as "const char * const" Ævar Arnfjörð Bjarmason
2021-09-21 14:53   ` Jeff King
2021-09-21 15:17     ` Ævar Arnfjörð Bjarmason
2021-09-21 19:18       ` Jeff King
2021-09-23 13:56         ` Ævar Arnfjörð Bjarmason
2021-09-24  4:22           ` Junio C Hamano
2021-09-22 19:02     ` Junio C Hamano
2021-09-22 18:57 ` [PATCH 0/5] const-correctness in grep.c Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87y27o5cxx.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=carenas@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=l.s.r@web.de \
    --cc=peff@peff.net \
    --cc=someguy@effective-light.com \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.