All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Emily Shaffer <emilyshaffer@google.com>
Cc: git@vger.kernel.org, jrn@google.com
Subject: Re: [PATCH v2] grep: provide sane default to grep_source struct
Date: Thu, 16 May 2019 18:02:54 -0400	[thread overview]
Message-ID: <20190516220254.GG10787@sigill.intra.peff.net> (raw)
In-Reply-To: <20190516214444.191743-1-emilyshaffer@google.com>

On Thu, May 16, 2019 at 02:44:44PM -0700, Emily Shaffer wrote:

> grep_buffer creates a struct grep_source gs and calls grep_source()
> with it. However, gs.name is null, which eventually produces a
> segmentation fault in
> grep_source()->grep_source_1()->show_line() when grep_opt.status_only is
> not set.

Funny wrapping here.

> This seems to be unreachable from existing commands but is reachable in
> the event that someone rolls their own revision walk and neglects to set
> rev_info->grep_filter->status_only. Conceivably, someone might want to
> print all changed lines of all commits which match some regex.
> 
> To futureproof, give developers a heads-up if grep_source() is called
> and a valid name field is expected but not given. This path should be
> unreachable now, but in the future may become reachable if developers
> want to add the ability to use grep_buffer() in prepared in-core data
> and provide hints to the user about where the data came from.

So I guess this is probably my fault, as I was the one who factored out
the grep_source bits long ago (though I would also not be surprised if
it could be triggered before, too).

I think adding a BUG() is the right thing to do.

> I've added the BUG() line to grep_source(). I considered adding it to
> grep_source_1() but didn't see much difference. Looks like
> grep_source_1() exists purely as a helper to grep_source() and is never
> called by anybody else... but it's the function where the crash would
> happen. So I'm not sure.

I agree it probably doesn't matter. I'd probably stick at at the top of
grep_source_1(), just with the rationale that it could possibly catch
more cases if somebody ever refactors grep_source() to have two
different callers (e.g., imagine we later add a variant that takes more
options, but leave the old one to avoid disrupting existing callers).

> I also modified the wording for the BUG("") statement a little from
> jrn's suggestion to hopefully make it more explicit, but I welcome
> wordsmithing.
> [...]
> +		BUG("grep call which could print a name requires "
> +		    "grep_source.name be non-NULL");

It looks fine to me. The point is that nobody should ever see this. And
if they do, we should already get a file/line number telling us where
the problem is (and a coredump to poke at). So as long as it's enough to
convince a regular user that they should make a report to the mailing
list, it will have done its job.

> diff --git a/grep.c b/grep.c
> index 0d50598acd..6ceb880a8c 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -2021,6 +2021,9 @@ static int chk_hit_marker(struct grep_expr *x)
>  
>  int grep_source(struct grep_opt *opt, struct grep_source *gs)
>  {
> +	if (!opt->status_only && gs->name == NULL)
> +		BUG("grep call which could print a name requires "
> +		    "grep_source.name be non-NULL");
>  	/*
>  	 * we do not have to do the two-pass grep when we do not check
>  	 * buffer-wide "all-match".

Minor nit, but can we put a blank line between the new block and the
comment?

> @@ -2045,7 +2048,11 @@ int grep_buffer(struct grep_opt *opt, char *buf, unsigned long size)
>  	struct grep_source gs;
>  	int r;
>  
> -	grep_source_init(&gs, GREP_SOURCE_BUF, NULL, NULL, NULL);
> +	/* TODO: In the future it may become desirable to pass in the name as
> +	 * an argument to grep_buffer(). At that time, "(in core)" should be
> +	 * replaced.
> +	 */
> +	grep_source_init(&gs, GREP_SOURCE_BUF, _("(in core)"), NULL, NULL);

Hmm. I don't see much point in this one, as it would just avoid
triggering our BUG(). If somebody is adding new grep_buffer() callers
that don't use status_only, wouldn't we want them to see the BUG() to
know that they need to refactor grep_buffer() to provide a name?

-Peff

  reply	other threads:[~2019-05-16 22:02 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-16  2:00 [RFC PATCH] grep: provide sane default to grep_source struct Emily Shaffer
2019-05-16  3:07 ` Junio C Hamano
2019-05-16  3:13   ` Jonathan Nieder
2019-05-16  3:11 ` Jonathan Nieder
2019-05-16 21:05   ` Emily Shaffer
2019-05-16 21:44 ` [PATCH v2] " Emily Shaffer
2019-05-16 22:02   ` Jeff King [this message]
2019-05-21 23:52     ` Emily Shaffer
2019-05-22  0:17       ` Jonathan Nieder
2019-05-22  0:34   ` [PATCH v3] " Emily Shaffer
2019-05-22  0:58     ` Jonathan Nieder
2019-05-22  4:01     ` Jeff King
2019-05-23 20:23     ` [PATCH v4] grep: fail if call could output and name is null Emily Shaffer
2019-05-23 20:34       ` Jonathan Nieder
2019-05-28 17:57         ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190516220254.GG10787@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=emilyshaffer@google.com \
    --cc=git@vger.kernel.org \
    --cc=jrn@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.