All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Ivan Frade via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Ivan Frade" <ifrade@google.com>
Subject: Re: [PATCH v2] fetch-pack: redact packfile urls in traces
Date: Mon, 11 Oct 2021 13:39:34 -0700	[thread overview]
Message-ID: <xmqqczobb8jd.fsf@gitster.g> (raw)
In-Reply-To: <pull.1052.v2.git.1633746024175.gitgitgadget@gmail.com> (Ivan Frade via GitGitGadget's message of "Sat, 09 Oct 2021 02:20:24 +0000")

"Ivan Frade via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Ivan Frade <ifrade@google.com>
>
> In some setups, packfile uris act as bearer token. It is not
> recommended to expose them plainly in logs, although in special
> circunstances (e.g. debug) it makes sense to write them.
>
> Redact the packfile-uri lines by default, unless the GIT_TRACE_REDACT
> variable is set to false. This mimics the redacting of the
> Authorization header in HTTP.

Well explained.

It of course is a different matter if the explained idea is
agreeable, though ;-).  Hiding the entire packet, based on the "it
might be in some setups" seems a bit too much.

Is it often the case that the whole URI is sensitive, or perhaps
leading "<scheme>://<host>/pack-<abc>.pack" part is not sensitive at
all, and what follows after that "public" part has some "nonce"
material that makes it sensitive?

> Changes since v1:
> - Removed non-POSIX flags in tests
> - More accurate regex for the non-encrypted packfile line
> - Dropped documentation change
> - Dropped redacting the die message in http-fetch

These are not for those who read "git log" in 3 months, as they may
not even have seen the "v1".  But these are very helpful for those
who read the "v1" to see how good this round is.  Please write such
material below the three-dash line.

> Signed-off-by: Ivan Frade <ifrade@google.com>
> ---

i.e. here.

>     fetch-pack: redact packfile urls in traces
>     
>     In some setups, packfile uris act as bearer token. It is not recommended
>     to expose them plainly in logs, although in special circunstances (e.g.
>     debug) it makes sense to write them.
>     
>     Redact the packfile-uri lines by default, unless the GIT_TRACE_REDACT
>     variable is set to false. This mimics the redacting of the Authorization
>     header in HTTP.
>     
>     Signed-off-by: Ivan Frade ifrade@google.com
>     
>     cc: Ævar Arnfjörð Bjarmason avarab@gmail.com

And there is no need to duplicate the log message here ;-)

> diff --git a/fetch-pack.c b/fetch-pack.c
> index a9604f35a3e..05c85eeafa1 100644
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -1518,7 +1518,16 @@ static void receive_wanted_refs(struct packet_reader *reader,
>  static void receive_packfile_uris(struct packet_reader *reader,
>  				  struct string_list *uris)
>  {
> +	int original_options;
>  	process_section_header(reader, "packfile-uris", 0);
> +	/*
> +	 * In some setups, packfile-uris act as bearer tokens,
> +	 * redact them by default.
> +	 */
> +	original_options = reader->options;
> +	if (git_env_bool("GIT_TRACE_REDACT", 1))
> +		reader->options |= PACKET_READ_REDACT_ON_TRACE;
> +
>  	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
>  		if (reader->pktlen < the_hash_algo->hexsz ||
>  		    reader->line[the_hash_algo->hexsz] != ' ')
> @@ -1526,6 +1535,8 @@ static void receive_packfile_uris(struct packet_reader *reader,
>  
>  		string_list_append(uris, reader->line);
>  	}
> +	reader->options = original_options;

So "original_options" is used to save away the reader->options so
that it can be restored before returning to our caller?  

OK (it may be more common in this codebase to call such a variable
"saved_X", though).

> diff --git a/pkt-line.c b/pkt-line.c
> index de4a94b437e..8da8ed88ccf 100644
> --- a/pkt-line.c
> +++ b/pkt-line.c
> @@ -443,7 +443,12 @@ enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
>  		len--;
>  
>  	buffer[len] = 0;
> -	packet_trace(buffer, len, 0);
> +	if (options & PACKET_READ_REDACT_ON_TRACE) {
> +		const char *redacted = "<redacted>";
> +		packet_trace(redacted, strlen(redacted), 0);
> +	} else {
> +		packet_trace(buffer, len, 0);
> +	}
> ...
> +	GIT_TRACE=1 GIT_TRACE_PACKET="$(pwd)/log" GIT_TEST_SIDEBAND_ALL=1 \
> +	git -c protocol.version=2 \
> +		-c fetch.uriprotocols=http,https \
> +		clone "$HTTPD_URL/smart/http_parent" http_child &&
> +
> +	grep "clone< <redacted>" log

This checks only that "redacted" string appears, but what the theme
of the change really cares about is different, no?  You want to
ensure that no sensitive substring of the URI appears in the log.

Imagine somebody breaking the redact logic by making it prepend that
string to the payload, instead of replacing the payload with that
string---this test will not catch such a regression.

Thanks.

  reply	other threads:[~2021-10-11 20:39 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-08 16:03 [PATCH 0/2] fetch-pack: redact packfile urls in traces Ivan Frade via GitGitGadget
2021-10-08 16:03 ` [PATCH 1/2] " Ivan Frade via GitGitGadget
2021-10-08 19:36   ` Ævar Arnfjörð Bjarmason
2021-10-08 23:15     ` Ivan Frade
2021-10-08 16:03 ` [PATCH 2/2] Documentation: packfile-uri hash can be longer than 40 hex chars Ivan Frade via GitGitGadget
2021-10-08 19:43   ` Ævar Arnfjörð Bjarmason
2021-10-09  2:20 ` [PATCH v2] fetch-pack: redact packfile urls in traces Ivan Frade via GitGitGadget
2021-10-11 20:39   ` Junio C Hamano [this message]
2021-10-26 19:32     ` Ivan Frade
2021-10-19 22:57   ` [PATCH v3] " Ivan Frade via GitGitGadget
2021-10-20 11:41     ` Ævar Arnfjörð Bjarmason
2021-10-26 22:49     ` [PATCH v4 0/2] " Ivan Frade via GitGitGadget
2021-10-26 22:49       ` [PATCH v4 1/2] " Ivan Frade via GitGitGadget
2021-10-28  1:01         ` Junio C Hamano
2021-10-28 22:15           ` Ivan Frade
2021-10-28 22:46             ` Junio C Hamano
2021-10-26 22:49       ` [PATCH v4 2/2] http-fetch: redact url on die() message Ivan Frade via GitGitGadget
2021-10-28 16:39         ` Ævar Arnfjörð Bjarmason
2021-10-28 17:25           ` Eric Sunshine
2021-10-28 22:44             ` Ivan Frade
2021-10-28 22:41           ` Ivan Frade
2021-10-29 23:18           ` Junio C Hamano
2021-11-09  1:54             ` Ævar Arnfjörð Bjarmason
2021-10-28 22:51       ` [PATCH v5 0/2] fetch-pack: redact packfile urls in traces Ivan Frade via GitGitGadget
2021-10-28 22:51         ` [PATCH v5 1/2] " Ivan Frade via GitGitGadget
2021-10-28 23:21           ` Junio C Hamano
2021-10-29 18:42             ` Ivan Frade
2021-10-29 19:59               ` Junio C Hamano
2021-11-08 22:43                 ` Jonathan Tan
2021-10-28 22:51         ` [PATCH v5 2/2] http-fetch: redact url on die() message Ivan Frade via GitGitGadget
2021-10-29 18:42         ` [PATCH v6 0/2] fetch-pack: redact packfile urls in traces Ivan Frade via GitGitGadget
2021-10-29 18:42           ` [PATCH v6 1/2] " Ivan Frade via GitGitGadget
2021-11-08 23:01             ` Jonathan Tan
2021-11-09  1:36               ` Ævar Arnfjörð Bjarmason
2021-11-10 23:44                 ` Ivan Frade
2021-11-11  0:01                   ` Ævar Arnfjörð Bjarmason
2021-11-10 21:18               ` Ivan Frade
2021-10-29 18:42           ` [PATCH v6 2/2] http-fetch: redact url on die() message Ivan Frade via GitGitGadget
2021-11-08 23:06             ` Jonathan Tan
2021-11-10 23:51           ` [PATCH v7 0/2] fetch-pack: redact packfile urls in traces Ivan Frade via GitGitGadget
2021-11-10 23:51             ` [PATCH v7 1/2] " Ivan Frade via GitGitGadget
2021-11-10 23:51             ` [PATCH v7 2/2] http-fetch: redact url on die() message Ivan Frade via GitGitGadget
2021-11-12  4:43             ` [PATCH v7 0/2] fetch-pack: redact packfile urls in traces Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqczobb8jd.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=ifrade@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.