All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Martin Ågren" <martin.agren@gmail.com>
To: "Đoàn Trần Công Danh" <congdanhqx@gmail.com>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Re: [PATCH v2 2/3] mailinfo.c::convert_to_utf8: reuse strlen info
Date: Sun, 19 Apr 2020 14:29:06 +0200	[thread overview]
Message-ID: <CAN0heSqi8KvbfSD80UA9s_aj9dGYTvbeW4GXVY59014_Sm-HmQ@mail.gmail.com> (raw)
In-Reply-To: <e3e542d38818b762f8d6d6b50bc42e01b070772b.1587289680.git.congdanhqx@gmail.com>

On Sun, 19 Apr 2020 at 13:03, Đoàn Trần Công Danh <congdanhqx@gmail.com> wrote:
>
> We're passing buffer from strbuf to reencode_string,
> which will call strlen(3) on that buffer,
> and discard the length of newly created buffer.
> Then, we compute the length of the return buffer to attach to strbuf.
>
> During this process, we introduce a discrimination between mail
> originally written in utf-8 and other encoding.
>
> * if the email was written in utf-8, we leave it as is. If there is
>   a NUL character in that line, we complains loudly:
>
>         error: a NUL byte in commit log message not allowed.
>
> * if the email was written in other encoding, we truncate the data as
>   the NUL character in that line, then we used the truncated line for
>   the metadata.
>
> We can do better by reusing all the available information,
> and call the underlying lower level function that will be called
> indirectly by reencode_string. By doing this, we will also postpone
> the NUL character processing to the commit step, which will
> complains about the faulty metadata.
>
> Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>

This makes sense to me.

I think the subject could be adapted though? Now it's not about "reusing
info" anymore, it's about using *other*, *better* info. Maybe

  mailinfo.c: avoid strlen on strings that might contain NUL

? Could probably be improved further..

> --- a/mailinfo.c
> +++ b/mailinfo.c
> @@ -447,19 +447,21 @@ static int convert_to_utf8(struct mailinfo *mi,
>                            struct strbuf *line, const char *charset)
>  {
>         char *out;
> +       size_t out_len;
>
>         if (!mi->metainfo_charset || !charset || !*charset)
>                 return 0;
>
>         if (same_encoding(mi->metainfo_charset, charset))
>                 return 0;
> -       out = reencode_string(line->buf, mi->metainfo_charset, charset);
> +       out = reencode_string_len(line->buf, line->len,
> +                                 mi->metainfo_charset, charset, &out_len);
>         if (!out) {
>                 mi->input_error = -1;
>                 return error("cannot convert from %s to %s",
>                              charset, mi->metainfo_charset);
>         }
> -       strbuf_attach(line, out, strlen(out), strlen(out));
> +       strbuf_attach(line, out, out_len, out_len);
>         return 0;
>  }

Same diff as before, ok.

> +write_nul_patch() {
> +       space=' '
> +       qNUL=
> +       case "$1" in
> +               subject) qNUL='=00' ;;
> +       esac

Here it's case/esac...

> +       cat <<-EOF
> +       From ec7364544f690c560304f5a5de9428ea3b978b26 Mon Sep 17 00:00:00 2001
> +       From: A U Thor <author@example.com>
> +       Date: Sun, 19 Apr 2020 13:42:07 +0700
> +       Subject: [PATCH] =?ISO-8859-1?q?=C4=CB${qNUL}=D1=CF=D6?=
> +       MIME-Version: 1.0
> +       Content-Type: text/plain; charset=ISO-8859-1
> +       Content-Transfer-Encoding: 8bit
> +
> +       EOF
> +       if test "$1" = body
> +       then
> +               printf "%s\0%s\n" abc def
> +       fi

Here it's if/fi. Looks a bit inconsistent.

I suppose you tried to have a case for "body" above but couldn't get it
to work? Somehow, it would seem more consistent to have a qSubject and
qBody and handle them the same way, but maybe that's not possible?
Maybe using q_to_nul to create qBody containing a NUL?

> +       cat <<-\EOF
> +       ---
> +       diff --git a/afile b/afile
> +       new file mode 100644
> +       index 0000000000..e69de29bb2
> +       --$space
> +       2.26.1
> +       EOF
> +}

I think this helper function should use &&-chaining.

> +
>  test_expect_success setup '
>         # Note the missing "+++" line:
>         cat >bad-patch.diff <<-\EOF &&
> @@ -32,4 +62,18 @@ test_expect_success 'try to apply corrupted patch' '
>         test_i18ncmp expected actual
>  '
>
> +test_expect_success "NUL in commit message's body" '
> +       test_when_finished "git am --abort" &&
> +       write_nul_patch body >body.patch &&
> +       test_must_fail git am body.patch 2>err &&
> +       grep "a NUL byte in commit log message not allowed" err
> +'

Makes sense.

> +
> +test_expect_failure "NUL in commit message's header" '
> +       test_when_finished "git am --abort" &&
> +       write_nul_patch subject >subject.patch &&
> +       test_must_fail git am subject.patch 2>err &&
> +       grep "a NUL byte in Subject is not allowed" err
> +'

Interesting. :-)


Martin

  reply	other threads:[~2020-04-19 12:29 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-18  3:54 [PATCH] mailinfo.c::convert_to_utf8: reuse strlen info Đoàn Trần Công Danh
2020-04-18 19:56 ` Martin Ågren
2020-04-18 20:18   ` [PATCH 0/6] strbuf: simplify `strbuf_attach()` usage Martin Ågren
2020-04-18 20:18     ` [PATCH 1/6] am: use `strbuf_attach()` correctly Martin Ågren
2020-04-18 20:18     ` [PATCH 2/6] strbuf_attach: correctly pass in `strlen() + 1` for `alloc` Martin Ågren
2020-04-18 20:18     ` [PATCH 3/6] strbuf: use `strbuf_attach()` correctly Martin Ågren
2020-04-18 20:18     ` [PATCH 4/6] fast-import: avoid awkward use of `strbuf_attach()` Martin Ågren
2020-04-18 20:18     ` [PATCH 5/6] rerere: " Martin Ågren
2020-04-18 20:18     ` [PATCH 6/6] strbuf: simplify `strbuf_attach()` usage Martin Ågren
2020-04-19  4:44     ` [PATCH 0/6] " Martin Ågren
2020-04-19 12:32     ` [PATCH 0/4] strbuf: fix doc for `strbuf_attach()` and avoid it Martin Ågren
2020-04-19 12:32       ` [PATCH 1/4] strbuf: fix doc for `strbuf_attach()` Martin Ågren
2020-04-20 17:30         ` Junio C Hamano
2020-04-21 18:44           ` Martin Ågren
2020-04-19 12:32       ` [PATCH 2/4] strbuf: introduce `strbuf_attachstr_len()` Martin Ågren
2020-04-19 12:32       ` [PATCH 3/4] strbuf: introduce `strbuf_attachstr()` Martin Ågren
2020-04-20 19:39         ` Junio C Hamano
2020-04-21 18:47           ` Martin Ågren
2020-04-19 12:32       ` [PATCH 4/4] strbuf_attach: prefer `strbuf_attachstr_len()` Martin Ågren
2020-04-18 23:12   ` [PATCH] mailinfo.c::convert_to_utf8: reuse strlen info Junio C Hamano
2020-04-19  2:48     ` Danh Doan
2020-04-19  4:34       ` Martin Ågren
2020-04-19  5:32         ` Junio C Hamano
2020-04-19 11:00 ` [PATCH v2 0/3] mailinfo: disallow and complains about NUL character Đoàn Trần Công Danh
2020-04-19 11:00   ` [PATCH v2 1/3] t4254: merge 2 steps of a single test Đoàn Trần Công Danh
2020-04-19 12:25     ` Martin Ågren
2020-04-19 14:17       ` Danh Doan
2020-04-19 11:00   ` [PATCH v2 2/3] mailinfo.c::convert_to_utf8: reuse strlen info Đoàn Trần Công Danh
2020-04-19 12:29     ` Martin Ågren [this message]
2020-04-19 14:16       ` Danh Doan
2020-04-20 19:59     ` Junio C Hamano
2020-04-20 23:53       ` Danh Doan
2020-04-19 11:00   ` [PATCH v2 3/3] mailinfo: disallow NUL character in mail's header Đoàn Trần Công Danh
2020-04-19 12:30     ` Martin Ågren
2020-04-19 14:24       ` Danh Doan
2020-04-20 23:54 ` [PATCH v3 0/3] Disallow NUL character in mailinfo Đoàn Trần Công Danh
2020-04-20 23:54   ` [PATCH v3 1/3] t4254: merge 2 steps of a single test Đoàn Trần Công Danh
2020-04-20 23:54   ` [PATCH v3 2/3] mailinfo.c: avoid strlen on strings that can contains NUL Đoàn Trần Công Danh
2020-04-20 23:54   ` [PATCH v3 3/3] mailinfo: disallow NUL character in mail's header Đoàn Trần Công Danh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAN0heSqi8KvbfSD80UA9s_aj9dGYTvbeW4GXVY59014_Sm-HmQ@mail.gmail.com \
    --to=martin.agren@gmail.com \
    --cc=congdanhqx@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.