From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.7 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 429FFC43460 for ; Tue, 4 May 2021 21:47:01 +0000 (UTC) Received: by mail.kernel.org (Postfix) id 27D6A613C1; Tue, 4 May 2021 21:47:01 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id DB0B7613BE for ; Tue, 4 May 2021 21:47:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1620164821; bh=/yZjhtdR78EKKD5q7jZrMZJHyKFtlz618hhhN90bxG8=; h=From:List-Id:To:Subject:Date:In-Reply-To:References:From; b=wjByeQ9aawlTM36VvFHySLyxCUAZUbhudH2nDULocRvj1zY/wqssQcOAFNwJmc7Qe HWa6EPjaj1rp7nzECKcBgzRi+aV0IzuVWOidI6ok5cg87KkrGl+emWn4cl9tcuK38p GlfxZLDsHvKapaXjZL3GTBq7d6rYuTgxMNSBbF5Y= From: Konstantin Ryabitsev List-Id: To: signatures@kernel.org Subject: [PATCH 2/3] Save to/cc headers as-is for tracking Date: Tue, 4 May 2021 17:46:57 -0400 Message-Id: <20210504214658.295563-3-konstantin@linuxfoundation.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210504214658.295563-1-konstantin@linuxfoundation.org> References: <20210504214658.295563-1-konstantin@linuxfoundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 X-Developer-Signature: v=1; a=openpgp-sha256; l=3041; h=from:subject; bh=/yZjhtdR78EKKD5q7jZrMZJHyKFtlz618hhhN90bxG8=; b=owGbwMvMwCG27YjM47CUmTmMp9WSGBImHljbrCJrXJxsuDdw45zZ/RXSff8nb29ZNn25gBnjg7tb J9/L7ChlYRDjYJAVU2Qp2xe7KajwoYdceo8pzBxWJpAhDFycAjCRL5oM/8umXtwtGn454sF9P41I+c mq+bF6sR+Wr477dyj3Xcv8vMcM/yuXN25a/p4jyspLYu/WGB7uQ5t4BGcVTg3Vkk22XB9kyg8A Content-Transfer-Encoding: 8bit If we clean the to/cc headers to get rid of all unicode escaping, we run into a Python bug that is unable to properly parse addresses, e.g.: In [5]: from email import utils In [6]: utils.getaddresses(['foo ']) Out[6]: [('foo', 'foo@bar.com')] In [7]: utils.getaddresses(['Shuming [范書銘] ']) Out[7]: [('', 'Shuming'), ('', ''), ('', '范書銘'), ('', ''), ('', 'shumingf@realtek.com')] If we store the headers as-is from the original message, we are less likely to run into this bug, as all non-ascii sequences should be qp-escaped in the original headers: =?big5?B?U2h1bWluZyBbrVOu0bvKXQ==?= This doesn't fix the underlying bug in Python, but works around it. Reported-by: Mark Brown Signed-off-by: Konstantin Ryabitsev --- b4/__init__.py | 11 ++++++++--- b4/mbox.py | 4 ++-- b4/pr.py | 4 ++-- 3 files changed, 12 insertions(+), 7 deletions(-) diff --git a/b4/__init__.py b/b4/__init__.py index ee07f16..32b5c02 100644 --- a/b4/__init__.py +++ b/b4/__init__.py @@ -2375,11 +2375,16 @@ def git_get_toplevel(path=None): return topdir -def format_addrs(pairs): +def format_addrs(pairs, clean=True): addrs = set() for pair in pairs: - # Remove any quoted-printable header junk from the name - addrs.add(email.utils.formataddr((LoreMessage.clean_header(pair[0]), LoreMessage.clean_header(pair[1])))) + pair = list(pair) + if pair[0] == pair[1]: + pair[0] = '' + if clean: + # Remove any quoted-printable header junk from the name + pair[0] = LoreMessage.clean_header(pair[0]) + addrs.add(email.utils.formataddr(pair)) # noqa return ', '.join(addrs) diff --git a/b4/mbox.py b/b4/mbox.py index d84d390..d3bde25 100644 --- a/b4/mbox.py +++ b/b4/mbox.py @@ -294,8 +294,8 @@ def thanks_record_am(lser, cherrypick=None): 'subject': lmsg.full_subject, 'fromname': lmsg.fromname, 'fromemail': lmsg.fromemail, - 'to': b4.format_addrs(allto), - 'cc': b4.format_addrs(allcc), + 'to': b4.format_addrs(allto, clean=False), + 'cc': b4.format_addrs(allcc, clean=False), 'references': b4.LoreMessage.clean_header(lmsg.msg['References']), 'sentdate': b4.LoreMessage.clean_header(lmsg.msg['Date']), 'quote': b4.make_quote(lmsg.body, maxlines=5), diff --git a/b4/pr.py b/b4/pr.py index 0ff68f8..5e6c7a1 100644 --- a/b4/pr.py +++ b/b4/pr.py @@ -225,8 +225,8 @@ def thanks_record_pr(lmsg): 'subject': lmsg.full_subject, 'fromname': lmsg.fromname, 'fromemail': lmsg.fromemail, - 'to': b4.format_addrs(allto), - 'cc': b4.format_addrs(allcc), + 'to': b4.format_addrs(allto, clean=False), + 'cc': b4.format_addrs(allcc, clean=False), 'references': b4.LoreMessage.clean_header(lmsg.msg['References']), 'remote': lmsg.pr_repo, 'ref': lmsg.pr_ref, -- 2.30.2