On 2020-04-06 at 13:30:40, Jeff King wrote: > I suspect this is mostly orthogonal, as that deals only with the > SMTP-level addresses, which include only the actual email part (not the > name) and aren't RFC2047-encoded anyway. It looks like we already leave > characters in addresses untouched (I'm not even 100% sure that RFC2047 > allows modifying within the local part of an addr): > > $echo foo >file > $ git add file > $ git -c user.email=péff@peff.net commit -m foo > $ git format-patch -1 --stdout | grep From: > From: Jeff King > > I did wonder if there are any standards around 8bit headers. Certainly > the de facto standard for local tools (e.g., mutt reading a message > you've edited in vim) is that they can be treated like a stream of > ASCII-compatible bytes, and that works pretty well in practice. But if > there's an IETF-endorsed method for 8bit headers, it would be nice to > use it. For 8bit bodies, we're able to give a content-transfer-encoding > and a content-type with the charset. But I don't know of an equivalent > for headers. That's RFC 6532, Internationalized Email Headers, the companion document to RFC 6531. (The RFC editor has cleverly kept the last digits in sync between the RFC 532x and 653x series). The basic summary is that header field names are not internationalized, but the field values do allow UTF-8 if they contain unstructured text (e.g., Subject), anything using atoms (e.g., Message-ID), quoted strings (e.g., local-parts of an email address), domains, and a few other constructs. RFC 2047 (MIME encoded words) is allowed "only in a subset of the places allowed by" RFC 6532, so just not encoding should be safe here, as long as it's UTF-8. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204