All of lore.kernel.org
 help / color / mirror / Atom feed
* [U-Boot] [PATCH] patman: encode CC list to UTF-8
@ 2017-04-19 13:24 Philipp Tomsich
  2017-04-22 23:53 ` Simon Glass
  0 siblings, 1 reply; 5+ messages in thread
From: Philipp Tomsich @ 2017-04-19 13:24 UTC (permalink / raw)
  To: u-boot

This change encodes the CC list to UTF-8 to avoid failures on
maintainer-addresses that include non-ASCII characters (observed on
Debian 7.11 with Python 2.7.3).

Without this, I get the following failure:
  Traceback (most recent call last):
    File "tools/patman/patman", line 159, in <module>
      options.add_maintainers)
    File "[snip]/u-boot/tools/patman/series.py", line 234, in MakeCcFile
      print(commit.patch, ', '.join(set(list)), file=fd)
  UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 81: ordinal not in range(128)
from Heiko's email address:
  [..., u'"Heiko St\xfcbner" <heiko@sntech.de>', ...]

While with this change added this encodes to:
  "=?UTF-8?q?Heiko=20St=C3=BCbner?= <heiko@sntech.de>"

Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
---

 tools/patman/series.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/patman/series.py b/tools/patman/series.py
index c1b8652..134a381 100644
--- a/tools/patman/series.py
+++ b/tools/patman/series.py
@@ -119,7 +119,7 @@ class Series(dict):
                     email = col.Color(col.YELLOW, "<alias '%s' not found>"
                             % tag)
                 if email:
-                    print('      Cc: ', email)
+                    print('      Cc: ', email.encode('utf-8'))
         print
         for item in to_set:
             print('To:\t ', item)
@@ -230,7 +230,7 @@ class Series(dict):
             if add_maintainers:
                 list += get_maintainer.GetMaintainer(commit.patch)
             all_ccs += list
-            print(commit.patch, ', '.join(set(list)), file=fd)
+            print(commit.patch, ', '.join(set(list)).encode('utf-8'), file=fd)
             self._generated_cc[commit.patch] = list
 
         if cover_fname:
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [U-Boot] [PATCH] patman: encode CC list to UTF-8
  2017-04-19 13:24 [U-Boot] [PATCH] patman: encode CC list to UTF-8 Philipp Tomsich
@ 2017-04-22 23:53 ` Simon Glass
  2017-04-25 17:12   ` Tom Rini
  0 siblings, 1 reply; 5+ messages in thread
From: Simon Glass @ 2017-04-22 23:53 UTC (permalink / raw)
  To: u-boot

+Tom

On 19 April 2017 at 07:24, Philipp Tomsich
<philipp.tomsich@theobroma-systems.com> wrote:
>
> This change encodes the CC list to UTF-8 to avoid failures on
> maintainer-addresses that include non-ASCII characters (observed on
> Debian 7.11 with Python 2.7.3).
>
> Without this, I get the following failure:
>   Traceback (most recent call last):
>     File "tools/patman/patman", line 159, in <module>
>       options.add_maintainers)
>     File "[snip]/u-boot/tools/patman/series.py", line 234, in MakeCcFile
>       print(commit.patch, ', '.join(set(list)), file=fd)
>   UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 81: ordinal not in range(128)
> from Heiko's email address:
>   [..., u'"Heiko St\xfcbner" <heiko@sntech.de>', ...]
>
> While with this change added this encodes to:
>   "=?UTF-8?q?Heiko=20St=C3=BCbner?= <heiko@sntech.de>"
>
> Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
> ---
>
>  tools/patman/series.py | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Reviewed-by: Simon Glass <sjg@chromium.org>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [U-Boot] [PATCH] patman: encode CC list to UTF-8
  2017-04-22 23:53 ` Simon Glass
@ 2017-04-25 17:12   ` Tom Rini
  2017-04-25 20:31     ` Simon Glass
  0 siblings, 1 reply; 5+ messages in thread
From: Tom Rini @ 2017-04-25 17:12 UTC (permalink / raw)
  To: u-boot

On Sat, Apr 22, 2017 at 05:53:36PM -0600, Simon Glass wrote:
> +Tom
> 
> On 19 April 2017 at 07:24, Philipp Tomsich
> <philipp.tomsich@theobroma-systems.com> wrote:
> >
> > This change encodes the CC list to UTF-8 to avoid failures on
> > maintainer-addresses that include non-ASCII characters (observed on
> > Debian 7.11 with Python 2.7.3).
> >
> > Without this, I get the following failure:
> >   Traceback (most recent call last):
> >     File "tools/patman/patman", line 159, in <module>
> >       options.add_maintainers)
> >     File "[snip]/u-boot/tools/patman/series.py", line 234, in MakeCcFile
> >       print(commit.patch, ', '.join(set(list)), file=fd)
> >   UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 81: ordinal not in range(128)
> > from Heiko's email address:
> >   [..., u'"Heiko St\xfcbner" <heiko@sntech.de>', ...]
> >
> > While with this change added this encodes to:
> >   "=?UTF-8?q?Heiko=20St=C3=BCbner?= <heiko@sntech.de>"
> >
> > Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
> > ---
> >
> >  tools/patman/series.py | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> Reviewed-by: Simon Glass <sjg@chromium.org>

Please put this in a PR for me, along with any other critical fixes to
the various python tools we have, thanks!

And also, do we need to perhaps whack something at a higher level, and
more consistently, about unicode?  This is, I gather, doing UTF-8 right.
In buildman we have a few patches to just translate to latin-1 instead.
We should do the same thing I think, and perhaps there's a higher level
up in the code where we need to do it too?  I don't know..

-- 
Tom
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: Digital signature
URL: <http://lists.denx.de/pipermail/u-boot/attachments/20170425/bc486b9c/attachment.sig>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [U-Boot] [PATCH] patman: encode CC list to UTF-8
  2017-04-25 17:12   ` Tom Rini
@ 2017-04-25 20:31     ` Simon Glass
  2017-04-25 22:27       ` Dr. Philipp Tomsich
  0 siblings, 1 reply; 5+ messages in thread
From: Simon Glass @ 2017-04-25 20:31 UTC (permalink / raw)
  To: u-boot

Hi Tom,

On 25 April 2017 at 11:12, Tom Rini <trini@konsulko.com> wrote:
>
> On Sat, Apr 22, 2017 at 05:53:36PM -0600, Simon Glass wrote:
> > +Tom
> >
> > On 19 April 2017 at 07:24, Philipp Tomsich
> > <philipp.tomsich@theobroma-systems.com> wrote:
> > >
> > > This change encodes the CC list to UTF-8 to avoid failures on
> > > maintainer-addresses that include non-ASCII characters (observed on
> > > Debian 7.11 with Python 2.7.3).
> > >
> > > Without this, I get the following failure:
> > >   Traceback (most recent call last):
> > >     File "tools/patman/patman", line 159, in <module>
> > >       options.add_maintainers)
> > >     File "[snip]/u-boot/tools/patman/series.py", line 234, in MakeCcFile
> > >       print(commit.patch, ', '.join(set(list)), file=fd)
> > >   UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 81: ordinal not in range(128)
> > > from Heiko's email address:
> > >   [..., u'"Heiko St\xfcbner" <heiko@sntech.de>', ...]
> > >
> > > While with this change added this encodes to:
> > >   "=?UTF-8?q?Heiko=20St=C3=BCbner?= <heiko@sntech.de>"
> > >
> > > Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
> > > ---
> > >
> > >  tools/patman/series.py | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > Reviewed-by: Simon Glass <sjg@chromium.org>
>
> Please put this in a PR for me, along with any other critical fixes to
> the various python tools we have, thanks!
>
> And also, do we need to perhaps whack something at a higher level, and
> more consistently, about unicode?  This is, I gather, doing UTF-8 right.
> In buildman we have a few patches to just translate to latin-1 instead.
> We should do the same thing I think, and perhaps there's a higher level
> up in the code where we need to do it too?  I don't know..

Actually I don't think we are quite there yet. This really needs a
test with all the different places strings can come from, to make sure
patman does the right thing.

Regards,
Simon

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [U-Boot] [PATCH] patman: encode CC list to UTF-8
  2017-04-25 20:31     ` Simon Glass
@ 2017-04-25 22:27       ` Dr. Philipp Tomsich
  0 siblings, 0 replies; 5+ messages in thread
From: Dr. Philipp Tomsich @ 2017-04-25 22:27 UTC (permalink / raw)
  To: u-boot

Hi Simon,

> On 25 Apr 2017, at 22:31, Simon Glass <sjg@chromium.org> wrote:
> 
> Hi Tom,
> 
> On 25 April 2017 at 11:12, Tom Rini <trini@konsulko.com> wrote:
>> 
>> On Sat, Apr 22, 2017 at 05:53:36PM -0600, Simon Glass wrote:
>>> +Tom
>>> 
>>> On 19 April 2017 at 07:24, Philipp Tomsich
>>> <philipp.tomsich@theobroma-systems.com> wrote:
>>>> 
>>>> This change encodes the CC list to UTF-8 to avoid failures on
>>>> maintainer-addresses that include non-ASCII characters (observed on
>>>> Debian 7.11 with Python 2.7.3).
>>>> 
>>>> Without this, I get the following failure:
>>>>  Traceback (most recent call last):
>>>>    File "tools/patman/patman", line 159, in <module>
>>>>      options.add_maintainers)
>>>>    File "[snip]/u-boot/tools/patman/series.py", line 234, in MakeCcFile
>>>>      print(commit.patch, ', '.join(set(list)), file=fd)
>>>>  UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 81: ordinal not in range(128)
>>>> from Heiko's email address:
>>>>  [..., u'"Heiko St\xfcbner" <heiko@sntech.de>', ...]
>>>> 
>>>> While with this change added this encodes to:
>>>>  "=?UTF-8?q?Heiko=20St=C3=BCbner?= <heiko@sntech.de>"
>>>> 
>>>> Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
>>>> ---
>>>> 
>>>> tools/patman/series.py | 4 ++--
>>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>> 
>>> Reviewed-by: Simon Glass <sjg@chromium.org>
>> 
>> Please put this in a PR for me, along with any other critical fixes to
>> the various python tools we have, thanks!
>> 
>> And also, do we need to perhaps whack something at a higher level, and
>> more consistently, about unicode?  This is, I gather, doing UTF-8 right.
>> In buildman we have a few patches to just translate to latin-1 instead.
>> We should do the same thing I think, and perhaps there's a higher level
>> up in the code where we need to do it too?  I don't know..
> 
> Actually I don't think we are quite there yet. This really needs a
> test with all the different places strings can come from, to make sure
> patman does the right thing.

On the topic of ‘different places strings can come from’, here’s another
change from my WIP tree that fixes some other UTF-8 issues in patman
and may point you towards another trouble spot:

@@ -229,14 +229,16 @@ class Series(dict):
                                            raise_on_error=raise_on_error)
             if add_maintainers:
                 list += get_maintainer.GetMaintainer(commit.patch)
+            list = [s.encode('utf-8') for s in list]
             all_ccs += list
-            print(commit.patch, ', '.join(set(list)).encode('utf-8'), file=fd)
+            print(commit.patch, ', '.join(set(list)), file=fd)
             self._generated_cc[commit.patch] = list
 
         if cover_fname:
             cover_cc = gitutil.BuildEmailList(self.get('cover_cc', ''))
-            cc_list = ', '.join([x.decode('utf-8') for x in set(cover_cc + all_ccs)])
-            print(cover_fname, cc_list.encode('utf-8'), file=fd)
+            cover_cc = [s.encode('utf-8') for s in cover_cc]
+            cc_list = ', '.join([x for x in set(cover_cc + all_ccs)])
+            print(cover_fname, cc_list, file=fd)
 
         fd.close()
         return fname


Regards,
Philipp.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-04-25 22:27 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-19 13:24 [U-Boot] [PATCH] patman: encode CC list to UTF-8 Philipp Tomsich
2017-04-22 23:53 ` Simon Glass
2017-04-25 17:12   ` Tom Rini
2017-04-25 20:31     ` Simon Glass
2017-04-25 22:27       ` Dr. Philipp Tomsich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.