All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3] decodetree: Open files with encoding='utf-8'
@ 2021-01-10  0:02 Philippe Mathieu-Daudé
  2021-01-11 18:28 ` Richard Henderson
  0 siblings, 1 reply; 2+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-01-10  0:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Maydell, Eduardo Habkost, Richard Henderson,
	Philippe Mathieu-Daudé,
	Yonggang Luo, Cleber Rosa, Paolo Bonzini

When decodetree.py was added in commit 568ae7efae7, QEMU was
using Python 2 which happily reads UTF-8 files in text mode.
Python 3 requires either UTF-8 locale or an explicit encoding
passed to open(). Now that Python 3 is required, explicit
UTF-8 encoding for decodetree source files.

To avoid further problems with the user locale, also explicit
UTF-8 encoding for the generated C files.

Explicit both input/output are plain text by using the 't' mode.

This fixes:

  $ /usr/bin/python3 scripts/decodetree.py test.decode
  Traceback (most recent call last):
    File "scripts/decodetree.py", line 1397, in <module>
      main()
    File "scripts/decodetree.py", line 1308, in main
      parse_file(f, toppat)
    File "scripts/decodetree.py", line 994, in parse_file
      for line in f:
    File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode
      return codecs.ascii_decode(input, self.errors)[0]
  UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 80:
  ordinal not in range(128)

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Suggested-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
---
v3: utf-8 stdout (Eduardo and Yonggang Luo)
v2: utf-8 output too (Peter)
    explicit default text mode.
---
 scripts/decodetree.py | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/scripts/decodetree.py b/scripts/decodetree.py
index 47aa9caf6d1..4637b633e70 100644
--- a/scripts/decodetree.py
+++ b/scripts/decodetree.py
@@ -20,6 +20,7 @@
 # See the syntax and semantics in docs/devel/decodetree.rst.
 #
 
+import io
 import os
 import re
 import sys
@@ -1304,7 +1305,7 @@ def main():
 
     for filename in args:
         input_file = filename
-        f = open(filename, 'r')
+        f = open(filename, 'rt', encoding='utf-8')
         parse_file(f, toppat)
         f.close()
 
@@ -1324,9 +1325,11 @@ def main():
         prop_size(stree)
 
     if output_file:
-        output_fd = open(output_file, 'w')
+        output_fd = open(output_file, 'wt', encoding='utf-8')
     else:
-        output_fd = sys.stdout
+        output_fd = io.TextIOWrapper(sys.stdout.buffer,
+                                     encoding=sys.stdout.encoding,
+                                     errors="ignore")
 
     output_autogen()
     for n in sorted(arguments.keys()):
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH v3] decodetree: Open files with encoding='utf-8'
  2021-01-10  0:02 [PATCH v3] decodetree: Open files with encoding='utf-8' Philippe Mathieu-Daudé
@ 2021-01-11 18:28 ` Richard Henderson
  0 siblings, 0 replies; 2+ messages in thread
From: Richard Henderson @ 2021-01-11 18:28 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, qemu-devel
  Cc: Peter Maydell, Paolo Bonzini, Yonggang Luo, Eduardo Habkost, Cleber Rosa

On 1/9/21 2:02 PM, Philippe Mathieu-Daudé wrote:
> When decodetree.py was added in commit 568ae7efae7, QEMU was
> using Python 2 which happily reads UTF-8 files in text mode.
> Python 3 requires either UTF-8 locale or an explicit encoding
> passed to open(). Now that Python 3 is required, explicit
> UTF-8 encoding for decodetree source files.
> 
> To avoid further problems with the user locale, also explicit
> UTF-8 encoding for the generated C files.
> 
> Explicit both input/output are plain text by using the 't' mode.
> 
> This fixes:
> 
>   $ /usr/bin/python3 scripts/decodetree.py test.decode
>   Traceback (most recent call last):
>     File "scripts/decodetree.py", line 1397, in <module>
>       main()
>     File "scripts/decodetree.py", line 1308, in main
>       parse_file(f, toppat)
>     File "scripts/decodetree.py", line 994, in parse_file
>       for line in f:
>     File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode
>       return codecs.ascii_decode(input, self.errors)[0]
>   UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 80:
>   ordinal not in range(128)
> 
> Reported-by: Peter Maydell <peter.maydell@linaro.org>
> Suggested-by: Yonggang Luo <luoyonggang@gmail.com>
> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
> ---
> v3: utf-8 stdout (Eduardo and Yonggang Luo)
> v2: utf-8 output too (Peter)
>     explicit default text mode.
> ---
>  scripts/decodetree.py | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 

Thanks.  Queued to tcg-next.


r~


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-01-11 18:29 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-10  0:02 [PATCH v3] decodetree: Open files with encoding='utf-8' Philippe Mathieu-Daudé
2021-01-11 18:28 ` Richard Henderson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.