* [PATCH V2] scripts/spdxcheck.py: Strictly read license files in utf-8
@ 2021-07-07 20:48 Nishanth Menon
2021-07-12 15:58 ` Jonathan Corbet
0 siblings, 1 reply; 2+ messages in thread
From: Nishanth Menon @ 2021-07-07 20:48 UTC (permalink / raw)
To: Greg Kroah-Hartman, Thomas Gleixner, Jonathan Corbet
Cc: Ravikumar, Rahul, lkml, linux-spdx, Nishanth Menon
Commit bc41a7f36469 ("LICENSES: Add the CC-BY-4.0 license")
unfortunately introduced LICENSES/dual/CC-BY-4.0 in UTF-8 Unicode text
While python will barf at it with:
FAIL: 'ascii' codec can't decode byte 0xe2 in position 2109: ordinal not in range(128)
Traceback (most recent call last):
File "scripts/spdxcheck.py", line 244, in <module>
spdx = read_spdxdata(repo)
File "scripts/spdxcheck.py", line 47, in read_spdxdata
for l in open(el.path).readlines():
File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 2109: ordinal not in range(128)
While it is indeed debatable if 'Licensor.' used in the license file
needs unicode quotes, instead, force spdxcheck to read utf-8.
Reported-by: Rahul T R <r-ravikumar@ti.com>
Signed-off-by: Nishanth Menon <nm@ti.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
---
Changes since V1:
* Commit message update to drop "Let's" "Let us".
* Picked up Thomas' Reviewed-by
V1: https://lore.kernel.org/linux-spdx/20210703012128.27946-1-nm@ti.com/
scripts/spdxcheck.py | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/scripts/spdxcheck.py b/scripts/spdxcheck.py
index 3e784cf9f401..ebd06ae642c9 100755
--- a/scripts/spdxcheck.py
+++ b/scripts/spdxcheck.py
@@ -44,7 +44,7 @@ def read_spdxdata(repo):
continue
exception = None
- for l in open(el.path).readlines():
+ for l in open(el.path, encoding="utf-8").readlines():
if l.startswith('Valid-License-Identifier:'):
lid = l.split(':')[1].strip().upper()
if lid in spdx.licenses:
--
2.32.0
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH V2] scripts/spdxcheck.py: Strictly read license files in utf-8
2021-07-07 20:48 [PATCH V2] scripts/spdxcheck.py: Strictly read license files in utf-8 Nishanth Menon
@ 2021-07-12 15:58 ` Jonathan Corbet
0 siblings, 0 replies; 2+ messages in thread
From: Jonathan Corbet @ 2021-07-12 15:58 UTC (permalink / raw)
To: Nishanth Menon, Greg Kroah-Hartman, Thomas Gleixner
Cc: Ravikumar, Rahul, lkml, linux-spdx, Nishanth Menon
Nishanth Menon <nm@ti.com> writes:
> Commit bc41a7f36469 ("LICENSES: Add the CC-BY-4.0 license")
> unfortunately introduced LICENSES/dual/CC-BY-4.0 in UTF-8 Unicode text
> While python will barf at it with:
>
> FAIL: 'ascii' codec can't decode byte 0xe2 in position 2109: ordinal not in range(128)
> Traceback (most recent call last):
> File "scripts/spdxcheck.py", line 244, in <module>
> spdx = read_spdxdata(repo)
> File "scripts/spdxcheck.py", line 47, in read_spdxdata
> for l in open(el.path).readlines():
> File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode
> return codecs.ascii_decode(input, self.errors)[0]
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 2109: ordinal not in range(128)
>
> While it is indeed debatable if 'Licensor.' used in the license file
> needs unicode quotes, instead, force spdxcheck to read utf-8.
>
> Reported-by: Rahul T R <r-ravikumar@ti.com>
> Signed-off-by: Nishanth Menon <nm@ti.com>
> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
I've applied this, thanks.
jon
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2021-07-12 15:58 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-07 20:48 [PATCH V2] scripts/spdxcheck.py: Strictly read license files in utf-8 Nishanth Menon
2021-07-12 15:58 ` Jonathan Corbet
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).