* [PATCH v2] kunit: tool: continue past invalid utf-8 output
@ 2021-10-20 23:21 Daniel Latypov
2021-10-21 2:32 ` David Gow
0 siblings, 1 reply; 2+ messages in thread
From: Daniel Latypov @ 2021-10-20 23:21 UTC (permalink / raw)
To: brendanhiggins, davidgow
Cc: linux-kernel, kunit-dev, linux-kselftest, skhan, Daniel Latypov
kunit.py currently crashes and fails to parse kernel output if it's not
fully valid utf-8.
This can come from memory corruption or or just inadvertently printing
out binary data as strings.
E.g. adding this line into a kunit test
pr_info("\x80")
will cause this exception
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 1961: invalid start byte
We can tell Python how to handle errors, see
https://docs.python.org/3/library/codecs.html#error-handlers
Unfortunately, it doesn't seem like there's a way to specify this in
just one location, so we need to repeat ourselves quite a bit.
Specify `errors='backslashreplace'` so we instead:
* print out the offending byte as '\x80'
* try and continue parsing the output.
* as long as the TAP lines themselves are valid, we're fine.
Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
---
v1 -> v2: add comment to silence erroneous pytype error
---
tools/testing/kunit/kunit.py | 3 ++-
tools/testing/kunit/kunit_kernel.py | 4 ++--
2 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/tools/testing/kunit/kunit.py b/tools/testing/kunit/kunit.py
index e1dd3180f0d1..68e6f461c758 100755
--- a/tools/testing/kunit/kunit.py
+++ b/tools/testing/kunit/kunit.py
@@ -477,9 +477,10 @@ def main(argv, linux=None):
sys.exit(1)
elif cli_args.subcommand == 'parse':
if cli_args.file == None:
+ sys.stdin.reconfigure(errors='backslashreplace') # pytype: disable=attribute-error
kunit_output = sys.stdin
else:
- with open(cli_args.file, 'r') as f:
+ with open(cli_args.file, 'r', errors='backslashreplace') as f:
kunit_output = f.read().splitlines()
request = KunitParseRequest(cli_args.raw_output,
None,
diff --git a/tools/testing/kunit/kunit_kernel.py b/tools/testing/kunit/kunit_kernel.py
index faa6320e900e..f08c6c36a947 100644
--- a/tools/testing/kunit/kunit_kernel.py
+++ b/tools/testing/kunit/kunit_kernel.py
@@ -135,7 +135,7 @@ class LinuxSourceTreeOperationsQemu(LinuxSourceTreeOperations):
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
- text=True, shell=True)
+ text=True, shell=True, errors='backslashreplace')
class LinuxSourceTreeOperationsUml(LinuxSourceTreeOperations):
"""An abstraction over command line operations performed on a source tree."""
@@ -172,7 +172,7 @@ class LinuxSourceTreeOperationsUml(LinuxSourceTreeOperations):
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
- text=True)
+ text=True, errors='backslashreplace')
def get_kconfig_path(build_dir) -> str:
return get_file_path(build_dir, KCONFIG_PATH)
base-commit: 63b136c634a2bdffd78795bc33ac2d488152ffe8
--
2.33.0.1079.g6e70778dc9-goog
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH v2] kunit: tool: continue past invalid utf-8 output
2021-10-20 23:21 [PATCH v2] kunit: tool: continue past invalid utf-8 output Daniel Latypov
@ 2021-10-21 2:32 ` David Gow
0 siblings, 0 replies; 2+ messages in thread
From: David Gow @ 2021-10-21 2:32 UTC (permalink / raw)
To: Daniel Latypov
Cc: Brendan Higgins, Linux Kernel Mailing List, KUnit Development,
open list:KERNEL SELFTEST FRAMEWORK, Shuah Khan
On Thu, Oct 21, 2021 at 7:21 AM 'Daniel Latypov' via KUnit Development
<kunit-dev@googlegroups.com> wrote:
>
> kunit.py currently crashes and fails to parse kernel output if it's not
> fully valid utf-8.
>
> This can come from memory corruption or or just inadvertently printing
> out binary data as strings.
>
> E.g. adding this line into a kunit test
> pr_info("\x80")
> will cause this exception
> UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 1961: invalid start byte
>
> We can tell Python how to handle errors, see
> https://docs.python.org/3/library/codecs.html#error-handlers
>
> Unfortunately, it doesn't seem like there's a way to specify this in
> just one location, so we need to repeat ourselves quite a bit.
>
> Specify `errors='backslashreplace'` so we instead:
> * print out the offending byte as '\x80'
> * try and continue parsing the output.
> * as long as the TAP lines themselves are valid, we're fine.
>
> Signed-off-by: Daniel Latypov <dlatypov@google.com>
> Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
> ---
> v1 -> v2: add comment to silence erroneous pytype error
> ---
Thanks. I've tested this, and it works well for me. I don't mind the
pytype comment, even though I don't use pytype, so I'm glad it's
there.
Tested-by: David Gow <davidgow@google.com>
Cheers,
-- David
> tools/testing/kunit/kunit.py | 3 ++-
> tools/testing/kunit/kunit_kernel.py | 4 ++--
> 2 files changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/tools/testing/kunit/kunit.py b/tools/testing/kunit/kunit.py
> index e1dd3180f0d1..68e6f461c758 100755
> --- a/tools/testing/kunit/kunit.py
> +++ b/tools/testing/kunit/kunit.py
> @@ -477,9 +477,10 @@ def main(argv, linux=None):
> sys.exit(1)
> elif cli_args.subcommand == 'parse':
> if cli_args.file == None:
> + sys.stdin.reconfigure(errors='backslashreplace') # pytype: disable=attribute-error
> kunit_output = sys.stdin
> else:
> - with open(cli_args.file, 'r') as f:
> + with open(cli_args.file, 'r', errors='backslashreplace') as f:
> kunit_output = f.read().splitlines()
> request = KunitParseRequest(cli_args.raw_output,
> None,
> diff --git a/tools/testing/kunit/kunit_kernel.py b/tools/testing/kunit/kunit_kernel.py
> index faa6320e900e..f08c6c36a947 100644
> --- a/tools/testing/kunit/kunit_kernel.py
> +++ b/tools/testing/kunit/kunit_kernel.py
> @@ -135,7 +135,7 @@ class LinuxSourceTreeOperationsQemu(LinuxSourceTreeOperations):
> stdin=subprocess.PIPE,
> stdout=subprocess.PIPE,
> stderr=subprocess.STDOUT,
> - text=True, shell=True)
> + text=True, shell=True, errors='backslashreplace')
>
> class LinuxSourceTreeOperationsUml(LinuxSourceTreeOperations):
> """An abstraction over command line operations performed on a source tree."""
> @@ -172,7 +172,7 @@ class LinuxSourceTreeOperationsUml(LinuxSourceTreeOperations):
> stdin=subprocess.PIPE,
> stdout=subprocess.PIPE,
> stderr=subprocess.STDOUT,
> - text=True)
> + text=True, errors='backslashreplace')
>
> def get_kconfig_path(build_dir) -> str:
> return get_file_path(build_dir, KCONFIG_PATH)
>
> base-commit: 63b136c634a2bdffd78795bc33ac2d488152ffe8
> --
> 2.33.0.1079.g6e70778dc9-goog
>
> --
> You received this message because you are subscribed to the Google Groups "KUnit Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to kunit-dev+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/kunit-dev/20211020232121.1748376-1-dlatypov%40google.com.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2021-10-21 2:32 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-20 23:21 [PATCH v2] kunit: tool: continue past invalid utf-8 output Daniel Latypov
2021-10-21 2:32 ` David Gow
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.