kernelci.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [RFC] KTAP spec v2: prefix to KTAP data
@ 2022-05-12  5:59 Frank Rowand
  2022-05-12  6:01 ` Frank Rowand
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Frank Rowand @ 2022-05-12  5:59 UTC (permalink / raw)
  To: Frank Rowand, David Gow, Shuah Khan, Kees Cook, Tim.Bird,
	Brendan Higgins
  Cc: Jonathan Corbet, rmr167, guillaume.tucker, dlatypov, kernelci,
	kunit-dev, linux-kselftest, linux-doc, linux-kernel

In the middle of the "RFC - kernel test result specification (KTAP)" thread,
started in August 2021, Tim Bird made a suggestion to allow a prefix to the
KTAP data format:

> Just as a side note, in some Fuego tests, it was very useful to include an identifier
> in thethe prefix nested tests.  The output looked like this:
>
> TAP version 13
> 1..2
> [batch_id 4] TAP version 13
> [batch_id 4] 1..2
> [batch_id 4] ok 1 - cyclictest with 1000 cycles
> [batch_id 4] # problem setting CLOCK_REALTIME
> [batch_id 4] not ok 2 - cyclictest with CLOCK_REALTIME
> not ok 1 - check realtime
> [batch_id 4] TAP version 13
> [batch_id 4] 1..1
> [batch_id 4] ok 1 - IOZone read/write 4k blocks
> ok 2 - check I/O performance
>
> Can I propose that the prefix not be fixed by the spec, but that the spec indicates that
> whatever the prefix is on the TAP version line, that prefix must be used with the output for
> all lines from the test (with the exception of unknown lines)?

The thread was discussing many other items, but this is the one that I want
to focus on in this new RFC thread.

Tim's original email was:

   https://lore.kernel.org/r/BYAPR13MB2503A4B79074D8ED5579345DFDCB9@BYAPR13MB2503.namprd13.prod.outlook.com

There was one reply to this that commented on Tim's suggestion (and also many
other items in the thread) at:

   https://lore.kernel.org/r/202108301226.800F3D6D4@keescook

> Oh, interesting. This would also allow parallel (unique) test execution
> to be parsable. That sounds workable. (Again, this needs LAVA patching
> again...)

I found Tim's original suggestion to be useful, so I have come up with
two possible ways to modify the KTAP specification to implement what Tim
was thinking about.  I would not be surprised if someone else has a better
suggestion than mine, but I will reply to this email with my two alternatives
to start a discussion.  My alternatives are not in the form of patches, but
if discussion leads to a good result then I will create a patch for review.

-Frank

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC] KTAP spec v2: prefix to KTAP data
  2022-05-12  5:59 [RFC] KTAP spec v2: prefix to KTAP data Frank Rowand
@ 2022-05-12  6:01 ` Frank Rowand
  2022-05-12 15:25   ` Daniel Latypov
  2022-05-12  6:02 ` Frank Rowand
  2022-05-12 15:12 ` Daniel Latypov
  2 siblings, 1 reply; 8+ messages in thread
From: Frank Rowand @ 2022-05-12  6:01 UTC (permalink / raw)
  To: David Gow, Shuah Khan, Kees Cook, Tim.Bird, Brendan Higgins
  Cc: Jonathan Corbet, rmr167, guillaume.tucker, dlatypov, kernelci,
	kunit-dev, linux-kselftest, linux-doc, linux-kernel

On 5/12/22 00:59, Frank Rowand wrote:
> In the middle of the "RFC - kernel test result specification (KTAP)" thread,
> started in August 2021, Tim Bird made a suggestion to allow a prefix to the
> KTAP data format:
> 
>> Just as a side note, in some Fuego tests, it was very useful to include an identifier
>> in thethe prefix nested tests.  The output looked like this:
>>
>> TAP version 13
>> 1..2
>> [batch_id 4] TAP version 13
>> [batch_id 4] 1..2
>> [batch_id 4] ok 1 - cyclictest with 1000 cycles
>> [batch_id 4] # problem setting CLOCK_REALTIME
>> [batch_id 4] not ok 2 - cyclictest with CLOCK_REALTIME
>> not ok 1 - check realtime
>> [batch_id 4] TAP version 13
>> [batch_id 4] 1..1
>> [batch_id 4] ok 1 - IOZone read/write 4k blocks
>> ok 2 - check I/O performance
>>
>> Can I propose that the prefix not be fixed by the spec, but that the spec indicates that
>> whatever the prefix is on the TAP version line, that prefix must be used with the output for
>> all lines from the test (with the exception of unknown lines)?
> 
> The thread was discussing many other items, but this is the one that I want
> to focus on in this new RFC thread.
> 
> Tim's original email was:
> 
>    https://lore.kernel.org/r/BYAPR13MB2503A4B79074D8ED5579345DFDCB9@BYAPR13MB2503.namprd13.prod.outlook.com
> 
> There was one reply to this that commented on Tim's suggestion (and also many
> other items in the thread) at:
> 
>    https://lore.kernel.org/r/202108301226.800F3D6D4@keescook
> 
>> Oh, interesting. This would also allow parallel (unique) test execution
>> to be parsable. That sounds workable. (Again, this needs LAVA patching
>> again...)
> 
> I found Tim's original suggestion to be useful, so I have come up with
> two possible ways to modify the KTAP specification to implement what Tim
> was thinking about.  I would not be surprised if someone else has a better
> suggestion than mine, but I will reply to this email with my two alternatives
> to start a discussion.  My alternatives are not in the form of patches, but
> if discussion leads to a good result then I will create a patch for review.
> 
> -Frank

================================================================================
Alternative 1
   - Add an optional <prefix string> to test output.
================================================================================
## <prefix string>

If the optional [<prefix string>] is present for any line of the KTAP formatted
output, then it must be present for all lines of the KTAP formatted output
(except for "Diagnostic lines", are allowed to not be prefixed with
<prefix string>, as described below).

The "Version lines" format is changed from:

   KTAP version 1

to:

   [<prefix string>] KTAP version 1

The "Plan lines" format is changed from:

   "1..N"

to:

   [<prefix string>] "1..N"

The "Test case result lines" format is changed from:

   <result> <number> [<description>][ # [<directive>] [<diagnostic data>]]

to:

   [<prefix string>] <result> <number> [<description>][ # [<directive>] [<diagnostic data>]]


   <prefix content is a constant string>

   All subsequent test lines are prefixed with the <prefix string>.

   Indentation for "Nested tests" follows <prefix string>.  The indentation
   does NOT precede <prefix string>.

   "Diagnostic lines" are prefixed with the <prefix string>.

   "Unknown lines" may optionally be prefixed with the <prefix string>, but
   are not required to be prefixed with the <prefix string>.  It is allowed
   for some "Unknown lines" to not be prefixed with the <prefix string>, even
   if one or more other "Unknown lines" are prefixed with the <prefix string>.


================================================================================
#### discussion notes:

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
PRO: minimally invasive to specification.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
CON:

KTAP does not include any mechanism to describe the value of <prefix string>
to test harnesses and test output processing programs.  The test output
processing programs must infer the value of <prefix string> by detecting
the <prefix string> in the "Version lines".

The detection of a "Version lines" might be a match of the regex:

   "^.*KTAP version 2$"

This risks falsely detecting a "Version lines", but the risk is small???

The impact of falsely detecting a "Version lines" may be large if the
beginning of test detection is a state machine that moves into a "detect
either Plan lines or Test case result lines" after falsely detecting a
"Version lines".  The impact may be small if the state maching instead
moves into a "detect Version lines, Plan lines, or Test case result lines".


================================================================================


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC] KTAP spec v2: prefix to KTAP data
  2022-05-12  5:59 [RFC] KTAP spec v2: prefix to KTAP data Frank Rowand
  2022-05-12  6:01 ` Frank Rowand
@ 2022-05-12  6:02 ` Frank Rowand
  2022-05-13 22:19   ` Bird, Tim
  2022-05-12 15:12 ` Daniel Latypov
  2 siblings, 1 reply; 8+ messages in thread
From: Frank Rowand @ 2022-05-12  6:02 UTC (permalink / raw)
  To: David Gow, Shuah Khan, Kees Cook, Tim.Bird, Brendan Higgins
  Cc: Jonathan Corbet, rmr167, guillaume.tucker, dlatypov, kernelci,
	kunit-dev, linux-kselftest, linux-doc, linux-kernel

On 5/12/22 00:59, Frank Rowand wrote:
> In the middle of the "RFC - kernel test result specification (KTAP)" thread,
> started in August 2021, Tim Bird made a suggestion to allow a prefix to the
> KTAP data format:
> 
>> Just as a side note, in some Fuego tests, it was very useful to include an identifier
>> in thethe prefix nested tests.  The output looked like this:
>>
>> TAP version 13
>> 1..2
>> [batch_id 4] TAP version 13
>> [batch_id 4] 1..2
>> [batch_id 4] ok 1 - cyclictest with 1000 cycles
>> [batch_id 4] # problem setting CLOCK_REALTIME
>> [batch_id 4] not ok 2 - cyclictest with CLOCK_REALTIME
>> not ok 1 - check realtime
>> [batch_id 4] TAP version 13
>> [batch_id 4] 1..1
>> [batch_id 4] ok 1 - IOZone read/write 4k blocks
>> ok 2 - check I/O performance
>>
>> Can I propose that the prefix not be fixed by the spec, but that the spec indicates that
>> whatever the prefix is on the TAP version line, that prefix must be used with the output for
>> all lines from the test (with the exception of unknown lines)?
> 
> The thread was discussing many other items, but this is the one that I want
> to focus on in this new RFC thread.
> 
> Tim's original email was:
> 
>    https://lore.kernel.org/r/BYAPR13MB2503A4B79074D8ED5579345DFDCB9@BYAPR13MB2503.namprd13.prod.outlook.com
> 
> There was one reply to this that commented on Tim's suggestion (and also many
> other items in the thread) at:
> 
>    https://lore.kernel.org/r/202108301226.800F3D6D4@keescook
> 
>> Oh, interesting. This would also allow parallel (unique) test execution
>> to be parsable. That sounds workable. (Again, this needs LAVA patching
>> again...)
> 
> I found Tim's original suggestion to be useful, so I have come up with
> two possible ways to modify the KTAP specification to implement what Tim
> was thinking about.  I would not be surprised if someone else has a better
> suggestion than mine, but I will reply to this email with my two alternatives
> to start a discussion.  My alternatives are not in the form of patches, but
> if discussion leads to a good result then I will create a patch for review.
> 
> -Frank

================================================================================
Alternative 2
   - Add an optional <prefix string> to test output.
   - Add "Configuration info lines", which are used to provide information
     about the KTAP format to programs that interpret the KTAP data.  The
     only type of "Configuration info lines" proposed provides the value
     of <prefix string> for programs that process the KTAP output.
   - Further types of "Configuration info lines" could be added.

Alternative 2b
   - Add an optional <prefix string> to test output.
   - Add a <prefix string> definition line of the form:
        ok 0 <prefix string>
     This line is essentially a phony "Test case result lines" for test 0.

================================================================================
## Configuration info lines

Occur in zero or more test case result lines, where <number> is 0 (zero),
following the "Plan line", before any other "Test case result" line.

   If there is no "Plan line" at the beginning of the test, then the
   "Configuration info lines" follow the "Version line", before any
   other "Test case result" line.

format:

   ok 0 <description> # [<directive>] [<diagnostic data>]]

<result> must be "ok".

The type of each "Configuration info line" is defined by the value
of <description>.

Each value of <description> used for a "Configuration info line" must be
listed in the specification.

Whether # <directive> is optional or required is defined for each type of
"Configuration info line".

#### Should '[<diagnostic data>]' be included in the format?


- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# [output] prefix

ok 0 output_prefix #<prefix string>

   <prefix content is a constant string>
   Note: there is no space between "#" and "<prefix string>".  If there is one
   or more spaces after the "#", then there are part of the <prefix string>

   ## <diagnostic data> must not exist unless there is a way to determine the
   ## end of <prefix string>.
   ##
   ## Adding a way to determine the end of <prefix string> adds much complexity
   ## to the consumers of ktap output.  For example, if the format was:
   ##
   ##   ok 0 output_prefix #<prefix string>[#<diagnostic data>]
   ##
   ## then it is not possible for <prefix string> to contain '#', unless
   ## a method to escape '#' is provided.  E.G.:
   ##
   ##   ok 0 output_prefix #XXX test result \#\#\##<diagnostic data>
   ##
   ## would result in <prefix string> value of 'XXX test result ###'
   ##
   ## My recomendation: do not allow <diagnostic data> in output_prefix format.

   All subsequent test lines are prefixed with the <prefix string>.

   Indentation for "Nested tests" follows <prefix string>.  The indentation
   does NOT precede <prefix string>.

   "Diagnostic lines" are prefixed with the <prefix string>.

   "Unknown lines" may optionally be prefixed with the <prefix string>, but
   are not required to be prefixed with the <prefix string>.  It is allowed
   for some "Unknown lines" to not be prefixed with the <prefix string>, even
   if one or more other "Unknown lines" are prefixed with the <prefix string>.

   #### Does prefixing begin immediately (even for a subsequent "Configuration
   #### info line" or does it begin with the test 1 "Test case result line"
   #### line?
   ####
   ####    This question might be simplified if the output_prefix line is
   ####    required to be the last "Configuration info line".  (Or maybe even if
   ####    required to be the first "Configuration info line".


================================================================================
#### discussion notes:

PRO:
   Test output processing programs can detect the value of <prefix string>
   more deterministically than Alternative 1.

CON:
   More complex than Alternative 1.  (But alternative 2b is less complex than
   Alternative 2.)

================================================================================


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC] KTAP spec v2: prefix to KTAP data
  2022-05-12  5:59 [RFC] KTAP spec v2: prefix to KTAP data Frank Rowand
  2022-05-12  6:01 ` Frank Rowand
  2022-05-12  6:02 ` Frank Rowand
@ 2022-05-12 15:12 ` Daniel Latypov
  2 siblings, 0 replies; 8+ messages in thread
From: Daniel Latypov @ 2022-05-12 15:12 UTC (permalink / raw)
  To: Frank Rowand
  Cc: David Gow, Shuah Khan, Kees Cook, Tim.Bird, Brendan Higgins,
	Jonathan Corbet, rmr167, guillaume.tucker, kernelci, kunit-dev,
	linux-kselftest, linux-doc, linux-kernel

On Wed, May 11, 2022 at 10:59 PM Frank Rowand <frowand.list@gmail.com> wrote:
>
> In the middle of the "RFC - kernel test result specification (KTAP)" thread,
> started in August 2021, Tim Bird made a suggestion to allow a prefix to the
> KTAP data format:
>
> > Just as a side note, in some Fuego tests, it was very useful to include an identifier
> > in thethe prefix nested tests.  The output looked like this:
> >
> > TAP version 13
> > 1..2
> > [batch_id 4] TAP version 13
> > [batch_id 4] 1..2
> > [batch_id 4] ok 1 - cyclictest with 1000 cycles
> > [batch_id 4] # problem setting CLOCK_REALTIME
> > [batch_id 4] not ok 2 - cyclictest with CLOCK_REALTIME
> > not ok 1 - check realtime
> > [batch_id 4] TAP version 13
> > [batch_id 4] 1..1
> > [batch_id 4] ok 1 - IOZone read/write 4k blocks
> > ok 2 - check I/O performance
> >
> > Can I propose that the prefix not be fixed by the spec, but that the spec indicates that
> > whatever the prefix is on the TAP version line, that prefix must be used with the output for
> > all lines from the test (with the exception of unknown lines)?

Just chiming in since I didn't see it mentioned after a quick skim of
the original thread:

This is already basically the behavior of kunit.py's TAP parser since
commit afc63da64f1e5e41875c98707020e85050f8a0c5
Author: Heidi Fahim <heidifahim@google.com>
Date:   Mon Mar 16 13:21:24 2020 -0700

    kunit: kunit_parser: make parser more robust

    Previously, kunit_parser did not properly handle kunit TAP output that
    - had any prefixes (generated from different configs e.g.
    CONFIG_PRINTK_TIME)
...

The notable difference is that only the prefix _length_ is fixed, not
the contents of the string itself.

So ignoring a dynamic prefix is a practical necessity if we want to
parse TAP from kernelspace/printk across a range of configs.
But I don't know if this dynamic version is worth including in the spec.
The static prefix makes more sense to me to formalize, and if we go
down that route, at least kunit.py will already be compliant :)

Daniel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC] KTAP spec v2: prefix to KTAP data
  2022-05-12  6:01 ` Frank Rowand
@ 2022-05-12 15:25   ` Daniel Latypov
  2022-05-12 17:26     ` Frank Rowand
  0 siblings, 1 reply; 8+ messages in thread
From: Daniel Latypov @ 2022-05-12 15:25 UTC (permalink / raw)
  To: Frank Rowand
  Cc: David Gow, Shuah Khan, Kees Cook, Tim.Bird, Brendan Higgins,
	Jonathan Corbet, rmr167, guillaume.tucker, kernelci, kunit-dev,
	linux-kselftest, linux-doc, linux-kernel

On Wed, May 11, 2022 at 11:01 PM Frank Rowand <frowand.list@gmail.com> wrote:
> ================================================================================
> #### discussion notes:
>
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> PRO: minimally invasive to specification.
>
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> CON:
>
> KTAP does not include any mechanism to describe the value of <prefix string>
> to test harnesses and test output processing programs.  The test output
> processing programs must infer the value of <prefix string> by detecting
> the <prefix string> in the "Version lines".
>
> The detection of a "Version lines" might be a match of the regex:
>
>    "^.*KTAP version 2$"
>
> This risks falsely detecting a "Version lines", but the risk is small???

Agree this is a risk and also think it's probably small, but it's hard to say.
I think the $ anchoring the regex is probably safe enough.

As noted earlier, this tracks with what kunit.py already does.
That was necessitated by dynamic prefixes such as timestamps, etc.
So I think this is probably a fine risk to take.

I imagine we could add constraints of prefix string, e.g. must have []
around it, etc. if we want to try and minimize this risk.
But I don't know if it's necessarily worth it, given what we know right now.

Along those lines, I think I like this approach (Alternative 1) more
than Alternative 2/2b.
I'm not sure we need a structured way to specify metadata in KTAP yet?
The prefix seems like a reasonable candidate, but do others have ideas
of other bits of metadata we'd want to be able to declare?

Daniel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC] KTAP spec v2: prefix to KTAP data
  2022-05-12 15:25   ` Daniel Latypov
@ 2022-05-12 17:26     ` Frank Rowand
  2022-05-14  8:26       ` David Gow
  0 siblings, 1 reply; 8+ messages in thread
From: Frank Rowand @ 2022-05-12 17:26 UTC (permalink / raw)
  To: Daniel Latypov
  Cc: David Gow, Shuah Khan, Kees Cook, Tim.Bird, Brendan Higgins,
	Jonathan Corbet, rmr167, guillaume.tucker, kernelci, kunit-dev,
	linux-kselftest, linux-doc, linux-kernel

On 5/12/22 10:25, Daniel Latypov wrote:
> On Wed, May 11, 2022 at 11:01 PM Frank Rowand <frowand.list@gmail.com> wrote:
>> ================================================================================
>> #### discussion notes:
>>
>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>> PRO: minimally invasive to specification.
>>
>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>> CON:
>>
>> KTAP does not include any mechanism to describe the value of <prefix string>
>> to test harnesses and test output processing programs.  The test output
>> processing programs must infer the value of <prefix string> by detecting
>> the <prefix string> in the "Version lines".
>>
>> The detection of a "Version lines" might be a match of the regex:
>>
>>    "^.*KTAP version 2$"
>>
>> This risks falsely detecting a "Version lines", but the risk is small???
> 
> Agree this is a risk and also think it's probably small, but it's hard to say.
> I think the $ anchoring the regex is probably safe enough.
> 
> As noted earlier, this tracks with what kunit.py already does.
> That was necessitated by dynamic prefixes such as timestamps, etc.f

That is a good observation.  I nearly always have the prefix timestamp
on my console output, and thus remove the timestamp with a regex when
processing the data.

-Frank

> So I think this is probably a fine risk to take.
> 
> I imagine we could add constraints of prefix string, e.g. must have []
> around it, etc. if we want to try and minimize this risk.
> But I don't know if it's necessarily worth it, given what we know right now.
> 
> Along those lines, I think I like this approach (Alternative 1) more
> than Alternative 2/2b.
> I'm not sure we need a structured way to specify metadata in KTAP yet?
> The prefix seems like a reasonable candidate, but do others have ideas
> of other bits of metadata we'd want to be able to declare?
> 
> Daniel


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC] KTAP spec v2: prefix to KTAP data
  2022-05-12  6:02 ` Frank Rowand
@ 2022-05-13 22:19   ` Bird, Tim
  0 siblings, 0 replies; 8+ messages in thread
From: Bird, Tim @ 2022-05-13 22:19 UTC (permalink / raw)
  To: Frank Rowand, David Gow, Shuah Khan, Kees Cook, Brendan Higgins
  Cc: Jonathan Corbet, rmr167, guillaume.tucker, dlatypov, kernelci,
	kunit-dev, linux-kselftest, linux-doc, linux-kernel



> -----Original Message-----
> From: Frank Rowand <frowand.list@gmail.com>
> 
> On 5/12/22 00:59, Frank Rowand wrote:
> > In the middle of the "RFC - kernel test result specification (KTAP)" thread,
> > started in August 2021, Tim Bird made a suggestion to allow a prefix to the
> > KTAP data format:
> >
> >> Just as a side note, in some Fuego tests, it was very useful to include an identifier
> >> in thethe prefix nested tests.  The output looked like this:
> >>
> >> TAP version 13
> >> 1..2
> >> [batch_id 4] TAP version 13
> >> [batch_id 4] 1..2
> >> [batch_id 4] ok 1 - cyclictest with 1000 cycles
> >> [batch_id 4] # problem setting CLOCK_REALTIME
> >> [batch_id 4] not ok 2 - cyclictest with CLOCK_REALTIME
> >> not ok 1 - check realtime
> >> [batch_id 4] TAP version 13
> >> [batch_id 4] 1..1
> >> [batch_id 4] ok 1 - IOZone read/write 4k blocks
> >> ok 2 - check I/O performance
> >>
> >> Can I propose that the prefix not be fixed by the spec, but that the spec indicates that
> >> whatever the prefix is on the TAP version line, that prefix must be used with the output for
> >> all lines from the test (with the exception of unknown lines)?
> >
> > The thread was discussing many other items, but this is the one that I want
> > to focus on in this new RFC thread.
> >
> > Tim's original email was:
> >
> >    https://lore.kernel.org/r/BYAPR13MB2503A4B79074D8ED5579345DFDCB9@BYAPR13MB2503.namprd13.prod.outlook.com
> >
> > There was one reply to this that commented on Tim's suggestion (and also many
> > other items in the thread) at:
> >
> >    https://lore.kernel.org/r/202108301226.800F3D6D4@keescook
> >
> >> Oh, interesting. This would also allow parallel (unique) test execution
> >> to be parsable. That sounds workable. (Again, this needs LAVA patching
> >> again...)
> >
> > I found Tim's original suggestion to be useful, so I have come up with
> > two possible ways to modify the KTAP specification to implement what Tim
> > was thinking about.  I would not be surprised if someone else has a better
> > suggestion than mine, but I will reply to this email with my two alternatives
> > to start a discussion.  My alternatives are not in the form of patches, but
> > if discussion leads to a good result then I will create a patch for review.
> >
> > -Frank
> 
> ================================================================================
> Alternative 2
>    - Add an optional <prefix string> to test output.
>    - Add "Configuration info lines", which are used to provide information
>      about the KTAP format to programs that interpret the KTAP data.  The
>      only type of "Configuration info lines" proposed provides the value
>      of <prefix string> for programs that process the KTAP output.
>    - Further types of "Configuration info lines" could be added.
> 
> Alternative 2b
>    - Add an optional <prefix string> to test output.
>    - Add a <prefix string> definition line of the form:
>         ok 0 <prefix string>
>      This line is essentially a phony "Test case result lines" for test 0.
> 
> ================================================================================

For what it's worth, I much prefer Alternative 1, using the prefix on the KTAP line as
the prefix for the lines associated with that test's output.  I think it's much simpler.

Thanks for following up on this.
 -- Tim




> ## Configuration info lines
> 
> Occur in zero or more test case result lines, where <number> is 0 (zero),
> following the "Plan line", before any other "Test case result" line.
> 
>    If there is no "Plan line" at the beginning of the test, then the
>    "Configuration info lines" follow the "Version line", before any
>    other "Test case result" line.
> 
> format:
> 
>    ok 0 <description> # [<directive>] [<diagnostic data>]]
> 
> <result> must be "ok".
> 
> The type of each "Configuration info line" is defined by the value
> of <description>.
> 
> Each value of <description> used for a "Configuration info line" must be
> listed in the specification.
> 
> Whether # <directive> is optional or required is defined for each type of
> "Configuration info line".
> 
> #### Should '[<diagnostic data>]' be included in the format?
> 
> 
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> # [output] prefix
> 
> ok 0 output_prefix #<prefix string>
> 
>    <prefix content is a constant string>
>    Note: there is no space between "#" and "<prefix string>".  If there is one
>    or more spaces after the "#", then there are part of the <prefix string>
> 
>    ## <diagnostic data> must not exist unless there is a way to determine the
>    ## end of <prefix string>.
>    ##
>    ## Adding a way to determine the end of <prefix string> adds much complexity
>    ## to the consumers of ktap output.  For example, if the format was:
>    ##
>    ##   ok 0 output_prefix #<prefix string>[#<diagnostic data>]
>    ##
>    ## then it is not possible for <prefix string> to contain '#', unless
>    ## a method to escape '#' is provided.  E.G.:
>    ##
>    ##   ok 0 output_prefix #XXX test result \#\#\##<diagnostic data>
>    ##
>    ## would result in <prefix string> value of 'XXX test result ###'
>    ##
>    ## My recomendation: do not allow <diagnostic data> in output_prefix format.
> 
>    All subsequent test lines are prefixed with the <prefix string>.
> 
>    Indentation for "Nested tests" follows <prefix string>.  The indentation
>    does NOT precede <prefix string>.
> 
>    "Diagnostic lines" are prefixed with the <prefix string>.
> 
>    "Unknown lines" may optionally be prefixed with the <prefix string>, but
>    are not required to be prefixed with the <prefix string>.  It is allowed
>    for some "Unknown lines" to not be prefixed with the <prefix string>, even
>    if one or more other "Unknown lines" are prefixed with the <prefix string>.
> 
>    #### Does prefixing begin immediately (even for a subsequent "Configuration
>    #### info line" or does it begin with the test 1 "Test case result line"
>    #### line?
>    ####
>    ####    This question might be simplified if the output_prefix line is
>    ####    required to be the last "Configuration info line".  (Or maybe even if
>    ####    required to be the first "Configuration info line".
> 
> 
> ================================================================================
> #### discussion notes:
> 
> PRO:
>    Test output processing programs can detect the value of <prefix string>
>    more deterministically than Alternative 1.
> 
> CON:
>    More complex than Alternative 1.  (But alternative 2b is less complex than
>    Alternative 2.)
> 
> ================================================================================


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC] KTAP spec v2: prefix to KTAP data
  2022-05-12 17:26     ` Frank Rowand
@ 2022-05-14  8:26       ` David Gow
  0 siblings, 0 replies; 8+ messages in thread
From: David Gow @ 2022-05-14  8:26 UTC (permalink / raw)
  To: Frank Rowand
  Cc: Daniel Latypov, Shuah Khan, Kees Cook, Bird, Tim,
	Brendan Higgins, Jonathan Corbet, Rae Moar, Guillaume Tucker,
	kernelci, KUnit Development, open list:KERNEL SELFTEST FRAMEWORK,
	open list:DOCUMENTATION, Linux Kernel Mailing List

On Fri, May 13, 2022 at 1:26 AM Frank Rowand <frowand.list@gmail.com> wrote:
>
> On 5/12/22 10:25, Daniel Latypov wrote:
> > On Wed, May 11, 2022 at 11:01 PM Frank Rowand <frowand.list@gmail.com> wrote:
> >> ================================================================================
> >> #### discussion notes:
> >>
> >> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> >> PRO: minimally invasive to specification.
> >>
> >> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> >> CON:
> >>
> >> KTAP does not include any mechanism to describe the value of <prefix string>
> >> to test harnesses and test output processing programs.  The test output
> >> processing programs must infer the value of <prefix string> by detecting
> >> the <prefix string> in the "Version lines".
> >>
> >> The detection of a "Version lines" might be a match of the regex:
> >>
> >>    "^.*KTAP version 2$"
> >>
> >> This risks falsely detecting a "Version lines", but the risk is small???
> >
> > Agree this is a risk and also think it's probably small, but it's hard to say.
> > I think the $ anchoring the regex is probably safe enough.
> >
> > As noted earlier, this tracks with what kunit.py already does.
> > That was necessitated by dynamic prefixes such as timestamps, etc.f
>
> That is a good observation.  I nearly always have the prefix timestamp
> on my console output, and thus remove the timestamp with a regex when
> processing the data.
>

Yeah: the use of a prefix length for timestamps (and similar 'dynamic
prefixes') works really well as a way to automatically find KTAP
output in the kernel log.
It's definitely less useful as a way of disambiguating several
(potentially interleaved) KTAP streams, though.

Personally, my conception of KTAP is that the prefix is not a part of
the format, but a way (possibly one we should recommend in the spec)
to isolate KTAP output from (e.g.) the kernel log.
So, in theory, the processing pipeline for log -> result is:
[dmesg] ---<remove prefix>--> [KTAP] ---<parse>---> [results]

I wouldn't object to having the prefix be a part of KTAP if it were
particularly useful to anyone, though, and definitely agree we should
note ways to do it.

Either way, we do end up with two (slightly conflicting) uses of prefixes:
a) To isolate tests from a log.
b) To allow multiple concurrent tests.

For a), Alternative 1 seems obviously correct to me, since whatever
module is being used might not know what random prefixes are being
added to its log lines.
For b), either would work, but we do potentially get conflicts with a)
if we're trying to do both. So maybe alternative 2 makes sense for
that.

>
> > So I think this is probably a fine risk to take.
> >
> > I imagine we could add constraints of prefix string, e.g. must have []
> > around it, etc. if we want to try and minimize this risk.
> > But I don't know if it's necessarily worth it, given what we know right now.
> >
> > Along those lines, I think I like this approach (Alternative 1) more
> > than Alternative 2/2b.

I agree, though note that we might need both if we want two separate
types of prefix as mentioned above.

> > I'm not sure we need a structured way to specify metadata in KTAP yet?
> > The prefix seems like a reasonable candidate, but do others have ideas
> > of other bits of metadata we'd want to be able to declare?
> >

That's the important takeaway from Alternative 2: some generic way of
having metadata could be quite handy. (Of them, Alternative 2b feels a
little more like a hack than I'd prefer, albeit a clever one. I'd
prefer a cleaner "configuration info line" personally, though if
reusing the test results for test 0 makes parsing significantly easier
for others, it may nevertheless be worth doing.)

As for other pieces of metadata, there are all sorts of useful
information about the test which could be useful to store alongside
the results (what tool generated it, kernel version, etc), though
those might be more useful when saving results elsewhere than embedded
into the kernel output.

Equally, if we extended these configuration/info lines to individual
test/subtests, things like test size, originating module, etc could
potentially be stored. Suite/test init functions could potentially add
useful metadata, too (e.g., number of cores/threads for kcsan, etc).

None of those are, I think, quite compelling enough individually to
add a config system, though. If we already had one for the prefix, and
other things which are necessary for parsing, maybe...

-- David

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-05-14  8:26 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-12  5:59 [RFC] KTAP spec v2: prefix to KTAP data Frank Rowand
2022-05-12  6:01 ` Frank Rowand
2022-05-12 15:25   ` Daniel Latypov
2022-05-12 17:26     ` Frank Rowand
2022-05-14  8:26       ` David Gow
2022-05-12  6:02 ` Frank Rowand
2022-05-13 22:19   ` Bird, Tim
2022-05-12 15:12 ` Daniel Latypov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).