Re: [PATCH 1/3] simplebench: add img_bench_templater.py

From: Hanna Reitz <hreitz@redhat.com>
To: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	qemu-block@nongnu.org
Cc: kwolf@redhat.com, den@openvz.org, qemu-devel@nongnu.org,
	mreitz@redhat.com
Subject: Re: [PATCH 1/3] simplebench: add img_bench_templater.py
Date: Thu, 19 Aug 2021 18:37:20 +0200	[thread overview]
Message-ID: <784f21b4-f990-f0af-1f24-caa2c66144bf@redhat.com> (raw)
In-Reply-To: <20210724133846.64614-2-vsementsov@virtuozzo.com>

On 24.07.21 15:38, Vladimir Sementsov-Ogievskiy wrote:
> Add simple grammar-parsing template benchmark.

This doesn’t really say much, and FWIW, for like ten minutes I thought 
this would do something completely different than it did (while I was 
trying to parse the help text).

(I thought this was about formatting an existing test’s output, and that 
“template” were kind of the wrong word, but then it turned out it’s 
exactly the right word, only that this is not about using a test’s 
output as a template, but actually using a template of a test (i.e. a 
test template, not a template test) to generate test instances to 
run...  Which of course is much cooler.)

Functionality-wise, as far as I understand (of course I have no 
knowledge of lark), this looks good to me.  And it’s really quite cool.

I just found the documentation confusing, so I have some suggestions for 
it below.

> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>   scripts/simplebench/img_bench_templater.py | 85 ++++++++++++++++++++++
>   scripts/simplebench/table_templater.py     | 62 ++++++++++++++++
>   2 files changed, 147 insertions(+)
>   create mode 100755 scripts/simplebench/img_bench_templater.py
>   create mode 100644 scripts/simplebench/table_templater.py
>
> diff --git a/scripts/simplebench/img_bench_templater.py b/scripts/simplebench/img_bench_templater.py
> new file mode 100755
> index 0000000000..d18a243d35
> --- /dev/null
> +++ b/scripts/simplebench/img_bench_templater.py
> @@ -0,0 +1,85 @@
> +#!/usr/bin/env python3
> +#
> +# Run img-bench template tests
> +#
> +# Copyright (c) 2021 Virtuozzo International GmbH.
> +#
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 2 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +#
> +
> +
> +import sys
> +import subprocess
> +import re
> +import json
> +
> +import simplebench
> +from results_to_text import results_to_text
> +from table_templater import Templater
> +
> +
> +def bench_func(env, case):
> +    test = templater.gen(env['data'], case['data'])
> +
> +    p = subprocess.run(test, shell=True, stdout=subprocess.PIPE,
> +                       stderr=subprocess.STDOUT, universal_newlines=True)
> +
> +    if p.returncode == 0:
> +        try:
> +            m = re.search(r'Run completed in (\d+.\d+) seconds.', p.stdout)
> +            return {'seconds': float(m.group(1))}
> +        except Exception:
> +            return {'error': f'failed to parse qemu-img output: {p.stdout}'}
> +    else:
> +        return {'error': f'qemu-img failed: {p.returncode}: {p.stdout}'}
> +
> +
> +if __name__ == '__main__':
> +    if len(sys.argv) > 1:
> +        print("""
> +Usage: no arguments. Just pass template test to stdin. Template test is

FWIW, I completely misunderstood this.

At first, this sounded really ambiguous to me; then I thought that 
clearly this must mean that one should pipe the test’s output to this 
script, i.e.

$ path/to/test.sh | scripts/simplebench/img_bench_templater.py

But now after reading more, I finally understand that this is not what 
is meant, but actually literally passing some template of a test script 
to this script, i.e.

$ scripts/simplebench/img_bench_templater.py < path/to/test-template.sh

So, two things; first, I believe it should be a “test template”, not a 
“template test”, because this is about templates for a test, not about a 
test that has something to do with templates.

Second, perhaps we should start with what this does.

Perhaps:

“This script generates performance tests from a test template (example 
below), runs them, and displays the results in a table. The template is 
read from stdin.  It must be written in bash and end with a `qemu-img 
bench` invocation (whose result is parsed to get the test instance’s 
result).”?

> +a bash script, last command should be qemu-img bench (it's output is parsed
> +to get a result). For templating use the following synax:

“Use the following syntax in the template to create the various 
different test instances:”?

> +
> +  column templating: {var1|var2|...} - test will use different values in
> +  different columns. You may use several {} constructions in the test, in this
> +  case product of all choice-sets will be used.
> +
> +  row templating: [var1|var2|...] - similar thing to define rows (test-cases)
> +
> +Test tempalate example:

*template

> +
> +Assume you want to compare two qemu-img binaries, called qemu-img-old and
> +qemu-img-new in your build directory in two test-cases with 4K writes and 64K
> +writes. Test may look like this:

I’d prefer s/Test/The template/.

> +
> +qemu_img=/path/to/qemu/build/qemu-img-{old|new}
> +$qemu_img create -f qcow2 /ssd/x.qcow2 1G
> +$qemu_img bench -c 100 -d 8 [-s 4K|-s 64K] -w -t none -n /ssd/x.qcow2
> +
> +If pass it to stdin of img_bench_templater.py, the resulting comparison table

s/If pass it/When passing this/

> +will contain two columns (for two binaries) and two rows (for two test-cases).
> +""")
> +        sys.exit()
> +
> +    templater = Templater(sys.stdin.read())
> +
> +    envs = [{'id': ' / '.join(x), 'data': x} for x in templater.columns]
> +    cases = [{'id': ' / '.join(x), 'data': x} for x in templater.rows]
> +
> +    result = simplebench.bench(bench_func, envs, cases, count=5,
> +                               initial_run=False)
> +    print(results_to_text(result))
> +    with open('results.json', 'w') as f:
> +        json.dump(result, f, indent=4)

Is this worth documenting?

> diff --git a/scripts/simplebench/table_templater.py b/scripts/simplebench/table_templater.py
> new file mode 100644
> index 0000000000..950f3b3024
> --- /dev/null
> +++ b/scripts/simplebench/table_templater.py
> @@ -0,0 +1,62 @@
> +# Parser for test templates
> +#
> +# Copyright (c) 2021 Virtuozzo International GmbH.
> +#
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 2 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +#
> +
> +import itertools
> +from lark import Lark
> +
> +grammar = """
> +start: ( text | column_switch | row_switch )+
> +
> +column_switch: "{" text ["|" text]+ "}"
> +row_switch: "[" text ["|" text]+ "]"
> +text: /[^|{}\[\]]+/

So I have no idea how this really works, of course, but does this mean 
that the `text` pattern cannot contain pipe symbols?  I.e. that you 
cannot use pipes in the test template?

Hanna

> +"""
> +
> +parser = Lark(grammar)
> +
> +class Templater:
> +    def __init__(self, template):
> +        self.tree = parser.parse(template)
> +
> +        c_switches = []
> +        r_switches = []
> +        for x in self.tree.children:
> +            if x.data == 'column_switch':
> +                c_switches.append([el.children[0].value for el in x.children])
> +            elif x.data == 'row_switch':
> +                r_switches.append([el.children[0].value for el in x.children])
> +
> +        self.columns = list(itertools.product(*c_switches))
> +        self.rows = list(itertools.product(*r_switches))
> +
> +    def gen(self, column, row):
> +        i = 0
> +        j = 0
> +        result = []
> +
> +        for x in self.tree.children:
> +            if x.data == 'text':
> +                result.append(x.children[0].value)
> +            elif x.data == 'column_switch':
> +                result.append(column[i])
> +                i += 1
> +            elif x.data == 'row_switch':
> +                result.append(row[j])
> +                j += 1
> +
> +        return ''.join(result)