All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] iotests: multiprocessing!!
@ 2021-12-03 12:22 Vladimir Sementsov-Ogievskiy
  2021-12-03 12:22 ` [PATCH 1/3] iotests/testrunner.py: add doc string for run_test() Vladimir Sementsov-Ogievskiy
                   ` (4 more replies)
  0 siblings, 5 replies; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-12-03 12:22 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, hreitz, kwolf, jsnow, vsementsov, den

Hi all!

Finally, I can not stand it any longer. So, I'm happy to present
multiprocessing support for iotests test runner.

testing on tmpfs:

Before:

time check -qcow2
...
real    12m28.095s
user    9m53.398s
sys     2m55.548s

After:

time check -qcow2 -j 12
...
real    2m12.136s
user    12m40.948s
sys     4m7.449s


Hmm, seems -j 6 should be enough. I have 6 cores, 2 threads per core.
Anyway, that's so fast!

Vladimir Sementsov-Ogievskiy (3):
  iotests/testrunner.py: add doc string for run_test()
  iotests/testrunner.py: move updating last_elapsed to run_tests
  iotests: check: multiprocessing support

 tests/qemu-iotests/check         |  4 +-
 tests/qemu-iotests/testrunner.py | 86 ++++++++++++++++++++++++++++----
 2 files changed, 80 insertions(+), 10 deletions(-)

-- 
2.31.1



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 1/3] iotests/testrunner.py: add doc string for run_test()
  2021-12-03 12:22 [PATCH 0/3] iotests: multiprocessing!! Vladimir Sementsov-Ogievskiy
@ 2021-12-03 12:22 ` Vladimir Sementsov-Ogievskiy
  2021-12-06 17:52   ` John Snow
  2021-12-10 14:12   ` Kevin Wolf
  2021-12-03 12:22 ` [PATCH 2/3] iotests/testrunner.py: move updating last_elapsed to run_tests Vladimir Sementsov-Ogievskiy
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-12-03 12:22 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, hreitz, kwolf, jsnow, vsementsov, den

We are going to modify these methods and will add more documentation in
further commit. As a preparation add basic documentation.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 tests/qemu-iotests/testrunner.py | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/tests/qemu-iotests/testrunner.py b/tests/qemu-iotests/testrunner.py
index 0e29c2fddd..fa842252d3 100644
--- a/tests/qemu-iotests/testrunner.py
+++ b/tests/qemu-iotests/testrunner.py
@@ -220,6 +220,12 @@ def find_reference(self, test: str) -> str:
         return f'{test}.out'
 
     def do_run_test(self, test: str) -> TestResult:
+        """
+        Run one test
+
+        :param test: test file path
+        """
+
         f_test = Path(test)
         f_bad = Path(f_test.name + '.out.bad')
         f_notrun = Path(f_test.name + '.notrun')
@@ -287,6 +293,13 @@ def do_run_test(self, test: str) -> TestResult:
 
     def run_test(self, test: str,
                  test_field_width: Optional[int] = None) -> TestResult:
+        """
+        Run one test and print short status
+
+        :param test: test file path
+        :param test_field_width: width for first field of status format
+        """
+
         last_el = self.last_elapsed.get(test)
         start = datetime.datetime.now().strftime('%H:%M:%S')
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 2/3] iotests/testrunner.py: move updating last_elapsed to run_tests
  2021-12-03 12:22 [PATCH 0/3] iotests: multiprocessing!! Vladimir Sementsov-Ogievskiy
  2021-12-03 12:22 ` [PATCH 1/3] iotests/testrunner.py: add doc string for run_test() Vladimir Sementsov-Ogievskiy
@ 2021-12-03 12:22 ` Vladimir Sementsov-Ogievskiy
  2021-12-06 17:59   ` John Snow
  2021-12-03 12:22 ` [PATCH 3/3] iotests: check: multiprocessing support Vladimir Sementsov-Ogievskiy
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-12-03 12:22 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, hreitz, kwolf, jsnow, vsementsov, den

We are going to use do_run_test() in multiprocessing environment, where
we'll not be able to change original runner object.

Happily, the only thing we change is that last_elapsed and it's simple
to do it in run_tests() instead. All other accesses to self in
do_runt_test() and in run_test() are read-only.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 tests/qemu-iotests/testrunner.py | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/testrunner.py b/tests/qemu-iotests/testrunner.py
index fa842252d3..a9f2feb58c 100644
--- a/tests/qemu-iotests/testrunner.py
+++ b/tests/qemu-iotests/testrunner.py
@@ -287,7 +287,6 @@ def do_run_test(self, test: str) -> TestResult:
                               diff=diff, casenotrun=casenotrun)
         else:
             f_bad.unlink()
-            self.last_elapsed.update(test, elapsed)
             return TestResult(status='pass', elapsed=elapsed,
                               casenotrun=casenotrun)
 
@@ -353,6 +352,9 @@ def run_tests(self, tests: List[str]) -> bool:
                     print('\n'.join(res.diff))
             elif res.status == 'not run':
                 notrun.append(name)
+            elif res.status == 'pass':
+                assert res.elapsed is not None
+                self.last_elapsed.update(t, res.elapsed)
 
             sys.stdout.flush()
             if res.interrupted:
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 3/3] iotests: check: multiprocessing support
  2021-12-03 12:22 [PATCH 0/3] iotests: multiprocessing!! Vladimir Sementsov-Ogievskiy
  2021-12-03 12:22 ` [PATCH 1/3] iotests/testrunner.py: add doc string for run_test() Vladimir Sementsov-Ogievskiy
  2021-12-03 12:22 ` [PATCH 2/3] iotests/testrunner.py: move updating last_elapsed to run_tests Vladimir Sementsov-Ogievskiy
@ 2021-12-03 12:22 ` Vladimir Sementsov-Ogievskiy
  2021-12-06 18:35   ` John Snow
  2021-12-10 14:36   ` Kevin Wolf
  2021-12-06 18:37 ` [PATCH 0/3] iotests: multiprocessing!! John Snow
  2021-12-09 14:33 ` Hanna Reitz
  4 siblings, 2 replies; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-12-03 12:22 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, hreitz, kwolf, jsnow, vsementsov, den

Add -j <JOBS> parameter, to run tests in several jobs simultaneously.
For realization - simply utilize multiprocessing.Pool class.

Notes:

1. Of course, tests can't run simultaneously in same TEST_DIR. So,
   use subdirectories TEST_DIR/testname/ and SOCK_DIR/testname/
   instead of simply TEST_DIR and SOCK_DIR

2. multiprocessing.Pool.starmap function doesn't support passing
   context managers, so we can't simply pass "self". Happily, we need
   self only for read-only access, and it just works if it is defined
   in global space. So, add a temporary link TestRunner.shared_self
   during run_tests().

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 tests/qemu-iotests/check         |  4 +-
 tests/qemu-iotests/testrunner.py | 69 ++++++++++++++++++++++++++++----
 2 files changed, 64 insertions(+), 9 deletions(-)

diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
index 43a4b694cc..0c27721a41 100755
--- a/tests/qemu-iotests/check
+++ b/tests/qemu-iotests/check
@@ -34,6 +34,8 @@ def make_argparser() -> argparse.ArgumentParser:
                    help='show me, do not run tests')
     p.add_argument('-makecheck', action='store_true',
                    help='pretty print output for make check')
+    p.add_argument('-j', dest='jobs', type=int, default=1,
+                   help='run tests in multiple parallel jobs')
 
     p.add_argument('-d', dest='debug', action='store_true', help='debug')
     p.add_argument('-p', dest='print', action='store_true',
@@ -165,6 +167,6 @@ if __name__ == '__main__':
         with TestRunner(env, makecheck=args.makecheck,
                         color=args.color) as tr:
             paths = [os.path.join(env.source_iotests, t) for t in tests]
-            ok = tr.run_tests(paths)
+            ok = tr.run_tests(paths, args.jobs)
             if not ok:
                 sys.exit(1)
diff --git a/tests/qemu-iotests/testrunner.py b/tests/qemu-iotests/testrunner.py
index a9f2feb58c..0feaa396d0 100644
--- a/tests/qemu-iotests/testrunner.py
+++ b/tests/qemu-iotests/testrunner.py
@@ -26,6 +26,7 @@
 import json
 import termios
 import sys
+from multiprocessing import Pool
 from contextlib import contextmanager
 from typing import List, Optional, Iterator, Any, Sequence, Dict, \
         ContextManager
@@ -126,6 +127,31 @@ def __init__(self, status: str, description: str = '',
 
 
 class TestRunner(ContextManager['TestRunner']):
+    shared_self = None
+
+    @staticmethod
+    def proc_run_test(test: str, test_field_width: int) -> TestResult:
+        # We are in a subprocess, we can't change the runner object!
+        runner = TestRunner.shared_self
+        assert runner is not None
+        return runner.run_test(test, test_field_width, mp=True)
+
+    def run_tests_pool(self, tests: List[str],
+                       test_field_width: int, jobs: int) -> List[TestResult]:
+
+        # passing self directly to Pool.starmap() just doesn't work, because
+        # it's a context manager.
+        assert TestRunner.shared_self is None
+        TestRunner.shared_self = self
+
+        with Pool(jobs) as p:
+            results = p.starmap(self.proc_run_test,
+                                zip(tests, [test_field_width] * len(tests)))
+
+        TestRunner.shared_self = None
+
+        return results
+
     def __init__(self, env: TestEnv, makecheck: bool = False,
                  color: str = 'auto') -> None:
         self.env = env
@@ -219,11 +245,16 @@ def find_reference(self, test: str) -> str:
 
         return f'{test}.out'
 
-    def do_run_test(self, test: str) -> TestResult:
+    def do_run_test(self, test: str, mp: bool) -> TestResult:
         """
         Run one test
 
         :param test: test file path
+        :param mp: if true, we are in a multiprocessing environment, use
+                   personal subdirectories for test run
+
+        Note: this method may be called from subprocess, so it does not
+        change ``self`` object in any way!
         """
 
         f_test = Path(test)
@@ -249,6 +280,12 @@ def do_run_test(self, test: str) -> TestResult:
 
         args = [str(f_test.resolve())]
         env = self.env.prepare_subprocess(args)
+        if mp:
+            # Split test directories, so that tests running in parallel don't
+            # break each other.
+            for d in ['TEST_DIR', 'SOCK_DIR']:
+                env[d] = os.path.join(env[d], f_test.name)
+                Path(env[d]).mkdir(parents=True, exist_ok=True)
 
         t0 = time.time()
         with f_bad.open('w', encoding="utf-8") as f:
@@ -291,23 +328,32 @@ def do_run_test(self, test: str) -> TestResult:
                               casenotrun=casenotrun)
 
     def run_test(self, test: str,
-                 test_field_width: Optional[int] = None) -> TestResult:
+                 test_field_width: Optional[int] = None,
+                 mp: bool = False) -> TestResult:
         """
         Run one test and print short status
 
         :param test: test file path
         :param test_field_width: width for first field of status format
+        :param mp: if true, we are in a multiprocessing environment, don't try
+                   to rewrite things in stdout
+
+        Note: this method may be called from subprocess, so it does not
+        change ``self`` object in any way!
         """
 
         last_el = self.last_elapsed.get(test)
         start = datetime.datetime.now().strftime('%H:%M:%S')
 
         if not self.makecheck:
-            self.test_print_one_line(test=test, starttime=start,
-                                     lasttime=last_el, end='\r',
+            self.test_print_one_line(test=test,
+                                     status = 'started' if mp else '...',
+                                     starttime=start,
+                                     lasttime=last_el,
+                                     end = '\n' if mp else '\r',
                                      test_field_width=test_field_width)
 
-        res = self.do_run_test(test)
+        res = self.do_run_test(test, mp)
 
         end = datetime.datetime.now().strftime('%H:%M:%S')
         self.test_print_one_line(test=test, status=res.status,
@@ -321,7 +367,7 @@ def run_test(self, test: str,
 
         return res
 
-    def run_tests(self, tests: List[str]) -> bool:
+    def run_tests(self, tests: List[str], jobs: int = 1) -> bool:
         n_run = 0
         failed = []
         notrun = []
@@ -332,9 +378,16 @@ def run_tests(self, tests: List[str]) -> bool:
 
         test_field_width = max(len(os.path.basename(t)) for t in tests) + 2
 
-        for t in tests:
+        if jobs > 1:
+            results = self.run_tests_pool(tests, test_field_width, jobs)
+
+        for i, t in enumerate(tests):
             name = os.path.basename(t)
-            res = self.run_test(t, test_field_width=test_field_width)
+
+            if jobs > 1:
+                res = results[i]
+            else:
+                res = self.run_test(t, test_field_width)
 
             assert res.status in ('pass', 'fail', 'not run')
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/3] iotests/testrunner.py: add doc string for run_test()
  2021-12-03 12:22 ` [PATCH 1/3] iotests/testrunner.py: add doc string for run_test() Vladimir Sementsov-Ogievskiy
@ 2021-12-06 17:52   ` John Snow
  2021-12-10 14:12   ` Kevin Wolf
  1 sibling, 0 replies; 21+ messages in thread
From: John Snow @ 2021-12-06 17:52 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: Kevin Wolf, den, Hanna Reitz, qemu-devel, Qemu-block

[-- Attachment #1: Type: text/plain, Size: 1558 bytes --]

On Fri, Dec 3, 2021 at 7:22 AM Vladimir Sementsov-Ogievskiy <
vsementsov@virtuozzo.com> wrote:

> We are going to modify these methods and will add more documentation in
> further commit. As a preparation add basic documentation.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  tests/qemu-iotests/testrunner.py | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
>
> diff --git a/tests/qemu-iotests/testrunner.py
> b/tests/qemu-iotests/testrunner.py
> index 0e29c2fddd..fa842252d3 100644
> --- a/tests/qemu-iotests/testrunner.py
> +++ b/tests/qemu-iotests/testrunner.py
> @@ -220,6 +220,12 @@ def find_reference(self, test: str) -> str:
>          return f'{test}.out'
>
>      def do_run_test(self, test: str) -> TestResult:
> +        """
> +        Run one test
> +
> +        :param test: test file path
> +        """
> +
>          f_test = Path(test)
>          f_bad = Path(f_test.name + '.out.bad')
>          f_notrun = Path(f_test.name + '.notrun')
> @@ -287,6 +293,13 @@ def do_run_test(self, test: str) -> TestResult:
>
>      def run_test(self, test: str,
>                   test_field_width: Optional[int] = None) -> TestResult:
> +        """
> +        Run one test and print short status
> +
> +        :param test: test file path
> +        :param test_field_width: width for first field of status format
> +        """
> +
>          last_el = self.last_elapsed.get(test)
>          start = datetime.datetime.now().strftime('%H:%M:%S')
>
> --
> 2.31.1
>
>
Reviewed-by: John Snow <jsnow@redhat.com>

[-- Attachment #2: Type: text/html, Size: 2467 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/3] iotests/testrunner.py: move updating last_elapsed to run_tests
  2021-12-03 12:22 ` [PATCH 2/3] iotests/testrunner.py: move updating last_elapsed to run_tests Vladimir Sementsov-Ogievskiy
@ 2021-12-06 17:59   ` John Snow
  2021-12-10 14:25     ` Kevin Wolf
  0 siblings, 1 reply; 21+ messages in thread
From: John Snow @ 2021-12-06 17:59 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: Kevin Wolf, den, Hanna Reitz, qemu-devel, Qemu-block

[-- Attachment #1: Type: text/plain, Size: 1848 bytes --]

On Fri, Dec 3, 2021 at 7:22 AM Vladimir Sementsov-Ogievskiy <
vsementsov@virtuozzo.com> wrote:

> We are going to use do_run_test() in multiprocessing environment, where
> we'll not be able to change original runner object.
>
> Happily, the only thing we change is that last_elapsed and it's simple
> to do it in run_tests() instead. All other accesses to self in
> do_runt_test() and in run_test() are read-only.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  tests/qemu-iotests/testrunner.py | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/tests/qemu-iotests/testrunner.py
> b/tests/qemu-iotests/testrunner.py
> index fa842252d3..a9f2feb58c 100644
> --- a/tests/qemu-iotests/testrunner.py
> +++ b/tests/qemu-iotests/testrunner.py
> @@ -287,7 +287,6 @@ def do_run_test(self, test: str) -> TestResult:
>                                diff=diff, casenotrun=casenotrun)
>          else:
>              f_bad.unlink()
> -            self.last_elapsed.update(test, elapsed)
>              return TestResult(status='pass', elapsed=elapsed,
>                                casenotrun=casenotrun)
>
> @@ -353,6 +352,9 @@ def run_tests(self, tests: List[str]) -> bool:
>                      print('\n'.join(res.diff))
>              elif res.status == 'not run':
>                  notrun.append(name)
> +            elif res.status == 'pass':
> +                assert res.elapsed is not None
> +                self.last_elapsed.update(t, res.elapsed)
>
>              sys.stdout.flush()
>              if res.interrupted:
> --
> 2.31.1
>
>
(I continue to be annoyed by the "None" problem in mypy, but I suppose it
really can't be helped. Nothing for you to change with this patch or
series. I just wish we didn't need so many assertions ...)

Reviewed-by: John Snow <jsnow@redhat.com>

[-- Attachment #2: Type: text/html, Size: 2621 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/3] iotests: check: multiprocessing support
  2021-12-03 12:22 ` [PATCH 3/3] iotests: check: multiprocessing support Vladimir Sementsov-Ogievskiy
@ 2021-12-06 18:35   ` John Snow
  2021-12-06 20:20     ` Vladimir Sementsov-Ogievskiy
  2021-12-10 14:36   ` Kevin Wolf
  1 sibling, 1 reply; 21+ messages in thread
From: John Snow @ 2021-12-06 18:35 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: Kevin Wolf, den, Hanna Reitz, qemu-devel, Qemu-block

[-- Attachment #1: Type: text/plain, Size: 8718 bytes --]

On Fri, Dec 3, 2021 at 7:22 AM Vladimir Sementsov-Ogievskiy <
vsementsov@virtuozzo.com> wrote:

> Add -j <JOBS> parameter, to run tests in several jobs simultaneously.
> For realization - simply utilize multiprocessing.Pool class.
>
> Notes:
>
> 1. Of course, tests can't run simultaneously in same TEST_DIR. So,
>    use subdirectories TEST_DIR/testname/ and SOCK_DIR/testname/
>    instead of simply TEST_DIR and SOCK_DIR
>
>
SOCK_DIR was introduced because TEST_DIR was getting too long, and the
length of the filepath was causing problems on some platforms. I hope that
we aren't pushing our luck by making the directory longer here. Using the
test name is nice because a human operator can quickly correlate
directories to the tests that populated them, but if test names get kind of
long, I wonder if we'll cause problems again.

Just a stray thought.


> 2. multiprocessing.Pool.starmap function doesn't support passing
>    context managers, so we can't simply pass "self". Happily, we need
>    self only for read-only access, and it just works if it is defined
>    in global space. So, add a temporary link TestRunner.shared_self
>    during run_tests().
>

I'm a little confused on this point, see below


> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  tests/qemu-iotests/check         |  4 +-
>  tests/qemu-iotests/testrunner.py | 69 ++++++++++++++++++++++++++++----
>  2 files changed, 64 insertions(+), 9 deletions(-)
>
> diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
> index 43a4b694cc..0c27721a41 100755
> --- a/tests/qemu-iotests/check
> +++ b/tests/qemu-iotests/check
> @@ -34,6 +34,8 @@ def make_argparser() -> argparse.ArgumentParser:
>                     help='show me, do not run tests')
>      p.add_argument('-makecheck', action='store_true',
>                     help='pretty print output for make check')
> +    p.add_argument('-j', dest='jobs', type=int, default=1,
> +                   help='run tests in multiple parallel jobs')
>
>      p.add_argument('-d', dest='debug', action='store_true', help='debug')
>      p.add_argument('-p', dest='print', action='store_true',
> @@ -165,6 +167,6 @@ if __name__ == '__main__':
>          with TestRunner(env, makecheck=args.makecheck,
>                          color=args.color) as tr:
>              paths = [os.path.join(env.source_iotests, t) for t in tests]
> -            ok = tr.run_tests(paths)
> +            ok = tr.run_tests(paths, args.jobs)
>              if not ok:
>                  sys.exit(1)
>

(OK)


> diff --git a/tests/qemu-iotests/testrunner.py
> b/tests/qemu-iotests/testrunner.py
> index a9f2feb58c..0feaa396d0 100644
> --- a/tests/qemu-iotests/testrunner.py
> +++ b/tests/qemu-iotests/testrunner.py
> @@ -26,6 +26,7 @@
>  import json
>  import termios
>  import sys
> +from multiprocessing import Pool
>  from contextlib import contextmanager
>  from typing import List, Optional, Iterator, Any, Sequence, Dict, \
>          ContextManager
> @@ -126,6 +127,31 @@ def __init__(self, status: str, description: str = '',
>
>
>  class TestRunner(ContextManager['TestRunner']):
> +    shared_self = None
>
+
> +    @staticmethod
> +    def proc_run_test(test: str, test_field_width: int) -> TestResult:
> +        # We are in a subprocess, we can't change the runner object!
>

*can't*, or shouldn't? I thought changing anything would just result in the
child process diverging, but nothing harmful overall. Am I mistaken?


> +        runner = TestRunner.shared_self
> +        assert runner is not None
> +        return runner.run_test(test, test_field_width, mp=True)
> +
> +    def run_tests_pool(self, tests: List[str],
> +                       test_field_width: int, jobs: int) ->
> List[TestResult]:
> +
> +        # passing self directly to Pool.starmap() just doesn't work,
> because
> +        # it's a context manager.
>

Why, what happens? (Or what doesn't happen?)

(I believe you that it doesn't work, but it's not immediately obvious to me
why.)


> +        assert TestRunner.shared_self is None
> +        TestRunner.shared_self = self
> +
> +        with Pool(jobs) as p:
> +            results = p.starmap(self.proc_run_test,
> +                                zip(tests, [test_field_width] *
> len(tests)))
> +
> +        TestRunner.shared_self = None
> +
> +        return results
> +
>      def __init__(self, env: TestEnv, makecheck: bool = False,
>                   color: str = 'auto') -> None:
>          self.env = env
> @@ -219,11 +245,16 @@ def find_reference(self, test: str) -> str:
>
>          return f'{test}.out'
>
> -    def do_run_test(self, test: str) -> TestResult:
> +    def do_run_test(self, test: str, mp: bool) -> TestResult:
>          """
>          Run one test
>
>          :param test: test file path
> +        :param mp: if true, we are in a multiprocessing environment, use
> +                   personal subdirectories for test run
> +
> +        Note: this method may be called from subprocess, so it does not
> +        change ``self`` object in any way!
>          """
>

Maybe worth mentioning that it *does* change environment variables, but
because this is "mp", it won't affect the parent execution environment.


>
>          f_test = Path(test)
> @@ -249,6 +280,12 @@ def do_run_test(self, test: str) -> TestResult:
>
>          args = [str(f_test.resolve())]
>          env = self.env.prepare_subprocess(args)
> +        if mp:
> +            # Split test directories, so that tests running in parallel
> don't
> +            # break each other.
> +            for d in ['TEST_DIR', 'SOCK_DIR']:
> +                env[d] = os.path.join(env[d], f_test.name)
> +                Path(env[d]).mkdir(parents=True, exist_ok=True)
>
>          t0 = time.time()
>          with f_bad.open('w', encoding="utf-8") as f:
> @@ -291,23 +328,32 @@ def do_run_test(self, test: str) -> TestResult:
>                                casenotrun=casenotrun)
>
>      def run_test(self, test: str,
> -                 test_field_width: Optional[int] = None) -> TestResult:
> +                 test_field_width: Optional[int] = None,
> +                 mp: bool = False) -> TestResult:
>          """
>          Run one test and print short status
>
>          :param test: test file path
>          :param test_field_width: width for first field of status format
> +        :param mp: if true, we are in a multiprocessing environment,
> don't try
> +                   to rewrite things in stdout
> +
> +        Note: this method may be called from subprocess, so it does not
> +        change ``self`` object in any way!
>          """
>
>          last_el = self.last_elapsed.get(test)
>          start = datetime.datetime.now().strftime('%H:%M:%S')
>
>          if not self.makecheck:
> -            self.test_print_one_line(test=test, starttime=start,
> -                                     lasttime=last_el, end='\r',
> +            self.test_print_one_line(test=test,
> +                                     status = 'started' if mp else '...',
> +                                     starttime=start,
> +                                     lasttime=last_el,
> +                                     end = '\n' if mp else '\r',
>                                       test_field_width=test_field_width)
>
> -        res = self.do_run_test(test)
> +        res = self.do_run_test(test, mp)
>
>          end = datetime.datetime.now().strftime('%H:%M:%S')
>          self.test_print_one_line(test=test, status=res.status,
>
@@ -321,7 +367,7 @@ def run_test(self, test: str,
>
>          return res
>
> -    def run_tests(self, tests: List[str]) -> bool:
> +    def run_tests(self, tests: List[str], jobs: int = 1) -> bool:
>          n_run = 0
>          failed = []
>          notrun = []
> @@ -332,9 +378,16 @@ def run_tests(self, tests: List[str]) -> bool:
>
>          test_field_width = max(len(os.path.basename(t)) for t in tests) +
> 2
>
> -        for t in tests:
> +        if jobs > 1:
> +            results = self.run_tests_pool(tests, test_field_width, jobs)
> +
> +        for i, t in enumerate(tests):
>              name = os.path.basename(t)
> -            res = self.run_test(t, test_field_width=test_field_width)
> +
> +            if jobs > 1:
> +                res = results[i]
> +            else:
> +                res = self.run_test(t, test_field_width)
>
>              assert res.status in ('pass', 'fail', 'not run')
>
>
Looks good and surprisingly minimal, I just have a curiosity about the
nature of the workaround here.

Either way, I believe this will probably work as written, so I can give it
an ACK at a minimum while I wait for answers.

Acked-by: John Snow <jsnow@redhat.com>

[-- Attachment #2: Type: text/html, Size: 12558 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/3] iotests: multiprocessing!!
  2021-12-03 12:22 [PATCH 0/3] iotests: multiprocessing!! Vladimir Sementsov-Ogievskiy
                   ` (2 preceding siblings ...)
  2021-12-03 12:22 ` [PATCH 3/3] iotests: check: multiprocessing support Vladimir Sementsov-Ogievskiy
@ 2021-12-06 18:37 ` John Snow
  2021-12-06 20:26   ` Vladimir Sementsov-Ogievskiy
  2021-12-09 14:33 ` Hanna Reitz
  4 siblings, 1 reply; 21+ messages in thread
From: John Snow @ 2021-12-06 18:37 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: Kevin Wolf, den, Hanna Reitz, qemu-devel, Qemu-block

[-- Attachment #1: Type: text/plain, Size: 1288 bytes --]

On Fri, Dec 3, 2021 at 7:22 AM Vladimir Sementsov-Ogievskiy <
vsementsov@virtuozzo.com> wrote:

> Hi all!
>
> Finally, I can not stand it any longer. So, I'm happy to present
> multiprocessing support for iotests test runner.
>
> testing on tmpfs:
>
> Before:
>
> time check -qcow2
> ...
> real    12m28.095s
> user    9m53.398s
> sys     2m55.548s
>
> After:
>
> time check -qcow2 -j 12
> ...
> real    2m12.136s
> user    12m40.948s
> sys     4m7.449s
>
>
VERY excellent. And this will probably flush a lot more bugs loose, too.
(Which I consider a good thing!)
We could look into utilizing it for 'make check', but we'll have to be
prepared for a greater risk of race conditions on the CI if we do. But...
it's seriously hard to argue with this kind of optimization, very well done!


>
> Hmm, seems -j 6 should be enough. I have 6 cores, 2 threads per core.
> Anyway, that's so fast!
>
> Vladimir Sementsov-Ogievskiy (3):
>   iotests/testrunner.py: add doc string for run_test()
>   iotests/testrunner.py: move updating last_elapsed to run_tests
>   iotests: check: multiprocessing support
>
>  tests/qemu-iotests/check         |  4 +-
>  tests/qemu-iotests/testrunner.py | 86 ++++++++++++++++++++++++++++----
>  2 files changed, 80 insertions(+), 10 deletions(-)
>
> --
> 2.31.1
>
>

[-- Attachment #2: Type: text/html, Size: 1958 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/3] iotests: check: multiprocessing support
  2021-12-06 18:35   ` John Snow
@ 2021-12-06 20:20     ` Vladimir Sementsov-Ogievskiy
  2021-12-06 21:00       ` John Snow
  0 siblings, 1 reply; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-12-06 20:20 UTC (permalink / raw)
  To: John Snow; +Cc: Qemu-block, qemu-devel, Hanna Reitz, Kevin Wolf, den

06.12.2021 21:35, John Snow wrote:
> 
> 
> On Fri, Dec 3, 2021 at 7:22 AM Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com <mailto:vsementsov@virtuozzo.com>> wrote:
> 
>     Add -j <JOBS> parameter, to run tests in several jobs simultaneously.
>     For realization - simply utilize multiprocessing.Pool class.
> 
>     Notes:
> 
>     1. Of course, tests can't run simultaneously in same TEST_DIR. So,
>         use subdirectories TEST_DIR/testname/ and SOCK_DIR/testname/
>         instead of simply TEST_DIR and SOCK_DIR
> 
> 
> SOCK_DIR was introduced because TEST_DIR was getting too long, and the length of the filepath was causing problems on some platforms. I hope that we aren't pushing our luck by making the directory longer here. Using the test name is nice because a human operator can quickly correlate directories to the tests that populated them, but if test names get kind of long, I wonder if we'll cause problems again.
> 
> Just a stray thought.
> 
>     2. multiprocessing.Pool.starmap function doesn't support passing
>         context managers, so we can't simply pass "self". Happily, we need
>         self only for read-only access, and it just works if it is defined
>         in global space. So, add a temporary link TestRunner.shared_self
>         during run_tests().
> 
> 
> I'm a little confused on this point, see below
> 
>     Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com <mailto:vsementsov@virtuozzo.com>>
>     ---
>       tests/qemu-iotests/check         |  4 +-
>       tests/qemu-iotests/testrunner.py | 69 ++++++++++++++++++++++++++++----
>       2 files changed, 64 insertions(+), 9 deletions(-)
> 
>     diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
>     index 43a4b694cc..0c27721a41 100755
>     --- a/tests/qemu-iotests/check
>     +++ b/tests/qemu-iotests/check
>     @@ -34,6 +34,8 @@ def make_argparser() -> argparse.ArgumentParser:
>                          help='show me, do not run tests')
>           p.add_argument('-makecheck', action='store_true',
>                          help='pretty print output for make check')
>     +    p.add_argument('-j', dest='jobs', type=int, default=1,
>     +                   help='run tests in multiple parallel jobs')
> 
>           p.add_argument('-d', dest='debug', action='store_true', help='debug')
>           p.add_argument('-p', dest='print', action='store_true',
>     @@ -165,6 +167,6 @@ if __name__ == '__main__':
>               with TestRunner(env, makecheck=args.makecheck,
>                               color=args.color) as tr:
>                   paths = [os.path.join(env.source_iotests, t) for t in tests]
>     -            ok = tr.run_tests(paths)
>     +            ok = tr.run_tests(paths, args.jobs <http://args.jobs>)
>                   if not ok:
>                       sys.exit(1)
> 
> 
> (OK)
> 
>     diff --git a/tests/qemu-iotests/testrunner.py b/tests/qemu-iotests/testrunner.py
>     index a9f2feb58c..0feaa396d0 100644
>     --- a/tests/qemu-iotests/testrunner.py
>     +++ b/tests/qemu-iotests/testrunner.py
>     @@ -26,6 +26,7 @@
>       import json
>       import termios
>       import sys
>     +from multiprocessing import Pool
>       from contextlib import contextmanager
>       from typing import List, Optional, Iterator, Any, Sequence, Dict, \
>               ContextManager
>     @@ -126,6 +127,31 @@ def __init__(self, status: str, description: str = '',
> 
> 
>       class TestRunner(ContextManager['TestRunner']):
>     +    shared_self = None
> 
>     +
>     +    @staticmethod
>     +    def proc_run_test(test: str, test_field_width: int) -> TestResult:
>     +        # We are in a subprocess, we can't change the runner object!
> 
> 
> *can't*, or shouldn't? I thought changing anything would just result in the child process diverging, but nothing harmful overall. Am I mistaken?

Yes you are right. "Shouldn't" is OK

> 
>     +        runner = TestRunner.shared_self
>     +        assert runner is not None
>     +        return runner.run_test(test, test_field_width, mp=True)
>     +
>     +    def run_tests_pool(self, tests: List[str],
>     +                       test_field_width: int, jobs: int) -> List[TestResult]:
>     +
>     +        # passing self directly to Pool.starmap() just doesn't work, because
>     +        # it's a context manager.
> 
> 
> Why, what happens? (Or what doesn't happen?)
> 
> (I believe you that it doesn't work, but it's not immediately obvious to me why.)

Simple check:

diff --git a/tests/qemu-iotests/testrunner.py b/tests/qemu-iotests/testrunner.py
index 0feaa396d0..49c1780697 100644
--- a/tests/qemu-iotests/testrunner.py
+++ b/tests/qemu-iotests/testrunner.py
@@ -130,7 +130,7 @@ class TestRunner(ContextManager['TestRunner']):
      shared_self = None
  
      @staticmethod
-    def proc_run_test(test: str, test_field_width: int) -> TestResult:
+    def proc_run_test(x, test: str, test_field_width: int) -> TestResult:
          # We are in a subprocess, we can't change the runner object!
          runner = TestRunner.shared_self
          assert runner is not None
@@ -146,7 +146,7 @@ def run_tests_pool(self, tests: List[str],
  
          with Pool(jobs) as p:
              results = p.starmap(self.proc_run_test,
-                                zip(tests, [test_field_width] * len(tests)))
+                                [(self, t, test_field_width) for t in tests])
  
          TestRunner.shared_self = None
  



Something like this happens:

   File "/work/src/qemu/up/up-iotests-multiprocessing/build/tests/qemu-iotests/./check", line 170, in <module>
     ok = tr.run_tests(paths, args.jobs)
   File "/work/src/qemu/up/up-iotests-multiprocessing/tests/qemu-iotests/testrunner.py", line 383, in run_tests
     results = self.run_tests_pool(tests, test_field_width, jobs)
   File "/work/src/qemu/up/up-iotests-multiprocessing/tests/qemu-iotests/testrunner.py", line 149, in run_tests_pool
     results = p.starmap(self.proc_run_test,
   File "/usr/lib64/python3.9/multiprocessing/pool.py", line 372, in starmap
     return self._map_async(func, iterable, starmapstar, chunksize).get()
   File "/usr/lib64/python3.9/multiprocessing/pool.py", line 771, in get
     raise self._value
   File "/usr/lib64/python3.9/multiprocessing/pool.py", line 537, in _handle_tasks
     put(task)
   File "/usr/lib64/python3.9/multiprocessing/connection.py", line 211, in send
     self._send_bytes(_ForkingPickler.dumps(obj))
   File "/usr/lib64/python3.9/multiprocessing/reduction.py", line 51, in dumps
     cls(buf, protocol).dump(obj)
TypeError: cannot pickle 'generator' object


Hmm, I remember that was cannot pickle context manager.. Probably I remember wrong :) Honestly I didn't dig into it except for detecting that not passing "self" fixes the problem.

> 
>     +        assert TestRunner.shared_self is None
>     +        TestRunner.shared_self = self
>     +
>     +        with Pool(jobs) as p:
>     +            results = p.starmap(self.proc_run_test,
>     +                                zip(tests, [test_field_width] * len(tests)))
>     +
>     +        TestRunner.shared_self = None
>     +
>     +        return results
>     +
>           def __init__(self, env: TestEnv, makecheck: bool = False,
>                        color: str = 'auto') -> None:
>               self.env = env
>     @@ -219,11 +245,16 @@ def find_reference(self, test: str) -> str:
> 
>               return f'{test}.out'
> 
>     -    def do_run_test(self, test: str) -> TestResult:
>     +    def do_run_test(self, test: str, mp: bool) -> TestResult:
>               """
>               Run one test
> 
>               :param test: test file path
>     +        :param mp: if true, we are in a multiprocessing environment, use
>     +                   personal subdirectories for test run
>     +
>     +        Note: this method may be called from subprocess, so it does not
>     +        change ``self`` object in any way!
>               """
> 
> 
> Maybe worth mentioning that it *does* change environment variables, but because this is "mp", it won't affect the parent execution environment.


Hmm. actually, it does not change it. And yes the reason is that we'll not change the original object anyway, so any logic that change the runner object in hope that it will make some effect would be wrong.


> 
> 
>               f_test = Path(test)
>     @@ -249,6 +280,12 @@ def do_run_test(self, test: str) -> TestResult:
> 
>               args = [str(f_test.resolve())]
>               env = self.env.prepare_subprocess(args)
>     +        if mp:
>     +            # Split test directories, so that tests running in parallel don't
>     +            # break each other.
>     +            for d in ['TEST_DIR', 'SOCK_DIR']:
>     +                env[d] = os.path.join(env[d], f_test.name <http://f_test.name>)
>     +                Path(env[d]).mkdir(parents=True, exist_ok=True)
> 
>               t0 = time.time()
>               with f_bad.open('w', encoding="utf-8") as f:
>     @@ -291,23 +328,32 @@ def do_run_test(self, test: str) -> TestResult:
>                                     casenotrun=casenotrun)
> 
>           def run_test(self, test: str,
>     -                 test_field_width: Optional[int] = None) -> TestResult:
>     +                 test_field_width: Optional[int] = None,
>     +                 mp: bool = False) -> TestResult:
>               """
>               Run one test and print short status
> 
>               :param test: test file path
>               :param test_field_width: width for first field of status format
>     +        :param mp: if true, we are in a multiprocessing environment, don't try
>     +                   to rewrite things in stdout
>     +
>     +        Note: this method may be called from subprocess, so it does not
>     +        change ``self`` object in any way!
>               """
> 
>               last_el = self.last_elapsed.get(test)
>               start = datetime.datetime.now().strftime('%H:%M:%S')
> 
>               if not self.makecheck:
>     -            self.test_print_one_line(test=test, starttime=start,
>     -                                     lasttime=last_el, end='\r',
>     +            self.test_print_one_line(test=test,
>     +                                     status = 'started' if mp else '...',
>     +                                     starttime=start,
>     +                                     lasttime=last_el,
>     +                                     end = '\n' if mp else '\r',
>                                            test_field_width=test_field_width)
> 
>     -        res = self.do_run_test(test)
>     +        res = self.do_run_test(test, mp)
> 
>               end = datetime.datetime.now().strftime('%H:%M:%S')
>               self.test_print_one_line(test=test, status=res.status,
> 
>     @@ -321,7 +367,7 @@ def run_test(self, test: str,
> 
>               return res
> 
>     -    def run_tests(self, tests: List[str]) -> bool:
>     +    def run_tests(self, tests: List[str], jobs: int = 1) -> bool:
>               n_run = 0
>               failed = []
>               notrun = []
>     @@ -332,9 +378,16 @@ def run_tests(self, tests: List[str]) -> bool:
> 
>               test_field_width = max(len(os.path.basename(t)) for t in tests) + 2
> 
>     -        for t in tests:
>     +        if jobs > 1:
>     +            results = self.run_tests_pool(tests, test_field_width, jobs)
>     +
>     +        for i, t in enumerate(tests):
>                   name = os.path.basename(t)
>     -            res = self.run_test(t, test_field_width=test_field_width)
>     +
>     +            if jobs > 1:
>     +                res = results[i]
>     +            else:
>     +                res = self.run_test(t, test_field_width)
> 
>                   assert res.status in ('pass', 'fail', 'not run')
> 
> 
> Looks good and surprisingly minimal, I just have a curiosity about the nature of the workaround here.
> 
> Either way, I believe this will probably work as written, so I can give it an ACK at a minimum while I wait for answers.
> 
> Acked-by: John Snow <jsnow@redhat.com <mailto:jsnow@redhat.com>>
> 

Thanks!

Yes, the workaround is a ugly.. But it's small, so I think we could live with.

I don't think that refactoring TestRunner to move all needed things to some simple structure supported by Pool is good idea: actually, we don't want to copy these data for each subprocess, we are OK with readonly access to shared object. And we do call methods on self, and on self.env, so refactoring would not be simple.

But about shared object, I didn't find any way to pass a link to shared object to Pool.map()..   Something like Pool.map( , ... , shared_state=self) would be good. But were is such an option? Note that this is my first experience with multiprocessing.

The only thing I find is passing through global variable. I started with real global variably, but then thought that hiding it inside the class would be a bit better.

-- 
Best regards,
Vladimir


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/3] iotests: multiprocessing!!
  2021-12-06 18:37 ` [PATCH 0/3] iotests: multiprocessing!! John Snow
@ 2021-12-06 20:26   ` Vladimir Sementsov-Ogievskiy
  2021-12-07 18:20     ` John Snow
  0 siblings, 1 reply; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-12-06 20:26 UTC (permalink / raw)
  To: John Snow; +Cc: Qemu-block, qemu-devel, Hanna Reitz, Kevin Wolf, den

06.12.2021 21:37, John Snow wrote:
> 
> 
> On Fri, Dec 3, 2021 at 7:22 AM Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com <mailto:vsementsov@virtuozzo.com>> wrote:
> 
>     Hi all!
> 
>     Finally, I can not stand it any longer. So, I'm happy to present
>     multiprocessing support for iotests test runner.
> 
>     testing on tmpfs:
> 
>     Before:
> 
>     time check -qcow2
>     ...
>     real    12m28.095s
>     user    9m53.398s
>     sys     2m55.548s
> 
>     After:
> 
>     time check -qcow2 -j 12
>     ...
>     real    2m12.136s
>     user    12m40.948s
>     sys     4m7.449s
> 
> 
> VERY excellent. And this will probably flush a lot more bugs loose, too. (Which I consider a good thing!)

Thanks!)

> We could look into utilizing it for 'make check', but we'll have to be prepared for a greater risk of race conditions on the CI if we do. But... it's seriously hard to argue with this kind of optimization, very well done!

I thought about this too. I think, we can at least passthrought -j flag of "make -j9 check" to ./check

I think, CIs mostly call make check without -j flag. But I always call make -j9 check. And it always upset me that all tests run in parallel except for iotests. So if it possible to detect that we are called through "make -j9 check" and use "-j 9" for iotests/check in this case, it would be good.

> 
> 
>     Hmm, seems -j 6 should be enough. I have 6 cores, 2 threads per core.
>     Anyway, that's so fast!
> 
>     Vladimir Sementsov-Ogievskiy (3):
>        iotests/testrunner.py: add doc string for run_test()
>        iotests/testrunner.py: move updating last_elapsed to run_tests
>        iotests: check: multiprocessing support
> 
>       tests/qemu-iotests/check         |  4 +-
>       tests/qemu-iotests/testrunner.py | 86 ++++++++++++++++++++++++++++----
>       2 files changed, 80 insertions(+), 10 deletions(-)
> 
>     -- 
>     2.31.1
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/3] iotests: check: multiprocessing support
  2021-12-06 20:20     ` Vladimir Sementsov-Ogievskiy
@ 2021-12-06 21:00       ` John Snow
  0 siblings, 0 replies; 21+ messages in thread
From: John Snow @ 2021-12-06 21:00 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: Kevin Wolf, den, Hanna Reitz, qemu-devel, Qemu-block

[-- Attachment #1: Type: text/plain, Size: 14543 bytes --]

On Mon, Dec 6, 2021 at 3:20 PM Vladimir Sementsov-Ogievskiy <
vsementsov@virtuozzo.com> wrote:

> 06.12.2021 21:35, John Snow wrote:
> >
> >
> > On Fri, Dec 3, 2021 at 7:22 AM Vladimir Sementsov-Ogievskiy <
> vsementsov@virtuozzo.com <mailto:vsementsov@virtuozzo.com>> wrote:
> >
> >     Add -j <JOBS> parameter, to run tests in several jobs simultaneously.
> >     For realization - simply utilize multiprocessing.Pool class.
> >
> >     Notes:
> >
> >     1. Of course, tests can't run simultaneously in same TEST_DIR. So,
> >         use subdirectories TEST_DIR/testname/ and SOCK_DIR/testname/
> >         instead of simply TEST_DIR and SOCK_DIR
> >
> >
> > SOCK_DIR was introduced because TEST_DIR was getting too long, and the
> length of the filepath was causing problems on some platforms. I hope that
> we aren't pushing our luck by making the directory longer here. Using the
> test name is nice because a human operator can quickly correlate
> directories to the tests that populated them, but if test names get kind of
> long, I wonder if we'll cause problems again.
> >
> > Just a stray thought.
> >
> >     2. multiprocessing.Pool.starmap function doesn't support passing
> >         context managers, so we can't simply pass "self". Happily, we
> need
> >         self only for read-only access, and it just works if it is
> defined
> >         in global space. So, add a temporary link TestRunner.shared_self
> >         during run_tests().
> >
> >
> > I'm a little confused on this point, see below
> >
> >     Signed-off-by: Vladimir Sementsov-Ogievskiy <
> vsementsov@virtuozzo.com <mailto:vsementsov@virtuozzo.com>>
> >     ---
> >       tests/qemu-iotests/check         |  4 +-
> >       tests/qemu-iotests/testrunner.py | 69
> ++++++++++++++++++++++++++++----
> >       2 files changed, 64 insertions(+), 9 deletions(-)
> >
> >     diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
> >     index 43a4b694cc..0c27721a41 100755
> >     --- a/tests/qemu-iotests/check
> >     +++ b/tests/qemu-iotests/check
> >     @@ -34,6 +34,8 @@ def make_argparser() -> argparse.ArgumentParser:
> >                          help='show me, do not run tests')
> >           p.add_argument('-makecheck', action='store_true',
> >                          help='pretty print output for make check')
> >     +    p.add_argument('-j', dest='jobs', type=int, default=1,
> >     +                   help='run tests in multiple parallel jobs')
> >
> >           p.add_argument('-d', dest='debug', action='store_true',
> help='debug')
> >           p.add_argument('-p', dest='print', action='store_true',
> >     @@ -165,6 +167,6 @@ if __name__ == '__main__':
> >               with TestRunner(env, makecheck=args.makecheck,
> >                               color=args.color) as tr:
> >                   paths = [os.path.join(env.source_iotests, t) for t in
> tests]
> >     -            ok = tr.run_tests(paths)
> >     +            ok = tr.run_tests(paths, args.jobs <http://args.jobs>)
> >                   if not ok:
> >                       sys.exit(1)
> >
> >
> > (OK)
> >
> >     diff --git a/tests/qemu-iotests/testrunner.py
> b/tests/qemu-iotests/testrunner.py
> >     index a9f2feb58c..0feaa396d0 100644
> >     --- a/tests/qemu-iotests/testrunner.py
> >     +++ b/tests/qemu-iotests/testrunner.py
> >     @@ -26,6 +26,7 @@
> >       import json
> >       import termios
> >       import sys
> >     +from multiprocessing import Pool
> >       from contextlib import contextmanager
> >       from typing import List, Optional, Iterator, Any, Sequence, Dict, \
> >               ContextManager
> >     @@ -126,6 +127,31 @@ def __init__(self, status: str, description:
> str = '',
> >
> >
> >       class TestRunner(ContextManager['TestRunner']):
> >     +    shared_self = None
> >
> >     +
> >     +    @staticmethod
> >     +    def proc_run_test(test: str, test_field_width: int) ->
> TestResult:
> >     +        # We are in a subprocess, we can't change the runner object!
> >
> >
> > *can't*, or shouldn't? I thought changing anything would just result in
> the child process diverging, but nothing harmful overall. Am I mistaken?
>
> Yes you are right. "Shouldn't" is OK
>
> >
> >     +        runner = TestRunner.shared_self
> >     +        assert runner is not None
> >     +        return runner.run_test(test, test_field_width, mp=True)
> >     +
> >     +    def run_tests_pool(self, tests: List[str],
> >     +                       test_field_width: int, jobs: int) ->
> List[TestResult]:
> >     +
> >     +        # passing self directly to Pool.starmap() just doesn't
> work, because
> >     +        # it's a context manager.
> >
> >
> > Why, what happens? (Or what doesn't happen?)
> >
> > (I believe you that it doesn't work, but it's not immediately obvious to
> me why.)
>
> Simple check:
>
> diff --git a/tests/qemu-iotests/testrunner.py
> b/tests/qemu-iotests/testrunner.py
> index 0feaa396d0..49c1780697 100644
> --- a/tests/qemu-iotests/testrunner.py
> +++ b/tests/qemu-iotests/testrunner.py
> @@ -130,7 +130,7 @@ class TestRunner(ContextManager['TestRunner']):
>       shared_self = None
>
>       @staticmethod
> -    def proc_run_test(test: str, test_field_width: int) -> TestResult:
> +    def proc_run_test(x, test: str, test_field_width: int) -> TestResult:
>           # We are in a subprocess, we can't change the runner object!
>           runner = TestRunner.shared_self
>           assert runner is not None
> @@ -146,7 +146,7 @@ def run_tests_pool(self, tests: List[str],
>
>           with Pool(jobs) as p:
>               results = p.starmap(self.proc_run_test,
> -                                zip(tests, [test_field_width] *
> len(tests)))
> +                                [(self, t, test_field_width) for t in
> tests])
>
>           TestRunner.shared_self = None
>
>
>
>
> Something like this happens:
>
>    File
> "/work/src/qemu/up/up-iotests-multiprocessing/build/tests/qemu-iotests/./check",
> line 170, in <module>
>      ok = tr.run_tests(paths, args.jobs)
>    File
> "/work/src/qemu/up/up-iotests-multiprocessing/tests/qemu-iotests/testrunner.py",
> line 383, in run_tests
>      results = self.run_tests_pool(tests, test_field_width, jobs)
>    File
> "/work/src/qemu/up/up-iotests-multiprocessing/tests/qemu-iotests/testrunner.py",
> line 149, in run_tests_pool
>      results = p.starmap(self.proc_run_test,
>    File "/usr/lib64/python3.9/multiprocessing/pool.py", line 372, in
> starmap
>      return self._map_async(func, iterable, starmapstar, chunksize).get()
>    File "/usr/lib64/python3.9/multiprocessing/pool.py", line 771, in get
>      raise self._value
>    File "/usr/lib64/python3.9/multiprocessing/pool.py", line 537, in
> _handle_tasks
>      put(task)
>    File "/usr/lib64/python3.9/multiprocessing/connection.py", line 211, in
> send
>      self._send_bytes(_ForkingPickler.dumps(obj))
>    File "/usr/lib64/python3.9/multiprocessing/reduction.py", line 51, in
> dumps
>      cls(buf, protocol).dump(obj)
> TypeError: cannot pickle 'generator' object
>
>
> Hmm, I remember that was cannot pickle context manager.. Probably I
> remember wrong :) Honestly I didn't dig into it except for detecting that
> not passing "self" fixes the problem.
>
>
Oh, I see. Even using a *bound method*, it still wants to pass 'self' as an
argument, but it's unable to do so ... uh, interesting! but having it as
global state somehow works. That's ... fascinating. Well, without spending
much time on it myself, I think your workaround is probably the best
possible thing without really tearing things apart and refactoring.
Asserting that shared_self is None will prevent run_tests_pool from being
called twice concurrently, so the limitation of the workaround is
well-contained.

Good enough.


> >
> >     +        assert TestRunner.shared_self is None
> >     +        TestRunner.shared_self = self
> >     +
> >     +        with Pool(jobs) as p:
> >     +            results = p.starmap(self.proc_run_test,
> >     +                                zip(tests, [test_field_width] *
> len(tests)))
> >     +
> >     +        TestRunner.shared_self = None
> >     +
> >     +        return results
> >     +
> >           def __init__(self, env: TestEnv, makecheck: bool = False,
> >                        color: str = 'auto') -> None:
> >               self.env = env
> >     @@ -219,11 +245,16 @@ def find_reference(self, test: str) -> str:
> >
> >               return f'{test}.out'
> >
> >     -    def do_run_test(self, test: str) -> TestResult:
> >     +    def do_run_test(self, test: str, mp: bool) -> TestResult:
> >               """
> >               Run one test
> >
> >               :param test: test file path
> >     +        :param mp: if true, we are in a multiprocessing
> environment, use
> >     +                   personal subdirectories for test run
> >     +
> >     +        Note: this method may be called from subprocess, so it does
> not
> >     +        change ``self`` object in any way!
> >               """
> >
> >
> > Maybe worth mentioning that it *does* change environment variables, but
> because this is "mp", it won't affect the parent execution environment.
>
>
> Hmm. actually, it does not change it. And yes the reason is that we'll not
> change the original object anyway, so any logic that change the runner
> object in hope that it will make some effect would be wrong.
>
>
> >
> >
> >               f_test = Path(test)
> >     @@ -249,6 +280,12 @@ def do_run_test(self, test: str) -> TestResult:
> >
> >               args = [str(f_test.resolve())]
> >               env = self.env.prepare_subprocess(args)
> >     +        if mp:
> >     +            # Split test directories, so that tests running in
> parallel don't
> >     +            # break each other.
> >     +            for d in ['TEST_DIR', 'SOCK_DIR']:
> >     +                env[d] = os.path.join(env[d], f_test.name <
> http://f_test.name>)
> >     +                Path(env[d]).mkdir(parents=True, exist_ok=True)
> >
> >               t0 = time.time()
> >               with f_bad.open('w', encoding="utf-8") as f:
> >     @@ -291,23 +328,32 @@ def do_run_test(self, test: str) -> TestResult:
> >                                     casenotrun=casenotrun)
> >
> >           def run_test(self, test: str,
> >     -                 test_field_width: Optional[int] = None) ->
> TestResult:
> >     +                 test_field_width: Optional[int] = None,
> >     +                 mp: bool = False) -> TestResult:
> >               """
> >               Run one test and print short status
> >
> >               :param test: test file path
> >               :param test_field_width: width for first field of status
> format
> >     +        :param mp: if true, we are in a multiprocessing
> environment, don't try
> >     +                   to rewrite things in stdout
> >     +
> >     +        Note: this method may be called from subprocess, so it does
> not
> >     +        change ``self`` object in any way!
> >               """
> >
> >               last_el = self.last_elapsed.get(test)
> >               start = datetime.datetime.now().strftime('%H:%M:%S')
> >
> >               if not self.makecheck:
> >     -            self.test_print_one_line(test=test, starttime=start,
> >     -                                     lasttime=last_el, end='\r',
> >     +            self.test_print_one_line(test=test,
> >     +                                     status = 'started' if mp else
> '...',
> >     +                                     starttime=start,
> >     +                                     lasttime=last_el,
> >     +                                     end = '\n' if mp else '\r',
> >
> test_field_width=test_field_width)
> >
> >     -        res = self.do_run_test(test)
> >     +        res = self.do_run_test(test, mp)
> >
> >               end = datetime.datetime.now().strftime('%H:%M:%S')
> >               self.test_print_one_line(test=test, status=res.status,
> >
> >     @@ -321,7 +367,7 @@ def run_test(self, test: str,
> >
> >               return res
> >
> >     -    def run_tests(self, tests: List[str]) -> bool:
> >     +    def run_tests(self, tests: List[str], jobs: int = 1) -> bool:
> >               n_run = 0
> >               failed = []
> >               notrun = []
> >     @@ -332,9 +378,16 @@ def run_tests(self, tests: List[str]) -> bool:
> >
> >               test_field_width = max(len(os.path.basename(t)) for t in
> tests) + 2
> >
> >     -        for t in tests:
> >     +        if jobs > 1:
> >     +            results = self.run_tests_pool(tests, test_field_width,
> jobs)
> >     +
> >     +        for i, t in enumerate(tests):
> >                   name = os.path.basename(t)
> >     -            res = self.run_test(t,
> test_field_width=test_field_width)
> >     +
> >     +            if jobs > 1:
> >     +                res = results[i]
> >     +            else:
> >     +                res = self.run_test(t, test_field_width)
> >
> >                   assert res.status in ('pass', 'fail', 'not run')
> >
> >
> > Looks good and surprisingly minimal, I just have a curiosity about the
> nature of the workaround here.
> >
> > Either way, I believe this will probably work as written, so I can give
> it an ACK at a minimum while I wait for answers.
> >
> > Acked-by: John Snow <jsnow@redhat.com <mailto:jsnow@redhat.com>>
> >
>
> Thanks!
>
> Yes, the workaround is a ugly.. But it's small, so I think we could live
> with.
>

I agree, I just wanted to make sure I understood what was happening and why.


> I don't think that refactoring TestRunner to move all needed things to
> some simple structure supported by Pool is good idea: actually, we don't
> want to copy these data for each subprocess, we are OK with readonly access
> to shared object. And we do call methods on self, and on self.env, so
> refactoring would not be simple.
>
> But about shared object, I didn't find any way to pass a link to shared
> object to Pool.map()..   Something like Pool.map( , ... ,
> shared_state=self) would be good. But were is such an option? Note that
> this is my first experience with multiprocessing.
>
> The only thing I find is passing through global variable. I started with
> real global variably, but then thought that hiding it inside the class
> would be a bit better.
>

Yeah, don't worry about making it absolutely beautiful.  Thanks for
explaining the problem to me, I agree that your workaround is a good
compromise.

Reviewed-by: John Snow <jsnow@redhat.com>

[-- Attachment #2: Type: text/html, Size: 19482 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/3] iotests: multiprocessing!!
  2021-12-06 20:26   ` Vladimir Sementsov-Ogievskiy
@ 2021-12-07 18:20     ` John Snow
  0 siblings, 0 replies; 21+ messages in thread
From: John Snow @ 2021-12-07 18:20 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: Kevin Wolf, den, Hanna Reitz, qemu-devel, Qemu-block

[-- Attachment #1: Type: text/plain, Size: 2451 bytes --]

On Mon, Dec 6, 2021 at 3:26 PM Vladimir Sementsov-Ogievskiy <
vsementsov@virtuozzo.com> wrote:

> 06.12.2021 21:37, John Snow wrote:
> >
> >
> > On Fri, Dec 3, 2021 at 7:22 AM Vladimir Sementsov-Ogievskiy <
> vsementsov@virtuozzo.com <mailto:vsementsov@virtuozzo.com>> wrote:
> >
> >     Hi all!
> >
> >     Finally, I can not stand it any longer. So, I'm happy to present
> >     multiprocessing support for iotests test runner.
> >
> >     testing on tmpfs:
> >
> >     Before:
> >
> >     time check -qcow2
> >     ...
> >     real    12m28.095s
> >     user    9m53.398s
> >     sys     2m55.548s
> >
> >     After:
> >
> >     time check -qcow2 -j 12
> >     ...
> >     real    2m12.136s
> >     user    12m40.948s
> >     sys     4m7.449s
> >
> >
> > VERY excellent. And this will probably flush a lot more bugs loose, too.
> (Which I consider a good thing!)
>
> Thanks!)
>
> > We could look into utilizing it for 'make check', but we'll have to be
> prepared for a greater risk of race conditions on the CI if we do. But...
> it's seriously hard to argue with this kind of optimization, very well done!
>
> I thought about this too. I think, we can at least passthrought -j flag of
> "make -j9 check" to ./check
>
> I think, CIs mostly call make check without -j flag. But I always call
> make -j9 check. And it always upset me that all tests run in parallel
> except for iotests. So if it possible to detect that we are called through
> "make -j9 check" and use "-j 9" for iotests/check in this case, it would be
> good.
>
> >
> >
> >     Hmm, seems -j 6 should be enough. I have 6 cores, 2 threads per core.
> >     Anyway, that's so fast!
> >
> >     Vladimir Sementsov-Ogievskiy (3):
> >        iotests/testrunner.py: add doc string for run_test()
> >        iotests/testrunner.py: move updating last_elapsed to run_tests
> >        iotests: check: multiprocessing support
> >
> >       tests/qemu-iotests/check         |  4 +-
> >       tests/qemu-iotests/testrunner.py | 86
> ++++++++++++++++++++++++++++----
> >       2 files changed, 80 insertions(+), 10 deletions(-)
> >
> >     --
> >     2.31.1
> >
>
>
>
I'll also now add:

Tested-by: John Snow <jsnow@redhat.com>

I tried to find a different workaround in just a few minutes, but that just
made it clear that your solution was right. While I had it checked out, I
ran it a few times and it looks good to me!
(And no new problems from the Python CI stuff, so it looks good to me.)

[-- Attachment #2: Type: text/html, Size: 3548 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/3] iotests: multiprocessing!!
  2021-12-03 12:22 [PATCH 0/3] iotests: multiprocessing!! Vladimir Sementsov-Ogievskiy
                   ` (3 preceding siblings ...)
  2021-12-06 18:37 ` [PATCH 0/3] iotests: multiprocessing!! John Snow
@ 2021-12-09 14:33 ` Hanna Reitz
  4 siblings, 0 replies; 21+ messages in thread
From: Hanna Reitz @ 2021-12-09 14:33 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: kwolf, den, jsnow, qemu-devel

On 03.12.21 13:22, Vladimir Sementsov-Ogievskiy wrote:
> Hi all!
>
> Finally, I can not stand it any longer. So, I'm happy to present
> multiprocessing support for iotests test runner.

Thanks, looks great!

Applied to my block-next branch:

https://gitlab.com/hreitz/qemu/-/commits/block-next

Hanna



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/3] iotests/testrunner.py: add doc string for run_test()
  2021-12-03 12:22 ` [PATCH 1/3] iotests/testrunner.py: add doc string for run_test() Vladimir Sementsov-Ogievskiy
  2021-12-06 17:52   ` John Snow
@ 2021-12-10 14:12   ` Kevin Wolf
  2021-12-10 14:40     ` Vladimir Sementsov-Ogievskiy
  1 sibling, 1 reply; 21+ messages in thread
From: Kevin Wolf @ 2021-12-10 14:12 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy; +Cc: den, hreitz, jsnow, qemu-devel, qemu-block

Am 03.12.2021 um 13:22 hat Vladimir Sementsov-Ogievskiy geschrieben:
> We are going to modify these methods and will add more documentation in
> further commit. As a preparation add basic documentation.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  tests/qemu-iotests/testrunner.py | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
> 
> diff --git a/tests/qemu-iotests/testrunner.py b/tests/qemu-iotests/testrunner.py
> index 0e29c2fddd..fa842252d3 100644
> --- a/tests/qemu-iotests/testrunner.py
> +++ b/tests/qemu-iotests/testrunner.py
> @@ -220,6 +220,12 @@ def find_reference(self, test: str) -> str:
>          return f'{test}.out'
>  
>      def do_run_test(self, test: str) -> TestResult:
> +        """
> +        Run one test
> +
> +        :param test: test file path
> +        """
> +
>          f_test = Path(test)
>          f_bad = Path(f_test.name + '.out.bad')
>          f_notrun = Path(f_test.name + '.notrun')
> @@ -287,6 +293,13 @@ def do_run_test(self, test: str) -> TestResult:
>  
>      def run_test(self, test: str,
>                   test_field_width: Optional[int] = None) -> TestResult:
> +        """
> +        Run one test and print short status
> +
> +        :param test: test file path
> +        :param test_field_width: width for first field of status format
> +        """
> +
>          last_el = self.last_elapsed.get(test)
>          start = datetime.datetime.now().strftime('%H:%M:%S')

test_field_width is Optional[int], so the documentation should specify
what happens when you pass None.

Seems it doesn't change the behaviour, but just picks a default value of
8. 'test_field_width: int = 8' might have been the more obvious solution
for this in the first place.

Kevin



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/3] iotests/testrunner.py: move updating last_elapsed to run_tests
  2021-12-06 17:59   ` John Snow
@ 2021-12-10 14:25     ` Kevin Wolf
  2021-12-10 14:47       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 21+ messages in thread
From: Kevin Wolf @ 2021-12-10 14:25 UTC (permalink / raw)
  To: John Snow
  Cc: den, Hanna Reitz, Vladimir Sementsov-Ogievskiy, qemu-devel, Qemu-block

Am 06.12.2021 um 18:59 hat John Snow geschrieben:
> On Fri, Dec 3, 2021 at 7:22 AM Vladimir Sementsov-Ogievskiy <
> vsementsov@virtuozzo.com> wrote:
> 
> > We are going to use do_run_test() in multiprocessing environment, where
> > we'll not be able to change original runner object.
> >
> > Happily, the only thing we change is that last_elapsed and it's simple
> > to do it in run_tests() instead. All other accesses to self in
> > do_runt_test() and in run_test() are read-only.
> >
> > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> > ---
> >  tests/qemu-iotests/testrunner.py | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/tests/qemu-iotests/testrunner.py
> > b/tests/qemu-iotests/testrunner.py
> > index fa842252d3..a9f2feb58c 100644
> > --- a/tests/qemu-iotests/testrunner.py
> > +++ b/tests/qemu-iotests/testrunner.py
> > @@ -287,7 +287,6 @@ def do_run_test(self, test: str) -> TestResult:
> >                                diff=diff, casenotrun=casenotrun)
> >          else:
> >              f_bad.unlink()
> > -            self.last_elapsed.update(test, elapsed)
> >              return TestResult(status='pass', elapsed=elapsed,
> >                                casenotrun=casenotrun)
> >
> > @@ -353,6 +352,9 @@ def run_tests(self, tests: List[str]) -> bool:
> >                      print('\n'.join(res.diff))
> >              elif res.status == 'not run':
> >                  notrun.append(name)
> > +            elif res.status == 'pass':
> > +                assert res.elapsed is not None
> > +                self.last_elapsed.update(t, res.elapsed)
> >
> >              sys.stdout.flush()
> >              if res.interrupted:
> > --
> > 2.31.1
> >
> >
> (I continue to be annoyed by the "None" problem in mypy, but I suppose it
> really can't be helped. Nothing for you to change with this patch or
> series. I just wish we didn't need so many assertions ...)

I'm inclined to say it's a None problem in our code, not in mypy.
Essentially it comes from the fact that we're abusing a string
(res.status) and None values to distinguish different types of results
that have a different set of valid attributes.

Of course, Python already provides a language feature to distinguish
different types of results that have a different set of attributes and
that wouldn't run into this problem: subclasses.

Kevin



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/3] iotests: check: multiprocessing support
  2021-12-03 12:22 ` [PATCH 3/3] iotests: check: multiprocessing support Vladimir Sementsov-Ogievskiy
  2021-12-06 18:35   ` John Snow
@ 2021-12-10 14:36   ` Kevin Wolf
  2021-12-10 14:46     ` Vladimir Sementsov-Ogievskiy
  1 sibling, 1 reply; 21+ messages in thread
From: Kevin Wolf @ 2021-12-10 14:36 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy; +Cc: den, hreitz, jsnow, qemu-devel, qemu-block

Am 03.12.2021 um 13:22 hat Vladimir Sementsov-Ogievskiy geschrieben:
> Add -j <JOBS> parameter, to run tests in several jobs simultaneously.
> For realization - simply utilize multiprocessing.Pool class.
> 
> Notes:
> 
> 1. Of course, tests can't run simultaneously in same TEST_DIR. So,
>    use subdirectories TEST_DIR/testname/ and SOCK_DIR/testname/
>    instead of simply TEST_DIR and SOCK_DIR
> 
> 2. multiprocessing.Pool.starmap function doesn't support passing
>    context managers, so we can't simply pass "self". Happily, we need
>    self only for read-only access, and it just works if it is defined
>    in global space. So, add a temporary link TestRunner.shared_self
>    during run_tests().
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

Just wondering, is it worth even supporting the mp=false case or can we
simplify the code a bit by always going through multiprocessing and
using nice directory names even if only one process is spawned?

Maybe John's observation that directory names get longer might be a
reason not to do that by default. Any other reasons you're aware of?

Kevin



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/3] iotests/testrunner.py: add doc string for run_test()
  2021-12-10 14:12   ` Kevin Wolf
@ 2021-12-10 14:40     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-12-10 14:40 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, qemu-devel, hreitz, jsnow, den

10.12.2021 17:12, Kevin Wolf wrote:
> Am 03.12.2021 um 13:22 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> We are going to modify these methods and will add more documentation in
>> further commit. As a preparation add basic documentation.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   tests/qemu-iotests/testrunner.py | 13 +++++++++++++
>>   1 file changed, 13 insertions(+)
>>
>> diff --git a/tests/qemu-iotests/testrunner.py b/tests/qemu-iotests/testrunner.py
>> index 0e29c2fddd..fa842252d3 100644
>> --- a/tests/qemu-iotests/testrunner.py
>> +++ b/tests/qemu-iotests/testrunner.py
>> @@ -220,6 +220,12 @@ def find_reference(self, test: str) -> str:
>>           return f'{test}.out'
>>   
>>       def do_run_test(self, test: str) -> TestResult:
>> +        """
>> +        Run one test
>> +
>> +        :param test: test file path
>> +        """
>> +
>>           f_test = Path(test)
>>           f_bad = Path(f_test.name + '.out.bad')
>>           f_notrun = Path(f_test.name + '.notrun')
>> @@ -287,6 +293,13 @@ def do_run_test(self, test: str) -> TestResult:
>>   
>>       def run_test(self, test: str,
>>                    test_field_width: Optional[int] = None) -> TestResult:
>> +        """
>> +        Run one test and print short status
>> +
>> +        :param test: test file path
>> +        :param test_field_width: width for first field of status format
>> +        """
>> +
>>           last_el = self.last_elapsed.get(test)
>>           start = datetime.datetime.now().strftime('%H:%M:%S')
> 
> test_field_width is Optional[int], so the documentation should specify
> what happens when you pass None.
> 
> Seems it doesn't change the behaviour, but just picks a default value of
> 8. 'test_field_width: int = 8' might have been the more obvious solution
> for this in the first place.
> 
> Kevin
> 

Agree, I'll make a follow-up patch for it.

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/3] iotests: check: multiprocessing support
  2021-12-10 14:36   ` Kevin Wolf
@ 2021-12-10 14:46     ` Vladimir Sementsov-Ogievskiy
  2021-12-10 16:26       ` Kevin Wolf
  0 siblings, 1 reply; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-12-10 14:46 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, qemu-devel, hreitz, jsnow, den

10.12.2021 17:36, Kevin Wolf wrote:
> Am 03.12.2021 um 13:22 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> Add -j <JOBS> parameter, to run tests in several jobs simultaneously.
>> For realization - simply utilize multiprocessing.Pool class.
>>
>> Notes:
>>
>> 1. Of course, tests can't run simultaneously in same TEST_DIR. So,
>>     use subdirectories TEST_DIR/testname/ and SOCK_DIR/testname/
>>     instead of simply TEST_DIR and SOCK_DIR
>>
>> 2. multiprocessing.Pool.starmap function doesn't support passing
>>     context managers, so we can't simply pass "self". Happily, we need
>>     self only for read-only access, and it just works if it is defined
>>     in global space. So, add a temporary link TestRunner.shared_self
>>     during run_tests().
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> 
> Just wondering, is it worth even supporting the mp=false case or can we
> simplify the code a bit by always going through multiprocessing and
> using nice directory names even if only one process is spawned?
> 
> Maybe John's observation that directory names get longer might be a
> reason not to do that by default. Any other reasons you're aware of?
> 

I just wanted to keep the behavior without a new option unchanged, to not deal with possible CI failures on "make check": who know what multiprocessing brings together with performance.

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/3] iotests/testrunner.py: move updating last_elapsed to run_tests
  2021-12-10 14:25     ` Kevin Wolf
@ 2021-12-10 14:47       ` Vladimir Sementsov-Ogievskiy
  2021-12-10 20:05         ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-12-10 14:47 UTC (permalink / raw)
  To: Kevin Wolf, John Snow; +Cc: Qemu-block, qemu-devel, Hanna Reitz, den

10.12.2021 17:25, Kevin Wolf wrote:
> Am 06.12.2021 um 18:59 hat John Snow geschrieben:
>> On Fri, Dec 3, 2021 at 7:22 AM Vladimir Sementsov-Ogievskiy <
>> vsementsov@virtuozzo.com> wrote:
>>
>>> We are going to use do_run_test() in multiprocessing environment, where
>>> we'll not be able to change original runner object.
>>>
>>> Happily, the only thing we change is that last_elapsed and it's simple
>>> to do it in run_tests() instead. All other accesses to self in
>>> do_runt_test() and in run_test() are read-only.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>> ---
>>>   tests/qemu-iotests/testrunner.py | 4 +++-
>>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/tests/qemu-iotests/testrunner.py
>>> b/tests/qemu-iotests/testrunner.py
>>> index fa842252d3..a9f2feb58c 100644
>>> --- a/tests/qemu-iotests/testrunner.py
>>> +++ b/tests/qemu-iotests/testrunner.py
>>> @@ -287,7 +287,6 @@ def do_run_test(self, test: str) -> TestResult:
>>>                                 diff=diff, casenotrun=casenotrun)
>>>           else:
>>>               f_bad.unlink()
>>> -            self.last_elapsed.update(test, elapsed)
>>>               return TestResult(status='pass', elapsed=elapsed,
>>>                                 casenotrun=casenotrun)
>>>
>>> @@ -353,6 +352,9 @@ def run_tests(self, tests: List[str]) -> bool:
>>>                       print('\n'.join(res.diff))
>>>               elif res.status == 'not run':
>>>                   notrun.append(name)
>>> +            elif res.status == 'pass':
>>> +                assert res.elapsed is not None
>>> +                self.last_elapsed.update(t, res.elapsed)
>>>
>>>               sys.stdout.flush()
>>>               if res.interrupted:
>>> --
>>> 2.31.1
>>>
>>>
>> (I continue to be annoyed by the "None" problem in mypy, but I suppose it
>> really can't be helped. Nothing for you to change with this patch or
>> series. I just wish we didn't need so many assertions ...)
> 
> I'm inclined to say it's a None problem in our code, not in mypy.
> Essentially it comes from the fact that we're abusing a string
> (res.status) and None values to distinguish different types of results
> that have a different set of valid attributes.
> 
> Of course, Python already provides a language feature to distinguish
> different types of results that have a different set of attributes and
> that wouldn't run into this problem: subclasses.
> 

Agree, you are right. I'll look if it is simple to refactor.

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/3] iotests: check: multiprocessing support
  2021-12-10 14:46     ` Vladimir Sementsov-Ogievskiy
@ 2021-12-10 16:26       ` Kevin Wolf
  0 siblings, 0 replies; 21+ messages in thread
From: Kevin Wolf @ 2021-12-10 16:26 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy; +Cc: den, hreitz, jsnow, qemu-devel, qemu-block

Am 10.12.2021 um 15:46 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 10.12.2021 17:36, Kevin Wolf wrote:
> > Am 03.12.2021 um 13:22 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > Add -j <JOBS> parameter, to run tests in several jobs simultaneously.
> > > For realization - simply utilize multiprocessing.Pool class.
> > > 
> > > Notes:
> > > 
> > > 1. Of course, tests can't run simultaneously in same TEST_DIR. So,
> > >     use subdirectories TEST_DIR/testname/ and SOCK_DIR/testname/
> > >     instead of simply TEST_DIR and SOCK_DIR
> > > 
> > > 2. multiprocessing.Pool.starmap function doesn't support passing
> > >     context managers, so we can't simply pass "self". Happily, we need
> > >     self only for read-only access, and it just works if it is defined
> > >     in global space. So, add a temporary link TestRunner.shared_self
> > >     during run_tests().
> > > 
> > > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> > 
> > Just wondering, is it worth even supporting the mp=false case or can we
> > simplify the code a bit by always going through multiprocessing and
> > using nice directory names even if only one process is spawned?
> > 
> > Maybe John's observation that directory names get longer might be a
> > reason not to do that by default. Any other reasons you're aware of?
> 
> I just wanted to keep the behavior without a new option unchanged, to
> not deal with possible CI failures on "make check": who know what
> multiprocessing brings together with performance.

So basically just err on the side of caution. Makes sense.

Kevin



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/3] iotests/testrunner.py: move updating last_elapsed to run_tests
  2021-12-10 14:47       ` Vladimir Sementsov-Ogievskiy
@ 2021-12-10 20:05         ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-12-10 20:05 UTC (permalink / raw)
  To: Kevin Wolf, John Snow; +Cc: Qemu-block, qemu-devel, Hanna Reitz, den

10.12.2021 17:47, Vladimir Sementsov-Ogievskiy wrote:
> 10.12.2021 17:25, Kevin Wolf wrote:
>> Am 06.12.2021 um 18:59 hat John Snow geschrieben:
>>> On Fri, Dec 3, 2021 at 7:22 AM Vladimir Sementsov-Ogievskiy <
>>> vsementsov@virtuozzo.com> wrote:
>>>
>>>> We are going to use do_run_test() in multiprocessing environment, where
>>>> we'll not be able to change original runner object.
>>>>
>>>> Happily, the only thing we change is that last_elapsed and it's simple
>>>> to do it in run_tests() instead. All other accesses to self in
>>>> do_runt_test() and in run_test() are read-only.
>>>>
>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>> ---
>>>>   tests/qemu-iotests/testrunner.py | 4 +++-
>>>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/tests/qemu-iotests/testrunner.py
>>>> b/tests/qemu-iotests/testrunner.py
>>>> index fa842252d3..a9f2feb58c 100644
>>>> --- a/tests/qemu-iotests/testrunner.py
>>>> +++ b/tests/qemu-iotests/testrunner.py
>>>> @@ -287,7 +287,6 @@ def do_run_test(self, test: str) -> TestResult:
>>>>                                 diff=diff, casenotrun=casenotrun)
>>>>           else:
>>>>               f_bad.unlink()
>>>> -            self.last_elapsed.update(test, elapsed)
>>>>               return TestResult(status='pass', elapsed=elapsed,
>>>>                                 casenotrun=casenotrun)
>>>>
>>>> @@ -353,6 +352,9 @@ def run_tests(self, tests: List[str]) -> bool:
>>>>                       print('\n'.join(res.diff))
>>>>               elif res.status == 'not run':
>>>>                   notrun.append(name)
>>>> +            elif res.status == 'pass':
>>>> +                assert res.elapsed is not None
>>>> +                self.last_elapsed.update(t, res.elapsed)
>>>>
>>>>               sys.stdout.flush()
>>>>               if res.interrupted:
>>>> -- 
>>>> 2.31.1
>>>>
>>>>
>>> (I continue to be annoyed by the "None" problem in mypy, but I suppose it
>>> really can't be helped. Nothing for you to change with this patch or
>>> series. I just wish we didn't need so many assertions ...)
>>
>> I'm inclined to say it's a None problem in our code, not in mypy.
>> Essentially it comes from the fact that we're abusing a string
>> (res.status) and None values to distinguish different types of results
>> that have a different set of valid attributes.
>>
>> Of course, Python already provides a language feature to distinguish
>> different types of results that have a different set of attributes and
>> that wouldn't run into this problem: subclasses.
>>
> 
> Agree, you are right. I'll look if it is simple to refactor.
> 

No it's not simple)

Actually, it means making TestResult more smart, and moving (most of) logic of test result result parsing (checking different files, etc) to TestResult.. But we'll still want to update last_elapsed.. Adding a method "TestResult.update_runnner(runner)", which will be no-op in base TestResult and will be actually realized only in TestResult subclass that have .elapsed to call runner.update_last_elapsed()? And we'll have a hierarhy like TestResultBase -> TestResultWithElapsed -> (TestResultFail, TestResultPass).. At least, that's what I imagine, and I don't like it :)

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2021-12-10 20:07 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-03 12:22 [PATCH 0/3] iotests: multiprocessing!! Vladimir Sementsov-Ogievskiy
2021-12-03 12:22 ` [PATCH 1/3] iotests/testrunner.py: add doc string for run_test() Vladimir Sementsov-Ogievskiy
2021-12-06 17:52   ` John Snow
2021-12-10 14:12   ` Kevin Wolf
2021-12-10 14:40     ` Vladimir Sementsov-Ogievskiy
2021-12-03 12:22 ` [PATCH 2/3] iotests/testrunner.py: move updating last_elapsed to run_tests Vladimir Sementsov-Ogievskiy
2021-12-06 17:59   ` John Snow
2021-12-10 14:25     ` Kevin Wolf
2021-12-10 14:47       ` Vladimir Sementsov-Ogievskiy
2021-12-10 20:05         ` Vladimir Sementsov-Ogievskiy
2021-12-03 12:22 ` [PATCH 3/3] iotests: check: multiprocessing support Vladimir Sementsov-Ogievskiy
2021-12-06 18:35   ` John Snow
2021-12-06 20:20     ` Vladimir Sementsov-Ogievskiy
2021-12-06 21:00       ` John Snow
2021-12-10 14:36   ` Kevin Wolf
2021-12-10 14:46     ` Vladimir Sementsov-Ogievskiy
2021-12-10 16:26       ` Kevin Wolf
2021-12-06 18:37 ` [PATCH 0/3] iotests: multiprocessing!! John Snow
2021-12-06 20:26   ` Vladimir Sementsov-Ogievskiy
2021-12-07 18:20     ` John Snow
2021-12-09 14:33 ` Hanna Reitz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.