All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/5] Add qemu-img checksum command using blkhash
@ 2022-11-28 14:15 Nir Soffer
  2022-11-28 14:15 ` [PATCH v2 1/5] qemu-img.c: Move IO_BUF_SIZE to the top of the file Nir Soffer
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Nir Soffer @ 2022-11-28 14:15 UTC (permalink / raw)
  To: qemu-devel
  Cc: Thomas Huth, Paolo Bonzini, Kevin Wolf,
	Philippe Mathieu-Daudé,
	Marc-André Lureau, qemu-block, Daniel P. Berrangé,
	Hanna Reitz, Nir Soffer

Since blkhash is available only via copr now, the new command is added as
optional feature, built only if blkhash-devel package is installed.

Changes since v1 (Hanna):
- Move IO_BUF_SIZE to top of the file
- Extend TestFinder to support format or cache specific out files
- Improve online help (note about optimization and lint to blkhash project)
- Guard blkhash.h include with CONFIG_BLKHASH
- Using user_creatable_process_cmdline() instead of user_creatable_add_from_str()
- Rename ret to exit_code
- Add static assert to ensure that read buffer is algined to block size
- Drop unneeded pnum variable
- Change test to work like other tests; use iotest.imgfmt and iotest.cachemode
- Simplify test to test only raw and qcow2 format using file protocol
- Fix code style issues (multi-line comments, missing braces)
- Make error checking more clear (checksum_block_status(s) < 0)

v1:
https://lists.nongnu.org/archive/html/qemu-block/2022-09/msg00021.html

v1 discussion:
- https://lists.nongnu.org/archive/html/qemu-block/2022-10/msg00602.html
- https://lists.nongnu.org/archive/html/qemu-block/2022-10/msg00603.html
- https://lists.nongnu.org/archive/html/qemu-block/2022-10/msg00604.html
- https://lists.nongnu.org/archive/html/qemu-block/2022-11/msg00171.html
- https://lists.nongnu.org/archive/html/qemu-block/2022-11/msg00173.html

Nir Soffer (5):
  qemu-img.c: Move IO_BUF_SIZE to the top of the file
  Support format or cache specific out file
  qemu-img: Add checksum command
  iotests: Test qemu-img checksum
  qemu-img: Speed up checksum

 docs/tools/qemu-img.rst                       |  24 ++
 meson.build                                   |  10 +-
 meson_options.txt                             |   2 +
 qemu-img-cmds.hx                              |   8 +
 qemu-img.c                                    | 390 +++++++++++++++++-
 tests/qemu-iotests/findtests.py               |  10 +-
 tests/qemu-iotests/tests/qemu-img-checksum    |  63 +++
 .../tests/qemu-img-checksum.out.qcow2         |  11 +
 .../tests/qemu-img-checksum.out.raw           |  10 +
 9 files changed, 523 insertions(+), 5 deletions(-)
 create mode 100755 tests/qemu-iotests/tests/qemu-img-checksum
 create mode 100644 tests/qemu-iotests/tests/qemu-img-checksum.out.qcow2
 create mode 100644 tests/qemu-iotests/tests/qemu-img-checksum.out.raw

-- 
2.38.1



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 1/5] qemu-img.c: Move IO_BUF_SIZE to the top of the file
  2022-11-28 14:15 [PATCH v2 0/5] Add qemu-img checksum command using blkhash Nir Soffer
@ 2022-11-28 14:15 ` Nir Soffer
  2022-12-12 10:35   ` Hanna Reitz
  2022-11-28 14:15 ` [PATCH v2 2/5] Support format or cache specific out file Nir Soffer
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Nir Soffer @ 2022-11-28 14:15 UTC (permalink / raw)
  To: qemu-devel
  Cc: Thomas Huth, Paolo Bonzini, Kevin Wolf,
	Philippe Mathieu-Daudé,
	Marc-André Lureau, qemu-block, Daniel P. Berrangé,
	Hanna Reitz, Nir Soffer

This macro is used by various commands (compare, convert, rebase) but it
is defined somewhere in the middle of the file. I'm going to use it in
the new checksum command so lets clean up a bit before that.
---
 qemu-img.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index a9b3a8103c..c03d6b4b31 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -49,20 +49,21 @@
 #include "block/block_int.h"
 #include "block/blockjob.h"
 #include "block/qapi.h"
 #include "crypto/init.h"
 #include "trace/control.h"
 #include "qemu/throttle.h"
 #include "block/throttle-groups.h"
 
 #define QEMU_IMG_VERSION "qemu-img version " QEMU_FULL_VERSION \
                           "\n" QEMU_COPYRIGHT "\n"
+#define IO_BUF_SIZE (2 * MiB)
 
 typedef struct img_cmd_t {
     const char *name;
     int (*handler)(int argc, char **argv);
 } img_cmd_t;
 
 enum {
     OPTION_OUTPUT = 256,
     OPTION_BACKING_CHAIN = 257,
     OPTION_OBJECT = 258,
@@ -1281,22 +1282,20 @@ static int compare_buffers(const uint8_t *buf1, const uint8_t *buf2,
         if (!!memcmp(buf1 + i, buf2 + i, len) != res) {
             break;
         }
         i += len;
     }
 
     *pnum = i;
     return res;
 }
 
-#define IO_BUF_SIZE (2 * MiB)
-
 /*
  * Check if passed sectors are empty (not allocated or contain only 0 bytes)
  *
  * Intended for use by 'qemu-img compare': Returns 0 in case sectors are
  * filled with 0, 1 if sectors contain non-zero data (this is a comparison
  * failure), and 4 on error (the exit status for read errors), after emitting
  * an error message.
  *
  * @param blk:  BlockBackend for the image
  * @param offset: Starting offset to check
-- 
2.38.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 2/5] Support format or cache specific out file
  2022-11-28 14:15 [PATCH v2 0/5] Add qemu-img checksum command using blkhash Nir Soffer
  2022-11-28 14:15 ` [PATCH v2 1/5] qemu-img.c: Move IO_BUF_SIZE to the top of the file Nir Soffer
@ 2022-11-28 14:15 ` Nir Soffer
  2022-12-12 10:38   ` Hanna Reitz
  2022-11-28 14:15 ` [PATCH v2 3/5] qemu-img: Add checksum command Nir Soffer
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Nir Soffer @ 2022-11-28 14:15 UTC (permalink / raw)
  To: qemu-devel
  Cc: Thomas Huth, Paolo Bonzini, Kevin Wolf,
	Philippe Mathieu-Daudé,
	Marc-André Lureau, qemu-block, Daniel P. Berrangé,
	Hanna Reitz, Nir Soffer

Extend the test finder to find tests with format (*.out.qcow2) or cache
specific (*.out.nocache) out file. This worked before only for the
numbered tests.
---
 tests/qemu-iotests/findtests.py | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/findtests.py b/tests/qemu-iotests/findtests.py
index dd77b453b8..f4344ce78c 100644
--- a/tests/qemu-iotests/findtests.py
+++ b/tests/qemu-iotests/findtests.py
@@ -38,31 +38,37 @@ def chdir(path: Optional[str] = None) -> Iterator[None]:
         os.chdir(saved_dir)
 
 
 class TestFinder:
     def __init__(self, test_dir: Optional[str] = None) -> None:
         self.groups = defaultdict(set)
 
         with chdir(test_dir):
             self.all_tests = glob.glob('[0-9][0-9][0-9]')
             self.all_tests += [f for f in glob.iglob('tests/*')
-                               if not f.endswith('.out') and
-                               os.path.isfile(f + '.out')]
+                               if self.is_test(f)]
 
             for t in self.all_tests:
                 with open(t, encoding="utf-8") as f:
                     for line in f:
                         if line.startswith('# group: '):
                             for g in line.split()[2:]:
                                 self.groups[g].add(t)
                             break
 
+    def is_test(self, fname: str) -> bool:
+        """
+        The tests directory contains tests (no extension) and out files
+        (*.out, *.out.{format}, *.out.{option}).
+        """
+        return re.search(r'.+\.out(\.\w+)?$', fname) is None
+
     def add_group_file(self, fname: str) -> None:
         with open(fname, encoding="utf-8") as f:
             for line in f:
                 line = line.strip()
 
                 if (not line) or line[0] == '#':
                     continue
 
                 words = line.split()
                 test_file = self.parse_test_name(words[0])
-- 
2.38.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 3/5] qemu-img: Add checksum command
  2022-11-28 14:15 [PATCH v2 0/5] Add qemu-img checksum command using blkhash Nir Soffer
  2022-11-28 14:15 ` [PATCH v2 1/5] qemu-img.c: Move IO_BUF_SIZE to the top of the file Nir Soffer
  2022-11-28 14:15 ` [PATCH v2 2/5] Support format or cache specific out file Nir Soffer
@ 2022-11-28 14:15 ` Nir Soffer
  2022-12-12 10:42   ` Hanna Reitz
  2022-11-28 14:15 ` [PATCH v2 4/5] iotests: Test qemu-img checksum Nir Soffer
  2022-11-28 14:15 ` [PATCH v2 5/5] qemu-img: Speed up checksum Nir Soffer
  4 siblings, 1 reply; 14+ messages in thread
From: Nir Soffer @ 2022-11-28 14:15 UTC (permalink / raw)
  To: qemu-devel
  Cc: Thomas Huth, Paolo Bonzini, Kevin Wolf,
	Philippe Mathieu-Daudé,
	Marc-André Lureau, qemu-block, Daniel P. Berrangé,
	Hanna Reitz, Nir Soffer

The checksum command compute a checksum for disk image content using the
blkhash library[1]. The blkhash library is not packaged yet, but it is
available via copr[2].

Example run:

    $ ./qemu-img checksum -p fedora-35.qcow2
    6e5c00c995056319d52395f8d91c7f84725ae3da69ffcba4de4c7d22cff713a5  fedora-35.qcow2

The block checksum is constructed by splitting the image to fixed sized
blocks and computing a digest of every block. The image checksum is the
digest of the all block digests.

The checksum uses internally the "sha256" algorithm but it cannot be
compared with checksums created by other tools such as `sha256sum`.

The blkhash library supports sparse images, zero detection, and
optimizes zero block hashing (they are practically free). The library
uses multiple threads to speed up the computation.

Comparing to `sha256sum`, `qemu-img checksum` is 3.5-4800[3] times
faster, depending on the amount of data in the image:

    $ ./qemu-img info /scratch/50p.raw
    file format: raw
    virtual size: 6 GiB (6442450944 bytes)
    disk size: 2.91 GiB

    $ hyperfine -w2 -r5 -p "sleep 1" "./qemu-img checksum /scratch/50p.raw" \
                                     "sha256sum /scratch/50p.raw"
    Benchmark 1: ./qemu-img checksum /scratch/50p.raw
      Time (mean ± σ):      1.849 s ±  0.037 s    [User: 7.764 s, System: 0.962 s]
      Range (min … max):    1.813 s …  1.908 s    5 runs

    Benchmark 2: sha256sum /scratch/50p.raw
      Time (mean ± σ):     14.585 s ±  0.072 s    [User: 13.537 s, System: 1.003 s]
      Range (min … max):   14.501 s … 14.697 s    5 runs

    Summary
      './qemu-img checksum /scratch/50p.raw' ran
        7.89 ± 0.16 times faster than 'sha256sum /scratch/50p.raw'

The new command is available only when `blkhash` is available during
build. To test the new command please install the `blkhash-devel`
package:

    $ dnf copr enable nsoffer/blkhash
    $ sudo dnf install blkhash-devel

[1] https://gitlab.com/nirs/blkhash
[2] https://copr.fedorainfracloud.org/coprs/nsoffer/blkhash/
[3] Computing checksum for 8T empty image: qemu-img checksum: 3.7s,
    sha256sum (estimate): 17,749s

Signed-off-by: Nir Soffer <nsoffer@redhat.com>
---
 docs/tools/qemu-img.rst |  24 ++++++
 meson.build             |  10 ++-
 meson_options.txt       |   2 +
 qemu-img-cmds.hx        |   8 ++
 qemu-img.c              | 183 ++++++++++++++++++++++++++++++++++++++++
 5 files changed, 226 insertions(+), 1 deletion(-)

diff --git a/docs/tools/qemu-img.rst b/docs/tools/qemu-img.rst
index 15aeddc6d8..d856785ecc 100644
--- a/docs/tools/qemu-img.rst
+++ b/docs/tools/qemu-img.rst
@@ -347,20 +347,44 @@ Command description:
     Check completed, image is corrupted
   3
     Check completed, image has leaked clusters, but is not corrupted
   63
     Checks are not supported by the image format
 
   If ``-r`` is specified, exit codes representing the image state refer to the
   state after (the attempt at) repairing it. That is, a successful ``-r all``
   will yield the exit code 0, independently of the image state before.
 
+.. option:: checksum [--object OBJECTDEF] [--image-opts] [-f FMT] [-T SRC_CACHE] [-p] FILENAME
+
+  Print a checksum for image *FILENAME* guest visible content. Images with
+  different format or settings will have the same checksum.
+
+  The format is probed unless you specify it by ``-f``.
+
+  The checksum is computed for guest visible content. Allocated areas full of
+  zeroes, zero clusters, and unallocated areas are read as zeros so they will
+  have the same checksum. Images with single or multiple files or backing files
+  will have the same checksums if the guest will see the same content when
+  reading the image.
+
+  Image metadata that is not visible to the guest such as dirty bitmaps does
+  not affect the checksum.
+
+  Computing a checksum requires a read-only image. You cannot compute a
+  checksum of an active image used by a guest, but you can compute a checksum
+  of a guest during pull mode incremental backup using NBD URL.
+
+  The checksum is not compatible with other tools such as *sha256sum* for
+  optimization purposes; using multithreading and optimized handling of zero
+  areas. For more info please see https://gitlab.com/nirs/blkhash.
+
 .. option:: commit [--object OBJECTDEF] [--image-opts] [-q] [-f FMT] [-t CACHE] [-b BASE] [-r RATE_LIMIT] [-d] [-p] FILENAME
 
   Commit the changes recorded in *FILENAME* in its base image or backing file.
   If the backing file is smaller than the snapshot, then the backing file will be
   resized to be the same size as the snapshot.  If the snapshot is smaller than
   the backing file, the backing file will not be truncated.  If you want the
   backing file to match the size of the smaller snapshot, you can safely truncate
   it yourself once the commit operation successfully completes.
 
   The image *FILENAME* is emptied after the operation has succeeded. If you do
diff --git a/meson.build b/meson.build
index 5c6b5a1c75..18071dd653 100644
--- a/meson.build
+++ b/meson.build
@@ -790,20 +790,24 @@ if not get_option('curl').auto() or have_block
                     kwargs: static_kwargs)
 endif
 libudev = not_found
 if targetos == 'linux' and (have_system or have_tools)
   libudev = dependency('libudev',
                        method: 'pkg-config',
                        required: get_option('libudev'),
                        kwargs: static_kwargs)
 endif
 
+blkhash = dependency('blkhash', version: '>=0.5.1',
+                     method: 'pkg-config',
+                     required: get_option('blkhash'))
+
 mpathlibs = [libudev]
 mpathpersist = not_found
 mpathpersist_new_api = false
 if targetos == 'linux' and have_tools and get_option('mpath').allowed()
   mpath_test_source_new = '''
     #include <libudev.h>
     #include <mpath_persist.h>
     unsigned mpath_mx_alloc_len = 1024;
     int logsink;
     static struct config *multipath_conf;
@@ -1917,20 +1921,22 @@ config_host_data.set('CONFIG_CFI', get_option('cfi'))
 config_host_data.set('CONFIG_SELINUX', selinux.found())
 config_host_data.set('CONFIG_XEN_BACKEND', xen.found())
 if xen.found()
   # protect from xen.version() having less than three components
   xen_version = xen.version().split('.') + ['0', '0']
   xen_ctrl_version = xen_version[0] + \
     ('0' + xen_version[1]).substring(-2) + \
     ('0' + xen_version[2]).substring(-2)
   config_host_data.set('CONFIG_XEN_CTRL_INTERFACE_VERSION', xen_ctrl_version)
 endif
+config_host_data.set('CONFIG_BLKHASH', blkhash.found())
+
 config_host_data.set('QEMU_VERSION', '"@0@"'.format(meson.project_version()))
 config_host_data.set('QEMU_VERSION_MAJOR', meson.project_version().split('.')[0])
 config_host_data.set('QEMU_VERSION_MINOR', meson.project_version().split('.')[1])
 config_host_data.set('QEMU_VERSION_MICRO', meson.project_version().split('.')[2])
 
 config_host_data.set_quoted('CONFIG_HOST_DSOSUF', host_dsosuf)
 config_host_data.set('HAVE_HOST_BLOCK_DEVICE', have_host_block_device)
 
 have_coroutine_pool = get_option('coroutine_pool')
 if get_option('debug_stack_usage') and have_coroutine_pool
@@ -3582,21 +3588,22 @@ subdir('qga')
 # Don't build qemu-keymap if xkbcommon is not explicitly enabled
 # when we don't build tools or system
 if xkbcommon.found()
   # used for the update-keymaps target, so include rules even if !have_tools
   qemu_keymap = executable('qemu-keymap', files('qemu-keymap.c', 'ui/input-keymap.c') + genh,
                            dependencies: [qemuutil, xkbcommon], install: have_tools)
 endif
 
 if have_tools
   qemu_img = executable('qemu-img', [files('qemu-img.c'), hxdep],
-             dependencies: [authz, block, crypto, io, qom, qemuutil], install: true)
+             dependencies: [authz, block, crypto, io, qom, qemuutil, blkhash],
+             install: true)
   qemu_io = executable('qemu-io', files('qemu-io.c'),
              dependencies: [block, qemuutil], install: true)
   qemu_nbd = executable('qemu-nbd', files('qemu-nbd.c'),
                dependencies: [blockdev, qemuutil, gnutls, selinux],
                install: true)
 
   subdir('storage-daemon')
   subdir('contrib/rdmacm-mux')
   subdir('contrib/elf2dmp')
 
@@ -3968,20 +3975,21 @@ summary_info += {'bzip2 support':     libbzip2}
 summary_info += {'lzfse support':     liblzfse}
 summary_info += {'zstd support':      zstd}
 summary_info += {'NUMA host support': numa}
 summary_info += {'capstone':          capstone}
 summary_info += {'libpmem support':   libpmem}
 summary_info += {'libdaxctl support': libdaxctl}
 summary_info += {'libudev':           libudev}
 # Dummy dependency, keep .found()
 summary_info += {'FUSE lseek':        fuse_lseek.found()}
 summary_info += {'selinux':           selinux}
+summary_info += {'blkhash':           blkhash}
 summary(summary_info, bool_yn: true, section: 'Dependencies')
 
 if not supported_cpus.contains(cpu)
   message()
   warning('SUPPORT FOR THIS HOST CPU WILL GO AWAY IN FUTURE RELEASES!')
   message()
   message('CPU host architecture ' + cpu + ' support is not currently maintained.')
   message('The QEMU project intends to remove support for this host CPU in')
   message('a future release if nobody volunteers to maintain it and to')
   message('provide a build host for our continuous integration setup.')
diff --git a/meson_options.txt b/meson_options.txt
index 4b749ca549..475ddcf211 100644
--- a/meson_options.txt
+++ b/meson_options.txt
@@ -314,10 +314,12 @@ option('debug_mutex', type: 'boolean', value: false,
 option('debug_stack_usage', type: 'boolean', value: false,
        description: 'measure coroutine stack usage')
 option('qom_cast_debug', type: 'boolean', value: false,
        description: 'cast debugging support')
 option('gprof', type: 'boolean', value: false,
        description: 'QEMU profiling with gprof')
 option('profiler', type: 'boolean', value: false,
        description: 'profiler support')
 option('slirp_smbd', type : 'feature', value : 'auto',
        description: 'use smbd (at path --smbd=*) in slirp networking')
+option('blkhash', type: 'feature', value: 'auto',
+       description: 'blkhash support for computing image checksum')
diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
index 1b1dab5b17..c4fc935a37 100644
--- a/qemu-img-cmds.hx
+++ b/qemu-img-cmds.hx
@@ -26,20 +26,28 @@ DEF("bitmap", img_bitmap,
 SRST
 .. option:: bitmap (--merge SOURCE | --add | --remove | --clear | --enable | --disable)... [-b SOURCE_FILE [-F SOURCE_FMT]] [-g GRANULARITY] [--object OBJECTDEF] [--image-opts | -f FMT] FILENAME BITMAP
 ERST
 
 DEF("check", img_check,
     "check [--object objectdef] [--image-opts] [-q] [-f fmt] [--output=ofmt] [-r [leaks | all]] [-T src_cache] [-U] filename")
 SRST
 .. option:: check [--object OBJECTDEF] [--image-opts] [-q] [-f FMT] [--output=OFMT] [-r [leaks | all]] [-T SRC_CACHE] [-U] FILENAME
 ERST
 
+#ifdef CONFIG_BLKHASH
+DEF("checksum", img_checksum,
+    "checksum [--object objectdef] [--image-opts] [-f fmt] [-T src_cache] [-p] filename")
+SRST
+.. option:: checksum [--object OBJECTDEF] [--image-opts] [-f FMT] [-T SRC_CACHE] [-p] FILENAME
+ERST
+#endif /* CONFIG_BLKHASH */
+
 DEF("commit", img_commit,
     "commit [--object objectdef] [--image-opts] [-q] [-f fmt] [-t cache] [-b base] [-r rate_limit] [-d] [-p] filename")
 SRST
 .. option:: commit [--object OBJECTDEF] [--image-opts] [-q] [-f FMT] [-t CACHE] [-b BASE] [-r RATE_LIMIT] [-d] [-p] FILENAME
 ERST
 
 DEF("compare", img_compare,
     "compare [--object objectdef] [--image-opts] [-f fmt] [-F fmt] [-T src_cache] [-p] [-q] [-s] [-U] filename1 filename2")
 SRST
 .. option:: compare [--object OBJECTDEF] [--image-opts] [-f FMT] [-F FMT] [-T SRC_CACHE] [-p] [-q] [-s] [-U] FILENAME1 FILENAME2
diff --git a/qemu-img.c b/qemu-img.c
index c03d6b4b31..4b4ca7add3 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -18,20 +18,24 @@
  * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
  * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
  * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
  * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
  * THE SOFTWARE.
  */
 
 #include "qemu/osdep.h"
 #include <getopt.h>
 
+#ifdef CONFIG_BLKHASH
+#include <blkhash.h>
+#endif
+
 #include "qemu/help-texts.h"
 #include "qemu/qemu-progress.h"
 #include "qemu-version.h"
 #include "qapi/error.h"
 #include "qapi/qapi-commands-block-core.h"
 #include "qapi/qapi-visit-block-core.h"
 #include "qapi/qobject-output-visitor.h"
 #include "qapi/qmp/qjson.h"
 #include "qapi/qmp/qdict.h"
 #include "qemu/cutils.h"
@@ -1613,20 +1617,199 @@ out:
     qemu_vfree(buf1);
     qemu_vfree(buf2);
     blk_unref(blk2);
 out2:
     blk_unref(blk1);
 out3:
     qemu_progress_end();
     return ret;
 }
 
+#ifdef CONFIG_BLKHASH
+/*
+ * Compute image checksum.
+ */
+static int img_checksum(int argc, char **argv)
+{
+    const char *digest_name = "sha256";
+    const size_t block_size = 64 * KiB;
+
+    _Static_assert(QEMU_IS_ALIGNED(IO_BUF_SIZE, block_size),
+                   "IO_BUF_SIZE should be alligned to block_size");
+
+    const char *format = NULL;
+    const char *cache = BDRV_DEFAULT_CACHE;
+    const char *filename;
+    BlockBackend *blk;
+    BlockDriverState *bs;
+    uint8_t *buf = NULL;
+    bool progress = false;
+    int flags = 0;
+    bool writethrough;
+    int64_t total_size;
+    int64_t offset = 0;
+    int c;
+    bool image_opts = false;
+    struct blkhash *h = NULL;
+    unsigned char digest[64];
+    unsigned int digest_len;
+    int err;
+    int exit_status = 1;
+
+    for (;;) {
+        static const struct option long_options[] = {
+            {"help", no_argument, 0, 'h'},
+            {"object", required_argument, 0, OPTION_OBJECT},
+            {"image-opts", no_argument, 0, OPTION_IMAGE_OPTS},
+            {0, 0, 0, 0}
+        };
+        c = getopt_long(argc, argv, ":hf:T:p",
+                        long_options, NULL);
+        if (c == -1) {
+            break;
+        }
+        switch (c) {
+        case ':':
+            missing_argument(argv[optind - 1]);
+            break;
+        case '?':
+            unrecognized_option(argv[optind - 1]);
+            break;
+        case 'h':
+            help();
+            break;
+        case 'f':
+            format = optarg;
+            break;
+        case 'T':
+            cache = optarg;
+            break;
+        case 'p':
+            progress = true;
+            break;
+        case OPTION_OBJECT:
+            user_creatable_process_cmdline(optarg);
+            break;
+        case OPTION_IMAGE_OPTS:
+            image_opts = true;
+            break;
+        }
+    }
+
+    if (optind != argc - 1) {
+        error_exit("Expecting image file name");
+    }
+
+    filename = argv[optind++];
+
+    err = bdrv_parse_cache_mode(cache, &flags, &writethrough);
+    if (err < 0) {
+        error_report("Invalid source cache option: %s", cache);
+        return exit_status;
+    }
+
+    blk = img_open(image_opts, filename, format, flags, writethrough, false,
+                   false);
+    if (!blk) {
+        return exit_status;
+    }
+
+    /* Initialize before using goto out. */
+    qemu_progress_init(progress, 2.0);
+
+    bs = blk_bs(blk);
+    buf = blk_blockalign(blk, IO_BUF_SIZE);
+
+    total_size = blk_getlength(blk);
+    if (total_size < 0) {
+        error_report("Can't get size of %s: %s",
+                     filename, strerror(-total_size));
+        goto out;
+    }
+
+    h = blkhash_new(block_size, digest_name);
+    if (!h) {
+        error_report("Can't create blkhash: %s", strerror(errno));
+        goto out;
+    }
+
+    qemu_progress_print(0, 100);
+
+    while (offset < total_size) {
+        int status;
+        int64_t chunk;
+
+        status = bdrv_block_status_above(bs, NULL, offset,
+                                         total_size - offset, &chunk, NULL,
+                                         NULL);
+        if (status < 0) {
+            error_report("Error checking status at offset %" PRId64 " for %s",
+                         offset, filename);
+            goto out;
+        }
+
+        assert(chunk);
+
+        if (status & BDRV_BLOCK_ZERO) {
+            chunk = MIN(chunk, SIZE_MAX);
+            err = blkhash_zero(h, chunk);
+            if (err) {
+                error_report("Error zeroing hash at offset %" PRId64
+                             " of %s: %s",
+                             offset, filename, strerror(err));
+                goto out;
+            }
+        } else {
+            chunk = MIN(chunk, IO_BUF_SIZE);
+            err = blk_pread(blk, offset, chunk, buf, 0);
+            if (err < 0) {
+                error_report("Error reading at offset %" PRId64 " of %s: %s",
+                             offset, filename, strerror(-err));
+                goto out;
+            }
+            err = blkhash_update(h, buf, chunk);
+            if (err) {
+                error_report("Error updating hash at offset %" PRId64
+                             " of %s: %s",
+                             offset, filename, strerror(err));
+                goto out;
+            }
+        }
+
+        offset += chunk;
+        qemu_progress_print(((float) chunk / total_size) * 100, 100);
+    }
+
+    err = blkhash_final(h, digest, &digest_len);
+    if (err) {
+        error_report("Error finalizing hash of %s: %s",
+                     filename, strerror(err));
+        goto out;
+    }
+
+    for (unsigned i = 0; i < digest_len; i++) {
+        printf("%02x", digest[i]);
+    }
+    printf("  %s%s", filename, progress ? "" : "\n");
+
+    exit_status = 0;
+
+out:
+    blkhash_free(h);
+    qemu_vfree(buf);
+    blk_unref(blk);
+    qemu_progress_end();
+
+    return exit_status;
+}
+#endif /* CONFIG_BLKHASH */
+
 /* Convenience wrapper around qmp_block_dirty_bitmap_merge */
 static void do_dirty_bitmap_merge(const char *dst_node, const char *dst_name,
                                   const char *src_node, const char *src_name,
                                   Error **errp)
 {
     BlockDirtyBitmapOrStr *merge_src;
     BlockDirtyBitmapOrStrList *list = NULL;
 
     merge_src = g_new0(BlockDirtyBitmapOrStr, 1);
     merge_src->type = QTYPE_QDICT;
-- 
2.38.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 4/5] iotests: Test qemu-img checksum
  2022-11-28 14:15 [PATCH v2 0/5] Add qemu-img checksum command using blkhash Nir Soffer
                   ` (2 preceding siblings ...)
  2022-11-28 14:15 ` [PATCH v2 3/5] qemu-img: Add checksum command Nir Soffer
@ 2022-11-28 14:15 ` Nir Soffer
  2022-12-12 10:43   ` Hanna Reitz
  2022-11-28 14:15 ` [PATCH v2 5/5] qemu-img: Speed up checksum Nir Soffer
  4 siblings, 1 reply; 14+ messages in thread
From: Nir Soffer @ 2022-11-28 14:15 UTC (permalink / raw)
  To: qemu-devel
  Cc: Thomas Huth, Paolo Bonzini, Kevin Wolf,
	Philippe Mathieu-Daudé,
	Marc-André Lureau, qemu-block, Daniel P. Berrangé,
	Hanna Reitz, Nir Soffer

Add simple tests computing a checksum for image with all kinds of
extents in raw and qcow2 formats.

The test can be extended later for other formats, format options (e..g
compressed qcow2), protocols (e.g. nbd), and image with a backing chain,
but I'm not sure this is really needed.

To help debugging in case of failures, the output includes a json map of
the test image.

Signed-off-by: Nir Soffer <nsoffer@redhat.com>
---
 tests/qemu-iotests/tests/qemu-img-checksum    | 63 +++++++++++++++++++
 .../tests/qemu-img-checksum.out.qcow2         | 11 ++++
 .../tests/qemu-img-checksum.out.raw           | 10 +++
 3 files changed, 84 insertions(+)
 create mode 100755 tests/qemu-iotests/tests/qemu-img-checksum
 create mode 100644 tests/qemu-iotests/tests/qemu-img-checksum.out.qcow2
 create mode 100644 tests/qemu-iotests/tests/qemu-img-checksum.out.raw

diff --git a/tests/qemu-iotests/tests/qemu-img-checksum b/tests/qemu-iotests/tests/qemu-img-checksum
new file mode 100755
index 0000000000..3577a0bc41
--- /dev/null
+++ b/tests/qemu-iotests/tests/qemu-img-checksum
@@ -0,0 +1,63 @@
+#!/usr/bin/env python3
+# group: rw auto quick
+#
+# Test cases for qemu-img checksum.
+#
+# Copyright (C) 2022 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+import re
+
+import iotests
+
+from iotests import (
+    filter_testfiles,
+    qemu_img,
+    qemu_img_log,
+    qemu_io,
+)
+
+
+def checksum_available():
+    out = qemu_img("--help").stdout
+    return re.search(r"\bchecksum .+ filename\b", out) is not None
+
+
+if not checksum_available():
+    iotests.notrun("checksum command not available")
+
+iotests.script_initialize(
+    supported_fmts=["raw", "qcow2"],
+    supported_cache_modes=["none", "writeback"],
+    supported_protocols=["file"],
+)
+
+print("=== Create test image ===\n")
+
+disk = iotests.file_path('disk')
+qemu_img("create", "-f", iotests.imgfmt, disk, "10m")
+qemu_io("-f", iotests.imgfmt,
+        "-c", "write -P 0x1 0 2m",      # data
+        "-c", "write -P 0x0 2m 2m",     # data with zeroes
+        "-c", "write -z 4m 2m",         # zero allocated
+        "-c", "write -z -u 6m 2m",      # zero hole
+                                        # unallocated
+        disk)
+print(filter_testfiles(disk))
+qemu_img_log("map", "--output", "json", disk)
+
+print("=== Compute checksum ===\n")
+
+qemu_img_log("checksum", "-T", iotests.cachemode, disk)
diff --git a/tests/qemu-iotests/tests/qemu-img-checksum.out.qcow2 b/tests/qemu-iotests/tests/qemu-img-checksum.out.qcow2
new file mode 100644
index 0000000000..02b9616e5b
--- /dev/null
+++ b/tests/qemu-iotests/tests/qemu-img-checksum.out.qcow2
@@ -0,0 +1,11 @@
+=== Create test image ===
+
+TEST_DIR/PID-disk
+[{ "start": 0, "length": 4194304, "depth": 0, "present": true, "zero": false, "data": true, "offset": 327680},
+{ "start": 4194304, "length": 4194304, "depth": 0, "present": true, "zero": true, "data": false},
+{ "start": 8388608, "length": 2097152, "depth": 0, "present": false, "zero": true, "data": false}]
+
+=== Compute checksum ===
+
+57cd8ef0cfad106d737f8fb0de3a0306a8a1a41db7bf7c0c36e2dfe75ee9bd26  TEST_DIR/PID-disk
+
diff --git a/tests/qemu-iotests/tests/qemu-img-checksum.out.raw b/tests/qemu-iotests/tests/qemu-img-checksum.out.raw
new file mode 100644
index 0000000000..6294e4dace
--- /dev/null
+++ b/tests/qemu-iotests/tests/qemu-img-checksum.out.raw
@@ -0,0 +1,10 @@
+=== Create test image ===
+
+TEST_DIR/PID-disk
+[{ "start": 0, "length": 4194304, "depth": 0, "present": true, "zero": false, "data": true, "offset": 0},
+{ "start": 4194304, "length": 6291456, "depth": 0, "present": true, "zero": true, "data": false, "offset": 4194304}]
+
+=== Compute checksum ===
+
+57cd8ef0cfad106d737f8fb0de3a0306a8a1a41db7bf7c0c36e2dfe75ee9bd26  TEST_DIR/PID-disk
+
-- 
2.38.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 5/5] qemu-img: Speed up checksum
  2022-11-28 14:15 [PATCH v2 0/5] Add qemu-img checksum command using blkhash Nir Soffer
                   ` (3 preceding siblings ...)
  2022-11-28 14:15 ` [PATCH v2 4/5] iotests: Test qemu-img checksum Nir Soffer
@ 2022-11-28 14:15 ` Nir Soffer
  2022-12-12 10:43   ` Hanna Reitz
  4 siblings, 1 reply; 14+ messages in thread
From: Nir Soffer @ 2022-11-28 14:15 UTC (permalink / raw)
  To: qemu-devel
  Cc: Thomas Huth, Paolo Bonzini, Kevin Wolf,
	Philippe Mathieu-Daudé,
	Marc-André Lureau, qemu-block, Daniel P. Berrangé,
	Hanna Reitz, Nir Soffer

Add coroutine based loop inspired by `qemu-img convert` design.

Changes compared to `qemu-img convert`:

- State for the entire image is kept in ImgChecksumState

- State for single worker coroutine is kept in ImgChecksumworker.

- "Writes" are always in-order, ensured using a queue.

- Calling block status once per image extent, when the current extent is
  consumed by the workers.

- Using 1m buffer size - testings shows that this gives best read
  performance both with buffered and direct I/O.

- Number of coroutines is not configurable. Testing does not show
  improvement when using more than 8 coroutines.

- Progress include entire image, not only the allocated state.

Comparing to the simple read loop shows that this version is up to 4.67
times faster when computing a checksum for an image full of zeroes. For
real images it is 1.59 times faster with direct I/O, and with buffered
I/O there is no difference.

Test results on Dell PowerEdge R640 in a CentOS Stream 9 container:

| image    | size | i/o       | before         | after          | change |
|----------|------|-----------|----------------|----------------|--------|
| zero [1] |   6g | buffered  | 1.600s ±0.014s | 0.342s ±0.016s |  x4.67 |
| zero     |   6g | direct    | 4.684s ±0.093s | 2.211s ±0.009s |  x2.12 |
| real [2] |   6g | buffered  | 1.841s ±0.075s | 1.806s ±0.036s |  x1.02 |
| real     |   6g | direct    | 3.094s ±0.079s | 1.947s ±0.017s |  x1.59 |
| nbd  [3] |   6g | buffered  | 2.455s ±0.183s | 1.808s ±0.016s |  x1.36 |
| nbd      |   6g | direct    | 3.540s ±0.020s | 1.749s ±0.018s |  x2.02 |

[1] raw image full of zeroes
[2] raw fedora 35 image with additional random data, 50% full
[3] image [2] exported by qemu-nbd via unix socket

Signed-off-by: Nir Soffer <nsoffer@redhat.com>
---
 qemu-img.c | 350 ++++++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 277 insertions(+), 73 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index 4b4ca7add3..5f63a769a9 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -1618,50 +1618,296 @@ out:
     qemu_vfree(buf2);
     blk_unref(blk2);
 out2:
     blk_unref(blk1);
 out3:
     qemu_progress_end();
     return ret;
 }
 
 #ifdef CONFIG_BLKHASH
+
+#define CHECKSUM_COROUTINES 8
+#define CHECKSUM_BUF_SIZE (1 * MiB)
+#define CHECKSUM_ZERO_SIZE MIN(16 * GiB, SIZE_MAX)
+
+typedef struct ImgChecksumState ImgChecksumState;
+
+typedef struct ImgChecksumWorker {
+    QTAILQ_ENTRY(ImgChecksumWorker) entry;
+    ImgChecksumState *state;
+    Coroutine *co;
+    uint8_t *buf;
+
+    /* The current chunk. */
+    int64_t offset;
+    int64_t length;
+    bool zero;
+
+    /*
+     * Always true for zero extent, false for data extent. Set to true
+     * when reading the chunk completes.
+     */
+    bool ready;
+} ImgChecksumWorker;
+
+struct ImgChecksumState {
+    const char *filename;
+    BlockBackend *blk;
+    BlockDriverState *bs;
+    int64_t total_size;
+
+    /* Current extent, modified in checksum_co_next. */
+    int64_t offset;
+    int64_t length;
+    bool zero;
+
+    int running_coroutines;
+    CoMutex lock;
+    ImgChecksumWorker workers[CHECKSUM_COROUTINES];
+
+    /*
+     * Ensure in-order updates. Update are scheduled at the tail of the
+     * queue and processed from the head of the queue when a worker is
+     * ready.
+     */
+    QTAILQ_HEAD(, ImgChecksumWorker) update_queue;
+
+    struct blkhash *hash;
+    int ret;
+};
+
+static int checksum_block_status(ImgChecksumState *s)
+{
+    int64_t length;
+    int status;
+
+    /* Must be called when current extent is consumed. */
+    assert(s->length == 0);
+
+    status = bdrv_block_status_above(s->bs, NULL, s->offset,
+                                     s->total_size - s->offset, &length, NULL,
+                                     NULL);
+    if (status < 0) {
+        error_report("Error checking status at offset %" PRId64 " for %s",
+                     s->offset, s->filename);
+        s->ret = status;
+        return -1;
+    }
+
+    assert(length > 0);
+
+    s->length = length;
+    s->zero = !!(status & BDRV_BLOCK_ZERO);
+
+    return 0;
+}
+
+/**
+ * Grab the next chunk from the current extent, getting the next extent if
+ * needed, and schecule the next update at the end fo the update queue.
+ *
+ * Retrun true if the worker has work to do, false if the worker has
+ * finished or there was an error getting the next extent.
+ */
+static coroutine_fn bool checksum_co_next(ImgChecksumWorker *w)
+{
+    ImgChecksumState *s = w->state;
+
+    qemu_co_mutex_lock(&s->lock);
+
+    if (s->offset == s->total_size || s->ret != -EINPROGRESS) {
+        qemu_co_mutex_unlock(&s->lock);
+        return false;
+    }
+
+    if (s->length == 0 && checksum_block_status(s) < 0) {
+        qemu_co_mutex_unlock(&s->lock);
+        return false;
+    }
+
+    /* Grab one chunk from current extent. */
+    w->offset = s->offset;
+    w->length = MIN(s->length,
+                    s->zero ?  CHECKSUM_ZERO_SIZE : CHECKSUM_BUF_SIZE);
+    w->zero = s->zero;
+    w->ready = s->zero;
+
+    /* Advance state to the next chunk. */
+    s->offset += w->length;
+    s->length -= w->length;
+
+    /* Schedule this chunk update. */
+    QTAILQ_INSERT_TAIL(&s->update_queue, w, entry);
+
+    qemu_co_mutex_unlock(&s->lock);
+
+    return true;
+}
+
+/* Wait until this chunk is in the head of the update queue. */
+static void coroutine_fn checksum_co_wait_for_update(ImgChecksumWorker *w)
+{
+    ImgChecksumState *s = w->state;
+
+    /* Must be called only when we are ready to update the hash. */
+    assert(w->ready);
+
+    while (QTAILQ_FIRST(&s->update_queue) != w) {
+        qemu_coroutine_yield();
+    }
+
+    QTAILQ_REMOVE(&s->update_queue, w, entry);
+}
+
+/* Enter the next worker coroutine if the worker is ready. */
+static void coroutine_fn checksum_co_enter_next(ImgChecksumWorker *w)
+{
+    ImgChecksumState *s = w->state;
+    ImgChecksumWorker *next;
+
+    if (!QTAILQ_EMPTY(&s->update_queue)) {
+        next = QTAILQ_FIRST(&s->update_queue);
+        if (next->ready) {
+            qemu_coroutine_enter(next->co);
+        }
+    }
+}
+
+static int coroutine_fn checksum_co_read(ImgChecksumWorker *w)
+{
+    ImgChecksumState *s = w->state;
+    int err;
+
+    err = blk_co_pread(s->blk, w->offset, w->length, w->buf, 0);
+    if (err < 0) {
+        error_report("Error reading at offset %" PRId64 " of %s: %s",
+                     w->offset, s->filename, strerror(-err));
+        /* Unschedule this chunk. */
+        QTAILQ_REMOVE(&s->update_queue, w, entry);
+        s->ret = err;
+        return -1;
+    }
+
+    w->ready = true;
+
+    return 0;
+}
+
+static int checksum_update_hash(ImgChecksumWorker *w)
+{
+    ImgChecksumState *s = w->state;
+    int err;
+
+    if (w->zero) {
+        err = blkhash_zero(s->hash, w->length);
+        if (err) {
+            error_report("Error zeroing hash at offset %" PRId64 " of %s: %s",
+                         w->offset, s->filename, strerror(err));
+            s->ret = -err;
+            return -1;
+        }
+    } else {
+        err = blkhash_update(s->hash, w->buf, w->length);
+        if (err) {
+            error_report("Error updating hash at offset %" PRId64 " of %s: %s",
+                         w->offset, s->filename, strerror(err));
+            s->ret = -err;
+            return -1;
+        }
+    }
+
+    return 0;
+}
+
+static void coroutine_fn checksum_co_loop(void *opaque)
+{
+    ImgChecksumWorker *w = opaque;
+    ImgChecksumState *s = w->state;
+
+    s->running_coroutines++;
+    w->buf = blk_blockalign(s->blk, CHECKSUM_BUF_SIZE);
+
+    while (checksum_co_next(w)) {
+        if (!w->zero && checksum_co_read(w) < 0) {
+            break;
+        }
+        checksum_co_wait_for_update(w);
+        if (s->ret == -EINPROGRESS) {
+            if (checksum_update_hash(w) < 0) {
+                break;
+            }
+            qemu_progress_print(((float) w->length / s->total_size) * 100, 100);
+        }
+        checksum_co_enter_next(w);
+    }
+
+    qemu_vfree(w->buf);
+    w->co = NULL;
+    s->running_coroutines--;
+
+    if (s->running_coroutines == 0 && s->ret == -EINPROGRESS) {
+        /* the checksum job finished successfully */
+        s->ret = 0;
+    }
+}
+
+static int checksum_loop(ImgChecksumState *s)
+{
+    int i;
+
+    s->ret = -EINPROGRESS;
+
+    qemu_co_mutex_init(&s->lock);
+    QTAILQ_INIT(&s->update_queue);
+
+    for (i = 0; i < CHECKSUM_COROUTINES; i++) {
+        ImgChecksumWorker *w = &s->workers[i];
+        w->state = s;
+        w->co = qemu_coroutine_create(checksum_co_loop, w);
+        qemu_coroutine_enter(w->co);
+    }
+
+    while (s->running_coroutines) {
+        main_loop_wait(false);
+    }
+
+    return s->ret;
+}
+
 /*
  * Compute image checksum.
  */
 static int img_checksum(int argc, char **argv)
 {
     const char *digest_name = "sha256";
     const size_t block_size = 64 * KiB;
 
-    _Static_assert(QEMU_IS_ALIGNED(IO_BUF_SIZE, block_size),
-                   "IO_BUF_SIZE should be alligned to block_size");
+    _Static_assert(QEMU_IS_ALIGNED(CHECKSUM_BUF_SIZE, block_size),
+                   "CHECKSUM_BUF_SIZE should be alligned to block_size");
 
     const char *format = NULL;
     const char *cache = BDRV_DEFAULT_CACHE;
-    const char *filename;
-    BlockBackend *blk;
-    BlockDriverState *bs;
-    uint8_t *buf = NULL;
     bool progress = false;
+    bool image_opts = false;
+
     int flags = 0;
-    bool writethrough;
-    int64_t total_size;
-    int64_t offset = 0;
+    bool writethrough = false;
     int c;
-    bool image_opts = false;
-    struct blkhash *h = NULL;
-    unsigned char digest[64];
-    unsigned int digest_len;
+
     int err;
     int exit_status = 1;
 
+    ImgChecksumState s = {0};
+    unsigned char digest[64];
+    unsigned int digest_len;
+
     for (;;) {
         static const struct option long_options[] = {
             {"help", no_argument, 0, 'h'},
             {"object", required_argument, 0, OPTION_OBJECT},
             {"image-opts", no_argument, 0, OPTION_IMAGE_OPTS},
             {0, 0, 0, 0}
         };
         c = getopt_long(argc, argv, ":hf:T:p",
                         long_options, NULL);
         if (c == -1) {
@@ -1692,118 +1938,76 @@ static int img_checksum(int argc, char **argv)
         case OPTION_IMAGE_OPTS:
             image_opts = true;
             break;
         }
     }
 
     if (optind != argc - 1) {
         error_exit("Expecting image file name");
     }
 
-    filename = argv[optind++];
+    s.filename = argv[optind++];
 
     err = bdrv_parse_cache_mode(cache, &flags, &writethrough);
     if (err < 0) {
         error_report("Invalid source cache option: %s", cache);
         return exit_status;
     }
 
-    blk = img_open(image_opts, filename, format, flags, writethrough, false,
-                   false);
-    if (!blk) {
+    s.blk = img_open(image_opts, s.filename, format, flags, writethrough,
+                     false, false);
+    if (!s.blk) {
         return exit_status;
     }
 
     /* Initialize before using goto out. */
     qemu_progress_init(progress, 2.0);
 
-    bs = blk_bs(blk);
-    buf = blk_blockalign(blk, IO_BUF_SIZE);
+    s.bs = blk_bs(s.blk);
 
-    total_size = blk_getlength(blk);
-    if (total_size < 0) {
+    s.total_size = blk_getlength(s.blk);
+    if (s.total_size < 0) {
         error_report("Can't get size of %s: %s",
-                     filename, strerror(-total_size));
+                     s.filename, strerror(-s.total_size));
         goto out;
     }
 
-    h = blkhash_new(block_size, digest_name);
-    if (!h) {
+    s.hash = blkhash_new(block_size, digest_name);
+    if (!s.hash) {
         error_report("Can't create blkhash: %s", strerror(errno));
         goto out;
     }
 
     qemu_progress_print(0, 100);
 
-    while (offset < total_size) {
-        int status;
-        int64_t chunk;
-
-        status = bdrv_block_status_above(bs, NULL, offset,
-                                         total_size - offset, &chunk, NULL,
-                                         NULL);
-        if (status < 0) {
-            error_report("Error checking status at offset %" PRId64 " for %s",
-                         offset, filename);
-            goto out;
-        }
-
-        assert(chunk);
-
-        if (status & BDRV_BLOCK_ZERO) {
-            chunk = MIN(chunk, SIZE_MAX);
-            err = blkhash_zero(h, chunk);
-            if (err) {
-                error_report("Error zeroing hash at offset %" PRId64
-                             " of %s: %s",
-                             offset, filename, strerror(err));
-                goto out;
-            }
-        } else {
-            chunk = MIN(chunk, IO_BUF_SIZE);
-            err = blk_pread(blk, offset, chunk, buf, 0);
-            if (err < 0) {
-                error_report("Error reading at offset %" PRId64 " of %s: %s",
-                             offset, filename, strerror(-err));
-                goto out;
-            }
-            err = blkhash_update(h, buf, chunk);
-            if (err) {
-                error_report("Error updating hash at offset %" PRId64
-                             " of %s: %s",
-                             offset, filename, strerror(err));
-                goto out;
-            }
-        }
-
-        offset += chunk;
-        qemu_progress_print(((float) chunk / total_size) * 100, 100);
+    err = checksum_loop(&s);
+    if (err) {
+        goto out;
     }
 
-    err = blkhash_final(h, digest, &digest_len);
+    err = blkhash_final(s.hash, digest, &digest_len);
     if (err) {
         error_report("Error finalizing hash of %s: %s",
-                     filename, strerror(err));
+                     s.filename, strerror(err));
         goto out;
     }
 
     for (unsigned i = 0; i < digest_len; i++) {
         printf("%02x", digest[i]);
     }
-    printf("  %s%s", filename, progress ? "" : "\n");
+    printf("  %s%s", s.filename, progress ? "" : "\n");
 
     exit_status = 0;
 
 out:
-    blkhash_free(h);
-    qemu_vfree(buf);
-    blk_unref(blk);
+    blkhash_free(s.hash);
+    blk_unref(s.blk);
     qemu_progress_end();
 
     return exit_status;
 }
 #endif /* CONFIG_BLKHASH */
 
 /* Convenience wrapper around qmp_block_dirty_bitmap_merge */
 static void do_dirty_bitmap_merge(const char *dst_node, const char *dst_name,
                                   const char *src_node, const char *src_name,
                                   Error **errp)
-- 
2.38.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 1/5] qemu-img.c: Move IO_BUF_SIZE to the top of the file
  2022-11-28 14:15 ` [PATCH v2 1/5] qemu-img.c: Move IO_BUF_SIZE to the top of the file Nir Soffer
@ 2022-12-12 10:35   ` Hanna Reitz
  0 siblings, 0 replies; 14+ messages in thread
From: Hanna Reitz @ 2022-12-12 10:35 UTC (permalink / raw)
  To: Nir Soffer, qemu-devel
  Cc: Thomas Huth, Paolo Bonzini, Kevin Wolf,
	Philippe Mathieu-Daudé,
	Marc-André Lureau, qemu-block, Daniel P. Berrangé

On 28.11.22 15:15, Nir Soffer wrote:
> This macro is used by various commands (compare, convert, rebase) but it
> is defined somewhere in the middle of the file. I'm going to use it in
> the new checksum command so lets clean up a bit before that.
> ---
>   qemu-img.c | 3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)

Looks good, but is missing your S-o-b – with it added:

Reviewed-by: Hanna Reitz <hreitz@redhat.com>



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/5] Support format or cache specific out file
  2022-11-28 14:15 ` [PATCH v2 2/5] Support format or cache specific out file Nir Soffer
@ 2022-12-12 10:38   ` Hanna Reitz
  2022-12-13 15:56     ` Nir Soffer
  0 siblings, 1 reply; 14+ messages in thread
From: Hanna Reitz @ 2022-12-12 10:38 UTC (permalink / raw)
  To: Nir Soffer, qemu-devel
  Cc: Thomas Huth, Paolo Bonzini, Kevin Wolf,
	Philippe Mathieu-Daudé,
	Marc-André Lureau, qemu-block, Daniel P. Berrangé

On 28.11.22 15:15, Nir Soffer wrote:
> Extend the test finder to find tests with format (*.out.qcow2) or cache
> specific (*.out.nocache) out file. This worked before only for the
> numbered tests.
> ---
>   tests/qemu-iotests/findtests.py | 10 ++++++++--
>   1 file changed, 8 insertions(+), 2 deletions(-)

This patch lacks an S-o-b, too.

> diff --git a/tests/qemu-iotests/findtests.py b/tests/qemu-iotests/findtests.py
> index dd77b453b8..f4344ce78c 100644
> --- a/tests/qemu-iotests/findtests.py
> +++ b/tests/qemu-iotests/findtests.py
> @@ -38,31 +38,37 @@ def chdir(path: Optional[str] = None) -> Iterator[None]:
>           os.chdir(saved_dir)
>   
>   
>   class TestFinder:
>       def __init__(self, test_dir: Optional[str] = None) -> None:
>           self.groups = defaultdict(set)
>   
>           with chdir(test_dir):
>               self.all_tests = glob.glob('[0-9][0-9][0-9]')
>               self.all_tests += [f for f in glob.iglob('tests/*')
> -                               if not f.endswith('.out') and
> -                               os.path.isfile(f + '.out')]
> +                               if self.is_test(f)]

So previously a file was only considered a test file if there was a 
corresponding reference output file (`f + '.out'`), so files without 
such a reference output aren’t considered test files...

>               for t in self.all_tests:
>                   with open(t, encoding="utf-8") as f:
>                       for line in f:
>                           if line.startswith('# group: '):
>                               for g in line.split()[2:]:
>                                   self.groups[g].add(t)
>                               break
>   
> +    def is_test(self, fname: str) -> bool:
> +        """
> +        The tests directory contains tests (no extension) and out files
> +        (*.out, *.out.{format}, *.out.{option}).
> +        """
> +        return re.search(r'.+\.out(\.\w+)?$', fname) is None

...but this new function doesn’t check that.  I think we should check it 
(just whether there’s any variant of `/{fname}\.out(\.\w+)?/` to go with 
`fname`) so that behavior isn’t changed.

Hanna



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 3/5] qemu-img: Add checksum command
  2022-11-28 14:15 ` [PATCH v2 3/5] qemu-img: Add checksum command Nir Soffer
@ 2022-12-12 10:42   ` Hanna Reitz
  0 siblings, 0 replies; 14+ messages in thread
From: Hanna Reitz @ 2022-12-12 10:42 UTC (permalink / raw)
  To: Nir Soffer, qemu-devel
  Cc: Thomas Huth, Paolo Bonzini, Kevin Wolf,
	Philippe Mathieu-Daudé,
	Marc-André Lureau, qemu-block, Daniel P. Berrangé

On 28.11.22 15:15, Nir Soffer wrote:
> The checksum command compute a checksum for disk image content using the
> blkhash library[1]. The blkhash library is not packaged yet, but it is
> available via copr[2].
>
> Example run:
>
>      $ ./qemu-img checksum -p fedora-35.qcow2
>      6e5c00c995056319d52395f8d91c7f84725ae3da69ffcba4de4c7d22cff713a5  fedora-35.qcow2
>
> The block checksum is constructed by splitting the image to fixed sized
> blocks and computing a digest of every block. The image checksum is the
> digest of the all block digests.
>
> The checksum uses internally the "sha256" algorithm but it cannot be
> compared with checksums created by other tools such as `sha256sum`.
>
> The blkhash library supports sparse images, zero detection, and
> optimizes zero block hashing (they are practically free). The library
> uses multiple threads to speed up the computation.
>
> Comparing to `sha256sum`, `qemu-img checksum` is 3.5-4800[3] times
> faster, depending on the amount of data in the image:
>
>      $ ./qemu-img info /scratch/50p.raw
>      file format: raw
>      virtual size: 6 GiB (6442450944 bytes)
>      disk size: 2.91 GiB
>
>      $ hyperfine -w2 -r5 -p "sleep 1" "./qemu-img checksum /scratch/50p.raw" \
>                                       "sha256sum /scratch/50p.raw"
>      Benchmark 1: ./qemu-img checksum /scratch/50p.raw
>        Time (mean ± σ):      1.849 s ±  0.037 s    [User: 7.764 s, System: 0.962 s]
>        Range (min … max):    1.813 s …  1.908 s    5 runs
>
>      Benchmark 2: sha256sum /scratch/50p.raw
>        Time (mean ± σ):     14.585 s ±  0.072 s    [User: 13.537 s, System: 1.003 s]
>        Range (min … max):   14.501 s … 14.697 s    5 runs
>
>      Summary
>        './qemu-img checksum /scratch/50p.raw' ran
>          7.89 ± 0.16 times faster than 'sha256sum /scratch/50p.raw'
>
> The new command is available only when `blkhash` is available during
> build. To test the new command please install the `blkhash-devel`
> package:
>
>      $ dnf copr enable nsoffer/blkhash
>      $ sudo dnf install blkhash-devel
>
> [1] https://gitlab.com/nirs/blkhash
> [2] https://copr.fedorainfracloud.org/coprs/nsoffer/blkhash/
> [3] Computing checksum for 8T empty image: qemu-img checksum: 3.7s,
>      sha256sum (estimate): 17,749s
>
> Signed-off-by: Nir Soffer <nsoffer@redhat.com>
> ---
>   docs/tools/qemu-img.rst |  24 ++++++
>   meson.build             |  10 ++-
>   meson_options.txt       |   2 +
>   qemu-img-cmds.hx        |   8 ++
>   qemu-img.c              | 183 ++++++++++++++++++++++++++++++++++++++++
>   5 files changed, 226 insertions(+), 1 deletion(-)

[...]

> diff --git a/qemu-img.c b/qemu-img.c
> index c03d6b4b31..4b4ca7add3 100644
> --- a/qemu-img.c
> +++ b/qemu-img.c

[...]

> @@ -1613,20 +1617,199 @@ out:
>       qemu_vfree(buf1);
>       qemu_vfree(buf2);
>       blk_unref(blk2);
>   out2:
>       blk_unref(blk1);
>   out3:
>       qemu_progress_end();
>       return ret;
>   }
>   
> +#ifdef CONFIG_BLKHASH
> +/*
> + * Compute image checksum.
> + */
> +static int img_checksum(int argc, char **argv)
> +{
> +    const char *digest_name = "sha256";
> +    const size_t block_size = 64 * KiB;
> +
> +    _Static_assert(QEMU_IS_ALIGNED(IO_BUF_SIZE, block_size),
> +                   "IO_BUF_SIZE should be alligned to block_size");

(s/alligned/aligned/)

A suggestion: We have a `QEMU_BUILD_BUG_MSG()` macro in 
include/qemu/compiler.h.  Nowadays it just unconditionally resolves to 
_Static_assert, I think before C11 was adopted it used a custom 
implementation.  Still, it is what seems to be used throughout the 
actual qemu code (disregarding roms/ and pc-bios/), so I think it would 
be more fitting to use.

But that’s just a suggestion.  It always resolves to _Static_assert 
anyway, so using _Static_assert seems by no means wrong.

So with the spelling fixed:

Reviewed-by: Hanna Reitz <hreitz@redhat.com>



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 4/5] iotests: Test qemu-img checksum
  2022-11-28 14:15 ` [PATCH v2 4/5] iotests: Test qemu-img checksum Nir Soffer
@ 2022-12-12 10:43   ` Hanna Reitz
  0 siblings, 0 replies; 14+ messages in thread
From: Hanna Reitz @ 2022-12-12 10:43 UTC (permalink / raw)
  To: Nir Soffer, qemu-devel
  Cc: Thomas Huth, Paolo Bonzini, Kevin Wolf,
	Philippe Mathieu-Daudé,
	Marc-André Lureau, qemu-block, Daniel P. Berrangé

On 28.11.22 15:15, Nir Soffer wrote:
> Add simple tests computing a checksum for image with all kinds of
> extents in raw and qcow2 formats.
>
> The test can be extended later for other formats, format options (e..g
> compressed qcow2), protocols (e.g. nbd), and image with a backing chain,
> but I'm not sure this is really needed.
>
> To help debugging in case of failures, the output includes a json map of
> the test image.
>
> Signed-off-by: Nir Soffer <nsoffer@redhat.com>
> ---
>   tests/qemu-iotests/tests/qemu-img-checksum    | 63 +++++++++++++++++++
>   .../tests/qemu-img-checksum.out.qcow2         | 11 ++++
>   .../tests/qemu-img-checksum.out.raw           | 10 +++
>   3 files changed, 84 insertions(+)
>   create mode 100755 tests/qemu-iotests/tests/qemu-img-checksum
>   create mode 100644 tests/qemu-iotests/tests/qemu-img-checksum.out.qcow2
>   create mode 100644 tests/qemu-iotests/tests/qemu-img-checksum.out.raw

Thanks!

Reviewed-by: Hanna Reitz <hreitz@redhat.com>



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 5/5] qemu-img: Speed up checksum
  2022-11-28 14:15 ` [PATCH v2 5/5] qemu-img: Speed up checksum Nir Soffer
@ 2022-12-12 10:43   ` Hanna Reitz
  0 siblings, 0 replies; 14+ messages in thread
From: Hanna Reitz @ 2022-12-12 10:43 UTC (permalink / raw)
  To: Nir Soffer, qemu-devel
  Cc: Thomas Huth, Paolo Bonzini, Kevin Wolf,
	Philippe Mathieu-Daudé,
	Marc-André Lureau, qemu-block, Daniel P. Berrangé

On 28.11.22 15:15, Nir Soffer wrote:
> Add coroutine based loop inspired by `qemu-img convert` design.
>
> Changes compared to `qemu-img convert`:
>
> - State for the entire image is kept in ImgChecksumState
>
> - State for single worker coroutine is kept in ImgChecksumworker.
>
> - "Writes" are always in-order, ensured using a queue.
>
> - Calling block status once per image extent, when the current extent is
>    consumed by the workers.
>
> - Using 1m buffer size - testings shows that this gives best read
>    performance both with buffered and direct I/O.
>
> - Number of coroutines is not configurable. Testing does not show
>    improvement when using more than 8 coroutines.
>
> - Progress include entire image, not only the allocated state.
>
> Comparing to the simple read loop shows that this version is up to 4.67
> times faster when computing a checksum for an image full of zeroes. For
> real images it is 1.59 times faster with direct I/O, and with buffered
> I/O there is no difference.
>
> Test results on Dell PowerEdge R640 in a CentOS Stream 9 container:
>
> | image    | size | i/o       | before         | after          | change |
> |----------|------|-----------|----------------|----------------|--------|
> | zero [1] |   6g | buffered  | 1.600s ±0.014s | 0.342s ±0.016s |  x4.67 |
> | zero     |   6g | direct    | 4.684s ±0.093s | 2.211s ±0.009s |  x2.12 |
> | real [2] |   6g | buffered  | 1.841s ±0.075s | 1.806s ±0.036s |  x1.02 |
> | real     |   6g | direct    | 3.094s ±0.079s | 1.947s ±0.017s |  x1.59 |
> | nbd  [3] |   6g | buffered  | 2.455s ±0.183s | 1.808s ±0.016s |  x1.36 |
> | nbd      |   6g | direct    | 3.540s ±0.020s | 1.749s ±0.018s |  x2.02 |
>
> [1] raw image full of zeroes
> [2] raw fedora 35 image with additional random data, 50% full
> [3] image [2] exported by qemu-nbd via unix socket
>
> Signed-off-by: Nir Soffer <nsoffer@redhat.com>
> ---
>   qemu-img.c | 350 ++++++++++++++++++++++++++++++++++++++++++-----------
>   1 file changed, 277 insertions(+), 73 deletions(-)

Reviewed-by: Hanna Reitz <hreitz@redhat.com>



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/5] Support format or cache specific out file
  2022-12-12 10:38   ` Hanna Reitz
@ 2022-12-13 15:56     ` Nir Soffer
  2022-12-13 18:09       ` Hanna Reitz
  0 siblings, 1 reply; 14+ messages in thread
From: Nir Soffer @ 2022-12-13 15:56 UTC (permalink / raw)
  To: Hanna Reitz
  Cc: qemu-devel, Thomas Huth, Paolo Bonzini, Kevin Wolf,
	Philippe Mathieu-Daudé,
	Marc-André Lureau, qemu-block, Daniel P. Berrangé

On Mon, Dec 12, 2022 at 12:38 PM Hanna Reitz <hreitz@redhat.com> wrote:
>
> On 28.11.22 15:15, Nir Soffer wrote:
> > Extend the test finder to find tests with format (*.out.qcow2) or cache
> > specific (*.out.nocache) out file. This worked before only for the
> > numbered tests.
> > ---
> >   tests/qemu-iotests/findtests.py | 10 ++++++++--
> >   1 file changed, 8 insertions(+), 2 deletions(-)
>
> This patch lacks an S-o-b, too.
>
> > diff --git a/tests/qemu-iotests/findtests.py b/tests/qemu-iotests/findtests.py
> > index dd77b453b8..f4344ce78c 100644
> > --- a/tests/qemu-iotests/findtests.py
> > +++ b/tests/qemu-iotests/findtests.py
> > @@ -38,31 +38,37 @@ def chdir(path: Optional[str] = None) -> Iterator[None]:
> >           os.chdir(saved_dir)
> >
> >
> >   class TestFinder:
> >       def __init__(self, test_dir: Optional[str] = None) -> None:
> >           self.groups = defaultdict(set)
> >
> >           with chdir(test_dir):
> >               self.all_tests = glob.glob('[0-9][0-9][0-9]')
> >               self.all_tests += [f for f in glob.iglob('tests/*')
> > -                               if not f.endswith('.out') and
> > -                               os.path.isfile(f + '.out')]
> > +                               if self.is_test(f)]
>
> So previously a file was only considered a test file if there was a
> corresponding reference output file (`f + '.out'`), so files without
> such a reference output aren’t considered test files...
>
> >               for t in self.all_tests:
> >                   with open(t, encoding="utf-8") as f:
> >                       for line in f:
> >                           if line.startswith('# group: '):
> >                               for g in line.split()[2:]:
> >                                   self.groups[g].add(t)
> >                               break
> >
> > +    def is_test(self, fname: str) -> bool:
> > +        """
> > +        The tests directory contains tests (no extension) and out files
> > +        (*.out, *.out.{format}, *.out.{option}).
> > +        """
> > +        return re.search(r'.+\.out(\.\w+)?$', fname) is None
>
> ...but this new function doesn’t check that.  I think we should check it
> (just whether there’s any variant of `/{fname}\.out(\.\w+)?/` to go with
> `fname`) so that behavior isn’t changed.

This means that you cannot add a test without a *.out* file, which may
 be useful when you don't use the out file for validation, but we can
add this later if needed.

I'll change the code to check both conditions.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/5] Support format or cache specific out file
  2022-12-13 15:56     ` Nir Soffer
@ 2022-12-13 18:09       ` Hanna Reitz
  2022-12-13 19:53         ` Nir Soffer
  0 siblings, 1 reply; 14+ messages in thread
From: Hanna Reitz @ 2022-12-13 18:09 UTC (permalink / raw)
  To: Nir Soffer
  Cc: qemu-devel, Thomas Huth, Paolo Bonzini, Kevin Wolf,
	Philippe Mathieu-Daudé,
	Marc-André Lureau, qemu-block, Daniel P. Berrangé

On 13.12.22 16:56, Nir Soffer wrote:
> On Mon, Dec 12, 2022 at 12:38 PM Hanna Reitz <hreitz@redhat.com> wrote:
>> On 28.11.22 15:15, Nir Soffer wrote:
>>> Extend the test finder to find tests with format (*.out.qcow2) or cache
>>> specific (*.out.nocache) out file. This worked before only for the
>>> numbered tests.
>>> ---
>>>    tests/qemu-iotests/findtests.py | 10 ++++++++--
>>>    1 file changed, 8 insertions(+), 2 deletions(-)
>> This patch lacks an S-o-b, too.
>>
>>> diff --git a/tests/qemu-iotests/findtests.py b/tests/qemu-iotests/findtests.py
>>> index dd77b453b8..f4344ce78c 100644
>>> --- a/tests/qemu-iotests/findtests.py
>>> +++ b/tests/qemu-iotests/findtests.py
>>> @@ -38,31 +38,37 @@ def chdir(path: Optional[str] = None) -> Iterator[None]:
>>>            os.chdir(saved_dir)
>>>
>>>
>>>    class TestFinder:
>>>        def __init__(self, test_dir: Optional[str] = None) -> None:
>>>            self.groups = defaultdict(set)
>>>
>>>            with chdir(test_dir):
>>>                self.all_tests = glob.glob('[0-9][0-9][0-9]')
>>>                self.all_tests += [f for f in glob.iglob('tests/*')
>>> -                               if not f.endswith('.out') and
>>> -                               os.path.isfile(f + '.out')]
>>> +                               if self.is_test(f)]
>> So previously a file was only considered a test file if there was a
>> corresponding reference output file (`f + '.out'`), so files without
>> such a reference output aren’t considered test files...
>>
>>>                for t in self.all_tests:
>>>                    with open(t, encoding="utf-8") as f:
>>>                        for line in f:
>>>                            if line.startswith('# group: '):
>>>                                for g in line.split()[2:]:
>>>                                    self.groups[g].add(t)
>>>                                break
>>>
>>> +    def is_test(self, fname: str) -> bool:
>>> +        """
>>> +        The tests directory contains tests (no extension) and out files
>>> +        (*.out, *.out.{format}, *.out.{option}).
>>> +        """
>>> +        return re.search(r'.+\.out(\.\w+)?$', fname) is None
>> ...but this new function doesn’t check that.  I think we should check it
>> (just whether there’s any variant of `/{fname}\.out(\.\w+)?/` to go with
>> `fname`) so that behavior isn’t changed.
> This means that you cannot add a test without a *.out* file, which may
>   be useful when you don't use the out file for validation, but we can
> add this later if needed.

I don’t think tests work without a reference output, do they?  At least 
a couple of years ago, the ./check script would refuse to run tests 
without a corresponding .out file.

Hanna



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/5] Support format or cache specific out file
  2022-12-13 18:09       ` Hanna Reitz
@ 2022-12-13 19:53         ` Nir Soffer
  0 siblings, 0 replies; 14+ messages in thread
From: Nir Soffer @ 2022-12-13 19:53 UTC (permalink / raw)
  To: Hanna Reitz
  Cc: qemu-devel, Thomas Huth, Paolo Bonzini, Kevin Wolf,
	Philippe Mathieu-Daudé,
	Marc-André Lureau, qemu-block, Daniel P. Berrangé

On Tue, Dec 13, 2022 at 8:09 PM Hanna Reitz <hreitz@redhat.com> wrote:
>
> On 13.12.22 16:56, Nir Soffer wrote:
> > On Mon, Dec 12, 2022 at 12:38 PM Hanna Reitz <hreitz@redhat.com> wrote:
> >> On 28.11.22 15:15, Nir Soffer wrote:
> >>> Extend the test finder to find tests with format (*.out.qcow2) or cache
> >>> specific (*.out.nocache) out file. This worked before only for the
> >>> numbered tests.
> >>> ---
> >>>    tests/qemu-iotests/findtests.py | 10 ++++++++--
> >>>    1 file changed, 8 insertions(+), 2 deletions(-)
> >> This patch lacks an S-o-b, too.
> >>
> >>> diff --git a/tests/qemu-iotests/findtests.py b/tests/qemu-iotests/findtests.py
> >>> index dd77b453b8..f4344ce78c 100644
> >>> --- a/tests/qemu-iotests/findtests.py
> >>> +++ b/tests/qemu-iotests/findtests.py
> >>> @@ -38,31 +38,37 @@ def chdir(path: Optional[str] = None) -> Iterator[None]:
> >>>            os.chdir(saved_dir)
> >>>
> >>>
> >>>    class TestFinder:
> >>>        def __init__(self, test_dir: Optional[str] = None) -> None:
> >>>            self.groups = defaultdict(set)
> >>>
> >>>            with chdir(test_dir):
> >>>                self.all_tests = glob.glob('[0-9][0-9][0-9]')
> >>>                self.all_tests += [f for f in glob.iglob('tests/*')
> >>> -                               if not f.endswith('.out') and
> >>> -                               os.path.isfile(f + '.out')]
> >>> +                               if self.is_test(f)]
> >> So previously a file was only considered a test file if there was a
> >> corresponding reference output file (`f + '.out'`), so files without
> >> such a reference output aren’t considered test files...
> >>
> >>>                for t in self.all_tests:
> >>>                    with open(t, encoding="utf-8") as f:
> >>>                        for line in f:
> >>>                            if line.startswith('# group: '):
> >>>                                for g in line.split()[2:]:
> >>>                                    self.groups[g].add(t)
> >>>                                break
> >>>
> >>> +    def is_test(self, fname: str) -> bool:
> >>> +        """
> >>> +        The tests directory contains tests (no extension) and out files
> >>> +        (*.out, *.out.{format}, *.out.{option}).
> >>> +        """
> >>> +        return re.search(r'.+\.out(\.\w+)?$', fname) is None
> >> ...but this new function doesn’t check that.  I think we should check it
> >> (just whether there’s any variant of `/{fname}\.out(\.\w+)?/` to go with
> >> `fname`) so that behavior isn’t changed.
> > This means that you cannot add a test without a *.out* file, which may
> >   be useful when you don't use the out file for validation, but we can
> > add this later if needed.
>
> I don’t think tests work without a reference output, do they?  At least
> a couple of years ago, the ./check script would refuse to run tests
> without a corresponding .out file.

This may be true, but most tests do not really need an out file and better be
verified by asserting. There are some python tests that have pointless out
file with the output of python unittest:

    $ cat tests/qemu-iotests/tests/nbd-multiconn.out
    ...
    ----------------------------------------------------------------------
    Ran 3 tests

    OK

This is not only unhelpful (update the output when adding a 4th test)
but fragile.
if unitests changes the output, maybe adding info about skipped tests, or
changing "---" to "****", the test will break.

But for now I agree the test framework should keep the current behavior.

Nir



^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-12-13 19:54 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-28 14:15 [PATCH v2 0/5] Add qemu-img checksum command using blkhash Nir Soffer
2022-11-28 14:15 ` [PATCH v2 1/5] qemu-img.c: Move IO_BUF_SIZE to the top of the file Nir Soffer
2022-12-12 10:35   ` Hanna Reitz
2022-11-28 14:15 ` [PATCH v2 2/5] Support format or cache specific out file Nir Soffer
2022-12-12 10:38   ` Hanna Reitz
2022-12-13 15:56     ` Nir Soffer
2022-12-13 18:09       ` Hanna Reitz
2022-12-13 19:53         ` Nir Soffer
2022-11-28 14:15 ` [PATCH v2 3/5] qemu-img: Add checksum command Nir Soffer
2022-12-12 10:42   ` Hanna Reitz
2022-11-28 14:15 ` [PATCH v2 4/5] iotests: Test qemu-img checksum Nir Soffer
2022-12-12 10:43   ` Hanna Reitz
2022-11-28 14:15 ` [PATCH v2 5/5] qemu-img: Speed up checksum Nir Soffer
2022-12-12 10:43   ` Hanna Reitz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.