All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hanna Reitz <hreitz@redhat.com>
To: Emanuele Giuseppe Esposito <eesposit@redhat.com>, qemu-block@nongnu.org
Cc: "Kevin Wolf" <kwolf@redhat.com>, "Fam Zheng" <fam@euphon.net>,
	"Vladimir Sementsov-Ogievskiy" <vsementsov@virtuozzo.com>,
	"Daniel P. Berrangé" <berrange@redhat.com>,
	"Eduardo Habkost" <ehabkost@redhat.com>,
	"Juan Quintela" <quintela@redhat.com>,
	qemu-devel@nongnu.org, "John Snow" <jsnow@redhat.com>,
	"Richard Henderson" <richard.henderson@linaro.org>,
	"Markus Armbruster" <armbru@redhat.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	"Stefan Hajnoczi" <stefanha@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Eric Blake" <eblake@redhat.com>
Subject: Re: [PATCH v4 19/25] block_int-common.h: split function pointers in BlockDriver
Date: Mon, 15 Nov 2021 13:00:07 +0100	[thread overview]
Message-ID: <619cd983-4339-f70d-af25-678f5a7ebd83@redhat.com> (raw)
In-Reply-To: <20211025101735.2060852-20-eesposit@redhat.com>

On 25.10.21 12:17, Emanuele Giuseppe Esposito wrote:
> Similar to the header split, also the function pointers in BlockDriver
> can be split in I/O and global state.
>
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>   include/block/block_int-common.h | 458 ++++++++++++++++---------------
>   1 file changed, 237 insertions(+), 221 deletions(-)
>
> diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h
> index 79a3d801d2..9857e775fe 100644
> --- a/include/block/block_int-common.h
> +++ b/include/block/block_int-common.h
> @@ -96,6 +96,7 @@ typedef struct BdrvTrackedRequest {
>   
>   
>   struct BlockDriver {
> +    /* Fields initialized in struct definition and never changed. */

I find this a bit difficult to understand.  Perhaps we could be more 
verbose to make it easier to read?  Like

“These fields are initialized when this object is created, and are never 
changed afterwards”

And I’d also add an empty line below to make visually clear that this 
does not describe the single subsequent field (format_name) but the 
whole chunk of fields.

I also wonder if we could substantiate the claim “never changed 
afterwards” by making these fields consts.  Then again, I suppose 
technically none of the fields in BlockDriver is supposed to be mutable 
(except for .list), neither these fields nor the function pointers...  
Yeah, maybe scratch that idea.

>       const char *format_name;
>       int instance_size;

[...]

> +    int coroutine_fn (*bdrv_save_vmstate)(BlockDriverState *bs,
> +                                          QEMUIOVector *qiov,
> +                                          int64_t pos);
> +    int coroutine_fn (*bdrv_load_vmstate)(BlockDriverState *bs,
> +                                          QEMUIOVector *qiov,
> +                                          int64_t pos);

In patch 18, you classified bdrv_co_readv_vmstate() and 
bdrv_co_writev_vmstate() as I/O functions.  They call these methods, 
though, so something doesn’t seem quite right.

Now I also remember you classified the global bdrv_save_vmstate() and 
bdrv_load_vmstate() functions as GS.  I don’t mind either way; AFAIU 
they are I/O functions that are generally called under the BQL.  But the 
classification should be consistent, of course.

> +
> +    int (*bdrv_change_backing_file)(BlockDriverState *bs,
> +        const char *backing_file, const char *backing_fmt);
> +
> +    void (*bdrv_eject)(BlockDriverState *bs, bool eject_flag);
> +    void (*bdrv_lock_medium)(BlockDriverState *bs, bool locked);

Above bdrv_is_inserted(), there’s a comment (“removable device 
specific”) that I think should be duplicated here.

> +
> +    /* TODO Better pass a option string/QDict/QemuOpts to add any rule? */
> +    int (*bdrv_debug_breakpoint)(BlockDriverState *bs, const char *event,
> +        const char *tag);
> +    int (*bdrv_debug_remove_breakpoint)(BlockDriverState *bs,
> +        const char *tag);
> +    int (*bdrv_debug_resume)(BlockDriverState *bs, const char *tag);
> +    bool (*bdrv_debug_is_suspended)(BlockDriverState *bs, const char *tag);
> +    void (*bdrv_refresh_limits)(BlockDriverState *bs, Error **errp);

Originally, there was an empty line before bdrv_refresh_limits(), and 
I’d keep it.

[...]

> +    /**
> +     * Try to get @bs's logical and physical block size.
> +     * On success, store them in @bsz and return zero.
> +     * On failure, return negative errno.
> +     */
> +    int (*bdrv_probe_blocksizes)(BlockDriverState *bs, BlockSizes *bsz);

Originally, there was a comment above this function (“I/O API, even 
though if it's a filter jumps on parent”) that you added in patch 6 and 
that I didn’t understand.  Given this, I’m not unhappy to see it go 
again, but now I wonder about its purpose even more...

> +    /**
> +     * Try to get @bs's geometry (cyls, heads, sectors)
> +     * On success, store them in @geo and return 0.
> +     * On failure return -errno.
> +     * Only drivers that want to override guest geometry implement this
> +     * callback; see hd_geometry_guess().
> +     */
> +    int (*bdrv_probe_geometry)(BlockDriverState *bs, HDGeometry *geo);

(Here, too)

[...]

> +    /**
> +     * Register/unregister a buffer for I/O. For example, when the driver is
> +     * interested to know the memory areas that will later be used in iovs, so
> +     * that it can do IOMMU mapping with VFIO etc., in order to get better
> +     * performance. In the case of VFIO drivers, this callback is used to do
> +     * DMA mapping for hot buffers.
> +     */
> +    void (*bdrv_register_buf)(BlockDriverState *bs, void *host, size_t size);
> +    void (*bdrv_unregister_buf)(BlockDriverState *bs, void *host);
> +    QLIST_ENTRY(BlockDriver) list;

This field is interesting because it’s the only mutable field in the 
whole struct.  I think it should be separated from the function pointers 
above and given a comment like “This field is modified only under the 
BQL, and is part of the global state”.

> +
> +    /*
> +     * I/O API functions. These functions are thread-safe.
> +     *
> +     * See include/block/block-io.h for more information about
> +     * the I/O API.
> +     */
> +
> +    int coroutine_fn (*bdrv_co_create)(BlockdevCreateOptions *opts,
> +                                       Error **errp);
> +    int coroutine_fn (*bdrv_co_create_opts)(BlockDriver *drv,
> +                                            const char *filename,
> +                                            QemuOpts *opts,
> +                                            Error **errp);

Now this is really interesting.  Technically I suppose these should work 
in any thread, but trying to do so results in:

$ touch /tmp/iothread-create-test.qcow2
$ ./qemu-system-x86_64 -object iothread,id=iothr0 -qmp stdio <<EOF
{"execute": "qmp_capabilities"}
{"execute":"blockdev-add","arguments":{"node-name":"proto","driver":"file","filename":"/tmp/iothread-create-test.qcow2"}}
{"execute":"x-blockdev-set-iothread","arguments":{"node-name":"proto","iothread":"iothr0"}}
{"execute":"blockdev-create","arguments":{"job-id":"create","options":{"driver":"qcow2","file":"proto","size":0}}}
EOF
{"QMP": {"version": {"qemu": {"micro": 90, "minor": 1, "major": 6}, 
"package": "v6.2.0-rc0-40-gd02d5fe5fb-dirty"}, "capabilities": ["oob"]}}
{"return": {}}
{"return": {}}
{"return": {}}
{"timestamp": {"seconds": 1636973542, "microseconds": 338117}, "event": 
"JOB_STATUS_CHANGE", "data": {"status": "created", "id": "create"}}
{"timestamp": {"seconds": 1636973542, "microseconds": 338197}, "event": 
"JOB_STATUS_CHANGE", "data": {"status": "running", "id": "create"}}
{"return": {}}
qemu: qemu_mutex_unlock_impl: Operation not permitted
[1]    86154 IOT instruction (core dumped)  ./qemu-system-x86_64 -object 
iothread,id=iothr0 -qmp stdio <<<''

So something’s fishy and perhaps we should investigate this...  I mean, 
I can’t really imagine a case where someone would need to run a 
blockdev-create job in an I/O thread, but right now the interface allows 
for it.

And then bdrv_create() is classified as global state, and also 
bdrv_co_create_opts_simple(), which is supposed to be a drop-in function 
for this .bdrv_co_create_opts function.  So that can’t work.

Also, I believe there might have been some functions you classified as 
GS that are called from .create* implementations.  I accepted that, 
given the abort I sketched above.  However, if we classify image 
creation as I/O, then those would need to be re-evaluated. For example, 
qcow2_co_create_opts() calls bdrv_create_file(), which is a GS function.

Some of this issues could be addressed by making .bdrv_co_create_opts a 
GS function and .bdrv_co_create an I/O function.  I believe that would 
be the ideal split, even though as shown above .bdrv_co_create doesn’t 
work in an I/O thread, and then you have the issue of probably all 
format drivers’ .bdrv_co_create implementations calling 
bdrv_open_blockdev_ref(), which is a GS function.

(VMDK even calls blk_new_open(), blk_new_with_bs(), and blk_unref(), 
none of which can ever be I/O functions, I think.)

I believe in practice the best is to for now classify all create-related 
functions as GS functions.  This is supported by the fact that 
qmp_blockdev_create() specifically creates the create job in the main 
context (with a TODO comment) and requires block drivers to error out 
when they encounter a node in a different AioContext.

> +    int coroutine_fn (*bdrv_co_amend)(BlockDriverState *bs,
> +                                      BlockdevAmendOptions *opts,
> +                                      bool force,
> +                                      Error **errp);
> +
>       /* aio */
>       BlockAIOCB *(*bdrv_aio_preadv)(BlockDriverState *bs,
>           int64_t offset, int64_t bytes, QEMUIOVector *qiov,

[...]

> @@ -443,47 +632,20 @@ struct BlockDriver {

Right above here (i.e. line 631), there’s an attribute 
`has_variable_length`.  I believe that should go up to the immutable 
fields section.

>       int64_t (*bdrv_get_allocated_file_size)(BlockDriverState *bs);
>       BlockMeasureInfo *(*bdrv_measure)(QemuOpts *opts, BlockDriverState *in_bs,
>                                         Error **errp);
> -

I’d keep this empty line.

Hanna

>       int coroutine_fn (*bdrv_co_pwritev_compressed)(BlockDriverState *bs,
>           int64_t offset, int64_t bytes, QEMUIOVector *qiov);
>       int coroutine_fn (*bdrv_co_pwritev_compressed_part)(BlockDriverState *bs,
>           int64_t offset, int64_t bytes, QEMUIOVector *qiov,
>           size_t qiov_offset);



  reply	other threads:[~2021-11-15 12:02 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-25 10:17 [PATCH v4 00/25] block layer: split block APIs in global state and I/O Emanuele Giuseppe Esposito
2021-10-25 10:17 ` [PATCH v4 01/25] main-loop.h: introduce qemu_in_main_thread() Emanuele Giuseppe Esposito
2021-10-25 11:33   ` Philippe Mathieu-Daudé
2021-10-25 10:17 ` [PATCH v4 02/25] include/block/block: split header into I/O and global state API Emanuele Giuseppe Esposito
2021-10-25 11:37   ` Philippe Mathieu-Daudé
2021-10-25 12:22     ` Emanuele Giuseppe Esposito
2021-11-11 15:00   ` Hanna Reitz
2021-11-15 12:08     ` Emanuele Giuseppe Esposito
2021-11-12 12:25   ` Hanna Reitz
2021-11-16 14:00     ` Emanuele Giuseppe Esposito
2021-10-25 10:17 ` [PATCH v4 03/25] assertions for block " Emanuele Giuseppe Esposito
2021-11-11 16:32   ` Hanna Reitz
2021-11-15 12:27     ` Emanuele Giuseppe Esposito
2021-11-15 15:27       ` Hanna Reitz
2021-11-12 11:31   ` Hanna Reitz
2021-10-25 10:17 ` [PATCH v4 04/25] include/sysemu/block-backend: split header into I/O and global state (GS) API Emanuele Giuseppe Esposito
2021-11-12 10:23   ` Hanna Reitz
2021-11-16 10:16     ` Emanuele Giuseppe Esposito
2021-11-12 12:30   ` Hanna Reitz
2021-11-16 14:24     ` Emanuele Giuseppe Esposito
2021-11-16 15:07       ` Hanna Reitz
2021-10-25 10:17 ` [PATCH v4 05/25] block/block-backend.c: assertions for block-backend Emanuele Giuseppe Esposito
2021-11-12 11:01   ` Hanna Reitz
2021-11-16 10:15     ` Emanuele Giuseppe Esposito
2021-11-16 12:29       ` Hanna Reitz
2021-10-25 10:17 ` [PATCH v4 06/25] include/block/block_int: split header into I/O and global state API Emanuele Giuseppe Esposito
2021-11-12 12:17   ` Hanna Reitz
2021-11-16 10:24     ` Emanuele Giuseppe Esposito
2021-11-16 12:30       ` Hanna Reitz
2021-10-25 10:17 ` [PATCH v4 07/25] assertions for block_int " Emanuele Giuseppe Esposito
2021-11-12 13:51   ` Hanna Reitz
2021-11-16 15:43     ` Emanuele Giuseppe Esposito
2021-11-16 16:46       ` Hanna Reitz
2021-10-25 10:17 ` [PATCH v4 08/25] block: introduce assert_bdrv_graph_writable Emanuele Giuseppe Esposito
2021-11-12 14:40   ` Hanna Reitz
2021-11-18  9:55     ` Emanuele Giuseppe Esposito
2021-11-18 10:24       ` Emanuele Giuseppe Esposito
2021-11-18 15:17       ` Hanna Reitz
2021-11-19  8:55         ` Emanuele Giuseppe Esposito
2021-10-25 10:17 ` [PATCH v4 09/25] include/block/blockjob_int.h: split header into I/O and GS API Emanuele Giuseppe Esposito
2021-10-25 10:17 ` [PATCH v4 10/25] assertions for blockjob_int.h Emanuele Giuseppe Esposito
2021-11-12 15:17   ` Hanna Reitz
2021-10-25 10:17 ` [PATCH v4 11/25] include/block/blockjob.h: global state API Emanuele Giuseppe Esposito
2021-10-25 10:17 ` [PATCH v4 12/25] assertions for blockob.h " Emanuele Giuseppe Esposito
2021-11-12 15:26   ` Hanna Reitz
2021-10-25 10:17 ` [PATCH v4 13/25] include/sysemu/blockdev.h: move drive_add and inline drive_def Emanuele Giuseppe Esposito
2021-11-12 15:41   ` Hanna Reitz
2021-10-25 10:17 ` [PATCH v4 14/25] include/systemu/blockdev.h: global state API Emanuele Giuseppe Esposito
2021-10-28 15:48   ` Stefan Hajnoczi
2021-11-12 15:46   ` Hanna Reitz
2021-10-25 10:17 ` [PATCH v4 15/25] assertions for blockdev.h " Emanuele Giuseppe Esposito
2021-10-25 10:17 ` [PATCH v4 16/25] include/block/snapshot: global state API + assertions Emanuele Giuseppe Esposito
2021-10-25 10:17 ` [PATCH v4 17/25] block/copy-before-write.h: " Emanuele Giuseppe Esposito
2021-10-25 10:17 ` [PATCH v4 18/25] block/coroutines: I/O API Emanuele Giuseppe Esposito
2021-10-25 10:17 ` [PATCH v4 19/25] block_int-common.h: split function pointers in BlockDriver Emanuele Giuseppe Esposito
2021-11-15 12:00   ` Hanna Reitz [this message]
2021-11-18 12:42     ` Emanuele Giuseppe Esposito
2021-10-25 10:17 ` [PATCH v4 20/25] block_int-common.h: assertion in the callers of BlockDriver function pointers Emanuele Giuseppe Esposito
2021-11-15 12:48   ` Hanna Reitz
2021-11-15 14:15     ` Hanna Reitz
2021-11-17 11:33     ` Emanuele Giuseppe Esposito
2021-11-17 12:51       ` Hanna Reitz
2021-11-17 13:09         ` Emanuele Giuseppe Esposito
2021-11-17 13:34           ` Hanna Reitz
2021-10-25 10:17 ` [PATCH v4 21/25] block_int-common.h: split function pointers in BdrvChildClass Emanuele Giuseppe Esposito
2021-11-15 14:36   ` Hanna Reitz
2021-10-25 10:17 ` [PATCH v4 22/25] block_int-common.h: assertions in the callers of BdrvChildClass function pointers Emanuele Giuseppe Esposito
2021-11-15 14:48   ` Hanna Reitz
2021-10-25 10:17 ` [PATCH v4 23/25] block-backend-common.h: split function pointers in BlockDevOps Emanuele Giuseppe Esposito
2021-10-25 14:10   ` Philippe Mathieu-Daudé
2021-10-25 10:17 ` [PATCH v4 24/25] job.h: split function pointers in JobDriver Emanuele Giuseppe Esposito
2021-11-15 15:11   ` Hanna Reitz
2021-11-17 13:43     ` Emanuele Giuseppe Esposito
2021-11-17 13:44       ` Hanna Reitz
2021-10-25 10:17 ` [PATCH v4 25/25] job.h: assertions in the callers of JobDriver funcion pointers Emanuele Giuseppe Esposito
2021-10-25 14:09 ` [PATCH v4 00/25] block layer: split block APIs in global state and I/O Philippe Mathieu-Daudé
2021-10-28 15:45   ` Stefan Hajnoczi
2021-10-28 15:49 ` Stefan Hajnoczi
2021-11-15 16:03 ` Hanna Reitz
2021-11-15 16:11   ` Daniel P. Berrangé
2021-11-18 13:50   ` Paolo Bonzini
2021-11-18 15:31     ` Hanna Reitz
2021-11-19  3:13       ` Paolo Bonzini
2021-11-19 10:42         ` Emanuele Giuseppe Esposito
2021-11-18 14:04   ` Paolo Bonzini
2021-11-18 15:22     ` Hanna Reitz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=619cd983-4339-f70d-af25-678f5a7ebd83@redhat.com \
    --to=hreitz@redhat.com \
    --cc=armbru@redhat.com \
    --cc=berrange@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=eblake@redhat.com \
    --cc=eesposit@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=fam@euphon.net \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=richard.henderson@linaro.org \
    --cc=stefanha@redhat.com \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.