* [Qemu-devel] [PATCH v2 0/3] xen-disk: performance improvements @ 2017-06-21 12:52 ` Paul Durrant 0 siblings, 0 replies; 24+ messages in thread From: Paul Durrant @ 2017-06-21 12:52 UTC (permalink / raw) To: xen-devel, qemu-devel, qemu-block; +Cc: Paul Durrant Paul Durrant (3): xen-disk: only advertize feature-persistent if grant copy is not available xen-disk: add support for multi-page shared rings xen-disk: use an IOThread per instance hw/block/trace-events | 7 ++ hw/block/xen_disk.c | 228 +++++++++++++++++++++++++++++++++++++++----------- 2 files changed, 188 insertions(+), 47 deletions(-) -- 2.11.0 ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v2 0/3] xen-disk: performance improvements @ 2017-06-21 12:52 ` Paul Durrant 0 siblings, 0 replies; 24+ messages in thread From: Paul Durrant @ 2017-06-21 12:52 UTC (permalink / raw) To: xen-devel, qemu-devel, qemu-block; +Cc: Paul Durrant Paul Durrant (3): xen-disk: only advertize feature-persistent if grant copy is not available xen-disk: add support for multi-page shared rings xen-disk: use an IOThread per instance hw/block/trace-events | 7 ++ hw/block/xen_disk.c | 228 +++++++++++++++++++++++++++++++++++++++----------- 2 files changed, 188 insertions(+), 47 deletions(-) -- 2.11.0 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 24+ messages in thread
* [Qemu-devel] [PATCH v2 1/3] xen-disk: only advertize feature-persistent if grant copy is not available 2017-06-21 12:52 ` Paul Durrant @ 2017-06-21 12:52 ` Paul Durrant -1 siblings, 0 replies; 24+ messages in thread From: Paul Durrant @ 2017-06-21 12:52 UTC (permalink / raw) To: xen-devel, qemu-devel, qemu-block Cc: Paul Durrant, Stefano Stabellini, Anthony Perard, Kevin Wolf, Max Reitz If grant copy is available then it will always be used in preference to persistent maps. In this case feature-persistent should not be advertized to the frontend, otherwise it may needlessly copy data into persistently granted buffers. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> --- Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Anthony Perard <anthony.perard@citrix.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Max Reitz <mreitz@redhat.com> --- hw/block/xen_disk.c | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c index 3a22805fbc..9b06e3aa81 100644 --- a/hw/block/xen_disk.c +++ b/hw/block/xen_disk.c @@ -1023,11 +1023,18 @@ static int blk_init(struct XenDevice *xendev) blkdev->file_blk = BLOCK_SIZE; + blkdev->feature_grant_copy = + (xengnttab_grant_copy(blkdev->xendev.gnttabdev, 0, NULL) == 0); + + xen_pv_printf(&blkdev->xendev, 3, "grant copy operation %s\n", + blkdev->feature_grant_copy ? "enabled" : "disabled"); + /* fill info * blk_connect supplies sector-size and sectors */ xenstore_write_be_int(&blkdev->xendev, "feature-flush-cache", 1); - xenstore_write_be_int(&blkdev->xendev, "feature-persistent", 1); + xenstore_write_be_int(&blkdev->xendev, "feature-persistent", + !blkdev->feature_grant_copy); xenstore_write_be_int(&blkdev->xendev, "info", info); blk_parse_discard(blkdev); @@ -1202,12 +1209,6 @@ static int blk_connect(struct XenDevice *xendev) xen_be_bind_evtchn(&blkdev->xendev); - blkdev->feature_grant_copy = - (xengnttab_grant_copy(blkdev->xendev.gnttabdev, 0, NULL) == 0); - - xen_pv_printf(&blkdev->xendev, 3, "grant copy operation %s\n", - blkdev->feature_grant_copy ? "enabled" : "disabled"); - xen_pv_printf(&blkdev->xendev, 1, "ok: proto %s, ring-ref %d, " "remote port %d, local port %d\n", blkdev->xendev.protocol, blkdev->ring_ref, -- 2.11.0 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v2 1/3] xen-disk: only advertize feature-persistent if grant copy is not available @ 2017-06-21 12:52 ` Paul Durrant 0 siblings, 0 replies; 24+ messages in thread From: Paul Durrant @ 2017-06-21 12:52 UTC (permalink / raw) To: xen-devel, qemu-devel, qemu-block Cc: Anthony Perard, Kevin Wolf, Paul Durrant, Stefano Stabellini, Max Reitz If grant copy is available then it will always be used in preference to persistent maps. In this case feature-persistent should not be advertized to the frontend, otherwise it may needlessly copy data into persistently granted buffers. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> --- Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Anthony Perard <anthony.perard@citrix.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Max Reitz <mreitz@redhat.com> --- hw/block/xen_disk.c | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c index 3a22805fbc..9b06e3aa81 100644 --- a/hw/block/xen_disk.c +++ b/hw/block/xen_disk.c @@ -1023,11 +1023,18 @@ static int blk_init(struct XenDevice *xendev) blkdev->file_blk = BLOCK_SIZE; + blkdev->feature_grant_copy = + (xengnttab_grant_copy(blkdev->xendev.gnttabdev, 0, NULL) == 0); + + xen_pv_printf(&blkdev->xendev, 3, "grant copy operation %s\n", + blkdev->feature_grant_copy ? "enabled" : "disabled"); + /* fill info * blk_connect supplies sector-size and sectors */ xenstore_write_be_int(&blkdev->xendev, "feature-flush-cache", 1); - xenstore_write_be_int(&blkdev->xendev, "feature-persistent", 1); + xenstore_write_be_int(&blkdev->xendev, "feature-persistent", + !blkdev->feature_grant_copy); xenstore_write_be_int(&blkdev->xendev, "info", info); blk_parse_discard(blkdev); @@ -1202,12 +1209,6 @@ static int blk_connect(struct XenDevice *xendev) xen_be_bind_evtchn(&blkdev->xendev); - blkdev->feature_grant_copy = - (xengnttab_grant_copy(blkdev->xendev.gnttabdev, 0, NULL) == 0); - - xen_pv_printf(&blkdev->xendev, 3, "grant copy operation %s\n", - blkdev->feature_grant_copy ? "enabled" : "disabled"); - xen_pv_printf(&blkdev->xendev, 1, "ok: proto %s, ring-ref %d, " "remote port %d, local port %d\n", blkdev->xendev.protocol, blkdev->ring_ref, -- 2.11.0 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH v2 1/3] xen-disk: only advertize feature-persistent if grant copy is not available 2017-06-21 12:52 ` Paul Durrant (?) @ 2017-06-22 0:40 ` Stefano Stabellini -1 siblings, 0 replies; 24+ messages in thread From: Stefano Stabellini @ 2017-06-22 0:40 UTC (permalink / raw) To: Paul Durrant Cc: Kevin Wolf, Stefano Stabellini, qemu-block, qemu-devel, Max Reitz, Anthony Perard, xen-devel On Wed, 21 Jun 2017, Paul Durrant wrote: > If grant copy is available then it will always be used in preference to > persistent maps. In this case feature-persistent should not be advertized > to the frontend, otherwise it may needlessly copy data into persistently > granted buffers. > > Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> > --- > Cc: Stefano Stabellini <sstabellini@kernel.org> > Cc: Anthony Perard <anthony.perard@citrix.com> > Cc: Kevin Wolf <kwolf@redhat.com> > Cc: Max Reitz <mreitz@redhat.com> > --- > hw/block/xen_disk.c | 15 ++++++++------- > 1 file changed, 8 insertions(+), 7 deletions(-) > > diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c > index 3a22805fbc..9b06e3aa81 100644 > --- a/hw/block/xen_disk.c > +++ b/hw/block/xen_disk.c > @@ -1023,11 +1023,18 @@ static int blk_init(struct XenDevice *xendev) > > blkdev->file_blk = BLOCK_SIZE; > > + blkdev->feature_grant_copy = > + (xengnttab_grant_copy(blkdev->xendev.gnttabdev, 0, NULL) == 0); > + > + xen_pv_printf(&blkdev->xendev, 3, "grant copy operation %s\n", > + blkdev->feature_grant_copy ? "enabled" : "disabled"); > + > /* fill info > * blk_connect supplies sector-size and sectors > */ > xenstore_write_be_int(&blkdev->xendev, "feature-flush-cache", 1); > - xenstore_write_be_int(&blkdev->xendev, "feature-persistent", 1); > + xenstore_write_be_int(&blkdev->xendev, "feature-persistent", > + !blkdev->feature_grant_copy); > xenstore_write_be_int(&blkdev->xendev, "info", info); > > blk_parse_discard(blkdev); > @@ -1202,12 +1209,6 @@ static int blk_connect(struct XenDevice *xendev) > > xen_be_bind_evtchn(&blkdev->xendev); > > - blkdev->feature_grant_copy = > - (xengnttab_grant_copy(blkdev->xendev.gnttabdev, 0, NULL) == 0); > - > - xen_pv_printf(&blkdev->xendev, 3, "grant copy operation %s\n", > - blkdev->feature_grant_copy ? "enabled" : "disabled"); > - > xen_pv_printf(&blkdev->xendev, 1, "ok: proto %s, ring-ref %d, " > "remote port %d, local port %d\n", > blkdev->xendev.protocol, blkdev->ring_ref, > -- > 2.11.0 > _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Qemu-devel] [PATCH v2 1/3] xen-disk: only advertize feature-persistent if grant copy is not available 2017-06-21 12:52 ` Paul Durrant (?) (?) @ 2017-06-22 0:40 ` Stefano Stabellini -1 siblings, 0 replies; 24+ messages in thread From: Stefano Stabellini @ 2017-06-22 0:40 UTC (permalink / raw) To: Paul Durrant Cc: xen-devel, qemu-devel, qemu-block, Stefano Stabellini, Anthony Perard, Kevin Wolf, Max Reitz On Wed, 21 Jun 2017, Paul Durrant wrote: > If grant copy is available then it will always be used in preference to > persistent maps. In this case feature-persistent should not be advertized > to the frontend, otherwise it may needlessly copy data into persistently > granted buffers. > > Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> > --- > Cc: Stefano Stabellini <sstabellini@kernel.org> > Cc: Anthony Perard <anthony.perard@citrix.com> > Cc: Kevin Wolf <kwolf@redhat.com> > Cc: Max Reitz <mreitz@redhat.com> > --- > hw/block/xen_disk.c | 15 ++++++++------- > 1 file changed, 8 insertions(+), 7 deletions(-) > > diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c > index 3a22805fbc..9b06e3aa81 100644 > --- a/hw/block/xen_disk.c > +++ b/hw/block/xen_disk.c > @@ -1023,11 +1023,18 @@ static int blk_init(struct XenDevice *xendev) > > blkdev->file_blk = BLOCK_SIZE; > > + blkdev->feature_grant_copy = > + (xengnttab_grant_copy(blkdev->xendev.gnttabdev, 0, NULL) == 0); > + > + xen_pv_printf(&blkdev->xendev, 3, "grant copy operation %s\n", > + blkdev->feature_grant_copy ? "enabled" : "disabled"); > + > /* fill info > * blk_connect supplies sector-size and sectors > */ > xenstore_write_be_int(&blkdev->xendev, "feature-flush-cache", 1); > - xenstore_write_be_int(&blkdev->xendev, "feature-persistent", 1); > + xenstore_write_be_int(&blkdev->xendev, "feature-persistent", > + !blkdev->feature_grant_copy); > xenstore_write_be_int(&blkdev->xendev, "info", info); > > blk_parse_discard(blkdev); > @@ -1202,12 +1209,6 @@ static int blk_connect(struct XenDevice *xendev) > > xen_be_bind_evtchn(&blkdev->xendev); > > - blkdev->feature_grant_copy = > - (xengnttab_grant_copy(blkdev->xendev.gnttabdev, 0, NULL) == 0); > - > - xen_pv_printf(&blkdev->xendev, 3, "grant copy operation %s\n", > - blkdev->feature_grant_copy ? "enabled" : "disabled"); > - > xen_pv_printf(&blkdev->xendev, 1, "ok: proto %s, ring-ref %d, " > "remote port %d, local port %d\n", > blkdev->xendev.protocol, blkdev->ring_ref, > -- > 2.11.0 > ^ permalink raw reply [flat|nested] 24+ messages in thread
* [Qemu-devel] [PATCH v2 2/3] xen-disk: add support for multi-page shared rings 2017-06-21 12:52 ` Paul Durrant @ 2017-06-21 12:52 ` Paul Durrant -1 siblings, 0 replies; 24+ messages in thread From: Paul Durrant @ 2017-06-21 12:52 UTC (permalink / raw) To: xen-devel, qemu-devel, qemu-block Cc: Paul Durrant, Stefano Stabellini, Anthony Perard, Kevin Wolf, Max Reitz The blkif protocol has had provision for negotiation of multi-page shared rings for some time now and many guest OS have support in their frontend drivers. This patch makes the necessary modifications to xen-disk support a shared ring up to order 4 (i.e. 16 pages). Signed-off-by: Paul Durrant <paul.durrant@citrix.com> --- Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Anthony Perard <anthony.perard@citrix.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Max Reitz <mreitz@redhat.com> v2: - Fix memory leak in error path - Print warning if ring-page-order exceeds limits --- hw/block/xen_disk.c | 144 +++++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 113 insertions(+), 31 deletions(-) diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c index 9b06e3aa81..0e6513708e 100644 --- a/hw/block/xen_disk.c +++ b/hw/block/xen_disk.c @@ -36,8 +36,6 @@ static int batch_maps = 0; -static int max_requests = 32; - /* ------------------------------------------------------------- */ #define BLOCK_SIZE 512 @@ -84,6 +82,8 @@ struct ioreq { BlockAcctCookie acct; }; +#define MAX_RING_PAGE_ORDER 4 + struct XenBlkDev { struct XenDevice xendev; /* must be first */ char *params; @@ -94,7 +94,8 @@ struct XenBlkDev { bool directiosafe; const char *fileproto; const char *filename; - int ring_ref; + unsigned int ring_ref[1 << MAX_RING_PAGE_ORDER]; + unsigned int nr_ring_ref; void *sring; int64_t file_blk; int64_t file_size; @@ -110,6 +111,7 @@ struct XenBlkDev { int requests_total; int requests_inflight; int requests_finished; + unsigned int max_requests; /* Persistent grants extension */ gboolean feature_discard; @@ -199,7 +201,7 @@ static struct ioreq *ioreq_start(struct XenBlkDev *blkdev) struct ioreq *ioreq = NULL; if (QLIST_EMPTY(&blkdev->freelist)) { - if (blkdev->requests_total >= max_requests) { + if (blkdev->requests_total >= blkdev->max_requests) { goto out; } /* allocate new struct */ @@ -905,7 +907,7 @@ static void blk_handle_requests(struct XenBlkDev *blkdev) ioreq_runio_qemu_aio(ioreq); } - if (blkdev->more_work && blkdev->requests_inflight < max_requests) { + if (blkdev->more_work && blkdev->requests_inflight < blkdev->max_requests) { qemu_bh_schedule(blkdev->bh); } } @@ -918,15 +920,6 @@ static void blk_bh(void *opaque) blk_handle_requests(blkdev); } -/* - * We need to account for the grant allocations requiring contiguous - * chunks; the worst case number would be - * max_req * max_seg + (max_req - 1) * (max_seg - 1) + 1, - * but in order to keep things simple just use - * 2 * max_req * max_seg. - */ -#define MAX_GRANTS(max_req, max_seg) (2 * (max_req) * (max_seg)) - static void blk_alloc(struct XenDevice *xendev) { struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev); @@ -938,11 +931,6 @@ static void blk_alloc(struct XenDevice *xendev) if (xen_mode != XEN_EMULATE) { batch_maps = 1; } - if (xengnttab_set_max_grants(xendev->gnttabdev, - MAX_GRANTS(max_requests, BLKIF_MAX_SEGMENTS_PER_REQUEST)) < 0) { - xen_pv_printf(xendev, 0, "xengnttab_set_max_grants failed: %s\n", - strerror(errno)); - } } static void blk_parse_discard(struct XenBlkDev *blkdev) @@ -1037,6 +1025,9 @@ static int blk_init(struct XenDevice *xendev) !blkdev->feature_grant_copy); xenstore_write_be_int(&blkdev->xendev, "info", info); + xenstore_write_be_int(&blkdev->xendev, "max-ring-page-order", + MAX_RING_PAGE_ORDER); + blk_parse_discard(blkdev); g_free(directiosafe); @@ -1058,12 +1049,25 @@ out_error: return -1; } +/* + * We need to account for the grant allocations requiring contiguous + * chunks; the worst case number would be + * max_req * max_seg + (max_req - 1) * (max_seg - 1) + 1, + * but in order to keep things simple just use + * 2 * max_req * max_seg. + */ +#define MAX_GRANTS(max_req, max_seg) (2 * (max_req) * (max_seg)) + static int blk_connect(struct XenDevice *xendev) { struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev); int pers, index, qflags; bool readonly = true; bool writethrough = true; + int order, ring_ref; + unsigned int ring_size, max_grants; + unsigned int i; + uint32_t *domids; /* read-only ? */ if (blkdev->directiosafe) { @@ -1138,9 +1142,42 @@ static int blk_connect(struct XenDevice *xendev) xenstore_write_be_int64(&blkdev->xendev, "sectors", blkdev->file_size / blkdev->file_blk); - if (xenstore_read_fe_int(&blkdev->xendev, "ring-ref", &blkdev->ring_ref) == -1) { + if (xenstore_read_fe_int(&blkdev->xendev, "ring-page-order", + &order) == -1) { + blkdev->nr_ring_ref = 1; + + if (xenstore_read_fe_int(&blkdev->xendev, "ring-ref", + &ring_ref) == -1) { + return -1; + } + blkdev->ring_ref[0] = ring_ref; + + } else if (order >= 0 && order <= MAX_RING_PAGE_ORDER) { + blkdev->nr_ring_ref = 1 << order; + + for (i = 0; i < blkdev->nr_ring_ref; i++) { + char *key; + + key = g_strdup_printf("ring-ref%u", i); + if (!key) { + return -1; + } + + if (xenstore_read_fe_int(&blkdev->xendev, key, + &ring_ref) == -1) { + g_free(key); + return -1; + } + blkdev->ring_ref[i] = ring_ref; + + g_free(key); + } + } else { + xen_pv_printf(xendev, 0, "invalid ring-page-order: %d\n", + order); return -1; } + if (xenstore_read_fe_int(&blkdev->xendev, "event-channel", &blkdev->xendev.remote_port) == -1) { return -1; @@ -1163,41 +1200,85 @@ static int blk_connect(struct XenDevice *xendev) blkdev->protocol = BLKIF_PROTOCOL_NATIVE; } - blkdev->sring = xengnttab_map_grant_ref(blkdev->xendev.gnttabdev, - blkdev->xendev.dom, - blkdev->ring_ref, - PROT_READ | PROT_WRITE); + ring_size = XC_PAGE_SIZE * blkdev->nr_ring_ref; + switch (blkdev->protocol) { + case BLKIF_PROTOCOL_NATIVE: + { + blkdev->max_requests = __CONST_RING_SIZE(blkif, ring_size); + break; + } + case BLKIF_PROTOCOL_X86_32: + { + blkdev->max_requests = __CONST_RING_SIZE(blkif_x86_32, ring_size); + break; + } + case BLKIF_PROTOCOL_X86_64: + { + blkdev->max_requests = __CONST_RING_SIZE(blkif_x86_64, ring_size); + break; + } + default: + return -1; + } + + /* Calculate the maximum number of grants needed by ioreqs */ + max_grants = MAX_GRANTS(blkdev->max_requests, + BLKIF_MAX_SEGMENTS_PER_REQUEST); + /* Add on the number needed for the ring pages */ + max_grants += blkdev->nr_ring_ref; + + if (xengnttab_set_max_grants(blkdev->xendev.gnttabdev, max_grants)) { + xen_pv_printf(xendev, 0, "xengnttab_set_max_grants failed: %s\n", + strerror(errno)); + return -1; + } + + domids = g_malloc0_n(blkdev->nr_ring_ref, sizeof(uint32_t)); + for (i = 0; i < blkdev->nr_ring_ref; i++) { + domids[i] = blkdev->xendev.dom; + } + + blkdev->sring = xengnttab_map_grant_refs(blkdev->xendev.gnttabdev, + blkdev->nr_ring_ref, + domids, + blkdev->ring_ref, + PROT_READ | PROT_WRITE); + + g_free(domids); + if (!blkdev->sring) { return -1; } + blkdev->cnt_map++; switch (blkdev->protocol) { case BLKIF_PROTOCOL_NATIVE: { blkif_sring_t *sring_native = blkdev->sring; - BACK_RING_INIT(&blkdev->rings.native, sring_native, XC_PAGE_SIZE); + BACK_RING_INIT(&blkdev->rings.native, sring_native, ring_size); break; } case BLKIF_PROTOCOL_X86_32: { blkif_x86_32_sring_t *sring_x86_32 = blkdev->sring; - BACK_RING_INIT(&blkdev->rings.x86_32_part, sring_x86_32, XC_PAGE_SIZE); + BACK_RING_INIT(&blkdev->rings.x86_32_part, sring_x86_32, ring_size); break; } case BLKIF_PROTOCOL_X86_64: { blkif_x86_64_sring_t *sring_x86_64 = blkdev->sring; - BACK_RING_INIT(&blkdev->rings.x86_64_part, sring_x86_64, XC_PAGE_SIZE); + BACK_RING_INIT(&blkdev->rings.x86_64_part, sring_x86_64, ring_size); break; } } if (blkdev->feature_persistent) { /* Init persistent grants */ - blkdev->max_grants = max_requests * BLKIF_MAX_SEGMENTS_PER_REQUEST; + blkdev->max_grants = blkdev->max_requests * + BLKIF_MAX_SEGMENTS_PER_REQUEST; blkdev->persistent_gnts = g_tree_new_full((GCompareDataFunc)int_cmp, NULL, NULL, batch_maps ? @@ -1209,9 +1290,9 @@ static int blk_connect(struct XenDevice *xendev) xen_be_bind_evtchn(&blkdev->xendev); - xen_pv_printf(&blkdev->xendev, 1, "ok: proto %s, ring-ref %d, " + xen_pv_printf(&blkdev->xendev, 1, "ok: proto %s, nr-ring-ref %u, " "remote port %d, local port %d\n", - blkdev->xendev.protocol, blkdev->ring_ref, + blkdev->xendev.protocol, blkdev->nr_ring_ref, blkdev->xendev.remote_port, blkdev->xendev.local_port); return 0; } @@ -1228,7 +1309,8 @@ static void blk_disconnect(struct XenDevice *xendev) xen_pv_unbind_evtchn(&blkdev->xendev); if (blkdev->sring) { - xengnttab_unmap(blkdev->xendev.gnttabdev, blkdev->sring, 1); + xengnttab_unmap(blkdev->xendev.gnttabdev, blkdev->sring, + blkdev->nr_ring_ref); blkdev->cnt_map--; blkdev->sring = NULL; } -- 2.11.0 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v2 2/3] xen-disk: add support for multi-page shared rings @ 2017-06-21 12:52 ` Paul Durrant 0 siblings, 0 replies; 24+ messages in thread From: Paul Durrant @ 2017-06-21 12:52 UTC (permalink / raw) To: xen-devel, qemu-devel, qemu-block Cc: Anthony Perard, Kevin Wolf, Paul Durrant, Stefano Stabellini, Max Reitz The blkif protocol has had provision for negotiation of multi-page shared rings for some time now and many guest OS have support in their frontend drivers. This patch makes the necessary modifications to xen-disk support a shared ring up to order 4 (i.e. 16 pages). Signed-off-by: Paul Durrant <paul.durrant@citrix.com> --- Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Anthony Perard <anthony.perard@citrix.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Max Reitz <mreitz@redhat.com> v2: - Fix memory leak in error path - Print warning if ring-page-order exceeds limits --- hw/block/xen_disk.c | 144 +++++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 113 insertions(+), 31 deletions(-) diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c index 9b06e3aa81..0e6513708e 100644 --- a/hw/block/xen_disk.c +++ b/hw/block/xen_disk.c @@ -36,8 +36,6 @@ static int batch_maps = 0; -static int max_requests = 32; - /* ------------------------------------------------------------- */ #define BLOCK_SIZE 512 @@ -84,6 +82,8 @@ struct ioreq { BlockAcctCookie acct; }; +#define MAX_RING_PAGE_ORDER 4 + struct XenBlkDev { struct XenDevice xendev; /* must be first */ char *params; @@ -94,7 +94,8 @@ struct XenBlkDev { bool directiosafe; const char *fileproto; const char *filename; - int ring_ref; + unsigned int ring_ref[1 << MAX_RING_PAGE_ORDER]; + unsigned int nr_ring_ref; void *sring; int64_t file_blk; int64_t file_size; @@ -110,6 +111,7 @@ struct XenBlkDev { int requests_total; int requests_inflight; int requests_finished; + unsigned int max_requests; /* Persistent grants extension */ gboolean feature_discard; @@ -199,7 +201,7 @@ static struct ioreq *ioreq_start(struct XenBlkDev *blkdev) struct ioreq *ioreq = NULL; if (QLIST_EMPTY(&blkdev->freelist)) { - if (blkdev->requests_total >= max_requests) { + if (blkdev->requests_total >= blkdev->max_requests) { goto out; } /* allocate new struct */ @@ -905,7 +907,7 @@ static void blk_handle_requests(struct XenBlkDev *blkdev) ioreq_runio_qemu_aio(ioreq); } - if (blkdev->more_work && blkdev->requests_inflight < max_requests) { + if (blkdev->more_work && blkdev->requests_inflight < blkdev->max_requests) { qemu_bh_schedule(blkdev->bh); } } @@ -918,15 +920,6 @@ static void blk_bh(void *opaque) blk_handle_requests(blkdev); } -/* - * We need to account for the grant allocations requiring contiguous - * chunks; the worst case number would be - * max_req * max_seg + (max_req - 1) * (max_seg - 1) + 1, - * but in order to keep things simple just use - * 2 * max_req * max_seg. - */ -#define MAX_GRANTS(max_req, max_seg) (2 * (max_req) * (max_seg)) - static void blk_alloc(struct XenDevice *xendev) { struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev); @@ -938,11 +931,6 @@ static void blk_alloc(struct XenDevice *xendev) if (xen_mode != XEN_EMULATE) { batch_maps = 1; } - if (xengnttab_set_max_grants(xendev->gnttabdev, - MAX_GRANTS(max_requests, BLKIF_MAX_SEGMENTS_PER_REQUEST)) < 0) { - xen_pv_printf(xendev, 0, "xengnttab_set_max_grants failed: %s\n", - strerror(errno)); - } } static void blk_parse_discard(struct XenBlkDev *blkdev) @@ -1037,6 +1025,9 @@ static int blk_init(struct XenDevice *xendev) !blkdev->feature_grant_copy); xenstore_write_be_int(&blkdev->xendev, "info", info); + xenstore_write_be_int(&blkdev->xendev, "max-ring-page-order", + MAX_RING_PAGE_ORDER); + blk_parse_discard(blkdev); g_free(directiosafe); @@ -1058,12 +1049,25 @@ out_error: return -1; } +/* + * We need to account for the grant allocations requiring contiguous + * chunks; the worst case number would be + * max_req * max_seg + (max_req - 1) * (max_seg - 1) + 1, + * but in order to keep things simple just use + * 2 * max_req * max_seg. + */ +#define MAX_GRANTS(max_req, max_seg) (2 * (max_req) * (max_seg)) + static int blk_connect(struct XenDevice *xendev) { struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev); int pers, index, qflags; bool readonly = true; bool writethrough = true; + int order, ring_ref; + unsigned int ring_size, max_grants; + unsigned int i; + uint32_t *domids; /* read-only ? */ if (blkdev->directiosafe) { @@ -1138,9 +1142,42 @@ static int blk_connect(struct XenDevice *xendev) xenstore_write_be_int64(&blkdev->xendev, "sectors", blkdev->file_size / blkdev->file_blk); - if (xenstore_read_fe_int(&blkdev->xendev, "ring-ref", &blkdev->ring_ref) == -1) { + if (xenstore_read_fe_int(&blkdev->xendev, "ring-page-order", + &order) == -1) { + blkdev->nr_ring_ref = 1; + + if (xenstore_read_fe_int(&blkdev->xendev, "ring-ref", + &ring_ref) == -1) { + return -1; + } + blkdev->ring_ref[0] = ring_ref; + + } else if (order >= 0 && order <= MAX_RING_PAGE_ORDER) { + blkdev->nr_ring_ref = 1 << order; + + for (i = 0; i < blkdev->nr_ring_ref; i++) { + char *key; + + key = g_strdup_printf("ring-ref%u", i); + if (!key) { + return -1; + } + + if (xenstore_read_fe_int(&blkdev->xendev, key, + &ring_ref) == -1) { + g_free(key); + return -1; + } + blkdev->ring_ref[i] = ring_ref; + + g_free(key); + } + } else { + xen_pv_printf(xendev, 0, "invalid ring-page-order: %d\n", + order); return -1; } + if (xenstore_read_fe_int(&blkdev->xendev, "event-channel", &blkdev->xendev.remote_port) == -1) { return -1; @@ -1163,41 +1200,85 @@ static int blk_connect(struct XenDevice *xendev) blkdev->protocol = BLKIF_PROTOCOL_NATIVE; } - blkdev->sring = xengnttab_map_grant_ref(blkdev->xendev.gnttabdev, - blkdev->xendev.dom, - blkdev->ring_ref, - PROT_READ | PROT_WRITE); + ring_size = XC_PAGE_SIZE * blkdev->nr_ring_ref; + switch (blkdev->protocol) { + case BLKIF_PROTOCOL_NATIVE: + { + blkdev->max_requests = __CONST_RING_SIZE(blkif, ring_size); + break; + } + case BLKIF_PROTOCOL_X86_32: + { + blkdev->max_requests = __CONST_RING_SIZE(blkif_x86_32, ring_size); + break; + } + case BLKIF_PROTOCOL_X86_64: + { + blkdev->max_requests = __CONST_RING_SIZE(blkif_x86_64, ring_size); + break; + } + default: + return -1; + } + + /* Calculate the maximum number of grants needed by ioreqs */ + max_grants = MAX_GRANTS(blkdev->max_requests, + BLKIF_MAX_SEGMENTS_PER_REQUEST); + /* Add on the number needed for the ring pages */ + max_grants += blkdev->nr_ring_ref; + + if (xengnttab_set_max_grants(blkdev->xendev.gnttabdev, max_grants)) { + xen_pv_printf(xendev, 0, "xengnttab_set_max_grants failed: %s\n", + strerror(errno)); + return -1; + } + + domids = g_malloc0_n(blkdev->nr_ring_ref, sizeof(uint32_t)); + for (i = 0; i < blkdev->nr_ring_ref; i++) { + domids[i] = blkdev->xendev.dom; + } + + blkdev->sring = xengnttab_map_grant_refs(blkdev->xendev.gnttabdev, + blkdev->nr_ring_ref, + domids, + blkdev->ring_ref, + PROT_READ | PROT_WRITE); + + g_free(domids); + if (!blkdev->sring) { return -1; } + blkdev->cnt_map++; switch (blkdev->protocol) { case BLKIF_PROTOCOL_NATIVE: { blkif_sring_t *sring_native = blkdev->sring; - BACK_RING_INIT(&blkdev->rings.native, sring_native, XC_PAGE_SIZE); + BACK_RING_INIT(&blkdev->rings.native, sring_native, ring_size); break; } case BLKIF_PROTOCOL_X86_32: { blkif_x86_32_sring_t *sring_x86_32 = blkdev->sring; - BACK_RING_INIT(&blkdev->rings.x86_32_part, sring_x86_32, XC_PAGE_SIZE); + BACK_RING_INIT(&blkdev->rings.x86_32_part, sring_x86_32, ring_size); break; } case BLKIF_PROTOCOL_X86_64: { blkif_x86_64_sring_t *sring_x86_64 = blkdev->sring; - BACK_RING_INIT(&blkdev->rings.x86_64_part, sring_x86_64, XC_PAGE_SIZE); + BACK_RING_INIT(&blkdev->rings.x86_64_part, sring_x86_64, ring_size); break; } } if (blkdev->feature_persistent) { /* Init persistent grants */ - blkdev->max_grants = max_requests * BLKIF_MAX_SEGMENTS_PER_REQUEST; + blkdev->max_grants = blkdev->max_requests * + BLKIF_MAX_SEGMENTS_PER_REQUEST; blkdev->persistent_gnts = g_tree_new_full((GCompareDataFunc)int_cmp, NULL, NULL, batch_maps ? @@ -1209,9 +1290,9 @@ static int blk_connect(struct XenDevice *xendev) xen_be_bind_evtchn(&blkdev->xendev); - xen_pv_printf(&blkdev->xendev, 1, "ok: proto %s, ring-ref %d, " + xen_pv_printf(&blkdev->xendev, 1, "ok: proto %s, nr-ring-ref %u, " "remote port %d, local port %d\n", - blkdev->xendev.protocol, blkdev->ring_ref, + blkdev->xendev.protocol, blkdev->nr_ring_ref, blkdev->xendev.remote_port, blkdev->xendev.local_port); return 0; } @@ -1228,7 +1309,8 @@ static void blk_disconnect(struct XenDevice *xendev) xen_pv_unbind_evtchn(&blkdev->xendev); if (blkdev->sring) { - xengnttab_unmap(blkdev->xendev.gnttabdev, blkdev->sring, 1); + xengnttab_unmap(blkdev->xendev.gnttabdev, blkdev->sring, + blkdev->nr_ring_ref); blkdev->cnt_map--; blkdev->sring = NULL; } -- 2.11.0 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [Qemu-devel] [PATCH v2 2/3] xen-disk: add support for multi-page shared rings 2017-06-21 12:52 ` Paul Durrant (?) @ 2017-06-22 0:39 ` Stefano Stabellini -1 siblings, 0 replies; 24+ messages in thread From: Stefano Stabellini @ 2017-06-22 0:39 UTC (permalink / raw) To: Paul Durrant Cc: xen-devel, qemu-devel, qemu-block, Stefano Stabellini, Anthony Perard, Kevin Wolf, Max Reitz On Wed, 21 Jun 2017, Paul Durrant wrote: > The blkif protocol has had provision for negotiation of multi-page shared > rings for some time now and many guest OS have support in their frontend > drivers. > > This patch makes the necessary modifications to xen-disk support a shared > ring up to order 4 (i.e. 16 pages). > > Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> > --- > Cc: Stefano Stabellini <sstabellini@kernel.org> > Cc: Anthony Perard <anthony.perard@citrix.com> > Cc: Kevin Wolf <kwolf@redhat.com> > Cc: Max Reitz <mreitz@redhat.com> > > v2: > - Fix memory leak in error path > - Print warning if ring-page-order exceeds limits > --- > hw/block/xen_disk.c | 144 +++++++++++++++++++++++++++++++++++++++++----------- > 1 file changed, 113 insertions(+), 31 deletions(-) > > diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c > index 9b06e3aa81..0e6513708e 100644 > --- a/hw/block/xen_disk.c > +++ b/hw/block/xen_disk.c > @@ -36,8 +36,6 @@ > > static int batch_maps = 0; > > -static int max_requests = 32; > - > /* ------------------------------------------------------------- */ > > #define BLOCK_SIZE 512 > @@ -84,6 +82,8 @@ struct ioreq { > BlockAcctCookie acct; > }; > > +#define MAX_RING_PAGE_ORDER 4 > + > struct XenBlkDev { > struct XenDevice xendev; /* must be first */ > char *params; > @@ -94,7 +94,8 @@ struct XenBlkDev { > bool directiosafe; > const char *fileproto; > const char *filename; > - int ring_ref; > + unsigned int ring_ref[1 << MAX_RING_PAGE_ORDER]; > + unsigned int nr_ring_ref; > void *sring; > int64_t file_blk; > int64_t file_size; > @@ -110,6 +111,7 @@ struct XenBlkDev { > int requests_total; > int requests_inflight; > int requests_finished; > + unsigned int max_requests; > > /* Persistent grants extension */ > gboolean feature_discard; > @@ -199,7 +201,7 @@ static struct ioreq *ioreq_start(struct XenBlkDev *blkdev) > struct ioreq *ioreq = NULL; > > if (QLIST_EMPTY(&blkdev->freelist)) { > - if (blkdev->requests_total >= max_requests) { > + if (blkdev->requests_total >= blkdev->max_requests) { > goto out; > } > /* allocate new struct */ > @@ -905,7 +907,7 @@ static void blk_handle_requests(struct XenBlkDev *blkdev) > ioreq_runio_qemu_aio(ioreq); > } > > - if (blkdev->more_work && blkdev->requests_inflight < max_requests) { > + if (blkdev->more_work && blkdev->requests_inflight < blkdev->max_requests) { > qemu_bh_schedule(blkdev->bh); > } > } > @@ -918,15 +920,6 @@ static void blk_bh(void *opaque) > blk_handle_requests(blkdev); > } > > -/* > - * We need to account for the grant allocations requiring contiguous > - * chunks; the worst case number would be > - * max_req * max_seg + (max_req - 1) * (max_seg - 1) + 1, > - * but in order to keep things simple just use > - * 2 * max_req * max_seg. > - */ > -#define MAX_GRANTS(max_req, max_seg) (2 * (max_req) * (max_seg)) > - > static void blk_alloc(struct XenDevice *xendev) > { > struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev); > @@ -938,11 +931,6 @@ static void blk_alloc(struct XenDevice *xendev) > if (xen_mode != XEN_EMULATE) { > batch_maps = 1; > } > - if (xengnttab_set_max_grants(xendev->gnttabdev, > - MAX_GRANTS(max_requests, BLKIF_MAX_SEGMENTS_PER_REQUEST)) < 0) { > - xen_pv_printf(xendev, 0, "xengnttab_set_max_grants failed: %s\n", > - strerror(errno)); > - } > } > > static void blk_parse_discard(struct XenBlkDev *blkdev) > @@ -1037,6 +1025,9 @@ static int blk_init(struct XenDevice *xendev) > !blkdev->feature_grant_copy); > xenstore_write_be_int(&blkdev->xendev, "info", info); > > + xenstore_write_be_int(&blkdev->xendev, "max-ring-page-order", > + MAX_RING_PAGE_ORDER); > + > blk_parse_discard(blkdev); > > g_free(directiosafe); > @@ -1058,12 +1049,25 @@ out_error: > return -1; > } > > +/* > + * We need to account for the grant allocations requiring contiguous > + * chunks; the worst case number would be > + * max_req * max_seg + (max_req - 1) * (max_seg - 1) + 1, > + * but in order to keep things simple just use > + * 2 * max_req * max_seg. > + */ > +#define MAX_GRANTS(max_req, max_seg) (2 * (max_req) * (max_seg)) > + > static int blk_connect(struct XenDevice *xendev) > { > struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev); > int pers, index, qflags; > bool readonly = true; > bool writethrough = true; > + int order, ring_ref; > + unsigned int ring_size, max_grants; > + unsigned int i; > + uint32_t *domids; > > /* read-only ? */ > if (blkdev->directiosafe) { > @@ -1138,9 +1142,42 @@ static int blk_connect(struct XenDevice *xendev) > xenstore_write_be_int64(&blkdev->xendev, "sectors", > blkdev->file_size / blkdev->file_blk); > > - if (xenstore_read_fe_int(&blkdev->xendev, "ring-ref", &blkdev->ring_ref) == -1) { > + if (xenstore_read_fe_int(&blkdev->xendev, "ring-page-order", > + &order) == -1) { > + blkdev->nr_ring_ref = 1; > + > + if (xenstore_read_fe_int(&blkdev->xendev, "ring-ref", > + &ring_ref) == -1) { > + return -1; > + } > + blkdev->ring_ref[0] = ring_ref; > + > + } else if (order >= 0 && order <= MAX_RING_PAGE_ORDER) { > + blkdev->nr_ring_ref = 1 << order; > + > + for (i = 0; i < blkdev->nr_ring_ref; i++) { > + char *key; > + > + key = g_strdup_printf("ring-ref%u", i); > + if (!key) { > + return -1; > + } > + > + if (xenstore_read_fe_int(&blkdev->xendev, key, > + &ring_ref) == -1) { > + g_free(key); > + return -1; > + } > + blkdev->ring_ref[i] = ring_ref; > + > + g_free(key); > + } > + } else { > + xen_pv_printf(xendev, 0, "invalid ring-page-order: %d\n", > + order); > return -1; > } > + > if (xenstore_read_fe_int(&blkdev->xendev, "event-channel", > &blkdev->xendev.remote_port) == -1) { > return -1; > @@ -1163,41 +1200,85 @@ static int blk_connect(struct XenDevice *xendev) > blkdev->protocol = BLKIF_PROTOCOL_NATIVE; > } > > - blkdev->sring = xengnttab_map_grant_ref(blkdev->xendev.gnttabdev, > - blkdev->xendev.dom, > - blkdev->ring_ref, > - PROT_READ | PROT_WRITE); > + ring_size = XC_PAGE_SIZE * blkdev->nr_ring_ref; > + switch (blkdev->protocol) { > + case BLKIF_PROTOCOL_NATIVE: > + { > + blkdev->max_requests = __CONST_RING_SIZE(blkif, ring_size); > + break; > + } > + case BLKIF_PROTOCOL_X86_32: > + { > + blkdev->max_requests = __CONST_RING_SIZE(blkif_x86_32, ring_size); > + break; > + } > + case BLKIF_PROTOCOL_X86_64: > + { > + blkdev->max_requests = __CONST_RING_SIZE(blkif_x86_64, ring_size); > + break; > + } > + default: > + return -1; > + } > + > + /* Calculate the maximum number of grants needed by ioreqs */ > + max_grants = MAX_GRANTS(blkdev->max_requests, > + BLKIF_MAX_SEGMENTS_PER_REQUEST); > + /* Add on the number needed for the ring pages */ > + max_grants += blkdev->nr_ring_ref; > + > + if (xengnttab_set_max_grants(blkdev->xendev.gnttabdev, max_grants)) { > + xen_pv_printf(xendev, 0, "xengnttab_set_max_grants failed: %s\n", > + strerror(errno)); > + return -1; > + } > + > + domids = g_malloc0_n(blkdev->nr_ring_ref, sizeof(uint32_t)); > + for (i = 0; i < blkdev->nr_ring_ref; i++) { > + domids[i] = blkdev->xendev.dom; > + } > + > + blkdev->sring = xengnttab_map_grant_refs(blkdev->xendev.gnttabdev, > + blkdev->nr_ring_ref, > + domids, > + blkdev->ring_ref, > + PROT_READ | PROT_WRITE); > + > + g_free(domids); > + > if (!blkdev->sring) { > return -1; > } > + > blkdev->cnt_map++; > > switch (blkdev->protocol) { > case BLKIF_PROTOCOL_NATIVE: > { > blkif_sring_t *sring_native = blkdev->sring; > - BACK_RING_INIT(&blkdev->rings.native, sring_native, XC_PAGE_SIZE); > + BACK_RING_INIT(&blkdev->rings.native, sring_native, ring_size); > break; > } > case BLKIF_PROTOCOL_X86_32: > { > blkif_x86_32_sring_t *sring_x86_32 = blkdev->sring; > > - BACK_RING_INIT(&blkdev->rings.x86_32_part, sring_x86_32, XC_PAGE_SIZE); > + BACK_RING_INIT(&blkdev->rings.x86_32_part, sring_x86_32, ring_size); > break; > } > case BLKIF_PROTOCOL_X86_64: > { > blkif_x86_64_sring_t *sring_x86_64 = blkdev->sring; > > - BACK_RING_INIT(&blkdev->rings.x86_64_part, sring_x86_64, XC_PAGE_SIZE); > + BACK_RING_INIT(&blkdev->rings.x86_64_part, sring_x86_64, ring_size); > break; > } > } > > if (blkdev->feature_persistent) { > /* Init persistent grants */ > - blkdev->max_grants = max_requests * BLKIF_MAX_SEGMENTS_PER_REQUEST; > + blkdev->max_grants = blkdev->max_requests * > + BLKIF_MAX_SEGMENTS_PER_REQUEST; > blkdev->persistent_gnts = g_tree_new_full((GCompareDataFunc)int_cmp, > NULL, NULL, > batch_maps ? > @@ -1209,9 +1290,9 @@ static int blk_connect(struct XenDevice *xendev) > > xen_be_bind_evtchn(&blkdev->xendev); > > - xen_pv_printf(&blkdev->xendev, 1, "ok: proto %s, ring-ref %d, " > + xen_pv_printf(&blkdev->xendev, 1, "ok: proto %s, nr-ring-ref %u, " > "remote port %d, local port %d\n", > - blkdev->xendev.protocol, blkdev->ring_ref, > + blkdev->xendev.protocol, blkdev->nr_ring_ref, > blkdev->xendev.remote_port, blkdev->xendev.local_port); > return 0; > } > @@ -1228,7 +1309,8 @@ static void blk_disconnect(struct XenDevice *xendev) > xen_pv_unbind_evtchn(&blkdev->xendev); > > if (blkdev->sring) { > - xengnttab_unmap(blkdev->xendev.gnttabdev, blkdev->sring, 1); > + xengnttab_unmap(blkdev->xendev.gnttabdev, blkdev->sring, > + blkdev->nr_ring_ref); > blkdev->cnt_map--; > blkdev->sring = NULL; > } > -- > 2.11.0 > ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2 2/3] xen-disk: add support for multi-page shared rings 2017-06-21 12:52 ` Paul Durrant (?) (?) @ 2017-06-22 0:39 ` Stefano Stabellini -1 siblings, 0 replies; 24+ messages in thread From: Stefano Stabellini @ 2017-06-22 0:39 UTC (permalink / raw) To: Paul Durrant Cc: Kevin Wolf, Stefano Stabellini, qemu-block, qemu-devel, Max Reitz, Anthony Perard, xen-devel On Wed, 21 Jun 2017, Paul Durrant wrote: > The blkif protocol has had provision for negotiation of multi-page shared > rings for some time now and many guest OS have support in their frontend > drivers. > > This patch makes the necessary modifications to xen-disk support a shared > ring up to order 4 (i.e. 16 pages). > > Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> > --- > Cc: Stefano Stabellini <sstabellini@kernel.org> > Cc: Anthony Perard <anthony.perard@citrix.com> > Cc: Kevin Wolf <kwolf@redhat.com> > Cc: Max Reitz <mreitz@redhat.com> > > v2: > - Fix memory leak in error path > - Print warning if ring-page-order exceeds limits > --- > hw/block/xen_disk.c | 144 +++++++++++++++++++++++++++++++++++++++++----------- > 1 file changed, 113 insertions(+), 31 deletions(-) > > diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c > index 9b06e3aa81..0e6513708e 100644 > --- a/hw/block/xen_disk.c > +++ b/hw/block/xen_disk.c > @@ -36,8 +36,6 @@ > > static int batch_maps = 0; > > -static int max_requests = 32; > - > /* ------------------------------------------------------------- */ > > #define BLOCK_SIZE 512 > @@ -84,6 +82,8 @@ struct ioreq { > BlockAcctCookie acct; > }; > > +#define MAX_RING_PAGE_ORDER 4 > + > struct XenBlkDev { > struct XenDevice xendev; /* must be first */ > char *params; > @@ -94,7 +94,8 @@ struct XenBlkDev { > bool directiosafe; > const char *fileproto; > const char *filename; > - int ring_ref; > + unsigned int ring_ref[1 << MAX_RING_PAGE_ORDER]; > + unsigned int nr_ring_ref; > void *sring; > int64_t file_blk; > int64_t file_size; > @@ -110,6 +111,7 @@ struct XenBlkDev { > int requests_total; > int requests_inflight; > int requests_finished; > + unsigned int max_requests; > > /* Persistent grants extension */ > gboolean feature_discard; > @@ -199,7 +201,7 @@ static struct ioreq *ioreq_start(struct XenBlkDev *blkdev) > struct ioreq *ioreq = NULL; > > if (QLIST_EMPTY(&blkdev->freelist)) { > - if (blkdev->requests_total >= max_requests) { > + if (blkdev->requests_total >= blkdev->max_requests) { > goto out; > } > /* allocate new struct */ > @@ -905,7 +907,7 @@ static void blk_handle_requests(struct XenBlkDev *blkdev) > ioreq_runio_qemu_aio(ioreq); > } > > - if (blkdev->more_work && blkdev->requests_inflight < max_requests) { > + if (blkdev->more_work && blkdev->requests_inflight < blkdev->max_requests) { > qemu_bh_schedule(blkdev->bh); > } > } > @@ -918,15 +920,6 @@ static void blk_bh(void *opaque) > blk_handle_requests(blkdev); > } > > -/* > - * We need to account for the grant allocations requiring contiguous > - * chunks; the worst case number would be > - * max_req * max_seg + (max_req - 1) * (max_seg - 1) + 1, > - * but in order to keep things simple just use > - * 2 * max_req * max_seg. > - */ > -#define MAX_GRANTS(max_req, max_seg) (2 * (max_req) * (max_seg)) > - > static void blk_alloc(struct XenDevice *xendev) > { > struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev); > @@ -938,11 +931,6 @@ static void blk_alloc(struct XenDevice *xendev) > if (xen_mode != XEN_EMULATE) { > batch_maps = 1; > } > - if (xengnttab_set_max_grants(xendev->gnttabdev, > - MAX_GRANTS(max_requests, BLKIF_MAX_SEGMENTS_PER_REQUEST)) < 0) { > - xen_pv_printf(xendev, 0, "xengnttab_set_max_grants failed: %s\n", > - strerror(errno)); > - } > } > > static void blk_parse_discard(struct XenBlkDev *blkdev) > @@ -1037,6 +1025,9 @@ static int blk_init(struct XenDevice *xendev) > !blkdev->feature_grant_copy); > xenstore_write_be_int(&blkdev->xendev, "info", info); > > + xenstore_write_be_int(&blkdev->xendev, "max-ring-page-order", > + MAX_RING_PAGE_ORDER); > + > blk_parse_discard(blkdev); > > g_free(directiosafe); > @@ -1058,12 +1049,25 @@ out_error: > return -1; > } > > +/* > + * We need to account for the grant allocations requiring contiguous > + * chunks; the worst case number would be > + * max_req * max_seg + (max_req - 1) * (max_seg - 1) + 1, > + * but in order to keep things simple just use > + * 2 * max_req * max_seg. > + */ > +#define MAX_GRANTS(max_req, max_seg) (2 * (max_req) * (max_seg)) > + > static int blk_connect(struct XenDevice *xendev) > { > struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev); > int pers, index, qflags; > bool readonly = true; > bool writethrough = true; > + int order, ring_ref; > + unsigned int ring_size, max_grants; > + unsigned int i; > + uint32_t *domids; > > /* read-only ? */ > if (blkdev->directiosafe) { > @@ -1138,9 +1142,42 @@ static int blk_connect(struct XenDevice *xendev) > xenstore_write_be_int64(&blkdev->xendev, "sectors", > blkdev->file_size / blkdev->file_blk); > > - if (xenstore_read_fe_int(&blkdev->xendev, "ring-ref", &blkdev->ring_ref) == -1) { > + if (xenstore_read_fe_int(&blkdev->xendev, "ring-page-order", > + &order) == -1) { > + blkdev->nr_ring_ref = 1; > + > + if (xenstore_read_fe_int(&blkdev->xendev, "ring-ref", > + &ring_ref) == -1) { > + return -1; > + } > + blkdev->ring_ref[0] = ring_ref; > + > + } else if (order >= 0 && order <= MAX_RING_PAGE_ORDER) { > + blkdev->nr_ring_ref = 1 << order; > + > + for (i = 0; i < blkdev->nr_ring_ref; i++) { > + char *key; > + > + key = g_strdup_printf("ring-ref%u", i); > + if (!key) { > + return -1; > + } > + > + if (xenstore_read_fe_int(&blkdev->xendev, key, > + &ring_ref) == -1) { > + g_free(key); > + return -1; > + } > + blkdev->ring_ref[i] = ring_ref; > + > + g_free(key); > + } > + } else { > + xen_pv_printf(xendev, 0, "invalid ring-page-order: %d\n", > + order); > return -1; > } > + > if (xenstore_read_fe_int(&blkdev->xendev, "event-channel", > &blkdev->xendev.remote_port) == -1) { > return -1; > @@ -1163,41 +1200,85 @@ static int blk_connect(struct XenDevice *xendev) > blkdev->protocol = BLKIF_PROTOCOL_NATIVE; > } > > - blkdev->sring = xengnttab_map_grant_ref(blkdev->xendev.gnttabdev, > - blkdev->xendev.dom, > - blkdev->ring_ref, > - PROT_READ | PROT_WRITE); > + ring_size = XC_PAGE_SIZE * blkdev->nr_ring_ref; > + switch (blkdev->protocol) { > + case BLKIF_PROTOCOL_NATIVE: > + { > + blkdev->max_requests = __CONST_RING_SIZE(blkif, ring_size); > + break; > + } > + case BLKIF_PROTOCOL_X86_32: > + { > + blkdev->max_requests = __CONST_RING_SIZE(blkif_x86_32, ring_size); > + break; > + } > + case BLKIF_PROTOCOL_X86_64: > + { > + blkdev->max_requests = __CONST_RING_SIZE(blkif_x86_64, ring_size); > + break; > + } > + default: > + return -1; > + } > + > + /* Calculate the maximum number of grants needed by ioreqs */ > + max_grants = MAX_GRANTS(blkdev->max_requests, > + BLKIF_MAX_SEGMENTS_PER_REQUEST); > + /* Add on the number needed for the ring pages */ > + max_grants += blkdev->nr_ring_ref; > + > + if (xengnttab_set_max_grants(blkdev->xendev.gnttabdev, max_grants)) { > + xen_pv_printf(xendev, 0, "xengnttab_set_max_grants failed: %s\n", > + strerror(errno)); > + return -1; > + } > + > + domids = g_malloc0_n(blkdev->nr_ring_ref, sizeof(uint32_t)); > + for (i = 0; i < blkdev->nr_ring_ref; i++) { > + domids[i] = blkdev->xendev.dom; > + } > + > + blkdev->sring = xengnttab_map_grant_refs(blkdev->xendev.gnttabdev, > + blkdev->nr_ring_ref, > + domids, > + blkdev->ring_ref, > + PROT_READ | PROT_WRITE); > + > + g_free(domids); > + > if (!blkdev->sring) { > return -1; > } > + > blkdev->cnt_map++; > > switch (blkdev->protocol) { > case BLKIF_PROTOCOL_NATIVE: > { > blkif_sring_t *sring_native = blkdev->sring; > - BACK_RING_INIT(&blkdev->rings.native, sring_native, XC_PAGE_SIZE); > + BACK_RING_INIT(&blkdev->rings.native, sring_native, ring_size); > break; > } > case BLKIF_PROTOCOL_X86_32: > { > blkif_x86_32_sring_t *sring_x86_32 = blkdev->sring; > > - BACK_RING_INIT(&blkdev->rings.x86_32_part, sring_x86_32, XC_PAGE_SIZE); > + BACK_RING_INIT(&blkdev->rings.x86_32_part, sring_x86_32, ring_size); > break; > } > case BLKIF_PROTOCOL_X86_64: > { > blkif_x86_64_sring_t *sring_x86_64 = blkdev->sring; > > - BACK_RING_INIT(&blkdev->rings.x86_64_part, sring_x86_64, XC_PAGE_SIZE); > + BACK_RING_INIT(&blkdev->rings.x86_64_part, sring_x86_64, ring_size); > break; > } > } > > if (blkdev->feature_persistent) { > /* Init persistent grants */ > - blkdev->max_grants = max_requests * BLKIF_MAX_SEGMENTS_PER_REQUEST; > + blkdev->max_grants = blkdev->max_requests * > + BLKIF_MAX_SEGMENTS_PER_REQUEST; > blkdev->persistent_gnts = g_tree_new_full((GCompareDataFunc)int_cmp, > NULL, NULL, > batch_maps ? > @@ -1209,9 +1290,9 @@ static int blk_connect(struct XenDevice *xendev) > > xen_be_bind_evtchn(&blkdev->xendev); > > - xen_pv_printf(&blkdev->xendev, 1, "ok: proto %s, ring-ref %d, " > + xen_pv_printf(&blkdev->xendev, 1, "ok: proto %s, nr-ring-ref %u, " > "remote port %d, local port %d\n", > - blkdev->xendev.protocol, blkdev->ring_ref, > + blkdev->xendev.protocol, blkdev->nr_ring_ref, > blkdev->xendev.remote_port, blkdev->xendev.local_port); > return 0; > } > @@ -1228,7 +1309,8 @@ static void blk_disconnect(struct XenDevice *xendev) > xen_pv_unbind_evtchn(&blkdev->xendev); > > if (blkdev->sring) { > - xengnttab_unmap(blkdev->xendev.gnttabdev, blkdev->sring, 1); > + xengnttab_unmap(blkdev->xendev.gnttabdev, blkdev->sring, > + blkdev->nr_ring_ref); > blkdev->cnt_map--; > blkdev->sring = NULL; > } > -- > 2.11.0 > _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 24+ messages in thread
* [Qemu-devel] [PATCH v2 3/3] xen-disk: use an IOThread per instance 2017-06-21 12:52 ` Paul Durrant @ 2017-06-21 12:52 ` Paul Durrant -1 siblings, 0 replies; 24+ messages in thread From: Paul Durrant @ 2017-06-21 12:52 UTC (permalink / raw) To: xen-devel, qemu-devel, qemu-block Cc: Paul Durrant, Stefano Stabellini, Anthony Perard, Kevin Wolf, Max Reitz This patch allocates an IOThread object for each xen_disk instance and sets the AIO context appropriately on connect. This allows processing of I/O to proceed in parallel. The patch also adds tracepoints into xen_disk to make it possible to follow the state transtions of an instance in the log. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> --- Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Anthony Perard <anthony.perard@citrix.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Max Reitz <mreitz@redhat.com> v2: - explicitly acquire and release AIO context in qemu_aio_complete() and blk_bh() --- hw/block/trace-events | 7 ++++++ hw/block/xen_disk.c | 69 ++++++++++++++++++++++++++++++++++++++++++++------- 2 files changed, 67 insertions(+), 9 deletions(-) diff --git a/hw/block/trace-events b/hw/block/trace-events index 65e83dc258..608b24ba66 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -10,3 +10,10 @@ virtio_blk_submit_multireq(void *mrb, int start, int num_reqs, uint64_t offset, # hw/block/hd-geometry.c hd_geometry_lchs_guess(void *blk, int cyls, int heads, int secs) "blk %p LCHS %d %d %d" hd_geometry_guess(void *blk, uint32_t cyls, uint32_t heads, uint32_t secs, int trans) "blk %p CHS %u %u %u trans %d" + +# hw/block/xen_disk.c +xen_disk_alloc(char *name) "%s" +xen_disk_init(char *name) "%s" +xen_disk_connect(char *name) "%s" +xen_disk_disconnect(char *name) "%s" +xen_disk_free(char *name) "%s" diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c index 0e6513708e..8548195195 100644 --- a/hw/block/xen_disk.c +++ b/hw/block/xen_disk.c @@ -27,10 +27,13 @@ #include "hw/xen/xen_backend.h" #include "xen_blkif.h" #include "sysemu/blockdev.h" +#include "sysemu/iothread.h" #include "sysemu/block-backend.h" #include "qapi/error.h" #include "qapi/qmp/qdict.h" #include "qapi/qmp/qstring.h" +#include "qom/object_interfaces.h" +#include "trace.h" /* ------------------------------------------------------------- */ @@ -128,6 +131,9 @@ struct XenBlkDev { DriveInfo *dinfo; BlockBackend *blk; QEMUBH *bh; + + IOThread *iothread; + AioContext *ctx; }; /* ------------------------------------------------------------- */ @@ -599,9 +605,12 @@ static int ioreq_runio_qemu_aio(struct ioreq *ioreq); static void qemu_aio_complete(void *opaque, int ret) { struct ioreq *ioreq = opaque; + struct XenBlkDev *blkdev = ioreq->blkdev; + + aio_context_acquire(blkdev->ctx); if (ret != 0) { - xen_pv_printf(&ioreq->blkdev->xendev, 0, "%s I/O error\n", + xen_pv_printf(&blkdev->xendev, 0, "%s I/O error\n", ioreq->req.operation == BLKIF_OP_READ ? "read" : "write"); ioreq->aio_errors++; } @@ -610,13 +619,13 @@ static void qemu_aio_complete(void *opaque, int ret) if (ioreq->presync) { ioreq->presync = 0; ioreq_runio_qemu_aio(ioreq); - return; + goto done; } if (ioreq->aio_inflight > 0) { - return; + goto done; } - if (ioreq->blkdev->feature_grant_copy) { + if (blkdev->feature_grant_copy) { switch (ioreq->req.operation) { case BLKIF_OP_READ: /* in case of failure ioreq->aio_errors is increased */ @@ -638,7 +647,7 @@ static void qemu_aio_complete(void *opaque, int ret) } ioreq->status = ioreq->aio_errors ? BLKIF_RSP_ERROR : BLKIF_RSP_OKAY; - if (!ioreq->blkdev->feature_grant_copy) { + if (!blkdev->feature_grant_copy) { ioreq_unmap(ioreq); } ioreq_finish(ioreq); @@ -650,16 +659,19 @@ static void qemu_aio_complete(void *opaque, int ret) } case BLKIF_OP_READ: if (ioreq->status == BLKIF_RSP_OKAY) { - block_acct_done(blk_get_stats(ioreq->blkdev->blk), &ioreq->acct); + block_acct_done(blk_get_stats(blkdev->blk), &ioreq->acct); } else { - block_acct_failed(blk_get_stats(ioreq->blkdev->blk), &ioreq->acct); + block_acct_failed(blk_get_stats(blkdev->blk), &ioreq->acct); } break; case BLKIF_OP_DISCARD: default: break; } - qemu_bh_schedule(ioreq->blkdev->bh); + qemu_bh_schedule(blkdev->bh); + +done: + aio_context_release(blkdev->ctx); } static bool blk_split_discard(struct ioreq *ioreq, blkif_sector_t sector_number, @@ -917,17 +929,40 @@ static void blk_handle_requests(struct XenBlkDev *blkdev) static void blk_bh(void *opaque) { struct XenBlkDev *blkdev = opaque; + + aio_context_acquire(blkdev->ctx); blk_handle_requests(blkdev); + aio_context_release(blkdev->ctx); } static void blk_alloc(struct XenDevice *xendev) { struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev); + Object *obj; + char *name; + Error *err = NULL; + + trace_xen_disk_alloc(xendev->name); QLIST_INIT(&blkdev->inflight); QLIST_INIT(&blkdev->finished); QLIST_INIT(&blkdev->freelist); - blkdev->bh = qemu_bh_new(blk_bh, blkdev); + + obj = object_new(TYPE_IOTHREAD); + name = g_strdup_printf("iothread-%s", xendev->name); + + object_property_add_child(object_get_objects_root(), name, obj, &err); + assert(!err); + + g_free(name); + + user_creatable_complete(obj, &err); + assert(!err); + + blkdev->iothread = (IOThread *)object_dynamic_cast(obj, TYPE_IOTHREAD); + blkdev->ctx = iothread_get_aio_context(blkdev->iothread); + blkdev->bh = aio_bh_new(blkdev->ctx, blk_bh, blkdev); + if (xen_mode != XEN_EMULATE) { batch_maps = 1; } @@ -954,6 +989,8 @@ static int blk_init(struct XenDevice *xendev) int info = 0; char *directiosafe = NULL; + trace_xen_disk_init(xendev->name); + /* read xenstore entries */ if (blkdev->params == NULL) { char *h = NULL; @@ -1069,6 +1106,8 @@ static int blk_connect(struct XenDevice *xendev) unsigned int i; uint32_t *domids; + trace_xen_disk_connect(xendev->name); + /* read-only ? */ if (blkdev->directiosafe) { qflags = BDRV_O_NOCACHE | BDRV_O_NATIVE_AIO; @@ -1288,6 +1327,8 @@ static int blk_connect(struct XenDevice *xendev) blkdev->persistent_gnt_count = 0; } + blk_set_aio_context(blkdev->blk, blkdev->ctx); + xen_be_bind_evtchn(&blkdev->xendev); xen_pv_printf(&blkdev->xendev, 1, "ok: proto %s, nr-ring-ref %u, " @@ -1301,13 +1342,20 @@ static void blk_disconnect(struct XenDevice *xendev) { struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev); + trace_xen_disk_disconnect(xendev->name); + + aio_context_acquire(blkdev->ctx); + if (blkdev->blk) { + blk_set_aio_context(blkdev->blk, qemu_get_aio_context()); blk_detach_dev(blkdev->blk, blkdev); blk_unref(blkdev->blk); blkdev->blk = NULL; } xen_pv_unbind_evtchn(&blkdev->xendev); + aio_context_release(blkdev->ctx); + if (blkdev->sring) { xengnttab_unmap(blkdev->xendev.gnttabdev, blkdev->sring, blkdev->nr_ring_ref); @@ -1341,6 +1389,8 @@ static int blk_free(struct XenDevice *xendev) struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev); struct ioreq *ioreq; + trace_xen_disk_free(xendev->name); + if (blkdev->blk || blkdev->sring) { blk_disconnect(xendev); } @@ -1358,6 +1408,7 @@ static int blk_free(struct XenDevice *xendev) g_free(blkdev->dev); g_free(blkdev->devtype); qemu_bh_delete(blkdev->bh); + object_unparent(OBJECT(blkdev->iothread)); return 0; } -- 2.11.0 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v2 3/3] xen-disk: use an IOThread per instance @ 2017-06-21 12:52 ` Paul Durrant 0 siblings, 0 replies; 24+ messages in thread From: Paul Durrant @ 2017-06-21 12:52 UTC (permalink / raw) To: xen-devel, qemu-devel, qemu-block Cc: Anthony Perard, Kevin Wolf, Paul Durrant, Stefano Stabellini, Max Reitz This patch allocates an IOThread object for each xen_disk instance and sets the AIO context appropriately on connect. This allows processing of I/O to proceed in parallel. The patch also adds tracepoints into xen_disk to make it possible to follow the state transtions of an instance in the log. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> --- Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Anthony Perard <anthony.perard@citrix.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Max Reitz <mreitz@redhat.com> v2: - explicitly acquire and release AIO context in qemu_aio_complete() and blk_bh() --- hw/block/trace-events | 7 ++++++ hw/block/xen_disk.c | 69 ++++++++++++++++++++++++++++++++++++++++++++------- 2 files changed, 67 insertions(+), 9 deletions(-) diff --git a/hw/block/trace-events b/hw/block/trace-events index 65e83dc258..608b24ba66 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -10,3 +10,10 @@ virtio_blk_submit_multireq(void *mrb, int start, int num_reqs, uint64_t offset, # hw/block/hd-geometry.c hd_geometry_lchs_guess(void *blk, int cyls, int heads, int secs) "blk %p LCHS %d %d %d" hd_geometry_guess(void *blk, uint32_t cyls, uint32_t heads, uint32_t secs, int trans) "blk %p CHS %u %u %u trans %d" + +# hw/block/xen_disk.c +xen_disk_alloc(char *name) "%s" +xen_disk_init(char *name) "%s" +xen_disk_connect(char *name) "%s" +xen_disk_disconnect(char *name) "%s" +xen_disk_free(char *name) "%s" diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c index 0e6513708e..8548195195 100644 --- a/hw/block/xen_disk.c +++ b/hw/block/xen_disk.c @@ -27,10 +27,13 @@ #include "hw/xen/xen_backend.h" #include "xen_blkif.h" #include "sysemu/blockdev.h" +#include "sysemu/iothread.h" #include "sysemu/block-backend.h" #include "qapi/error.h" #include "qapi/qmp/qdict.h" #include "qapi/qmp/qstring.h" +#include "qom/object_interfaces.h" +#include "trace.h" /* ------------------------------------------------------------- */ @@ -128,6 +131,9 @@ struct XenBlkDev { DriveInfo *dinfo; BlockBackend *blk; QEMUBH *bh; + + IOThread *iothread; + AioContext *ctx; }; /* ------------------------------------------------------------- */ @@ -599,9 +605,12 @@ static int ioreq_runio_qemu_aio(struct ioreq *ioreq); static void qemu_aio_complete(void *opaque, int ret) { struct ioreq *ioreq = opaque; + struct XenBlkDev *blkdev = ioreq->blkdev; + + aio_context_acquire(blkdev->ctx); if (ret != 0) { - xen_pv_printf(&ioreq->blkdev->xendev, 0, "%s I/O error\n", + xen_pv_printf(&blkdev->xendev, 0, "%s I/O error\n", ioreq->req.operation == BLKIF_OP_READ ? "read" : "write"); ioreq->aio_errors++; } @@ -610,13 +619,13 @@ static void qemu_aio_complete(void *opaque, int ret) if (ioreq->presync) { ioreq->presync = 0; ioreq_runio_qemu_aio(ioreq); - return; + goto done; } if (ioreq->aio_inflight > 0) { - return; + goto done; } - if (ioreq->blkdev->feature_grant_copy) { + if (blkdev->feature_grant_copy) { switch (ioreq->req.operation) { case BLKIF_OP_READ: /* in case of failure ioreq->aio_errors is increased */ @@ -638,7 +647,7 @@ static void qemu_aio_complete(void *opaque, int ret) } ioreq->status = ioreq->aio_errors ? BLKIF_RSP_ERROR : BLKIF_RSP_OKAY; - if (!ioreq->blkdev->feature_grant_copy) { + if (!blkdev->feature_grant_copy) { ioreq_unmap(ioreq); } ioreq_finish(ioreq); @@ -650,16 +659,19 @@ static void qemu_aio_complete(void *opaque, int ret) } case BLKIF_OP_READ: if (ioreq->status == BLKIF_RSP_OKAY) { - block_acct_done(blk_get_stats(ioreq->blkdev->blk), &ioreq->acct); + block_acct_done(blk_get_stats(blkdev->blk), &ioreq->acct); } else { - block_acct_failed(blk_get_stats(ioreq->blkdev->blk), &ioreq->acct); + block_acct_failed(blk_get_stats(blkdev->blk), &ioreq->acct); } break; case BLKIF_OP_DISCARD: default: break; } - qemu_bh_schedule(ioreq->blkdev->bh); + qemu_bh_schedule(blkdev->bh); + +done: + aio_context_release(blkdev->ctx); } static bool blk_split_discard(struct ioreq *ioreq, blkif_sector_t sector_number, @@ -917,17 +929,40 @@ static void blk_handle_requests(struct XenBlkDev *blkdev) static void blk_bh(void *opaque) { struct XenBlkDev *blkdev = opaque; + + aio_context_acquire(blkdev->ctx); blk_handle_requests(blkdev); + aio_context_release(blkdev->ctx); } static void blk_alloc(struct XenDevice *xendev) { struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev); + Object *obj; + char *name; + Error *err = NULL; + + trace_xen_disk_alloc(xendev->name); QLIST_INIT(&blkdev->inflight); QLIST_INIT(&blkdev->finished); QLIST_INIT(&blkdev->freelist); - blkdev->bh = qemu_bh_new(blk_bh, blkdev); + + obj = object_new(TYPE_IOTHREAD); + name = g_strdup_printf("iothread-%s", xendev->name); + + object_property_add_child(object_get_objects_root(), name, obj, &err); + assert(!err); + + g_free(name); + + user_creatable_complete(obj, &err); + assert(!err); + + blkdev->iothread = (IOThread *)object_dynamic_cast(obj, TYPE_IOTHREAD); + blkdev->ctx = iothread_get_aio_context(blkdev->iothread); + blkdev->bh = aio_bh_new(blkdev->ctx, blk_bh, blkdev); + if (xen_mode != XEN_EMULATE) { batch_maps = 1; } @@ -954,6 +989,8 @@ static int blk_init(struct XenDevice *xendev) int info = 0; char *directiosafe = NULL; + trace_xen_disk_init(xendev->name); + /* read xenstore entries */ if (blkdev->params == NULL) { char *h = NULL; @@ -1069,6 +1106,8 @@ static int blk_connect(struct XenDevice *xendev) unsigned int i; uint32_t *domids; + trace_xen_disk_connect(xendev->name); + /* read-only ? */ if (blkdev->directiosafe) { qflags = BDRV_O_NOCACHE | BDRV_O_NATIVE_AIO; @@ -1288,6 +1327,8 @@ static int blk_connect(struct XenDevice *xendev) blkdev->persistent_gnt_count = 0; } + blk_set_aio_context(blkdev->blk, blkdev->ctx); + xen_be_bind_evtchn(&blkdev->xendev); xen_pv_printf(&blkdev->xendev, 1, "ok: proto %s, nr-ring-ref %u, " @@ -1301,13 +1342,20 @@ static void blk_disconnect(struct XenDevice *xendev) { struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev); + trace_xen_disk_disconnect(xendev->name); + + aio_context_acquire(blkdev->ctx); + if (blkdev->blk) { + blk_set_aio_context(blkdev->blk, qemu_get_aio_context()); blk_detach_dev(blkdev->blk, blkdev); blk_unref(blkdev->blk); blkdev->blk = NULL; } xen_pv_unbind_evtchn(&blkdev->xendev); + aio_context_release(blkdev->ctx); + if (blkdev->sring) { xengnttab_unmap(blkdev->xendev.gnttabdev, blkdev->sring, blkdev->nr_ring_ref); @@ -1341,6 +1389,8 @@ static int blk_free(struct XenDevice *xendev) struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev); struct ioreq *ioreq; + trace_xen_disk_free(xendev->name); + if (blkdev->blk || blkdev->sring) { blk_disconnect(xendev); } @@ -1358,6 +1408,7 @@ static int blk_free(struct XenDevice *xendev) g_free(blkdev->dev); g_free(blkdev->devtype); qemu_bh_delete(blkdev->bh); + object_unparent(OBJECT(blkdev->iothread)); return 0; } -- 2.11.0 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [Qemu-devel] [PATCH v2 3/3] xen-disk: use an IOThread per instance 2017-06-21 12:52 ` Paul Durrant (?) @ 2017-06-22 22:14 ` Stefano Stabellini 2017-07-07 8:20 ` Paul Durrant 2017-07-07 8:20 ` Paul Durrant -1 siblings, 2 replies; 24+ messages in thread From: Stefano Stabellini @ 2017-06-22 22:14 UTC (permalink / raw) To: Paul Durrant Cc: xen-devel, qemu-devel, qemu-block, Stefano Stabellini, Anthony Perard, Kevin Wolf, Max Reitz, afaerber CC'ing Andreas Färber. Could you please give a quick look below at the way the iothread object is instantiate and destroyed? I am no object model expert and would appreaciate a second opinion. On Wed, 21 Jun 2017, Paul Durrant wrote: > This patch allocates an IOThread object for each xen_disk instance and > sets the AIO context appropriately on connect. This allows processing > of I/O to proceed in parallel. > > The patch also adds tracepoints into xen_disk to make it possible to > follow the state transtions of an instance in the log. > > Signed-off-by: Paul Durrant <paul.durrant@citrix.com> > --- > Cc: Stefano Stabellini <sstabellini@kernel.org> > Cc: Anthony Perard <anthony.perard@citrix.com> > Cc: Kevin Wolf <kwolf@redhat.com> > Cc: Max Reitz <mreitz@redhat.com> > > v2: > - explicitly acquire and release AIO context in qemu_aio_complete() and > blk_bh() > --- > hw/block/trace-events | 7 ++++++ > hw/block/xen_disk.c | 69 ++++++++++++++++++++++++++++++++++++++++++++------- > 2 files changed, 67 insertions(+), 9 deletions(-) > > diff --git a/hw/block/trace-events b/hw/block/trace-events > index 65e83dc258..608b24ba66 100644 > --- a/hw/block/trace-events > +++ b/hw/block/trace-events > @@ -10,3 +10,10 @@ virtio_blk_submit_multireq(void *mrb, int start, int num_reqs, uint64_t offset, > # hw/block/hd-geometry.c > hd_geometry_lchs_guess(void *blk, int cyls, int heads, int secs) "blk %p LCHS %d %d %d" > hd_geometry_guess(void *blk, uint32_t cyls, uint32_t heads, uint32_t secs, int trans) "blk %p CHS %u %u %u trans %d" > + > +# hw/block/xen_disk.c > +xen_disk_alloc(char *name) "%s" > +xen_disk_init(char *name) "%s" > +xen_disk_connect(char *name) "%s" > +xen_disk_disconnect(char *name) "%s" > +xen_disk_free(char *name) "%s" > diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c > index 0e6513708e..8548195195 100644 > --- a/hw/block/xen_disk.c > +++ b/hw/block/xen_disk.c > @@ -27,10 +27,13 @@ > #include "hw/xen/xen_backend.h" > #include "xen_blkif.h" > #include "sysemu/blockdev.h" > +#include "sysemu/iothread.h" > #include "sysemu/block-backend.h" > #include "qapi/error.h" > #include "qapi/qmp/qdict.h" > #include "qapi/qmp/qstring.h" > +#include "qom/object_interfaces.h" > +#include "trace.h" > > /* ------------------------------------------------------------- */ > > @@ -128,6 +131,9 @@ struct XenBlkDev { > DriveInfo *dinfo; > BlockBackend *blk; > QEMUBH *bh; > + > + IOThread *iothread; > + AioContext *ctx; > }; > > /* ------------------------------------------------------------- */ > @@ -599,9 +605,12 @@ static int ioreq_runio_qemu_aio(struct ioreq *ioreq); > static void qemu_aio_complete(void *opaque, int ret) > { > struct ioreq *ioreq = opaque; > + struct XenBlkDev *blkdev = ioreq->blkdev; > + > + aio_context_acquire(blkdev->ctx); I think that Paolo was right that we need a aio_context_acquire here, however the issue is that with the current code: blk_handle_requests -> ioreq_runio_qemu_aio -> qemu_aio_complete leading to aio_context_acquire being called twice on the same lock, which I don't think is allowed? I think we need to get rid of the qemu_aio_complete call from ioreq_runio_qemu_aio, but to do that we need to be careful with the accounting of aio_inflight (today it's incremented unconditionally at the beginning of ioreq_runio_qemu_aio, I think we would have to change that to increment it only if presync). > if (ret != 0) { > - xen_pv_printf(&ioreq->blkdev->xendev, 0, "%s I/O error\n", > + xen_pv_printf(&blkdev->xendev, 0, "%s I/O error\n", > ioreq->req.operation == BLKIF_OP_READ ? "read" : "write"); > ioreq->aio_errors++; > } > @@ -610,13 +619,13 @@ static void qemu_aio_complete(void *opaque, int ret) > if (ioreq->presync) { > ioreq->presync = 0; > ioreq_runio_qemu_aio(ioreq); > - return; > + goto done; > } > if (ioreq->aio_inflight > 0) { > - return; > + goto done; > } > > - if (ioreq->blkdev->feature_grant_copy) { > + if (blkdev->feature_grant_copy) { > switch (ioreq->req.operation) { > case BLKIF_OP_READ: > /* in case of failure ioreq->aio_errors is increased */ > @@ -638,7 +647,7 @@ static void qemu_aio_complete(void *opaque, int ret) > } > > ioreq->status = ioreq->aio_errors ? BLKIF_RSP_ERROR : BLKIF_RSP_OKAY; > - if (!ioreq->blkdev->feature_grant_copy) { > + if (!blkdev->feature_grant_copy) { > ioreq_unmap(ioreq); > } > ioreq_finish(ioreq); > @@ -650,16 +659,19 @@ static void qemu_aio_complete(void *opaque, int ret) > } > case BLKIF_OP_READ: > if (ioreq->status == BLKIF_RSP_OKAY) { > - block_acct_done(blk_get_stats(ioreq->blkdev->blk), &ioreq->acct); > + block_acct_done(blk_get_stats(blkdev->blk), &ioreq->acct); > } else { > - block_acct_failed(blk_get_stats(ioreq->blkdev->blk), &ioreq->acct); > + block_acct_failed(blk_get_stats(blkdev->blk), &ioreq->acct); > } > break; > case BLKIF_OP_DISCARD: > default: > break; > } > - qemu_bh_schedule(ioreq->blkdev->bh); > + qemu_bh_schedule(blkdev->bh); > + > +done: > + aio_context_release(blkdev->ctx); > } > > static bool blk_split_discard(struct ioreq *ioreq, blkif_sector_t sector_number, > @@ -917,17 +929,40 @@ static void blk_handle_requests(struct XenBlkDev *blkdev) > static void blk_bh(void *opaque) > { > struct XenBlkDev *blkdev = opaque; > + > + aio_context_acquire(blkdev->ctx); > blk_handle_requests(blkdev); > + aio_context_release(blkdev->ctx); > } > > static void blk_alloc(struct XenDevice *xendev) > { > struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev); > + Object *obj; > + char *name; > + Error *err = NULL; > + > + trace_xen_disk_alloc(xendev->name); > > QLIST_INIT(&blkdev->inflight); > QLIST_INIT(&blkdev->finished); > QLIST_INIT(&blkdev->freelist); > - blkdev->bh = qemu_bh_new(blk_bh, blkdev); > + > + obj = object_new(TYPE_IOTHREAD); > + name = g_strdup_printf("iothread-%s", xendev->name); > + > + object_property_add_child(object_get_objects_root(), name, obj, &err); > + assert(!err); Would it be enough to call object_ref? > + g_free(name); > + > + user_creatable_complete(obj, &err); Why do we need to call this? > + assert(!err); > + > + blkdev->iothread = (IOThread *)object_dynamic_cast(obj, TYPE_IOTHREAD); > + blkdev->ctx = iothread_get_aio_context(blkdev->iothread); > + blkdev->bh = aio_bh_new(blkdev->ctx, blk_bh, blkdev); > + > if (xen_mode != XEN_EMULATE) { > batch_maps = 1; > } > @@ -1288,6 +1327,8 @@ static int blk_connect(struct XenDevice *xendev) > blkdev->persistent_gnt_count = 0; > } > > + blk_set_aio_context(blkdev->blk, blkdev->ctx); > + > xen_be_bind_evtchn(&blkdev->xendev); > > xen_pv_printf(&blkdev->xendev, 1, "ok: proto %s, nr-ring-ref %u, " > @@ -1301,13 +1342,20 @@ static void blk_disconnect(struct XenDevice *xendev) > { > struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev); > > + trace_xen_disk_disconnect(xendev->name); > + > + aio_context_acquire(blkdev->ctx); > + > if (blkdev->blk) { > + blk_set_aio_context(blkdev->blk, qemu_get_aio_context()); > blk_detach_dev(blkdev->blk, blkdev); > blk_unref(blkdev->blk); > blkdev->blk = NULL; > } > xen_pv_unbind_evtchn(&blkdev->xendev); > > + aio_context_release(blkdev->ctx); > + > if (blkdev->sring) { > xengnttab_unmap(blkdev->xendev.gnttabdev, blkdev->sring, > blkdev->nr_ring_ref); > @@ -1358,6 +1408,7 @@ static int blk_free(struct XenDevice *xendev) > g_free(blkdev->dev); > g_free(blkdev->devtype); > qemu_bh_delete(blkdev->bh); > + object_unparent(OBJECT(blkdev->iothread)); Shouldn't this be object_unref? > return 0; > } ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Qemu-devel] [PATCH v2 3/3] xen-disk: use an IOThread per instance 2017-06-22 22:14 ` [Qemu-devel] " Stefano Stabellini @ 2017-07-07 8:20 ` Paul Durrant 2017-07-07 22:06 ` Stefano Stabellini 2017-07-07 22:06 ` [Qemu-devel] " Stefano Stabellini 2017-07-07 8:20 ` Paul Durrant 1 sibling, 2 replies; 24+ messages in thread From: Paul Durrant @ 2017-07-07 8:20 UTC (permalink / raw) To: 'Stefano Stabellini' Cc: xen-devel, qemu-devel, qemu-block, Anthony Perard, Kevin Wolf, Max Reitz, afaerber > -----Original Message----- > From: Stefano Stabellini [mailto:sstabellini@kernel.org] > Sent: 22 June 2017 23:15 > To: Paul Durrant <Paul.Durrant@citrix.com> > Cc: xen-devel@lists.xenproject.org; qemu-devel@nongnu.org; qemu- > block@nongnu.org; Stefano Stabellini <sstabellini@kernel.org>; Anthony > Perard <anthony.perard@citrix.com>; Kevin Wolf <kwolf@redhat.com>; > Max Reitz <mreitz@redhat.com>; afaerber@suse.de > Subject: Re: [PATCH v2 3/3] xen-disk: use an IOThread per instance > > CC'ing Andreas Färber. Could you please give a quick look below at the > way the iothread object is instantiate and destroyed? I am no object > model expert and would appreaciate a second opinion. > I have not seen any response so far. > > On Wed, 21 Jun 2017, Paul Durrant wrote: > > This patch allocates an IOThread object for each xen_disk instance and > > sets the AIO context appropriately on connect. This allows processing > > of I/O to proceed in parallel. > > > > The patch also adds tracepoints into xen_disk to make it possible to > > follow the state transtions of an instance in the log. > > > > Signed-off-by: Paul Durrant <paul.durrant@citrix.com> > > --- > > Cc: Stefano Stabellini <sstabellini@kernel.org> > > Cc: Anthony Perard <anthony.perard@citrix.com> > > Cc: Kevin Wolf <kwolf@redhat.com> > > Cc: Max Reitz <mreitz@redhat.com> > > > > v2: > > - explicitly acquire and release AIO context in qemu_aio_complete() and > > blk_bh() > > --- > > hw/block/trace-events | 7 ++++++ > > hw/block/xen_disk.c | 69 > ++++++++++++++++++++++++++++++++++++++++++++------- > > 2 files changed, 67 insertions(+), 9 deletions(-) > > > > diff --git a/hw/block/trace-events b/hw/block/trace-events > > index 65e83dc258..608b24ba66 100644 > > --- a/hw/block/trace-events > > +++ b/hw/block/trace-events > > @@ -10,3 +10,10 @@ virtio_blk_submit_multireq(void *mrb, int start, int > num_reqs, uint64_t offset, > > # hw/block/hd-geometry.c > > hd_geometry_lchs_guess(void *blk, int cyls, int heads, int secs) "blk %p > LCHS %d %d %d" > > hd_geometry_guess(void *blk, uint32_t cyls, uint32_t heads, uint32_t > secs, int trans) "blk %p CHS %u %u %u trans %d" > > + > > +# hw/block/xen_disk.c > > +xen_disk_alloc(char *name) "%s" > > +xen_disk_init(char *name) "%s" > > +xen_disk_connect(char *name) "%s" > > +xen_disk_disconnect(char *name) "%s" > > +xen_disk_free(char *name) "%s" > > diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c > > index 0e6513708e..8548195195 100644 > > --- a/hw/block/xen_disk.c > > +++ b/hw/block/xen_disk.c > > @@ -27,10 +27,13 @@ > > #include "hw/xen/xen_backend.h" > > #include "xen_blkif.h" > > #include "sysemu/blockdev.h" > > +#include "sysemu/iothread.h" > > #include "sysemu/block-backend.h" > > #include "qapi/error.h" > > #include "qapi/qmp/qdict.h" > > #include "qapi/qmp/qstring.h" > > +#include "qom/object_interfaces.h" > > +#include "trace.h" > > > > /* ------------------------------------------------------------- */ > > > > @@ -128,6 +131,9 @@ struct XenBlkDev { > > DriveInfo *dinfo; > > BlockBackend *blk; > > QEMUBH *bh; > > + > > + IOThread *iothread; > > + AioContext *ctx; > > }; > > > > /* ------------------------------------------------------------- */ > > @@ -599,9 +605,12 @@ static int ioreq_runio_qemu_aio(struct ioreq > *ioreq); > > static void qemu_aio_complete(void *opaque, int ret) > > { > > struct ioreq *ioreq = opaque; > > + struct XenBlkDev *blkdev = ioreq->blkdev; > > + > > + aio_context_acquire(blkdev->ctx); > > I think that Paolo was right that we need a aio_context_acquire here, > however the issue is that with the current code: > > blk_handle_requests -> ioreq_runio_qemu_aio -> qemu_aio_complete > > leading to aio_context_acquire being called twice on the same lock, > which I don't think is allowed? It resolves to a qemu_rec_mutex_lock() which I believed is a recursive lock, so I think that's ok. > > I think we need to get rid of the qemu_aio_complete call from > ioreq_runio_qemu_aio, but to do that we need to be careful with the > accounting of aio_inflight (today it's incremented unconditionally at > the beginning of ioreq_runio_qemu_aio, I think we would have to change > that to increment it only if presync). > If the lock is indeed recursive then I think we can avoid this complication. > > > if (ret != 0) { > > - xen_pv_printf(&ioreq->blkdev->xendev, 0, "%s I/O error\n", > > + xen_pv_printf(&blkdev->xendev, 0, "%s I/O error\n", > > ioreq->req.operation == BLKIF_OP_READ ? "read" : "write"); > > ioreq->aio_errors++; > > } > > @@ -610,13 +619,13 @@ static void qemu_aio_complete(void *opaque, int > ret) > > if (ioreq->presync) { > > ioreq->presync = 0; > > ioreq_runio_qemu_aio(ioreq); > > - return; > > + goto done; > > } > > if (ioreq->aio_inflight > 0) { > > - return; > > + goto done; > > } > > > > - if (ioreq->blkdev->feature_grant_copy) { > > + if (blkdev->feature_grant_copy) { > > switch (ioreq->req.operation) { > > case BLKIF_OP_READ: > > /* in case of failure ioreq->aio_errors is increased */ > > @@ -638,7 +647,7 @@ static void qemu_aio_complete(void *opaque, int > ret) > > } > > > > ioreq->status = ioreq->aio_errors ? BLKIF_RSP_ERROR : > BLKIF_RSP_OKAY; > > - if (!ioreq->blkdev->feature_grant_copy) { > > + if (!blkdev->feature_grant_copy) { > > ioreq_unmap(ioreq); > > } > > ioreq_finish(ioreq); > > @@ -650,16 +659,19 @@ static void qemu_aio_complete(void *opaque, int > ret) > > } > > case BLKIF_OP_READ: > > if (ioreq->status == BLKIF_RSP_OKAY) { > > - block_acct_done(blk_get_stats(ioreq->blkdev->blk), &ioreq->acct); > > + block_acct_done(blk_get_stats(blkdev->blk), &ioreq->acct); > > } else { > > - block_acct_failed(blk_get_stats(ioreq->blkdev->blk), &ioreq->acct); > > + block_acct_failed(blk_get_stats(blkdev->blk), &ioreq->acct); > > } > > break; > > case BLKIF_OP_DISCARD: > > default: > > break; > > } > > - qemu_bh_schedule(ioreq->blkdev->bh); > > + qemu_bh_schedule(blkdev->bh); > > + > > +done: > > + aio_context_release(blkdev->ctx); > > } > > > > static bool blk_split_discard(struct ioreq *ioreq, blkif_sector_t > sector_number, > > @@ -917,17 +929,40 @@ static void blk_handle_requests(struct XenBlkDev > *blkdev) > > static void blk_bh(void *opaque) > > { > > struct XenBlkDev *blkdev = opaque; > > + > > + aio_context_acquire(blkdev->ctx); > > blk_handle_requests(blkdev); > > + aio_context_release(blkdev->ctx); > > } > > > > static void blk_alloc(struct XenDevice *xendev) > > { > > struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, > xendev); > > + Object *obj; > > + char *name; > > + Error *err = NULL; > > + > > + trace_xen_disk_alloc(xendev->name); > > > > QLIST_INIT(&blkdev->inflight); > > QLIST_INIT(&blkdev->finished); > > QLIST_INIT(&blkdev->freelist); > > - blkdev->bh = qemu_bh_new(blk_bh, blkdev); > > + > > + obj = object_new(TYPE_IOTHREAD); > > + name = g_strdup_printf("iothread-%s", xendev->name); > > + > > + object_property_add_child(object_get_objects_root(), name, obj, > &err); > > + assert(!err); > > Would it be enough to call object_ref? > You mean to avoid the assert? I guess so but I think any failure here would be indicative of a larger problem. > > > + g_free(name); > > + > > + user_creatable_complete(obj, &err); > > Why do we need to call this? > I'm not entirely sure but looking around the object code it seemed to be a necessary part of instantiation. Maybe it is not required for iothread objects, but I could not figure that out from looking at the code and comments in the header suggest it is harmless if it is not required. > > > + assert(!err); > > + > > + blkdev->iothread = (IOThread *)object_dynamic_cast(obj, > TYPE_IOTHREAD); > > + blkdev->ctx = iothread_get_aio_context(blkdev->iothread); > > + blkdev->bh = aio_bh_new(blkdev->ctx, blk_bh, blkdev); > > + > > if (xen_mode != XEN_EMULATE) { > > batch_maps = 1; > > } > > @@ -1288,6 +1327,8 @@ static int blk_connect(struct XenDevice *xendev) > > blkdev->persistent_gnt_count = 0; > > } > > > > + blk_set_aio_context(blkdev->blk, blkdev->ctx); > > + > > xen_be_bind_evtchn(&blkdev->xendev); > > > > xen_pv_printf(&blkdev->xendev, 1, "ok: proto %s, nr-ring-ref %u, " > > @@ -1301,13 +1342,20 @@ static void blk_disconnect(struct XenDevice > *xendev) > > { > > struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, > xendev); > > > > + trace_xen_disk_disconnect(xendev->name); > > + > > + aio_context_acquire(blkdev->ctx); > > + > > if (blkdev->blk) { > > + blk_set_aio_context(blkdev->blk, qemu_get_aio_context()); > > blk_detach_dev(blkdev->blk, blkdev); > > blk_unref(blkdev->blk); > > blkdev->blk = NULL; > > } > > xen_pv_unbind_evtchn(&blkdev->xendev); > > > > + aio_context_release(blkdev->ctx); > > + > > if (blkdev->sring) { > > xengnttab_unmap(blkdev->xendev.gnttabdev, blkdev->sring, > > blkdev->nr_ring_ref); > > @@ -1358,6 +1408,7 @@ static int blk_free(struct XenDevice *xendev) > > g_free(blkdev->dev); > > g_free(blkdev->devtype); > > qemu_bh_delete(blkdev->bh); > > + object_unparent(OBJECT(blkdev->iothread)); > > Shouldn't this be object_unref? > I don't think so. I think this is required to undo what was done by calling object_property_add_child() on the root object. Looking at other code such as object_new_with_propv() it looks like the right thing to do is to call object_unref() after calling object_property_add_child() to drop the implicit ref taken by object_new() so I'd need to add the call in blk_alloc(). Paul > > > return 0; > > } ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2 3/3] xen-disk: use an IOThread per instance 2017-07-07 8:20 ` Paul Durrant @ 2017-07-07 22:06 ` Stefano Stabellini 2017-07-07 22:06 ` [Qemu-devel] " Stefano Stabellini 1 sibling, 0 replies; 24+ messages in thread From: Stefano Stabellini @ 2017-07-07 22:06 UTC (permalink / raw) To: Paul Durrant Cc: Kevin Wolf, 'Stefano Stabellini', qemu-block, armbru, qemu-devel, Max Reitz, Anthony Perard, xen-devel, afaerber [-- Attachment #1: Type: TEXT/PLAIN, Size: 11287 bytes --] On Fri, 7 Jul 2017, Paul Durrant wrote: > > -----Original Message----- > > From: Stefano Stabellini [mailto:sstabellini@kernel.org] > > Sent: 22 June 2017 23:15 > > To: Paul Durrant <Paul.Durrant@citrix.com> > > Cc: xen-devel@lists.xenproject.org; qemu-devel@nongnu.org; qemu- > > block@nongnu.org; Stefano Stabellini <sstabellini@kernel.org>; Anthony > > Perard <anthony.perard@citrix.com>; Kevin Wolf <kwolf@redhat.com>; > > Max Reitz <mreitz@redhat.com>; afaerber@suse.de > > Subject: Re: [PATCH v2 3/3] xen-disk: use an IOThread per instance > > > > CC'ing Andreas Färber. Could you please give a quick look below at the > > way the iothread object is instantiate and destroyed? I am no object > > model expert and would appreaciate a second opinion. > > > > I have not seen any response so far. > > > > > On Wed, 21 Jun 2017, Paul Durrant wrote: > > > This patch allocates an IOThread object for each xen_disk instance and > > > sets the AIO context appropriately on connect. This allows processing > > > of I/O to proceed in parallel. > > > > > > The patch also adds tracepoints into xen_disk to make it possible to > > > follow the state transtions of an instance in the log. > > > > > > Signed-off-by: Paul Durrant <paul.durrant@citrix.com> > > > --- > > > Cc: Stefano Stabellini <sstabellini@kernel.org> > > > Cc: Anthony Perard <anthony.perard@citrix.com> > > > Cc: Kevin Wolf <kwolf@redhat.com> > > > Cc: Max Reitz <mreitz@redhat.com> > > > > > > v2: > > > - explicitly acquire and release AIO context in qemu_aio_complete() and > > > blk_bh() > > > --- > > > hw/block/trace-events | 7 ++++++ > > > hw/block/xen_disk.c | 69 > > ++++++++++++++++++++++++++++++++++++++++++++------- > > > 2 files changed, 67 insertions(+), 9 deletions(-) > > > > > > diff --git a/hw/block/trace-events b/hw/block/trace-events > > > index 65e83dc258..608b24ba66 100644 > > > --- a/hw/block/trace-events > > > +++ b/hw/block/trace-events > > > @@ -10,3 +10,10 @@ virtio_blk_submit_multireq(void *mrb, int start, int > > num_reqs, uint64_t offset, > > > # hw/block/hd-geometry.c > > > hd_geometry_lchs_guess(void *blk, int cyls, int heads, int secs) "blk %p > > LCHS %d %d %d" > > > hd_geometry_guess(void *blk, uint32_t cyls, uint32_t heads, uint32_t > > secs, int trans) "blk %p CHS %u %u %u trans %d" > > > + > > > +# hw/block/xen_disk.c > > > +xen_disk_alloc(char *name) "%s" > > > +xen_disk_init(char *name) "%s" > > > +xen_disk_connect(char *name) "%s" > > > +xen_disk_disconnect(char *name) "%s" > > > +xen_disk_free(char *name) "%s" > > > diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c > > > index 0e6513708e..8548195195 100644 > > > --- a/hw/block/xen_disk.c > > > +++ b/hw/block/xen_disk.c > > > @@ -27,10 +27,13 @@ > > > #include "hw/xen/xen_backend.h" > > > #include "xen_blkif.h" > > > #include "sysemu/blockdev.h" > > > +#include "sysemu/iothread.h" > > > #include "sysemu/block-backend.h" > > > #include "qapi/error.h" > > > #include "qapi/qmp/qdict.h" > > > #include "qapi/qmp/qstring.h" > > > +#include "qom/object_interfaces.h" > > > +#include "trace.h" > > > > > > /* ------------------------------------------------------------- */ > > > > > > @@ -128,6 +131,9 @@ struct XenBlkDev { > > > DriveInfo *dinfo; > > > BlockBackend *blk; > > > QEMUBH *bh; > > > + > > > + IOThread *iothread; > > > + AioContext *ctx; > > > }; > > > > > > /* ------------------------------------------------------------- */ > > > @@ -599,9 +605,12 @@ static int ioreq_runio_qemu_aio(struct ioreq > > *ioreq); > > > static void qemu_aio_complete(void *opaque, int ret) > > > { > > > struct ioreq *ioreq = opaque; > > > + struct XenBlkDev *blkdev = ioreq->blkdev; > > > + > > > + aio_context_acquire(blkdev->ctx); > > > > I think that Paolo was right that we need a aio_context_acquire here, > > however the issue is that with the current code: > > > > blk_handle_requests -> ioreq_runio_qemu_aio -> qemu_aio_complete > > > > leading to aio_context_acquire being called twice on the same lock, > > which I don't think is allowed? > > It resolves to a qemu_rec_mutex_lock() which I believed is a recursive lock, so I think that's ok. On Linux it becomes pthread_mutex_lock. The lock is created by qemu_rec_mutex_init which specifies PTHREAD_MUTEX_RECURSIVE, so yes, it should be recursive. Good. > > > > I think we need to get rid of the qemu_aio_complete call from > > ioreq_runio_qemu_aio, but to do that we need to be careful with the > > accounting of aio_inflight (today it's incremented unconditionally at > > the beginning of ioreq_runio_qemu_aio, I think we would have to change > > that to increment it only if presync). > > > > If the lock is indeed recursive then I think we can avoid this complication. OK > > > > > if (ret != 0) { > > > - xen_pv_printf(&ioreq->blkdev->xendev, 0, "%s I/O error\n", > > > + xen_pv_printf(&blkdev->xendev, 0, "%s I/O error\n", > > > ioreq->req.operation == BLKIF_OP_READ ? "read" : "write"); > > > ioreq->aio_errors++; > > > } > > > @@ -610,13 +619,13 @@ static void qemu_aio_complete(void *opaque, int > > ret) > > > if (ioreq->presync) { > > > ioreq->presync = 0; > > > ioreq_runio_qemu_aio(ioreq); > > > - return; > > > + goto done; > > > } > > > if (ioreq->aio_inflight > 0) { > > > - return; > > > + goto done; > > > } > > > > > > - if (ioreq->blkdev->feature_grant_copy) { > > > + if (blkdev->feature_grant_copy) { > > > switch (ioreq->req.operation) { > > > case BLKIF_OP_READ: > > > /* in case of failure ioreq->aio_errors is increased */ > > > @@ -638,7 +647,7 @@ static void qemu_aio_complete(void *opaque, int > > ret) > > > } > > > > > > ioreq->status = ioreq->aio_errors ? BLKIF_RSP_ERROR : > > BLKIF_RSP_OKAY; > > > - if (!ioreq->blkdev->feature_grant_copy) { > > > + if (!blkdev->feature_grant_copy) { > > > ioreq_unmap(ioreq); > > > } > > > ioreq_finish(ioreq); > > > @@ -650,16 +659,19 @@ static void qemu_aio_complete(void *opaque, int > > ret) > > > } > > > case BLKIF_OP_READ: > > > if (ioreq->status == BLKIF_RSP_OKAY) { > > > - block_acct_done(blk_get_stats(ioreq->blkdev->blk), &ioreq->acct); > > > + block_acct_done(blk_get_stats(blkdev->blk), &ioreq->acct); > > > } else { > > > - block_acct_failed(blk_get_stats(ioreq->blkdev->blk), &ioreq->acct); > > > + block_acct_failed(blk_get_stats(blkdev->blk), &ioreq->acct); > > > } > > > break; > > > case BLKIF_OP_DISCARD: > > > default: > > > break; > > > } > > > - qemu_bh_schedule(ioreq->blkdev->bh); > > > + qemu_bh_schedule(blkdev->bh); > > > + > > > +done: > > > + aio_context_release(blkdev->ctx); > > > } > > > > > > static bool blk_split_discard(struct ioreq *ioreq, blkif_sector_t > > sector_number, > > > @@ -917,17 +929,40 @@ static void blk_handle_requests(struct XenBlkDev > > *blkdev) > > > static void blk_bh(void *opaque) > > > { > > > struct XenBlkDev *blkdev = opaque; > > > + > > > + aio_context_acquire(blkdev->ctx); > > > blk_handle_requests(blkdev); > > > + aio_context_release(blkdev->ctx); > > > } > > > > > > static void blk_alloc(struct XenDevice *xendev) > > > { > > > struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, > > xendev); > > > + Object *obj; > > > + char *name; > > > + Error *err = NULL; > > > + > > > + trace_xen_disk_alloc(xendev->name); > > > > > > QLIST_INIT(&blkdev->inflight); > > > QLIST_INIT(&blkdev->finished); > > > QLIST_INIT(&blkdev->freelist); > > > - blkdev->bh = qemu_bh_new(blk_bh, blkdev); > > > + > > > + obj = object_new(TYPE_IOTHREAD); > > > + name = g_strdup_printf("iothread-%s", xendev->name); > > > + > > > + object_property_add_child(object_get_objects_root(), name, obj, > > &err); > > > + assert(!err); > > > > Would it be enough to call object_ref? > > > > You mean to avoid the assert? I guess so but I think any failure here would be indicative of a larger problem. No, I meant calling object_ref instead of object_property_add_child. > > > > > + g_free(name); > > > + > > > + user_creatable_complete(obj, &err); > > > > Why do we need to call this? > > > > I'm not entirely sure but looking around the object code it seemed to be a necessary part of instantiation. Maybe it is not required for iothread objects, but I could not figure that out from looking at the code and comments in the header suggest it is harmless if it is not required. > > > > + assert(!err); > > > + > > > + blkdev->iothread = (IOThread *)object_dynamic_cast(obj, > > TYPE_IOTHREAD); > > > + blkdev->ctx = iothread_get_aio_context(blkdev->iothread); > > > + blkdev->bh = aio_bh_new(blkdev->ctx, blk_bh, blkdev); > > > + > > > if (xen_mode != XEN_EMULATE) { > > > batch_maps = 1; > > > } > > > @@ -1288,6 +1327,8 @@ static int blk_connect(struct XenDevice *xendev) > > > blkdev->persistent_gnt_count = 0; > > > } > > > > > > + blk_set_aio_context(blkdev->blk, blkdev->ctx); > > > + > > > xen_be_bind_evtchn(&blkdev->xendev); > > > > > > xen_pv_printf(&blkdev->xendev, 1, "ok: proto %s, nr-ring-ref %u, " > > > @@ -1301,13 +1342,20 @@ static void blk_disconnect(struct XenDevice > > *xendev) > > > { > > > struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, > > xendev); > > > > > > + trace_xen_disk_disconnect(xendev->name); > > > + > > > + aio_context_acquire(blkdev->ctx); > > > + > > > if (blkdev->blk) { > > > + blk_set_aio_context(blkdev->blk, qemu_get_aio_context()); > > > blk_detach_dev(blkdev->blk, blkdev); > > > blk_unref(blkdev->blk); > > > blkdev->blk = NULL; > > > } > > > xen_pv_unbind_evtchn(&blkdev->xendev); > > > > > > + aio_context_release(blkdev->ctx); > > > + > > > if (blkdev->sring) { > > > xengnttab_unmap(blkdev->xendev.gnttabdev, blkdev->sring, > > > blkdev->nr_ring_ref); > > > @@ -1358,6 +1408,7 @@ static int blk_free(struct XenDevice *xendev) > > > g_free(blkdev->dev); > > > g_free(blkdev->devtype); > > > qemu_bh_delete(blkdev->bh); > > > + object_unparent(OBJECT(blkdev->iothread)); > > > > Shouldn't this be object_unref? > > > > I don't think so. I think this is required to undo what was done by calling object_property_add_child() on the root object. Right, so if object_property_add_child is not actually required, then you might be able to turn object_unparent into object_unref. Unfortunately I don't know enough about QOM to be able to tell which is the right way of doing things, but looking at hw/block/dataplane/virtio-blk.c, it would seem that only object_ref and object_unref are required? > Looking at other code such as object_new_with_propv() it looks like the right thing to do is to call object_unref() after calling object_property_add_child() to drop the implicit ref taken by object_new() so I'd need to add the call in blk_alloc(). [-- Attachment #2: Type: text/plain, Size: 127 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Qemu-devel] [PATCH v2 3/3] xen-disk: use an IOThread per instance 2017-07-07 8:20 ` Paul Durrant 2017-07-07 22:06 ` Stefano Stabellini @ 2017-07-07 22:06 ` Stefano Stabellini 2017-07-10 12:11 ` Paul Durrant 2017-07-10 12:11 ` [Qemu-devel] " Paul Durrant 1 sibling, 2 replies; 24+ messages in thread From: Stefano Stabellini @ 2017-07-07 22:06 UTC (permalink / raw) To: Paul Durrant Cc: 'Stefano Stabellini', xen-devel, qemu-devel, qemu-block, Anthony Perard, Kevin Wolf, Max Reitz, afaerber, armbru On Fri, 7 Jul 2017, Paul Durrant wrote: > > -----Original Message----- > > From: Stefano Stabellini [mailto:sstabellini@kernel.org] > > Sent: 22 June 2017 23:15 > > To: Paul Durrant <Paul.Durrant@citrix.com> > > Cc: xen-devel@lists.xenproject.org; qemu-devel@nongnu.org; qemu- > > block@nongnu.org; Stefano Stabellini <sstabellini@kernel.org>; Anthony > > Perard <anthony.perard@citrix.com>; Kevin Wolf <kwolf@redhat.com>; > > Max Reitz <mreitz@redhat.com>; afaerber@suse.de > > Subject: Re: [PATCH v2 3/3] xen-disk: use an IOThread per instance > > > > CC'ing Andreas Färber. Could you please give a quick look below at the > > way the iothread object is instantiate and destroyed? I am no object > > model expert and would appreaciate a second opinion. > > > > I have not seen any response so far. > > > > > On Wed, 21 Jun 2017, Paul Durrant wrote: > > > This patch allocates an IOThread object for each xen_disk instance and > > > sets the AIO context appropriately on connect. This allows processing > > > of I/O to proceed in parallel. > > > > > > The patch also adds tracepoints into xen_disk to make it possible to > > > follow the state transtions of an instance in the log. > > > > > > Signed-off-by: Paul Durrant <paul.durrant@citrix.com> > > > --- > > > Cc: Stefano Stabellini <sstabellini@kernel.org> > > > Cc: Anthony Perard <anthony.perard@citrix.com> > > > Cc: Kevin Wolf <kwolf@redhat.com> > > > Cc: Max Reitz <mreitz@redhat.com> > > > > > > v2: > > > - explicitly acquire and release AIO context in qemu_aio_complete() and > > > blk_bh() > > > --- > > > hw/block/trace-events | 7 ++++++ > > > hw/block/xen_disk.c | 69 > > ++++++++++++++++++++++++++++++++++++++++++++------- > > > 2 files changed, 67 insertions(+), 9 deletions(-) > > > > > > diff --git a/hw/block/trace-events b/hw/block/trace-events > > > index 65e83dc258..608b24ba66 100644 > > > --- a/hw/block/trace-events > > > +++ b/hw/block/trace-events > > > @@ -10,3 +10,10 @@ virtio_blk_submit_multireq(void *mrb, int start, int > > num_reqs, uint64_t offset, > > > # hw/block/hd-geometry.c > > > hd_geometry_lchs_guess(void *blk, int cyls, int heads, int secs) "blk %p > > LCHS %d %d %d" > > > hd_geometry_guess(void *blk, uint32_t cyls, uint32_t heads, uint32_t > > secs, int trans) "blk %p CHS %u %u %u trans %d" > > > + > > > +# hw/block/xen_disk.c > > > +xen_disk_alloc(char *name) "%s" > > > +xen_disk_init(char *name) "%s" > > > +xen_disk_connect(char *name) "%s" > > > +xen_disk_disconnect(char *name) "%s" > > > +xen_disk_free(char *name) "%s" > > > diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c > > > index 0e6513708e..8548195195 100644 > > > --- a/hw/block/xen_disk.c > > > +++ b/hw/block/xen_disk.c > > > @@ -27,10 +27,13 @@ > > > #include "hw/xen/xen_backend.h" > > > #include "xen_blkif.h" > > > #include "sysemu/blockdev.h" > > > +#include "sysemu/iothread.h" > > > #include "sysemu/block-backend.h" > > > #include "qapi/error.h" > > > #include "qapi/qmp/qdict.h" > > > #include "qapi/qmp/qstring.h" > > > +#include "qom/object_interfaces.h" > > > +#include "trace.h" > > > > > > /* ------------------------------------------------------------- */ > > > > > > @@ -128,6 +131,9 @@ struct XenBlkDev { > > > DriveInfo *dinfo; > > > BlockBackend *blk; > > > QEMUBH *bh; > > > + > > > + IOThread *iothread; > > > + AioContext *ctx; > > > }; > > > > > > /* ------------------------------------------------------------- */ > > > @@ -599,9 +605,12 @@ static int ioreq_runio_qemu_aio(struct ioreq > > *ioreq); > > > static void qemu_aio_complete(void *opaque, int ret) > > > { > > > struct ioreq *ioreq = opaque; > > > + struct XenBlkDev *blkdev = ioreq->blkdev; > > > + > > > + aio_context_acquire(blkdev->ctx); > > > > I think that Paolo was right that we need a aio_context_acquire here, > > however the issue is that with the current code: > > > > blk_handle_requests -> ioreq_runio_qemu_aio -> qemu_aio_complete > > > > leading to aio_context_acquire being called twice on the same lock, > > which I don't think is allowed? > > It resolves to a qemu_rec_mutex_lock() which I believed is a recursive lock, so I think that's ok. On Linux it becomes pthread_mutex_lock. The lock is created by qemu_rec_mutex_init which specifies PTHREAD_MUTEX_RECURSIVE, so yes, it should be recursive. Good. > > > > I think we need to get rid of the qemu_aio_complete call from > > ioreq_runio_qemu_aio, but to do that we need to be careful with the > > accounting of aio_inflight (today it's incremented unconditionally at > > the beginning of ioreq_runio_qemu_aio, I think we would have to change > > that to increment it only if presync). > > > > If the lock is indeed recursive then I think we can avoid this complication. OK > > > > > if (ret != 0) { > > > - xen_pv_printf(&ioreq->blkdev->xendev, 0, "%s I/O error\n", > > > + xen_pv_printf(&blkdev->xendev, 0, "%s I/O error\n", > > > ioreq->req.operation == BLKIF_OP_READ ? "read" : "write"); > > > ioreq->aio_errors++; > > > } > > > @@ -610,13 +619,13 @@ static void qemu_aio_complete(void *opaque, int > > ret) > > > if (ioreq->presync) { > > > ioreq->presync = 0; > > > ioreq_runio_qemu_aio(ioreq); > > > - return; > > > + goto done; > > > } > > > if (ioreq->aio_inflight > 0) { > > > - return; > > > + goto done; > > > } > > > > > > - if (ioreq->blkdev->feature_grant_copy) { > > > + if (blkdev->feature_grant_copy) { > > > switch (ioreq->req.operation) { > > > case BLKIF_OP_READ: > > > /* in case of failure ioreq->aio_errors is increased */ > > > @@ -638,7 +647,7 @@ static void qemu_aio_complete(void *opaque, int > > ret) > > > } > > > > > > ioreq->status = ioreq->aio_errors ? BLKIF_RSP_ERROR : > > BLKIF_RSP_OKAY; > > > - if (!ioreq->blkdev->feature_grant_copy) { > > > + if (!blkdev->feature_grant_copy) { > > > ioreq_unmap(ioreq); > > > } > > > ioreq_finish(ioreq); > > > @@ -650,16 +659,19 @@ static void qemu_aio_complete(void *opaque, int > > ret) > > > } > > > case BLKIF_OP_READ: > > > if (ioreq->status == BLKIF_RSP_OKAY) { > > > - block_acct_done(blk_get_stats(ioreq->blkdev->blk), &ioreq->acct); > > > + block_acct_done(blk_get_stats(blkdev->blk), &ioreq->acct); > > > } else { > > > - block_acct_failed(blk_get_stats(ioreq->blkdev->blk), &ioreq->acct); > > > + block_acct_failed(blk_get_stats(blkdev->blk), &ioreq->acct); > > > } > > > break; > > > case BLKIF_OP_DISCARD: > > > default: > > > break; > > > } > > > - qemu_bh_schedule(ioreq->blkdev->bh); > > > + qemu_bh_schedule(blkdev->bh); > > > + > > > +done: > > > + aio_context_release(blkdev->ctx); > > > } > > > > > > static bool blk_split_discard(struct ioreq *ioreq, blkif_sector_t > > sector_number, > > > @@ -917,17 +929,40 @@ static void blk_handle_requests(struct XenBlkDev > > *blkdev) > > > static void blk_bh(void *opaque) > > > { > > > struct XenBlkDev *blkdev = opaque; > > > + > > > + aio_context_acquire(blkdev->ctx); > > > blk_handle_requests(blkdev); > > > + aio_context_release(blkdev->ctx); > > > } > > > > > > static void blk_alloc(struct XenDevice *xendev) > > > { > > > struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, > > xendev); > > > + Object *obj; > > > + char *name; > > > + Error *err = NULL; > > > + > > > + trace_xen_disk_alloc(xendev->name); > > > > > > QLIST_INIT(&blkdev->inflight); > > > QLIST_INIT(&blkdev->finished); > > > QLIST_INIT(&blkdev->freelist); > > > - blkdev->bh = qemu_bh_new(blk_bh, blkdev); > > > + > > > + obj = object_new(TYPE_IOTHREAD); > > > + name = g_strdup_printf("iothread-%s", xendev->name); > > > + > > > + object_property_add_child(object_get_objects_root(), name, obj, > > &err); > > > + assert(!err); > > > > Would it be enough to call object_ref? > > > > You mean to avoid the assert? I guess so but I think any failure here would be indicative of a larger problem. No, I meant calling object_ref instead of object_property_add_child. > > > > > + g_free(name); > > > + > > > + user_creatable_complete(obj, &err); > > > > Why do we need to call this? > > > > I'm not entirely sure but looking around the object code it seemed to be a necessary part of instantiation. Maybe it is not required for iothread objects, but I could not figure that out from looking at the code and comments in the header suggest it is harmless if it is not required. > > > > + assert(!err); > > > + > > > + blkdev->iothread = (IOThread *)object_dynamic_cast(obj, > > TYPE_IOTHREAD); > > > + blkdev->ctx = iothread_get_aio_context(blkdev->iothread); > > > + blkdev->bh = aio_bh_new(blkdev->ctx, blk_bh, blkdev); > > > + > > > if (xen_mode != XEN_EMULATE) { > > > batch_maps = 1; > > > } > > > @@ -1288,6 +1327,8 @@ static int blk_connect(struct XenDevice *xendev) > > > blkdev->persistent_gnt_count = 0; > > > } > > > > > > + blk_set_aio_context(blkdev->blk, blkdev->ctx); > > > + > > > xen_be_bind_evtchn(&blkdev->xendev); > > > > > > xen_pv_printf(&blkdev->xendev, 1, "ok: proto %s, nr-ring-ref %u, " > > > @@ -1301,13 +1342,20 @@ static void blk_disconnect(struct XenDevice > > *xendev) > > > { > > > struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, > > xendev); > > > > > > + trace_xen_disk_disconnect(xendev->name); > > > + > > > + aio_context_acquire(blkdev->ctx); > > > + > > > if (blkdev->blk) { > > > + blk_set_aio_context(blkdev->blk, qemu_get_aio_context()); > > > blk_detach_dev(blkdev->blk, blkdev); > > > blk_unref(blkdev->blk); > > > blkdev->blk = NULL; > > > } > > > xen_pv_unbind_evtchn(&blkdev->xendev); > > > > > > + aio_context_release(blkdev->ctx); > > > + > > > if (blkdev->sring) { > > > xengnttab_unmap(blkdev->xendev.gnttabdev, blkdev->sring, > > > blkdev->nr_ring_ref); > > > @@ -1358,6 +1408,7 @@ static int blk_free(struct XenDevice *xendev) > > > g_free(blkdev->dev); > > > g_free(blkdev->devtype); > > > qemu_bh_delete(blkdev->bh); > > > + object_unparent(OBJECT(blkdev->iothread)); > > > > Shouldn't this be object_unref? > > > > I don't think so. I think this is required to undo what was done by calling object_property_add_child() on the root object. Right, so if object_property_add_child is not actually required, then you might be able to turn object_unparent into object_unref. Unfortunately I don't know enough about QOM to be able to tell which is the right way of doing things, but looking at hw/block/dataplane/virtio-blk.c, it would seem that only object_ref and object_unref are required? > Looking at other code such as object_new_with_propv() it looks like the right thing to do is to call object_unref() after calling object_property_add_child() to drop the implicit ref taken by object_new() so I'd need to add the call in blk_alloc(). ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2 3/3] xen-disk: use an IOThread per instance 2017-07-07 22:06 ` [Qemu-devel] " Stefano Stabellini @ 2017-07-10 12:11 ` Paul Durrant 2017-07-10 12:11 ` [Qemu-devel] " Paul Durrant 1 sibling, 0 replies; 24+ messages in thread From: Paul Durrant @ 2017-07-10 12:11 UTC (permalink / raw) To: 'Stefano Stabellini' Cc: Kevin Wolf, qemu-block, armbru, qemu-devel, Max Reitz, Anthony Perard, xen-devel, afaerber > -----Original Message----- [snip] > > > > + object_unparent(OBJECT(blkdev->iothread)); > > > > > > Shouldn't this be object_unref? > > > > > > > I don't think so. I think this is required to undo what was done by calling > object_property_add_child() on the root object. > > Right, so if object_property_add_child is not actually required, then > you might be able to turn object_unparent into object_unref. > > Unfortunately I don't know enough about QOM to be able to tell which is > the right way of doing things, but looking at > hw/block/dataplane/virtio-blk.c, it would seem that only object_ref and > object_unref are required? > I guess I can give it a try. I was working on the assumption that all objects were required to have a parent, but maybe that's not true. Can someone more familiar with QOM comment? Cheers, Paul > > > Looking at other code such as object_new_with_propv() it looks like the > right thing to do is to call object_unref() after calling > object_property_add_child() to drop the implicit ref taken by object_new() > so I'd need to add the call in blk_alloc(). _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Qemu-devel] [PATCH v2 3/3] xen-disk: use an IOThread per instance 2017-07-07 22:06 ` [Qemu-devel] " Stefano Stabellini 2017-07-10 12:11 ` Paul Durrant @ 2017-07-10 12:11 ` Paul Durrant 1 sibling, 0 replies; 24+ messages in thread From: Paul Durrant @ 2017-07-10 12:11 UTC (permalink / raw) To: 'Stefano Stabellini' Cc: xen-devel, qemu-devel, qemu-block, Anthony Perard, Kevin Wolf, Max Reitz, afaerber, armbru > -----Original Message----- [snip] > > > > + object_unparent(OBJECT(blkdev->iothread)); > > > > > > Shouldn't this be object_unref? > > > > > > > I don't think so. I think this is required to undo what was done by calling > object_property_add_child() on the root object. > > Right, so if object_property_add_child is not actually required, then > you might be able to turn object_unparent into object_unref. > > Unfortunately I don't know enough about QOM to be able to tell which is > the right way of doing things, but looking at > hw/block/dataplane/virtio-blk.c, it would seem that only object_ref and > object_unref are required? > I guess I can give it a try. I was working on the assumption that all objects were required to have a parent, but maybe that's not true. Can someone more familiar with QOM comment? Cheers, Paul > > > Looking at other code such as object_new_with_propv() it looks like the > right thing to do is to call object_unref() after calling > object_property_add_child() to drop the implicit ref taken by object_new() > so I'd need to add the call in blk_alloc(). ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2 3/3] xen-disk: use an IOThread per instance 2017-06-22 22:14 ` [Qemu-devel] " Stefano Stabellini 2017-07-07 8:20 ` Paul Durrant @ 2017-07-07 8:20 ` Paul Durrant 1 sibling, 0 replies; 24+ messages in thread From: Paul Durrant @ 2017-07-07 8:20 UTC (permalink / raw) To: 'Stefano Stabellini' Cc: Kevin Wolf, qemu-block, qemu-devel, Max Reitz, Anthony Perard, xen-devel, afaerber > -----Original Message----- > From: Stefano Stabellini [mailto:sstabellini@kernel.org] > Sent: 22 June 2017 23:15 > To: Paul Durrant <Paul.Durrant@citrix.com> > Cc: xen-devel@lists.xenproject.org; qemu-devel@nongnu.org; qemu- > block@nongnu.org; Stefano Stabellini <sstabellini@kernel.org>; Anthony > Perard <anthony.perard@citrix.com>; Kevin Wolf <kwolf@redhat.com>; > Max Reitz <mreitz@redhat.com>; afaerber@suse.de > Subject: Re: [PATCH v2 3/3] xen-disk: use an IOThread per instance > > CC'ing Andreas Färber. Could you please give a quick look below at the > way the iothread object is instantiate and destroyed? I am no object > model expert and would appreaciate a second opinion. > I have not seen any response so far. > > On Wed, 21 Jun 2017, Paul Durrant wrote: > > This patch allocates an IOThread object for each xen_disk instance and > > sets the AIO context appropriately on connect. This allows processing > > of I/O to proceed in parallel. > > > > The patch also adds tracepoints into xen_disk to make it possible to > > follow the state transtions of an instance in the log. > > > > Signed-off-by: Paul Durrant <paul.durrant@citrix.com> > > --- > > Cc: Stefano Stabellini <sstabellini@kernel.org> > > Cc: Anthony Perard <anthony.perard@citrix.com> > > Cc: Kevin Wolf <kwolf@redhat.com> > > Cc: Max Reitz <mreitz@redhat.com> > > > > v2: > > - explicitly acquire and release AIO context in qemu_aio_complete() and > > blk_bh() > > --- > > hw/block/trace-events | 7 ++++++ > > hw/block/xen_disk.c | 69 > ++++++++++++++++++++++++++++++++++++++++++++------- > > 2 files changed, 67 insertions(+), 9 deletions(-) > > > > diff --git a/hw/block/trace-events b/hw/block/trace-events > > index 65e83dc258..608b24ba66 100644 > > --- a/hw/block/trace-events > > +++ b/hw/block/trace-events > > @@ -10,3 +10,10 @@ virtio_blk_submit_multireq(void *mrb, int start, int > num_reqs, uint64_t offset, > > # hw/block/hd-geometry.c > > hd_geometry_lchs_guess(void *blk, int cyls, int heads, int secs) "blk %p > LCHS %d %d %d" > > hd_geometry_guess(void *blk, uint32_t cyls, uint32_t heads, uint32_t > secs, int trans) "blk %p CHS %u %u %u trans %d" > > + > > +# hw/block/xen_disk.c > > +xen_disk_alloc(char *name) "%s" > > +xen_disk_init(char *name) "%s" > > +xen_disk_connect(char *name) "%s" > > +xen_disk_disconnect(char *name) "%s" > > +xen_disk_free(char *name) "%s" > > diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c > > index 0e6513708e..8548195195 100644 > > --- a/hw/block/xen_disk.c > > +++ b/hw/block/xen_disk.c > > @@ -27,10 +27,13 @@ > > #include "hw/xen/xen_backend.h" > > #include "xen_blkif.h" > > #include "sysemu/blockdev.h" > > +#include "sysemu/iothread.h" > > #include "sysemu/block-backend.h" > > #include "qapi/error.h" > > #include "qapi/qmp/qdict.h" > > #include "qapi/qmp/qstring.h" > > +#include "qom/object_interfaces.h" > > +#include "trace.h" > > > > /* ------------------------------------------------------------- */ > > > > @@ -128,6 +131,9 @@ struct XenBlkDev { > > DriveInfo *dinfo; > > BlockBackend *blk; > > QEMUBH *bh; > > + > > + IOThread *iothread; > > + AioContext *ctx; > > }; > > > > /* ------------------------------------------------------------- */ > > @@ -599,9 +605,12 @@ static int ioreq_runio_qemu_aio(struct ioreq > *ioreq); > > static void qemu_aio_complete(void *opaque, int ret) > > { > > struct ioreq *ioreq = opaque; > > + struct XenBlkDev *blkdev = ioreq->blkdev; > > + > > + aio_context_acquire(blkdev->ctx); > > I think that Paolo was right that we need a aio_context_acquire here, > however the issue is that with the current code: > > blk_handle_requests -> ioreq_runio_qemu_aio -> qemu_aio_complete > > leading to aio_context_acquire being called twice on the same lock, > which I don't think is allowed? It resolves to a qemu_rec_mutex_lock() which I believed is a recursive lock, so I think that's ok. > > I think we need to get rid of the qemu_aio_complete call from > ioreq_runio_qemu_aio, but to do that we need to be careful with the > accounting of aio_inflight (today it's incremented unconditionally at > the beginning of ioreq_runio_qemu_aio, I think we would have to change > that to increment it only if presync). > If the lock is indeed recursive then I think we can avoid this complication. > > > if (ret != 0) { > > - xen_pv_printf(&ioreq->blkdev->xendev, 0, "%s I/O error\n", > > + xen_pv_printf(&blkdev->xendev, 0, "%s I/O error\n", > > ioreq->req.operation == BLKIF_OP_READ ? "read" : "write"); > > ioreq->aio_errors++; > > } > > @@ -610,13 +619,13 @@ static void qemu_aio_complete(void *opaque, int > ret) > > if (ioreq->presync) { > > ioreq->presync = 0; > > ioreq_runio_qemu_aio(ioreq); > > - return; > > + goto done; > > } > > if (ioreq->aio_inflight > 0) { > > - return; > > + goto done; > > } > > > > - if (ioreq->blkdev->feature_grant_copy) { > > + if (blkdev->feature_grant_copy) { > > switch (ioreq->req.operation) { > > case BLKIF_OP_READ: > > /* in case of failure ioreq->aio_errors is increased */ > > @@ -638,7 +647,7 @@ static void qemu_aio_complete(void *opaque, int > ret) > > } > > > > ioreq->status = ioreq->aio_errors ? BLKIF_RSP_ERROR : > BLKIF_RSP_OKAY; > > - if (!ioreq->blkdev->feature_grant_copy) { > > + if (!blkdev->feature_grant_copy) { > > ioreq_unmap(ioreq); > > } > > ioreq_finish(ioreq); > > @@ -650,16 +659,19 @@ static void qemu_aio_complete(void *opaque, int > ret) > > } > > case BLKIF_OP_READ: > > if (ioreq->status == BLKIF_RSP_OKAY) { > > - block_acct_done(blk_get_stats(ioreq->blkdev->blk), &ioreq->acct); > > + block_acct_done(blk_get_stats(blkdev->blk), &ioreq->acct); > > } else { > > - block_acct_failed(blk_get_stats(ioreq->blkdev->blk), &ioreq->acct); > > + block_acct_failed(blk_get_stats(blkdev->blk), &ioreq->acct); > > } > > break; > > case BLKIF_OP_DISCARD: > > default: > > break; > > } > > - qemu_bh_schedule(ioreq->blkdev->bh); > > + qemu_bh_schedule(blkdev->bh); > > + > > +done: > > + aio_context_release(blkdev->ctx); > > } > > > > static bool blk_split_discard(struct ioreq *ioreq, blkif_sector_t > sector_number, > > @@ -917,17 +929,40 @@ static void blk_handle_requests(struct XenBlkDev > *blkdev) > > static void blk_bh(void *opaque) > > { > > struct XenBlkDev *blkdev = opaque; > > + > > + aio_context_acquire(blkdev->ctx); > > blk_handle_requests(blkdev); > > + aio_context_release(blkdev->ctx); > > } > > > > static void blk_alloc(struct XenDevice *xendev) > > { > > struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, > xendev); > > + Object *obj; > > + char *name; > > + Error *err = NULL; > > + > > + trace_xen_disk_alloc(xendev->name); > > > > QLIST_INIT(&blkdev->inflight); > > QLIST_INIT(&blkdev->finished); > > QLIST_INIT(&blkdev->freelist); > > - blkdev->bh = qemu_bh_new(blk_bh, blkdev); > > + > > + obj = object_new(TYPE_IOTHREAD); > > + name = g_strdup_printf("iothread-%s", xendev->name); > > + > > + object_property_add_child(object_get_objects_root(), name, obj, > &err); > > + assert(!err); > > Would it be enough to call object_ref? > You mean to avoid the assert? I guess so but I think any failure here would be indicative of a larger problem. > > > + g_free(name); > > + > > + user_creatable_complete(obj, &err); > > Why do we need to call this? > I'm not entirely sure but looking around the object code it seemed to be a necessary part of instantiation. Maybe it is not required for iothread objects, but I could not figure that out from looking at the code and comments in the header suggest it is harmless if it is not required. > > > + assert(!err); > > + > > + blkdev->iothread = (IOThread *)object_dynamic_cast(obj, > TYPE_IOTHREAD); > > + blkdev->ctx = iothread_get_aio_context(blkdev->iothread); > > + blkdev->bh = aio_bh_new(blkdev->ctx, blk_bh, blkdev); > > + > > if (xen_mode != XEN_EMULATE) { > > batch_maps = 1; > > } > > @@ -1288,6 +1327,8 @@ static int blk_connect(struct XenDevice *xendev) > > blkdev->persistent_gnt_count = 0; > > } > > > > + blk_set_aio_context(blkdev->blk, blkdev->ctx); > > + > > xen_be_bind_evtchn(&blkdev->xendev); > > > > xen_pv_printf(&blkdev->xendev, 1, "ok: proto %s, nr-ring-ref %u, " > > @@ -1301,13 +1342,20 @@ static void blk_disconnect(struct XenDevice > *xendev) > > { > > struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, > xendev); > > > > + trace_xen_disk_disconnect(xendev->name); > > + > > + aio_context_acquire(blkdev->ctx); > > + > > if (blkdev->blk) { > > + blk_set_aio_context(blkdev->blk, qemu_get_aio_context()); > > blk_detach_dev(blkdev->blk, blkdev); > > blk_unref(blkdev->blk); > > blkdev->blk = NULL; > > } > > xen_pv_unbind_evtchn(&blkdev->xendev); > > > > + aio_context_release(blkdev->ctx); > > + > > if (blkdev->sring) { > > xengnttab_unmap(blkdev->xendev.gnttabdev, blkdev->sring, > > blkdev->nr_ring_ref); > > @@ -1358,6 +1408,7 @@ static int blk_free(struct XenDevice *xendev) > > g_free(blkdev->dev); > > g_free(blkdev->devtype); > > qemu_bh_delete(blkdev->bh); > > + object_unparent(OBJECT(blkdev->iothread)); > > Shouldn't this be object_unref? > I don't think so. I think this is required to undo what was done by calling object_property_add_child() on the root object. Looking at other code such as object_new_with_propv() it looks like the right thing to do is to call object_unref() after calling object_property_add_child() to drop the implicit ref taken by object_new() so I'd need to add the call in blk_alloc(). Paul > > > return 0; > > } _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2 3/3] xen-disk: use an IOThread per instance 2017-06-21 12:52 ` Paul Durrant (?) (?) @ 2017-06-22 22:14 ` Stefano Stabellini -1 siblings, 0 replies; 24+ messages in thread From: Stefano Stabellini @ 2017-06-22 22:14 UTC (permalink / raw) To: Paul Durrant Cc: Kevin Wolf, Stefano Stabellini, qemu-block, qemu-devel, Max Reitz, Anthony Perard, xen-devel, afaerber [-- Attachment #1: Type: TEXT/PLAIN, Size: 8170 bytes --] CC'ing Andreas Färber. Could you please give a quick look below at the way the iothread object is instantiate and destroyed? I am no object model expert and would appreaciate a second opinion. On Wed, 21 Jun 2017, Paul Durrant wrote: > This patch allocates an IOThread object for each xen_disk instance and > sets the AIO context appropriately on connect. This allows processing > of I/O to proceed in parallel. > > The patch also adds tracepoints into xen_disk to make it possible to > follow the state transtions of an instance in the log. > > Signed-off-by: Paul Durrant <paul.durrant@citrix.com> > --- > Cc: Stefano Stabellini <sstabellini@kernel.org> > Cc: Anthony Perard <anthony.perard@citrix.com> > Cc: Kevin Wolf <kwolf@redhat.com> > Cc: Max Reitz <mreitz@redhat.com> > > v2: > - explicitly acquire and release AIO context in qemu_aio_complete() and > blk_bh() > --- > hw/block/trace-events | 7 ++++++ > hw/block/xen_disk.c | 69 ++++++++++++++++++++++++++++++++++++++++++++------- > 2 files changed, 67 insertions(+), 9 deletions(-) > > diff --git a/hw/block/trace-events b/hw/block/trace-events > index 65e83dc258..608b24ba66 100644 > --- a/hw/block/trace-events > +++ b/hw/block/trace-events > @@ -10,3 +10,10 @@ virtio_blk_submit_multireq(void *mrb, int start, int num_reqs, uint64_t offset, > # hw/block/hd-geometry.c > hd_geometry_lchs_guess(void *blk, int cyls, int heads, int secs) "blk %p LCHS %d %d %d" > hd_geometry_guess(void *blk, uint32_t cyls, uint32_t heads, uint32_t secs, int trans) "blk %p CHS %u %u %u trans %d" > + > +# hw/block/xen_disk.c > +xen_disk_alloc(char *name) "%s" > +xen_disk_init(char *name) "%s" > +xen_disk_connect(char *name) "%s" > +xen_disk_disconnect(char *name) "%s" > +xen_disk_free(char *name) "%s" > diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c > index 0e6513708e..8548195195 100644 > --- a/hw/block/xen_disk.c > +++ b/hw/block/xen_disk.c > @@ -27,10 +27,13 @@ > #include "hw/xen/xen_backend.h" > #include "xen_blkif.h" > #include "sysemu/blockdev.h" > +#include "sysemu/iothread.h" > #include "sysemu/block-backend.h" > #include "qapi/error.h" > #include "qapi/qmp/qdict.h" > #include "qapi/qmp/qstring.h" > +#include "qom/object_interfaces.h" > +#include "trace.h" > > /* ------------------------------------------------------------- */ > > @@ -128,6 +131,9 @@ struct XenBlkDev { > DriveInfo *dinfo; > BlockBackend *blk; > QEMUBH *bh; > + > + IOThread *iothread; > + AioContext *ctx; > }; > > /* ------------------------------------------------------------- */ > @@ -599,9 +605,12 @@ static int ioreq_runio_qemu_aio(struct ioreq *ioreq); > static void qemu_aio_complete(void *opaque, int ret) > { > struct ioreq *ioreq = opaque; > + struct XenBlkDev *blkdev = ioreq->blkdev; > + > + aio_context_acquire(blkdev->ctx); I think that Paolo was right that we need a aio_context_acquire here, however the issue is that with the current code: blk_handle_requests -> ioreq_runio_qemu_aio -> qemu_aio_complete leading to aio_context_acquire being called twice on the same lock, which I don't think is allowed? I think we need to get rid of the qemu_aio_complete call from ioreq_runio_qemu_aio, but to do that we need to be careful with the accounting of aio_inflight (today it's incremented unconditionally at the beginning of ioreq_runio_qemu_aio, I think we would have to change that to increment it only if presync). > if (ret != 0) { > - xen_pv_printf(&ioreq->blkdev->xendev, 0, "%s I/O error\n", > + xen_pv_printf(&blkdev->xendev, 0, "%s I/O error\n", > ioreq->req.operation == BLKIF_OP_READ ? "read" : "write"); > ioreq->aio_errors++; > } > @@ -610,13 +619,13 @@ static void qemu_aio_complete(void *opaque, int ret) > if (ioreq->presync) { > ioreq->presync = 0; > ioreq_runio_qemu_aio(ioreq); > - return; > + goto done; > } > if (ioreq->aio_inflight > 0) { > - return; > + goto done; > } > > - if (ioreq->blkdev->feature_grant_copy) { > + if (blkdev->feature_grant_copy) { > switch (ioreq->req.operation) { > case BLKIF_OP_READ: > /* in case of failure ioreq->aio_errors is increased */ > @@ -638,7 +647,7 @@ static void qemu_aio_complete(void *opaque, int ret) > } > > ioreq->status = ioreq->aio_errors ? BLKIF_RSP_ERROR : BLKIF_RSP_OKAY; > - if (!ioreq->blkdev->feature_grant_copy) { > + if (!blkdev->feature_grant_copy) { > ioreq_unmap(ioreq); > } > ioreq_finish(ioreq); > @@ -650,16 +659,19 @@ static void qemu_aio_complete(void *opaque, int ret) > } > case BLKIF_OP_READ: > if (ioreq->status == BLKIF_RSP_OKAY) { > - block_acct_done(blk_get_stats(ioreq->blkdev->blk), &ioreq->acct); > + block_acct_done(blk_get_stats(blkdev->blk), &ioreq->acct); > } else { > - block_acct_failed(blk_get_stats(ioreq->blkdev->blk), &ioreq->acct); > + block_acct_failed(blk_get_stats(blkdev->blk), &ioreq->acct); > } > break; > case BLKIF_OP_DISCARD: > default: > break; > } > - qemu_bh_schedule(ioreq->blkdev->bh); > + qemu_bh_schedule(blkdev->bh); > + > +done: > + aio_context_release(blkdev->ctx); > } > > static bool blk_split_discard(struct ioreq *ioreq, blkif_sector_t sector_number, > @@ -917,17 +929,40 @@ static void blk_handle_requests(struct XenBlkDev *blkdev) > static void blk_bh(void *opaque) > { > struct XenBlkDev *blkdev = opaque; > + > + aio_context_acquire(blkdev->ctx); > blk_handle_requests(blkdev); > + aio_context_release(blkdev->ctx); > } > > static void blk_alloc(struct XenDevice *xendev) > { > struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev); > + Object *obj; > + char *name; > + Error *err = NULL; > + > + trace_xen_disk_alloc(xendev->name); > > QLIST_INIT(&blkdev->inflight); > QLIST_INIT(&blkdev->finished); > QLIST_INIT(&blkdev->freelist); > - blkdev->bh = qemu_bh_new(blk_bh, blkdev); > + > + obj = object_new(TYPE_IOTHREAD); > + name = g_strdup_printf("iothread-%s", xendev->name); > + > + object_property_add_child(object_get_objects_root(), name, obj, &err); > + assert(!err); Would it be enough to call object_ref? > + g_free(name); > + > + user_creatable_complete(obj, &err); Why do we need to call this? > + assert(!err); > + > + blkdev->iothread = (IOThread *)object_dynamic_cast(obj, TYPE_IOTHREAD); > + blkdev->ctx = iothread_get_aio_context(blkdev->iothread); > + blkdev->bh = aio_bh_new(blkdev->ctx, blk_bh, blkdev); > + > if (xen_mode != XEN_EMULATE) { > batch_maps = 1; > } > @@ -1288,6 +1327,8 @@ static int blk_connect(struct XenDevice *xendev) > blkdev->persistent_gnt_count = 0; > } > > + blk_set_aio_context(blkdev->blk, blkdev->ctx); > + > xen_be_bind_evtchn(&blkdev->xendev); > > xen_pv_printf(&blkdev->xendev, 1, "ok: proto %s, nr-ring-ref %u, " > @@ -1301,13 +1342,20 @@ static void blk_disconnect(struct XenDevice *xendev) > { > struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev); > > + trace_xen_disk_disconnect(xendev->name); > + > + aio_context_acquire(blkdev->ctx); > + > if (blkdev->blk) { > + blk_set_aio_context(blkdev->blk, qemu_get_aio_context()); > blk_detach_dev(blkdev->blk, blkdev); > blk_unref(blkdev->blk); > blkdev->blk = NULL; > } > xen_pv_unbind_evtchn(&blkdev->xendev); > > + aio_context_release(blkdev->ctx); > + > if (blkdev->sring) { > xengnttab_unmap(blkdev->xendev.gnttabdev, blkdev->sring, > blkdev->nr_ring_ref); > @@ -1358,6 +1408,7 @@ static int blk_free(struct XenDevice *xendev) > g_free(blkdev->dev); > g_free(blkdev->devtype); > qemu_bh_delete(blkdev->bh); > + object_unparent(OBJECT(blkdev->iothread)); Shouldn't this be object_unref? > return 0; > } [-- Attachment #2: Type: text/plain, Size: 127 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Qemu-devel] [Xen-devel] [PATCH v2 0/3] xen-disk: performance improvements 2017-06-21 12:52 ` Paul Durrant @ 2017-06-27 22:07 ` Stefano Stabellini -1 siblings, 0 replies; 24+ messages in thread From: Stefano Stabellini @ 2017-06-27 22:07 UTC (permalink / raw) To: Paul Durrant; +Cc: xen-devel, qemu-devel, qemu-block On Wed, 21 Jun 2017, Paul Durrant wrote: > Paul Durrant (3): > xen-disk: only advertize feature-persistent if grant copy is not > available > xen-disk: add support for multi-page shared rings > xen-disk: use an IOThread per instance > > hw/block/trace-events | 7 ++ > hw/block/xen_disk.c | 228 +++++++++++++++++++++++++++++++++++++++----------- > 2 files changed, 188 insertions(+), 47 deletions(-) While waiting for an answer on patch #3, I sent a pull request for the first 2 patches ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2 0/3] xen-disk: performance improvements @ 2017-06-27 22:07 ` Stefano Stabellini 0 siblings, 0 replies; 24+ messages in thread From: Stefano Stabellini @ 2017-06-27 22:07 UTC (permalink / raw) To: Paul Durrant; +Cc: xen-devel, qemu-devel, qemu-block On Wed, 21 Jun 2017, Paul Durrant wrote: > Paul Durrant (3): > xen-disk: only advertize feature-persistent if grant copy is not > available > xen-disk: add support for multi-page shared rings > xen-disk: use an IOThread per instance > > hw/block/trace-events | 7 ++ > hw/block/xen_disk.c | 228 +++++++++++++++++++++++++++++++++++++++----------- > 2 files changed, 188 insertions(+), 47 deletions(-) While waiting for an answer on patch #3, I sent a pull request for the first 2 patches _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Qemu-devel] [Xen-devel] [PATCH v2 0/3] xen-disk: performance improvements 2017-06-27 22:07 ` Stefano Stabellini @ 2017-06-28 12:52 ` Paul Durrant -1 siblings, 0 replies; 24+ messages in thread From: Paul Durrant @ 2017-06-28 12:52 UTC (permalink / raw) To: 'Stefano Stabellini'; +Cc: xen-devel, qemu-devel, qemu-block > -----Original Message----- > From: Stefano Stabellini [mailto:sstabellini@kernel.org] > Sent: 27 June 2017 23:07 > To: Paul Durrant <Paul.Durrant@citrix.com> > Cc: xen-devel@lists.xenproject.org; qemu-devel@nongnu.org; qemu- > block@nongnu.org > Subject: Re: [Xen-devel] [PATCH v2 0/3] xen-disk: performance > improvements > > On Wed, 21 Jun 2017, Paul Durrant wrote: > > Paul Durrant (3): > > xen-disk: only advertize feature-persistent if grant copy is not > > available > > xen-disk: add support for multi-page shared rings > > xen-disk: use an IOThread per instance > > > > hw/block/trace-events | 7 ++ > > hw/block/xen_disk.c | 228 > +++++++++++++++++++++++++++++++++++++++----------- > > 2 files changed, 188 insertions(+), 47 deletions(-) > > While waiting for an answer on patch #3, I sent a pull request for the > first 2 patches Cool. Thanks. Hopefully we won't have to wait too long for review on patch #3. Cheers, Paul ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2 0/3] xen-disk: performance improvements @ 2017-06-28 12:52 ` Paul Durrant 0 siblings, 0 replies; 24+ messages in thread From: Paul Durrant @ 2017-06-28 12:52 UTC (permalink / raw) To: 'Stefano Stabellini'; +Cc: xen-devel, qemu-devel, qemu-block > -----Original Message----- > From: Stefano Stabellini [mailto:sstabellini@kernel.org] > Sent: 27 June 2017 23:07 > To: Paul Durrant <Paul.Durrant@citrix.com> > Cc: xen-devel@lists.xenproject.org; qemu-devel@nongnu.org; qemu- > block@nongnu.org > Subject: Re: [Xen-devel] [PATCH v2 0/3] xen-disk: performance > improvements > > On Wed, 21 Jun 2017, Paul Durrant wrote: > > Paul Durrant (3): > > xen-disk: only advertize feature-persistent if grant copy is not > > available > > xen-disk: add support for multi-page shared rings > > xen-disk: use an IOThread per instance > > > > hw/block/trace-events | 7 ++ > > hw/block/xen_disk.c | 228 > +++++++++++++++++++++++++++++++++++++++----------- > > 2 files changed, 188 insertions(+), 47 deletions(-) > > While waiting for an answer on patch #3, I sent a pull request for the > first 2 patches Cool. Thanks. Hopefully we won't have to wait too long for review on patch #3. Cheers, Paul _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2017-07-10 12:11 UTC | newest] Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-06-21 12:52 [Qemu-devel] [PATCH v2 0/3] xen-disk: performance improvements Paul Durrant 2017-06-21 12:52 ` Paul Durrant 2017-06-21 12:52 ` [Qemu-devel] [PATCH v2 1/3] xen-disk: only advertize feature-persistent if grant copy is not available Paul Durrant 2017-06-21 12:52 ` Paul Durrant 2017-06-22 0:40 ` Stefano Stabellini 2017-06-22 0:40 ` [Qemu-devel] " Stefano Stabellini 2017-06-21 12:52 ` [Qemu-devel] [PATCH v2 2/3] xen-disk: add support for multi-page shared rings Paul Durrant 2017-06-21 12:52 ` Paul Durrant 2017-06-22 0:39 ` [Qemu-devel] " Stefano Stabellini 2017-06-22 0:39 ` Stefano Stabellini 2017-06-21 12:52 ` [Qemu-devel] [PATCH v2 3/3] xen-disk: use an IOThread per instance Paul Durrant 2017-06-21 12:52 ` Paul Durrant 2017-06-22 22:14 ` [Qemu-devel] " Stefano Stabellini 2017-07-07 8:20 ` Paul Durrant 2017-07-07 22:06 ` Stefano Stabellini 2017-07-07 22:06 ` [Qemu-devel] " Stefano Stabellini 2017-07-10 12:11 ` Paul Durrant 2017-07-10 12:11 ` [Qemu-devel] " Paul Durrant 2017-07-07 8:20 ` Paul Durrant 2017-06-22 22:14 ` Stefano Stabellini 2017-06-27 22:07 ` [Qemu-devel] [Xen-devel] [PATCH v2 0/3] xen-disk: performance improvements Stefano Stabellini 2017-06-27 22:07 ` Stefano Stabellini 2017-06-28 12:52 ` [Qemu-devel] [Xen-devel] " Paul Durrant 2017-06-28 12:52 ` Paul Durrant
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.