All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 7/7] block/rbd: change request alignment to 1 byte
@ 2021-01-20 23:01 Peter Lieven
  2021-01-21 19:42 ` Jason Dillaman
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Lieven @ 2021-01-20 23:01 UTC (permalink / raw)
  To: dillaman; +Cc: Kevin Wolf, Christian Theune, qemu-devel, qemu-block, Max Reitz


> Am 19.01.2021 um 15:20 schrieb Jason Dillaman <jdillama@redhat.com>:
> 
> On Tue, Jan 19, 2021 at 4:36 AM Peter Lieven <pl@kamp.de> wrote:
>>> Am 18.01.21 um 23:33 schrieb Jason Dillaman:
>>> On Fri, Jan 15, 2021 at 10:39 AM Peter Lieven <pl@kamp.de> wrote:
>>>> Am 15.01.21 um 16:27 schrieb Jason Dillaman:
>>>>> On Thu, Jan 14, 2021 at 2:59 PM Peter Lieven <pl@kamp.de> wrote:
>>>>>> Am 14.01.21 um 20:19 schrieb Jason Dillaman:
>>>>>>> On Sun, Dec 27, 2020 at 11:42 AM Peter Lieven <pl@kamp.de> wrote:
>>>>>>>> since we implement byte interfaces and librbd supports aio on byte granularity we can lift
>>>>>>>> the 512 byte alignment.
>>>>>>>> Signed-off-by: Peter Lieven <pl@kamp.de>
>>>>>>>> ---
>>>>>>>> block/rbd.c | 2 --
>>>>>>>> 1 file changed, 2 deletions(-)
>>>>>>>> diff --git a/block/rbd.c b/block/rbd.c
>>>>>>>> index 27b4404adf..8673e8f553 100644
>>>>>>>> --- a/block/rbd.c
>>>>>>>> +++ b/block/rbd.c
>>>>>>>> @@ -223,8 +223,6 @@ done:
>>>>>>>> static void qemu_rbd_refresh_limits(BlockDriverState *bs, Error **errp)
>>>>>>>> {
>>>>>>>>    BDRVRBDState *s = bs->opaque;
>>>>>>>> -    /* XXX Does RBD support AIO on less than 512-byte alignment? */
>>>>>>>> -    bs->bl.request_alignment = 512;
>>>>>>> Just a suggestion, but perhaps improve discard alignment, max discard,
>>>>>>> optimal alignment (if that's something QEMU handles internally) if not
>>>>>>> overridden by the user.
>>>>>> Qemu supports max_discard and discard_alignment. Is there a call to get these limits
>>>>>> from librbd?
>>>>>> What do you mean by optimal_alignment? The object size?
>>>>> krbd does a good job of initializing defaults [1] where optimal and
>>>>> discard alignment is 64KiB (can actually be 4KiB now), max IO size for
>>>>> writes, discards, and write-zeroes is the object size * the stripe
>>>>> count.
>>>> Okay, I will have a look at it. If qemu issues a write, discard, write_zero greater than
>>>> obj_size  * stripe count will librbd split it internally or will the request fail?
>>> librbd will handle it as needed. My goal is really just to get the
>>> hints down the guest OS.
>>>> Regarding the alignment it seems that rbd_dev->opts->alloc_size is something that comes from the device
>>>> configuration and not from rbd? I don't have that information inside the Qemu RBD driver.
>>> librbd doesn't really have the information either. The 64KiB guess
>>> that krbd uses was a compromise since that was the default OSD
>>> allocation size for HDDs since Luminous. Starting with Pacific that
>>> default is going down to 4KiB.
>> I will try to adjust these values as far as it is possible and makes sense.
>> Is there a way to check the minimum supported OSD release in the backend from librbd / librados?
> 
> It's not a minimum -- RADOS will gladly access 1 byte writes as well.
> It's really just the optimal (performance and space-wise). Sadly,
> there is no realistic way to query this data from the backend.

So you would suggest to advertise an optimal transfer length of 64k and max transfer length of obj size * stripe count to the guest unless we have an API in the future to query these limits from the backend?

I would leave request alignment at 1 byte as otherwise Qemu will issue RMWs for all write requests that do not align. Everything that comes from a guest OS is very likely 4k aligned anyway.

Peter




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 7/7] block/rbd: change request alignment to 1 byte
  2021-01-20 23:01 [PATCH 7/7] block/rbd: change request alignment to 1 byte Peter Lieven
@ 2021-01-21 19:42 ` Jason Dillaman
  2021-01-21 20:29   ` Peter Lieven
  0 siblings, 1 reply; 12+ messages in thread
From: Jason Dillaman @ 2021-01-21 19:42 UTC (permalink / raw)
  To: Peter Lieven
  Cc: Kevin Wolf, Christian Theune, qemu-devel, qemu-block, Max Reitz

On Wed, Jan 20, 2021 at 6:01 PM Peter Lieven <pl@kamp.de> wrote:
>
>
> > Am 19.01.2021 um 15:20 schrieb Jason Dillaman <jdillama@redhat.com>:
> >
> > On Tue, Jan 19, 2021 at 4:36 AM Peter Lieven <pl@kamp.de> wrote:
> >>> Am 18.01.21 um 23:33 schrieb Jason Dillaman:
> >>> On Fri, Jan 15, 2021 at 10:39 AM Peter Lieven <pl@kamp.de> wrote:
> >>>> Am 15.01.21 um 16:27 schrieb Jason Dillaman:
> >>>>> On Thu, Jan 14, 2021 at 2:59 PM Peter Lieven <pl@kamp.de> wrote:
> >>>>>> Am 14.01.21 um 20:19 schrieb Jason Dillaman:
> >>>>>>> On Sun, Dec 27, 2020 at 11:42 AM Peter Lieven <pl@kamp.de> wrote:
> >>>>>>>> since we implement byte interfaces and librbd supports aio on byte granularity we can lift
> >>>>>>>> the 512 byte alignment.
> >>>>>>>> Signed-off-by: Peter Lieven <pl@kamp.de>
> >>>>>>>> ---
> >>>>>>>> block/rbd.c | 2 --
> >>>>>>>> 1 file changed, 2 deletions(-)
> >>>>>>>> diff --git a/block/rbd.c b/block/rbd.c
> >>>>>>>> index 27b4404adf..8673e8f553 100644
> >>>>>>>> --- a/block/rbd.c
> >>>>>>>> +++ b/block/rbd.c
> >>>>>>>> @@ -223,8 +223,6 @@ done:
> >>>>>>>> static void qemu_rbd_refresh_limits(BlockDriverState *bs, Error **errp)
> >>>>>>>> {
> >>>>>>>>    BDRVRBDState *s = bs->opaque;
> >>>>>>>> -    /* XXX Does RBD support AIO on less than 512-byte alignment? */
> >>>>>>>> -    bs->bl.request_alignment = 512;
> >>>>>>> Just a suggestion, but perhaps improve discard alignment, max discard,
> >>>>>>> optimal alignment (if that's something QEMU handles internally) if not
> >>>>>>> overridden by the user.
> >>>>>> Qemu supports max_discard and discard_alignment. Is there a call to get these limits
> >>>>>> from librbd?
> >>>>>> What do you mean by optimal_alignment? The object size?
> >>>>> krbd does a good job of initializing defaults [1] where optimal and
> >>>>> discard alignment is 64KiB (can actually be 4KiB now), max IO size for
> >>>>> writes, discards, and write-zeroes is the object size * the stripe
> >>>>> count.
> >>>> Okay, I will have a look at it. If qemu issues a write, discard, write_zero greater than
> >>>> obj_size  * stripe count will librbd split it internally or will the request fail?
> >>> librbd will handle it as needed. My goal is really just to get the
> >>> hints down the guest OS.
> >>>> Regarding the alignment it seems that rbd_dev->opts->alloc_size is something that comes from the device
> >>>> configuration and not from rbd? I don't have that information inside the Qemu RBD driver.
> >>> librbd doesn't really have the information either. The 64KiB guess
> >>> that krbd uses was a compromise since that was the default OSD
> >>> allocation size for HDDs since Luminous. Starting with Pacific that
> >>> default is going down to 4KiB.
> >> I will try to adjust these values as far as it is possible and makes sense.
> >> Is there a way to check the minimum supported OSD release in the backend from librbd / librados?
> >
> > It's not a minimum -- RADOS will gladly access 1 byte writes as well.
> > It's really just the optimal (performance and space-wise). Sadly,
> > there is no realistic way to query this data from the backend.
>
> So you would suggest to advertise an optimal transfer length of 64k and max transfer length of obj size * stripe count to the guest unless we have an API in the future to query these limits from the backend?

I'll open a Ceph tracker ticket to expose these via the API in a future release.

> I would leave request alignment at 1 byte as otherwise Qemu will issue RMWs for all write requests that do not align. Everything that comes from a guest OS is very likely 4k aligned anyway.

My goal is definitely not to have QEMU do any extra work for splitting
or aligning IOs. I am really only trying to get hints passed down the
guest via the virtio drivers. If there isn't the plumbing in QEMU for
that yet, disregard.

> Peter
>
>


-- 
Jason



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 7/7] block/rbd: change request alignment to 1 byte
  2021-01-21 19:42 ` Jason Dillaman
@ 2021-01-21 20:29   ` Peter Lieven
  2021-01-21 20:55     ` Jason Dillaman
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Lieven @ 2021-01-21 20:29 UTC (permalink / raw)
  To: dillaman; +Cc: Kevin Wolf, Christian Theune, qemu-devel, qemu-block, Max Reitz

Am 21.01.21 um 20:42 schrieb Jason Dillaman:
> On Wed, Jan 20, 2021 at 6:01 PM Peter Lieven <pl@kamp.de> wrote:
>>
>>> Am 19.01.2021 um 15:20 schrieb Jason Dillaman <jdillama@redhat.com>:
>>>
>>> On Tue, Jan 19, 2021 at 4:36 AM Peter Lieven <pl@kamp.de> wrote:
>>>>> Am 18.01.21 um 23:33 schrieb Jason Dillaman:
>>>>> On Fri, Jan 15, 2021 at 10:39 AM Peter Lieven <pl@kamp.de> wrote:
>>>>>> Am 15.01.21 um 16:27 schrieb Jason Dillaman:
>>>>>>> On Thu, Jan 14, 2021 at 2:59 PM Peter Lieven <pl@kamp.de> wrote:
>>>>>>>> Am 14.01.21 um 20:19 schrieb Jason Dillaman:
>>>>>>>>> On Sun, Dec 27, 2020 at 11:42 AM Peter Lieven <pl@kamp.de> wrote:
>>>>>>>>>> since we implement byte interfaces and librbd supports aio on byte granularity we can lift
>>>>>>>>>> the 512 byte alignment.
>>>>>>>>>> Signed-off-by: Peter Lieven <pl@kamp.de>
>>>>>>>>>> ---
>>>>>>>>>> block/rbd.c | 2 --
>>>>>>>>>> 1 file changed, 2 deletions(-)
>>>>>>>>>> diff --git a/block/rbd.c b/block/rbd.c
>>>>>>>>>> index 27b4404adf..8673e8f553 100644
>>>>>>>>>> --- a/block/rbd.c
>>>>>>>>>> +++ b/block/rbd.c
>>>>>>>>>> @@ -223,8 +223,6 @@ done:
>>>>>>>>>> static void qemu_rbd_refresh_limits(BlockDriverState *bs, Error **errp)
>>>>>>>>>> {
>>>>>>>>>>    BDRVRBDState *s = bs->opaque;
>>>>>>>>>> -    /* XXX Does RBD support AIO on less than 512-byte alignment? */
>>>>>>>>>> -    bs->bl.request_alignment = 512;
>>>>>>>>> Just a suggestion, but perhaps improve discard alignment, max discard,
>>>>>>>>> optimal alignment (if that's something QEMU handles internally) if not
>>>>>>>>> overridden by the user.
>>>>>>>> Qemu supports max_discard and discard_alignment. Is there a call to get these limits
>>>>>>>> from librbd?
>>>>>>>> What do you mean by optimal_alignment? The object size?
>>>>>>> krbd does a good job of initializing defaults [1] where optimal and
>>>>>>> discard alignment is 64KiB (can actually be 4KiB now), max IO size for
>>>>>>> writes, discards, and write-zeroes is the object size * the stripe
>>>>>>> count.
>>>>>> Okay, I will have a look at it. If qemu issues a write, discard, write_zero greater than
>>>>>> obj_size  * stripe count will librbd split it internally or will the request fail?
>>>>> librbd will handle it as needed. My goal is really just to get the
>>>>> hints down the guest OS.
>>>>>> Regarding the alignment it seems that rbd_dev->opts->alloc_size is something that comes from the device
>>>>>> configuration and not from rbd? I don't have that information inside the Qemu RBD driver.
>>>>> librbd doesn't really have the information either. The 64KiB guess
>>>>> that krbd uses was a compromise since that was the default OSD
>>>>> allocation size for HDDs since Luminous. Starting with Pacific that
>>>>> default is going down to 4KiB.
>>>> I will try to adjust these values as far as it is possible and makes sense.
>>>> Is there a way to check the minimum supported OSD release in the backend from librbd / librados?
>>> It's not a minimum -- RADOS will gladly access 1 byte writes as well.
>>> It's really just the optimal (performance and space-wise). Sadly,
>>> there is no realistic way to query this data from the backend.
>> So you would suggest to advertise an optimal transfer length of 64k and max transfer length of obj size * stripe count to the guest unless we have an API in the future to query these limits from the backend?
> I'll open a Ceph tracker ticket to expose these via the API in a future release.


That would be good to have!


>
>> I would leave request alignment at 1 byte as otherwise Qemu will issue RMWs for all write requests that do not align. Everything that comes from a guest OS is very likely 4k aligned anyway.
> My goal is definitely not to have QEMU do any extra work for splitting
> or aligning IOs. I am really only trying to get hints passed down the
> guest via the virtio drivers. If there isn't the plumbing in QEMU for
> that yet, disregard.


From what I read from the code Qemu emulates the Block Limits VPD Page for virtio-scsi, but the limits there are not taken from the backend driver. They can be passed as config to the virtio-scsi device.

However, Qemu uses all the Block Limit that can be found in include/block/block_int.h in the BlockLimits struct to generate optimal requests if it comes to block operations like mirroring, or zeroing of a whole

device etc. Some of the alignments (e.g. the request alignment) have direct influence and make Qemu split requests from the guest or even make RMW cycles.

If my understanding is incorrect please anyone correct me.


It would indeed be nice to have an option to propagate the settings from the backend driver into the Guest. But from my understanding this is not there yet.


So I would leave it as it. Drop the request_alignment = 512 (like in the patch) and just advertise the cluster_size at obj_size. This is already in the rbd driver today.

The cluster_size e.g. is used in any qemu-img convert operation to align read / write requests and size them apropiately.


Peter




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 7/7] block/rbd: change request alignment to 1 byte
  2021-01-21 20:29   ` Peter Lieven
@ 2021-01-21 20:55     ` Jason Dillaman
  0 siblings, 0 replies; 12+ messages in thread
From: Jason Dillaman @ 2021-01-21 20:55 UTC (permalink / raw)
  To: Peter Lieven
  Cc: Kevin Wolf, Christian Theune, qemu-devel, qemu-block, Max Reitz

On Thu, Jan 21, 2021 at 3:29 PM Peter Lieven <pl@kamp.de> wrote:
>
> Am 21.01.21 um 20:42 schrieb Jason Dillaman:
> > On Wed, Jan 20, 2021 at 6:01 PM Peter Lieven <pl@kamp.de> wrote:
> >>
> >>> Am 19.01.2021 um 15:20 schrieb Jason Dillaman <jdillama@redhat.com>:
> >>>
> >>> On Tue, Jan 19, 2021 at 4:36 AM Peter Lieven <pl@kamp.de> wrote:
> >>>>> Am 18.01.21 um 23:33 schrieb Jason Dillaman:
> >>>>> On Fri, Jan 15, 2021 at 10:39 AM Peter Lieven <pl@kamp.de> wrote:
> >>>>>> Am 15.01.21 um 16:27 schrieb Jason Dillaman:
> >>>>>>> On Thu, Jan 14, 2021 at 2:59 PM Peter Lieven <pl@kamp.de> wrote:
> >>>>>>>> Am 14.01.21 um 20:19 schrieb Jason Dillaman:
> >>>>>>>>> On Sun, Dec 27, 2020 at 11:42 AM Peter Lieven <pl@kamp.de> wrote:
> >>>>>>>>>> since we implement byte interfaces and librbd supports aio on byte granularity we can lift
> >>>>>>>>>> the 512 byte alignment.
> >>>>>>>>>> Signed-off-by: Peter Lieven <pl@kamp.de>
> >>>>>>>>>> ---
> >>>>>>>>>> block/rbd.c | 2 --
> >>>>>>>>>> 1 file changed, 2 deletions(-)
> >>>>>>>>>> diff --git a/block/rbd.c b/block/rbd.c
> >>>>>>>>>> index 27b4404adf..8673e8f553 100644
> >>>>>>>>>> --- a/block/rbd.c
> >>>>>>>>>> +++ b/block/rbd.c
> >>>>>>>>>> @@ -223,8 +223,6 @@ done:
> >>>>>>>>>> static void qemu_rbd_refresh_limits(BlockDriverState *bs, Error **errp)
> >>>>>>>>>> {
> >>>>>>>>>>    BDRVRBDState *s = bs->opaque;
> >>>>>>>>>> -    /* XXX Does RBD support AIO on less than 512-byte alignment? */
> >>>>>>>>>> -    bs->bl.request_alignment = 512;
> >>>>>>>>> Just a suggestion, but perhaps improve discard alignment, max discard,
> >>>>>>>>> optimal alignment (if that's something QEMU handles internally) if not
> >>>>>>>>> overridden by the user.
> >>>>>>>> Qemu supports max_discard and discard_alignment. Is there a call to get these limits
> >>>>>>>> from librbd?
> >>>>>>>> What do you mean by optimal_alignment? The object size?
> >>>>>>> krbd does a good job of initializing defaults [1] where optimal and
> >>>>>>> discard alignment is 64KiB (can actually be 4KiB now), max IO size for
> >>>>>>> writes, discards, and write-zeroes is the object size * the stripe
> >>>>>>> count.
> >>>>>> Okay, I will have a look at it. If qemu issues a write, discard, write_zero greater than
> >>>>>> obj_size  * stripe count will librbd split it internally or will the request fail?
> >>>>> librbd will handle it as needed. My goal is really just to get the
> >>>>> hints down the guest OS.
> >>>>>> Regarding the alignment it seems that rbd_dev->opts->alloc_size is something that comes from the device
> >>>>>> configuration and not from rbd? I don't have that information inside the Qemu RBD driver.
> >>>>> librbd doesn't really have the information either. The 64KiB guess
> >>>>> that krbd uses was a compromise since that was the default OSD
> >>>>> allocation size for HDDs since Luminous. Starting with Pacific that
> >>>>> default is going down to 4KiB.
> >>>> I will try to adjust these values as far as it is possible and makes sense.
> >>>> Is there a way to check the minimum supported OSD release in the backend from librbd / librados?
> >>> It's not a minimum -- RADOS will gladly access 1 byte writes as well.
> >>> It's really just the optimal (performance and space-wise). Sadly,
> >>> there is no realistic way to query this data from the backend.
> >> So you would suggest to advertise an optimal transfer length of 64k and max transfer length of obj size * stripe count to the guest unless we have an API in the future to query these limits from the backend?
> > I'll open a Ceph tracker ticket to expose these via the API in a future release.
>
>
> That would be good to have!
>
>
> >
> >> I would leave request alignment at 1 byte as otherwise Qemu will issue RMWs for all write requests that do not align. Everything that comes from a guest OS is very likely 4k aligned anyway.
> > My goal is definitely not to have QEMU do any extra work for splitting
> > or aligning IOs. I am really only trying to get hints passed down the
> > guest via the virtio drivers. If there isn't the plumbing in QEMU for
> > that yet, disregard.
>
>
> From what I read from the code Qemu emulates the Block Limits VPD Page for virtio-scsi, but the limits there are not taken from the backend driver. They can be passed as config to the virtio-scsi device.
>
> However, Qemu uses all the Block Limit that can be found in include/block/block_int.h in the BlockLimits struct to generate optimal requests if it comes to block operations like mirroring, or zeroing of a whole
>
> device etc. Some of the alignments (e.g. the request alignment) have direct influence and make Qemu split requests from the guest or even make RMW cycles.
>
> If my understanding is incorrect please anyone correct me.
>
>
> It would indeed be nice to have an option to propagate the settings from the backend driver into the Guest. But from my understanding this is not there yet.
>
>
> So I would leave it as it. Drop the request_alignment = 512 (like in the patch) and just advertise the cluster_size at obj_size. This is already in the rbd driver today.
>
> The cluster_size e.g. is used in any qemu-img convert operation to align read / write requests and size them apropiately.

+1 to leave as-is until we can pass those hints down.

> Peter
>
>

Thanks!

-- 
Jason



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 7/7] block/rbd: change request alignment to 1 byte
  2021-01-19  9:36             ` Peter Lieven
@ 2021-01-19 14:20               ` Jason Dillaman
  0 siblings, 0 replies; 12+ messages in thread
From: Jason Dillaman @ 2021-01-19 14:20 UTC (permalink / raw)
  To: Peter Lieven
  Cc: Kevin Wolf, Christian Theune, qemu-devel, qemu-block, Max Reitz

On Tue, Jan 19, 2021 at 4:36 AM Peter Lieven <pl@kamp.de> wrote:
>
> Am 18.01.21 um 23:33 schrieb Jason Dillaman:
> > On Fri, Jan 15, 2021 at 10:39 AM Peter Lieven <pl@kamp.de> wrote:
> >> Am 15.01.21 um 16:27 schrieb Jason Dillaman:
> >>> On Thu, Jan 14, 2021 at 2:59 PM Peter Lieven <pl@kamp.de> wrote:
> >>>> Am 14.01.21 um 20:19 schrieb Jason Dillaman:
> >>>>> On Sun, Dec 27, 2020 at 11:42 AM Peter Lieven <pl@kamp.de> wrote:
> >>>>>> since we implement byte interfaces and librbd supports aio on byte granularity we can lift
> >>>>>> the 512 byte alignment.
> >>>>>>
> >>>>>> Signed-off-by: Peter Lieven <pl@kamp.de>
> >>>>>> ---
> >>>>>>  block/rbd.c | 2 --
> >>>>>>  1 file changed, 2 deletions(-)
> >>>>>>
> >>>>>> diff --git a/block/rbd.c b/block/rbd.c
> >>>>>> index 27b4404adf..8673e8f553 100644
> >>>>>> --- a/block/rbd.c
> >>>>>> +++ b/block/rbd.c
> >>>>>> @@ -223,8 +223,6 @@ done:
> >>>>>>  static void qemu_rbd_refresh_limits(BlockDriverState *bs, Error **errp)
> >>>>>>  {
> >>>>>>      BDRVRBDState *s = bs->opaque;
> >>>>>> -    /* XXX Does RBD support AIO on less than 512-byte alignment? */
> >>>>>> -    bs->bl.request_alignment = 512;
> >>>>> Just a suggestion, but perhaps improve discard alignment, max discard,
> >>>>> optimal alignment (if that's something QEMU handles internally) if not
> >>>>> overridden by the user.
> >>>> Qemu supports max_discard and discard_alignment. Is there a call to get these limits
> >>>>
> >>>> from librbd?
> >>>>
> >>>>
> >>>> What do you mean by optimal_alignment? The object size?
> >>> krbd does a good job of initializing defaults [1] where optimal and
> >>> discard alignment is 64KiB (can actually be 4KiB now), max IO size for
> >>> writes, discards, and write-zeroes is the object size * the stripe
> >>> count.
> >>
> >> Okay, I will have a look at it. If qemu issues a write, discard, write_zero greater than
> >>
> >> obj_size  * stripe count will librbd split it internally or will the request fail?
> > librbd will handle it as needed. My goal is really just to get the
> > hints down the guest OS.
> >
> >> Regarding the alignment it seems that rbd_dev->opts->alloc_size is something that comes from the device
> >>
> >> configuration and not from rbd? I don't have that information inside the Qemu RBD driver.
> > librbd doesn't really have the information either. The 64KiB guess
> > that krbd uses was a compromise since that was the default OSD
> > allocation size for HDDs since Luminous. Starting with Pacific that
> > default is going down to 4KiB.
>
>
> I will try to adjust these values as far as it is possible and makes sense.
>
>
> Is there a way to check the minimum supported OSD release in the backend from librbd / librados?

It's not a minimum -- RADOS will gladly access 1 byte writes as well.
It's really just the optimal (performance and space-wise). Sadly,
there is no realistic way to query this data from the backend.

>
> Anyway, I want to sent a V2 by the end of this week.
>
>
> Peter
>
>


-- 
Jason



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 7/7] block/rbd: change request alignment to 1 byte
  2021-01-18 22:33           ` Jason Dillaman
@ 2021-01-19  9:36             ` Peter Lieven
  2021-01-19 14:20               ` Jason Dillaman
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Lieven @ 2021-01-19  9:36 UTC (permalink / raw)
  To: dillaman; +Cc: Kevin Wolf, Christian Theune, qemu-devel, qemu-block, Max Reitz

Am 18.01.21 um 23:33 schrieb Jason Dillaman:
> On Fri, Jan 15, 2021 at 10:39 AM Peter Lieven <pl@kamp.de> wrote:
>> Am 15.01.21 um 16:27 schrieb Jason Dillaman:
>>> On Thu, Jan 14, 2021 at 2:59 PM Peter Lieven <pl@kamp.de> wrote:
>>>> Am 14.01.21 um 20:19 schrieb Jason Dillaman:
>>>>> On Sun, Dec 27, 2020 at 11:42 AM Peter Lieven <pl@kamp.de> wrote:
>>>>>> since we implement byte interfaces and librbd supports aio on byte granularity we can lift
>>>>>> the 512 byte alignment.
>>>>>>
>>>>>> Signed-off-by: Peter Lieven <pl@kamp.de>
>>>>>> ---
>>>>>>  block/rbd.c | 2 --
>>>>>>  1 file changed, 2 deletions(-)
>>>>>>
>>>>>> diff --git a/block/rbd.c b/block/rbd.c
>>>>>> index 27b4404adf..8673e8f553 100644
>>>>>> --- a/block/rbd.c
>>>>>> +++ b/block/rbd.c
>>>>>> @@ -223,8 +223,6 @@ done:
>>>>>>  static void qemu_rbd_refresh_limits(BlockDriverState *bs, Error **errp)
>>>>>>  {
>>>>>>      BDRVRBDState *s = bs->opaque;
>>>>>> -    /* XXX Does RBD support AIO on less than 512-byte alignment? */
>>>>>> -    bs->bl.request_alignment = 512;
>>>>> Just a suggestion, but perhaps improve discard alignment, max discard,
>>>>> optimal alignment (if that's something QEMU handles internally) if not
>>>>> overridden by the user.
>>>> Qemu supports max_discard and discard_alignment. Is there a call to get these limits
>>>>
>>>> from librbd?
>>>>
>>>>
>>>> What do you mean by optimal_alignment? The object size?
>>> krbd does a good job of initializing defaults [1] where optimal and
>>> discard alignment is 64KiB (can actually be 4KiB now), max IO size for
>>> writes, discards, and write-zeroes is the object size * the stripe
>>> count.
>>
>> Okay, I will have a look at it. If qemu issues a write, discard, write_zero greater than
>>
>> obj_size  * stripe count will librbd split it internally or will the request fail?
> librbd will handle it as needed. My goal is really just to get the
> hints down the guest OS.
>
>> Regarding the alignment it seems that rbd_dev->opts->alloc_size is something that comes from the device
>>
>> configuration and not from rbd? I don't have that information inside the Qemu RBD driver.
> librbd doesn't really have the information either. The 64KiB guess
> that krbd uses was a compromise since that was the default OSD
> allocation size for HDDs since Luminous. Starting with Pacific that
> default is going down to 4KiB.


I will try to adjust these values as far as it is possible and makes sense.


Is there a way to check the minimum supported OSD release in the backend from librbd / librados?


Anyway, I want to sent a V2 by the end of this week.


Peter




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 7/7] block/rbd: change request alignment to 1 byte
  2021-01-15 15:39         ` Peter Lieven
@ 2021-01-18 22:33           ` Jason Dillaman
  2021-01-19  9:36             ` Peter Lieven
  0 siblings, 1 reply; 12+ messages in thread
From: Jason Dillaman @ 2021-01-18 22:33 UTC (permalink / raw)
  To: Peter Lieven
  Cc: Kevin Wolf, Christian Theune, qemu-devel, qemu-block, Max Reitz

On Fri, Jan 15, 2021 at 10:39 AM Peter Lieven <pl@kamp.de> wrote:
>
> Am 15.01.21 um 16:27 schrieb Jason Dillaman:
> > On Thu, Jan 14, 2021 at 2:59 PM Peter Lieven <pl@kamp.de> wrote:
> >> Am 14.01.21 um 20:19 schrieb Jason Dillaman:
> >>> On Sun, Dec 27, 2020 at 11:42 AM Peter Lieven <pl@kamp.de> wrote:
> >>>> since we implement byte interfaces and librbd supports aio on byte granularity we can lift
> >>>> the 512 byte alignment.
> >>>>
> >>>> Signed-off-by: Peter Lieven <pl@kamp.de>
> >>>> ---
> >>>>  block/rbd.c | 2 --
> >>>>  1 file changed, 2 deletions(-)
> >>>>
> >>>> diff --git a/block/rbd.c b/block/rbd.c
> >>>> index 27b4404adf..8673e8f553 100644
> >>>> --- a/block/rbd.c
> >>>> +++ b/block/rbd.c
> >>>> @@ -223,8 +223,6 @@ done:
> >>>>  static void qemu_rbd_refresh_limits(BlockDriverState *bs, Error **errp)
> >>>>  {
> >>>>      BDRVRBDState *s = bs->opaque;
> >>>> -    /* XXX Does RBD support AIO on less than 512-byte alignment? */
> >>>> -    bs->bl.request_alignment = 512;
> >>> Just a suggestion, but perhaps improve discard alignment, max discard,
> >>> optimal alignment (if that's something QEMU handles internally) if not
> >>> overridden by the user.
> >>
> >> Qemu supports max_discard and discard_alignment. Is there a call to get these limits
> >>
> >> from librbd?
> >>
> >>
> >> What do you mean by optimal_alignment? The object size?
> > krbd does a good job of initializing defaults [1] where optimal and
> > discard alignment is 64KiB (can actually be 4KiB now), max IO size for
> > writes, discards, and write-zeroes is the object size * the stripe
> > count.
>
>
> Okay, I will have a look at it. If qemu issues a write, discard, write_zero greater than
>
> obj_size  * stripe count will librbd split it internally or will the request fail?

librbd will handle it as needed. My goal is really just to get the
hints down the guest OS.

> Regarding the alignment it seems that rbd_dev->opts->alloc_size is something that comes from the device
>
> configuration and not from rbd? I don't have that information inside the Qemu RBD driver.

librbd doesn't really have the information either. The 64KiB guess
that krbd uses was a compromise since that was the default OSD
allocation size for HDDs since Luminous. Starting with Pacific that
default is going down to 4KiB.

>
> Peter
>
>


-- 
Jason



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 7/7] block/rbd: change request alignment to 1 byte
  2021-01-15 15:27       ` Jason Dillaman
@ 2021-01-15 15:39         ` Peter Lieven
  2021-01-18 22:33           ` Jason Dillaman
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Lieven @ 2021-01-15 15:39 UTC (permalink / raw)
  To: dillaman; +Cc: Kevin Wolf, Christian Theune, qemu-devel, qemu-block, Max Reitz

Am 15.01.21 um 16:27 schrieb Jason Dillaman:
> On Thu, Jan 14, 2021 at 2:59 PM Peter Lieven <pl@kamp.de> wrote:
>> Am 14.01.21 um 20:19 schrieb Jason Dillaman:
>>> On Sun, Dec 27, 2020 at 11:42 AM Peter Lieven <pl@kamp.de> wrote:
>>>> since we implement byte interfaces and librbd supports aio on byte granularity we can lift
>>>> the 512 byte alignment.
>>>>
>>>> Signed-off-by: Peter Lieven <pl@kamp.de>
>>>> ---
>>>>  block/rbd.c | 2 --
>>>>  1 file changed, 2 deletions(-)
>>>>
>>>> diff --git a/block/rbd.c b/block/rbd.c
>>>> index 27b4404adf..8673e8f553 100644
>>>> --- a/block/rbd.c
>>>> +++ b/block/rbd.c
>>>> @@ -223,8 +223,6 @@ done:
>>>>  static void qemu_rbd_refresh_limits(BlockDriverState *bs, Error **errp)
>>>>  {
>>>>      BDRVRBDState *s = bs->opaque;
>>>> -    /* XXX Does RBD support AIO on less than 512-byte alignment? */
>>>> -    bs->bl.request_alignment = 512;
>>> Just a suggestion, but perhaps improve discard alignment, max discard,
>>> optimal alignment (if that's something QEMU handles internally) if not
>>> overridden by the user.
>>
>> Qemu supports max_discard and discard_alignment. Is there a call to get these limits
>>
>> from librbd?
>>
>>
>> What do you mean by optimal_alignment? The object size?
> krbd does a good job of initializing defaults [1] where optimal and
> discard alignment is 64KiB (can actually be 4KiB now), max IO size for
> writes, discards, and write-zeroes is the object size * the stripe
> count.


Okay, I will have a look at it. If qemu issues a write, discard, write_zero greater than

obj_size  * stripe count will librbd split it internally or will the request fail?


Regarding the alignment it seems that rbd_dev->opts->alloc_size is something that comes from the device

configuration and not from rbd? I don't have that information inside the Qemu RBD driver.


Peter




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 7/7] block/rbd: change request alignment to 1 byte
  2021-01-14 19:59     ` Peter Lieven
@ 2021-01-15 15:27       ` Jason Dillaman
  2021-01-15 15:39         ` Peter Lieven
  0 siblings, 1 reply; 12+ messages in thread
From: Jason Dillaman @ 2021-01-15 15:27 UTC (permalink / raw)
  To: Peter Lieven
  Cc: Kevin Wolf, Christian Theune, qemu-devel, qemu-block, Max Reitz

On Thu, Jan 14, 2021 at 2:59 PM Peter Lieven <pl@kamp.de> wrote:
>
> Am 14.01.21 um 20:19 schrieb Jason Dillaman:
> > On Sun, Dec 27, 2020 at 11:42 AM Peter Lieven <pl@kamp.de> wrote:
> >> since we implement byte interfaces and librbd supports aio on byte granularity we can lift
> >> the 512 byte alignment.
> >>
> >> Signed-off-by: Peter Lieven <pl@kamp.de>
> >> ---
> >>  block/rbd.c | 2 --
> >>  1 file changed, 2 deletions(-)
> >>
> >> diff --git a/block/rbd.c b/block/rbd.c
> >> index 27b4404adf..8673e8f553 100644
> >> --- a/block/rbd.c
> >> +++ b/block/rbd.c
> >> @@ -223,8 +223,6 @@ done:
> >>  static void qemu_rbd_refresh_limits(BlockDriverState *bs, Error **errp)
> >>  {
> >>      BDRVRBDState *s = bs->opaque;
> >> -    /* XXX Does RBD support AIO on less than 512-byte alignment? */
> >> -    bs->bl.request_alignment = 512;
> > Just a suggestion, but perhaps improve discard alignment, max discard,
> > optimal alignment (if that's something QEMU handles internally) if not
> > overridden by the user.
>
>
> Qemu supports max_discard and discard_alignment. Is there a call to get these limits
>
> from librbd?
>
>
> What do you mean by optimal_alignment? The object size?

krbd does a good job of initializing defaults [1] where optimal and
discard alignment is 64KiB (can actually be 4KiB now), max IO size for
writes, discards, and write-zeroes is the object size * the stripe
count.

> Peter
>
>
>

[1] https://github.com/torvalds/linux/blob/master/drivers/block/rbd.c#L4981

-- 
Jason



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 7/7] block/rbd: change request alignment to 1 byte
  2021-01-14 19:19   ` Jason Dillaman
@ 2021-01-14 19:59     ` Peter Lieven
  2021-01-15 15:27       ` Jason Dillaman
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Lieven @ 2021-01-14 19:59 UTC (permalink / raw)
  To: dillaman; +Cc: Kevin Wolf, Christian Theune, qemu-devel, qemu-block, Max Reitz

Am 14.01.21 um 20:19 schrieb Jason Dillaman:
> On Sun, Dec 27, 2020 at 11:42 AM Peter Lieven <pl@kamp.de> wrote:
>> since we implement byte interfaces and librbd supports aio on byte granularity we can lift
>> the 512 byte alignment.
>>
>> Signed-off-by: Peter Lieven <pl@kamp.de>
>> ---
>>  block/rbd.c | 2 --
>>  1 file changed, 2 deletions(-)
>>
>> diff --git a/block/rbd.c b/block/rbd.c
>> index 27b4404adf..8673e8f553 100644
>> --- a/block/rbd.c
>> +++ b/block/rbd.c
>> @@ -223,8 +223,6 @@ done:
>>  static void qemu_rbd_refresh_limits(BlockDriverState *bs, Error **errp)
>>  {
>>      BDRVRBDState *s = bs->opaque;
>> -    /* XXX Does RBD support AIO on less than 512-byte alignment? */
>> -    bs->bl.request_alignment = 512;
> Just a suggestion, but perhaps improve discard alignment, max discard,
> optimal alignment (if that's something QEMU handles internally) if not
> overridden by the user.


Qemu supports max_discard and discard_alignment. Is there a call to get these limits

from librbd?


What do you mean by optimal_alignment? The object size?


Peter





^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 7/7] block/rbd: change request alignment to 1 byte
  2020-12-27 16:42 ` [PATCH 7/7] block/rbd: change request alignment to 1 byte Peter Lieven
@ 2021-01-14 19:19   ` Jason Dillaman
  2021-01-14 19:59     ` Peter Lieven
  0 siblings, 1 reply; 12+ messages in thread
From: Jason Dillaman @ 2021-01-14 19:19 UTC (permalink / raw)
  To: Peter Lieven
  Cc: Kevin Wolf, Christian Theune, qemu-devel, qemu-block, Max Reitz

On Sun, Dec 27, 2020 at 11:42 AM Peter Lieven <pl@kamp.de> wrote:
>
> since we implement byte interfaces and librbd supports aio on byte granularity we can lift
> the 512 byte alignment.
>
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---
>  block/rbd.c | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/block/rbd.c b/block/rbd.c
> index 27b4404adf..8673e8f553 100644
> --- a/block/rbd.c
> +++ b/block/rbd.c
> @@ -223,8 +223,6 @@ done:
>  static void qemu_rbd_refresh_limits(BlockDriverState *bs, Error **errp)
>  {
>      BDRVRBDState *s = bs->opaque;
> -    /* XXX Does RBD support AIO on less than 512-byte alignment? */
> -    bs->bl.request_alignment = 512;

Just a suggestion, but perhaps improve discard alignment, max discard,
optimal alignment (if that's something QEMU handles internally) if not
overridden by the user.


>  #ifdef LIBRBD_SUPPORTS_WRITE_ZEROES
>      bs->bl.pwrite_zeroes_alignment = s->object_size;
>  #endif
> --
> 2.17.1
>
>


--
Jason



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 7/7] block/rbd: change request alignment to 1 byte
  2020-12-27 16:42 [PATCH 0/7] block/rbd: migrate to coroutines and add write zeroes support Peter Lieven
@ 2020-12-27 16:42 ` Peter Lieven
  2021-01-14 19:19   ` Jason Dillaman
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Lieven @ 2020-12-27 16:42 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, Peter Lieven, qemu-devel, ct, mreitz, dillaman

since we implement byte interfaces and librbd supports aio on byte granularity we can lift
the 512 byte alignment.

Signed-off-by: Peter Lieven <pl@kamp.de>
---
 block/rbd.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/block/rbd.c b/block/rbd.c
index 27b4404adf..8673e8f553 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -223,8 +223,6 @@ done:
 static void qemu_rbd_refresh_limits(BlockDriverState *bs, Error **errp)
 {
     BDRVRBDState *s = bs->opaque;
-    /* XXX Does RBD support AIO on less than 512-byte alignment? */
-    bs->bl.request_alignment = 512;
 #ifdef LIBRBD_SUPPORTS_WRITE_ZEROES
     bs->bl.pwrite_zeroes_alignment = s->object_size;
 #endif
-- 
2.17.1




^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-01-21 20:56 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-20 23:01 [PATCH 7/7] block/rbd: change request alignment to 1 byte Peter Lieven
2021-01-21 19:42 ` Jason Dillaman
2021-01-21 20:29   ` Peter Lieven
2021-01-21 20:55     ` Jason Dillaman
  -- strict thread matches above, loose matches on Subject: below --
2020-12-27 16:42 [PATCH 0/7] block/rbd: migrate to coroutines and add write zeroes support Peter Lieven
2020-12-27 16:42 ` [PATCH 7/7] block/rbd: change request alignment to 1 byte Peter Lieven
2021-01-14 19:19   ` Jason Dillaman
2021-01-14 19:59     ` Peter Lieven
2021-01-15 15:27       ` Jason Dillaman
2021-01-15 15:39         ` Peter Lieven
2021-01-18 22:33           ` Jason Dillaman
2021-01-19  9:36             ` Peter Lieven
2021-01-19 14:20               ` Jason Dillaman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.