All of lore.kernel.org
 help / color / mirror / Atom feed
* fio SGL support
@ 2016-09-20 18:03 Jeff Furlong
  2016-09-21  4:03 ` Jens Axboe
  0 siblings, 1 reply; 5+ messages in thread
From: Jeff Furlong @ 2016-09-20 18:03 UTC (permalink / raw)
  To: fio

Hi All,
Is it possible to add Scatter Gather List (SGL) support to fio?  Suppose we have a 16KB write to our device.  I believe fio will allocate memory in 4KB pages (depending upon machine architecture).  For SGL use, it would be valuable to define the SGL descriptors, such that we could control the 16KB write as (example): 512B, 3.5KB, 4KB, 8KB.  Similarly for reads, we may define the SGL descriptors for how to place the IO in memory.

If fio SGL support is possible, then the descriptors would be passed down to the kernel and driver level, where SGL supported drivers would be required.  However, the latest Linux kernels are offering support.

If fio SGL support is possible, adding an option such as sglsplit (similar to bssplit) may be an easy way to specify the descriptors.

Thanks.

Regards,
Jeff


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: fio SGL support
  2016-09-20 18:03 fio SGL support Jeff Furlong
@ 2016-09-21  4:03 ` Jens Axboe
  2016-09-21 23:14   ` Jeff Furlong
  0 siblings, 1 reply; 5+ messages in thread
From: Jens Axboe @ 2016-09-21  4:03 UTC (permalink / raw)
  To: Jeff Furlong, fio

On 09/20/2016 12:03 PM, Jeff Furlong wrote:
> Hi All,
> Is it possible to add Scatter Gather List (SGL) support to fio?
> Suppose we have a 16KB write to our device.  I believe fio will
> allocate memory in 4KB pages (depending upon machine architecture).
> For SGL use, it would be valuable to define the SGL descriptors, such
> that we could control the 16KB write as (example): 512B, 3.5KB, 4KB,
> 8KB.  Similarly for reads, we may define the SGL descriptors for how
> to place the IO in memory.

Fio will allocate buffers in multiples of the maximum blocksize
specified in the job. If you ask for bs=16k or similar, then fio will
allocate a virtually contig block of memory that is 16k in length and
properly aligned, if needed.

> If fio SGL support is possible, then the descriptors would be passed
> down to the kernel and driver level, where SGL supported drivers would
> be required.  However, the latest Linux kernels are offering support.

What interface is this?

> If fio SGL support is possible, adding an option such as sglsplit
> (similar to bssplit) may be an easy way to specify the descriptors.

I'm a little puzzled by this. Since fio doesn't deal with physical
addresses, any IO that fio performs can be described by a virtual
address and a total length. That may map to different physical pages in
the case of O_DIRECT, and subsequently an SG list will be needed on the
driver side to describe those pages in a single command.

So you probably have to be a bit more specific here.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: fio SGL support
  2016-09-21  4:03 ` Jens Axboe
@ 2016-09-21 23:14   ` Jeff Furlong
  2016-09-21 23:26     ` Jens Axboe
  0 siblings, 1 reply; 5+ messages in thread
From: Jeff Furlong @ 2016-09-21 23:14 UTC (permalink / raw)
  To: Jens Axboe, fio

I believe kernel 4.8 offers an NVMe over fabrics driver that supports SGL descriptors.

In the case of a 16KB IO that is described as 512B, 3.5KB, 4KB, and 8KB, there would be separate data transfers to the device under test (such as NVMe over fabrics device).  In the case of a 16KB IO allocated as contiguous pages, one large data transfer could be sufficient.  Is it possible to force one way or the other with fio (assuming the driver has support)?  We could simply use bssplit as 512B/3.5KB/4KB/8KB, but then the 8KB might actually be on noncontiguous pages, in which case it could really end up as  512B/3.5KB/4KB/4KB/4KB.

Thanks.

Regards,
Jeff

-----Original Message-----
From: Jens Axboe [mailto:axboe@kernel.dk] 
Sent: Tuesday, September 20, 2016 9:03 PM
To: Jeff Furlong <jeff.furlong@hgst.com>; fio@vger.kernel.org
Subject: Re: fio SGL support

On 09/20/2016 12:03 PM, Jeff Furlong wrote:
> Hi All,
> Is it possible to add Scatter Gather List (SGL) support to fio?
> Suppose we have a 16KB write to our device.  I believe fio will 
> allocate memory in 4KB pages (depending upon machine architecture).
> For SGL use, it would be valuable to define the SGL descriptors, such 
> that we could control the 16KB write as (example): 512B, 3.5KB, 4KB, 
> 8KB.  Similarly for reads, we may define the SGL descriptors for how 
> to place the IO in memory.

Fio will allocate buffers in multiples of the maximum blocksize specified in the job. If you ask for bs=16k or similar, then fio will allocate a virtually contig block of memory that is 16k in length and properly aligned, if needed.

> If fio SGL support is possible, then the descriptors would be passed 
> down to the kernel and driver level, where SGL supported drivers would 
> be required.  However, the latest Linux kernels are offering support.

What interface is this?

> If fio SGL support is possible, adding an option such as sglsplit 
> (similar to bssplit) may be an easy way to specify the descriptors.

I'm a little puzzled by this. Since fio doesn't deal with physical addresses, any IO that fio performs can be described by a virtual address and a total length. That may map to different physical pages in the case of O_DIRECT, and subsequently an SG list will be needed on the driver side to describe those pages in a single command.

So you probably have to be a bit more specific here.

--
Jens Axboe


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: fio SGL support
  2016-09-21 23:14   ` Jeff Furlong
@ 2016-09-21 23:26     ` Jens Axboe
  2016-09-22  0:05       ` Jeff Furlong
  0 siblings, 1 reply; 5+ messages in thread
From: Jens Axboe @ 2016-09-21 23:26 UTC (permalink / raw)
  To: Jeff Furlong, fio

On 09/21/2016 05:14 PM, Jeff Furlong wrote:
> I believe kernel 4.8 offers an NVMe over fabrics driver that supports
> SGL descriptors.

This is just NVMe being a bit different from everyone else, basically
all storage drivers use SG lists. This is completely detached from any
application level interface, it's just an artifact of how physical
memory is mapped for DMA and that information is passed to the driver.

> In the case of a 16KB IO that is described as 512B, 3.5KB, 4KB, and
> 8KB, there would be separate data transfers to the device under test
> (such as NVMe over fabrics device).  In the case of a 16KB IO
> allocated as contiguous pages, one large data transfer could be
> sufficient.  Is it possible to force one way or the other with fio
> (assuming the driver has support)?  We could simply use bssplit as
> 512B/3.5KB/4KB/8KB, but then the 8KB might actually be on
> noncontiguous pages, in which case it could really end up as
> 512B/3.5KB/4KB/4KB/4KB.

I'm still confused, honestly I don't see the point of this at all. Fio
makes no attempt to allocate (nor can it) physical pages, so we can't
control how this memory ends up being laid out. The best we can do is
align buffers to the same offset as their length, assuming we then get
pages mapped that are physical contiguous. This will reduce the required
number of SG elements on the driver side. 'iomem_align' will align the
IO buffers to the given alignment. If you do:

bssplit=512B/3.5KB/4KB/8KB
iomem_align=8k

then ANY IO buffer will be aligned to 8k, and 8k in length, even if we
only do shorter transfers to it.

I hope that explains it.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: fio SGL support
  2016-09-21 23:26     ` Jens Axboe
@ 2016-09-22  0:05       ` Jeff Furlong
  0 siblings, 0 replies; 5+ messages in thread
From: Jeff Furlong @ 2016-09-22  0:05 UTC (permalink / raw)
  To: Jens Axboe, fio

The motivation for the application (fio) to allocate physical pages is really about performance.  If the IO size is more extreme, say 1MB, then descriptors of 512B each vs 1MB each means potentially many DMAs.  All of those DMAs can really change performance.  

The iomem_align seems workable.  My understanding now is that in the case of:

bssplit=512B/3.5KB/4KB/8KB
iomem_align=8k

then all bs buffers are 8KB aligned, and there is a best effort (up to kernel) to have the 8KB buffer have two physically contiguous pages (assuming 4KB physical page size on host).  In the case above, the 512B/3.5KB/4KB bs only use (DMA) one page of the 8KB iomem_align.

In the case of the extreme 1MB IO size, we can use bssplit and iomem_align to cause 512B DMAs.  However, we have probably low probability that we could cause a 1MB DMA.  I'm guessing the odds of the kernel allocating a 1MB region of physically contiguous pages (4KB each) are small, especially for iodepth times.  But, I think that's more of the extreme usage.

Thanks for the help.

Regards,
Jeff

-----Original Message-----
From: fio-owner@vger.kernel.org [mailto:fio-owner@vger.kernel.org] On Behalf Of Jens Axboe
Sent: Wednesday, September 21, 2016 4:26 PM
To: Jeff Furlong <jeff.furlong@hgst.com>; fio@vger.kernel.org
Subject: Re: fio SGL support

On 09/21/2016 05:14 PM, Jeff Furlong wrote:
> I believe kernel 4.8 offers an NVMe over fabrics driver that supports 
> SGL descriptors.

This is just NVMe being a bit different from everyone else, basically all storage drivers use SG lists. This is completely detached from any application level interface, it's just an artifact of how physical memory is mapped for DMA and that information is passed to the driver.

> In the case of a 16KB IO that is described as 512B, 3.5KB, 4KB, and 
> 8KB, there would be separate data transfers to the device under test 
> (such as NVMe over fabrics device).  In the case of a 16KB IO 
> allocated as contiguous pages, one large data transfer could be 
> sufficient.  Is it possible to force one way or the other with fio 
> (assuming the driver has support)?  We could simply use bssplit as 
> 512B/3.5KB/4KB/8KB, but then the 8KB might actually be on 
> noncontiguous pages, in which case it could really end up as 
> 512B/3.5KB/4KB/4KB/4KB.

I'm still confused, honestly I don't see the point of this at all. Fio makes no attempt to allocate (nor can it) physical pages, so we can't control how this memory ends up being laid out. The best we can do is align buffers to the same offset as their length, assuming we then get pages mapped that are physical contiguous. This will reduce the required number of SG elements on the driver side. 'iomem_align' will align the IO buffers to the given alignment. If you do:

bssplit=512B/3.5KB/4KB/8KB
iomem_align=8k

then ANY IO buffer will be aligned to 8k, and 8k in length, even if we only do shorter transfers to it.

I hope that explains it.

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe fio" in the body of a message to majordomo@vger.kernel.org More majordomo info at  http://vger.kernel.org/majordomo-info.html
Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:

This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-09-22  0:05 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-20 18:03 fio SGL support Jeff Furlong
2016-09-21  4:03 ` Jens Axboe
2016-09-21 23:14   ` Jeff Furlong
2016-09-21 23:26     ` Jens Axboe
2016-09-22  0:05       ` Jeff Furlong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.