All of lore.kernel.org
 help / color / mirror / Atom feed
* How to utilize a PCIE4.0 SSD?
@ 2020-12-29 13:40 Stefan Lederer
  2020-12-29 16:19 ` Keith Busch
  0 siblings, 1 reply; 4+ messages in thread
From: Stefan Lederer @ 2020-12-29 13:40 UTC (permalink / raw)
  To: linux-block

Hello dear list,

(I hope I do not annoy you as a simple application programmer)

for a seminar paper at my university we reproduced the 2009 paper
"Pathologies of big data" by Jacobs, where he basically reads a
100GB file sequentially from a HDD with some light processing.

We have a PCIE4.0 SSD with up to 7GB/s reading (Samsung 980) but
nothing we programmed so far comes even close to that speed (regular
read(), mmap() with optional threads, io_uring, multi-process) so we
wonder if it is possible at all?

According to iostat mmap is the fastest with 4GB/s and a queue depth
of ~3. All other approaches do not go beyond 2.5GB/s.

Also we get some strange effects like sequential read() with 16KB
buffers being faster than one with 16MB and io_uring being alot
slower than mmap (all tested on Manjaro with kernel 5.8/5.10 and ext4).

So, now we are quite lost and would appreciate a hint into the right
direction :)

What is neccesary to simply read 100GB of data at 7GB/s?

I wish everybody a happy new year!
Stefan Lederer

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: How to utilize a PCIE4.0 SSD?
  2020-12-29 13:40 How to utilize a PCIE4.0 SSD? Stefan Lederer
@ 2020-12-29 16:19 ` Keith Busch
  2020-12-29 18:48   ` Jens Axboe
  0 siblings, 1 reply; 4+ messages in thread
From: Keith Busch @ 2020-12-29 16:19 UTC (permalink / raw)
  To: Stefan Lederer; +Cc: linux-block

On Tue, Dec 29, 2020 at 02:40:57PM +0100, Stefan Lederer wrote:
> Hello dear list,
> 
> (I hope I do not annoy you as a simple application programmer)
> 
> for a seminar paper at my university we reproduced the 2009 paper
> "Pathologies of big data" by Jacobs, where he basically reads a
> 100GB file sequentially from a HDD with some light processing.
> 
> We have a PCIE4.0 SSD with up to 7GB/s reading (Samsung 980) but
> nothing we programmed so far comes even close to that speed (regular
> read(), mmap() with optional threads, io_uring, multi-process) so we
> wonder if it is possible at all?
> 
> According to iostat mmap is the fastest with 4GB/s and a queue depth
> of ~3. All other approaches do not go beyond 2.5GB/s.
> 
> Also we get some strange effects like sequential read() with 16KB
> buffers being faster than one with 16MB and io_uring being alot
> slower than mmap (all tested on Manjaro with kernel 5.8/5.10 and ext4).
> 
> So, now we are quite lost and would appreciate a hint into the right
> direction :)
> 
> What is neccesary to simply read 100GB of data at 7GB/s?

Is your device running at gen4 speed? Easiest way to tell with an nvme
ssd (assuming you're reading from /dev/nvme0n1) is something like:

 # cat /sys/block/nvme0n1/device/device/current_link_speed

If it says less than 16GT/s, then it can't read at 7GB/s.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: How to utilize a PCIE4.0 SSD?
  2020-12-29 16:19 ` Keith Busch
@ 2020-12-29 18:48   ` Jens Axboe
  2020-12-29 20:18     ` Stefan Lederer
  0 siblings, 1 reply; 4+ messages in thread
From: Jens Axboe @ 2020-12-29 18:48 UTC (permalink / raw)
  To: Keith Busch, Stefan Lederer; +Cc: linux-block

On 12/29/20 9:19 AM, Keith Busch wrote:
> On Tue, Dec 29, 2020 at 02:40:57PM +0100, Stefan Lederer wrote:
>> Hello dear list,
>>
>> (I hope I do not annoy you as a simple application programmer)
>>
>> for a seminar paper at my university we reproduced the 2009 paper
>> "Pathologies of big data" by Jacobs, where he basically reads a
>> 100GB file sequentially from a HDD with some light processing.
>>
>> We have a PCIE4.0 SSD with up to 7GB/s reading (Samsung 980) but
>> nothing we programmed so far comes even close to that speed (regular
>> read(), mmap() with optional threads, io_uring, multi-process) so we
>> wonder if it is possible at all?
>>
>> According to iostat mmap is the fastest with 4GB/s and a queue depth
>> of ~3. All other approaches do not go beyond 2.5GB/s.
>>
>> Also we get some strange effects like sequential read() with 16KB
>> buffers being faster than one with 16MB and io_uring being alot
>> slower than mmap (all tested on Manjaro with kernel 5.8/5.10 and ext4).
>>
>> So, now we are quite lost and would appreciate a hint into the right
>> direction :)
>>
>> What is neccesary to simply read 100GB of data at 7GB/s?
> 
> Is your device running at gen4 speed? Easiest way to tell with an nvme
> ssd (assuming you're reading from /dev/nvme0n1) is something like:
> 
>  # cat /sys/block/nvme0n1/device/device/current_link_speed
> 
> If it says less than 16GT/s, then it can't read at 7GB/s.

Does sound like that a lot. Simple test here on a gen4 device:

# cat /sys/block/nvme3n1/device/device/current_link_speed
16.0 GT/s PCIe

# ~axboe/git/fio/fio --name=bw --filename=/dev/nvme3n1 --direct=1 --bs=32k --ioengine=io_uring --iodepth=16 --rw=randread --norandommap
[snip]
   READ: bw=6630MiB/s (6952MB/s), 6630MiB/s-6630MiB/s (6952MB/s-6952MB/s), io=36.4GiB (39.1GB), run=5621-5621msec

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: How to utilize a PCIE4.0 SSD?
  2020-12-29 18:48   ` Jens Axboe
@ 2020-12-29 20:18     ` Stefan Lederer
  0 siblings, 0 replies; 4+ messages in thread
From: Stefan Lederer @ 2020-12-29 20:18 UTC (permalink / raw)
  To: Jens Axboe, Keith Busch; +Cc: linux-block

Sorry for the noise!

I didn't use O_DIRECT. Now it looks much better at >5GB/s in our
program. Unbelievable how fast SSDs have become.

Thanks for the cool new interface BTW :)


On 29.12.20 19:48, Jens Axboe wrote:
> On 12/29/20 9:19 AM, Keith Busch wrote:
>> On Tue, Dec 29, 2020 at 02:40:57PM +0100, Stefan Lederer wrote:
>>> Hello dear list,
>>>
>>> (I hope I do not annoy you as a simple application programmer)
>>>
>>> for a seminar paper at my university we reproduced the 2009 paper
>>> "Pathologies of big data" by Jacobs, where he basically reads a
>>> 100GB file sequentially from a HDD with some light processing.
>>>
>>> We have a PCIE4.0 SSD with up to 7GB/s reading (Samsung 980) but
>>> nothing we programmed so far comes even close to that speed (regular
>>> read(), mmap() with optional threads, io_uring, multi-process) so we
>>> wonder if it is possible at all?
>>>
>>> According to iostat mmap is the fastest with 4GB/s and a queue depth
>>> of ~3. All other approaches do not go beyond 2.5GB/s.
>>>
>>> Also we get some strange effects like sequential read() with 16KB
>>> buffers being faster than one with 16MB and io_uring being alot
>>> slower than mmap (all tested on Manjaro with kernel 5.8/5.10 and ext4).
>>>
>>> So, now we are quite lost and would appreciate a hint into the right
>>> direction :)
>>>
>>> What is neccesary to simply read 100GB of data at 7GB/s?
>>
>> Is your device running at gen4 speed? Easiest way to tell with an nvme
>> ssd (assuming you're reading from /dev/nvme0n1) is something like:
>>
>>   # cat /sys/block/nvme0n1/device/device/current_link_speed
>>
>> If it says less than 16GT/s, then it can't read at 7GB/s.
> 
> Does sound like that a lot. Simple test here on a gen4 device:
> 
> # cat /sys/block/nvme3n1/device/device/current_link_speed
> 16.0 GT/s PCIe
> 
> # ~axboe/git/fio/fio --name=bw --filename=/dev/nvme3n1 --direct=1 --bs=32k --ioengine=io_uring --iodepth=16 --rw=randread --norandommap
> [snip]
>     READ: bw=6630MiB/s (6952MB/s), 6630MiB/s-6630MiB/s (6952MB/s-6952MB/s), io=36.4GiB (39.1GB), run=5621-5621msec
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-12-29 19:19 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-29 13:40 How to utilize a PCIE4.0 SSD? Stefan Lederer
2020-12-29 16:19 ` Keith Busch
2020-12-29 18:48   ` Jens Axboe
2020-12-29 20:18     ` Stefan Lederer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.