All of lore.kernel.org
 help / color / mirror / Atom feed
* NVMe SSD + compression - benchmarking
@ 2018-04-27 17:41 Brendan Hide
  2018-04-28  2:05 ` Qu Wenruo
  0 siblings, 1 reply; 4+ messages in thread
From: Brendan Hide @ 2018-04-27 17:41 UTC (permalink / raw)
  To: Btrfs BTRFS

Hey, all

I'm following up on the queries I had last week since I have installed 
the NVMe SSD into the PCI-e adapter. I'm having difficulty knowing 
whether or not I'm doing these benchmarks correctly.

As a first test, I put together a 4.7GB .tar containing mostly 
duplicated copies of the kernel source code (rather compressible). 
Writing this to the SSD I was seeing repeatable numbers - but noted that 
the new (supposedly faster) zstd compression is noticeably slower than 
all other methods. Perhaps this is partly due to lack of 
multi-threading? No matter, I did also notice a supposedly impossible 
stat when there is no compression, in that it seems to be faster than 
the PCI-E 2.0 bus theoretically can deliver:

compression type / write speed / read speed (in GBps)
zlib / 1.24 / 2.07
lzo / 1.17 / 2.04
zstd / 0.75 / 1.97
no / 1.42 / 2.79

The SSD is PCI-E 3.0 4-lane capable and is connected to a PCI-E 2.0 
16-lane slot. lspci -vv confirms it is using 4 lanes. This means it's 
peak throughput *should* be 2.0 GBps - but above you can see the average 
read benchmark is 2.79GBps. :-/

The crude timing script I've put together does the following:
- Format the SSD anew with btrfs and no custom settings
- wait 180 seconds for possible hardware TRIM to settle (possibly 
overkill since the SSD is new)
- Mount the fs using all defaults except for compression, which could be 
of zlib, lzo, zstd, or no
- sync
- Drop all caches
- Time the following
  - Copy the file to the test fs (source is a ramdisk)
  - sync
- Drop all caches
- Time the following
  - Copy back from the test fs to ramdisk
  - sync
- unmount

I can see how, with compression, it *can* be faster than 2 GBps (though 
it isn't). But I cannot see how having no compression could possibly be 
faster than 2 GBps. :-/

I can of course get more info if it'd help figure out this puzzle:

Kernel info:
Linux localhost.localdomain 4.16.3-1-vfio #1 SMP PREEMPT Sun Apr 22 
12:35:45 SAST 2018 x86_64 GNU/Linux
^ Close to the regular ArchLinux kernel - but with vfio, and compiled 
with -arch=native. See https://aur.archlinux.org/pkgbase/linux-vfio/

CPU model:
model name    : Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz

Motherboard model:
Product Name: Z68MA-G45 (MS-7676)

lspci output for the slot:
02:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe 
SSD Controller SM961/PM961
^ The disk id sans serial is Samsung_SSD_960_EVO_1TB

dmidecode output for the slot:
Handle 0x001E, DMI type 9, 17 bytes
System Slot Information
         Designation: J8B4
         Type: x16 PCI Express
         Current Usage: In Use
         Length: Long
         ID: 4
         Characteristics:
                 3.3 V is provided
                 Opening is shared
                 PME signal is supported
         Bus Address: 0000:02:01.1

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-04-29  8:30 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-27 17:41 NVMe SSD + compression - benchmarking Brendan Hide
2018-04-28  2:05 ` Qu Wenruo
2018-04-28  7:30   ` Brendan Hide
2018-04-29  8:28     ` Duncan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.