linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Maxim Levitsky <mlevitsk@redhat.com>
To: linux-nvme@lists.infradead.org
Cc: Fam Zheng <fam@euphon.net>, Keith Busch <keith.busch@intel.com>,
	Sagi Grimberg <sagi@grimberg.me>,
	kvm@vger.kernel.org, Wolfram Sang <wsa@the-dreams.de>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Liang Cunming <cunming.liang@intel.com>,
	Nicolas Ferre <nicolas.ferre@microchip.com>,
	linux-kernel@vger.kernel.org,
	Kirti Wankhede <kwankhede@nvidia.com>,
	"David S . Miller" <davem@davemloft.net>,
	Jens Axboe <axboe@fb.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	John Ferlan <jferlan@redhat.com>,
	Mauro Carvalho Chehab <mchehab+samsung@kernel.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Liu Changpeng <changpeng.liu@intel.com>,
	"Paul E . McKenney" <paulmck@linux.ibm.com>,
	Amnon Ilan <ailan@redhat.com>, Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH 0/9] RFC: NVME VFIO mediated device [BENCHMARKS]
Date: Mon, 25 Mar 2019 20:52:32 +0200	[thread overview]
Message-ID: <dbec1f2bee53ab786bb4f6204f9a930eac279970.camel@redhat.com> (raw)
In-Reply-To: <d41484848b1832192c6978c7054bec5c326afa6d.camel@redhat.com>

Hi

This is first round of benchmarks.

The system is Intel(R) Xeon(R) Gold 6128 CPU @ 3.40GHz

The system has 2 numa nodes, but only cpus and memory from node 0 were used to
avoid noise from numa.

The SSD is Intel® Optane™ SSD 900P Series, 280 GB version


https://ark.intel.com/content/www/us/en/ark/products/123628/intel-optane-ssd-900p-series-280gb-1-2-height-pcie-x4-20nm-3d-xpoint.html


** Latency benchmark with no interrupts at all **

spdk was complited with fio plugin in the host and in the guest.
spdk was first run in the host
then vm was started with one of spdk,pci passthrough, mdev and inside the
vm spdk was run with fio plugin.

spdk was taken from my branch on gitlab, and fio was complied from source for
3.4 branch as needed by the spdk fio plugin.

The following spdk command line was used:

$WORK/fio/fio \
	--name=job --runtime=40 --ramp_time=0 --time_based \
	 --filename="trtype=PCIe traddr=$DEVICE_FOR_FIO ns=1" --ioengine=spdk  \
	--direct=1 --rw=randread --bs=4K --cpus_allowed=0 \
	--iodepth=1 --thread

The average values for slat (submission latency), clat (completion latency) and
its sum (slat+clat) were noted.

The results:

spdk fio host: 
	573 Mib/s - slat 112.00ns, clat 6.400us, lat 6.52ms
	573 Mib/s - slat 111.50ns, clat 6.406us, lat 6.52ms


pci passthough host/
spdk fio guest
	571 Mib/s - slat 124.56ns, clat 6.422us  lat 6.55ms
	571 Mib/s - slat 122.86ns, clat 6.410us  lat 6.53ms
	570 Mib/s - slat 124.95ns, clat 6.425us  lat 6.55ms

spdk host/
spdk fio guest:
	535 Mib/s - slat 125.00ns, clat 6.895us  lat 7.02ms
	534 Mib/s - slat 125.36ns, clat 6.896us  lat 7.02ms
	534 Mib/s - slat 125.82ns, clat 6.892us  lat 7.02ms

mdev host/
spdk fio guest:
	534 Mib/s - slat 128.04ns, clat 6.902us  lat 7.03ms
	535 Mib/s - slat 126.97ns, clat 6.900us  lat 7.03ms
	535 Mib/s - slat 127.00ns, clat 6.898us  lat 7.03ms


As you see, native latency is 6.52ms, pci passthrough barely adds any latency,
while both mdev/spdk added about (7.03/2 - 6.52) - 0.51ms/0.50ms of latency.

In addtion to that I added few 'rdtsc' into my mdev driver to strategically
capture the cycle count it takes it to do 3 things:

1. translate a just received command (till it is copied to the hardware
submission queue)

2. receive a completion (divided by the number of completion received in one
round of polling)

3. deliver an interupt to the guest (call to eventfd_signal)

This is not the whole latency as there is also a latency between the point the
submission entry is written and till it is visible on the polling cpu, plus
latency till polling cpu gets to the code which reads the submission entry,
and of course latency of interrupt delivery, but the above measurements mostly
capture the latency I can control.

The results are:

commands translated : avg cycles: 459.844     avg time(usec): 0.135        
commands completed  : avg cycles: 354.61      avg time(usec): 0.104        
interrupts sent     : avg cycles: 590.227     avg time(usec): 0.174

avg time total: 0.413 usec

All measurmenets done in the host kernel. the time calculated using tsc_khz
kernel variable.

The biggest take from this is that both spdk and my driver are very fast and
overhead is just a  thousand of cpu cycles give it or take.

*** Throughput benchmarks ***

https://paste.fedoraproject.org/paste/ecijclLMG2B11MVCVIst-w

Here you can find the throughput benchmarks.

The biggest take is that when using no interrupts (spdk fio in guest or spdk fio
in host), the bottelneck is in the device, and througput is about 2290 Mib/s

And mdev vs spdk, with interrupts, my driver sligly wins by giving throughput of
about 2015 Mib/s while spdk is about 2005 Mib/s
mostly due to slightly different timings as the latency of both is about the
same.

Disabling meltdown mitigation didn't had much effect on the performance.

Best regards,
	Maxim Levitsky


  reply	other threads:[~2019-03-25 18:52 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20190319144116.400-1-mlevitsk@redhat.com>
2019-03-19 14:41 ` [PATCH 1/9] vfio/mdev: add .request callback Maxim Levitsky
2019-03-19 14:41 ` [PATCH 2/9] nvme/core: add some more values from the spec Maxim Levitsky
2019-03-19 14:41 ` [PATCH 3/9] nvme/core: add NVME_CTRL_SUSPENDED controller state Maxim Levitsky
2019-03-19 14:41 ` [PATCH 4/9] nvme/pci: use the NVME_CTRL_SUSPENDED state Maxim Levitsky
2019-03-20  2:54   ` Fam Zheng
2019-03-19 14:41 ` [PATCH 5/9] nvme/pci: add known admin effects to augument admin effects log page Maxim Levitsky
2019-03-19 14:41 ` [PATCH 6/9] nvme/pci: init shadow doorbell after each reset Maxim Levitsky
2019-03-19 14:41 ` [PATCH 7/9] nvme/core: add mdev interfaces Maxim Levitsky
2019-03-20 11:46   ` Stefan Hajnoczi
2019-03-20 12:50     ` Maxim Levitsky
2019-03-19 14:41 ` [PATCH 8/9] nvme/core: add nvme-mdev core driver Maxim Levitsky
2019-03-19 14:41 ` [PATCH 9/9] nvme/pci: implement the mdev external queue allocation interface Maxim Levitsky
2019-03-19 14:58 ` [PATCH 0/9] RFC: NVME VFIO mediated device Maxim Levitsky
2019-03-25 18:52   ` Maxim Levitsky [this message]
2019-03-26  9:38     ` [PATCH 0/9] RFC: NVME VFIO mediated device [BENCHMARKS] Stefan Hajnoczi
2019-03-26  9:50       ` Maxim Levitsky
2019-03-19 15:22 ` your mail Keith Busch
2019-03-19 23:49   ` Chaitanya Kulkarni
2019-03-20 16:44     ` Maxim Levitsky
2019-03-20 16:30   ` Maxim Levitsky
2019-03-20 17:03     ` Keith Busch
2019-03-20 17:33       ` Maxim Levitsky
2019-04-08 10:04   ` Maxim Levitsky
2019-03-20 11:03 ` Felipe Franciosi
2019-03-20 19:08   ` Re: Maxim Levitsky
2019-03-21 16:12     ` Re: Stefan Hajnoczi
2019-03-21 16:21       ` Re: Keith Busch
2019-03-21 16:41         ` Re: Felipe Franciosi
2019-03-21 17:04           ` Re: Maxim Levitsky
2019-03-22  7:54             ` Re: Felipe Franciosi
2019-03-22 10:32               ` Re: Maxim Levitsky
2019-03-22 15:30               ` Re: Keith Busch
2019-03-25 15:44                 ` Re: Felipe Franciosi
2019-03-20 15:08 ` [PATCH 0/9] RFC: NVME VFIO mediated device Bart Van Assche
2019-03-20 16:48   ` Maxim Levitsky
2019-03-20 15:28 ` Bart Van Assche
2019-03-20 16:42   ` Maxim Levitsky
2019-03-20 17:03     ` Alex Williamson
2019-03-21 16:13 ` your mail Stefan Hajnoczi
2019-03-21 17:07   ` Maxim Levitsky
2019-03-25 16:46     ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dbec1f2bee53ab786bb4f6204f9a930eac279970.camel@redhat.com \
    --to=mlevitsk@redhat.com \
    --cc=ailan@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=axboe@fb.com \
    --cc=changpeng.liu@intel.com \
    --cc=cunming.liang@intel.com \
    --cc=davem@davemloft.net \
    --cc=fam@euphon.net \
    --cc=gregkh@linuxfoundation.org \
    --cc=hch@lst.de \
    --cc=jferlan@redhat.com \
    --cc=keith.busch@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=mchehab+samsung@kernel.org \
    --cc=nicolas.ferre@microchip.com \
    --cc=paulmck@linux.ibm.com \
    --cc=pbonzini@redhat.com \
    --cc=sagi@grimberg.me \
    --cc=wsa@the-dreams.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).