All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Thumshirn <jthumshirn@suse.de>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Vinod Koul <vinod.koul@intel.com>,
	"dmaengine@vger.kernel.org" <dmaengine@vger.kernel.org>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>
Subject: Re: [PATCH 5/5] libnvdimm: add DMA support for pmem blk-mq
Date: Thu, 3 Aug 2017 10:06:15 +0200	[thread overview]
Message-ID: <20170803080615.GB4333@linux-x5ow.site> (raw)
In-Reply-To: <CAPcyv4hvY4swcyijRaSR8QpMEv1w=g52udpW7F9Qs-1oS0bskQ@mail.gmail.com>

On Tue, Aug 01, 2017 at 10:43:30AM -0700, Dan Williams wrote:
> On Tue, Aug 1, 2017 at 12:34 AM, Johannes Thumshirn <jthumshirn@suse.de> wrote:
> > Dave Jiang <dave.jiang@intel.com> writes:
> >
> >> Adding DMA support for pmem blk reads. This provides signficant CPU
> >> reduction with large memory reads with good performance. DMAs are triggered
> >> with test against bio_multiple_segment(), so the small I/Os (4k or less?)
> >> are still performed by the CPU in order to reduce latency. By default
> >> the pmem driver will be using blk-mq with DMA.
> >>
> >> Numbers below are measured against pmem simulated via DRAM using
> >> memmap=NN!SS.  DMA engine used is the ioatdma on Intel Skylake Xeon
> >> platform.  Keep in mind the performance for actual persistent memory
> >> will differ.
> >> Fio 2.21 was used.
> >>
> >> 64k: 1 task queuedepth=1
> >> CPU Read:  7631 MB/s  99.7% CPU    DMA Read: 2415 MB/s  54% CPU
> >> CPU Write: 3552 MB/s  100% CPU     DMA Write 2173 MB/s  54% CPU
> >>
> >> 64k: 16 tasks queuedepth=16
> >> CPU Read: 36800 MB/s  1593% CPU    DMA Read:  29100 MB/s  607% CPU
> >> CPU Write 20900 MB/s  1589% CPU    DMA Write: 23400 MB/s  585% CPU
> >>
> >> 2M: 1 task queuedepth=1
> >> CPU Read:  6013 MB/s  99.3% CPU    DMA Read:  7986 MB/s  59.3% CPU
> >> CPU Write: 3579 MB/s  100% CPU     DMA Write: 5211 MB/s  58.3% CPU
> >>
> >> 2M: 16 tasks queuedepth=16
> >> CPU Read:  18100 MB/s 1588% CPU    DMA Read:  21300 MB/s 180.9% CPU
> >> CPU Write: 14100 MB/s 1594% CPU    DMA Write: 20400 MB/s 446.9% CPU
> >>
> >> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> >> ---
> >
> > Hi Dave,
> >
> > The above table shows that there's a performance benefit for 2M
> > transfers but a regression for 64k transfers, if we forget about the CPU
> > utilization for a second.
> 
> I don't think we can forget about cpu utilization, and I would expect
> most users would value cpu over small bandwidth increases. Especially
> when those numbers show efficiency like this
> 
> +160% cpu +26% read bandwidth
> +171% cpu -10% write bandwidth
> 
> > Would it be beneficial to have heuristics on
> > the transfer size that decide when to use dma and when not? You
> > introduced this hunk:
> >
> > -   rc = pmem_handle_cmd(cmd);
> > +   if (cmd->chan && bio_multiple_segments(req->bio))
> > +      rc = pmem_handle_cmd_dma(cmd, op_is_write(req_op(req)));
> > +   else
> > +       rc = pmem_handle_cmd(cmd);
> >
> > Which utilizes dma for bios with multiple segments and for single
> > segment bios you use the old path, maybe the single/multi segment logic
> > can be amended to have something like:
> >
> >     if (cmd->chan && bio_segments(req->bio) > PMEM_DMA_THRESH)
> >        rc = pmem_handle_cmd_dma(cmd, op_is_write(req_op(req));
> >     else
> >        rc = pmem_handle_cmd(cmd);
> >
> > Just something woth considering IMHO.
> 
> The thing we want to avoid most is having the cpu stall doing nothing
> waiting for memory access, so I think once the efficiency of adding
> more cpu goes non-linear it's better to let the dma engine handle
> that. The current heuristic seems to achieve that, but the worry is
> does it over-select the cpu for cases where requests could be merged
> into a bulkier i/o that is more suitable for dma.
> 
> I'm not sure we want another tunable vs predictable / efficient
> behavior by default.

Sorry for my late reply, but lots of automated performance regression CI tools
_will_ report errors because of the performance drop. Yes these may be false
positives, but it'll cost hours to resolve them. If we have a tunable it's way
easier to set it correctly for the use cases. And please keep in mind some
people don't really case about CPU utilization but maxiumum throughput.

Thanks,
	Johannes

-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

  reply	other threads:[~2017-08-03  8:04 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-31 22:24 [PATCH 0/5] Adding blk-mq and DMA support to pmem block driver Dave Jiang
2017-07-31 22:24 ` [PATCH 1/5] dmaengine: ioatdma: revert 7618d035 to allow sharing of DMA channels Dave Jiang
2017-07-31 22:24 ` [PATCH 2/5] dmaengine: ioatdma: dma_prep_memcpy_to/from_sg support Dave Jiang
2017-08-01  2:14   ` Dan Williams
2017-08-01 16:39     ` Dave Jiang
2017-08-02  4:57     ` Vinod Koul
2017-07-31 22:24 ` [PATCH 3/5] dmaengine: add SG support to dmaengine_unmap Dave Jiang
2017-07-31 22:24 ` [PATCH 4/5] libnvdimm: Adding blk-mq support to the pmem driver Dave Jiang
2017-08-01 19:02   ` Ross Zwisler
2017-07-31 22:24 ` [PATCH 5/5] libnvdimm: add DMA support for pmem blk-mq Dave Jiang
2017-08-01  7:34   ` Johannes Thumshirn
2017-08-01 16:40     ` Dave Jiang
2017-08-01 17:43     ` Dan Williams
2017-08-03  8:06       ` Johannes Thumshirn [this message]
2017-08-03 15:41         ` Dan Williams
2017-08-03 16:12           ` Dave Jiang
2017-08-03 16:15             ` Dan Williams
2017-08-04  6:07               ` Johannes Thumshirn
2017-08-04 15:47                 ` Dan Williams
2017-08-01 20:42   ` Ross Zwisler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170803080615.GB4333@linux-x5ow.site \
    --to=jthumshirn@suse.de \
    --cc=dan.j.williams@intel.com \
    --cc=dmaengine@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=vinod.koul@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.