All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: John Garry <john.garry@huawei.com>
Cc: Marc Zyngier <maz@kernel.org>,
	tglx@linutronix.de, "chenxiang (M)" <chenxiang66@hisilicon.com>,
	bigeasy@linutronix.de, linux-kernel@vger.kernel.org,
	hare@suse.com, hch@lst.de, axboe@kernel.dk, bvanassche@acm.org,
	peterz@infradead.org, mingo@redhat.com,
	Zhang Yi <yi.zhang@redhat.com>
Subject: Re: [PATCH RFC 1/1] genirq: Make threaded handler use irq affinity for managed interrupt
Date: Fri, 3 Jan 2020 08:46:25 +0800	[thread overview]
Message-ID: <20200103004625.GA5219@ming.t460p> (raw)
In-Reply-To: <72a6a738-f04b-3792-627a-fbfcb7b297e1@huawei.com>

On Thu, Jan 02, 2020 at 10:35:31AM +0000, John Garry wrote:
> On 25/12/2019 00:48, Ming Lei wrote:
> > On Tue, Dec 24, 2019 at 11:20:25AM +0000, Marc Zyngier wrote:
> > > On 2019-12-24 01:59, Ming Lei wrote:
> > > > On Mon, Dec 23, 2019 at 10:47:07AM +0000, Marc Zyngier wrote:
> > > > > On 2019-12-23 10:26, John Garry wrote:
> > > > > > > > > > I've also managed to trigger some of them now that I have
> > > > > > > > > access to
> > > > > > > > > > a decent box with nvme storage.
> > > > > > > > > 
> > > > > > > > > I only have 2x NVMe SSDs when this occurs - I should not be
> > > > > > > > > hitting this...
> > > > > > > > > 
> > > > > > > > > Out of curiosity, have you tried
> > > > > > > > > > with the SMMU disabled? I'm wondering whether we hit some
> > > > > > > > > livelock
> > > > > > > > > > condition on unmapping buffers...
> > > > > > > > > 
> > > > > > > > > No, but I can give it a try. Doing that should lower the CPU
> > > > > > > > > usage, though,
> > > > > > > > > so maybe masks the issue - probably not.
> > > > > > > > 
> > > > > > > > Lots of CPU lockup can is performance issue if there isn't
> > > > > > > > obvious bug.
> > > > > > > > 
> > > > > > > > I am wondering if you may explain it a bit why enabling SMMU
> > > > > may
> > > > > > > > save
> > > > > > > > CPU a it?
> > > > > > > The other way around. mapping/unmapping IOVAs doesn't comes for
> > > > > > > free.
> > > > > > > I'm trying to find out whether the NVMe map/unmap patterns
> > > > > trigger
> > > > > > > something unexpected in the SMMU driver, but that's a very long
> > > > > > > shot.
> > > > > > 
> > > > > > So I tested v5.5-rc3 with and without the SMMU enabled, and
> > > > > without
> > > > > > the SMMU enabled I don't get the lockup.
> > > > > 
> > > > > OK, so my hunch wasn't completely off... At least we have something
> > > > > to look into.
> > > > > 
> > > > > [...]
> > > > > 
> > > > > > Obviously this is not conclusive, especially with such limited
> > > > > > testing - 5 minute runs each. The CPU load goes up when disabling
> > > > > the
> > > > > > SMMU, but that could be attributed to extra throughput (1183K ->
> > > > > > 1539K) loading.
> > > > > > 
> > > > > > I do notice that since we complete the NVMe request in irq
> > > > > context,
> > > > > > we also do the DMA unmap, i.e. talk to the SMMU, in the same
> > > > > context,
> > > > > > which is less than ideal.
> > > > > 
> > > > > It depends on how much overhead invalidating the TLB adds to the
> > > > > equation, but we should be able to do some tracing and find out.
> > > > > 
> > > > > > I need to finish for the Christmas break today, so can't check
> > > > > this
> > > > > > much further ATM.
> > > > > 
> > > > > No worries. May I suggest creating a new thread in the new year,
> > > > > maybe
> > > > > involving Robin and Will as well?
> > > > 
> > > > Zhang Yi has observed the CPU lockup issue once when running heavy IO on
> > > > single nvme drive, and please CC him if you have new patch to try.
> > > 
> > > On which architecture? John was indicating that this also happen on x86.
> > 
> > ARM64.
> > 
> > To be honest, I never see such CPU lockup issue on x86 in case of running
> > heavy IO on single NVMe drive.
> > 
> > > 
> > > > Then looks the DMA unmap cost is too big on aarch64 if SMMU is involved.
> > > 
> > > So far, we don't have any data suggesting that this is actually the case.
> > > Also, other workloads (such as networking) do not exhibit this behaviour,
> > > while being least as unmap-heavy as NVMe is.
> > 
> > Maybe it is because networking workloads usually completes IO in softirq
> > context, instead of hard interrupt context.
> > 
> > > 
> > > If the cross-architecture aspect is confirmed, this points more into
> > > the direction of an interaction between the NVMe subsystem and the
> > > DMA API more than an architecture-specific problem.
> > > 
> > > Given that we have so far very little data, I'd hold off any conclusion.
> > 
> > We can start to collect latency data of dma unmapping vs nvme_irq()
> > on both x86 and arm64.
> > 
> > I will see if I can get a such box for collecting the latency data.
> 
> To reiterate what I mentioned before about IOMMU DMA unmap on x86, a key
> difference is that by default it uses the non-strict (lazy) mode unmap, i.e.
> we unmap in batches. ARM64 uses general default, which is strict mode, i.e.
> every unmap results in an IOTLB fluch.
> 
> In my setup, if I switch to lazy unmap (set iommu.strict=0 on cmdline), then
> no lockup.
> 
> Are any special IOMMU setups being used for x86, like enabling strict mode?
> I don't know...

BTW, I have run the test on one 224-core ARM64 with one 32-hw_queue NVMe, the
softlock issue can be triggered in one minute.

nvme_irq() often takes ~5us to complete on this machine, then there is really
risk of cpu lockup when IOPS is > 200K.

The soft lockup can be triggered too if 'iommu.strict=0' is passed in,
just takes a bit longer by starting more IO jobs.

In above test, I submit IO to one single NVMe drive from 4 CPU cores via 8 or
12 jobs(iommu.strict=0), meantime make the nvme interrupt handled just in one
dedicated CPU core.

Is there lock contention among iommu dma map and unmap callback?

Thanks,
Ming


  reply	other threads:[~2020-01-03  0:46 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-06 14:35 [PATCH RFC 0/1] Threaded handler uses irq affinity for when the interrupt is managed John Garry
2019-12-06 14:35 ` [PATCH RFC 1/1] genirq: Make threaded handler use irq affinity for managed interrupt John Garry
2019-12-06 15:22   ` Marc Zyngier
2019-12-06 16:16     ` John Garry
2019-12-07  8:03   ` Ming Lei
2019-12-09 14:30     ` John Garry
2019-12-09 15:09       ` Hannes Reinecke
2019-12-09 15:17         ` Marc Zyngier
2019-12-09 15:25           ` Hannes Reinecke
2019-12-09 15:36             ` Marc Zyngier
2019-12-09 15:49           ` Qais Yousef
2019-12-09 15:55             ` Marc Zyngier
2019-12-10  1:43       ` Ming Lei
2019-12-10  9:45         ` John Garry
2019-12-10 10:06           ` Ming Lei
2019-12-10 10:28           ` Marc Zyngier
2019-12-10 10:59             ` John Garry
2019-12-10 11:36               ` Marc Zyngier
2019-12-10 12:05                 ` John Garry
2019-12-10 18:32                   ` Marc Zyngier
2019-12-11  9:41                     ` John Garry
2019-12-13 10:07                       ` John Garry
2019-12-13 10:31                         ` Marc Zyngier
2019-12-13 12:08                           ` John Garry
2019-12-14 10:59                             ` Marc Zyngier
2019-12-11 17:09         ` John Garry
2019-12-12 22:38           ` Ming Lei
2019-12-13 11:12             ` John Garry
2019-12-13 13:18               ` Ming Lei
2019-12-13 15:43                 ` John Garry
2019-12-13 17:12                   ` Ming Lei
2019-12-13 17:50                     ` John Garry
2019-12-14 13:56                   ` Marc Zyngier
2019-12-16 10:47                     ` John Garry
2019-12-16 11:40                       ` Marc Zyngier
2019-12-16 14:17                         ` John Garry
2019-12-16 18:00                           ` Marc Zyngier
2019-12-16 18:50                             ` John Garry
2019-12-20 11:30                               ` John Garry
2019-12-20 14:43                                 ` Marc Zyngier
2019-12-20 15:38                                   ` John Garry
2019-12-20 16:16                                     ` Marc Zyngier
2019-12-20 23:31                                     ` Ming Lei
2019-12-23  9:07                                       ` Marc Zyngier
2019-12-23 10:26                                         ` John Garry
2019-12-23 10:47                                           ` Marc Zyngier
2019-12-23 11:35                                             ` John Garry
2019-12-24  1:59                                             ` Ming Lei
2019-12-24 11:20                                               ` Marc Zyngier
2019-12-25  0:48                                                 ` Ming Lei
2020-01-02 10:35                                                   ` John Garry
2020-01-03  0:46                                                     ` Ming Lei [this message]
2020-01-03 10:41                                                       ` John Garry
2020-01-03 11:29                                                         ` Ming Lei
2020-01-03 11:50                                                           ` John Garry
2020-01-04 12:03                                                             ` Ming Lei
2020-05-30  7:46 ` [tip: irq/core] irqchip/gic-v3-its: Balance initial LPI affinity across CPUs tip-bot2 for Marc Zyngier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200103004625.GA5219@ming.t460p \
    --to=ming.lei@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=bigeasy@linutronix.de \
    --cc=bvanassche@acm.org \
    --cc=chenxiang66@hisilicon.com \
    --cc=hare@suse.com \
    --cc=hch@lst.de \
    --cc=john.garry@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=yi.zhang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.