Linux-NVME Archive on lore.kernel.org
 help / color / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Long Li <longli@microsoft.com>
Cc: Keith Busch <kbusch@kernel.org>, Jens Axboe <axboe@fb.com>,
	Christoph Hellwig <hch@lst.de>,
	"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
	Sagi Grimberg <sagi@grimberg.me>
Subject: Re: [PATCH 2/2] nvme-pci: poll IO after batch submission for multi-mapping queue
Date: Tue, 12 Nov 2019 10:39:20 +0800
Message-ID: <20191112023920.GD15079@ming.t460p> (raw)
In-Reply-To: <CY4PR21MB0741004E62F9C50B8EF7DA9ECE770@CY4PR21MB0741.namprd21.prod.outlook.com>

On Tue, Nov 12, 2019 at 12:33:50AM +0000, Long Li wrote:
> >From: Christoph Hellwig <hch@lst.de>
> >Sent: Monday, November 11, 2019 12:45 PM
> >To: Ming Lei <ming.lei@redhat.com>
> >Cc: linux-nvme@lists.infradead.org; Keith Busch <kbusch@kernel.org>; Jens
> >Axboe <axboe@fb.com>; Christoph Hellwig <hch@lst.de>; Sagi Grimberg
> ><sagi@grimberg.me>; Long Li <longli@microsoft.com>
> >Subject: Re: [PATCH 2/2] nvme-pci: poll IO after batch submission for multi-
> >mapping queue
> >
> >On Fri, Nov 08, 2019 at 11:55:08AM +0800, Ming Lei wrote:
> >> f9dde187fa92("nvme-pci: remove cq check after submission") removes cq
> >> check after submission, this change actually causes performance
> >> regression on some NVMe drive in which single nvmeq handles requests
> >> originated from more than one blk-mq sw queues(call it multi-mapping
> >> queue).
> >
> >> Follows test result done on Azure L80sv2 guest with NVMe drive(
> >> Microsoft Corporation Device b111). This guest has 80 CPUs and 10 numa
> >> nodes, and each NVMe drive supports 8 hw queues.
> >
> >Have you actually seen this on a real nvme drive as well?
> >
> >Note that it is kinda silly to limit queues like that in VMs, so I really don't think
> >we should optimize the driver for this particular case.
> 
> I tested on an Azure L80s_v2 VM with newer Samsung P983 NVMe SSD (with 32 hardware queues). Tests also showed soft lockup when 32 queues are shared by 80 CPUs. 
> 

BTW, do you see if this simple change makes a difference?

> The issue will likely show up if the number of NVMe hardware queues is less than the number of CPUs. I think this may be a likely configuration on a very large system. (e.g. the largest VM on Azure has 416 cores)
> 

'the number of NVMe hardware queues' above should be the number of single NVMe drive.
I believe 32 hw queues is common, also poll queues may take several from the total 32.
When interrupt handling on single CPU core can't catch up with NVMe's IO handling,
soft lockup could be triggered. Of course, there are lot kinds of supported processors
by Linux.

Also when (nr_nvme_drives * nr_nvme_hw_queues) > nr_cpu_cores, one same CPU
can be assigned to handle more than 1 nvme IO queue interrupt from different
NVMe drive, the situation becomes worse.


Thanks,
Ming


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

  parent reply index

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-08  3:55 [PATCH 0/2] nvme-pci: improve IO performance via poll after batch submission Ming Lei
2019-11-08  3:55 ` [PATCH 1/2] nvme-pci: move sq/cq_poll lock initialization into nvme_init_queue Ming Lei
2019-11-08  4:12   ` Keith Busch
2019-11-08  7:09     ` Ming Lei
2019-11-08  3:55 ` [PATCH 2/2] nvme-pci: poll IO after batch submission for multi-mapping queue Ming Lei
2019-11-11 20:44   ` Christoph Hellwig
2019-11-12  0:33     ` Long Li
2019-11-12  1:35       ` Sagi Grimberg
2019-11-12  2:39       ` Ming Lei [this message]
2019-11-12 16:25         ` Hannes Reinecke
2019-11-12 16:49           ` Keith Busch
2019-11-12 17:29             ` Hannes Reinecke
2019-11-13  3:05               ` Ming Lei
2019-11-13  3:17                 ` Keith Busch
2019-11-13  3:57                   ` Ming Lei
2019-11-12 21:20         ` Long Li
2019-11-12 21:36           ` Keith Busch
2019-11-13  0:50             ` Long Li
2019-11-13  2:24           ` Ming Lei
2019-11-12  2:07     ` Ming Lei
2019-11-12  1:44   ` Sagi Grimberg
2019-11-12  9:56     ` Ming Lei
2019-11-12 17:35       ` Sagi Grimberg
2019-11-12 21:17         ` Long Li
2019-11-12 23:44         ` Jens Axboe
2019-11-13  2:47         ` Ming Lei
2019-11-12 18:11   ` Nadolski, Edmund
2019-11-13 13:46     ` Ming Lei

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191112023920.GD15079@ming.t460p \
    --to=ming.lei@redhat.com \
    --cc=axboe@fb.com \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=longli@microsoft.com \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-NVME Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-nvme/0 linux-nvme/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-nvme linux-nvme/ https://lore.kernel.org/linux-nvme \
		linux-nvme@lists.infradead.org
	public-inbox-index linux-nvme

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.infradead.lists.linux-nvme


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git