linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/9] mpt3sas and dmapool scalability
@ 2018-11-12 15:40 Tony Battersby
  2018-11-13  6:44 ` Matthew Wilcox
  0 siblings, 1 reply; 2+ messages in thread
From: Tony Battersby @ 2018-11-12 15:40 UTC (permalink / raw)
  To: Matthew Wilcox, Christoph Hellwig, Marek Szyprowski, iommu, linux-mm
  Cc: linux-scsi

I posted v3 on August 7.  Nobody acked or merged the patches, and then
I got too busy with other stuff to repost until now.

The only change since v3:
*) Dropped patch #10 (the mpt3sas patch) since the mpt3sas maintainers
didn't show any interest.

I believe these patches are ready for merging.

---

drivers/scsi/mpt3sas is running into a scalability problem with the
kernel's DMA pool implementation.  With a LSI/Broadcom SAS 9300-8i
12Gb/s HBA and max_sgl_entries=256, during modprobe, mpt3sas does the
equivalent of:

chain_dma_pool = dma_pool_create(size = 128);
for (i = 0; i < 373959; i++)
    {
    dma_addr[i] = dma_pool_alloc(chain_dma_pool);
    }

And at rmmod, system shutdown, or system reboot, mpt3sas does the
equivalent of:

for (i = 0; i < 373959; i++)
    {
    dma_pool_free(chain_dma_pool, dma_addr[i]);
    }
dma_pool_destroy(chain_dma_pool);

With this usage, both dma_pool_alloc() and dma_pool_free() exhibit
O(n^2) complexity, although dma_pool_free() is much worse due to
implementation details.  On my system, the dma_pool_free() loop above
takes about 9 seconds to run.  Note that the problem was even worse
before commit 74522a92bbf0 ("scsi: mpt3sas: Optimize I/O memory
consumption in driver."), where the dma_pool_free() loop could take ~30
seconds.

mpt3sas also has some other DMA pools, but chain_dma_pool is the only
one with so many allocations:

cat /sys/devices/pci0000:80/0000:80:07.0/0000:85:00.0/pools
(manually cleaned up column alignment)
poolinfo - 0.1
reply_post_free_array pool  1      21     192     1
reply_free pool             1      1      41728   1
reply pool                  1      1      1335296 1
sense pool                  1      1      970272  1
chain pool                  373959 386048 128     12064
reply_post_free pool        12     12     166528  12

The patches in this series improve the scalability of the DMA pool
implementation, which significantly reduces the running time of the
DMA alloc/free loops.  With the patches applied, "modprobe mpt3sas",
"rmmod mpt3sas", and system shutdown/reboot with mpt3sas loaded are
significantly faster.  Here are some benchmarks (of DMA alloc/free
only, not the entire modprobe/rmmod):

dma_pool_create() + dma_pool_alloc() loop, size = 128, count = 373959
  original:        350 ms ( 1x)
  dmapool patches:  17 ms (21x)

dma_pool_free() loop + dma_pool_destroy(), size = 128, count = 373959
  original:        8901 ms (   1x)
  dmapool patches:   15 ms ( 618x)

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH v4 0/9] mpt3sas and dmapool scalability
  2018-11-12 15:40 [PATCH v4 0/9] mpt3sas and dmapool scalability Tony Battersby
@ 2018-11-13  6:44 ` Matthew Wilcox
  0 siblings, 0 replies; 2+ messages in thread
From: Matthew Wilcox @ 2018-11-13  6:44 UTC (permalink / raw)
  To: Tony Battersby
  Cc: Christoph Hellwig, Marek Szyprowski, iommu, linux-mm, linux-scsi

On Mon, Nov 12, 2018 at 10:40:57AM -0500, Tony Battersby wrote:
> I posted v3 on August 7.  Nobody acked or merged the patches, and then
> I got too busy with other stuff to repost until now.

Thanks for resending.  They were in my pile of things to look at, but
that's an ever-growing pile.

> I believe these patches are ready for merging.

I agree.

> cat /sys/devices/pci0000:80/0000:80:07.0/0000:85:00.0/pools
> (manually cleaned up column alignment)
> poolinfo - 0.1
> reply_post_free_array pool  1      21     192     1
> reply_free pool             1      1      41728   1
> reply pool                  1      1      1335296 1
> sense pool                  1      1      970272  1
> chain pool                  373959 386048 128     12064
> reply_post_free pool        12     12     166528  12

That reply pool ... 1 object of 1.3MB?  That's a lot of strain to put
on the page allocator.  I wonder if anything can be done about that.

(I'm equally non-thrilled about the sense pool, the reply_post_free pool
and the reply_free pool, but they seem a little less stressful than the
reply pool)

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2018-11-13  6:44 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-12 15:40 [PATCH v4 0/9] mpt3sas and dmapool scalability Tony Battersby
2018-11-13  6:44 ` Matthew Wilcox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).