All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 0/8] Reliably generate large request from SRP
@ 2011-01-19  4:27 David Dillow
       [not found] ` <1295411242-26148-1-git-send-email-dillowda-1Heg1YXhbW8@public.gmane.org>
       [not found] ` <1300148888.2772.15.camel@lap75545.ornl.gov>
  0 siblings, 2 replies; 35+ messages in thread
From: David Dillow @ 2011-01-19  4:27 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA

A persistent thorn in our side has been getting large (1 MB+) requests
from SRP on a system that has been up for any period of time. As we're
using RAID6 8+2 LUNs, we need to generate a full 1 MB IO to avoid a R/M/W
cycle on some hardware, and other hardware just likes the larger requests,
even without the penalty of an R/M/W cycle. The existing code wouldn't 
reliably generate the requests because its sg_tablesize was limited to
255, or less, due to the number of descriptors we could describe in the
SRP_CMD message.

Now that at least one vendor is implementing full support for the SRP
indirect memory descriptor tables, we can safely expand the sg_tablesize,
and realize some performance gains, in many cases quite large. I don't
have vendor code that implements the full support needed for safety, but
the rareness of FMR mapping failures allows the mapping code to function,
at a risk, with existing targets.

I've done some quick testing against an older generation of hardware RAID6
for these numbers.  They are streaming writes using a queue depth of 64.
The SATA numbers are against a LUN built with 8+2 1 TB SATA drives; the SAS
numbers are against a LUN built with two volumes of 8+2 1 TB SAS drives in
a RAID 0 config.  In all cases, the write cache is disabled, and
dma_boundary on the SRP initiator is set such that no coalescing occurs on
the SG list. The IOMMU has been disabled, and max_sectors_kb has been set
to the IO size under test, which matches the IO request size from the
application. For the baseline testing, the IO request is broken into
multiple pieces before being sent due to the sg_tablesize being capped at
255. For the patched numbers, the request was sent intact. These numbers
are for SRP_FMR_SIZE == 256, but I expect the 512 numbers to be similar.


Device	Size	Baseline	Patched
SAS	1M	524 MB/s	1004 MB/s
SAS	2M	520 MB/s	861 MB/s
SAS	4M	529 MB/s	921 MB/s
SAS	8M	600 MB/s	951 MB/s

SATA	1M	385 MB/s	515 MB/s
SATA	2M	394 MB/s	591 MB/s
SATA	4M	377 MB/s	565 MB/s
SATA	8M	419 MB/s	616 MB/s


Similar gains are found at other queue depths, but I've not done a full
parameter search.

Testing the lock scaling capability with fio indicates an increase in
command throughput except in the single threaded case. This is an
unexpected improvement and needs further examination.

I've only played with performance testing; I need to test data integrity
as well.



David Dillow (8):
  IB/srp: always avoid non-zero offsets into an FMR
  IB/srp: move IB CM setup completion into its own function
  IB/srp: allow sg_tablesize to be set for each target
  IB/srp: rework mapping engine to use multiple FMR entries
  IB/srp: add safety valve for large SG tables without HW support
  IB/srp: add support for indirect tables that don't fit in SRP_CMD
  IB/srp: try to use larger FMR sizes to cover our mappings
  IB/srp and direct IO: patches for testing large indirect tables

 drivers/infiniband/ulp/srp/ib_srp.c |  736 +++++++++++++++++++++++------------
 drivers/infiniband/ulp/srp/ib_srp.h |   38 ++-
 fs/direct-io.c                      |    1 +
 3 files changed, 525 insertions(+), 250 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2011-03-16 21:41 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-19  4:27 [RFC 0/8] Reliably generate large request from SRP David Dillow
     [not found] ` <1295411242-26148-1-git-send-email-dillowda-1Heg1YXhbW8@public.gmane.org>
2011-01-19  4:27   ` [RFC 1/8] IB/srp: always avoid non-zero offsets into an FMR David Dillow
2011-01-19  4:27   ` [RFC 2/8] IB/srp: move IB CM setup completion into its own function David Dillow
2011-01-19  4:27   ` [RFC 3/8] IB/srp: allow sg_tablesize to be set for each target David Dillow
2011-01-19  4:27   ` [RFC 4/8] IB/srp: rework mapping engine to use multiple FMR entries David Dillow
     [not found]     ` <1295411242-26148-5-git-send-email-dillowda-1Heg1YXhbW8@public.gmane.org>
2011-01-20 10:04       ` Or Gerlitz
     [not found]         ` <AANLkTim6H063ta0w2A+zo9QH0jY5qL5uu1OxN4iqMFEm-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-01-20 12:36           ` David Dillow
2011-01-19  4:27   ` [RFC 5/8] IB/srp: add safety valve for large SG tables without HW support David Dillow
2011-01-19  4:27   ` [RFC 6/8] IB/srp: add support for indirect tables that don't fit in SRP_CMD David Dillow
2011-01-19  4:27   ` [RFC 7/8] IB/srp: try to use larger FMR sizes to cover our mappings David Dillow
     [not found]     ` <1295411242-26148-8-git-send-email-dillowda-1Heg1YXhbW8@public.gmane.org>
2011-01-20 10:24       ` Or Gerlitz
     [not found]         ` <4D380D4B.6060404-smomgflXvOZWk0Htik3J/w@public.gmane.org>
2011-01-20 12:40           ` David Dillow
2011-01-19  4:27   ` [RFC 8/8] IB/srp and direct IO: patches for testing large indirect tables David Dillow
     [not found]     ` <1295411242-26148-9-git-send-email-dillowda-1Heg1YXhbW8@public.gmane.org>
2011-01-20 10:07       ` Or Gerlitz
     [not found]         ` <4D38094B.9090101-smomgflXvOZWk0Htik3J/w@public.gmane.org>
2011-01-20 12:33           ` David Dillow
2011-01-19  5:31   ` [RFC 0/8] Reliably generate large request from SRP Roland Dreier
     [not found]     ` <aday66hxxwe.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2011-01-19 12:01       ` David Dillow
2011-01-20  9:52       ` Or Gerlitz
     [not found]         ` <4D3805C4.6010203-smomgflXvOZWk0Htik3J/w@public.gmane.org>
2011-01-20 12:54           ` David Dillow
     [not found]             ` <1295528044.22825.64.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
2011-01-24 15:32               ` Or Gerlitz
     [not found]                 ` <4D3D9B74.8090607-smomgflXvOZWk0Htik3J/w@public.gmane.org>
2011-01-24 16:14                   ` Bart Van Assche
     [not found]                     ` <AANLkTikapxELx5B6knAm6CQaeLsKHWd9EMQeexmFdF1Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-01-24 18:00                       ` David Dillow
2011-01-24 17:53                   ` David Dillow
2011-01-20 17:50           ` Roland Dreier
2011-02-19  0:06       ` David Dillow
2011-02-19  0:07       ` David Dillow
     [not found]         ` <1298074037.15679.17.camel-FqX9LgGZnHWDB2HL1qBt2PIbXMQ5te18@public.gmane.org>
2011-02-22  6:36           ` Or Gerlitz
     [not found]             ` <4D635974.10807-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2011-02-22 14:49               ` David Dillow
     [not found]                 ` <1298386190.18945.1.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
2011-02-22 23:38                   ` Roland Dreier
     [not found]                     ` <AANLkTikxDu5b=p4fXHMm8W+tF3Lru4vB7xRZEF+HDpyu-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-02-23  1:23                       ` David Dillow
     [not found] ` <1300148888.2772.15.camel@lap75545.ornl.gov>
     [not found]   ` <AANLkTinC9QcE8E_O3M0+dapVGEAZq_tw-3cb3GN4qf-q@mail.gmail.com>
     [not found]     ` <AANLkTinC9QcE8E_O3M0+dapVGEAZq_tw-3cb3GN4qf-q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-03-15 23:51       ` David Dillow
     [not found]   ` <1300148888.2772.15.camel-FqX9LgGZnHWDB2HL1qBt2PIbXMQ5te18@public.gmane.org>
2011-03-16  8:27     ` Or Gerlitz
     [not found]       ` <4D807472.2060000-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2011-03-16 15:07         ` David Dillow
2011-03-16 16:50         ` Roland Dreier
     [not found]           ` <AANLkTimY74Wmsfc3F35SBuR2YyDW=ao78B=9uGh4LZNJ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-03-16 21:41             ` David Dillow

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.