linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Guenter Roeck <linux@roeck-us.net>
To: Ming Lei <ming.lei@redhat.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>,
	linux-scsi@vger.kernel.org,
	"Martin K . Petersen" <martin.petersen@oracle.com>,
	linux-block@vger.kernel.org, Christoph Hellwig <hch@lst.de>,
	Bart Van Assche <bvanassche@acm.org>,
	"Ewan D . Milne" <emilne@redhat.com>,
	Hannes Reinecke <hare@suse.com>
Subject: Re: [PATCH V4 3/3] scsi: core: avoid to pre-allocate big chunk for sg list
Date: Mon, 3 Jun 2019 13:44:22 -0700	[thread overview]
Message-ID: <20190603204422.GA7240@roeck-us.net> (raw)
In-Reply-To: <20190428073932.9898-4-ming.lei@redhat.com>

On Sun, Apr 28, 2019 at 03:39:32PM +0800, Ming Lei wrote:
> Now scsi_mq_setup_tags() pre-allocates a big buffer for IO sg list,
> and the buffer size is scsi_mq_sgl_size() which depends on smaller
> value between shost->sg_tablesize and SG_CHUNK_SIZE.
> 
> Modern HBA's DMA is often capable of deadling with very big segment
> number, so scsi_mq_sgl_size() is often big. Suppose the max sg number
> of SG_CHUNK_SIZE is taken, scsi_mq_sgl_size() will be 4KB.
> 
> Then if one HBA has lots of queues, and each hw queue's depth is
> high, pre-allocation for sg list can consume huge memory.
> For example of lpfc, nr_hw_queues can be 70, each queue's depth
> can be 3781, so the pre-allocation for data sg list is 70*3781*2k
> =517MB for single HBA.
> 
> There is Red Hat internal report that scsi_debug based tests can't
> be run any more since legacy io path is killed because too big
> pre-allocation.
> 
> So switch to runtime allocation for sg list, meantime pre-allocate 2
> inline sg entries. This way has been applied to NVMe PCI for a while,
> so it should be fine for SCSI too. Also runtime sg entries allocation
> has verified and run always in the original legacy io path.
> 
> Not see performance effect in my big BS test on scsi_debug.
> 

This patch causes a variety of boot failures in -next. Typical failure
pattern is scsi hangs or failure to find a root file system. For example,
on alpha, trying to boot from usb:

scsi 0:0:0:0: Direct-Access     QEMU     QEMU HARDDISK    2.5+ PQ: 0 ANSI: 5
sd 0:0:0:0: Power-on or device reset occurred
sd 0:0:0:0: [sda] 20480 512-byte logical blocks: (10.5 MB/10.0 MiB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sd 0:0:0:0: [sda] Attached SCSI disk
usb 1-1: reset full-speed USB device number 2 using ohci-pci
sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x07 driverbyte=0x00
sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 00 58 00 00 42 00
print_req_error: I/O error, dev sda, sector 88 flags 80000
EXT4-fs error (device sda): __ext4_get_inode_loc:4703: inode #2: block 44: comm
swapper: unable to read itable block
EXT4-fs (sda): get root inode failed
EXT4-fs (sda): mount failed
VFS: Cannot open root device "sda" or unknown-block(8,0): error -5

Similar problems are seen with other architectures.
Reverting the patch fixes the problem. Bisect log attached.

Guenter

---
# bad: [ae3cad8f39ccf8d31775d9737488bccf0e44d370] Add linux-next specific files for 20190603
# good: [f2c7c76c5d0a443053e94adb9f0918fa2fb85c3a] Linux 5.2-rc3
git bisect start 'HEAD' 'v5.2-rc3'
# good: [8ff6f4c6e067a9d3f3bbacf02c4ea5eb81fe2c6a] Merge remote-tracking branch 'crypto/master'
git bisect good 8ff6f4c6e067a9d3f3bbacf02c4ea5eb81fe2c6a
# good: [6c93755861ce6a6dd904df9cdae9f08671132dbe] Merge remote-tracking branch 'iommu/next'
git bisect good 6c93755861ce6a6dd904df9cdae9f08671132dbe
# good: [1a567956cb3be5754d94ce9380a2151e57e204a7] Merge remote-tracking branch 'cgroup/for-next'
git bisect good 1a567956cb3be5754d94ce9380a2151e57e204a7
# bad: [a6878ca73cf30b83efbdfb1ecc443d7cfb2d8193] Merge remote-tracking branch 'rtc/rtc-next'
git bisect bad a6878ca73cf30b83efbdfb1ecc443d7cfb2d8193
# bad: [25e7f57cf9f86076704b543628a0f02d3d733726] Merge remote-tracking branch 'gpio/for-next'
git bisect bad 25e7f57cf9f86076704b543628a0f02d3d733726
# bad: [28e85db94534c92fa00d787264e3ea1843d1dc42] scsi: lpfc: Fix nvmet handling of received ABTS for unmapped frames
git bisect bad 28e85db94534c92fa00d787264e3ea1843d1dc42
# bad: [cf9eddf616bb03387e597629142b0be34111e8d0] scsi: hpsa: check for tag collision
git bisect bad cf9eddf616bb03387e597629142b0be34111e8d0
# good: [4e74166c52a836f05d4bd8270835703908b34d3e] scsi: libsas: switch sas_ata.[ch] to SPDX tags
git bisect good 4e74166c52a836f05d4bd8270835703908b34d3e
# good: [f186090846c29fd9760917fb3d01f095c39262e0] scsi: lib/sg_pool.c: improve APIs for allocating sg pool
git bisect good f186090846c29fd9760917fb3d01f095c39262e0
# bad: [12b6b55806928690359bb21de3a19199412289fd] scsi: sd: Inline sd_probe_part2()
git bisect bad 12b6b55806928690359bb21de3a19199412289fd
# bad: [c3288dd8c2328a74cb2157dff18d13f6a75cedd2] scsi: core: avoid pre-allocating big SGL for data
git bisect bad c3288dd8c2328a74cb2157dff18d13f6a75cedd2
# good: [0f0e744eae6c8af361d227d3a2286973351e5a2a] scsi: core: avoid pre-allocating big SGL for protection information
git bisect good 0f0e744eae6c8af361d227d3a2286973351e5a2a
# first bad commit: [c3288dd8c2328a74cb2157dff18d13f6a75cedd2] scsi: core: avoid pre-allocating big SGL for data

  parent reply	other threads:[~2019-06-03 21:43 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-28  7:39 [PATCH V4 0/3] scsi: core: avoid big pre-allocation for sg list Ming Lei
2019-04-28  7:39 ` [PATCH V4 1/3] lib/sg_pool.c: improve APIs for allocating sg pool Ming Lei
2019-04-28 12:04   ` Christoph Hellwig
2019-04-28  7:39 ` [PATCH V4 2/3] scsi: core: avoid to pre-allocate big chunk for protection meta data Ming Lei
2019-04-29 18:15   ` Bart Van Assche
2019-04-28  7:39 ` [PATCH V4 3/3] scsi: core: avoid to pre-allocate big chunk for sg list Ming Lei
2019-04-29 18:16   ` Bart Van Assche
2019-06-03 20:44   ` Guenter Roeck [this message]
2019-06-04  1:00     ` Ming Lei
2019-06-04  3:49       ` Guenter Roeck
2019-06-04  4:10         ` Ming Lei
2019-06-04 14:51           ` Guenter Roeck
2019-06-04  6:55         ` Ming Lei
2019-05-05  1:10 ` [PATCH V4 0/3] scsi: core: avoid big pre-allocation " Ming Lei
2019-05-14  2:06 ` Martin K. Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190603204422.GA7240@roeck-us.net \
    --to=linux@roeck-us.net \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=bvanassche@acm.org \
    --cc=emilne@redhat.com \
    --cc=hare@suse.com \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=ming.lei@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).