From: Roger Willcocks <roger@filmlight.ltd.uk>
To: Don.Brace@microchip.com
Cc: Roger Willcocks <roger@filmlight.ltd.uk>,
mwilck@suse.com, john.garry@huawei.com, buczek@molgen.mpg.de,
martin.petersen@oracle.com, ming.lei@redhat.com,
jejb@linux.vnet.ibm.com, linux-scsi@vger.kernel.org,
hare@suse.de, Kevin.Barnett@microchip.com, pmenzel@molgen.mpg.de,
hare@suse.com
Subject: Re: [PATCH] scsi: scsi_host_queue_ready: increase busy count early
Date: Mon, 22 Feb 2021 14:23:59 +0000 [thread overview]
Message-ID: <0DB85ADC-B962-4AF9-B106-3F3B412CE4DB@filmlight.ltd.uk> (raw)
In-Reply-To: <SN6PR11MB28482D89B75197B742459063E1B49@SN6PR11MB2848.namprd11.prod.outlook.com>
FYI we have exactly this issue on a machine here running CentOS 8.3 (kernel 4.18.0-240.1.1) (so presumably this happens in RHEL 8 too.)
Controller is MSCC / Adaptec 3154-8i16e driving 60 x 12TB HGST drives configured as five x twelve-drive raid-6, software striped using md, and formatted with xfs.
Test software writes to the array using multiple threads in parallel.
The smartpqi driver would report controller offline within ten minutes or so, with status code 0x6100c
Changed the driver to set 'nr_hw_queues = 1’ and then tested by filling the array with random files (which took a couple of days), which completed fine, so it looks like that one-line change fixes it.
Would, of course, be helpful if this was back-ported.
—
Roger
> On 3 Feb 2021, at 15:56, Don.Brace@microchip.com wrote:
>
> -----Original Message-----
> From: Martin Wilck [mailto:mwilck@suse.com]
> Subject: Re: [PATCH] scsi: scsi_host_queue_ready: increase busy count early
>
>>
>>
>> Confirmed my suspicions - it looks like the host is sent more commands
>> than it can handle. We would need many disks to see this issue though,
>> which you have.
>>
>> So for stable kernels, 6eb045e092ef is not in 5.4 . Next is 5.10, and
>> I suppose it could be simply fixed by setting .host_tagset in scsi
>> host template there.
>>
>> Thanks,
>> John
>> --
>> Don: Even though this works for current kernels, what would chances of
>> this getting back-ported to 5.9 or even further?
>>
>> Otherwise the original patch smartpqi_fix_host_qdepth_limit would
>> correct this issue for older kernels.
>
> True. However this is 5.12 material, so we shouldn't be bothered by that here. For 5.5 up to 5.9, you need a workaround. But I'm unsure whether smartpqi_fix_host_qdepth_limit would be the solution.
> You could simply divide can_queue by nr_hw_queues, as suggested before, or even simpler, set nr_hw_queues = 1.
>
> How much performance would that cost you?
>
> Don: For my HBA disk tests...
>
> Dividing can_queue / nr_hw_queues is about a 40% drop.
> ~380K - 400K IOPS
> Setting nr_hw_queues = 1 results in a 1.5 X gain in performance.
> ~980K IOPS
> Setting host_tagset = 1
> ~640K IOPS
>
> So, it seem that setting nr_hw_queues = 1 results in the best performance.
>
> Is this expected? Would this also be true for the future?
>
> Thanks,
> Don Brace
>
> Below is my setup.
> ---
> [3:0:0:0] disk HP EG0900FBLSK HPD7 /dev/sdd
> [3:0:1:0] disk HP EG0900FBLSK HPD7 /dev/sde
> [3:0:2:0] disk HP EG0900FBLSK HPD7 /dev/sdf
> [3:0:3:0] disk HP EH0300FBQDD HPD5 /dev/sdg
> [3:0:4:0] disk HP EG0900FDJYR HPD4 /dev/sdh
> [3:0:5:0] disk HP EG0300FCVBF HPD9 /dev/sdi
> [3:0:6:0] disk HP EG0900FBLSK HPD7 /dev/sdj
> [3:0:7:0] disk HP EG0900FBLSK HPD7 /dev/sdk
> [3:0:8:0] disk HP EG0900FBLSK HPD7 /dev/sdl
> [3:0:9:0] disk HP MO0200FBRWB HPD9 /dev/sdm
> [3:0:10:0] disk HP MM0500FBFVQ HPD8 /dev/sdn
> [3:0:11:0] disk ATA MM0500GBKAK HPGC /dev/sdo
> [3:0:12:0] disk HP EG0900FBVFQ HPDC /dev/sdp
> [3:0:13:0] disk HP VO006400JWZJT HP00 /dev/sdq
> [3:0:14:0] disk HP VO015360JWZJN HP00 /dev/sdr
> [3:0:15:0] enclosu HP D3700 5.04 -
> [3:0:16:0] enclosu HP D3700 5.04 -
> [3:0:17:0] enclosu HPE Smart Adapter 3.00 -
> [3:1:0:0] disk HPE LOGICAL VOLUME 3.00 /dev/sds
> [3:2:0:0] storage HPE P408e-p SR Gen10 3.00 -
> -----
> [global]
> ioengine=libaio
> ; rw=randwrite
> ; percentage_random=40
> rw=write
> size=100g
> bs=4k
> direct=1
> ramp_time=15
> ; filename=/mnt/fio_test
> ; cpus_allowed=0-27
> iodepth=4096
>
> [/dev/sdd]
> [/dev/sde]
> [/dev/sdf]
> [/dev/sdg]
> [/dev/sdh]
> [/dev/sdi]
> [/dev/sdj]
> [/dev/sdk]
> [/dev/sdl]
> [/dev/sdm]
> [/dev/sdn]
> [/dev/sdo]
> [/dev/sdp]
> [/dev/sdq]
> [/dev/sdr]
>
>
> Distribution kernels would be yet another issue, distros can backport host_tagset and get rid of the issue.
>
> Regards
> Martin
>
>
>
>
>
>
>
>
>
>
next prev parent reply other threads:[~2021-02-22 14:37 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-20 18:45 [PATCH] scsi: scsi_host_queue_ready: increase busy count early mwilck
2021-01-20 20:26 ` John Garry
2021-01-21 12:01 ` Donald Buczek
2021-01-21 12:35 ` John Garry
2021-01-21 12:44 ` Donald Buczek
2021-01-21 13:05 ` John Garry
2021-01-21 23:32 ` Martin Wilck
2021-03-11 16:36 ` Donald Buczek
2021-02-01 22:44 ` Don.Brace
2021-02-02 20:04 ` Don.Brace
2021-02-02 20:48 ` Martin Wilck
2021-02-03 8:49 ` John Garry
2021-02-03 8:58 ` Paul Menzel
2021-02-03 15:30 ` Don.Brace
2021-02-03 15:56 ` Don.Brace
2021-02-03 18:25 ` John Garry
2021-02-03 19:01 ` Don.Brace
2021-02-22 14:23 ` Roger Willcocks [this message]
2021-02-23 8:57 ` John Garry
2021-02-23 14:06 ` Roger Willcocks
2021-02-23 16:17 ` John Garry
2021-03-01 14:51 ` Paul Menzel
2021-01-21 9:07 ` Donald Buczek
2021-01-21 10:05 ` Martin Wilck
2021-01-22 0:14 ` Martin Wilck
2021-01-22 3:23 ` Ming Lei
2021-01-22 14:05 ` Martin Wilck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0DB85ADC-B962-4AF9-B106-3F3B412CE4DB@filmlight.ltd.uk \
--to=roger@filmlight.ltd.uk \
--cc=Don.Brace@microchip.com \
--cc=Kevin.Barnett@microchip.com \
--cc=buczek@molgen.mpg.de \
--cc=hare@suse.com \
--cc=hare@suse.de \
--cc=jejb@linux.vnet.ibm.com \
--cc=john.garry@huawei.com \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=ming.lei@redhat.com \
--cc=mwilck@suse.com \
--cc=pmenzel@molgen.mpg.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.