From: Jens Axboe <axboe@kernel.dk>
To: James Bottomley <James.Bottomley@HansenPartnership.com>,
Mikael Pettersson <mikpelinux@gmail.com>
Cc: Linux SPARC Kernel Mailing List <sparclinux@vger.kernel.org>,
linux-block@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [5.0-rc5 regression] "scsi: kill off the legacy IO path" causes 5 minute delay during boot on Sun Blade 2500
Date: Mon, 11 Feb 2019 08:28:46 -0700 [thread overview]
Message-ID: <44bb4374-0b7c-733b-a53e-92d2f03f2f49@kernel.dk> (raw)
In-Reply-To: <1549898730.2831.6.camel@HansenPartnership.com>
On 2/11/19 8:25 AM, James Bottomley wrote:
> On Sun, 2019-02-10 at 09:35 -0700, Jens Axboe wrote:
>> On 2/10/19 9:25 AM, James Bottomley wrote:
>>> On Sun, 2019-02-10 at 09:05 -0700, Jens Axboe wrote:
>>>> On 2/10/19 8:44 AM, James Bottomley wrote:
>>>>> On Sun, 2019-02-10 at 10:17 +0100, Mikael Pettersson wrote:
>>>>>> On Sat, Feb 9, 2019 at 7:19 PM James Bottomley
>>>>>> <James.Bottomley@hansenpartnership.com> wrote:
>>>>>
>>>>> [...]
>>>>>>> I think the reason for this is that the block mq path
>>>>>>> doesn't feed the kernel entropy pool correctly, hence the
>>>>>>> need to install an entropy gatherer for systems that don't
>>>>>>> have other good random number sources.
>>>>>>
>>>>>> That does sound plausible, I admit I didn't even consider the
>>>>>> possibility that the old block I/O path also was an entropy
>>>>>> source.
>>>>>
>>>>> In theory, the new one should be as well since the rotational
>>>>> entropy collector is on the SCSI completion path. I'd seen
>>>>> the same problem but had assumed it was something someone had
>>>>> done to our internal entropy pool and thus hadn't bisected it.
>>>>
>>>> The difference is that the old stack included ADD_RANDOM by
>>>> default, so this check:
>>>>
>>>> if (blk_queue_add_random(q))
>>>> add_disk_randomness(req->rq_disk);
>>>>
>>>> in scsi_end_request() would be true, and we'd add the randomness.
>>>> For sd, it seems to set it just fine for non-rotational drives.
>>>> Could this be because other devices don't? Maybe the below makes
>>>> a difference.
>>>
>>> No, in both we set it per the rotational parameters of the disk in
>>>
>>> sd.c:sd_read_block_characteristics()
>>>
>>> rot = get_unaligned_be16(&buffer[4]);
>>>
>>> if (rot == 1) {
>>>
>>> blk_queue_flag_set(QUEUE_FLAG_NONROT, q);
>>>
>>> blk_queue_flag_clear(QUEUE_FLAG_ADD_RANDOM, q);
>>> } else {
>>>
>>> blk_queue_flag_clear(QUEUE_FLAG_NONROT, q);
>>>
>>> blk_queue_flag_set(QUEUE_FLAG_ADD_RANDOM, q);
>>> }
>>>
>>>
>>> That check wasn't changed by the code removal.
>>
>> As I said above, for sd. This isn't true for non-disks.
>
> Yes, but the behaviour above doesn't change across a switch to MQ, so I
> don't quite understand how it bisects back to that change. If we're
> not gathering entropy for the device now, we wouldn't have been before
> the switch, so the entropy characteristics shouldn't have changed.
But it does, as I also wrote in that first email. The legacy queue
flags had QUEUE_FLAG_ADD_RANDOM set by default, the MQ ones do not.
Hence any non-sd device would previously ALWAYS have ADD_RANDOM
set, now none of them do. Also see the patch I sent.
>>> Although I suspect it should be unconditional: even SSDs have what
>>> would appear as seek latencies at least during writes depending on
>>> the time taken to find an erased block or even trigger garbage
>>> collection. The entropy collector is good at taking something
>>> completely regular and spotting the inconsistencies, so it won't
>>> matter that loads of "seeks" are deterministic.
>>
>> The reason it isn't is that it's of limited use for SSDs where it's a
>> lot more predictable. And they are also a lot faster, which means the
>> adding randomness is more problematic from an efficiency pov.
>
> But that's my point: our entropy extractor is good at weeding out
> predictable signals. Fine, it won't extract any entropy if the disk
> seek time is entirely regular, but it won't contaminate the entropy
> pool. The computational delay, I grant ... it takes a while to
> determine if any entropy is present in the signal.
But you are missing my point - if we're mostly weeding out predictable
signals, then it's pointless to take the overhead of the randomness.
This is why the MQ flag don't include it by default.
> What about feeding it with something like discard timings, which should
> be much less predictable.
That's not true, lots of devices have VERY predictable discard timings.
Most of them will have a fixed discards-per-second rate, even.
--
Jens Axboe
next prev parent reply other threads:[~2019-02-11 15:28 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-09 17:04 [5.0-rc5 regression] "scsi: kill off the legacy IO path" causes 5 minute delay during boot on Sun Blade 2500 Mikael Pettersson
2019-02-09 18:19 ` James Bottomley
2019-02-10 9:17 ` Mikael Pettersson
2019-02-10 15:44 ` James Bottomley
2019-02-10 16:05 ` Jens Axboe
2019-02-10 16:25 ` James Bottomley
2019-02-10 16:35 ` Jens Axboe
2019-02-11 15:25 ` James Bottomley
2019-02-11 15:28 ` Jens Axboe [this message]
2019-02-11 15:42 ` James Bottomley
2019-02-11 15:46 ` Jens Axboe
2019-02-11 16:28 ` James Bottomley
2019-02-11 16:31 ` Jens Axboe
2019-02-12 2:13 ` James Bottomley
2019-02-12 2:50 ` Jens Axboe
2019-02-12 3:37 ` Elliott, Robert (Persistent Memory)
2019-02-12 4:15 ` James Bottomley
2019-02-12 15:24 ` James Bottomley
2019-02-12 15:27 ` Jens Axboe
2019-02-14 18:35 ` Mikael Pettersson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=44bb4374-0b7c-733b-a53e-92d2f03f2f49@kernel.dk \
--to=axboe@kernel.dk \
--cc=James.Bottomley@HansenPartnership.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mikpelinux@gmail.com \
--cc=sparclinux@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).