* Re: BLKSECDISCARD ioctl and hung tasks
2020-02-13 19:21 ` Salman Qazi
@ 2020-02-13 22:08 ` Salman Qazi
2020-02-14 0:25 ` Ming Lei
2020-02-14 5:49 ` Bart Van Assche
2 siblings, 0 replies; 19+ messages in thread
From: Salman Qazi @ 2020-02-13 22:08 UTC (permalink / raw)
To: Bart Van Assche
Cc: Ming Lei, Jens Axboe, Christoph Hellwig,
Linux Kernel Mailing List, linux-block, Gwendal Grignou,
Jesse Barnes
On Thu, Feb 13, 2020 at 11:21 AM Salman Qazi <sqazi@google.com> wrote:
>
> On Thu, Feb 13, 2020 at 9:48 AM Bart Van Assche <bvanassche@acm.org> wrote:
> >
> > On 2/13/20 12:26 AM, Ming Lei wrote:
> > > The approach used in blk_execute_rq() can be borrowed for workaround the
> > > issue, such as:
> > >
> > > diff --git a/block/bio.c b/block/bio.c
> > > index 94d697217887..c9ce19a86de7 100644
> > > --- a/block/bio.c
> > > +++ b/block/bio.c
> > > @@ -17,6 +17,7 @@
> > > #include <linux/cgroup.h>
> > > #include <linux/blk-cgroup.h>
> > > #include <linux/highmem.h>
> > > +#include <linux/sched/sysctl.h>
> > >
> > > #include <trace/events/block.h>
> > > #include "blk.h"
> > > @@ -1019,12 +1020,19 @@ static void submit_bio_wait_endio(struct bio *bio)
> > > int submit_bio_wait(struct bio *bio)
> > > {
> > > DECLARE_COMPLETION_ONSTACK_MAP(done, bio->bi_disk->lockdep_map);
> > > + unsigned long hang_check;
> > >
> > > bio->bi_private = &done;
> > > bio->bi_end_io = submit_bio_wait_endio;
> > > bio->bi_opf |= REQ_SYNC;
> > > submit_bio(bio);
> > > - wait_for_completion_io(&done);
> > > +
> > > + /* Prevent hang_check timer from firing at us during very long I/O */
> > > + hang_check = sysctl_hung_task_timeout_secs;
> > > + if (hang_check)
> > > + while (!wait_for_completion_io_timeout(&done, hang_check * (HZ/2)));
> > > + else
> > > + wait_for_completion_io(&done);
> > >
> > > return blk_status_to_errno(bio->bi_status);
> > > }
> >
> > Instead of suppressing the hung task complaints, has it been considered
> > to use the bio splitting mechanism to make discard bios smaller? Block
> > drivers may set a limit by calling blk_queue_max_discard_segments().
> > From block/blk-settings.c:
> >
> > /**
> > * blk_queue_max_discard_segments - set max segments for discard
> > * requests
> > * @q: the request queue for the device
> > * @max_segments: max number of segments
> > *
> > * Description:
> > * Enables a low level driver to set an upper limit on the number of
> > * segments in a discard request.
> > **/
> > void blk_queue_max_discard_segments(struct request_queue *q,
> > unsigned short max_segments)
> > {
> > q->limits.max_discard_segments = max_segments;
> > }
> > EXPORT_SYMBOL_GPL(blk_queue_max_discard_segments);
> >
>
> AFAICT, This is not actually sufficient, because the issuer of the bio
> is waiting for the entire bio, regardless of how it is split later.
> But, also there isn't a good mapping between the size of the secure
> discard and how long it will take. If given the geometry of a flash
> device, it is not hard to construct a scenario where a relatively
> small secure discard (few thousand sectors) will take a very long time
> (multiple seconds).
>
> Having said that, I don't like neutering the hung task timer either.
In fact, it's worse than that. Today, I was able to construct a case
of a 4K discard on a particular device that took 100 seconds. I did
this
by arranging to write a single copy of page 0 for every erase unit of
the device, and wrote random LBAs to the rest of the erase unit. I
suspect the
slow speed comes from the need to copy almost the entire device to
erase all the stale copies of page 0.
#define _GNU_SOURCE
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/ioctl.h>
#include <fcntl.h>
#include <assert.h>
#include <unistd.h>
#include <linux/fs.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
char page[8192];
int main(int argc, char **argv)
{
unsigned long start;
int fd;
int i;
char *page_aligned = (char *)(((unsigned long)page + 4095) & ~4095UL);
unsigned long range[2];
fd = open(argv[1], O_RDWR | O_DIRECT);
assert(fd >= 0);
range[0] = 0;
assert(ioctl(fd, BLKGETSIZE64, &range[1]) >= 0);
for (i = 0; i < range[1]; i += 4096) {
/* paranoia: incase there is any deduping */
page_aligned[0] = i;
/*
* Almost always write randomly
*/
if (i % (4*1024*1024) != 0)
assert(pwrite(fd, page_aligned, 4096,
(lrand48() % range[1]) & ~4095UL) == 4096);
else
/* except, once per erase block, write page 0 */
assert(pwrite(fd, page_aligned, 4096, 0) == 4096);
}
start = time(NULL);
/* discard exactly one page */
range[1] = 4096;
printf("Starting discard %lu!\n", start);
assert(ioctl(fd, BLKSECDISCARD, &range) >= 0);
printf("Finished discard. Took %lu!\n", time(NULL) - start);
close(fd);
}
>
> > Thanks,
> >
> > Bart.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: BLKSECDISCARD ioctl and hung tasks
2020-02-13 19:21 ` Salman Qazi
2020-02-13 22:08 ` Salman Qazi
@ 2020-02-14 0:25 ` Ming Lei
2020-02-14 5:49 ` Bart Van Assche
2 siblings, 0 replies; 19+ messages in thread
From: Ming Lei @ 2020-02-14 0:25 UTC (permalink / raw)
To: Salman Qazi
Cc: Bart Van Assche, Jens Axboe, Christoph Hellwig,
Linux Kernel Mailing List, linux-block, Gwendal Grignou,
Jesse Barnes
On Thu, Feb 13, 2020 at 11:21:37AM -0800, Salman Qazi wrote:
> On Thu, Feb 13, 2020 at 9:48 AM Bart Van Assche <bvanassche@acm.org> wrote:
> >
> > On 2/13/20 12:26 AM, Ming Lei wrote:
> > > The approach used in blk_execute_rq() can be borrowed for workaround the
> > > issue, such as:
> > >
> > > diff --git a/block/bio.c b/block/bio.c
> > > index 94d697217887..c9ce19a86de7 100644
> > > --- a/block/bio.c
> > > +++ b/block/bio.c
> > > @@ -17,6 +17,7 @@
> > > #include <linux/cgroup.h>
> > > #include <linux/blk-cgroup.h>
> > > #include <linux/highmem.h>
> > > +#include <linux/sched/sysctl.h>
> > >
> > > #include <trace/events/block.h>
> > > #include "blk.h"
> > > @@ -1019,12 +1020,19 @@ static void submit_bio_wait_endio(struct bio *bio)
> > > int submit_bio_wait(struct bio *bio)
> > > {
> > > DECLARE_COMPLETION_ONSTACK_MAP(done, bio->bi_disk->lockdep_map);
> > > + unsigned long hang_check;
> > >
> > > bio->bi_private = &done;
> > > bio->bi_end_io = submit_bio_wait_endio;
> > > bio->bi_opf |= REQ_SYNC;
> > > submit_bio(bio);
> > > - wait_for_completion_io(&done);
> > > +
> > > + /* Prevent hang_check timer from firing at us during very long I/O */
> > > + hang_check = sysctl_hung_task_timeout_secs;
> > > + if (hang_check)
> > > + while (!wait_for_completion_io_timeout(&done, hang_check * (HZ/2)));
> > > + else
> > > + wait_for_completion_io(&done);
> > >
> > > return blk_status_to_errno(bio->bi_status);
> > > }
> >
> > Instead of suppressing the hung task complaints, has it been considered
> > to use the bio splitting mechanism to make discard bios smaller? Block
> > drivers may set a limit by calling blk_queue_max_discard_segments().
> > From block/blk-settings.c:
> >
> > /**
> > * blk_queue_max_discard_segments - set max segments for discard
> > * requests
> > * @q: the request queue for the device
> > * @max_segments: max number of segments
> > *
> > * Description:
> > * Enables a low level driver to set an upper limit on the number of
> > * segments in a discard request.
> > **/
> > void blk_queue_max_discard_segments(struct request_queue *q,
> > unsigned short max_segments)
> > {
> > q->limits.max_discard_segments = max_segments;
> > }
> > EXPORT_SYMBOL_GPL(blk_queue_max_discard_segments);
> >
>
> AFAICT, This is not actually sufficient, because the issuer of the bio
> is waiting for the entire bio, regardless of how it is split later.
Right.
> But, also there isn't a good mapping between the size of the secure
> discard and how long it will take. If given the geometry of a flash
> device, it is not hard to construct a scenario where a relatively
> small secure discard (few thousand sectors) will take a very long time
> (multiple seconds).
It isn't strange to see such implementation, and I also see discard
request timeout.
>
> Having said that, I don't like neutering the hung task timer either.
But it does fix the hung warning, doesn't it?
thanks,
Ming
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: BLKSECDISCARD ioctl and hung tasks
2020-02-13 19:21 ` Salman Qazi
2020-02-13 22:08 ` Salman Qazi
2020-02-14 0:25 ` Ming Lei
@ 2020-02-14 5:49 ` Bart Van Assche
2020-02-14 9:22 ` Ming Lei
2 siblings, 1 reply; 19+ messages in thread
From: Bart Van Assche @ 2020-02-14 5:49 UTC (permalink / raw)
To: Salman Qazi
Cc: Ming Lei, Jens Axboe, Christoph Hellwig,
Linux Kernel Mailing List, linux-block, Gwendal Grignou,
Jesse Barnes
On 2020-02-13 11:21, Salman Qazi wrote:
> AFAICT, This is not actually sufficient, because the issuer of the bio
> is waiting for the entire bio, regardless of how it is split later.
> But, also there isn't a good mapping between the size of the secure
> discard and how long it will take. If given the geometry of a flash
> device, it is not hard to construct a scenario where a relatively
> small secure discard (few thousand sectors) will take a very long time
> (multiple seconds).
>
> Having said that, I don't like neutering the hung task timer either.
Hi Salman,
How about modifying the block layer such that completions of bio
fragments are considered as task activity? I think that bio splitting is
rare enough for such a change not to affect performance of the hot path.
How about setting max_discard_segments such that a discard always
completes in less than half the hung task timeout? This may make
discards a bit slower for one particular block driver but I think that's
better than hung task complaints.
Thanks,
Bart.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: BLKSECDISCARD ioctl and hung tasks
2020-02-14 5:49 ` Bart Van Assche
@ 2020-02-14 9:22 ` Ming Lei
2020-02-14 19:42 ` Salman Qazi
0 siblings, 1 reply; 19+ messages in thread
From: Ming Lei @ 2020-02-14 9:22 UTC (permalink / raw)
To: Bart Van Assche
Cc: Salman Qazi, Ming Lei, Jens Axboe, Christoph Hellwig,
Linux Kernel Mailing List, linux-block, Gwendal Grignou,
Jesse Barnes
On Fri, Feb 14, 2020 at 1:50 PM Bart Van Assche <bvanassche@acm.org> wrote:
>
> On 2020-02-13 11:21, Salman Qazi wrote:
> > AFAICT, This is not actually sufficient, because the issuer of the bio
> > is waiting for the entire bio, regardless of how it is split later.
> > But, also there isn't a good mapping between the size of the secure
> > discard and how long it will take. If given the geometry of a flash
> > device, it is not hard to construct a scenario where a relatively
> > small secure discard (few thousand sectors) will take a very long time
> > (multiple seconds).
> >
> > Having said that, I don't like neutering the hung task timer either.
>
> Hi Salman,
>
> How about modifying the block layer such that completions of bio
> fragments are considered as task activity? I think that bio splitting is
> rare enough for such a change not to affect performance of the hot path.
Are you sure that the task hung warning won't be triggered in case of
non-splitting?
>
> How about setting max_discard_segments such that a discard always
> completes in less than half the hung task timeout? This may make
> discards a bit slower for one particular block driver but I think that's
> better than hung task complaints.
I am afraid you can't find a golden setting max_discard_segments working
for every drivers. Even it is found, the performance may have been affected.
So just wondering why not take the simple approach used in blk_execute_rq()?
Thanks,
Ming Lei
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: BLKSECDISCARD ioctl and hung tasks
2020-02-14 9:22 ` Ming Lei
@ 2020-02-14 19:42 ` Salman Qazi
2020-02-15 3:46 ` Ming Lei
0 siblings, 1 reply; 19+ messages in thread
From: Salman Qazi @ 2020-02-14 19:42 UTC (permalink / raw)
To: Ming Lei
Cc: Bart Van Assche, Ming Lei, Jens Axboe, Christoph Hellwig,
Linux Kernel Mailing List, linux-block, Gwendal Grignou,
Jesse Barnes
On Fri, Feb 14, 2020 at 1:23 AM Ming Lei <tom.leiming@gmail.com> wrote:
>
> On Fri, Feb 14, 2020 at 1:50 PM Bart Van Assche <bvanassche@acm.org> wrote:
> >
> > On 2020-02-13 11:21, Salman Qazi wrote:
> > > AFAICT, This is not actually sufficient, because the issuer of the bio
> > > is waiting for the entire bio, regardless of how it is split later.
> > > But, also there isn't a good mapping between the size of the secure
> > > discard and how long it will take. If given the geometry of a flash
> > > device, it is not hard to construct a scenario where a relatively
> > > small secure discard (few thousand sectors) will take a very long time
> > > (multiple seconds).
> > >
> > > Having said that, I don't like neutering the hung task timer either.
> >
> > Hi Salman,
> >
> > How about modifying the block layer such that completions of bio
> > fragments are considered as task activity? I think that bio splitting is
> > rare enough for such a change not to affect performance of the hot path.
>
> Are you sure that the task hung warning won't be triggered in case of
> non-splitting?
I demonstrated a few emails ago that it doesn't take a very large
secure discard command to trigger this. So, I am sceptical that we
will be able to use splitting to solve this.
>
> >
> > How about setting max_discard_segments such that a discard always
> > completes in less than half the hung task timeout? This may make
> > discards a bit slower for one particular block driver but I think that's
> > better than hung task complaints.
>
> I am afraid you can't find a golden setting max_discard_segments working
> for every drivers. Even it is found, the performance may have been affected.
>
> So just wondering why not take the simple approach used in blk_execute_rq()?
My colleague Gwendal pointed out another issue which I had missed:
secure discard is an exclusive command: it monopolizes the device.
Even if we fix this via your approach, it will show up somewhere else,
because other operations to the drive will not make progress for that
length of time.
For Chromium OS purposes, if we had a blank slate, this is how I would solve it:
* Under the assumption that the truly sensitive data is not very big:
* Keep secure data on a separate partition to make sure that those
LBAs have controlled history
* Treat the files in that partition as immutable (i.e. no
overwriting the contents of the file without first secure erasing the
existing contents).
* By never letting more than one version of the file accumulate,
we can guarantee that the secure erase will always be fast for
moderate sized files.
But for all the existing machines with keys on them, we will need to
do something else.
>
> Thanks,
> Ming Lei
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: BLKSECDISCARD ioctl and hung tasks
2020-02-14 19:42 ` Salman Qazi
@ 2020-02-15 3:46 ` Ming Lei
2020-02-18 16:11 ` Jesse Barnes
0 siblings, 1 reply; 19+ messages in thread
From: Ming Lei @ 2020-02-15 3:46 UTC (permalink / raw)
To: Salman Qazi
Cc: Ming Lei, Bart Van Assche, Jens Axboe, Christoph Hellwig,
Linux Kernel Mailing List, linux-block, Gwendal Grignou,
Jesse Barnes
On Fri, Feb 14, 2020 at 11:42:32AM -0800, Salman Qazi wrote:
> On Fri, Feb 14, 2020 at 1:23 AM Ming Lei <tom.leiming@gmail.com> wrote:
> >
> > On Fri, Feb 14, 2020 at 1:50 PM Bart Van Assche <bvanassche@acm.org> wrote:
> > >
> > > On 2020-02-13 11:21, Salman Qazi wrote:
> > > > AFAICT, This is not actually sufficient, because the issuer of the bio
> > > > is waiting for the entire bio, regardless of how it is split later.
> > > > But, also there isn't a good mapping between the size of the secure
> > > > discard and how long it will take. If given the geometry of a flash
> > > > device, it is not hard to construct a scenario where a relatively
> > > > small secure discard (few thousand sectors) will take a very long time
> > > > (multiple seconds).
> > > >
> > > > Having said that, I don't like neutering the hung task timer either.
> > >
> > > Hi Salman,
> > >
> > > How about modifying the block layer such that completions of bio
> > > fragments are considered as task activity? I think that bio splitting is
> > > rare enough for such a change not to affect performance of the hot path.
> >
> > Are you sure that the task hung warning won't be triggered in case of
> > non-splitting?
>
> I demonstrated a few emails ago that it doesn't take a very large
> secure discard command to trigger this. So, I am sceptical that we
> will be able to use splitting to solve this.
>
> >
> > >
> > > How about setting max_discard_segments such that a discard always
> > > completes in less than half the hung task timeout? This may make
> > > discards a bit slower for one particular block driver but I think that's
> > > better than hung task complaints.
> >
> > I am afraid you can't find a golden setting max_discard_segments working
> > for every drivers. Even it is found, the performance may have been affected.
> >
> > So just wondering why not take the simple approach used in blk_execute_rq()?
>
> My colleague Gwendal pointed out another issue which I had missed:
> secure discard is an exclusive command: it monopolizes the device.
> Even if we fix this via your approach, it will show up somewhere else,
> because other operations to the drive will not make progress for that
> length of time.
What are the 'other operations'? Are they block IOs?
If yes, that is why I suggest to fix submit_bio_wait(), which should cover
most of sync bio submission.
Anyway, the fix is simple & generic enough, I'd plan to post a formal
patch if no one figures out better doable approaches.
>
> For Chromium OS purposes, if we had a blank slate, this is how I would solve it:
>
> * Under the assumption that the truly sensitive data is not very big:
> * Keep secure data on a separate partition to make sure that those
> LBAs have controlled history
> * Treat the files in that partition as immutable (i.e. no
> overwriting the contents of the file without first secure erasing the
> existing contents).
> * By never letting more than one version of the file accumulate,
> we can guarantee that the secure erase will always be fast for
> moderate sized files.
>
> But for all the existing machines with keys on them, we will need to
> do something else.
The issue you reported is a generic one, not Chromium only.
Thanks,
Ming
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: BLKSECDISCARD ioctl and hung tasks
2020-02-15 3:46 ` Ming Lei
@ 2020-02-18 16:11 ` Jesse Barnes
2020-02-19 1:37 ` Ming Lei
2020-02-19 2:54 ` Ming Lei
0 siblings, 2 replies; 19+ messages in thread
From: Jesse Barnes @ 2020-02-18 16:11 UTC (permalink / raw)
To: Ming Lei
Cc: Salman Qazi, Ming Lei, Bart Van Assche, Jens Axboe,
Christoph Hellwig, Linux Kernel Mailing List, linux-block,
Gwendal Grignou
On Fri, Feb 14, 2020 at 7:47 PM Ming Lei <ming.lei@redhat.com> wrote:
> What are the 'other operations'? Are they block IOs?
>
> If yes, that is why I suggest to fix submit_bio_wait(), which should cover
> most of sync bio submission.
>
> Anyway, the fix is simple & generic enough, I'd plan to post a formal
> patch if no one figures out better doable approaches.
Yeah I think any block I/O operation that occurs after the
BLKSECDISCARD is submitted will also potentially be affected by the
hung task timeouts, and I think your patch will address that. My only
concern with it is that it might hide some other I/O "hangs" that are
due to device misbehavior instead. Yes driver and device timeouts
should generally catch those, but with this in place we might miss a
few bugs.
Given the nature of these types of storage devices though, I think
that's a minor issue and not worth blocking the patch on, given that
it should prevent a lot of false positive hang reports as Salman
demonstrated.
Thanks,
Jesse
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: BLKSECDISCARD ioctl and hung tasks
2020-02-18 16:11 ` Jesse Barnes
@ 2020-02-19 1:37 ` Ming Lei
2020-02-19 2:54 ` Ming Lei
1 sibling, 0 replies; 19+ messages in thread
From: Ming Lei @ 2020-02-19 1:37 UTC (permalink / raw)
To: Jesse Barnes
Cc: Salman Qazi, Ming Lei, Bart Van Assche, Jens Axboe,
Christoph Hellwig, Linux Kernel Mailing List, linux-block,
Gwendal Grignou
On Tue, Feb 18, 2020 at 08:11:53AM -0800, Jesse Barnes wrote:
> On Fri, Feb 14, 2020 at 7:47 PM Ming Lei <ming.lei@redhat.com> wrote:
> > What are the 'other operations'? Are they block IOs?
> >
> > If yes, that is why I suggest to fix submit_bio_wait(), which should cover
> > most of sync bio submission.
> >
> > Anyway, the fix is simple & generic enough, I'd plan to post a formal
> > patch if no one figures out better doable approaches.
>
> Yeah I think any block I/O operation that occurs after the
> BLKSECDISCARD is submitted will also potentially be affected by the
> hung task timeouts, and I think your patch will address that. My only
> concern with it is that it might hide some other I/O "hangs" that are
> due to device misbehavior instead. Yes driver and device timeouts
> should generally catch those, but with this in place we might miss a
> few bugs.
We may add a warning(device name, request, ...) when one IO isn't completed in
sysctl_hung_task_timeout_secs, so this device misbehavior still can be
caught.
>
> Given the nature of these types of storage devices though, I think
> that's a minor issue and not worth blocking the patch on, given that
> it should prevent a lot of false positive hang reports as Salman
> demonstrated.
OK, I will post it soon.
Thanks,
Ming
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: BLKSECDISCARD ioctl and hung tasks
2020-02-18 16:11 ` Jesse Barnes
2020-02-19 1:37 ` Ming Lei
@ 2020-02-19 2:54 ` Ming Lei
2020-02-19 17:54 ` Salman Qazi
1 sibling, 1 reply; 19+ messages in thread
From: Ming Lei @ 2020-02-19 2:54 UTC (permalink / raw)
To: Jesse Barnes
Cc: Salman Qazi, Ming Lei, Bart Van Assche, Jens Axboe,
Christoph Hellwig, Linux Kernel Mailing List, linux-block,
Gwendal Grignou
On Tue, Feb 18, 2020 at 08:11:53AM -0800, Jesse Barnes wrote:
> On Fri, Feb 14, 2020 at 7:47 PM Ming Lei <ming.lei@redhat.com> wrote:
> > What are the 'other operations'? Are they block IOs?
> >
> > If yes, that is why I suggest to fix submit_bio_wait(), which should cover
> > most of sync bio submission.
> >
> > Anyway, the fix is simple & generic enough, I'd plan to post a formal
> > patch if no one figures out better doable approaches.
>
> Yeah I think any block I/O operation that occurs after the
> BLKSECDISCARD is submitted will also potentially be affected by the
> hung task timeouts, and I think your patch will address that. My only
> concern with it is that it might hide some other I/O "hangs" that are
> due to device misbehavior instead. Yes driver and device timeouts
> should generally catch those, but with this in place we might miss a
> few bugs.
>
> Given the nature of these types of storage devices though, I think
> that's a minor issue and not worth blocking the patch on, given that
> it should prevent a lot of false positive hang reports as Salman
> demonstrated.
Hello Jesse and Salman,
One more question about this issue, do you enable BLK_WBT on your test
kernel?
Thanks,
Ming
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: BLKSECDISCARD ioctl and hung tasks
2020-02-19 2:54 ` Ming Lei
@ 2020-02-19 17:54 ` Salman Qazi
2020-02-19 22:22 ` Ming Lei
0 siblings, 1 reply; 19+ messages in thread
From: Salman Qazi @ 2020-02-19 17:54 UTC (permalink / raw)
To: Ming Lei
Cc: Jesse Barnes, Ming Lei, Bart Van Assche, Jens Axboe,
Christoph Hellwig, Linux Kernel Mailing List, linux-block,
Gwendal Grignou
On Tue, Feb 18, 2020 at 6:55 PM Ming Lei <ming.lei@redhat.com> wrote:
>
> On Tue, Feb 18, 2020 at 08:11:53AM -0800, Jesse Barnes wrote:
> > On Fri, Feb 14, 2020 at 7:47 PM Ming Lei <ming.lei@redhat.com> wrote:
> > > What are the 'other operations'? Are they block IOs?
> > >
> > > If yes, that is why I suggest to fix submit_bio_wait(), which should cover
> > > most of sync bio submission.
> > >
> > > Anyway, the fix is simple & generic enough, I'd plan to post a formal
> > > patch if no one figures out better doable approaches.
> >
> > Yeah I think any block I/O operation that occurs after the
> > BLKSECDISCARD is submitted will also potentially be affected by the
> > hung task timeouts, and I think your patch will address that. My only
> > concern with it is that it might hide some other I/O "hangs" that are
> > due to device misbehavior instead. Yes driver and device timeouts
> > should generally catch those, but with this in place we might miss a
> > few bugs.
> >
> > Given the nature of these types of storage devices though, I think
> > that's a minor issue and not worth blocking the patch on, given that
> > it should prevent a lot of false positive hang reports as Salman
> > demonstrated.
>
> Hello Jesse and Salman,
>
> One more question about this issue, do you enable BLK_WBT on your test
> kernel?
It doesn't exist on the original 4.4-based kernel where we reproduced
this bug. I am curious how this interacts with this bug.
>
> Thanks,
> Ming
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: BLKSECDISCARD ioctl and hung tasks
2020-02-19 17:54 ` Salman Qazi
@ 2020-02-19 22:22 ` Ming Lei
2020-02-19 22:26 ` Salman Qazi
0 siblings, 1 reply; 19+ messages in thread
From: Ming Lei @ 2020-02-19 22:22 UTC (permalink / raw)
To: Salman Qazi
Cc: Jesse Barnes, Ming Lei, Bart Van Assche, Jens Axboe,
Christoph Hellwig, Linux Kernel Mailing List, linux-block,
Gwendal Grignou
On Wed, Feb 19, 2020 at 09:54:31AM -0800, Salman Qazi wrote:
> On Tue, Feb 18, 2020 at 6:55 PM Ming Lei <ming.lei@redhat.com> wrote:
> >
> > On Tue, Feb 18, 2020 at 08:11:53AM -0800, Jesse Barnes wrote:
> > > On Fri, Feb 14, 2020 at 7:47 PM Ming Lei <ming.lei@redhat.com> wrote:
> > > > What are the 'other operations'? Are they block IOs?
> > > >
> > > > If yes, that is why I suggest to fix submit_bio_wait(), which should cover
> > > > most of sync bio submission.
> > > >
> > > > Anyway, the fix is simple & generic enough, I'd plan to post a formal
> > > > patch if no one figures out better doable approaches.
> > >
> > > Yeah I think any block I/O operation that occurs after the
> > > BLKSECDISCARD is submitted will also potentially be affected by the
> > > hung task timeouts, and I think your patch will address that. My only
> > > concern with it is that it might hide some other I/O "hangs" that are
> > > due to device misbehavior instead. Yes driver and device timeouts
> > > should generally catch those, but with this in place we might miss a
> > > few bugs.
> > >
> > > Given the nature of these types of storage devices though, I think
> > > that's a minor issue and not worth blocking the patch on, given that
> > > it should prevent a lot of false positive hang reports as Salman
> > > demonstrated.
> >
> > Hello Jesse and Salman,
> >
> > One more question about this issue, do you enable BLK_WBT on your test
> > kernel?
>
> It doesn't exist on the original 4.4-based kernel where we reproduced
> this bug. I am curious how this interacts with this bug.
blk-wbt can throttle discard request and keep discard queue not too
deep.
However, given block erase is involved in BLKSECDISCARD, I guess blk-wbt
may not avoid this task hung issue completely.
Thanks,
Ming
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: BLKSECDISCARD ioctl and hung tasks
2020-02-19 22:22 ` Ming Lei
@ 2020-02-19 22:26 ` Salman Qazi
0 siblings, 0 replies; 19+ messages in thread
From: Salman Qazi @ 2020-02-19 22:26 UTC (permalink / raw)
To: Ming Lei
Cc: Jesse Barnes, Ming Lei, Bart Van Assche, Jens Axboe,
Christoph Hellwig, Linux Kernel Mailing List, linux-block,
Gwendal Grignou
On Wed, Feb 19, 2020 at 2:23 PM Ming Lei <ming.lei@redhat.com> wrote:
>
> On Wed, Feb 19, 2020 at 09:54:31AM -0800, Salman Qazi wrote:
> > On Tue, Feb 18, 2020 at 6:55 PM Ming Lei <ming.lei@redhat.com> wrote:
> > >
> > > On Tue, Feb 18, 2020 at 08:11:53AM -0800, Jesse Barnes wrote:
> > > > On Fri, Feb 14, 2020 at 7:47 PM Ming Lei <ming.lei@redhat.com> wrote:
> > > > > What are the 'other operations'? Are they block IOs?
> > > > >
> > > > > If yes, that is why I suggest to fix submit_bio_wait(), which should cover
> > > > > most of sync bio submission.
> > > > >
> > > > > Anyway, the fix is simple & generic enough, I'd plan to post a formal
> > > > > patch if no one figures out better doable approaches.
> > > >
> > > > Yeah I think any block I/O operation that occurs after the
> > > > BLKSECDISCARD is submitted will also potentially be affected by the
> > > > hung task timeouts, and I think your patch will address that. My only
> > > > concern with it is that it might hide some other I/O "hangs" that are
> > > > due to device misbehavior instead. Yes driver and device timeouts
> > > > should generally catch those, but with this in place we might miss a
> > > > few bugs.
> > > >
> > > > Given the nature of these types of storage devices though, I think
> > > > that's a minor issue and not worth blocking the patch on, given that
> > > > it should prevent a lot of false positive hang reports as Salman
> > > > demonstrated.
> > >
> > > Hello Jesse and Salman,
> > >
> > > One more question about this issue, do you enable BLK_WBT on your test
> > > kernel?
> >
> > It doesn't exist on the original 4.4-based kernel where we reproduced
> > this bug. I am curious how this interacts with this bug.
>
> blk-wbt can throttle discard request and keep discard queue not too
> deep.
>
> However, given block erase is involved in BLKSECDISCARD, I guess blk-wbt
> may not avoid this task hung issue completely.
Thanks for the explanation.
As I said, it takes one 4K BLKSECDISCARD to get to 100 second delay
where the entire device is unusable for that time. So, the
queue doesn't have to be deep at all. It's a single tiny IOCTL.
>
>
> Thanks,
> Ming
>
^ permalink raw reply [flat|nested] 19+ messages in thread