All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Revert "dm bufio: fix deadlock with loop device"
@ 2019-08-08  9:40 Mikulas Patocka
  2019-08-08 14:40 ` Mike Snitzer
  0 siblings, 1 reply; 5+ messages in thread
From: Mikulas Patocka @ 2019-08-08  9:40 UTC (permalink / raw)
  To: Mike Snitzer, junxiao.bi; +Cc: honglei.wang, dm-devel, Alasdair Kergon

Revert the patch bd293d071ffe65e645b4d8104f9d8fe15ea13862. A proper fix
should be not to use GFP_KERNEL in the function do_blockdev_direct_IO.

Note that the patch bd293d071ffe doesn't really prevent the deadlock from
occuring - if we look at the stacktrace reported by Junxiao Bi, we see
that it hangs in bit_wait_io and not on the mutex - i.e. it has already
successfully taken the mutex. Changing the mutex from mutex_lock to
mutex_trylock won't help with deadlocks that happen afterwards.

PID: 474    TASK: ffff8813e11f4600  CPU: 10  COMMAND: "kswapd0"
   #0 [ffff8813dedfb938] __schedule at ffffffff8173f405
   #1 [ffff8813dedfb990] schedule at ffffffff8173fa27
   #2 [ffff8813dedfb9b0] schedule_timeout at ffffffff81742fec
   #3 [ffff8813dedfba60] io_schedule_timeout at ffffffff8173f186
   #4 [ffff8813dedfbaa0] bit_wait_io at ffffffff8174034f
   #5 [ffff8813dedfbac0] __wait_on_bit at ffffffff8173fec8
   #6 [ffff8813dedfbb10] out_of_line_wait_on_bit at ffffffff8173ff81
   #7 [ffff8813dedfbb90] __make_buffer_clean at ffffffffa038736f [dm_bufio]
   #8 [ffff8813dedfbbb0] __try_evict_buffer at ffffffffa0387bb8 [dm_bufio]
   #9 [ffff8813dedfbbd0] dm_bufio_shrink_scan at ffffffffa0387cc3 [dm_bufio]
  #10 [ffff8813dedfbc40] shrink_slab at ffffffff811a87ce
  #11 [ffff8813dedfbd30] shrink_zone at ffffffff811ad778
  #12 [ffff8813dedfbdc0] kswapd at ffffffff811ae92f
  #13 [ffff8813dedfbec0] kthread at ffffffff810a8428
  #14 [ffff8813dedfbf50] ret_from_fork at ffffffff81745242

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Fixes: bd293d071ffe ("dm bufio: fix deadlock with loop device")

---
 drivers/md/dm-bufio.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Index: linux-2.6/drivers/md/dm-bufio.c
===================================================================
--- linux-2.6.orig/drivers/md/dm-bufio.c	2019-08-02 08:52:35.000000000 +0200
+++ linux-2.6/drivers/md/dm-bufio.c	2019-08-08 11:19:13.000000000 +0200
@@ -1604,7 +1604,9 @@ dm_bufio_shrink_scan(struct shrinker *sh
 	unsigned long freed;
 
 	c = container_of(shrink, struct dm_bufio_client, shrinker);
-	if (!dm_bufio_trylock(c))
+	if (sc->gfp_mask & __GFP_FS)
+		dm_bufio_lock(c);
+	else if (!dm_bufio_trylock(c))
 		return SHRINK_STOP;
 
 	freed  = __scan(c, sc->nr_to_scan, sc->gfp_mask);

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Revert "dm bufio: fix deadlock with loop device"
  2019-08-08  9:40 [PATCH] Revert "dm bufio: fix deadlock with loop device" Mikulas Patocka
@ 2019-08-08 14:40 ` Mike Snitzer
  2019-08-08 15:01   ` Mikulas Patocka
  0 siblings, 1 reply; 5+ messages in thread
From: Mike Snitzer @ 2019-08-08 14:40 UTC (permalink / raw)
  To: Mikulas Patocka; +Cc: honglei.wang, dm-devel, Alasdair Kergon, junxiao.bi

On Thu, Aug 08 2019 at  5:40am -0400,
Mikulas Patocka <mpatocka@redhat.com> wrote:

> Revert the patch bd293d071ffe65e645b4d8104f9d8fe15ea13862. A proper fix
> should be not to use GFP_KERNEL in the function do_blockdev_direct_IO.

Matthew Wilcox pointed out that the "proper fix" is loop should be using
memalloc_noio_{save,restore}

> Note that the patch bd293d071ffe doesn't really prevent the deadlock from
> occuring - if we look at the stacktrace reported by Junxiao Bi, we see
> that it hangs in bit_wait_io and not on the mutex - i.e. it has already
> successfully taken the mutex. Changing the mutex from mutex_lock to
> mutex_trylock won't help with deadlocks that happen afterwards.
> 
> PID: 474    TASK: ffff8813e11f4600  CPU: 10  COMMAND: "kswapd0"
>    #0 [ffff8813dedfb938] __schedule at ffffffff8173f405
>    #1 [ffff8813dedfb990] schedule at ffffffff8173fa27
>    #2 [ffff8813dedfb9b0] schedule_timeout at ffffffff81742fec
>    #3 [ffff8813dedfba60] io_schedule_timeout at ffffffff8173f186
>    #4 [ffff8813dedfbaa0] bit_wait_io at ffffffff8174034f
>    #5 [ffff8813dedfbac0] __wait_on_bit at ffffffff8173fec8
>    #6 [ffff8813dedfbb10] out_of_line_wait_on_bit at ffffffff8173ff81
>    #7 [ffff8813dedfbb90] __make_buffer_clean at ffffffffa038736f [dm_bufio]
>    #8 [ffff8813dedfbbb0] __try_evict_buffer at ffffffffa0387bb8 [dm_bufio]
>    #9 [ffff8813dedfbbd0] dm_bufio_shrink_scan at ffffffffa0387cc3 [dm_bufio]
>   #10 [ffff8813dedfbc40] shrink_slab at ffffffff811a87ce
>   #11 [ffff8813dedfbd30] shrink_zone at ffffffff811ad778
>   #12 [ffff8813dedfbdc0] kswapd at ffffffff811ae92f
>   #13 [ffff8813dedfbec0] kthread at ffffffff810a8428
>   #14 [ffff8813dedfbf50] ret_from_fork at ffffffff81745242

The above stack trace doesn't tell the entire story though.  Yes, one
process will have already gotten the the lock and is left waiting on
IO.  But that IO isn't able to complete because it is blocked on mm's
reclaim also trying to evict via same shrinker... so another thread will
be blocked waiting on the mutex (e.g. PID 14127 in your recent patch
header).

Please re-read the header for commit bd293d071ffe -- I think that fix is
good.  But, I could still be wrong... ;)

> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> Cc: stable@vger.kernel.org
> Fixes: bd293d071ffe ("dm bufio: fix deadlock with loop device")
> 
> ---
>  drivers/md/dm-bufio.c |    4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> Index: linux-2.6/drivers/md/dm-bufio.c
> ===================================================================
> --- linux-2.6.orig/drivers/md/dm-bufio.c	2019-08-02 08:52:35.000000000 +0200
> +++ linux-2.6/drivers/md/dm-bufio.c	2019-08-08 11:19:13.000000000 +0200
> @@ -1604,7 +1604,9 @@ dm_bufio_shrink_scan(struct shrinker *sh
>  	unsigned long freed;
>  
>  	c = container_of(shrink, struct dm_bufio_client, shrinker);
> -	if (!dm_bufio_trylock(c))
> +	if (sc->gfp_mask & __GFP_FS)
> +		dm_bufio_lock(c);
> +	else if (!dm_bufio_trylock(c))
>  		return SHRINK_STOP;
>  
>  	freed  = __scan(c, sc->nr_to_scan, sc->gfp_mask);

I don't see the performance win of micro-optimizing to not use
dm_bufio_trylock() worth it.  BUT on the flip side, dm_bufio_lock()
_was_ the canary in the coal mine here: meaning it caught that loop
isn't properly using memalloc_noio_{save,restore}...

I'm just not liking the prospect of bufio being the smoking gun for
future deadlocks.  But I could go either way with this...

But if we do revert, this patch header needs to:

Depends-on: XXXXXXXXX ("loop: use memalloc_noio_{save,restore}")

Mike

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Revert "dm bufio: fix deadlock with loop device"
  2019-08-08 14:40 ` Mike Snitzer
@ 2019-08-08 15:01   ` Mikulas Patocka
  2019-08-08 16:28     ` Junxiao Bi
  0 siblings, 1 reply; 5+ messages in thread
From: Mikulas Patocka @ 2019-08-08 15:01 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: honglei.wang, dm-devel, Alasdair Kergon, junxiao.bi



On Thu, 8 Aug 2019, Mike Snitzer wrote:

> On Thu, Aug 08 2019 at  5:40am -0400,
> Mikulas Patocka <mpatocka@redhat.com> wrote:
> 
> > Revert the patch bd293d071ffe65e645b4d8104f9d8fe15ea13862. A proper fix
> > should be not to use GFP_KERNEL in the function do_blockdev_direct_IO.
> 
> Matthew Wilcox pointed out that the "proper fix" is loop should be using
> memalloc_noio_{save,restore}

He's right, I will post another patch that sets memalloc_noio_save in 
loop.

> > Note that the patch bd293d071ffe doesn't really prevent the deadlock from
> > occuring - if we look at the stacktrace reported by Junxiao Bi, we see
> > that it hangs in bit_wait_io and not on the mutex - i.e. it has already
> > successfully taken the mutex. Changing the mutex from mutex_lock to
> > mutex_trylock won't help with deadlocks that happen afterwards.
> > 
> > PID: 474    TASK: ffff8813e11f4600  CPU: 10  COMMAND: "kswapd0"
> >    #0 [ffff8813dedfb938] __schedule at ffffffff8173f405
> >    #1 [ffff8813dedfb990] schedule at ffffffff8173fa27
> >    #2 [ffff8813dedfb9b0] schedule_timeout at ffffffff81742fec
> >    #3 [ffff8813dedfba60] io_schedule_timeout at ffffffff8173f186
> >    #4 [ffff8813dedfbaa0] bit_wait_io at ffffffff8174034f
> >    #5 [ffff8813dedfbac0] __wait_on_bit at ffffffff8173fec8
> >    #6 [ffff8813dedfbb10] out_of_line_wait_on_bit at ffffffff8173ff81
> >    #7 [ffff8813dedfbb90] __make_buffer_clean at ffffffffa038736f [dm_bufio]
> >    #8 [ffff8813dedfbbb0] __try_evict_buffer at ffffffffa0387bb8 [dm_bufio]
> >    #9 [ffff8813dedfbbd0] dm_bufio_shrink_scan at ffffffffa0387cc3 [dm_bufio]
> >   #10 [ffff8813dedfbc40] shrink_slab at ffffffff811a87ce
> >   #11 [ffff8813dedfbd30] shrink_zone at ffffffff811ad778
> >   #12 [ffff8813dedfbdc0] kswapd at ffffffff811ae92f
> >   #13 [ffff8813dedfbec0] kthread at ffffffff810a8428
> >   #14 [ffff8813dedfbf50] ret_from_fork at ffffffff81745242
> 
> The above stack trace doesn't tell the entire story though.  Yes, one
> process will have already gotten the the lock and is left waiting on
> IO.  But that IO isn't able to complete because it is blocked on mm's
> reclaim also trying to evict via same shrinker... so another thread will
> be blocked waiting on the mutex (e.g. PID 14127 in your recent patch
> header).
> 
> Please re-read the header for commit bd293d071ffe -- I think that fix is
> good.  But, I could still be wrong... ;)

The problem with the "dm_bufio_trylock" patch is - suppose that we are 
called with GFP_KERNEL context - we succeed with dm_bufio_trylock and then 
go to __make_buffer_clean->out_of_line_wait_on_bit->__wait_on_bit - if 
this wait depends no some I/O completion on the loop device, we still get 
a deadlock.

The patch fixes some case of the deadlock, but it doesn't fix it entirely.

Mikulas

> > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > Cc: stable@vger.kernel.org
> > Fixes: bd293d071ffe ("dm bufio: fix deadlock with loop device")
> > 
> > ---
> >  drivers/md/dm-bufio.c |    4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > Index: linux-2.6/drivers/md/dm-bufio.c
> > ===================================================================
> > --- linux-2.6.orig/drivers/md/dm-bufio.c	2019-08-02 08:52:35.000000000 +0200
> > +++ linux-2.6/drivers/md/dm-bufio.c	2019-08-08 11:19:13.000000000 +0200
> > @@ -1604,7 +1604,9 @@ dm_bufio_shrink_scan(struct shrinker *sh
> >  	unsigned long freed;
> >  
> >  	c = container_of(shrink, struct dm_bufio_client, shrinker);
> > -	if (!dm_bufio_trylock(c))
> > +	if (sc->gfp_mask & __GFP_FS)
> > +		dm_bufio_lock(c);
> > +	else if (!dm_bufio_trylock(c))
> >  		return SHRINK_STOP;
> >  
> >  	freed  = __scan(c, sc->nr_to_scan, sc->gfp_mask);
> 
> I don't see the performance win of micro-optimizing to not use
> dm_bufio_trylock() worth it.  BUT on the flip side, dm_bufio_lock()
> _was_ the canary in the coal mine here: meaning it caught that loop
> isn't properly using memalloc_noio_{save,restore}...
> 
> I'm just not liking the prospect of bufio being the smoking gun for
> future deadlocks.  But I could go either way with this...
> 
> But if we do revert, this patch header needs to:
> 
> Depends-on: XXXXXXXXX ("loop: use memalloc_noio_{save,restore}")
> 
> Mike
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Revert "dm bufio: fix deadlock with loop device"
  2019-08-08 15:01   ` Mikulas Patocka
@ 2019-08-08 16:28     ` Junxiao Bi
  2019-08-08 17:07       ` Mikulas Patocka
  0 siblings, 1 reply; 5+ messages in thread
From: Junxiao Bi @ 2019-08-08 16:28 UTC (permalink / raw)
  To: Mikulas Patocka, Mike Snitzer; +Cc: honglei.wang, dm-devel, Alasdair Kergon


[-- Attachment #1.1: Type: text/plain, Size: 2803 bytes --]

On 8/8/19 8:01 AM, Mikulas Patocka wrote:

>>> Note that the patch bd293d071ffe doesn't really prevent the deadlock from
>>> occuring - if we look at the stacktrace reported by Junxiao Bi, we see
>>> that it hangs in bit_wait_io and not on the mutex - i.e. it has already
>>> successfully taken the mutex. Changing the mutex from mutex_lock to
>>> mutex_trylock won't help with deadlocks that happen afterwards.
>>>
>>> PID: 474    TASK: ffff8813e11f4600  CPU: 10  COMMAND: "kswapd0"
>>>     #0 [ffff8813dedfb938] __schedule at ffffffff8173f405
>>>     #1 [ffff8813dedfb990] schedule at ffffffff8173fa27
>>>     #2 [ffff8813dedfb9b0] schedule_timeout at ffffffff81742fec
>>>     #3 [ffff8813dedfba60] io_schedule_timeout at ffffffff8173f186
>>>     #4 [ffff8813dedfbaa0] bit_wait_io at ffffffff8174034f
>>>     #5 [ffff8813dedfbac0] __wait_on_bit at ffffffff8173fec8
>>>     #6 [ffff8813dedfbb10] out_of_line_wait_on_bit at ffffffff8173ff81
>>>     #7 [ffff8813dedfbb90] __make_buffer_clean at ffffffffa038736f [dm_bufio]
>>>     #8 [ffff8813dedfbbb0] __try_evict_buffer at ffffffffa0387bb8 [dm_bufio]
>>>     #9 [ffff8813dedfbbd0] dm_bufio_shrink_scan at ffffffffa0387cc3 [dm_bufio]
>>>    #10 [ffff8813dedfbc40] shrink_slab at ffffffff811a87ce
>>>    #11 [ffff8813dedfbd30] shrink_zone at ffffffff811ad778
>>>    #12 [ffff8813dedfbdc0] kswapd at ffffffff811ae92f
>>>    #13 [ffff8813dedfbec0] kthread at ffffffff810a8428
>>>    #14 [ffff8813dedfbf50] ret_from_fork at ffffffff81745242
>> The above stack trace doesn't tell the entire story though.  Yes, one
>> process will have already gotten the the lock and is left waiting on
>> IO.  But that IO isn't able to complete because it is blocked on mm's
>> reclaim also trying to evict via same shrinker... so another thread will
>> be blocked waiting on the mutex (e.g. PID 14127 in your recent patch
>> header).
>>
>> Please re-read the header for commit bd293d071ffe -- I think that fix is
>> good.  But, I could still be wrong...;)
> The problem with the "dm_bufio_trylock" patch is - suppose that we are
> called with GFP_KERNEL context - we succeed with dm_bufio_trylock and then
> go to __make_buffer_clean->out_of_line_wait_on_bit->__wait_on_bit - if
> this wait depends no some I/O completion on the loop device, we still get
> a deadlock.

No, this is not right, see the source code in __try_evict_buffer(). It 
will never wait io in GFP_KERENL case.

1546     if (!(gfp & __GFP_FS)) {
1547         if (test_bit(B_READING, &b->state) ||
1548             test_bit(B_WRITING, &b->state) ||
1549             test_bit(B_DIRTY, &b->state))
1550             return false;
1551     }

Thanks,

Junxiao.

>
> The patch fixes some case of the deadlock, but it doesn't fix it entirely.
>
> Mikulas
>

[-- Attachment #1.2: Type: text/html, Size: 3673 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Revert "dm bufio: fix deadlock with loop device"
  2019-08-08 16:28     ` Junxiao Bi
@ 2019-08-08 17:07       ` Mikulas Patocka
  0 siblings, 0 replies; 5+ messages in thread
From: Mikulas Patocka @ 2019-08-08 17:07 UTC (permalink / raw)
  To: Junxiao Bi; +Cc: honglei.wang, dm-devel, Alasdair Kergon, Mike Snitzer

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3269 bytes --]



On Thu, 8 Aug 2019, Junxiao Bi wrote:

> 
> On 8/8/19 8:01 AM, Mikulas Patocka wrote:
> 
> Note that the patch bd293d071ffe doesn't really prevent the deadlock from
> occuring - if we look at the stacktrace reported by Junxiao Bi, we see
> that it hangs in bit_wait_io and not on the mutex - i.e. it has already
> successfully taken the mutex. Changing the mutex from mutex_lock to
> mutex_trylock won't help with deadlocks that happen afterwards.
> 
> PID: 474    TASK: ffff8813e11f4600  CPU: 10  COMMAND: "kswapd0"
>    #0 [ffff8813dedfb938] __schedule at ffffffff8173f405
>    #1 [ffff8813dedfb990] schedule at ffffffff8173fa27
>    #2 [ffff8813dedfb9b0] schedule_timeout at ffffffff81742fec
>    #3 [ffff8813dedfba60] io_schedule_timeout at ffffffff8173f186
>    #4 [ffff8813dedfbaa0] bit_wait_io at ffffffff8174034f
>    #5 [ffff8813dedfbac0] __wait_on_bit at ffffffff8173fec8
>    #6 [ffff8813dedfbb10] out_of_line_wait_on_bit at ffffffff8173ff81
>    #7 [ffff8813dedfbb90] __make_buffer_clean at ffffffffa038736f [dm_bufio]
>    #8 [ffff8813dedfbbb0] __try_evict_buffer at ffffffffa0387bb8 [dm_bufio]
>    #9 [ffff8813dedfbbd0] dm_bufio_shrink_scan at ffffffffa0387cc3 [dm_bufio]
>   #10 [ffff8813dedfbc40] shrink_slab at ffffffff811a87ce
>   #11 [ffff8813dedfbd30] shrink_zone at ffffffff811ad778
>   #12 [ffff8813dedfbdc0] kswapd at ffffffff811ae92f
>   #13 [ffff8813dedfbec0] kthread at ffffffff810a8428
>   #14 [ffff8813dedfbf50] ret_from_fork at ffffffff81745242
> 
> The above stack trace doesn't tell the entire story though.  Yes, one
> process will have already gotten the the lock and is left waiting on
> IO.  But that IO isn't able to complete because it is blocked on mm's
> reclaim also trying to evict via same shrinker... so another thread will
> be blocked waiting on the mutex (e.g. PID 14127 in your recent patch
> header).
> 
> Please re-read the header for commit bd293d071ffe -- I think that fix is
> good.  But, I could still be wrong... ;)

I think that your patch fixes the stacktrace that you observed, but it 
doesn't fix the problem entirely.

Now, that Jens Axboe applied the loop driver patch, we don't need any 
patch to dm-bufio at all.

> The problem with the "dm_bufio_trylock" patch is - suppose that we are 
> called with GFP_KERNEL context - we succeed with dm_bufio_trylock and then 
> go to __make_buffer_clean->out_of_line_wait_on_bit->__wait_on_bit - if 
> this wait depends no some I/O completion on the loop device, we still get 
> a deadlock.
> 
> No, this is not right, see the source code in __try_evict_buffer(). It will never wait io in GFP_KERENL case.
> 
> 1546     if (!(gfp & __GFP_FS)) {
> 1547         if (test_bit(B_READING, &b->state) ||
> 1548             test_bit(B_WRITING, &b->state) ||
> 1549             test_bit(B_DIRTY, &b->state))
> 1550             return false;
> 1551     }

GFP_KERNEL includes the bit __GFP_FS. GFP_NOIO doesn't include __GFP_FS.

So, for GFP_NOIO, we exit if either B_READING, B_WRITING or B_DIRTY is 
set.

For GFP_KERNEL, we don't exit and continue with __make_buffer_clean (that 
writes the buffer if it's dirty and waits for I/O).

Mikulas

> Thanks,
> 
> Junxiao.
> 
> 
> The patch fixes some case of the deadlock, but it doesn't fix it entirely.
> 
> Mikulas
> 
> 
> 

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-08-08 17:07 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-08  9:40 [PATCH] Revert "dm bufio: fix deadlock with loop device" Mikulas Patocka
2019-08-08 14:40 ` Mike Snitzer
2019-08-08 15:01   ` Mikulas Patocka
2019-08-08 16:28     ` Junxiao Bi
2019-08-08 17:07       ` Mikulas Patocka

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.