linux-bcache.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] bcache: remove unnecessary flush_workqueue
@ 2022-03-27  7:20 Li Lei
  2022-04-07 16:53 ` Coly Li
  0 siblings, 1 reply; 6+ messages in thread
From: Li Lei @ 2022-03-27  7:20 UTC (permalink / raw)
  To: linux-bcache; +Cc: colyli, kent.overstreet, noctis.akm, Li Lei

All pending works will be drained by destroy_workqueue(), no need to call
flush_workqueue() explicitly.

Signed-off-by: Li Lei <lilei@szsandstone.com>
---
 drivers/md/bcache/writeback.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
index 9ee0005874cd..2a6d9f39a9b1 100644
--- a/drivers/md/bcache/writeback.c
+++ b/drivers/md/bcache/writeback.c
@@ -793,10 +793,9 @@ static int bch_writeback_thread(void *arg)
 		}
 	}
 
-	if (dc->writeback_write_wq) {
-		flush_workqueue(dc->writeback_write_wq);
+	if (dc->writeback_write_wq)
 		destroy_workqueue(dc->writeback_write_wq);
-	}
+
 	cached_dev_put(dc);
 	wait_for_kthread_stop();
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] bcache: remove unnecessary flush_workqueue
  2022-03-27  7:20 [PATCH] bcache: remove unnecessary flush_workqueue Li Lei
@ 2022-04-07 16:53 ` Coly Li
  2022-04-09  1:17   ` 李磊
  0 siblings, 1 reply; 6+ messages in thread
From: Coly Li @ 2022-04-07 16:53 UTC (permalink / raw)
  To: Li Lei; +Cc: kent.overstreet, linux-bcache, Li Lei

On 3/27/22 3:20 PM, Li Lei wrote:
> All pending works will be drained by destroy_workqueue(), no need to call
> flush_workqueue() explicitly.
>
> Signed-off-by: Li Lei <lilei@szsandstone.com>
> ---
>   drivers/md/bcache/writeback.c | 5 ++---
>   1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
> index 9ee0005874cd..2a6d9f39a9b1 100644
> --- a/drivers/md/bcache/writeback.c
> +++ b/drivers/md/bcache/writeback.c
> @@ -793,10 +793,9 @@ static int bch_writeback_thread(void *arg)
>   		}
>   	}
>   
> -	if (dc->writeback_write_wq) {
> -		flush_workqueue(dc->writeback_write_wq);
> +	if (dc->writeback_write_wq)
>   		destroy_workqueue(dc->writeback_write_wq);
> -	}
> +
>   	cached_dev_put(dc);
>   	wait_for_kthread_stop();
>   

The above code is from commit 7e865eba00a3 ("bcache: fix potential 
deadlock in cached_def_free()"). I explicitly added extra 
flush_workqueue() was because of workqueue_sysfs_unregister(wq) in 
destory_workqueue().

workqueue_sysfs_unregister() is not simple because it calls 
device_unregister(), and the code path is long. During reboot I am not 
sure whether extra deadlocking issue might be introduced. So the safe 
way is to explicitly call flush_workqueue() firstly to wait for all 
kwork finished, then destroy it.

It has been ~3 years passed, now I am totally OK with your above change. 
But could you please test your patch with lockdep enabled, and see 
whether there is no lock warning observed? Then I'd like to add it into 
my test directory.


Thanks.

Coly Li


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] bcache: remove unnecessary flush_workqueue
  2022-04-07 16:53 ` Coly Li
@ 2022-04-09  1:17   ` 李磊
  2022-05-13 15:17     ` Coly Li
  0 siblings, 1 reply; 6+ messages in thread
From: 李磊 @ 2022-04-09  1:17 UTC (permalink / raw)
  To: Coly Li; +Cc: kent.overstreet, linux-bcache, Li Lei

Coly Li <colyli@suse.de> 于2022年4月8日周五 00:54写道:
>
> On 3/27/22 3:20 PM, Li Lei wrote:
> > All pending works will be drained by destroy_workqueue(), no need to call
> > flush_workqueue() explicitly.
> >
> > Signed-off-by: Li Lei <lilei@szsandstone.com>
> > ---
> >   drivers/md/bcache/writeback.c | 5 ++---
> >   1 file changed, 2 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
> > index 9ee0005874cd..2a6d9f39a9b1 100644
> > --- a/drivers/md/bcache/writeback.c
> > +++ b/drivers/md/bcache/writeback.c
> > @@ -793,10 +793,9 @@ static int bch_writeback_thread(void *arg)
> >               }
> >       }
> >
> > -     if (dc->writeback_write_wq) {
> > -             flush_workqueue(dc->writeback_write_wq);
> > +     if (dc->writeback_write_wq)
> >               destroy_workqueue(dc->writeback_write_wq);
> > -     }
> > +
> >       cached_dev_put(dc);
> >       wait_for_kthread_stop();
> >
>
> The above code is from commit 7e865eba00a3 ("bcache: fix potential
> deadlock in cached_def_free()"). I explicitly added extra
> flush_workqueue() was because of workqueue_sysfs_unregister(wq) in
> destory_workqueue().
>
> workqueue_sysfs_unregister() is not simple because it calls
> device_unregister(), and the code path is long. During reboot I am not
> sure whether extra deadlocking issue might be introduced. So the safe
> way is to explicitly call flush_workqueue() firstly to wait for all
> kwork finished, then destroy it.
>
> It has been ~3 years passed, now I am totally OK with your above change.
> But could you please test your patch with lockdep enabled, and see
> whether there is no lock warning observed? Then I'd like to add it into
> my test directory.
>

OK,I will test this scenario.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] bcache: remove unnecessary flush_workqueue
  2022-04-09  1:17   ` 李磊
@ 2022-05-13 15:17     ` Coly Li
  2022-05-28 18:11       ` Lei Li
  0 siblings, 1 reply; 6+ messages in thread
From: Coly Li @ 2022-05-13 15:17 UTC (permalink / raw)
  To: 李磊; +Cc: kent.overstreet, linux-bcache, Li Lei

On 4/9/22 9:17 AM, 李磊 wrote:
> Coly Li <colyli@suse.de> 于2022年4月8日周五 00:54写道:
>> On 3/27/22 3:20 PM, Li Lei wrote:
>>> All pending works will be drained by destroy_workqueue(), no need to call
>>> flush_workqueue() explicitly.
>>>
>>> Signed-off-by: Li Lei <lilei@szsandstone.com>
>>> ---
>>>    drivers/md/bcache/writeback.c | 5 ++---
>>>    1 file changed, 2 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
>>> index 9ee0005874cd..2a6d9f39a9b1 100644
>>> --- a/drivers/md/bcache/writeback.c
>>> +++ b/drivers/md/bcache/writeback.c
>>> @@ -793,10 +793,9 @@ static int bch_writeback_thread(void *arg)
>>>                }
>>>        }
>>>
>>> -     if (dc->writeback_write_wq) {
>>> -             flush_workqueue(dc->writeback_write_wq);
>>> +     if (dc->writeback_write_wq)
>>>                destroy_workqueue(dc->writeback_write_wq);
>>> -     }
>>> +
>>>        cached_dev_put(dc);
>>>        wait_for_kthread_stop();
>>>
>> The above code is from commit 7e865eba00a3 ("bcache: fix potential
>> deadlock in cached_def_free()"). I explicitly added extra
>> flush_workqueue() was because of workqueue_sysfs_unregister(wq) in
>> destory_workqueue().
>>
>> workqueue_sysfs_unregister() is not simple because it calls
>> device_unregister(), and the code path is long. During reboot I am not
>> sure whether extra deadlocking issue might be introduced. So the safe
>> way is to explicitly call flush_workqueue() firstly to wait for all
>> kwork finished, then destroy it.
>>
>> It has been ~3 years passed, now I am totally OK with your above change.
>> But could you please test your patch with lockdep enabled, and see
>> whether there is no lock warning observed? Then I'd like to add it into
>> my test directory.
>>
> OK,I will test this scenario.


Any progress?


Coly Li


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] bcache: remove unnecessary flush_workqueue
  2022-05-13 15:17     ` Coly Li
@ 2022-05-28 18:11       ` Lei Li
  2022-06-06  3:22         ` Coly Li
  0 siblings, 1 reply; 6+ messages in thread
From: Lei Li @ 2022-05-28 18:11 UTC (permalink / raw)
  To: Coly Li; +Cc: kent.overstreet, linux-bcache, Li Lei

Sorry for replying so late, because of my testing environment. After
several fio random writing and reboot tests,
this patch worked well. Here is the test result.

1. I successfully reproduced deadlock warning by reverting commit
7e865eba00a3 ("bcache: fix potential
deadlock in cached_def_free()").
[  105.448074] ======================================================
[  105.449634] WARNING: possible circular locking dependency detected
[  105.451185] 5.4.191-MANJARO #1 Not tainted
[  105.452221] ------------------------------------------------------
[  105.453742] kworker/9:15/1272 is trying to acquire lock:
[  105.455042] ffff9e9ef1c44348
((wq_completion)bcache_writeback_wq){+.+.}, at:
flush_workqueue+0x8a/0x550
[  105.457482]
[  105.457482] but task is already holding lock:
[  105.458926] ffffbaba01203e68
((work_completion)(&cl->work)#2){+.+.}, at:
process_one_work+0x1c3/0x5a0
[  105.461568]
[  105.461568] which lock already depends on the new lock.
[  105.461568]
[  105.463810]
[  105.463810] the existing dependency chain (in reverse order) is:
[  105.465570]
[  105.465570] -> #1 ((work_completion)(&cl->work)#2){+.+.}:
[  105.467230]        process_one_work+0x21a/0x5a0
[  105.468449]        worker_thread+0x52/0x3c0
[  105.469653]        kthread+0x132/0x150
[  105.470695]        ret_from_fork+0x3a/0x50
[  105.471649]
[  105.471649] -> #0 ((wq_completion)bcache_writeback_wq){+.+.}:
[  105.473453]        __lock_acquire+0x105b/0x1c50
[  105.474678]        lock_acquire+0xc4/0x1b0
[  105.476007]        flush_workqueue+0xad/0x550
[  105.477160]        drain_workqueue+0xb6/0x170
[  105.478237]        destroy_workqueue+0x36/0x290
[  105.479328]        cached_dev_free+0x45/0x1e0 [bcache]
[  105.480595]        process_one_work+0x243/0x5a0
[  105.481714]        worker_thread+0x52/0x3c0
[  105.482687]        kthread+0x132/0x150
[  105.483667]        ret_from_fork+0x3a/0x50
[  105.484707]
[  105.484707] other info that might help us debug this:
[  105.484707]
[  105.486856]  Possible unsafe locking scenario:
[  105.486856]
[  105.488314]        CPU0                    CPU1
[  105.489336]        ----                    ----
[  105.490438]   lock((work_completion)(&cl->work)#2);
[  105.491587]
lock((wq_completion)bcache_writeback_wq);
[  105.493554]
lock((work_completion)(&cl->work)#2);
[  105.495360]   lock((wq_completion)bcache_writeback_wq);
[  105.496878]
[  105.496878]  *** DEADLOCK ***
[  105.496878]
[  105.498303] 2 locks held by kworker/9:15/1272:
[  105.499373]  #0: ffff9e9f16c21148 ((wq_completion)events){+.+.},
at: process_one_work+0x1c3/0x5a0
[  105.501514]  #1: ffffbaba01203e68
((work_completion)(&cl->work)#2){+.+.}, at:
process_one_work+0x1c3/0x5a0
[  105.504221]
[  105.504221] stack backtrace:
[  105.505271] CPU: 9 PID: 1272 Comm: kworker/9:15 Not tainted
5.4.191-MANJARO #1
[  105.507069] Hardware name: VMware, Inc. VMware Virtual
Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
[  105.509485] Workqueue: events cached_dev_free [bcache]
[  105.510805] Call Trace:
[  105.511471]  dump_stack+0x8b/0xbd
[  105.512327]  check_noncircular+0x198/0x1b0
[  105.513462]  __lock_acquire+0x105b/0x1c50
[  105.514675]  lock_acquire+0xc4/0x1b0
[  105.515557]  ? flush_workqueue+0x8a/0x550
[  105.516535]  flush_workqueue+0xad/0x550
[  105.517587]  ? flush_workqueue+0x8a/0x550
[  105.518530]  ? drain_workqueue+0xb6/0x170
[  105.519518]  drain_workqueue+0xb6/0x170
[  105.520572]  destroy_workqueue+0x36/0x290
[  105.521658]  cached_dev_free+0x45/0x1e0 [bcache]
[  105.522879]  process_one_work+0x243/0x5a0
[  105.523854]  worker_thread+0x52/0x3c0
[  105.524734]  ? process_one_work+0x5a0/0x5a0
[  105.525768]  kthread+0x132/0x150
[  105.526594]  ? __kthread_bind_mask+0x60/0x60
[  105.527706]  ret_from_fork+0x3a/0x50
[  105.571043] bcache: bcache_device_free() bcache0 stopped
[  105.730801] bcache: cache_set_free() Cache set
74f01341-0881-4c77-a49b-39fafcabb99e unregistered
[  105.730819] bcache: bcache_reboot() All devices stopped
[  105.930055] reboot: Restarting system
[  105.930996] reboot: machine restart
[  105.932182] ACPI MEMORY or I/O RESET_REG.

2. After applied 7e865eba00a3 and this patch, no warning showed again.
[   58.296057] bcache: bcache_reboot() Stopping all devices:
[   58.337730] bcache: bcache_device_free() bcache0 stopped
[   58.484177] bcache: cache_set_free() Cache set
74f01341-0881-4c77-a49b-39fafcabb99e unregistered
[   58.484202] bcache: bcache_reboot() All devices stopped
[   58.731929] reboot: Restarting system
[   58.732845] reboot: machine restart
[   58.734044] ACPI MEMORY or I/O RESET_REG.

Coly Li <colyli@suse.de> 于2022年5月13日周五 23:17写道:
>
> On 4/9/22 9:17 AM, 李磊 wrote:
> > Coly Li <colyli@suse.de> 于2022年4月8日周五 00:54写道:
> >> On 3/27/22 3:20 PM, Li Lei wrote:
> >>> All pending works will be drained by destroy_workqueue(), no need to call
> >>> flush_workqueue() explicitly.
> >>>
> >>> Signed-off-by: Li Lei <lilei@szsandstone.com>
> >>> ---
> >>>    drivers/md/bcache/writeback.c | 5 ++---
> >>>    1 file changed, 2 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
> >>> index 9ee0005874cd..2a6d9f39a9b1 100644
> >>> --- a/drivers/md/bcache/writeback.c
> >>> +++ b/drivers/md/bcache/writeback.c
> >>> @@ -793,10 +793,9 @@ static int bch_writeback_thread(void *arg)
> >>>                }
> >>>        }
> >>>
> >>> -     if (dc->writeback_write_wq) {
> >>> -             flush_workqueue(dc->writeback_write_wq);
> >>> +     if (dc->writeback_write_wq)
> >>>                destroy_workqueue(dc->writeback_write_wq);
> >>> -     }
> >>> +
> >>>        cached_dev_put(dc);
> >>>        wait_for_kthread_stop();
> >>>
> >> The above code is from commit 7e865eba00a3 ("bcache: fix potential
> >> deadlock in cached_def_free()"). I explicitly added extra
> >> flush_workqueue() was because of workqueue_sysfs_unregister(wq) in
> >> destory_workqueue().
> >>
> >> workqueue_sysfs_unregister() is not simple because it calls
> >> device_unregister(), and the code path is long. During reboot I am not
> >> sure whether extra deadlocking issue might be introduced. So the safe
> >> way is to explicitly call flush_workqueue() firstly to wait for all
> >> kwork finished, then destroy it.
> >>
> >> It has been ~3 years passed, now I am totally OK with your above change.
> >> But could you please test your patch with lockdep enabled, and see
> >> whether there is no lock warning observed? Then I'd like to add it into
> >> my test directory.
> >>
> > OK,I will test this scenario.
>
>
> Any progress?
>
>
> Coly Li
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] bcache: remove unnecessary flush_workqueue
  2022-05-28 18:11       ` Lei Li
@ 2022-06-06  3:22         ` Coly Li
  0 siblings, 0 replies; 6+ messages in thread
From: Coly Li @ 2022-06-06  3:22 UTC (permalink / raw)
  To: Lei Li; +Cc: kent.overstreet, linux-bcache



> 2022年5月29日 02:11,Lei Li <noctis.akm@gmail.com> 写道:
> 
> Sorry for replying so late, because of my testing environment. After
> several fio random writing and reboot tests,
> this patch worked well. Here is the test result.
> 
> 1. I successfully reproduced deadlock warning by reverting commit
> 7e865eba00a3 ("bcache: fix potential
> deadlock in cached_def_free()").
> [  105.448074] ======================================================
> [  105.449634] WARNING: possible circular locking dependency detected
> [  105.451185] 5.4.191-MANJARO #1 Not tainted
> [  105.452221] ------------------------------------------------------
> [  105.453742] kworker/9:15/1272 is trying to acquire lock:
> [  105.455042] ffff9e9ef1c44348
> ((wq_completion)bcache_writeback_wq){+.+.}, at:
> flush_workqueue+0x8a/0x550
> [  105.457482]
> [  105.457482] but task is already holding lock:
> [  105.458926] ffffbaba01203e68
> ((work_completion)(&cl->work)#2){+.+.}, at:
> process_one_work+0x1c3/0x5a0
> [  105.461568]
> [  105.461568] which lock already depends on the new lock.
> [  105.461568]
> [  105.463810]
> [  105.463810] the existing dependency chain (in reverse order) is:
> [  105.465570]
> [  105.465570] -> #1 ((work_completion)(&cl->work)#2){+.+.}:
> [  105.467230]        process_one_work+0x21a/0x5a0
> [  105.468449]        worker_thread+0x52/0x3c0
> [  105.469653]        kthread+0x132/0x150
> [  105.470695]        ret_from_fork+0x3a/0x50
> [  105.471649]
> [  105.471649] -> #0 ((wq_completion)bcache_writeback_wq){+.+.}:
> [  105.473453]        __lock_acquire+0x105b/0x1c50
> [  105.474678]        lock_acquire+0xc4/0x1b0
> [  105.476007]        flush_workqueue+0xad/0x550
> [  105.477160]        drain_workqueue+0xb6/0x170
> [  105.478237]        destroy_workqueue+0x36/0x290
> [  105.479328]        cached_dev_free+0x45/0x1e0 [bcache]
> [  105.480595]        process_one_work+0x243/0x5a0
> [  105.481714]        worker_thread+0x52/0x3c0
> [  105.482687]        kthread+0x132/0x150
> [  105.483667]        ret_from_fork+0x3a/0x50
> [  105.484707]
> [  105.484707] other info that might help us debug this:
> [  105.484707]
> [  105.486856]  Possible unsafe locking scenario:
> [  105.486856]
> [  105.488314]        CPU0                    CPU1
> [  105.489336]        ----                    ----
> [  105.490438]   lock((work_completion)(&cl->work)#2);
> [  105.491587]
> lock((wq_completion)bcache_writeback_wq);
> [  105.493554]
> lock((work_completion)(&cl->work)#2);
> [  105.495360]   lock((wq_completion)bcache_writeback_wq);
> [  105.496878]
> [  105.496878]  *** DEADLOCK ***
> [  105.496878]
> [  105.498303] 2 locks held by kworker/9:15/1272:
> [  105.499373]  #0: ffff9e9f16c21148 ((wq_completion)events){+.+.},
> at: process_one_work+0x1c3/0x5a0
> [  105.501514]  #1: ffffbaba01203e68
> ((work_completion)(&cl->work)#2){+.+.}, at:
> process_one_work+0x1c3/0x5a0
> [  105.504221]
> [  105.504221] stack backtrace:
> [  105.505271] CPU: 9 PID: 1272 Comm: kworker/9:15 Not tainted
> 5.4.191-MANJARO #1
> [  105.507069] Hardware name: VMware, Inc. VMware Virtual
> Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
> [  105.509485] Workqueue: events cached_dev_free [bcache]
> [  105.510805] Call Trace:
> [  105.511471]  dump_stack+0x8b/0xbd
> [  105.512327]  check_noncircular+0x198/0x1b0
> [  105.513462]  __lock_acquire+0x105b/0x1c50
> [  105.514675]  lock_acquire+0xc4/0x1b0
> [  105.515557]  ? flush_workqueue+0x8a/0x550
> [  105.516535]  flush_workqueue+0xad/0x550
> [  105.517587]  ? flush_workqueue+0x8a/0x550
> [  105.518530]  ? drain_workqueue+0xb6/0x170
> [  105.519518]  drain_workqueue+0xb6/0x170
> [  105.520572]  destroy_workqueue+0x36/0x290
> [  105.521658]  cached_dev_free+0x45/0x1e0 [bcache]
> [  105.522879]  process_one_work+0x243/0x5a0
> [  105.523854]  worker_thread+0x52/0x3c0
> [  105.524734]  ? process_one_work+0x5a0/0x5a0
> [  105.525768]  kthread+0x132/0x150
> [  105.526594]  ? __kthread_bind_mask+0x60/0x60
> [  105.527706]  ret_from_fork+0x3a/0x50
> [  105.571043] bcache: bcache_device_free() bcache0 stopped
> [  105.730801] bcache: cache_set_free() Cache set
> 74f01341-0881-4c77-a49b-39fafcabb99e unregistered
> [  105.730819] bcache: bcache_reboot() All devices stopped
> [  105.930055] reboot: Restarting system
> [  105.930996] reboot: machine restart
> [  105.932182] ACPI MEMORY or I/O RESET_REG.
> 
> 2. After applied 7e865eba00a3 and this patch, no warning showed again.
> [   58.296057] bcache: bcache_reboot() Stopping all devices:
> [   58.337730] bcache: bcache_device_free() bcache0 stopped
> [   58.484177] bcache: cache_set_free() Cache set
> 74f01341-0881-4c77-a49b-39fafcabb99e unregistered
> [   58.484202] bcache: bcache_reboot() All devices stopped
> [   58.731929] reboot: Restarting system
> [   58.732845] reboot: machine restart
> [   58.734044] ACPI MEMORY or I/O RESET_REG.

Hi Lei,

Nice, then it is proved your patch removes redundant code without regression.

I will take this patch for 5.20, thank you for the extra effort.

Coly Li



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-06-06  3:23 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-27  7:20 [PATCH] bcache: remove unnecessary flush_workqueue Li Lei
2022-04-07 16:53 ` Coly Li
2022-04-09  1:17   ` 李磊
2022-05-13 15:17     ` Coly Li
2022-05-28 18:11       ` Lei Li
2022-06-06  3:22         ` Coly Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).