All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH v1 05/10] bcache: stop dc->writeback_rate_update if cache set is stopping
@ 2018-01-04 12:41 tang.junhui
  2018-01-05  4:01 ` Coly Li
  0 siblings, 1 reply; 10+ messages in thread
From: tang.junhui @ 2018-01-04 12:41 UTC (permalink / raw)
  To: colyli; +Cc: mlyle, linux-bcache, linux-block, tang.junhui

From: Tang Junhui <tang.junhui@zte.com.cn>

Hello Coly,

>When cache set is stopping, calculating writeback rate is wast of time.
>This is the purpose of the first checking, to avoid unnecessary delay
>from bcache_flash_devs_sectors_dirty() inside __update_writeback_rate().

I thought twice, and found there is still a race.

work update_writeback_rate is runing:                                                                                
>@@ -91,6 +91,11 @@ static void update_writeback_rate(struct work_struct *work)
>     struct cached_dev *dc = container_of(to_delayed_work(work),
>                          struct cached_dev,
>                          writeback_rate_update);
>+    struct cache_set *c = dc->disk.c;
>+
>+    /* quit directly if cache set is stopping */
>+    if (test_bit(CACHE_SET_STOPPING, &c->flags))
>+        return;
> 
>     down_read(&dc->writeback_lock);
> 
>@@ -100,6 +105,10 @@ static void update_writeback_rate(struct work_struct *work)
> 
>     up_read(&dc->writeback_lock);
> 
>+    /* do not schedule delayed work if cache set is stopping */
>+    if (test_bit(CACHE_SET_STOPPING, &c->flags))
>+        return;
>+

and if cancel_delayed_wsrksync() runs at this moment, the code can not prevent the
work re-arming itself to workqueue. So maybe we need a locker to protect it.    

>     schedule_delayed_work(&dc->writeback_rate_update,
>                   dc->writeback_rate_update_seconds * HZ);
> }

Thanks,
Tang Junhui

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 05/10] bcache: stop dc->writeback_rate_update if cache set is stopping
  2018-01-04 12:41 [PATCH v1 05/10] bcache: stop dc->writeback_rate_update if cache set is stopping tang.junhui
@ 2018-01-05  4:01 ` Coly Li
  0 siblings, 0 replies; 10+ messages in thread
From: Coly Li @ 2018-01-05  4:01 UTC (permalink / raw)
  To: tang.junhui; +Cc: mlyle, linux-bcache, linux-block

On 04/01/2018 8:41 PM, tang.junhui@zte.com.cn wrote:
> From: Tang Junhui <tang.junhui@zte.com.cn>
> 
> Hello Coly,
> 
>> When cache set is stopping, calculating writeback rate is wast of time.
>> This is the purpose of the first checking, to avoid unnecessary delay
>>from bcache_flash_devs_sectors_dirty() inside __update_writeback_rate().
> 
> I thought twice, and found there is still a race.
> 
> work update_writeback_rate is runing:                                                                                
>> @@ -91,6 +91,11 @@ static void update_writeback_rate(struct work_struct *work)
>>     struct cached_dev *dc = container_of(to_delayed_work(work),
>>                          struct cached_dev,
>>                          writeback_rate_update);
>> +    struct cache_set *c = dc->disk.c;
>> +
>> +    /* quit directly if cache set is stopping */
>> +    if (test_bit(CACHE_SET_STOPPING, &c->flags))
>> +        return;
>>
>>     down_read(&dc->writeback_lock);
>>
>> @@ -100,6 +105,10 @@ static void update_writeback_rate(struct work_struct *work)
>>
>>     up_read(&dc->writeback_lock);
>>
>> +    /* do not schedule delayed work if cache set is stopping */
>> +    if (test_bit(CACHE_SET_STOPPING, &c->flags))
>> +        return;
>> +
> 
> and if cancel_delayed_wsrksync() runs at this moment, the code can not prevent the
> work re-arming itself to workqueue. So maybe we need a locker to protect it.    
> 

Hi Junhui,

Nice catch! Yes, there should be something here to avoid such race. I
will fix it in v2 patch.

Thanks for the hint.

Coly Li

>>     schedule_delayed_work(&dc->writeback_rate_update,
>>                   dc->writeback_rate_update_seconds * HZ);
>> }

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 05/10] bcache: stop dc->writeback_rate_update if cache set is stopping
  2018-01-08  7:22   ` Hannes Reinecke
@ 2018-01-08 16:01     ` Coly Li
  0 siblings, 0 replies; 10+ messages in thread
From: Coly Li @ 2018-01-08 16:01 UTC (permalink / raw)
  To: Hannes Reinecke, linux-bcache; +Cc: linux-block, mlyle, tang.junhui

On 08/01/2018 3:22 PM, Hannes Reinecke wrote:
> On 01/03/2018 03:03 PM, Coly Li wrote:
>> struct delayed_work writeback_rate_update in struct cache_dev is a delayed
>> worker to call function update_writeback_rate() in period (the interval is
>> defined by dc->writeback_rate_update_seconds).
>>
>> When a metadate I/O error happens on cache device, bcache error handling
>> routine bch_cache_set_error() will call bch_cache_set_unregister() to
>> retire whole cache set. On the unregister code path, cached_dev_free()
>> calls cancel_delayed_work_sync(&dc->writeback_rate_update) to stop this
>> delayed work.
>>
>> dc->writeback_rate_update is a special delayed work from others in bcache.
>> In its routine update_writeback_rate(), this delayed work is re-armed
>> after a piece of time. That means when cancel_delayed_work_sync() returns,
>> this delayed work can still be executed after several seconds defined by
>> dc->writeback_rate_update_seconds.
>>
>> The problem is, after cancel_delayed_work_sync() returns, the cache set
>> unregister code path will eventually release memory of struct cache set.
>> Then the delayed work is scheduled to run, and inside its routine
>> update_writeback_rate() that already released cache set NULL pointer will
>> be accessed. Now a NULL pointer deference panic is triggered.
>>
>> In order to avoid the above problem, this patch checks cache set flags in
>> delayed work routine update_writeback_rate(). If flag CACHE_SET_STOPPING
>> is set, this routine will quit without re-arm the delayed work. Then the
>> NULL pointer deference panic won't happen after cache set is released.
>>
>> Signed-off-by: Coly Li <colyli@suse.de>
>> ---
>>  drivers/md/bcache/writeback.c | 9 +++++++++
>>  1 file changed, 9 insertions(+)
>>
>> diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
>> index 0789a9e18337..745d9b2a326f 100644
>> --- a/drivers/md/bcache/writeback.c
>> +++ b/drivers/md/bcache/writeback.c
>> @@ -91,6 +91,11 @@ static void update_writeback_rate(struct work_struct *work)
>>  	struct cached_dev *dc = container_of(to_delayed_work(work),
>>  					     struct cached_dev,
>>  					     writeback_rate_update);
>> +	struct cache_set *c = dc->disk.c;
>> +
>> +	/* quit directly if cache set is stopping */
>> +	if (test_bit(CACHE_SET_STOPPING, &c->flags))
>> +		return;
>>  
>>  	down_read(&dc->writeback_lock);
>>  
>> @@ -100,6 +105,10 @@ static void update_writeback_rate(struct work_struct *work)
>>  
>>  	up_read(&dc->writeback_lock);
>>  
>> +	/* do not schedule delayed work if cache set is stopping */
>> +	if (test_bit(CACHE_SET_STOPPING, &c->flags))
>> +		return;
>> +
>>  	schedule_delayed_work(&dc->writeback_rate_update,
>>  			      dc->writeback_rate_update_seconds * HZ);
>>  }
>>
> This is actually not quite correct; the function might still be called
> after 'struct cached_dev' has been removed.
> The correct way of fixing is to either take a reference to struct
> cached_dev and release it once 'update_writeback_rate' is finished, or
> to call 'cancel_delayed_work_sync()' before deleting struct cached_dev.

Hi Hannes,

The problem is not cached_dev, it is cache_set. In
__update_writeback_rate(), struct cache_set is referenced. The solutions
is similar as you suggested, call cancel_delayed_work_sync() before
deleting struct cache_set. Junhui posted another patch to fix duplicated
writeback threads issue, but also fixes this problem too. Therefore just
prevent this kworker from re-arm itself again should be enough, and my
next patche to stop dc->writeback_thread and dc->writeback_rate_update
can be ignored, Junhui's patch is in bcache-for-next already.

Thanks.

Coly Li

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 05/10] bcache: stop dc->writeback_rate_update if cache set is stopping
  2018-01-03 14:03 ` [PATCH v1 05/10] bcache: stop dc->writeback_rate_update if cache set is stopping Coly Li
@ 2018-01-08  7:22   ` Hannes Reinecke
  2018-01-08 16:01     ` Coly Li
  0 siblings, 1 reply; 10+ messages in thread
From: Hannes Reinecke @ 2018-01-08  7:22 UTC (permalink / raw)
  To: Coly Li, linux-bcache; +Cc: linux-block, mlyle, tang.junhui

On 01/03/2018 03:03 PM, Coly Li wrote:
> struct delayed_work writeback_rate_update in struct cache_dev is a delayed
> worker to call function update_writeback_rate() in period (the interval is
> defined by dc->writeback_rate_update_seconds).
> 
> When a metadate I/O error happens on cache device, bcache error handling
> routine bch_cache_set_error() will call bch_cache_set_unregister() to
> retire whole cache set. On the unregister code path, cached_dev_free()
> calls cancel_delayed_work_sync(&dc->writeback_rate_update) to stop this
> delayed work.
> 
> dc->writeback_rate_update is a special delayed work from others in bcache.
> In its routine update_writeback_rate(), this delayed work is re-armed
> after a piece of time. That means when cancel_delayed_work_sync() returns,
> this delayed work can still be executed after several seconds defined by
> dc->writeback_rate_update_seconds.
> 
> The problem is, after cancel_delayed_work_sync() returns, the cache set
> unregister code path will eventually release memory of struct cache set.
> Then the delayed work is scheduled to run, and inside its routine
> update_writeback_rate() that already released cache set NULL pointer will
> be accessed. Now a NULL pointer deference panic is triggered.
> 
> In order to avoid the above problem, this patch checks cache set flags in
> delayed work routine update_writeback_rate(). If flag CACHE_SET_STOPPING
> is set, this routine will quit without re-arm the delayed work. Then the
> NULL pointer deference panic won't happen after cache set is released.
> 
> Signed-off-by: Coly Li <colyli@suse.de>
> ---
>  drivers/md/bcache/writeback.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
> index 0789a9e18337..745d9b2a326f 100644
> --- a/drivers/md/bcache/writeback.c
> +++ b/drivers/md/bcache/writeback.c
> @@ -91,6 +91,11 @@ static void update_writeback_rate(struct work_struct *work)
>  	struct cached_dev *dc = container_of(to_delayed_work(work),
>  					     struct cached_dev,
>  					     writeback_rate_update);
> +	struct cache_set *c = dc->disk.c;
> +
> +	/* quit directly if cache set is stopping */
> +	if (test_bit(CACHE_SET_STOPPING, &c->flags))
> +		return;
>  
>  	down_read(&dc->writeback_lock);
>  
> @@ -100,6 +105,10 @@ static void update_writeback_rate(struct work_struct *work)
>  
>  	up_read(&dc->writeback_lock);
>  
> +	/* do not schedule delayed work if cache set is stopping */
> +	if (test_bit(CACHE_SET_STOPPING, &c->flags))
> +		return;
> +
>  	schedule_delayed_work(&dc->writeback_rate_update,
>  			      dc->writeback_rate_update_seconds * HZ);
>  }
> 
This is actually not quite correct; the function might still be called
after 'struct cached_dev' has been removed.
The correct way of fixing is to either take a reference to struct
cached_dev and release it once 'update_writeback_rate' is finished, or
to call 'cancel_delayed_work_sync()' before deleting struct cached_dev.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 05/10] bcache: stop dc->writeback_rate_update if cache set is stopping
  2018-01-03 18:54 tang.junhui
@ 2018-01-04  9:05 ` Coly Li
  0 siblings, 0 replies; 10+ messages in thread
From: Coly Li @ 2018-01-04  9:05 UTC (permalink / raw)
  To: tang.junhui; +Cc: mlyle, linux-bcache, linux-block

On 04/01/2018 2:54 AM, tang.junhui@zte.com.cn wrote:
> From: Tang Junhui <tang.junhui@zte.com.cn>
> 
> Hi Coly,
> 
>> It is about an implicit and interesting ordering, a simple patch with a
>> lot detail behind. Let me explain why it's safe,
>>
>> - cancel_delayed_work_sync() is called in bcache_device_detach() when
>> dc->count is 0. But cancel_delayed_work_sync() is called before
>> closure_put(&d->c->caching) in bcache_device_detach().
>>
>> - After closure_put(&d->c->caching) called in bcache_device_detach(),
>> refcount of c->caching becomes 0, and cache_set_flush() set in
>> __cache_set_unregister() continue to execute.
>>
>> - See continue_at() and closure_put_after_sub(), after c->caching
>> refoucnt reaches 0, and c->caching->fn gets called, closure_put(parent)
>> will be called. Here the parent of c->caching is c->cl, so refcount of
>> c->cl will reach 0, and closure_put_after_sub() will be called again for
>> c->cl and call its cl->fn which is cache_set_free(). It is set in
>> bch_cache_set_alloc().
>>
>> - So before closure_put(&d->c->caching) is called in
>> bcache_device_detach(), cache_set_flush() in __cache_set_unregister()
>> won't be executed and memory of cache_set won't be freed.
>>
>> Now back to delayed worker routine update_writeback_rate(), I check
>> c->flags inside it, in this moment cancel_delayed_work_sync() is still
>> waiting for update_writeback_rate() to exit, so refcount of c->caching
>> is still hold and cache set memory will only be freed after
>> cancel_delayed_work_sync() returns and refcount of c->caching dropped.
>>
>> And in cached_dev_free(), bcache_device_free() and
>> kobject_put(&dc->disk.kobj) are called after cancel_delayed_work_sync()
>> returns. Therefore when update_writeback_rate() is executing and
>> accessing c->flgas inside, memory of cache_set, bcache_device and
>> cached_dev are all there.
> 
> OK, I got your mean. It is safe run work update_writeback_rate when 
> cancel_delayed_work_sync() is called, but it is not safe to re-arm itself
> to workqueue again.
> 
> So, the first judgement  "if (test_bit(CACHE_SET_STOPPING, &c->flags))"
> is not very nessary, we only need the second one to prevent the work 
> from re-arming itself to workqueue.

Hi Junhui,

When cache set is stopping, calculating writeback rate is wast of time.
This is the purpose of the first checking, to avoid unnecessary delay
from bcache_flash_devs_sectors_dirty() inside __update_writeback_rate().

Thanks.

Coly Li

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 05/10] bcache: stop dc->writeback_rate_update if cache set is stopping
  2018-01-03 16:47 tang.junhui
@ 2018-01-04  8:06 ` Coly Li
  0 siblings, 0 replies; 10+ messages in thread
From: Coly Li @ 2018-01-04  8:06 UTC (permalink / raw)
  To: tang.junhui; +Cc: mlyle, linux-bcache, linux-block

On 04/01/2018 12:47 AM, tang.junhui@zte.com.cn wrote:
> From: Tang Junhui <tang.junhui@zte.com.cn>
> 
> Hello Coly,
> 
>> dc->writeback_rate_update is a special delayed worker, it re-arms itself
>> to run after several seconds by,
>>>>     schedule_delayed_work(&dc->writeback_rate_update,
>>>>                   dc->writeback_rate_update_seconds * HZ);
>>
>> I check the workqueue code, it seems cancel_delayed_work_sync() does not
>> prevent a delayed work to re-arm itself inside worker routine. And in my
>> test, around 5 seconds after cancel_delayed_work_sync() called, a NULL
>> pointer difference oops happens. 
> I think I got how this issue would be occured:
> 1)work writeback_rate_update is running;
> 2)cancel_delayed_work_sync() is called, and waiting for work 
>   writeback_rate_update to run over, and conntinue to release cche set 
>   and cached device resources.
> 3)In the end of step 1, work writeback_rate_update re-armed again, and
>   5s later, the work runs again.
> 
>> Cache set memory is freed within 2 seconds
>> after __cache_set_unregister() called, so inside
>> __update_writeback_rate() struct cache_set is referenced and causes the
>> NULL pointer deference bug.
>>
> So, if it occured like I said above, it is not safty to judged by:
>>    struct cache_set *c = dc->disk.c;
>>    if (test_bit(CACHE_SET_STOPPING, &c->flags))
>>        return;
> because cache_set, even bcache device and cache device are all released.                                             
> Is that right?

Hi Junhui,

It is about an implicit and interesting ordering, a simple patch with a
lot detail behind. Let me explain why it's safe,

- cancel_delayed_work_sync() is called in bcache_device_detach() when
dc->count is 0. But cancel_delayed_work_sync() is called before
closure_put(&d->c->caching) in bcache_device_detach().

- After closure_put(&d->c->caching) called in bcache_device_detach(),
refcount of c->caching becomes 0, and cache_set_flush() set in
__cache_set_unregister() continue to execute.

- See continue_at() and closure_put_after_sub(), after c->caching
refoucnt reaches 0, and c->caching->fn gets called, closure_put(parent)
will be called. Here the parent of c->caching is c->cl, so refcount of
c->cl will reach 0, and closure_put_after_sub() will be called again for
c->cl and call its cl->fn which is cache_set_free(). It is set in
bch_cache_set_alloc().

- So before closure_put(&d->c->caching) is called in
bcache_device_detach(), cache_set_flush() in __cache_set_unregister()
won't be executed and memory of cache_set won't be freed.

Now back to delayed worker routine update_writeback_rate(), I check
c->flags inside it, in this moment cancel_delayed_work_sync() is still
waiting for update_writeback_rate() to exit, so refcount of c->caching
is still hold and cache set memory will only be freed after
cancel_delayed_work_sync() returns and refcount of c->caching dropped.

And in cached_dev_free(), bcache_device_free() and
kobject_put(&dc->disk.kobj) are called after cancel_delayed_work_sync()
returns. Therefore when update_writeback_rate() is executing and
accessing c->flgas inside, memory of cache_set, bcache_device and
cached_dev are all there.

This is why it is safe ;-)

Coly Li

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 05/10] bcache: stop dc->writeback_rate_update if cache set is stopping
  2018-01-03 13:15 tang.junhui
@ 2018-01-04  3:32 ` Coly Li
  0 siblings, 0 replies; 10+ messages in thread
From: Coly Li @ 2018-01-04  3:32 UTC (permalink / raw)
  To: tang.junhui; +Cc: mlyle, linux-bcache, linux-block

On 03/01/2018 9:15 PM, tang.junhui@zte.com.cn wrote:
> From: Tang Junhui <tang.junhui@zte.com.cn>
> 
> Hello Coly,
> 
> Thanks for this serials.
> 
>> struct delayed_work writeback_rate_update in struct cache_dev is a delayed
>> worker to call function update_writeback_rate() in period (the interval is
>> defined by dc->writeback_rate_update_seconds).
>>
>> When a metadate I/O error happens on cache device, bcache error handling
>> routine bch_cache_set_error() will call bch_cache_set_unregister() to
>> retire whole cache set. On the unregister code path, cached_dev_free()
>> calls cancel_delayed_work_sync(&dc->writeback_rate_update) to stop this
>> delayed work.
>>
>> dc->writeback_rate_update is a special delayed work from others in bcache.
>> In its routine update_writeback_rate(), this delayed work is re-armed
>> after a piece of time. That means when cancel_delayed_work_sync() returns,
>> this delayed work can still be executed after several seconds defined by
>> dc->writeback_rate_update_seconds.
>>
>> The problem is, after cancel_delayed_work_sync() returns, the cache set
>> unregister code path will eventually release memory of struct cache set.
>> Then the delayed work is scheduled to run, and inside its routine
>> update_writeback_rate() that already released cache set NULL pointer will
>> be accessed. Now a NULL pointer deference panic is triggered.
>>
> As I known that, after calling cancel_delayed_work_sync(), if the work queue
> is running, cancel_delayed_work_sync() would return only AFTER the work queue
> runs over, else if the work queue does not run yet, cancel_delayed_work_sync()
> will cancel the work queue imediately, so I think it is safe in 
> update_writeback_rate() to access the cache set resource. Point me out if I
> am wrong.
> 

Hi Junhui,

dc->writeback_rate_update is a special delayed worker, it re-arms itself
to run after several seconds by,
>>     schedule_delayed_work(&dc->writeback_rate_update,
>>                   dc->writeback_rate_update_seconds * HZ);

I check the workqueue code, it seems cancel_delayed_work_sync() does not
prevent a delayed work to re-arm itself inside worker routine. And in my
test, around 5 seconds after cancel_delayed_work_sync() called, a NULL
pointer
difference oops happens. Cache set memory is freed within 2 seconds
after __cache_set_unregister() called, so inside
__update_writeback_rate() struct cache_set is referenced and causes the
NULL pointer deference bug.

There are several delayed work in bcache code, but only
dc->writeback_rate_update is special to re-arm itself.

Coly Li

>> In order to avoid the above problem, this patch checks cache set flags in
>> delayed work routine update_writeback_rate(). If flag CACHE_SET_STOPPING
>> is set, this routine will quit without re-arm the delayed work. Then the
>> NULL pointer deference panic won't happen after cache set is released.
>>
>> Signed-off-by: Coly Li <colyli@suse.de>
>> ---
>> drivers/md/bcache/writeback.c | 9 +++++++++
>> 1 file changed, 9 insertions(+)
>>
>> diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
>> index 0789a9e18337..745d9b2a326f 100644
>> --- a/drivers/md/bcache/writeback.c
>> +++ b/drivers/md/bcache/writeback.c
>> @@ -91,6 +91,11 @@ static void update_writeback_rate(struct work_struct *work)
>>     struct cached_dev *dc = container_of(to_delayed_work(work),
>>                          struct cached_dev,
>>                          writeback_rate_update);
>> +    struct cache_set *c = dc->disk.c;
>> +
>> +    /* quit directly if cache set is stopping */
>> +    if (test_bit(CACHE_SET_STOPPING, &c->flags))
>> +        return;
>>
>>     down_read(&dc->writeback_lock);
>>
>> @@ -100,6 +105,10 @@ static void update_writeback_rate(struct work_struct *work)
>>
>>     up_read(&dc->writeback_lock);
>>
>> +    /* do not schedule delayed work if cache set is stopping */
>> +    if (test_bit(CACHE_SET_STOPPING, &c->flags))
>> +        return;
>> +
>>     schedule_delayed_work(&dc->writeback_rate_update,
>>                   dc->writeback_rate_update_seconds * HZ);
>> }
>> -- 
>> 2.15.1                                                             
> 
> Thanks,
> Tang Junhui
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 05/10] bcache: stop dc->writeback_rate_update if cache set is stopping
@ 2018-01-03 16:47 tang.junhui
  2018-01-04  8:06 ` Coly Li
  0 siblings, 1 reply; 10+ messages in thread
From: tang.junhui @ 2018-01-03 16:47 UTC (permalink / raw)
  To: colyli; +Cc: mlyle, linux-bcache, linux-block, tang.junhui

From: Tang Junhui <tang.junhui@zte.com.cn>

Hello Coly,

>dc->writeback_rate_update is a special delayed worker, it re-arms itself
>to run after several seconds by,
>>>     schedule_delayed_work(&dc->writeback_rate_update,
>>>                   dc->writeback_rate_update_seconds * HZ);
>
>I check the workqueue code, it seems cancel_delayed_work_sync() does not
>prevent a delayed work to re-arm itself inside worker routine. And in my
>test, around 5 seconds after cancel_delayed_work_sync() called, a NULL
>pointer difference oops happens. 
I think I got how this issue would be occured:
1)work writeback_rate_update is running;
2)cancel_delayed_work_sync() is called, and waiting for work 
  writeback_rate_update to run over, and conntinue to release cche set 
  and cached device resources.
3)In the end of step 1, work writeback_rate_update re-armed again, and
  5s later, the work runs again.

>Cache set memory is freed within 2 seconds
>after __cache_set_unregister() called, so inside
>__update_writeback_rate() struct cache_set is referenced and causes the
>NULL pointer deference bug.
>
So, if it occured like I said above, it is not safty to judged by:
>    struct cache_set *c = dc->disk.c;
>    if (test_bit(CACHE_SET_STOPPING, &c->flags))
>        return;
because cache_set, even bcache device and cache device are all released.                                             
Is that right?

Thanks,
Tang Junhui

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v1 05/10] bcache: stop dc->writeback_rate_update if cache set is stopping
  2018-01-03 14:03 [PATCH v1 00/10] cache device failure handling improvement Coly Li
@ 2018-01-03 14:03 ` Coly Li
  2018-01-08  7:22   ` Hannes Reinecke
  0 siblings, 1 reply; 10+ messages in thread
From: Coly Li @ 2018-01-03 14:03 UTC (permalink / raw)
  To: linux-bcache; +Cc: linux-block, mlyle, tang.junhui, Coly Li

struct delayed_work writeback_rate_update in struct cache_dev is a delayed
worker to call function update_writeback_rate() in period (the interval is
defined by dc->writeback_rate_update_seconds).

When a metadate I/O error happens on cache device, bcache error handling
routine bch_cache_set_error() will call bch_cache_set_unregister() to
retire whole cache set. On the unregister code path, cached_dev_free()
calls cancel_delayed_work_sync(&dc->writeback_rate_update) to stop this
delayed work.

dc->writeback_rate_update is a special delayed work from others in bcache.
In its routine update_writeback_rate(), this delayed work is re-armed
after a piece of time. That means when cancel_delayed_work_sync() returns,
this delayed work can still be executed after several seconds defined by
dc->writeback_rate_update_seconds.

The problem is, after cancel_delayed_work_sync() returns, the cache set
unregister code path will eventually release memory of struct cache set.
Then the delayed work is scheduled to run, and inside its routine
update_writeback_rate() that already released cache set NULL pointer will
be accessed. Now a NULL pointer deference panic is triggered.

In order to avoid the above problem, this patch checks cache set flags in
delayed work routine update_writeback_rate(). If flag CACHE_SET_STOPPING
is set, this routine will quit without re-arm the delayed work. Then the
NULL pointer deference panic won't happen after cache set is released.

Signed-off-by: Coly Li <colyli@suse.de>
---
 drivers/md/bcache/writeback.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
index 0789a9e18337..745d9b2a326f 100644
--- a/drivers/md/bcache/writeback.c
+++ b/drivers/md/bcache/writeback.c
@@ -91,6 +91,11 @@ static void update_writeback_rate(struct work_struct *work)
 	struct cached_dev *dc = container_of(to_delayed_work(work),
 					     struct cached_dev,
 					     writeback_rate_update);
+	struct cache_set *c = dc->disk.c;
+
+	/* quit directly if cache set is stopping */
+	if (test_bit(CACHE_SET_STOPPING, &c->flags))
+		return;
 
 	down_read(&dc->writeback_lock);
 
@@ -100,6 +105,10 @@ static void update_writeback_rate(struct work_struct *work)
 
 	up_read(&dc->writeback_lock);
 
+	/* do not schedule delayed work if cache set is stopping */
+	if (test_bit(CACHE_SET_STOPPING, &c->flags))
+		return;
+
 	schedule_delayed_work(&dc->writeback_rate_update,
 			      dc->writeback_rate_update_seconds * HZ);
 }
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 05/10] bcache: stop dc->writeback_rate_update if cache set is stopping
@ 2018-01-03 13:15 tang.junhui
  2018-01-04  3:32 ` Coly Li
  0 siblings, 1 reply; 10+ messages in thread
From: tang.junhui @ 2018-01-03 13:15 UTC (permalink / raw)
  To: colyli; +Cc: mlyle, linux-bcache, linux-block, tang.junhui

From: Tang Junhui <tang.junhui@zte.com.cn>

Hello Coly,

Thanks for this serials.

>struct delayed_work writeback_rate_update in struct cache_dev is a delayed
>worker to call function update_writeback_rate() in period (the interval is
>defined by dc->writeback_rate_update_seconds).
>
>When a metadate I/O error happens on cache device, bcache error handling
>routine bch_cache_set_error() will call bch_cache_set_unregister() to
>retire whole cache set. On the unregister code path, cached_dev_free()
>calls cancel_delayed_work_sync(&dc->writeback_rate_update) to stop this
>delayed work.
>
>dc->writeback_rate_update is a special delayed work from others in bcache.
>In its routine update_writeback_rate(), this delayed work is re-armed
>after a piece of time. That means when cancel_delayed_work_sync() returns,
>this delayed work can still be executed after several seconds defined by
>dc->writeback_rate_update_seconds.
>
>The problem is, after cancel_delayed_work_sync() returns, the cache set
>unregister code path will eventually release memory of struct cache set.
>Then the delayed work is scheduled to run, and inside its routine
>update_writeback_rate() that already released cache set NULL pointer will
>be accessed. Now a NULL pointer deference panic is triggered.
>
As I known that, after calling cancel_delayed_work_sync(), if the work queue
is running, cancel_delayed_work_sync() would return only AFTER the work queue
runs over, else if the work queue does not run yet, cancel_delayed_work_sync()
will cancel the work queue imediately, so I think it is safe in 
update_writeback_rate() to access the cache set resource. Point me out if I
am wrong.

>In order to avoid the above problem, this patch checks cache set flags in
>delayed work routine update_writeback_rate(). If flag CACHE_SET_STOPPING
>is set, this routine will quit without re-arm the delayed work. Then the
>NULL pointer deference panic won't happen after cache set is released.
>
>Signed-off-by: Coly Li <colyli@suse.de>
>---
> drivers/md/bcache/writeback.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
>diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
>index 0789a9e18337..745d9b2a326f 100644
>--- a/drivers/md/bcache/writeback.c
>+++ b/drivers/md/bcache/writeback.c
>@@ -91,6 +91,11 @@ static void update_writeback_rate(struct work_struct *work)
>     struct cached_dev *dc = container_of(to_delayed_work(work),
>                          struct cached_dev,
>                          writeback_rate_update);
>+    struct cache_set *c = dc->disk.c;
>+
>+    /* quit directly if cache set is stopping */
>+    if (test_bit(CACHE_SET_STOPPING, &c->flags))
>+        return;
> 
>     down_read(&dc->writeback_lock);
> 
>@@ -100,6 +105,10 @@ static void update_writeback_rate(struct work_struct *work)
> 
>     up_read(&dc->writeback_lock);
> 
>+    /* do not schedule delayed work if cache set is stopping */
>+    if (test_bit(CACHE_SET_STOPPING, &c->flags))
>+        return;
>+
>     schedule_delayed_work(&dc->writeback_rate_update,
>                   dc->writeback_rate_update_seconds * HZ);
> }
>-- 
>2.15.1                                                             

Thanks,
Tang Junhui

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2018-01-08 16:01 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-04 12:41 [PATCH v1 05/10] bcache: stop dc->writeback_rate_update if cache set is stopping tang.junhui
2018-01-05  4:01 ` Coly Li
  -- strict thread matches above, loose matches on Subject: below --
2018-01-03 18:54 tang.junhui
2018-01-04  9:05 ` Coly Li
2018-01-03 16:47 tang.junhui
2018-01-04  8:06 ` Coly Li
2018-01-03 14:03 [PATCH v1 00/10] cache device failure handling improvement Coly Li
2018-01-03 14:03 ` [PATCH v1 05/10] bcache: stop dc->writeback_rate_update if cache set is stopping Coly Li
2018-01-08  7:22   ` Hannes Reinecke
2018-01-08 16:01     ` Coly Li
2018-01-03 13:15 tang.junhui
2018-01-04  3:32 ` Coly Li

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.