linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm: Fix a NULL dereference crash while accessing bdev->bd_disk
@ 2016-11-26  2:06 Wei Fang
  2016-11-28 10:07 ` Jan Kara
  2016-11-29 23:08 ` Andrew Morton
  0 siblings, 2 replies; 12+ messages in thread
From: Wei Fang @ 2016-11-26  2:06 UTC (permalink / raw)
  To: akpm; +Cc: jack, hannes, hch, linux-mm, Wei Fang, stable

->bd_disk is assigned to NULL in __blkdev_put() when no one is holding
the bdev. After that, ->bd_inode still can be touched in the
blockdev_superblock->s_inodes list before the final iput. So iterate_bdevs()
can still get this inode, and start writeback on mapping dirty pages.
->bd_disk will be dereferenced in mapping_cap_writeback_dirty() in this
case, and a NULL dereference crash will be triggered:

Unable to handle kernel NULL pointer dereference at virtual address 00000388
...
[<ffff8000004cb1e4>] blk_get_backing_dev_info+0x1c/0x28
[<ffff8000001c879c>] __filemap_fdatawrite_range+0x54/0x98
[<ffff8000001c8804>] filemap_fdatawrite+0x24/0x2c
[<ffff80000027e7a4>] fdatawrite_one_bdev+0x20/0x28
[<ffff800000288b44>] iterate_bdevs+0xec/0x144
[<ffff80000027eb50>] sys_sync+0x84/0xd0

Since mapping_cap_writeback_dirty() is always return true about
block device inodes, no need to check it if the inode is a block
device inode.

Cc: stable@vger.kernel.org
Signed-off-by: Wei Fang <fangwei1@huawei.com>
---
 mm/filemap.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index 235021e..d607677 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -334,8 +334,9 @@ int __filemap_fdatawrite_range(struct address_space *mapping, loff_t start,
 		.range_end = end,
 	};
 
-	if (!mapping_cap_writeback_dirty(mapping))
-		return 0;
+	if (!sb_is_blkdev_sb(mapping->host->i_sb))
+		if (!mapping_cap_writeback_dirty(mapping))
+			return 0;
 
 	wbc_attach_fdatawrite_inode(&wbc, mapping->host);
 	ret = do_writepages(mapping, &wbc);
-- 
2.4.11

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Fix a NULL dereference crash while accessing bdev->bd_disk
  2016-11-26  2:06 [PATCH] mm: Fix a NULL dereference crash while accessing bdev->bd_disk Wei Fang
@ 2016-11-28 10:07 ` Jan Kara
  2016-11-28 15:57   ` Tejun Heo
  2016-11-29  1:58   ` Wei Fang
  2016-11-29 23:08 ` Andrew Morton
  1 sibling, 2 replies; 12+ messages in thread
From: Jan Kara @ 2016-11-28 10:07 UTC (permalink / raw)
  To: Wei Fang; +Cc: akpm, jack, hannes, hch, linux-mm, stable, Jens Axboe, Tejun Heo

On Sat 26-11-16 10:06:22, Wei Fang wrote:
> ->bd_disk is assigned to NULL in __blkdev_put() when no one is holding
> the bdev. After that, ->bd_inode still can be touched in the
> blockdev_superblock->s_inodes list before the final iput. So iterate_bdevs()
> can still get this inode, and start writeback on mapping dirty pages.
> ->bd_disk will be dereferenced in mapping_cap_writeback_dirty() in this
> case, and a NULL dereference crash will be triggered:
> 
> Unable to handle kernel NULL pointer dereference at virtual address 00000388
> ...
> [<ffff8000004cb1e4>] blk_get_backing_dev_info+0x1c/0x28
> [<ffff8000001c879c>] __filemap_fdatawrite_range+0x54/0x98
> [<ffff8000001c8804>] filemap_fdatawrite+0x24/0x2c
> [<ffff80000027e7a4>] fdatawrite_one_bdev+0x20/0x28
> [<ffff800000288b44>] iterate_bdevs+0xec/0x144
> [<ffff80000027eb50>] sys_sync+0x84/0xd0
> 
> Since mapping_cap_writeback_dirty() is always return true about
> block device inodes, no need to check it if the inode is a block
> device inode.
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Wei Fang <fangwei1@huawei.com>

Good catch but I don't like sprinkling checks like this into the writeback
code and furthermore we don't want to call into writeback code when block
device is in the process of being destroyed which is what would happen with
your patch. That is a bug waiting to happen...

As I'm looking into the code, we need a serialization between bdev writeback
and blkdev_put(). That should be doable if we use writeback_single_inode()
for writing bdev inode instead of simple filemap_fdatawrite() and then use
inode_wait_for_writeback() in blkdev_put() but it needs some careful
thought.

Frankly that whole idea of tearing block devices down on last close is a
major headache and keeps biting us. I'm wondering whether it is still worth
it these days...

								Honza

> ---
>  mm/filemap.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 235021e..d607677 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -334,8 +334,9 @@ int __filemap_fdatawrite_range(struct address_space *mapping, loff_t start,
>  		.range_end = end,
>  	};
>  
> -	if (!mapping_cap_writeback_dirty(mapping))
> -		return 0;
> +	if (!sb_is_blkdev_sb(mapping->host->i_sb))
> +		if (!mapping_cap_writeback_dirty(mapping))
> +			return 0;
>  
>  	wbc_attach_fdatawrite_inode(&wbc, mapping->host);
>  	ret = do_writepages(mapping, &wbc);
> -- 
> 2.4.11
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Fix a NULL dereference crash while accessing bdev->bd_disk
  2016-11-28 10:07 ` Jan Kara
@ 2016-11-28 15:57   ` Tejun Heo
  2016-11-29  9:30     ` Jan Kara
  2016-11-29  1:58   ` Wei Fang
  1 sibling, 1 reply; 12+ messages in thread
From: Tejun Heo @ 2016-11-28 15:57 UTC (permalink / raw)
  To: Jan Kara; +Cc: Wei Fang, akpm, hannes, hch, linux-mm, stable, Jens Axboe

Hello, Jan.

On Mon, Nov 28, 2016 at 11:07:18AM +0100, Jan Kara wrote:
> As I'm looking into the code, we need a serialization between bdev writeback
> and blkdev_put(). That should be doable if we use writeback_single_inode()
> for writing bdev inode instead of simple filemap_fdatawrite() and then use
> inode_wait_for_writeback() in blkdev_put() but it needs some careful
> thought.

It's kinda weird that sync() is ends up accessing bdev's without any
synchronization.  Can't we just make iterate_bdevs() grab bd_mutex and
verify bd_disk isn't NULL before calling into the callback?

> Frankly that whole idea of tearing block devices down on last close is a
> major headache and keeps biting us. I'm wondering whether it is still worth
> it these days...

Yeah, it'd be great if we can follow a more conventional lifetime
pattern here.

Thanks.

-- 
tejun

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Fix a NULL dereference crash while accessing bdev->bd_disk
  2016-11-28 10:07 ` Jan Kara
  2016-11-28 15:57   ` Tejun Heo
@ 2016-11-29  1:58   ` Wei Fang
  2016-11-30  9:51     ` Jan Kara
  1 sibling, 1 reply; 12+ messages in thread
From: Wei Fang @ 2016-11-29  1:58 UTC (permalink / raw)
  To: Jan Kara; +Cc: akpm, hannes, hch, linux-mm, stable, Jens Axboe, Tejun Heo

Hi, Jan,

On 2016/11/28 18:07, Jan Kara wrote:
> Good catch but I don't like sprinkling checks like this into the writeback
> code and furthermore we don't want to call into writeback code when block
> device is in the process of being destroyed which is what would happen with
> your patch. That is a bug waiting to happen...

Agreed. Need another way to fix this problem. I looked through the
writeback cgroup code in __filemap_fdatawrite_range(), found if we
turn on CONFIG_CGROUP_WRITEBACK, a new crash will happen.

Thanks,
Wei

> As I'm looking into the code, we need a serialization between bdev writeback
> and blkdev_put(). That should be doable if we use writeback_single_inode()
> for writing bdev inode instead of simple filemap_fdatawrite() and then use
> inode_wait_for_writeback() in blkdev_put() but it needs some careful
> thought.
> 
> Frankly that whole idea of tearing block devices down on last close is a
> major headache and keeps biting us. I'm wondering whether it is still worth
> it these days...
> 
> 								Honza
> 
>> ---
>>  mm/filemap.c | 5 +++--
>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/filemap.c b/mm/filemap.c
>> index 235021e..d607677 100644
>> --- a/mm/filemap.c
>> +++ b/mm/filemap.c
>> @@ -334,8 +334,9 @@ int __filemap_fdatawrite_range(struct address_space *mapping, loff_t start,
>>  		.range_end = end,
>>  	};
>>  
>> -	if (!mapping_cap_writeback_dirty(mapping))
>> -		return 0;
>> +	if (!sb_is_blkdev_sb(mapping->host->i_sb))
>> +		if (!mapping_cap_writeback_dirty(mapping))
>> +			return 0;
>>  
>>  	wbc_attach_fdatawrite_inode(&wbc, mapping->host);
>>  	ret = do_writepages(mapping, &wbc);
>> -- 
>> 2.4.11
>>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Fix a NULL dereference crash while accessing bdev->bd_disk
  2016-11-28 15:57   ` Tejun Heo
@ 2016-11-29  9:30     ` Jan Kara
  2016-11-29 16:43       ` Tejun Heo
  0 siblings, 1 reply; 12+ messages in thread
From: Jan Kara @ 2016-11-29  9:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jan Kara, Wei Fang, akpm, hannes, hch, linux-mm, stable, Jens Axboe

On Mon 28-11-16 10:57:18, Tejun Heo wrote:
> Hello, Jan.
> 
> On Mon, Nov 28, 2016 at 11:07:18AM +0100, Jan Kara wrote:
> > As I'm looking into the code, we need a serialization between bdev writeback
> > and blkdev_put(). That should be doable if we use writeback_single_inode()
> > for writing bdev inode instead of simple filemap_fdatawrite() and then use
> > inode_wait_for_writeback() in blkdev_put() but it needs some careful
> > thought.
> 
> It's kinda weird that sync() is ends up accessing bdev's without any
> synchronization.  Can't we just make iterate_bdevs() grab bd_mutex and
> verify bd_disk isn't NULL before calling into the callback?

This reminded me I've already seen something like this and indeed I've
already had a very similar discussion in March -
https://patchwork.kernel.org/patch/8556941/

Holding bd_mutex in iterate_devs() works but still nothing protects from
flusher thread just walking across the block device inode and trying to
write it which would result in the very same oops.

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Fix a NULL dereference crash while accessing bdev->bd_disk
  2016-11-29  9:30     ` Jan Kara
@ 2016-11-29 16:43       ` Tejun Heo
  2016-11-30  9:50         ` Jan Kara
  0 siblings, 1 reply; 12+ messages in thread
From: Tejun Heo @ 2016-11-29 16:43 UTC (permalink / raw)
  To: Jan Kara; +Cc: Wei Fang, akpm, hannes, hch, linux-mm, stable, Jens Axboe

Hello, Jan.

On Tue, Nov 29, 2016 at 10:30:35AM +0100, Jan Kara wrote:
> > It's kinda weird that sync() is ends up accessing bdev's without any
> > synchronization.  Can't we just make iterate_bdevs() grab bd_mutex and
> > verify bd_disk isn't NULL before calling into the callback?
> 
> This reminded me I've already seen something like this and indeed I've
> already had a very similar discussion in March -
> https://patchwork.kernel.org/patch/8556941/

lol

> Holding bd_mutex in iterate_devs() works but still nothing protects from
> flusher thread just walking across the block device inode and trying to
> write it which would result in the very same oops.

Ah, right.  We aren't implementing either sever or refcnted draining
semantics properly.  I wonder whether we'd be able to retire the inode
synchronously during blkdev_put.

Thanks.

-- 
tejun

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Fix a NULL dereference crash while accessing bdev->bd_disk
  2016-11-26  2:06 [PATCH] mm: Fix a NULL dereference crash while accessing bdev->bd_disk Wei Fang
  2016-11-28 10:07 ` Jan Kara
@ 2016-11-29 23:08 ` Andrew Morton
  2016-11-30  7:30   ` Jan Kara
  1 sibling, 1 reply; 12+ messages in thread
From: Andrew Morton @ 2016-11-29 23:08 UTC (permalink / raw)
  To: Wei Fang; +Cc: jack, hannes, hch, linux-mm, stable, Al Viro

On Sat, 26 Nov 2016 10:06:22 +0800 Wei Fang <fangwei1@huawei.com> wrote:

> ->bd_disk is assigned to NULL in __blkdev_put() when no one is holding
> the bdev. After that, ->bd_inode still can be touched in the
> blockdev_superblock->s_inodes list before the final iput. So iterate_bdevs()
> can still get this inode, and start writeback on mapping dirty pages.
> ->bd_disk will be dereferenced in mapping_cap_writeback_dirty() in this
> case, and a NULL dereference crash will be triggered:
> 
> Unable to handle kernel NULL pointer dereference at virtual address 00000388
> ...
> [<ffff8000004cb1e4>] blk_get_backing_dev_info+0x1c/0x28
> [<ffff8000001c879c>] __filemap_fdatawrite_range+0x54/0x98
> [<ffff8000001c8804>] filemap_fdatawrite+0x24/0x2c
> [<ffff80000027e7a4>] fdatawrite_one_bdev+0x20/0x28
> [<ffff800000288b44>] iterate_bdevs+0xec/0x144
> [<ffff80000027eb50>] sys_sync+0x84/0xd0
> 
> Since mapping_cap_writeback_dirty() is always return true about
> block device inodes, no need to check it if the inode is a block
> device inode.
> 
> ...
>
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -334,8 +334,9 @@ int __filemap_fdatawrite_range(struct address_space *mapping, loff_t start,
>  		.range_end = end,
>  	};
>  
> -	if (!mapping_cap_writeback_dirty(mapping))
> -		return 0;
> +	if (!sb_is_blkdev_sb(mapping->host->i_sb))
> +		if (!mapping_cap_writeback_dirty(mapping))
> +			return 0;
>  
>  	wbc_attach_fdatawrite_inode(&wbc, mapping->host);
>  	ret = do_writepages(mapping, &wbc);

This seems wrong to me.  If __blkdev_put() has got so deep into the
release process as to be zeroing out ->bd_disk then the blockdev's
inode shouldn't be visible to iterate_bdevs()?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Fix a NULL dereference crash while accessing bdev->bd_disk
  2016-11-29 23:08 ` Andrew Morton
@ 2016-11-30  7:30   ` Jan Kara
  0 siblings, 0 replies; 12+ messages in thread
From: Jan Kara @ 2016-11-30  7:30 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Wei Fang, jack, hannes, hch, linux-mm, stable, Al Viro

On Tue 29-11-16 15:08:28, Andrew Morton wrote:
> On Sat, 26 Nov 2016 10:06:22 +0800 Wei Fang <fangwei1@huawei.com> wrote:
> 
> > ->bd_disk is assigned to NULL in __blkdev_put() when no one is holding
> > the bdev. After that, ->bd_inode still can be touched in the
> > blockdev_superblock->s_inodes list before the final iput. So iterate_bdevs()
> > can still get this inode, and start writeback on mapping dirty pages.
> > ->bd_disk will be dereferenced in mapping_cap_writeback_dirty() in this
> > case, and a NULL dereference crash will be triggered:
> > 
> > Unable to handle kernel NULL pointer dereference at virtual address 00000388
> > ...
> > [<ffff8000004cb1e4>] blk_get_backing_dev_info+0x1c/0x28
> > [<ffff8000001c879c>] __filemap_fdatawrite_range+0x54/0x98
> > [<ffff8000001c8804>] filemap_fdatawrite+0x24/0x2c
> > [<ffff80000027e7a4>] fdatawrite_one_bdev+0x20/0x28
> > [<ffff800000288b44>] iterate_bdevs+0xec/0x144
> > [<ffff80000027eb50>] sys_sync+0x84/0xd0
> > 
> > Since mapping_cap_writeback_dirty() is always return true about
> > block device inodes, no need to check it if the inode is a block
> > device inode.
> > 
> > ...
> >
> > --- a/mm/filemap.c
> > +++ b/mm/filemap.c
> > @@ -334,8 +334,9 @@ int __filemap_fdatawrite_range(struct address_space *mapping, loff_t start,
> >  		.range_end = end,
> >  	};
> >  
> > -	if (!mapping_cap_writeback_dirty(mapping))
> > -		return 0;
> > +	if (!sb_is_blkdev_sb(mapping->host->i_sb))
> > +		if (!mapping_cap_writeback_dirty(mapping))
> > +			return 0;
> >  
> >  	wbc_attach_fdatawrite_inode(&wbc, mapping->host);
> >  	ret = do_writepages(mapping, &wbc);
> 
> This seems wrong to me.  If __blkdev_put() has got so deep into the
> release process as to be zeroing out ->bd_disk then the blockdev's
> inode shouldn't be visible to iterate_bdevs()?

That's the trouble with how block devices currently work. On last close of
the block device, the block device inode is detached from bd_disk and thus
from request_queue & bdi. bd_disk & company gets freed, inode stays (bdev
inode is referenced by inodes representing block device in the filesystem
which are referenced by dentries). This happens asynchronously wrt
iterate_bdevs() and inode_to_bdi() calls in general - any inode_to_bdi()
call on block device inode can oops if it happens to race with
__blkdev_put().  The use of inode_to_bdi() in mapping_cap_writeback_dirty()
from iterate_bdevs() is one such possibility - that is relatively easy to
fix by modifying iterate_bdevs() however it is not so easy to protect in
this way inode_to_bdi() calls in writeback happening periodically from the
flusher work.
								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Fix a NULL dereference crash while accessing bdev->bd_disk
  2016-11-29 16:43       ` Tejun Heo
@ 2016-11-30  9:50         ` Jan Kara
  0 siblings, 0 replies; 12+ messages in thread
From: Jan Kara @ 2016-11-30  9:50 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jan Kara, Wei Fang, akpm, hannes, hch, linux-mm, stable, Jens Axboe

On Tue 29-11-16 11:43:50, Tejun Heo wrote:
> Hello, Jan.
> 
> On Tue, Nov 29, 2016 at 10:30:35AM +0100, Jan Kara wrote:
> > > It's kinda weird that sync() is ends up accessing bdev's without any
> > > synchronization.  Can't we just make iterate_bdevs() grab bd_mutex and
> > > verify bd_disk isn't NULL before calling into the callback?
> > 
> > This reminded me I've already seen something like this and indeed I've
> > already had a very similar discussion in March -
> > https://patchwork.kernel.org/patch/8556941/
> 
> lol
> 
> > Holding bd_mutex in iterate_devs() works but still nothing protects from
> > flusher thread just walking across the block device inode and trying to
> > write it which would result in the very same oops.
> 
> Ah, right.  We aren't implementing either sever or refcnted draining
> semantics properly.  I wonder whether we'd be able to retire the inode
> synchronously during blkdev_put.

Yeah, I've realized flusher thread is mostly taken care of by the fact that
__blkdev_put() does bdev_write_inode() which waits for I_SYNC to get
cleared and then the inode is clean so writeback code mostly ignores it. It
is fragile but likely it works. So for now I've decided to just push the
patch mentioned above to get at least obvious breakage fixed as playing
with bdev lifetime rules definitely won't be a stable material anyway.

I was also thinking about completely tearing down bdev inode in
__blkdev_put() and it could be doable although hunting all inodes
referencing bdev inode through i_bdev will be tricky. Probably we'll need
i_devices things for block devices back...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Fix a NULL dereference crash while accessing bdev->bd_disk
  2016-11-29  1:58   ` Wei Fang
@ 2016-11-30  9:51     ` Jan Kara
  2016-12-01  2:30       ` Wei Fang
  0 siblings, 1 reply; 12+ messages in thread
From: Jan Kara @ 2016-11-30  9:51 UTC (permalink / raw)
  To: Wei Fang
  Cc: Jan Kara, akpm, hannes, hch, linux-mm, stable, Jens Axboe, Tejun Heo

[-- Attachment #1: Type: text/plain, Size: 696 bytes --]

On Tue 29-11-16 09:58:31, Wei Fang wrote:
> Hi, Jan,
> 
> On 2016/11/28 18:07, Jan Kara wrote:
> > Good catch but I don't like sprinkling checks like this into the writeback
> > code and furthermore we don't want to call into writeback code when block
> > device is in the process of being destroyed which is what would happen with
> > your patch. That is a bug waiting to happen...
> 
> Agreed. Need another way to fix this problem. I looked through the
> writeback cgroup code in __filemap_fdatawrite_range(), found if we
> turn on CONFIG_CGROUP_WRITEBACK, a new crash will happen.

OK, can you test with attached patch please? Thanks!

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

[-- Attachment #2: 0001-block-protect-iterate_bdevs-against-concurrent-close.patch --]
[-- Type: text/x-patch, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Fix a NULL dereference crash while accessing bdev->bd_disk
  2016-11-30  9:51     ` Jan Kara
@ 2016-12-01  2:30       ` Wei Fang
  2016-12-01  8:18         ` Jan Kara
  0 siblings, 1 reply; 12+ messages in thread
From: Wei Fang @ 2016-12-01  2:30 UTC (permalink / raw)
  To: Jan Kara; +Cc: akpm, hannes, hch, linux-mm, stable, Jens Axboe, Tejun Heo

Hi, Jan,

On 2016/11/30 17:51, Jan Kara wrote:
> On Tue 29-11-16 09:58:31, Wei Fang wrote:
>> Hi, Jan,
>>
>> On 2016/11/28 18:07, Jan Kara wrote:
>>> Good catch but I don't like sprinkling checks like this into the writeback
>>> code and furthermore we don't want to call into writeback code when block
>>> device is in the process of being destroyed which is what would happen with
>>> your patch. That is a bug waiting to happen...
>>
>> Agreed. Need another way to fix this problem. I looked through the
>> writeback cgroup code in __filemap_fdatawrite_range(), found if we
>> turn on CONFIG_CGROUP_WRITEBACK, a new crash will happen.
> 
> OK, can you test with attached patch please? Thanks!

I've tested this patch with linux-next about 2 hours, and all goes well.
Without this patch, kernel crashes in minutes.

Thanks,
Wei

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Fix a NULL dereference crash while accessing bdev->bd_disk
  2016-12-01  2:30       ` Wei Fang
@ 2016-12-01  8:18         ` Jan Kara
  0 siblings, 0 replies; 12+ messages in thread
From: Jan Kara @ 2016-12-01  8:18 UTC (permalink / raw)
  To: Wei Fang
  Cc: Jan Kara, akpm, hannes, hch, linux-mm, stable, Jens Axboe, Tejun Heo

On Thu 01-12-16 10:30:05, Wei Fang wrote:
> On 2016/11/30 17:51, Jan Kara wrote:
> > On Tue 29-11-16 09:58:31, Wei Fang wrote:
> >> Hi, Jan,
> >>
> >> On 2016/11/28 18:07, Jan Kara wrote:
> >>> Good catch but I don't like sprinkling checks like this into the writeback
> >>> code and furthermore we don't want to call into writeback code when block
> >>> device is in the process of being destroyed which is what would happen with
> >>> your patch. That is a bug waiting to happen...
> >>
> >> Agreed. Need another way to fix this problem. I looked through the
> >> writeback cgroup code in __filemap_fdatawrite_range(), found if we
> >> turn on CONFIG_CGROUP_WRITEBACK, a new crash will happen.
> > 
> > OK, can you test with attached patch please? Thanks!
> 
> I've tested this patch with linux-next about 2 hours, and all goes well.
> Without this patch, kernel crashes in minutes.

Good. Thanks for testing! I'll send the patch for inclusion.

								hONZA
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2016-12-01  8:18 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-26  2:06 [PATCH] mm: Fix a NULL dereference crash while accessing bdev->bd_disk Wei Fang
2016-11-28 10:07 ` Jan Kara
2016-11-28 15:57   ` Tejun Heo
2016-11-29  9:30     ` Jan Kara
2016-11-29 16:43       ` Tejun Heo
2016-11-30  9:50         ` Jan Kara
2016-11-29  1:58   ` Wei Fang
2016-11-30  9:51     ` Jan Kara
2016-12-01  2:30       ` Wei Fang
2016-12-01  8:18         ` Jan Kara
2016-11-29 23:08 ` Andrew Morton
2016-11-30  7:30   ` Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).