All of lore.kernel.org
 help / color / mirror / Atom feed
* Bug - bcache: "unable to handle kernel NULL pointer dereference" when trying to add a md caching device
@ 2016-03-24 15:19 Sebastian Roesner
  2016-03-25  1:49 ` Eric Wheeler
  0 siblings, 1 reply; 14+ messages in thread
From: Sebastian Roesner @ 2016-03-24 15:19 UTC (permalink / raw)
  To: linux-bcache

Hell,

I experienced a bug after upgrading to a recent kernel (4.5.0).

# echo /dev/md2 > /sys/fs/bcache/register

gives me a stack trace.

I created a bug in the kernel bugtracker but I'm not sure if it has been
noticed by the bcache developers:

<https://bugzilla.kernel.org/show_bug.cgi?id=114871>

Could somebody have a look?

Cheers
Sebastian

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bug - bcache: "unable to handle kernel NULL pointer dereference" when trying to add a md caching device
  2016-03-24 15:19 Bug - bcache: "unable to handle kernel NULL pointer dereference" when trying to add a md caching device Sebastian Roesner
@ 2016-03-25  1:49 ` Eric Wheeler
  2016-03-25  4:18   ` Eric Wheeler
  0 siblings, 1 reply; 14+ messages in thread
From: Eric Wheeler @ 2016-03-25  1:49 UTC (permalink / raw)
  To: Sebastian Roesner; +Cc: linux-bcache

On Thu, 24 Mar 2016, Sebastian Roesner wrote:
> I experienced a bug after upgrading to a recent kernel (4.5.0).
> 
> # echo /dev/md2 > /sys/fs/bcache/register
> 
> gives me a stack trace.
> 
> I created a bug in the kernel bugtracker but I'm not sure if it has been
> noticed by the bcache developers:
> 
> <https://bugzilla.kernel.org/show_bug.cgi?id=114871>

The dereference appears to happen inside of bio_trim which is core block 
code.  In this case, it looks like bio_trim is being passed a null pointer 
to *bio, but by whom?  

Please send this output:

tail /sys/block/md2/queue/discard_*

> Could somebody have a look?

It could be an allocation failure in bcache that gets pushed down the 
stack at registration.  If that is the case, please pull these two fixes 
branches which may have missed 4.5:

git pull https://bitbucket.org/ewheelerinc/linux v4.5-rc6-bcache-fixes
git pull https://bitbucket.org/ewheelerinc/linux v4.5-rc7-bcache-fixes


For list reference, this is the trace:

Mar 17 21:54:20 gropius kernel: [ 1544.712759] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
Mar 17 21:54:20 gropius kernel: [ 1544.712847] IP: [<ffffffff811e53b4>] bio_trim+0xf/0x2a
[...]
Mar 17 21:54:20 gropius kernel: [ 1544.717685]  [<ffffffffa008c3be>] ? raid1_make_request+0x2e8/0xad7 [raid1]
Mar 17 21:54:20 gropius kernel: [ 1544.717718]  [<ffffffff811f07da>] ? blk_queue_split+0x377/0x3d4
Mar 17 21:54:20 gropius kernel: [ 1544.717754]  [<ffffffffa006cb5f>] ? md_make_request+0xf6/0x1e9 [md_mod]
Mar 17 21:54:20 gropius kernel: [ 1544.717787]  [<ffffffff811eb860>] ? generic_make_request+0xb5/0x155
Mar 17 21:54:20 gropius kernel: [ 1544.717824]  [<ffffffffa045dc79>] ? prio_io+0x85/0x95 [bcache]
Mar 17 21:54:20 gropius kernel: [ 1544.717858]  [<ffffffffa0460205>] ? register_cache_set+0x355/0x8d0 [bcache]
Mar 17 21:54:20 gropius kernel: [ 1544.717893]  [<ffffffffa046177b>] ? register_bcache+0xffb/0x114b [bcache]

-Eric

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bug - bcache: "unable to handle kernel NULL pointer dereference" when trying to add a md caching device
  2016-03-25  1:49 ` Eric Wheeler
@ 2016-03-25  4:18   ` Eric Wheeler
  2016-03-25 14:11     ` Sebastian Roesner
  0 siblings, 1 reply; 14+ messages in thread
From: Eric Wheeler @ 2016-03-25  4:18 UTC (permalink / raw)
  To: Sebastian Roesner; +Cc: linux-bcache

On Fri, 25 Mar 2016, Eric Wheeler wrote:
> On Thu, 24 Mar 2016, Sebastian Roesner wrote:
> > I experienced a bug after upgrading to a recent kernel (4.5.0).
> > 
> > # echo /dev/md2 > /sys/fs/bcache/register
> > 
> > gives me a stack trace.
> > 
> > I created a bug in the kernel bugtracker but I'm not sure if it has been
> > noticed by the bcache developers:
> > 
> > <https://bugzilla.kernel.org/show_bug.cgi?id=114871>
> 
> The dereference appears to happen inside of bio_trim which is core block 
> code.  In this case, it looks like bio_trim is being passed a null pointer 
> to *bio, but by whom?  
> 
> Please send this output:
> 
> tail /sys/block/md2/queue/discard_*
> 
> > Could somebody have a look?
> 
> It could be an allocation failure in bcache that gets pushed down the 
> stack at registration.  If that is the case, please pull these two fixes 
> branches which may have missed 4.5:
> 
> git pull https://bitbucket.org/ewheelerinc/linux v4.5-rc6-bcache-fixes
> git pull https://bitbucket.org/ewheelerinc/linux v4.5-rc7-bcache-fixes

Sebastian,

Try this patch and see if it BUGs:

diff --git a/drivers/md/bcache/io.c b/drivers/md/bcache/io.c
index 86a0bb8..f6bbbce 100644
--- a/drivers/md/bcache/io.c
+++ b/drivers/md/bcache/io.c
@@ -24,6 +24,9 @@ struct bio *bch_bbio_alloc(struct cache_set *c)
 	struct bbio *b = mempool_alloc(c->bio_meta, GFP_NOIO);
 	struct bio *bio = &b->bio;
 
+	BUG_ON(!b);
+	BUG_ON(!bio);
+
 	bio_init(bio);
 	bio->bi_flags		|= BIO_POOL_NONE << BIO_POOL_OFFSET;
 	bio->bi_max_vecs	 = bucket_pages(c);

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: Bug - bcache: "unable to handle kernel NULL pointer dereference" when trying to add a md caching device
  2016-03-25  4:18   ` Eric Wheeler
@ 2016-03-25 14:11     ` Sebastian Roesner
  2016-03-25 20:44       ` [BUG] NULL pointer in raid1_make_request passed to bio_trim when adding md as bcache caching dev Eric Wheeler
  0 siblings, 1 reply; 14+ messages in thread
From: Sebastian Roesner @ 2016-03-25 14:11 UTC (permalink / raw)
  To: Eric Wheeler; +Cc: linux-bcache

Hi Eric,

Am 25.03.2016 um 05:18 schrieb Eric Wheeler:
> Please send this output:
>
> tail /sys/block/md2/queue/discard_*

>> git pull https://bitbucket.org/ewheelerinc/linux v4.5-rc6-bcache-fixes
>> git pull https://bitbucket.org/ewheelerinc/linux v4.5-rc7-bcache-fixes
>
> Try this patch and see if it BUGs:
 > [..]

# tail /sys/block/md2/queue/discard_*
==> /sys/block/md2/queue/discard_granularity <==
4096

==> /sys/block/md2/queue/discard_max_bytes <==
2147450880

==> /sys/block/md2/queue/discard_max_hw_bytes <==
2147450880

==> /sys/block/md2/queue/discard_zeroes_data <==
0

root@gropius:/home/sroesner# echo /dev/md2 > /sys/fs/bcache/register
sroesner@gropius:~$

-> it killed my bash.

Trace with patches and BUG_ON patch:

[  172.660142] BUG: unable to handle kernel NULL pointer dereference at 
0000000000000028
[  172.660229] IP: [<ffffffff811e53b4>] bio_trim+0xf/0x2a
[  172.660289] PGD 7faf3e067 PUD 7f9279067 PMD 0
[  172.660399] Oops: 0000 [#1] SMP
[...]
[  172.664780] Call Trace:
[  172.664813]  [<ffffffffa007f3be>] ? raid1_make_request+0x2e8/0xad7 
[raid1]
[  172.664846]  [<ffffffff811f07da>] ? blk_queue_split+0x377/0x3d4
[  172.664880]  [<ffffffffa005fb5f>] ? md_make_request+0xf6/0x1e9 [md_mod]
[  172.664912]  [<ffffffff811eb860>] ? generic_make_request+0xb5/0x155
[  172.664947]  [<ffffffffa0445c89>] ? prio_io+0x85/0x95 [bcache]
[  172.664981]  [<ffffffffa0448252>] ? register_cache_set+0x355/0x8d0 
[bcache]
[  172.665016]  [<ffffffffa04497d3>] ? register_bcache+0x1006/0x1174 
[bcache]

See http://pastebin.com/6CHtky2x for complete traces.

Sebastian

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [BUG] NULL pointer in raid1_make_request passed to bio_trim when adding md as bcache caching dev
  2016-03-25 14:11     ` Sebastian Roesner
@ 2016-03-25 20:44       ` Eric Wheeler
  2016-03-25 21:32         ` Eric Wheeler
  2016-03-26 15:40         ` Ming Lei
  0 siblings, 2 replies; 14+ messages in thread
From: Eric Wheeler @ 2016-03-25 20:44 UTC (permalink / raw)
  To: Sebastian Roesner
  Cc: linux-bcache, linux-raid, mlin, miuvlad, ming.lei, rjones,
	kent.overstreet

[ +cc's, intentional top post ]

Hello linux-raid list, please see below:

A problem where bio_trim() dereferences a NULL pointer via 
bio->bi_iter.bi_size (0+0x28) has been reported at least three times when 
passing DISCARDs into raid1's.  I could be wrong here, but I'm guessing 
the cause of these three reports are the same:

[2014-02-04] https://bugzilla.redhat.com/show_bug.cgi?id=1061339
[2016-01-13] http://www.spinics.net/lists/linux-bcache/msg03335.html
	     https://bugzilla.kernel.org/show_bug.cgi?id=110771
[2016-03-25] http://thread.gmane.org/gmane.linux.kernel.bcache.devel/3607 [this thread]
             https://bugzilla.kernel.org/show_bug.cgi?id=114871

Is there something that needs to be done differently with DISCARD bio's, 
perhaps related to splitting?  We added BUG_ON(!bio) in bcache's 
prio_io() function (see backtrace below) and didn't trip, so we know that 
bcache isn't passing null bio pointers.

Perhaps there is an issue in either of blk_queue_split() or 
md_make_request() even before it gets to raid1_make_request()?

Please add Cc's for anyone who might be able to help solve this problem.  

Thanks you for your help!

--
Eric Wheeler

On Fri, 25 Mar 2016, Sebastian Roesner wrote:

> Hi Eric,
> 
> Am 25.03.2016 um 05:18 schrieb Eric Wheeler:
> > Please send this output:
> >
> > tail /sys/block/md2/queue/discard_*
> 
> # tail /sys/block/md2/queue/discard_*
> ==> /sys/block/md2/queue/discard_granularity <==
> 4096
> 
> ==> /sys/block/md2/queue/discard_max_bytes <==
> 2147450880
> 
> ==> /sys/block/md2/queue/discard_max_hw_bytes <==
> 2147450880
> 
> ==> /sys/block/md2/queue/discard_zeroes_data <==
> 0
> 
> root@gropius:/home/sroesner# echo /dev/md2 > /sys/fs/bcache/register
> sroesner@gropius:~$
> 
> -> it killed my bash.
> 
> Trace with patches and BUG_ON patch:
> 
> [  172.660142] BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000028
> [  172.660229] IP: [<ffffffff811e53b4>] bio_trim+0xf/0x2a
> [  172.660289] PGD 7faf3e067 PUD 7f9279067 PMD 0
> [  172.660399] Oops: 0000 [#1] SMP
> [...]
> [  172.664780] Call Trace:
> [  172.664813]  [<ffffffffa007f3be>] ? raid1_make_request+0x2e8/0xad7 [raid1]
> [  172.664846]  [<ffffffff811f07da>] ? blk_queue_split+0x377/0x3d4
> [  172.664880]  [<ffffffffa005fb5f>] ? md_make_request+0xf6/0x1e9 [md_mod]
> [  172.664912]  [<ffffffff811eb860>] ? generic_make_request+0xb5/0x155
> [  172.664947]  [<ffffffffa0445c89>] ? prio_io+0x85/0x95 [bcache]
> [  172.664981]  [<ffffffffa0448252>] ? register_cache_set+0x355/0x8d0 [bcache]
> [  172.665016]  [<ffffffffa04497d3>] ? register_bcache+0x1006/0x1174 [bcache]
> 
> See http://pastebin.com/6CHtky2x for complete traces.
> 
> Sebastian
> --
> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] NULL pointer in raid1_make_request passed to bio_trim when adding md as bcache caching dev
  2016-03-25 20:44       ` [BUG] NULL pointer in raid1_make_request passed to bio_trim when adding md as bcache caching dev Eric Wheeler
@ 2016-03-25 21:32         ` Eric Wheeler
  2016-03-26 18:38           ` Sebastian Roesner
  2016-03-26 15:40         ` Ming Lei
  1 sibling, 1 reply; 14+ messages in thread
From: Eric Wheeler @ 2016-03-25 21:32 UTC (permalink / raw)
  To: Sebastian Roesner
  Cc: linux-bcache, linux-raid, mlin, miuvlad, ming.lei, rjones,
	kent.overstreet

On Fri, 25 Mar 2016, Eric Wheeler wrote:
> On Fri, 25 Mar 2016, Sebastian Roesner wrote:
> 
> > Hi Eric,
> > 
> > Am 25.03.2016 um 05:18 schrieb Eric Wheeler:
> > > Please send this output:
> > >
> > > tail /sys/block/md2/queue/discard_*
> > 
> > # tail /sys/block/md2/queue/discard_*
> > ==> /sys/block/md2/queue/discard_granularity <==
> > 4096
> > 
> > ==> /sys/block/md2/queue/discard_max_bytes <==
> > 2147450880
> > 
> > ==> /sys/block/md2/queue/discard_max_hw_bytes <==
> > 2147450880
> > 
> > ==> /sys/block/md2/queue/discard_zeroes_data <==
> > 0

Sebastian, 

This could be an upstream problem.  For now you could disable discards.  
One to do this might be to place the cache device under LVM and set 
issue_discards=0 in lvm.conf.  Then register the LV as your cachedev. 
Hopefully this is an initial use case and that you don't have important 
writeback data in the cache such that you can re-create it.  We use LVM to 
allocate our cache volumes so I know it works.

I'm not sure if /sys/block/md2/queue/discard_granularity is writable, but 
if it is, you could `echo 0 > /sys/block/md2/queue/discard_granularity` 
which should disable discard too.

Perhaps others can chime in on turning off discard support for block 
devices if there is a better way, perhaps even something md-specific.


--
Eric Wheeler



> > 
> > root@gropius:/home/sroesner# echo /dev/md2 > /sys/fs/bcache/register
> > sroesner@gropius:~$
> > 
> > -> it killed my bash.
> > 
> > Trace with patches and BUG_ON patch:
> > 
> > [  172.660142] BUG: unable to handle kernel NULL pointer dereference at
> > 0000000000000028
> > [  172.660229] IP: [<ffffffff811e53b4>] bio_trim+0xf/0x2a
> > [  172.660289] PGD 7faf3e067 PUD 7f9279067 PMD 0
> > [  172.660399] Oops: 0000 [#1] SMP
> > [...]
> > [  172.664780] Call Trace:
> > [  172.664813]  [<ffffffffa007f3be>] ? raid1_make_request+0x2e8/0xad7 [raid1]
> > [  172.664846]  [<ffffffff811f07da>] ? blk_queue_split+0x377/0x3d4
> > [  172.664880]  [<ffffffffa005fb5f>] ? md_make_request+0xf6/0x1e9 [md_mod]
> > [  172.664912]  [<ffffffff811eb860>] ? generic_make_request+0xb5/0x155
> > [  172.664947]  [<ffffffffa0445c89>] ? prio_io+0x85/0x95 [bcache]
> > [  172.664981]  [<ffffffffa0448252>] ? register_cache_set+0x355/0x8d0 [bcache]
> > [  172.665016]  [<ffffffffa04497d3>] ? register_bcache+0x1006/0x1174 [bcache]
> > 
> > See http://pastebin.com/6CHtky2x for complete traces.
> > 
> > Sebastian
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> > 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] NULL pointer in raid1_make_request passed to bio_trim when adding md as bcache caching dev
  2016-03-25 20:44       ` [BUG] NULL pointer in raid1_make_request passed to bio_trim when adding md as bcache caching dev Eric Wheeler
  2016-03-25 21:32         ` Eric Wheeler
@ 2016-03-26 15:40         ` Ming Lei
  2016-03-26 16:46           ` Sebastian Roesner
  1 sibling, 1 reply; 14+ messages in thread
From: Ming Lei @ 2016-03-26 15:40 UTC (permalink / raw)
  To: Eric Wheeler
  Cc: Sebastian Roesner, linux-bcache, linux-raid, Ming Lin,
	Vlad-Cosmin Miu, rjones, Kent Overstreet

[-- Attachment #1: Type: text/plain, Size: 3317 bytes --]

On Sat, Mar 26, 2016 at 4:44 AM, Eric Wheeler <bcache@lists.ewheeler.net> wrote:
> [ +cc's, intentional top post ]
>
> Hello linux-raid list, please see below:
>
> A problem where bio_trim() dereferences a NULL pointer via
> bio->bi_iter.bi_size (0+0x28) has been reported at least three times when
> passing DISCARDs into raid1's.  I could be wrong here, but I'm guessing
> the cause of these three reports are the same:
>
> [2014-02-04] https://bugzilla.redhat.com/show_bug.cgi?id=1061339
> [2016-01-13] http://www.spinics.net/lists/linux-bcache/msg03335.html
>              https://bugzilla.kernel.org/show_bug.cgi?id=110771
> [2016-03-25] http://thread.gmane.org/gmane.linux.kernel.bcache.devel/3607 [this thread]
>              https://bugzilla.kernel.org/show_bug.cgi?id=114871
>
> Is there something that needs to be done differently with DISCARD bio's,
> perhaps related to splitting?  We added BUG_ON(!bio) in bcache's
> prio_io() function (see backtrace below) and didn't trip, so we know that
> bcache isn't passing null bio pointers.
>
> Perhaps there is an issue in either of blk_queue_split() or
> md_make_request() even before it gets to raid1_make_request()?

From the trace, looks the bio returned from bio_clone_mddev() is NULL,
and maybe the source bio is buggy, something like the attached patch
might be helpful, could you apply the attached debug patch, reproduce
and post the log?

thanks,

>
> Please add Cc's for anyone who might be able to help solve this problem.
>
> Thanks you for your help!
>
> --
> Eric Wheeler
>
> On Fri, 25 Mar 2016, Sebastian Roesner wrote:
>
>> Hi Eric,
>>
>> Am 25.03.2016 um 05:18 schrieb Eric Wheeler:
>> > Please send this output:
>> >
>> > tail /sys/block/md2/queue/discard_*
>>
>> # tail /sys/block/md2/queue/discard_*
>> ==> /sys/block/md2/queue/discard_granularity <==
>> 4096
>>
>> ==> /sys/block/md2/queue/discard_max_bytes <==
>> 2147450880
>>
>> ==> /sys/block/md2/queue/discard_max_hw_bytes <==
>> 2147450880
>>
>> ==> /sys/block/md2/queue/discard_zeroes_data <==
>> 0
>>
>> root@gropius:/home/sroesner# echo /dev/md2 > /sys/fs/bcache/register
>> sroesner@gropius:~$
>>
>> -> it killed my bash.
>>
>> Trace with patches and BUG_ON patch:
>>
>> [  172.660142] BUG: unable to handle kernel NULL pointer dereference at
>> 0000000000000028
>> [  172.660229] IP: [<ffffffff811e53b4>] bio_trim+0xf/0x2a
>> [  172.660289] PGD 7faf3e067 PUD 7f9279067 PMD 0
>> [  172.660399] Oops: 0000 [#1] SMP
>> [...]
>> [  172.664780] Call Trace:
>> [  172.664813]  [<ffffffffa007f3be>] ? raid1_make_request+0x2e8/0xad7 [raid1]
>> [  172.664846]  [<ffffffff811f07da>] ? blk_queue_split+0x377/0x3d4
>> [  172.664880]  [<ffffffffa005fb5f>] ? md_make_request+0xf6/0x1e9 [md_mod]
>> [  172.664912]  [<ffffffff811eb860>] ? generic_make_request+0xb5/0x155
>> [  172.664947]  [<ffffffffa0445c89>] ? prio_io+0x85/0x95 [bcache]
>> [  172.664981]  [<ffffffffa0448252>] ? register_cache_set+0x355/0x8d0 [bcache]
>> [  172.665016]  [<ffffffffa04497d3>] ? register_bcache+0x1006/0x1174 [bcache]
>>
>> See http://pastebin.com/6CHtky2x for complete traces.
>>
>> Sebastian
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>

[-- Attachment #2: raid1-check-bio.patch --]
[-- Type: text/x-patch, Size: 1400 bytes --]

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 39fb21e..956205d 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1044,6 +1044,27 @@ static void raid1_unplug(struct blk_plug_cb *cb, bool from_schedule)
 	kfree(plug);
 }
 
+static void check_bio(struct bio *bio, bool check)
+{
+	struct bio_vec bvec;
+	struct bvec_iter iter = bio->bi_iter;
+	int i = 0;
+
+	if (!check)
+		return;
+
+	printk("%s: dump bio %p segs:%d f:%x max_bvecs:%d iov:%p\n",
+			__func__, bio, bio_segments(bio), bio->bi_flags,
+			bio->bi_max_vecs, bio->bi_io_vec);
+	printk("\t iter: %u %u %u\n", iter.bi_size, iter.bi_idx,
+	       iter.bi_bvec_done);
+	bio_for_each_segment(bvec, bio, iter) {
+		printk("\t %4d: %u %u %lu\n",
+		       i, bvec.bv_offset, bvec.bv_len,
+		       (unsigned long)page_to_pfn(bvec.bv_page));
+	}
+}
+
 static void raid1_make_request(struct mddev *mddev, struct bio * bio)
 {
 	struct r1conf *conf = mddev->private;
@@ -1157,6 +1178,7 @@ read_again:
 		r1_bio->start_next_window = 0;
 
 		read_bio = bio_clone_mddev(bio, GFP_NOIO, mddev);
+		check_bio(bio, !read_bio);
 		bio_trim(read_bio, r1_bio->sector - bio->bi_iter.bi_sector,
 			 max_sectors);
 
@@ -1338,6 +1360,7 @@ read_again:
 			continue;
 
 		mbio = bio_clone_mddev(bio, GFP_NOIO, mddev);
+		check_bio(bio, !mbio);
 		bio_trim(mbio, r1_bio->sector - bio->bi_iter.bi_sector, max_sectors);
 
 		if (first_clone) {

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [BUG] NULL pointer in raid1_make_request passed to bio_trim when adding md as bcache caching dev
  2016-03-26 15:40         ` Ming Lei
@ 2016-03-26 16:46           ` Sebastian Roesner
  2016-03-28 18:10             ` Shaohua Li
  0 siblings, 1 reply; 14+ messages in thread
From: Sebastian Roesner @ 2016-03-26 16:46 UTC (permalink / raw)
  To: Ming Lei, Eric Wheeler
  Cc: linux-bcache, linux-raid, Ming Lin, Vlad-Cosmin Miu, rjones,
	Kent Overstreet

Hello Ming, Eric,

Am 26.03.2016 um 16:40 schrieb Ming Lei:
>  From the trace, looks the bio returned from bio_clone_mddev() is NULL,
> and maybe the source bio is buggy, something like the attached patch
> might be helpful, could you apply the attached debug patch, reproduce
> and post the log?

I was able to reproduce it on a non-productive system, but only after 
copying the bcache superblocks/partition starts from the original 
system, with new created ones it worked fine.

Full trace and check_bio output can be found here:

http://pastebin.com/ngvGGHBZ

Sebastian

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] NULL pointer in raid1_make_request passed to bio_trim when adding md as bcache caching dev
  2016-03-25 21:32         ` Eric Wheeler
@ 2016-03-26 18:38           ` Sebastian Roesner
  0 siblings, 0 replies; 14+ messages in thread
From: Sebastian Roesner @ 2016-03-26 18:38 UTC (permalink / raw)
  To: Eric Wheeler; +Cc: linux-bcache

Hi Eric,

[other lists removed intentionally]

> This could be an upstream problem.

I tried to reproduce it on a test system with the same kernel with new 
created superblocks, there everything worked fine. Also after setting 
dev.cache.discard to yes.

Then I copied the complete partitions from the productive system and got 
the same bug there.

Long story short: It seems to be caused by some data in the the 
cache-device superblock. So even if it's an upstream problem, it's 
somehow caused by data in my superblock.

It can be found here: https://www.f0o.de/~sroesner/bcache-superblock.bz2

Maybe you can reproduce it, for testing I created a degraded raid on 
only one 500G partition and tried to register it to cause the bug.

Sebastian

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] NULL pointer in raid1_make_request passed to bio_trim when adding md as bcache caching dev
  2016-03-26 16:46           ` Sebastian Roesner
@ 2016-03-28 18:10             ` Shaohua Li
  2016-03-28 18:38               ` Jeff Moyer
                                 ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Shaohua Li @ 2016-03-28 18:10 UTC (permalink / raw)
  To: Sebastian Roesner
  Cc: Ming Lei, Eric Wheeler, linux-bcache, linux-raid, Ming Lin,
	Vlad-Cosmin Miu, rjones, Kent Overstreet, jmoyer, axboe

On Sat, Mar 26, 2016 at 05:46:16PM +0100, Sebastian Roesner wrote:
> Hello Ming, Eric,
> 
> Am 26.03.2016 um 16:40 schrieb Ming Lei:
> > From the trace, looks the bio returned from bio_clone_mddev() is NULL,
> >and maybe the source bio is buggy, something like the attached patch
> >might be helpful, could you apply the attached debug patch, reproduce
> >and post the log?
> 
> I was able to reproduce it on a non-productive system, but only after
> copying the bcache superblocks/partition starts from the original system,
> with new created ones it worked fine.
> 
> Full trace and check_bio output can be found here:
> 
> http://pastebin.com/ngvGGHBZ

320 bvecs exceeds what bio-clone_set can handle. Could you please try below patch?

commit 92761dad7ff6e1bf25de247e0064dd398e797599
Author: Shaohua Li <shli@fb.com>
Date:   Mon Mar 28 10:54:35 2016 -0700

    block: don't make BLK_DEF_MAX_SECTORS too big
    
    bio_alloc_bioset() allocates bvecs from bvec_slabs which can only allocate
    maximum 256 bvec (eg, 1M for 4k pages). We can't bump BLK_DEF_MAX_SECTORS to
    exceed this value otherwise bio_alloc_bioset will fail.
    
    This fixes commit 30e2bc08b2bb7c069. We probably should make the bvec_slabs
    hold bigger bvecs if bigger bio size is required.
    
    Signed-off-by: Shaohua Li <shli@fb.com>

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 7e5d7e0..da64325 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1153,7 +1153,11 @@ extern int blk_verify_command(unsigned char *cmd, fmode_t has_write_perm);
 enum blk_default_limits {
 	BLK_MAX_SEGMENTS	= 128,
 	BLK_SAFE_MAX_SECTORS	= 255,
-	BLK_DEF_MAX_SECTORS	= 2560,
+	/*
+	 * if you change this, please also change bvec_alloc and BIO_MAX_PAGES.
+	 * Otherwise bio_alloc_bioset will break.
+	 */
+	BLK_DEF_MAX_SECTORS	= BIO_MAX_SECTORS,
 	BLK_MAX_SEGMENT_SIZE	= 65536,
 	BLK_SEG_BOUNDARY_MASK	= 0xFFFFFFFFUL,
 };

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [BUG] NULL pointer in raid1_make_request passed to bio_trim when adding md as bcache caching dev
  2016-03-28 18:10             ` Shaohua Li
@ 2016-03-28 18:38               ` Jeff Moyer
  2016-03-29  1:28               ` Ming Lei
  2016-04-01 18:14               ` Eric Wheeler
  2 siblings, 0 replies; 14+ messages in thread
From: Jeff Moyer @ 2016-03-28 18:38 UTC (permalink / raw)
  To: Shaohua Li
  Cc: Sebastian Roesner, Ming Lei, Eric Wheeler, linux-bcache,
	linux-raid, Ming Lin, Vlad-Cosmin Miu, rjones, Kent Overstreet,
	axboe

Shaohua Li <shli@kernel.org> writes:

> On Sat, Mar 26, 2016 at 05:46:16PM +0100, Sebastian Roesner wrote:
>> Hello Ming, Eric,
>> 
>> Am 26.03.2016 um 16:40 schrieb Ming Lei:
>> > From the trace, looks the bio returned from bio_clone_mddev() is NULL,
>> >and maybe the source bio is buggy, something like the attached patch
>> >might be helpful, could you apply the attached debug patch, reproduce
>> >and post the log?
>> 
>> I was able to reproduce it on a non-productive system, but only after
>> copying the bcache superblocks/partition starts from the original system,
>> with new created ones it worked fine.
>> 
>> Full trace and check_bio output can be found here:
>> 
>> http://pastebin.com/ngvGGHBZ
>
> 320 bvecs exceeds what bio-clone_set can handle. Could you please try
> below patch?

OK, so bvec_alloc won't allow us to allocate more than BIO_MAX_PAGES.

> commit 92761dad7ff6e1bf25de247e0064dd398e797599
> Author: Shaohua Li <shli@fb.com>
> Date:   Mon Mar 28 10:54:35 2016 -0700
>
>     block: don't make BLK_DEF_MAX_SECTORS too big
>     
>     bio_alloc_bioset() allocates bvecs from bvec_slabs which can only allocate
>     maximum 256 bvec (eg, 1M for 4k pages). We can't bump BLK_DEF_MAX_SECTORS to
>     exceed this value otherwise bio_alloc_bioset will fail.
>     
>     This fixes commit 30e2bc08b2bb7c069.

Wrong commit.  You meant this one: d2be537c3ba (block: bump
BLK_DEF_MAX_SECTORS to 2560).

>     We probably should make the bvec_slabs hold bigger bvecs if bigger
>     bio size is required.

This change is okay with me.  So long as it fixes the issue, it should
be marked for stable, too.  Thanks for adding the comment, by the way!

Reviewed-by: Jeff Moyer <jmoyer@redhat.com>


>     Signed-off-by: Shaohua Li <shli@fb.com>
>
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index 7e5d7e0..da64325 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -1153,7 +1153,11 @@ extern int blk_verify_command(unsigned char *cmd, fmode_t has_write_perm);
>  enum blk_default_limits {
>  	BLK_MAX_SEGMENTS	= 128,
>  	BLK_SAFE_MAX_SECTORS	= 255,
> -	BLK_DEF_MAX_SECTORS	= 2560,
> +	/*
> +	 * if you change this, please also change bvec_alloc and BIO_MAX_PAGES.
> +	 * Otherwise bio_alloc_bioset will break.
> +	 */
> +	BLK_DEF_MAX_SECTORS	= BIO_MAX_SECTORS,
>  	BLK_MAX_SEGMENT_SIZE	= 65536,
>  	BLK_SEG_BOUNDARY_MASK	= 0xFFFFFFFFUL,
>  };

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] NULL pointer in raid1_make_request passed to bio_trim when adding md as bcache caching dev
  2016-03-28 18:10             ` Shaohua Li
  2016-03-28 18:38               ` Jeff Moyer
@ 2016-03-29  1:28               ` Ming Lei
  2016-04-01 18:14               ` Eric Wheeler
  2 siblings, 0 replies; 14+ messages in thread
From: Ming Lei @ 2016-03-29  1:28 UTC (permalink / raw)
  To: Shaohua Li
  Cc: Sebastian Roesner, Eric Wheeler, linux-bcache, linux-raid,
	Ming Lin, Vlad-Cosmin Miu, rjones, Kent Overstreet, Jeff Moyer,
	Jens Axboe

On Tue, Mar 29, 2016 at 2:10 AM, Shaohua Li <shli@kernel.org> wrote:
> On Sat, Mar 26, 2016 at 05:46:16PM +0100, Sebastian Roesner wrote:
>> Hello Ming, Eric,
>>
>> Am 26.03.2016 um 16:40 schrieb Ming Lei:
>> > From the trace, looks the bio returned from bio_clone_mddev() is NULL,
>> >and maybe the source bio is buggy, something like the attached patch
>> >might be helpful, could you apply the attached debug patch, reproduce
>> >and post the log?
>>
>> I was able to reproduce it on a non-productive system, but only after
>> copying the bcache superblocks/partition starts from the original system,
>> with new created ones it worked fine.
>>
>> Full trace and check_bio output can be found here:
>>
>> http://pastebin.com/ngvGGHBZ
>
> 320 bvecs exceeds what bio-clone_set can handle. Could you please try below patch?
>

The source bio is still a cloned bio which includes 320 bvecs, which is weried.
And I guess bio_add_page() isn't used for the initial source bio from bcache.

From the stack trace, looks it is related with prio_io(), which just
sets bio size as
the bucket size, so could you check what the bucket size is about your
bcache device via bcache-super-show or sysfs?

If the bucket size is (320 * 4K), looks we should limit the queue's max
sector as 1MB.

BTW, in my bcache block, the bucket size is 512K, and I can't reproduce
the issue, and it is stored in superblock.

> commit 92761dad7ff6e1bf25de247e0064dd398e797599
> Author: Shaohua Li <shli@fb.com>
> Date:   Mon Mar 28 10:54:35 2016 -0700
>
>     block: don't make BLK_DEF_MAX_SECTORS too big
>
>     bio_alloc_bioset() allocates bvecs from bvec_slabs which can only allocate
>     maximum 256 bvec (eg, 1M for 4k pages). We can't bump BLK_DEF_MAX_SECTORS to
>     exceed this value otherwise bio_alloc_bioset will fail.
>
>     This fixes commit 30e2bc08b2bb7c069. We probably should make the bvec_slabs
>     hold bigger bvecs if bigger bio size is required.
>
>     Signed-off-by: Shaohua Li <shli@fb.com>
>
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index 7e5d7e0..da64325 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -1153,7 +1153,11 @@ extern int blk_verify_command(unsigned char *cmd, fmode_t has_write_perm);
>  enum blk_default_limits {
>         BLK_MAX_SEGMENTS        = 128,
>         BLK_SAFE_MAX_SECTORS    = 255,
> -       BLK_DEF_MAX_SECTORS     = 2560,
> +       /*
> +        * if you change this, please also change bvec_alloc and BIO_MAX_PAGES.
> +        * Otherwise bio_alloc_bioset will break.
> +        */
> +       BLK_DEF_MAX_SECTORS     = BIO_MAX_SECTORS,
>         BLK_MAX_SEGMENT_SIZE    = 65536,
>         BLK_SEG_BOUNDARY_MASK   = 0xFFFFFFFFUL,
>  };

This change should work.

But, when multipage bvecs is ready, I think the above limit can be increased
too. I suggest to use (BIO_MAX_PAGES << (PAGE_SHIFT - 9)) instead of
BIO_MAX_SECTORS, which is removed in my following patch and there is
only one user of this macro:

       http://marc.info/?l=linux-kernel&m=145862727620599&w=2

Thanks,
Ming

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] NULL pointer in raid1_make_request passed to bio_trim when adding md as bcache caching dev
  2016-03-28 18:10             ` Shaohua Li
  2016-03-28 18:38               ` Jeff Moyer
  2016-03-29  1:28               ` Ming Lei
@ 2016-04-01 18:14               ` Eric Wheeler
  2016-04-10 18:01                 ` Sebastian Roesner
  2 siblings, 1 reply; 14+ messages in thread
From: Eric Wheeler @ 2016-04-01 18:14 UTC (permalink / raw)
  To: Sebastian Roesner
  Cc: Shaohua Li, Ming Lei, linux-bcache, linux-raid, Ming Lin,
	Vlad-Cosmin Miu, rjones, Kent Overstreet, jmoyer, axboe

On Mon, 28 Mar 2016, Shaohua Li wrote:
> On Sat, Mar 26, 2016 at 05:46:16PM +0100, Sebastian Roesner wrote:
> > Hello Ming, Eric,
> > 
> > Am 26.03.2016 um 16:40 schrieb Ming Lei:
> > I was able to reproduce it on a non-productive system, but only after
> > copying the bcache superblocks/partition starts from the original system,
> > with new created ones it worked fine.

Hey Sebastian,

Have you had a chance to test the patch below from Shaohua Li?

--
Eric Wheeler


> > 
> > Full trace and check_bio output can be found here:
> > 
> > http://pastebin.com/ngvGGHBZ
> 
> 320 bvecs exceeds what bio-clone_set can handle. Could you please try below patch?
> 
> commit 92761dad7ff6e1bf25de247e0064dd398e797599
> Author: Shaohua Li <shli@fb.com>
> Date:   Mon Mar 28 10:54:35 2016 -0700
> 
>     block: don't make BLK_DEF_MAX_SECTORS too big
>     
>     bio_alloc_bioset() allocates bvecs from bvec_slabs which can only allocate
>     maximum 256 bvec (eg, 1M for 4k pages). We can't bump BLK_DEF_MAX_SECTORS to
>     exceed this value otherwise bio_alloc_bioset will fail.
>     
>     This fixes commit 30e2bc08b2bb7c069. We probably should make the bvec_slabs
>     hold bigger bvecs if bigger bio size is required.
>     
>     Signed-off-by: Shaohua Li <shli@fb.com>
> 
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index 7e5d7e0..da64325 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -1153,7 +1153,11 @@ extern int blk_verify_command(unsigned char *cmd, fmode_t has_write_perm);
>  enum blk_default_limits {
>  	BLK_MAX_SEGMENTS	= 128,
>  	BLK_SAFE_MAX_SECTORS	= 255,
> -	BLK_DEF_MAX_SECTORS	= 2560,
> +	/*
> +	 * if you change this, please also change bvec_alloc and BIO_MAX_PAGES.
> +	 * Otherwise bio_alloc_bioset will break.
> +	 */
> +	BLK_DEF_MAX_SECTORS	= BIO_MAX_SECTORS,
>  	BLK_MAX_SEGMENT_SIZE	= 65536,
>  	BLK_SEG_BOUNDARY_MASK	= 0xFFFFFFFFUL,
>  };
> 

	

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] NULL pointer in raid1_make_request passed to bio_trim when adding md as bcache caching dev
  2016-04-01 18:14               ` Eric Wheeler
@ 2016-04-10 18:01                 ` Sebastian Roesner
  0 siblings, 0 replies; 14+ messages in thread
From: Sebastian Roesner @ 2016-04-10 18:01 UTC (permalink / raw)
  To: Eric Wheeler, Shaohua Li
  Cc: Ming Lei, linux-bcache, linux-raid, Ming Lin, Vlad-Cosmin Miu,
	rjones, Kent Overstreet, jmoyer, axboe

Hello Eric, Shaohua,

Am 01.04.2016 um 20:14 schrieb Eric Wheeler:
 > Hey Sebastian,
 >
 > Have you had a chance to test the patch below from Shaohua Li?

I just tried the patch and it did not crash anymore.

Sebastian

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2016-04-10 18:01 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-24 15:19 Bug - bcache: "unable to handle kernel NULL pointer dereference" when trying to add a md caching device Sebastian Roesner
2016-03-25  1:49 ` Eric Wheeler
2016-03-25  4:18   ` Eric Wheeler
2016-03-25 14:11     ` Sebastian Roesner
2016-03-25 20:44       ` [BUG] NULL pointer in raid1_make_request passed to bio_trim when adding md as bcache caching dev Eric Wheeler
2016-03-25 21:32         ` Eric Wheeler
2016-03-26 18:38           ` Sebastian Roesner
2016-03-26 15:40         ` Ming Lei
2016-03-26 16:46           ` Sebastian Roesner
2016-03-28 18:10             ` Shaohua Li
2016-03-28 18:38               ` Jeff Moyer
2016-03-29  1:28               ` Ming Lei
2016-04-01 18:14               ` Eric Wheeler
2016-04-10 18:01                 ` Sebastian Roesner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.