All of lore.kernel.org
 help / color / mirror / Atom feed
* hybrid raid1 with trim support
@ 2013-03-12 17:49 Markus
  2013-03-24  9:21 ` Markus
  0 siblings, 1 reply; 7+ messages in thread
From: Markus @ 2013-03-12 17:49 UTC (permalink / raw)
  To: lkml

Hello!

I created a hybrid raid1 with one ssd and one hdd. Used writemostly and writebehind and put ext4 with discard enabled on it.
This setup worked quite well for the last months (last kernel was 3.7.6). But now as I booted 3.8.2 the hdd was dropped from the raid with:
> md/raid1:md2: Disk failure on sdb1, disabling device.
> md/raid1:md2: Operation continuing on 1 devices.

Re-adding this drive it will try to resync but the hdd will be dropped short time after. Now I remounted the device without the discard flag and the resync and usage works as before.
After remounting it again with discard enabled the hdd is dropped again. So I think this is the culprit as the hdd does obviously not support TRIM...

As it worked fine before, I think this is a regression? Or is this an intended change?


Thanks,
Markus

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: hybrid raid1 with trim support
  2013-03-12 17:49 hybrid raid1 with trim support Markus
@ 2013-03-24  9:21 ` Markus
  2013-04-27 16:29   ` hybrid raid1 with trim support [REGRESSION] Markus
  0 siblings, 1 reply; 7+ messages in thread
From: Markus @ 2013-03-24  9:21 UTC (permalink / raw)
  To: lkml

Hi!

Still the same with 3.8.4 ... anybody?

Best regards,
Markus

Markus schrieb am 12.03.2013:
> Hello!
> 
> I created a hybrid raid1 with one ssd and one hdd. Used writemostly and writebehind and put ext4 with discard enabled on it.
> This setup worked quite well for the last months (last kernel was 3.7.6). But now as I booted 3.8.2 the hdd was dropped from the raid with:
> > md/raid1:md2: Disk failure on sdb1, disabling device.
> > md/raid1:md2: Operation continuing on 1 devices.
> 
> Re-adding this drive it will try to resync but the hdd will be dropped short time after. Now I remounted the device without the discard flag and the resync and usage works as before.
> After remounting it again with discard enabled the hdd is dropped again. So I think this is the culprit as the hdd does obviously not support TRIM...
> 
> As it worked fine before, I think this is a regression? Or is this an intended change?
> 
> 
> Thanks,
> Markus

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: hybrid raid1 with trim support [REGRESSION]
  2013-03-24  9:21 ` Markus
@ 2013-04-27 16:29   ` Markus
  2013-04-28  0:54     ` Shaohua Li
  0 siblings, 1 reply; 7+ messages in thread
From: Markus @ 2013-04-27 16:29 UTC (permalink / raw)
  To: lkml; +Cc: Shaohua Li, Jens Axboe

Hi!

Now I had the time to bisect, started with 3.7 as good and 3.8 as bad.
0cfbcafcae8b7364b5fa96c2b26ccde7a3a296a9 is the bad commit. [1]
block: add plug for blkdev_issue_discard

While 3.8.10 was still bad, the same kernel with the reverted patch applied is fine.


I found another report. [2]


Thanks,
Markus

[1] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=0cfbcafcae8b7364b5fa96c2b26ccde7a3a296a9
[2] http://www.spinics.net/lists/raid/msg42758.html

Markus schrieb am 24.03.2013:
> Hi!
> 
> Still the same with 3.8.4 ... anybody?
> 
> Best regards,
> Markus
> 
> Markus schrieb am 12.03.2013:
> > Hello!
> > 
> > I created a hybrid raid1 with one ssd and one hdd. Used writemostly and writebehind and put ext4 with discard enabled on it.
> > This setup worked quite well for the last months (last kernel was 3.7.6). But now as I booted 3.8.2 the hdd was dropped from the raid with:
> > > md/raid1:md2: Disk failure on sdb1, disabling device.
> > > md/raid1:md2: Operation continuing on 1 devices.
> > 
> > Re-adding this drive it will try to resync but the hdd will be dropped short time after. Now I remounted the device without the discard flag and the resync and usage works as before.
> > After remounting it again with discard enabled the hdd is dropped again. So I think this is the culprit as the hdd does obviously not support TRIM...
> > 
> > As it worked fine before, I think this is a regression? Or is this an intended change?
> > 
> > 
> > Thanks,
> > Markus

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: hybrid raid1 with trim support [REGRESSION]
  2013-04-27 16:29   ` hybrid raid1 with trim support [REGRESSION] Markus
@ 2013-04-28  0:54     ` Shaohua Li
  2013-04-28  1:00       ` Shaohua Li
  0 siblings, 1 reply; 7+ messages in thread
From: Shaohua Li @ 2013-04-28  0:54 UTC (permalink / raw)
  To: Markus; +Cc: lkml, Jens Axboe

On Sat, Apr 27, 2013 at 06:29:49PM +0200, Markus wrote:
> Hi!
> 
> Now I had the time to bisect, started with 3.7 as good and 3.8 as bad.
> 0cfbcafcae8b7364b5fa96c2b26ccde7a3a296a9 is the bad commit. [1]
> block: add plug for blkdev_issue_discard
> 
> While 3.8.10 was still bad, the same kernel with the reverted patch applied is fine.
Thanks for the reporting. Does below patch work for you?

Thanks,
Shaohua


---
 drivers/md/raid1.c |    4 ++++
 1 file changed, 4 insertions(+)

Index: linux/drivers/md/raid1.c
===================================================================
--- linux.orig/drivers/md/raid1.c	2013-03-07 14:14:05.950824173 +0800
+++ linux/drivers/md/raid1.c	2013-04-28 08:52:06.761964780 +0800
@@ -981,6 +981,10 @@ static void raid1_unplug(struct blk_plug
 	while (bio) { /* submit pending writes */
 		struct bio *next = bio->bi_next;
 		bio->bi_next = NULL;
+		if (unlikely((bio->bi_rw & REQ_DISCARD) &&
+		    !blk_queue_discard(bdev_get_queue(bio->bi_bdev))))
+			/* Just ignore it */
+			bio_endio(bio, 0);
 		generic_make_request(bio);
 		bio = next;
 	}

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: hybrid raid1 with trim support [REGRESSION]
  2013-04-28  0:54     ` Shaohua Li
@ 2013-04-28  1:00       ` Shaohua Li
  2013-04-28  9:40         ` Markus
  0 siblings, 1 reply; 7+ messages in thread
From: Shaohua Li @ 2013-04-28  1:00 UTC (permalink / raw)
  To: Markus; +Cc: lkml, Jens Axboe

On Sun, Apr 28, 2013 at 08:54:46AM +0800, Shaohua Li wrote:
> On Sat, Apr 27, 2013 at 06:29:49PM +0200, Markus wrote:
> > Hi!
> > 
> > Now I had the time to bisect, started with 3.7 as good and 3.8 as bad.
> > 0cfbcafcae8b7364b5fa96c2b26ccde7a3a296a9 is the bad commit. [1]
> > block: add plug for blkdev_issue_discard
> > 
> > While 3.8.10 was still bad, the same kernel with the reverted patch applied is fine.
> Thanks for the reporting. Does below patch work for you?
Oops, there is a typo there, should be this one:

---
 drivers/md/raid1.c  |    7 ++++++-
 drivers/md/raid10.c |    7 ++++++-
 2 files changed, 12 insertions(+), 2 deletions(-)

Index: linux/drivers/md/raid1.c
===================================================================
--- linux.orig/drivers/md/raid1.c	2013-03-07 14:14:05.950824173 +0800
+++ linux/drivers/md/raid1.c	2013-04-28 08:57:17.874058434 +0800
@@ -981,7 +981,12 @@ static void raid1_unplug(struct blk_plug
 	while (bio) { /* submit pending writes */
 		struct bio *next = bio->bi_next;
 		bio->bi_next = NULL;
-		generic_make_request(bio);
+		if (unlikely((bio->bi_rw & REQ_DISCARD) &&
+		    !blk_queue_discard(bdev_get_queue(bio->bi_bdev))))
+			/* Just ignore it */
+			bio_endio(bio, 0);
+		else
+			generic_make_request(bio);
 		bio = next;
 	}
 	kfree(plug);
Index: linux/drivers/md/raid10.c
===================================================================
--- linux.orig/drivers/md/raid10.c	2013-03-07 14:14:05.950824173 +0800
+++ linux/drivers/md/raid10.c	2013-04-28 08:57:44.765719067 +0800
@@ -1133,7 +1133,12 @@ static void raid10_unplug(struct blk_plu
 	while (bio) { /* submit pending writes */
 		struct bio *next = bio->bi_next;
 		bio->bi_next = NULL;
-		generic_make_request(bio);
+		if (unlikely((bio->bi_rw & REQ_DISCARD) &&
+		    !blk_queue_discard(bdev_get_queue(bio->bi_bdev))))
+			/* Just ignore it */
+			bio_endio(bio, 0);
+		else
+			generic_make_request(bio);
 		bio = next;
 	}
 	kfree(plug);

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: hybrid raid1 with trim support [REGRESSION]
  2013-04-28  1:00       ` Shaohua Li
@ 2013-04-28  9:40         ` Markus
  2013-04-28 10:10           ` Shaohua Li
  0 siblings, 1 reply; 7+ messages in thread
From: Markus @ 2013-04-28  9:40 UTC (permalink / raw)
  To: Shaohua Li; +Cc: lkml, Jens Axboe

Hi!

Thanks for your work. The patch seems to work for me on a vanilla 3.8.10, at 
least the hdds are no longer dropped from the raid.
The code now ignores some request? What was the reason the disks fell off the 
raid? The discards are still passed to the ssd?


Thanks,
Markus


Shaohua Li schrieb am 28.04.2013:
> On Sun, Apr 28, 2013 at 08:54:46AM +0800, Shaohua Li wrote:
> > On Sat, Apr 27, 2013 at 06:29:49PM +0200, Markus wrote:
> > > Hi!
> > > 
> > > Now I had the time to bisect, started with 3.7 as good and 3.8 as bad.
> > > 0cfbcafcae8b7364b5fa96c2b26ccde7a3a296a9 is the bad commit. [1]
> > > block: add plug for blkdev_issue_discard
> > > 
> > > While 3.8.10 was still bad, the same kernel with the reverted patch 
applied is fine.
> > Thanks for the reporting. Does below patch work for you?
> Oops, there is a typo there, should be this one:
> 
> ---
>  drivers/md/raid1.c  |    7 ++++++-
>  drivers/md/raid10.c |    7 ++++++-
>  2 files changed, 12 insertions(+), 2 deletions(-)
> 
> Index: linux/drivers/md/raid1.c
> ===================================================================
> --- linux.orig/drivers/md/raid1.c	2013-03-07 14:14:05.950824173 +0800
> +++ linux/drivers/md/raid1.c	2013-04-28 08:57:17.874058434 +0800
> @@ -981,7 +981,12 @@ static void raid1_unplug(struct blk_plug
>  	while (bio) { /* submit pending writes */
>  		struct bio *next = bio->bi_next;
>  		bio->bi_next = NULL;
> -		generic_make_request(bio);
> +		if (unlikely((bio->bi_rw & REQ_DISCARD) &&
> +		    !blk_queue_discard(bdev_get_queue(bio->bi_bdev))))
> +			/* Just ignore it */
> +			bio_endio(bio, 0);
> +		else
> +			generic_make_request(bio);
>  		bio = next;
>  	}
>  	kfree(plug);
> Index: linux/drivers/md/raid10.c
> ===================================================================
> --- linux.orig/drivers/md/raid10.c	2013-03-07 14:14:05.950824173 +0800
> +++ linux/drivers/md/raid10.c	2013-04-28 08:57:44.765719067 +0800
> @@ -1133,7 +1133,12 @@ static void raid10_unplug(struct blk_plu
>  	while (bio) { /* submit pending writes */
>  		struct bio *next = bio->bi_next;
>  		bio->bi_next = NULL;
> -		generic_make_request(bio);
> +		if (unlikely((bio->bi_rw & REQ_DISCARD) &&
> +		    !blk_queue_discard(bdev_get_queue(bio->bi_bdev))))
> +			/* Just ignore it */
> +			bio_endio(bio, 0);
> +		else
> +			generic_make_request(bio);
>  		bio = next;
>  	}
>  	kfree(plug);

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: hybrid raid1 with trim support [REGRESSION]
  2013-04-28  9:40         ` Markus
@ 2013-04-28 10:10           ` Shaohua Li
  0 siblings, 0 replies; 7+ messages in thread
From: Shaohua Li @ 2013-04-28 10:10 UTC (permalink / raw)
  To: Markus; +Cc: lkml, Jens Axboe

On Sun, Apr 28, 2013 at 11:40:42AM +0200, Markus wrote:
> Hi!
> 
> Thanks for your work. The patch seems to work for me on a vanilla 3.8.10, at 
> least the hdds are no longer dropped from the raid.
> The code now ignores some request? What was the reason the disks fell off the 
> raid? The discards are still passed to the ssd?
Thanks for testing, I'll send to Neil soon.

Yes, the discard will still be passed to SSD, we just ignore the request for harddisk.

Thanks,
Shaohua
 
> Thanks,
> Markus
> 
> 
> Shaohua Li schrieb am 28.04.2013:
> > On Sun, Apr 28, 2013 at 08:54:46AM +0800, Shaohua Li wrote:
> > > On Sat, Apr 27, 2013 at 06:29:49PM +0200, Markus wrote:
> > > > Hi!
> > > > 
> > > > Now I had the time to bisect, started with 3.7 as good and 3.8 as bad.
> > > > 0cfbcafcae8b7364b5fa96c2b26ccde7a3a296a9 is the bad commit. [1]
> > > > block: add plug for blkdev_issue_discard
> > > > 
> > > > While 3.8.10 was still bad, the same kernel with the reverted patch 
> applied is fine.
> > > Thanks for the reporting. Does below patch work for you?
> > Oops, there is a typo there, should be this one:
> > 
> > ---
> >  drivers/md/raid1.c  |    7 ++++++-
> >  drivers/md/raid10.c |    7 ++++++-
> >  2 files changed, 12 insertions(+), 2 deletions(-)
> > 
> > Index: linux/drivers/md/raid1.c
> > ===================================================================
> > --- linux.orig/drivers/md/raid1.c	2013-03-07 14:14:05.950824173 +0800
> > +++ linux/drivers/md/raid1.c	2013-04-28 08:57:17.874058434 +0800
> > @@ -981,7 +981,12 @@ static void raid1_unplug(struct blk_plug
> >  	while (bio) { /* submit pending writes */
> >  		struct bio *next = bio->bi_next;
> >  		bio->bi_next = NULL;
> > -		generic_make_request(bio);
> > +		if (unlikely((bio->bi_rw & REQ_DISCARD) &&
> > +		    !blk_queue_discard(bdev_get_queue(bio->bi_bdev))))
> > +			/* Just ignore it */
> > +			bio_endio(bio, 0);
> > +		else
> > +			generic_make_request(bio);
> >  		bio = next;
> >  	}
> >  	kfree(plug);
> > Index: linux/drivers/md/raid10.c
> > ===================================================================
> > --- linux.orig/drivers/md/raid10.c	2013-03-07 14:14:05.950824173 +0800
> > +++ linux/drivers/md/raid10.c	2013-04-28 08:57:44.765719067 +0800
> > @@ -1133,7 +1133,12 @@ static void raid10_unplug(struct blk_plu
> >  	while (bio) { /* submit pending writes */
> >  		struct bio *next = bio->bi_next;
> >  		bio->bi_next = NULL;
> > -		generic_make_request(bio);
> > +		if (unlikely((bio->bi_rw & REQ_DISCARD) &&
> > +		    !blk_queue_discard(bdev_get_queue(bio->bi_bdev))))
> > +			/* Just ignore it */
> > +			bio_endio(bio, 0);
> > +		else
> > +			generic_make_request(bio);
> >  		bio = next;
> >  	}
> >  	kfree(plug);

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2013-04-28 10:10 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-12 17:49 hybrid raid1 with trim support Markus
2013-03-24  9:21 ` Markus
2013-04-27 16:29   ` hybrid raid1 with trim support [REGRESSION] Markus
2013-04-28  0:54     ` Shaohua Li
2013-04-28  1:00       ` Shaohua Li
2013-04-28  9:40         ` Markus
2013-04-28 10:10           ` Shaohua Li

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.