All of lore.kernel.org
 help / color / mirror / Atom feed
From: "James Johnston" <johnstonj.public@codenest.com>
To: 'Eric Wheeler' <bcache@lists.ewheeler.net>,
	'Tim Small' <tim@buttersideup.com>
Cc: 'Kent Overstreet' <kent.overstreet@gmail.com>,
	'Alasdair Kergon' <agk@redhat.com>,
	'Mike Snitzer' <snitzer@redhat.com>,
	linux-bcache@vger.kernel.org, dm-devel@redhat.com,
	dm-crypt@saout.de
Subject: RE: bcache gets stuck flushing writeback cache when used in combination with LUKS/dm-crypt and non-default bucket size
Date: Fri, 20 May 2016 06:59:32 -0000	[thread overview]
Message-ID: <02b101d1b265$2bc46fb0$834d4f10$@codenest.com> (raw)
In-Reply-To: <alpine.LRH.2.11.1605191618560.22436@mx.ewheeler.net>

> On Mon, 16 May 2016, Tim Small wrote:
> 
> > On 08/05/16 19:39, James Johnston wrote:
> > > I've run into a problem where the bcache writeback cache can't be flushed to
> > > disk when the backing device is a LUKS / dm-crypt device and the cache set has
> > > a non-default bucket size.  Basically, only a few megabytes will be flushed to
> > > disk, and then it gets stuck.  Stuck means that the bcache writeback task
> > > thrashes the disk by constantly reading hundreds of MB/second from the cache set
> > > in an infinite loop, while not actually progressing (dirty_data never decreases
> > > beyond a certain point).
> >
> > > [...]
> >
> > > The situation is basically unrecoverable as far as I can tell: if you attempt
> > > to detach the cache set then the cache set disk gets thrashed extra-hard
> > > forever, and it's impossible to actually get the cache set detached.  The only
> > > solution seems to be to back up the data and destroy the volume...
> >
> > You can boot an older kernel to flush the device without destroying it
> > (I'm guessing that's because older kernels split down the big requests
> > which are failing on the 4.4 kernel).  Once flushed you could put the
> > cache into writethrough mode, or use a smaller bucket size.
> 
> Indeed, can someone test 4.1.y and see if the problem persists with a 2M
> bucket size?  (If someone has already tested 4.1, then appologies as I've
> not yet seen that report.)
> 
> If 4.1 works, then I think a bisect is in order.  Such a bisect would at
> least highlight the problem and might indicate a (hopefully trivial) fix.

To help narrow this down, I tested the following generic pre-compiled mainline kernels
on Ubuntu 15.10:

 * WORKS:  http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.3.6-wily/
 * DOES NOT WORK:  http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.4-rc1+cod1-wily/

I also tried the default & latest distribution-provided 4.2 kernel.  It worked.
This one also worked:

 * WORKS:  http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.2.8-wily/

So it seems to me that it is a regression from 4.3.6 kernel to any 4.4 kernel.  That
should help save time with bisection...

James

WARNING: multiple messages have this Message-ID (diff)
From: "James Johnston" <johnstonj.public@codenest.com>
To: 'Eric Wheeler' <bcache@lists.ewheeler.net>,
	'Tim Small' <tim@buttersideup.com>
Cc: 'Kent Overstreet' <kent.overstreet@gmail.com>,
	'Alasdair Kergon' <agk@redhat.com>,
	'Mike Snitzer' <snitzer@redhat.com>,
	linux-bcache@vger.kernel.org, dm-devel@redhat.com,
	dm-crypt@saout.de
Subject: Re: [dm-crypt] bcache gets stuck flushing writeback cache when used in combination with LUKS/dm-crypt and non-default bucket size
Date: Fri, 20 May 2016 06:59:32 -0000	[thread overview]
Message-ID: <02b101d1b265$2bc46fb0$834d4f10$@codenest.com> (raw)
In-Reply-To: <alpine.LRH.2.11.1605191618560.22436@mx.ewheeler.net>

> On Mon, 16 May 2016, Tim Small wrote:
> 
> > On 08/05/16 19:39, James Johnston wrote:
> > > I've run into a problem where the bcache writeback cache can't be flushed to
> > > disk when the backing device is a LUKS / dm-crypt device and the cache set has
> > > a non-default bucket size.  Basically, only a few megabytes will be flushed to
> > > disk, and then it gets stuck.  Stuck means that the bcache writeback task
> > > thrashes the disk by constantly reading hundreds of MB/second from the cache set
> > > in an infinite loop, while not actually progressing (dirty_data never decreases
> > > beyond a certain point).
> >
> > > [...]
> >
> > > The situation is basically unrecoverable as far as I can tell: if you attempt
> > > to detach the cache set then the cache set disk gets thrashed extra-hard
> > > forever, and it's impossible to actually get the cache set detached.  The only
> > > solution seems to be to back up the data and destroy the volume...
> >
> > You can boot an older kernel to flush the device without destroying it
> > (I'm guessing that's because older kernels split down the big requests
> > which are failing on the 4.4 kernel).  Once flushed you could put the
> > cache into writethrough mode, or use a smaller bucket size.
> 
> Indeed, can someone test 4.1.y and see if the problem persists with a 2M
> bucket size?  (If someone has already tested 4.1, then appologies as I've
> not yet seen that report.)
> 
> If 4.1 works, then I think a bisect is in order.  Such a bisect would at
> least highlight the problem and might indicate a (hopefully trivial) fix.

To help narrow this down, I tested the following generic pre-compiled mainline kernels
on Ubuntu 15.10:

 * WORKS:  http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.3.6-wily/
 * DOES NOT WORK:  http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.4-rc1+cod1-wily/

I also tried the default & latest distribution-provided 4.2 kernel.  It worked.
This one also worked:

 * WORKS:  http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.2.8-wily/

So it seems to me that it is a regression from 4.3.6 kernel to any 4.4 kernel.  That
should help save time with bisection...

James

  reply	other threads:[~2016-05-20  7:00 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-08 18:39 bcache gets stuck flushing writeback cache when used in combination with LUKS/dm-crypt and non-default bucket size James Johnston
2016-05-08 18:39 ` [dm-crypt] " James Johnston
2016-05-11  1:38 ` Eric Wheeler
2016-05-11  1:38   ` [dm-crypt] " Eric Wheeler
2016-05-15  9:08   ` Tim Small
2016-05-16 13:02     ` Tim Small
2016-05-16 13:02       ` [dm-crypt] " Tim Small
2016-05-16 13:53       ` Tim Small
2016-05-16 13:53         ` [dm-crypt] " Tim Small
2016-05-19 23:15       ` Eric Wheeler
2016-05-19 23:15         ` [dm-crypt] " Eric Wheeler
2016-05-18 17:01   ` [dm-devel] " James Johnston
2016-05-18 17:01     ` [dm-crypt] " James Johnston
2016-05-16 16:08 ` Tim Small
2016-05-16 16:08   ` [dm-crypt] " Tim Small
2016-05-19 23:22   ` Eric Wheeler
2016-05-19 23:22     ` [dm-crypt] " Eric Wheeler
2016-05-20  6:59     ` James Johnston [this message]
2016-05-20  6:59       ` James Johnston
2016-05-20 21:37       ` 'Eric Wheeler'
2016-05-20 21:37         ` [dm-crypt] " 'Eric Wheeler'
2016-05-22  4:26         ` James Johnston
2016-05-22  4:26           ` [dm-crypt] " James Johnston
2016-05-27 14:47           ` [PATCH] dm-crypt: Fix error with too large bios (was: bcache gets stuck flushing writeback cache when used in combination with LUKS/dm-crypt and non-default bucket size) Mikulas Patocka
2016-05-27 14:47             ` [dm-crypt] " Mikulas Patocka
2016-06-01  4:19             ` James Johnston
2016-06-01  4:19               ` [dm-crypt] " James Johnston
2016-05-20 20:22 ` bcache gets stuck flushing writeback cache when used in combination with LUKS/dm-crypt and non-default bucket size Eric Wheeler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='02b101d1b265$2bc46fb0$834d4f10$@codenest.com' \
    --to=johnstonj.public@codenest.com \
    --cc=agk@redhat.com \
    --cc=bcache@lists.ewheeler.net \
    --cc=dm-crypt@saout.de \
    --cc=dm-devel@redhat.com \
    --cc=kent.overstreet@gmail.com \
    --cc=linux-bcache@vger.kernel.org \
    --cc=snitzer@redhat.com \
    --cc=tim@buttersideup.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.