* [PATCH] bcache: Make gc wakeup sane, remove set_task_state()
@ 2017-01-23 13:20 colyli
2017-01-23 14:16 ` Greg KH
0 siblings, 1 reply; 9+ messages in thread
From: colyli @ 2017-01-23 13:20 UTC (permalink / raw)
To: stable; +Cc: Kent Overstreet
Hi stable maintainers,
This patch is from Kent, upstream commit ID is be628be09563.
Olav Reinert <seroton10@gmail.com> reports a kerenl crash from
bcache (boo#1021260) and Oliver Nuekum points out this patch fixes the problem.
I send this patch to stable@kernel.vger.org, hope this patch can be taken care
in stable kernels.
Thanks in advance.
Coly Li
Here I attach the original patch, just FYI.
---
From: Kent Overstreet <kent.overstreet@gmail.com>
Date: Wed, 26 Oct 2016 20:31:17 -0700
Subject: [PATCH] bcache: Make gc wakeup sane, remove set_task_state()
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
---
drivers/md/bcache/bcache.h | 4 ++--
drivers/md/bcache/btree.c | 39 ++++++++++++++++++++-------------------
drivers/md/bcache/btree.h | 3 +--
drivers/md/bcache/request.c | 4 +---
drivers/md/bcache/super.c | 2 ++
5 files changed, 26 insertions(+), 26 deletions(-)
diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
index 6b420a5..c3ea03c 100644
--- a/drivers/md/bcache/bcache.h
+++ b/drivers/md/bcache/bcache.h
@@ -425,7 +425,7 @@ struct cache {
* until a gc finishes - otherwise we could pointlessly burn a ton of
* cpu
*/
- unsigned invalidate_needs_gc:1;
+ unsigned invalidate_needs_gc;
bool discard; /* Get rid of? */
@@ -593,8 +593,8 @@ struct cache_set {
/* Counts how many sectors bio_insert has added to the cache */
atomic_t sectors_to_gc;
+ wait_queue_head_t gc_wait;
- wait_queue_head_t moving_gc_wait;
struct keybuf moving_gc_keys;
/* Number of moving GC bios in flight */
struct semaphore moving_in_flight;
diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index 6fdd8e2..a43eedd 100644
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -1757,32 +1757,34 @@ static void bch_btree_gc(struct cache_set *c)
bch_moving_gc(c);
}
-static int bch_gc_thread(void *arg)
+static bool gc_should_run(struct cache_set *c)
{
- struct cache_set *c = arg;
struct cache *ca;
unsigned i;
- while (1) {
-again:
- bch_btree_gc(c);
+ for_each_cache(ca, c, i)
+ if (ca->invalidate_needs_gc)
+ return true;
- set_current_state(TASK_INTERRUPTIBLE);
- if (kthread_should_stop())
- break;
+ if (atomic_read(&c->sectors_to_gc) < 0)
+ return true;
- mutex_lock(&c->bucket_lock);
+ return false;
+}
- for_each_cache(ca, c, i)
- if (ca->invalidate_needs_gc) {
- mutex_unlock(&c->bucket_lock);
- set_current_state(TASK_RUNNING);
- goto again;
- }
+static int bch_gc_thread(void *arg)
+{
+ struct cache_set *c = arg;
- mutex_unlock(&c->bucket_lock);
+ while (1) {
+ wait_event_interruptible(c->gc_wait,
+ kthread_should_stop() || gc_should_run(c));
- schedule();
+ if (kthread_should_stop())
+ break;
+
+ set_gc_sectors(c);
+ bch_btree_gc(c);
}
return 0;
@@ -1790,11 +1792,10 @@ static int bch_gc_thread(void *arg)
int bch_gc_thread_start(struct cache_set *c)
{
- c->gc_thread = kthread_create(bch_gc_thread, c, "bcache_gc");
+ c->gc_thread = kthread_run(bch_gc_thread, c, "bcache_gc");
if (IS_ERR(c->gc_thread))
return PTR_ERR(c->gc_thread);
- set_task_state(c->gc_thread, TASK_INTERRUPTIBLE);
return 0;
}
diff --git a/drivers/md/bcache/btree.h b/drivers/md/bcache/btree.h
index 5c391fa..9b80417 100644
--- a/drivers/md/bcache/btree.h
+++ b/drivers/md/bcache/btree.h
@@ -260,8 +260,7 @@ void bch_initial_mark_key(struct cache_set *, int, struct bkey *);
static inline void wake_up_gc(struct cache_set *c)
{
- if (c->gc_thread)
- wake_up_process(c->gc_thread);
+ wake_up(&c->gc_wait);
}
#define MAP_DONE 0
diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c
index f49c541..76d2087 100644
--- a/drivers/md/bcache/request.c
+++ b/drivers/md/bcache/request.c
@@ -196,10 +196,8 @@ static void bch_data_insert_start(struct closure *cl)
struct data_insert_op *op = container_of(cl, struct data_insert_op, cl);
struct bio *bio = op->bio, *n;
- if (atomic_sub_return(bio_sectors(bio), &op->c->sectors_to_gc) < 0) {
- set_gc_sectors(op->c);
+ if (atomic_sub_return(bio_sectors(bio), &op->c->sectors_to_gc) < 0)
wake_up_gc(op->c);
- }
if (op->bypass)
return bch_data_invalidate(cl);
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 2fb5bfe..b33dd3b 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1489,6 +1489,7 @@ struct cache_set *bch_cache_set_alloc(struct cache_sb *sb)
mutex_init(&c->bucket_lock);
init_waitqueue_head(&c->btree_cache_wait);
init_waitqueue_head(&c->bucket_wait);
+ init_waitqueue_head(&c->gc_wait);
sema_init(&c->uuid_write_mutex, 1);
spin_lock_init(&c->btree_gc_time.lock);
@@ -1548,6 +1549,7 @@ static void run_cache_set(struct cache_set *c)
for_each_cache(ca, c, i)
c->nbuckets += ca->sb.nbuckets;
+ set_gc_sectors(c);
if (CACHE_SYNC(&c->sb)) {
LIST_HEAD(journal);
--
2.6.6
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] bcache: Make gc wakeup sane, remove set_task_state()
2017-01-23 13:20 [PATCH] bcache: Make gc wakeup sane, remove set_task_state() colyli
@ 2017-01-23 14:16 ` Greg KH
2017-01-23 14:45 ` Coly Li
0 siblings, 1 reply; 9+ messages in thread
From: Greg KH @ 2017-01-23 14:16 UTC (permalink / raw)
To: colyli; +Cc: stable, Kent Overstreet
On Mon, Jan 23, 2017 at 09:20:12PM +0800, colyli@suse.de wrote:
> Hi stable maintainers,
>
> This patch is from Kent, upstream commit ID is be628be09563.
> Olav Reinert <seroton10@gmail.com> reports a kerenl crash from
> bcache (boo#1021260) and Oliver Nuekum points out this patch fixes the problem.
"boo"?
> I send this patch to stable@kernel.vger.org, hope this patch can be taken care
> in stable kernels.
>
> Thanks in advance.
>
> Coly Li
>
> Here I attach the original patch, just FYI.
> ---
> From: Kent Overstreet <kent.overstreet@gmail.com>
> Date: Wed, 26 Oct 2016 20:31:17 -0700
> Subject: [PATCH] bcache: Make gc wakeup sane, remove set_task_state()
>
> Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
No changelog text? Worst short changelog description ever?
This gives me no context of what is going on here. Why does this fix a
bug? What kernel(s) should it be backported to?
I need some more help here please.
thanks,
greg k-h
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] bcache: Make gc wakeup sane, remove set_task_state()
2017-01-23 14:16 ` Greg KH
@ 2017-01-23 14:45 ` Coly Li
2017-01-23 14:54 ` Greg KH
0 siblings, 1 reply; 9+ messages in thread
From: Coly Li @ 2017-01-23 14:45 UTC (permalink / raw)
To: Greg KH; +Cc: stable, Kent Overstreet
On 2017/1/23 下午10:16, Greg KH wrote:
> On Mon, Jan 23, 2017 at 09:20:12PM +0800, colyli@suse.de wrote:
>> Hi stable maintainers,
>>
>> This patch is from Kent, upstream commit ID is be628be09563.
>> Olav Reinert <seroton10@gmail.com> reports a kerenl crash from
>> bcache (boo#1021260) and Oliver Nuekum points out this patch fixes the problem.
>
> "boo"?
>
Hi Greg,
"boo" is abbreviation of bugzilla.opensuse.org, I paste the original bug
report here,
==== start of bug report ==========
I have starting seeing errors like the one quoted below in the system
log. It occurs infrequently, but quite regularly, about 1-3 times a
week, on a server running 24x7.
Around the time it began, I started running a beta version of Leap 42.2,
upgraded from 42.1. Also, I enabled the "discard" option (SSD TRIM) on
the bcache cache about 3-6 months ago. I believe one of those two events
caused the bug to appear.
Not sure what other info is useful, please ask for whatever you need.
Oct 10 00:00:02 blackbox kernel: ------------[ cut here ]------------
Oct 10 00:00:02 blackbox kernel: WARNING: CPU: 4 PID: 1269 at
../kernel/sched/core.c:7891 __might_sleep+0x76/0x80()
Oct 10 00:00:02 blackbox kernel: do not call blocking ops when
!TASK_RUNNING; state=1 set at [<ffffffffa09e2325>]
bch_gc_thread+0x25/0x100 [
Oct 10 00:00:02 blackbox kernel: Modules linked in: vhost_net vhost
macvtap macvlan fuse ebt_arp ebt_ip ebtable_nat ebtable_filter ebtables
Oct 10 00:00:02 blackbox kernel: mxm_wmi
Oct 10 00:00:02 blackbox kernel: bcache aesni_intel raid1
snd_hda_codec_realtek aes_x86_64 lrw snd_hda_codec_generic gf128mul
md_mod glue_h
Oct 10 00:00:02 blackbox kernel:
Oct 10 00:00:02 blackbox kernel: CPU: 4 PID: 1269 Comm: bcache_gc Not
tainted 4.4.21-2-default #1
Oct 10 00:00:02 blackbox kernel: Hardware name: To be filled by O.E.M.
To be filled by O.E.M./M5A99X EVO R2.0, BIOS 2301 01/06/2014
Oct 10 00:00:02 blackbox kernel: 0000000000000000 ffffffff81326967
ffff8800b605be10 ffffffff81a5e431
Oct 10 00:00:02 blackbox kernel: ffffffff8107e7d1 ffffffff81a5f54f
ffff8800b605be60 0000000000000061
Oct 10 00:00:02 blackbox kernel: 0000000000000000
Oct 10 00:00:02 blackbox kernel: 0000000000000000 ffffffff8107e84c
ffffffff81a4ef88
Oct 10 00:00:02 blackbox kernel: Call Trace:
Oct 10 00:00:02 blackbox kernel: [<ffffffff81019e69>] dump_trace+0x59/0x320
Oct 10 00:00:02 blackbox kernel: [<ffffffff8101a22a>]
show_stack_log_lvl+0xfa/0x180
Oct 10 00:00:02 blackbox kernel: [<ffffffff8101afd1>] show_stack+0x21/0x40
Oct 10 00:00:02 blackbox kernel: [<ffffffff81326967>] dump_stack+0x5c/0x85
Oct 10 00:00:02 blackbox kernel: [<ffffffff8107e7d1>]
warn_slowpath_common+0x81/0xb0
Oct 10 00:00:02 blackbox kernel: [<ffffffff8107e84c>]
warn_slowpath_fmt+0x4c/0x50
Oct 10 00:00:02 blackbox kernel: [<ffffffff810a3026>]
__might_sleep+0x76/0x80
Oct 10 00:00:02 blackbox kernel: [<ffffffff81605cac>] mutex_lock+0x1c/0x38
Oct 10 00:00:02 blackbox kernel: [<ffffffffa09e2365>]
bch_gc_thread+0x65/0x100 [bcache]
Oct 10 00:00:02 blackbox kernel: [<ffffffff8109d268>] kthread+0xc8/0xe0
Oct 10 00:00:02 blackbox kernel: [<ffffffff8160828f>]
ret_from_fork+0x3f/0x70
Oct 10 00:00:02 blackbox kernel: DWARF2 unwinder stuck at
ret_from_fork+0x3f/0x70
Oct 10 00:00:02 blackbox kernel:
Oct 10 00:00:02 blackbox kernel: Leftover inexact backtrace:
Oct 10 00:00:02 blackbox kernel: [<ffffffff8109d1a0>] ?
kthread_park+0x50/0x50
Oct 10 00:00:02 blackbox kernel: ---[ end trace c63abcb6c473e79b ]---
==== end of bug report ==========
# journalctl|grep "blocking ops"
Oct 10 00:00:02 blackbox kernel: do not call blocking ops when
!TASK_RUNNING; state=1 set at [<ffffffffa09e2325>]
bch_gc_thread+0x25/0x100 [bcache]
[snip repeated lines]
>> I send this patch to stable@kernel.vger.org, hope this patch can be taken care
>> in stable kernels.
>>
>> Thanks in advance.
>>
>> Coly Li
>>
>> Here I attach the original patch, just FYI.
>> ---
>> From: Kent Overstreet <kent.overstreet@gmail.com>
>> Date: Wed, 26 Oct 2016 20:31:17 -0700
>> Subject: [PATCH] bcache: Make gc wakeup sane, remove set_task_state()
>>
>> Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
>
> No changelog text? Worst short changelog description ever?
There is no change log from original patch, I am not the author, and it
is in upstream already. So I think I am not the right person to change
its commit log.
This is the first time I encounter this situation, that send a patch to
stable which is not from me. I guess Kent does not notice that this
patch indeed fixes a kernel oops. But it does fix a bug report for Leap
42.2 and SLE12-SP2.
>
> This gives me no context of what is going on here. Why does this fix a
> bug? What kernel(s) should it be backported to?
>
The bug is reported on Linux 4.4 based kernel, so at least all kernels
since Linux 4.4 should have the fix. Maybe Kent can provide more
accurate suggestion.
Thanks.
Coly
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] bcache: Make gc wakeup sane, remove set_task_state()
2017-01-23 14:45 ` Coly Li
@ 2017-01-23 14:54 ` Greg KH
2017-02-20 12:31 ` Greg KH
0 siblings, 1 reply; 9+ messages in thread
From: Greg KH @ 2017-01-23 14:54 UTC (permalink / raw)
To: Coly Li; +Cc: stable, Kent Overstreet
On Mon, Jan 23, 2017 at 10:45:47PM +0800, Coly Li wrote:
> On 2017/1/23 下午10:16, Greg KH wrote:
> > On Mon, Jan 23, 2017 at 09:20:12PM +0800, colyli@suse.de wrote:
> >> Hi stable maintainers,
> >>
> >> This patch is from Kent, upstream commit ID is be628be09563.
> >> Olav Reinert <seroton10@gmail.com> reports a kerenl crash from
> >> bcache (boo#1021260) and Oliver Nuekum points out this patch fixes the problem.
> >
> > "boo"?
> >
>
> Hi Greg,
>
>
> "boo" is abbreviation of bugzilla.opensuse.org, I paste the original bug
> report here,
> ==== start of bug report ==========
> I have starting seeing errors like the one quoted below in the system
> log. It occurs infrequently, but quite regularly, about 1-3 times a
> week, on a server running 24x7.
>
> Around the time it began, I started running a beta version of Leap 42.2,
> upgraded from 42.1. Also, I enabled the "discard" option (SSD TRIM) on
> the bcache cache about 3-6 months ago. I believe one of those two events
> caused the bug to appear.
>
> Not sure what other info is useful, please ask for whatever you need.
>
>
> Oct 10 00:00:02 blackbox kernel: ------------[ cut here ]------------
> Oct 10 00:00:02 blackbox kernel: WARNING: CPU: 4 PID: 1269 at
> ../kernel/sched/core.c:7891 __might_sleep+0x76/0x80()
> Oct 10 00:00:02 blackbox kernel: do not call blocking ops when
> !TASK_RUNNING; state=1 set at [<ffffffffa09e2325>]
> bch_gc_thread+0x25/0x100 [
> Oct 10 00:00:02 blackbox kernel: Modules linked in: vhost_net vhost
> macvtap macvlan fuse ebt_arp ebt_ip ebtable_nat ebtable_filter ebtables
> Oct 10 00:00:02 blackbox kernel: mxm_wmi
> Oct 10 00:00:02 blackbox kernel: bcache aesni_intel raid1
> snd_hda_codec_realtek aes_x86_64 lrw snd_hda_codec_generic gf128mul
> md_mod glue_h
> Oct 10 00:00:02 blackbox kernel:
> Oct 10 00:00:02 blackbox kernel: CPU: 4 PID: 1269 Comm: bcache_gc Not
> tainted 4.4.21-2-default #1
> Oct 10 00:00:02 blackbox kernel: Hardware name: To be filled by O.E.M.
> To be filled by O.E.M./M5A99X EVO R2.0, BIOS 2301 01/06/2014
> Oct 10 00:00:02 blackbox kernel: 0000000000000000 ffffffff81326967
> ffff8800b605be10 ffffffff81a5e431
> Oct 10 00:00:02 blackbox kernel: ffffffff8107e7d1 ffffffff81a5f54f
> ffff8800b605be60 0000000000000061
> Oct 10 00:00:02 blackbox kernel: 0000000000000000
> Oct 10 00:00:02 blackbox kernel: 0000000000000000 ffffffff8107e84c
> ffffffff81a4ef88
> Oct 10 00:00:02 blackbox kernel: Call Trace:
> Oct 10 00:00:02 blackbox kernel: [<ffffffff81019e69>] dump_trace+0x59/0x320
> Oct 10 00:00:02 blackbox kernel: [<ffffffff8101a22a>]
> show_stack_log_lvl+0xfa/0x180
> Oct 10 00:00:02 blackbox kernel: [<ffffffff8101afd1>] show_stack+0x21/0x40
> Oct 10 00:00:02 blackbox kernel: [<ffffffff81326967>] dump_stack+0x5c/0x85
> Oct 10 00:00:02 blackbox kernel: [<ffffffff8107e7d1>]
> warn_slowpath_common+0x81/0xb0
> Oct 10 00:00:02 blackbox kernel: [<ffffffff8107e84c>]
> warn_slowpath_fmt+0x4c/0x50
> Oct 10 00:00:02 blackbox kernel: [<ffffffff810a3026>]
> __might_sleep+0x76/0x80
> Oct 10 00:00:02 blackbox kernel: [<ffffffff81605cac>] mutex_lock+0x1c/0x38
> Oct 10 00:00:02 blackbox kernel: [<ffffffffa09e2365>]
> bch_gc_thread+0x65/0x100 [bcache]
> Oct 10 00:00:02 blackbox kernel: [<ffffffff8109d268>] kthread+0xc8/0xe0
> Oct 10 00:00:02 blackbox kernel: [<ffffffff8160828f>]
> ret_from_fork+0x3f/0x70
> Oct 10 00:00:02 blackbox kernel: DWARF2 unwinder stuck at
> ret_from_fork+0x3f/0x70
> Oct 10 00:00:02 blackbox kernel:
> Oct 10 00:00:02 blackbox kernel: Leftover inexact backtrace:
> Oct 10 00:00:02 blackbox kernel: [<ffffffff8109d1a0>] ?
> kthread_park+0x50/0x50
> Oct 10 00:00:02 blackbox kernel: ---[ end trace c63abcb6c473e79b ]---
> ==== end of bug report ==========
>
>
> # journalctl|grep "blocking ops"
> Oct 10 00:00:02 blackbox kernel: do not call blocking ops when
> !TASK_RUNNING; state=1 set at [<ffffffffa09e2325>]
> bch_gc_thread+0x25/0x100 [bcache]
> [snip repeated lines]
>
>
>
> >> I send this patch to stable@kernel.vger.org, hope this patch can be taken care
> >> in stable kernels.
> >>
> >> Thanks in advance.
> >>
> >> Coly Li
> >>
> >> Here I attach the original patch, just FYI.
> >> ---
> >> From: Kent Overstreet <kent.overstreet@gmail.com>
> >> Date: Wed, 26 Oct 2016 20:31:17 -0700
> >> Subject: [PATCH] bcache: Make gc wakeup sane, remove set_task_state()
> >>
> >> Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
> >
> > No changelog text? Worst short changelog description ever?
>
> There is no change log from original patch, I am not the author, and it
> is in upstream already. So I think I am not the right person to change
> its commit log.
Oh, I didn't mean to complain to you, my complain was to Kent.
Kent, please go read the section, "The canonical patch format" in the
Documentation/SubmittingPatches file for how to do this properly.
> This is the first time I encounter this situation, that send a patch to
> stable which is not from me. I guess Kent does not notice that this
> patch indeed fixes a kernel oops. But it does fix a bug report for Leap
> 42.2 and SLE12-SP2.
> >
> > This gives me no context of what is going on here. Why does this fix a
> > bug? What kernel(s) should it be backported to?
> >
>
> The bug is reported on Linux 4.4 based kernel, so at least all kernels
> since Linux 4.4 should have the fix. Maybe Kent can provide more
> accurate suggestion.
Kent, any hints?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] bcache: Make gc wakeup sane, remove set_task_state()
2017-01-23 14:54 ` Greg KH
@ 2017-02-20 12:31 ` Greg KH
2017-02-20 13:12 ` Kent Overstreet
0 siblings, 1 reply; 9+ messages in thread
From: Greg KH @ 2017-02-20 12:31 UTC (permalink / raw)
To: Coly Li; +Cc: stable, Kent Overstreet
On Mon, Jan 23, 2017 at 03:54:07PM +0100, Greg KH wrote:
> On Mon, Jan 23, 2017 at 10:45:47PM +0800, Coly Li wrote:
> > On 2017/1/23 下午10:16, Greg KH wrote:
> > > On Mon, Jan 23, 2017 at 09:20:12PM +0800, colyli@suse.de wrote:
> > >> Hi stable maintainers,
> > >>
> > >> This patch is from Kent, upstream commit ID is be628be09563.
> > >> Olav Reinert <seroton10@gmail.com> reports a kerenl crash from
> > >> bcache (boo#1021260) and Oliver Nuekum points out this patch fixes the problem.
> > >
> > > "boo"?
> > >
> >
> > Hi Greg,
> >
> >
> > "boo" is abbreviation of bugzilla.opensuse.org, I paste the original bug
> > report here,
> > ==== start of bug report ==========
> > I have starting seeing errors like the one quoted below in the system
> > log. It occurs infrequently, but quite regularly, about 1-3 times a
> > week, on a server running 24x7.
> >
> > Around the time it began, I started running a beta version of Leap 42.2,
> > upgraded from 42.1. Also, I enabled the "discard" option (SSD TRIM) on
> > the bcache cache about 3-6 months ago. I believe one of those two events
> > caused the bug to appear.
> >
> > Not sure what other info is useful, please ask for whatever you need.
> >
> >
> > Oct 10 00:00:02 blackbox kernel: ------------[ cut here ]------------
> > Oct 10 00:00:02 blackbox kernel: WARNING: CPU: 4 PID: 1269 at
> > ../kernel/sched/core.c:7891 __might_sleep+0x76/0x80()
> > Oct 10 00:00:02 blackbox kernel: do not call blocking ops when
> > !TASK_RUNNING; state=1 set at [<ffffffffa09e2325>]
> > bch_gc_thread+0x25/0x100 [
> > Oct 10 00:00:02 blackbox kernel: Modules linked in: vhost_net vhost
> > macvtap macvlan fuse ebt_arp ebt_ip ebtable_nat ebtable_filter ebtables
> > Oct 10 00:00:02 blackbox kernel: mxm_wmi
> > Oct 10 00:00:02 blackbox kernel: bcache aesni_intel raid1
> > snd_hda_codec_realtek aes_x86_64 lrw snd_hda_codec_generic gf128mul
> > md_mod glue_h
> > Oct 10 00:00:02 blackbox kernel:
> > Oct 10 00:00:02 blackbox kernel: CPU: 4 PID: 1269 Comm: bcache_gc Not
> > tainted 4.4.21-2-default #1
> > Oct 10 00:00:02 blackbox kernel: Hardware name: To be filled by O.E.M.
> > To be filled by O.E.M./M5A99X EVO R2.0, BIOS 2301 01/06/2014
> > Oct 10 00:00:02 blackbox kernel: 0000000000000000 ffffffff81326967
> > ffff8800b605be10 ffffffff81a5e431
> > Oct 10 00:00:02 blackbox kernel: ffffffff8107e7d1 ffffffff81a5f54f
> > ffff8800b605be60 0000000000000061
> > Oct 10 00:00:02 blackbox kernel: 0000000000000000
> > Oct 10 00:00:02 blackbox kernel: 0000000000000000 ffffffff8107e84c
> > ffffffff81a4ef88
> > Oct 10 00:00:02 blackbox kernel: Call Trace:
> > Oct 10 00:00:02 blackbox kernel: [<ffffffff81019e69>] dump_trace+0x59/0x320
> > Oct 10 00:00:02 blackbox kernel: [<ffffffff8101a22a>]
> > show_stack_log_lvl+0xfa/0x180
> > Oct 10 00:00:02 blackbox kernel: [<ffffffff8101afd1>] show_stack+0x21/0x40
> > Oct 10 00:00:02 blackbox kernel: [<ffffffff81326967>] dump_stack+0x5c/0x85
> > Oct 10 00:00:02 blackbox kernel: [<ffffffff8107e7d1>]
> > warn_slowpath_common+0x81/0xb0
> > Oct 10 00:00:02 blackbox kernel: [<ffffffff8107e84c>]
> > warn_slowpath_fmt+0x4c/0x50
> > Oct 10 00:00:02 blackbox kernel: [<ffffffff810a3026>]
> > __might_sleep+0x76/0x80
> > Oct 10 00:00:02 blackbox kernel: [<ffffffff81605cac>] mutex_lock+0x1c/0x38
> > Oct 10 00:00:02 blackbox kernel: [<ffffffffa09e2365>]
> > bch_gc_thread+0x65/0x100 [bcache]
> > Oct 10 00:00:02 blackbox kernel: [<ffffffff8109d268>] kthread+0xc8/0xe0
> > Oct 10 00:00:02 blackbox kernel: [<ffffffff8160828f>]
> > ret_from_fork+0x3f/0x70
> > Oct 10 00:00:02 blackbox kernel: DWARF2 unwinder stuck at
> > ret_from_fork+0x3f/0x70
> > Oct 10 00:00:02 blackbox kernel:
> > Oct 10 00:00:02 blackbox kernel: Leftover inexact backtrace:
> > Oct 10 00:00:02 blackbox kernel: [<ffffffff8109d1a0>] ?
> > kthread_park+0x50/0x50
> > Oct 10 00:00:02 blackbox kernel: ---[ end trace c63abcb6c473e79b ]---
> > ==== end of bug report ==========
> >
> >
> > # journalctl|grep "blocking ops"
> > Oct 10 00:00:02 blackbox kernel: do not call blocking ops when
> > !TASK_RUNNING; state=1 set at [<ffffffffa09e2325>]
> > bch_gc_thread+0x25/0x100 [bcache]
> > [snip repeated lines]
> >
> >
> >
> > >> I send this patch to stable@kernel.vger.org, hope this patch can be taken care
> > >> in stable kernels.
> > >>
> > >> Thanks in advance.
> > >>
> > >> Coly Li
> > >>
> > >> Here I attach the original patch, just FYI.
> > >> ---
> > >> From: Kent Overstreet <kent.overstreet@gmail.com>
> > >> Date: Wed, 26 Oct 2016 20:31:17 -0700
> > >> Subject: [PATCH] bcache: Make gc wakeup sane, remove set_task_state()
> > >>
> > >> Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
> > >
> > > No changelog text? Worst short changelog description ever?
> >
> > There is no change log from original patch, I am not the author, and it
> > is in upstream already. So I think I am not the right person to change
> > its commit log.
>
> Oh, I didn't mean to complain to you, my complain was to Kent.
>
> Kent, please go read the section, "The canonical patch format" in the
> Documentation/SubmittingPatches file for how to do this properly.
>
> > This is the first time I encounter this situation, that send a patch to
> > stable which is not from me. I guess Kent does not notice that this
> > patch indeed fixes a kernel oops. But it does fix a bug report for Leap
> > 42.2 and SLE12-SP2.
> > >
> > > This gives me no context of what is going on here. Why does this fix a
> > > bug? What kernel(s) should it be backported to?
> > >
> >
> > The bug is reported on Linux 4.4 based kernel, so at least all kernels
> > since Linux 4.4 should have the fix. Maybe Kent can provide more
> > accurate suggestion.
>
> Kent, any hints?
Without a response from the maintainer, I can't apply this...
greg k-h
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] bcache: Make gc wakeup sane, remove set_task_state()
2017-02-20 12:31 ` Greg KH
@ 2017-02-20 13:12 ` Kent Overstreet
2017-02-20 14:06 ` Greg KH
0 siblings, 1 reply; 9+ messages in thread
From: Kent Overstreet @ 2017-02-20 13:12 UTC (permalink / raw)
To: Greg KH; +Cc: Coly Li, stable
On Mon, Feb 20, 2017 at 01:31:37PM +0100, Greg KH wrote:
> On Mon, Jan 23, 2017 at 03:54:07PM +0100, Greg KH wrote:
> > Kent, any hints?
>
> Without a response from the maintainer, I can't apply this...
>
> greg k-h
Sorry I missed this - yes, this patch should be safe to apply and it does fix
that crash. The relevant code hasn't been changed in ages, 4.4 is definitely
fine.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] bcache: Make gc wakeup sane, remove set_task_state()
2017-02-20 13:12 ` Kent Overstreet
@ 2017-02-20 14:06 ` Greg KH
2017-02-20 14:36 ` Coly Li
0 siblings, 1 reply; 9+ messages in thread
From: Greg KH @ 2017-02-20 14:06 UTC (permalink / raw)
To: Kent Overstreet; +Cc: Coly Li, stable
On Mon, Feb 20, 2017 at 04:12:58AM -0900, Kent Overstreet wrote:
> On Mon, Feb 20, 2017 at 01:31:37PM +0100, Greg KH wrote:
> > On Mon, Jan 23, 2017 at 03:54:07PM +0100, Greg KH wrote:
> > > Kent, any hints?
> >
> > Without a response from the maintainer, I can't apply this...
> >
> > greg k-h
>
> Sorry I missed this - yes, this patch should be safe to apply and it does fix
> that crash. The relevant code hasn't been changed in ages, 4.4 is definitely
> fine.
Ok, I've queued this up for 4.9, but for 4.4 it does not apply. Coly,
can you provide a working backport for 4.4-stable?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] bcache: Make gc wakeup sane, remove set_task_state()
2017-02-20 14:06 ` Greg KH
@ 2017-02-20 14:36 ` Coly Li
2017-02-20 15:19 ` Greg KH
0 siblings, 1 reply; 9+ messages in thread
From: Coly Li @ 2017-02-20 14:36 UTC (permalink / raw)
To: Greg KH; +Cc: Kent Overstreet, stable
[-- Attachment #1: Type: text/plain, Size: 932 bytes --]
On 2017/2/20 下午10:06, Greg KH wrote:
> On Mon, Feb 20, 2017 at 04:12:58AM -0900, Kent Overstreet wrote:
>> On Mon, Feb 20, 2017 at 01:31:37PM +0100, Greg KH wrote:
>>> On Mon, Jan 23, 2017 at 03:54:07PM +0100, Greg KH wrote:
>>>> Kent, any hints?
>>>
>>> Without a response from the maintainer, I can't apply this...
>>>
>>> greg k-h
>>
>> Sorry I missed this - yes, this patch should be safe to apply and it does fix
>> that crash. The relevant code hasn't been changed in ages, 4.4 is definitely
>> fine.
>
> Ok, I've queued this up for 4.9, but for 4.4 it does not apply. Coly,
> can you provide a working backport for 4.4-stable?
Greg,
It is because the 'commit 29e6c57cc78e ("bcache: bch_gc_thread() is not
freezable")' remove a "try_to_freeze()" in bch_gc_thread(), which
happens in v4.7.
I just rebase Kent's fix to v4.4 kernel, solve the conflict. Could you
please check and try the attached patch ?
Thanks.
Coly
[-- Attachment #2: 0001-bcache-Make-gc-wakeup-sane-remove-set_task_state.patch --]
[-- Type: text/plain, Size: 4404 bytes --]
Subject: [PATCH] bcache: Make gc wakeup sane, remove set_task_state()
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
---
drivers/md/bcache/bcache.h | 4 ++--
drivers/md/bcache/btree.c | 40 ++++++++++++++++++++--------------------
drivers/md/bcache/btree.h | 3 +--
drivers/md/bcache/request.c | 4 +---
drivers/md/bcache/super.c | 2 ++
5 files changed, 26 insertions(+), 27 deletions(-)
diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
index 6b420a5..c3ea03c 100644
--- a/drivers/md/bcache/bcache.h
+++ b/drivers/md/bcache/bcache.h
@@ -425,7 +425,7 @@ struct cache {
* until a gc finishes - otherwise we could pointlessly burn a ton of
* cpu
*/
- unsigned invalidate_needs_gc:1;
+ unsigned invalidate_needs_gc;
bool discard; /* Get rid of? */
@@ -593,8 +593,8 @@ struct cache_set {
/* Counts how many sectors bio_insert has added to the cache */
atomic_t sectors_to_gc;
+ wait_queue_head_t gc_wait;
- wait_queue_head_t moving_gc_wait;
struct keybuf moving_gc_keys;
/* Number of moving GC bios in flight */
struct semaphore moving_in_flight;
diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index 83392f8..b5eccb5 100644
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -1761,33 +1761,34 @@ static void bch_btree_gc(struct cache_set *c)
bch_moving_gc(c);
}
-static int bch_gc_thread(void *arg)
+static bool gc_should_run(struct cache_set *c)
{
- struct cache_set *c = arg;
struct cache *ca;
unsigned i;
- while (1) {
-again:
- bch_btree_gc(c);
+ for_each_cache(ca, c, i)
+ if (ca->invalidate_needs_gc)
+ return true;
- set_current_state(TASK_INTERRUPTIBLE);
- if (kthread_should_stop())
- break;
+ if (atomic_read(&c->sectors_to_gc) < 0)
+ return true;
- mutex_lock(&c->bucket_lock);
+ return false;
+}
- for_each_cache(ca, c, i)
- if (ca->invalidate_needs_gc) {
- mutex_unlock(&c->bucket_lock);
- set_current_state(TASK_RUNNING);
- goto again;
- }
+static int bch_gc_thread(void *arg)
+{
+ struct cache_set *c = arg;
- mutex_unlock(&c->bucket_lock);
+ while (1) {
+ wait_event_interruptible(c->gc_wait,
+ kthread_should_stop() || gc_should_run(c));
- try_to_freeze();
- schedule();
+ if (kthread_should_stop())
+ break;
+
+ set_gc_sectors(c);
+ bch_btree_gc(c);
}
return 0;
@@ -1795,11 +1796,10 @@ again:
int bch_gc_thread_start(struct cache_set *c)
{
- c->gc_thread = kthread_create(bch_gc_thread, c, "bcache_gc");
+ c->gc_thread = kthread_run(bch_gc_thread, c, "bcache_gc");
if (IS_ERR(c->gc_thread))
return PTR_ERR(c->gc_thread);
- set_task_state(c->gc_thread, TASK_INTERRUPTIBLE);
return 0;
}
diff --git a/drivers/md/bcache/btree.h b/drivers/md/bcache/btree.h
index 5c391fa..9b80417 100644
--- a/drivers/md/bcache/btree.h
+++ b/drivers/md/bcache/btree.h
@@ -260,8 +260,7 @@ void bch_initial_mark_key(struct cache_set *, int, struct bkey *);
static inline void wake_up_gc(struct cache_set *c)
{
- if (c->gc_thread)
- wake_up_process(c->gc_thread);
+ wake_up(&c->gc_wait);
}
#define MAP_DONE 0
diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c
index 25fa844..2410df1 100644
--- a/drivers/md/bcache/request.c
+++ b/drivers/md/bcache/request.c
@@ -196,10 +196,8 @@ static void bch_data_insert_start(struct closure *cl)
struct data_insert_op *op = container_of(cl, struct data_insert_op, cl);
struct bio *bio = op->bio, *n;
- if (atomic_sub_return(bio_sectors(bio), &op->c->sectors_to_gc) < 0) {
- set_gc_sectors(op->c);
+ if (atomic_sub_return(bio_sectors(bio), &op->c->sectors_to_gc) < 0)
wake_up_gc(op->c);
- }
if (op->bypass)
return bch_data_invalidate(cl);
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 679a093..81fef23 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1474,6 +1474,7 @@ struct cache_set *bch_cache_set_alloc(struct cache_sb *sb)
mutex_init(&c->bucket_lock);
init_waitqueue_head(&c->btree_cache_wait);
init_waitqueue_head(&c->bucket_wait);
+ init_waitqueue_head(&c->gc_wait);
sema_init(&c->uuid_write_mutex, 1);
spin_lock_init(&c->btree_gc_time.lock);
@@ -1532,6 +1533,7 @@ static void run_cache_set(struct cache_set *c)
for_each_cache(ca, c, i)
c->nbuckets += ca->sb.nbuckets;
+ set_gc_sectors(c);
if (CACHE_SYNC(&c->sb)) {
LIST_HEAD(journal);
--
2.10.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] bcache: Make gc wakeup sane, remove set_task_state()
2017-02-20 14:36 ` Coly Li
@ 2017-02-20 15:19 ` Greg KH
0 siblings, 0 replies; 9+ messages in thread
From: Greg KH @ 2017-02-20 15:19 UTC (permalink / raw)
To: Coly Li; +Cc: Kent Overstreet, stable
On Mon, Feb 20, 2017 at 10:36:13PM +0800, Coly Li wrote:
> On 2017/2/20 下午10:06, Greg KH wrote:
> > On Mon, Feb 20, 2017 at 04:12:58AM -0900, Kent Overstreet wrote:
> >> On Mon, Feb 20, 2017 at 01:31:37PM +0100, Greg KH wrote:
> >>> On Mon, Jan 23, 2017 at 03:54:07PM +0100, Greg KH wrote:
> >>>> Kent, any hints?
> >>>
> >>> Without a response from the maintainer, I can't apply this...
> >>>
> >>> greg k-h
> >>
> >> Sorry I missed this - yes, this patch should be safe to apply and it does fix
> >> that crash. The relevant code hasn't been changed in ages, 4.4 is definitely
> >> fine.
> >
> > Ok, I've queued this up for 4.9, but for 4.4 it does not apply. Coly,
> > can you provide a working backport for 4.4-stable?
>
> Greg,
>
> It is because the 'commit 29e6c57cc78e ("bcache: bch_gc_thread() is not
> freezable")' remove a "try_to_freeze()" in bch_gc_thread(), which
> happens in v4.7.
>
> I just rebase Kent's fix to v4.4 kernel, solve the conflict. Could you
> please check and try the attached patch ?
That worked, thanks!
greg k-h
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2017-02-20 15:20 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-23 13:20 [PATCH] bcache: Make gc wakeup sane, remove set_task_state() colyli
2017-01-23 14:16 ` Greg KH
2017-01-23 14:45 ` Coly Li
2017-01-23 14:54 ` Greg KH
2017-02-20 12:31 ` Greg KH
2017-02-20 13:12 ` Kent Overstreet
2017-02-20 14:06 ` Greg KH
2017-02-20 14:36 ` Coly Li
2017-02-20 15:19 ` Greg KH
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.