linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kent Overstreet <kent.overstreet@gmail.com>
To: Junhui Tang <tang.junhui.linux@gmail.com>
Cc: colyli@suse.de, linux-bcache@vger.kernel.org,
	linux-block@vger.kernel.org
Subject: Re: [PATCH] bcache: treat stale && dirty keys as bad keys
Date: Tue, 18 Dec 2018 09:01:57 -0500	[thread overview]
Message-ID: <20181218140157.GB7144@kmo-pixel> (raw)
In-Reply-To: <CA+NNRhE8438XuoZ25kdCRjYvZv6nHJU+Yoyg6hacWuptjWWDfA@mail.gmail.com>

On Tue, Dec 18, 2018 at 02:37:14PM +0800, Junhui Tang wrote:
> From 8ca52771f1d3b83ff55dc80ba633901a081963bf Mon Sep 17 00:00:00 2001
> From: Tang Junhui <tang.junhui.linux@gmail.com>
> Date: Tue, 18 Dec 2018 10:38:26 +0800
> Subject: [PATCH] bcache: treat stale && dirty keys as bad keys
> 
> Stale && dirty keys can be produced in the follow way:
> After writeback in write_dirty_finish(), dirty keys k1 will
> replace by clean keys k2
> ==>ret = bch_btree_insert(dc->disk.c, &keys, NULL, &w->key);
> ==>btree_insert_fn(struct btree_op *b_op, struct btree *b)
> ==>static int bch_btree_insert_node(struct btree *b,
>        struct btree_op *op,
>        struct keylist *insert_keys,
>        atomic_t *journal_ref,
> Then two steps:
> A) update k1 to k2 in btree node memory;
>    bch_btree_insert_keys(b, op, insert_keys, replace_key)
> B) Write the bset(contains k2) to cache disk by a 30s delay work
>    bch_btree_leaf_dirty(b, journal_ref).
> But before the 30s delay work write the bset to cache device,
> these things happend:
> A) GC works, and reclaim the bucket k2 point to;
> B) Allocator works, and invalidate the bucket k2 point to,
>    and increase the gen of the bucket, and place it into free_inc
>    fifo;
> C) Until now, the 30s delay work still does not finish work,
>    so in the disk, the key still is k1, it is dirty and stale
>    (its gen is smaller than the gen of the bucket). and then the
>    machine power off suddenly happens;
> D) When the machine power on again, after the btree reconstruction,
>    the stale dirty key appear.

Only prior to journal replay, right? Or did you uncover something more severe?

> In bch_extent_bad(), when expensive_debug_checks is off, it would
> treat the dirty key as good even it is stale keys, and it would
> cause bellow probelms:
> A) In read_dirty() it would cause machine crash:
>    BUG_ON(ptr_stale(dc->disk.c, &w->key, 0));
> B) It could be worse when reads hits stale dirty keys, it would
>    read old incorrect data.

Neither of these can happen until after journal replay is finished. Prior to
journal replay we expect to find stale dirty keys - if we find any after journal
replay then it's indicative of a real bug.

> 
> This patch tolerate the existence of these stale && dirty keys,
> and treat them as bad key in bch_extent_bad().
> 
> Signed-off-by: Tang Junhui <tang.junhui.linux@gmail.com>
> ---
>  drivers/md/bcache/extents.c | 14 ++++++++------
>  1 file changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/md/bcache/extents.c b/drivers/md/bcache/extents.c
> index 1d09674..f9eb03c 100644
> --- a/drivers/md/bcache/extents.c
> +++ b/drivers/md/bcache/extents.c
> @@ -535,6 +535,7 @@ static bool bch_extent_bad(struct btree_keys *bk,
> const struct bkey *k)
>  {
>   struct btree *b = container_of(bk, struct btree, keys);
>   unsigned i, stale;
> + char buf[80];
> 
>   if (!KEY_PTRS(k) ||
>       bch_extent_invalid(bk, k))
> @@ -544,19 +545,20 @@ static bool bch_extent_bad(struct btree_keys
> *bk, const struct bkey *k)
>   if (!ptr_available(b->c, k, i))
>   return true;
> 
> - if (!expensive_debug_checks(b->c) && KEY_DIRTY(k))
> - return false;
> -
>   for (i = 0; i < KEY_PTRS(k); i++) {
>   stale = ptr_stale(b->c, k, i);
> 
> + if (stale && KEY_DIRTY(k)) {
> + bch_extent_to_text(buf, sizeof(buf), k);
> + pr_info("stale dirty pointer, stale %u, key: %s",
> +     stale,
> +     buf);
> + }
> +
>   btree_bug_on(stale > 96, b,
>        "key too stale: %i, need_gc %u",
>        stale, b->c->need_gc);
> 
> - btree_bug_on(stale && KEY_DIRTY(k) && KEY_SIZE(k),
> -      b, "stale dirty pointer");
> -
>   if (stale)
>   return true;
> 
> -- 
> 1.8.3.1

  reply	other threads:[~2018-12-18 14:02 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-18  6:37 [PATCH] bcache: treat stale && dirty keys as bad keys Junhui Tang
2018-12-18 14:01 ` Kent Overstreet [this message]
2018-12-19  1:30   ` Junhui Tang
2018-12-19  1:32   ` Junhui Tang
2018-12-19 11:32     ` Kent Overstreet
2018-12-20  8:40       ` Junhui Tang
2018-12-22 13:02 ` Coly Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181218140157.GB7144@kmo-pixel \
    --to=kent.overstreet@gmail.com \
    --cc=colyli@suse.de \
    --cc=linux-bcache@vger.kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=tang.junhui.linux@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).