All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Hua Rui <huarui.dev@gmail.com>,
	Michael Lyle <mlyle@lyle.org>, Coly Li <colyli@suse.de>,
	Jens Axboe <axboe@kernel.dk>
Subject: [PATCH 3.18 02/26] bcache: recover data from backing when data is clean
Date: Thu,  7 Dec 2017 13:48:15 +0100	[thread overview]
Message-ID: <20171207124654.984666061@linuxfoundation.org> (raw)
In-Reply-To: <20171207124654.669583826@linuxfoundation.org>

3.18-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Rui Hua <huarui.dev@gmail.com>

commit e393aa2446150536929140739f09c6ecbcbea7f0 upstream.

When we send a read request and hit the clean data in cache device, there
is a situation called cache read race in bcache(see the commit in the tail
of cache_look_up(), the following explaination just copy from there):
The bucket we're reading from might be reused while our bio is in flight,
and we could then end up reading the wrong data. We guard against this
by checking (in bch_cache_read_endio()) if the pointer is stale again;
if so, we treat it as an error (s->iop.error = -EINTR) and reread from
the backing device (but we don't pass that error up anywhere)

It should be noted that cache read race happened under normal
circumstances, not the circumstance when SSD failed, it was counted
and shown in  /sys/fs/bcache/XXX/internal/cache_read_races.

Without this patch, when we use writeback mode, we will never reread from
the backing device when cache read race happened, until the whole cache
device is clean, because the condition
(s->recoverable && (dc && !atomic_read(&dc->has_dirty))) is false in
cached_dev_read_error(). In this situation, the s->iop.error(= -EINTR)
will be passed up, at last, user will receive -EINTR when it's bio end,
this is not suitable, and wield to up-application.

In this patch, we use s->read_dirty_data to judge whether the read
request hit dirty data in cache device, it is safe to reread data from
the backing device when the read request hit clean data. This can not
only handle cache read race, but also recover data when failed read
request from cache device.

[edited by mlyle to fix up whitespace, commit log title, comment
spelling]

Fixes: d59b23795933 ("bcache: only permit to recovery read error when cache device is clean")
Signed-off-by: Hua Rui <huarui.dev@gmail.com>
Reviewed-by: Michael Lyle <mlyle@lyle.org>
Reviewed-by: Coly Li <colyli@suse.de>
Signed-off-by: Michael Lyle <mlyle@lyle.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/md/bcache/request.c |   13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

--- a/drivers/md/bcache/request.c
+++ b/drivers/md/bcache/request.c
@@ -705,16 +705,15 @@ static void cached_dev_read_error(struct
 {
 	struct search *s = container_of(cl, struct search, cl);
 	struct bio *bio = &s->bio.bio;
-	struct cached_dev *dc = container_of(s->d, struct cached_dev, disk);
 
 	/*
-	 * If cache device is dirty (dc->has_dirty is non-zero), then
-	 * recovery a failed read request from cached device may get a
-	 * stale data back. So read failure recovery is only permitted
-	 * when cache device is clean.
+	 * If read request hit dirty data (s->read_dirty_data is true),
+	 * then recovery a failed read request from cached device may
+	 * get a stale data back. So read failure recovery is only
+	 * permitted when read request hit clean data in cache device,
+	 * or when cache read race happened.
 	 */
-	if (s->recoverable &&
-	    (dc && !atomic_read(&dc->has_dirty))) {
+	if (s->recoverable && !s->read_dirty_data) {
 		/* Retry from the backing device: */
 		trace_bcache_read_retry(s->orig_bio);
 

  parent reply	other threads:[~2017-12-07 12:54 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-07 12:48 [PATCH 3.18 00/26] 3.18.87-stable review Greg Kroah-Hartman
2017-12-07 12:48 ` [PATCH 3.18 01/26] bcache: only permit to recovery read error when cache device is clean Greg Kroah-Hartman
2017-12-07 12:48 ` Greg Kroah-Hartman [this message]
2017-12-07 12:48 ` [PATCH 3.18 03/26] serial: 8250_fintek: Fix rs485 disablement on invalid ioctl() Greg Kroah-Hartman
2017-12-07 12:48 ` [PATCH 3.18 04/26] spi: sh-msiof: Fix DMA transfer size check Greg Kroah-Hartman
2017-12-07 12:48 ` [PATCH 3.18 05/26] EDAC, sb_edac: Fix missing break in switch Greg Kroah-Hartman
2017-12-07 12:48   ` [3.18,05/26] " Greg Kroah-Hartman
2017-12-07 12:48 ` [PATCH 3.18 06/26] sysrq : fix Show Regs call trace on ARM Greg Kroah-Hartman
2017-12-07 12:48 ` [PATCH 3.18 07/26] usbip: tools: Install all headers needed for libusbip development Greg Kroah-Hartman
2017-12-07 12:48 ` [PATCH 3.18 08/26] perf test attr: Fix ignored test case result Greg Kroah-Hartman
2017-12-07 12:48 ` [PATCH 3.18 09/26] ARM: OMAP1: DMA: Correct the number of logical channels Greg Kroah-Hartman
2017-12-07 12:48 ` [PATCH 3.18 10/26] vti6: fix device register to report IFLA_INFO_KIND Greg Kroah-Hartman
2017-12-07 12:48 ` [PATCH 3.18 11/26] net/appletalk: Fix kernel memory disclosure Greg Kroah-Hartman
2017-12-07 12:48 ` [PATCH 3.18 12/26] nfs: Dont take a reference on fl->fl_file for LOCK operation Greg Kroah-Hartman
2017-12-07 12:48 ` [PATCH 3.18 13/26] NFSv4: Fix client recovery when server reboots multiple times Greg Kroah-Hartman
2017-12-07 12:48 ` [PATCH 3.18 14/26] net: sctp: fix array overrun read on sctp_timer_tbl Greg Kroah-Hartman
2017-12-07 12:48 ` [PATCH 3.18 15/26] tipc: fix cleanup at module unload Greg Kroah-Hartman
2017-12-07 12:48 ` [PATCH 3.18 16/26] mm: avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers Greg Kroah-Hartman
2017-12-07 12:48 ` [PATCH 3.18 17/26] net: fec: fix multicast filtering hardware setup Greg Kroah-Hartman
2017-12-07 12:48 ` [PATCH 3.18 18/26] ima: fix hash algorithm initialization Greg Kroah-Hartman
2017-12-07 12:48 ` [PATCH 3.18 19/26] uas: Always apply US_FL_NO_ATA_1X quirk to Seagate devices Greg Kroah-Hartman
2017-12-07 12:48 ` [PATCH 3.18 20/26] usb: quirks: Add no-lpm quirk for KY-688 USB 3.1 Type-C Hub Greg Kroah-Hartman
2017-12-07 12:48 ` [PATCH 3.18 21/26] serial: 8250_pci: Add Amazon PCI serial device ID Greg Kroah-Hartman
2017-12-07 12:48 ` [PATCH 3.18 22/26] usb: hub: Cycle HUB power when initialization fails Greg Kroah-Hartman
2017-12-07 12:48 ` [PATCH 3.18 23/26] USB: Increase usbfs transfer limit Greg Kroah-Hartman
2017-12-07 12:48 ` [PATCH 3.18 24/26] USB: devio: Prevent integer overflow in proc_do_submiturb() Greg Kroah-Hartman
2017-12-07 12:48 ` [PATCH 3.18 25/26] USB: usbfs: Filter flags passed in from user space Greg Kroah-Hartman
2017-12-07 12:48 ` [PATCH 3.18 26/26] usb: host: fix incorrect updating of offset Greg Kroah-Hartman
2017-12-07 20:54 ` [PATCH 3.18 00/26] 3.18.87-stable review Guenter Roeck
2017-12-08  0:06 ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171207124654.984666061@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=axboe@kernel.dk \
    --cc=colyli@suse.de \
    --cc=huarui.dev@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mlyle@lyle.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.