linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <greg@kroah.com>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Alex Lyakas <alex@zadarastorage.com>,
	NeilBrown <neilb@suse.de>
Subject: [ 25/48] md/raid1: consider WRITE as successful only if at least one non-Faulty and non-rebuilding drive completed it.
Date: Tue, 18 Jun 2013 09:17:51 -0700	[thread overview]
Message-ID: <20130618161729.541053291@linuxfoundation.org> (raw)
In-Reply-To: <20130618161725.912524266@linuxfoundation.org>

From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

3.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Alex Lyakas <alex@zadarastorage.com>

commit 3056e3aec8d8ba61a0710fb78b2d562600aa2ea7 upstream.

Without that fix, the following scenario could happen:

- RAID1 with drives A and B; drive B was freshly-added and is rebuilding
- Drive A fails
- WRITE request arrives to the array. It is failed by drive A, so
r1_bio is marked as R1BIO_WriteError, but the rebuilding drive B
succeeds in writing it, so the same r1_bio is marked as
R1BIO_Uptodate.
- r1_bio arrives to handle_write_finished, badblocks are disabled,
md_error()->error() does nothing because we don't fail the last drive
of raid1
- raid_end_bio_io()  calls call_bio_endio()
- As a result, in call_bio_endio():
        if (!test_bit(R1BIO_Uptodate, &r1_bio->state))
                clear_bit(BIO_UPTODATE, &bio->bi_flags);
this code doesn't clear the BIO_UPTODATE flag, and the whole master
WRITE succeeds, back to the upper layer.

So we returned success to the upper layer, even though we had written
the data onto the rebuilding drive only. But when we want to read the
data back, we would not read from the rebuilding drive, so this data
is lost.

[neilb - applied identical change to raid10 as well]

This bug can result in lost data, so it is suitable for any
-stable kernel.

Signed-off-by: Alex Lyakas <alex@zadarastorage.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/md/raid1.c  |   12 +++++++++++-
 drivers/md/raid10.c |   12 +++++++++++-
 2 files changed, 22 insertions(+), 2 deletions(-)

--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -427,7 +427,17 @@ static void raid1_end_write_request(stru
 
 		r1_bio->bios[mirror] = NULL;
 		to_put = bio;
-		set_bit(R1BIO_Uptodate, &r1_bio->state);
+		/*
+		 * Do not set R1BIO_Uptodate if the current device is
+		 * rebuilding or Faulty. This is because we cannot use
+		 * such device for properly reading the data back (we could
+		 * potentially use it, if the current write would have felt
+		 * before rdev->recovery_offset, but for simplicity we don't
+		 * check this here.
+		 */
+		if (test_bit(In_sync, &conf->mirrors[mirror].rdev->flags) &&
+		    !test_bit(Faulty, &conf->mirrors[mirror].rdev->flags))
+			set_bit(R1BIO_Uptodate, &r1_bio->state);
 
 		/* Maybe we can clear some bad blocks. */
 		if (is_badblock(conf->mirrors[mirror].rdev,
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -490,7 +490,17 @@ static void raid10_end_write_request(str
 		sector_t first_bad;
 		int bad_sectors;
 
-		set_bit(R10BIO_Uptodate, &r10_bio->state);
+		/*
+		 * Do not set R10BIO_Uptodate if the current device is
+		 * rebuilding or Faulty. This is because we cannot use
+		 * such device for properly reading the data back (we could
+		 * potentially use it, if the current write would have felt
+		 * before rdev->recovery_offset, but for simplicity we don't
+		 * check this here.
+		 */
+		if (test_bit(In_sync, &rdev->flags) &&
+		    !test_bit(Faulty, &rdev->flags))
+			set_bit(R10BIO_Uptodate, &r10_bio->state);
 
 		/* Maybe we can clear some bad blocks. */
 		if (is_badblock(rdev,



  parent reply	other threads:[~2013-06-18 16:25 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-18 16:17 [ 00/48] 3.9.7-stable review Greg Kroah-Hartman
2013-06-18 16:17 ` [ 01/48] audit: wait_for_auditd() should use TASK_UNINTERRUPTIBLE Greg Kroah-Hartman
2013-06-18 16:17 ` [ 02/48] b43: stop format string leaking into error msgs Greg Kroah-Hartman
2013-06-18 16:17 ` [ 03/48] ACPI / video: Do not bind to device objects with a scan handler Greg Kroah-Hartman
2013-06-18 16:17 ` [ 04/48] libceph: must hold mutex for reset_changed_osds() Greg Kroah-Hartman
2013-06-18 16:17 ` [ 05/48] ceph: add cpu_to_le32() calls when encoding a reconnect capability Greg Kroah-Hartman
2013-06-18 16:17 ` [ 06/48] ceph: ceph_pagelist_append might sleep while atomic Greg Kroah-Hartman
2013-06-18 16:17 ` [ 07/48] rbd: dont destroy ceph_opts in rbd_add() Greg Kroah-Hartman
2013-06-18 16:17 ` [ 08/48] drivers/rtc/rtc-twl.c: fix missing device_init_wakeup() when booted with device tree Greg Kroah-Hartman
2013-06-18 16:17 ` [ 09/48] drm/gma500/psb: Unpin framebuffer on crtc disable Greg Kroah-Hartman
2013-06-18 16:17 ` [ 10/48] drm/gma500/cdv: " Greg Kroah-Hartman
2013-06-18 16:17 ` [ 11/48] Bluetooth: Fix missing length checks for L2CAP signalling PDUs Greg Kroah-Hartman
2013-06-18 16:17 ` [ 12/48] Bluetooth: Fix mgmt handling of power on failures Greg Kroah-Hartman
2013-06-18 16:17 ` [ 13/48] s390/pci: Implement IRQ functions if !PCI Greg Kroah-Hartman
2013-06-18 17:35   ` Ben Hutchings
2013-06-18 17:42     ` Greg Kroah-Hartman
2013-06-18 21:35       ` Ben Hutchings
2013-06-19  7:09     ` Martin Schwidefsky
2013-06-20 19:21     ` Greg Kroah-Hartman
2013-06-18 16:17 ` [ 14/48] ath9k: Disable PowerSave by default Greg Kroah-Hartman
2013-06-18 16:17 ` [ 15/48] Revert "ath9k_hw: Update rx gain initval to improve rx sensitivity" Greg Kroah-Hartman
2013-06-18 16:17 ` [ 16/48] ath9k: Use minstrel rate control by default Greg Kroah-Hartman
2013-06-18 16:17 ` [ 17/48] CPU hotplug: provide a generic helper to disable/enable CPU hotplug Greg Kroah-Hartman
2013-06-18 16:17 ` [ 18/48] reboot: rigrate shutdown/reboot to boot cpu Greg Kroah-Hartman
2013-06-18 16:17 ` [ 19/48] kmsg: honor dmesg_restrict sysctl on /dev/kmsg Greg Kroah-Hartman
2013-06-18 16:17 ` [ 20/48] cciss: fix broken mutex usage in ioctl Greg Kroah-Hartman
2013-06-18 16:17 ` [ 21/48] memcg: dont initialize kmem-cache destroying work for root caches Greg Kroah-Hartman
2013-06-18 16:17 ` [ 22/48] wl12xx: fix minimum required firmware version for wl127x multirole Greg Kroah-Hartman
2013-06-18 16:17 ` [ 23/48] drm/i915: prefer VBT modes for SVDO-LVDS over EDID Greg Kroah-Hartman
2013-06-18 16:17 ` [ 24/48] swap: avoid read_swap_cache_async() race to deadlock while waiting on discard I/O completion Greg Kroah-Hartman
2013-06-18 16:17 ` Greg Kroah-Hartman [this message]
2013-06-18 16:17 ` [ 26/48] md/raid1,5,10: Disable WRITE SAME until a recovery strategy is in place Greg Kroah-Hartman
2013-06-18 16:17 ` [ 27/48] md/raid1,raid10: use freeze_array in place of raise_barrier in various places Greg Kroah-Hartman
2013-06-18 16:17 ` [ 28/48] mm/page_alloc.c: fix watermark check in __zone_watermark_ok() Greg Kroah-Hartman
2013-06-18 16:17 ` [ 29/48] mm: migration: add migrate_entry_wait_huge() Greg Kroah-Hartman
2013-06-20  9:52   ` Satoru Takeuchi
2013-06-20 17:02     ` Greg Kroah-Hartman
2013-06-21 11:42       ` Satoru Takeuchi
2013-06-21 12:47     ` Michal Hocko
2013-06-21 22:56       ` Satoru Takeuchi
2013-06-22 12:28         ` Satoru Takeuchi
2013-06-27 18:48     ` Naoya Horiguchi
2013-06-18 16:17 ` [ 30/48] x86: Fix adjust_range_size_mask calling position Greg Kroah-Hartman
2013-06-18 16:17 ` [ 31/48] x86: Fix typo in kexec register clearing Greg Kroah-Hartman
2013-06-18 16:17 ` [ 32/48] drm/nv50/disp: force dac power state during load detect Greg Kroah-Hartman
2013-06-18 16:17 ` [ 33/48] drm/nv50/kms: use dac loadval from vbios, where its available Greg Kroah-Hartman
2013-06-18 16:18 ` [ 34/48] libceph: clear messenger auth_retry flag when we authenticate Greg Kroah-Hartman
2013-06-18 16:18 ` [ 35/48] libceph: fix authorizer invalidation Greg Kroah-Hartman
2013-06-18 16:18 ` [ 36/48] libceph: add update_authorizer auth method Greg Kroah-Hartman
2013-06-18 16:18 ` [ 37/48] libceph: wrap auth ops in wrapper functions Greg Kroah-Hartman
2013-06-18 16:18 ` [ 38/48] libceph: wrap auth methods in a mutex Greg Kroah-Hartman
2013-06-18 16:18 ` [ 39/48] Modify UEFI anti-bricking code Greg Kroah-Hartman
2013-06-18 16:18 ` [ 40/48] powerpc: Fix stack overflow crash in resume_kernel when ftracing Greg Kroah-Hartman
2013-06-18 16:18 ` [ 41/48] powerpc: Fix emulation of illegal instructions on PowerNV platform Greg Kroah-Hartman
2013-06-18 16:18 ` [ 42/48] powerpc: Fix missing/delayed calls to irq_work Greg Kroah-Hartman
2013-06-18 16:18 ` [ 43/48] usb: chipidea: fix id change handling Greg Kroah-Hartman
2013-06-18 16:18 ` [ 44/48] USB: pl2303: fix device initialisation at open Greg Kroah-Hartman
2013-06-18 16:18 ` [ 45/48] USB: f81232: " Greg Kroah-Hartman
2013-06-18 16:18 ` [ 46/48] USB: spcp8x5: " Greg Kroah-Hartman
2013-06-18 16:18 ` [ 47/48] tg3: Wait for boot code to finish after power on Greg Kroah-Hartman
2013-06-18 16:18 ` [ 48/48] ARM: Kirkwood: handle mv88f6282 cpu in __kirkwood_variant() Greg Kroah-Hartman
2013-06-18 21:55 ` [ 00/48] 3.9.7-stable review Shuah Khan
2013-06-18 22:11   ` Greg Kroah-Hartman
2013-06-18 22:58 ` Guenter Roeck
2013-06-18 23:26   ` Greg Kroah-Hartman
2013-06-20 10:02 ` Satoru Takeuchi
2013-06-20 17:01   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130618161729.541053291@linuxfoundation.org \
    --to=greg@kroah.com \
    --cc=alex@zadarastorage.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).