From mboxrd@z Thu Jan 1 00:00:00 1970 From: NeilBrown Subject: Re: raid1 repair does not repair errors? Date: Mon, 3 Feb 2014 15:36:44 +1100 Message-ID: <20140203153644.4c530672@notabene.brown> References: <52EE3910.3040205@msgid.tls.msk.ru> <20140203120431.400a8a1b@notabene.brown> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=PGP-SHA1; boundary="Sig_/wXKad2Tckhb+236E+aJ3Wl8"; protocol="application/pgp-signature" Return-path: In-Reply-To: <20140203120431.400a8a1b@notabene.brown> Sender: linux-raid-owner@vger.kernel.org To: Michael Tokarev Cc: linux-raid List-Id: linux-raid.ids --Sig_/wXKad2Tckhb+236E+aJ3Wl8 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable On Mon, 3 Feb 2014 12:04:31 +1100 NeilBrown wrote: > On Sun, 02 Feb 2014 16:24:48 +0400 Michael Tokarev wrote: >=20 > > Hello. > >=20 > > This is a followup for a somewhat old thread, -- > > http://thread.gmane.org/gmane.linux.raid/44503 > > with the required details. > >=20 > > Initial problem was that with a raid1 array on a > > few drives and one of them having a bad sector, > > runnig `repair' action does not actually fix the > > error, it looks like the raid1 code does not see > > the error. > >=20 > > This is a production host, so it is very difficult to > > experiment. When I initially hit this issue there, I > > tried various ways to fix the issue, one of them was > > to removing the bad drive from the array and adding > > it back. This forced all sectors to be re-written, > > and the problem went away. > >=20 > > Now, the same issue happened again - another drive > > developed a bad sector, and again, md `repair' action > > does not fix it. > >=20 > > So I added some debugging as requested in the original > > thread, and re-run `repair' action. > >=20 > > Here's the changes I added to 3.10 raid1.c file: > >=20 > > --- ../linux-3.10/drivers/md/raid1.c 2014-02-02 16:01:55.003119836 +0400 > > +++ drivers/md/raid1.c 2014-02-02 16:07:37.913808336 +0400 > > @@ -1636,6 +1636,8 @@ static void end_sync_read(struct bio *bi > > */ > > if (test_bit(BIO_UPTODATE, &bio->bi_flags)) > > set_bit(R1BIO_Uptodate, &r1_bio->state); > > +else > > +printk("end_sync_read: !BIO_UPTODATE\n"); > >=20 > > if (atomic_dec_and_test(&r1_bio->remaining)) > > reschedule_retry(r1_bio); > > @@ -1749,6 +1751,7 @@ static int fix_sync_read_error(struct r1 > > * active, and resync is currently active > > */ > > rdev =3D conf->mirrors[d].rdev; > > +printk("fix_sync_read_error: calling sync_page_io(%Li, %Li<<9)\n", (ui= nt64_t)sect, (uint64_t)s); > > if (sync_page_io(rdev, sect, s<<9, > > bio->bi_io_vec[idx].bv_page, > > READ, false)) { > > @@ -1931,10 +1934,12 @@ static void sync_request_write(struct md > >=20 > > bio =3D r1_bio->bios[r1_bio->read_disk]; > >=20 > > - if (!test_bit(R1BIO_Uptodate, &r1_bio->state)) > > + if (!test_bit(R1BIO_Uptodate, &r1_bio->state)) { > > /* ouch - failed to read all of that. */ > > +printk("sync_request_write: !R1BIO_Uptodate\n"); > > if (!fix_sync_read_error(r1_bio)) > > return; > > + } > >=20 > > if (test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery)) > > if (process_checks(r1_bio) < 0) > >=20 > >=20 > > And here is the whole dmesg result from the repair run: > >=20 > > [ 74.288227] md: requested-resync of RAID array md1 > > [ 74.288719] md: minimum _guaranteed_ speed: 1000 KB/sec/disk. > > [ 74.289329] md: using maximum available idle IO bandwidth (but not m= ore than 200000 KB/sec) for requested-resync. > > [ 74.290404] md: using 128k window, over a total of 2096064k. > > [ 179.213754] ata6.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 ac= tion 0x0 > > [ 179.214500] ata6.00: irq_stat 0x40000008 > > [ 179.214909] ata6.00: failed command: READ FPDMA QUEUED > > [ 179.215452] ata6.00: cmd 60/80:00:00:3e:3e/00:00:00:00:00/40 tag 0 n= cq 65536 in > > [ 179.215452] res 41/40:00:23:3e:3e/00:00:00:00:00/40 Emask 0= x409 (media error) > > [ 179.217087] ata6.00: status: { DRDY ERR } > > [ 179.217500] ata6.00: error: { UNC } > > [ 179.220185] ata6.00: configured for UDMA/133 > > [ 179.220656] sd 5:0:0:0: [sdd] Unhandled sense code > > [ 179.221149] sd 5:0:0:0: [sdd] > > [ 179.221476] Result: hostbyte=3DDID_OK driverbyte=3DDRIVER_SENSE > > [ 179.222062] sd 5:0:0:0: [sdd] > > [ 179.222398] Sense Key : Medium Error [current] [descriptor] > > [ 179.223034] Descriptor sense data with sense descriptors (in hex): > > [ 179.223704] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 > > [ 179.224726] 00 3e 3e 23 > > [ 179.225169] sd 5:0:0:0: [sdd] > > [ 179.225494] Add. Sense: Unrecovered read error - auto reallocate fai= led > > [ 179.226215] sd 5:0:0:0: [sdd] CDB: > > [ 179.226577] Read(10): 28 00 00 3e 3e 00 00 00 80 00 > > [ 179.227344] end_request: I/O error, dev sdd, sector 4079139 > > [ 179.227926] end_sync_read: !BIO_UPTODATE > > [ 179.228359] ata6: EH complete > > [ 181.926457] md: md1: requested-resync done. > >=20 > >=20 > > So the only printk I've added is seen: "end_sync_read: !BIO_UPTODATE", = and > > it looks like rewriting code is never called. > >=20 > >=20 > > This is root array of a production machine, so I can reboot it only at > > late evenings or at weekends. But this time I finally want to fix the > > bug for real, so I will not try to fix the erroneous drive until we will > > be able to fix the code. Just one thing: the fixing process might be > > a bit slow. > >=20 > > Thanks, > >=20 > > /mjt >=20 >=20 > Hi, > thanks for the testing and report. I see what the problem is now. >=20 > We only try to fix a read error when all reads failed rather than when any > read fails. > Most of the time there is only one read so this makes no difference. > However for 'check' and 'repair' we read all devices so the current code = will > only try to repair a read error if *every* device failed, which of course > would be pointless. >=20 > This patch (against v3.10) should fix it. It only leaves R1BIO_Uptodate = set > if no failure is seen. >=20 > NeilBrown >=20 >=20 > diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c > index 6e17f8181c4b..ba38ef6c612b 100644 > --- a/drivers/md/raid1.c > +++ b/drivers/md/raid1.c > @@ -1633,8 +1633,8 @@ static void end_sync_read(struct bio *bio, int erro= r) > * or re-read if the read failed. > * We don't do much here, just schedule handling by raid1d > */ > - if (test_bit(BIO_UPTODATE, &bio->bi_flags)) > - set_bit(R1BIO_Uptodate, &r1_bio->state); > + if (!test_bit(BIO_UPTODATE, &bio->bi_flags)) > + clear_bit(R1BIO_Uptodate, &r1_bio->state); > =20 > if (atomic_dec_and_test(&r1_bio->remaining)) > reschedule_retry(r1_bio); > @@ -2609,6 +2609,7 @@ static sector_t sync_request(struct mddev *mddev, s= ector_t sector_nr, int *skipp > /* For a user-requested sync, we read all readable devices and do a > * compare > */ > + set_bit(R1BIO_UPTODATE, &r1_bio->state); > if (test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery)) { > atomic_set(&r1_bio->remaining, read_targets); > for (i =3D 0; i < conf->raid_disks * 2 && read_targets; i++) { Actually I've changed my mind. That patch won't fix anything. fix_sync_read_error() is focussed on fixing a read error on ->read_disk. So we only set uptodate if ->read_disk succeeded. So this patch should do it. NeilBrown diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index fd3a2a14b587..0fe5fd469e74 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -1733,7 +1733,8 @@ static void end_sync_read(struct bio *bio, int error) * or re-read if the read failed. * We don't do much here, just schedule handling by raid1d */ - if (test_bit(BIO_UPTODATE, &bio->bi_flags)) + if (bio =3D=3D r1_bio->bios[r1_bio->read_disk] && + test_bit(BIO_UPTODATE, &bio->bi_flags)) set_bit(R1BIO_Uptodate, &r1_bio->state); =20 if (atomic_dec_and_test(&r1_bio->remaining)) --Sig_/wXKad2Tckhb+236E+aJ3Wl8 Content-Type: application/pgp-signature; name=signature.asc Content-Disposition: attachment; filename=signature.asc -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIVAwUBUu8c3Dnsnt1WYoG5AQJ10hAAsnU6wt/Vxn4JT7Aj6uAp1F+DC9Rwx2N+ 5EQV1oKpIiW2dLPoJinSL/S6yb1w7hlMe8SXcgfE2ImiDoypjw9EzY/WIiUp0+Fu XHJXe6ocJT7BKJq2V0TZXJJSsQQwYRg+UJj1yHuHVIPpYQWuK/Fi9j4iwGWfvyCP VLRwydjpyNi67iJ+i6Zm8MPRo9YxUH/DiubKu1S2AgTtske0hriW2HcK3kYxlW3v b0hsOBCQ/QndTvxDja6rO818QesBl38lvt4Qp8Dz3r+MKoYGT0H7lhi1ZagDVeLd mAj7JQDgAhIjwJUpWoEDXShP+QzpwgbTzPeLQCI9Of5BBSmYyi7AFcJKvuRr3H0t XfIaJJVgZ3c0CmXDcbOSCkvY4bXFcDrUiLTgmA6Q3aaOVr1iQW9V0EBiTC5F/Ty3 Wu6GyJY3dYad0w17Wk4uAgui59wTEwDL86W1Syqmw1NzrjXtj2B9xvo3zEsPN2Gi Tk29VAY2zNXnw7Kg0CXnhchjqQro33Gp6xGcm/yyxr2M+BS2DSPdb4qObE+zT8gV VXH0Zf3IH4xi3J5mrXnAqig9VuW8+apwJdsu++/i4gzptZqILd0u1OjMXyYqFWQT U8fpT2tV596P4O5ri4qMO65h+ykEEYclWb7VStTXm1CRtatOiokGLyQKl2WOnPC5 RnRykRhri2E= =3/eW -----END PGP SIGNATURE----- --Sig_/wXKad2Tckhb+236E+aJ3Wl8--