* If I have a single bad sector, how many failed reads should simple dd report? @ 2010-07-08 17:14 Greg Freemyer 2010-07-09 19:04 ` Greg Freemyer 0 siblings, 1 reply; 7+ messages in thread From: Greg Freemyer @ 2010-07-08 17:14 UTC (permalink / raw) To: IDE/ATA development list All, I just ran a test against a IDE drive (/dev/sdb) and slightly older kernel (2.6.32). Similar to: dd if=/dev/sdb of=/dev/null conv=noerror,sync bs=4k With a good drive it ran fine. Then I used hdparm --make-bad-sector to intentionally corrupt a sector on the drive. When I re-ran it, /var/log/messages reported 10 bad logical blocks. And even worse, dd reported 20 bad blocks. I examined the data dd read and it had 80KB of zero'ed out data. So that's 160 sectors worth of data lost because of a single bad sector. At most I was expecting 4KB of zero'ed out data. I haven't started troubleshooting, but I want to know if this is expected behavior due to read-ahead or something. (Is there read-ahead on the raw device, or just if a file system is involved.) I can redo my test with 2.6.34 and get logs if that is a bug. And if not a bug, is there a hdparm command I can issue to eliminate this behaviour. Thanks Greg ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: If I have a single bad sector, how many failed reads should simple dd report? 2010-07-08 17:14 If I have a single bad sector, how many failed reads should simple dd report? Greg Freemyer @ 2010-07-09 19:04 ` Greg Freemyer 2010-07-10 1:19 ` Mark Lord 0 siblings, 1 reply; 7+ messages in thread From: Greg Freemyer @ 2010-07-09 19:04 UTC (permalink / raw) To: IDE/ATA development list, Mark Lord Adding Mark Lord to CC because hdparm may be part of the problem. On Thu, Jul 8, 2010 at 1:14 PM, Greg Freemyer <greg.freemyer@gmail.com> wrote: > All, > > I just ran a test against a IDE drive (/dev/sdb) and slightly older > kernel (2.6.32). > > Similar to: > > dd if=/dev/sdb of=/dev/null conv=noerror,sync bs=4k > > With a good drive it ran fine. > > Then I used hdparm --make-bad-sector to intentionally corrupt a sector > on the drive. > > When I re-ran it, /var/log/messages reported 10 bad logical blocks. > And even worse, dd reported 20 bad blocks. I examined the data dd > read and it had 80KB of zero'ed out data. So that's 160 sectors worth > of data lost because of a single bad sector. At most I was expecting > 4KB of zero'ed out data. > > I haven't started troubleshooting, but I want to know if this is > expected behavior due to read-ahead or something. (Is there > read-ahead on the raw device, or just if a file system is involved.) > > I can redo my test with 2.6.34 and get logs if that is a bug. > > And if not a bug, is there a hdparm command I can issue to eliminate > this behaviour. > > Thanks > Greg I retested with 2.6.25 and see the same behavior, so if its a bug, its not a recently introduced one. But 2.6.25 is showing a lot more media errors than I recall from 2.6.32. Even stranger, I'm now introducing the bad sector as sector 100,000 (ie. about 50 MB from the start of the drive). Then doing a dd to capture just the first 100 MB to a file. ie. dd if=/dev/sdb of=clean.dd conv=noerror,sync bs=4K count=25600 hdparm --make-bad-sector 100000 --yes-i-know-what-i-am-doing /dev/sdb dd if=/dev/sdb of=corrupt.dd conv=noerror,sync bs=4K count=25600 hdparm --write-sector 100000 --yes-i-know-what-i-am-doing /dev/sdb (clearly don't do that to a disk you care about. The data at sector 100,000 is permanently lost.) cmp -bl clean.dd corrupt.dd > delta.log Looking at delta.log, the bytes start disagreeing 4 bytes prior to the sector boundary. How can that be? I guess hdparm --make-bad-sector may be corrupting an extra 4 bytes, but that would surprise me. That seems like a definite bug in either the kernel or hdparm. Mark, do you have any input about either the early start of the corruption, or the very large number of sectors that seem corrupted by making a single sector bad. Greg ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: If I have a single bad sector, how many failed reads should simple dd report? 2010-07-09 19:04 ` Greg Freemyer @ 2010-07-10 1:19 ` Mark Lord 2010-07-10 1:24 ` Mark Lord 0 siblings, 1 reply; 7+ messages in thread From: Mark Lord @ 2010-07-10 1:19 UTC (permalink / raw) To: Greg Freemyer; +Cc: IDE/ATA development list, Mark Lord On 09/07/10 03:04 PM, Greg Freemyer wrote: .. >> When I re-ran it, /var/log/messages reported 10 bad logical blocks. >> And even worse, dd reported 20 bad blocks. I examined the data dd >> read and it had 80KB of zero'ed out data. So that's 160 sectors worth >> of data lost because of a single bad sector. At most I was expecting >> 4KB of zero'ed out data. .. That's just the standard, undesirable result of the current SCSI EH when used with libata for (mainly) desktop computers. I have patches (against older kernels) to fix it, but have yet to get both myself and James B. interested enough simultaneously to actually get the kernel fixed. :) .. > cmp -bl clean.dd corrupt.dd> delta.log > > Looking at delta.log, the bytes start disagreeing 4 bytes prior to the > sector boundary. How can that be? .. Dunno. That could be a different problem. I've seen the odd scattered report of data corruption on ext3/ext4 with large data sets, but don't know what the root causes were. Cheers ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: If I have a single bad sector, how many failed reads should simple dd report? 2010-07-10 1:19 ` Mark Lord @ 2010-07-10 1:24 ` Mark Lord 2010-07-10 14:14 ` James Bottomley 0 siblings, 1 reply; 7+ messages in thread From: Mark Lord @ 2010-07-10 1:24 UTC (permalink / raw) To: Greg Freemyer; +Cc: IDE/ATA development list, Mark Lord [-- Attachment #1: Type: text/plain, Size: 4127 bytes --] On 09/07/10 09:19 PM, Mark Lord wrote: > On 09/07/10 03:04 PM, Greg Freemyer wrote: > .. >>> When I re-ran it, /var/log/messages reported 10 bad logical blocks. >>> And even worse, dd reported 20 bad blocks. I examined the data dd >>> read and it had 80KB of zero'ed out data. So that's 160 sectors worth >>> of data lost because of a single bad sector. At most I was expecting >>> 4KB of zero'ed out data. > .. > > That's just the standard, undesirable result of the current SCSI EH > when used with libata for (mainly) desktop computers. > > I have patches (against older kernels) to fix it, but have yet to > get both myself and James B. interested enough simultaneously to > actually get the kernel fixed. :) .. Here (attached and inline below) are my most recent patches for this. Still outdated, though. These are against the SLES11 2.6.27.19 kernel: -----------------------------snip---------------------------- Stop the SCSI EH from performing tons of retries on unrecoverable medium errors, so that error-handling fails more quickly and we (EMC) avoid unneeded node resets. The ugliness of this patch matches the ugliness of SCSI EH. Does *anyone* actually understand this code completely? Signed-off-by: Mark Lord <mlord@pobox.com> --- old/drivers/scsi/scsi_error.c 2009-06-04 09:46:55.000000000 -0400 +++ linux/drivers/scsi/scsi_error.c 2009-06-04 12:08:48.000000000 -0400 @@ -423,6 +423,52 @@ } } +/* + * The problem with scsi_check_sense(), is that it is (designed to be) + * called only after retries are exhausted. But for MEDIUM_ERRORs (only) + * we don't want any retries here at all. + * + * So this function below is a clone of the necessary parts from scsi_check_sense(), + * to check for unrecoverable MEDIUM_ERRORs when deciding whether to retry or not. + */ +static int scsi_unrecoverable_medium_error(struct scsi_cmnd *scmd) +{ + struct scsi_sense_hdr sshdr; + + if (! scsi_command_normalize_sense(scmd, &sshdr)) + return 0; /* no valid sense data */ + + if (scsi_sense_is_deferred(&sshdr)) + return 0; + + if (sshdr.response_code == 0x70) { + /* fixed format */ + if (scmd->sense_buffer[2] & 0xe0) + return 0; + } else { + /* + * descriptor format: look for "stream commands sense data + * descriptor" (see SSC-3). Assume single sense data + * descriptor. Ignore ILI from SBC-2 READ LONG and WRITE LONG. + */ + if ((sshdr.additional_length > 3) && + (scmd->sense_buffer[8] == 0x4) && + (scmd->sense_buffer[11] & 0xe0)) + return 0; + } + + switch (sshdr.sense_key) { + case MEDIUM_ERROR: + if (sshdr.asc == 0x11 || /* UNRECOVERED READ ERR */ + sshdr.asc == 0x13 || /* AMNF DATA FIELD */ + sshdr.asc == 0x14) { /* RECORD NOT FOUND */ + //printk(KERN_WARNING "%s: MEDIUM_ERROR\n", __func__); + return 1; + } + } + return 0; +} + /** * scsi_eh_completed_normally - Disposition a eh cmd on return from LLD. * @scmd: SCSI cmd to examine. @@ -1334,6 +1380,8 @@ switch (status_byte(scmd->result)) { case CHECK_CONDITION: + if (scsi_unrecoverable_medium_error(scmd)) + return 1; /* * assume caller has checked sense and determinted * the check condition was retryable. -----------------------------snip---------------------------- On encountering a bad sector, report and skip over it, then continue with the remainder of the request. Otherwise we would fail perfectly good sectors, making a bad situation even worse. Signed-off-by: Mark Lord <mlord@pobox.com> --- old/drivers/scsi/scsi_lib.c 2009-06-04 12:26:52.000000000 -0400 +++ linux/drivers/scsi/scsi_lib.c 2009-06-04 14:40:11.000000000 -0400 @@ -952,6 +952,12 @@ */ if (sense_valid && !sense_deferred) { switch (sshdr.sense_key) { + case MEDIUM_ERROR: + /* Bad sector. Fail it, and then continue the rest of the request. */ + if (this_count && scsi_end_request(cmd, -EIO, cmd->device->sector_size, 1) == NULL) { + cmd->retries = 0; // go around again.. + return; + } case UNIT_ATTENTION: if (cmd->device->removable) { /* Detected disc change. Set a bit -----------------------------snip---------------------------- [-- Attachment #2: 01_scsi_no_retry_medium_errors-sles11.patch --] [-- Type: text/x-diff, Size: 2252 bytes --] sles11: Stop the SCSI EH from performing tons of retries on unrecoverable medium errors, so that error-handling fails more quickly and we (EMC) avoid unneeded node resets. The ugliness of this patch matches the ugliness of SCSI EH. Does *anyone* actually understand this code completely? Signed-off-by: Mark Lord <mlord@pobox.com> --- old/drivers/scsi/scsi_error.c 2009-06-04 09:46:55.000000000 -0400 +++ linux/drivers/scsi/scsi_error.c 2009-06-04 12:08:48.000000000 -0400 @@ -423,6 +423,52 @@ } } +/* + * The problem with scsi_check_sense(), is that it is (designed to be) + * called only after retries are exhausted. But for MEDIUM_ERRORs (only) + * we don't want any retries here at all. + * + * So this function below is a clone of the necessary parts from scsi_check_sense(), + * to check for unrecoverable MEDIUM_ERRORs when deciding whether to retry or not. + */ +static int scsi_unrecoverable_medium_error(struct scsi_cmnd *scmd) +{ + struct scsi_sense_hdr sshdr; + + if (! scsi_command_normalize_sense(scmd, &sshdr)) + return 0; /* no valid sense data */ + + if (scsi_sense_is_deferred(&sshdr)) + return 0; + + if (sshdr.response_code == 0x70) { + /* fixed format */ + if (scmd->sense_buffer[2] & 0xe0) + return 0; + } else { + /* + * descriptor format: look for "stream commands sense data + * descriptor" (see SSC-3). Assume single sense data + * descriptor. Ignore ILI from SBC-2 READ LONG and WRITE LONG. + */ + if ((sshdr.additional_length > 3) && + (scmd->sense_buffer[8] == 0x4) && + (scmd->sense_buffer[11] & 0xe0)) + return 0; + } + + switch (sshdr.sense_key) { + case MEDIUM_ERROR: + if (sshdr.asc == 0x11 || /* UNRECOVERED READ ERR */ + sshdr.asc == 0x13 || /* AMNF DATA FIELD */ + sshdr.asc == 0x14) { /* RECORD NOT FOUND */ + //printk(KERN_WARNING "%s: MEDIUM_ERROR\n", __func__); + return 1; + } + } + return 0; +} + /** * scsi_eh_completed_normally - Disposition a eh cmd on return from LLD. * @scmd: SCSI cmd to examine. @@ -1334,6 +1380,8 @@ switch (status_byte(scmd->result)) { case CHECK_CONDITION: + if (scsi_unrecoverable_medium_error(scmd)) + return 1; /* * assume caller has checked sense and determinted * the check condition was retryable. [-- Attachment #3: 02_scsi_eh_continue-sles11.patch --] [-- Type: text/x-diff, Size: 825 bytes --] sles11: On encountering a bad sector, report and skip over it, then continue with the remainder of the request. Otherwise we would fail perfectly good sectors, making a bad situation even worse. Signed-off-by: Mark Lord <mlord@pobox.com> --- old/drivers/scsi/scsi_lib.c 2009-06-04 12:26:52.000000000 -0400 +++ linux/drivers/scsi/scsi_lib.c 2009-06-04 14:40:11.000000000 -0400 @@ -952,6 +952,12 @@ */ if (sense_valid && !sense_deferred) { switch (sshdr.sense_key) { + case MEDIUM_ERROR: + /* Bad sector. Fail it, and then continue the rest of the request. */ + if (this_count && scsi_end_request(cmd, -EIO, cmd->device->sector_size, 1) == NULL) { + cmd->retries = 0; // go around again.. + return; + } case UNIT_ATTENTION: if (cmd->device->removable) { /* Detected disc change. Set a bit ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: If I have a single bad sector, how many failed reads should simple dd report? 2010-07-10 1:24 ` Mark Lord @ 2010-07-10 14:14 ` James Bottomley 2010-07-11 12:58 ` Greg Freemyer 2010-07-13 19:07 ` Mark Lord 0 siblings, 2 replies; 7+ messages in thread From: James Bottomley @ 2010-07-10 14:14 UTC (permalink / raw) To: Mark Lord; +Cc: Greg Freemyer, IDE/ATA development list, Mark Lord On Fri, 2010-07-09 at 21:24 -0400, Mark Lord wrote: > On 09/07/10 09:19 PM, Mark Lord wrote: > > On 09/07/10 03:04 PM, Greg Freemyer wrote: > > .. > >>> When I re-ran it, /var/log/messages reported 10 bad logical blocks. > >>> And even worse, dd reported 20 bad blocks. I examined the data dd > >>> read and it had 80KB of zero'ed out data. So that's 160 sectors worth > >>> of data lost because of a single bad sector. At most I was expecting > >>> 4KB of zero'ed out data. > > .. > > > > That's just the standard, undesirable result of the current SCSI EH > > when used with libata for (mainly) desktop computers. > > > > I have patches (against older kernels) to fix it, but have yet to > > get both myself and James B. interested enough simultaneously to > > actually get the kernel fixed. :) > .. > > Here (attached and inline below) are my most recent patches for this. > Still outdated, though. These are against the SLES11 2.6.27.19 kernel: > > -----------------------------snip---------------------------- > > Stop the SCSI EH from performing tons of retries on unrecoverable medium errors, > so that error-handling fails more quickly and we (EMC) avoid unneeded node resets. > > The ugliness of this patch matches the ugliness of SCSI EH. > Does *anyone* actually understand this code completely? > > Signed-off-by: Mark Lord <mlord@pobox.com> > > --- old/drivers/scsi/scsi_error.c 2009-06-04 09:46:55.000000000 -0400 > +++ linux/drivers/scsi/scsi_error.c 2009-06-04 12:08:48.000000000 -0400 > @@ -423,6 +423,52 @@ > } > } > > +/* > + * The problem with scsi_check_sense(), is that it is (designed to be) > + * called only after retries are exhausted. But for MEDIUM_ERRORs (only) > + * we don't want any retries here at all. > + * > + * So this function below is a clone of the necessary parts from scsi_check_sense(), > + * to check for unrecoverable MEDIUM_ERRORs when deciding whether to retry or not. > + */ > +static int scsi_unrecoverable_medium_error(struct scsi_cmnd *scmd) > +{ > + struct scsi_sense_hdr sshdr; > + > + if (! scsi_command_normalize_sense(scmd, &sshdr)) > + return 0; /* no valid sense data */ > + > + if (scsi_sense_is_deferred(&sshdr)) > + return 0; > + > + if (sshdr.response_code == 0x70) { > + /* fixed format */ > + if (scmd->sense_buffer[2] & 0xe0) > + return 0; > + } else { > + /* > + * descriptor format: look for "stream commands sense data > + * descriptor" (see SSC-3). Assume single sense data > + * descriptor. Ignore ILI from SBC-2 READ LONG and WRITE LONG. > + */ > + if ((sshdr.additional_length > 3) && > + (scmd->sense_buffer[8] == 0x4) && > + (scmd->sense_buffer[11] & 0xe0)) > + return 0; > + } > + > + switch (sshdr.sense_key) { > + case MEDIUM_ERROR: > + if (sshdr.asc == 0x11 || /* UNRECOVERED READ ERR */ > + sshdr.asc == 0x13 || /* AMNF DATA FIELD */ > + sshdr.asc == 0x14) { /* RECORD NOT FOUND */ > + //printk(KERN_WARNING "%s: MEDIUM_ERROR\n", __func__); > + return 1; > + } > + } > + return 0; > +} > + > /** > * scsi_eh_completed_normally - Disposition a eh cmd on return from LLD. > * @scmd: SCSI cmd to examine. > @@ -1334,6 +1380,8 @@ > > switch (status_byte(scmd->result)) { > case CHECK_CONDITION: > + if (scsi_unrecoverable_medium_error(scmd)) > + return 1; > /* > * assume caller has checked sense and determinted > * the check condition was retryable. The check is redundant: scsi_decide_disposition is where we check for retries or error handling. If you look, it already picks out your three cases and returns success for them (meaning pass straight through). in sd_done() MEDIUM_ERROR means complete immediately. > > On encountering a bad sector, report and skip over it, > then continue with the remainder of the request. > Otherwise we would fail perfectly good sectors, > making a bad situation even worse. > > Signed-off-by: Mark Lord <mlord@pobox.com> > > --- old/drivers/scsi/scsi_lib.c 2009-06-04 12:26:52.000000000 -0400 > +++ linux/drivers/scsi/scsi_lib.c 2009-06-04 14:40:11.000000000 -0400 > @@ -952,6 +952,12 @@ > */ > if (sense_valid && !sense_deferred) { > switch (sshdr.sense_key) { > + case MEDIUM_ERROR: > + /* Bad sector. Fail it, and then continue the rest of the request. */ > + if (this_count && scsi_end_request(cmd, -EIO, cmd->device->sector_size, 1) == NULL) { > + cmd->retries = 0; // go around again.. > + return; > + } > case UNIT_ATTENTION: > if (cmd->device->removable) { > /* Detected disc change. Set a bit This one's in the wrong place. Normal MEDIUM_ERRORS complete above this. I also don't think the skip is right. We're supposed to communicate to block what we've done, and all we have to do that with is good bytes. If we skip over a bad sector and try to complete the rest, we've lost the error position. The failure return from Medium error is defined to be the number of bytes we actually wrote, which is everything up to the medium error. James ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: If I have a single bad sector, how many failed reads should simple dd report? 2010-07-10 14:14 ` James Bottomley @ 2010-07-11 12:58 ` Greg Freemyer 2010-07-13 19:07 ` Mark Lord 1 sibling, 0 replies; 7+ messages in thread From: Greg Freemyer @ 2010-07-11 12:58 UTC (permalink / raw) To: James Bottomley; +Cc: Mark Lord, IDE/ATA development list, Mark Lord On Sat, Jul 10, 2010 at 10:14 AM, James Bottomley <James.Bottomley@suse.de> wrote: > On Fri, 2010-07-09 at 21:24 -0400, Mark Lord wrote: >> On 09/07/10 09:19 PM, Mark Lord wrote: >> > On 09/07/10 03:04 PM, Greg Freemyer wrote: >> > .. >> >>> When I re-ran it, /var/log/messages reported 10 bad logical blocks. >> >>> And even worse, dd reported 20 bad blocks. I examined the data dd >> >>> read and it had 80KB of zero'ed out data. So that's 160 sectors worth >> >>> of data lost because of a single bad sector. At most I was expecting >> >>> 4KB of zero'ed out data. >> > .. >> > >> > That's just the standard, undesirable result of the current SCSI EH >> > when used with libata for (mainly) desktop computers. >> > >> > I have patches (against older kernels) to fix it, but have yet to >> > get both myself and James B. interested enough simultaneously to >> > actually get the kernel fixed. :) >> .. >> >> Here (attached and inline below) are my most recent patches for this. >> Still outdated, though. These are against the SLES11 2.6.27.19 kernel: >> >> -----------------------------snip---------------------------- >> >> Stop the SCSI EH from performing tons of retries on unrecoverable medium errors, >> so that error-handling fails more quickly and we (EMC) avoid unneeded node resets. >> >> The ugliness of this patch matches the ugliness of SCSI EH. >> Does *anyone* actually understand this code completely? >> >> Signed-off-by: Mark Lord <mlord@pobox.com> >> >> --- old/drivers/scsi/scsi_error.c 2009-06-04 09:46:55.000000000 -0400 >> +++ linux/drivers/scsi/scsi_error.c 2009-06-04 12:08:48.000000000 -0400 >> @@ -423,6 +423,52 @@ >> } >> } >> >> +/* >> + * The problem with scsi_check_sense(), is that it is (designed to be) >> + * called only after retries are exhausted. But for MEDIUM_ERRORs (only) >> + * we don't want any retries here at all. >> + * >> + * So this function below is a clone of the necessary parts from scsi_check_sense(), >> + * to check for unrecoverable MEDIUM_ERRORs when deciding whether to retry or not. >> + */ >> +static int scsi_unrecoverable_medium_error(struct scsi_cmnd *scmd) >> +{ >> + struct scsi_sense_hdr sshdr; >> + >> + if (! scsi_command_normalize_sense(scmd, &sshdr)) >> + return 0; /* no valid sense data */ >> + >> + if (scsi_sense_is_deferred(&sshdr)) >> + return 0; >> + >> + if (sshdr.response_code == 0x70) { >> + /* fixed format */ >> + if (scmd->sense_buffer[2] & 0xe0) >> + return 0; >> + } else { >> + /* >> + * descriptor format: look for "stream commands sense data >> + * descriptor" (see SSC-3). Assume single sense data >> + * descriptor. Ignore ILI from SBC-2 READ LONG and WRITE LONG. >> + */ >> + if ((sshdr.additional_length > 3) && >> + (scmd->sense_buffer[8] == 0x4) && >> + (scmd->sense_buffer[11] & 0xe0)) >> + return 0; >> + } >> + >> + switch (sshdr.sense_key) { >> + case MEDIUM_ERROR: >> + if (sshdr.asc == 0x11 || /* UNRECOVERED READ ERR */ >> + sshdr.asc == 0x13 || /* AMNF DATA FIELD */ >> + sshdr.asc == 0x14) { /* RECORD NOT FOUND */ >> + //printk(KERN_WARNING "%s: MEDIUM_ERROR\n", __func__); >> + return 1; >> + } >> + } >> + return 0; >> +} > > >> + >> /** >> * scsi_eh_completed_normally - Disposition a eh cmd on return from LLD. >> * @scmd: SCSI cmd to examine. >> @@ -1334,6 +1380,8 @@ >> >> switch (status_byte(scmd->result)) { >> case CHECK_CONDITION: >> + if (scsi_unrecoverable_medium_error(scmd)) >> + return 1; >> /* >> * assume caller has checked sense and determinted >> * the check condition was retryable. > > The check is redundant: scsi_decide_disposition is where we check for > retries or error handling. If you look, it already picks out your three > cases and returns success for them (meaning pass straight through). in > sd_done() MEDIUM_ERROR means complete immediately. > > >> >> On encountering a bad sector, report and skip over it, >> then continue with the remainder of the request. >> Otherwise we would fail perfectly good sectors, >> making a bad situation even worse. >> >> Signed-off-by: Mark Lord <mlord@pobox.com> >> >> --- old/drivers/scsi/scsi_lib.c 2009-06-04 12:26:52.000000000 -0400 >> +++ linux/drivers/scsi/scsi_lib.c 2009-06-04 14:40:11.000000000 -0400 >> @@ -952,6 +952,12 @@ >> */ >> if (sense_valid && !sense_deferred) { >> switch (sshdr.sense_key) { >> + case MEDIUM_ERROR: >> + /* Bad sector. Fail it, and then continue the rest of the request. */ >> + if (this_count && scsi_end_request(cmd, -EIO, cmd->device->sector_size, 1) == NULL) { >> + cmd->retries = 0; // go around again.. >> + return; >> + } >> case UNIT_ATTENTION: >> if (cmd->device->removable) { >> /* Detected disc change. Set a bit > > This one's in the wrong place. Normal MEDIUM_ERRORS complete above > this. > > I also don't think the skip is right. We're supposed to communicate to > block what we've done, and all we have to do that with is good bytes. > If we skip over a bad sector and try to complete the rest, we've lost > the error position. > > The failure return from Medium error is defined to be the number of > bytes we actually wrote, which is everything up to the medium error. > > James If you guys have patches you want me to test, I'm happy to do so, but I urge the team to work on this simple test case. It may have been this way since day one of libata, but it certainly looks like a collection of bugs to me. Again, assuming /dev/sdb is a test drive without valuable data the test case can be as simple as just: ============== #create a 1000 sectors of random data starting at sector 1000 dd if=/dev/urandom of=/dev/sdb bs=512 seek=1000 count=1000 #make a clean copy of the data, starting at sector 0 and going well past. 4MB total dd if=/dev/sdb of=clean.dd conv=noerror,sync bs=4K count=1000 #corrupt a sector in the middle of the test data hdparm --make-bad-sector 1500 --yes-i-know-what-i-am-doing /dev/sdb #make another copy of the first 4MB dd if=/dev/sdb of=corrupt.dd conv=noerror,sync bs=4K count=1000 #restore the drive sector so it is not permanently lost hdparm --write-sector 1500 --yes-i-know-what-i-am-doing /dev/sdb (I choose sector 1500 this time to make the testing process faster.) cmp -bl clean.dd corrupt.dd > delta.log Review delta.log for disagrees. And /var/log/warn for logical block failures as well as extraneous media errors. ============== I found: 1) Corruption stated several bytes prior to the corrupt sector 2) With dd using 512 byte blocks, 64 sectors worth of corrupt data in corrupt.dd 3) With dd using 4K blocks, 160 sectors worth of corrupt data in corrupt.dd. (I don't know why the difference and I was using a customized version of dd called dcfldd for this test.) 4) a 50% reduction in throughput based on one bad sector. I'm not sure that's a bug, but it is clearly an issue that can be looked at. You need a much bigger dd run to see that impact. Thanks Greg ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: If I have a single bad sector, how many failed reads should simple dd report? 2010-07-10 14:14 ` James Bottomley 2010-07-11 12:58 ` Greg Freemyer @ 2010-07-13 19:07 ` Mark Lord 1 sibling, 0 replies; 7+ messages in thread From: Mark Lord @ 2010-07-13 19:07 UTC (permalink / raw) To: James Bottomley; +Cc: Greg Freemyer, IDE/ATA development list, Mark Lord On 10/07/10 10:14 AM, James Bottomley wrote: .. > The check is redundant: scsi_decide_disposition is where we check for > retries or error handling. If you look, it already picks out your three > cases and returns success for them (meaning pass straight through). in > sd_done() MEDIUM_ERROR means complete immediately. .. > This one's in the wrong place. Normal MEDIUM_ERRORS complete above > this. > > I also don't think the skip is right. We're supposed to communicate to > block what we've done, and all we have to do that with is good bytes. > If we skip over a bad sector and try to complete the rest, we've lost > the error position. .. No, on 2.6.27, those patches work exactly as they should. Failing (to the block layer) the bad sector(s), and successfully reading all others. Try it. It works. But horribly out of date for the newer kernels, which is why it's not in your patch queue from me. :) Cheers ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2010-07-13 19:07 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2010-07-08 17:14 If I have a single bad sector, how many failed reads should simple dd report? Greg Freemyer 2010-07-09 19:04 ` Greg Freemyer 2010-07-10 1:19 ` Mark Lord 2010-07-10 1:24 ` Mark Lord 2010-07-10 14:14 ` James Bottomley 2010-07-11 12:58 ` Greg Freemyer 2010-07-13 19:07 ` Mark Lord
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.