* Input/Output error reading from a clean raid @ 2017-01-22 14:08 Salatiel Filho 2017-01-23 0:18 ` John Stoffel [not found] ` <20170123010334.GA7546@metamorpher.de> 0 siblings, 2 replies; 11+ messages in thread From: Salatiel Filho @ 2017-01-22 14:08 UTC (permalink / raw) To: linux-raid I am trying to recover a few files from my backup. The backup is on a raid 5 + ext4. There are several files where i get I/O error. The raid appears to be clean and fsck shows no errors. Any ideas what could it be ? md1 : active raid5 sdd1[0] sdg1[4] sdf1[2] sde1[1] 3220829184 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU] bitmap: 1/8 pages [4KB], 65536KB chunk []'s Salatiel ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Input/Output error reading from a clean raid 2017-01-22 14:08 Input/Output error reading from a clean raid Salatiel Filho @ 2017-01-23 0:18 ` John Stoffel 2017-01-23 14:42 ` Salatiel Filho [not found] ` <20170123010334.GA7546@metamorpher.de> 1 sibling, 1 reply; 11+ messages in thread From: John Stoffel @ 2017-01-23 0:18 UTC (permalink / raw) To: Salatiel Filho; +Cc: linux-raid Salatiel> I am trying to recover a few files from my backup. The Salatiel> backup is on a raid 5 + ext4. There are several files where Salatiel> i get I/O error. The raid appears to be clean and fsck shows Salatiel> no errors. Any ideas what could it be ? Salatiel> md1 : active raid5 sdd1[0] sdg1[4] sdf1[2] sde1[1] Salatiel> 3220829184 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU] Salatiel> bitmap: 1/8 pages [4KB], 65536KB chunk It would help if you could post the error(s) you're getting, along with any output from dmesg during that time. Have you done a full scan of the disk looking for errors? You might just have silent read errors in your array. So as root do: # echo check >>/sys/block/md??/md/sync_action where md?? is the name of your md array you want to check. You can get the name from: cat /proc/mdstat and of course it would help to post that info as well if you want more help. John ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Input/Output error reading from a clean raid 2017-01-23 0:18 ` John Stoffel @ 2017-01-23 14:42 ` Salatiel Filho 2017-01-23 16:12 ` John Stoffel 0 siblings, 1 reply; 11+ messages in thread From: Salatiel Filho @ 2017-01-23 14:42 UTC (permalink / raw) To: John Stoffel; +Cc: linux-raid The output of the command is: # dd if=Fotos.zip of=/dev/null dd: error reading ‘Fotos.zip’: Input/output error 328704+0 records in 328704+0 records out 168296448 bytes (168 MB) copied, 0.127723 s, 1.3 GB/s or # cp Fotos.zip /tmp/ cp: error reading ‘Fotos.zip’: Input/output error cp: failed to extend ‘/tmp/Fotos.zip’: Input/output error There is nothing on dmesg after running those commands; []'s Salatiel On Sun, Jan 22, 2017 at 9:18 PM, John Stoffel <john@stoffel.org> wrote: > > Salatiel> I am trying to recover a few files from my backup. The > Salatiel> backup is on a raid 5 + ext4. There are several files where > Salatiel> i get I/O error. The raid appears to be clean and fsck shows > Salatiel> no errors. Any ideas what could it be ? > > Salatiel> md1 : active raid5 sdd1[0] sdg1[4] sdf1[2] sde1[1] > Salatiel> 3220829184 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU] > Salatiel> bitmap: 1/8 pages [4KB], 65536KB chunk > > It would help if you could post the error(s) you're getting, along > with any output from dmesg during that time. Have you done a full > scan of the disk looking for errors? You might just have silent > read errors in your array. So as root do: > > # echo check >>/sys/block/md??/md/sync_action > > where md?? is the name of your md array you want to check. You can > get the name from: > > cat /proc/mdstat > > and of course it would help to post that info as well if you want more > help. > > John ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Input/Output error reading from a clean raid 2017-01-23 14:42 ` Salatiel Filho @ 2017-01-23 16:12 ` John Stoffel 0 siblings, 0 replies; 11+ messages in thread From: John Stoffel @ 2017-01-23 16:12 UTC (permalink / raw) To: Salatiel Filho; +Cc: John Stoffel, linux-raid Salatiel> The output of the command is: Salatiel> # dd if=Fotos.zip of=/dev/null Salatiel> dd: error reading ‘Fotos.zip’: Input/output error Salatiel> 328704+0 records in Salatiel> 328704+0 records out Salatiel> 168296448 bytes (168 MB) copied, 0.127723 s, 1.3 GB/s Salatiel> or Salatiel> # cp Fotos.zip /tmp/ Salatiel> cp: error reading ‘Fotos.zip’: Input/output error Salatiel> cp: failed to extend ‘/tmp/Fotos.zip’: Input/output error Can you do a 'zip -l Fotos.zip' and get anything back? It looks like the first 168mb might be ok... so you might get something back. You might also want to try and start doing a dd from 328705 records (or even a couple more records farther) to see if you can get anything else from there. In this case, the tool 'ddrescue' might be your answer, since it is designed to handle errors like this and continue reading past errors. It might, or might not, let you get more of your data back. On debian based systems you should be able to just do: apt-get install gddrescue or just do: apt-cache search ddrescue For RedHat fedora you could do: dnf search ddrescue too. Did you run the "echo check > ..." command at all? What did it say in the output of: cat /proc/mdstat when you did this? Salatiel> There is nothing on dmesg after running those commands; You might be out of luck. This is one reason why I like A) mirroring my data and B) saving multiple copies to multiple locations. Storage is cheap these days. Though I admit I'm not perfect either. Please get us more information so we can try to help more. Also, have you unmounted the filesystem and done an 'fsck -y /dev/...' on it as well? You might want to do a more in-depth check of the filesystem to see if there's any corruption somewhere. Also, going to the end of the file, and seeking backwards and reading off blocks might help you recover more of the zip file. Salatiel> On Sun, Jan 22, 2017 at 9:18 PM, John Stoffel <john@stoffel.org> wrote: >> Salatiel> I am trying to recover a few files from my backup. The Salatiel> backup is on a raid 5 + ext4. There are several files where Salatiel> i get I/O error. The raid appears to be clean and fsck shows Salatiel> no errors. Any ideas what could it be ? >> Salatiel> md1 : active raid5 sdd1[0] sdg1[4] sdf1[2] sde1[1] Salatiel> 3220829184 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU] Salatiel> bitmap: 1/8 pages [4KB], 65536KB chunk >> >> It would help if you could post the error(s) you're getting, along >> with any output from dmesg during that time. Have you done a full >> scan of the disk looking for errors? You might just have silent >> read errors in your array. So as root do: >> >> # echo check >>/sys/block/md??/md/sync_action >> >> where md?? is the name of your md array you want to check. You can >> get the name from: >> >> cat /proc/mdstat >> >> and of course it would help to post that info as well if you want more >> help. >> >> John ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <20170123010334.GA7546@metamorpher.de>]
* Re: Input/Output error reading from a clean raid [not found] ` <20170123010334.GA7546@metamorpher.de> @ 2017-01-23 14:02 ` Salatiel Filho 2017-01-23 17:07 ` John Stoffel 2017-01-23 17:34 ` Andreas Klauer 0 siblings, 2 replies; 11+ messages in thread From: Salatiel Filho @ 2017-01-23 14:02 UTC (permalink / raw) To: linux-raid Ok, i have run echo check >>/sys/block/md1/md/sync_action, and now the output of mdadm mdadm --examine-badblocks /dev/sdd1 /dev/sdg1 /dev/sdf1 /dev/sde1 Bad-blocks on /dev/sdd1: 1515723072 for 512 sectors 1515723584 for 512 sectors 1515724096 for 512 sectors 1515724608 for 512 sectors 1515725120 for 512 sectors 1515725632 for 512 sectors 1515726144 for 512 sectors 1515726656 for 512 sectors 1515727168 for 512 sectors 1515727680 for 512 sectors 1515728192 for 512 sectors 1515728704 for 512 sectors 1515729216 for 512 sectors 1515729728 for 512 sectors 1515730240 for 512 sectors 1515730752 for 512 sectors 1515731264 for 512 sectors 1515731776 for 512 sectors 1515732288 for 512 sectors 1515732800 for 512 sectors 1515733312 for 512 sectors 1515733824 for 512 sectors 1515734336 for 512 sectors 1515734848 for 512 sectors 1515735360 for 512 sectors 1515735872 for 512 sectors 1515736384 for 512 sectors 1515736896 for 512 sectors 1515737408 for 512 sectors 1515737920 for 512 sectors 1515738432 for 512 sectors 1515738944 for 512 sectors 1515739456 for 512 sectors 1515739968 for 512 sectors 1515740480 for 512 sectors 1515740992 for 512 sectors 1515741504 for 512 sectors 1515742016 for 192 sectors 1515743712 for 512 sectors 1515744224 for 512 sectors 1515744736 for 512 sectors 1515745248 for 512 sectors 1515745760 for 512 sectors 1515746272 for 512 sectors 1515746784 for 512 sectors 1515747296 for 512 sectors 1515747808 for 512 sectors 1515748320 for 512 sectors 1515749072 for 304 sectors 1515750400 for 512 sectors 1515750912 for 512 sectors 1515751424 for 512 sectors 1515751936 for 512 sectors 1515752448 for 512 sectors 1515752960 for 512 sectors 1515753472 for 512 sectors 1515753984 for 512 sectors 1515754496 for 232 sectors Bad-blocks list is empty in /dev/sdg1 Bad-blocks list is empty in /dev/sdf1 Bad-blocks on /dev/sde1: 1515723072 for 512 sectors 1515723584 for 512 sectors 1515724096 for 512 sectors 1515724608 for 512 sectors 1515725120 for 512 sectors 1515725632 for 512 sectors 1515726144 for 512 sectors 1515726656 for 512 sectors 1515727168 for 512 sectors 1515727680 for 512 sectors 1515728192 for 512 sectors 1515728704 for 512 sectors 1515729216 for 512 sectors 1515729728 for 512 sectors 1515730240 for 512 sectors 1515730752 for 512 sectors 1515731264 for 512 sectors 1515731776 for 512 sectors 1515732288 for 512 sectors 1515732800 for 512 sectors 1515733312 for 512 sectors 1515733824 for 512 sectors 1515734336 for 512 sectors 1515734848 for 512 sectors 1515735360 for 512 sectors 1515735872 for 512 sectors 1515736384 for 512 sectors 1515736896 for 512 sectors 1515737408 for 512 sectors 1515737920 for 512 sectors 1515738432 for 512 sectors 1515738944 for 512 sectors 1515739456 for 512 sectors 1515739968 for 512 sectors 1515740480 for 512 sectors 1515740992 for 512 sectors 1515741504 for 512 sectors 1515742016 for 192 sectors 1515743712 for 512 sectors 1515744224 for 512 sectors 1515744736 for 512 sectors 1515745248 for 512 sectors 1515745760 for 512 sectors 1515746272 for 512 sectors 1515746784 for 512 sectors 1515747296 for 512 sectors 1515747808 for 512 sectors 1515748320 for 512 sectors 1515749072 for 304 sectors 1515750400 for 512 sectors 1515750912 for 512 sectors 1515751424 for 512 sectors 1515751936 for 512 sectors 1515752448 for 512 sectors 1515752960 for 512 sectors 1515753472 for 512 sectors 1515753984 for 512 sectors 1515754496 for 232 sectors []'s Salatiel On Sun, Jan 22, 2017 at 10:03 PM, Andreas Klauer <Andreas.Klauer@metamorpher.de> wrote: > On Sun, Jan 22, 2017 at 11:08:40AM -0300, Salatiel Filho wrote: >> Any ideas what could it be ? > > mdadm --examine-badblocks > > Regards > Andreas Klauer ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Input/Output error reading from a clean raid 2017-01-23 14:02 ` Salatiel Filho @ 2017-01-23 17:07 ` John Stoffel 2017-01-23 17:23 ` Wols Lists 2017-01-23 17:34 ` Andreas Klauer 1 sibling, 1 reply; 11+ messages in thread From: John Stoffel @ 2017-01-23 17:07 UTC (permalink / raw) To: Salatiel Filho; +Cc: linux-raid >>>>> "Salatiel" == Salatiel Filho <salatiel.filho@gmail.com> writes: Salatiel> Ok, i have run echo check >>/sys/block/md1/md/sync_action, Salatiel> and now the output of mdadm mdadm --examine-badblocks Salatiel> /dev/sdd1 /dev/sdg1 /dev/sdf1 /dev/sde1 Salatiel> Bad-blocks on /dev/sdd1: Salatiel> 1515723072 for 512 sectors Salatiel> 1515723584 for 512 sectors Salatiel> 1515724096 for 512 sectors Salatiel> 1515724608 for 512 sectors You have bad disks in your array. First thing off is that I would go buy replacements and then use 'ddrescue' to copy the data from the old disks to new disks. Then I'd try to assemble the NEW disks only into an array, and then I'd fsck the filesystem(s). You're going to lose data, no doubt about it. You're now in the mode where you're trying to save as much as you can as quickly as possible. Personally, I'd be setting up a RAID6 array for your new setup. Then I would also be setting up weekly checks of the raid array as well. You're going to lose data no matter what. So get new disks and start copying what you can. John ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Input/Output error reading from a clean raid 2017-01-23 17:07 ` John Stoffel @ 2017-01-23 17:23 ` Wols Lists 0 siblings, 0 replies; 11+ messages in thread From: Wols Lists @ 2017-01-23 17:23 UTC (permalink / raw) To: John Stoffel, Salatiel Filho; +Cc: linux-raid On 23/01/17 17:07, John Stoffel wrote: > You have bad disks in your array. First thing off is that I would go > buy replacements and then use 'ddrescue' to copy the data from the old > disks to new disks. Then I'd try to assemble the NEW disks only into > an array, and then I'd fsck the filesystem(s). > > You're going to lose data, no doubt about it. You're now in the mode > where you're trying to save as much as you can as quickly as possible. > > Personally, I'd be setting up a RAID6 array for your new setup. Then > I would also be setting up weekly checks of the raid array as well. > > You're going to lose data no matter what. So get new disks and start > copying what you can. Go read the raid wiki. https://raid.wiki.kernel.org/index.php/Linux_Raid Especially replacing a failed drive https://raid.wiki.kernel.org/index.php/Replacing_a_failed_drive And please - can you get ddrescue's error log that it mentions and email me a copy. If you've got some Perl or Python or shell skills, maybe you could even write that script it mentions (which is described in a bit more detail in programming projects https://raid.wiki.kernel.org/index.php/Programming_projects) Otherwise I'll try and write it - might be a good way of learning Python :-) but at the moment I think I'm learning by jumping in out of my depth, so we'll see how far I get :-) Cheers, Wol ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Input/Output error reading from a clean raid 2017-01-23 14:02 ` Salatiel Filho 2017-01-23 17:07 ` John Stoffel @ 2017-01-23 17:34 ` Andreas Klauer 2017-01-24 21:15 ` Salatiel Filho 1 sibling, 1 reply; 11+ messages in thread From: Andreas Klauer @ 2017-01-23 17:34 UTC (permalink / raw) To: Salatiel Filho; +Cc: linux-raid On Mon, Jan 23, 2017 at 11:02:24AM -0300, Salatiel Filho wrote: > mdadm mdadm --examine-badblocks /dev/sdd1 /dev/sdg1 /dev/sdf1 /dev/sde1 > > Bad-blocks on /dev/sdd1: > 1515723072 for 512 sectors > Bad-blocks on /dev/sde1: > 1515723072 for 512 sectors md believes you have bad blocks in identical places so it won't return whatever data is in these blocks. Thus you get read errors even if there is no bad block on the disk itself. Those bad block entries can be caused by cable or controller flukes, making temporary problems permanent... Personally I disable the bad block list everywhere. You can search this list for old messages regarding --examine-badblocks, this problem came up several times. Clearing the mdadm bad block list is worth a try. There's an undocumented option, update=force-no-bbl or such. Regards Andreas Klauer ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Input/Output error reading from a clean raid 2017-01-23 17:34 ` Andreas Klauer @ 2017-01-24 21:15 ` Salatiel Filho 2017-01-24 21:58 ` Wols Lists 2017-01-25 15:54 ` John Stoffel 0 siblings, 2 replies; 11+ messages in thread From: Salatiel Filho @ 2017-01-24 21:15 UTC (permalink / raw) To: Andreas Klauer; +Cc: linux-raid On Mon, Jan 23, 2017 at 2:34 PM, Andreas Klauer <Andreas.Klauer@metamorpher.de> wrote: > On Mon, Jan 23, 2017 at 11:02:24AM -0300, Salatiel Filho wrote: >> mdadm mdadm --examine-badblocks /dev/sdd1 /dev/sdg1 /dev/sdf1 /dev/sde1 >> >> Bad-blocks on /dev/sdd1: >> 1515723072 for 512 sectors >> Bad-blocks on /dev/sde1: >> 1515723072 for 512 sectors > > md believes you have bad blocks in identical places so it won't return > whatever data is in these blocks. Thus you get read errors even if there > is no bad block on the disk itself. Those bad block entries can be caused > by cable or controller flukes, making temporary problems permanent... > > Personally I disable the bad block list everywhere. > > You can search this list for old messages regarding --examine-badblocks, > this problem came up several times. Clearing the mdadm bad block list is > worth a try. There's an undocumented option, update=force-no-bbl or such. > > Regards > Andreas Klauer Thanks all of you for the help. Andreas, the force-no-bbl from mdadm 3.4 did the trick. I was able to retrieve all files and their md5 matches, so it is great =) I really think it is very unlikely that two different disks from two different brands would have problems at exactly the same block. I have a question, who populates the badblock list ? Is the check action send to the /sys/block/md??/md/sync_action OR each read error updates it ? I think it was maybe some problem with the cable ( it is a 4 disks usb3 bay ). Anyway, thank you very much ! ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Input/Output error reading from a clean raid 2017-01-24 21:15 ` Salatiel Filho @ 2017-01-24 21:58 ` Wols Lists 2017-01-25 15:54 ` John Stoffel 1 sibling, 0 replies; 11+ messages in thread From: Wols Lists @ 2017-01-24 21:58 UTC (permalink / raw) To: Salatiel Filho, Andreas Klauer; +Cc: linux-raid On 24/01/17 21:15, Salatiel Filho wrote: > I really think it is very unlikely that two different disks from two > different brands would have problems at exactly the same block. > I have a question, who populates the badblock list ? Is the check > action send to the /sys/block/md??/md/sync_action OR each read error > updates it ? I think it's a known problem - nobody seems to know quite why it happens but when a block is added to the badblocks list it seems to get added to every device. Given that modern hard-drives are supposed to relocate bad blocks and not need a badblock list, I think that's why it's not been found, most people especially those in the know just tend to disable os-level badblocks. Cheers, Wol ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Input/Output error reading from a clean raid 2017-01-24 21:15 ` Salatiel Filho 2017-01-24 21:58 ` Wols Lists @ 2017-01-25 15:54 ` John Stoffel 1 sibling, 0 replies; 11+ messages in thread From: John Stoffel @ 2017-01-25 15:54 UTC (permalink / raw) To: Salatiel Filho; +Cc: Andreas Klauer, linux-raid >>>>> "Salatiel" == Salatiel Filho <salatiel.filho@gmail.com> writes: Salatiel> On Mon, Jan 23, 2017 at 2:34 PM, Andreas Klauer Salatiel> <Andreas.Klauer@metamorpher.de> wrote: >> On Mon, Jan 23, 2017 at 11:02:24AM -0300, Salatiel Filho wrote: >>> mdadm mdadm --examine-badblocks /dev/sdd1 /dev/sdg1 /dev/sdf1 /dev/sde1 >>> >>> Bad-blocks on /dev/sdd1: >>> 1515723072 for 512 sectors >>> Bad-blocks on /dev/sde1: >>> 1515723072 for 512 sectors >> >> md believes you have bad blocks in identical places so it won't return >> whatever data is in these blocks. Thus you get read errors even if there >> is no bad block on the disk itself. Those bad block entries can be caused >> by cable or controller flukes, making temporary problems permanent... >> >> Personally I disable the bad block list everywhere. >> >> You can search this list for old messages regarding --examine-badblocks, >> this problem came up several times. Clearing the mdadm bad block list is >> worth a try. There's an undocumented option, update=force-no-bbl or such. >> >> Regards >> Andreas Klauer Salatiel> Thanks all of you for the help. Salatiel> Andreas, the force-no-bbl from mdadm 3.4 did the trick. I was able to Salatiel> retrieve all files and their md5 matches, so it is great =) Great news, glad I could help, wish I had pin-pointed the root cause better. john ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2017-01-25 15:54 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-01-22 14:08 Input/Output error reading from a clean raid Salatiel Filho 2017-01-23 0:18 ` John Stoffel 2017-01-23 14:42 ` Salatiel Filho 2017-01-23 16:12 ` John Stoffel [not found] ` <20170123010334.GA7546@metamorpher.de> 2017-01-23 14:02 ` Salatiel Filho 2017-01-23 17:07 ` John Stoffel 2017-01-23 17:23 ` Wols Lists 2017-01-23 17:34 ` Andreas Klauer 2017-01-24 21:15 ` Salatiel Filho 2017-01-24 21:58 ` Wols Lists 2017-01-25 15:54 ` John Stoffel
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.