All of lore.kernel.org
 help / color / mirror / Atom feed
* Disk "failed" while doing scrub
@ 2015-07-13  6:26 Dāvis Mosāns
  2015-07-13  8:12 ` Duncan
  2015-08-21  4:16 ` Dāvis Mosāns
  0 siblings, 2 replies; 5+ messages in thread
From: Dāvis Mosāns @ 2015-07-13  6:26 UTC (permalink / raw)
  To: linux-btrfs

Hello,

Short version: while doing scrub on 5 disk btrfs filesystem, /dev/sdd
"failed" and also had some error on other disk (/dev/sdh)

Because filesystem still mounts, I assume I should do "btrfs device
delete /dev/sdd /mntpoint" and then restore damaged files from backup.
Are all affected files listed in journal? there's messages about "x
callbacks suppressed" so I'm not sure and if there aren't how to get
full list of damaged files?
Also I wonder if there are any tools to recover partial file fragments
and reconstruct file? (where missing fragments filled with nulls)
I assume that there's no point in running "btrfs check
--check-data-csum" because scrub already does check that?

from journal:

kernel: drivers/scsi/mvsas/mv_sas.c 1863:Release slot [1] tag[1], task
[ffff88007efb8800]:
kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active 00000002,  slot [1].
kernel: sas: sas_ata_task_done: SAS error 8a
kernel: sas: Enter sas_scsi_recover_host busy: 1 failed: 1
kernel: sas: ata9: end_device-7:2: cmd error handler
kernel: sas: ata7: end_device-7:0: dev error handler
kernel: sas: ata14: end_device-7:7: dev error handler
kernel: ata9.00: exception Emask 0x0 SAct 0x800 SErr 0x0 action 0x0
kernel: ata9.00: failed command: READ FPDMA QUEUED
kernel: ata9.00: cmd 60/00:00:00:3d:a1/04:00:ab:00:00/40 tag 11 ncq 524288 in
                                            res
41/40:00:48:40:a1/00:04:ab:00:00/00 Emask 0x409 (media error) <F>
kernel: ata9.00: status: { DRDY ERR }
kernel: ata9.00: error: { UNC }
kernel: ata9.00: configured for UDMA/133
kernel: sd 7:0:2:0: [sdd] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x00
driverbyte=0x08
kernel: sd 7:0:2:0: [sdd] tag#0 Sense Key : 0x3 [current] [descriptor]
kernel: sd 7:0:2:0: [sdd] tag#0 ASC=0x11 ASCQ=0x4
kernel: sd 7:0:2:0: [sdd] tag#0 CDB: opcode=0x28 28 00 ab a1 3d 00 00 04 00 00
kernel: blk_update_request: I/O error, dev sdd, sector 2879471688
kernel: ata9: EH complete
kernel: sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1
kernel: drivers/scsi/mvsas/mv_sas.c 1863:Release slot [1] tag[1], task
[ffff88007efb9a00]:
kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active 00000003,  slot [1].
kernel: sas: sas_ata_task_done: SAS error 8a
kernel: sas: Enter sas_scsi_recover_host busy: 2 failed: 2
kernel: sas: trying to find task 0xffff8801e0cadb00
kernel: sas: sas_scsi_find_task: aborting task 0xffff8801e0cadb00
kernel: sas: sas_scsi_find_task: task 0xffff8801e0cadb00 is aborted
kernel: sas: sas_eh_handle_sas_errors: task 0xffff8801e0cadb00 is aborted
kernel: sas: ata9: end_device-7:2: cmd error handler
kernel: sas: ata8: end_device-7:1: cmd error handler
kernel: sas: ata7: end_device-7:0: dev error handler
kernel: sas: ata8: end_device-7:1: dev error handler
kernel: ata8.00: exception Emask 0x0 SAct 0x40000 SErr 0x0 action 0x6 frozen
kernel: ata8.00: failed command: READ FPDMA QUEUED
kernel: ata8.00: cmd 60/00:00:00:1b:36/04:00:bf:00:00/40 tag 18 ncq 524288 in
                                            res
40/00:08:00:58:11/00:00:a6:00:00/40 Emask 0x4 (timeout)
kernel: ata8.00: status: { DRDY }
kernel: ata8: hard resetting link
kernel: sas: ata9: end_device-7:2: dev error handler
kernel: sas: ata14: end_device-7:7: dev error handler
kernel: ata9: log page 10h reported inactive tag 26
kernel: ata9.00: exception Emask 0x1 SAct 0x400000 SErr 0x0 action 0x6
kernel: ata9.00: failed command: READ FPDMA QUEUED
kernel: ata9.00: cmd 60/08:00:48:40:a1/00:00:ab:00:00/40 tag 22 ncq 4096 in
                                            res
01/04:a8:40:40:a1/00:00:ab:00:00/40 Emask 0x3 (HSM violation)
kernel: ata9.00: status: { ERR }
kernel: ata9.00: error: { ABRT }
kernel: ata9: hard resetting link
kernel: sas: sas_form_port: phy1 belongs to port1 already(1)!
kernel: ata9.00: both IDENTIFYs aborted, assuming NODEV
kernel: ata9.00: revalidation failed (errno=-2)
kernel: drivers/scsi/mvsas/mv_sas.c 1428:mvs_I_T_nexus_reset for device[1]:rc= 0
kernel: ata8.00: configured for UDMA/133
kernel: ata8.00: device reported invalid CHS sector 0
kernel: ata8: EH complete
kernel: ata9: hard resetting link
kernel: ata9.00: both IDENTIFYs aborted, assuming NODEV
kernel: ata9.00: revalidation failed (errno=-2)
kernel: ata9: hard resetting link
kernel: ata9.00: both IDENTIFYs aborted, assuming NODEV
kernel: ata9.00: revalidation failed (errno=-2)
kernel: ata9.00: disabled
kernel: ata9: EH complete
kernel: sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1
kernel: sd 7:0:2:0: [sdd] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x04
driverbyte=0x00
kernel: sd 7:0:2:0: [sdd] tag#0 CDB: opcode=0x28 28 00 ab a1 40 48 00 00 08 00
kernel: blk_update_request: I/O error, dev sdd, sector 2879471688
kernel: sd 7:0:2:0: [sdd] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x04
driverbyte=0x00
kernel: sd 7:0:2:0: [sdd] tag#0 CDB: opcode=0x28 28 00 ab a1 45 00 00 06 00 00
kernel: BTRFS: unable to fixup (regular) error at logical
7390602616832 on dev /dev/sdd
kernel: BTRFS: unable to fixup (regular) error at logical
7390602891264 on dev /dev/sdd
kernel: scsi_io_completion: 186117 callbacks suppressed
kernel: sd 7:0:2:0: [sdd] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x04
driverbyte=0x00
kernel: sd 7:0:2:0: [sdd] tag#0 CDB: opcode=0x2a 2a 00 00 14 78 c0 00 00 20 00
kernel: blk_update_request: 186156 callbacks suppressed
kernel: blk_update_request: I/O error, dev sdd, sector 1341632
kernel: sd 7:0:2:0: [sdd] tag#1 UNKNOWN(0x2003) Result: hostbyte=0x04
driverbyte=0x00
kernel: sd 7:0:2:0: [sdd] tag#1 CDB: opcode=0x2a 2a 00 00 14 7a 80 00 00 20 00
kernel: blk_update_request: I/O error, dev sdd, sector 2879472896
kernel: BTRFS: i/o error at logical 7386235424768 on dev /dev/sdd,
sector 2891849768, root 3034, inode 5633529, offset 11878400, length
4096, links 1 (path: [...])
kernel: BTRFS: i/o error at logical 7386235039744 on dev /dev/sdd,
sector 2891849016, root 3034, inode 5633529, offset 11493376, length
4096, links 1 (path: [...])
kernel: btrfs_dev_stat_print_on_error: 78908 callbacks suppressed
kernel: BTRFS: bdev /dev/sdd errs: wr 347, rd 1644871, flush 0, corrupt 0, gen 0
kernel: BTRFS: bdev /dev/sdd errs: wr 356, rd 1644871, flush 0, corrupt 0, gen 0
kernel: BTRFS: error (device sdh) in write_all_supers:3454: errno=-5
IO failure (errors while submitting device barriers.)
kernel: BTRFS info (device sdh): forced readonly
kernel: BTRFS warning (device sdh): Skipping commit of aborted transaction.
kernel: ------------[ cut here ]------------
kernel: WARNING: CPU: 5 PID: 3756 at fs/btrfs/super.c:260
__btrfs_abort_transaction+0x54/0x130 [btrfs]()
kernel: BTRFS: Transaction aborted (error -5)
kernel: Modules linked in: nf_conntrack_netbios_ns
nf_conntrack_broadcast xt_tcpudp ip6t_rpfilter ip6t_REJECT [...]
kernel:  nvidia(PO) tda8290 tuner aes_x86_64 lrw saa7134
snd_hda_codec_realtek gf128mul edac_core glue_helper [...]
kernel:
kernel: CPU: 5 PID: 3756 Comm: btrfs-transacti Tainted: P           O
  4.0.7-2-ARCH #1
kernel: Hardware name: Gigabyte Technology Co., Ltd.
GA-990FXA-UD3/GA-990FXA-UD3, BIOS FFe 11/08/2013
kernel:  0000000000000000 000000005f5d9ca7 ffff88006090fc18 ffffffff81574ec3
kernel:  0000000000000000 ffff88006090fc70 ffff88006090fc58 ffffffff81074e7a
kernel:  0000000000000000 ffff8800ce8e6c60 00000000fffffffb ffff8800bbaa4800
kernel: Call Trace:
kernel:  [<ffffffff81574ec3>] dump_stack+0x4c/0x6e
kernel:  [<ffffffff81074e7a>] warn_slowpath_common+0x8a/0xc0
kernel:  [<ffffffff81074f05>] warn_slowpath_fmt+0x55/0x70
kernel:  [<ffffffffa0253bb4>] __btrfs_abort_transaction+0x54/0x130 [btrfs]
kernel:  [<ffffffffa0282ceb>] cleanup_transaction+0x7b/0x300 [btrfs]
kernel:  [<ffffffff810b6ce0>] ? wake_atomic_t_function+0x60/0x60
kernel:  [<ffffffffa0284162>] btrfs_commit_transaction+0x932/0xc10 [btrfs]
kernel:  [<ffffffffa027f3a5>] transaction_kthread+0x1d5/0x240 [btrfs]
kernel:  [<ffffffffa027f1d0>] ? btrfs_cleanup_transaction+0x5a0/0x5a0 [btrfs]
kernel:  [<ffffffff810934b8>] kthread+0xd8/0xf0
kernel:  [<ffffffff810933e0>] ? kthread_worker_fn+0x170/0x170
kernel:  [<ffffffff8157a718>] ret_from_fork+0x58/0x90
kernel:  [<ffffffff810933e0>] ? kthread_worker_fn+0x170/0x170
kernel: ---[ end trace 8ecc49ef203bd88c ]---
kernel: BTRFS: error (device sdh) in cleanup_transaction:1686:
errno=-5 IO failure
kernel: BTRFS info (device sdh): delayed_refs has NO entry
kernel: scrub_handle_errored_block: 92600 callbacks suppressed
kernel: BTRFS: i/o error at logical 7390928568320 on dev /dev/sdd,
sector 2892627456, root 3034, inode 5637106, offset 614400, length
4096, links 1 (path: [...])
kernel: BTRFS: i/o error at logical 7390928175104 on dev /dev/sdd,
sector 2892626688, root 3034, inode 5637106, offset 483328, length
4096, links 1 (path: [...])
kernel: scrub_handle_errored_block: 77404 callbacks suppressed
kernel: BTRFS: unable to fixup (regular) error at logical
7390928568320 on dev /dev/sdd
kernel: BTRFS: unable to fixup (regular) error at logical
7390928175104 on dev /dev/sdd
smartd[723]: Device: /dev/sdd [SAT], not capable of SMART self-check
smartd[723]: Device: /dev/sdd [SAT], failed to read SMART Attribute Data
smartd[723]: Device: /dev/sdd [SAT], Read SMART Self Test Log Failed
smartd[723]: Device: /dev/sdd [SAT], Read Summary SMART Error Log failed
kernel: scsi_io_completion: 8110 callbacks suppressed
kernel: sd 7:0:2:0: [sdd] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x04
driverbyte=0x00
kernel: sd 7:0:2:0: [sdd] tag#0 CDB: opcode=0x28 28 00 e8 e0 88 00 00 00 08 00
kernel: blk_update_request: 8115 callbacks suppressed
kernel: blk_update_request: I/O error, dev sdd, sector 3907028992
kernel: sd 7:0:2:0: [sdd] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x04
driverbyte=0x00
kernel: sd 7:0:2:0: [sdd] tag#0 CDB: opcode=0x28 28 00 e8 e0 88 00 00 00 08 00
kernel: blk_update_request: I/O error, dev sdd, sector 3907028992
kernel: Buffer I/O error on dev sdd, logical block 488378624, async page read


Long story:

I had Seagate disk which died, but still was covered by warranty so I
got replacement, only disk they returned wasn't new, but repaired
and I haven't used it much, but seems it won't hold for long as it got
uncorrectable sectors.
When I received it, I did full SMART test and checked all sectors,
everything passed and seemed to be good, but now I copied my data
and used it for a while, only to find

smartd[592]: Device: /dev/sdd [SAT], 16 Currently unreadable (pending) sectors
smartd[592]: Device: /dev/sdd [SAT], 16 Offline uncorrectable sectors

then I ran scrub

scrub status for 1ec5b839-acc6-4f70-be9d-6f9e6118c71c
       scrub started at Sun Jul 12 13:36:11 2015 and was aborted after 02:43:21
       total bytes scrubbed: 6.24TiB with 1648151 errors
       error details: read=1648151
       corrected errors: 704, uncorrectable errors: 1647447,
unverified errors: 0

it caused drive to become unrecognizable by Linux and seems it also
made some error for different disk (/dev/sdh)
which caused filesystem to become read-only and didn't mount

kernel: sd 7:0:2:0: [sdd] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x04
driverbyte=0x00
kernel: sd 7:0:2:0: [sdd] tag#0 CDB: opcode=0x28 28 00 00 00 00 80 00 00 08 00
kernel: blk_update_request: I/O error, dev sdd, sector 128
kernel: BTRFS info (device sdh): enabling auto defrag
kernel: BTRFS info (device sdh): disk space caching is enabled
kernel: BTRFS: has skinny extents
kernel: BTRFS: failed to read chunk tree on sdh
mount[17625]: mount: wrong fs type, bad option, bad superblock on /dev/sdh,
mount[17625]: missing codepage or helper program, or other error
mount[17625]: In some cases useful info is found in syslog - try
mount[17625]: dmesg | tail or so.
kernel: BTRFS: open_ctree failed
kernel: sd 7:0:2:0: [sdd] Synchronizing SCSI cache
kernel: sd 7:0:2:0: [sdd] Synchronize Cache(10) failed: Result:
hostbyte=0x04 driverbyte=0x00
kernel: sd 7:0:2:0: [sdd] Stopping disk
kernel: sd 7:0:2:0: [sdd] Start/Stop Unit failed: Result:
hostbyte=0x04 driverbyte=0x00

pulled out that /dev/sdd drive and plugged back in

kernel: mvsas 0000:07:00.0: Phy2 : No sig fis
kernel: sas: phy-7:2 added to port-7:2, phy_mask:0x4 ( 200000000000000)
kernel: sas: DOING DISCOVERY on port 2, pid:16744
kernel: sas: DONE DISCOVERY on port 2, pid:16744, result:0
kernel: sas: Enter sas_scsi_recover_host busy: 0 failed: 0
kernel: ata20.00: ATA-8: ST2000DM001-9YN164, CC9F, max UDMA/133
kernel: ata20.00: 3907029168 sectors, multi 0: LBA48 NCQ (depth 31/32)
kernel: ata20.00: configured for UDMA/133
kernel: sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1
kernel: scsi 7:0:8:0: Direct-Access     ATA      ST2000DM001-9YN1 CC9F
PQ: 0 ANSI: 5
kernel: sd 7:0:8:0: [sdd] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB)
kernel: sd 7:0:8:0: [sdd] 4096-byte physical blocks
kernel: sd 7:0:8:0: [sdd] Write Protect is off
kernel: sd 7:0:8:0: [sdd] Mode Sense: 00 3a 00 00
kernel: sd 7:0:8:0: [sdd] Write cache: enabled, read cache: enabled,
doesn't support DPO or FUA
kernel: sd 7:0:8:0: [sdd] Attached SCSI disk
smartd[723]: Device: /dev/sdd [SAT], SMART Usage Attribute: 187
Reported_Uncorrect changed from 100 to 98
smartd[723]: Device: /dev/sdd [SAT], previous self-test completed with
error (read test element)
smartd[723]: Device: /dev/sdd [SAT], Self-Test Log error count
increased from 0 to 2
smartd[723]: Device: /dev/sdd [SAT], ATA error count increased from 0 to 2

everything seems "ok" again, run short SMART self-test which now
failed for first time (but disk SMART status still says PASSED)
then resumed scrub and it completed

scrub status for 1ec5b839-acc6-4f70-be9d-6f9e6118c71c
scrub device /dev/sdc (id 1) history
       scrub resumed at Sun Jul 12 18:07:06 2015 and finished after 04:34:02
       total bytes scrubbed: 2.35TiB with 0 errors
scrub device /dev/sdd (id 2) history
       scrub resumed at Sun Jul 12 18:07:06 2015 and finished after 02:56:23
       total bytes scrubbed: 1.44TiB with 1648151 errors
       error details: read=1648151
       corrected errors: 704, uncorrectable errors: 1647447,
unverified errors: 0
scrub device /dev/sde (id 3) history
       scrub started at Sun Jul 12 13:36:11 2015 and finished after 02:35:46
       total bytes scrubbed: 1.43TiB with 0 errors
scrub device /dev/sdg (id 4) history
       scrub started at Sun Jul 12 13:36:11 2015 and finished after 02:40:01
       total bytes scrubbed: 1.44TiB with 0 errors
scrub device /dev/sdh (id 5) history
       scrub started at Sun Jul 12 13:36:11 2015 and finished after 01:14:34
       total bytes scrubbed: 537.82GiB with 0 errors

btrfs device stats doesn't show any errors

[/dev/sdc].write_io_errs   0
[/dev/sdc].read_io_errs    0
[/dev/sdc].flush_io_errs   0
[/dev/sdc].corruption_errs 0
[/dev/sdc].generation_errs 0
[/dev/sdd].write_io_errs   0
[/dev/sdd].read_io_errs    0
[/dev/sdd].flush_io_errs   0
[/dev/sdd].corruption_errs 0
[/dev/sdd].generation_errs 0
[/dev/sde].write_io_errs   0
[/dev/sde].read_io_errs    0
[/dev/sde].flush_io_errs   0
[/dev/sde].corruption_errs 0
[/dev/sde].generation_errs 0
[/dev/sdg].write_io_errs   0
[/dev/sdg].read_io_errs    0
[/dev/sdg].flush_io_errs   0
[/dev/sdg].corruption_errs 0
[/dev/sdg].generation_errs 0
[/dev/sdh].write_io_errs   0
[/dev/sdh].read_io_errs    0
[/dev/sdh].flush_io_errs   0
[/dev/sdh].corruption_errs 0
[/dev/sdh].generation_errs 0


other disk /dev/sdh doesn't show any signs if it would have become bad
so most likely it was controller's fault when sdd threw errors.
when scrub says about error counts, what exactly count's as error, a
file fragment?
also are there some easy way to locate those unreadable sectors and
rewrite them so hdd relocates them?

Thanks :)

Here's ful SMART info for /dev/sdd

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.14 (AF)
Device Model:     ST2000DM001-9YN164
Serial Number:    W2404VST
LU WWN Device Id: 5 000c50 044a7a68a
Firmware Version: CC9F
User Capacity:    2 000 398 934 016 bytes [2,00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Mon Jul 13 07:40:14 2015 EEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM level is:     128 (minimum power consumption without standby)
Rd look-ahead is: Enabled
Write cache is:   Enabled
ATA Security is:  Disabled, NOT FROZEN [SEC1]
Wt Cache Reorder: Unavailable

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                       was never started.
                                       Auto Offline Data Collection: Disabled.
Self-test execution status:      ( 121) The previous self-test completed having
                                       the read element of the test failed.
Total time to complete Offline
data collection:                (  592) seconds.
Offline data collection
capabilities:                    (0x73) SMART execute Offline immediate.
                                       Auto Offline data collection
on/off support.
                                       Suspend Offline collection upon new
                                       command.
                                       No Offline surface scan supported.
                                       Self-test supported.
                                       Conveyance Self-test supported.
                                       Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                       power-saving mode.
                                       Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                       General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 254) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x3081) SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
 1 Raw_Read_Error_Rate     POSR--   117   100   006    -    166724616
 3 Spin_Up_Time            PO----   092   092   000    -    0
 4 Start_Stop_Count        -O--CK   100   100   020    -    626
 5 Reallocated_Sector_Ct   PO--CK   100   100   036    -    0
 7 Seek_Error_Rate         POSR--   060   060   030    -    1306645
 9 Power_On_Hours          -O--CK   097   097   000    -    3154
10 Spin_Retry_Count        PO--C-   100   100   097    -    0
12 Power_Cycle_Count       -O--CK   100   100   020    -    433
183 Runtime_Bad_Block       -O--CK   100   100   000    -    0
184 End-to-End_Error        -O--CK   100   100   099    -    0
187 Reported_Uncorrect      -O--CK   098   098   000    -    2
188 Command_Timeout         -O--CK   100   099   000    -    4 4 4
189 High_Fly_Writes         -O-RCK   100   100   000    -    0
190 Airflow_Temperature_Cel -O---K   070   058   045    -    30 (0 1 34 29 0)
191 G-Sense_Error_Rate      -O--CK   100   100   000    -    0
192 Power-Off_Retract_Count -O--CK   100   100   000    -    335
193 Load_Cycle_Count        -O--CK   096   096   000    -    9566
194 Temperature_Celsius     -O---K   030   042   000    -    30 (128 0 0 0 0)
197 Current_Pending_Sector  -O--C-   100   100   000    -    16
198 Offline_Uncorrectable   ----C-   100   100   000    -    16
199 UDMA_CRC_Error_Count    -OSRCK   200   200   000    -    0
240 Head_Flying_Hours       ------   100   253   000    -    367h+26m+14.504s
241 Total_LBAs_Written      ------   100   253   000    -    38608136381115
242 Total_LBAs_Read         ------   100   253   000    -    7979572945843
                           ||||||_ K auto-keep
                           |||||__ C event count
                           ||||___ R error rate
                           |||____ S speed/performance
                           ||_____ O updated online
                           |______ P prefailure warning

General Purpose Log Directory Version 1
SMART           Log Directory Version 1 [multi-sector log support]
Address    Access  R/W   Size  Description
0x00       GPL,SL  R/O      1  Log Directory
0x01           SL  R/O      1  Summary SMART error log
0x02           SL  R/O      5  Comprehensive SMART error log
0x03       GPL     R/O      5  Ext. Comprehensive SMART error log
0x06           SL  R/O      1  SMART self-test log
0x07       GPL     R/O      1  Extended self-test log
0x09           SL  R/W      1  Selective self-test log
0x10       GPL     R/O      1  SATA NCQ Queued Error log
0x11       GPL     R/O      1  SATA Phy Event Counters log
0x21       GPL     R/O      1  Write stream error log
0x22       GPL     R/O      1  Read stream error log
0x80-0x9f  GPL,SL  R/W     16  Host vendor specific log
0xa1       GPL,SL  VS      20  Device vendor specific log
0xa2       GPL     VS    4496  Device vendor specific log
0xa8       GPL,SL  VS      20  Device vendor specific log
0xa9       GPL,SL  VS       1  Device vendor specific log
0xab       GPL     VS       1  Device vendor specific log
0xb0       GPL     VS    5067  Device vendor specific log
0xbd       GPL     VS     512  Device vendor specific log
0xbe-0xbf  GPL     VS   65535  Device vendor specific log
0xc0       GPL,SL  VS       1  Device vendor specific log
0xe0       GPL,SL  R/W      1  SCT Command/Status
0xe1       GPL,SL  R/W      1  SCT Data Transfer

SMART Extended Comprehensive Error Log Version: 1 (5 sectors)
Device Error Count: 2
       CR     = Command Register
       FEATR  = Features Register
       COUNT  = Count (was: Sector Count) Register
       LBA_48 = Upper bytes of LBA High/Mid/Low Registers ]  ATA-8
       LH     = LBA High (was: Cylinder High) Register    ]   LBA
       LM     = LBA Mid (was: Cylinder Low) Register      ] Register
       LL     = LBA Low (was: Sector Number) Register     ]
       DV     = Device (was: Device/Head) Register
       DC     = Device Control Register
       ER     = Error register
       ST     = Status register
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 2 [1] occurred at disk power-on lifetime: 3139 hours (130 days + 19 hours)
 When the command that caused the error occurred, the device was active or idle.

 After command completion occurred, registers were:
 ER -- ST COUNT  LBA_48  LH LM LL DV DC
 -- -- -- == -- == == == -- -- -- -- --
 40 -- 51 00 00 00 00 ab a1 40 48 00 00  Error: UNC at LBA =
0xaba14048 = 2879471688

 Commands leading to the command that caused the error were:
 CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
 -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
 60 00 00 00 08 00 00 ab a1 40 48 40 00     02:54:39.784  READ FPDMA QUEUED
 60 00 00 00 08 00 00 ab a1 40 40 40 00     02:54:39.783  READ FPDMA QUEUED
 60 00 00 00 08 00 00 ab a1 40 38 40 00     02:54:39.783  READ FPDMA QUEUED
 60 00 00 00 08 00 00 ab a1 40 30 40 00     02:54:39.782  READ FPDMA QUEUED
 60 00 00 00 08 00 00 ab a1 40 28 40 00     02:54:39.782  READ FPDMA QUEUED

Error 1 [0] occurred at disk power-on lifetime: 3139 hours (130 days + 19 hours)
 When the command that caused the error occurred, the device was active or idle.

 After command completion occurred, registers were:
 ER -- ST COUNT  LBA_48  LH LM LL DV DC
 -- -- -- == -- == == == -- -- -- -- --
 40 -- 51 00 00 00 00 ab a1 40 48 00 00  Error: UNC at LBA =
0xaba14048 = 2879471688

 Commands leading to the command that caused the error were:
 CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
 -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
 60 00 00 04 00 00 00 ab a0 14 00 40 00     02:54:36.512  READ FPDMA QUEUED
 60 00 00 04 00 00 00 ab a0 10 00 40 00     02:54:36.500  READ FPDMA QUEUED
 60 00 00 04 00 00 00 ab a0 0c 00 40 00     02:54:36.498  READ FPDMA QUEUED
 60 00 00 04 00 00 00 ab a0 08 00 40 00     02:54:36.497  READ FPDMA QUEUED
 60 00 00 04 00 00 00 ab 9f f9 00 40 00     02:54:36.402  READ FPDMA QUEUED

SMART Error Log Version: 1
ATA Error Count: 2
       CR = Command Register [HEX]
       FR = Features Register [HEX]
       SC = Sector Count Register [HEX]
       SN = Sector Number Register [HEX]
       CL = Cylinder Low Register [HEX]
       CH = Cylinder High Register [HEX]
       DH = Device/Head Register [HEX]
       DC = Device Command Register [HEX]
       ER = Error register [HEX]
       ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 2 occurred at disk power-on lifetime: 3139 hours (130 days + 19 hours)
 When the command that caused the error occurred, the device was active or idle.

 After command completion occurred, registers were:
 ER ST SC SN CL CH DH
 -- -- -- -- -- -- --
 40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

 Commands leading to the command that caused the error were:
 CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
 -- -- -- -- -- -- -- --  ----------------  --------------------
 60 00 08 ff ff ff 4f 00      02:54:39.784  READ FPDMA QUEUED
 60 00 08 ff ff ff 4f 00      02:54:39.783  READ FPDMA QUEUED
 60 00 08 ff ff ff 4f 00      02:54:39.783  READ FPDMA QUEUED
 60 00 08 ff ff ff 4f 00      02:54:39.782  READ FPDMA QUEUED
 60 00 08 ff ff ff 4f 00      02:54:39.782  READ FPDMA QUEUED

Error 1 occurred at disk power-on lifetime: 3139 hours (130 days + 19 hours)
 When the command that caused the error occurred, the device was active or idle.

 After command completion occurred, registers were:
 ER ST SC SN CL CH DH
 -- -- -- -- -- -- --
 40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

 Commands leading to the command that caused the error were:
 CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
 -- -- -- -- -- -- -- --  ----------------  --------------------
 60 00 00 ff ff ff 4f 00      02:54:36.512  READ FPDMA QUEUED
 60 00 00 ff ff ff 4f 00      02:54:36.500  READ FPDMA QUEUED
 60 00 00 ff ff ff 4f 00      02:54:36.498  READ FPDMA QUEUED
 60 00 00 ff ff ff 4f 00      02:54:36.497  READ FPDMA QUEUED
 60 00 00 ff ff ff 4f 00      02:54:36.402  READ FPDMA QUEUED

SMART Extended Self-test Log Version: 1 (1 sectors)
Num  Test_Description    Status                  Remaining
LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed: read failure       90%      3139
      2879471688
# 2  Short offline       Completed: read failure       90%      3139
      2879471688
# 3  Short offline       Completed without error       00%      3049         -
# 4  Conveyance offline  Completed without error       00%      2996         -
# 5  Short offline       Completed without error       00%      2239         -
# 6  Extended offline    Completed without error       00%      2238         -
# 7  Short offline       Completed without error       00%      1550         -
# 8  Short offline       Completed without error       00%      1550         -
# 9  Short offline       Completed without error       00%        69         -
#10  Short offline       Completed without error       00%         9         -

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining
LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed: read failure       90%      3139
      2879471688
# 2  Short offline       Completed: read failure       90%      3139
      2879471688
# 3  Short offline       Completed without error       00%      3049         -
# 4  Conveyance offline  Completed without error       00%      2996         -
# 5  Short offline       Completed without error       00%      2239         -
# 6  Extended offline    Completed without error       00%      2238         -
# 7  Short offline       Completed without error       00%      1550         -
# 8  Short offline       Completed without error       00%      1550         -
# 9  Short offline       Completed without error       00%        69         -
#10  Short offline       Completed without error       00%         9         -

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
   1        0        0  Not_testing
   2        0        0  Not_testing
   3        0        0  Not_testing
   4        0        0  Not_testing
   5        0        0  Not_testing
Selective self-test flags (0x0):
 After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

SCT Status Version:                  3
SCT Version (vendor specific):       522 (0x020a)
SCT Support Level:                   1
Device State:                        Active (0)
Current Temperature:                    30 Celsius
Power Cycle Min/Max Temperature:     29/34 Celsius
Lifetime    Min/Max Temperature:      9/42 Celsius
Under/Over Temperature Limit Count:   0/0

SCT Data Table command not supported

SCT Error Recovery Control command not supported

Device Statistics (GP/SMART Log 0x04) not supported

SATA Phy Event Counters (GP Log 0x11)
ID      Size     Value  Description
0x000a  2            1  Device-to-host register FISes sent due to a COMRESET
0x0001  2            0  Command failed due to ICRC error
0x0003  2            0  R_ERR response for device-to-host data FIS
0x0004  2            0  R_ERR response for host-to-device data FIS
0x0006  2            0  R_ERR response for device-to-host non-data FIS
0x0007  2            0  R_ERR response for host-to-device non-data FIS





SMART info for /dev/sdh

=== START OF INFORMATION SECTION ===
Model Family:     SAMSUNG SpinPoint F3
Device Model:     SAMSUNG HD103SJ
Serial Number:    S246JDWZ113593
LU WWN Device Id: 5 0024e9 002bf43c5
Firmware Version: 1AJ100E4
User Capacity:    1 000 204 886 016 bytes [1,00 TB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 6
SATA Version is:  SATA 2.6, 3.0 Gb/s
Local Time is:    Mon Jul 13 07:53:49 2015 EEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Disabled
APM feature is:   Disabled
Rd look-ahead is: Enabled
Write cache is:   Enabled
ATA Security is:  Disabled, NOT FROZEN [SEC1]
Wt Cache Reorder: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                       was completed without error.
                                       Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                       without error or no self-test has ever
                                       been run.
Total time to complete Offline
data collection:                ( 9420) seconds.
Offline data collection
capabilities:                    (0x5b) SMART execute Offline immediate.
                                       Auto Offline data collection
on/off support.
                                       Suspend Offline collection upon new
                                       command.
                                       Offline surface scan supported.
                                       Self-test supported.
                                       No Conveyance Self-test supported.
                                       Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                       power-saving mode.
                                       Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                       General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 157) minutes.
SCT capabilities:              (0x003f) SCT Status supported.
                                       SCT Error Recovery Control supported.
                                       SCT Feature Control supported.
                                       SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
 1 Raw_Read_Error_Rate     POSR-K   100   100   051    -    1
 2 Throughput_Performance  -OS--K   055   055   000    -    8621
 3 Spin_Up_Time            PO---K   073   071   025    -    8314
 4 Start_Stop_Count        -O--CK   091   091   000    -    9745
 5 Reallocated_Sector_Ct   PO--CK   252   252   010    -    0
 7 Seek_Error_Rate         -OSR-K   252   252   051    -    0
 8 Seek_Time_Performance   --S--K   252   252   015    -    0
 9 Power_On_Hours          -O--CK   100   100   000    -    20675
10 Spin_Retry_Count        -O--CK   252   252   051    -    0
11 Calibration_Retry_Count -O--CK   252   252   000    -    0
12 Power_Cycle_Count       -O--CK   097   097   000    -    3297
191 G-Sense_Error_Rate      -O---K   100   100   000    -    42
192 Power-Off_Retract_Count -O---K   252   252   000    -    0
194 Temperature_Celsius     -O----   064   043   000    -    32 (Min/Max 4/57)
195 Hardware_ECC_Recovered  -O-RCK   100   100   000    -    0
196 Reallocated_Event_Count -O--CK   252   252   000    -    0
197 Current_Pending_Sector  -O--CK   252   252   000    -    0
198 Offline_Uncorrectable   ----CK   252   252   000    -    0
199 UDMA_CRC_Error_Count    -OS-CK   100   100   000    -    2
200 Multi_Zone_Error_Rate   -O-R-K   100   100   000    -    101
223 Load_Retry_Count        -O--CK   252   252   000    -    0
225 Load_Cycle_Count        -O--CK   100   100   000    -    9897
                           ||||||_ K auto-keep
                           |||||__ C event count
                           ||||___ R error rate
                           |||____ S speed/performance
                           ||_____ O updated online
                           |______ P prefailure warning

General Purpose Log Directory Version 1
SMART           Log Directory Version 1 [multi-sector log support]
Address    Access  R/W   Size  Description
0x00       GPL,SL  R/O      1  Log Directory
0x01           SL  R/O      1  Summary SMART error log
0x02           SL  R/O      2  Comprehensive SMART error log
0x03       GPL     R/O      2  Ext. Comprehensive SMART error log
0x06           SL  R/O      1  SMART self-test log
0x07       GPL     R/O      2  Extended self-test log
0x08       GPL     R/O      2  Power Conditions log
0x09           SL  R/W      1  Selective self-test log
0x10       GPL     R/O      1  SATA NCQ Queued Error log
0x11       GPL     R/O      1  SATA Phy Event Counters log
0x80-0x9f  GPL,SL  R/W     16  Host vendor specific log
0xbb       GPL     VS       4  Device vendor specific log
0xbc       GPL     VS       2  Device vendor specific log
0xe0       GPL,SL  R/W      1  SCT Command/Status
0xe1       GPL,SL  R/W      1  SCT Data Transfer

SMART Extended Comprehensive Error Log Version: 1 (2 sectors)
Device Error Count: 2
       CR     = Command Register
       FEATR  = Features Register
       COUNT  = Count (was: Sector Count) Register
       LBA_48 = Upper bytes of LBA High/Mid/Low Registers ]  ATA-8
       LH     = LBA High (was: Cylinder High) Register    ]   LBA
       LM     = LBA Mid (was: Cylinder Low) Register      ] Register
       LL     = LBA Low (was: Sector Number) Register     ]
       DV     = Device (was: Device/Head) Register
       DC     = Device Control Register
       ER     = Error register
       ST     = Status register
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 2 [1] occurred at disk power-on lifetime: 4244 hours (176 days + 20 hours)
 When the command that caused the error occurred, the device was active or idle.

 After command completion occurred, registers were:
 ER -- ST COUNT  LBA_48  LH LM LL DV DC
 -- -- -- == -- == == == -- -- -- -- --
 84 -- 51 93 e8 00 00 00 00 00 00 e0 00  Error: ICRC, ABRT 37864
sectors at LBA = 0x00000000 = 0

 Commands leading to the command that caused the error were:
 CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
 -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
 35 00 00 01 00 00 00 61 18 92 e8 e0 08     00:00:01.927  WRITE DMA EXT
 25 00 00 01 00 00 00 1b ce e8 60 e0 08     00:00:01.927  READ DMA EXT
 25 00 00 01 00 00 00 1b ce e7 60 e0 08     00:00:01.927  READ DMA EXT
 25 00 00 01 00 00 00 1b ce e6 60 e0 08     00:00:01.927  READ DMA EXT
 25 00 00 01 00 00 00 1b ce e5 60 e0 08     00:00:01.927  READ DMA EXT

Error 1 [0] occurred at disk power-on lifetime: 2234 hours (93 days + 2 hours)
 When the command that caused the error occurred, the device was active or idle.

 After command completion occurred, registers were:
 ER -- ST COUNT  LBA_48  LH LM LL DV DC
 -- -- -- == -- == == == -- -- -- -- --
 84 -- 51 e5 ee 00 00 00 00 00 00 e0 00  Error: ICRC, ABRT 58862
sectors at LBA = 0x00000000 = 0

 Commands leading to the command that caused the error were:
 CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
 -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
 35 00 00 00 06 00 00 00 35 e5 e8 e0 08     00:00:17.173  WRITE DMA EXT
 35 00 00 00 08 00 00 06 d5 77 10 e0 08     00:00:17.173  WRITE DMA EXT
 35 00 00 00 03 00 00 00 82 12 48 e0 08     00:00:17.173  WRITE DMA EXT
 35 00 00 00 07 00 00 06 d5 77 10 e0 08     00:00:17.171  WRITE DMA EXT
 35 00 00 00 03 00 00 00 82 12 48 e0 08     00:00:17.171  WRITE DMA EXT

SMART Error Log Version: 1
No Errors Logged

SMART Extended Self-test Log Version: 1 (2 sectors)
Num  Test_Description    Status                  Remaining
LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     20661         -
# 2  Extended offline    Completed without error       00%     19724         -
# 3  Short offline       Completed without error       00%     19721         -
# 4  Short offline       Aborted by host               90%     19404         -
# 5  Short offline       Completed without error       00%     18910         -
# 6  Short offline       Completed without error       00%     15792         -
# 7  Short offline       Completed without error       00%     15792         -

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining
LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     20661         -
# 2  Extended offline    Completed without error       00%     19724         -
# 3  Short offline       Completed without error       00%     19721         -
# 4  Short offline       Aborted by host               90%     19404         -
# 5  Short offline       Completed without error       00%     18910         -
# 6  Short offline       Completed without error       00%     15792         -
# 7  Short offline       Completed without error       00%     15792         -

SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has
ever been run
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
   1        0        0  Completed [00% left] (0-65535)
   2        0        0  Not_testing
   3        0        0  Not_testing
   4        0        0  Not_testing
   5        0        0  Not_testing
Selective self-test flags (0x0):
 After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

SCT Status Version:                  2
SCT Version (vendor specific):       256 (0x0100)
SCT Support Level:                   1
Device State:                        Active (0)
Current Temperature:                    32 Celsius
Power Cycle Min/Max Temperature:     24/38 Celsius
Lifetime    Min/Max Temperature:      7/57 Celsius
Under/Over Temperature Limit Count:   0/0

SCT Temperature History Version:     2
Temperature Sampling Period:         5 minutes
Temperature Logging Interval:        5 minutes
Min/Max recommended Temperature:     -5/80 Celsius
Min/Max Temperature Limit:           -10/85 Celsius
Temperature History Size (Index):    128 (106)

Index    Estimated Time   Temperature Celsius
107    2015-07-12 21:15    35  ****************
108    2015-07-12 21:20    34  ***************
105    2015-07-13 07:45    33  **************
106    2015-07-13 07:50    32  *************

SCT Error Recovery Control:
          Read: Disabled
         Write: Disabled

Device Statistics (GP/SMART Log 0x04) not supported

SATA Phy Event Counters (GP Log 0x11)
ID      Size     Value  Description
0x0001  4            0  Command failed due to ICRC error
0x0002  4            0  R_ERR response for data FIS
0x0003  4            0  R_ERR response for device-to-host data FIS
0x0004  4            0  R_ERR response for host-to-device data FIS
0x0005  4            0  R_ERR response for non-data FIS
0x0006  4            0  R_ERR response for device-to-host non-data FIS
0x0007  4            0  R_ERR response for host-to-device non-data FIS
0x0008  4            0  Device-to-host non-data FIS retries
0x0009  4            1  Transition from drive PhyRdy to drive PhyNRdy
0x000a  4            2  Device-to-host register FISes sent due to a COMRESET
0x000b  4            0  CRC errors within host-to-device FIS
0x000d  4            0  Non-CRC errors within host-to-device FIS
0x000f  4            0  R_ERR response for host-to-device data FIS, CRC
0x0010  4            0  R_ERR response for host-to-device data FIS, non-CRC
0x0012  4            0  R_ERR response for host-to-device non-data FIS, CRC
0x0013  4            0  R_ERR response for host-to-device non-data FIS, non-CRC
0x8e00  4            0  Vendor specific
0x8e01  4            0  Vendor specific
0x8e02  4            0  Vendor specific
0x8e03  4            0  Vendor specific
0x8e04  4            0  Vendor specific
0x8e05  4            0  Vendor specific
0x8e06  4            0  Vendor specific
0x8e07  4            0  Vendor specific
0x8e08  4            0  Vendor specific
0x8e09  4            0  Vendor specific
0x8e0a  4            0  Vendor specific
0x8e0b  4            0  Vendor specific
0x8e0c  4            0  Vendor specific
0x8e0d  4            0  Vendor specific
0x8e0e  4            0  Vendor specific
0x8e0f  4            0  Vendor specific
0x8e10  4            0  Vendor specific
0x8e11  4            0  Vendor specific

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Disk "failed" while doing scrub
  2015-07-13  6:26 Disk "failed" while doing scrub Dāvis Mosāns
@ 2015-07-13  8:12 ` Duncan
  2015-07-14  1:54   ` Dāvis Mosāns
  2015-08-21  4:16 ` Dāvis Mosāns
  1 sibling, 1 reply; 5+ messages in thread
From: Duncan @ 2015-07-13  8:12 UTC (permalink / raw)
  To: linux-btrfs

Dāvis Mosāns posted on Mon, 13 Jul 2015 09:26:05 +0300 as excerpted:

> Short version: while doing scrub on 5 disk btrfs filesystem, /dev/sdd
> "failed" and also had some error on other disk (/dev/sdh)

You say five disk, but nowhere in your post do you mention what raid mode 
you were using, neither do you post btrfs filesystem show and btrfs 
filesystem df, as suggested on the wiki and which list that information.

FWIW, btrfs defaults for a multi-device filesystem are raid1 metadata, 
raid0 data.  If you didn't specify raid level at mkfs time, it's very 
likely that's what you're using.  The scrub results seem to support this 
as if the data had been raid1 or raid10, nearly all the errors should 
have been correctable by pulling from the second copy.  And raid5/6 
should have been able to recover from parity, tho this mode is new enough 
it's still not recommended as the chances of bugs and thus failure to 
work properly are much higher.

So you really should have been using raid1/10 if you wanted device 
failure tolerance, but you didn't say, and if you're using defaults as 
seems reasonably likely, your data was raid0, and thus it's likely many/
most files are either gone or damaged beyond repair.

(As it happens I have a number of btrfs raid1 data/metadata on a pair of 
partitioned ssds, with each btrfs on a corresponding partition on both of 
them, with one of the ssds developing bad sectors and basically slowly 
failing.  But the other member of the raid1 pair is solid and I have 
backups, as well as a spare I can replace the failing one with when I 
decide it's time, so I've been letting the bad one stick around due as 
much as anything to morbid curiosity, watching it slowly fail. So I know 
exactly how scrub on btrfs raid1 behaves in a bad-sector case, pulling 
the copy from the good device to overwrite the bad copy with, triggering 
the device's sector remapping in the process.  Despite all the read 
errors, they've all been correctable, because I'm using raid1 for both 
data and metadata.)

> Because filesystem still mounts, I assume I should do "btrfs device
> delete /dev/sdd /mntpoint" and then restore damaged files from backup.

You can try a replace, but with a failing drive still connected, people 
report mixed results.  It's likely to fail as it can't read certain 
blocks to transfer them to the new device.

With raid1 or better, physically disconnecting the failing device, and 
doing a device delete missing (or replace missing, but AFAIK this doesn't 
work with released versions and I'm not sure if it's even in integration 
yet, but there are patches on-list that should make it work) can work, 
but with raid0/single, you can mount with a missing device if you use 
degraded,ro, but obviously that'll only let you try to copy files off, 
and you'll likely not have a lot of luck with raid0, with files missing 
but a bit more luck with single.

In the likely raid0/single case, you're best bet is probably to try 
copying off what you can, and/or restoring from backups.  See the 
discussion below.

> Are all affected files listed in journal? there's messages about "x
> callbacks suppressed" so I'm not sure and if there aren't how to get
> full list of damaged files?

> Also I wonder if there are any tools to recover partial file fragments
> and reconstruct file? (where missing fragments filled with nulls)
> I assume that there's no point in running "btrfs check
> --check-data-csum" because scrub already does check that?

There's no such partial-file with null-fill tools shipped just yet.  
Those files normally simply trigger errors trying to read them, because 
btrfs won't let you at them if the checksum doesn't verify.

There /is/, however, a command that can be used to either regenerate or 
zero-out the checksum tree.  See btrfs check --init-csum-tree.  Current 
versions recalculate the csums, older versions (btrfsck as that was 
before btrfs check) simply zeroed it out.  Then you can read the file 
despite bad checksums, tho you'll still get errors if the block 
physically cannot be read.

There's also btrfs restore, which works on the unmounted filesystem 
without actually writing to it, copying the files it can read to a new 
location, which of course has to be a filesystem with enough room to 
restore the files to, altho it's possible to tell restore to do only 
specific subdirs, for instance.

What I'd recommend depends on how complete and how recent your backup 
is.  If it's complete and recent enough, probably the easiest thing is to 
simply blow away the bad filesystem and start over, recovering from the 
backup to a new filesystem.

If there's files you'd like to get back that weren't backed up or where 
the backup is old, since the filesystem is mountable, I'd probably copy 
everything off it I could.  Then, I'd try restore, letting it restore to 
the same location I had copied to, but NOT using the --overwrite option, 
so it only wrote any files it could restore that the copy wasn't able to 
get you, as they might be slightly older versions.

Then, if you really need more of the files, you can try using btrfs check 
--init-csum-tree as mentioned above, and then try mounting and see if you 
can access more files.  But as these are likely to be somewhat corrupt, 
I'd probably /not/ copy them to the same location as the others.  If you 
have space for two copies, you might duplicate the set of files as you 
were able to recover them with the initial copy and restore, and use the 
same don't-overwrite technique on one of the sets, marking it the 
possibly corrupted version.  Then you can do a diff or rsync dry-run to 
see the differences between the good version and the bad, and examine 
anything spitout by the diff/rsync individually.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Disk "failed" while doing scrub
  2015-07-13  8:12 ` Duncan
@ 2015-07-14  1:54   ` Dāvis Mosāns
  2015-07-14  6:26     ` Duncan
  0 siblings, 1 reply; 5+ messages in thread
From: Dāvis Mosāns @ 2015-07-14  1:54 UTC (permalink / raw)
  To: Duncan; +Cc: linux-btrfs

2015-07-13 11:12 GMT+03:00 Duncan <1i5t5.duncan@cox.net>:
> You say five disk, but nowhere in your post do you mention what raid mode
> you were using, neither do you post btrfs filesystem show and btrfs
> filesystem df, as suggested on the wiki and which list that information.

Sorry, I forgot. I'm running Arch Linux 4.0.7, with btrfs-progs v4.1
Using RAID1 for metadata and single for data, with features
big_metadata, extended_iref, mixed_backref, no_holes, skinny_metadata
and mounted with noatime,compress=zlib,space_cache,autodefrag

Label: 'Data'  uuid: 1ec5b839-acc6-4f70-be9d-6f9e6118c71c
       Total devices 5 FS bytes used 7.16TiB
       devid    1 size 2.73TiB used 2.35TiB path /dev/sdc
       devid    2 size 1.82TiB used 1.44TiB path /dev/sdd
       devid    3 size 1.82TiB used 1.44TiB path /dev/sde
       devid    4 size 1.82TiB used 1.44TiB path /dev/sdg
       devid    5 size 931.51GiB used 539.01GiB path /dev/sdh

Data, single: total=7.15TiB, used=7.15TiB
System, RAID1: total=8.00MiB, used=784.00KiB
System, single: total=4.00MiB, used=0.00B
Metadata, RAID1: total=16.00GiB, used=14.37GiB
Metadata, single: total=8.00MiB, used=0.00B
GlobalReserve, single: total=512.00MiB, used=0.00B


>> Because filesystem still mounts, I assume I should do "btrfs device
>> delete /dev/sdd /mntpoint" and then restore damaged files from backup.
>
> You can try a replace, but with a failing drive still connected, people
> report mixed results.  It's likely to fail as it can't read certain
> blocks to transfer them to the new device.

As I understand, device delete will copy data from that disk and
distribute across rest of disks,
while btrfs replace will copy to new disk which must be atleast size
of disk I'm replacing.
Assuming other existing disks are good, if so, why replace would be
preferable over delete?
because delete could fail, but replace not?


> There's no such partial-file with null-fill tools shipped just yet.
> Those files normally simply trigger errors trying to read them, because
> btrfs won't let you at them if the checksum doesn't verify.

>From journal I have only 14 files mentioned where errors occurred. Now
13 files from
them don't throw any errors and their SHA's match to my backups so they're fine.
And actually btrfs does allow to copy/read that one damaged file, only
I get I/O error
when trying to read data from those broken sectors

kernel: drivers/scsi/mvsas/mv_sas.c 1863:Release slot [0] tag[0], task
[ffff88011c8c9900]:
kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active 00000001,  slot [0].
kernel: sas: sas_ata_task_done: SAS error 8a
kernel: sas: Enter sas_scsi_recover_host busy: 1 failed: 1
kernel: sas: ata9: end_device-7:2: cmd error handler
kernel: sas: ata7: end_device-7:0: dev error handler
kernel: sas: ata14: end_device-7:7: dev error handler
kernel: ata9.00: exception Emask 0x0 SAct 0x4000 SErr 0x0 action 0x0
kernel: ata9.00: failed command: READ FPDMA QUEUED
kernel: ata9.00: cmd 60/00:00:00:33:a1/0f:00:ab:00:00/40 tag 14 ncq 1966080 in
                         res 41/40:00:48:40:a1/00:0f:ab:00:00/00 Emask
0x409 (media error) <F>
kernel: ata9.00: status: { DRDY ERR }
kernel: ata9.00: error: { UNC }
kernel: ata9.00: configured for UDMA/133
kernel: sd 7:0:2:0: [sdd] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x00
driverbyte=0x08
kernel: sd 7:0:2:0: [sdd] tag#0 Sense Key : 0x3 [current] [descriptor]
kernel: sd 7:0:2:0: [sdd] tag#0 ASC=0x11 ASCQ=0x4
kernel: sd 7:0:2:0: [sdd] tag#0 CDB: opcode=0x28 28 00 ab a1 33 00 00 0f 00 00
kernel: blk_update_request: I/O error, dev sdd, sector 2879471688
kernel: ata9: EH complete
kernel: sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1


but all other sectors can be copied fine

$ du -m ./damaged_file
6250 ./damaged_file

$ cp ./damaged_file /tmp/
cp: error reading ‘damaged_file’: Input/output error

$ du -m /tmp/damaged_file
4335    /tmp/damaged_file

cp copies first file part correctly, and I verified that both
start of file (first 4336M) and end of file (last 1890M) SHA's match backup

$ head -c 4336M ./damaged_file | sha256sum
e81b20bfa7358c9f5a0ed165bffe43185abc59e35246e52a7be1d43e6b7e040d  -
$ head -c 4337M ./damaged_file | sha256sum
head: error reading ‘./damaged_file’: Input/output error

$ tail -c 1890M ./damaged_file | sha256sum
941568f4b614077858cb8c8dd262bb431bf4c45eca936af728ecffc95619cb60  -
$ tail -c 1891M ./damaged_file  | sha256sum
tail: error reading ‘./damaged_file’: Input/output error

with dd can also copy almost all file, only using noerror option it
excludes those regions
from target file rather than filling with nulls so this isn't good for recovery

$ dd conv=noerror if=damaged_file of=/tmp/damaged_file
dd: error reading ‘damaged_file’: Input/output error
8880328+0 records in
8880328+0 records out
4546727936 bytes (4,5 GB) copied, 69,7282 s, 65,2 MB/s
dd: error reading ‘damaged_file’: Input/output error
8930824+0 records in
8930824+0 records out
4572581888 bytes (4,6 GB) copied, 113,648 s, 40,2 MB/s
12801720+0 records in
12801720+0 records out
6554480640 bytes (6,6 GB) copied, 223,212 s, 29,4 MB/s

$ du -m /tmp/damaged_file
6251     /tmp/damaged_file


best and correct way to recover a file is using ddrescue

$ ddrescue ./damaged_file /tmp/damaged_file info.log
rescued:     6554 MB,  errsize:    8192 B,  current rate:        0 B/s
  ipos:     4572 MB,   errors:       2,    average rate:   43407 kB/s
  opos:     4572 MB, run time:    2.51 m,  successful read:      34 s ago
Finished
      pos        size  status
0x00000000  0x10F019000  +
0x10F019000  0x00001000  -
0x10F01A000  0x018A8000  +
0x1108C2000  0x00001000  -
0x1108C3000  0x76216000  +

$ du -m /tmp/damaged_file
6251    /tmp/damaged_file

so basically only like 8K bytes are unrecoverable from this file. Probably there
could be created some tool which could get even more data knowing about btrfs.

> There /is/, however, a command that can be used to either regenerate or
> zero-out the checksum tree.  See btrfs check --init-csum-tree.  Current
> versions recalculate the csums, older versions (btrfsck as that was
> before btrfs check) simply zeroed it out.  Then you can read the file
> despite bad checksums, tho you'll still get errors if the block
> physically cannot be read.
>

Seems, you can't specify a path/file for it and it's quite destructive
action if you
want to get data only about some one specific file.

I did scrub second time and this time there aren't that many
uncorrectable errors and
also there's no csum_errors so --init-csum-tree is useless here I think.
Most likely previously scrub got that many errors because it still
continued for a bit even
if disk didn't respond.

scrub status for 1ec5b839-acc6-4f70-be9d-6f9e6118c71c
       scrub resumed at Mon Jul 13 22:24:43 2015 and finished after 02:47:28
       data_extents_scrubbed: 26357534
       tree_extents_scrubbed: 316780
       data_bytes_scrubbed: 1574584311808
       tree_bytes_scrubbed: 5190123520
       read_errors: 2
       csum_errors: 0
       verify_errors: 0
       no_csum: 89600
       csum_discards: 656214
       super_errors: 0
       malloc_errors: 0
       uncorrectable_errors: 2
       unverified_errors: 0
       corrected_errors: 0
       last_physical: 2590041112576

also now, there's i/o errors from device stats which were 0 previously

[/dev/sdd].write_io_errs   0
[/dev/sdd].read_io_errs    123
[/dev/sdd].flush_io_errs   0
[/dev/sdd].corruption_errs 0
[/dev/sdd].generation_errs 0

these are all errors which came from 2nd scrub, only 2 dead sectors

kernel: BTRFS: i/o error at logical 7358423011328 on dev /dev/sdd,
sector 2879471688, root 3034, inode 5619902, offset 4546727936, length
4096, links 1 (path: dir2/damaged_file)
kernel: BTRFS: bdev /dev/sdd errs: wr 0, rd 50, flush 0, corrupt 0, gen 0
kernel: BTRFS: unable to fixup (regular) error at logical
7358423011328 on dev /dev/sdd
kernel: BTRFS: i/o error at logical 7358448869376 on dev /dev/sdd,
sector 2879522192, root 3034, inode 5619902, offset 4572585984, length
4096, links 1 (path: dir2/damaged_file)
kernel: BTRFS: bdev /dev/sdd errs: wr 0, rd 51, flush 0, corrupt 0, gen 0
kernel: BTRFS: unable to fixup (regular) error at logical
7358448869376 on dev /dev/sdd


> There's also btrfs restore, which works on the unmounted filesystem
> without actually writing to it, copying the files it can read to a new
> location, which of course has to be a filesystem with enough room to
> restore the files to, altho it's possible to tell restore to do only
> specific subdirs, for instance.
>

I tried restore for that file, but it's not as good as ddrescue because it
stopped on error even with --ignore-errors flag and seems there aren't option
to continue and try more.

$ btrfs restore -i -x -m -v --path-regex
"^/dir1(|/dir2(|/damaged_file))$" /dev/sdd ./
Restoring ./dir1
Restoring ./dir1/dir2
Restoring ./dir1/dir2/damaged_file
offset is 258048
offset is 212992
offset is 233472
offset is 217088
offset is 237568
Exhausted mirrors trying to read
Error copying data for ./dir1/dir2/damaged_file
Done searching /dir1/dir2/damaged_file
Done searching /dir1/dir2
Done searching /dir1
Done searching

$ du -m ./dir1/dir2/damaged_file
4296    ./dir1/dir2/damaged_file

can see that it got only first half, similar how simple cp does.

> What I'd recommend depends on how complete and how recent your backup
> is.  If it's complete and recent enough, probably the easiest thing is to
> simply blow away the bad filesystem and start over, recovering from the
> backup to a new filesystem.

Actually this time I've 100% complete and up-to-date backups of all
files so I can
freely experiment and try practicing real world recovery which could
be very useful.
So far seems if I didn't had backup I would have lost only 8K bytes.
Why recreate to new filesystem rather than just delete/replace dying
disk? I will still
check if all files are ok, but I don't really see need to recreate
filesystem if files are fine.


by the way I managed to crash btrfs progs, I had scrub running  with -B and then
Xorg crashed (not related to btrfs) and it took down scrub process.
Then I just resumed scrub.
I've stripped symbols so stack trace is totally useless..

#0  0x0000000000418103 in ?? ()
#1  0x000000000040ee82 in main ()


also when I try restore from different root tree it crashes (this is
on 2 disk RAID0)

# btrfs restore -v -t 74579968 /dev/sdk ./
parent transid verify failed on 74579968 wanted 135 found 132
parent transid verify failed on 74579968 wanted 135 found 132
parent transid verify failed on 74579968 wanted 135 found 132
parent transid verify failed on 74579968 wanted 135 found 132
Ignoring transid failure
volumes.c:1554: btrfs_chunk_readonly: Assertion `!ce` failed.
btrfs[0x44c6ce]
btrfs[0x44f426]
btrfs(btrfs_read_block_groups+0x23e)[0x4442de]
btrfs(btrfs_setup_all_roots+0x387)[0x43edd7]
btrfs[0x43f124]
btrfs(open_ctree_fs_info+0x43)[0x43f2b3]
btrfs(cmd_restore+0xb5b)[0x42e77b]
btrfs(main+0x82)[0x40ee82]
/usr/lib/libc.so.6(__libc_start_main+0xf0)[0x7f1090778790]
btrfs(_start+0x29)[0x40ef79]


Thanks for your reply :)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Disk "failed" while doing scrub
  2015-07-14  1:54   ` Dāvis Mosāns
@ 2015-07-14  6:26     ` Duncan
  0 siblings, 0 replies; 5+ messages in thread
From: Duncan @ 2015-07-14  6:26 UTC (permalink / raw)
  To: linux-btrfs

Dāvis Mosāns posted on Tue, 14 Jul 2015 04:54:27 +0300 as excerpted:

> 2015-07-13 11:12 GMT+03:00 Duncan <1i5t5.duncan@cox.net>:
>> You say five disk, but nowhere in your post do you mention what raid
>> mode you were using, neither do you post btrfs filesystem show and
>> btrfs filesystem df, as suggested on the wiki and which list that
>> information.
> 
> Sorry, I forgot. I'm running Arch Linux 4.0.7, with btrfs-progs v4.1
> Using RAID1 for metadata and single for data, with features
> big_metadata, extended_iref, mixed_backref, no_holes, skinny_metadata
> and mounted with noatime,compress=zlib,space_cache,autodefrag

Thanks.  FWIW, pretty similar here, but running gentoo, now with btrfs-
progs v4.1.1 and the mainline 4.2-rc1+ kernel.

BTW, note that space_cache has been the default for quite some time, 
now.  I've never actually manually mounted with space_cache on any of my 
filesystems over several years, now, yet they all report it when I check 
/proc/mounts, etc.  So if you're adding that manually, you can kill that 
option and save the commandline/fstab space. =:^)

> Label: 'Data'  uuid: 1ec5b839-acc6-4f70-be9d-6f9e6118c71c
>        Total devices 5 FS bytes used 7.16TiB
>        devid    1 size 2.73TiB used 2.35TiB path /dev/sdc
>        devid    2 size 1.82TiB used 1.44TiB path /dev/sdd
>        devid    3 size 1.82TiB used 1.44TiB path /dev/sde
>        devid    4 size 1.82TiB used 1.44TiB path /dev/sdg
>        devid    5 size 931.51GiB used 539.01GiB path /dev/sdh
> 
> Data, single: total=7.15TiB, used=7.15TiB
> System, RAID1: total=8.00MiB, used=784.00KiB
> System, single: total=4.00MiB, used=0.00B
> Metadata, RAID1: total=16.00GiB, used=14.37GiB
> Metadata, single: total=8.00MiB, used=0.00B
> GlobalReserve, single: total=512.00MiB, used=0.00B

And note that you can easily and quickly remove those empty single-mode 
system and metadata chunks, which are an artifact of the way mkfs.btrfs 
works, using balance filters.

btrfs balance start -mprofile=single

... should do it.  They're actually working on mkfs.btrfs patches to fix 
it not to do that, right now.  There's active patch and testing threads 
discussing it.  Hopefully for btrfs-progs v4.2.  (4.1.1 has the patches 
for single-device and prep work for multi-device, according to the 
changelog.)

>>> Because filesystem still mounts, I assume I should do "btrfs device
>>> delete /dev/sdd /mntpoint" and then restore damaged files from backup.
>>
>> You can try a replace, but with a failing drive still connected, people
>> report mixed results.  It's likely to fail as it can't read certain
>> blocks to transfer them to the new device.
> 
> As I understand, device delete will copy data from that disk and
> distribute across rest of disks, while btrfs replace will copy to new
> disk which must be atleast size of disk I'm replacing.

Sorry.  You wrote delete, I read replace.  How'd I do that? =:^(

You are absolutely correct.  Delete would be better here.

I guess I had just been reading a thread discussing the problems I 
mentioned with replace, and saw what I expected to see, not what you 
actually wrote.

>> There's no such partial-file with null-fill tools shipped just yet.

> From journal I have only 14 files mentioned where errors occurred. Now
> 13 files from them don't throw any errors and their SHA's match to my
> backups so they're fine.

Good.  I was going on the assumption that the questionable device was in 
much worse shape than that.

> And actually btrfs does allow to copy/read that one damaged file, only I
> get I/O error when trying to read data from those broken sectors

Good, and good to know.  Thanks. =:^)

> best and correct way to recover a file is using ddrescue

I was just going to mention ddrescue. =:^)

> $ du -m /tmp/damaged_file 6251    /tmp/damaged_file
> 
> so basically only like 8K bytes are unrecoverable from this file.
> Probably there could be created some tool which could get even more data
> knowing about btrfs.
> 
>> There /is/, however, a command that can be used to either regenerate or
>> zero-out the checksum tree.  See btrfs check --init-csum-tree.
>>
> Seems, you can't specify a path/file for it and it's quite destructive
> action if you want to get data only about some one specific file.

Yes.  It's whole-filesystem-all-or-nothing, unfortunately. =:^(

> I did scrub second time and this time there aren't that many
> uncorrectable errors and also there's no csum_errors so --init-csum-tree
> is useless here I think.

Agreed.

> Most likely previously scrub got that many errors because it still
> continued for a bit even if disk didn't respond.

Yes.

> scrub status [...]
>	 read_errors: 2
>	 csum_errors: 0
>	 verify_errors: 0
>        no_csum: 89600
>	 csum_discards: 656214
>	 super_errors: 0
>        malloc_errors: 0
>	 uncorrectable_errors: 2
>	 unverified_errors: 0
>        corrected_errors: 0
>	 last_physical: 2590041112576

OK, that matches up with 8 KiB bad, since blocks are 4 KiB and there's 
two uncorrectable errors.  With the scrub now reporting no further errors 
and the two it does report accounted for, nothing else should be 
affected. =:^)

> also now, there's i/o errors from device stats which were 0 previously

Good.  It's recording them now.

>> There's also btrfs restore, which works on the unmounted filesystem
>> without actually writing to it, copying the files it can read to a new
>> location, which of course has to be a filesystem with enough room to
>> restore the files to, altho it's possible to tell restore to do only
>> specific subdirs, for instance.
>>
>>
> I tried restore for that file, but it's not as good as ddrescue because
> it stopped on error even with --ignore-errors flag and seems there
> aren't option to continue and try more.

Yes.  It's primary use is when the filesystem can't be mounted and 
backups aren't available or at least aren't current.  The fact that it 
works without writing to the filesystem in question is also nice, as that 
lets people grab the files they can while they know they can, before 
trying potential fixes that might end up making things worse instead of 
better.

Since you could mount, and the questionable device turned out not as bad 
as it first seemed, actually mounting and working with the mounted 
filesystem is the better choice.  I was just throwing restore out as an 
available tool, because again, I thought the iffy device could fail at 
any time, leaving you grasping at straws.

>> What I'd recommend depends on how complete and how recent your backup
>> is.  If it's complete and recent enough, probably the easiest thing is
>> to simply blow away the bad filesystem and start over, recovering from
>> the backup to a new filesystem.
> 
> Actually this time I've 100% complete and up-to-date backups of all
> files so I can freely experiment and try practicing real world recovery
> which could be very useful.

=:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Disk "failed" while doing scrub
  2015-07-13  6:26 Disk "failed" while doing scrub Dāvis Mosāns
  2015-07-13  8:12 ` Duncan
@ 2015-08-21  4:16 ` Dāvis Mosāns
  1 sibling, 0 replies; 5+ messages in thread
From: Dāvis Mosāns @ 2015-08-21  4:16 UTC (permalink / raw)
  To: linux-btrfs

2015-07-13 9:26 GMT+03:00 Dāvis Mosāns <davispuh@gmail.com>:
> also are there some easy way to locate those unreadable sectors and
> rewrite them so hdd relocates them?
>

Only now noticed that scrub does tell it :)

> kernel: BTRFS: i/o error at logical 7358423011328 on dev /dev/sdd,
sector 2879471688, root 3034, inode 5619902, offset 4546727936, length
4096, links 1 (path: dir2/damaged_file)

So for each broken sector I did
$ dd if=/dev/zero of=/dev/sdd seek=359933961 count=1 bs=4096

note that for dd seek need to specify block number which is 4096 byte size
in my case, but from scrub sector is 512 bytes size so 2879471688 / 8
= 359933961

Now disk was able to mark those sectors as dead and self-test passes
also it doesn't show any uncorrectable sectors anymore

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
5 Reallocated_Sector_Ct       0x0033   100   100   036    Pre-fail
Always       -       0
197 Current_Pending_Sector  0x0012   100   100   000    Old_age
Always       -       0
198 Offline_Uncorrectable      0x0010   100   100   000    Old_age
Offline        -       0

Num  Test_Description    Status                  Remaining
LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%      3173         -
# 2  Short offline           Completed without error       00%
3169         -
# 3  Short offline           Completed: read failure       90%
3139         2879471688

Then I tried to copy that same file

$ cp damaged_file /tmp/damaged_file
cp: error reading damaged_file: Input/output error

$ ddrescue damaged_file /tmp/damaged_file
GNU ddrescue 1.19
Press Ctrl-C to interrupt
rescued:     6554 MB,  errsize:    8192 B,  current rate:   56082 kB/s
  ipos:     4572 MB,   errors:       2,    average rate:   99310 kB/s
  opos:     4572 MB, run time:    1.10 m,  successful read:       0 s ago
Finished

and result is same, cp stops on first error, but ddrescue is able to
get everything
except those 8 KiB only difference is that I get csum error instead of
I/O error :)

kernel: BTRFS warning (device sdh): csum failed ino 5619902 off
4546727936 csum 2566472073 expected csum

when running scrub

scrub device /dev/sdd (id 2) done
       scrub started at Thu Jul 17 13:58:06 2015 and finished after 02:48:05
       data_extents_scrubbed: 26349742
       tree_extents_scrubbed: 316806
       data_bytes_scrubbed: 1574102949888
       tree_bytes_scrubbed: 5190549504
       read_errors: 0
       csum_errors: 2
       verify_errors: 0
       no_csum: 89600
       csum_discards: 656179
       super_errors: 0
       malloc_errors: 0
       uncorrectable_errors: 2
       unverified_errors: 0
       corrected_errors: 0
       last_physical: 1579475271680
ERROR: There are uncorrectable errors.


Now to fix csum errors I could use btrfs check --init-csum-tree  but I
think that's bad
as it will basically force all files to be valid even if they are
corrupted so I just copied
file from backup overwriting this damaged one.

Then after running scrub again can see that there's no errors anymore

scrub status for 1ec5b839-acc6-4f70-be9d-6f9e6118c71c
       scrub started at Fri Jul 17 19:22:45 2015 and finished after 02:47:58
       data_extents_scrubbed: 26347511
       tree_extents_scrubbed: 317192
       data_bytes_scrubbed: 1573973471232
       tree_bytes_scrubbed: 5196873728
       read_errors: 0
       csum_errors: 0
       verify_errors: 0
       no_csum: 89472
       csum_discards: 656152
       super_errors: 0
       malloc_errors: 0
       uncorrectable_errors: 0
       unverified_errors: 0
       corrected_errors: 0
       last_physical: 1580549013504

Next I did
$ btrfs device delete /dev/sdd /mnt/Data

Which successfully completed, only seems there's a bug that it shows incorrect
unallocated space for device when delete is in progress
$ btrfs filesystem usage

Unallocated:
  /dev/sdc       11.49GiB
  /dev/sdd       16.00EiB   // disk isn't that big...
  /dev/sde       12.02GiB
  /dev/sdg       12.02GiB
  /dev/sdh       11.48GiB

Then I tested that disk with badblocks and it didn't find anything so I just
added it back with
$ btrfs device add /dev/sdd /mnt/Data
and balance
$ btrfs balance start /mnt/Data

And just be completely sure everything is ok

$ btrfs check --check-data-csum /dev/sdc
Checking filesystem on /dev/sdc
UUID: 1ec5b839-acc6-4f70-be9d-6f9e6118c71c
checking extents
checking free space cache
checking fs roots
checking csums
checking root refs
found 7931796849809 bytes used err is 0
total csum bytes: 7731179932
total tree bytes: 15068594176
total fs tree bytes: 5814714368
total extent tree bytes: 860798976
btree space waste bytes: 1691112689
file data blocks allocated: 7918108438528
referenced 8212185219072


That's all, wasn't any need to recreate filesystem from scratch but just recover
1 file from backup and I even verified all files from backup with
rsync --checksum --dry-run
that everything is indeed correct.

PS. Sorry for so delayed follow-up.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-08-21  4:16 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-13  6:26 Disk "failed" while doing scrub Dāvis Mosāns
2015-07-13  8:12 ` Duncan
2015-07-14  1:54   ` Dāvis Mosāns
2015-07-14  6:26     ` Duncan
2015-08-21  4:16 ` Dāvis Mosāns

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.