All of lore.kernel.org
 help / color / mirror / Atom feed
* recovering failed raid5
@ 2016-10-27 15:06 Alexander Shenkin
  2016-10-27 16:04 ` Andreas Klauer
                   ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Alexander Shenkin @ 2016-10-27 15:06 UTC (permalink / raw)
  To: linux-raid

Hello all,

A RAID newbie here - apologies in advance for any bonehead questions.
I have 4 3TB disks that participate in 3 raid arrays.  md2 is failing.
I'm hoping someone here might be able to give me pointers for the
right direction to take to avoid data loss, if possible, and recover
the array.

md2: raid5 mounted on /, via sd[abcd]3
md3: raid10 are swap, via sd[abcd]4
md0: raid1 mounted on /boot, via sd[abcd]1
sd[abcd]2 are used for bios_grub

My sdb was recently reporting problems.  Instead of second guessing
those problems, I just got a new disk, replaced it, and added it to
the arrays.  /dev/md3 synced in the new device just fine.  But md0 and
md2 were showing it as spare (S).  When I tried to remove and re-add
sdb3 to md2, sdc3 started acting up, leading to an error message when
adding to the md2 array:

username@machinename:~$ sudo mdadm --manage /dev/md2 --add /dev/sdb3
sudo: unable to open /var/lib/sudo/username/1: No such file or directory
mdadm: /dev/md2 has failed so using --add cannot work and might destroy
mdadm: data on /dev/sdb3.  You should stop the array and re-assemble it.

Following the wiki
(https://raid.wiki.kernel.org/index.php/Linux_Raid), I've included
relevant information below.  My system has marked / as read-only at
this point, so I'm not able to install lsdrv.

The Event counts in the raid5 array (md2) are all quite similar:
         Events : 53547
         Events : 53539
         Events : 53547

Perhaps that means I can try "mdadm --assemble --force /dev/md2
/dev/sd[acd]3"?  I wanted to check with the experts, however, before
moving forward.

Thanks,
Allie


table of contents:
1) mdadm.conf
2) smartctl (disabled on drives - can enable once back up.  should I?)
3) mdadm --examine
4) cat /proc/mdstat
5) parted
6) dmesg output is available here: http://pastebin.com/YRz2Lxr1

################## mdadm.conf ###############################

root@machinename:/home/username# cat /etc/mdadm/mdadm.conf
# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#

# by default (built-in), scan all partitions (/proc/partitions) and all
# containers for MD superblocks. alternatively, specify devices to scan,
using
# wildcards if desired.
#DEVICE partitions containers

# auto-create devices with Debian standard permissions
CREATE owner=root group=disk mode=0660 auto=yes

# automatically tag new arrays as belonging to the local system
HOMEHOST <system>

# instruct the monitoring daemon where to send mail alerts
MAILADDR machinename@shenkin.org

# definitions of existing MD arrays
ARRAY /dev/md/0 metadata=1.2 UUID=437e4abb:c7ac46f1:ef8b2976:94921060
name=arrayname:0
   spares=2
ARRAY /dev/md/2 metadata=1.2 UUID=6426779d:5a08badf:9958e59e:2ded49d5
name=arrayname:2
ARRAY /dev/md/3 metadata=1.2 UUID=dd78a43e:92699c27:3dc5489d:91d93bb2
name=arrayname:3

# This file was auto-generated on Sat, 12 Dec 2015 09:37:39 +0000
# by mkconf $Id$


########################## smartctl ##########################

note: SMART only enabled after problems started cropping up.

root@machinename:/home/username# smartctl --xall /dev/sda
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.19.0-39-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.14 (AF)
Device Model:     ST3000DM001-9YN166
Serial Number:    Z1F13FBA
LU WWN Device Id: 5 000c50 04e444ab1
Firmware Version: CC4B
User Capacity:    3,000,592,982,016 bytes [3.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Wed Oct 26 11:14:12 2016 BST

==> WARNING: A firmware update for this drive may be available,
see the following Seagate web pages:
http://knowledge.seagate.com/articles/en_US/FAQ/207931en
http://knowledge.seagate.com/articles/en_US/FAQ/223651en

SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM level is:     254 (maximum performance)
Rd look-ahead is: Enabled
Write cache is:   Enabled
ATA Security is:  Disabled, NOT FROZEN [SEC1]
Write SCT (Get) XXX Error Recovery Control Command failed: scsi error
aborted command
Wt Cache Reorder: N/A

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (  592) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection
on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 335) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x3085) SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR--   114   099   006    -    81641176
  3 Spin_Up_Time            PO----   092   092   000    -    0
  4 Start_Stop_Count        -O--CK   100   100   020    -    32
  5 Reallocated_Sector_Ct   PO--CK   100   100   036    -    0
  7 Seek_Error_Rate         POSR--   072   060   030    -    17221506
  9 Power_On_Hours          -O--CK   092   092   000    -    7180
 10 Spin_Retry_Count        PO--C-   100   100   097    -    0
 12 Power_Cycle_Count       -O--CK   100   100   020    -    49
183 Runtime_Bad_Block       -O--CK   100   100   000    -    0
184 End-to-End_Error        -O--CK   100   100   099    -    0
187 Reported_Uncorrect      -O--CK   100   100   000    -    0
188 Command_Timeout         -O--CK   100   100   000    -    0 0 0
189 High_Fly_Writes         -O-RCK   100   100   000    -    0
190 Airflow_Temperature_Cel -O---K   051   049   045    -    49 (Min/Max 48/50)
191 G-Sense_Error_Rate      -O--CK   100   100   000    -    0
192 Power-Off_Retract_Count -O--CK   100   100   000    -    43
193 Load_Cycle_Count        -O--CK   100   100   000    -    70
194 Temperature_Celsius     -O---K   049   051   000    -    49 (0 15 0 0 0)
197 Current_Pending_Sector  -O--C-   100   100   000    -    0
198 Offline_Uncorrectable   ----C-   100   100   000    -    0
199 UDMA_CRC_Error_Count    -OSRCK   200   200   000    -    0
240 Head_Flying_Hours       ------   100   253   000    - 1783h+53m+16.383s
241 Total_LBAs_Written      ------   100   253   000    -    837415739719
242 Total_LBAs_Read         ------   100   253   000    -    121956855490474
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

General Purpose Log Directory Version 1
SMART           Log Directory Version 1 [multi-sector log support]
Address    Access  R/W   Size  Description
0x00       GPL,SL  R/O      1  Log Directory
0x01           SL  R/O      1  Summary SMART error log
0x02           SL  R/O      5  Comprehensive SMART error log
0x03       GPL     R/O      5  Ext. Comprehensive SMART error log
0x06           SL  R/O      1  SMART self-test log
0x07       GPL     R/O      1  Extended self-test log
0x09           SL  R/W      1  Selective self-test log
0x10       GPL     R/O      1  NCQ Command Error log
0x11       GPL     R/O      1  SATA Phy Event Counters
0x21       GPL     R/O      1  Write stream error log
0x22       GPL     R/O      1  Read stream error log
0x80-0x9f  GPL,SL  R/W     16  Host vendor specific log
0xa1       GPL,SL  VS      20  Device vendor specific log
0xa2       GPL     VS    4496  Device vendor specific log
0xa8       GPL,SL  VS      20  Device vendor specific log
0xa9       GPL,SL  VS       1  Device vendor specific log
0xab       GPL     VS       1  Device vendor specific log
0xb0       GPL     VS    5067  Device vendor specific log
0xbd       GPL     VS     512  Device vendor specific log
0xbe-0xbf  GPL     VS   65535  Device vendor specific log
0xc0       GPL,SL  VS       1  Device vendor specific log
0xe0       GPL,SL  R/W      1  SCT Command/Status
0xe1       GPL,SL  R/W      1  SCT Data Transfer

SMART Extended Comprehensive Error Log Version: 1 (5 sectors)
No Errors Logged

SMART Extended Self-test Log Version: 1 (1 sectors)
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

SCT Data Table command not supported

SCT Error Recovery Control command not supported

Device Statistics (GP Log 0x04) not supported

SATA Phy Event Counters (GP Log 0x11)
ID      Size     Value  Description
0x000a  2            2  Device-to-host register FISes sent due to a COMRESET
0x0001  2            0  Command failed due to ICRC error
0x0003  2            0  R_ERR response for device-to-host data FIS
0x0004  2            0  R_ERR response for host-to-device data FIS
0x0006  2            0  R_ERR response for device-to-host non-data FIS
0x0007  2            0  R_ERR response for host-to-device non-data FIS

root@machinename:/home/username# smartctl --xall /dev/sdb
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.19.0-39-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.14 (AF)
Device Model:     ST3000DM001-1ER166
Serial Number:    Z502SGLG
LU WWN Device Id: 5 000c50 090cfc6d8
Firmware Version: CC26
User Capacity:    3,000,592,982,016 bytes [3.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Wed Oct 26 11:14:16 2016 BST

==> WARNING: A firmware update for this drive may be available,
see the following Seagate web pages:
http://knowledge.seagate.com/articles/en_US/FAQ/207931en
http://knowledge.seagate.com/articles/en_US/FAQ/223651en

SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM level is:     254 (maximum performance)
Rd look-ahead is: Enabled
Write cache is:   Enabled
ATA Security is:  Disabled, NOT FROZEN [SEC1]
Write SCT (Get) XXX Error Recovery Control Command failed: scsi error
aborted command
Wt Cache Reorder: N/A

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (   80) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection
on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 318) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x1085) SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR--   100   100   006    -    155392
  3 Spin_Up_Time            PO----   100   100   000    -    0
  4 Start_Stop_Count        -O--CK   100   100   020    -    1
  5 Reallocated_Sector_Ct   PO--CK   100   100   010    -    0
  7 Seek_Error_Rate         POSR--   100   253   030    -    0
  9 Power_On_Hours          -O--CK   100   100   000    -    50
 10 Spin_Retry_Count        PO--C-   100   100   097    -    0
 12 Power_Cycle_Count       -O--CK   100   100   020    -    1
183 Runtime_Bad_Block       -O--CK   100   100   000    -    0
184 End-to-End_Error        -O--CK   100   100   099    -    0
187 Reported_Uncorrect      -O--CK   100   100   000    -    0
188 Command_Timeout         -O--CK   100   100   000    -    0 0 0
189 High_Fly_Writes         -O-RCK   100   100   000    -    0
190 Airflow_Temperature_Cel -O---K   050   050   045    -    50 (Min/Max 17/50)
191 G-Sense_Error_Rate      -O--CK   100   100   000    -    0
192 Power-Off_Retract_Count -O--CK   100   100   000    -    1
193 Load_Cycle_Count        -O--CK   100   100   000    -    1
194 Temperature_Celsius     -O---K   050   050   000    -    50 (0 17 0 0 0)
197 Current_Pending_Sector  -O--C-   100   100   000    -    0
198 Offline_Uncorrectable   ----C-   100   100   000    -    0
199 UDMA_CRC_Error_Count    -OSRCK   200   200   000    -    0
240 Head_Flying_Hours       ------   100   253   000    -    50h+33m+15.470s
241 Total_LBAs_Written      ------   100   253   000    -    15979162
242 Total_LBAs_Read         ------   100   253   000    -    19700
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

General Purpose Log Directory Version 1
SMART           Log Directory Version 1 [multi-sector log support]
Address    Access  R/W   Size  Description
0x00       GPL,SL  R/O      1  Log Directory
0x01           SL  R/O      1  Summary SMART error log
0x02           SL  R/O      5  Comprehensive SMART error log
0x03       GPL     R/O      5  Ext. Comprehensive SMART error log
0x06           SL  R/O      1  SMART self-test log
0x07       GPL     R/O      1  Extended self-test log
0x09           SL  R/W      1  Selective self-test log
0x10       GPL     R/O      1  NCQ Command Error log
0x11       GPL     R/O      1  SATA Phy Event Counters
0x21       GPL     R/O      1  Write stream error log
0x22       GPL     R/O      1  Read stream error log
0x30       GPL,SL  R/O      9  IDENTIFY DEVICE data log
0x80-0x9f  GPL,SL  R/W     16  Host vendor specific log
0xa1       GPL,SL  VS      20  Device vendor specific log
0xa2       GPL     VS    4496  Device vendor specific log
0xa8       GPL,SL  VS     129  Device vendor specific log
0xa9       GPL,SL  VS       1  Device vendor specific log
0xab       GPL     VS       1  Device vendor specific log
0xb0       GPL     VS    5176  Device vendor specific log
0xbe-0xbf  GPL     VS   65535  Device vendor specific log
0xc0       GPL,SL  VS       1  Device vendor specific log
0xc1       GPL,SL  VS      10  Device vendor specific log
0xc3       GPL,SL  VS       8  Device vendor specific log
0xe0       GPL,SL  R/W      1  SCT Command/Status
0xe1       GPL,SL  R/W      1  SCT Data Transfer

SMART Extended Comprehensive Error Log Version: 1 (5 sectors)
No Errors Logged

SMART Extended Self-test Log Version: 1 (1 sectors)
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

SCT Data Table command not supported

SCT Error Recovery Control command not supported

Device Statistics (GP Log 0x04) not supported

SATA Phy Event Counters (GP Log 0x11)
ID      Size     Value  Description
0x000a  2            2  Device-to-host register FISes sent due to a COMRESET
0x0001  2            0  Command failed due to ICRC error
0x0003  2            0  R_ERR response for device-to-host data FIS
0x0004  2            0  R_ERR response for host-to-device data FIS
0x0006  2            0  R_ERR response for device-to-host non-data FIS
0x0007  2            0  R_ERR response for host-to-device non-data FIS

root@machinename:/home/username# smartctl --xall /dev/sdc
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.19.0-39-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.14 (AF)
Device Model:     ST3000DM001-1CH166
Serial Number:    W1F1N909
LU WWN Device Id: 5 000c50 05ce3c3a2
Firmware Version: CC24
User Capacity:    3,000,592,982,016 bytes [3.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Wed Oct 26 11:14:18 2016 BST

==> WARNING: A firmware update for this drive may be available,
see the following Seagate web pages:
http://knowledge.seagate.com/articles/en_US/FAQ/207931en
http://knowledge.seagate.com/articles/en_US/FAQ/223651en

SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM level is:     254 (maximum performance)
Rd look-ahead is: Enabled
Write cache is:   Enabled
ATA Security is:  Disabled, NOT FROZEN [SEC1]
Write SCT (Get) XXX Error Recovery Control Command failed: scsi error
aborted command
Wt Cache Reorder: N/A

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (  592) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection
on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 324) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x3085) SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR--   107   099   006    -    13439162
  3 Spin_Up_Time            PO----   093   093   000    -    0
  4 Start_Stop_Count        -O--CK   100   100   020    -    30
  5 Reallocated_Sector_Ct   PO--CK   100   100   010    -    0
  7 Seek_Error_Rate         POSR--   060   060   030    -    42962048780
  9 Power_On_Hours          -O--CK   092   092   000    -    7356
 10 Spin_Retry_Count        PO--C-   100   100   097    -    0
 12 Power_Cycle_Count       -O--CK   100   100   020    -    47
183 Runtime_Bad_Block       -O--CK   100   100   000    -    0
184 End-to-End_Error        -O--CK   100   100   099    -    0
187 Reported_Uncorrect      -O--CK   098   098   000    -    2
188 Command_Timeout         -O--CK   100   100   000    -    0 0 0
189 High_Fly_Writes         -O-RCK   098   098   000    -    2
190 Airflow_Temperature_Cel -O---K   049   049   045    -    51 (Min/Max 47/51)
191 G-Sense_Error_Rate      -O--CK   100   100   000    -    0
192 Power-Off_Retract_Count -O--CK   100   100   000    -    42
193 Load_Cycle_Count        -O--CK   100   100   000    -    89
194 Temperature_Celsius     -O---K   051   051   000    -    51 (0 12 0 0 0)
197 Current_Pending_Sector  -O--C-   100   100   000    -    8
198 Offline_Uncorrectable   ----C-   100   100   000    -    8
199 UDMA_CRC_Error_Count    -OSRCK   200   200   000    -    0
240 Head_Flying_Hours       ------   100   253   000    - 1793h+59m+25.776s
241 Total_LBAs_Written      ------   100   253   000    -    5375768743
242 Total_LBAs_Read         ------   100   253   000    -    80575126377
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

General Purpose Log Directory Version 1
SMART           Log Directory Version 1 [multi-sector log support]
Address    Access  R/W   Size  Description
0x00       GPL,SL  R/O      1  Log Directory
0x01           SL  R/O      1  Summary SMART error log
0x02           SL  R/O      5  Comprehensive SMART error log
0x03       GPL     R/O      5  Ext. Comprehensive SMART error log
0x06           SL  R/O      1  SMART self-test log
0x07       GPL     R/O      1  Extended self-test log
0x09           SL  R/W      1  Selective self-test log
0x10       GPL     R/O      1  NCQ Command Error log
0x11       GPL     R/O      1  SATA Phy Event Counters
0x21       GPL     R/O      1  Write stream error log
0x22       GPL     R/O      1  Read stream error log
0x80-0x9f  GPL,SL  R/W     16  Host vendor specific log
0xa1       GPL,SL  VS      20  Device vendor specific log
0xa2       GPL     VS    4496  Device vendor specific log
0xa8       GPL,SL  VS     129  Device vendor specific log
0xa9       GPL,SL  VS       1  Device vendor specific log
0xab       GPL     VS       1  Device vendor specific log
0xb0       GPL     VS    5176  Device vendor specific log
0xbd       GPL     VS     512  Device vendor specific log
0xbe-0xbf  GPL     VS   65535  Device vendor specific log
0xc0       GPL,SL  VS       1  Device vendor specific log
0xc1       GPL,SL  VS      10  Device vendor specific log
0xc4       GPL,SL  VS       5  Device vendor specific log
0xe0       GPL,SL  R/W      1  SCT Command/Status
0xe1       GPL,SL  R/W      1  SCT Data Transfer

SMART Extended Comprehensive Error Log Version: 1 (5 sectors)
Device Error Count: 2
        CR     = Command Register
        FEATR  = Features Register
        COUNT  = Count (was: Sector Count) Register
        LBA_48 = Upper bytes of LBA High/Mid/Low Registers ]  ATA-8
        LH     = LBA High (was: Cylinder High) Register    ]   LBA
        LM     = LBA Mid (was: Cylinder Low) Register      ] Register
        LL     = LBA Low (was: Sector Number) Register     ]
        DV     = Device (was: Device/Head) Register
        DC     = Device Control Register
        ER     = Error register
        ST     = Status register
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 2 [1] occurred at disk power-on lifetime: 7306 hours (304 days + 10 hours)
  When the command that caused the error occurred, the device was
active or idle.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  40 -- 51 00 00 00 00 00 3f ad 38 00 00  Error: UNC at LBA =
0x003fad38 = 4173112

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  --------------- --------------------
  60 00 00 00 38 00 00 00 3f af 30 40 00     00:13:13.200  READ FPDMA QUEUED
  60 00 00 00 80 00 00 00 3f ae b0 40 00     00:13:13.200  READ FPDMA QUEUED
  60 00 00 00 80 00 00 00 3f ae 30 40 00     00:13:13.200  READ FPDMA QUEUED
  60 00 00 00 80 00 00 00 3f ad b0 40 00     00:13:13.200  READ FPDMA QUEUED
  60 00 00 04 c0 00 00 00 3f b0 00 40 00     00:13:13.200  READ FPDMA QUEUED

Error 1 [0] occurred at disk power-on lifetime: 7306 hours (304 days + 10 hours)
  When the command that caused the error occurred, the device was
active or idle.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  40 -- 51 00 00 00 00 00 3f ad 38 00 00  Error: UNC at LBA =
0x003fad38 = 4173112

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  --------------- --------------------
  60 00 00 02 a8 00 00 00 3f ac c0 40 00     00:13:09.845  READ FPDMA QUEUED
  60 00 00 04 f0 00 00 00 3f a7 d0 40 00     00:13:09.842  READ FPDMA QUEUED
  60 00 00 00 b8 00 00 00 3f a7 18 40 00     00:13:09.812  READ FPDMA QUEUED
  60 00 00 05 40 00 00 00 3f a1 d8 40 00     00:13:09.812  READ FPDMA QUEUED
  60 00 00 00 08 00 00 00 3f a1 c8 40 00     00:13:09.812  READ FPDMA QUEUED

SMART Extended Self-test Log Version: 1 (1 sectors)
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

SCT Data Table command not supported

SCT Error Recovery Control command not supported

Device Statistics (GP Log 0x04) not supported

SATA Phy Event Counters (GP Log 0x11)
ID      Size     Value  Description
0x000a  2            2  Device-to-host register FISes sent due to a COMRESET
0x0001  2            0  Command failed due to ICRC error
0x0003  2            0  R_ERR response for device-to-host data FIS
0x0004  2            0  R_ERR response for host-to-device data FIS
0x0006  2            0  R_ERR response for device-to-host non-data FIS
0x0007  2            0  R_ERR response for host-to-device non-data FIS

root@machinename:/home/username# smartctl --xall /dev/sdd
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.19.0-39-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.14 (AF)
Device Model:     ST3000DM001-9YN166
Serial Number:    S1F0HLY4
LU WWN Device Id: 5 000c50 0513b85c1
Firmware Version: CC9F
User Capacity:    3,000,592,982,016 bytes [3.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Wed Oct 26 11:14:19 2016 BST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM level is:     254 (maximum performance)
Rd look-ahead is: Enabled
Write cache is:   Enabled
ATA Security is:  Disabled, NOT FROZEN [SEC1]
Write SCT (Get) XXX Error Recovery Control Command failed: scsi error
aborted command
Wt Cache Reorder: N/A

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (  575) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection
on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 331) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x3081) SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR--   114   099   006    -    76568208
  3 Spin_Up_Time            PO----   092   092   000    -    0
  4 Start_Stop_Count        -O--CK   100   100   020    -    37
  5 Reallocated_Sector_Ct   PO--CK   100   100   036    -    0
  7 Seek_Error_Rate         POSR--   071   060   030    -    13538199
  9 Power_On_Hours          -O--CK   092   092   000    -    7083
 10 Spin_Retry_Count        PO--C-   100   100   097    -    0
 12 Power_Cycle_Count       -O--CK   100   100   020    -    54
183 Runtime_Bad_Block       -O--CK   100   100   000    -    0
184 End-to-End_Error        -O--CK   100   100   099    -    0
187 Reported_Uncorrect      -O--CK   053   053   000    -    47
188 Command_Timeout         -O--CK   100   100   000    -    0 0 0
189 High_Fly_Writes         -O-RCK   100   100   000    -    0
190 Airflow_Temperature_Cel -O---K   058   058   045    -    42 (Min/Max 41/42)
191 G-Sense_Error_Rate      -O--CK   100   100   000    -    0
192 Power-Off_Retract_Count -O--CK   100   100   000    -    45
193 Load_Cycle_Count        -O--CK   100   100   000    -    77
194 Temperature_Celsius     -O---K   042   042   000    -    42 (0 14 0 0 0)
197 Current_Pending_Sector  -O--C-   100   100   000    -    0
198 Offline_Uncorrectable   ----C-   100   100   000    -    0
199 UDMA_CRC_Error_Count    -OSRCK   200   200   000    -    0
240 Head_Flying_Hours       ------   100   253   000    - 1775h+53m+38.823s
241 Total_LBAs_Written      ------   100   253   000    -    828557670290
242 Total_LBAs_Read         ------   100   253   000    -    52298143648302
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

General Purpose Log Directory Version 1
SMART           Log Directory Version 1 [multi-sector log support]
Address    Access  R/W   Size  Description
0x00       GPL,SL  R/O      1  Log Directory
0x01           SL  R/O      1  Summary SMART error log
0x02           SL  R/O      5  Comprehensive SMART error log
0x03       GPL     R/O      5  Ext. Comprehensive SMART error log
0x06           SL  R/O      1  SMART self-test log
0x07       GPL     R/O      1  Extended self-test log
0x09           SL  R/W      1  Selective self-test log
0x10       GPL     R/O      1  NCQ Command Error log
0x11       GPL     R/O      1  SATA Phy Event Counters
0x21       GPL     R/O      1  Write stream error log
0x22       GPL     R/O      1  Read stream error log
0x80-0x9f  GPL,SL  R/W     16  Host vendor specific log
0xa1       GPL,SL  VS      20  Device vendor specific log
0xa2       GPL     VS    4496  Device vendor specific log
0xa8       GPL,SL  VS      20  Device vendor specific log
0xa9       GPL,SL  VS       1  Device vendor specific log
0xab       GPL     VS       1  Device vendor specific log
0xb0       GPL     VS    5067  Device vendor specific log
0xbd       GPL     VS     512  Device vendor specific log
0xbe-0xbf  GPL     VS   65535  Device vendor specific log
0xc0       GPL,SL  VS       1  Device vendor specific log
0xe0       GPL,SL  R/W      1  SCT Command/Status
0xe1       GPL,SL  R/W      1  SCT Data Transfer

SMART Extended Comprehensive Error Log Version: 1 (5 sectors)
Device Error Count: 47 (device log contains only the most recent 20 errors)
        CR     = Command Register
        FEATR  = Features Register
        COUNT  = Count (was: Sector Count) Register
        LBA_48 = Upper bytes of LBA High/Mid/Low Registers ]  ATA-8
        LH     = LBA High (was: Cylinder High) Register    ]   LBA
        LM     = LBA Mid (was: Cylinder Low) Register      ] Register
        LL     = LBA Low (was: Sector Number) Register     ]
        DV     = Device (was: Device/Head) Register
        DC     = Device Control Register
        ER     = Error register
        ST     = Status register
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 47 [6] occurred at disk power-on lifetime: 4698 hours (195 days
+ 18 hours)
  When the command that caused the error occurred, the device was
active or idle.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  40 -- 51 00 00 00 00 f4 df c2 c0 00 00  Error: UNC at LBA =
0xf4dfc2c0 = 4108305088

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  --------------- --------------------
  60 00 00 00 08 00 00 cf 15 5c d0 40 00 21d+03:57:37.019  READ FPDMA QUEUED
  61 00 00 00 08 00 00 cd 15 33 d8 40 00 21d+03:57:37.018  WRITE FPDMA QUEUED
  61 00 00 00 08 00 00 cd 15 33 a8 40 00 21d+03:57:37.018  WRITE FPDMA QUEUED
  60 00 00 00 28 00 00 ae 41 4c a0 40 00 21d+03:57:37.018  READ FPDMA QUEUED
  60 00 00 00 20 00 00 f4 df c4 40 40 00 21d+03:57:37.018  READ FPDMA QUEUED

Error 46 [5] occurred at disk power-on lifetime: 4698 hours (195 days
+ 18 hours)
  When the command that caused the error occurred, the device was
active or idle.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  40 -- 51 00 00 00 00 f4 df c2 c0 00 00  Error: UNC at LBA =
0xf4dfc2c0 = 4108305088

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  --------------- --------------------
  60 00 00 00 80 00 00 f4 df b0 a0 40 00 21d+03:57:33.737  READ FPDMA QUEUED
  61 00 00 00 80 00 00 f4 df b0 a0 40 00 21d+03:57:33.736  WRITE FPDMA QUEUED
  60 00 00 02 38 00 00 f4 df c4 60 40 00 21d+03:57:33.736  READ FPDMA QUEUED
  60 00 00 05 40 00 00 f4 df bf 20 40 00 21d+03:57:33.736  READ FPDMA QUEUED
  61 00 00 00 80 00 00 f4 df b0 a0 40 00 21d+03:57:33.734  WRITE FPDMA QUEUED

Error 45 [4] occurred at disk power-on lifetime: 4698 hours (195 days
+ 18 hours)
  When the command that caused the error occurred, the device was
active or idle.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  40 -- 51 00 00 00 00 f4 df b0 a0 00 00  Error: UNC at LBA =
0xf4dfb0a0 = 4108300448

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  --------------- --------------------
  60 00 00 03 d0 00 00 f4 df b7 10 40 00 21d+03:57:30.735  READ FPDMA QUEUED
  60 00 00 00 30 00 00 f4 df b4 20 40 00 21d+03:57:30.733  READ FPDMA QUEUED
  60 00 00 00 78 00 00 ae 41 4c 20 40 00 21d+03:57:30.733  READ FPDMA QUEUED
  60 00 00 00 80 00 00 f4 df b3 a0 40 00 21d+03:57:30.732  READ FPDMA QUEUED
  60 00 00 00 80 00 00 f4 df b3 20 40 00 21d+03:57:30.732  READ FPDMA QUEUED

Error 44 [3] occurred at disk power-on lifetime: 4698 hours (195 days
+ 18 hours)
  When the command that caused the error occurred, the device was
active or idle.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  40 -- 51 00 00 00 00 f4 df b0 a0 00 00  Error: UNC at LBA =
0xf4dfb0a0 = 4108300448

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  --------------- --------------------
  60 00 00 00 a8 00 00 f4 df b4 50 40 00 21d+03:57:27.426  READ FPDMA QUEUED
  61 00 00 00 50 00 00 cd 15 33 88 40 00 21d+03:57:27.403  WRITE FPDMA QUEUED
  60 00 00 02 18 00 00 f4 df ac f8 40 00 21d+03:57:27.388  READ FPDMA QUEUED
  60 00 00 05 40 00 00 f4 df a7 b8 40 00 21d+03:57:27.387  READ FPDMA QUEUED
  60 00 00 00 58 00 00 f4 df 95 68 40 00 21d+03:57:27.384  READ FPDMA QUEUED

Error 43 [2] occurred at disk power-on lifetime: 4698 hours (195 days
+ 18 hours)
  When the command that caused the error occurred, the device was
active or idle.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  40 -- 51 00 00 00 00 f4 df 95 68 00 00  Error: UNC at LBA =
0xf4df9568 = 4108293480

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  --------------- --------------------
  60 00 00 00 80 00 00 f4 df 91 50 40 00 21d+03:57:24.175  READ FPDMA QUEUED
  61 00 00 00 80 00 00 f4 df 91 50 40 00 21d+03:57:24.175  WRITE FPDMA QUEUED
  60 00 00 01 c0 00 00 f4 df 9e 78 40 00 21d+03:57:24.175  READ FPDMA QUEUED
  60 00 00 05 40 00 00 f4 df 99 38 40 00 21d+03:57:24.174  READ FPDMA QUEUED
  61 00 00 00 80 00 00 f4 df 91 50 40 00 21d+03:57:24.173  WRITE FPDMA QUEUED

Error 42 [1] occurred at disk power-on lifetime: 4698 hours (195 days
+ 18 hours)
  When the command that caused the error occurred, the device was
active or idle.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  40 -- 51 00 00 00 00 f4 df 95 68 00 00  Error: UNC at LBA =
0xf4df9568 = 4108293480

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  --------------- --------------------
  60 00 00 01 00 00 00 f4 df 94 c0 40 00 21d+03:57:21.120  READ FPDMA QUEUED
  60 00 00 00 80 00 00 f4 df 91 d0 40 00 21d+03:57:21.119  READ FPDMA QUEUED
  60 00 00 00 80 00 00 f4 df 92 50 40 00 21d+03:57:21.119  READ FPDMA QUEUED
  60 00 00 00 80 00 00 f4 df 92 d0 40 00 21d+03:57:21.119  READ FPDMA QUEUED
  60 00 00 00 80 00 00 f4 df 93 50 40 00 21d+03:57:21.119  READ FPDMA QUEUED

Error 41 [0] occurred at disk power-on lifetime: 4698 hours (195 days
+ 18 hours)
  When the command that caused the error occurred, the device was
active or idle.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  40 -- 51 00 00 00 00 f4 df 91 50 00 00  Error: UNC at LBA =
0xf4df9150 = 4108292432

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  --------------- --------------------
  60 00 00 03 78 00 00 f4 df 95 c0 40 00 21d+03:57:18.237  READ FPDMA QUEUED
  60 00 00 00 70 00 00 f4 df 94 50 40 00 21d+03:57:18.235  READ FPDMA QUEUED
  60 00 00 00 80 00 00 f4 df 93 d0 40 00 21d+03:57:18.235  READ FPDMA QUEUED
  60 00 00 00 80 00 00 f4 df 93 50 40 00 21d+03:57:18.235  READ FPDMA QUEUED
  60 00 00 00 80 00 00 f4 df 92 d0 40 00 21d+03:57:18.235  READ FPDMA QUEUED

Error 40 [19] occurred at disk power-on lifetime: 4698 hours (195 days
+ 18 hours)
  When the command that caused the error occurred, the device was
active or idle.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  40 -- 51 00 00 00 00 f4 df 91 50 00 00  Error: UNC at LBA =
0xf4df9150 = 4108292432

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  --------------- --------------------
  60 00 00 00 08 00 00 ca 32 4a 70 40 00 21d+03:57:15.237  READ FPDMA QUEUED
  60 00 00 00 08 00 00 ca 32 4a 40 40 00 21d+03:57:15.237  READ FPDMA QUEUED
  60 00 00 00 20 00 00 ca 31 9e a0 40 00 21d+03:57:15.237  READ FPDMA QUEUED
  60 00 00 01 00 00 00 f4 df 94 c0 40 00 21d+03:57:15.236  READ FPDMA QUEUED
  60 00 00 05 40 00 00 f4 df 8f 80 40 00 21d+03:57:15.236  READ FPDMA QUEUED

SMART Extended Self-test Log Version: 1 (1 sectors)
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

SCT Data Table command not supported

SCT Error Recovery Control command not supported

Device Statistics (GP Log 0x04) not supported

SATA Phy Event Counters (GP Log 0x11)
ID      Size     Value  Description
0x000a  2            2  Device-to-host register FISes sent due to a COMRESET
0x0001  2            0  Command failed due to ICRC error
0x0003  2            0  R_ERR response for device-to-host data FIS
0x0004  2            0  R_ERR response for host-to-device data FIS
0x0006  2            0  R_ERR response for device-to-host non-data FIS
0x0007  2            0  R_ERR response for host-to-device non-data FIS

root@machinename:/home/username#


################### mdadm --examine ###########################

/dev/sda1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 437e4abb:c7ac46f1:ef8b2976:94921060
           Name : arrayname:0
  Creation Time : Mon Dec  7 08:31:31 2015
     Raid Level : raid1
   Raid Devices : 2

 Avail Dev Size : 3901440 (1905.32 MiB 1997.54 MB)
     Array Size : 1950656 (1905.26 MiB 1997.47 MB)
  Used Dev Size : 3901312 (1905.26 MiB 1997.47 MB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : active
    Device UUID : 9cb1890b:ad675b3b:7517467f:0780ec8e

    Update Time : Mon Oct 24 09:19:37 2016
       Checksum : 65eadceb - correct
         Events : 92


   Device Role : Active device 0
   Array State : AA ('A' == active, '.' == missing)
/dev/sdb1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 437e4abb:c7ac46f1:ef8b2976:94921060
           Name : arrayname:0
  Creation Time : Mon Dec  7 08:31:31 2015
     Raid Level : raid1
   Raid Devices : 2

 Avail Dev Size : 3901440 (1905.32 MiB 1997.54 MB)
     Array Size : 1950656 (1905.26 MiB 1997.47 MB)
  Used Dev Size : 3901312 (1905.26 MiB 1997.47 MB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 7acd8648:d34b69f3:f85b524d:dedb6b19

    Update Time : Mon Oct 24 08:53:41 2016
       Checksum : b3817738 - correct
         Events : 91


   Device Role : spare
   Array State : AA ('A' == active, '.' == missing)
/dev/sdc1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 437e4abb:c7ac46f1:ef8b2976:94921060
           Name : arrayname:0
  Creation Time : Mon Dec  7 08:31:31 2015
     Raid Level : raid1
   Raid Devices : 2

 Avail Dev Size : 3901440 (1905.32 MiB 1997.54 MB)
     Array Size : 1950656 (1905.26 MiB 1997.47 MB)
  Used Dev Size : 3901312 (1905.26 MiB 1997.47 MB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : f162eae5:19f8926b:f5bb6a2a:8adbbefd

    Update Time : Mon Oct 24 08:53:41 2016
       Checksum : 8a7a189d - correct
         Events : 91


   Device Role : spare
   Array State : AA ('A' == active, '.' == missing)
/dev/sdd1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 437e4abb:c7ac46f1:ef8b2976:94921060
           Name : arrayname:0
  Creation Time : Mon Dec  7 08:31:31 2015
     Raid Level : raid1
   Raid Devices : 2

 Avail Dev Size : 3901440 (1905.32 MiB 1997.54 MB)
     Array Size : 1950656 (1905.26 MiB 1997.47 MB)
  Used Dev Size : 3901312 (1905.26 MiB 1997.47 MB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : active
    Device UUID : 6b10139e:ab5da9d0:665b17ee:daf63719

    Update Time : Mon Oct 24 09:19:37 2016
       Checksum : 86deec80 - correct
         Events : 92


   Device Role : Active device 1
   Array State : AA ('A' == active, '.' == missing)
/dev/sda3:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 6426779d:5a08badf:9958e59e:2ded49d5
           Name : arrayname:2
  Creation Time : Mon Dec  7 08:32:42 2015
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 5840377856 (2784.91 GiB 2990.27 GB)
     Array Size : 8760565248 (8354.73 GiB 8970.82 GB)
  Used Dev Size : 5840376832 (2784.91 GiB 2990.27 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : c6136963:6c04bbd8:436bda87:2ad19433

    Update Time : Mon Oct 24 09:02:52 2016
       Checksum : 2ec5936f - correct
         Events : 53547

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : A..A ('A' == active, '.' == missing)
/dev/sdc3:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 6426779d:5a08badf:9958e59e:2ded49d5
           Name : arrayname:2
  Creation Time : Mon Dec  7 08:32:42 2015
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 5840377856 (2784.91 GiB 2990.27 GB)
     Array Size : 8760565248 (8354.73 GiB 8970.82 GB)
  Used Dev Size : 5840376832 (2784.91 GiB 2990.27 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : active
    Device UUID : 657c6955:1cdcfdaf:eb6c2aed:f5a4ed1f

    Update Time : Mon Oct 24 08:53:57 2016
       Checksum : 49afa71b - correct
         Events : 53539

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 2
   Array State : AAAA ('A' == active, '.' == missing)
/dev/sdd3:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 6426779d:5a08badf:9958e59e:2ded49d5
           Name : arrayname:2
  Creation Time : Mon Dec  7 08:32:42 2015
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 5840377856 (2784.91 GiB 2990.27 GB)
     Array Size : 8760565248 (8354.73 GiB 8970.82 GB)
  Used Dev Size : 5840376832 (2784.91 GiB 2990.27 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 1f7d54bf:e8b8a81e:898d9255:d2683cd7

    Update Time : Mon Oct 24 09:02:52 2016
       Checksum : 41fe73ed - correct
         Events : 53547

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 3
   Array State : A..A ('A' == active, '.' == missing)
/dev/sda4:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : dd78a43e:92699c27:3dc5489d:91d93bb2
           Name : arrayname:3
  Creation Time : Mon Dec  7 08:33:17 2015
     Raid Level : raid10
   Raid Devices : 4

 Avail Dev Size : 15976448 (7.62 GiB 8.18 GB)
     Array Size : 15975424 (15.24 GiB 16.36 GB)
  Used Dev Size : 15975424 (7.62 GiB 8.18 GB)
    Data Offset : 8192 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : cba9cc59:55cccd59:aecb8e98:4de1814a

    Update Time : Mon Oct 24 08:56:30 2016
       Checksum : dbe91e70 - correct
         Events : 75

         Layout : near=2
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AAAA ('A' == active, '.' == missing)
/dev/sdb4:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : dd78a43e:92699c27:3dc5489d:91d93bb2
           Name : arrayname:3
  Creation Time : Mon Dec  7 08:33:17 2015
     Raid Level : raid10
   Raid Devices : 4

 Avail Dev Size : 15976448 (7.62 GiB 8.18 GB)
     Array Size : 15975424 (15.24 GiB 16.36 GB)
  Used Dev Size : 15975424 (7.62 GiB 8.18 GB)
    Data Offset : 8192 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : f3c3c943:303bd216:bb42b1aa:5ac65a19

    Update Time : Mon Oct 24 08:56:30 2016
       Checksum : 63e60391 - correct
         Events : 75

         Layout : near=2
     Chunk Size : 512K

   Device Role : Active device 1
   Array State : AAAA ('A' == active, '.' == missing)
/dev/sdc4:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : dd78a43e:92699c27:3dc5489d:91d93bb2
           Name : arrayname:3
  Creation Time : Mon Dec  7 08:33:17 2015
     Raid Level : raid10
   Raid Devices : 4

 Avail Dev Size : 15976448 (7.62 GiB 8.18 GB)
     Array Size : 15975424 (15.24 GiB 16.36 GB)
  Used Dev Size : 15975424 (7.62 GiB 8.18 GB)
    Data Offset : 8192 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 1fece6ad:ac89b95f:e861553e:d507b3d6

    Update Time : Mon Oct 24 08:56:30 2016
       Checksum : 67e6dae0 - correct
         Events : 75

         Layout : near=2
     Chunk Size : 512K

   Device Role : Active device 2
   Array State : AAAA ('A' == active, '.' == missing)
/dev/sdd4:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : dd78a43e:92699c27:3dc5489d:91d93bb2
           Name : arrayname:3
  Creation Time : Mon Dec  7 08:33:17 2015
     Raid Level : raid10
   Raid Devices : 4

 Avail Dev Size : 15976448 (7.62 GiB 8.18 GB)
     Array Size : 15975424 (15.24 GiB 16.36 GB)
  Used Dev Size : 15975424 (7.62 GiB 8.18 GB)
    Data Offset : 8192 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 33cae133:63d86892:7011553c:2068b2cc

    Update Time : Mon Oct 24 08:56:30 2016
       Checksum : 1490177f - correct
         Events : 75

         Layout : near=2
     Chunk Size : 512K

   Device Role : Active device 3
   Array State : AAAA ('A' == active, '.' == missing)


############ /proc/mdstat ############################################

root@machinename:/home/username# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
[raid4] [raid10]
md3 : active raid10 sdb4[4] sdd4[3] sda4[0] sdc4[2]
      15975424 blocks super 1.2 512K chunks 2 near-copies [4/4] [UUUU]

md2 : active raid5 sda3[0] sdc3[2](F) sdd3[3]
      8760565248 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/2]
[U__U]

md0 : active raid1 sdb1[4](S) sdc1[2](S) sda1[0] sdd1[3]
      1950656 blocks super 1.2 [2/2] [UU]

unused devices: <none>


################ PARTITIONS ##################

root@machinename:/home/username# cat sd.parted
BYT;
/dev/sda:3001GB:scsi:512:4096:gpt:ATA ST3000DM001-9YN1;
1:1049kB:2000MB:1999MB::boot:raid;
2:2000MB:2001MB:1049kB::grubbios:bios_grub;
3:2001MB:2992GB:2990GB:ext4:main:raid;
4:2992GB:3001GB:8184MB:linux-swap(v1):swap:raid;

BYT;
/dev/sdb:3001GB:scsi:512:4096:gpt:ATA ST3000DM001-1CH1;
1:1049kB:2000MB:1999MB::boot:raid;
2:2000MB:2001MB:1049kB::grubbios:bios_grub;
3:2001MB:2992GB:2990GB::main:raid;
4:2992GB:3001GB:8184MB::swap:raid;

BYT;
/dev/sdc:3001GB:scsi:512:4096:gpt:ATA ST3000DM001-1CH1;
1:1049kB:2000MB:1999MB::boot:raid;
2:2000MB:2001MB:1049kB::grubbios:bios_grub;
3:2001MB:2992GB:2990GB::main:raid;
4:2992GB:3001GB:8184MB::swap:raid;

BYT;
/dev/sdd:3001GB:scsi:512:4096:gpt:ATA ST3000DM001-9YN1;
1:1049kB:2000MB:1999MB::boot:raid;
2:2000MB:2001MB:1049kB::grubbios:bios_grub;
3:2001MB:2992GB:2990GB::main:raid;
4:2992GB:3001GB:8184MB::swap:raid;

BYT;
/dev/md0:1997MB:md:512:4096:loop:Linux Software RAID Array;
1:0.00B:1997MB:1997MB:ext4::;

BYT;
/dev/md2:8971GB:md:512:4096:loop:Linux Software RAID Array;
1:0.00B:8971GB:8971GB:ext4::;

BYT;
/dev/md3:16.4GB:md:512:4096:loop:Linux Software RAID Array;
1:0.00B:16.4GB:16.4GB:linux-swap(v1)::;

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: recovering failed raid5
  2016-10-27 15:06 recovering failed raid5 Alexander Shenkin
@ 2016-10-27 16:04 ` Andreas Klauer
  2016-10-28 12:22   ` Alexander Shenkin
  2016-10-27 16:26 ` Roman Mamedov
  2016-10-27 20:34 ` Robin Hill
  2 siblings, 1 reply; 29+ messages in thread
From: Andreas Klauer @ 2016-10-27 16:04 UTC (permalink / raw)
  To: Alexander Shenkin; +Cc: linux-raid

On Thu, Oct 27, 2016 at 04:06:14PM +0100, Alexander Shenkin wrote:
> md2: raid5 mounted on /, via sd[abcd]3

Two failed disks...

> md0: raid1 mounted on /boot, via sd[abcd]1

Actually only two disks active in that one, the other two are spares.
It hardly matters for /boot, but you could grow it to a 4 disk raid1.
Spares are not useful.

> My sdb was recently reporting problems.  Instead of second guessing
> those problems, I just got a new disk, replaced it, and added it to
> the arrays.

Replacing right away is the right thing to do.
Unfortunately it seems you have another disk that is broke too.

> 2) smartctl (disabled on drives - can enable once back up.  should I?)
> note: SMART only enabled after problems started cropping up.

But... why? Why disable smart? And if you do, is it a surprise that you 
only notice disk failures when it's already too late?

You should enable smart, and not only that, also run regular selftests, 
and have smartd running, and have it send you mail when something happens. 
Same with raid checks, raid checks are at least something but it won't 
tell you about how many reallocated sectors your drive has.

> root@machinename:/home/username# smartctl --xall /dev/sda

Looks fine but never ran a selftest.

> root@machinename:/home/username# smartctl --xall /dev/sdb

Looks new. (New drives need selftests too.)

> root@machinename:/home/username# smartctl --xall /dev/sdc
> smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.19.0-39-generic] (local build)
> Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
> 
> === START OF INFORMATION SECTION ===
> Model Family:     Seagate Barracuda 7200.14 (AF)
> Device Model:     ST3000DM001-1CH166
> Serial Number:    W1F1N909
>
> 197 Current_Pending_Sector  -O--C-   100   100   000    -    8
> 198 Offline_Uncorrectable   ----C-   100   100   000    -    8

This one is faulty and probably the reason why your resync failed.
You have no redundancy left, so an option here would be to get a 
new drive and ddrescue it over.

That's exactly the kind of thing you should be notified instantly 
about via mail. And it should be discovered when running selftests. 
Without full surface scan of the media, the disk itself won't know.

> ==> WARNING: A firmware update for this drive may be available,
> see the following Seagate web pages:
> http://knowledge.seagate.com/articles/en_US/FAQ/207931en
> http://knowledge.seagate.com/articles/en_US/FAQ/223651en

About this, *shrug*
I don't have these drives, you might want to check that out.
But it probably won't fix bad sectors.

> root@machinename:/home/username# smartctl --xall /dev/sdd

Some strange things in the error log here, but old.
Still, same as for all others - selftest.

> ################### mdadm --examine ###########################
> 
> /dev/sda1:
>      Raid Level : raid1
>    Raid Devices : 2

A RAID 1 with two drives, could be four.

> /dev/sdb1:
> /dev/sdc1:

So these would also have data instead of being spare.

> /dev/sda3:
>      Raid Level : raid5
>    Raid Devices : 4
> 
>     Update Time : Mon Oct 24 09:02:52 2016
>          Events : 53547
> 
>    Device Role : Active device 0
>    Array State : A..A ('A' == active, '.' == missing)

RAID-5 with two failed disks.

> /dev/sdc3:
>      Raid Level : raid5
>    Raid Devices : 4
> 
>     Update Time : Mon Oct 24 08:53:57 2016
>          Events : 53539
> 
>    Device Role : Active device 2
>    Array State : AAAA ('A' == active, '.' == missing)

This one failed, 8:53.

> ############ /proc/mdstat ############################################
> 
> md2 : active raid5 sda3[0] sdc3[2](F) sdd3[3]
>       8760565248 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/2]
> [U__U]

[U__U] refers to device roles as in [0123], 
so device role 0 and 3 is okay, 1 and 2 missing.

> md0 : active raid1 sdb1[4](S) sdc1[2](S) sda1[0] sdd1[3]
>       1950656 blocks super 1.2 [2/2] [UU]

Those two spares again, could be [UUUU] instead.

tl;dr
stop it all,
ddrescue /dev/sdc to your new disk,
try your luck with --assemble --force (not using /dev/sdc!),
get yet another new disk, add, sync, cross fingers.

There's also mdadm --replace instead of --remove, --add, 
that sometimes helps if there's only a few bad sectors 
on each disk. If the disk you already removed wasn't 
already kicked from the array by the time you replaced, 
maybe it would have avoided this problem.

But good disk monitoring and testing is even more important.

Regards
Andreas Klauer

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: recovering failed raid5
  2016-10-27 15:06 recovering failed raid5 Alexander Shenkin
  2016-10-27 16:04 ` Andreas Klauer
@ 2016-10-27 16:26 ` Roman Mamedov
  2016-10-27 20:34 ` Robin Hill
  2 siblings, 0 replies; 29+ messages in thread
From: Roman Mamedov @ 2016-10-27 16:26 UTC (permalink / raw)
  To: Alexander Shenkin; +Cc: linux-raid

On Thu, 27 Oct 2016 16:06:14 +0100
Alexander Shenkin <al@shenkin.org> wrote:

> === START OF INFORMATION SECTION ===
> Model Family:     Seagate Barracuda 7200.14 (AF)
> Device Model:     ST3000DM001-9YN166

That's the horror drive of doom with 30% failure rates within a couple of
years https://www.backblaze.com/blog/3tb-hard-drive-failure/

It's even got it's own Wikipedia article by now:
https://en.wikipedia.org/wiki/ST3000DM001

This Russian article dissects what actually causes the failures -- poor
dust-proofing of the platters area: https://habrahabr.ru/post/251941/

I hope you didn't seriously go out and buy one more of that same model to
replace the failed one.

-- 
With respect,
Roman

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: recovering failed raid5
  2016-10-27 15:06 recovering failed raid5 Alexander Shenkin
  2016-10-27 16:04 ` Andreas Klauer
  2016-10-27 16:26 ` Roman Mamedov
@ 2016-10-27 20:34 ` Robin Hill
  2 siblings, 0 replies; 29+ messages in thread
From: Robin Hill @ 2016-10-27 20:34 UTC (permalink / raw)
  To: Alexander Shenkin; +Cc: linux-raid

On Thu Oct 27, 2016 at 04:06:14PM +0100, Alexander Shenkin wrote:

> Hello all,
> 
> A RAID newbie here - apologies in advance for any bonehead questions.
> I have 4 3TB disks that participate in 3 raid arrays.  md2 is failing.
> I'm hoping someone here might be able to give me pointers for the
> right direction to take to avoid data loss, if possible, and recover
> the array.
> 
<- snip ->
> SCT Error Recovery Control command not supported
> SCT Error Recovery Control command not supported
> SCT Error Recovery Control command not supported
> SCT Error Recovery Control command not supported

Others have already chimed in about the recovery process. I just wanted
to make sure you were aware that none of your drives support
TLER/SCTERC. See https://raid.wiki.kernel.org/index.php/Timeout_Mismatch
for details on why this is an issue and how to work around it. You'll
need to do this before attempting any sort of recovery or you're likely
to run into further problems.

Cheers,
    Robin

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: recovering failed raid5
  2016-10-27 16:04 ` Andreas Klauer
@ 2016-10-28 12:22   ` Alexander Shenkin
  2016-10-28 13:33     ` Andreas Klauer
                       ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Alexander Shenkin @ 2016-10-28 12:22 UTC (permalink / raw)
  To: linux-raid; +Cc: Andreas Klauer, rm, robin

Thanks Andreas, much appreciated.  Your points about selftests and smart 
are well taken, and i'll implement them once i get this back up.  I'll 
buy yet another new, non drive-from-hell (yes Roman, I did buy the same 
damn drive again.  Will try to return it, thanks for the heads up...) 
and follow your instructions below.

One remaining question: is sdc definitely toast?  Or, is it possible 
that the Timeout Mismatch (as mentioned by Robin Hill; thanks Robin) is 
flagging the drive as failed, when something else is at play and perhaps 
the drive is actually fine?

To everyone: sorry for the multiple posts.  Was having majordomo issues...

On 10/27/2016 5:04 PM, Andreas Klauer wrote:
> On Thu, Oct 27, 2016 at 04:06:14PM +0100, Alexander Shenkin wrote:
>> md2: raid5 mounted on /, via sd[abcd]3
>
> Two failed disks...
>
>> md0: raid1 mounted on /boot, via sd[abcd]1
>
> Actually only two disks active in that one, the other two are spares.
> It hardly matters for /boot, but you could grow it to a 4 disk raid1.
> Spares are not useful.
>
>> My sdb was recently reporting problems.  Instead of second guessing
>> those problems, I just got a new disk, replaced it, and added it to
>> the arrays.
>
> Replacing right away is the right thing to do.
> Unfortunately it seems you have another disk that is broke too.
>
>> 2) smartctl (disabled on drives - can enable once back up.  should I?)
>> note: SMART only enabled after problems started cropping up.
>
> But... why? Why disable smart? And if you do, is it a surprise that you
> only notice disk failures when it's already too late?

yeah, i asked myself that same question.  there was probably some reason 
I did, but i don't remember what it was.  i'll keep smart enabled from 
now on...

> You should enable smart, and not only that, also run regular selftests,
> and have smartd running, and have it send you mail when something happens.
> Same with raid checks, raid checks are at least something but it won't
> tell you about how many reallocated sectors your drive has.

will do

>> root@machinename:/home/username# smartctl --xall /dev/sda
>
> Looks fine but never ran a selftest.
>
>> root@machinename:/home/username# smartctl --xall /dev/sdb
>
> Looks new. (New drives need selftests too.)
>
>> root@machinename:/home/username# smartctl --xall /dev/sdc
>> smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.19.0-39-generic] (local build)
>> Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
>>
>> === START OF INFORMATION SECTION ===
>> Model Family:     Seagate Barracuda 7200.14 (AF)
>> Device Model:     ST3000DM001-1CH166
>> Serial Number:    W1F1N909
>>
>> 197 Current_Pending_Sector  -O--C-   100   100   000    -    8
>> 198 Offline_Uncorrectable   ----C-   100   100   000    -    8
>
> This one is faulty and probably the reason why your resync failed.
> You have no redundancy left, so an option here would be to get a
> new drive and ddrescue it over.
>
> That's exactly the kind of thing you should be notified instantly
> about via mail. And it should be discovered when running selftests.
> Without full surface scan of the media, the disk itself won't know.
>
>> ==> WARNING: A firmware update for this drive may be available,
>> see the following Seagate web pages:
>> http://knowledge.seagate.com/articles/en_US/FAQ/207931en
>> http://knowledge.seagate.com/articles/en_US/FAQ/223651en
>
> About this, *shrug*
> I don't have these drives, you might want to check that out.
> But it probably won't fix bad sectors.
>
>> root@machinename:/home/username# smartctl --xall /dev/sdd
>
> Some strange things in the error log here, but old.
> Still, same as for all others - selftest.
>
>> ################### mdadm --examine ###########################
>>
>> /dev/sda1:
>>      Raid Level : raid1
>>    Raid Devices : 2
>
> A RAID 1 with two drives, could be four.
>
>> /dev/sdb1:
>> /dev/sdc1:
>
> So these would also have data instead of being spare.
>
>> /dev/sda3:
>>      Raid Level : raid5
>>    Raid Devices : 4
>>
>>     Update Time : Mon Oct 24 09:02:52 2016
>>          Events : 53547
>>
>>    Device Role : Active device 0
>>    Array State : A..A ('A' == active, '.' == missing)
>
> RAID-5 with two failed disks.
>
>> /dev/sdc3:
>>      Raid Level : raid5
>>    Raid Devices : 4
>>
>>     Update Time : Mon Oct 24 08:53:57 2016
>>          Events : 53539
>>
>>    Device Role : Active device 2
>>    Array State : AAAA ('A' == active, '.' == missing)
>
> This one failed, 8:53.
>
>> ############ /proc/mdstat ############################################
>>
>> md2 : active raid5 sda3[0] sdc3[2](F) sdd3[3]
>>       8760565248 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/2]
>> [U__U]
>
> [U__U] refers to device roles as in [0123],
> so device role 0 and 3 is okay, 1 and 2 missing.
>
>> md0 : active raid1 sdb1[4](S) sdc1[2](S) sda1[0] sdd1[3]
>>       1950656 blocks super 1.2 [2/2] [UU]
>
> Those two spares again, could be [UUUU] instead.
>
> tl;dr
> stop it all,
> ddrescue /dev/sdc to your new disk,
> try your luck with --assemble --force (not using /dev/sdc!),
> get yet another new disk, add, sync, cross fingers.
>
> There's also mdadm --replace instead of --remove, --add,
> that sometimes helps if there's only a few bad sectors
> on each disk. If the disk you already removed wasn't
> already kicked from the array by the time you replaced,
> maybe it would have avoided this problem.
>
> But good disk monitoring and testing is even more important.

thanks a bunch, Andreas.  I'll monitor and test from now on...

> Regards
> Andreas Klauer


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: recovering failed raid5
  2016-10-28 12:22   ` Alexander Shenkin
@ 2016-10-28 13:33     ` Andreas Klauer
  2016-10-28 21:16       ` Phil Turmel
  2016-10-29 10:29       ` Roman Mamedov
  2016-10-28 13:36     ` Robin Hill
  2016-10-31 16:31     ` Wols Lists
  2 siblings, 2 replies; 29+ messages in thread
From: Andreas Klauer @ 2016-10-28 13:33 UTC (permalink / raw)
  To: Alexander Shenkin; +Cc: linux-raid

On Fri, Oct 28, 2016 at 01:22:31PM +0100, Alexander Shenkin wrote:
> One remaining question: is sdc definitely toast?

In my opinion a drive is toast starting from the very first reallocated/ 
pending/uncorrectable sector, your drive has several of those and that's 
only the ones the drive already knows about - there may be more.

> Or, is it possible that the Timeout Mismatch (as mentioned by Robin Hill; 
> thanks Robin) is flagging the drive as failed, when something else is at 
> play and perhaps the drive is actually fine?

I don't believe in timeout mismatches, either. The timeouts are generous. 
Waiting for a disk to wake from standby is not a problem, and that takes 
ages already. If a disk gets stuck even longer in error correction limbo 
and it gets kicked because of it - IMHO that's the right call.

A disk that is unable to read its data, a disk that refuses to write data, 
a disk that needs help from the RAID layer to correct its errors, 
should be kicked because it's not able to pull its own weight.

You need drives that work without errors, without outside help, because 
during a rebuild, when the RAID is already degraded, there won't be any 
outside help. Either the disks work or your RAID is dead.

RAID redundancy is supposed to allow disks be replaced. (mdadm --replace)
If you use it instead to keep fixing errors on other disks, there is not 
any real redundancy left. In a RAID, if one of your disks has errors, 
you get rid of it as soon as possible.

Your RAID did not fail because of timeouts or not. It's not important. 
It failed because you didn't notice broken disks in time and you had two. 
Testing, monitoring, actually acting on the first error, is important. 

People have different opinions on this. Someone might argue.
It's up to you what risks to take.

Regards
Andreas Klauer

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: recovering failed raid5
  2016-10-28 12:22   ` Alexander Shenkin
  2016-10-28 13:33     ` Andreas Klauer
@ 2016-10-28 13:36     ` Robin Hill
  2016-10-31 10:44       ` Alexander Shenkin
                         ` (2 more replies)
  2016-10-31 16:31     ` Wols Lists
  2 siblings, 3 replies; 29+ messages in thread
From: Robin Hill @ 2016-10-28 13:36 UTC (permalink / raw)
  To: Alexander Shenkin; +Cc: linux-raid, Andreas Klauer, rm, robin

On Fri Oct 28, 2016 at 01:22:31PM +0100, Alexander Shenkin wrote:

> Thanks Andreas, much appreciated.  Your points about selftests and smart 
> are well taken, and i'll implement them once i get this back up.  I'll 
> buy yet another new, non drive-from-hell (yes Roman, I did buy the same 
> damn drive again.  Will try to return it, thanks for the heads up...) 
> and follow your instructions below.
> 
> One remaining question: is sdc definitely toast?  Or, is it possible 
> that the Timeout Mismatch (as mentioned by Robin Hill; thanks Robin) is 
> flagging the drive as failed, when something else is at play and perhaps 
> the drive is actually fine?
> 
It's not definitely toast, no (but this is unrelated to the Timeout
mismatches). It has some pending reallocations, which means the drive
was unable to read from some blocks - if a write to the blocks fails
then one of the spare blocks will be reallocated instead, but a write
will often succeed and the pending reallocation will just be cleared.

Unfortunately, reconstruction of the array depends on this data being
readable, so the fact the drive isn't toast doesn't necessarily help.
I'd suggest replicating (using ddrescue) that drive to the new one (when
it arrives) as a first step. It's possible ddrescue will manage to read
the data (it'll make several attempts, so can sometimes read data that
fails initially), otherwise you'll end up with some missing data
(possibly corrupt files, possibly corrupt filesystem metadata, possibly
just a bit of extra noise in an audio/video file). Once that's done, you
can do a proper check on sdc (e.g. a badblocks read/write test), which
will either lead to sector actually being reallocated, or to clearing
the pending reallocations. Unless you get a lot more reallocated sectors
than are currently pending, you can put the drive back into use if you
like (bearing in mind the reputation of these drives and weighing the
replacement cost against the value of your data).

If you run a regular selftest on the array, these sort of issues would
be picked up and repaired automatically (the read errors will trigger
rewrites and either reallocate blocks, clear the pending reallocations,
or fail the drive). Otherwise they're liable to come back to bite you
when you're trying to recover from a different failure.

Timeout Mismatches will lead to drives being failed from an otherwise
healthy array - a read failure on the drive can't be corrected as the
drive is still busy trying when the write request goes through, so the
drive gets kicked out of the array. You didn't say what the issue was
with your original sdb, but if it wasn't a definite fault then it may
have been affected by a timeout mismatch.

Cheers,
    Robin

> To everyone: sorry for the multiple posts.  Was having majordomo issues...
> 
> On 10/27/2016 5:04 PM, Andreas Klauer wrote:
> > On Thu, Oct 27, 2016 at 04:06:14PM +0100, Alexander Shenkin wrote:
> >> md2: raid5 mounted on /, via sd[abcd]3
> >
> > Two failed disks...
> >
> >> md0: raid1 mounted on /boot, via sd[abcd]1
> >
> > Actually only two disks active in that one, the other two are spares.
> > It hardly matters for /boot, but you could grow it to a 4 disk raid1.
> > Spares are not useful.
> >
> >> My sdb was recently reporting problems.  Instead of second guessing
> >> those problems, I just got a new disk, replaced it, and added it to
> >> the arrays.
> >
> > Replacing right away is the right thing to do.
> > Unfortunately it seems you have another disk that is broke too.
> >
> >> 2) smartctl (disabled on drives - can enable once back up.  should I?)
> >> note: SMART only enabled after problems started cropping up.
> >
> > But... why? Why disable smart? And if you do, is it a surprise that you
> > only notice disk failures when it's already too late?
> 
> yeah, i asked myself that same question.  there was probably some reason 
> I did, but i don't remember what it was.  i'll keep smart enabled from 
> now on...
> 
> > You should enable smart, and not only that, also run regular selftests,
> > and have smartd running, and have it send you mail when something happens.
> > Same with raid checks, raid checks are at least something but it won't
> > tell you about how many reallocated sectors your drive has.
> 
> will do
> 
> >> root@machinename:/home/username# smartctl --xall /dev/sda
> >
> > Looks fine but never ran a selftest.
> >
> >> root@machinename:/home/username# smartctl --xall /dev/sdb
> >
> > Looks new. (New drives need selftests too.)
> >
> >> root@machinename:/home/username# smartctl --xall /dev/sdc
> >> smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.19.0-39-generic] (local build)
> >> Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
> >>
> >> === START OF INFORMATION SECTION ===
> >> Model Family:     Seagate Barracuda 7200.14 (AF)
> >> Device Model:     ST3000DM001-1CH166
> >> Serial Number:    W1F1N909
> >>
> >> 197 Current_Pending_Sector  -O--C-   100   100   000    -    8
> >> 198 Offline_Uncorrectable   ----C-   100   100   000    -    8
> >
> > This one is faulty and probably the reason why your resync failed.
> > You have no redundancy left, so an option here would be to get a
> > new drive and ddrescue it over.
> >
> > That's exactly the kind of thing you should be notified instantly
> > about via mail. And it should be discovered when running selftests.
> > Without full surface scan of the media, the disk itself won't know.
> >
> >> ==> WARNING: A firmware update for this drive may be available,
> >> see the following Seagate web pages:
> >> http://knowledge.seagate.com/articles/en_US/FAQ/207931en
> >> http://knowledge.seagate.com/articles/en_US/FAQ/223651en
> >
> > About this, *shrug*
> > I don't have these drives, you might want to check that out.
> > But it probably won't fix bad sectors.
> >
> >> root@machinename:/home/username# smartctl --xall /dev/sdd
> >
> > Some strange things in the error log here, but old.
> > Still, same as for all others - selftest.
> >
> >> ################### mdadm --examine ###########################
> >>
> >> /dev/sda1:
> >>      Raid Level : raid1
> >>    Raid Devices : 2
> >
> > A RAID 1 with two drives, could be four.
> >
> >> /dev/sdb1:
> >> /dev/sdc1:
> >
> > So these would also have data instead of being spare.
> >
> >> /dev/sda3:
> >>      Raid Level : raid5
> >>    Raid Devices : 4
> >>
> >>     Update Time : Mon Oct 24 09:02:52 2016
> >>          Events : 53547
> >>
> >>    Device Role : Active device 0
> >>    Array State : A..A ('A' == active, '.' == missing)
> >
> > RAID-5 with two failed disks.
> >
> >> /dev/sdc3:
> >>      Raid Level : raid5
> >>    Raid Devices : 4
> >>
> >>     Update Time : Mon Oct 24 08:53:57 2016
> >>          Events : 53539
> >>
> >>    Device Role : Active device 2
> >>    Array State : AAAA ('A' == active, '.' == missing)
> >
> > This one failed, 8:53.
> >
> >> ############ /proc/mdstat ############################################
> >>
> >> md2 : active raid5 sda3[0] sdc3[2](F) sdd3[3]
> >>       8760565248 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/2]
> >> [U__U]
> >
> > [U__U] refers to device roles as in [0123],
> > so device role 0 and 3 is okay, 1 and 2 missing.
> >
> >> md0 : active raid1 sdb1[4](S) sdc1[2](S) sda1[0] sdd1[3]
> >>       1950656 blocks super 1.2 [2/2] [UU]
> >
> > Those two spares again, could be [UUUU] instead.
> >
> > tl;dr
> > stop it all,
> > ddrescue /dev/sdc to your new disk,
> > try your luck with --assemble --force (not using /dev/sdc!),
> > get yet another new disk, add, sync, cross fingers.
> >
> > There's also mdadm --replace instead of --remove, --add,
> > that sometimes helps if there's only a few bad sectors
> > on each disk. If the disk you already removed wasn't
> > already kicked from the array by the time you replaced,
> > maybe it would have avoided this problem.
> >
> > But good disk monitoring and testing is even more important.
> 
> thanks a bunch, Andreas.  I'll monitor and test from now on...
> 
> > Regards
> > Andreas Klauer
> 

-- 
     ___        
    ( ' }     |       Robin Hill        <robin@robinhill.me.uk> |
   / / )      | Little Jim says ....                            |
  // !!       |      "He fallen in de water !!"                 |

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: recovering failed raid5
  2016-10-28 13:33     ` Andreas Klauer
@ 2016-10-28 21:16       ` Phil Turmel
  2016-10-28 23:45         ` Andreas Klauer
  2016-10-29 10:29       ` Roman Mamedov
  1 sibling, 1 reply; 29+ messages in thread
From: Phil Turmel @ 2016-10-28 21:16 UTC (permalink / raw)
  To: Andreas Klauer, Alexander Shenkin; +Cc: linux-raid

Good afternoon Alexander,

On 10/28/2016 09:33 AM, Andreas Klauer wrote:
> On Fri, Oct 28, 2016 at 01:22:31PM +0100, Alexander Shenkin wrote:
>> One remaining question: is sdc definitely toast?
> 
> In my opinion a drive is toast starting from the very first reallocated/ 
> pending/uncorrectable sector, your drive has several of those and that's 
> only the ones the drive already knows about - there may be more.

Actual vs. Pending relocations are very different things.  Andreas'
approach is rather expensive in practice, as manufacturers of
consumer-grade drives specify an error rate of less than 1 per 10^14
bits read.  That's only 12.5TB.  A moderately used media server will
encounter many of these in a four to five year life span.  Few at first,
then more as the drive ages.  If you insist on replacing drives at the
first "pending" relocation, expect to purchase many more drives than
everyone else.

Enterprise drives work the same way, BTW, just with a spec of 1 per
10^15 bits read.  Since enterprise drives are typically in constant
heavy use, a similar count in a normal lifespan is expected.

>> Or, is it possible that the Timeout Mismatch (as mentioned by Robin Hill; 
>> thanks Robin) is flagging the drive as failed, when something else is at 
>> play and perhaps the drive is actually fine?

Pending relocations are often just glitches that are gone after the
sector is rewritten.  if your drives have an error timeout that is
shorter than the OS device driver timeout, a raid array will silently
fix these errors for you and you'll never notice.  If your array is
lightly used, a weekly or monthly "check" scrub will help flush them out
in a timely fashion.

If you have green or desktop drive that has a long timeout (greater than
the 30-second default linux driver timeout), your array will crash when
your drives age just enough to pop up their first UREs.  Please read the
list archives linked in the wiki to help you understand how and why this
happens.

> I don't believe in timeout mismatches, either. The timeouts are generous. 
> Waiting for a disk to wake from standby is not a problem, and that takes 
> ages already. If a disk gets stuck even longer in error correction limbo 
> and it gets kicked because of it - IMHO that's the right call.

Alex, I strongly recommend you ignore Andreas' advice on this one topic.
 Use the work-arounds for the drives you have, and buy friendlier drives
as age and capacity increases demand.  { If your livelihood or marriage
depends on the security of the contents of your array, buy enterprise
drives and verify your backup system... }

[trim /]

> Your RAID did not fail because of timeouts or not. It's not important. 
> It failed because you didn't notice broken disks in time and you had two. 
> Testing, monitoring, actually acting on the first error, is important. 

Andreas' is flat-out wrong on this.  If you had the work-arounds in
place on your array, your pending errors would have been silently fixed
and your array would almost certainly never have failed.  With or
without SMART enabled.

Not that I recommend running without the SMART features -- you will
still want to know when your drives have real problems.

Phil

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: recovering failed raid5
  2016-10-28 21:16       ` Phil Turmel
@ 2016-10-28 23:45         ` Andreas Klauer
  2016-10-29  2:52           ` Edward Kuns
                             ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Andreas Klauer @ 2016-10-28 23:45 UTC (permalink / raw)
  To: Phil Turmel; +Cc: Alexander Shenkin, linux-raid

On Fri, Oct 28, 2016 at 05:16:27PM -0400, Phil Turmel wrote:
> Andreas' approach is rather expensive in practice

Not really. Currently all of my disks are out of their warranty period. 
Whenever I bring this up the first thing I hear is that I'm just 
not noticing these errors that are happening all the time... oh well.

I run SMART selftests daily (select,cont), I run mdadm checks and check 
for mismatch_cnt afterwards (always 0 thus far). Not sure what else to 
do... haven't gone as far as patching the kernel to be more verbose. 
There's only so much you can do.

I'm mainly using cheap WD Green drives. I don't like enterprise drives, 
there's nothing that makes them more reliable, and in a home use where 
they twiddle their thumbs most of the time what's the point of it all? 
Expensive drives are more likely to turn you into a penny-pincher 
when replacement would be the right thing to do...

> manufacturers of consumer-grade drives specify an error rate of less
> than 1 per 10^14 bits read.  That's only 12.5TB.

Yes, according to that math you get stuff like that:

    http://www.zdnet.com/article/why-raid-5-stops-working-in-2009/

Or perhaps that just isn't how failures happen.

    https://www.high-rely.com/blog/why-raid-5-stops-working-in-2009-not/

I'm sure there are better links on the topic.

If there actually was one failure for every 12.5TB, this technology 
would be unusable. It's a LOT more reliable than that, thankfully. 
So no, I don't replace my disks every 12.5TB. That'd be ridiculous.

Maybe you didn't mean it this way.

> Pending relocations are often just glitches that are gone after the
> sector is rewritten.

That's the other opinion I was referring to.

There's no way to tell what caused sectors to become unreadable. 
Is it just a glitch in the matrix, never happen again once fixed? 
Or is it a serious issue, likely to reoccur or get even worse.
Who knows? It's not like you can open it and check. 

> a weekly or monthly "check" scrub will help flush them out 
> in a timely fashion.

Our advice is not that different. You recommend regular checks. 
I recommend regular checks.

I just don't believe in the "it will magically fix itself and 
never happen again" kind of story. It's a trust issue, I just 
can't bring myself to trust disks that have already lost data 
once. Elsewhere people add checksums to filesystems because 
they worry about single bit flips, not entire sectors gone... 
how come one is completely fine but not the other.
(I'm not worried about bit flips, either.)

I see this timeout thing as a fad, it's brought up in every 
other thread about raid failures on this list, regardless 
how little / none indication there was that timeouts were 
related in any way at all to the failure in question.

You'd think timeouts would solve all problems. They probably don't. 
In some exceedingly rare cases, they might not even matter at all.

> Andreas' is flat-out wrong on this.

I say his raid failed due to not running checks, 
running checks is something you recommend too.
There is some common ground there, however tiny.

> Not that I recommend running without the SMART features

That's the general gist I get from reading your posts, though.

> -- you will still want to know when your drives have real problems.

What's a real problem then, when pending sectors and read failures 
in selftest are not real enough?

Some arbitrarily chosen number of errors...

Disks just go bad. You can make up whatever reasons to not replace them, 
but whether your RAID will survive it, seems like a gamble to me.
Backups are a failsafe. I like the safe part, I try to avoid the fail.

Everyone has to find their own approach to things.

Regards
Andreas Klauer

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: recovering failed raid5
  2016-10-28 23:45         ` Andreas Klauer
@ 2016-10-29  2:52           ` Edward Kuns
  2016-10-29  2:53           ` Phil Turmel
  2016-10-29  8:46           ` Mikael Abrahamsson
  2 siblings, 0 replies; 29+ messages in thread
From: Edward Kuns @ 2016-10-29  2:52 UTC (permalink / raw)
  To: Andreas Klauer; +Cc: Phil Turmel, Alexander Shenkin, Linux-RAID

On Fri, Oct 28, 2016 at 6:45 PM, Andreas Klauer
<Andreas.Klauer@metamorpher.de> wrote:
> You'd think timeouts would solve all problems. They probably don't.
> In some exceedingly rare cases, they might not even matter at all.

As someone who has experienced this problem, no-one is saying that
having correct timeouts fixes *all* problems.  Obviously, drives fail.
However, it's very clear that having mismatched timeouts can cause a
single-sector failure to escalate to the whole drive being kicked from
the array, exposing you to a much bigger risk of data loss if anything
else at all goes wrong while you have no redundancy.

Right, timeouts won't matter all the time.  They only matter when
mismatched and when you hit a condition that causes the OS to give up
before the drive does.  Is that a good reason to look the other way
and not even check to see if you are exposed to risk by having
mismatched timeouts?  Would you tell someone to go boating without a
life jacket because most of the time they might not matter at all?

         Eddie

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: recovering failed raid5
  2016-10-28 23:45         ` Andreas Klauer
  2016-10-29  2:52           ` Edward Kuns
@ 2016-10-29  2:53           ` Phil Turmel
  2016-10-29  8:46           ` Mikael Abrahamsson
  2 siblings, 0 replies; 29+ messages in thread
From: Phil Turmel @ 2016-10-29  2:53 UTC (permalink / raw)
  To: Andreas Klauer; +Cc: Alexander Shenkin, linux-raid

Sigh.

On 10/28/2016 07:45 PM, Andreas Klauer wrote:

> Everyone has to find their own approach to things.

Humanity advances by learning from other peoples mistakes.  If we all
had to learn from scratch the technology we use every day, we'd still be
living in huts.

Please read this archived mail, and if you can, the whole thread:

http://marc.info/?l=linux-raid&m=135811522817345&w=1

The read the rest of the archived links in the wiki here:

https://raid.wiki.kernel.org/index.php/Timeout_Mismatch

Maybe you'll get it.  Maybe you won't.  I just don't want innocent
bystanders to take your undisputed commentary as gospel.

Phil

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: recovering failed raid5
  2016-10-28 23:45         ` Andreas Klauer
  2016-10-29  2:52           ` Edward Kuns
  2016-10-29  2:53           ` Phil Turmel
@ 2016-10-29  8:46           ` Mikael Abrahamsson
  2 siblings, 0 replies; 29+ messages in thread
From: Mikael Abrahamsson @ 2016-10-29  8:46 UTC (permalink / raw)
  To: Andreas Klauer; +Cc: linux-raid

On Sat, 29 Oct 2016, Andreas Klauer wrote:

> You'd think timeouts would solve all problems. They probably don't. In 
> some exceedingly rare cases, they might not even matter at all.

My reasoning regarding timeouts, especially for home arrays is the 
following:

Turning up the timeouts to 180 means your worst case scenario is that your 
array will have a 180 second long "hiccup" in delivering data.

This can be really bad in an enterprise environment, but in a home 
environment it's merely in an inconvenience. It happens at very few times, 
and it stops your drive from being spuriously kicked out when there is a 
read error, where it being kicked out can lead to lots worse things 
happening.

So for regular use there is very little downside to set the timeouts to 
180 seconds, there are substantial upsides, and I recommend everybody with 
non-enterprise drives to do that.

I wish the kernel defaults would be changed to 180, because I see these 
default timeout settings to cause people more problems than they help.

This is of course just one piece of a larger puzzle, but it's one that 
it's important to get right.

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: recovering failed raid5
  2016-10-28 13:33     ` Andreas Klauer
  2016-10-28 21:16       ` Phil Turmel
@ 2016-10-29 10:29       ` Roman Mamedov
  2016-10-29 12:02         ` Andreas Klauer
  1 sibling, 1 reply; 29+ messages in thread
From: Roman Mamedov @ 2016-10-29 10:29 UTC (permalink / raw)
  To: Andreas Klauer; +Cc: linux-raid

On Fri, 28 Oct 2016 15:33:04 +0200
Andreas Klauer <Andreas.Klauer@metamorpher.de> wrote:

> On Fri, Oct 28, 2016 at 01:22:31PM +0100, Alexander Shenkin wrote:
> > One remaining question: is sdc definitely toast?
> 
> In my opinion a drive is toast starting from the very first reallocated/ 
> pending/uncorrectable sector, your drive has several of those and that's 
> only the ones the drive already knows about - there may be more.

I'd say you are overly cautious on this. Yes there are drives for which one
reallocated sector is a sign of the coming avalanche of them, but then there
are also ones (e.g. my Hitachi 2TB) which work for years, over than period
develop 3-5-7 reallocated sectors, and THAT'S IT, they just continue to work.
And if there's an unreadable sector on rebuild as a drive found its 8th bad
sector after 3 more years of perfect operation, that's not a problem either,
because the setup they run in, is RAID6. (Not to compensate for this, but I
wouldn't be running a 8-10 drive RAID5 in any case).

-- 
With respect,
Roman

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: recovering failed raid5
  2016-10-29 10:29       ` Roman Mamedov
@ 2016-10-29 12:02         ` Andreas Klauer
  2016-10-30 16:18           ` Phil Turmel
  0 siblings, 1 reply; 29+ messages in thread
From: Andreas Klauer @ 2016-10-29 12:02 UTC (permalink / raw)
  To: Roman Mamedov; +Cc: linux-raid

On Sat, Oct 29, 2016 at 03:29:51PM +0500, Roman Mamedov wrote:
> And if there's an unreadable sector on rebuild as a drive found its 8th bad
> sector after 3 more years of perfect operation, that's not a problem either,
> because the setup they run in, is RAID6.

But if such disks are acceptable to run in a RAID, and you advertize it 
as such, you have to expect to see RAIDs where every single disk has a 
dozen reallocated sectors and a history of read errors to go with it.
Is that still fine? Do you expect to be lucky every time?

RAID-6 is not magic, either. Sooner or later, it will fail, too. 

Keep ignoring errors in RAID-5 and you'll see double failure.
Keep ignoring errors in RAID-6 long enough and you'll see triple failure.
All disks fail and many of them do silently, undetected if untested.

If you rented a server in a datacenter, thus entitled to working hardware, 
would you create a ticket on read failure or not?

Regards
Andreas Klauer

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: recovering failed raid5
  2016-10-29 12:02         ` Andreas Klauer
@ 2016-10-30 16:18           ` Phil Turmel
  0 siblings, 0 replies; 29+ messages in thread
From: Phil Turmel @ 2016-10-30 16:18 UTC (permalink / raw)
  To: Andreas Klauer, Roman Mamedov; +Cc: linux-raid

On 10/29/2016 08:02 AM, Andreas Klauer wrote:

> If you rented a server in a datacenter, thus entitled to working hardware, 
> would you create a ticket on read failure or not?

If the rate of read errors is within the manufacturer specs, it *is*
"working hardware", by definition.  I would expect if you did file such
a ticket, without a pattern of multiple read errors outside the hardware
spec, for it to be rejected.  And if you made a nuisance of yourself in
such a professional environment when you are so clearly wrong, to find
your server rental contract cancelled.

Phil

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: recovering failed raid5
  2016-10-28 13:36     ` Robin Hill
@ 2016-10-31 10:44       ` Alexander Shenkin
  2016-10-31 11:09         ` Andreas Klauer
                           ` (2 more replies)
  2016-10-31 16:28       ` Wols Lists
  2016-11-16  9:04       ` Alexander Shenkin
  2 siblings, 3 replies; 29+ messages in thread
From: Alexander Shenkin @ 2016-10-31 10:44 UTC (permalink / raw)
  To: linux-raid, Andreas Klauer, rm, robin

Thanks to everyone for their input.  I need to get a new, non-horrible 
3TB drive to ddrescue to.  The question: can I get a different 3TB drive 
(e.g. Toshiba P300 3TB, 
https://www.amazon.co.uk/Toshiba-P300-7200RPM-SATA-Drive/dp/B0151KM6F0), 
or are sizes of 3TB slightly different enough for that to cause me 
headaches when adding it back into the array?  If the latter is the 
case, then perhaps I need to aim for a 4TB drive replacement...

Thanks,
Allie

On 10/28/2016 2:36 PM, Robin Hill wrote:
> On Fri Oct 28, 2016 at 01:22:31PM +0100, Alexander Shenkin wrote:
>
>> Thanks Andreas, much appreciated.  Your points about selftests and smart
>> are well taken, and i'll implement them once i get this back up.  I'll
>> buy yet another new, non drive-from-hell (yes Roman, I did buy the same
>> damn drive again.  Will try to return it, thanks for the heads up...)
>> and follow your instructions below.
>>
>> One remaining question: is sdc definitely toast?  Or, is it possible
>> that the Timeout Mismatch (as mentioned by Robin Hill; thanks Robin) is
>> flagging the drive as failed, when something else is at play and perhaps
>> the drive is actually fine?
>>
> It's not definitely toast, no (but this is unrelated to the Timeout
> mismatches). It has some pending reallocations, which means the drive
> was unable to read from some blocks - if a write to the blocks fails
> then one of the spare blocks will be reallocated instead, but a write
> will often succeed and the pending reallocation will just be cleared.
>
> Unfortunately, reconstruction of the array depends on this data being
> readable, so the fact the drive isn't toast doesn't necessarily help.
> I'd suggest replicating (using ddrescue) that drive to the new one (when
> it arrives) as a first step. It's possible ddrescue will manage to read
> the data (it'll make several attempts, so can sometimes read data that
> fails initially), otherwise you'll end up with some missing data
> (possibly corrupt files, possibly corrupt filesystem metadata, possibly
> just a bit of extra noise in an audio/video file). Once that's done, you
> can do a proper check on sdc (e.g. a badblocks read/write test), which
> will either lead to sector actually being reallocated, or to clearing
> the pending reallocations. Unless you get a lot more reallocated sectors
> than are currently pending, you can put the drive back into use if you
> like (bearing in mind the reputation of these drives and weighing the
> replacement cost against the value of your data).
>
> If you run a regular selftest on the array, these sort of issues would
> be picked up and repaired automatically (the read errors will trigger
> rewrites and either reallocate blocks, clear the pending reallocations,
> or fail the drive). Otherwise they're liable to come back to bite you
> when you're trying to recover from a different failure.
>
> Timeout Mismatches will lead to drives being failed from an otherwise
> healthy array - a read failure on the drive can't be corrected as the
> drive is still busy trying when the write request goes through, so the
> drive gets kicked out of the array. You didn't say what the issue was
> with your original sdb, but if it wasn't a definite fault then it may
> have been affected by a timeout mismatch.
>
> Cheers,
>     Robin
>
>> To everyone: sorry for the multiple posts.  Was having majordomo issues...
>>
>> On 10/27/2016 5:04 PM, Andreas Klauer wrote:
>>> On Thu, Oct 27, 2016 at 04:06:14PM +0100, Alexander Shenkin wrote:
>>>> md2: raid5 mounted on /, via sd[abcd]3
>>>
>>> Two failed disks...
>>>
>>>> md0: raid1 mounted on /boot, via sd[abcd]1
>>>
>>> Actually only two disks active in that one, the other two are spares.
>>> It hardly matters for /boot, but you could grow it to a 4 disk raid1.
>>> Spares are not useful.
>>>
>>>> My sdb was recently reporting problems.  Instead of second guessing
>>>> those problems, I just got a new disk, replaced it, and added it to
>>>> the arrays.
>>>
>>> Replacing right away is the right thing to do.
>>> Unfortunately it seems you have another disk that is broke too.
>>>
>>>> 2) smartctl (disabled on drives - can enable once back up.  should I?)
>>>> note: SMART only enabled after problems started cropping up.
>>>
>>> But... why? Why disable smart? And if you do, is it a surprise that you
>>> only notice disk failures when it's already too late?
>>
>> yeah, i asked myself that same question.  there was probably some reason
>> I did, but i don't remember what it was.  i'll keep smart enabled from
>> now on...
>>
>>> You should enable smart, and not only that, also run regular selftests,
>>> and have smartd running, and have it send you mail when something happens.
>>> Same with raid checks, raid checks are at least something but it won't
>>> tell you about how many reallocated sectors your drive has.
>>
>> will do
>>
>>>> root@machinename:/home/username# smartctl --xall /dev/sda
>>>
>>> Looks fine but never ran a selftest.
>>>
>>>> root@machinename:/home/username# smartctl --xall /dev/sdb
>>>
>>> Looks new. (New drives need selftests too.)
>>>
>>>> root@machinename:/home/username# smartctl --xall /dev/sdc
>>>> smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.19.0-39-generic] (local build)
>>>> Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
>>>>
>>>> === START OF INFORMATION SECTION ===
>>>> Model Family:     Seagate Barracuda 7200.14 (AF)
>>>> Device Model:     ST3000DM001-1CH166
>>>> Serial Number:    W1F1N909
>>>>
>>>> 197 Current_Pending_Sector  -O--C-   100   100   000    -    8
>>>> 198 Offline_Uncorrectable   ----C-   100   100   000    -    8
>>>
>>> This one is faulty and probably the reason why your resync failed.
>>> You have no redundancy left, so an option here would be to get a
>>> new drive and ddrescue it over.
>>>
>>> That's exactly the kind of thing you should be notified instantly
>>> about via mail. And it should be discovered when running selftests.
>>> Without full surface scan of the media, the disk itself won't know.
>>>
>>>> ==> WARNING: A firmware update for this drive may be available,
>>>> see the following Seagate web pages:
>>>> http://knowledge.seagate.com/articles/en_US/FAQ/207931en
>>>> http://knowledge.seagate.com/articles/en_US/FAQ/223651en
>>>
>>> About this, *shrug*
>>> I don't have these drives, you might want to check that out.
>>> But it probably won't fix bad sectors.
>>>
>>>> root@machinename:/home/username# smartctl --xall /dev/sdd
>>>
>>> Some strange things in the error log here, but old.
>>> Still, same as for all others - selftest.
>>>
>>>> ################### mdadm --examine ###########################
>>>>
>>>> /dev/sda1:
>>>>      Raid Level : raid1
>>>>    Raid Devices : 2
>>>
>>> A RAID 1 with two drives, could be four.
>>>
>>>> /dev/sdb1:
>>>> /dev/sdc1:
>>>
>>> So these would also have data instead of being spare.
>>>
>>>> /dev/sda3:
>>>>      Raid Level : raid5
>>>>    Raid Devices : 4
>>>>
>>>>     Update Time : Mon Oct 24 09:02:52 2016
>>>>          Events : 53547
>>>>
>>>>    Device Role : Active device 0
>>>>    Array State : A..A ('A' == active, '.' == missing)
>>>
>>> RAID-5 with two failed disks.
>>>
>>>> /dev/sdc3:
>>>>      Raid Level : raid5
>>>>    Raid Devices : 4
>>>>
>>>>     Update Time : Mon Oct 24 08:53:57 2016
>>>>          Events : 53539
>>>>
>>>>    Device Role : Active device 2
>>>>    Array State : AAAA ('A' == active, '.' == missing)
>>>
>>> This one failed, 8:53.
>>>
>>>> ############ /proc/mdstat ############################################
>>>>
>>>> md2 : active raid5 sda3[0] sdc3[2](F) sdd3[3]
>>>>       8760565248 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/2]
>>>> [U__U]
>>>
>>> [U__U] refers to device roles as in [0123],
>>> so device role 0 and 3 is okay, 1 and 2 missing.
>>>
>>>> md0 : active raid1 sdb1[4](S) sdc1[2](S) sda1[0] sdd1[3]
>>>>       1950656 blocks super 1.2 [2/2] [UU]
>>>
>>> Those two spares again, could be [UUUU] instead.
>>>
>>> tl;dr
>>> stop it all,
>>> ddrescue /dev/sdc to your new disk,
>>> try your luck with --assemble --force (not using /dev/sdc!),
>>> get yet another new disk, add, sync, cross fingers.
>>>
>>> There's also mdadm --replace instead of --remove, --add,
>>> that sometimes helps if there's only a few bad sectors
>>> on each disk. If the disk you already removed wasn't
>>> already kicked from the array by the time you replaced,
>>> maybe it would have avoided this problem.
>>>
>>> But good disk monitoring and testing is even more important.
>>
>> thanks a bunch, Andreas.  I'll monitor and test from now on...
>>
>>> Regards
>>> Andreas Klauer
>>
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: recovering failed raid5
  2016-10-31 10:44       ` Alexander Shenkin
@ 2016-10-31 11:09         ` Andreas Klauer
  2016-10-31 15:19         ` Robin Hill
  2016-10-31 16:26         ` Wols Lists
  2 siblings, 0 replies; 29+ messages in thread
From: Andreas Klauer @ 2016-10-31 11:09 UTC (permalink / raw)
  To: Alexander Shenkin; +Cc: linux-raid

On Mon, Oct 31, 2016 at 10:44:38AM +0000, Alexander Shenkin wrote:
> or are sizes of 3TB slightly different enough for that to cause me 
> headaches when adding it back into the array?

This is usually not much to worry about, worst case you tell mdadm to 
shrink the size a little. But I don't see how it applies in your case. 
The partition you're interested in is not 3TB, and there is a swap 
partition you could skip if there's really no other way...

You can ddrescue partitions too, doesn't have to be the whole disk.

You posted parted output earlier, parted uses stupid unit by default 
so I usually prefer 'parted /dev/disk unit s print free'.

Regards
Andreas Klauer

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: recovering failed raid5
  2016-10-31 10:44       ` Alexander Shenkin
  2016-10-31 11:09         ` Andreas Klauer
@ 2016-10-31 15:19         ` Robin Hill
  2016-10-31 16:26         ` Wols Lists
  2 siblings, 0 replies; 29+ messages in thread
From: Robin Hill @ 2016-10-31 15:19 UTC (permalink / raw)
  To: Alexander Shenkin; +Cc: linux-raid, Andreas Klauer, rm, robin

On Mon Oct 31, 2016 at 10:44:38AM +0000, Alexander Shenkin wrote:

> Thanks to everyone for their input.  I need to get a new, non-horrible 
> 3TB drive to ddrescue to.  The question: can I get a different 3TB drive 
> (e.g. Toshiba P300 3TB, 
> https://www.amazon.co.uk/Toshiba-P300-7200RPM-SATA-Drive/dp/B0151KM6F0), 
> or are sizes of 3TB slightly different enough for that to cause me 
> headaches when adding it back into the array?  If the latter is the 
> case, then perhaps I need to aim for a 4TB drive replacement...
> 
> Thanks,
> Allie
> 

Any 3TB drive should be exactly the same size. Somewhere around the 1TB
drive size they stopped using different sizes and standardised across
the industry.

Cheers,
    Robin
-- 
     ___        
    ( ' }     |       Robin Hill        <robin@robinhill.me.uk> |
   / / )      | Little Jim says ....                            |
  // !!       |      "He fallen in de water !!"                 |

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: recovering failed raid5
  2016-10-31 10:44       ` Alexander Shenkin
  2016-10-31 11:09         ` Andreas Klauer
  2016-10-31 15:19         ` Robin Hill
@ 2016-10-31 16:26         ` Wols Lists
  2 siblings, 0 replies; 29+ messages in thread
From: Wols Lists @ 2016-10-31 16:26 UTC (permalink / raw)
  To: Alexander Shenkin, linux-raid, Andreas Klauer, rm, robin

On 31/10/16 10:44, Alexander Shenkin wrote:
> Thanks to everyone for their input.  I need to get a new, non-horrible
> 3TB drive to ddrescue to.  The question: can I get a different 3TB drive
> (e.g. Toshiba P300 3TB,
> https://www.amazon.co.uk/Toshiba-P300-7200RPM-SATA-Drive/dp/B0151KM6F0),
> or are sizes of 3TB slightly different enough for that to cause me
> headaches when adding it back into the array?  If the latter is the
> case, then perhaps I need to aim for a 4TB drive replacement...

Can't speak for that drive, but I recently got myself a laptop 2TB
Toshiba drive. Was pleasantly surprised to discover it supported SCT/ERC
(and they make a 3TB 2.5" drive too, just don't try sticking it in a
laptop as it won't fit :-)

Cheers,
Wol

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: recovering failed raid5
  2016-10-28 13:36     ` Robin Hill
  2016-10-31 10:44       ` Alexander Shenkin
@ 2016-10-31 16:28       ` Wols Lists
  2016-11-16  9:04       ` Alexander Shenkin
  2 siblings, 0 replies; 29+ messages in thread
From: Wols Lists @ 2016-10-31 16:28 UTC (permalink / raw)
  To: Alexander Shenkin, linux-raid, Andreas Klauer, rm, robin

On 28/10/16 14:36, Robin Hill wrote:
> Unfortunately, reconstruction of the array depends on this data being
> readable, so the fact the drive isn't toast doesn't necessarily help.
> I'd suggest replicating (using ddrescue) that drive to the new one (when
> it arrives) as a first step. It's possible ddrescue will manage to read
> the data (it'll make several attempts, so can sometimes read data that
> fails initially), otherwise you'll end up with some missing data
> (possibly corrupt files, possibly corrupt filesystem metadata, possibly
> just a bit of extra noise in an audio/video file). Once that's done, you
> can do a proper check on sdc (e.g. a badblocks read/write test), which
> will either lead to sector actually being reallocated, or to clearing
> the pending reallocations. Unless you get a lot more reallocated sectors
> than are currently pending, you can put the drive back into use if you
> like (bearing in mind the reputation of these drives and weighing the
> replacement cost against the value of your data).

Read the linux raid wiki - the page about programming projects at the
bottom.

If ddrescue fails to do a complete, okay copy, then maybe you or someone
near you has the smarts to do that little project. Then you can stick
your newly copied drive back knowing that the raid at least has a chance
of reconstructing your data without error.

Cheers,
Wol

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: recovering failed raid5
  2016-10-28 12:22   ` Alexander Shenkin
  2016-10-28 13:33     ` Andreas Klauer
  2016-10-28 13:36     ` Robin Hill
@ 2016-10-31 16:31     ` Wols Lists
  2 siblings, 0 replies; 29+ messages in thread
From: Wols Lists @ 2016-10-31 16:31 UTC (permalink / raw)
  To: Alexander Shenkin, linux-raid; +Cc: Andreas Klauer, rm, robin

On 28/10/16 13:22, Alexander Shenkin wrote:
>> But... why? Why disable smart? And if you do, is it a surprise that you
>> only notice disk failures when it's already too late?
> 
> yeah, i asked myself that same question.  there was probably some reason
> I did, but i don't remember what it was.  i'll keep smart enabled from
> now on...

I bet he didn't disable smarts - bear in mind I also have two 3TB
Barracudas ... and they lose their settings at power-off/on.

If he didn't set the boot process to explicitly turn smart on, EVERY
BOOT, then by default it's off.

Cheers,
Wol

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: recovering failed raid5
  2016-10-28 13:36     ` Robin Hill
  2016-10-31 10:44       ` Alexander Shenkin
  2016-10-31 16:28       ` Wols Lists
@ 2016-11-16  9:04       ` Alexander Shenkin
  2016-11-16 11:14         ` Andreas Klauer
  2016-11-16 15:35         ` Wols Lists
  2 siblings, 2 replies; 29+ messages in thread
From: Alexander Shenkin @ 2016-11-16  9:04 UTC (permalink / raw)
  To: linux-raid, Andreas Klauer, rm, robin

Hello all,

As a quick reminder, my sdb failed in a 4-disk RAID5, and then sdc 
failed when trying to replace sdb.  I'm now trying to recover sdc with 
ddrescue.

After much back and forth, I've finally got ddrescue running to 
replicate my apparently-faulty sdc.  I'm ddrescue'ing from a seagate 3TB 
to a toshiba 3TB drive, and I'm getting a 'No space left on device 
error'.  Any thoughts?

One further question: should I also try to ddrescue my original failed 
sdb in the hopes that anything lost on sdc would be covered by the 
recovered sdb?

Logs below... (below, /dev/sdb is the original failed seagate sdc, and 
/dev/sdc below is the new bare toshiba drive).

Thanks,
Allie


username@Ubuntu-VirtualBox:~$ sudo ddrescue -d -f -r3 /dev/sdb /dev/sdc 
~/rescue.logfile
[sudo] password for username:
GNU ddrescue 1.19
Press Ctrl-C to interrupt
rescued:     3000 GB,  errsize:   65536 B,  current rate:   55640 kB/s
    ipos:     3000 GB,   errors:       1,    average rate:   83070 kB/s
    opos:     3000 GB, run time:   10.03 h,  successful read:       0 s ago
Copying non-tried blocks... Pass 1 (forwards)
ddrescue: Write error: No space left on device



username@Ubuntu-VirtualBox:~$ lsblk -o name,label,size,fstype,model
NAME   LABEL           SIZE FSTYPE            MODEL
sda                      8G                   VBOX HARDDISK
├─sda1                   6G ext4
├─sda2                   1K
└─sda5                   2G swap
sdb                    2.7T                   2105
├─sdb1 arrayname:0  1.9G linux_raid_member
├─sdb2                   1M
├─sdb3 arrayname:2  2.7T linux_raid_member
└─sdb4 arrayname:3  7.6G linux_raid_member
sdc                    2.7T                   Expan



username@Ubuntu-VirtualBox:~$ fdisk -l

[...]

Disk /dev/sdb: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 4B356AFA-8F48-4227-86F0-329565146D7A

Device          Start        End    Sectors  Size Type
/dev/sdb1        2048    3905535    3903488  1.9G Linux RAID
/dev/sdb2     3905536    3907583       2048    1M BIOS boot
/dev/sdb3     3907584 5844547583 5840640000  2.7T Linux RAID
/dev/sdb4  5844547584 5860532223   15984640  7.6G Linux RAID


Disk /dev/sdc: 2.7 TiB, 3000592977920 bytes, 732566645 sectors
Units: sectors of 1 * 4096 = 4096 bytes
Sector size (logical/physical): 4096 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: dos
Disk identifier: 0x00000000

Device     Boot Start        End    Sectors Size Id Type
/dev/sdc1           1 4294967295 4294967295  16T ee GPT



username@Ubuntu-VirtualBox:~$ cat rescue.logfile
# Rescue Logfile. Created by GNU ddrescue version 1.19
# Command line: ddrescue -d -f -r3 /dev/sdb /dev/sdc 
/home/username/rescue.logfile
# Start time:   2016-11-15 13:54:24
# Current time: 2016-11-15 23:56:25
# Copying non-tried blocks... Pass 1 (forwards)
# current_pos  current_status
0x2BAA1470000     ?
#      pos        size  status
0x00000000  0x7F5A0000  +
0x7F5A0000  0x00010000  *
0x7F5B0000  0x00010000  ?
0x7F5C0000  0x2BA21EB0000  +


On 10/28/2016 2:36 PM, Robin Hill wrote:
> On Fri Oct 28, 2016 at 01:22:31PM +0100, Alexander Shenkin wrote:
>
>> Thanks Andreas, much appreciated.  Your points about selftests and smart
>> are well taken, and i'll implement them once i get this back up.  I'll
>> buy yet another new, non drive-from-hell (yes Roman, I did buy the same
>> damn drive again.  Will try to return it, thanks for the heads up...)
>> and follow your instructions below.
>>
>> One remaining question: is sdc definitely toast?  Or, is it possible
>> that the Timeout Mismatch (as mentioned by Robin Hill; thanks Robin) is
>> flagging the drive as failed, when something else is at play and perhaps
>> the drive is actually fine?
>>
> It's not definitely toast, no (but this is unrelated to the Timeout
> mismatches). It has some pending reallocations, which means the drive
> was unable to read from some blocks - if a write to the blocks fails
> then one of the spare blocks will be reallocated instead, but a write
> will often succeed and the pending reallocation will just be cleared.
>
> Unfortunately, reconstruction of the array depends on this data being
> readable, so the fact the drive isn't toast doesn't necessarily help.
> I'd suggest replicating (using ddrescue) that drive to the new one (when
> it arrives) as a first step. It's possible ddrescue will manage to read
> the data (it'll make several attempts, so can sometimes read data that
> fails initially), otherwise you'll end up with some missing data
> (possibly corrupt files, possibly corrupt filesystem metadata, possibly
> just a bit of extra noise in an audio/video file). Once that's done, you
> can do a proper check on sdc (e.g. a badblocks read/write test), which
> will either lead to sector actually being reallocated, or to clearing
> the pending reallocations. Unless you get a lot more reallocated sectors
> than are currently pending, you can put the drive back into use if you
> like (bearing in mind the reputation of these drives and weighing the
> replacement cost against the value of your data).
>
> If you run a regular selftest on the array, these sort of issues would
> be picked up and repaired automatically (the read errors will trigger
> rewrites and either reallocate blocks, clear the pending reallocations,
> or fail the drive). Otherwise they're liable to come back to bite you
> when you're trying to recover from a different failure.
>
> Timeout Mismatches will lead to drives being failed from an otherwise
> healthy array - a read failure on the drive can't be corrected as the
> drive is still busy trying when the write request goes through, so the
> drive gets kicked out of the array. You didn't say what the issue was
> with your original sdb, but if it wasn't a definite fault then it may
> have been affected by a timeout mismatch.
>
> Cheers,
>     Robin
>
>> To everyone: sorry for the multiple posts.  Was having majordomo issues...
>>
>> On 10/27/2016 5:04 PM, Andreas Klauer wrote:
>>> On Thu, Oct 27, 2016 at 04:06:14PM +0100, Alexander Shenkin wrote:
>>>> md2: raid5 mounted on /, via sd[abcd]3
>>>
>>> Two failed disks...
>>>
>>>> md0: raid1 mounted on /boot, via sd[abcd]1
>>>
>>> Actually only two disks active in that one, the other two are spares.
>>> It hardly matters for /boot, but you could grow it to a 4 disk raid1.
>>> Spares are not useful.
>>>
>>>> My sdb was recently reporting problems.  Instead of second guessing
>>>> those problems, I just got a new disk, replaced it, and added it to
>>>> the arrays.
>>>
>>> Replacing right away is the right thing to do.
>>> Unfortunately it seems you have another disk that is broke too.
>>>
>>>> 2) smartctl (disabled on drives - can enable once back up.  should I?)
>>>> note: SMART only enabled after problems started cropping up.
>>>
>>> But... why? Why disable smart? And if you do, is it a surprise that you
>>> only notice disk failures when it's already too late?
>>
>> yeah, i asked myself that same question.  there was probably some reason
>> I did, but i don't remember what it was.  i'll keep smart enabled from
>> now on...
>>
>>> You should enable smart, and not only that, also run regular selftests,
>>> and have smartd running, and have it send you mail when something happens.
>>> Same with raid checks, raid checks are at least something but it won't
>>> tell you about how many reallocated sectors your drive has.
>>
>> will do
>>
>>>> root@machinename:/home/username# smartctl --xall /dev/sda
>>>
>>> Looks fine but never ran a selftest.
>>>
>>>> root@machinename:/home/username# smartctl --xall /dev/sdb
>>>
>>> Looks new. (New drives need selftests too.)
>>>
>>>> root@machinename:/home/username# smartctl --xall /dev/sdc
>>>> smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.19.0-39-generic] (local build)
>>>> Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
>>>>
>>>> === START OF INFORMATION SECTION ===
>>>> Model Family:     Seagate Barracuda 7200.14 (AF)
>>>> Device Model:     ST3000DM001-1CH166
>>>> Serial Number:    W1F1N909
>>>>
>>>> 197 Current_Pending_Sector  -O--C-   100   100   000    -    8
>>>> 198 Offline_Uncorrectable   ----C-   100   100   000    -    8
>>>
>>> This one is faulty and probably the reason why your resync failed.
>>> You have no redundancy left, so an option here would be to get a
>>> new drive and ddrescue it over.
>>>
>>> That's exactly the kind of thing you should be notified instantly
>>> about via mail. And it should be discovered when running selftests.
>>> Without full surface scan of the media, the disk itself won't know.
>>>
>>>> ==> WARNING: A firmware update for this drive may be available,
>>>> see the following Seagate web pages:
>>>> http://knowledge.seagate.com/articles/en_US/FAQ/207931en
>>>> http://knowledge.seagate.com/articles/en_US/FAQ/223651en
>>>
>>> About this, *shrug*
>>> I don't have these drives, you might want to check that out.
>>> But it probably won't fix bad sectors.
>>>
>>>> root@machinename:/home/username# smartctl --xall /dev/sdd
>>>
>>> Some strange things in the error log here, but old.
>>> Still, same as for all others - selftest.
>>>
>>>> ################### mdadm --examine ###########################
>>>>
>>>> /dev/sda1:
>>>>      Raid Level : raid1
>>>>    Raid Devices : 2
>>>
>>> A RAID 1 with two drives, could be four.
>>>
>>>> /dev/sdb1:
>>>> /dev/sdc1:
>>>
>>> So these would also have data instead of being spare.
>>>
>>>> /dev/sda3:
>>>>      Raid Level : raid5
>>>>    Raid Devices : 4
>>>>
>>>>     Update Time : Mon Oct 24 09:02:52 2016
>>>>          Events : 53547
>>>>
>>>>    Device Role : Active device 0
>>>>    Array State : A..A ('A' == active, '.' == missing)
>>>
>>> RAID-5 with two failed disks.
>>>
>>>> /dev/sdc3:
>>>>      Raid Level : raid5
>>>>    Raid Devices : 4
>>>>
>>>>     Update Time : Mon Oct 24 08:53:57 2016
>>>>          Events : 53539
>>>>
>>>>    Device Role : Active device 2
>>>>    Array State : AAAA ('A' == active, '.' == missing)
>>>
>>> This one failed, 8:53.
>>>
>>>> ############ /proc/mdstat ############################################
>>>>
>>>> md2 : active raid5 sda3[0] sdc3[2](F) sdd3[3]
>>>>       8760565248 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/2]
>>>> [U__U]
>>>
>>> [U__U] refers to device roles as in [0123],
>>> so device role 0 and 3 is okay, 1 and 2 missing.
>>>
>>>> md0 : active raid1 sdb1[4](S) sdc1[2](S) sda1[0] sdd1[3]
>>>>       1950656 blocks super 1.2 [2/2] [UU]
>>>
>>> Those two spares again, could be [UUUU] instead.
>>>
>>> tl;dr
>>> stop it all,
>>> ddrescue /dev/sdc to your new disk,
>>> try your luck with --assemble --force (not using /dev/sdc!),
>>> get yet another new disk, add, sync, cross fingers.
>>>
>>> There's also mdadm --replace instead of --remove, --add,
>>> that sometimes helps if there's only a few bad sectors
>>> on each disk. If the disk you already removed wasn't
>>> already kicked from the array by the time you replaced,
>>> maybe it would have avoided this problem.
>>>
>>> But good disk monitoring and testing is even more important.
>>
>> thanks a bunch, Andreas.  I'll monitor and test from now on...
>>
>>> Regards
>>> Andreas Klauer
>>
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: recovering failed raid5
  2016-11-16  9:04       ` Alexander Shenkin
@ 2016-11-16 11:14         ` Andreas Klauer
  2016-11-16 13:27           ` Alexander Shenkin
  2016-11-16 15:35         ` Wols Lists
  1 sibling, 1 reply; 29+ messages in thread
From: Andreas Klauer @ 2016-11-16 11:14 UTC (permalink / raw)
  To: Alexander Shenkin; +Cc: linux-raid

On Wed, Nov 16, 2016 at 09:04:29AM +0000, Alexander Shenkin wrote:
> I'm getting a 'No space left on device error'.  Any thoughts?

It's smaller by 4096 bytes, that's probably not a problem. 
ddrescue seems to have failed to copy 128K of data, 
but that's probably not a big problem either.

Your problem is something else:

> Disk /dev/sdb: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
> Sector size (logical/physical): 512 bytes / 512 bytes
> Disklabel type: gpt

> Disk /dev/sdc: 2.7 TiB, 3000592977920 bytes, 732566645 sectors
> Sector size (logical/physical): 4096 bytes / 4096 bytes
> Disklabel type: dos

The physical sector size is different.

Unfortunately GPT partition scheme still depends on sector size and is 
inherently incompatible when dd(rescue)'d to a drive with different size.

In theory it would be possible to ignore this, i.e. interpret GPT 
correctly on a 4K sector drive even if it was created for a 512b drive, 
or vice versa, but Linux is quite strict about standards in this case. 
(If Linux was smarter it would work in Linux but fail for Windows...)

Anyway, you'll have to fixer-upper your GPT partition tables to 4K. 
gdisk has an expert -> recovery section that might be able to do so 
automagically, or you could just manually recreate with the correct 
_byte_ offsets (sector offset will be different).

Your partitions are all MiB aligned so there are no alignment issues.

Once the partition table is fixed you should be able to proceed normally.

Regards
Andreas Klauer

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: recovering failed raid5
  2016-11-16 11:14         ` Andreas Klauer
@ 2016-11-16 13:27           ` Alexander Shenkin
  2016-11-16 13:59             ` Andreas Klauer
  0 siblings, 1 reply; 29+ messages in thread
From: Alexander Shenkin @ 2016-11-16 13:27 UTC (permalink / raw)
  To: Andreas Klauer; +Cc: linux-raid

Thanks Andreas - replies below.

On 11/16/2016 11:14 AM, Andreas Klauer wrote:
> On Wed, Nov 16, 2016 at 09:04:29AM +0000, Alexander Shenkin wrote:
>> I'm getting a 'No space left on device error'.  Any thoughts?
>
> It's smaller by 4096 bytes, that's probably not a problem.
> ddrescue seems to have failed to copy 128K of data,
> but that's probably not a big problem either.

Regarding the failed copy, where do you see the 128k?  I see that there 
was an errsize of 65536 bytes, but I'm not sure how to tell if that was 
ever able to be read by ddrescue... i suspect ddrescue ran out of disk 
space before it was able to retry those unreadable bytes...

Won't both issues be problematic (i.e. unread data + out of space)?  I 
believe the last drive partition is the one that participates in the 
array that gets mounted as "/"... so, can't really just throw that one 
away...

https://i.imgur.com/SMdCo12.png

>
> Your problem is something else:
>
>> Disk /dev/sdb: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
>> Sector size (logical/physical): 512 bytes / 512 bytes
>> Disklabel type: gpt
>
>> Disk /dev/sdc: 2.7 TiB, 3000592977920 bytes, 732566645 sectors
>> Sector size (logical/physical): 4096 bytes / 4096 bytes
>> Disklabel type: dos
>
> The physical sector size is different.
>
> Unfortunately GPT partition scheme still depends on sector size and is
> inherently incompatible when dd(rescue)'d to a drive with different size.
>
> In theory it would be possible to ignore this, i.e. interpret GPT
> correctly on a 4K sector drive even if it was created for a 512b drive,
> or vice versa, but Linux is quite strict about standards in this case.
> (If Linux was smarter it would work in Linux but fail for Windows...)
>
> Anyway, you'll have to fixer-upper your GPT partition tables to 4K.
> gdisk has an expert -> recovery section that might be able to do so
> automagically, or you could just manually recreate with the correct
> _byte_ offsets (sector offset will be different).

So, just trying to understand the issue here...  The original (failed) 
drive had 512 byte sectors...  Does that mean that ddrescue has copied 
the 512 partition tables to my 4k drive, and hence I just need to fix 
the partition table on the new 4k drive?

>
> Your partitions are all MiB aligned so there are no alignment issues.
>
> Once the partition table is fixed you should be able to proceed normally.
>
> Regards
> Andreas Klauer
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: recovering failed raid5
  2016-11-16 13:27           ` Alexander Shenkin
@ 2016-11-16 13:59             ` Andreas Klauer
  0 siblings, 0 replies; 29+ messages in thread
From: Andreas Klauer @ 2016-11-16 13:59 UTC (permalink / raw)
  To: Alexander Shenkin; +Cc: linux-raid

On Wed, Nov 16, 2016 at 01:27:55PM +0000, Alexander Shenkin wrote:
> Does that mean that ddrescue has copied the 512 partition tables 
> to my 4k drive, and hence I just need to fix the partition table 
> on the new 4k drive?

Just so.

And the missing 4096 bytes don't matter, you had enough unpartitioned space.
It doesn't affect your data at all.

As for the 128K I counted all non+, if it's only 64K then even better.

Regards
Andreas Klauer

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: recovering failed raid5
  2016-11-16  9:04       ` Alexander Shenkin
  2016-11-16 11:14         ` Andreas Klauer
@ 2016-11-16 15:35         ` Wols Lists
  2016-11-16 15:50           ` Alexander Shenkin
  1 sibling, 1 reply; 29+ messages in thread
From: Wols Lists @ 2016-11-16 15:35 UTC (permalink / raw)
  To: Alexander Shenkin, linux-raid, rm, robin

On 16/11/16 09:04, Alexander Shenkin wrote:
> Hello all,
> 
> As a quick reminder, my sdb failed in a 4-disk RAID5, and then sdc
> failed when trying to replace sdb.  I'm now trying to recover sdc with
> ddrescue.
> 
> After much back and forth, I've finally got ddrescue running to
> replicate my apparently-faulty sdc.  I'm ddrescue'ing from a seagate 3TB
> to a toshiba 3TB drive, and I'm getting a 'No space left on device
> error'.  Any thoughts?
> 
> One further question: should I also try to ddrescue my original failed
> sdb in the hopes that anything lost on sdc would be covered by the
> recovered sdb?

Depends how badly out of sync the event counts are. However, I note that
your ddrescue copy appeared to run without any errors (apart from
falling off the end of the drive :-) ?

In which case, you haven't lost anything on sdc. Which is why the wiki
says don't mount your array writeable while you're trying to recover it
- you're not going to muck up your data and have user-space provoke
further errors.

If the array barfs while it's rebuilding, it's hopefully just a
transient, and do another assemble with --force to get it back again.

Once you've got the array properly back up again :-

1) make sure that the timeout script is run EVERY BOOT to fix the kernel
defaults for your remaining barracudas.

2) make sure smarts are enabled EVERY BOOT because barracudas forget
their settings on power-off.

3) You've now got a spare drive. If a smart self-check comes back pretty
clean and it looks like a transient problem not a dud drive, then put it
back in and convert the array to raid 6.

4) MONITOR MONITOR MONITOR

You've seen the comments elsewhere about the 3TB barracudas? Barracudas
in general aren't bad drives, but the 3TB model has a reputation for
dying early and quickly. You can then plan to replace the drives at your
leisure, knowing that provided you catch any failure, you've still got
redundancy with one dead drive in a raid-6. Even better, get another
Toshiba and go raid-6+spare. And don't say you haven't got enough sata
ports - an add-in card is about £20 :-)

Cheers,
Wol

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: recovering failed raid5
  2016-11-16 15:35         ` Wols Lists
@ 2016-11-16 15:50           ` Alexander Shenkin
  2016-11-16 16:38             ` Wols Lists
  0 siblings, 1 reply; 29+ messages in thread
From: Alexander Shenkin @ 2016-11-16 15:50 UTC (permalink / raw)
  To: Wols Lists, linux-raid, rm, robin



On 11/16/2016 3:35 PM, Wols Lists wrote:
> On 16/11/16 09:04, Alexander Shenkin wrote:
>> Hello all,
>>
>> As a quick reminder, my sdb failed in a 4-disk RAID5, and then sdc
>> failed when trying to replace sdb.  I'm now trying to recover sdc with
>> ddrescue.
>>
>> After much back and forth, I've finally got ddrescue running to
>> replicate my apparently-faulty sdc.  I'm ddrescue'ing from a seagate 3TB
>> to a toshiba 3TB drive, and I'm getting a 'No space left on device
>> error'.  Any thoughts?
>>
>> One further question: should I also try to ddrescue my original failed
>> sdb in the hopes that anything lost on sdc would be covered by the
>> recovered sdb?
>
> Depends how badly out of sync the event counts are. However, I note that
> your ddrescue copy appeared to run without any errors (apart from
> falling off the end of the drive :-) ?

Thanks Wol.

 From my newbie reading, it looked like there was on 65kb error... but 
i'm not sure how to tell if it got read properly by ddrescue in the end 
- any tips?  I don't see any "retrying bad sectors" (-) lines in the 
logfile below...

username@Ubuntu-VirtualBox:~$ sudo ddrescue -d -f -r3 /dev/sdb /dev/sdc 
~/rescue.logfile
[sudo] password for username:
GNU ddrescue 1.19
Press Ctrl-C to interrupt
rescued:     3000 GB,  errsize:   65536 B,  current rate:   55640 kB/s
    ipos:     3000 GB,   errors:       1,    average rate:   83070 kB/s
    opos:     3000 GB, run time:   10.03 h,  successful read:       0 s ago
Copying non-tried blocks... Pass 1 (forwards)
ddrescue: Write error: No space left on device

# Rescue Logfile. Created by GNU ddrescue version 1.19
# Command line: ddrescue -d -f -r3 /dev/sdb /dev/sdc 
/home/username/rescue.logfile
# Start time:   2016-11-15 13:54:24
# Current time: 2016-11-15 23:56:25
# Copying non-tried blocks... Pass 1 (forwards)
# current_pos  current_status
0x2BAA1470000     ?
#      pos        size  status
0x00000000  0x7F5A0000  +
0x7F5A0000  0x00010000  *
0x7F5B0000  0x00010000  ?
0x7F5C0000  0x2BA21EB0000  +
0x2BAA1470000  0x00006000  ?

>
> In which case, you haven't lost anything on sdc. Which is why the wiki
> says don't mount your array writeable while you're trying to recover it
> - you're not going to muck up your data and have user-space provoke
> further errors.

gotcha - i'm doing this with removed drives on a different (virtual) 
machine.  Seemed like the arrays were getting mounted read-only by 
default when the disks were having issues...

>
> If the array barfs while it's rebuilding, it's hopefully just a
> transient, and do another assemble with --force to get it back again.

so, i guess i put the copied drive back in as sdc, and a new blank drive 
as sdb, add sdb, and just let it rebuild from there?  Or, do I issue 
this command as appropriate?

mdadm --force --assemble /dev/mdN /dev/sd[XYZ]1

>
> Once you've got the array properly back up again :-
>
> 1) make sure that the timeout script is run EVERY BOOT to fix the kernel
> defaults for your remaining barracudas.
>
> 2) make sure smarts are enabled EVERY BOOT because barracudas forget
> their settings on power-off.
>
> 3) You've now got a spare drive. If a smart self-check comes back pretty
> clean and it looks like a transient problem not a dud drive, then put it
> back in and convert the array to raid 6.
>
> 4) MONITOR MONITOR MONITOR
>
> You've seen the comments elsewhere about the 3TB barracudas? Barracudas
> in general aren't bad drives, but the 3TB model has a reputation for
> dying early and quickly. You can then plan to replace the drives at your
> leisure, knowing that provided you catch any failure, you've still got
> redundancy with one dead drive in a raid-6. Even better, get another
> Toshiba and go raid-6+spare. And don't say you haven't got enough sata
> ports - an add-in card is about £20 :-)
>
> Cheers,
> Wol
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: recovering failed raid5
  2016-11-16 15:50           ` Alexander Shenkin
@ 2016-11-16 16:38             ` Wols Lists
  2017-01-05 12:08               ` Alexander Shenkin
  0 siblings, 1 reply; 29+ messages in thread
From: Wols Lists @ 2016-11-16 16:38 UTC (permalink / raw)
  To: Alexander Shenkin, linux-raid, rm, robin

On 16/11/16 15:50, Alexander Shenkin wrote:
> 
> 
> On 11/16/2016 3:35 PM, Wols Lists wrote:
>> On 16/11/16 09:04, Alexander Shenkin wrote:
>>> Hello all,
>>>
>>> As a quick reminder, my sdb failed in a 4-disk RAID5, and then sdc
>>> failed when trying to replace sdb.  I'm now trying to recover sdc with
>>> ddrescue.
>>>
>>> After much back and forth, I've finally got ddrescue running to
>>> replicate my apparently-faulty sdc.  I'm ddrescue'ing from a seagate 3TB
>>> to a toshiba 3TB drive, and I'm getting a 'No space left on device
>>> error'.  Any thoughts?
>>>
>>> One further question: should I also try to ddrescue my original failed
>>> sdb in the hopes that anything lost on sdc would be covered by the
>>> recovered sdb?
>>
>> Depends how badly out of sync the event counts are. However, I note that
>> your ddrescue copy appeared to run without any errors (apart from
>> falling off the end of the drive :-) ?
> 
> Thanks Wol.
> 
> From my newbie reading, it looked like there was on 65kb error... but
> i'm not sure how to tell if it got read properly by ddrescue in the end
> - any tips?  I don't see any "retrying bad sectors" (-) lines in the
> logfile below...
> 
> username@Ubuntu-VirtualBox:~$ sudo ddrescue -d -f -r3 /dev/sdb /dev/sdc
> ~/rescue.logfile
> [sudo] password for username:
> GNU ddrescue 1.19
> Press Ctrl-C to interrupt
> rescued:     3000 GB,  errsize:   65536 B,  current rate:   55640 kB/s
>    ipos:     3000 GB,   errors:       1,    average rate:   83070 kB/s
>    opos:     3000 GB, run time:   10.03 h,  successful read:       0 s ago
> Copying non-tried blocks... Pass 1 (forwards)
> ddrescue: Write error: No space left on device
> 
> # Rescue Logfile. Created by GNU ddrescue version 1.19
> # Command line: ddrescue -d -f -r3 /dev/sdb /dev/sdc
> /home/username/rescue.logfile
> # Start time:   2016-11-15 13:54:24
> # Current time: 2016-11-15 23:56:25
> # Copying non-tried blocks... Pass 1 (forwards)
> # current_pos  current_status
> 0x2BAA1470000     ?
> #      pos        size  status
> 0x00000000  0x7F5A0000  +
> 0x7F5A0000  0x00010000  *
> 0x7F5B0000  0x00010000  ?
> 0x7F5C0000  0x2BA21EB0000  +
> 0x2BAA1470000  0x00006000  ?
> 
>>
>> In which case, you haven't lost anything on sdc. Which is why the wiki
>> says don't mount your array writeable while you're trying to recover it
>> - you're not going to muck up your data and have user-space provoke
>> further errors.
> 
> gotcha - i'm doing this with removed drives on a different (virtual)
> machine.  Seemed like the arrays were getting mounted read-only by
> default when the disks were having issues...
> 
>>
>> If the array barfs while it's rebuilding, it's hopefully just a
>> transient, and do another assemble with --force to get it back again.
> 
> so, i guess i put the copied drive back in as sdc, and a new blank drive
> as sdb, add sdb, and just let it rebuild from there?  Or, do I issue
> this command as appropriate?
> 
> mdadm --force --assemble /dev/mdN /dev/sd[XYZ]1

Let me get my thoughts straight - cross check what I'm writing but ...

sda and sdd have never failed. sdc is the new drive you've ddrescue'd onto.

So in order to get a working array, you need to do
"mdadm --assemble /dev/sd[adc]n"
This will give you a working, degraded array, which unfortunately
probably has a little bit of corruption - whatever you were writing when
the array first failed will not have been saved properly. You've
basically recovered the array with the two drives that are okay, and a
copy of the drive that failed most recently.

IFF the smarts report that your two failed drives are okay, then you can
add them back in. I'm hoping it was just the timeout problem - with
Barracudas that's quite likely.

MAKE SURE that you've run the timeout script on all the Barracudas, or
the array is simply going to crash again.

WIPE THE SUPERBLOCKS on the old drives. I'm not sure what the mdadm
command is, but we're adding them back in as new drives.

mdadm --add /dev/old-b /dev/old-c

This will think they are two new drives and will rebuild on to one of
them. You can then convert the array to raid 6 and it will rebuild on to
the other one.

Once you've got back to a fully-working raid-5, you can do a fsck on the
filesystem(s) to find the corruption.

Lastly, if you can get another Toshiba drive, add that in as a spare.

This will leave you with a 6-drive raid-6 - 3xdata, 2xparity, 1xspare.

If the smarts report that any of your barracudas have a load of errors,
it's not worth faffing about with them. Bin them and replace them.

Going back to an earlier point of yours - DO NOT try to force re-add the
first drive that failed back into the array. The mismatch in event count
will mean loads of corruption.

Cheers,
Wol
> 
>>
>> Once you've got the array properly back up again :-
>>
>> 1) make sure that the timeout script is run EVERY BOOT to fix the kernel
>> defaults for your remaining barracudas.
>>
>> 2) make sure smarts are enabled EVERY BOOT because barracudas forget
>> their settings on power-off.
>>
>> 3) You've now got a spare drive. If a smart self-check comes back pretty
>> clean and it looks like a transient problem not a dud drive, then put it
>> back in and convert the array to raid 6.
>>
>> 4) MONITOR MONITOR MONITOR
>>
>> You've seen the comments elsewhere about the 3TB barracudas? Barracudas
>> in general aren't bad drives, but the 3TB model has a reputation for
>> dying early and quickly. You can then plan to replace the drives at your
>> leisure, knowing that provided you catch any failure, you've still got
>> redundancy with one dead drive in a raid-6. Even better, get another
>> Toshiba and go raid-6+spare. And don't say you haven't got enough sata
>> ports - an add-in card is about £20 :-)
>>
>> Cheers,
>> Wol
>>
> 


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: recovering failed raid5
  2016-11-16 16:38             ` Wols Lists
@ 2017-01-05 12:08               ` Alexander Shenkin
  0 siblings, 0 replies; 29+ messages in thread
From: Alexander Shenkin @ 2017-01-05 12:08 UTC (permalink / raw)
  To: Wols Lists, linux-raid, rm, robin

Hi again all,

I've finally gotten new disks and copies ready, and have a small 
operational question.  But first, just a reminder, as this thread is a 
bit old.

My sdb went down in a 4-disk RAID5 array.  After adding a new sdb and 
rebuilding, sdc went down.  I ddrescue'd sdc to a new drive (previous 
attempts were marred by errors when using a USB enclosure; all finally 
went well when using direct motherboard SATA interface - just one 4096 
byte sector couldn't be read).  So now I have: sda (good), sdc 
(ddrescued), and sdd (good).  I have copied the partition table, and 
randomized the IDs, to a new drive and connected it to the sdb SATA 
interface on the motherboard.  All of this was done using the system 
rescue cd on a USB drive (https://www.system-rescue-cd.org/).

Now the question is: how do I actually get the system up to a state 
where I can run "mdadm --assemble /dev/sd[adc]n" as suggested by Wol 
below?  The system won't boot from the HDD's since there are only 2 
working members of the RAID apparently (I guess it must have removed sdc 
previously?  not sure.).  And trying to run mdadm from the system rescue 
cd OS says that the md config isn't there (or something to that effect). 
  (note: i do have the timeout script running on the USB OS).

Should I somehow recreate the md config on the OS on the USB drive?  Or 
something else?  Thanks again all!

Best,
Allie

On 11/16/2016 4:38 PM, Wols Lists wrote:
> On 16/11/16 15:50, Alexander Shenkin wrote:
>>
>>
>> On 11/16/2016 3:35 PM, Wols Lists wrote:
>>> On 16/11/16 09:04, Alexander Shenkin wrote:
>>>> Hello all,
>>>>
>>>> As a quick reminder, my sdb failed in a 4-disk RAID5, and then sdc
>>>> failed when trying to replace sdb.  I'm now trying to recover sdc with
>>>> ddrescue.
>>>>
>>>> After much back and forth, I've finally got ddrescue running to
>>>> replicate my apparently-faulty sdc.  I'm ddrescue'ing from a seagate 3TB
>>>> to a toshiba 3TB drive, and I'm getting a 'No space left on device
>>>> error'.  Any thoughts?
>>>>
>>>> One further question: should I also try to ddrescue my original failed
>>>> sdb in the hopes that anything lost on sdc would be covered by the
>>>> recovered sdb?
>>>
>>> Depends how badly out of sync the event counts are. However, I note that
>>> your ddrescue copy appeared to run without any errors (apart from
>>> falling off the end of the drive :-) ?
>>
>> Thanks Wol.
>>
>> From my newbie reading, it looked like there was on 65kb error... but
>> i'm not sure how to tell if it got read properly by ddrescue in the end
>> - any tips?  I don't see any "retrying bad sectors" (-) lines in the
>> logfile below...
>>
>> username@Ubuntu-VirtualBox:~$ sudo ddrescue -d -f -r3 /dev/sdb /dev/sdc
>> ~/rescue.logfile
>> [sudo] password for username:
>> GNU ddrescue 1.19
>> Press Ctrl-C to interrupt
>> rescued:     3000 GB,  errsize:   65536 B,  current rate:   55640 kB/s
>>    ipos:     3000 GB,   errors:       1,    average rate:   83070 kB/s
>>    opos:     3000 GB, run time:   10.03 h,  successful read:       0 s ago
>> Copying non-tried blocks... Pass 1 (forwards)
>> ddrescue: Write error: No space left on device
>>
>> # Rescue Logfile. Created by GNU ddrescue version 1.19
>> # Command line: ddrescue -d -f -r3 /dev/sdb /dev/sdc
>> /home/username/rescue.logfile
>> # Start time:   2016-11-15 13:54:24
>> # Current time: 2016-11-15 23:56:25
>> # Copying non-tried blocks... Pass 1 (forwards)
>> # current_pos  current_status
>> 0x2BAA1470000     ?
>> #      pos        size  status
>> 0x00000000  0x7F5A0000  +
>> 0x7F5A0000  0x00010000  *
>> 0x7F5B0000  0x00010000  ?
>> 0x7F5C0000  0x2BA21EB0000  +
>> 0x2BAA1470000  0x00006000  ?
>>
>>>
>>> In which case, you haven't lost anything on sdc. Which is why the wiki
>>> says don't mount your array writeable while you're trying to recover it
>>> - you're not going to muck up your data and have user-space provoke
>>> further errors.
>>
>> gotcha - i'm doing this with removed drives on a different (virtual)
>> machine.  Seemed like the arrays were getting mounted read-only by
>> default when the disks were having issues...
>>
>>>
>>> If the array barfs while it's rebuilding, it's hopefully just a
>>> transient, and do another assemble with --force to get it back again.
>>
>> so, i guess i put the copied drive back in as sdc, and a new blank drive
>> as sdb, add sdb, and just let it rebuild from there?  Or, do I issue
>> this command as appropriate?
>>
>> mdadm --force --assemble /dev/mdN /dev/sd[XYZ]1
>
> Let me get my thoughts straight - cross check what I'm writing but ...
>
> sda and sdd have never failed. sdc is the new drive you've ddrescue'd onto.
>
> So in order to get a working array, you need to do
> "mdadm --assemble /dev/sd[adc]n"
> This will give you a working, degraded array, which unfortunately
> probably has a little bit of corruption - whatever you were writing when
> the array first failed will not have been saved properly. You've
> basically recovered the array with the two drives that are okay, and a
> copy of the drive that failed most recently.
>
> IFF the smarts report that your two failed drives are okay, then you can
> add them back in. I'm hoping it was just the timeout problem - with
> Barracudas that's quite likely.
>
> MAKE SURE that you've run the timeout script on all the Barracudas, or
> the array is simply going to crash again.
>
> WIPE THE SUPERBLOCKS on the old drives. I'm not sure what the mdadm
> command is, but we're adding them back in as new drives.
>
> mdadm --add /dev/old-b /dev/old-c
>
> This will think they are two new drives and will rebuild on to one of
> them. You can then convert the array to raid 6 and it will rebuild on to
> the other one.
>
> Once you've got back to a fully-working raid-5, you can do a fsck on the
> filesystem(s) to find the corruption.
>
> Lastly, if you can get another Toshiba drive, add that in as a spare.
>
> This will leave you with a 6-drive raid-6 - 3xdata, 2xparity, 1xspare.
>
> If the smarts report that any of your barracudas have a load of errors,
> it's not worth faffing about with them. Bin them and replace them.
>
> Going back to an earlier point of yours - DO NOT try to force re-add the
> first drive that failed back into the array. The mismatch in event count
> will mean loads of corruption.
>
> Cheers,
> Wol
>>
>>>
>>> Once you've got the array properly back up again :-
>>>
>>> 1) make sure that the timeout script is run EVERY BOOT to fix the kernel
>>> defaults for your remaining barracudas.
>>>
>>> 2) make sure smarts are enabled EVERY BOOT because barracudas forget
>>> their settings on power-off.
>>>
>>> 3) You've now got a spare drive. If a smart self-check comes back pretty
>>> clean and it looks like a transient problem not a dud drive, then put it
>>> back in and convert the array to raid 6.
>>>
>>> 4) MONITOR MONITOR MONITOR
>>>
>>> You've seen the comments elsewhere about the 3TB barracudas? Barracudas
>>> in general aren't bad drives, but the 3TB model has a reputation for
>>> dying early and quickly. You can then plan to replace the drives at your
>>> leisure, knowing that provided you catch any failure, you've still got
>>> redundancy with one dead drive in a raid-6. Even better, get another
>>> Toshiba and go raid-6+spare. And don't say you haven't got enough sata
>>> ports - an add-in card is about £20 :-)
>>>
>>> Cheers,
>>> Wol
>>>
>>
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2017-01-05 12:08 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-27 15:06 recovering failed raid5 Alexander Shenkin
2016-10-27 16:04 ` Andreas Klauer
2016-10-28 12:22   ` Alexander Shenkin
2016-10-28 13:33     ` Andreas Klauer
2016-10-28 21:16       ` Phil Turmel
2016-10-28 23:45         ` Andreas Klauer
2016-10-29  2:52           ` Edward Kuns
2016-10-29  2:53           ` Phil Turmel
2016-10-29  8:46           ` Mikael Abrahamsson
2016-10-29 10:29       ` Roman Mamedov
2016-10-29 12:02         ` Andreas Klauer
2016-10-30 16:18           ` Phil Turmel
2016-10-28 13:36     ` Robin Hill
2016-10-31 10:44       ` Alexander Shenkin
2016-10-31 11:09         ` Andreas Klauer
2016-10-31 15:19         ` Robin Hill
2016-10-31 16:26         ` Wols Lists
2016-10-31 16:28       ` Wols Lists
2016-11-16  9:04       ` Alexander Shenkin
2016-11-16 11:14         ` Andreas Klauer
2016-11-16 13:27           ` Alexander Shenkin
2016-11-16 13:59             ` Andreas Klauer
2016-11-16 15:35         ` Wols Lists
2016-11-16 15:50           ` Alexander Shenkin
2016-11-16 16:38             ` Wols Lists
2017-01-05 12:08               ` Alexander Shenkin
2016-10-31 16:31     ` Wols Lists
2016-10-27 16:26 ` Roman Mamedov
2016-10-27 20:34 ` Robin Hill

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.