RAID5 up, but one drive removed, one says spare building, what now?

All of lore.kernel.org
 help / color / mirror / Atom feed

* RAID5 up, but one drive removed, one says spare building, what now?
@ 2017-11-10  3:09 Jun-Kai Teoh
  2017-11-10 11:31 ` Wols Lists
  0 siblings, 1 reply; 12+ messages in thread
From: Jun-Kai Teoh @ 2017-11-10  3:09 UTC (permalink / raw)
  To: Linux-RAID

Hi all,

I managed to get my RAID drive back up, content looks like it's still
there, but it's not resyncing or reshaping and my parity drive was
removed (I did it when I tried to get it back up).

So what should I do now? I'm afraid of doing anything else at this point.

/dev/md126:
        Version : 1.2
  Creation Time : Thu Jun 30 07:57:36 2016
     Raid Level : raid5
     Array Size : 23441323008 (22355.39 GiB 24003.91 GB)
  Used Dev Size : 3906887168 (3725.90 GiB 4000.65 GB)
   Raid Devices : 8
  Total Devices : 7
    Persistence : Superblock is persistent

  Intent Bitmap : Internal

    Update Time : Thu Nov  9 18:57:18 2017
          State : clean, FAILED
 Active Devices : 6
Working Devices : 7
 Failed Devices : 0
  Spare Devices : 1

         Layout : left-symmetric
     Chunk Size : 512K

  Delta Devices : 1, (7->8)

           Name : livingrm-server:2  (local to host livingrm-server)
           UUID : f7333d4f:8300969d:55148d64:93c8afc8
         Events : 650582

    Number   Major   Minor   RaidDevice State
       0       8      112        0      active sync   /dev/sdh
       1       8       48        1      active sync   /dev/sdd
       7       8       64        2      spare rebuilding   /dev/sde
       3       8       96        3      active sync   /dev/sdg
       4       8       32        4      active sync   /dev/sdc
       5       8       80        5      active sync   /dev/sdf
       6       8       16        6      active sync   /dev/sdb
      14       0        0       14      removed

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RAID5 up, but one drive removed, one says spare building, what now?
  2017-11-10  3:09 RAID5 up, but one drive removed, one says spare building, what now? Jun-Kai Teoh
@ 2017-11-10 11:31 ` Wols Lists
  2017-11-10 13:01   ` Phil Turmel
  0 siblings, 1 reply; 12+ messages in thread
From: Wols Lists @ 2017-11-10 11:31 UTC (permalink / raw)
  To: Jun-Kai Teoh, Linux-RAID, Phil Turmel

On 10/11/17 03:09, Jun-Kai Teoh wrote:
> Hi all,
> 
> I managed to get my RAID drive back up, content looks like it's still
> there, but it's not resyncing or reshaping and my parity drive was
> removed (I did it when I tried to get it back up).
> 
> So what should I do now? I'm afraid of doing anything else at this point.
> 
> /dev/md126:
>         Version : 1.2
>   Creation Time : Thu Jun 30 07:57:36 2016
>      Raid Level : raid5
>      Array Size : 23441323008 (22355.39 GiB 24003.91 GB)
>   Used Dev Size : 3906887168 (3725.90 GiB 4000.65 GB)
>    Raid Devices : 8
>   Total Devices : 7
>     Persistence : Superblock is persistent
> 
>   Intent Bitmap : Internal
> 
>     Update Time : Thu Nov  9 18:57:18 2017
>           State : clean, FAILED
>  Active Devices : 6
> Working Devices : 7
>  Failed Devices : 0
>   Spare Devices : 1
> 
>          Layout : left-symmetric
>      Chunk Size : 512K
> 
>   Delta Devices : 1, (7->8)
> 
>            Name : livingrm-server:2  (local to host livingrm-server)
>            UUID : f7333d4f:8300969d:55148d64:93c8afc8
>          Events : 650582
> 
>     Number   Major   Minor   RaidDevice State
>        0       8      112        0      active sync   /dev/sdh
>        1       8       48        1      active sync   /dev/sdd
>        7       8       64        2      spare rebuilding   /dev/sde
>        3       8       96        3      active sync   /dev/sdg
>        4       8       32        4      active sync   /dev/sdc
>        5       8       80        5      active sync   /dev/sdf
>        6       8       16        6      active sync   /dev/sdb
>       14       0        0       14      removed

Okay. I was hoping someone else would chime in, but I'd say this looks
well promising. You have seven drives of eight so you have no redundancy :-(

You say your data is still there - does that mean you've mounted it, and
it looks okay?

sde is rebuilding, which means the array is sorting itself out.

You need that eighth drive. If a fsck says you have no (or almost no)
filesystem corruption, and you have a known-good drive, add it in. The
array will then sort itself out.

I would NOT recommend mounting it read-write until it comes back and
says "eight drives of eight working".

Cheers,
Wol

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RAID5 up, but one drive removed, one says spare building, what now?
  2017-11-10 11:31 ` Wols Lists
@ 2017-11-10 13:01   ` Phil Turmel
  2017-11-10 13:13     ` Mark Knecht
  0 siblings, 1 reply; 12+ messages in thread
From: Phil Turmel @ 2017-11-10 13:01 UTC (permalink / raw)
  To: Wols Lists, Jun-Kai Teoh, Linux-RAID

Hi Jun-Kai,

On 11/10/2017 06:31 AM, Wols Lists wrote:
> On 10/11/17 03:09, Jun-Kai Teoh wrote:
>> Hi all,
>>
>> I managed to get my RAID drive back up, content looks like it's still
>> there, but it's not resyncing or reshaping and my parity drive was
>> removed (I did it when I tried to get it back up).

If you can see your content (mounted read-only, I hope), backup up
everything, in order from most critical to least critical.

>> So what should I do now? I'm afraid of doing anything else at this point.

Well, you need to provide more information.  At least the mdadm -E
reports for all of the member devices.  Including the "parity" device
you removed.  (Parity is spread among all devices in a normal raid5
layout, so you may be suffering from a misunderstanding.)

>> /dev/md126:
>>         Version : 1.2
>>   Creation Time : Thu Jun 30 07:57:36 2016
>>      Raid Level : raid5
>>      Array Size : 23441323008 (22355.39 GiB 24003.91 GB)
>>   Used Dev Size : 3906887168 (3725.90 GiB 4000.65 GB)
>>    Raid Devices : 8
>>   Total Devices : 7
>>     Persistence : Superblock is persistent
>>
>>   Intent Bitmap : Internal
>>
>>     Update Time : Thu Nov  9 18:57:18 2017
>>           State : clean, FAILED
>>  Active Devices : 6
>> Working Devices : 7
>>  Failed Devices : 0
>>   Spare Devices : 1
>>
>>          Layout : left-symmetric
>>      Chunk Size : 512K
>>
>>   Delta Devices : 1, (7->8)
                        ^^^^^^

Somewhere in your efforts, you must have used mdadm --grow.  That was
bad, and the reason for my suggestion above.

Please review your bash history and reconstruct what steps you took in
your efforts to revive your array.  An extract of relevant dmesg text
from that time period may be helpful, too.

It would help if you summarized how you got into this jam in the first
place.

[trim /]

> Okay. I was hoping someone else would chime in, but I'd say this looks
> well promising. You have seven drives of eight so you have no redundancy :-(
> 
> You say your data is still there - does that mean you've mounted it, and
> it looks okay?
> 
> sde is rebuilding, which means the array is sorting itself out.

It's not progressing, because of the reshape.

> You need that eighth drive. If a fsck says you have no (or almost no)
> filesystem corruption, and you have a known-good drive, add it in. The
> array will then sort itself out.

No, backups are first.

> I would NOT recommend mounting it read-write until it comes back and
> says "eight drives of eight working".

Concur.

Phil

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RAID5 up, but one drive removed, one says spare building, what now?
  2017-11-10 13:01   ` Phil Turmel
@ 2017-11-10 13:13     ` Mark Knecht
  2017-11-10 17:25       ` Jun-Kai Teoh
  0 siblings, 1 reply; 12+ messages in thread
From: Mark Knecht @ 2017-11-10 13:13 UTC (permalink / raw)
  To: Phil Turmel; +Cc: Wols Lists, Jun-Kai Teoh, Linux-RAID

On Fri, Nov 10, 2017 at 6:01 AM, Phil Turmel <philip@turmel.org> wrote:
> Hi Jun-Kai,
>
> On 11/10/2017 06:31 AM, Wols Lists wrote:
>> On 10/11/17 03:09, Jun-Kai Teoh wrote:
>>> Hi all,
>>>
>>> I managed to get my RAID drive back up, content looks like it's still
>>> there, but it's not resyncing or reshaping and my parity drive was
>>> removed (I did it when I tried to get it back up).
>
> If you can see your content (mounted read-only, I hope), backup up
> everything, in order from most critical to least critical.
>
>>> So what should I do now? I'm afraid of doing anything else at this point.
>
> Well, you need to provide more information.  At least the mdadm -E
> reports for all of the member devices.  Including the "parity" device
> you removed.  (Parity is spread among all devices in a normal raid5
> layout, so you may be suffering from a misunderstanding.)
>

Phil,
   Please reference the thread entitled

"Raid 5 array down/missing - went through wiki steps"

All the mdadm -E info and much more was reported there a week
ago. Wols and I helped get him that far but neither of us was confident
in giving instructions to get him further.

Cheers,
Mark

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RAID5 up, but one drive removed, one says spare building, what now?
  2017-11-10 13:13     ` Mark Knecht
@ 2017-11-10 17:25       ` Jun-Kai Teoh
  2017-11-10 18:01         ` Phil Turmel
  0 siblings, 1 reply; 12+ messages in thread
From: Jun-Kai Teoh @ 2017-11-10 17:25 UTC (permalink / raw)
  To: Mark Knecht; +Cc: Phil Turmel, Wols Lists, Linux-RAID

I was reshaping/growing my array when it died. Computer came back up
but array wouldn't assemble.

I looked around and saw some commands on how to assemble without the
one drive that looked like it was kicked off sometime ago that I did
not realize (I used --assemble --force --verbose)

I kept getting the error that there were 6 drives, 1 rebuilding, and
not enough to get the array up. I unplugged the one drive that had the
incorrect superblock and ran "mdadm -A -R".

It ran and looked like it was resuming reshaping but it got really
slow - like 7 years slow.

I killed the process, rebooted the machine, and it just auto
reassembled the array and automounted it on boot.

Am in the process of backup right now.

And yes, the drive/array is mounted.

mdadm version 3.3, september 2013

Below is the updated mdadm examine info followed by smartctl (note
that the sd* have changed from the earlier post, likely because I
unplugged that drive):

/dev/sdb:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x45
     Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
           Name : livingrm-server:2  (local to host livingrm-server)
  Creation Time : Thu Jun 30 07:57:36 2016
     Raid Level : raid5
   Raid Devices : 8

 Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
     Array Size : 27348210176 (26081.29 GiB 28004.57 GB)
  Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
    Data Offset : 262144 sectors
     New Offset : 254976 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 2df0b319:bdb18eee:27b318ec:da55d53d

Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 39348736 (37.53 GiB 40.29 GB)
  Delta Devices : 1 (7->8)

    Update Time : Fri Nov 10 09:17:35 2017
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : d2b7f433 - correct
         Events : 651498

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 6
   Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdc:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x45
     Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
           Name : livingrm-server:2  (local to host livingrm-server)
  Creation Time : Thu Jun 30 07:57:36 2016
     Raid Level : raid5
   Raid Devices : 8

 Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
     Array Size : 27348210176 (26081.29 GiB 28004.57 GB)
  Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
    Data Offset : 262144 sectors
     New Offset : 254976 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : f1a790a9:98e01257:d9ab257d:95c8f1fc

Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 39348736 (37.53 GiB 40.29 GB)
  Delta Devices : 1 (7->8)

    Update Time : Fri Nov 10 09:17:35 2017
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 1b42463e - correct
         Events : 651498

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 4
   Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdd:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x45
     Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
           Name : livingrm-server:2  (local to host livingrm-server)
  Creation Time : Thu Jun 30 07:57:36 2016
     Raid Level : raid5
   Raid Devices : 8

 Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
     Array Size : 27348210176 (26081.29 GiB 28004.57 GB)
  Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
    Data Offset : 262144 sectors
     New Offset : 254976 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 6f44eba1:29d246a4:c5e8312e:bac00a7b

Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 39348736 (37.53 GiB 40.29 GB)
  Delta Devices : 1 (7->8)

    Update Time : Fri Nov 10 09:17:35 2017
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 8ff6095a - correct
         Events : 651498

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 1
   Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
/dev/sde:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x47
     Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
           Name : livingrm-server:2  (local to host livingrm-server)
  Creation Time : Thu Jun 30 07:57:36 2016
     Raid Level : raid5
   Raid Devices : 8

 Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
     Array Size : 27348210176 (26081.29 GiB 28004.57 GB)
  Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
    Data Offset : 262144 sectors
     New Offset : 254976 sectors
   Super Offset : 8 sectors
Recovery Offset : 11264816 sectors
          State : clean
    Device UUID : 109b7a2f:0529794c:2cf95cc1:d6c0bd6b

Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 39348736 (37.53 GiB 40.29 GB)
  Delta Devices : 1 (7->8)

    Update Time : Fri Nov 10 09:17:35 2017
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 4a41aa92 - correct
         Events : 651498

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 2
   Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdf:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x45
     Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
           Name : livingrm-server:2  (local to host livingrm-server)
  Creation Time : Thu Jun 30 07:57:36 2016
     Raid Level : raid5
   Raid Devices : 8

 Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
     Array Size : 27348210176 (26081.29 GiB 28004.57 GB)
  Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
    Data Offset : 262144 sectors
     New Offset : 254976 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : deb04cea:d6530966:6c70ca90:bebb143e

Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 39348736 (37.53 GiB 40.29 GB)
  Delta Devices : 1 (7->8)

    Update Time : Fri Nov 10 09:17:35 2017
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : bfbc7a25 - correct
         Events : 651498

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 5
   Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdg:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x45
     Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
           Name : livingrm-server:2  (local to host livingrm-server)
  Creation Time : Thu Jun 30 07:57:36 2016
     Raid Level : raid5
   Raid Devices : 8

 Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
     Array Size : 27348210176 (26081.29 GiB 28004.57 GB)
  Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
    Data Offset : 262144 sectors
     New Offset : 254976 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 15d85573:6e78f040:8c028ef3:d1301f9d

Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 39348736 (37.53 GiB 40.29 GB)
  Delta Devices : 1 (7->8)

    Update Time : Fri Nov 10 09:17:35 2017
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : e57acd25 - correct
         Events : 651498

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 3
   Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdh:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x45
     Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
           Name : livingrm-server:2  (local to host livingrm-server)
  Creation Time : Thu Jun 30 07:57:36 2016
     Raid Level : raid5
   Raid Devices : 8

 Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
     Array Size : 27348210176 (26081.29 GiB 28004.57 GB)
  Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
    Data Offset : 262144 sectors
     New Offset : 254976 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 64d5961e:230e558c:3748b561:a7c6ab8c

Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 39348736 (37.53 GiB 40.29 GB)
  Delta Devices : 1 (7->8)

    Update Time : Fri Nov 10 09:17:35 2017
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 39d43ba7 - correct
         Events : 651498

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)

------

smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-98-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD40EFRX-68WT0N0
Serial Number:    WD-WCC4E2VKP3SV
LU WWN Device Id: 5 0014ee 2620754c8
Firmware Version: 82.00A82
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Fri Nov 10 09:24:03 2017 PST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (51120) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: (   2) minutes.
Extended self-test routine
recommended polling time: ( 512) minutes.
Conveyance self-test routine
recommended polling time: (   5) minutes.
SCT capabilities:        (0x703d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail
Always       -       0
  3 Spin_Up_Time            0x0027   191   183   021    Pre-fail
Always       -       7441
  4 Start_Stop_Count        0x0032   100   100   000    Old_age
Always       -       182
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail
Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age
Always       -       0
  9 Power_On_Hours          0x0032   082   082   000    Old_age
Always       -       13243
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age
Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age
Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age
Always       -       182
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age
Always       -       113
193 Load_Cycle_Count        0x0032   192   192   000    Old_age
Always       -       26737
194 Temperature_Celsius     0x0022   118   101   000    Old_age
Always       -       34
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age
Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age
Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age
Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age
Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age
Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-98-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD40EFRX-68WT0N0
Serial Number:    WD-WCC4E4JNHEC6
LU WWN Device Id: 5 0014ee 2614842c3
Firmware Version: 82.00A82
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 1.5 Gb/s)
Local Time is:    Fri Nov 10 09:24:14 2017 PST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (53940) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: (   2) minutes.
Extended self-test routine
recommended polling time: ( 539) minutes.
Conveyance self-test routine
recommended polling time: (   5) minutes.
SCT capabilities:        (0x703d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail
Always       -       0
  3 Spin_Up_Time            0x0027   189   180   021    Pre-fail
Always       -       7516
  4 Start_Stop_Count        0x0032   100   100   000    Old_age
Always       -       214
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail
Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age
Always       -       0
  9 Power_On_Hours          0x0032   076   076   000    Old_age
Always       -       18057
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age
Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age
Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age
Always       -       214
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age
Always       -       128
193 Load_Cycle_Count        0x0032   189   189   000    Old_age
Always       -       33162
194 Temperature_Celsius     0x0022   117   092   000    Old_age
Always       -       35
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age
Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age
Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age
Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age
Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age
Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-98-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD40EFRX-68WT0N0
Serial Number:    WD-WCC4E2ZYE8AN
LU WWN Device Id: 5 0014ee 20bf26ce9
Firmware Version: 82.00A82
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 1.5 Gb/s)
Local Time is:    Fri Nov 10 09:24:17 2017 PST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (54960) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: (   2) minutes.
Extended self-test routine
recommended polling time: ( 550) minutes.
Conveyance self-test routine
recommended polling time: (   5) minutes.
SCT capabilities:        (0x703d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail
Always       -       0
  3 Spin_Up_Time            0x0027   186   178   021    Pre-fail
Always       -       7658
  4 Start_Stop_Count        0x0032   100   100   000    Old_age
Always       -       214
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail
Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age
Always       -       0
  9 Power_On_Hours          0x0032   076   076   000    Old_age
Always       -       18060
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age
Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age
Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age
Always       -       214
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age
Always       -       121
193 Load_Cycle_Count        0x0032   189   189   000    Old_age
Always       -       33798
194 Temperature_Celsius     0x0022   117   098   000    Old_age
Always       -       35
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age
Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age
Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age
Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age
Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age
Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-98-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD40EFRX-68N32N0
Serial Number:    WD-WCC7K7PZ7R6Z
LU WWN Device Id: 5 0014ee 26410129e
Firmware Version: 82.00A82
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 1.5 Gb/s)
Local Time is:    Fri Nov 10 09:24:20 2017 PST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (46440) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: (   2) minutes.
Extended self-test routine
recommended polling time: ( 492) minutes.
Conveyance self-test routine
recommended polling time: (   5) minutes.
SCT capabilities:        (0x303d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   100   253   051    Pre-fail
Always       -       0
  3 Spin_Up_Time            0x0027   100   253   021    Pre-fail
Always       -       0
  4 Start_Stop_Count        0x0032   100   100   000    Old_age
Always       -       5
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail
Always       -       0
  7 Seek_Error_Rate         0x002e   100   253   000    Old_age
Always       -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age
Always       -       320
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age
Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age
Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age
Always       -       5
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age
Always       -       2
193 Load_Cycle_Count        0x0032   200   200   000    Old_age
Always       -       61
194 Temperature_Celsius     0x0022   118   114   000    Old_age
Always       -       32
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age
Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age
Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age
Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age
Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age
Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-98-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD40EFRX-68WT0N0
Serial Number:    WD-WCC4E4JNHKL9
LU WWN Device Id: 5 0014ee 2b69e186b
Firmware Version: 82.00A82
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 1.5 Gb/s)
Local Time is:    Fri Nov 10 09:24:25 2017 PST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (55260) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: (   2) minutes.
Extended self-test routine
recommended polling time: ( 552) minutes.
Conveyance self-test routine
recommended polling time: (   5) minutes.
SCT capabilities:        (0x703d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail
Always       -       1
  3 Spin_Up_Time            0x0027   187   180   021    Pre-fail
Always       -       7616
  4 Start_Stop_Count        0x0032   100   100   000    Old_age
Always       -       675
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail
Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age
Always       -       0
  9 Power_On_Hours          0x0032   074   074   000    Old_age
Always       -       19242
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age
Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age
Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age
Always       -       263
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age
Always       -       163
193 Load_Cycle_Count        0x0032   188   188   000    Old_age
Always       -       36231
194 Temperature_Celsius     0x0022   116   091   000    Old_age
Always       -       36
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age
Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age
Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age
Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age
Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age
Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-98-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD40EFRX-68WT0N0
Serial Number:    WD-WCC4E4K2JPZT
LU WWN Device Id: 5 0014ee 2b510b64f
Firmware Version: 80.00A80
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Fri Nov 10 09:24:28 2017 PST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (52320) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: (   2) minutes.
Extended self-test routine
recommended polling time: ( 523) minutes.
Conveyance self-test routine
recommended polling time: (   5) minutes.
SCT capabilities:        (0x703d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail
Always       -       0
  3 Spin_Up_Time            0x0027   189   179   021    Pre-fail
Always       -       7541
  4 Start_Stop_Count        0x0032   098   098   000    Old_age
Always       -       2626
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail
Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age
Always       -       0
  9 Power_On_Hours          0x0032   064   064   000    Old_age
Always       -       26956
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age
Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age
Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age
Always       -       312
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age
Always       -       185
193 Load_Cycle_Count        0x0032   188   188   000    Old_age
Always       -       38926
194 Temperature_Celsius     0x0022   117   092   000    Old_age
Always       -       35
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age
Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age
Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age
Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   192   000    Old_age
Always       -       14
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age
Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-98-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD40EFRX-68WT0N0
Serial Number:    WD-WCC4E4VTX9TP
LU WWN Device Id: 5 0014ee 261485dc4
Firmware Version: 82.00A82
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Fri Nov 10 09:24:31 2017 PST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (54780) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: (   2) minutes.
Extended self-test routine
recommended polling time: ( 548) minutes.
Conveyance self-test routine
recommended polling time: (   5) minutes.
SCT capabilities:        (0x703d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail
Always       -       0
  3 Spin_Up_Time            0x0027   189   181   021    Pre-fail
Always       -       7516
  4 Start_Stop_Count        0x0032   100   100   000    Old_age
Always       -       212
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail
Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age
Always       -       0
  9 Power_On_Hours          0x0032   076   076   000    Old_age
Always       -       18057
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age
Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age
Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age
Always       -       212
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age
Always       -       122
193 Load_Cycle_Count        0x0032   189   189   000    Old_age
Always       -       33611
194 Temperature_Celsius     0x0022   118   092   000    Old_age
Always       -       34
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age
Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age
Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age
Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age
Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age
Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

On Fri, Nov 10, 2017 at 5:13 AM, Mark Knecht <markknecht@gmail.com> wrote:
> On Fri, Nov 10, 2017 at 6:01 AM, Phil Turmel <philip@turmel.org> wrote:
>> Hi Jun-Kai,
>>
>> On 11/10/2017 06:31 AM, Wols Lists wrote:
>>> On 10/11/17 03:09, Jun-Kai Teoh wrote:
>>>> Hi all,
>>>>
>>>> I managed to get my RAID drive back up, content looks like it's still
>>>> there, but it's not resyncing or reshaping and my parity drive was
>>>> removed (I did it when I tried to get it back up).
>>
>> If you can see your content (mounted read-only, I hope), backup up
>> everything, in order from most critical to least critical.
>>
>>>> So what should I do now? I'm afraid of doing anything else at this point.
>>
>> Well, you need to provide more information.  At least the mdadm -E
>> reports for all of the member devices.  Including the "parity" device
>> you removed.  (Parity is spread among all devices in a normal raid5
>> layout, so you may be suffering from a misunderstanding.)
>>
>
> Phil,
>    Please reference the thread entitled
>
> "Raid 5 array down/missing - went through wiki steps"
>
> All the mdadm -E info and much more was reported there a week
> ago. Wols and I helped get him that far but neither of us was confident
> in giving instructions to get him further.
>
> Cheers,
> Mark

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RAID5 up, but one drive removed, one says spare building, what now?
  2017-11-10 17:25       ` Jun-Kai Teoh
@ 2017-11-10 18:01         ` Phil Turmel
  2017-11-10 18:15           ` Jun-Kai Teoh
  0 siblings, 1 reply; 12+ messages in thread
From: Phil Turmel @ 2017-11-10 18:01 UTC (permalink / raw)
  To: Jun-Kai Teoh, Mark Knecht; +Cc: Wols Lists, Linux-RAID

Hi Jun-Kai,

{Convention on kernel.org is to trim replies and avoid top-posting.}

On 11/10/2017 12:25 PM, Jun-Kai Teoh wrote:
> I was reshaping/growing my array when it died. Computer came back up
> but array wouldn't assemble.
> 
> I looked around and saw some commands on how to assemble without the
> one drive that looked like it was kicked off sometime ago that I did
> not realize (I used --assemble --force --verbose)
> 
> I kept getting the error that there were 6 drives, 1 rebuilding, and
> not enough to get the array up. I unplugged the one drive that had the
> incorrect superblock and ran "mdadm -A -R".
> 
> It ran and looked like it was resuming reshaping but it got really
> slow - like 7 years slow.
> 
> I killed the process, rebooted the machine, and it just auto
> reassembled the array and automounted it on boot.
> 
> Am in the process of backup right now.
> 
> And yes, the drive/array is mounted.

Ok.  I reviewed the other thread.  Consider creating a new array from
scratch after you complete your backups, using consistent partitioning
across all devices.  I would put LVM on top and leave part of it
unallocated (for emergencies), but that's just my preference.

Phil




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RAID5 up, but one drive removed, one says spare building, what now?
  2017-11-10 18:01         ` Phil Turmel
@ 2017-11-10 18:15           ` Jun-Kai Teoh
  2017-11-10 19:04             ` Phil Turmel
  0 siblings, 1 reply; 12+ messages in thread
From: Jun-Kai Teoh @ 2017-11-10 18:15 UTC (permalink / raw)
  To: Phil Turmel; +Cc: Mark Knecht, Wols Lists, Linux-RAID


> On Nov 10, 2017, at 10:01 AM, Phil Turmel <philip@turmel.org> wrote:
> 
> Hi Jun-Kai,
> 
> {Convention on kernel.org is to trim replies and avoid top-posting.}
> 
> On 11/10/2017 12:25 PM, Jun-Kai Teoh wrote:
>> I was reshaping/growing my array when it died. Computer came back up
>> but array wouldn't assemble.
>> 
>> I looked around and saw some commands on how to assemble without the
>> one drive that looked like it was kicked off sometime ago that I did
>> not realize (I used --assemble --force --verbose)
>> 
>> I kept getting the error that there were 6 drives, 1 rebuilding, and
>> not enough to get the array up. I unplugged the one drive that had the
>> incorrect superblock and ran "mdadm -A -R".
>> 
>> It ran and looked like it was resuming reshaping but it got really
>> slow - like 7 years slow.
>> 
>> I killed the process, rebooted the machine, and it just auto
>> reassembled the array and automounted it on boot.
>> 
>> Am in the process of backup right now.
>> 
>> And yes, the drive/array is mounted.
> 
> Ok.  I reviewed the other thread.  Consider creating a new array from
> scratch after you complete your backups, using consistent partitioning
> across all devices.  I would put LVM on top and leave part of it
> unallocated (for emergencies), but that's just my preference.
> 
> Phil


My apologies all! I did not know this.

I'm backing up as much as I can, but I don't think I'll be able to back up everything. I can back up very little of the data, so I'm trying to prioritize carefully right now.

All devices are just 4TB drives, I didn't partition them separately for the array - or am I misunderstanding what you mean? <-- This is really common, all of you are far more knowledgeable than I am and I only understand some of the terms sometimes.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RAID5 up, but one drive removed, one says spare building, what now?
  2017-11-10 18:15           ` Jun-Kai Teoh
@ 2017-11-10 19:04             ` Phil Turmel
  2017-11-10 19:22               ` Jun-Kai Teoh
  0 siblings, 1 reply; 12+ messages in thread
From: Phil Turmel @ 2017-11-10 19:04 UTC (permalink / raw)
  To: Jun-Kai Teoh; +Cc: Mark Knecht, Wols Lists, Linux-RAID

On 11/10/2017 01:15 PM, Jun-Kai Teoh wrote:

>>> Am in the process of backup right now.
>>> 
>>> And yes, the drive/array is mounted.
>> 
>> Ok.  I reviewed the other thread.  Consider creating a new array 
>> from scratch after you complete your backups, using consistent 
>> partitioning across all devices.  I would put LVM on top and leave 
>> part of it unallocated (for emergencies), but that's just my 
>> preference.

> I'm backing up as much as I can, but I don't think I'll be able to 
> back up everything. I can back up very little of the data, so I'm 
> trying to prioritize carefully right now.

Ah, ok.  Well, after you get the important stuff backed up, you can
add more devices as spares and try to let MD continue its grow and
rebuild operations.  It might end up completing.

But don't add any more complete devices to the array -- use a partition
that starts at 1MB and covers the rest of the device (default for most
partition tools nowadays).

I recommend that when/if your array is stable again, you add one more
spare (with partition) and then use mdadm's --replace operation to move
complete-device members to the new member.  When each is done, take the
newly freed device and partition it and do the next.  When you have no
more complete-device members and have the last freed device partitioned,
consider converting to raid6 with that spare.

All of that will take many days, but can be done safely while running
the system.

> All devices are just 4TB drives, I didn't partition them separately 
> for the array - or am I misunderstanding what you mean? <-- This is 
> really common, all of you are far more knowledgeable than I am and I 
> only understand some of the terms sometimes.

MD raid doesn't care if you use whole devices of if you use partitions.
But other system utilities can get confused, and can even write a
new, unwanted partition table.  (I recall seeing that with a NAS
product a while back.)  So, the 1MB lost is good insurance against
stupid tools.

Phil

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RAID5 up, but one drive removed, one says spare building, what now?
  2017-11-10 19:04             ` Phil Turmel
@ 2017-11-10 19:22               ` Jun-Kai Teoh
  2017-11-10 21:58                 ` Phil Turmel
  0 siblings, 1 reply; 12+ messages in thread
From: Jun-Kai Teoh @ 2017-11-10 19:22 UTC (permalink / raw)
  To: Phil Turmel; +Cc: Mark Knecht, Wols Lists, Linux-RAID

> On Nov 10, 2017, at 11:04 AM, Phil Turmel <philip@turmel.org> wrote:
> 
> On 11/10/2017 01:15 PM, Jun-Kai Teoh wrote:
> 
>> I'm backing up as much as I can, but I don't think I'll be able to 
>> back up everything. I can back up very little of the data, so I'm 
>> trying to prioritize carefully right now.
> 
> Ah, ok.  Well, after you get the important stuff backed up, you can
> add more devices as spares and try to let MD continue its grow and
> rebuild operations.  It might end up completing.

Again, pardon me if I sound super ignorant, I just want to do all of this right.

Do I just plug in another blank 4TB, and run "mdadm -A /dev/md126 /dev/sd[newdrive]" into it?

Or leave it as it is, and just ask it to continue growing with "mdadm -A -R", and *then* add a blank drive into the raid but don't ask it to grow?

> But don't add any more complete devices to the array -- use a partition
> that starts at 1MB and covers the rest of the device (default for most
> partition tools nowadays).

Where can I read about how to do this properly? I tried googling it but I'm pretty lost. Or what tool I should use etc.

> I recommend that when/if your array is stable again, you add one more
> spare (with partition) and then use mdadm's --replace operation to move
> complete-device members to the new member.  When each is done, take the
> newly freed device and partition it and do the next.  When you have no
> more complete-device members and have the last freed device partitioned,
> consider converting to raid6 with that spare.

I think I understand what you mean here, but it sounds like I need to figure out the top two parts first and that'll take me awhile.

I'll probably have questions again, later, haha.

I really appreciate all the support and patience from this list.

Y'all lifesavers. 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RAID5 up, but one drive removed, one says spare building, what now?
  2017-11-10 19:22               ` Jun-Kai Teoh
@ 2017-11-10 21:58                 ` Phil Turmel
  2017-11-10 23:17                   ` Wols Lists
  0 siblings, 1 reply; 12+ messages in thread
From: Phil Turmel @ 2017-11-10 21:58 UTC (permalink / raw)
  To: Jun-Kai Teoh; +Cc: Mark Knecht, Wols Lists, Linux-RAID

On 11/10/2017 02:22 PM, Jun-Kai Teoh wrote:
>> On Nov 10, 2017, at 11:04 AM, Phil Turmel <philip@turmel.org>
>> wrote:
>> 
>> On 11/10/2017 01:15 PM, Jun-Kai Teoh wrote:
>> 
>>> I'm backing up as much as I can, but I don't think I'll be able
>>> to back up everything. I can back up very little of the data, so
>>> I'm trying to prioritize carefully right now.
>> 
>> Ah, ok.  Well, after you get the important stuff backed up, you
>> can add more devices as spares and try to let MD continue its grow
>> and rebuild operations.  It might end up completing.
> 
> Again, pardon me if I sound super ignorant, I just want to do all of
> this right.
> 
> Do I just plug in another blank 4TB, and run "mdadm -A /dev/md126
> /dev/sd[newdrive]" into it?
> 
> Or leave it as it is, and just ask it to continue growing with "mdadm
> -A -R", and *then* add a blank drive into the raid but don't ask it
> to grow?

Neither....  With your array mounted and running as it is right now, but
after all backups are done, use:

    mdadm --add /dev/mdXXX /dev/sdX1

with the new device's *partition*.

MD should recognize the opportunity to proceed and will resume
rebuilding and growing.

>> But don't add any more complete devices to the array -- use a
>> partition that starts at 1MB and covers the rest of the device
>> (default for most partition tools nowadays).
> 
> Where can I read about how to do this properly? I tried googling it
> but I'm pretty lost. Or what tool I should use etc.

parted or gdisk or a very modern version of fdisk will let you partition
your new drives.  They will default to a "GPT" disk label and will
generally place the first partition you create in the "right" place -- a
multiple of 4K, typically 1MB.  When you save the new partition table
(aka disk label), the #1 partition will show up in your list of devices.

>> I recommend that when/if your array is stable again, you add one
>> more spare (with partition) and then use mdadm's --replace
>> operation to move complete-device members to the new member.  When
>> each is done, take the newly freed device and partition it and do
>> the next.  When you have no more complete-device members and have
>> the last freed device partitioned, consider converting to raid6
>> with that spare.
> 
> I think I understand what you mean here, but it sounds like I need to
> figure out the top two parts first and that'll take me awhile.

Same partitioning procedure for each free disk.

> I'll probably have questions again, later, haha.

Sure, no problem.

> I really appreciate all the support and patience from this list.
> 
> Y'all lifesavers.

You're welcome.

Phil

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RAID5 up, but one drive removed, one says spare building, what now?
  2017-11-10 21:58                 ` Phil Turmel
@ 2017-11-10 23:17                   ` Wols Lists
  2017-11-10 23:22                     ` Wols Lists
  0 siblings, 1 reply; 12+ messages in thread
From: Wols Lists @ 2017-11-10 23:17 UTC (permalink / raw)
  To: Jun-Kai Teoh; +Cc: Linux-RAID

On 10/11/17 21:58, Phil Turmel wrote:
>>> But don't add any more complete devices to the array -- use a
>>> >> partition that starts at 1MB and covers the rest of the device
>>> >> (default for most partition tools nowadays).
>> > 
>> > Where can I read about how to do this properly? I tried googling it
>> > but I'm pretty lost. Or what tool I should use etc.
> parted or gdisk or a very modern version of fdisk will let you partition
> your new drives.  They will default to a "GPT" disk label and will
> generally place the first partition you create in the "right" place -- a
> multiple of 4K, typically 1MB.  When you save the new partition table
> (aka disk label), the #1 partition will show up in your list of devices.
> 
>>> >> I recommend that when/if your array is stable again, you add one
>>> >> more spare (with partition) and then use mdadm's --replace
>>> >> operation to move complete-device members to the new member.  When
>>> >> each is done, take the newly freed device and partition it and do
>>> >> the next.  When you have no more complete-device members and have
>>> >> the last freed device partitioned, consider converting to raid6
>>> >> with that spare.
>> > 
>> > I think I understand what you mean here, but it sounds like I need to
>> > figure out the top two parts first and that'll take me awhile.

If you take the wiki slowly, you'll hopefully grasp what's going on.

https://raid.wiki.kernel.org/index.php/Linux_Raid#Overview

Especially the bit about setting up a new system or converting an
existing system. Okay, they're not at all about what you're doing, but
if you work your way through and get what's happening there straight in
your head, you'll then be able to apply it to your setup.

Note that they're written from my perspective as a gentoo user, so I
have to do everything from scratch - boot into a live/rescue CD, and
then configure/setup the system from there.

I know it doesn't cover LVM - that is very patchy at the moment - but I
just need to troubleshoot my new machine to get a bios screen, and I'll
be setting that up with LVM and all that should be appearing on the wiki.

(My new system at the moment isn't even POSTing :-( I'm not used to
troubleshooting builds, I've usually been lucky, so I hope everything is
working soon :-)

Cheers,
Wol

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RAID5 up, but one drive removed, one says spare building, what now?
  2017-11-10 23:17                   ` Wols Lists
@ 2017-11-10 23:22                     ` Wols Lists
  0 siblings, 0 replies; 12+ messages in thread
From: Wols Lists @ 2017-11-10 23:22 UTC (permalink / raw)
  To: Jun-Kai Teoh; +Cc: Linux-RAID

On 10/11/17 23:17, Wols Lists wrote:
> If you take the wiki slowly, you'll hopefully grasp what's going on.
> 
> https://raid.wiki.kernel.org/index.php/Linux_Raid#Overview
> 
> Especially the bit about setting up a new system or converting an
> existing system. Okay, they're not at all about what you're doing, but
> if you work your way through and get what's happening there straight in
> your head, you'll then be able to apply it to your setup.
> 
> Note that they're written from my perspective as a gentoo user, so I
> have to do everything from scratch - boot into a live/rescue CD, and
> then configure/setup the system from there.

Just checked those two pages (should have done that before I hit send
:-) and they're not quite as appropriate as I thought. Still worth
reading, though.

But that's given me an idea for more detail in the wiki, so this should
lead to further improvement :-)

Cheers,
Wol

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-11-10 23:22 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-10  3:09 RAID5 up, but one drive removed, one says spare building, what now? Jun-Kai Teoh
2017-11-10 11:31 ` Wols Lists
2017-11-10 13:01   ` Phil Turmel
2017-11-10 13:13     ` Mark Knecht
2017-11-10 17:25       ` Jun-Kai Teoh
2017-11-10 18:01         ` Phil Turmel
2017-11-10 18:15           ` Jun-Kai Teoh
2017-11-10 19:04             ` Phil Turmel
2017-11-10 19:22               ` Jun-Kai Teoh
2017-11-10 21:58                 ` Phil Turmel
2017-11-10 23:17                   ` Wols Lists
2017-11-10 23:22                     ` Wols Lists

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.