linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed
* [linux-lvm] badblocks
@ 2023-05-20 22:29 graeme vetterlein
  2023-05-22  7:22 ` Roberto Fastec
  2023-05-22 13:49 ` Phillip Susi
  0 siblings, 2 replies; 5+ messages in thread
From: graeme vetterlein @ 2023-05-20 22:29 UTC (permalink / raw)
  To: linux-lvm


[-- Attachment #1.1: Type: text/plain, Size: 1555 bytes --]

This is my 1st post to this group.

I'm not aware of anywhere I can search for "known issues" or similar 
...short of opening each archive  and searching it.

(happy to be corrected on this)


I have a desktop Linux box (Debian Sid) with a clutch of disks in it (4 
or 5)  and have mostly defined each disk as a volume group.


Now a key point is *some of the disks are NAS grade disks.* This means 
they do NOT reallocate bad sectors silently. They report IO errors and 
leave it to the OS (i.e. the old fashion way of doing this)


One of the disks (over 10 years old) had started reporting errors. I ran 
fsck with -cc but it never found the issue. In the end I bough a new 
disk and did:

  * *vgextend* -- to add the new disk
  * *pvmove*   -- move volumes off "bad" disk
  * *vgreduce* -- remove bad disk from group

While the pvmove is running I had plenty of time to think....


The filesystem in the LV on the "bad disk" is now moving to the "new 
disk"  ...had the fsck managed to mark bad blocks, these would be in 
terms of the LV in which it sat and would be meaningless on  the new disk.


Then *the penny dropped!* The only component that has a view on the real 
physical disk is lvm2 and in particular the PV ...so if anybody can 
mark(and avoid) badblocks it's the PV...so I should be looking for 
something akin to the -cc option of fsck , applied to a PV command?


Googling around I see lots of ill-informed comments by people who's 
never seen a read "bad block" and assume all modern disks solve this for 
you.


--


Graeme



[-- Attachment #1.2: Type: text/html, Size: 2255 bytes --]

[-- Attachment #2: Type: text/plain, Size: 202 bytes --]

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [linux-lvm] badblocks
  2023-05-20 22:29 [linux-lvm] badblocks graeme vetterlein
@ 2023-05-22  7:22 ` Roberto Fastec
  2023-05-22 13:49 ` Phillip Susi
  1 sibling, 0 replies; 5+ messages in thread
From: Roberto Fastec @ 2023-05-22  7:22 UTC (permalink / raw)
  To: LVM general discussion and development


[-- Attachment #1.1: Type: text/plain, Size: 2897 bytes --]

Wrong approach

Bad drives , cause ancient, are going to die so fast

Cloning them software will kill them faster that above

You should send the drive to a data recovery lab where they have Atola or  PC-3000 devices and will clone the drive in an hardware manner

Best solution I know is https://www.RecuperoDati299euro.it where they offer the cloning service (other labs will ask you a full recovery cost)

Hard drives should be used for the time the manufacturer offer as manufacturer' warranty, plus six months. Then you are using a "tyre" over 5 years old , you are not saving money, you are risking your life, since the tyres' rubber is has almost completely dried out.

Check out SMART, the drive will have not only reallocation events but also pending ones, which is the worse situation

Kind regards
R.

Il giorno 22 mag 2023, 08:59, alle ore 08:59, graeme vetterlein <graeme.lvm@vetterlein.com> ha scritto:
>This is my 1st post to this group.
>
>I'm not aware of anywhere I can search for "known issues" or similar 
>...short of opening each archive  and searching it.
>
>(happy to be corrected on this)
>
>
>I have a desktop Linux box (Debian Sid) with a clutch of disks in it (4
>
>or 5)  and have mostly defined each disk as a volume group.
>
>
>Now a key point is *some of the disks are NAS grade disks.* This means 
>they do NOT reallocate bad sectors silently. They report IO errors and 
>leave it to the OS (i.e. the old fashion way of doing this)
>
>
>One of the disks (over 10 years old) had started reporting errors. I
>ran 
>fsck with -cc but it never found the issue. In the end I bough a new 
>disk and did:
>
>  * *vgextend* -- to add the new disk
>  * *pvmove*   -- move volumes off "bad" disk
>  * *vgreduce* -- remove bad disk from group
>
>While the pvmove is running I had plenty of time to think....
>
>
>The filesystem in the LV on the "bad disk" is now moving to the "new 
>disk"  ...had the fsck managed to mark bad blocks, these would be in 
>terms of the LV in which it sat and would be meaningless on  the new
>disk.
>
>
>Then *the penny dropped!* The only component that has a view on the
>real 
>physical disk is lvm2 and in particular the PV ...so if anybody can 
>mark(and avoid) badblocks it's the PV...so I should be looking for 
>something akin to the -cc option of fsck , applied to a PV command?
>
>
>Googling around I see lots of ill-informed comments by people who's 
>never seen a read "bad block" and assume all modern disks solve this
>for 
>you.
>
>
>--
>
>
>Graeme
>
>
>
>
>------------------------------------------------------------------------
>
>_______________________________________________
>linux-lvm mailing list
>linux-lvm@redhat.com
>https://listman.redhat.com/mailman/listinfo/linux-lvm
>read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

[-- Attachment #1.2: Type: text/html, Size: 4136 bytes --]

[-- Attachment #2: Type: text/plain, Size: 202 bytes --]

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [linux-lvm] badblocks
  2023-05-20 22:29 [linux-lvm] badblocks graeme vetterlein
  2023-05-22  7:22 ` Roberto Fastec
@ 2023-05-22 13:49 ` Phillip Susi
  2023-05-22 15:36   ` graeme vetterlein
  1 sibling, 1 reply; 5+ messages in thread
From: Phillip Susi @ 2023-05-22 13:49 UTC (permalink / raw)
  To: graeme vetterlein; +Cc: linux-lvm


graeme vetterlein <graeme.lvm@vetterlein.com> writes:

> I have a desktop Linux box (Debian Sid) with a clutch of disks in it
> (4 or 5)  and have mostly defined each disk as a volume group.

Why?  The prupose of a VG is to hold multiple PVs.

> Now a key point is *some of the disks are NAS grade disks.* This means
> they do NOT reallocate bad sectors silently. They report IO errors and 
> leave it to the OS (i.e. the old fashion way of doing this)

Are you sure that you are not confusing the ERC feature here?  That lets
the drive give up in a reasonable amount of time and report the (read)
error rather than keep trying.  Most often there is nothing wrong with
the disk physically and writing to the sector will succeed.  If it
doesn't, then the drive will remap it.

> Then *the penny dropped!* The only component that has a view on the
> real physical disk is lvm2 and in particular the PV ...so if anybody
> can mark(and avoid) badblocks it's the PV...so I should be looking for 
> something akin to the -cc option of fsck , applied to a PV command?

Theoretically yes, you can create a mapping to relocate the block
elsewhere, but there are no tools that I am aware of to help with this.
Again, are you sure there is actually something wrong with the disk?
Try writing to the "bad block" and see what happens.  Every time I have
done this in the last 20 years, the problem just goes away.  Usually
without even triggering a reallocation.  This is because the data just
got a little scrambled even though there is nothing wrong with the
medium.

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [linux-lvm] badblocks
  2023-05-22 13:49 ` Phillip Susi
@ 2023-05-22 15:36   ` graeme vetterlein
  2023-05-22 16:01     ` Phillip Susi
  0 siblings, 1 reply; 5+ messages in thread
From: graeme vetterlein @ 2023-05-22 15:36 UTC (permalink / raw)
  To: Phillip Susi; +Cc: linux-lvm

 > graeme vetterlein <graeme.lvm@vetterlein.com> writes:
 >
 > > I have a desktop Linux box (Debian Sid) with a clutch of disks in it
 > > (4 or 5)  and have mostly defined each disk as a volume group.
 >
 > Why?  The purpose of a VG is to hold multiple PVs.

One or more I believe :-) I mainly have an interest in being able move and
resize logical volumes, less with moving things between physical 
volumes.  (but
see later). ABTW one SATA port is wired to a "dock" so physical disks 
come and
go.

I was able to "replace" 1 broken 2TB 1 working 2TB drive with a new 4TB 
drive
without any filesystem creation, copying etc, just using LVM commands:

lvm> pvdisplay
..
   --- Physical volume ---
   PV Name               /dev/sdc1
   VG Name               SAMSUNG_2TB
..
   --- Physical volume ---
   PV Name               /dev/sdb5
   VG Name               real-vg
   --- Physical volume ---
   PV Name               /dev/sda1
   VG Name               TOSHIBA_2TB
   "/dev/sdd1" is a new physical volume of "<3.64 TiB"
   --- NEW Physical volume ---
   PV Name               /dev/sdd1
   VG Name

(sda is "bad" disk, sdd is "new" disk)
This is what I did...

    vgextend TOSHIBA_2TB /dev/sdd1      --- adds /dev/sdd1 into the 
existing VG
    pvmove /dev/sda1               --- moves everything off sda1
    vgreduce TOSHIBA_2TB /dev/sda1      --- take sda1 out of the VG
    vgcfgbackup

    vgrename TOSHIBA_2TB BARRACUDA_4TB
    vgcfgbackup
..
     umount /dev/mapper/SAMSUNG_2TB-data
     umount /dev/mapper/SAMSUNG_2TB-vimage
     lvchange -an  SAMSUNG_2TB/data
     lvchange -an  SAMSUNG_2TB/vimage
     vgmerge  BARRACUDA_4TB SAMSUNG_2TB     -- I believe this puts 
everything into BARRACUDA_4TB (oddly right to left)
     pvmove /dev/sdc1             -- Moves everything off sdc1
     vgreduce BARRACUDA_4TB /dev/sdc1    -- Nothing should be in sdc1, 
so drop it from the group

.. no need for renames, just update fstab

This was without taking the system down. I didn't fancy trying to do 
that with fisk(1) and dd(1)  :-)

 > > Now a key point is *some of the disks are NAS grade disks.* This means
 > > they do NOT reallocate bad sectors silently. They report IO errors and
 > > leave it to the OS (i.e. the old fashion way of doing this)
 >
 > Are you sure that you are not confusing the ERC feature here? That lets
 > the drive give up in a reasonable amount of time and report the (read)
 > error rather than keep trying.  Most often there is nothing wrong with
 > the disk physically and writing to the sector will succeed.  If it
 > doesn't, then the drive will remap it.

Reasonably sure. The disk is over 10 years old (smart shows > 19000 hours in
use, it's powered on a few hours/day) and it's only started getting 
errors since
19,000 hours, I get a popup warnings almost every day now. Smart shows 
it has NO
reallocated sectors.


SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE UPDATED  
WHEN_FAILED RAW_VALUE
   1 Raw_Read_Error_Rate     0x000b   100   100   016    Pre-fail 
Always       -       0
   2 Throughput_Performance  0x0005   140   140   054    Pre-fail 
Offline      -       69
   3 Spin_Up_Time            0x0007   127   127   024    Pre-fail 
Always       -       296 (Average 299)
   4 Start_Stop_Count        0x0012   100   100   000    Old_age 
Always       -       3554
   5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail 
Always       -       0
   7 Seek_Error_Rate         0x000b   100   100   067    Pre-fail 
Always       -       0
   8 Seek_Time_Performance   0x0005   124   124   020    Pre-fail 
Offline      -       33
   9 Power_On_Hours          0x0012   098   098   000    Old_age 
Always       -       19080
  10 Spin_Retry_Count        0x0013   100   100   060    Pre-fail 
Always       -       0
  12 Power_Cycle_Count       0x0032   100   100   000    Old_age 
Always       -       2959
192 Power-Off_Retract_Count 0x0032   097   097   000    Old_age 
Always       -       3616
193 Load_Cycle_Count        0x0012   097   097   000    Old_age 
Always       -       3616
194 Temperature_Celsius     0x0002   250   250   000    Old_age 
Always       -       24 (Min/Max 13/45)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age 
Always       -       0
197 Current_Pending_Sector  0x0022   100   100   000    Old_age 
Always       -       0
198 Offline_Uncorrectable   0x0008   100   100   000    Old_age 
Offline      -       0
199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age 
Always       -       57


Now, I know it's possible that these CRC errors are e.g. 'cable related' but
I've swapped the cable and moved SATA ports to no effect. In the end I 
decided
10 years was enough and bought a new drive.

Not withstanding this, it's still the case that it's perfectly possible 
that I
could, and will, get real permanent errors on a disk. In fact I gather 
SSDs can
suffer failure modes that make a large group of sectors inaccessible (bad).

In the past a filesystems acted as if it was dealing with the real physical
sectors on disk (because it was!) and so could simply map these to an inode
holding bad blocks. Now however only the PV really has any knowledge of the
physical sectors so it needs to do any such mapping. Consider the situation:

   A VG has 5 physical disks, 1 disk has a single bad block on it. If I 
resize
   and move around LVs and filesystems, that single bad block is going 
top crop
   up in various filesystems causing issues all over the place.

Now this particular instance of this problem is only of academic interest,
however (the drive is replaced). However I have several (QNAP) NAS with 
many NAS
grade drives in them. Due to annoying bugs in QTS I plan to reinstall 
these with
Debian. I'm thinking I'll probably use LVM2 and raid striping (so I will 
have VG
with many PV in them :-) )

 > > Then *the penny dropped!* The only component that has a view on the
 > > real physical disk is lvm2 and in particular the PV ...so if anybody
 > > can mark(and avoid) badblocks it's the PV...so I should be looking for
 > > something akin to the -cc option of fsck , applied to a PV command?
 >
 > Theoretically yes, you can create a mapping to relocate the block
 > elsewhere,
Any hints? lvm2 commands? I can RTFM but a pointing finger would help.

 > but there are no tools that I am aware of to help with this.
 > Again, are you sure there is actually something wrong with the disk?
 > Try writing to the "bad block" and see what happens.  Every time I have
 > done this in the last 20 years, the problem just goes away. Usually
 > without even triggering a reallocation.  This is because the data just
 > got a little scrambled even though there is nothing wrong with the
 > medium.

I've certainly met real bad blocks, possibly not in the last 20 years. 
When a
disk costs more than I earned in a year it was worth the effort to remap :-)

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [linux-lvm] badblocks
  2023-05-22 15:36   ` graeme vetterlein
@ 2023-05-22 16:01     ` Phillip Susi
  0 siblings, 0 replies; 5+ messages in thread
From: Phillip Susi @ 2023-05-22 16:01 UTC (permalink / raw)
  To: graeme vetterlein; +Cc: linux-lvm


graeme vetterlein <graeme.lvm@vetterlein.com> writes:

> I was able to "replace" 1 broken 2TB 1 working 2TB drive with a new
> 4TB drive
> without any filesystem creation, copying etc, just using LVM commands:

>     umount /dev/mapper/SAMSUNG_2TB-data
>     umount /dev/mapper/SAMSUNG_2TB-vimage
>     lvchange -an  SAMSUNG_2TB/data
>     lvchange -an  SAMSUNG_2TB/vimage
>     vgmerge  BARRACUDA_4TB SAMSUNG_2TB     -- I believe this puts
> everything into BARRACUDA_4TB (oddly right to left)
>     pvmove /dev/sdc1             -- Moves everything off sdc1
>     vgreduce BARRACUDA_4TB /dev/sdc1    -- Nothing should be in sdc1,
> so drop it from the group

If they were in the same vg, you wouldn't even have to umount the
filesystem.

> 19,000 hours, I get a popup warnings almost every day now. Smart shows
> it has NO
> reallocated sectors.

What "pop up warning??  What does smartctl -H say about the drive?  I
don't see anything below that indcicates there is anything wrong with
the drive at all.

> SMART Attributes Data Structure revision number: 16
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE UPDATED 
> WHEN_FAILED RAW_VALUE
>   1 Raw_Read_Error_Rate     0x000b   100   100   016    Pre-fail
> Always       -       0
>   2 Throughput_Performance  0x0005   140   140   054    Pre-fail
> Offline      -       69
>   3 Spin_Up_Time            0x0007   127   127   024    Pre-fail
> Always       -       296 (Average 299)
>   4 Start_Stop_Count        0x0012   100   100   000    Old_age
> Always       -       3554
>   5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail
> Always       -       0
>   7 Seek_Error_Rate         0x000b   100   100   067    Pre-fail
> Always       -       0
>   8 Seek_Time_Performance   0x0005   124   124   020    Pre-fail
> Offline      -       33
>   9 Power_On_Hours          0x0012   098   098   000    Old_age
> Always       -       19080
>  10 Spin_Retry_Count        0x0013   100   100   060    Pre-fail
> Always       -       0
>  12 Power_Cycle_Count       0x0032   100   100   000    Old_age
> Always       -       2959
> 192 Power-Off_Retract_Count 0x0032   097   097   000    Old_age
> Always       -       3616
> 193 Load_Cycle_Count        0x0012   097   097   000    Old_age
> Always       -       3616
> 194 Temperature_Celsius     0x0002   250   250   000    Old_age
> Always       -       24 (Min/Max 13/45)
> 196 Reallocated_Event_Count 0x0032   100   100   000    Old_age
> Always       -       0
> 197 Current_Pending_Sector  0x0022   100   100   000    Old_age
> Always       -       0
> 198 Offline_Uncorrectable   0x0008   100   100   000    Old_age
> Offline      -       0
> 199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age
> Always       -       57
>
>
> Now, I know it's possible that these CRC errors are e.g. 'cable related' but
> I've swapped the cable and moved SATA ports to no effect. In the end I
> decided
> 10 years was enough and bought a new drive.

Yes, those are just errors going over the SATA link.  They would have
been retried and you never noticed.  The question is whether you see any
errors in dmesg or your kernel logs, or reported from badblocks.  Or you
might ask smartctl to have the drive run its own internal test with
smartctl -t long.  The advantage of this over badblocks is that it
doesn't have to waste resources actually sending the data to the CPU
just to test if it can read the disk.  You can then check the drive's
log with smartctl -l selftest to see if it found any errors.

> Any hints? lvm2 commands? I can RTFM but a pointing finger would help.

You can bypass LVM and directly manipulate the device mapper with the
dmsetup command.  Doing this, you can do various other things such as
insert a fake "bad sector" for testing purposes, but you will have to
set up a script in your initramfs to configure the table on every boot.

> Debian. I'm thinking I'll probably use LVM2 and raid striping (so I
> will have VG
> with many PV in them :-) )

That's one way to go, but if you are currently keeping them as separate
filesystems, you might be interested in looking up snapraid.  It lets
you create parity to be able to recover from file or disk loss like
raid5/6, but not in real time.  It's handy for several disks that
contain files that are rarely written to such as media files.  You can
keep your media disks in standby mode except for the one disk that
contains the file you want to play right now, rather than having to wake
all of the disks up as with raid5/6.  You can also keep your parity
disk(s) offline and drop them in an eSATA dock just to update the parity
when you do modify the files on the data disks.

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-05-23  6:47 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-20 22:29 [linux-lvm] badblocks graeme vetterlein
2023-05-22  7:22 ` Roberto Fastec
2023-05-22 13:49 ` Phillip Susi
2023-05-22 15:36   ` graeme vetterlein
2023-05-22 16:01     ` Phillip Susi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).