All of lore.kernel.org
 help / color / mirror / Atom feed
* Two uncorrectable errors across RAID1 at same logical block?
@ 2014-10-06 17:06 Rich Rauenzahn
  2014-10-07  2:05 ` Liu Bo
  0 siblings, 1 reply; 9+ messages in thread
From: Rich Rauenzahn @ 2014-10-06 17:06 UTC (permalink / raw)
  To: linux-btrfs

This fs is across two ssd drives.  Am I interpreting this right that the
same logical block is corrupt on both drives?  That seems odd.

How do I map it to a filename?

$ sudo dmesg --clear

$ dmesg

$ sudo btrfs scrub start -B /
scrub done for
35f0ce3f-0902-47a3-8ad8-86179d1f3e3a
        scrub started at Mon Oct  6 09:51:01 2014 and finished after 325 seconds
        total bytes scrubbed: 114.89GiB with 2 errors
        error details: csum=2
        corrected errors: 0, uncorrectable errors: 2, unverified errors: 0

$ dmesg
 btrfs: bdev /dev/sdg3 errs: wr 0, rd 0, flush 0, corrupt 2, gen 0
 btrfs: unable to fixup (regular) error at logical 58464632832 on dev /dev/sdg3
 btrfs: bdev /dev/sdf3 errs: wr 0, rd 0, flush 0, corrupt 6, gen 0
 btrfs: unable to fixup (regular) error at logical 58464632832 on dev /dev/sdf3

$ uname -a
Linux hostname 3.10.0-123.8.1.el7.x86_64 #1 SMP Mon Sep 22 19:06:58
UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

$ btrfs --version
Btrfs v3.12

$ btrfs fi show
Btrfs v3.12

$ btrfs fi df /
Data, RAID1: total=107.07GiB, used=55.47GiB
System, RAID1: total=32.00MiB, used=16.00KiB
Metadata, RAID1: total=4.00GiB, used=2.04GiB

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Two uncorrectable errors across RAID1 at same logical block?
  2014-10-06 17:06 Two uncorrectable errors across RAID1 at same logical block? Rich Rauenzahn
@ 2014-10-07  2:05 ` Liu Bo
  2014-10-07  2:18   ` Rich Rauenzahn
  0 siblings, 1 reply; 9+ messages in thread
From: Liu Bo @ 2014-10-07  2:05 UTC (permalink / raw)
  To: Rich Rauenzahn; +Cc: linux-btrfs

On Mon, Oct 06, 2014 at 10:06:52AM -0700, Rich Rauenzahn wrote:
> This fs is across two ssd drives.  Am I interpreting this right that the
> same logical block is corrupt on both drives?  That seems odd.
> 

Yes, they both corrupt somehow.

> How do I map it to a filename?

You may try "btrfs inspect-internal logical-resolve 58464632832 /your_btrfs_mnt"

thanks,
-liubo

> 
> $ sudo dmesg --clear
> 
> $ dmesg
> 
> $ sudo btrfs scrub start -B /
> scrub done for
> 35f0ce3f-0902-47a3-8ad8-86179d1f3e3a
>         scrub started at Mon Oct  6 09:51:01 2014 and finished after 325 seconds
>         total bytes scrubbed: 114.89GiB with 2 errors
>         error details: csum=2
>         corrected errors: 0, uncorrectable errors: 2, unverified errors: 0
> 
> $ dmesg
>  btrfs: bdev /dev/sdg3 errs: wr 0, rd 0, flush 0, corrupt 2, gen 0
>  btrfs: unable to fixup (regular) error at logical 58464632832 on dev /dev/sdg3
>  btrfs: bdev /dev/sdf3 errs: wr 0, rd 0, flush 0, corrupt 6, gen 0
>  btrfs: unable to fixup (regular) error at logical 58464632832 on dev /dev/sdf3
> 
> $ uname -a
> Linux hostname 3.10.0-123.8.1.el7.x86_64 #1 SMP Mon Sep 22 19:06:58
> UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
> 
> $ btrfs --version
> Btrfs v3.12
> 
> $ btrfs fi show
> Btrfs v3.12
> 
> $ btrfs fi df /
> Data, RAID1: total=107.07GiB, used=55.47GiB
> System, RAID1: total=32.00MiB, used=16.00KiB
> Metadata, RAID1: total=4.00GiB, used=2.04GiB
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Two uncorrectable errors across RAID1 at same logical block?
  2014-10-07  2:05 ` Liu Bo
@ 2014-10-07  2:18   ` Rich Rauenzahn
  2014-10-08 14:20     ` Liu Bo
  0 siblings, 1 reply; 9+ messages in thread
From: Rich Rauenzahn @ 2014-10-07  2:18 UTC (permalink / raw)
  To: bo.li.liu, Rich Rauenzahn; +Cc: linux-btrfs

On 10/6/2014 7:05 PM, Liu Bo wrote:
> btrfs inspect-internal logical-resolve 58464632832

$  sudo btrfs inspect-internal logical-resolve 58464632832  /

...no output?



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Two uncorrectable errors across RAID1 at same logical block?
  2014-10-07  2:18   ` Rich Rauenzahn
@ 2014-10-08 14:20     ` Liu Bo
  2014-10-08 16:13       ` Rich Rauenzahn
  0 siblings, 1 reply; 9+ messages in thread
From: Liu Bo @ 2014-10-08 14:20 UTC (permalink / raw)
  To: Rich Rauenzahn; +Cc: Rich Rauenzahn, linux-btrfs

On Mon, Oct 06, 2014 at 07:18:06PM -0700, Rich Rauenzahn wrote:
> On 10/6/2014 7:05 PM, Liu Bo wrote:
> >btrfs inspect-internal logical-resolve 58464632832
> 
> $  sudo btrfs inspect-internal logical-resolve 58464632832  /
> 
> ...no output?
> 
> 

Hmm...have you tried the latest btrfs-progs?

You can pull it or get a tar ball from

git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs.git

thanks,
-liubo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Two uncorrectable errors across RAID1 at same logical block?
  2014-10-08 14:20     ` Liu Bo
@ 2014-10-08 16:13       ` Rich Rauenzahn
  2014-10-09  7:13         ` Liu Bo
  0 siblings, 1 reply; 9+ messages in thread
From: Rich Rauenzahn @ 2014-10-08 16:13 UTC (permalink / raw)
  To: bo.li.liu; +Cc: Rich Rauenzahn, linux-btrfs

On 10/8/2014 7:20 AM, Liu Bo wrote:
> On Mon, Oct 06, 2014 at 07:18:06PM -0700, Rich Rauenzahn wrote:
>> On 10/6/2014 7:05 PM, Liu Bo wrote:
>>> btrfs inspect-internal logical-resolve 58464632832
>> $  sudo btrfs inspect-internal logical-resolve 58464632832  /
>>
>> ...no output?
>>
>>
> Hmm...have you tried the latest btrfs-progs?
>
> You can pull it or get a tar ball from
>
> git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs.git
>
> thanks,
> -liubo
>

Still no output:

   $ sudo ./btrfs inspect-internal logical-resolve 58464632832  /

Could it be a deleted file?



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Two uncorrectable errors across RAID1 at same logical block?
  2014-10-08 16:13       ` Rich Rauenzahn
@ 2014-10-09  7:13         ` Liu Bo
  2014-10-09 16:58           ` Rich Rauenzahn
  0 siblings, 1 reply; 9+ messages in thread
From: Liu Bo @ 2014-10-09  7:13 UTC (permalink / raw)
  To: Rich Rauenzahn; +Cc: Rich Rauenzahn, linux-btrfs

On Wed, Oct 08, 2014 at 09:13:58AM -0700, Rich Rauenzahn wrote:
> On 10/8/2014 7:20 AM, Liu Bo wrote:
> >On Mon, Oct 06, 2014 at 07:18:06PM -0700, Rich Rauenzahn wrote:
> >>On 10/6/2014 7:05 PM, Liu Bo wrote:
> >>>btrfs inspect-internal logical-resolve 58464632832
> >>$  sudo btrfs inspect-internal logical-resolve 58464632832  /
> >>
> >>...no output?
> >>
> >>
> >Hmm...have you tried the latest btrfs-progs?
> >
> >You can pull it or get a tar ball from
> >
> >git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs.git
> >
> >thanks,
> >-liubo
> >
> 
> Still no output:
> 
>   $ sudo ./btrfs inspect-internal logical-resolve 58464632832  /
> 
> Could it be a deleted file?

No idea.

Would you please try it with verbose option? 
"sudo ./btrfs inspect-internal logical-resolve -v 58464632832  /"

thanks,
-liubo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Two uncorrectable errors across RAID1 at same logical block?
  2014-10-09  7:13         ` Liu Bo
@ 2014-10-09 16:58           ` Rich Rauenzahn
  2014-10-11  8:52             ` Liu Bo
  0 siblings, 1 reply; 9+ messages in thread
From: Rich Rauenzahn @ 2014-10-09 16:58 UTC (permalink / raw)
  To: bo.li.liu; +Cc: Rich Rauenzahn, linux-btrfs

On 10/9/2014 12:13 AM, Liu Bo wrote:
> sudo ./btrfs inspect-internal logical-resolve -v 58464632832  /

$ sudo ./btrfs inspect-internal logical-resolve -v 58464632832  /
ioctl ret=0, total_size=4096, bytes_left=4080, bytes_missing=0, cnt=0, 
missed=0

I also tried -P and -s 100000000 ....

Also did this:

$ sudo ./btrfs-map-logical  -l 58464632832   -o /tmp/58464632832 /dev/sdf3
mirror 1 logical 58464632832 physical 1536393216 device /dev/sdg3
mirror 2 logical 58464632832 physical 58464632832 device /dev/sdf3

And looked at the 4k block.  strings doesn't show anything useful: +V0T"
File doesn't recognize it as anything particular.

Weird.

I have one other clue which I think is irrelevant.  I had another error 
on a different drive/different fs and it turned out to be the vmem file 
for a virtual machine under vmware workstation.  I deleted the file 
since it was just the memory image and the error went away.  It was easy 
to map the bad block to the file from dmesg and the inode.   I may have 
also created a vm at some point on this drive we're looking at now and 
then moved it.  So I think that information is not relevant... but maybe 
you've seen this before.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Two uncorrectable errors across RAID1 at same logical block?
  2014-10-09 16:58           ` Rich Rauenzahn
@ 2014-10-11  8:52             ` Liu Bo
  2014-10-11 15:11               ` Rich Rauenzahn
  0 siblings, 1 reply; 9+ messages in thread
From: Liu Bo @ 2014-10-11  8:52 UTC (permalink / raw)
  To: Rich Rauenzahn; +Cc: Rich Rauenzahn, linux-btrfs

On Thu, Oct 09, 2014 at 09:58:03AM -0700, Rich Rauenzahn wrote:
> On 10/9/2014 12:13 AM, Liu Bo wrote:
> >sudo ./btrfs inspect-internal logical-resolve -v 58464632832  /
> 
> $ sudo ./btrfs inspect-internal logical-resolve -v 58464632832  /
> ioctl ret=0, total_size=4096, bytes_left=4080, bytes_missing=0,
> cnt=0, missed=0

Hi Rich,

So cnt=0 is the reason that we got nothing output, however,
to be honest, I don't know exactly why 'cnt' is 0, perhaps it's due to an old
version btrfs, as the 'btrfs inspect-internal' command depends on ioctl which
may have bugs in old btrfs.

> 
> I also tried -P and -s 100000000 ....
> 
> Also did this:
> 
> $ sudo ./btrfs-map-logical  -l 58464632832   -o /tmp/58464632832 /dev/sdf3
> mirror 1 logical 58464632832 physical 1536393216 device /dev/sdg3
> mirror 2 logical 58464632832 physical 58464632832 device /dev/sdf3
> 
> And looked at the 4k block.  strings doesn't show anything useful: +V0T"
> File doesn't recognize it as anything particular.
> 
> Weird.

One ultimate solution is to use 'btrfs-debug-tree' to dump the human readable
metadata information and grep for 58464632832 to read inode info. ,
but this may cost time depending on the size of your patition.

thanks,
-liubo

> 
> I have one other clue which I think is irrelevant.  I had another
> error on a different drive/different fs and it turned out to be the
> vmem file for a virtual machine under vmware workstation.  I deleted
> the file since it was just the memory image and the error went away.
> It was easy to map the bad block to the file from dmesg and the
> inode.   I may have also created a vm at some point on this drive
> we're looking at now and then moved it.  So I think that information
> is not relevant... but maybe you've seen this before.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Two uncorrectable errors across RAID1 at same logical block?
  2014-10-11  8:52             ` Liu Bo
@ 2014-10-11 15:11               ` Rich Rauenzahn
  0 siblings, 0 replies; 9+ messages in thread
From: Rich Rauenzahn @ 2014-10-11 15:11 UTC (permalink / raw)
  To: bo.li.liu; +Cc: Rich Rauenzahn, linux-btrfs

$ sudo ./btrfs-debug-tree  /dev/sdf3 | grep 58464632832
                namelen 11 datalen 0 name: 58464632832
                namelen 11 datalen 0 name: 58464632832
                inode ref index 266598 namelen 11 name: 58464632832

Is that a real inode number? I think it isn't ... Wait -- I think I'm
picking a file in /tmp with that name, which is a block I dumped.
Renamed it.

$ sudo ./btrfs-debug-tree  /dev/sdf3 | grep 58464632832

Now it returns nothing...


On Sat, Oct 11, 2014 at 1:52 AM, Liu Bo <bo.li.liu@oracle.com> wrote:
> On Thu, Oct 09, 2014 at 09:58:03AM -0700, Rich Rauenzahn wrote:
>> On 10/9/2014 12:13 AM, Liu Bo wrote:
>> >sudo ./btrfs inspect-internal logical-resolve -v 58464632832  /
>>
>> $ sudo ./btrfs inspect-internal logical-resolve -v 58464632832  /
>> ioctl ret=0, total_size=4096, bytes_left=4080, bytes_missing=0,
>> cnt=0, missed=0
>
> Hi Rich,
>
> So cnt=0 is the reason that we got nothing output, however,
> to be honest, I don't know exactly why 'cnt' is 0, perhaps it's due to an old
> version btrfs, as the 'btrfs inspect-internal' command depends on ioctl which
> may have bugs in old btrfs.
>
>>
>> I also tried -P and -s 100000000 ....
>>
>> Also did this:
>>
>> $ sudo ./btrfs-map-logical  -l 58464632832   -o /tmp/58464632832 /dev/sdf3
>> mirror 1 logical 58464632832 physical 1536393216 device /dev/sdg3
>> mirror 2 logical 58464632832 physical 58464632832 device /dev/sdf3
>>
>> And looked at the 4k block.  strings doesn't show anything useful: +V0T"
>> File doesn't recognize it as anything particular.
>>
>> Weird.
>
> One ultimate solution is to use 'btrfs-debug-tree' to dump the human readable
> metadata information and grep for 58464632832 to read inode info. ,
> but this may cost time depending on the size of your patition.
>
> thanks,
> -liubo
>
>>
>> I have one other clue which I think is irrelevant.  I had another
>> error on a different drive/different fs and it turned out to be the
>> vmem file for a virtual machine under vmware workstation.  I deleted
>> the file since it was just the memory image and the error went away.
>> It was easy to map the bad block to the file from dmesg and the
>> inode.   I may have also created a vm at some point on this drive
>> we're looking at now and then moved it.  So I think that information
>> is not relevant... but maybe you've seen this before.
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-10-11 15:11 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-06 17:06 Two uncorrectable errors across RAID1 at same logical block? Rich Rauenzahn
2014-10-07  2:05 ` Liu Bo
2014-10-07  2:18   ` Rich Rauenzahn
2014-10-08 14:20     ` Liu Bo
2014-10-08 16:13       ` Rich Rauenzahn
2014-10-09  7:13         ` Liu Bo
2014-10-09 16:58           ` Rich Rauenzahn
2014-10-11  8:52             ` Liu Bo
2014-10-11 15:11               ` Rich Rauenzahn

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.