* raid5 crash on system which PAGE_SIZE is 64KB
@ 2021-03-15 13:44 Xiao Ni
2021-03-16 9:20 ` Yufen Yu
0 siblings, 1 reply; 6+ messages in thread
From: Xiao Ni @ 2021-03-15 13:44 UTC (permalink / raw)
To: yuyufen, song, linux-raid, Nigel Croxon
Cc: Heinz Mauelshagen, kent.overstreet
Hi all
We encounter one raid5 crash problem on POWER system which PAGE_SIZE is
64KB.
I can reproduce this problem 100%. This problem can be reproduced with
latest upstream kernel.
The steps are:
mdadm -CR /dev/md0 -l5 -n3 /dev/sda1 /dev/sdc1 /dev/sdd1
mkfs.xfs /dev/md0 -f
mount /dev/md0 /mnt/test
The error message is:
mount: /mnt/test: mount(2) system call failed: Structure needs cleaning.
We can see error message in dmesg:
[ 6455.761545] XFS (md0): Metadata CRC error detected at
xfs_agf_read_verify+0x118/0x160 [xfs], xfs_agf block 0x2105c008
[ 6455.761570] XFS (md0): Unmount and run xfs_repair
[ 6455.761575] XFS (md0): First 128 bytes of corrupted metadata buffer:
[ 6455.761581] 00000000: fe ed ba be 00 00 00 00 00 00 00 02 00 00 00
00 ................
[ 6455.761586] 00000010: 00 00 00 00 00 00 03 c0 00 00 00 01 00 00 00
00 ................
[ 6455.761590] 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 ................
[ 6455.761594] 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 ................
[ 6455.761598] 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 ................
[ 6455.761601] 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 ................
[ 6455.761605] 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 ................
[ 6455.761609] 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 ................
[ 6455.761662] XFS (md0): metadata I/O error in "xfs_read_agf+0xb4/0x1a0
[xfs]" at daddr 0x2105c008 len 8 error 74
[ 6455.761673] XFS (md0): Error -117 recovering leftover CoW allocations.
[ 6455.761685] XFS (md0): Corruption of in-memory data detected.
Shutting down filesystem
[ 6455.761690] XFS (md0): Please unmount the filesystem and rectify the
problem(s)
This problem doesn't happen when creating raid device with
--assume-clean. So the crash only happens when sync and normal
I/O write at the same time.
I tried to revert the patch set "Save memory for stripe_head buffer" and
the problem can be fixed. I'm looking at this problem,
but I haven't found the root cause. Could you have a look?
By the way, there is a place that I can't understand. Is it a bug?
Should we do in this way:
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 5d57a5b..4a5e8ae 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -1479,7 +1479,7 @@ static struct page **to_addr_page(struct
raid5_percpu *percpu, int i)
static addr_conv_t *to_addr_conv(struct stripe_head *sh,
struct raid5_percpu *percpu, int i)
{
- return (void *) (to_addr_page(percpu, i) + sh->disks + 2);
+ return (void *) (to_addr_page(percpu, i) + sizeof(struct
page*)*(sh->disks + 2));
}
/*
@@ -1488,7 +1488,7 @@ static addr_conv_t *to_addr_conv(struct
stripe_head *sh,
static unsigned int *
to_addr_offs(struct stripe_head *sh, struct raid5_percpu *percpu)
{
- return (unsigned int *) (to_addr_conv(sh, percpu, 0) + sh->disks
+ 2);
+ return (unsigned int *) (to_addr_conv(sh, percpu, 0) +
sizeof(addr_conv_t)*(sh->disks + 2));
}
This is introduced by commit b330e6a49d (md: convert to kvmalloc)
Regards
Xiao
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: raid5 crash on system which PAGE_SIZE is 64KB
2021-03-15 13:44 raid5 crash on system which PAGE_SIZE is 64KB Xiao Ni
@ 2021-03-16 9:20 ` Yufen Yu
2021-03-22 17:28 ` Song Liu
0 siblings, 1 reply; 6+ messages in thread
From: Yufen Yu @ 2021-03-16 9:20 UTC (permalink / raw)
To: Xiao Ni, song, linux-raid, Nigel Croxon
Cc: Heinz Mauelshagen, kent.overstreet
On 2021/3/15 21:44, Xiao Ni wrote:
> Hi all
>
> We encounter one raid5 crash problem on POWER system which PAGE_SIZE is 64KB.
> I can reproduce this problem 100%. This problem can be reproduced with latest upstream kernel.
>
> The steps are:
> mdadm -CR /dev/md0 -l5 -n3 /dev/sda1 /dev/sdc1 /dev/sdd1
> mkfs.xfs /dev/md0 -f
> mount /dev/md0 /mnt/test
>
> The error message is:
> mount: /mnt/test: mount(2) system call failed: Structure needs cleaning.
>
> We can see error message in dmesg:
> [ 6455.761545] XFS (md0): Metadata CRC error detected at xfs_agf_read_verify+0x118/0x160 [xfs], xfs_agf block 0x2105c008
> [ 6455.761570] XFS (md0): Unmount and run xfs_repair
> [ 6455.761575] XFS (md0): First 128 bytes of corrupted metadata buffer:
> [ 6455.761581] 00000000: fe ed ba be 00 00 00 00 00 00 00 02 00 00 00 00 ................
> [ 6455.761586] 00000010: 00 00 00 00 00 00 03 c0 00 00 00 01 00 00 00 00 ................
> [ 6455.761590] 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> [ 6455.761594] 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> [ 6455.761598] 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> [ 6455.761601] 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> [ 6455.761605] 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> [ 6455.761609] 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> [ 6455.761662] XFS (md0): metadata I/O error in "xfs_read_agf+0xb4/0x1a0 [xfs]" at daddr 0x2105c008 len 8 error 74
> [ 6455.761673] XFS (md0): Error -117 recovering leftover CoW allocations.
> [ 6455.761685] XFS (md0): Corruption of in-memory data detected. Shutting down filesystem
> [ 6455.761690] XFS (md0): Please unmount the filesystem and rectify the problem(s)
>
> This problem doesn't happen when creating raid device with --assume-clean. So the crash only happens when sync and normal
> I/O write at the same time.
>
> I tried to revert the patch set "Save memory for stripe_head buffer" and the problem can be fixed. I'm looking at this problem,
> but I haven't found the root cause. Could you have a look?
Thanks for reporting this bug. Please give me some times to debug it,
recently time is very limited for me.
Thanks,
Yufen
>
> By the way, there is a place that I can't understand. Is it a bug? Should we do in this way:
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index 5d57a5b..4a5e8ae 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -1479,7 +1479,7 @@ static struct page **to_addr_page(struct raid5_percpu *percpu, int i)
> static addr_conv_t *to_addr_conv(struct stripe_head *sh,
> struct raid5_percpu *percpu, int i)
> {
> - return (void *) (to_addr_page(percpu, i) + sh->disks + 2);
> + return (void *) (to_addr_page(percpu, i) + sizeof(struct page*)*(sh->disks + 2));
> }
>
> /*
> @@ -1488,7 +1488,7 @@ static addr_conv_t *to_addr_conv(struct stripe_head *sh,
> static unsigned int *
> to_addr_offs(struct stripe_head *sh, struct raid5_percpu *percpu)
> {
> - return (unsigned int *) (to_addr_conv(sh, percpu, 0) + sh->disks + 2);
> + return (unsigned int *) (to_addr_conv(sh, percpu, 0) + sizeof(addr_conv_t)*(sh->disks + 2));
> }
>
> This is introduced by commit b330e6a49d (md: convert to kvmalloc)
>
> Regards
> Xiao
>
>
>
>
> .
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: raid5 crash on system which PAGE_SIZE is 64KB
2021-03-16 9:20 ` Yufen Yu
@ 2021-03-22 17:28 ` Song Liu
2021-03-23 5:04 ` Xiao Ni
2021-03-23 7:41 ` Yufen Yu
0 siblings, 2 replies; 6+ messages in thread
From: Song Liu @ 2021-03-22 17:28 UTC (permalink / raw)
To: Yufen Yu
Cc: Xiao Ni, linux-raid, Nigel Croxon, Heinz Mauelshagen, kent.overstreet
On Tue, Mar 16, 2021 at 2:20 AM Yufen Yu <yuyufen@huawei.com> wrote:
>
>
>
> On 2021/3/15 21:44, Xiao Ni wrote:
> > Hi all
> >
> > We encounter one raid5 crash problem on POWER system which PAGE_SIZE is 64KB.
> > I can reproduce this problem 100%. This problem can be reproduced with latest upstream kernel.
> >
> > The steps are:
> > mdadm -CR /dev/md0 -l5 -n3 /dev/sda1 /dev/sdc1 /dev/sdd1
> > mkfs.xfs /dev/md0 -f
> > mount /dev/md0 /mnt/test
> >
> > The error message is:
> > mount: /mnt/test: mount(2) system call failed: Structure needs cleaning.
> >
> > We can see error message in dmesg:
> > [ 6455.761545] XFS (md0): Metadata CRC error detected at xfs_agf_read_verify+0x118/0x160 [xfs], xfs_agf block 0x2105c008
> > [ 6455.761570] XFS (md0): Unmount and run xfs_repair
> > [ 6455.761575] XFS (md0): First 128 bytes of corrupted metadata buffer:
> > [ 6455.761581] 00000000: fe ed ba be 00 00 00 00 00 00 00 02 00 00 00 00 ................
> > [ 6455.761586] 00000010: 00 00 00 00 00 00 03 c0 00 00 00 01 00 00 00 00 ................
> > [ 6455.761590] 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> > [ 6455.761594] 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> > [ 6455.761598] 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> > [ 6455.761601] 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> > [ 6455.761605] 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> > [ 6455.761609] 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> > [ 6455.761662] XFS (md0): metadata I/O error in "xfs_read_agf+0xb4/0x1a0 [xfs]" at daddr 0x2105c008 len 8 error 74
> > [ 6455.761673] XFS (md0): Error -117 recovering leftover CoW allocations.
> > [ 6455.761685] XFS (md0): Corruption of in-memory data detected. Shutting down filesystem
> > [ 6455.761690] XFS (md0): Please unmount the filesystem and rectify the problem(s)
> >
> > This problem doesn't happen when creating raid device with --assume-clean. So the crash only happens when sync and normal
> > I/O write at the same time.
> >
> > I tried to revert the patch set "Save memory for stripe_head buffer" and the problem can be fixed. I'm looking at this problem,
> > but I haven't found the root cause. Could you have a look?
>
> Thanks for reporting this bug. Please give me some times to debug it,
> recently time is very limited for me.
>
> Thanks,
> Yufen
Hi Yufen,
Have you got time to look into this?
>
> >
> > By the way, there is a place that I can't understand. Is it a bug? Should we do in this way:
> > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> > index 5d57a5b..4a5e8ae 100644
> > --- a/drivers/md/raid5.c
> > +++ b/drivers/md/raid5.c
> > @@ -1479,7 +1479,7 @@ static struct page **to_addr_page(struct raid5_percpu *percpu, int i)
> > static addr_conv_t *to_addr_conv(struct stripe_head *sh,
> > struct raid5_percpu *percpu, int i)
> > {
> > - return (void *) (to_addr_page(percpu, i) + sh->disks + 2);
> > + return (void *) (to_addr_page(percpu, i) + sizeof(struct page*)*(sh->disks + 2));
I guess we don't need this change. to_add_page() returns "struct page **", which
should have same size of "struct page*", no?
Thanks,
Song
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: raid5 crash on system which PAGE_SIZE is 64KB
2021-03-22 17:28 ` Song Liu
@ 2021-03-23 5:04 ` Xiao Ni
2021-03-23 7:41 ` Yufen Yu
1 sibling, 0 replies; 6+ messages in thread
From: Xiao Ni @ 2021-03-23 5:04 UTC (permalink / raw)
To: Song Liu, Yufen Yu
Cc: linux-raid, Nigel Croxon, Heinz Mauelshagen, kent.overstreet
On 03/23/2021 01:28 AM, Song Liu wrote:
> On Tue, Mar 16, 2021 at 2:20 AM Yufen Yu <yuyufen@huawei.com> wrote:
>>
>>
>> On 2021/3/15 21:44, Xiao Ni wrote:
>>> Hi all
>>>
>>> We encounter one raid5 crash problem on POWER system which PAGE_SIZE is 64KB.
>>> I can reproduce this problem 100%. This problem can be reproduced with latest upstream kernel.
>>>
>>> The steps are:
>>> mdadm -CR /dev/md0 -l5 -n3 /dev/sda1 /dev/sdc1 /dev/sdd1
>>> mkfs.xfs /dev/md0 -f
>>> mount /dev/md0 /mnt/test
>>>
>>> The error message is:
>>> mount: /mnt/test: mount(2) system call failed: Structure needs cleaning.
>>>
>>> We can see error message in dmesg:
>>> [ 6455.761545] XFS (md0): Metadata CRC error detected at xfs_agf_read_verify+0x118/0x160 [xfs], xfs_agf block 0x2105c008
>>> [ 6455.761570] XFS (md0): Unmount and run xfs_repair
>>> [ 6455.761575] XFS (md0): First 128 bytes of corrupted metadata buffer:
>>> [ 6455.761581] 00000000: fe ed ba be 00 00 00 00 00 00 00 02 00 00 00 00 ................
>>> [ 6455.761586] 00000010: 00 00 00 00 00 00 03 c0 00 00 00 01 00 00 00 00 ................
>>> [ 6455.761590] 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
>>> [ 6455.761594] 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
>>> [ 6455.761598] 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
>>> [ 6455.761601] 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
>>> [ 6455.761605] 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
>>> [ 6455.761609] 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
>>> [ 6455.761662] XFS (md0): metadata I/O error in "xfs_read_agf+0xb4/0x1a0 [xfs]" at daddr 0x2105c008 len 8 error 74
>>> [ 6455.761673] XFS (md0): Error -117 recovering leftover CoW allocations.
>>> [ 6455.761685] XFS (md0): Corruption of in-memory data detected. Shutting down filesystem
>>> [ 6455.761690] XFS (md0): Please unmount the filesystem and rectify the problem(s)
>>>
>>> This problem doesn't happen when creating raid device with --assume-clean. So the crash only happens when sync and normal
>>> I/O write at the same time.
>>>
>>> I tried to revert the patch set "Save memory for stripe_head buffer" and the problem can be fixed. I'm looking at this problem,
>>> but I haven't found the root cause. Could you have a look?
>> Thanks for reporting this bug. Please give me some times to debug it,
>> recently time is very limited for me.
>>
>> Thanks,
>> Yufen
> Hi Yufen,
>
> Have you got time to look into this?
>
>>> By the way, there is a place that I can't understand. Is it a bug? Should we do in this way:
>>> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
>>> index 5d57a5b..4a5e8ae 100644
>>> --- a/drivers/md/raid5.c
>>> +++ b/drivers/md/raid5.c
>>> @@ -1479,7 +1479,7 @@ static struct page **to_addr_page(struct raid5_percpu *percpu, int i)
>>> static addr_conv_t *to_addr_conv(struct stripe_head *sh,
>>> struct raid5_percpu *percpu, int i)
>>> {
>>> - return (void *) (to_addr_page(percpu, i) + sh->disks + 2);
>>> + return (void *) (to_addr_page(percpu, i) + sizeof(struct page*)*(sh->disks + 2));
> I guess we don't need this change. to_add_page() returns "struct page **", which
> should have same size of "struct page*", no?
You are right. We don't need to change this. And I'm looking at this
problem too.
I'll report once I find new hints.
Regards
Xiao
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: raid5 crash on system which PAGE_SIZE is 64KB
2021-03-22 17:28 ` Song Liu
2021-03-23 5:04 ` Xiao Ni
@ 2021-03-23 7:41 ` Yufen Yu
2021-03-24 8:02 ` Xiao Ni
1 sibling, 1 reply; 6+ messages in thread
From: Yufen Yu @ 2021-03-23 7:41 UTC (permalink / raw)
To: Song Liu
Cc: Xiao Ni, linux-raid, Nigel Croxon, Heinz Mauelshagen, kent.overstreet
hi
On 2021/3/23 1:28, Song Liu wrote:
> On Tue, Mar 16, 2021 at 2:20 AM Yufen Yu <yuyufen@huawei.com> wrote:
>>
>>
>>
>> On 2021/3/15 21:44, Xiao Ni wrote:
>>> Hi all
>>>
>>> We encounter one raid5 crash problem on POWER system which PAGE_SIZE is 64KB.
>>> I can reproduce this problem 100%. This problem can be reproduced with latest upstream kernel.
>>>
>>> The steps are:
>>> mdadm -CR /dev/md0 -l5 -n3 /dev/sda1 /dev/sdc1 /dev/sdd1
>>> mkfs.xfs /dev/md0 -f
>>> mount /dev/md0 /mnt/test
>>>
>>> The error message is:
>>> mount: /mnt/test: mount(2) system call failed: Structure needs cleaning.
>>>
>>> We can see error message in dmesg:
>>> [ 6455.761545] XFS (md0): Metadata CRC error detected at xfs_agf_read_verify+0x118/0x160 [xfs], xfs_agf block 0x2105c008
>>> [ 6455.761570] XFS (md0): Unmount and run xfs_repair
>>> [ 6455.761575] XFS (md0): First 128 bytes of corrupted metadata buffer:
>>> [ 6455.761581] 00000000: fe ed ba be 00 00 00 00 00 00 00 02 00 00 00 00 ................
>>> [ 6455.761586] 00000010: 00 00 00 00 00 00 03 c0 00 00 00 01 00 00 00 00 ................
>>> [ 6455.761590] 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
>>> [ 6455.761594] 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
>>> [ 6455.761598] 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
>>> [ 6455.761601] 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
>>> [ 6455.761605] 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
>>> [ 6455.761609] 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
>>> [ 6455.761662] XFS (md0): metadata I/O error in "xfs_read_agf+0xb4/0x1a0 [xfs]" at daddr 0x2105c008 len 8 error 74
>>> [ 6455.761673] XFS (md0): Error -117 recovering leftover CoW allocations.
>>> [ 6455.761685] XFS (md0): Corruption of in-memory data detected. Shutting down filesystem
>>> [ 6455.761690] XFS (md0): Please unmount the filesystem and rectify the problem(s)
>>>
>>> This problem doesn't happen when creating raid device with --assume-clean. So the crash only happens when sync and normal
>>> I/O write at the same time.
>>>
>>> I tried to revert the patch set "Save memory for stripe_head buffer" and the problem can be fixed. I'm looking at this problem,
>>> but I haven't found the root cause. Could you have a look?
>>
>> Thanks for reporting this bug. Please give me some times to debug it,
>> recently time is very limited for me.
>>
>> Thanks,
>> Yufen
>
> Hi Yufen,
>
> Have you got time to look into this?
>
I can also reproduce this problem on my qemu vm system, with 3 10G disks.
But, there is no problem when I change mkfs.xfs option 'agcount' (default
value is 16 for my system). For example, if I set agcount=15, there is no
problem when mount xfs, likely:
mkfs.xfs -d agcount=15 -f /dev/md0
mount /dev/md0 /mnt/test
In addition, I try to write a 128MB file to /dev/md0 and then read it out
during md resync, they are same by checking md5sum, likely:
dd if=randfile of=/dev/md0 bs=1M count=128 oflag=direct seek=10240
dd if=/dev/md0 of=out.randfile bs=1M count=128 oflag=direct skip=10240
BTW, I found mkfs.xfs have some options related to raid device, such as
sunit, su, swidth, sw. I guess this problem may be caused by data alignment.
But, I have no idea how it happen. More time may needed.
Thanks
Yufen
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: raid5 crash on system which PAGE_SIZE is 64KB
2021-03-23 7:41 ` Yufen Yu
@ 2021-03-24 8:02 ` Xiao Ni
0 siblings, 0 replies; 6+ messages in thread
From: Xiao Ni @ 2021-03-24 8:02 UTC (permalink / raw)
To: Yufen Yu, Song Liu
Cc: linux-raid, Nigel Croxon, Heinz Mauelshagen, kent.overstreet
>>
>
> I can also reproduce this problem on my qemu vm system, with 3 10G disks.
> But, there is no problem when I change mkfs.xfs option 'agcount' (default
> value is 16 for my system). For example, if I set agcount=15, there is no
> problem when mount xfs, likely:
>
> mkfs.xfs -d agcount=15 -f /dev/md0
> mount /dev/md0 /mnt/test
Hi Yufen
I did test with agcount=15, this problem exists too in my environment.
Test1:
[root@ibm-p8-11 ~]# mdadm -CR /dev/md0 -l5 -n3 /dev/sd[b-d]1 --size=20G
[root@ibm-p8-11 ~]# mkfs.xfs /dev/md0 -f
meta-data=/dev/md0 isize=512 agcount=16, agsize=655232 blks
...
[root@ibm-p8-11 ~]# mount /dev/md0 /mnt/test
mount: /mnt/test: mount(2) system call failed: Structure needs cleaning.
Test2:
[root@ibm-p8-11 ~]# mkfs.xfs /dev/md0 -f -d agcount=15
Warning: AG size is a multiple of stripe width. This can cause performance
problems by aligning all AGs on the same disk. To avoid this, run mkfs with
an AG size that is one stripe unit smaller or larger, for example 699008.
meta-data=/dev/md0 isize=512 agcount=15, agsize=699136 blks
...
[root@ibm-p8-11 ~]# mount /dev/md0 /mnt/test
mount: /mnt/test: mount(2) system call failed: Structure needs cleaning.
>
> In addition, I try to write a 128MB file to /dev/md0 and then read it out
> during md resync, they are same by checking md5sum, likely:
>
> dd if=randfile of=/dev/md0 bs=1M count=128 oflag=direct seek=10240
> dd if=/dev/md0 of=out.randfile bs=1M count=128 oflag=direct skip=10240
>
> BTW, I found mkfs.xfs have some options related to raid device, such as
> sunit, su, swidth, sw. I guess this problem may be caused by data
> alignment.
> But, I have no idea how it happen. More time may needed.
The problem doesn't happen if mkfs without resync. Is there a
possibility that resync and mkfs
write to the same page?
Regards
Xiao
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-03-24 8:03 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-15 13:44 raid5 crash on system which PAGE_SIZE is 64KB Xiao Ni
2021-03-16 9:20 ` Yufen Yu
2021-03-22 17:28 ` Song Liu
2021-03-23 5:04 ` Xiao Ni
2021-03-23 7:41 ` Yufen Yu
2021-03-24 8:02 ` Xiao Ni
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.