On 2019/6/28 上午10:26, Anand Jain wrote: > At the time mkfs.btrfs the device id and stripe index gets reversed as > shown in [1]. This patch helps to keep them in order at the time of > mkfs.btrfs. And makes it easier to debug. > > Before: > Stripe 0 is on devid 2; Stipe 1 is on devid 1; > > ./mkfs.btrfs -fq -draid1 -mraid1 /dev/sdb /dev/sdc && btrfs in dump-tree -d /dev/sdb | grep -A 10000 "chunk tree" | grep -B 10000 "device tree" | grep -A 13 "FIRST_CHUNK_TREE CHUNK_ITEM" > item 2 key (FIRST_CHUNK_TREE CHUNK_ITEM 22020096) itemoff 15975 itemsize 112 > length 8388608 owner 2 stripe_len 65536 type SYSTEM|RAID1 > io_align 65536 io_width 65536 sector_size 4096 > num_stripes 2 sub_stripes 0 > stripe 0 devid 2 offset 1048576 > dev_uuid d9fe51c4-6e79-446d-87ee-5be3184798cd > stripe 1 devid 1 offset 22020096 > dev_uuid 16f626ca-1a54-469b-ac7e-25623af884ab > item 3 key (FIRST_CHUNK_TREE CHUNK_ITEM 30408704) itemoff 15863 itemsize 112 > length 268435456 owner 2 stripe_len 65536 type METADATA|RAID1 > io_align 65536 io_width 65536 sector_size 4096 > num_stripes 2 sub_stripes 0 > stripe 0 devid 2 offset 9437184 > dev_uuid d9fe51c4-6e79-446d-87ee-5be3184798cd > stripe 1 devid 1 offset 30408704 > dev_uuid 16f626ca-1a54-469b-ac7e-25623af884ab > item 4 key (FIRST_CHUNK_TREE CHUNK_ITEM 298844160) itemoff 15751 itemsize 112 > length 314572800 owner 2 stripe_len 65536 type DATA|RAID1 > io_align 65536 io_width 65536 sector_size 4096 > num_stripes 2 sub_stripes 0 > stripe 0 devid 2 offset 277872640 > dev_uuid d9fe51c4-6e79-446d-87ee-5be3184798cd > stripe 1 devid 1 offset 298844160 > dev_uuid 16f626ca-1a54-469b-ac7e-25623af884ab > > After: > Stripe 0 is on devid 1; Stripe 1 is on devid 2 > > ./mkfs.btrfs -fq -draid1 -mraid1 /dev/sdb /dev/sdc && btrfs in dump-tree -d /dev/sdb | grep -A 10000 "chunk tree" | grep -B 10000 "device tree" | grep -A 13 "FIRST_CHUNK_TREE CHUNK_ITEM" > /dev/sdb: 8 bytes were erased at offset 0x00010040 (btrfs): 5f 42 48 52 66 53 5f 4d > /dev/sdc: 8 bytes were erased at offset 0x00010040 (btrfs): 5f 42 48 52 66 53 5f 4d > item 2 key (FIRST_CHUNK_TREE CHUNK_ITEM 22020096) itemoff 15975 itemsize 112 > length 8388608 owner 2 stripe_len 65536 type SYSTEM|RAID1 > io_align 65536 io_width 65536 sector_size 4096 > num_stripes 2 sub_stripes 0 > stripe 0 devid 1 offset 22020096 > dev_uuid 6abc88fa-f42e-4f0c-9bc3-2225735e51d1 > stripe 1 devid 2 offset 1048576 > dev_uuid 73746d27-13a6-4d58-ac6b-48c90c31d94d > item 3 key (FIRST_CHUNK_TREE CHUNK_ITEM 30408704) itemoff 15863 itemsize 112 > length 268435456 owner 2 stripe_len 65536 type METADATA|RAID1 > io_align 65536 io_width 65536 sector_size 4096 > num_stripes 2 sub_stripes 0 > stripe 0 devid 1 offset 30408704 > dev_uuid 6abc88fa-f42e-4f0c-9bc3-2225735e51d1 > stripe 1 devid 2 offset 9437184 > dev_uuid 73746d27-13a6-4d58-ac6b-48c90c31d94d > item 4 key (FIRST_CHUNK_TREE CHUNK_ITEM 298844160) itemoff 15751 itemsize 112 > length 314572800 owner 2 stripe_len 65536 type DATA|RAID1 > io_align 65536 io_width 65536 sector_size 4096 > num_stripes 2 sub_stripes 0 > stripe 0 devid 1 offset 298844160 > dev_uuid 6abc88fa-f42e-4f0c-9bc3-2225735e51d1 > stripe 1 devid 2 offset 277872640 > dev_uuid 73746d27-13a6-4d58-ac6b-48c90c31d94d > > Signed-off-by: Anand Jain Reviewed-by: Qu Wenruo But please also check the comment inlined below. > --- > volumes.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/volumes.c b/volumes.c > index 79d1d6a07fb7..8c8b17e814b8 100644 > --- a/volumes.c > +++ b/volumes.c > @@ -1109,7 +1109,7 @@ again: > return ret; > cur = cur->next; > if (avail >= min_free) { > - list_move_tail(&device->dev_list, &private_devs); > + list_move(&device->dev_list, &private_devs); This is OK since current btrfs-progs chunk allocator doesn't follow the kernel behavior by sorting devices with its unallocated space. So it's completely devid based. But please keep in mind that, if we're going to unify the chunk allocator behavior of kernel and btrfs-progs, the behavior will change. As the initial temporary chunk is always allocated on devid 1, reducing its unallocated space thus reducing its priority in chunk allocator, and making the devid sequence more unreliable. Thanks, Qu > index++; > if (type & BTRFS_BLOCK_GROUP_DUP) > index++; > @@ -1166,7 +1166,7 @@ again: > /* loop over this device again if we're doing a dup group */ > if (!(type & BTRFS_BLOCK_GROUP_DUP) || > (index == num_stripes - 1)) > - list_move_tail(&device->dev_list, dev_list); > + list_move(&device->dev_list, dev_list); > > ret = btrfs_alloc_dev_extent(trans, device, key.offset, > calc_size, &dev_offset); >