linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* slow single -> raid1 conversion (heavy write to original LVM volume)
@ 2020-01-14 13:16 jn
  2020-01-14 17:18 ` Chris Murphy
  2020-01-15  3:26 ` Qu Wenruo
  0 siblings, 2 replies; 11+ messages in thread
From: jn @ 2020-01-14 13:16 UTC (permalink / raw)
  To: linux-btrfs

Hello,

I am experiencing very slow conversion from single disk BTRFS to raid1
balanced (new disk was added):

what I have done:

I have added new disk to nearly full (cca 85%) BTRFS filesystem on LVM
volume with intention to convert it into raid1:

btrfs balance start -dconvert raid1 -mconvert raid1 /data/

> Jan 10 08:14:04 sopa kernel: [155893.485617] BTRFS info (device dm-0):
> disk added /dev/sdb3
> Jan 10 08:15:06 sopa kernel: [155955.958561] BTRFS info (device dm-0):
> relocating block group 2078923554816 flags data
> Jan 10 08:15:07 sopa kernel: [155956.991293] BTRFS info (device dm-0):
> relocating block group 2077849812992 flags data
> Jan 10 08:15:10 sopa kernel: [155960.357846] BTRFS info (device dm-0):
> relocating block group 2076776071168 flags data
> Jan 10 08:15:13 sopa kernel: [155962.772534] BTRFS info (device dm-0):
> relocating block group 2075702329344 flags data
> Jan 10 08:15:14 sopa kernel: [155964.195237] BTRFS info (device dm-0):
> relocating block group 2074628587520 flags data
> Jan 10 08:15:45 sopa kernel: [155994.546695] BTRFS info (device dm-0):
> relocating block group 2062817427456 flags data
> Jan 10 08:15:52 sopa kernel: [156001.952247] BTRFS info (device dm-0):
> relocating block group 2059596201984 flags data
> Jan 10 08:15:58 sopa kernel: [156007.787071] BTRFS info (device dm-0):
> relocating block group 2057448718336 flags data
> Jan 10 08:16:00 sopa kernel: [156010.094565] BTRFS info (device dm-0):
> relocating block group 2056374976512 flags data
> Jan 10 08:16:06 sopa kernel: [156015.585343] BTRFS info (device dm-0):
> relocating block group 2054227492864 flags data
> Jan 10 08:16:12 sopa kernel: [156022.305629] BTRFS info (device dm-0):
> relocating block group 2051006267392 flags data
> Jan 10 08:16:23 sopa kernel: [156033.373144] BTRFS info (device dm-0):
> found 75 extents
> Jan 10 08:16:29 sopa kernel: [156038.666672] BTRFS info (device dm-0):
> found 75 extents
> Jan 10 08:16:36 sopa kernel: [156045.909270] BTRFS info (device dm-0):
> found 75 extents
> Jan 10 08:16:42 sopa kernel: [156052.292789] BTRFS info (device dm-0):
> found 75 extents
> Jan 10 08:16:46 sopa kernel: [156055.643452] BTRFS info (device dm-0):
> found 75 extents
> Jan 10 08:16:54 sopa kernel: [156063.608344] BTRFS info (device dm-0):
> found 75 extents
after 6hours of processing with 0% progress reported by balance status,
I decided to cancel it to empty more space and rerun balance with some
filters:

btrfs balance cancel /data

> Jan 10 14:38:11 sopa kernel: [178941.189217] BTRFS info (device dm-0):
> found 68 extents
> Jan 10 14:38:14 sopa kernel: [178943.619787] BTRFS info (device dm-0):
> found 68 extents
> Jan 10 14:38:20 sopa kernel: [178950.275334] BTRFS info (device dm-0):
> found 68 extents
> Jan 10 14:38:24 sopa kernel: [178954.018770] INFO: task btrfs:30196
> blocked for more than 845 seconds.
> Jan 10 14:38:24 sopa kernel: [178954.018844] 
> btrfs_cancel_balance+0xf8/0x170 [btrfs]
> Jan 10 14:38:24 sopa kernel: [178954.018878] 
> btrfs_ioctl+0x13af/0x20d0 [btrfs]
> Jan 10 14:38:28 sopa kernel: [178957.999108] BTRFS info (device dm-0):
> found 68 extents
> Jan 10 14:38:29 sopa kernel: [178958.837674] BTRFS info (device dm-0):
> found 68 extents
> Jan 10 14:38:30 sopa kernel: [178959.835118] BTRFS info (device dm-0):
> found 68 extents
> Jan 10 14:38:31 sopa kernel: [178960.915305] BTRFS info (device dm-0):
> found 68 extents
> Jan 10 14:40:25 sopa kernel: [179074.851376] 
> btrfs_cancel_balance+0xf8/0x170 [btrfs]
> Jan 10 14:40:25 sopa kernel: [179074.851408] 
> btrfs_ioctl+0x13af/0x20d0 [btrfs]

now nearly 4 days later (and after some data deleted) both balance start
and balance cancel processes are still running and system reports:

> root@sopa:/var/log# btrfs balance status /data/
> Balance on '/data/' is running, cancel requested
> 0 out of about 1900 chunks balanced (29 considered), 100% left

> root@sopa:~# uname -a
> Linux sopa 5.4.8-050408-generic #202001041436 SMP Sat Jan 4 19:40:55
> UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
>
> root@sopa:~#   btrfs --version
> btrfs-progs v4.15.1
>
> root@sopa:~#   btrfs fi show
> Label: 'SOPADATA'  uuid: 37b8a62c-68e8-44e4-a3b2-eb572385c3e8
>     Total devices 2 FS bytes used 1.04TiB
>     devid    1 size 1.86TiB used 1.86TiB path /dev/mapper/sopa-data
>     devid    2 size 1.86TiB used 0.00B path /dev/sdb3
>
> root@sopa:~# btrfs subvolume list /data
> ID 1021 gen 7564583 top level 5 path nfs
> ID 1022 gen 7564590 top level 5 path motion

> root@sopa:~#   btrfs fi df /data
> Data, single: total=1.84TiB, used=1.04TiB
> System, DUP: total=8.00MiB, used=224.00KiB
> System, single: total=4.00MiB, used=0.00B
> Metadata, DUP: total=6.50GiB, used=2.99GiB
> Metadata, single: total=8.00MiB, used=0.00B
> GlobalReserve, single: total=512.00MiB, used=0.00B
>
is it normal that  it have written nearly 5TB of data to the original
disk ??:

> root@sopa:/var/log# ps ax | grep balance
> 16014 ?        D    21114928:30 btrfs balance start -dconvert raid1
> -mconvert raid1 /data/
> 30196 ?        D      0:00 btrfs balance cancel /data

> root@sopa:/var/log# cat /proc/16014/io | grep bytes
> read_bytes: 1150357504
> write_bytes: 5812039966720
> root@sopa:/sys/block# cat  /sys/block/sdb/sdb3/stat
>      404        0    39352      956  4999199     1016 40001720
> 71701953        0 14622628 67496136        0        0        0        0

> [520398.089952] btrfs(16014): WRITE block 131072 on sdb3 (8 sectors)
> [520398.089975] btrfs(16014): WRITE block 536870912 on sdb3 (8 sectors)
> [520398.089995] btrfs(16014): WRITE block 128 on dm-0 (8 sectors)
> [520398.090021] btrfs(16014): WRITE block 131072 on dm-0 (8 sectors)
> [520398.090040] btrfs(16014): WRITE block 536870912 on dm-0 (8 sectors)
> [520398.154382] btrfs(16014): WRITE block 14629168 on dm-0 (512 sectors)
> [520398.155017] btrfs(16014): WRITE block 17748832 on dm-0 (512 sectors)
> [520398.155545] btrfs(16014): WRITE block 17909352 on dm-0 (512 sectors)
> [520398.156091] btrfs(16014): WRITE block 20534680 on dm-0 (512 sectors)
>
regards

jn



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: slow single -> raid1 conversion (heavy write to original LVM volume)
  2020-01-14 13:16 slow single -> raid1 conversion (heavy write to original LVM volume) jn
@ 2020-01-14 17:18 ` Chris Murphy
  2020-01-14 17:41   ` jakub nantl
  2020-01-15  3:26 ` Qu Wenruo
  1 sibling, 1 reply; 11+ messages in thread
From: Chris Murphy @ 2020-01-14 17:18 UTC (permalink / raw)
  To: jn; +Cc: Btrfs BTRFS

On Tue, Jan 14, 2020 at 6:27 AM jn <jn@forever.cz> wrote:
>
> Hello,
>
> I am experiencing very slow conversion from single disk BTRFS to raid1
> balanced (new disk was added):
>
> what I have done:
>
> I have added new disk to nearly full (cca 85%) BTRFS filesystem on LVM
> volume with intention to convert it into raid1:
>
> btrfs balance start -dconvert raid1 -mconvert raid1 /data/
>
> > Jan 10 08:14:04 sopa kernel: [155893.485617] BTRFS info (device dm-0):
> > disk added /dev/sdb3
> > Jan 10 08:15:06 sopa kernel: [155955.958561] BTRFS info (device dm-0):
> > relocating block group 2078923554816 flags data
> > Jan 10 08:15:07 sopa kernel: [155956.991293] BTRFS info (device dm-0):
> > relocating block group 2077849812992 flags data
> > Jan 10 08:15:10 sopa kernel: [155960.357846] BTRFS info (device dm-0):
> > relocating block group 2076776071168 flags data
> > Jan 10 08:15:13 sopa kernel: [155962.772534] BTRFS info (device dm-0):
> > relocating block group 2075702329344 flags data
> > Jan 10 08:15:14 sopa kernel: [155964.195237] BTRFS info (device dm-0):
> > relocating block group 2074628587520 flags data
> > Jan 10 08:15:45 sopa kernel: [155994.546695] BTRFS info (device dm-0):
> > relocating block group 2062817427456 flags data
> > Jan 10 08:15:52 sopa kernel: [156001.952247] BTRFS info (device dm-0):
> > relocating block group 2059596201984 flags data
> > Jan 10 08:15:58 sopa kernel: [156007.787071] BTRFS info (device dm-0):
> > relocating block group 2057448718336 flags data
> > Jan 10 08:16:00 sopa kernel: [156010.094565] BTRFS info (device dm-0):
> > relocating block group 2056374976512 flags data
> > Jan 10 08:16:06 sopa kernel: [156015.585343] BTRFS info (device dm-0):
> > relocating block group 2054227492864 flags data
> > Jan 10 08:16:12 sopa kernel: [156022.305629] BTRFS info (device dm-0):
> > relocating block group 2051006267392 flags data
> > Jan 10 08:16:23 sopa kernel: [156033.373144] BTRFS info (device dm-0):
> > found 75 extents
> > Jan 10 08:16:29 sopa kernel: [156038.666672] BTRFS info (device dm-0):
> > found 75 extents
> > Jan 10 08:16:36 sopa kernel: [156045.909270] BTRFS info (device dm-0):
> > found 75 extents
> > Jan 10 08:16:42 sopa kernel: [156052.292789] BTRFS info (device dm-0):
> > found 75 extents
> > Jan 10 08:16:46 sopa kernel: [156055.643452] BTRFS info (device dm-0):
> > found 75 extents
> > Jan 10 08:16:54 sopa kernel: [156063.608344] BTRFS info (device dm-0):
> > found 75 extents
> after 6hours of processing with 0% progress reported by balance status,
> I decided to cancel it to empty more space and rerun balance with some
> filters:
>
> btrfs balance cancel /data
>
> > Jan 10 14:38:11 sopa kernel: [178941.189217] BTRFS info (device dm-0):
> > found 68 extents
> > Jan 10 14:38:14 sopa kernel: [178943.619787] BTRFS info (device dm-0):
> > found 68 extents
> > Jan 10 14:38:20 sopa kernel: [178950.275334] BTRFS info (device dm-0):
> > found 68 extents
> > Jan 10 14:38:24 sopa kernel: [178954.018770] INFO: task btrfs:30196
> > blocked for more than 845 seconds.

Often a developer will want to see sysrq+w output following a blocked
task message like this.
https://www.kernel.org/doc/html/latest/admin-guide/sysrq.html

It's a bit different depending on the distribution but typical is:

echo 1 >/proc/sys/kernel/sysrq   ##this enables sysrq
echo w > /proc/sysrq-trigger     ##this is the trigger

It'll dump to kernel messages; sometimes there's more than the kernel
message buffer can hold; I personally use 'journalctl -k -o
short-monotonic > journal.txt'  to capture all kernel messages in a
file, and put that up somewhere. They're sometimes too big as
attachments for the list, so I sometimes do both.



> > Jan 10 14:38:24 sopa kernel: [178954.018844]
> > btrfs_cancel_balance+0xf8/0x170 [btrfs]
> > Jan 10 14:38:24 sopa kernel: [178954.018878]
> > btrfs_ioctl+0x13af/0x20d0 [btrfs]
> > Jan 10 14:38:28 sopa kernel: [178957.999108] BTRFS info (device dm-0):
> > found 68 extents
> > Jan 10 14:38:29 sopa kernel: [178958.837674] BTRFS info (device dm-0):
> > found 68 extents
> > Jan 10 14:38:30 sopa kernel: [178959.835118] BTRFS info (device dm-0):
> > found 68 extents
> > Jan 10 14:38:31 sopa kernel: [178960.915305] BTRFS info (device dm-0):
> > found 68 extents
> > Jan 10 14:40:25 sopa kernel: [179074.851376]
> > btrfs_cancel_balance+0xf8/0x170 [btrfs]
> > Jan 10 14:40:25 sopa kernel: [179074.851408]
> > btrfs_ioctl+0x13af/0x20d0 [btrfs]
>
> now nearly 4 days later (and after some data deleted) both balance start
> and balance cancel processes are still running and system reports:

It's clearly stuck on something, not sure what. sysrq+w should help
make it clear.

Before forcing a reboot; make sure backups are current while the file
system is still working. Balance is copy on write, so in theory it can
be interrupted safely. But I'd do what you can to minimize the changes
happening to the file system immediately prior to forcing the reboot:
close out of everything, stop the GUI if you can and move to a VT, do
a sync, and do 'reboot -f' instead of the friendlier 'reboot' because
a systemd reboot causes a bunch of services to get quit and all of
that does journal writes. It's really better to just force the reboot
and cross your fingers.

But yeah, if you can wait until hearing back from a dev. That's better.


> > root@sopa:~# uname -a
> > Linux sopa 5.4.8-050408-generic #202001041436 SMP Sat Jan 4 19:40:55
> > UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
> >
> > root@sopa:~#   btrfs --version
> > btrfs-progs v4.15.1

It's probably not related, but this is quite old compared to the kernel.

There is a known bug in 5.4 that off hand I don't think is causing the
problem you're seeing, but to avoid possibly strange space usage
reporting you can optionally use metadata_ratio=1 mount option.



> >
> > root@sopa:~#   btrfs fi show
> > Label: 'SOPADATA'  uuid: 37b8a62c-68e8-44e4-a3b2-eb572385c3e8
> >     Total devices 2 FS bytes used 1.04TiB
> >     devid    1 size 1.86TiB used 1.86TiB path /dev/mapper/sopa-data
> >     devid    2 size 1.86TiB used 0.00B path /dev/sdb3
> >
> > root@sopa:~# btrfs subvolume list /data
> > ID 1021 gen 7564583 top level 5 path nfs
> > ID 1022 gen 7564590 top level 5 path motion
>
> > root@sopa:~#   btrfs fi df /data
> > Data, single: total=1.84TiB, used=1.04TiB
> > System, DUP: total=8.00MiB, used=224.00KiB
> > System, single: total=4.00MiB, used=0.00B
> > Metadata, DUP: total=6.50GiB, used=2.99GiB
> > Metadata, single: total=8.00MiB, used=0.00B
> > GlobalReserve, single: total=512.00MiB, used=0.00B
> >
> is it normal that  it have written nearly 5TB of data to the original
> disk ??:


What I do see is there's not yet a single raid1 block group. That is
strange. For sure it's stuck.



>
> > root@sopa:/var/log# ps ax | grep balance
> > 16014 ?        D    21114928:30 btrfs balance start -dconvert raid1
> > -mconvert raid1 /data/
> > 30196 ?        D      0:00 btrfs balance cancel /data
>
> > root@sopa:/var/log# cat /proc/16014/io | grep bytes
> > read_bytes: 1150357504
> > write_bytes: 5812039966720

No idea.


> > root@sopa:/sys/block# cat  /sys/block/sdb/sdb3/stat
> >      404        0    39352      956  4999199     1016 40001720
> > 71701953        0 14622628 67496136        0        0        0        0
>
> > [520398.089952] btrfs(16014): WRITE block 131072 on sdb3 (8 sectors)
> > [520398.089975] btrfs(16014): WRITE block 536870912 on sdb3 (8 sectors)
> > [520398.089995] btrfs(16014): WRITE block 128 on dm-0 (8 sectors)
> > [520398.090021] btrfs(16014): WRITE block 131072 on dm-0 (8 sectors)
> > [520398.090040] btrfs(16014): WRITE block 536870912 on dm-0 (8 sectors)
> > [520398.154382] btrfs(16014): WRITE block 14629168 on dm-0 (512 sectors)
> > [520398.155017] btrfs(16014): WRITE block 17748832 on dm-0 (512 sectors)
> > [520398.155545] btrfs(16014): WRITE block 17909352 on dm-0 (512 sectors)
> > [520398.156091] btrfs(16014): WRITE block 20534680 on dm-0 (512 sectors)
> >
> regards
>
> jn
>
>


It's really confused, I'm not sure why from the available information.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: slow single -> raid1 conversion (heavy write to original LVM volume)
  2020-01-14 17:18 ` Chris Murphy
@ 2020-01-14 17:41   ` jakub nantl
  2020-01-14 20:09     ` Chris Murphy
  0 siblings, 1 reply; 11+ messages in thread
From: jakub nantl @ 2020-01-14 17:41 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS

hello,

thank for reply, here is the call trace, no need to reboot it, so I am
waiting :)

[538847.101197] sysrq: Show Blocked State
[538847.101206]   task                        PC stack   pid father
[538847.101321] btrfs           D    0 16014      1 0x00004004
[538847.101324] Call Trace:
[538847.101335]  __schedule+0x2e3/0x740
[538847.101339]  ? __switch_to_asm+0x40/0x70
[538847.101342]  ? __switch_to_asm+0x34/0x70
[538847.101345]  schedule+0x42/0xb0
[538847.101348]  schedule_timeout+0x203/0x2f0
[538847.101351]  ? __schedule+0x2eb/0x740
[538847.101355]  io_schedule_timeout+0x1e/0x50
[538847.101358]  wait_for_completion_io+0xb1/0x120
[538847.101363]  ? wake_up_q+0x70/0x70
[538847.101401]  write_all_supers+0x896/0x960 [btrfs]
[538847.101426]  btrfs_commit_transaction+0x6ea/0x960 [btrfs]
[538847.101456]  prepare_to_merge+0x210/0x250 [btrfs]
[538847.101484]  relocate_block_group+0x36b/0x5f0 [btrfs]
[538847.101512]  btrfs_relocate_block_group+0x15e/0x300 [btrfs]
[538847.101539]  btrfs_relocate_chunk+0x2a/0x90 [btrfs]
[538847.101566]  __btrfs_balance+0x409/0xa50 [btrfs]
[538847.101593]  btrfs_balance+0x3ae/0x530 [btrfs]
[538847.101621]  btrfs_ioctl_balance+0x2c1/0x380 [btrfs]
[538847.101648]  btrfs_ioctl+0x836/0x20d0 [btrfs]
[538847.101652]  ? do_anonymous_page+0x2e6/0x650
[538847.101656]  ? __handle_mm_fault+0x760/0x7a0
[538847.101662]  do_vfs_ioctl+0x407/0x670
[538847.101664]  ? do_vfs_ioctl+0x407/0x670
[538847.101669]  ? do_user_addr_fault+0x216/0x450
[538847.101672]  ksys_ioctl+0x67/0x90
[538847.101675]  __x64_sys_ioctl+0x1a/0x20
[538847.101680]  do_syscall_64+0x57/0x190
[538847.101683]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[538847.101687] RIP: 0033:0x7f3cb04c85d7
[538847.101695] Code: Bad RIP value.
[538847.101697] RSP: 002b:00007ffcd4e5fe88 EFLAGS: 00000246 ORIG_RAX:
0000000000000010
[538847.101701] RAX: ffffffffffffffda RBX: 0000000000000001 RCX:
00007f3cb04c85d7
[538847.101704] RDX: 00007ffcd4e5ff18 RSI: 00000000c4009420 RDI:
0000000000000003
[538847.101707] RBP: 00007ffcd4e5ff18 R08: 0000000000000078 R09:
0000000000000000
[538847.101710] R10: 0000559f27675010 R11: 0000000000000246 R12:
0000000000000003
[538847.101713] R13: 00007ffcd4e62734 R14: 0000000000000001 R15:
0000000000000000
[538847.101718] btrfs           D    0 30196      1 0x00000004
[538847.101720] Call Trace:
[538847.101724]  __schedule+0x2e3/0x740
[538847.101727]  schedule+0x42/0xb0
[538847.101753]  btrfs_cancel_balance+0xf8/0x170 [btrfs]
[538847.101759]  ? wait_woken+0x80/0x80
[538847.101786]  btrfs_ioctl+0x13af/0x20d0 [btrfs]
[538847.101789]  ? do_anonymous_page+0x2e6/0x650
[538847.101793]  ? __handle_mm_fault+0x760/0x7a0
[538847.101797]  do_vfs_ioctl+0x407/0x670
[538847.101800]  ? do_vfs_ioctl+0x407/0x670
[538847.101803]  ? do_user_addr_fault+0x216/0x450
[538847.101806]  ksys_ioctl+0x67/0x90
[538847.101809]  __x64_sys_ioctl+0x1a/0x20
[538847.101813]  do_syscall_64+0x57/0x190
[538847.101856]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[538847.101859] RIP: 0033:0x7fa33680c5d7
[538847.101864] Code: Bad RIP value.
[538847.101873] RSP: 002b:00007ffdbe2b9c58 EFLAGS: 00000246 ORIG_RAX:
0000000000000010
[538847.101888] RAX: ffffffffffffffda RBX: 0000000000000003 RCX:
00007fa33680c5d7
[538847.101897] RDX: 0000000000000002 RSI: 0000000040049421 RDI:
0000000000000003
[538847.101908] RBP: 00007ffdbe2ba1d8 R08: 0000000000000078 R09:
0000000000000000
[538847.101918] R10: 00005604500f4010 R11: 0000000000000246 R12:
00007ffdbe2ba735
[538847.101928] R13: 00007ffdbe2ba1c0 R14: 0000000000000000 R15:
0000000000000000




On 14. 01. 20 18:18, Chris Murphy wrote:
> On Tue, Jan 14, 2020 at 6:27 AM jn <jn@forever.cz> wrote:
>> Hello,
>>
>> I am experiencing very slow conversion from single disk BTRFS to raid1
>> balanced (new disk was added):
>>
>> what I have done:
>>
>> I have added new disk to nearly full (cca 85%) BTRFS filesystem on LVM
>> volume with intention to convert it into raid1:
>>
>> btrfs balance start -dconvert raid1 -mconvert raid1 /data/
>>
>>> Jan 10 08:14:04 sopa kernel: [155893.485617] BTRFS info (device dm-0):
>>> disk added /dev/sdb3
>>> Jan 10 08:15:06 sopa kernel: [155955.958561] BTRFS info (device dm-0):
>>> relocating block group 2078923554816 flags data
>>> Jan 10 08:15:07 sopa kernel: [155956.991293] BTRFS info (device dm-0):
>>> relocating block group 2077849812992 flags data
>>> Jan 10 08:15:10 sopa kernel: [155960.357846] BTRFS info (device dm-0):
>>> relocating block group 2076776071168 flags data
>>> Jan 10 08:15:13 sopa kernel: [155962.772534] BTRFS info (device dm-0):
>>> relocating block group 2075702329344 flags data
>>> Jan 10 08:15:14 sopa kernel: [155964.195237] BTRFS info (device dm-0):
>>> relocating block group 2074628587520 flags data
>>> Jan 10 08:15:45 sopa kernel: [155994.546695] BTRFS info (device dm-0):
>>> relocating block group 2062817427456 flags data
>>> Jan 10 08:15:52 sopa kernel: [156001.952247] BTRFS info (device dm-0):
>>> relocating block group 2059596201984 flags data
>>> Jan 10 08:15:58 sopa kernel: [156007.787071] BTRFS info (device dm-0):
>>> relocating block group 2057448718336 flags data
>>> Jan 10 08:16:00 sopa kernel: [156010.094565] BTRFS info (device dm-0):
>>> relocating block group 2056374976512 flags data
>>> Jan 10 08:16:06 sopa kernel: [156015.585343] BTRFS info (device dm-0):
>>> relocating block group 2054227492864 flags data
>>> Jan 10 08:16:12 sopa kernel: [156022.305629] BTRFS info (device dm-0):
>>> relocating block group 2051006267392 flags data
>>> Jan 10 08:16:23 sopa kernel: [156033.373144] BTRFS info (device dm-0):
>>> found 75 extents
>>> Jan 10 08:16:29 sopa kernel: [156038.666672] BTRFS info (device dm-0):
>>> found 75 extents
>>> Jan 10 08:16:36 sopa kernel: [156045.909270] BTRFS info (device dm-0):
>>> found 75 extents
>>> Jan 10 08:16:42 sopa kernel: [156052.292789] BTRFS info (device dm-0):
>>> found 75 extents
>>> Jan 10 08:16:46 sopa kernel: [156055.643452] BTRFS info (device dm-0):
>>> found 75 extents
>>> Jan 10 08:16:54 sopa kernel: [156063.608344] BTRFS info (device dm-0):
>>> found 75 extents
>> after 6hours of processing with 0% progress reported by balance status,
>> I decided to cancel it to empty more space and rerun balance with some
>> filters:
>>
>> btrfs balance cancel /data
>>
>>> Jan 10 14:38:11 sopa kernel: [178941.189217] BTRFS info (device dm-0):
>>> found 68 extents
>>> Jan 10 14:38:14 sopa kernel: [178943.619787] BTRFS info (device dm-0):
>>> found 68 extents
>>> Jan 10 14:38:20 sopa kernel: [178950.275334] BTRFS info (device dm-0):
>>> found 68 extents
>>> Jan 10 14:38:24 sopa kernel: [178954.018770] INFO: task btrfs:30196
>>> blocked for more than 845 seconds.
> Often a developer will want to see sysrq+w output following a blocked
> task message like this.
> https://www.kernel.org/doc/html/latest/admin-guide/sysrq.html
>
> It's a bit different depending on the distribution but typical is:
>
> echo 1 >/proc/sys/kernel/sysrq   ##this enables sysrq
> echo w > /proc/sysrq-trigger     ##this is the trigger
>
> It'll dump to kernel messages; sometimes there's more than the kernel
> message buffer can hold; I personally use 'journalctl -k -o
> short-monotonic > journal.txt'  to capture all kernel messages in a
> file, and put that up somewhere. They're sometimes too big as
> attachments for the list, so I sometimes do both.
>
>
>
>>> Jan 10 14:38:24 sopa kernel: [178954.018844]
>>> btrfs_cancel_balance+0xf8/0x170 [btrfs]
>>> Jan 10 14:38:24 sopa kernel: [178954.018878]
>>> btrfs_ioctl+0x13af/0x20d0 [btrfs]
>>> Jan 10 14:38:28 sopa kernel: [178957.999108] BTRFS info (device dm-0):
>>> found 68 extents
>>> Jan 10 14:38:29 sopa kernel: [178958.837674] BTRFS info (device dm-0):
>>> found 68 extents
>>> Jan 10 14:38:30 sopa kernel: [178959.835118] BTRFS info (device dm-0):
>>> found 68 extents
>>> Jan 10 14:38:31 sopa kernel: [178960.915305] BTRFS info (device dm-0):
>>> found 68 extents
>>> Jan 10 14:40:25 sopa kernel: [179074.851376]
>>> btrfs_cancel_balance+0xf8/0x170 [btrfs]
>>> Jan 10 14:40:25 sopa kernel: [179074.851408]
>>> btrfs_ioctl+0x13af/0x20d0 [btrfs]
>> now nearly 4 days later (and after some data deleted) both balance start
>> and balance cancel processes are still running and system reports:
> It's clearly stuck on something, not sure what. sysrq+w should help
> make it clear.
>
> Before forcing a reboot; make sure backups are current while the file
> system is still working. Balance is copy on write, so in theory it can
> be interrupted safely. But I'd do what you can to minimize the changes
> happening to the file system immediately prior to forcing the reboot:
> close out of everything, stop the GUI if you can and move to a VT, do
> a sync, and do 'reboot -f' instead of the friendlier 'reboot' because
> a systemd reboot causes a bunch of services to get quit and all of
> that does journal writes. It's really better to just force the reboot
> and cross your fingers.
>
> But yeah, if you can wait until hearing back from a dev. That's better.
>
>
>>> root@sopa:~# uname -a
>>> Linux sopa 5.4.8-050408-generic #202001041436 SMP Sat Jan 4 19:40:55
>>> UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
>>>
>>> root@sopa:~#   btrfs --version
>>> btrfs-progs v4.15.1
> It's probably not related, but this is quite old compared to the kernel.
>
> There is a known bug in 5.4 that off hand I don't think is causing the
> problem you're seeing, but to avoid possibly strange space usage
> reporting you can optionally use metadata_ratio=1 mount option.
>
>
>
>>> root@sopa:~#   btrfs fi show
>>> Label: 'SOPADATA'  uuid: 37b8a62c-68e8-44e4-a3b2-eb572385c3e8
>>>     Total devices 2 FS bytes used 1.04TiB
>>>     devid    1 size 1.86TiB used 1.86TiB path /dev/mapper/sopa-data
>>>     devid    2 size 1.86TiB used 0.00B path /dev/sdb3
>>>
>>> root@sopa:~# btrfs subvolume list /data
>>> ID 1021 gen 7564583 top level 5 path nfs
>>> ID 1022 gen 7564590 top level 5 path motion
>>> root@sopa:~#   btrfs fi df /data
>>> Data, single: total=1.84TiB, used=1.04TiB
>>> System, DUP: total=8.00MiB, used=224.00KiB
>>> System, single: total=4.00MiB, used=0.00B
>>> Metadata, DUP: total=6.50GiB, used=2.99GiB
>>> Metadata, single: total=8.00MiB, used=0.00B
>>> GlobalReserve, single: total=512.00MiB, used=0.00B
>>>
>> is it normal that  it have written nearly 5TB of data to the original
>> disk ??:
>
> What I do see is there's not yet a single raid1 block group. That is
> strange. For sure it's stuck.
>
>
>
>>> root@sopa:/var/log# ps ax | grep balance
>>> 16014 ?        D    21114928:30 btrfs balance start -dconvert raid1
>>> -mconvert raid1 /data/
>>> 30196 ?        D      0:00 btrfs balance cancel /data
>>> root@sopa:/var/log# cat /proc/16014/io | grep bytes
>>> read_bytes: 1150357504
>>> write_bytes: 5812039966720
> No idea.
>
>
>>> root@sopa:/sys/block# cat  /sys/block/sdb/sdb3/stat
>>>      404        0    39352      956  4999199     1016 40001720
>>> 71701953        0 14622628 67496136        0        0        0        0
>>> [520398.089952] btrfs(16014): WRITE block 131072 on sdb3 (8 sectors)
>>> [520398.089975] btrfs(16014): WRITE block 536870912 on sdb3 (8 sectors)
>>> [520398.089995] btrfs(16014): WRITE block 128 on dm-0 (8 sectors)
>>> [520398.090021] btrfs(16014): WRITE block 131072 on dm-0 (8 sectors)
>>> [520398.090040] btrfs(16014): WRITE block 536870912 on dm-0 (8 sectors)
>>> [520398.154382] btrfs(16014): WRITE block 14629168 on dm-0 (512 sectors)
>>> [520398.155017] btrfs(16014): WRITE block 17748832 on dm-0 (512 sectors)
>>> [520398.155545] btrfs(16014): WRITE block 17909352 on dm-0 (512 sectors)
>>> [520398.156091] btrfs(16014): WRITE block 20534680 on dm-0 (512 sectors)
>>>
>> regards
>>
>> jn
>>
>>
>
> It's really confused, I'm not sure why from the available information.
>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: slow single -> raid1 conversion (heavy write to original LVM volume)
  2020-01-14 17:41   ` jakub nantl
@ 2020-01-14 20:09     ` Chris Murphy
  2020-01-14 20:53       ` jn
  0 siblings, 1 reply; 11+ messages in thread
From: Chris Murphy @ 2020-01-14 20:09 UTC (permalink / raw)
  To: jakub nantl; +Cc: Btrfs BTRFS, Qu Wenruo

On Tue, Jan 14, 2020 at 10:41 AM jakub nantl <jn@forever.cz> wrote:
>
> hello,
>
> thank for reply, here is the call trace, no need to reboot it, so I am
> waiting :)
>
> [538847.101197] sysrq: Show Blocked State
> [538847.101206]   task                        PC stack   pid father
> [538847.101321] btrfs           D    0 16014      1 0x00004004
> [538847.101324] Call Trace:
> [538847.101335]  __schedule+0x2e3/0x740
> [538847.101339]  ? __switch_to_asm+0x40/0x70
> [538847.101342]  ? __switch_to_asm+0x34/0x70
> [538847.101345]  schedule+0x42/0xb0
> [538847.101348]  schedule_timeout+0x203/0x2f0
> [538847.101351]  ? __schedule+0x2eb/0x740
> [538847.101355]  io_schedule_timeout+0x1e/0x50
> [538847.101358]  wait_for_completion_io+0xb1/0x120
> [538847.101363]  ? wake_up_q+0x70/0x70
> [538847.101401]  write_all_supers+0x896/0x960 [btrfs]
> [538847.101426]  btrfs_commit_transaction+0x6ea/0x960 [btrfs]
> [538847.101456]  prepare_to_merge+0x210/0x250 [btrfs]
> [538847.101484]  relocate_block_group+0x36b/0x5f0 [btrfs]
> [538847.101512]  btrfs_relocate_block_group+0x15e/0x300 [btrfs]
> [538847.101539]  btrfs_relocate_chunk+0x2a/0x90 [btrfs]
> [538847.101566]  __btrfs_balance+0x409/0xa50 [btrfs]
> [538847.101593]  btrfs_balance+0x3ae/0x530 [btrfs]
> [538847.101621]  btrfs_ioctl_balance+0x2c1/0x380 [btrfs]
> [538847.101648]  btrfs_ioctl+0x836/0x20d0 [btrfs]
> [538847.101652]  ? do_anonymous_page+0x2e6/0x650
> [538847.101656]  ? __handle_mm_fault+0x760/0x7a0
> [538847.101662]  do_vfs_ioctl+0x407/0x670
> [538847.101664]  ? do_vfs_ioctl+0x407/0x670
> [538847.101669]  ? do_user_addr_fault+0x216/0x450
> [538847.101672]  ksys_ioctl+0x67/0x90
> [538847.101675]  __x64_sys_ioctl+0x1a/0x20
> [538847.101680]  do_syscall_64+0x57/0x190
> [538847.101683]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [538847.101687] RIP: 0033:0x7f3cb04c85d7
> [538847.101695] Code: Bad RIP value.
> [538847.101697] RSP: 002b:00007ffcd4e5fe88 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000010
> [538847.101701] RAX: ffffffffffffffda RBX: 0000000000000001 RCX:
> 00007f3cb04c85d7
> [538847.101704] RDX: 00007ffcd4e5ff18 RSI: 00000000c4009420 RDI:
> 0000000000000003
> [538847.101707] RBP: 00007ffcd4e5ff18 R08: 0000000000000078 R09:
> 0000000000000000
> [538847.101710] R10: 0000559f27675010 R11: 0000000000000246 R12:
> 0000000000000003
> [538847.101713] R13: 00007ffcd4e62734 R14: 0000000000000001 R15:
> 0000000000000000
> [538847.101718] btrfs           D    0 30196      1 0x00000004
> [538847.101720] Call Trace:
> [538847.101724]  __schedule+0x2e3/0x740
> [538847.101727]  schedule+0x42/0xb0
> [538847.101753]  btrfs_cancel_balance+0xf8/0x170 [btrfs]
> [538847.101759]  ? wait_woken+0x80/0x80
> [538847.101786]  btrfs_ioctl+0x13af/0x20d0 [btrfs]
> [538847.101789]  ? do_anonymous_page+0x2e6/0x650
> [538847.101793]  ? __handle_mm_fault+0x760/0x7a0
> [538847.101797]  do_vfs_ioctl+0x407/0x670
> [538847.101800]  ? do_vfs_ioctl+0x407/0x670
> [538847.101803]  ? do_user_addr_fault+0x216/0x450
> [538847.101806]  ksys_ioctl+0x67/0x90
> [538847.101809]  __x64_sys_ioctl+0x1a/0x20
> [538847.101813]  do_syscall_64+0x57/0x190
> [538847.101856]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [538847.101859] RIP: 0033:0x7fa33680c5d7
> [538847.101864] Code: Bad RIP value.
> [538847.101873] RSP: 002b:00007ffdbe2b9c58 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000010
> [538847.101888] RAX: ffffffffffffffda RBX: 0000000000000003 RCX:
> 00007fa33680c5d7
> [538847.101897] RDX: 0000000000000002 RSI: 0000000040049421 RDI:
> 0000000000000003
> [538847.101908] RBP: 00007ffdbe2ba1d8 R08: 0000000000000078 R09:
> 0000000000000000
> [538847.101918] R10: 00005604500f4010 R11: 0000000000000246 R12:
> 00007ffdbe2ba735
> [538847.101928] R13: 00007ffdbe2ba1c0 R14: 0000000000000000 R15:
> 0000000000000000
>

I think it got clipped. And also the MUA is wrapping it and making it
hard to read. I suggest 'journalctl -k -o short-monotonic' because
what started the problem might actually be much earlier and there's no
way to know that without the entire thing. Put that up in a dropbox or
pastebin or google drive or equivalent. And hopefully a dev will be
able to figure out why it's hung up. All I can tell from the above is
that it's hung up on cancelling, which doesn't say much.

_handle_mm_fault is suspicious. On second thought, I suggest doing
sysrq+t. And then output journalctl -k, and post that. It'll have the
complete dmesg, the sysrq+w, and +t. That for sure won't post to the
list, it'll be too long, and the way MUA's wrap it, it's hard to read.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: slow single -> raid1 conversion (heavy write to original LVM volume)
  2020-01-14 20:09     ` Chris Murphy
@ 2020-01-14 20:53       ` jn
  2020-01-15  2:23         ` Chris Murphy
  2020-01-15  3:15         ` Remi Gauvin
  0 siblings, 2 replies; 11+ messages in thread
From: jn @ 2020-01-14 20:53 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS, Qu Wenruo


Hello Chris,

here it is:  http://www.uschovna.cz/en/zasilka/TWGCD2G39WBH9G8N-XFE 
(since I have limited journald space, beginning of events was
overwritten already, so it's assembled from syslog files)

jn


current device & fs usage:

root@sopa:~# btrfs dev usa /data
/dev/mapper/sopa-data, ID: 1
   Device size:             1.86TiB
   Device slack:              0.00B
   Data,single:             1.84TiB
   Metadata,single:         8.00MiB
   Metadata,DUP:           13.00GiB
   System,single:           4.00MiB
   System,DUP:             16.00MiB
   Unallocated:               0.00B

/dev/sdb3, ID: 2
   Device size:             1.86TiB
   Device slack:              0.00B
   Unallocated:             1.86TiB

root@sopa:~# btrfs fi usa /data
Overall:
    Device size:           3.71TiB
    Device allocated:           1.86TiB
    Device unallocated:           1.86TiB
    Device missing:             0.00B
    Used:               1.05TiB
    Free (estimated):           2.65TiB    (min: 1.73TiB)
    Data ratio:                  1.00
    Metadata ratio:              2.00
    Global reserve:         512.00MiB    (used: 0.00B)

Data,single: Size:1.84TiB, Used:1.04TiB
   /dev/mapper/sopa-data       1.84TiB

Metadata,single: Size:8.00MiB, Used:0.00B
   /dev/mapper/sopa-data       8.00MiB

Metadata,DUP: Size:6.50GiB, Used:3.00GiB
   /dev/mapper/sopa-data      13.00GiB

System,single: Size:4.00MiB, Used:0.00B
   /dev/mapper/sopa-data       4.00MiB

System,DUP: Size:8.00MiB, Used:224.00KiB
   /dev/mapper/sopa-data      16.00MiB

Unallocated:
   /dev/mapper/sopa-data         0.00B
   /dev/sdb3       1.86TiB


On 14. 01. 20 21:09, Chris Murphy wrote:
> I think it got clipped. And also the MUA is wrapping it and making it
> hard to read. I suggest 'journalctl -k -o short-monotonic' because
> what started the problem might actually be much earlier and there's no
> way to know that without the entire thing. Put that up in a dropbox or
> pastebin or google drive or equivalent. And hopefully a dev will be
> able to figure out why it's hung up. All I can tell from the above is
> that it's hung up on cancelling, which doesn't say much.
>
> _handle_mm_fault is suspicious. On second thought, I suggest doing
> sysrq+t. And then output journalctl -k, and post that. It'll have the
> complete dmesg, the sysrq+w, and +t. That for sure won't post to the
> list, it'll be too long, and the way MUA's wrap it, it's hard to read.
>
> -- Chris Murphy



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: slow single -> raid1 conversion (heavy write to original LVM volume)
  2020-01-14 20:53       ` jn
@ 2020-01-15  2:23         ` Chris Murphy
  2020-01-15 11:28           ` jakub nantl
  2020-01-15  3:15         ` Remi Gauvin
  1 sibling, 1 reply; 11+ messages in thread
From: Chris Murphy @ 2020-01-15  2:23 UTC (permalink / raw)
  To: jn; +Cc: Chris Murphy, Btrfs BTRFS, Qu Wenruo

Jan  8 12:28:54 sopa kernel: [   13.552491] BTRFS info (device dm-0):
disk space caching is enabled
Jan  8 12:28:54 sopa kernel: [   13.803006] BTRFS info (device dm-0):
bdev /dev/mapper/sopa-data errs: wr 420, rd 44, flush 0, corrupt 0,
gen 0

Any idea what these are and if they're recent or old? This is the
original device with all the data and there are read and write errors.
Write errors mean writes have been dropped. It's usually an indicator
of a device problem.

Jan  8 12:59:40 sopa kernel: [  229.560937] Alternate GPT is invalid,
using primary GPT.
Jan  8 12:59:40 sopa kernel: [  229.560954]  sdb: sdb1 sdb2 sdb3

This suggests something overwrote the backup GPT on /dev/sdb. This
probably isn't related to the problem you're having, because it comes
before you add /dev/sdb3 to the Btrfs volume, and doesn't look like
Btrfs has written anything to /dev/sdb3 at this point. But whatever
did this should be tracked down, and in the meantime the backup GPT
should be fixed. Most anything will fix it these days, I send to use
gdisk.


Jan 10 08:16:12 sopa kernel: [156022.305629] BTRFS info (device dm-0):
relocating block group 2051006267392 flags data
Jan 10 08:16:23 sopa kernel: [156033.373144] BTRFS info (device dm-0):
found 75 extents


Last line repeats hundreds of times, over the course of hours. Even if
it's highly fragmented, I wouldn't expect it takes this long to read
75 extents. The differential is reasonable in some cases if these are
small 4K file, but others, it's suspicious:

Jan 10 12:10:14 sopa kernel: [170063.613418] BTRFS info (device dm-0):
found 75 extents
Jan 10 12:10:44 sopa kernel: [170093.813300] BTRFS info (device dm-0):
found 75 extents

But no errors, for many hours.

I can't really make much out of the task list, whether Btrfs is the
cause or a victim.


---
Chris Murphy

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: slow single -> raid1 conversion (heavy write to original LVM volume)
  2020-01-14 20:53       ` jn
  2020-01-15  2:23         ` Chris Murphy
@ 2020-01-15  3:15         ` Remi Gauvin
  1 sibling, 0 replies; 11+ messages in thread
From: Remi Gauvin @ 2020-01-15  3:15 UTC (permalink / raw)
  To: jn; +Cc: Btrfs BTRFS, Qu Wenruo


[-- Attachment #1.1: Type: text/plain, Size: 751 bytes --]

On 2020-01-14 3:53 p.m., jn wrote:

> 
> Unallocated:
>    /dev/mapper/sopa-data         0.00B
>    /dev/sdb3       1.86TiB
> 
> 


With only 2 devices, and one of them being completely out of
unnallocated space, I would expect it is impossible to allcoate any
mirrored blocks.  (The system needs unnallocated space on 2 devices.).
If it's possible, you might have to do some filtered regular balances
(ie, no convert) to get some space on sopa-data.  If there are large
chunks of data you can copy off and delete, that might help as well...
(completely empty allocated space should be de-allocated.)


I believe Chris also already pointed out some concerning errors that
might hint at unhealthy underlaying hardware as well.



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: slow single -> raid1 conversion (heavy write to original LVM volume)
  2020-01-14 13:16 slow single -> raid1 conversion (heavy write to original LVM volume) jn
  2020-01-14 17:18 ` Chris Murphy
@ 2020-01-15  3:26 ` Qu Wenruo
  2020-04-14  7:09   ` balance canceling was: " jakub nantl
  1 sibling, 1 reply; 11+ messages in thread
From: Qu Wenruo @ 2020-01-15  3:26 UTC (permalink / raw)
  To: jn, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 6885 bytes --]



On 2020/1/14 下午9:16, jn wrote:
> Hello,
> 
> I am experiencing very slow conversion from single disk BTRFS to raid1
> balanced (new disk was added):
> 
> what I have done:
> 
> I have added new disk to nearly full (cca 85%) BTRFS filesystem on LVM
> volume with intention to convert it into raid1:
> 
> btrfs balance start -dconvert raid1 -mconvert raid1 /data/
> 
>> Jan 10 08:14:04 sopa kernel: [155893.485617] BTRFS info (device dm-0):
>> disk added /dev/sdb3
>> Jan 10 08:15:06 sopa kernel: [155955.958561] BTRFS info (device dm-0):
>> relocating block group 2078923554816 flags data
>> Jan 10 08:15:07 sopa kernel: [155956.991293] BTRFS info (device dm-0):
>> relocating block group 2077849812992 flags data
>> Jan 10 08:15:10 sopa kernel: [155960.357846] BTRFS info (device dm-0):
>> relocating block group 2076776071168 flags data
>> Jan 10 08:15:13 sopa kernel: [155962.772534] BTRFS info (device dm-0):
>> relocating block group 2075702329344 flags data
>> Jan 10 08:15:14 sopa kernel: [155964.195237] BTRFS info (device dm-0):
>> relocating block group 2074628587520 flags data
>> Jan 10 08:15:45 sopa kernel: [155994.546695] BTRFS info (device dm-0):
>> relocating block group 2062817427456 flags data
>> Jan 10 08:15:52 sopa kernel: [156001.952247] BTRFS info (device dm-0):
>> relocating block group 2059596201984 flags data
>> Jan 10 08:15:58 sopa kernel: [156007.787071] BTRFS info (device dm-0):
>> relocating block group 2057448718336 flags data
>> Jan 10 08:16:00 sopa kernel: [156010.094565] BTRFS info (device dm-0):
>> relocating block group 2056374976512 flags data
>> Jan 10 08:16:06 sopa kernel: [156015.585343] BTRFS info (device dm-0):
>> relocating block group 2054227492864 flags data
>> Jan 10 08:16:12 sopa kernel: [156022.305629] BTRFS info (device dm-0):
>> relocating block group 2051006267392 flags data
>> Jan 10 08:16:23 sopa kernel: [156033.373144] BTRFS info (device dm-0):
>> found 75 extents
>> Jan 10 08:16:29 sopa kernel: [156038.666672] BTRFS info (device dm-0):
>> found 75 extents
>> Jan 10 08:16:36 sopa kernel: [156045.909270] BTRFS info (device dm-0):
>> found 75 extents
>> Jan 10 08:16:42 sopa kernel: [156052.292789] BTRFS info (device dm-0):
>> found 75 extents
>> Jan 10 08:16:46 sopa kernel: [156055.643452] BTRFS info (device dm-0):
>> found 75 extents
>> Jan 10 08:16:54 sopa kernel: [156063.608344] BTRFS info (device dm-0):
>> found 75 extents

This repeating result looks like a bug in recent kernels.

And due to another bug in balance canceling, we can't cancel it.

For the unable to cancel part, there are patches to address it, and
would get upstream in v5.6 release.


For the repeating line bug, I am still investigating. The ETA would be
5.7 if we're lucky enough.

Thanks,
Qu

> after 6hours of processing with 0% progress reported by balance status,
> I decided to cancel it to empty more space and rerun balance with some
> filters:
> 
> btrfs balance cancel /data
> 
>> Jan 10 14:38:11 sopa kernel: [178941.189217] BTRFS info (device dm-0):
>> found 68 extents
>> Jan 10 14:38:14 sopa kernel: [178943.619787] BTRFS info (device dm-0):
>> found 68 extents
>> Jan 10 14:38:20 sopa kernel: [178950.275334] BTRFS info (device dm-0):
>> found 68 extents
>> Jan 10 14:38:24 sopa kernel: [178954.018770] INFO: task btrfs:30196
>> blocked for more than 845 seconds.
>> Jan 10 14:38:24 sopa kernel: [178954.018844] 
>> btrfs_cancel_balance+0xf8/0x170 [btrfs]
>> Jan 10 14:38:24 sopa kernel: [178954.018878] 
>> btrfs_ioctl+0x13af/0x20d0 [btrfs]
>> Jan 10 14:38:28 sopa kernel: [178957.999108] BTRFS info (device dm-0):
>> found 68 extents
>> Jan 10 14:38:29 sopa kernel: [178958.837674] BTRFS info (device dm-0):
>> found 68 extents
>> Jan 10 14:38:30 sopa kernel: [178959.835118] BTRFS info (device dm-0):
>> found 68 extents
>> Jan 10 14:38:31 sopa kernel: [178960.915305] BTRFS info (device dm-0):
>> found 68 extents
>> Jan 10 14:40:25 sopa kernel: [179074.851376] 
>> btrfs_cancel_balance+0xf8/0x170 [btrfs]
>> Jan 10 14:40:25 sopa kernel: [179074.851408] 
>> btrfs_ioctl+0x13af/0x20d0 [btrfs]
> 
> now nearly 4 days later (and after some data deleted) both balance start
> and balance cancel processes are still running and system reports:
> 
>> root@sopa:/var/log# btrfs balance status /data/
>> Balance on '/data/' is running, cancel requested
>> 0 out of about 1900 chunks balanced (29 considered), 100% left
> 
>> root@sopa:~# uname -a
>> Linux sopa 5.4.8-050408-generic #202001041436 SMP Sat Jan 4 19:40:55
>> UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
>>
>> root@sopa:~#   btrfs --version
>> btrfs-progs v4.15.1
>>
>> root@sopa:~#   btrfs fi show
>> Label: 'SOPADATA'  uuid: 37b8a62c-68e8-44e4-a3b2-eb572385c3e8
>>     Total devices 2 FS bytes used 1.04TiB
>>     devid    1 size 1.86TiB used 1.86TiB path /dev/mapper/sopa-data
>>     devid    2 size 1.86TiB used 0.00B path /dev/sdb3
>>
>> root@sopa:~# btrfs subvolume list /data
>> ID 1021 gen 7564583 top level 5 path nfs
>> ID 1022 gen 7564590 top level 5 path motion
> 
>> root@sopa:~#   btrfs fi df /data
>> Data, single: total=1.84TiB, used=1.04TiB
>> System, DUP: total=8.00MiB, used=224.00KiB
>> System, single: total=4.00MiB, used=0.00B
>> Metadata, DUP: total=6.50GiB, used=2.99GiB
>> Metadata, single: total=8.00MiB, used=0.00B
>> GlobalReserve, single: total=512.00MiB, used=0.00B
>>
> is it normal that  it have written nearly 5TB of data to the original
> disk ??:
> 
>> root@sopa:/var/log# ps ax | grep balance
>> 16014 ?        D    21114928:30 btrfs balance start -dconvert raid1
>> -mconvert raid1 /data/
>> 30196 ?        D      0:00 btrfs balance cancel /data
> 
>> root@sopa:/var/log# cat /proc/16014/io | grep bytes
>> read_bytes: 1150357504
>> write_bytes: 5812039966720
>> root@sopa:/sys/block# cat  /sys/block/sdb/sdb3/stat
>>      404        0    39352      956  4999199     1016 40001720
>> 71701953        0 14622628 67496136        0        0        0        0
> 
>> [520398.089952] btrfs(16014): WRITE block 131072 on sdb3 (8 sectors)
>> [520398.089975] btrfs(16014): WRITE block 536870912 on sdb3 (8 sectors)
>> [520398.089995] btrfs(16014): WRITE block 128 on dm-0 (8 sectors)
>> [520398.090021] btrfs(16014): WRITE block 131072 on dm-0 (8 sectors)
>> [520398.090040] btrfs(16014): WRITE block 536870912 on dm-0 (8 sectors)
>> [520398.154382] btrfs(16014): WRITE block 14629168 on dm-0 (512 sectors)
>> [520398.155017] btrfs(16014): WRITE block 17748832 on dm-0 (512 sectors)
>> [520398.155545] btrfs(16014): WRITE block 17909352 on dm-0 (512 sectors)
>> [520398.156091] btrfs(16014): WRITE block 20534680 on dm-0 (512 sectors)
>>
> regards
> 
> jn
> 
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: slow single -> raid1 conversion (heavy write to original LVM volume)
  2020-01-15  2:23         ` Chris Murphy
@ 2020-01-15 11:28           ` jakub nantl
  0 siblings, 0 replies; 11+ messages in thread
From: jakub nantl @ 2020-01-15 11:28 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS, Qu Wenruo

On 15. 01. 20 3:23, Chris Murphy wrote:
> Jan  8 12:28:54 sopa kernel: [   13.552491] BTRFS info (device dm-0):
> disk space caching is enabled
> Jan  8 12:28:54 sopa kernel: [   13.803006] BTRFS info (device dm-0):
> bdev /dev/mapper/sopa-data errs: wr 420, rd 44, flush 0, corrupt 0,
> gen 0

these errors have to be  old, no signs of HW problem with orig disk. see
smartctl output https://pastebin.com/PgM84cGC

> Any idea what these are and if they're recent or old? This is the
> original device with all the data and there are read and write errors.
> Write errors mean writes have been dropped. It's usually an indicator
> of a device problem.
>
> Jan  8 12:59:40 sopa kernel: [  229.560937] Alternate GPT is invalid,
> using primary GPT.
> Jan  8 12:59:40 sopa kernel: [  229.560954]  sdb: sdb1 sdb2 sdb3

not a problem, I have  copied MBR to the new disk (dd) and fixed it
immediately by fdisk write...

is it still worth to wait for something to happen or should I reboot?

jn


^ permalink raw reply	[flat|nested] 11+ messages in thread

* balance canceling was: Re: slow single -> raid1 conversion (heavy write to original LVM volume)
  2020-01-15  3:26 ` Qu Wenruo
@ 2020-04-14  7:09   ` jakub nantl
  2020-04-14  9:07     ` Qu Wenruo
  0 siblings, 1 reply; 11+ messages in thread
From: jakub nantl @ 2020-04-14  7:09 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs

On 15. 01. 20 4:26,  Qu Wenruo wrote:
> And due to another bug in balance canceling, we can't cancel it.
>
> For the unable to cancel part, there are patches to address it, and
> would get upstream in v5.6 release.

Hello,

looks like balance canceling is still not 100% in 5.6.x:


# btrfs balance status /data/
Balance on '/data/' is running, cancel requested
0 out of about 16 chunks balanced (9 considered), 100% left


Apr 13 23:30:52 sopa kernel: [ 6983.625318] BTRFS info (device dm-0):
balance: start -dusage=100,limit=16
Apr 13 23:30:52 sopa kernel: [ 6983.627286] BTRFS info (device dm-0):
relocating block group 5572244013056 flags data|raid1
Apr 13 23:31:05 sopa kernel: [ 6996.237814] BTRFS info (device dm-0):
relocating block group 5569073119232 flags data|raid1
Apr 13 23:31:40 sopa kernel: [ 7032.178175] BTRFS info (device dm-0):
found 17 extents, stage: move data extents
Apr 13 23:31:46 sopa kernel: [ 7037.711119] BTRFS info (device dm-0):
found 17 extents, stage: update data pointers
Apr 13 23:31:49 sopa kernel: [ 7040.767052] BTRFS info (device dm-0):
found 17 extents, stage: update data pointers
Apr 13 23:32:00 sopa kernel: [ 7051.885977] BTRFS info (device dm-0):
found 17 extents, stage: update data pointers
..

..

Apr 14 06:26:06 sopa kernel: [31897.468487] BTRFS info (device dm-0):
found 17 extents, stage: update data pointers
Apr 14 06:26:08 sopa kernel: [31900.034563] BTRFS info (device dm-0):
found 17 extents, stage: update data pointers
Apr 14 06:26:10 sopa kernel: [31901.719655] BTRFS info (device dm-0):
found 17 extents, stage: update data pointers
Apr 14 06:26:12 sopa kernel: [31903.334506] BTRFS info (device dm-0):
found 17 extents, stage: update data pointers
Apr 14 06:26:12 sopa kernel: [31903.856791] BTRFS info (device dm-0):
found 17 extents, stage: update data pointers


Linux sopa 5.6.4-050604-generic #202004131234 SMP Mon Apr 13 12:36:46
UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

# btrfs fi us /data/
Overall:
    Device size:           3.71TiB
    Device allocated:           3.71TiB
    Device unallocated:           2.00MiB
    Device missing:             0.00B
    Used:               3.30TiB
    Free (estimated):         209.16GiB    (min: 209.16GiB)
    Data ratio:                  2.00
    Metadata ratio:              2.00
    Global reserve:         512.00MiB    (used: 0.00B)

Data,RAID1: Size:1.85TiB, Used:1.64TiB
   /dev/mapper/sopa-data       1.85TiB
   /dev/sdb3       1.85TiB

Metadata,RAID1: Size:6.11GiB, Used:4.77GiB
   /dev/mapper/sopa-data       6.11GiB
   /dev/sdb3       6.11GiB

System,RAID1: Size:32.00MiB, Used:304.00KiB
   /dev/mapper/sopa-data      32.00MiB
   /dev/sdb3      32.00MiB

Unallocated:
   /dev/mapper/sopa-data       1.00MiB
   /dev/sdb3       1.00MiB

# btrfs fi show /data/
Label: 'SOPADATA'  uuid: 37b8a62c-68e8-44e4-a3b2-eb572385c3e8
    Total devices 2 FS bytes used 1.65TiB
    devid    1 size 1.86TiB used 1.85TiB path /dev/mapper/sopa-data
    devid    2 size 1.86TiB used 1.85TiB path /dev/sdb3

jn


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: balance canceling was: Re: slow single -> raid1 conversion (heavy write to original LVM volume)
  2020-04-14  7:09   ` balance canceling was: " jakub nantl
@ 2020-04-14  9:07     ` Qu Wenruo
  0 siblings, 0 replies; 11+ messages in thread
From: Qu Wenruo @ 2020-04-14  9:07 UTC (permalink / raw)
  To: jakub nantl, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 3809 bytes --]



On 2020/4/14 下午3:09, jakub nantl wrote:
> On 15. 01. 20 4:26,  Qu Wenruo wrote:
>> And due to another bug in balance canceling, we can't cancel it.
>>
>> For the unable to cancel part, there are patches to address it, and
>> would get upstream in v5.6 release.
> 
> Hello,
> 
> looks like balance canceling is still not 100% in 5.6.x:
> 
> 
> # btrfs balance status /data/
> Balance on '/data/' is running, cancel requested
> 0 out of about 16 chunks balanced (9 considered), 100% left
> 
> 
> Apr 13 23:30:52 sopa kernel: [ 6983.625318] BTRFS info (device dm-0):
> balance: start -dusage=100,limit=16
> Apr 13 23:30:52 sopa kernel: [ 6983.627286] BTRFS info (device dm-0):
> relocating block group 5572244013056 flags data|raid1
> Apr 13 23:31:05 sopa kernel: [ 6996.237814] BTRFS info (device dm-0):
> relocating block group 5569073119232 flags data|raid1
> Apr 13 23:31:40 sopa kernel: [ 7032.178175] BTRFS info (device dm-0):
> found 17 extents, stage: move data extents
> Apr 13 23:31:46 sopa kernel: [ 7037.711119] BTRFS info (device dm-0):
> found 17 extents, stage: update data pointers
> Apr 13 23:31:49 sopa kernel: [ 7040.767052] BTRFS info (device dm-0):
> found 17 extents, stage: update data pointers
> Apr 13 23:32:00 sopa kernel: [ 7051.885977] BTRFS info (device dm-0):
> found 17 extents, stage: update data pointers
> ..
> 
> ..
> 
> Apr 14 06:26:06 sopa kernel: [31897.468487] BTRFS info (device dm-0):
> found 17 extents, stage: update data pointers
> Apr 14 06:26:08 sopa kernel: [31900.034563] BTRFS info (device dm-0):
> found 17 extents, stage: update data pointers
> Apr 14 06:26:10 sopa kernel: [31901.719655] BTRFS info (device dm-0):
> found 17 extents, stage: update data pointers
> Apr 14 06:26:12 sopa kernel: [31903.334506] BTRFS info (device dm-0):
> found 17 extents, stage: update data pointers
> Apr 14 06:26:12 sopa kernel: [31903.856791] BTRFS info (device dm-0):
> found 17 extents, stage: update data pointers
> 

Thanks for the report, this means one of my original patch is still needed.

And since that patch failed to pass review where I can't explain under
which case that can happen.
Now it has been proven that we still need that one.

And let me check under which condition that happened.

Thanks,
Qu
> 
> Linux sopa 5.6.4-050604-generic #202004131234 SMP Mon Apr 13 12:36:46
> UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
> 
> # btrfs fi us /data/
> Overall:
>     Device size:           3.71TiB
>     Device allocated:           3.71TiB
>     Device unallocated:           2.00MiB
>     Device missing:             0.00B
>     Used:               3.30TiB
>     Free (estimated):         209.16GiB    (min: 209.16GiB)
>     Data ratio:                  2.00
>     Metadata ratio:              2.00
>     Global reserve:         512.00MiB    (used: 0.00B)
> 
> Data,RAID1: Size:1.85TiB, Used:1.64TiB
>    /dev/mapper/sopa-data       1.85TiB
>    /dev/sdb3       1.85TiB
> 
> Metadata,RAID1: Size:6.11GiB, Used:4.77GiB
>    /dev/mapper/sopa-data       6.11GiB
>    /dev/sdb3       6.11GiB
> 
> System,RAID1: Size:32.00MiB, Used:304.00KiB
>    /dev/mapper/sopa-data      32.00MiB
>    /dev/sdb3      32.00MiB
> 
> Unallocated:
>    /dev/mapper/sopa-data       1.00MiB
>    /dev/sdb3       1.00MiB
> 
> # btrfs fi show /data/
> Label: 'SOPADATA'  uuid: 37b8a62c-68e8-44e4-a3b2-eb572385c3e8
>     Total devices 2 FS bytes used 1.65TiB
>     devid    1 size 1.86TiB used 1.85TiB path /dev/mapper/sopa-data
>     devid    2 size 1.86TiB used 1.85TiB path /dev/sdb3
> 
> jn
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-04-14  9:07 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-14 13:16 slow single -> raid1 conversion (heavy write to original LVM volume) jn
2020-01-14 17:18 ` Chris Murphy
2020-01-14 17:41   ` jakub nantl
2020-01-14 20:09     ` Chris Murphy
2020-01-14 20:53       ` jn
2020-01-15  2:23         ` Chris Murphy
2020-01-15 11:28           ` jakub nantl
2020-01-15  3:15         ` Remi Gauvin
2020-01-15  3:26 ` Qu Wenruo
2020-04-14  7:09   ` balance canceling was: " jakub nantl
2020-04-14  9:07     ` Qu Wenruo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).