* how long should "btrfs device delete missing ..." take?
@ 2014-09-11 15:22 Tomasz Chmielewski
2014-09-11 19:31 ` Duncan
2014-09-11 23:06 ` Tomasz Chmielewski
0 siblings, 2 replies; 11+ messages in thread
From: Tomasz Chmielewski @ 2014-09-11 15:22 UTC (permalink / raw)
To: linux-btrfs
After a disk died and was replaced, "btrfs device delete missing" is
taking more than 10 days on an otherwise idle server:
# btrfs fi show /home
Label: none uuid: 84d087aa-3a32-46da-844f-a233237cf04f
Total devices 3 FS bytes used 362.44GiB
devid 2 size 1.71TiB used 365.03GiB path /dev/sdb4
devid 3 size 1.71TiB used 58.00GiB path /dev/sda4
*** Some devices missing
Btrfs v3.16
So far, it has copied 58 GB out of 365 GB - and it took 10 days. At this
speed, the whole operation will take 2-3 months (assuming that the only
healthy disk doesn't die in the meantime).
Is this expected time for btrfs RAID-1?
There are no errors in dmesg/smart, performance of both disks is fine:
# hdparm -t /dev/sda /dev/sdb
/dev/sda:
Timing buffered disk reads: 442 MB in 3.01 seconds = 146.99 MB/sec
/dev/sdb:
Timing buffered disk reads: 402 MB in 3.39 seconds = 118.47 MB/sec
# btrfs fi df /home
Data, RAID1: total=352.00GiB, used=351.02GiB
System, RAID1: total=32.00MiB, used=96.00KiB
Metadata, RAID1: total=13.00GiB, used=11.38GiB
unknown, single: total=512.00MiB, used=67.05MiB
# btrfs sub list /home | wc -l
260
# uptime
17:21:53 up 10 days, 6:01, 2 users, load average: 3.22, 3.53, 3.55
I've tried running this on the latest 3.16.x kernel earlier, but since
the progress was so slow, rebooted after about a week to see if the
latest RC will be any faster.
--
Tomasz Chmielewski
http://www.sslrack.com
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: how long should "btrfs device delete missing ..." take?
2014-09-11 15:22 how long should "btrfs device delete missing ..." take? Tomasz Chmielewski
@ 2014-09-11 19:31 ` Duncan
2014-09-11 21:25 ` Chris Murphy
2014-09-11 23:06 ` Tomasz Chmielewski
1 sibling, 1 reply; 11+ messages in thread
From: Duncan @ 2014-09-11 19:31 UTC (permalink / raw)
To: linux-btrfs
Tomasz Chmielewski posted on Thu, 11 Sep 2014 17:22:15 +0200 as excerpted:
> After a disk died and was replaced, "btrfs device delete missing" is
> taking more than 10 days on an otherwise idle server:
>
> # btrfs fi show /home
> Label: none uuid: 84d087aa-3a32-46da-844f-a233237cf04f
> Total devices 3 FS bytes used 362.44GiB
> devid 2 size 1.71TiB used 365.03GiB path /dev/sdb4
> devid 3 size 1.71TiB used 58.00GiB path /dev/sda4
> *** Some devices missing
>
> Btrfs v3.16
>
> So far, it has copied 58 GB out of 365 GB - and it took 10 days. At this
> speed, the whole operation will take 2-3 months (assuming that the only
> healthy disk doesn't die in the meantime).
> Is this expected time for btrfs RAID-1?
Device delete definitely takes time. For the sub-GiB usage shown above,
10 days for 50 GiB out of 350+ does seem excessive, but there are extreme
cases where it isn't entirely out of line. See below.
> There are no errors in dmesg/smart, performance of both disks is fine:
> # btrfs sub list /home | wc -l
> 260
> I've tried running this on the latest 3.16.x kernel earlier, but since
> the progress was so slow, rebooted after about a week to see if the
> latest RC will be any faster.
The good thing is that once a block group is copied over, it should be
fine and won't need re-copied if the process is stopped over a reboot
and restarted on a new kernel, etc.
The bad thing is that if I'm interpreting your report correctly, that
likely means 7+10=17 days for that 58 gig. =:^(
Questions:
* Presumably most of those 260 subvolumes are snapshots, correct?
What was your snapshotting schedule and did you have old snapshot
cleanup-deletion scheduled as well?
* Do you run with autodefrag or was the system otherwise regularly
defragged?
* Do you have large (GiB plus) database or virtual machine image files
on that filesystem? If so, had you properly set the NOCOW file
attribute (chattr +C) on them and were they on dedicated subvolumes?
200+ snapshots is somewhat high and could be part of the issue, tho
it's nothing like the extremes (thousands) we've seen posted in the
past. Were it me, I'd have tried deleting as many as possible before
the device delete missing, in ordered to simplify the process and
eliminate as much "extra" data as possible.
The real issue is going to be fragmentation, on spinning-rust drives.
Run filefrag on some of your gig-plus files that get written to
frequently (vm-images and database files are the classic cases) and
see how many extents are reported. (Tho note that filefrag doesn't
understand btrfs compression and won't be accurate in that case, and
also that due to the btrfs data chunk size of 1 GiB, that's the
maximum extent size, so multi-gig files will typically be two extents
more than the number of gigs, filling up the current chunk,
N whole-gig chunks, the file tail.) The nocow file attribute (which
must be set while the file is zero-sized to be effective, see
discussion elsewhere) can help with that, but snapshotting an
actively being rewritten nocow file more or less defeats the
purpose of nocow, since the snapshot locks in place the existing
data and the first rewrite to a block must then be cowed anyway.
But putting those files on dedicated subvolumes and then not
snapshotting those subvolumes is a workaround.
I wouldn't try defragging now, but it might be worthwhile to stop the
device delete (rebooting to do so since I don't think there's a cancel)
and delete as many snapshots as possible. That should help matters.
Additionally, if you have recent backups of highly fragmented files
such as the VM-images and DBs I mentioned, you might consider simply
deleting them, thus eliminating that fragment processing from the
device delete. I don't know that making a backup now would go much
faster than the device delete, however, so I don't know whether to
recommend that or not.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: how long should "btrfs device delete missing ..." take?
2014-09-11 19:31 ` Duncan
@ 2014-09-11 21:25 ` Chris Murphy
2014-09-11 23:51 ` Duncan
0 siblings, 1 reply; 11+ messages in thread
From: Chris Murphy @ 2014-09-11 21:25 UTC (permalink / raw)
To: Btrfs BTRFS
On Sep 11, 2014, at 1:31 PM, Duncan <1i5t5.duncan@cox.net> wrote:
>
> I wouldn't try defragging now, but it might be worthwhile to stop the
> device delete (rebooting to do so since I don't think there's a cancel)
'btrfs replace cancel' does exist, although I haven't tried it.
Something isn't right though, because it's clearly neither reading nor writing at anywhere close to 1/2 the drive read throughput. I'm curious what 'iotop -d30 -o' shows (during the replace, before cancel), which should be pretty consistent by averaging 30 seconds worth of io. And then try 'iotop -d3 -o' and see if there are spikes. I'm willing to bet there's a lot of nothing going on, with occasional spikes, rather than a constant trickle.
And then the question is to find out what btrfs is thinking about while nothing is reading or writing. Even though it's not 5000+ snapshots, I wonder if the balance code (and hence btrfs replace) makes extensive use of fiemap that's causing this to go catatonic. http://comments.gmane.org/gmane.comp.file-systems.btrfs/35724
Chris Murphy
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: how long should "btrfs device delete missing ..." take?
2014-09-11 15:22 how long should "btrfs device delete missing ..." take? Tomasz Chmielewski
2014-09-11 19:31 ` Duncan
@ 2014-09-11 23:06 ` Tomasz Chmielewski
1 sibling, 0 replies; 11+ messages in thread
From: Tomasz Chmielewski @ 2014-09-11 23:06 UTC (permalink / raw)
To: linux-btrfs
>> After a disk died and was replaced, "btrfs device delete missing" is
>> taking more than 10 days on an otherwise idle server:
>
> Something isn't right though, because it's clearly neither reading nor
> writing at \
> anywhere close to 1/2 the drive read throughput. I'm curious what
> 'iotop -d30 -o' \
> shows (during the replace, before cancel), which should be pretty
> consistent by \
> averaging 30 seconds worth of io. And then try 'iotop -d3 -o' and see
> if there are \
> spikes. I'm willing to bet there's a lot of nothing going on, with
> occasional spikes, \
> rather than a constant trickle.
That's more or less what I'm seeing with both. The numbers will go up or
down slightly, but it's counted in kilobytes per second:
Total DISK READ: 0.00 B/s | Total DISK WRITE: 545.82 B/s
TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
940 be/3 root 0.00 B/s 136.46 B/s 0.00 % 0.10 % [jbd2/md2-8]
4714 be/4 root 0.00 B/s 329.94 K/s 0.00 % 0.00 %
[btrfs-transacti]
25534 be/4 root 0.00 B/s 402.97 K/s 0.00 % 0.00 %
[kworker/u16:0]
The bottleneck may be here - one CPU core is mostly 100% busy (kworker).
Not sure what it's really busy with though:
PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command
25546 root 20 0 0 0 0 R 93.0 0.0 18:22.94
kworker/u16:7
14473 root 20 0 0 0 0 S 5.0 0.0 25:00.14
kworker/0:0
[912979.063432] SysRq : Show Blocked State
[912979.063485] task PC stack pid father
[912979.063545] btrfs D ffff88083fa515c0 0 4793 4622
0x00000000
[912979.063601] ffff88061a29b878 0000000000000086 0000000000000000
ffff88003683e040
[912979.063701] 00000000000115c0 0000000000004000 ffff880813e30000
ffff88003683e040
[912979.063800] ffff88061a29b7e8 ffffffff8105d8e9 ffff88083fa40000
ffff88083fa115c0
[912979.063899] Call Trace:
[912979.063951] [<ffffffff8105d8e9>] ? enqueue_task_fair+0x3e5/0x44f
[912979.064006] [<ffffffff81053484>] ? resched_curr+0x47/0x57
[912979.064058] [<ffffffff81053aed>] ? check_preempt_curr+0x3e/0x6d
[912979.064111] [<ffffffff81053b2e>] ? ttwu_do_wakeup+0x12/0x7f
[912979.064164] [<ffffffff81053c3c>] ?
ttwu_do_activate.constprop.74+0x57/0x5c
[912979.064220] [<ffffffff813acc1e>] schedule+0x65/0x67
[912979.064272] [<ffffffff813aed0c>] schedule_timeout+0x26/0x198
[912979.064324] [<ffffffff8105639d>] ? wake_up_process+0x31/0x35
[912979.064378] [<ffffffff81049baf>] ? wake_up_worker+0x1f/0x21
[912979.064431] [<ffffffff81049df6>] ? insert_work+0x87/0x94
[912979.064493] [<ffffffffa02d524b>] ? free_block_list+0x1f/0x34
[btrfs]
[912979.064548] [<ffffffff813ad443>] wait_for_common+0x10d/0x13e
[912979.064600] [<ffffffff8105635d>] ? try_to_wake_up+0x251/0x251
[912979.064653] [<ffffffff813ad48c>] wait_for_completion+0x18/0x1a
[912979.064710] [<ffffffffa0283a01>]
btrfs_async_run_delayed_refs+0xc1/0xe4 [btrfs]
[912979.064814] [<ffffffffa02983c5>]
__btrfs_end_transaction+0x2bb/0x2e1 [btrfs]
[912979.064916] [<ffffffffa02983f9>]
btrfs_end_transaction_throttle+0xe/0x10 [btrfs]
[912979.065020] [<ffffffffa02d973d>] relocate_block_group+0x2ad/0x4de
[btrfs]
[912979.065079] [<ffffffffa02d9ac6>]
btrfs_relocate_block_group+0x158/0x278 [btrfs]
[912979.065184] [<ffffffffa02b66f0>]
btrfs_relocate_chunk.isra.62+0x58/0x5f7 [btrfs]
[912979.065286] [<ffffffffa02c58d7>] ?
btrfs_set_lock_blocking_rw+0x68/0x95 [btrfs]
[912979.065387] [<ffffffffa0276b04>] ?
btrfs_set_path_blocking+0x23/0x54 [btrfs]
[912979.065486] [<ffffffffa027b517>] ? btrfs_search_slot+0x7bc/0x816
[btrfs]
[912979.065546] [<ffffffffa02b2bd5>] ? free_extent_buffer+0x6f/0x7c
[btrfs]
[912979.065605] [<ffffffffa02b89e9>] btrfs_shrink_device+0x23c/0x3a5
[btrfs]
[912979.065679] [<ffffffffa02bb2c7>] btrfs_rm_device+0x2a1/0x759
[btrfs]
[912979.065747] [<ffffffffa02c3ab3>] btrfs_ioctl+0xa52/0x227f [btrfs]
[912979.065811] [<ffffffff81107182>] ? putname+0x23/0x2c
[912979.065863] [<ffffffff8110b3cb>] ? user_path_at_empty+0x60/0x90
[912979.065918] [<ffffffff81173b1a>] ? avc_has_perm+0x2e/0xf7
[912979.065978] [<ffffffff810d7ad5>] ? __vm_enough_memory+0x25/0x13c
[912979.066032] [<ffffffff8110d3c1>] do_vfs_ioctl+0x3f2/0x43c
[912979.066084] [<ffffffff811026fd>] ? vfs_stat+0x16/0x18
[912979.066136] [<ffffffff8110d459>] SyS_ioctl+0x4e/0x7d
[912979.066188] [<ffffffff81030a71>] ? do_page_fault+0xc/0xf
[912979.066240] [<ffffffff813afd92>] system_call_fastpath+0x16/0x1b
[912979.066296] Sched Debug Version: v0.11, 3.17.0-rc3 #1
[912979.066347] ktime :
913460840.666210
[912979.066401] sched_clk :
912979066.295474
[912979.066454] cpu_clk :
912979066.295485
[912979.066507] jiffies : 4386283381
[912979.066560] sched_clock_stable() : 1
[912979.066610]
[912979.066656] sysctl_sched
[912979.066703] .sysctl_sched_latency : 24.000000
[912979.066756] .sysctl_sched_min_granularity : 3.000000
[912979.066809] .sysctl_sched_wakeup_granularity : 4.000000
[912979.066861] .sysctl_sched_child_runs_first : 0
[912979.066913] .sysctl_sched_features : 11899
[912979.068196] .sysctl_sched_tunable_scaling : 1
(logaritmic)
[912979.068248]
[912979.068294] cpu#0, 3411.379 MHz
[912979.068342] .nr_running : 0
[912979.068393] .load : 0
[912979.068443] .nr_switches : 374480347
[912979.068495] .nr_load_updates : 56296080
[912979.068546] .nr_uninterruptible : -3469678
[912979.068597] .next_balance : 4386.283363
[912979.068649] .curr->pid : 0
[912979.068699] .clock : 912979065.783028
[912979.068751] .cpu_load[0] : 0
[912979.068802] .cpu_load[1] : 0
[912979.068852] .cpu_load[2] : 0
[912979.068902] .cpu_load[3] : 0
[912979.068953] .cpu_load[4] : 0
[912979.069004] .yld_count : 1231295
[912979.069054] .sched_count : 375930185
[912979.069106] .sched_goidle : 150592952
[912979.069157] .avg_idle : 875618
[912979.069208] .max_idle_balance_cost : 500000
[912979.069259] .ttwu_count : 252809952
[912979.069311] .ttwu_local : 170251968
[912979.069363]
[912979.069363] cfs_rq[0]:/autogroup-6
[912979.069456] .exec_clock : 4133517.038969
[912979.069509] .MIN_vruntime : 0.000001
[912979.069560] .min_vruntime : 3861183.891419
[912979.069611] .max_vruntime : 0.000001
[912979.069662] .spread : 0.000000
[912979.069713] .spread0 : -117017929.493938
[912979.069765] .nr_spread_over : 122
[912979.069815] .nr_running : 0
[912979.069865] .load : 0
[912979.069916] .runnable_load_avg : 0
[912979.069966] .blocked_load_avg : 0
[912979.070015] .tg_load_contrib : 0
[912979.070066] .tg_runnable_contrib : 2
[912979.070115] .tg_load_avg : 3
[912979.070166] .tg->runnable_avg : 5
[912979.070216] .tg->cfs_bandwidth.timer_active: 0
[912979.070266] .throttled : 0
[912979.070316] .throttle_count : 0
[912979.070366] .se->exec_start : 912979065.782511
[912979.070418] .se->vruntime : 120879101.503042
[912979.070470] .se->sum_exec_runtime : 4133527.988053
[912979.070522] .se->statistics.wait_start : 0.000000
[912979.070573] .se->statistics.sleep_start : 0.000000
[912979.070624] .se->statistics.block_start : 0.000000
[912979.070675] .se->statistics.sleep_max : 0.000000
[912979.070726] .se->statistics.block_max : 0.000000
[912979.070777] .se->statistics.exec_max : 10.022184
[912979.070828] .se->statistics.slice_max : 11.802172
[912979.070879] .se->statistics.wait_max : 64.131091
[912979.070930] .se->statistics.wait_sum : 61306.855909
[912979.070981] .se->statistics.wait_count : 15944650
[912979.071032] .se->load.weight : 2
[912979.071082] .se->avg.runnable_avg_sum : 111
[912979.071133] .se->avg.runnable_avg_period : 47099
[912979.071183] .se->avg.load_avg_contrib : 0
[912979.071233] .se->avg.decay_count : 870684687
[912979.071285]
[912979.071285] cfs_rq[0]:/autogroup-2671
[912979.071379] .exec_clock : 5.865939
[912979.071430] .MIN_vruntime : 0.000001
[912979.071481] .min_vruntime : 4.817363
[912979.071533] .max_vruntime : 0.000001
[912979.071583] .spread : 0.000000
[912979.071635] .spread0 : -120879108.567994
[912979.071686] .nr_spread_over : 0
[912979.071737] .nr_running : 0
[912979.071787] .load : 0
[912979.071837] .runnable_load_avg : 0
[912979.071888] .blocked_load_avg : 0
[912979.071937] .tg_load_contrib : 0
[912979.071987] .tg_runnable_contrib : 0
[912979.072037] .tg_load_avg : 0
[912979.072087] .tg->runnable_avg : 0
[912979.072138] .tg->cfs_bandwidth.timer_active: 0
[912979.072188] .throttled : 0
[912979.072238] .throttle_count : 0
[912979.072289] .se->exec_start : 912979063.302772
[912979.072342] .se->vruntime : 120879101.418762
[912979.072393] .se->sum_exec_runtime : 5.865939
[912979.072445] .se->statistics.wait_start : 0.000000
[912979.072496] .se->statistics.sleep_start : 0.000000
[912979.072547] .se->statistics.block_start : 0.000000
[912979.072598] .se->statistics.sleep_max : 0.000000
[912979.072649] .se->statistics.block_max : 0.000000
[912979.072700] .se->statistics.exec_max : 1.418038
[912979.072751] .se->statistics.slice_max : 0.039648
[912979.072802] .se->statistics.wait_max : 0.011396
[912979.072853] .se->statistics.wait_sum : 0.011396
[912979.072904] .se->statistics.wait_count : 87
[912979.072955] .se->load.weight : 2
[912979.073005] .se->avg.runnable_avg_sum : 29
[912979.073056] .se->avg.runnable_avg_period : 47385
[912979.073107] .se->avg.load_avg_contrib : 0
[912979.073157] .se->avg.decay_count : 870684685
[912979.073209]
[912979.073209] cfs_rq[0]:/
[912979.073301] .exec_clock : 123499351.282840
[912979.073352] .MIN_vruntime : 0.000001
[912979.073404] .min_vruntime : 120879113.385357
[912979.073456] .max_vruntime : 0.000001
[912979.073508] .spread : 0.000000
[912979.073559] .spread0 : 0.000000
[912979.073610] .nr_spread_over : 899
[912979.073661] .nr_running : 0
[912979.073711] .load : 0
[912979.073762] .runnable_load_avg : 0
[912979.073812] .blocked_load_avg : 0
[912979.073861] .tg_load_contrib : 0
[912979.073912] .tg_runnable_contrib : 3
[912979.073963] .tg_load_avg : 1112
[912979.074013] .tg->runnable_avg : 2090
[912979.074064] .tg->cfs_bandwidth.timer_active: 0
[912979.074114] .throttled : 0
[912979.074164] .throttle_count : 0
[912979.074215] .avg->runnable_avg_sum : 147
[912979.074265] .avg->runnable_avg_period : 47875
[912979.074317]
[912979.074317] rt_rq[0]:/
[912979.074409] .rt_nr_running : 0
[912979.074460] .rt_throttled : 0
[912979.074511] .rt_time : 0.008767
[912979.074562] .rt_runtime : 950.000000
[912979.074614]
[912979.074614] runnable tasks:
[912979.074614] task PID tree-key switches prio
exec-runtime sum-exec sum-sleep
[912979.074614]
----------------------------------------------------------------------------------------------------------
[912979.074908] init 1 1442.976599 182764 120
1442.976599 4970.908057 912971562.079844 /autogroup-2
[912979.075015] ksoftirqd/0 3 120879101.450246 2773774 120
120879101.450246 40699.948228 912767567.756265 /
[912979.075120] kworker/0:0H 5 704.624192 5 100
704.624192 0.046383 1080.126254 /
[912979.075225] rcu_bh 8 117.338636 2 120
117.338636 0.000907 0.000736 /
[912979.075329] migration/0 9 0.000000 79441 0
0.000000 94289.738036 0.000939 /
[912979.075435] watchdog/0 10 -11.784505 228375 0
-11.784505 18626.398234 93.358787 /
[912979.075540] fsnotify_mark 573 1251.442326 15 120
1251.442326 0.277180 33700.437424 /
[912979.075662] kworker/0:1 600 117631274.916693 481 120
117631274.916693 3.528776 885071044.709135 /
[912979.075767] scsi_eh_0 855 703.799083 18 120
703.799083 0.937602 352.420033 /
[912979.075871] scsi_eh_1 857 703.864038 18 120
703.864038 1.040986 349.446698 /
[912979.075977] kworker/0:1H 879 120879082.863400 81445041 100
120879082.863400 2958135.905303 909589328.375092 /
[912979.076081] md2_raid1 925 120879064.591392 18785880 120
120879064.591392 1385509.864935 911497676.823212 /
[912979.076187] jbd2/md2-8 940 120879042.325286 3868883 120
120879042.325286 116631.787281 912832775.733876 /
[912979.076292] rsyslogd 3079 3861172.009104 335 120
3861172.009104 19.617385 912940034.911841 /autogroup-6
[912979.076398] vnstatd 3055 6435.362083 184313 120
6435.362083 16205.294171 912920653.881480 /autogroup-7
[912979.076504] atd 3159 0.797141 255 120
0.797141 4.675411 910319644.880519 /autogroup-8
[912979.076610] mdadm 3165 26.238387 983 120
26.238387 124.630846 912209267.355015 /autogroup-9
[912979.076717] nginx 3428 8742.095210 1353920 120
8742.095210 16279.858318 912922182.809013 /autogroup-14
[912979.076823] nginx 3429 8715.099678 1807456 120
8715.099678 21823.289825 911255010.187828 /autogroup-14
[912979.076929] dovecot 4225 464.029513 1819 120
464.029513 95.516146 910857010.523253 /autogroup-17
[912979.077036] getty 4380 0.037161 18 120
0.037161 1.226701 106.505480 /autogroup-22
[912979.077143] btrfs-endio-wri 4704 21203401.069282 6 100
21203401.069282 0.188994 141336214.234067 /
[912979.077248] kworker/0:0 14473 120879101.539884 7149580 120
120879101.539884 1503790.895268 208487953.565214 /
[912979.077353] pickup 25685 889.860671 16 120
889.860671 2.904432 660392.168074 /autogroup-19
[912979.078689] sshd 25689 4.817363 117 120
4.817363 11.371198 539229.489832 /autogroup-2671
[912979.078796] iotop 25901 242.641616 350 120
242.641616 1127.641305 429237.471383 /autogroup-2674
[912979.078902] htop 25910 201.989567 237 120
201.989567 784.683871 163646.060924 /autogroup-2676
[912979.079009] kworker/u16:6 25921 120879083.401250 697 120
120879083.401250 2993.947814 126051.426518 /
[912979.079114] kworker/u17:5 25968 120838671.063663 12 100
120838671.063663 0.460692 3.298837 /
[912979.079221]
[912979.079275] cpu#1, 3411.379 MHz
[912979.079332] .nr_running : 2
[912979.079384] .load : 2048
[912979.079438] .nr_switches : 169743127
[912979.079491] .nr_load_updates : 54005477
[912979.079545] .nr_uninterruptible : 1661854
[912979.079605] .next_balance : 4386.283389
[912979.079657] .curr->pid : 25691
[912979.079716] .clock : 912979079.220144
[912979.079768] .cpu_load[0] : 283
[912979.079819] .cpu_load[1] : 153
[912979.079870] .cpu_load[2] : 80
[912979.079920] .cpu_load[3] : 41
[912979.079971] .cpu_load[4] : 21
[912979.080022] .yld_count : 815801
[912979.080072] .sched_count : 171240785
[912979.080124] .sched_goidle : 55313040
[912979.080175] .avg_idle : 1000000
[912979.080226] .max_idle_balance_cost : 500000
[912979.080278] .ttwu_count : 75199620
[912979.080329] .ttwu_local : 47229794
[912979.080380]
[912979.080380] cfs_rq[1]:/autogroup-2672
[912979.080474] .exec_clock : 68.766961
[912979.080525] .MIN_vruntime : 0.000001
[912979.080577] .min_vruntime : 101.530915
[912979.080629] .max_vruntime : 0.000001
[912979.080680] .spread : 0.000000
[912979.080731] .spread0 : -120879011.854442
[912979.080783] .nr_spread_over : 0
[912979.080834] .nr_running : 1
[912979.080884] .load : 1024
[912979.080935] .runnable_load_avg : 280
[912979.080985] .blocked_load_avg : 0
[912979.081036] .tg_load_contrib : 280
[912979.081086] .tg_runnable_contrib : 285
[912979.081137] .tg_load_avg : 280
[912979.081187] .tg->runnable_avg : 285
[912979.081238] .tg->cfs_bandwidth.timer_active: 0
[912979.081287] .throttled : 0
[912979.081338] .throttle_count : 0
[912979.081388] .se->exec_start : 912979079.220144
[912979.081440] .se->vruntime : 275145564.677538
[912979.081492] .se->sum_exec_runtime : 68.766961
[912979.081543] .se->statistics.wait_start : 0.000000
[912979.081594] .se->statistics.sleep_start : 0.000000
[912979.081644] .se->statistics.block_start : 0.000000
[912979.081695] .se->statistics.sleep_max : 0.000000
[912979.081746] .se->statistics.block_max : 0.000000
[912979.081798] .se->statistics.exec_max : 13.558964
[912979.081848] .se->statistics.slice_max : 2.985393
[912979.081899] .se->statistics.wait_max : 0.009879
[912979.081950] .se->statistics.wait_sum : 0.026885
[912979.082001] .se->statistics.wait_count : 69
[912979.082052] .se->load.weight : 1024
[912979.082102] .se->avg.runnable_avg_sum : 13365
[912979.082153] .se->avg.runnable_avg_period : 47980
[912979.082204] .se->avg.load_avg_contrib : 283
[912979.082255] .se->avg.decay_count : 0
[912979.082305]
[912979.082305] cfs_rq[1]:/
[912979.082397] .exec_clock : 286402909.854441
[912979.082449] .MIN_vruntime : 275145546.211368
[912979.082502] .min_vruntime : 275145557.563743
[912979.082554] .max_vruntime : 275145546.211368
[912979.082606] .spread : 0.000000
[912979.082658] .spread0 : 154266444.178386
[912979.082710] .nr_spread_over : 812
[912979.082760] .nr_running : 2
[912979.082811] .load : 2048
[912979.082861] .runnable_load_avg : 283
[912979.082912] .blocked_load_avg : 0
[912979.082962] .tg_load_contrib : 283
[912979.083013] .tg_runnable_contrib : 289
[912979.083063] .tg_load_avg : 1356
[912979.083114] .tg->runnable_avg : 2342
[912979.083164] .tg->cfs_bandwidth.timer_active: 0
[912979.083214] .throttled : 0
[912979.083265] .throttle_count : 0
[912979.083315] .avg->runnable_avg_sum : 13118
[912979.083366] .avg->runnable_avg_period : 46332
[912979.083417]
[912979.083417] rt_rq[1]:/
[912979.083509] .rt_nr_running : 0
[912979.083559] .rt_throttled : 0
[912979.083609] .rt_time : 0.000000
[912979.083660] .rt_runtime : 950.000000
[912979.083712]
[912979.083712] runnable tasks:
[912979.083712] task PID tree-key switches prio
exec-runtime sum-exec sum-sleep
[912979.083712]
----------------------------------------------------------------------------------------------------------
[912979.084006] watchdog/1 11 -4.044370 3 120
-4.044370 0.005127 0.000948 /
[912979.084111] migration/1 12 0.000000 64289 0
0.000000 81551.633973 0.000880 /
[912979.084215] ksoftirqd/1 13 275145545.590468 272453 120
275145545.590468 9049.618468 912823795.945262 /
[912979.084320] kworker/1:0 14 268487677.088594 25 120
268487677.088594 0.073843 890158629.032624 /
[912979.084424] kworker/1:0H 15 292.784984 6 100
292.784984 0.034371 1174.761064 /
[912979.084529] khungtaskd 201 275144402.934404 7614 120
275144402.934404 416.952311 912957932.380625 /
[912979.084633] kswapd0 502 274008133.542214 11379 120
274008133.542214 4497.950184 909317147.417035 /
[912979.084737] deferwq 727 23.191246 2 100
23.191246 0.011226 0.004570 /
[912979.084842] kworker/1:1H 897 275143578.574135 1248547 100
275143578.574135 10327.849008 912864614.443779 /
[912979.084946] udevd 1127 29.410078 216 120
29.410078 37.086597 1475991.787847 /autogroup-4
[912979.085053] ntpd 3336 5469.037929 916907 120
5469.037929 23741.992404 912915353.001973 /autogroup-12
[912979.085158] sshd 3449 0.161283 103 120
0.161283 8.844398 912476711.092472 /autogroup-15
[912979.085265] getty 4383 0.612958 96 120
0.612958 2.038719 107.995012 /autogroup-24
[912979.085371] bash 4622 2762.602192 159 120
2762.602192 48.778186 2482145.460574 /autogroup-35
[912979.085477] btrfs 4793 544711.533345 2433072 120
544711.533345 2311403.122680 906591091.757311 /autogroup-35
[912979.085584] kworker/1:2 19065 275145546.211368 5366925 120
275145546.211368 2459424.518362 122606255.191968 /
[912979.085721] kworker/u16:1 25373 275115349.906651 110532 120
275115349.906651 959695.014681 6225615.573687 /
[912979.085826] kworker/u16:7 25546 275145546.776778 135645 120
275145546.776778 1192109.886086 2457115.723296 /
[912979.085931] R bash 25691 101.530915 107 120
101.530915 60.081376 535001.680171 /autogroup-2672
[912979.086038] sshd 25743 2.112753 200 120
2.112753 22.093931 492201.992771 /autogroup-2673
[912979.086144] kworker/u16:3 25919 275119113.595959 1711 120
275119113.595959 9954.427931 57170.637468 /
[912979.086249] kworker/u16:4 25920 275119113.620983 949 120
275119113.620983 4193.701841 62074.304968 /
[912979.086353] kworker/u16:9 25923 275129475.908640 924 120
275129475.908640 4824.625594 74152.344320 /
[912979.086458] kworker/u16:10 25924 275119113.342743 1017 120
275119113.342743 6528.918601 50465.467883 /
[912979.086563] kworker/u17:2 25965 275111274.543188 16 100
275111274.543188 0.865367 14.115527 /
[912979.086670]
[912979.086724] cpu#2, 3411.379 MHz
[912979.086782] .nr_running : 1
[912979.086834] .load : 1024
[912979.086885] .nr_switches : 161087963
[912979.086945] .nr_load_updates : 52060426
[912979.086997] .nr_uninterruptible : 1030559
[912979.087056] .next_balance : 4386.283391
[912979.087107] .curr->pid : 25922
[912979.087158] .clock : 912979085.701520
[912979.087210] .cpu_load[0] : 1023
[912979.087261] .cpu_load[1] : 1023
[912979.087311] .cpu_load[2] : 1023
[912979.087361] .cpu_load[3] : 1023
[912979.087412] .cpu_load[4] : 1023
[912979.087462] .yld_count : 652418
[912979.087513] .sched_count : 162161509
[912979.087564] .sched_goidle : 52314275
[912979.087615] .avg_idle : 612215
[912979.087666] .max_idle_balance_cost : 500000
[912979.087717] .ttwu_count : 72879058
[912979.087768] .ttwu_local : 44844744
[912979.087819]
[912979.087819] cfs_rq[2]:/
[912979.087911] .exec_clock : 271247931.510377
[912979.089193] .MIN_vruntime : 0.000001
[912979.089244] .min_vruntime : 261281757.496336
[912979.089297] .max_vruntime : 0.000001
[912979.089348] .spread : 0.000000
[912979.089399] .spread0 : 140402644.110979
[912979.089451] .nr_spread_over : 787
[912979.089502] .nr_running : 1
[912979.089552] .load : 1024
[912979.089603] .runnable_load_avg : 1023
[912979.089654] .blocked_load_avg : 39
[912979.089704] .tg_load_contrib : 1058
[912979.089754] .tg_runnable_contrib : 1011
[912979.089804] .tg_load_avg : 1451
[912979.089854] .tg->runnable_avg : 2456
[912979.089905] .tg->cfs_bandwidth.timer_active: 0
[912979.089955] .throttled : 0
[912979.090006] .throttle_count : 0
[912979.090056] .avg->runnable_avg_sum : 47775
[912979.090107] .avg->runnable_avg_period : 47775
[912979.090158]
[912979.090158] rt_rq[2]:/
[912979.090250] .rt_nr_running : 0
[912979.090301] .rt_throttled : 0
[912979.090351] .rt_time : 0.000000
[912979.090402] .rt_runtime : 950.000000
[912979.090454]
[912979.090454] runnable tasks:
[912979.090454] task PID tree-key switches prio
exec-runtime sum-exec sum-sleep
[912979.090454]
----------------------------------------------------------------------------------------------------------
[912979.090748] kthreadd 2 261237406.066326 4827 120
261237406.066326 119.911688 912871657.395410 /
[912979.090853] watchdog/2 16 -7.044760 3 120
-7.044760 0.004873 0.001867 /
[912979.090957] migration/2 17 0.000000 62927 0
0.000000 87331.886093 0.000889 /
[912979.091062] ksoftirqd/2 18 261279894.213366 232738 120
261279894.213366 9205.862670 912836969.487694 /
[912979.091166] kworker/2:0H 20 1186.583000 6 100
1186.583000 0.049105 36720.236432 /
[912979.091272] khubd 352 735.961340 62 120
735.961340 3.794366 34873.341958 /
[912979.091377] scsi_eh_5 865 294.741950 3 120
294.741950 0.094279 1.195837 /
[912979.091481] udevd 1253 36.383350 1 120
36.383350 0.349195 0.000000 /autogroup-4
[912979.091588] kworker/2:1H 2466 261154579.822841 1217898 100
261154579.822841 7698.405912 912438827.428521 /
[912979.091692] rsyslogd 3078 1361568.583283 8082 120
1361568.583283 278.723375 912918782.388604 /autogroup-6
[912979.091798] cron 3318 527.499209 17807 120
527.499209 864.457775 912917777.953019 /autogroup-11
[912979.091905] nginx 3427 7530.670910 11 120
7530.670910 1.438863 845802616.809370 /autogroup-14
[912979.092010] getty 4381 0.550444 101 120
0.550444 1.894524 108.260507 /autogroup-23
[912979.092117] qmgr 4385 662.119562 5737 120
662.119562 563.016676 912836471.747917 /autogroup-19
[912979.092223] btrfs-cleaner 4713 261276427.726190 88520 120
261276427.726190 46175.312340 911393872.179404 /
[912979.092328] kworker/2:1 24696 261281745.576376 902624 120
261281745.576376 401393.704640 22415088.310903 /
[912979.092432] kworker/u16:5 25545 261237415.871889 119775 120
261237415.871889 1060653.766316 2499879.159564 /
[912979.092537] kworker/2:0 25737 261143674.398829 12 120
261143674.398829 0.075679 90.512937 /
[912979.092642] bash 25745 1.774182 57 120
1.774182 27.044681 59433.860152 /autogroup-2674
[912979.092749] bash 25797 125.213121 104 120
125.213121 31.328467 296070.107601 /autogroup-2676
[912979.092856] R kworker/u16:8 25922 261281757.496336 7833 120
261281757.496336 66888.389010 62803.947413 /
[912979.092961] kworker/u17:1 25936 261237406.051287 8 100
261237406.051287 0.276398 22805.965907 /
[912979.093066] kworker/u16:17 25967 261237421.924073 24 120
261237421.924073 15.955485 2814.028396 /
[912979.093171]
[912979.093217] cpu#3, 3411.379 MHz
[912979.093265] .nr_running : 0
[912979.093315] .load : 0
[912979.093365] .nr_switches : 155857077
[912979.093416] .nr_load_updates : 50842415
[912979.093467] .nr_uninterruptible : 572868
[912979.093518] .next_balance : 4386.283384
[912979.093569] .curr->pid : 0
[912979.093619] .clock : 912979085.631719
[912979.093671] .cpu_load[0] : 0
[912979.093721] .cpu_load[1] : 0
[912979.093771] .cpu_load[2] : 0
[912979.093821] .cpu_load[3] : 0
[912979.093872] .cpu_load[4] : 0
[912979.093922] .yld_count : 539647
[912979.093973] .sched_count : 156675888
[912979.094025] .sched_goidle : 50861107
[912979.094076] .avg_idle : 1000000
[912979.094127] .max_idle_balance_cost : 500000
[912979.094178] .ttwu_count : 70168767
[912979.094229] .ttwu_local : 43110297
[912979.094281]
[912979.094281] cfs_rq[3]:/
[912979.094373] .exec_clock : 258813890.199969
[912979.094424] .MIN_vruntime : 0.000001
[912979.094475] .min_vruntime : 249777836.394830
[912979.094528] .max_vruntime : 0.000001
[912979.094579] .spread : 0.000000
[912979.094630] .spread0 : 128898723.009473
[912979.094683] .nr_spread_over : 710
[912979.094734] .nr_running : 0
[912979.094784] .load : 0
[912979.094834] .runnable_load_avg : 0
[912979.094885] .blocked_load_avg : 0
[912979.094935] .tg_load_contrib : 0
[912979.094985] .tg_runnable_contrib : 0
[912979.095035] .tg_load_avg : 1451
[912979.095086] .tg->runnable_avg : 2456
[912979.095136] .tg->cfs_bandwidth.timer_active: 0
[912979.095186] .throttled : 0
[912979.095237] .throttle_count : 0
[912979.095287] .avg->runnable_avg_sum : 12
[912979.095338] .avg->runnable_avg_period : 47841
[912979.095389]
[912979.095389] rt_rq[3]:/
[912979.095481] .rt_nr_running : 0
[912979.095532] .rt_throttled : 0
[912979.095582] .rt_time : 0.000000
[912979.095649] .rt_runtime : 950.000000
[912979.095710]
[912979.095710] runnable tasks:
[912979.095710] task PID tree-key switches prio
exec-runtime sum-exec sum-sleep
[912979.095710]
----------------------------------------------------------------------------------------------------------
[912979.096024] watchdog/3 21 -7.044474 3 120
-7.044474 0.005278 0.001572 /
[912979.096128] migration/3 22 0.000000 62875 0
0.000000 86643.021266 0.000894 /
[912979.096233] ksoftirqd/3 23 249777505.585429 211305 120
249777505.585429 9262.490697 912840221.020644 /
[912979.096337] kworker/3:0 24 242884588.306110 20 120
242884588.306110 0.043153 884589608.629792 /
[912979.096442] kworker/3:0H 25 1375.522741 6 100
1375.522741 0.031575 39029.180311 /
[912979.096547] kworker/3:1 606 249777825.422551 36311384 120
249777825.422551 15997163.195058 896809394.721102 /
[912979.096653] kworker/3:1H 3166 249768145.314170 1231490 100
249768145.314170 5499.103364 912882138.510763 /
[912979.096757] nginx 3430 8357.302718 887796 120
8357.302718 10878.157162 912927715.123733 /autogroup-14
[912979.096864] nginx 3431 8357.281768 1441416 120
8357.281768 18119.184373 912920263.466853 /autogroup-14
[912979.096969] log 4278 1009.514780 721 120
1009.514780 32.210457 910856714.130981 /autogroup-17
[912979.097076] getty 4379 0.744904 62 120
0.744904 2.024537 107.666970 /autogroup-21
[912979.097182] tmux 4621 30898.295871 2720146 120
30898.295871 125970.617268 911355335.261179 /autogroup-34
[912979.097289] btrfs-transacti 4714 249768155.237773 62080256 120
249768155.237773 491918009.273985 384893945.745894 /
[912979.097394] kworker/u17:0 25132 249726067.531953 131 100
249726067.531953 4.623987 14288907.161099 /
[912979.097498] ssl-params 25668 1021.291893 4 120
1021.291893 1.071507 0.162468 /autogroup-17
[912979.097606] kworker/u17:3 25966 249726077.748339 6 100
249726077.748339 0.166575 14.012508 /
[912979.097713]
[912979.097767] cpu#4, 3411.379 MHz
[912979.097823] .nr_running : 0
[912979.097875] .load : 0
[912979.097927] .nr_switches : 97989748
[912979.097988] .nr_load_updates : 29437738
[912979.098039] .nr_uninterruptible : 99645
[912979.098090] .next_balance : 4386.283321
[912979.098150] .curr->pid : 0
[912979.098200] .clock : 912979095.633281
[912979.098252] .cpu_load[0] : 0
[912979.098302] .cpu_load[1] : 0
[912979.098352] .cpu_load[2] : 0
[912979.098403] .cpu_load[3] : 0
[912979.098453] .cpu_load[4] : 0
[912979.099726] .yld_count : 289502
[912979.099776] .sched_count : 99329529
[912979.099827] .sched_goidle : 24596751
[912979.099878] .avg_idle : 1000000
[912979.099929] .max_idle_balance_cost : 500000
[912979.099979] .ttwu_count : 40527541
[912979.100030] .ttwu_local : 28095271
[912979.100081]
[912979.100081] cfs_rq[4]:/
[912979.100172] .exec_clock : 240550243.715826
[912979.100225] .MIN_vruntime : 0.000001
[912979.100276] .min_vruntime : 232547480.727422
[912979.100328] .max_vruntime : 0.000001
[912979.100379] .spread : 0.000000
[912979.100431] .spread0 : 111668367.342065
[912979.100483] .nr_spread_over : 680
[912979.100533] .nr_running : 0
[912979.100583] .load : 0
[912979.100634] .runnable_load_avg : 0
[912979.100684] .blocked_load_avg : 0
[912979.100734] .tg_load_contrib : 0
[912979.100784] .tg_runnable_contrib : 0
[912979.100834] .tg_load_avg : 1591
[912979.100884] .tg->runnable_avg : 2592
[912979.100935] .tg->cfs_bandwidth.timer_active: 0
[912979.100985] .throttled : 0
[912979.101035] .throttle_count : 0
[912979.101086] .avg->runnable_avg_sum : 6
[912979.101136] .avg->runnable_avg_period : 47942
[912979.101186]
[912979.101186] rt_rq[4]:/
[912979.101279] .rt_nr_running : 0
[912979.101329] .rt_throttled : 0
[912979.101379] .rt_time : 0.000000
[912979.101431] .rt_runtime : 950.000000
[912979.101483]
[912979.101483] runnable tasks:
[912979.101483] task PID tree-key switches prio
exec-runtime sum-exec sum-sleep
[912979.101483]
----------------------------------------------------------------------------------------------------------
[912979.101777] rcu_sched 7 232547476.981614 39735742 120
232547476.981614 1073301.745321 911867300.274149 /
[912979.101881] watchdog/4 26 -10.043788 3 120
-10.043788 0.005889 0.001724 /
[912979.101986] migration/4 27 0.000000 52897 0
0.000000 30928.519290 0.000864 /
[912979.102090] ksoftirqd/4 28 232547379.187231 177607 120
232547379.187231 1064.877952 912828854.891450 /
[912979.102195] kworker/4:0 29 225129625.190891 21 120
225129625.190891 0.052462 885018589.851740 /
[912979.102300] kworker/4:0H 30 401.555147 6 100
401.555147 0.023250 1113.250079 /
[912979.102405] writeback 202 10.955626 2 100
10.955626 0.004714 0.001304 /
[912979.102510] crypto 205 22.958809 2 100
22.958809 0.003514 0.001582 /
[912979.102615] ata_sff 853 221.595354 2 100
221.595354 0.054486 0.007891 /
[912979.102720] scsi_eh_3 861 341.343202 3 120
341.343202 0.081789 1.417215 /
[912979.102824] kworker/4:2 895 232547469.098718 25902640 120
232547469.098718 14994809.177138 897903166.791145 /
[912979.102929] kworker/4:1H 896 232524737.232736 641223 100
232524737.232736 7788.068600 912787058.060253 /
[912979.103033] bioset 907 461.767329 2 100
461.767329 0.024292 0.004808 /
[912979.103138] md0_raid1 908 232423972.410927 68708 120
232423972.410927 14264.555610 912231051.031317 /
[912979.103243] bioset 916 493.916287 2 100
493.916287 0.008234 0.006637 /
[912979.103348] bioset 924 530.083314 2 100
530.083314 0.033173 0.005301 /
[912979.103453] ext4-rsv-conver 941 590.416989 2 100
590.416989 0.032333 0.008235 /
[912979.103557] rsyslogd 3044 708180.065334 36 120
708180.065334 3.020832 845805199.558834 /autogroup-6
[912979.103666] kworker/u16:11 25959 232528438.791148 129 120
232528438.791148 25.448642 10350.709475 /
[912979.103770] kworker/u16:12 25960 232528444.421842 13 120
232528444.421842 25.248681 6835.941648 /
[912979.103875]
[912979.103921] cpu#5, 3411.379 MHz
[912979.103969] .nr_running : 0
[912979.104019] .load : 0
[912979.104069] .nr_switches : 60686752
[912979.104120] .nr_load_updates : 16590839
[912979.104171] .nr_uninterruptible : 35116
[912979.104221] .next_balance : 4386.283371
[912979.104273] .curr->pid : 0
[912979.104323] .clock : 912978955.699686
[912979.104375] .cpu_load[0] : 0
[912979.104425] .cpu_load[1] : 0
[912979.104476] .cpu_load[2] : 0
[912979.104526] .cpu_load[3] : 0
[912979.104577] .cpu_load[4] : 0
[912979.104627] .yld_count : 180533
[912979.104678] .sched_count : 61115461
[912979.104729] .sched_goidle : 18413446
[912979.104780] .avg_idle : 1000000
[912979.104831] .max_idle_balance_cost : 500000
[912979.104883] .ttwu_count : 26014119
[912979.104934] .ttwu_local : 14558862
[912979.104985]
[912979.104985] cfs_rq[5]:/
[912979.105078] .exec_clock : 112549009.355412
[912979.105129] .MIN_vruntime : 0.000001
[912979.105180] .min_vruntime : 112340424.333297
[912979.105233] .max_vruntime : 0.000001
[912979.105284] .spread : 0.000000
[912979.105335] .spread0 : -8538689.052060
[912979.105388] .nr_spread_over : 776
[912979.105438] .nr_running : 0
[912979.105489] .load : 0
[912979.105539] .runnable_load_avg : 0
[912979.105590] .blocked_load_avg : 0
[912979.105665] .tg_load_contrib : 0
[912979.105715] .tg_runnable_contrib : 0
[912979.105765] .tg_load_avg : 1566
[912979.105816] .tg->runnable_avg : 2588
[912979.105867] .tg->cfs_bandwidth.timer_active: 0
[912979.105917] .throttled : 0
[912979.105967] .throttle_count : 0
[912979.106018] .avg->runnable_avg_sum : 9
[912979.106068] .avg->runnable_avg_period : 48296
[912979.106120]
[912979.106120] rt_rq[5]:/
[912979.106212] .rt_nr_running : 0
[912979.106262] .rt_throttled : 0
[912979.106312] .rt_time : 0.000000
[912979.106364] .rt_runtime : 950.000000
[912979.106416]
[912979.106416] runnable tasks:
[912979.106416] task PID tree-key switches prio
exec-runtime sum-exec sum-sleep
[912979.106416]
----------------------------------------------------------------------------------------------------------
[912979.106711] watchdog/5 31 -10.044212 3 120
-10.044212 0.005558 0.002261 /
[912979.106815] migration/5 32 0.000000 39981 0
0.000000 22420.347279 0.000857 /
[912979.106920] ksoftirqd/5 33 112340386.830138 88288 120
112340386.830138 902.891990 912915985.970305 /
[912979.107024] kworker/5:0 34 66161108.436113 17 120
66161108.436113 0.040609 531545833.648619 /
[912979.107129] kworker/5:0H 35 2637.765099 6 100
2637.765099 0.016356 34887.843068 /
[912979.107234] bioset 206 10.954992 2 100
10.954992 0.004215 0.001385 /
[912979.107338] kblockd 207 22.958046 2 100
22.958046 0.003486 0.001405 /
[912979.107443] md 358 46.967085 2 100
46.967085 0.004262 0.001718 /
[912979.107549] md1_raid1 917 112224651.731228 2576 120
112224651.731228 438.643648 912246492.124230 /
[912979.107653] kworker/5:1H 1535 112279235.381603 705637 100
112279235.381603 2860.139708 912488051.526814 /
[912979.107757] kjournald 2221 2802.174368 2 120
2802.174368 0.172461 0.003504 /
[912979.107862] bioset 2227 2814.208072 2 100
2814.208072 0.035120 0.024287 /
[912979.107966] acpid 3231 0.818794 84 120
0.818794 3.306390 846335772.890227 /autogroup-10
[912979.108073] ipv6_addrconf 3363 3115.100877 2 100
3115.100877 0.025822 0.024520 /
[912979.108178] dbus-daemon 3401 0.953565 1 120
0.953565 0.195728 0.000000 /autogroup-13
[912979.108284] getty 4382 0.887602 105 120
0.887602 1.971108 107.901308 /autogroup-25
[912979.108392] kworker/5:2 4904 112340412.387203 5220912 120
112340412.387203 2883754.936751 378528258.548013 /
[912979.108497] kworker/u16:0 25534 112334884.843224 46521 120
112334884.843224 407619.457750 3292541.847410 /
[912979.108602] sshd 25795 15.048918 272 120
15.048918 34.315217 463191.101842 /autogroup-2675
[912979.108709] kworker/u16:2 25918 112334885.079192 252 120
112334885.079192 16.946254 67813.385939 /
[912979.108815] kworker/u16:15 25963 112333963.192420 419 120
112333963.192420 14.102201 27660.580866 /
[912979.108920] kworker/u16:16 25964 112340412.451440 3169 120
112340412.451440 26688.738693 76979.038506 /
[912979.110257]
[912979.110311] cpu#6, 3411.379 MHz
[912979.110367] .nr_running : 0
[912979.110419] .load : 0
[912979.110468] .nr_switches : 60049743
[912979.110530] .nr_load_updates : 16356167
[912979.110581] .nr_uninterruptible : 35160
[912979.110632] .next_balance : 4386.283021
[912979.110692] .curr->pid : 0
[912979.110742] .clock : 912978955.696449
[912979.110794] .cpu_load[0] : 0
[912979.110844] .cpu_load[1] : 0
[912979.110893] .cpu_load[2] : 0
[912979.110944] .cpu_load[3] : 0
[912979.110994] .cpu_load[4] : 0
[912979.111045] .yld_count : 182103
[912979.111095] .sched_count : 60472214
[912979.111147] .sched_goidle : 18472620
[912979.111198] .avg_idle : 1000000
[912979.111249] .max_idle_balance_cost : 500000
[912979.111300] .ttwu_count : 28463751
[912979.111351] .ttwu_local : 14148770
[912979.111402]
[912979.111402] cfs_rq[6]:/
[912979.111495] .exec_clock : 108531033.446718
[912979.111547] .MIN_vruntime : 0.000001
[912979.111598] .min_vruntime : 108832402.721814
[912979.111649] .max_vruntime : 0.000001
[912979.111700] .spread : 0.000000
[912979.111752] .spread0 : -12046710.663543
[912979.111804] .nr_spread_over : 738
[912979.111854] .nr_running : 0
[912979.111904] .load : 0
[912979.111954] .runnable_load_avg : 0
[912979.112004] .blocked_load_avg : 0
[912979.112054] .tg_load_contrib : 0
[912979.112105] .tg_runnable_contrib : 1023
[912979.112156] .tg_load_avg : 1704
[912979.112207] .tg->runnable_avg : 2710
[912979.112258] .tg->cfs_bandwidth.timer_active: 0
[912979.112308] .throttled : 0
[912979.112358] .throttle_count : 0
[912979.112408] .avg->runnable_avg_sum : 48032
[912979.112458] .avg->runnable_avg_period : 48032
[912979.112509]
[912979.112509] rt_rq[6]:/
[912979.112601] .rt_nr_running : 0
[912979.112650] .rt_throttled : 0
[912979.112701] .rt_time : 0.000000
[912979.112752] .rt_runtime : 950.000000
[912979.112803]
[912979.112803] runnable tasks:
[912979.112803] task PID tree-key switches prio
exec-runtime sum-exec sum-sleep
[912979.112803]
----------------------------------------------------------------------------------------------------------
[912979.113096] watchdog/6 36 -10.044693 3 120
-10.044693 0.005081 0.001609 /
[912979.113201] migration/6 37 0.000000 37276 0
0.000000 20532.517090 0.000899 /
[912979.113305] ksoftirqd/6 38 108832139.243786 84205 120
108832139.243786 828.593684 912904581.835227 /
[912979.113409] kworker/6:0 39 75613767.546629 16 120
75613767.546629 0.039713 643748961.557535 /
[912979.113513] kworker/6:0H 40 3739.808388 6 100
3739.808388 0.060525 40424.938273 /
[912979.113618] acpi_thermal_pm 616 1868.134944 2 100
1868.134944 0.006186 0.002757 /
[912979.113723] udevd 1252 323.595660 4 120
323.595660 0.967985 1475586.237158 /autogroup-4
[912979.113830] kworker/6:1H 4101 108794319.824306 698571 100
108794319.824306 2632.092232 912466706.055504 /
[912979.113934] anvil 4277 873.900669 544 120
873.900669 23.579194 910855769.729803 /autogroup-17
[912979.114040] getty 4378 2.434546 104 120
2.434546 4.057877 106.857510 /autogroup-20
[912979.114147] kworker/6:2 10896 108832390.870370 3705038 120
108832390.870370 2015140.511741 267196695.854675 /
[912979.114251] kworker/u16:13 25961 108823829.294127 201 120
108823829.294127 25.493814 6828.575308 /
[912979.114356]
[912979.114402] cpu#7, 3411.379 MHz
[912979.114451] .nr_running : 0
[912979.114501] .load : 0
[912979.114551] .nr_switches : 59555676
[912979.114602] .nr_load_updates : 15987201
[912979.114653] .nr_uninterruptible : 34477
[912979.114703] .next_balance : 4386.283385
[912979.114755] .curr->pid : 0
[912979.114805] .clock : 912979110.628700
[912979.114856] .cpu_load[0] : 0
[912979.114907] .cpu_load[1] : 0
[912979.114957] .cpu_load[2] : 0
[912979.115007] .cpu_load[3] : 0
[912979.115058] .cpu_load[4] : 0
[912979.115108] .yld_count : 177334
[912979.115159] .sched_count : 59962825
[912979.115211] .sched_goidle : 18826005
[912979.115262] .avg_idle : 1000000
[912979.115313] .max_idle_balance_cost : 500000
[912979.115364] .ttwu_count : 28048259
[912979.115415] .ttwu_local : 13666465
[912979.115467]
[912979.115467] cfs_rq[7]:/autogroup-6
[912979.115560] .exec_clock : 365777.289449
[912979.115622] .MIN_vruntime : 0.000001
[912979.115690] .min_vruntime : 356931.628448
[912979.115742] .max_vruntime : 0.000001
[912979.115793] .spread : 0.000000
[912979.115844] .spread0 : -120522181.756909
[912979.115896] .nr_spread_over : 201
[912979.115946] .nr_running : 0
[912979.115997] .load : 0
[912979.116047] .runnable_load_avg : 0
[912979.116097] .blocked_load_avg : 12
[912979.116148] .tg_load_contrib : 12
[912979.116198] .tg_runnable_contrib : 18
[912979.116247] .tg_load_avg : 21
[912979.116298] .tg->runnable_avg : 32
[912979.116348] .tg->cfs_bandwidth.timer_active: 0
[912979.116398] .throttled : 0
[912979.116448] .throttle_count : 0
[912979.116499] .se->exec_start : 912979110.627684
[912979.116551] .se->vruntime : 103316251.044128
[912979.116603] .se->sum_exec_runtime : 365782.801926
[912979.116655] .se->statistics.wait_start : 0.000000
[912979.116706] .se->statistics.sleep_start : 0.000000
[912979.116757] .se->statistics.block_start : 0.000000
[912979.116807] .se->statistics.sleep_max : 0.000000
[912979.116858] .se->statistics.block_max : 0.000000
[912979.116910] .se->statistics.exec_max : 15.287497
[912979.116961] .se->statistics.slice_max : 9.991647
[912979.117012] .se->statistics.wait_max : 20.496483
[912979.117063] .se->statistics.wait_sum : 3694.519174
[912979.117114] .se->statistics.wait_count : 818307
[912979.117165] .se->load.weight : 2
[912979.117215] .se->avg.runnable_avg_sum : 864
[912979.117266] .se->avg.runnable_avg_period : 46751
[912979.117317] .se->avg.load_avg_contrib : 15
[912979.117367] .se->avg.decay_count : 870684730
[912979.117418]
[912979.117418] cfs_rq[7]:/
[912979.117511] .exec_clock : 102605411.107675
[912979.117562] .MIN_vruntime : 0.000001
[912979.117614] .min_vruntime : 103316261.560904
[912979.117666] .max_vruntime : 0.000001
[912979.117717] .spread : 0.000000
[912979.117769] .spread0 : -17562851.824453
[912979.117821] .nr_spread_over : 834
[912979.117871] .nr_running : 0
[912979.117921] .load : 0
[912979.117972] .runnable_load_avg : 0
[912979.118022] .blocked_load_avg : 0
[912979.118072] .tg_load_contrib : 0
[912979.118122] .tg_runnable_contrib : 19
[912979.118173] .tg_load_avg : 1679
[912979.118224] .tg->runnable_avg : 2706
[912979.118274] .tg->cfs_bandwidth.timer_active: 0
[912979.118325] .throttled : 0
[912979.118375] .throttle_count : 0
[912979.118425] .avg->runnable_avg_sum : 897
[912979.118476] .avg->runnable_avg_period : 47291
[912979.118527]
[912979.118527] rt_rq[7]:/
[912979.118620] .rt_nr_running : 0
[912979.118670] .rt_throttled : 0
[912979.118721] .rt_time : 0.000000
[912979.118772] .rt_runtime : 950.000000
[912979.118824]
[912979.118824] runnable tasks:
[912979.118824] task PID tree-key switches prio
exec-runtime sum-exec sum-sleep
[912979.118824]
----------------------------------------------------------------------------------------------------------
[912979.119119] watchdog/7 41 -10.045045 3 120
-10.045045 0.004808 0.001532 /
[912979.119224] migration/7 42 0.000000 36792 0
0.000000 20126.155679 0.000854 /
[912979.119329] ksoftirqd/7 43 103316249.568830 81327 120
103316249.568830 774.603306 912926058.496427 /
[912979.120664] kworker/7:0 44 47090310.248653 16 120
47090310.248653 0.128903 434429936.382521 /
[912979.120769] kworker/7:0H 45 4851.525691 6 100
4851.525691 0.029924 34165.166196 /
[912979.120874] khelper 46 10.956929 2 100
10.956929 0.006162 0.001989 /
[912979.120979] netns 47 22.960187 2 100
22.960187 0.003588 0.001842 /
[912979.121084] irq/28-mei_me 663 0.000000 14 49
0.000000 0.467508 0.000000 /
[912979.121189] kpsmoused 682 3645.306968 2 100
3645.306968 0.033012 0.007425 /
[912979.121293] scsi_tmf_0 856 4451.954811 2 100
4451.954811 0.009155 0.003823 /
[912979.121398] scsi_tmf_1 858 4463.965463 2 100
4463.965463 0.013995 0.004815 /
[912979.121502] scsi_eh_2 859 4524.055744 3 120
4524.055744 0.038579 1.530789 /
[912979.121607] scsi_tmf_2 860 4487.982184 2 100
4487.982184 0.008967 0.005525 /
[912979.121712] scsi_tmf_3 862 4499.993379 2 100
4499.993379 0.011948 0.005102 /
[912979.121817] scsi_eh_4 863 4524.057233 3 120
4524.057233 0.041540 1.325310 /
[912979.121922] scsi_tmf_4 864 4524.015332 2 100
4524.015332 0.012859 0.004181 /
[912979.122026] scsi_tmf_5 866 4536.025438 2 100
4536.025438 0.011072 0.006043 /
[912979.122131] kworker/7:1H 1280 103286251.400108 688398 100
103286251.400108 2454.691791 912481485.268755 /
[912979.122236] rs:main Q:Reg 3077 356931.628448 7847 120
356931.628448 859.764447 912939232.144327 /autogroup-6
[912979.122343] master 4354 554.129516 43341 120
554.129516 2028.044789 912879488.890427 /autogroup-19
[912979.122449] btrfs-worker 4691 5989.004416 2 100
5989.004416 0.041977 0.004016 /
[912979.122553] btrfs-worker-hi 4693 6013.105480 2 100
6013.105480 0.052131 0.002841 /
[912979.122658] btrfs-delalloc 4694 6025.150939 2 100
6025.150939 0.046807 0.002861 /
[912979.122764] btrfs-flush_del 4695 6037.193893 2 100
6037.193893 0.044335 0.003252 /
[912979.122868] btrfs-cache 4696 6049.237247 2 100
6049.237247 0.044729 0.002826 /
[912979.122972] btrfs-submit 4697 6061.280488 2 100
6061.280488 0.044660 0.002709 /
[912979.123077] btrfs-fixup 4698 6073.289654 2 100
6073.289654 0.009914 0.002789 /
[912979.123181] btrfs-endio 4699 14914387.229554 4 100
14914387.229554 0.040802 141335502.714150 /
[912979.123286] btrfs-endio-met 4700 6097.310484 2 100
6097.310484 0.012027 0.003048 /
[912979.123390] btrfs-endio-met 4701 6109.352065 2 100
6109.352065 0.042989 0.002727 /
[912979.123495] btrfs-endio-rai 4702 6121.394067 2 100
6121.394067 0.043385 0.002729 /
[912979.123600] btrfs-rmw 4703 6133.415559 2 100
6133.415559 0.022201 0.002938 /
[912979.123705] btrfs-freespace 4705 6157.446031 2 100
6157.446031 0.009791 0.003402 /
[912979.123810] btrfs-delayed-m 4706 6169.454602 2 100
6169.454602 0.009254 0.003302 /
[912979.123914] btrfs-readahead 4707 6181.495529 2 100
6181.495529 0.041628 0.003029 /
[912979.124019] btrfs-qgroup-re 4708 6193.519464 2 100
6193.519464 0.024648 0.002956 /
[912979.124124] btrfs-extent-re 4709 6205.526825 2 100
6205.526825 0.008097 0.002968 /
[912979.124229] kworker/7:1 31378 103316249.746160 6369777 120
103316249.746160 3405409.629733 475117858.774212 /
[912979.124334] kworker/u17:4 25396 103303273.647273 79 100
103303273.647273 2.725773 7167326.179379 /
[912979.124438] kworker/u16:14 25962 103307736.104678 1653 120
103307736.104678 51.500169 34247.616388 /
[912979.124545]
--
Tomasz Chmielewski
http://www.sslrack.com
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: how long should "btrfs device delete missing ..." take?
2014-09-11 21:25 ` Chris Murphy
@ 2014-09-11 23:51 ` Duncan
2014-09-12 2:10 ` Chris Murphy
0 siblings, 1 reply; 11+ messages in thread
From: Duncan @ 2014-09-11 23:51 UTC (permalink / raw)
To: linux-btrfs
Chris Murphy posted on Thu, 11 Sep 2014 15:25:51 -0600 as excerpted:
> On Sep 11, 2014, at 1:31 PM, Duncan <1i5t5.duncan@cox.net> wrote:
>>
>> I wouldn't try defragging now, but it might be worthwhile to stop the
>> device delete (rebooting to do so since I don't think there's a cancel)
>
> 'btrfs replace cancel' does exist, although I haven't tried it.
Btrfs replace cancel exists, yes, but does it work for btrfs device
delete, which is what the OP used?
> Something isn't right though, because it's clearly neither reading nor
> writing at anywhere close to 1/2 the drive read throughput. I'm curious
> what 'iotop -d30 -o' shows (during the replace, before cancel), which
> should be pretty consistent by averaging 30 seconds worth of io. And
> then try 'iotop -d3 -o' and see if there are spikes. I'm willing to bet
> there's a lot of nothing going on, with occasional spikes, rather than a
> constant trickle.
>
> And then the question is to find out what btrfs is thinking about while
> nothing is reading or writing. Even though it's not 5000+ snapshots, I
> wonder if the balance code (and hence btrfs replace) makes extensive use
> of fiemap that's causing this to go catatonic.
> http://comments.gmane.org/gmane.comp.file-systems.btrfs/35724
Not sure (some of that stuff's beyond me), but one thing we /do/ know is
that btrfs has so far been focused mostly on features and debugging, not
on optimization beyond the worst-cases, which themselves remain a big
enough problem, tho it's slowly getting better.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: how long should "btrfs device delete missing ..." take?
2014-09-11 23:51 ` Duncan
@ 2014-09-12 2:10 ` Chris Murphy
2014-09-12 5:19 ` Russell Coker
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Chris Murphy @ 2014-09-12 2:10 UTC (permalink / raw)
To: Btrfs BTRFS
On Sep 11, 2014, at 5:51 PM, Duncan <1i5t5.duncan@cox.net> wrote:
> Chris Murphy posted on Thu, 11 Sep 2014 15:25:51 -0600 as excerpted:
>
>> On Sep 11, 2014, at 1:31 PM, Duncan <1i5t5.duncan@cox.net> wrote:
>>>
>>> I wouldn't try defragging now, but it might be worthwhile to stop the
>>> device delete (rebooting to do so since I don't think there's a cancel)
>>
>> 'btrfs replace cancel' does exist, although I haven't tried it.
>
> Btrfs replace cancel exists, yes, but does it work for btrfs device
> delete, which is what the OP used?
Oops, right! I'm not sure what can do this safely.
And then when I think about just creating a new fs, using btrfs send/receive, the snapshots need to be ro first. So if there's any uncertainty about safely canceling the 'device delete' those ro snapshots need to be taken first, in the event only an ro mount is possible subsequently. And then there's some uncertainty how long 260+ ro snapshots will take (should be fast, but) and how much worse that makes the current situation. But probably worth the risk to take the snapshots and just wait a while before trying something like umount or sysrq+s followed by sysrq+u.
>
>> Something isn't right though, because it's clearly neither reading nor
>> writing at anywhere close to 1/2 the drive read throughput. I'm curious
>> what 'iotop -d30 -o' shows (during the replace, before cancel), which
>> should be pretty consistent by averaging 30 seconds worth of io. And
>> then try 'iotop -d3 -o' and see if there are spikes. I'm willing to bet
>> there's a lot of nothing going on, with occasional spikes, rather than a
>> constant trickle.
>>
>> And then the question is to find out what btrfs is thinking about while
>> nothing is reading or writing. Even though it's not 5000+ snapshots, I
>> wonder if the balance code (and hence btrfs replace) makes extensive use
>> of fiemap that's causing this to go catatonic.
>> http://comments.gmane.org/gmane.comp.file-systems.btrfs/35724
>
> Not sure (some of that stuff's beyond me), but one thing we /do/ know is
> that btrfs has so far been focused mostly on features and debugging, not
> on optimization beyond the worst-cases, which themselves remain a big
> enough problem, tho it's slowly getting better.
Sure. But what's the next step? Given 260+ snapshots might mean well more than 350GB of data, depending on how deduplicated the fs is, it still probably would be faster to rsync this to a pile of drives in linear/concat+XFS than wait a month (?) for device delete to finish.
Alternatively, script some way to create 260+ ro snapshots to btrfs send/receive to a new btrfs volume; and turn it into a raid1 later.
I'm curious if a sysrq+s followed by sysrq+u might leave the filessystem in a state where it could still be rw mountable. But I'm skeptical of anything interrupting the device delete before being fully prepared for the fs to be toast for rw mount. If only ro mount is possible, any chance of creating ro snapshots is out.
Chris Murphy
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: how long should "btrfs device delete missing ..." take?
2014-09-12 2:10 ` Chris Murphy
@ 2014-09-12 5:19 ` Russell Coker
2014-09-12 5:33 ` Duncan
2014-09-12 6:27 ` Chris Murphy
2014-09-12 5:41 ` Duncan
2014-09-12 5:59 ` Duncan
2 siblings, 2 replies; 11+ messages in thread
From: Russell Coker @ 2014-09-12 5:19 UTC (permalink / raw)
To: Chris Murphy, Btrfs BTRFS
It would be nice if a file system mounted ro counted as ro snapshots for btrfs send.
When a file system is so messed up it can't be mounted rw it should be regarded as ro for all operations.
--
Sent from my Samsung Galaxy Note 2 with K-9 Mail.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: how long should "btrfs device delete missing ..." take?
2014-09-12 5:19 ` Russell Coker
@ 2014-09-12 5:33 ` Duncan
2014-09-12 6:27 ` Chris Murphy
1 sibling, 0 replies; 11+ messages in thread
From: Duncan @ 2014-09-12 5:33 UTC (permalink / raw)
To: linux-btrfs
Russell Coker posted on Fri, 12 Sep 2014 15:19:04 +1000 as excerpted:
> It would be nice if a file system mounted ro counted as ro snapshots for
> btrfs send.
>
> When a file system is so messed up it can't be mounted rw it should be
> regarded as ro for all operations.
Indeed, and that has been suggested before, but unfortunately it's not
current behavior.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: how long should "btrfs device delete missing ..." take?
2014-09-12 2:10 ` Chris Murphy
2014-09-12 5:19 ` Russell Coker
@ 2014-09-12 5:41 ` Duncan
2014-09-12 5:59 ` Duncan
2 siblings, 0 replies; 11+ messages in thread
From: Duncan @ 2014-09-12 5:41 UTC (permalink / raw)
To: linux-btrfs
Chris Murphy posted on Thu, 11 Sep 2014 20:10:26 -0600 as excerpted:
> And then when I think about just creating a new fs, using btrfs
> send/receive, the snapshots need to be ro first.
FWIW, at this point I'd forget about send/receive and create the backup
(assuming one doesn't exist already) using more normal methods. At least
if the original send/receive hasn't yet been done, so it'd be copying off
all the data anyway. Perhaps mount selected snapshots and back them up
too (after the current case is backed up), but throw away most of the
snapshots.
Of course if there's an existing relatively current sent/received base to
build on, and no indication that send/receive is broken, definitely try
that first as the amount of data to sync in that case should be MUCH
lower, but if not...
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: how long should "btrfs device delete missing ..." take?
2014-09-12 2:10 ` Chris Murphy
2014-09-12 5:19 ` Russell Coker
2014-09-12 5:41 ` Duncan
@ 2014-09-12 5:59 ` Duncan
2 siblings, 0 replies; 11+ messages in thread
From: Duncan @ 2014-09-12 5:59 UTC (permalink / raw)
To: linux-btrfs
Chris Murphy posted on Thu, 11 Sep 2014 20:10:26 -0600 as excerpted:
> Sure. But what's the next step? Given 260+ snapshots might mean well
> more than 350GB of data, depending on how deduplicated the fs is, it
> still probably would be faster to rsync this to a pile of drives in
> linear/concat+XFS than wait a month (?) for device delete to finish.
That was what I was getting at in my other just-finished short reply. It
may be time to give up on the btrfs specific solutions for the moment and
go with tried and tested traditional solutions (tho I'd definitely *NOT*
try rsync or the like with the delete still going, we know from other
reports that rsync places its own stresses on btrfs and one major
stressor, the delete-triggered rebalance, at a time, is bad enough).
> Alternatively, script some way to create 260+ ro snapshots to btrfs
> send/receive to a new btrfs volume; and turn it into a raid1 later.
No confirmation yet but I strongly suspect most of those subs are
snapshots. Assuming that's the case, it's very likely most of them can
simply be eliminated as I originally suggested, a process that /should/
be fast, decomplexifying the situation dramatically.
> I'm curious if a sysrq+s followed by sysrq+u might leave the filessystem
> in a state where it could still be rw mountable. But I'm skeptical of
> anything interrupting the device delete before being fully prepared for
> the fs to be toast for rw mount. If only ro mount is possible, any
> chance of creating ro snapshots is out.
In theory, that is, barring bugs, interrupting the delete with normal
shutdown to the extent possible, then sysrq+s, sysrq+u, should not be a
problem. The delete is basically a balance, going chunk by chunk, and
either the chunk has been duplicated to the new device or it hasn't. In
either case, the existing chunk on the remaining old device shouldn't be
affected.
So rebooting in that way in ordered to stop the delete temporarily
/should/ have no bad effects. Of course, that's barring bugs. Btrfs is
still not fully stabilized, and bugs do happen, so anything's possible.
But I'd consider it safe enough to try here, certainly so if I had
backups, as is still STRONGLY recommended for btrfs at this point, much
more so than the routine sysadmin "if it's not backed up by definition
it's not valuable to you" rule.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: how long should "btrfs device delete missing ..." take?
2014-09-12 5:19 ` Russell Coker
2014-09-12 5:33 ` Duncan
@ 2014-09-12 6:27 ` Chris Murphy
1 sibling, 0 replies; 11+ messages in thread
From: Chris Murphy @ 2014-09-12 6:27 UTC (permalink / raw)
To: Btrfs BTRFS
On Sep 11, 2014, at 11:19 PM, Russell Coker <russell@coker.com.au> wrote:
> It would be nice if a file system mounted ro counted as ro snapshots for btrfs send.
>
> When a file system is so messed up it can't be mounted rw it should be regarded as ro for all operations.
Yes it's come up before, and there's a question whether mount -o ro is reliably ro enough for this. Maybe a force option?
But then another one is a recursive btrfs send to go along with the above. I might want them all, or I might want all of the ones in two particular subvolumes, etc. And even combine the recursive ro snapshot and recursive send as a btrfs rescue option that would work even if the volume is mounted read-only.
Chris Murphy
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2014-09-12 6:27 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-11 15:22 how long should "btrfs device delete missing ..." take? Tomasz Chmielewski
2014-09-11 19:31 ` Duncan
2014-09-11 21:25 ` Chris Murphy
2014-09-11 23:51 ` Duncan
2014-09-12 2:10 ` Chris Murphy
2014-09-12 5:19 ` Russell Coker
2014-09-12 5:33 ` Duncan
2014-09-12 6:27 ` Chris Murphy
2014-09-12 5:41 ` Duncan
2014-09-12 5:59 ` Duncan
2014-09-11 23:06 ` Tomasz Chmielewski
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.