All of lore.kernel.org
 help / color / mirror / Atom feed
* how long should "btrfs device delete missing ..."  take?
@ 2014-09-11 15:22 Tomasz Chmielewski
  2014-09-11 19:31 ` Duncan
  2014-09-11 23:06 ` Tomasz Chmielewski
  0 siblings, 2 replies; 11+ messages in thread
From: Tomasz Chmielewski @ 2014-09-11 15:22 UTC (permalink / raw)
  To: linux-btrfs

After a disk died and was replaced, "btrfs device delete missing" is 
taking more than 10 days on an otherwise idle server:

# btrfs fi show /home
Label: none  uuid: 84d087aa-3a32-46da-844f-a233237cf04f
         Total devices 3 FS bytes used 362.44GiB
         devid    2 size 1.71TiB used 365.03GiB path /dev/sdb4
         devid    3 size 1.71TiB used 58.00GiB path /dev/sda4
         *** Some devices missing

Btrfs v3.16



So far, it has copied 58 GB out of 365 GB - and it took 10 days. At this 
speed, the whole operation will take 2-3 months (assuming that the only 
healthy disk doesn't die in the meantime).
Is this expected time for btrfs RAID-1?

There are no errors in dmesg/smart, performance of both disks is fine:

# hdparm -t /dev/sda /dev/sdb

/dev/sda:
  Timing buffered disk reads: 442 MB in  3.01 seconds = 146.99 MB/sec

/dev/sdb:
  Timing buffered disk reads: 402 MB in  3.39 seconds = 118.47 MB/sec


# btrfs fi df /home
Data, RAID1: total=352.00GiB, used=351.02GiB
System, RAID1: total=32.00MiB, used=96.00KiB
Metadata, RAID1: total=13.00GiB, used=11.38GiB
unknown, single: total=512.00MiB, used=67.05MiB

# btrfs sub list /home | wc -l
260

# uptime
  17:21:53 up 10 days,  6:01,  2 users,  load average: 3.22, 3.53, 3.55


I've tried running this on the latest 3.16.x kernel earlier, but since 
the progress was so slow, rebooted after about a week to see if the 
latest RC will be any faster.


-- 
Tomasz Chmielewski
http://www.sslrack.com




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: how long should "btrfs device delete missing ..."  take?
  2014-09-11 15:22 how long should "btrfs device delete missing ..." take? Tomasz Chmielewski
@ 2014-09-11 19:31 ` Duncan
  2014-09-11 21:25   ` Chris Murphy
  2014-09-11 23:06 ` Tomasz Chmielewski
  1 sibling, 1 reply; 11+ messages in thread
From: Duncan @ 2014-09-11 19:31 UTC (permalink / raw)
  To: linux-btrfs

Tomasz Chmielewski posted on Thu, 11 Sep 2014 17:22:15 +0200 as excerpted:

> After a disk died and was replaced, "btrfs device delete missing" is 
> taking more than 10 days on an otherwise idle server:
> 
> # btrfs fi show /home
> Label: none  uuid: 84d087aa-3a32-46da-844f-a233237cf04f
>          Total devices 3 FS bytes used 362.44GiB
>          devid    2 size 1.71TiB used 365.03GiB path /dev/sdb4
>          devid    3 size 1.71TiB used 58.00GiB path /dev/sda4
>          *** Some devices missing
> 
> Btrfs v3.16
> 
> So far, it has copied 58 GB out of 365 GB - and it took 10 days. At this 
> speed, the whole operation will take 2-3 months (assuming that the only 
> healthy disk doesn't die in the meantime).
> Is this expected time for btrfs RAID-1?

Device delete definitely takes time.  For the sub-GiB usage shown above,
10 days for 50 GiB out of 350+ does seem excessive, but there are extreme
cases where it isn't entirely out of line.  See below.

> There are no errors in dmesg/smart, performance of both disks is fine:

> # btrfs sub list /home | wc -l
> 260

> I've tried running this on the latest 3.16.x kernel earlier, but since 
> the progress was so slow, rebooted after about a week to see if the 
> latest RC will be any faster.

The good thing is that once a block group is copied over, it should be
fine and won't need re-copied if the process is stopped over a reboot
and restarted on a new kernel, etc.

The bad thing is that if I'm interpreting your report correctly, that
likely means 7+10=17 days for that 58 gig. =:^(

Questions:

* Presumably most of those 260 subvolumes are snapshots, correct?
What was your snapshotting schedule and did you have old snapshot
cleanup-deletion scheduled as well?

* Do you run with autodefrag or was the system otherwise regularly
defragged?

* Do you have large (GiB plus) database or virtual machine image files
on that filesystem?  If so, had you properly set the NOCOW file
attribute (chattr +C) on them and were they on dedicated subvolumes?


200+ snapshots is somewhat high and could be part of the issue, tho
it's nothing like the extremes (thousands) we've seen posted in the
past.  Were it me, I'd have tried deleting as many as possible before
the device delete missing, in ordered to simplify the process and
eliminate as much "extra" data as possible.

The real issue is going to be fragmentation, on spinning-rust drives.
Run filefrag on some of your gig-plus files that get written to
frequently (vm-images and database files are the classic cases) and
see how many extents are reported.  (Tho note that filefrag doesn't
understand btrfs compression and won't be accurate in that case, and
also that due to the btrfs data chunk size of 1 GiB, that's the
maximum extent size, so multi-gig files will typically be two extents
more than the number of gigs, filling up the current chunk,
N whole-gig chunks, the file tail.)  The nocow file attribute (which
must be set while the file is zero-sized to be effective, see
discussion elsewhere) can help with that, but snapshotting an
actively being rewritten nocow file more or less defeats the
purpose of nocow, since the snapshot locks in place the existing
data and the first rewrite to a block must then be cowed anyway.
But putting those files on dedicated subvolumes and then not
snapshotting those subvolumes is a workaround.


I wouldn't try defragging now, but it might be worthwhile to stop the
device delete (rebooting to do so since I don't think there's a cancel)
and delete as many snapshots as possible.  That should help matters.
Additionally, if you have recent backups of highly fragmented files
such as the VM-images and DBs I mentioned, you might consider simply
deleting them, thus eliminating that fragment processing from the
device delete.  I don't know that making a backup now would go much
faster than the device delete, however, so I don't know whether to
recommend that or not.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: how long should "btrfs device delete missing ..."  take?
  2014-09-11 19:31 ` Duncan
@ 2014-09-11 21:25   ` Chris Murphy
  2014-09-11 23:51     ` Duncan
  0 siblings, 1 reply; 11+ messages in thread
From: Chris Murphy @ 2014-09-11 21:25 UTC (permalink / raw)
  To: Btrfs BTRFS


On Sep 11, 2014, at 1:31 PM, Duncan <1i5t5.duncan@cox.net> wrote:
> 
> I wouldn't try defragging now, but it might be worthwhile to stop the
> device delete (rebooting to do so since I don't think there's a cancel)

'btrfs replace cancel' does exist, although I haven't tried it.

Something isn't right though, because it's clearly neither reading nor writing at anywhere close to 1/2 the drive read throughput. I'm curious what 'iotop -d30 -o' shows (during the replace, before cancel), which should be pretty consistent by averaging 30 seconds worth of io. And then try 'iotop -d3 -o' and see if there are spikes. I'm willing to bet there's a lot of nothing going on, with occasional spikes, rather than a constant trickle. 

And then the question is to find out what btrfs is thinking about while nothing is reading or writing. Even though it's not 5000+ snapshots, I wonder if the balance code (and hence btrfs replace) makes extensive use of fiemap that's causing this to go catatonic. http://comments.gmane.org/gmane.comp.file-systems.btrfs/35724

Chris Murphy


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: how long should "btrfs device delete missing ..."  take?
  2014-09-11 15:22 how long should "btrfs device delete missing ..." take? Tomasz Chmielewski
  2014-09-11 19:31 ` Duncan
@ 2014-09-11 23:06 ` Tomasz Chmielewski
  1 sibling, 0 replies; 11+ messages in thread
From: Tomasz Chmielewski @ 2014-09-11 23:06 UTC (permalink / raw)
  To: linux-btrfs

>> After a disk died and was replaced, "btrfs device delete missing" is
>> taking more than 10 days on an otherwise idle server:
> 
> Something isn't right though, because it's clearly neither reading nor 
> writing at \
> anywhere close to 1/2 the drive read throughput. I'm curious what 
> 'iotop -d30 -o' \
> shows (during the replace, before cancel), which should be pretty 
> consistent by \
> averaging 30 seconds worth of io. And then try 'iotop -d3 -o' and see 
> if there are \
> spikes. I'm willing to bet there's a lot of nothing going on, with 
> occasional spikes, \
> rather than a constant trickle.

That's more or less what I'm seeing with both. The numbers will go up or 
down slightly, but it's counted in kilobytes per second:

Total DISK READ:       0.00 B/s | Total DISK WRITE:     545.82 B/s
   TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND
   940 be/3 root        0.00 B/s  136.46 B/s  0.00 %  0.10 % [jbd2/md2-8]
  4714 be/4 root        0.00 B/s  329.94 K/s  0.00 %  0.00 % 
[btrfs-transacti]
25534 be/4 root        0.00 B/s  402.97 K/s  0.00 %  0.00 % 
[kworker/u16:0]


The bottleneck may be here - one CPU core is mostly 100% busy (kworker). 
Not sure what it's really busy with though:

   PID USER      PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command
25546 root       20   0     0     0     0 R 93.0  0.0 18:22.94 
kworker/u16:7
14473 root       20   0     0     0     0 S  5.0  0.0 25:00.14 
kworker/0:0


[912979.063432] SysRq : Show Blocked State
[912979.063485]   task                        PC stack   pid father
[912979.063545] btrfs           D ffff88083fa515c0     0  4793   4622 
0x00000000
[912979.063601]  ffff88061a29b878 0000000000000086 0000000000000000 
ffff88003683e040
[912979.063701]  00000000000115c0 0000000000004000 ffff880813e30000 
ffff88003683e040
[912979.063800]  ffff88061a29b7e8 ffffffff8105d8e9 ffff88083fa40000 
ffff88083fa115c0
[912979.063899] Call Trace:
[912979.063951]  [<ffffffff8105d8e9>] ? enqueue_task_fair+0x3e5/0x44f
[912979.064006]  [<ffffffff81053484>] ? resched_curr+0x47/0x57
[912979.064058]  [<ffffffff81053aed>] ? check_preempt_curr+0x3e/0x6d
[912979.064111]  [<ffffffff81053b2e>] ? ttwu_do_wakeup+0x12/0x7f
[912979.064164]  [<ffffffff81053c3c>] ? 
ttwu_do_activate.constprop.74+0x57/0x5c
[912979.064220]  [<ffffffff813acc1e>] schedule+0x65/0x67
[912979.064272]  [<ffffffff813aed0c>] schedule_timeout+0x26/0x198
[912979.064324]  [<ffffffff8105639d>] ? wake_up_process+0x31/0x35
[912979.064378]  [<ffffffff81049baf>] ? wake_up_worker+0x1f/0x21
[912979.064431]  [<ffffffff81049df6>] ? insert_work+0x87/0x94
[912979.064493]  [<ffffffffa02d524b>] ? free_block_list+0x1f/0x34 
[btrfs]
[912979.064548]  [<ffffffff813ad443>] wait_for_common+0x10d/0x13e
[912979.064600]  [<ffffffff8105635d>] ? try_to_wake_up+0x251/0x251
[912979.064653]  [<ffffffff813ad48c>] wait_for_completion+0x18/0x1a
[912979.064710]  [<ffffffffa0283a01>] 
btrfs_async_run_delayed_refs+0xc1/0xe4 [btrfs]
[912979.064814]  [<ffffffffa02983c5>] 
__btrfs_end_transaction+0x2bb/0x2e1 [btrfs]
[912979.064916]  [<ffffffffa02983f9>] 
btrfs_end_transaction_throttle+0xe/0x10 [btrfs]
[912979.065020]  [<ffffffffa02d973d>] relocate_block_group+0x2ad/0x4de 
[btrfs]
[912979.065079]  [<ffffffffa02d9ac6>] 
btrfs_relocate_block_group+0x158/0x278 [btrfs]
[912979.065184]  [<ffffffffa02b66f0>] 
btrfs_relocate_chunk.isra.62+0x58/0x5f7 [btrfs]
[912979.065286]  [<ffffffffa02c58d7>] ? 
btrfs_set_lock_blocking_rw+0x68/0x95 [btrfs]
[912979.065387]  [<ffffffffa0276b04>] ? 
btrfs_set_path_blocking+0x23/0x54 [btrfs]
[912979.065486]  [<ffffffffa027b517>] ? btrfs_search_slot+0x7bc/0x816 
[btrfs]
[912979.065546]  [<ffffffffa02b2bd5>] ? free_extent_buffer+0x6f/0x7c 
[btrfs]
[912979.065605]  [<ffffffffa02b89e9>] btrfs_shrink_device+0x23c/0x3a5 
[btrfs]
[912979.065679]  [<ffffffffa02bb2c7>] btrfs_rm_device+0x2a1/0x759 
[btrfs]
[912979.065747]  [<ffffffffa02c3ab3>] btrfs_ioctl+0xa52/0x227f [btrfs]
[912979.065811]  [<ffffffff81107182>] ? putname+0x23/0x2c
[912979.065863]  [<ffffffff8110b3cb>] ? user_path_at_empty+0x60/0x90
[912979.065918]  [<ffffffff81173b1a>] ? avc_has_perm+0x2e/0xf7
[912979.065978]  [<ffffffff810d7ad5>] ? __vm_enough_memory+0x25/0x13c
[912979.066032]  [<ffffffff8110d3c1>] do_vfs_ioctl+0x3f2/0x43c
[912979.066084]  [<ffffffff811026fd>] ? vfs_stat+0x16/0x18
[912979.066136]  [<ffffffff8110d459>] SyS_ioctl+0x4e/0x7d
[912979.066188]  [<ffffffff81030a71>] ? do_page_fault+0xc/0xf
[912979.066240]  [<ffffffff813afd92>] system_call_fastpath+0x16/0x1b
[912979.066296] Sched Debug Version: v0.11, 3.17.0-rc3 #1
[912979.066347] ktime                                   : 
913460840.666210
[912979.066401] sched_clk                               : 
912979066.295474
[912979.066454] cpu_clk                                 : 
912979066.295485
[912979.066507] jiffies                                 : 4386283381
[912979.066560] sched_clock_stable()                    : 1
[912979.066610]
[912979.066656] sysctl_sched
[912979.066703]   .sysctl_sched_latency                    : 24.000000
[912979.066756]   .sysctl_sched_min_granularity            : 3.000000
[912979.066809]   .sysctl_sched_wakeup_granularity         : 4.000000
[912979.066861]   .sysctl_sched_child_runs_first           : 0
[912979.066913]   .sysctl_sched_features                   : 11899
[912979.068196]   .sysctl_sched_tunable_scaling            : 1 
(logaritmic)
[912979.068248]
[912979.068294] cpu#0, 3411.379 MHz
[912979.068342]   .nr_running                    : 0
[912979.068393]   .load                          : 0
[912979.068443]   .nr_switches                   : 374480347
[912979.068495]   .nr_load_updates               : 56296080
[912979.068546]   .nr_uninterruptible            : -3469678
[912979.068597]   .next_balance                  : 4386.283363
[912979.068649]   .curr->pid                     : 0
[912979.068699]   .clock                         : 912979065.783028
[912979.068751]   .cpu_load[0]                   : 0
[912979.068802]   .cpu_load[1]                   : 0
[912979.068852]   .cpu_load[2]                   : 0
[912979.068902]   .cpu_load[3]                   : 0
[912979.068953]   .cpu_load[4]                   : 0
[912979.069004]   .yld_count                     : 1231295
[912979.069054]   .sched_count                   : 375930185
[912979.069106]   .sched_goidle                  : 150592952
[912979.069157]   .avg_idle                      : 875618
[912979.069208]   .max_idle_balance_cost         : 500000
[912979.069259]   .ttwu_count                    : 252809952
[912979.069311]   .ttwu_local                    : 170251968
[912979.069363]
[912979.069363] cfs_rq[0]:/autogroup-6
[912979.069456]   .exec_clock                    : 4133517.038969
[912979.069509]   .MIN_vruntime                  : 0.000001
[912979.069560]   .min_vruntime                  : 3861183.891419
[912979.069611]   .max_vruntime                  : 0.000001
[912979.069662]   .spread                        : 0.000000
[912979.069713]   .spread0                       : -117017929.493938
[912979.069765]   .nr_spread_over                : 122
[912979.069815]   .nr_running                    : 0
[912979.069865]   .load                          : 0
[912979.069916]   .runnable_load_avg             : 0
[912979.069966]   .blocked_load_avg              : 0
[912979.070015]   .tg_load_contrib               : 0
[912979.070066]   .tg_runnable_contrib           : 2
[912979.070115]   .tg_load_avg                   : 3
[912979.070166]   .tg->runnable_avg              : 5
[912979.070216]   .tg->cfs_bandwidth.timer_active: 0
[912979.070266]   .throttled                     : 0
[912979.070316]   .throttle_count                : 0
[912979.070366]   .se->exec_start                : 912979065.782511
[912979.070418]   .se->vruntime                  : 120879101.503042
[912979.070470]   .se->sum_exec_runtime          : 4133527.988053
[912979.070522]   .se->statistics.wait_start     : 0.000000
[912979.070573]   .se->statistics.sleep_start    : 0.000000
[912979.070624]   .se->statistics.block_start    : 0.000000
[912979.070675]   .se->statistics.sleep_max      : 0.000000
[912979.070726]   .se->statistics.block_max      : 0.000000
[912979.070777]   .se->statistics.exec_max       : 10.022184
[912979.070828]   .se->statistics.slice_max      : 11.802172
[912979.070879]   .se->statistics.wait_max       : 64.131091
[912979.070930]   .se->statistics.wait_sum       : 61306.855909
[912979.070981]   .se->statistics.wait_count     : 15944650
[912979.071032]   .se->load.weight               : 2
[912979.071082]   .se->avg.runnable_avg_sum      : 111
[912979.071133]   .se->avg.runnable_avg_period   : 47099
[912979.071183]   .se->avg.load_avg_contrib      : 0
[912979.071233]   .se->avg.decay_count           : 870684687
[912979.071285]
[912979.071285] cfs_rq[0]:/autogroup-2671
[912979.071379]   .exec_clock                    : 5.865939
[912979.071430]   .MIN_vruntime                  : 0.000001
[912979.071481]   .min_vruntime                  : 4.817363
[912979.071533]   .max_vruntime                  : 0.000001
[912979.071583]   .spread                        : 0.000000
[912979.071635]   .spread0                       : -120879108.567994
[912979.071686]   .nr_spread_over                : 0
[912979.071737]   .nr_running                    : 0
[912979.071787]   .load                          : 0
[912979.071837]   .runnable_load_avg             : 0
[912979.071888]   .blocked_load_avg              : 0
[912979.071937]   .tg_load_contrib               : 0
[912979.071987]   .tg_runnable_contrib           : 0
[912979.072037]   .tg_load_avg                   : 0
[912979.072087]   .tg->runnable_avg              : 0
[912979.072138]   .tg->cfs_bandwidth.timer_active: 0
[912979.072188]   .throttled                     : 0
[912979.072238]   .throttle_count                : 0
[912979.072289]   .se->exec_start                : 912979063.302772
[912979.072342]   .se->vruntime                  : 120879101.418762
[912979.072393]   .se->sum_exec_runtime          : 5.865939
[912979.072445]   .se->statistics.wait_start     : 0.000000
[912979.072496]   .se->statistics.sleep_start    : 0.000000
[912979.072547]   .se->statistics.block_start    : 0.000000
[912979.072598]   .se->statistics.sleep_max      : 0.000000
[912979.072649]   .se->statistics.block_max      : 0.000000
[912979.072700]   .se->statistics.exec_max       : 1.418038
[912979.072751]   .se->statistics.slice_max      : 0.039648
[912979.072802]   .se->statistics.wait_max       : 0.011396
[912979.072853]   .se->statistics.wait_sum       : 0.011396
[912979.072904]   .se->statistics.wait_count     : 87
[912979.072955]   .se->load.weight               : 2
[912979.073005]   .se->avg.runnable_avg_sum      : 29
[912979.073056]   .se->avg.runnable_avg_period   : 47385
[912979.073107]   .se->avg.load_avg_contrib      : 0
[912979.073157]   .se->avg.decay_count           : 870684685
[912979.073209]
[912979.073209] cfs_rq[0]:/
[912979.073301]   .exec_clock                    : 123499351.282840
[912979.073352]   .MIN_vruntime                  : 0.000001
[912979.073404]   .min_vruntime                  : 120879113.385357
[912979.073456]   .max_vruntime                  : 0.000001
[912979.073508]   .spread                        : 0.000000
[912979.073559]   .spread0                       : 0.000000
[912979.073610]   .nr_spread_over                : 899
[912979.073661]   .nr_running                    : 0
[912979.073711]   .load                          : 0
[912979.073762]   .runnable_load_avg             : 0
[912979.073812]   .blocked_load_avg              : 0
[912979.073861]   .tg_load_contrib               : 0
[912979.073912]   .tg_runnable_contrib           : 3
[912979.073963]   .tg_load_avg                   : 1112
[912979.074013]   .tg->runnable_avg              : 2090
[912979.074064]   .tg->cfs_bandwidth.timer_active: 0
[912979.074114]   .throttled                     : 0
[912979.074164]   .throttle_count                : 0
[912979.074215]   .avg->runnable_avg_sum         : 147
[912979.074265]   .avg->runnable_avg_period      : 47875
[912979.074317]
[912979.074317] rt_rq[0]:/
[912979.074409]   .rt_nr_running                 : 0
[912979.074460]   .rt_throttled                  : 0
[912979.074511]   .rt_time                       : 0.008767
[912979.074562]   .rt_runtime                    : 950.000000
[912979.074614]
[912979.074614] runnable tasks:
[912979.074614]             task   PID         tree-key  switches  prio  
    exec-runtime         sum-exec        sum-sleep
[912979.074614] 
----------------------------------------------------------------------------------------------------------
[912979.074908]             init     1      1442.976599    182764   120  
     1442.976599      4970.908057 912971562.079844 /autogroup-2
[912979.075015]      ksoftirqd/0     3 120879101.450246   2773774   120 
120879101.450246     40699.948228 912767567.756265 /
[912979.075120]     kworker/0:0H     5       704.624192         5   100  
      704.624192         0.046383      1080.126254 /
[912979.075225]           rcu_bh     8       117.338636         2   120  
      117.338636         0.000907         0.000736 /
[912979.075329]      migration/0     9         0.000000     79441     0  
        0.000000     94289.738036         0.000939 /
[912979.075435]       watchdog/0    10       -11.784505    228375     0  
      -11.784505     18626.398234        93.358787 /
[912979.075540]    fsnotify_mark   573      1251.442326        15   120  
     1251.442326         0.277180     33700.437424 /
[912979.075662]      kworker/0:1   600 117631274.916693       481   120 
117631274.916693         3.528776 885071044.709135 /
[912979.075767]        scsi_eh_0   855       703.799083        18   120  
      703.799083         0.937602       352.420033 /
[912979.075871]        scsi_eh_1   857       703.864038        18   120  
      703.864038         1.040986       349.446698 /
[912979.075977]     kworker/0:1H   879 120879082.863400  81445041   100 
120879082.863400   2958135.905303 909589328.375092 /
[912979.076081]        md2_raid1   925 120879064.591392  18785880   120 
120879064.591392   1385509.864935 911497676.823212 /
[912979.076187]       jbd2/md2-8   940 120879042.325286   3868883   120 
120879042.325286    116631.787281 912832775.733876 /
[912979.076292]         rsyslogd  3079   3861172.009104       335   120  
  3861172.009104        19.617385 912940034.911841 /autogroup-6
[912979.076398]          vnstatd  3055      6435.362083    184313   120  
     6435.362083     16205.294171 912920653.881480 /autogroup-7
[912979.076504]              atd  3159         0.797141       255   120  
        0.797141         4.675411 910319644.880519 /autogroup-8
[912979.076610]            mdadm  3165        26.238387       983   120  
       26.238387       124.630846 912209267.355015 /autogroup-9
[912979.076717]            nginx  3428      8742.095210   1353920   120  
     8742.095210     16279.858318 912922182.809013 /autogroup-14
[912979.076823]            nginx  3429      8715.099678   1807456   120  
     8715.099678     21823.289825 911255010.187828 /autogroup-14
[912979.076929]          dovecot  4225       464.029513      1819   120  
      464.029513        95.516146 910857010.523253 /autogroup-17
[912979.077036]            getty  4380         0.037161        18   120  
        0.037161         1.226701       106.505480 /autogroup-22
[912979.077143]  btrfs-endio-wri  4704  21203401.069282         6   100  
21203401.069282         0.188994 141336214.234067 /
[912979.077248]      kworker/0:0 14473 120879101.539884   7149580   120 
120879101.539884   1503790.895268 208487953.565214 /
[912979.077353]           pickup 25685       889.860671        16   120  
      889.860671         2.904432    660392.168074 /autogroup-19
[912979.078689]             sshd 25689         4.817363       117   120  
        4.817363        11.371198    539229.489832 /autogroup-2671
[912979.078796]            iotop 25901       242.641616       350   120  
      242.641616      1127.641305    429237.471383 /autogroup-2674
[912979.078902]             htop 25910       201.989567       237   120  
      201.989567       784.683871    163646.060924 /autogroup-2676
[912979.079009]    kworker/u16:6 25921 120879083.401250       697   120 
120879083.401250      2993.947814    126051.426518 /
[912979.079114]    kworker/u17:5 25968 120838671.063663        12   100 
120838671.063663         0.460692         3.298837 /
[912979.079221]
[912979.079275] cpu#1, 3411.379 MHz
[912979.079332]   .nr_running                    : 2
[912979.079384]   .load                          : 2048
[912979.079438]   .nr_switches                   : 169743127
[912979.079491]   .nr_load_updates               : 54005477
[912979.079545]   .nr_uninterruptible            : 1661854
[912979.079605]   .next_balance                  : 4386.283389
[912979.079657]   .curr->pid                     : 25691
[912979.079716]   .clock                         : 912979079.220144
[912979.079768]   .cpu_load[0]                   : 283
[912979.079819]   .cpu_load[1]                   : 153
[912979.079870]   .cpu_load[2]                   : 80
[912979.079920]   .cpu_load[3]                   : 41
[912979.079971]   .cpu_load[4]                   : 21
[912979.080022]   .yld_count                     : 815801
[912979.080072]   .sched_count                   : 171240785
[912979.080124]   .sched_goidle                  : 55313040
[912979.080175]   .avg_idle                      : 1000000
[912979.080226]   .max_idle_balance_cost         : 500000
[912979.080278]   .ttwu_count                    : 75199620
[912979.080329]   .ttwu_local                    : 47229794
[912979.080380]
[912979.080380] cfs_rq[1]:/autogroup-2672
[912979.080474]   .exec_clock                    : 68.766961
[912979.080525]   .MIN_vruntime                  : 0.000001
[912979.080577]   .min_vruntime                  : 101.530915
[912979.080629]   .max_vruntime                  : 0.000001
[912979.080680]   .spread                        : 0.000000
[912979.080731]   .spread0                       : -120879011.854442
[912979.080783]   .nr_spread_over                : 0
[912979.080834]   .nr_running                    : 1
[912979.080884]   .load                          : 1024
[912979.080935]   .runnable_load_avg             : 280
[912979.080985]   .blocked_load_avg              : 0
[912979.081036]   .tg_load_contrib               : 280
[912979.081086]   .tg_runnable_contrib           : 285
[912979.081137]   .tg_load_avg                   : 280
[912979.081187]   .tg->runnable_avg              : 285
[912979.081238]   .tg->cfs_bandwidth.timer_active: 0
[912979.081287]   .throttled                     : 0
[912979.081338]   .throttle_count                : 0
[912979.081388]   .se->exec_start                : 912979079.220144
[912979.081440]   .se->vruntime                  : 275145564.677538
[912979.081492]   .se->sum_exec_runtime          : 68.766961
[912979.081543]   .se->statistics.wait_start     : 0.000000
[912979.081594]   .se->statistics.sleep_start    : 0.000000
[912979.081644]   .se->statistics.block_start    : 0.000000
[912979.081695]   .se->statistics.sleep_max      : 0.000000
[912979.081746]   .se->statistics.block_max      : 0.000000
[912979.081798]   .se->statistics.exec_max       : 13.558964
[912979.081848]   .se->statistics.slice_max      : 2.985393
[912979.081899]   .se->statistics.wait_max       : 0.009879
[912979.081950]   .se->statistics.wait_sum       : 0.026885
[912979.082001]   .se->statistics.wait_count     : 69
[912979.082052]   .se->load.weight               : 1024
[912979.082102]   .se->avg.runnable_avg_sum      : 13365
[912979.082153]   .se->avg.runnable_avg_period   : 47980
[912979.082204]   .se->avg.load_avg_contrib      : 283
[912979.082255]   .se->avg.decay_count           : 0
[912979.082305]
[912979.082305] cfs_rq[1]:/
[912979.082397]   .exec_clock                    : 286402909.854441
[912979.082449]   .MIN_vruntime                  : 275145546.211368
[912979.082502]   .min_vruntime                  : 275145557.563743
[912979.082554]   .max_vruntime                  : 275145546.211368
[912979.082606]   .spread                        : 0.000000
[912979.082658]   .spread0                       : 154266444.178386
[912979.082710]   .nr_spread_over                : 812
[912979.082760]   .nr_running                    : 2
[912979.082811]   .load                          : 2048
[912979.082861]   .runnable_load_avg             : 283
[912979.082912]   .blocked_load_avg              : 0
[912979.082962]   .tg_load_contrib               : 283
[912979.083013]   .tg_runnable_contrib           : 289
[912979.083063]   .tg_load_avg                   : 1356
[912979.083114]   .tg->runnable_avg              : 2342
[912979.083164]   .tg->cfs_bandwidth.timer_active: 0
[912979.083214]   .throttled                     : 0
[912979.083265]   .throttle_count                : 0
[912979.083315]   .avg->runnable_avg_sum         : 13118
[912979.083366]   .avg->runnable_avg_period      : 46332
[912979.083417]
[912979.083417] rt_rq[1]:/
[912979.083509]   .rt_nr_running                 : 0
[912979.083559]   .rt_throttled                  : 0
[912979.083609]   .rt_time                       : 0.000000
[912979.083660]   .rt_runtime                    : 950.000000
[912979.083712]
[912979.083712] runnable tasks:
[912979.083712]             task   PID         tree-key  switches  prio  
    exec-runtime         sum-exec        sum-sleep
[912979.083712] 
----------------------------------------------------------------------------------------------------------
[912979.084006]       watchdog/1    11        -4.044370         3   120  
       -4.044370         0.005127         0.000948 /
[912979.084111]      migration/1    12         0.000000     64289     0  
        0.000000     81551.633973         0.000880 /
[912979.084215]      ksoftirqd/1    13 275145545.590468    272453   120 
275145545.590468      9049.618468 912823795.945262 /
[912979.084320]      kworker/1:0    14 268487677.088594        25   120 
268487677.088594         0.073843 890158629.032624 /
[912979.084424]     kworker/1:0H    15       292.784984         6   100  
      292.784984         0.034371      1174.761064 /
[912979.084529]       khungtaskd   201 275144402.934404      7614   120 
275144402.934404       416.952311 912957932.380625 /
[912979.084633]          kswapd0   502 274008133.542214     11379   120 
274008133.542214      4497.950184 909317147.417035 /
[912979.084737]          deferwq   727        23.191246         2   100  
       23.191246         0.011226         0.004570 /
[912979.084842]     kworker/1:1H   897 275143578.574135   1248547   100 
275143578.574135     10327.849008 912864614.443779 /
[912979.084946]            udevd  1127        29.410078       216   120  
       29.410078        37.086597   1475991.787847 /autogroup-4
[912979.085053]             ntpd  3336      5469.037929    916907   120  
     5469.037929     23741.992404 912915353.001973 /autogroup-12
[912979.085158]             sshd  3449         0.161283       103   120  
        0.161283         8.844398 912476711.092472 /autogroup-15
[912979.085265]            getty  4383         0.612958        96   120  
        0.612958         2.038719       107.995012 /autogroup-24
[912979.085371]             bash  4622      2762.602192       159   120  
     2762.602192        48.778186   2482145.460574 /autogroup-35
[912979.085477]            btrfs  4793    544711.533345   2433072   120  
   544711.533345   2311403.122680 906591091.757311 /autogroup-35
[912979.085584]      kworker/1:2 19065 275145546.211368   5366925   120 
275145546.211368   2459424.518362 122606255.191968 /
[912979.085721]    kworker/u16:1 25373 275115349.906651    110532   120 
275115349.906651    959695.014681   6225615.573687 /
[912979.085826]    kworker/u16:7 25546 275145546.776778    135645   120 
275145546.776778   1192109.886086   2457115.723296 /
[912979.085931] R           bash 25691       101.530915       107   120  
      101.530915        60.081376    535001.680171 /autogroup-2672
[912979.086038]             sshd 25743         2.112753       200   120  
        2.112753        22.093931    492201.992771 /autogroup-2673
[912979.086144]    kworker/u16:3 25919 275119113.595959      1711   120 
275119113.595959      9954.427931     57170.637468 /
[912979.086249]    kworker/u16:4 25920 275119113.620983       949   120 
275119113.620983      4193.701841     62074.304968 /
[912979.086353]    kworker/u16:9 25923 275129475.908640       924   120 
275129475.908640      4824.625594     74152.344320 /
[912979.086458]   kworker/u16:10 25924 275119113.342743      1017   120 
275119113.342743      6528.918601     50465.467883 /
[912979.086563]    kworker/u17:2 25965 275111274.543188        16   100 
275111274.543188         0.865367        14.115527 /
[912979.086670]
[912979.086724] cpu#2, 3411.379 MHz
[912979.086782]   .nr_running                    : 1
[912979.086834]   .load                          : 1024
[912979.086885]   .nr_switches                   : 161087963
[912979.086945]   .nr_load_updates               : 52060426
[912979.086997]   .nr_uninterruptible            : 1030559
[912979.087056]   .next_balance                  : 4386.283391
[912979.087107]   .curr->pid                     : 25922
[912979.087158]   .clock                         : 912979085.701520
[912979.087210]   .cpu_load[0]                   : 1023
[912979.087261]   .cpu_load[1]                   : 1023
[912979.087311]   .cpu_load[2]                   : 1023
[912979.087361]   .cpu_load[3]                   : 1023
[912979.087412]   .cpu_load[4]                   : 1023
[912979.087462]   .yld_count                     : 652418
[912979.087513]   .sched_count                   : 162161509
[912979.087564]   .sched_goidle                  : 52314275
[912979.087615]   .avg_idle                      : 612215
[912979.087666]   .max_idle_balance_cost         : 500000
[912979.087717]   .ttwu_count                    : 72879058
[912979.087768]   .ttwu_local                    : 44844744
[912979.087819]
[912979.087819] cfs_rq[2]:/
[912979.087911]   .exec_clock                    : 271247931.510377
[912979.089193]   .MIN_vruntime                  : 0.000001
[912979.089244]   .min_vruntime                  : 261281757.496336
[912979.089297]   .max_vruntime                  : 0.000001
[912979.089348]   .spread                        : 0.000000
[912979.089399]   .spread0                       : 140402644.110979
[912979.089451]   .nr_spread_over                : 787
[912979.089502]   .nr_running                    : 1
[912979.089552]   .load                          : 1024
[912979.089603]   .runnable_load_avg             : 1023
[912979.089654]   .blocked_load_avg              : 39
[912979.089704]   .tg_load_contrib               : 1058
[912979.089754]   .tg_runnable_contrib           : 1011
[912979.089804]   .tg_load_avg                   : 1451
[912979.089854]   .tg->runnable_avg              : 2456
[912979.089905]   .tg->cfs_bandwidth.timer_active: 0
[912979.089955]   .throttled                     : 0
[912979.090006]   .throttle_count                : 0
[912979.090056]   .avg->runnable_avg_sum         : 47775
[912979.090107]   .avg->runnable_avg_period      : 47775
[912979.090158]
[912979.090158] rt_rq[2]:/
[912979.090250]   .rt_nr_running                 : 0
[912979.090301]   .rt_throttled                  : 0
[912979.090351]   .rt_time                       : 0.000000
[912979.090402]   .rt_runtime                    : 950.000000
[912979.090454]
[912979.090454] runnable tasks:
[912979.090454]             task   PID         tree-key  switches  prio  
    exec-runtime         sum-exec        sum-sleep
[912979.090454] 
----------------------------------------------------------------------------------------------------------
[912979.090748]         kthreadd     2 261237406.066326      4827   120 
261237406.066326       119.911688 912871657.395410 /
[912979.090853]       watchdog/2    16        -7.044760         3   120  
       -7.044760         0.004873         0.001867 /
[912979.090957]      migration/2    17         0.000000     62927     0  
        0.000000     87331.886093         0.000889 /
[912979.091062]      ksoftirqd/2    18 261279894.213366    232738   120 
261279894.213366      9205.862670 912836969.487694 /
[912979.091166]     kworker/2:0H    20      1186.583000         6   100  
     1186.583000         0.049105     36720.236432 /
[912979.091272]            khubd   352       735.961340        62   120  
      735.961340         3.794366     34873.341958 /
[912979.091377]        scsi_eh_5   865       294.741950         3   120  
      294.741950         0.094279         1.195837 /
[912979.091481]            udevd  1253        36.383350         1   120  
       36.383350         0.349195         0.000000 /autogroup-4
[912979.091588]     kworker/2:1H  2466 261154579.822841   1217898   100 
261154579.822841      7698.405912 912438827.428521 /
[912979.091692]         rsyslogd  3078   1361568.583283      8082   120  
  1361568.583283       278.723375 912918782.388604 /autogroup-6
[912979.091798]             cron  3318       527.499209     17807   120  
      527.499209       864.457775 912917777.953019 /autogroup-11
[912979.091905]            nginx  3427      7530.670910        11   120  
     7530.670910         1.438863 845802616.809370 /autogroup-14
[912979.092010]            getty  4381         0.550444       101   120  
        0.550444         1.894524       108.260507 /autogroup-23
[912979.092117]             qmgr  4385       662.119562      5737   120  
      662.119562       563.016676 912836471.747917 /autogroup-19
[912979.092223]    btrfs-cleaner  4713 261276427.726190     88520   120 
261276427.726190     46175.312340 911393872.179404 /
[912979.092328]      kworker/2:1 24696 261281745.576376    902624   120 
261281745.576376    401393.704640  22415088.310903 /
[912979.092432]    kworker/u16:5 25545 261237415.871889    119775   120 
261237415.871889   1060653.766316   2499879.159564 /
[912979.092537]      kworker/2:0 25737 261143674.398829        12   120 
261143674.398829         0.075679        90.512937 /
[912979.092642]             bash 25745         1.774182        57   120  
        1.774182        27.044681     59433.860152 /autogroup-2674
[912979.092749]             bash 25797       125.213121       104   120  
      125.213121        31.328467    296070.107601 /autogroup-2676
[912979.092856] R  kworker/u16:8 25922 261281757.496336      7833   120 
261281757.496336     66888.389010     62803.947413 /
[912979.092961]    kworker/u17:1 25936 261237406.051287         8   100 
261237406.051287         0.276398     22805.965907 /
[912979.093066]   kworker/u16:17 25967 261237421.924073        24   120 
261237421.924073        15.955485      2814.028396 /
[912979.093171]
[912979.093217] cpu#3, 3411.379 MHz
[912979.093265]   .nr_running                    : 0
[912979.093315]   .load                          : 0
[912979.093365]   .nr_switches                   : 155857077
[912979.093416]   .nr_load_updates               : 50842415
[912979.093467]   .nr_uninterruptible            : 572868
[912979.093518]   .next_balance                  : 4386.283384
[912979.093569]   .curr->pid                     : 0
[912979.093619]   .clock                         : 912979085.631719
[912979.093671]   .cpu_load[0]                   : 0
[912979.093721]   .cpu_load[1]                   : 0
[912979.093771]   .cpu_load[2]                   : 0
[912979.093821]   .cpu_load[3]                   : 0
[912979.093872]   .cpu_load[4]                   : 0
[912979.093922]   .yld_count                     : 539647
[912979.093973]   .sched_count                   : 156675888
[912979.094025]   .sched_goidle                  : 50861107
[912979.094076]   .avg_idle                      : 1000000
[912979.094127]   .max_idle_balance_cost         : 500000
[912979.094178]   .ttwu_count                    : 70168767
[912979.094229]   .ttwu_local                    : 43110297
[912979.094281]
[912979.094281] cfs_rq[3]:/
[912979.094373]   .exec_clock                    : 258813890.199969
[912979.094424]   .MIN_vruntime                  : 0.000001
[912979.094475]   .min_vruntime                  : 249777836.394830
[912979.094528]   .max_vruntime                  : 0.000001
[912979.094579]   .spread                        : 0.000000
[912979.094630]   .spread0                       : 128898723.009473
[912979.094683]   .nr_spread_over                : 710
[912979.094734]   .nr_running                    : 0
[912979.094784]   .load                          : 0
[912979.094834]   .runnable_load_avg             : 0
[912979.094885]   .blocked_load_avg              : 0
[912979.094935]   .tg_load_contrib               : 0
[912979.094985]   .tg_runnable_contrib           : 0
[912979.095035]   .tg_load_avg                   : 1451
[912979.095086]   .tg->runnable_avg              : 2456
[912979.095136]   .tg->cfs_bandwidth.timer_active: 0
[912979.095186]   .throttled                     : 0
[912979.095237]   .throttle_count                : 0
[912979.095287]   .avg->runnable_avg_sum         : 12
[912979.095338]   .avg->runnable_avg_period      : 47841
[912979.095389]
[912979.095389] rt_rq[3]:/
[912979.095481]   .rt_nr_running                 : 0
[912979.095532]   .rt_throttled                  : 0
[912979.095582]   .rt_time                       : 0.000000
[912979.095649]   .rt_runtime                    : 950.000000
[912979.095710]
[912979.095710] runnable tasks:
[912979.095710]             task   PID         tree-key  switches  prio  
    exec-runtime         sum-exec        sum-sleep
[912979.095710] 
----------------------------------------------------------------------------------------------------------
[912979.096024]       watchdog/3    21        -7.044474         3   120  
       -7.044474         0.005278         0.001572 /
[912979.096128]      migration/3    22         0.000000     62875     0  
        0.000000     86643.021266         0.000894 /
[912979.096233]      ksoftirqd/3    23 249777505.585429    211305   120 
249777505.585429      9262.490697 912840221.020644 /
[912979.096337]      kworker/3:0    24 242884588.306110        20   120 
242884588.306110         0.043153 884589608.629792 /
[912979.096442]     kworker/3:0H    25      1375.522741         6   100  
     1375.522741         0.031575     39029.180311 /
[912979.096547]      kworker/3:1   606 249777825.422551  36311384   120 
249777825.422551  15997163.195058 896809394.721102 /
[912979.096653]     kworker/3:1H  3166 249768145.314170   1231490   100 
249768145.314170      5499.103364 912882138.510763 /
[912979.096757]            nginx  3430      8357.302718    887796   120  
     8357.302718     10878.157162 912927715.123733 /autogroup-14
[912979.096864]            nginx  3431      8357.281768   1441416   120  
     8357.281768     18119.184373 912920263.466853 /autogroup-14
[912979.096969]              log  4278      1009.514780       721   120  
     1009.514780        32.210457 910856714.130981 /autogroup-17
[912979.097076]            getty  4379         0.744904        62   120  
        0.744904         2.024537       107.666970 /autogroup-21
[912979.097182]             tmux  4621     30898.295871   2720146   120  
    30898.295871    125970.617268 911355335.261179 /autogroup-34
[912979.097289]  btrfs-transacti  4714 249768155.237773  62080256   120 
249768155.237773 491918009.273985 384893945.745894 /
[912979.097394]    kworker/u17:0 25132 249726067.531953       131   100 
249726067.531953         4.623987  14288907.161099 /
[912979.097498]       ssl-params 25668      1021.291893         4   120  
     1021.291893         1.071507         0.162468 /autogroup-17
[912979.097606]    kworker/u17:3 25966 249726077.748339         6   100 
249726077.748339         0.166575        14.012508 /
[912979.097713]
[912979.097767] cpu#4, 3411.379 MHz
[912979.097823]   .nr_running                    : 0
[912979.097875]   .load                          : 0
[912979.097927]   .nr_switches                   : 97989748
[912979.097988]   .nr_load_updates               : 29437738
[912979.098039]   .nr_uninterruptible            : 99645
[912979.098090]   .next_balance                  : 4386.283321
[912979.098150]   .curr->pid                     : 0
[912979.098200]   .clock                         : 912979095.633281
[912979.098252]   .cpu_load[0]                   : 0
[912979.098302]   .cpu_load[1]                   : 0
[912979.098352]   .cpu_load[2]                   : 0
[912979.098403]   .cpu_load[3]                   : 0
[912979.098453]   .cpu_load[4]                   : 0
[912979.099726]   .yld_count                     : 289502
[912979.099776]   .sched_count                   : 99329529
[912979.099827]   .sched_goidle                  : 24596751
[912979.099878]   .avg_idle                      : 1000000
[912979.099929]   .max_idle_balance_cost         : 500000
[912979.099979]   .ttwu_count                    : 40527541
[912979.100030]   .ttwu_local                    : 28095271
[912979.100081]
[912979.100081] cfs_rq[4]:/
[912979.100172]   .exec_clock                    : 240550243.715826
[912979.100225]   .MIN_vruntime                  : 0.000001
[912979.100276]   .min_vruntime                  : 232547480.727422
[912979.100328]   .max_vruntime                  : 0.000001
[912979.100379]   .spread                        : 0.000000
[912979.100431]   .spread0                       : 111668367.342065
[912979.100483]   .nr_spread_over                : 680
[912979.100533]   .nr_running                    : 0
[912979.100583]   .load                          : 0
[912979.100634]   .runnable_load_avg             : 0
[912979.100684]   .blocked_load_avg              : 0
[912979.100734]   .tg_load_contrib               : 0
[912979.100784]   .tg_runnable_contrib           : 0
[912979.100834]   .tg_load_avg                   : 1591
[912979.100884]   .tg->runnable_avg              : 2592
[912979.100935]   .tg->cfs_bandwidth.timer_active: 0
[912979.100985]   .throttled                     : 0
[912979.101035]   .throttle_count                : 0
[912979.101086]   .avg->runnable_avg_sum         : 6
[912979.101136]   .avg->runnable_avg_period      : 47942
[912979.101186]
[912979.101186] rt_rq[4]:/
[912979.101279]   .rt_nr_running                 : 0
[912979.101329]   .rt_throttled                  : 0
[912979.101379]   .rt_time                       : 0.000000
[912979.101431]   .rt_runtime                    : 950.000000
[912979.101483]
[912979.101483] runnable tasks:
[912979.101483]             task   PID         tree-key  switches  prio  
    exec-runtime         sum-exec        sum-sleep
[912979.101483] 
----------------------------------------------------------------------------------------------------------
[912979.101777]        rcu_sched     7 232547476.981614  39735742   120 
232547476.981614   1073301.745321 911867300.274149 /
[912979.101881]       watchdog/4    26       -10.043788         3   120  
      -10.043788         0.005889         0.001724 /
[912979.101986]      migration/4    27         0.000000     52897     0  
        0.000000     30928.519290         0.000864 /
[912979.102090]      ksoftirqd/4    28 232547379.187231    177607   120 
232547379.187231      1064.877952 912828854.891450 /
[912979.102195]      kworker/4:0    29 225129625.190891        21   120 
225129625.190891         0.052462 885018589.851740 /
[912979.102300]     kworker/4:0H    30       401.555147         6   100  
      401.555147         0.023250      1113.250079 /
[912979.102405]        writeback   202        10.955626         2   100  
       10.955626         0.004714         0.001304 /
[912979.102510]           crypto   205        22.958809         2   100  
       22.958809         0.003514         0.001582 /
[912979.102615]          ata_sff   853       221.595354         2   100  
      221.595354         0.054486         0.007891 /
[912979.102720]        scsi_eh_3   861       341.343202         3   120  
      341.343202         0.081789         1.417215 /
[912979.102824]      kworker/4:2   895 232547469.098718  25902640   120 
232547469.098718  14994809.177138 897903166.791145 /
[912979.102929]     kworker/4:1H   896 232524737.232736    641223   100 
232524737.232736      7788.068600 912787058.060253 /
[912979.103033]           bioset   907       461.767329         2   100  
      461.767329         0.024292         0.004808 /
[912979.103138]        md0_raid1   908 232423972.410927     68708   120 
232423972.410927     14264.555610 912231051.031317 /
[912979.103243]           bioset   916       493.916287         2   100  
      493.916287         0.008234         0.006637 /
[912979.103348]           bioset   924       530.083314         2   100  
      530.083314         0.033173         0.005301 /
[912979.103453]  ext4-rsv-conver   941       590.416989         2   100  
      590.416989         0.032333         0.008235 /
[912979.103557]         rsyslogd  3044    708180.065334        36   120  
   708180.065334         3.020832 845805199.558834 /autogroup-6
[912979.103666]   kworker/u16:11 25959 232528438.791148       129   120 
232528438.791148        25.448642     10350.709475 /
[912979.103770]   kworker/u16:12 25960 232528444.421842        13   120 
232528444.421842        25.248681      6835.941648 /
[912979.103875]
[912979.103921] cpu#5, 3411.379 MHz
[912979.103969]   .nr_running                    : 0
[912979.104019]   .load                          : 0
[912979.104069]   .nr_switches                   : 60686752
[912979.104120]   .nr_load_updates               : 16590839
[912979.104171]   .nr_uninterruptible            : 35116
[912979.104221]   .next_balance                  : 4386.283371
[912979.104273]   .curr->pid                     : 0
[912979.104323]   .clock                         : 912978955.699686
[912979.104375]   .cpu_load[0]                   : 0
[912979.104425]   .cpu_load[1]                   : 0
[912979.104476]   .cpu_load[2]                   : 0
[912979.104526]   .cpu_load[3]                   : 0
[912979.104577]   .cpu_load[4]                   : 0
[912979.104627]   .yld_count                     : 180533
[912979.104678]   .sched_count                   : 61115461
[912979.104729]   .sched_goidle                  : 18413446
[912979.104780]   .avg_idle                      : 1000000
[912979.104831]   .max_idle_balance_cost         : 500000
[912979.104883]   .ttwu_count                    : 26014119
[912979.104934]   .ttwu_local                    : 14558862
[912979.104985]
[912979.104985] cfs_rq[5]:/
[912979.105078]   .exec_clock                    : 112549009.355412
[912979.105129]   .MIN_vruntime                  : 0.000001
[912979.105180]   .min_vruntime                  : 112340424.333297
[912979.105233]   .max_vruntime                  : 0.000001
[912979.105284]   .spread                        : 0.000000
[912979.105335]   .spread0                       : -8538689.052060
[912979.105388]   .nr_spread_over                : 776
[912979.105438]   .nr_running                    : 0
[912979.105489]   .load                          : 0
[912979.105539]   .runnable_load_avg             : 0
[912979.105590]   .blocked_load_avg              : 0
[912979.105665]   .tg_load_contrib               : 0
[912979.105715]   .tg_runnable_contrib           : 0
[912979.105765]   .tg_load_avg                   : 1566
[912979.105816]   .tg->runnable_avg              : 2588
[912979.105867]   .tg->cfs_bandwidth.timer_active: 0
[912979.105917]   .throttled                     : 0
[912979.105967]   .throttle_count                : 0
[912979.106018]   .avg->runnable_avg_sum         : 9
[912979.106068]   .avg->runnable_avg_period      : 48296
[912979.106120]
[912979.106120] rt_rq[5]:/
[912979.106212]   .rt_nr_running                 : 0
[912979.106262]   .rt_throttled                  : 0
[912979.106312]   .rt_time                       : 0.000000
[912979.106364]   .rt_runtime                    : 950.000000
[912979.106416]
[912979.106416] runnable tasks:
[912979.106416]             task   PID         tree-key  switches  prio  
    exec-runtime         sum-exec        sum-sleep
[912979.106416] 
----------------------------------------------------------------------------------------------------------
[912979.106711]       watchdog/5    31       -10.044212         3   120  
      -10.044212         0.005558         0.002261 /
[912979.106815]      migration/5    32         0.000000     39981     0  
        0.000000     22420.347279         0.000857 /
[912979.106920]      ksoftirqd/5    33 112340386.830138     88288   120 
112340386.830138       902.891990 912915985.970305 /
[912979.107024]      kworker/5:0    34  66161108.436113        17   120  
66161108.436113         0.040609 531545833.648619 /
[912979.107129]     kworker/5:0H    35      2637.765099         6   100  
     2637.765099         0.016356     34887.843068 /
[912979.107234]           bioset   206        10.954992         2   100  
       10.954992         0.004215         0.001385 /
[912979.107338]          kblockd   207        22.958046         2   100  
       22.958046         0.003486         0.001405 /
[912979.107443]               md   358        46.967085         2   100  
       46.967085         0.004262         0.001718 /
[912979.107549]        md1_raid1   917 112224651.731228      2576   120 
112224651.731228       438.643648 912246492.124230 /
[912979.107653]     kworker/5:1H  1535 112279235.381603    705637   100 
112279235.381603      2860.139708 912488051.526814 /
[912979.107757]        kjournald  2221      2802.174368         2   120  
     2802.174368         0.172461         0.003504 /
[912979.107862]           bioset  2227      2814.208072         2   100  
     2814.208072         0.035120         0.024287 /
[912979.107966]            acpid  3231         0.818794        84   120  
        0.818794         3.306390 846335772.890227 /autogroup-10
[912979.108073]    ipv6_addrconf  3363      3115.100877         2   100  
     3115.100877         0.025822         0.024520 /
[912979.108178]      dbus-daemon  3401         0.953565         1   120  
        0.953565         0.195728         0.000000 /autogroup-13
[912979.108284]            getty  4382         0.887602       105   120  
        0.887602         1.971108       107.901308 /autogroup-25
[912979.108392]      kworker/5:2  4904 112340412.387203   5220912   120 
112340412.387203   2883754.936751 378528258.548013 /
[912979.108497]    kworker/u16:0 25534 112334884.843224     46521   120 
112334884.843224    407619.457750   3292541.847410 /
[912979.108602]             sshd 25795        15.048918       272   120  
       15.048918        34.315217    463191.101842 /autogroup-2675
[912979.108709]    kworker/u16:2 25918 112334885.079192       252   120 
112334885.079192        16.946254     67813.385939 /
[912979.108815]   kworker/u16:15 25963 112333963.192420       419   120 
112333963.192420        14.102201     27660.580866 /
[912979.108920]   kworker/u16:16 25964 112340412.451440      3169   120 
112340412.451440     26688.738693     76979.038506 /
[912979.110257]
[912979.110311] cpu#6, 3411.379 MHz
[912979.110367]   .nr_running                    : 0
[912979.110419]   .load                          : 0
[912979.110468]   .nr_switches                   : 60049743
[912979.110530]   .nr_load_updates               : 16356167
[912979.110581]   .nr_uninterruptible            : 35160
[912979.110632]   .next_balance                  : 4386.283021
[912979.110692]   .curr->pid                     : 0
[912979.110742]   .clock                         : 912978955.696449
[912979.110794]   .cpu_load[0]                   : 0
[912979.110844]   .cpu_load[1]                   : 0
[912979.110893]   .cpu_load[2]                   : 0
[912979.110944]   .cpu_load[3]                   : 0
[912979.110994]   .cpu_load[4]                   : 0
[912979.111045]   .yld_count                     : 182103
[912979.111095]   .sched_count                   : 60472214
[912979.111147]   .sched_goidle                  : 18472620
[912979.111198]   .avg_idle                      : 1000000
[912979.111249]   .max_idle_balance_cost         : 500000
[912979.111300]   .ttwu_count                    : 28463751
[912979.111351]   .ttwu_local                    : 14148770
[912979.111402]
[912979.111402] cfs_rq[6]:/
[912979.111495]   .exec_clock                    : 108531033.446718
[912979.111547]   .MIN_vruntime                  : 0.000001
[912979.111598]   .min_vruntime                  : 108832402.721814
[912979.111649]   .max_vruntime                  : 0.000001
[912979.111700]   .spread                        : 0.000000
[912979.111752]   .spread0                       : -12046710.663543
[912979.111804]   .nr_spread_over                : 738
[912979.111854]   .nr_running                    : 0
[912979.111904]   .load                          : 0
[912979.111954]   .runnable_load_avg             : 0
[912979.112004]   .blocked_load_avg              : 0
[912979.112054]   .tg_load_contrib               : 0
[912979.112105]   .tg_runnable_contrib           : 1023
[912979.112156]   .tg_load_avg                   : 1704
[912979.112207]   .tg->runnable_avg              : 2710
[912979.112258]   .tg->cfs_bandwidth.timer_active: 0
[912979.112308]   .throttled                     : 0
[912979.112358]   .throttle_count                : 0
[912979.112408]   .avg->runnable_avg_sum         : 48032
[912979.112458]   .avg->runnable_avg_period      : 48032
[912979.112509]
[912979.112509] rt_rq[6]:/
[912979.112601]   .rt_nr_running                 : 0
[912979.112650]   .rt_throttled                  : 0
[912979.112701]   .rt_time                       : 0.000000
[912979.112752]   .rt_runtime                    : 950.000000
[912979.112803]
[912979.112803] runnable tasks:
[912979.112803]             task   PID         tree-key  switches  prio  
    exec-runtime         sum-exec        sum-sleep
[912979.112803] 
----------------------------------------------------------------------------------------------------------
[912979.113096]       watchdog/6    36       -10.044693         3   120  
      -10.044693         0.005081         0.001609 /
[912979.113201]      migration/6    37         0.000000     37276     0  
        0.000000     20532.517090         0.000899 /
[912979.113305]      ksoftirqd/6    38 108832139.243786     84205   120 
108832139.243786       828.593684 912904581.835227 /
[912979.113409]      kworker/6:0    39  75613767.546629        16   120  
75613767.546629         0.039713 643748961.557535 /
[912979.113513]     kworker/6:0H    40      3739.808388         6   100  
     3739.808388         0.060525     40424.938273 /
[912979.113618]  acpi_thermal_pm   616      1868.134944         2   100  
     1868.134944         0.006186         0.002757 /
[912979.113723]            udevd  1252       323.595660         4   120  
      323.595660         0.967985   1475586.237158 /autogroup-4
[912979.113830]     kworker/6:1H  4101 108794319.824306    698571   100 
108794319.824306      2632.092232 912466706.055504 /
[912979.113934]            anvil  4277       873.900669       544   120  
      873.900669        23.579194 910855769.729803 /autogroup-17
[912979.114040]            getty  4378         2.434546       104   120  
        2.434546         4.057877       106.857510 /autogroup-20
[912979.114147]      kworker/6:2 10896 108832390.870370   3705038   120 
108832390.870370   2015140.511741 267196695.854675 /
[912979.114251]   kworker/u16:13 25961 108823829.294127       201   120 
108823829.294127        25.493814      6828.575308 /
[912979.114356]
[912979.114402] cpu#7, 3411.379 MHz
[912979.114451]   .nr_running                    : 0
[912979.114501]   .load                          : 0
[912979.114551]   .nr_switches                   : 59555676
[912979.114602]   .nr_load_updates               : 15987201
[912979.114653]   .nr_uninterruptible            : 34477
[912979.114703]   .next_balance                  : 4386.283385
[912979.114755]   .curr->pid                     : 0
[912979.114805]   .clock                         : 912979110.628700
[912979.114856]   .cpu_load[0]                   : 0
[912979.114907]   .cpu_load[1]                   : 0
[912979.114957]   .cpu_load[2]                   : 0
[912979.115007]   .cpu_load[3]                   : 0
[912979.115058]   .cpu_load[4]                   : 0
[912979.115108]   .yld_count                     : 177334
[912979.115159]   .sched_count                   : 59962825
[912979.115211]   .sched_goidle                  : 18826005
[912979.115262]   .avg_idle                      : 1000000
[912979.115313]   .max_idle_balance_cost         : 500000
[912979.115364]   .ttwu_count                    : 28048259
[912979.115415]   .ttwu_local                    : 13666465
[912979.115467]
[912979.115467] cfs_rq[7]:/autogroup-6
[912979.115560]   .exec_clock                    : 365777.289449
[912979.115622]   .MIN_vruntime                  : 0.000001
[912979.115690]   .min_vruntime                  : 356931.628448
[912979.115742]   .max_vruntime                  : 0.000001
[912979.115793]   .spread                        : 0.000000
[912979.115844]   .spread0                       : -120522181.756909
[912979.115896]   .nr_spread_over                : 201
[912979.115946]   .nr_running                    : 0
[912979.115997]   .load                          : 0
[912979.116047]   .runnable_load_avg             : 0
[912979.116097]   .blocked_load_avg              : 12
[912979.116148]   .tg_load_contrib               : 12
[912979.116198]   .tg_runnable_contrib           : 18
[912979.116247]   .tg_load_avg                   : 21
[912979.116298]   .tg->runnable_avg              : 32
[912979.116348]   .tg->cfs_bandwidth.timer_active: 0
[912979.116398]   .throttled                     : 0
[912979.116448]   .throttle_count                : 0
[912979.116499]   .se->exec_start                : 912979110.627684
[912979.116551]   .se->vruntime                  : 103316251.044128
[912979.116603]   .se->sum_exec_runtime          : 365782.801926
[912979.116655]   .se->statistics.wait_start     : 0.000000
[912979.116706]   .se->statistics.sleep_start    : 0.000000
[912979.116757]   .se->statistics.block_start    : 0.000000
[912979.116807]   .se->statistics.sleep_max      : 0.000000
[912979.116858]   .se->statistics.block_max      : 0.000000
[912979.116910]   .se->statistics.exec_max       : 15.287497
[912979.116961]   .se->statistics.slice_max      : 9.991647
[912979.117012]   .se->statistics.wait_max       : 20.496483
[912979.117063]   .se->statistics.wait_sum       : 3694.519174
[912979.117114]   .se->statistics.wait_count     : 818307
[912979.117165]   .se->load.weight               : 2
[912979.117215]   .se->avg.runnable_avg_sum      : 864
[912979.117266]   .se->avg.runnable_avg_period   : 46751
[912979.117317]   .se->avg.load_avg_contrib      : 15
[912979.117367]   .se->avg.decay_count           : 870684730
[912979.117418]
[912979.117418] cfs_rq[7]:/
[912979.117511]   .exec_clock                    : 102605411.107675
[912979.117562]   .MIN_vruntime                  : 0.000001
[912979.117614]   .min_vruntime                  : 103316261.560904
[912979.117666]   .max_vruntime                  : 0.000001
[912979.117717]   .spread                        : 0.000000
[912979.117769]   .spread0                       : -17562851.824453
[912979.117821]   .nr_spread_over                : 834
[912979.117871]   .nr_running                    : 0
[912979.117921]   .load                          : 0
[912979.117972]   .runnable_load_avg             : 0
[912979.118022]   .blocked_load_avg              : 0
[912979.118072]   .tg_load_contrib               : 0
[912979.118122]   .tg_runnable_contrib           : 19
[912979.118173]   .tg_load_avg                   : 1679
[912979.118224]   .tg->runnable_avg              : 2706
[912979.118274]   .tg->cfs_bandwidth.timer_active: 0
[912979.118325]   .throttled                     : 0
[912979.118375]   .throttle_count                : 0
[912979.118425]   .avg->runnable_avg_sum         : 897
[912979.118476]   .avg->runnable_avg_period      : 47291
[912979.118527]
[912979.118527] rt_rq[7]:/
[912979.118620]   .rt_nr_running                 : 0
[912979.118670]   .rt_throttled                  : 0
[912979.118721]   .rt_time                       : 0.000000
[912979.118772]   .rt_runtime                    : 950.000000
[912979.118824]
[912979.118824] runnable tasks:
[912979.118824]             task   PID         tree-key  switches  prio  
    exec-runtime         sum-exec        sum-sleep
[912979.118824] 
----------------------------------------------------------------------------------------------------------
[912979.119119]       watchdog/7    41       -10.045045         3   120  
      -10.045045         0.004808         0.001532 /
[912979.119224]      migration/7    42         0.000000     36792     0  
        0.000000     20126.155679         0.000854 /
[912979.119329]      ksoftirqd/7    43 103316249.568830     81327   120 
103316249.568830       774.603306 912926058.496427 /
[912979.120664]      kworker/7:0    44  47090310.248653        16   120  
47090310.248653         0.128903 434429936.382521 /
[912979.120769]     kworker/7:0H    45      4851.525691         6   100  
     4851.525691         0.029924     34165.166196 /
[912979.120874]          khelper    46        10.956929         2   100  
       10.956929         0.006162         0.001989 /
[912979.120979]            netns    47        22.960187         2   100  
       22.960187         0.003588         0.001842 /
[912979.121084]    irq/28-mei_me   663         0.000000        14    49  
        0.000000         0.467508         0.000000 /
[912979.121189]        kpsmoused   682      3645.306968         2   100  
     3645.306968         0.033012         0.007425 /
[912979.121293]       scsi_tmf_0   856      4451.954811         2   100  
     4451.954811         0.009155         0.003823 /
[912979.121398]       scsi_tmf_1   858      4463.965463         2   100  
     4463.965463         0.013995         0.004815 /
[912979.121502]        scsi_eh_2   859      4524.055744         3   120  
     4524.055744         0.038579         1.530789 /
[912979.121607]       scsi_tmf_2   860      4487.982184         2   100  
     4487.982184         0.008967         0.005525 /
[912979.121712]       scsi_tmf_3   862      4499.993379         2   100  
     4499.993379         0.011948         0.005102 /
[912979.121817]        scsi_eh_4   863      4524.057233         3   120  
     4524.057233         0.041540         1.325310 /
[912979.121922]       scsi_tmf_4   864      4524.015332         2   100  
     4524.015332         0.012859         0.004181 /
[912979.122026]       scsi_tmf_5   866      4536.025438         2   100  
     4536.025438         0.011072         0.006043 /
[912979.122131]     kworker/7:1H  1280 103286251.400108    688398   100 
103286251.400108      2454.691791 912481485.268755 /
[912979.122236]    rs:main Q:Reg  3077    356931.628448      7847   120  
   356931.628448       859.764447 912939232.144327 /autogroup-6
[912979.122343]           master  4354       554.129516     43341   120  
      554.129516      2028.044789 912879488.890427 /autogroup-19
[912979.122449]     btrfs-worker  4691      5989.004416         2   100  
     5989.004416         0.041977         0.004016 /
[912979.122553]  btrfs-worker-hi  4693      6013.105480         2   100  
     6013.105480         0.052131         0.002841 /
[912979.122658]   btrfs-delalloc  4694      6025.150939         2   100  
     6025.150939         0.046807         0.002861 /
[912979.122764]  btrfs-flush_del  4695      6037.193893         2   100  
     6037.193893         0.044335         0.003252 /
[912979.122868]      btrfs-cache  4696      6049.237247         2   100  
     6049.237247         0.044729         0.002826 /
[912979.122972]     btrfs-submit  4697      6061.280488         2   100  
     6061.280488         0.044660         0.002709 /
[912979.123077]      btrfs-fixup  4698      6073.289654         2   100  
     6073.289654         0.009914         0.002789 /
[912979.123181]      btrfs-endio  4699  14914387.229554         4   100  
14914387.229554         0.040802 141335502.714150 /
[912979.123286]  btrfs-endio-met  4700      6097.310484         2   100  
     6097.310484         0.012027         0.003048 /
[912979.123390]  btrfs-endio-met  4701      6109.352065         2   100  
     6109.352065         0.042989         0.002727 /
[912979.123495]  btrfs-endio-rai  4702      6121.394067         2   100  
     6121.394067         0.043385         0.002729 /
[912979.123600]        btrfs-rmw  4703      6133.415559         2   100  
     6133.415559         0.022201         0.002938 /
[912979.123705]  btrfs-freespace  4705      6157.446031         2   100  
     6157.446031         0.009791         0.003402 /
[912979.123810]  btrfs-delayed-m  4706      6169.454602         2   100  
     6169.454602         0.009254         0.003302 /
[912979.123914]  btrfs-readahead  4707      6181.495529         2   100  
     6181.495529         0.041628         0.003029 /
[912979.124019]  btrfs-qgroup-re  4708      6193.519464         2   100  
     6193.519464         0.024648         0.002956 /
[912979.124124]  btrfs-extent-re  4709      6205.526825         2   100  
     6205.526825         0.008097         0.002968 /
[912979.124229]      kworker/7:1 31378 103316249.746160   6369777   120 
103316249.746160   3405409.629733 475117858.774212 /
[912979.124334]    kworker/u17:4 25396 103303273.647273        79   100 
103303273.647273         2.725773   7167326.179379 /
[912979.124438]   kworker/u16:14 25962 103307736.104678      1653   120 
103307736.104678        51.500169     34247.616388 /
[912979.124545]



-- 
Tomasz Chmielewski
http://www.sslrack.com


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: how long should "btrfs device delete missing ..."  take?
  2014-09-11 21:25   ` Chris Murphy
@ 2014-09-11 23:51     ` Duncan
  2014-09-12  2:10       ` Chris Murphy
  0 siblings, 1 reply; 11+ messages in thread
From: Duncan @ 2014-09-11 23:51 UTC (permalink / raw)
  To: linux-btrfs

Chris Murphy posted on Thu, 11 Sep 2014 15:25:51 -0600 as excerpted:

> On Sep 11, 2014, at 1:31 PM, Duncan <1i5t5.duncan@cox.net> wrote:
>> 
>> I wouldn't try defragging now, but it might be worthwhile to stop the
>> device delete (rebooting to do so since I don't think there's a cancel)
> 
> 'btrfs replace cancel' does exist, although I haven't tried it.

Btrfs replace cancel exists, yes, but does it work for btrfs device 
delete, which is what the OP used?

> Something isn't right though, because it's clearly neither reading nor
> writing at anywhere close to 1/2 the drive read throughput. I'm curious
> what 'iotop -d30 -o' shows (during the replace, before cancel), which
> should be pretty consistent by averaging 30 seconds worth of io. And
> then try 'iotop -d3 -o' and see if there are spikes. I'm willing to bet
> there's a lot of nothing going on, with occasional spikes, rather than a
> constant trickle.
> 
> And then the question is to find out what btrfs is thinking about while
> nothing is reading or writing. Even though it's not 5000+ snapshots, I
> wonder if the balance code (and hence btrfs replace) makes extensive use
> of fiemap that's causing this to go catatonic.
> http://comments.gmane.org/gmane.comp.file-systems.btrfs/35724

Not sure (some of that stuff's beyond me), but one thing we /do/ know is 
that btrfs has so far been focused mostly on features and debugging, not 
on optimization beyond the worst-cases, which themselves remain a big 
enough problem, tho it's slowly getting better.


-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: how long should "btrfs device delete missing ..."  take?
  2014-09-11 23:51     ` Duncan
@ 2014-09-12  2:10       ` Chris Murphy
  2014-09-12  5:19         ` Russell Coker
                           ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Chris Murphy @ 2014-09-12  2:10 UTC (permalink / raw)
  To: Btrfs BTRFS


On Sep 11, 2014, at 5:51 PM, Duncan <1i5t5.duncan@cox.net> wrote:

> Chris Murphy posted on Thu, 11 Sep 2014 15:25:51 -0600 as excerpted:
> 
>> On Sep 11, 2014, at 1:31 PM, Duncan <1i5t5.duncan@cox.net> wrote:
>>> 
>>> I wouldn't try defragging now, but it might be worthwhile to stop the
>>> device delete (rebooting to do so since I don't think there's a cancel)
>> 
>> 'btrfs replace cancel' does exist, although I haven't tried it.
> 
> Btrfs replace cancel exists, yes, but does it work for btrfs device 
> delete, which is what the OP used?

Oops, right! I'm not sure what can do this safely.

And then when I think about just creating a new fs, using btrfs send/receive, the snapshots need to be ro first. So if there's any uncertainty about safely canceling the 'device delete' those ro snapshots need to be taken first, in the event only an ro mount is possible subsequently. And then there's some uncertainty how long 260+ ro snapshots will take (should be fast, but) and how much worse that makes the current situation. But probably worth the risk to take the snapshots and just wait a while before trying something like umount or sysrq+s followed by sysrq+u.



> 
>> Something isn't right though, because it's clearly neither reading nor
>> writing at anywhere close to 1/2 the drive read throughput. I'm curious
>> what 'iotop -d30 -o' shows (during the replace, before cancel), which
>> should be pretty consistent by averaging 30 seconds worth of io. And
>> then try 'iotop -d3 -o' and see if there are spikes. I'm willing to bet
>> there's a lot of nothing going on, with occasional spikes, rather than a
>> constant trickle.
>> 
>> And then the question is to find out what btrfs is thinking about while
>> nothing is reading or writing. Even though it's not 5000+ snapshots, I
>> wonder if the balance code (and hence btrfs replace) makes extensive use
>> of fiemap that's causing this to go catatonic.
>> http://comments.gmane.org/gmane.comp.file-systems.btrfs/35724
> 
> Not sure (some of that stuff's beyond me), but one thing we /do/ know is 
> that btrfs has so far been focused mostly on features and debugging, not 
> on optimization beyond the worst-cases, which themselves remain a big 
> enough problem, tho it's slowly getting better.

Sure. But what's the next step? Given 260+ snapshots might mean well more than 350GB of data, depending on how deduplicated the fs is, it still probably would be faster to rsync this to a pile of drives in linear/concat+XFS than wait a month (?) for device delete to finish.

Alternatively, script some way to create 260+ ro snapshots to btrfs send/receive to a new btrfs volume; and turn it into a raid1 later.

I'm curious if a sysrq+s followed by sysrq+u might leave the filessystem in a state where it could still be rw mountable. But I'm skeptical of anything interrupting the device delete before being fully prepared for the fs to be toast for rw mount. If only ro mount is possible, any chance of creating ro snapshots is out.


Chris Murphy

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: how long should "btrfs device delete missing ..."  take?
  2014-09-12  2:10       ` Chris Murphy
@ 2014-09-12  5:19         ` Russell Coker
  2014-09-12  5:33           ` Duncan
  2014-09-12  6:27           ` Chris Murphy
  2014-09-12  5:41         ` Duncan
  2014-09-12  5:59         ` Duncan
  2 siblings, 2 replies; 11+ messages in thread
From: Russell Coker @ 2014-09-12  5:19 UTC (permalink / raw)
  To: Chris Murphy, Btrfs BTRFS

It would be nice if a file system mounted ro counted as ro snapshots for btrfs send.

When a file system is so messed up it can't be mounted rw it should be regarded as ro for all operations.
-- 
Sent from my Samsung Galaxy Note 2 with K-9 Mail.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: how long should "btrfs device delete missing ..."  take?
  2014-09-12  5:19         ` Russell Coker
@ 2014-09-12  5:33           ` Duncan
  2014-09-12  6:27           ` Chris Murphy
  1 sibling, 0 replies; 11+ messages in thread
From: Duncan @ 2014-09-12  5:33 UTC (permalink / raw)
  To: linux-btrfs

Russell Coker posted on Fri, 12 Sep 2014 15:19:04 +1000 as excerpted:

> It would be nice if a file system mounted ro counted as ro snapshots for
> btrfs send.
> 
> When a file system is so messed up it can't be mounted rw it should be
> regarded as ro for all operations.

Indeed, and that has been suggested before, but unfortunately it's not 
current behavior.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: how long should "btrfs device delete missing ..."  take?
  2014-09-12  2:10       ` Chris Murphy
  2014-09-12  5:19         ` Russell Coker
@ 2014-09-12  5:41         ` Duncan
  2014-09-12  5:59         ` Duncan
  2 siblings, 0 replies; 11+ messages in thread
From: Duncan @ 2014-09-12  5:41 UTC (permalink / raw)
  To: linux-btrfs

Chris Murphy posted on Thu, 11 Sep 2014 20:10:26 -0600 as excerpted:

> And then when I think about just creating a new fs, using btrfs
> send/receive, the snapshots need to be ro first.

FWIW, at this point I'd forget about send/receive and create the backup 
(assuming one doesn't exist already) using more normal methods.  At least 
if the original send/receive hasn't yet been done, so it'd be copying off 
all the data anyway.  Perhaps mount selected snapshots and back them up 
too (after the current case is backed up), but throw away most of the 
snapshots.

Of course if there's an existing relatively current sent/received base to 
build on, and no indication that send/receive is broken, definitely try 
that first as the amount of data to sync in that case should be MUCH 
lower, but if not...

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: how long should "btrfs device delete missing ..."  take?
  2014-09-12  2:10       ` Chris Murphy
  2014-09-12  5:19         ` Russell Coker
  2014-09-12  5:41         ` Duncan
@ 2014-09-12  5:59         ` Duncan
  2 siblings, 0 replies; 11+ messages in thread
From: Duncan @ 2014-09-12  5:59 UTC (permalink / raw)
  To: linux-btrfs

Chris Murphy posted on Thu, 11 Sep 2014 20:10:26 -0600 as excerpted:

> Sure. But what's the next step? Given 260+ snapshots might mean well
> more than 350GB of data, depending on how deduplicated the fs is, it
> still probably would be faster to rsync this to a pile of drives in
> linear/concat+XFS than wait a month (?) for device delete to finish.

That was what I was getting at in my other just-finished short reply.  It 
may be time to give up on the btrfs specific solutions for the moment and 
go with tried and tested traditional solutions (tho I'd definitely *NOT* 
try rsync or the like with the delete still going, we know from other 
reports that rsync places its own stresses on btrfs and one major 
stressor, the delete-triggered rebalance, at a time, is bad enough).

> Alternatively, script some way to create 260+ ro snapshots to btrfs
> send/receive to a new btrfs volume; and turn it into a raid1 later.

No confirmation yet but I strongly suspect most of those subs are 
snapshots.  Assuming that's the case, it's very likely most of them can 
simply be eliminated as I originally suggested, a process that /should/ 
be fast, decomplexifying the situation dramatically.

> I'm curious if a sysrq+s followed by sysrq+u might leave the filessystem
> in a state where it could still be rw mountable. But I'm skeptical of
> anything interrupting the device delete before being fully prepared for
> the fs to be toast for rw mount. If only ro mount is possible, any
> chance of creating ro snapshots is out.

In theory, that is, barring bugs, interrupting the delete with normal 
shutdown to the extent possible, then sysrq+s, sysrq+u, should not be a 
problem.  The delete is basically a balance, going chunk by chunk, and 
either the chunk has been duplicated to the new device or it hasn't.  In 
either case, the existing chunk on the remaining old device shouldn't be 
affected.

So rebooting in that way in ordered to stop the delete temporarily 
/should/ have no bad effects.  Of course, that's barring bugs.  Btrfs is 
still not fully stabilized, and bugs do happen, so anything's possible.  
But I'd consider it safe enough to try here, certainly so if I had 
backups, as is still STRONGLY recommended for btrfs at this point, much 
more so than the routine sysadmin "if it's not backed up by definition 
it's not valuable to you" rule.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: how long should "btrfs device delete missing ..."  take?
  2014-09-12  5:19         ` Russell Coker
  2014-09-12  5:33           ` Duncan
@ 2014-09-12  6:27           ` Chris Murphy
  1 sibling, 0 replies; 11+ messages in thread
From: Chris Murphy @ 2014-09-12  6:27 UTC (permalink / raw)
  To: Btrfs BTRFS


On Sep 11, 2014, at 11:19 PM, Russell Coker <russell@coker.com.au> wrote:

> It would be nice if a file system mounted ro counted as ro snapshots for btrfs send.
> 
> When a file system is so messed up it can't be mounted rw it should be regarded as ro for all operations.

Yes it's come up before, and there's a question whether mount -o ro is reliably ro enough for this. Maybe a force option?

But then another one is a recursive btrfs send to go along with the above. I might want them all, or I might want all of the ones in two particular subvolumes, etc. And even combine the recursive ro snapshot and recursive send as a btrfs rescue option that would work even if the volume is mounted read-only.


Chris Murphy

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2014-09-12  6:27 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-11 15:22 how long should "btrfs device delete missing ..." take? Tomasz Chmielewski
2014-09-11 19:31 ` Duncan
2014-09-11 21:25   ` Chris Murphy
2014-09-11 23:51     ` Duncan
2014-09-12  2:10       ` Chris Murphy
2014-09-12  5:19         ` Russell Coker
2014-09-12  5:33           ` Duncan
2014-09-12  6:27           ` Chris Murphy
2014-09-12  5:41         ` Duncan
2014-09-12  5:59         ` Duncan
2014-09-11 23:06 ` Tomasz Chmielewski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.