All of lore.kernel.org
 help / color / mirror / Atom feed
* Abysmal Performance
@ 2011-06-20 21:51 Henning Rohlfs
  2011-06-21  0:12 ` Josef Bacik
                   ` (2 more replies)
  0 siblings, 3 replies; 37+ messages in thread
From: Henning Rohlfs @ 2011-06-20 21:51 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 3052 bytes --]

 Hello,

 I've migrated my system to btrfs (raid1) a few months ago. Since then 
 the performance has been pretty bad, but recently it's gotten 
 unbearable: a simple sync called while the system is idle can take 20 up 
 to 60 seconds. Creating or deleting files often has several seconds 
 latency, too.

 One curious - but maybe unrelated - observation is that even though I'm 
 using a raid1 btrfs setup, the hdds are often being written to 
 sequentially. One hard-drive sees some write activity and after it 
 subsides, the other drive sees some activity. (See attached 
 sequential-writes.txt.)

 - 64bit gentoo with vanilla 2.6.39 kernel
 - lzo compression enabled
 - 2x WD1000FYPS (1TB WD hdds)
 - Athlon x2 2.2GHz with 8GB RAM
 - space_cache was enabled, but it seemed to make the problem worse. 
 It's no longer in the mount options.

 Any help is appreciated. Thanks,
 Henning




 server ~ # sync; time sync
 real	0m28.869s
 user	0m0.000s
 sys	0m5.750s



 server ~ # uname -a
 Linux server 2.6.39 #3 SMP Sat May 28 17:25:31 CEST 2011 x86_64 AMD 
 Athlon(tm) 64 X2 Dual Core Processor 4200+ AuthenticAMD GNU/Linux



 server ~ # mount | grep btrfs
 /dev/sdb2 on / type btrfs (rw,noatime,compress=lzo,noacl)
 /dev/sda2 on /mnt/pool type btrfs (rw,noatime,subvolid=0,compress=lzo)
 /dev/sda2 on /usr/portage type btrfs 
 (rw,noatime,subvol=newportage,compress=lzo)
 /dev/sda2 on /home type btrfs (rw,noatime,subvol=home,compress=lzo)
 /dev/sda2 on /home/mythtv type btrfs 
 (rw,noatime,subvol=mythtv,compress=lzo)



 server ~ # btrfs fi show
 Label: none  uuid: 7676eb78-e411-4505-ac51-ccd12aa5a6b6
 	Total devices 2 FS bytes used 281.58GB
 	devid    1 size 931.28GB used 898.26GB path /dev/sda2
 	devid    3 size 931.27GB used 898.26GB path /dev/sdb2

 Btrfs v0.19-35-g1b444cd-dirty



 server ~ # btrfs fi df /
 Data, RAID1: total=875.00GB, used=279.30GB
 System, RAID1: total=8.00MB, used=140.00KB
 System: total=4.00MB, used=0.00
 Metadata, RAID1: total=23.25GB, used=2.28GB



 bonnie++

 Version  1.96       ------Sequential Output------ --Sequential Input- 
 --Random-
 Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- 
 --Seeks--
 Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  
 /sec %CP
 server          16G   147  90 76321  18 31787  16  1370  71 64812  14  
 27.0  66
 Latency             66485us    7581ms    4455ms   25011us     695ms     
 959ms
 Version  1.96       ------Sequential Create------ --------Random 
 Create--------
 server              -Create-- --Read--- -Delete-- -Create-- --Read--- 
 -Delete--
               files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  
 /sec %CP
                  16   238  51 +++++ +++   219  51   284  52 +++++ +++   
 390  57
 Latency              1914ms     524us    3461ms    1141ms      39us    
 1308ms
 1.96,1.96,server,1,1308618030,16G,,147,90,76321,18,31787,16,1370,71,64812,14,27.0,66,16,,,,,238,51,+++++,+++,219,51,284,52,+++++,+++,390,57,66485us,7581ms,4455ms,25011us,695ms,959ms,1914ms,524us,3461ms,1141ms,39us,1308ms

[-- Attachment #2: sequential-writes.txt --]
[-- Type: text/plain, Size: 2015 bytes --]

server ~ # iostat -m 5

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           3.00    0.00   45.20    5.20    0.00   46.60

Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
sda               0.00         0.00         0.00          0          0
sdb              15.20         0.06         0.00          0          0
md0               0.00         0.00         0.00          0          0

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           4.30    0.00   37.46   42.36    0.00   15.88

Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
sda              45.00         0.00         0.38          0          1
sdb             467.60         0.02         2.06          0         10
md0               0.00         0.00         0.00          0          0

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           4.31    0.00   19.34   58.82    0.00   17.54

Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
sda               8.80         0.00         0.04          0          0
sdb             649.80         0.02         2.67          0         13
md0               0.00         0.00         0.00          0          0

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           3.20    0.00   63.24   31.97    0.00    1.60

Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
sda             585.80         0.00         2.36          0         11
sdb              20.80         0.08         0.16          0          0
md0               0.00         0.00         0.00          0          0

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           3.30    0.00   42.60   39.30    0.00   14.80

Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
sda             514.20         0.00         2.29          0         11
sdb              59.20         0.10         0.17          0          0
md0               0.00         0.00         0.00          0          0



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Abysmal Performance
  2011-06-20 21:51 Abysmal Performance Henning Rohlfs
@ 2011-06-21  0:12 ` Josef Bacik
  2011-06-21  7:10   ` Henning Rohlfs
  2011-06-21  8:00 ` Sander
  2011-06-21 15:24 ` Calvin Walton
  2 siblings, 1 reply; 37+ messages in thread
From: Josef Bacik @ 2011-06-21  0:12 UTC (permalink / raw)
  To: Henning Rohlfs; +Cc: linux-btrfs

On 06/20/2011 05:51 PM, Henning Rohlfs wrote:
> Hello,
>
> I've migrated my system to btrfs (raid1) a few months ago. Since then
> the performance has been pretty bad, but recently it's gotten
> unbearable: a simple sync called while the system is idle can take 20 up
> to 60 seconds. Creating or deleting files often has several seconds
> latency, too.
>
> One curious - but maybe unrelated - observation is that even though I'm
> using a raid1 btrfs setup, the hdds are often being written to
> sequentially. One hard-drive sees some write activity and after it
> subsides, the other drive sees some activity. (See attached
> sequential-writes.txt.)
>

Can you do sysrq+w while this is happening so we can see who is doing 
the writing?  Thanks,

Josef

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Abysmal Performance
  2011-06-21  0:12 ` Josef Bacik
@ 2011-06-21  7:10   ` Henning Rohlfs
  0 siblings, 0 replies; 37+ messages in thread
From: Henning Rohlfs @ 2011-06-21  7:10 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1021 bytes --]

 On Mon, 20 Jun 2011 20:12:16 -0400, Josef Bacik wrote:
> On 06/20/2011 05:51 PM, Henning Rohlfs wrote:
>> Hello,
>>
>> I've migrated my system to btrfs (raid1) a few months ago. Since 
>> then
>> the performance has been pretty bad, but recently it's gotten
>> unbearable: a simple sync called while the system is idle can take 
>> 20 up
>> to 60 seconds. Creating or deleting files often has several seconds
>> latency, too.
>>
>> One curious - but maybe unrelated - observation is that even though 
>> I'm
>> using a raid1 btrfs setup, the hdds are often being written to
>> sequentially. One hard-drive sees some write activity and after it
>> subsides, the other drive sees some activity. (See attached
>> sequential-writes.txt.)
>>
>
> Can you do sysrq+w while this is happening so we can see who is doing
> the writing?  Thanks,
>
> Josef

 When I call sync, it starts with several seconds of 100% (one core) cpu 
 usage by sync itself. Afterwards btrfs-submit-0 and sync are blocked. 
 sysrq+w output is attached.

[-- Attachment #2: blocked-sync.txt --]
[-- Type: text/plain, Size: 17019 bytes --]

SysRq : Show Blocked State
  task                        PC stack   pid father
btrfs-submit-0  D ffff88021c8bbf00     0   851      2 0x00000000
 ffff880245b07b20 0000000000000046 0000000000000048 ffff880245b06010
 ffff880246ce2b80 0000000000011340 ffff880245b07fd8 0000000000004000
 ffff880245b07fd8 0000000000011340 ffff88021f2d95c0 ffff880246ce2b80
Call Trace:
 [<ffffffff81333f7a>] ? put_device+0x12/0x14
 [<ffffffff81343af6>] ? scsi_request_fn+0x341/0x40d
 [<ffffffff812a4643>] ? __blk_run_queue+0x16/0x18
 [<ffffffff814fdc2f>] io_schedule+0x51/0x66
 [<ffffffff812a6bd9>] get_request_wait+0xa1/0x12f
 [<ffffffff81050c85>] ? wake_up_bit+0x25/0x25
 [<ffffffff812a3585>] ? elv_merge+0x99/0xa9
 [<ffffffff812a6dee>] __make_request+0x187/0x27f
 [<ffffffff812a540f>] generic_make_request+0x229/0x2a4
 [<ffffffff812a5553>] submit_bio+0xc9/0xe8
 [<ffffffff81267b24>] run_scheduled_bios+0x296/0x415
 [<ffffffff81044692>] ? del_timer+0x83/0x83
 [<ffffffff81267cb3>] pending_bios_fn+0x10/0x12
 [<ffffffff8126d591>] worker_loop+0x189/0x4b7
 [<ffffffff8126d408>] ? btrfs_queue_worker+0x263/0x263
 [<ffffffff8126d408>] ? btrfs_queue_worker+0x263/0x263
 [<ffffffff810508bc>] kthread+0x7d/0x85
 [<ffffffff81500c94>] kernel_thread_helper+0x4/0x10
 [<ffffffff8105083f>] ? kthread_worker_fn+0x13a/0x13a
 [<ffffffff81500c90>] ? gs_change+0xb/0xb
sync            D 0000000102884001     0 22091  19401 0x00000000
 ffff88019f0edc98 0000000000000086 ffff880100000000 ffff88019f0ec010
 ffff880227808000 0000000000011340 ffff88019f0edfd8 0000000000004000
 ffff88019f0edfd8 0000000000011340 ffffffff8181f020 ffff880227808000
Call Trace:
 [<ffffffff810b189b>] ? add_partial+0x1b/0x64
 [<ffffffff810b36d6>] ? kmem_cache_free+0x8e/0x93
 [<ffffffff812623ad>] ? free_extent_state+0x43/0x47
 [<ffffffff81086c2b>] ? __lock_page+0x68/0x68
 [<ffffffff814fdc2f>] io_schedule+0x51/0x66
 [<ffffffff81086c34>] sleep_on_page+0x9/0xd
 [<ffffffff814fe32c>] __wait_on_bit+0x43/0x76
 [<ffffffff81086df3>] wait_on_page_bit+0x6d/0x74
 [<ffffffff81050cb9>] ? autoremove_wake_function+0x34/0x34
 [<ffffffff81086b0e>] ? find_get_page+0x19/0x66
 [<ffffffff81247d60>] btrfs_wait_marked_extents+0xeb/0x124
 [<ffffffff81247ee6>] btrfs_write_and_wait_marked_extents+0x2a/0x3c
 [<ffffffff81247f3a>] btrfs_write_and_wait_transaction+0x42/0x44
 [<ffffffff81248676>] btrfs_commit_transaction+0x53c/0x650
 [<ffffffff81050c85>] ? wake_up_bit+0x25/0x25
 [<ffffffff810db775>] ? __sync_filesystem+0x75/0x75
 [<ffffffff8122a33a>] btrfs_sync_fs+0x66/0x6b
 [<ffffffff810db761>] __sync_filesystem+0x61/0x75
 [<ffffffff810db786>] sync_one_sb+0x11/0x13
 [<ffffffff810bc89c>] iterate_supers+0x67/0xbd
 [<ffffffff810db7c8>] sys_sync+0x40/0x57
 [<ffffffff814fff7b>] system_call_fastpath+0x16/0x1b
Sched Debug Version: v0.10, 2.6.39 #3
ktime                                   : 141912370.788052
sched_clk                               : 141831001.710073
cpu_clk                                 : 141912370.788875
jiffies                                 : 4337451011
sched_clock_stable                      : 0

sysctl_sched
  .sysctl_sched_latency                    : 12.000000
  .sysctl_sched_min_granularity            : 1.500000
  .sysctl_sched_wakeup_granularity         : 2.000000
  .sysctl_sched_child_runs_first           : 0
  .sysctl_sched_features                   : 7279
  .sysctl_sched_tunable_scaling            : 1 (logaritmic)

cpu#0, 2009.138 MHz
  .nr_running                    : 0
  .load                          : 0
  .nr_switches                   : 145331671
  .nr_load_updates               : 14210439
  .nr_uninterruptible            : 0
  .next_balance                  : 4337.451012
  .curr->pid                     : 0
  .clock                         : 141912370.762325
  .cpu_load[0]                   : 0
  .cpu_load[1]                   : 0
  .cpu_load[2]                   : 14
  .cpu_load[3]                   : 87
  .cpu_load[4]                   : 139
  .yld_count                     : 20406719
  .sched_switch                  : 0
  .sched_count                   : 167876926
  .sched_goidle                  : 51210495
  .avg_idle                      : 835252
  .ttwu_count                    : 78324179
  .ttwu_local                    : 51824984
  .bkl_count                     : 0

cfs_rq[0]:/autogroup-49
  .exec_clock                    : 422553.128760
  .MIN_vruntime                  : 0.000001
  .min_vruntime                  : 321772.360035
  .max_vruntime                  : 0.000001
  .spread                        : 0.000000
  .spread0                       : -44285292.572040
  .nr_spread_over                : 229
  .nr_running                    : 0
  .load                          : 0
  .load_avg                      : 167.025784
  .load_period                   : 6.022728
  .load_contrib                  : 27
  .load_tg                       : 27
  .se->exec_start                : 141912368.437919
  .se->vruntime                  : 44607059.310813
  .se->sum_exec_runtime          : 422561.882460
  .se->statistics.wait_start     : 0.000000
  .se->statistics.sleep_start    : 0.000000
  .se->statistics.block_start    : 0.000000
  .se->statistics.sleep_max      : 0.000000
  .se->statistics.block_max      : 0.000000
  .se->statistics.exec_max       : 51.388441
  .se->statistics.slice_max      : 25.949119
  .se->statistics.wait_max       : 65.392792
  .se->statistics.wait_sum       : 156059.730060
  .se->statistics.wait_count     : 4204371
  .se->load.weight               : 2

cfs_rq[0]:/autogroup-40
  .exec_clock                    : 605168.668669
  .MIN_vruntime                  : 0.000001
  .min_vruntime                  : 605161.611298
  .max_vruntime                  : 0.000001
  .spread                        : 0.000000
  .spread0                       : -44001903.381029
  .nr_spread_over                : 3
  .nr_running                    : 0
  .load                          : 0
  .load_avg                      : 39.999996
  .load_period                   : 5.926694
  .load_contrib                  : 6
  .load_tg                       : 180
  .se->exec_start                : 141912331.697676
  .se->vruntime                  : 44607058.841296
  .se->sum_exec_runtime          : 605171.006071
  .se->statistics.wait_start     : 0.000000
  .se->statistics.sleep_start    : 0.000000
  .se->statistics.block_start    : 0.000000
  .se->statistics.sleep_max      : 0.000000
  .se->statistics.block_max      : 0.000000
  .se->statistics.exec_max       : 41.311985
  .se->statistics.slice_max      : 64.763488
  .se->statistics.wait_max       : 277.272092
  .se->statistics.wait_sum       : 85941.008888
  .se->statistics.wait_count     : 3857989
  .se->load.weight               : 2

cfs_rq[0]:/autogroup-6606
  .exec_clock                    : 390924.295583
  .MIN_vruntime                  : 0.000001
  .min_vruntime                  : 390385.678251
  .max_vruntime                  : 0.000001
  .spread                        : 0.000000
  .spread0                       : -44216679.314076
  .nr_spread_over                : 21
  .nr_running                    : 0
  .load                          : 0
  .load_avg                      : 36.783281
  .load_period                   : 6.071262
  .load_contrib                  : 6
  .load_tg                       : 13
  .se->exec_start                : 141912362.779247
  .se->vruntime                  : 44607059.020711
  .se->sum_exec_runtime          : 390925.810660
  .se->statistics.wait_start     : 0.000000
  .se->statistics.sleep_start    : 0.000000
  .se->statistics.block_start    : 0.000000
  .se->statistics.sleep_max      : 0.000000
  .se->statistics.block_max      : 0.000000
  .se->statistics.exec_max       : 6.906599
  .se->statistics.slice_max      : 57.161522
  .se->statistics.wait_max       : 32.987270
  .se->statistics.wait_sum       : 55137.427273
  .se->statistics.wait_count     : 294878
  .se->load.weight               : 2

cfs_rq[0]:/autogroup-67
  .exec_clock                    : 13569548.916161
  .MIN_vruntime                  : 0.000001
  .min_vruntime                  : 7304002.576262
  .max_vruntime                  : 0.000001
  .spread                        : 0.000000
  .spread0                       : -37303062.475986
  .nr_spread_over                : 2495
  .nr_running                    : 0
  .load                          : 0
  .load_avg                      : 149.399974
  .load_period                   : 6.421008
  .load_contrib                  : 23
  .load_tg                       : 49
  .se->exec_start                : 141912370.118364
  .se->vruntime                  : 44607060.001860
  .se->sum_exec_runtime          : 13569663.320444
  .se->statistics.wait_start     : 0.000000
  .se->statistics.sleep_start    : 0.000000
  .se->statistics.block_start    : 0.000000
  .se->statistics.sleep_max      : 0.000000
  .se->statistics.block_max      : 0.000000
  .se->statistics.exec_max       : 51.770648
  .se->statistics.slice_max      : 66.102982
  .se->statistics.wait_max       : 148.774325
  .se->statistics.wait_sum       : 2325737.897221
  .se->statistics.wait_count     : 57199515
  .se->load.weight               : 2

cfs_rq[0]:/
  .exec_clock                    : 42518959.323607
  .MIN_vruntime                  : 0.000001
  .min_vruntime                  : 44607065.052248
  .max_vruntime                  : 0.000001
  .spread                        : 0.000000
  .spread0                       : 0.000000
  .nr_spread_over                : 7056
  .nr_running                    : 0
  .load                          : 0
  .load_avg                      : 0.000000
  .load_period                   : 0.000000
  .load_contrib                  : 0
  .load_tg                       : 0

rt_rq[0]:
  .rt_nr_running                 : 0
  .rt_throttled                  : 0
  .rt_time                       : 0.000000
  .rt_runtime                    : 950.000000

runnable tasks:
            task   PID         tree-key  switches  prio     exec-runtime         sum-exec        sum-sleep
----------------------------------------------------------------------------------------------------------

cpu#1, 2009.138 MHz
  .nr_running                    : 1
  .load                          : 1017
  .nr_switches                   : 134029296
  .nr_load_updates               : 15663195
  .nr_uninterruptible            : 1
  .next_balance                  : 4337.451017
  .curr->pid                     : 0
  .clock                         : 141912370.751092
  .cpu_load[0]                   : 0
  .cpu_load[1]                   : 8
  .cpu_load[2]                   : 65
  .cpu_load[3]                   : 131
  .cpu_load[4]                   : 142
  .yld_count                     : 45556538
  .sched_switch                  : 0
  .sched_count                   : 190136659
  .sched_goidle                  : 45498985
  .avg_idle                      : 956897
  .ttwu_count                    : 73382490
  .ttwu_local                    : 60102512
  .bkl_count                     : 0

cfs_rq[1]:/autogroup-40
  .exec_clock                    : 590188.619254
  .MIN_vruntime                  : 0.000001
  .min_vruntime                  : 590194.777093
  .max_vruntime                  : 0.000001
  .spread                        : 0.000000
  .spread0                       : -44016870.275155
  .nr_spread_over                : 0
  .nr_running                    : 0
  .load                          : 0
  .load_avg                      : 1279.999872
  .load_period                   : 7.354108
  .load_contrib                  : 174
  .load_tg                       : 180
  .se->exec_start                : 141912354.700492
  .se->vruntime                  : 47687190.448099
  .se->sum_exec_runtime          : 590189.718382
  .se->statistics.wait_start     : 0.000000
  .se->statistics.sleep_start    : 0.000000
  .se->statistics.block_start    : 0.000000
  .se->statistics.sleep_max      : 0.000000
  .se->statistics.block_max      : 0.000000
  .se->statistics.exec_max       : 37.702102
  .se->statistics.slice_max      : 67.857220
  .se->statistics.wait_max       : 450.337317
  .se->statistics.wait_sum       : 107254.369568
  .se->statistics.wait_count     : 3731680
  .se->load.weight               : 2

cfs_rq[1]:/autogroup-8532
  .exec_clock                    : 13132.462638
  .MIN_vruntime                  : 0.000001
  .min_vruntime                  : 14920.315501
  .max_vruntime                  : 0.000001
  .spread                        : 0.000000
  .spread0                       : -44592144.736747
  .nr_spread_over                : 979
  .nr_running                    : 0
  .load                          : 0
  .load_avg                      : 185.203956
  .load_period                   : 7.531595
  .load_contrib                  : 24
  .load_tg                       : 24
  .se->exec_start                : 141912336.944095
  .se->vruntime                  : 47687195.840843
  .se->sum_exec_runtime          : 13132.517825
  .se->statistics.wait_start     : 0.000000
  .se->statistics.sleep_start    : 0.000000
  .se->statistics.block_start    : 0.000000
  .se->statistics.sleep_max      : 0.000000
  .se->statistics.block_max      : 0.000000
  .se->statistics.exec_max       : 5.525199
  .se->statistics.slice_max      : 15.355378
  .se->statistics.wait_max       : 22.093189
  .se->statistics.wait_sum       : 1246.745792
  .se->statistics.wait_count     : 10884
  .se->load.weight               : 2

cfs_rq[1]:/autogroup-6606
  .exec_clock                    : 220454.371902
  .MIN_vruntime                  : 0.000001
  .min_vruntime                  : 220154.461555
  .max_vruntime                  : 0.000001
  .spread                        : 0.000000
  .spread0                       : -44386916.681120
  .nr_spread_over                : 38
  .nr_running                    : 0
  .load                          : 0
  .load_avg                      : 919.191514
  .load_period                   : 9.421373
  .load_contrib                  : 7
  .load_tg                       : 16
  .se->exec_start                : 141912325.856540
  .se->vruntime                  : 47687190.843682
  .se->sum_exec_runtime          : 220455.300850
  .se->statistics.wait_start     : 0.000000
  .se->statistics.sleep_start    : 0.000000
  .se->statistics.block_start    : 0.000000
  .se->statistics.sleep_max      : 0.000000
  .se->statistics.block_max      : 0.000000
  .se->statistics.exec_max       : 11.798679
  .se->statistics.slice_max      : 216.375331
  .se->statistics.wait_max       : 29.371565
  .se->statistics.wait_sum       : 53833.944831
  .se->statistics.wait_count     : 243307
  .se->load.weight               : 2

cfs_rq[1]:/autogroup-67
  .exec_clock                    : 16088979.991476
  .MIN_vruntime                  : 0.000001
  .min_vruntime                  : 8686788.465245
  .max_vruntime                  : 0.000001
  .spread                        : 0.000000
  .spread0                       : -35920282.677430
  .nr_spread_over                : 2599
  .nr_running                    : 0
  .load                          : 0
  .load_avg                      : 192.884281
  .load_period                   : 7.306848
  .load_contrib                  : 26
  .load_tg                       : 47
  .se->exec_start                : 141912370.095909
  .se->vruntime                  : 47687190.909224
  .se->sum_exec_runtime          : 16089050.165780
  .se->statistics.wait_start     : 0.000000
  .se->statistics.sleep_start    : 0.000000
  .se->statistics.block_start    : 0.000000
  .se->statistics.sleep_max      : 0.000000
  .se->statistics.block_max      : 0.000000
  .se->statistics.exec_max       : 191.854446
  .se->statistics.slice_max      : 246.895824
  .se->statistics.wait_max       : 172.819165
  .se->statistics.wait_sum       : 2727601.602106
  .se->statistics.wait_count     : 53656471
  .se->load.weight               : 2

cfs_rq[1]:/
  .exec_clock                    : 47837439.429324
  .MIN_vruntime                  : 47687190.843682
  .min_vruntime                  : 47687196.843682
  .max_vruntime                  : 47687190.843682
  .spread                        : 0.000000
  .spread0                       : 3080125.701007
  .nr_spread_over                : 7511
  .nr_running                    : 1
  .load                          : 1018
  .load_avg                      : 0.000000
  .load_period                   : 0.000000
  .load_contrib                  : 0
  .load_tg                       : 0

rt_rq[1]:
  .rt_nr_running                 : 0
  .rt_throttled                  : 0
  .rt_time                       : 0.000000
  .rt_runtime                    : 950.000000

runnable tasks:
            task   PID         tree-key  switches  prio     exec-runtime         sum-exec        sum-sleep
----------------------------------------------------------------------------------------------------------
               X  2530    590194.777093   7559895   120    590194.777093   1195228.445892 140477131.913151 /autogroup-40


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Abysmal Performance
  2011-06-20 21:51 Abysmal Performance Henning Rohlfs
  2011-06-21  0:12 ` Josef Bacik
@ 2011-06-21  8:00 ` Sander
  2011-06-21  9:26   ` Henning Rohlfs
  2011-06-21 15:24 ` Calvin Walton
  2 siblings, 1 reply; 37+ messages in thread
From: Sander @ 2011-06-21  8:00 UTC (permalink / raw)
  To: Henning Rohlfs; +Cc: linux-btrfs

Henning Rohlfs wrote (ao):
> - space_cache was enabled, but it seemed to make the problem worse.
> It's no longer in the mount options.

space_cache is a one time mount option which enabled space_cache. Not
supplying it anymore as a mount option has no effect (dmesg | grep btrfs).

	Sander

-- 
Humilis IT Services and Solutions
http://www.humilis.net

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Abysmal Performance
  2011-06-21  8:00 ` Sander
@ 2011-06-21  9:26   ` Henning Rohlfs
  2011-06-21 15:18     ` Josef Bacik
  0 siblings, 1 reply; 37+ messages in thread
From: Henning Rohlfs @ 2011-06-21  9:26 UTC (permalink / raw)
  To: linux-btrfs

 On Tue, 21 Jun 2011 10:00:59 +0200, Sander wrote:
> Henning Rohlfs wrote (ao):
>> - space_cache was enabled, but it seemed to make the problem worse.
>> It's no longer in the mount options.
>
> space_cache is a one time mount option which enabled space_cache. Not
> supplying it anymore as a mount option has no effect (dmesg | grep 
> btrfs).

 I'm sure that after the first reboot after removing the flag from the 
 mount options, the system was faster for a while. That must have been a 
 coincidence (or just an error on my part).

 Anyway, I rebooted with clear_cache as mount option and there was no 
 improvement either.


 Thanks for pointing this out,
 Henning

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Abysmal Performance
  2011-06-21  9:26   ` Henning Rohlfs
@ 2011-06-21 15:18     ` Josef Bacik
  2011-06-21 16:55       ` Henning Rohlfs
  0 siblings, 1 reply; 37+ messages in thread
From: Josef Bacik @ 2011-06-21 15:18 UTC (permalink / raw)
  To: Henning Rohlfs; +Cc: linux-btrfs

On 06/21/2011 05:26 AM, Henning Rohlfs wrote:
> On Tue, 21 Jun 2011 10:00:59 +0200, Sander wrote:
>> Henning Rohlfs wrote (ao):
>>> - space_cache was enabled, but it seemed to make the problem worse.
>>> It's no longer in the mount options.
>>
>> space_cache is a one time mount option which enabled space_cache. Not
>> supplying it anymore as a mount option has no effect (dmesg | grep
>> btrfs).
> 
> I'm sure that after the first reboot after removing the flag from the
> mount options, the system was faster for a while. That must have been a
> coincidence (or just an error on my part).
> 

No, the space cache will make your system faster _after_ having been
enabled once.  The reason for this is because we have to build the cache
the slow way at first, and then after that we can do it the fast way.
What is probably happening is your box is slowing down trying to build
this cache.  Don't mount with clear_cache unless there is a bug in your
cache.  Let it do it's thing and stuff will get faster.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Abysmal Performance
  2011-06-20 21:51 Abysmal Performance Henning Rohlfs
  2011-06-21  0:12 ` Josef Bacik
  2011-06-21  8:00 ` Sander
@ 2011-06-21 15:24 ` Calvin Walton
  2011-06-22 14:15   ` Henning Rohlfs
  2 siblings, 1 reply; 37+ messages in thread
From: Calvin Walton @ 2011-06-21 15:24 UTC (permalink / raw)
  To: Henning Rohlfs; +Cc: linux-btrfs

On Mon, 2011-06-20 at 23:51 +0200, Henning Rohlfs wrote:
> Hello,
>=20
>  I've migrated my system to btrfs (raid1) a few months ago. Since the=
n=20
>  the performance has been pretty bad, but recently it's gotten=20
>  unbearable: a simple sync called while the system is idle can take 2=
0 up=20
>  to 60 seconds. Creating or deleting files often has several seconds=20
>  latency, too.

I think I=E2=80=99ve been seeing a fairly similar, or possibly the same=
? issue
as well. It looks like it=E2=80=99s actually a regression introduced in=
 2.6.39 -
if I switch back to a 2.6.38 kernel, my latency issues magically go
away! (I'm curious: does using the older 2.6.38.x kernel help with
anyone else that's seeing the issue?)

Some hardware/configuration details:
btrfs on a single disc (Seagate Momentus XT hybrid), lzo compression an=
d
space cache enabled. Some snapshots in use.

I notice that in latencytop I'm seeing a lot of lines with (cropped)
traces like

sleep_on_page wait_on_page_bit read_extent_buffer_ 13.3 msec          0=
=2E5 %

showing up that I didn't see with the 2.6.38 kernel. I occasionally see
latencies as bad as 20-30 seconds on operations like fsync or
synchronous writes.

I think I can reproduce the issue well enough to bisect it, so I might
give that a try. It'll be slow going, though.

--=20
Calvin Walton <calvin.walton@kepstin.ca>

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Abysmal Performance
  2011-06-21 15:18     ` Josef Bacik
@ 2011-06-21 16:55       ` Henning Rohlfs
  0 siblings, 0 replies; 37+ messages in thread
From: Henning Rohlfs @ 2011-06-21 16:55 UTC (permalink / raw)
  To: Josef Bacik; +Cc: Henning Rohlfs, linux-btrfs

 On Tue, 21 Jun 2011 11:18:30 -0400, Josef Bacik wrote:
> On 06/21/2011 05:26 AM, Henning Rohlfs wrote:
>> On Tue, 21 Jun 2011 10:00:59 +0200, Sander wrote:
>>> Henning Rohlfs wrote (ao):
>>>> - space_cache was enabled, but it seemed to make the problem 
>>>> worse.
>>>> It's no longer in the mount options.
>>>
>>> space_cache is a one time mount option which enabled space_cache. 
>>> Not
>>> supplying it anymore as a mount option has no effect (dmesg | grep
>>> btrfs).
>>
>> I'm sure that after the first reboot after removing the flag from 
>> the
>> mount options, the system was faster for a while. That must have 
>> been a
>> coincidence (or just an error on my part).
>>
>
> No, the space cache will make your system faster _after_ having been
> enabled once. The reason for this is because we have to build the 
> cache
> the slow way at first, and then after that we can do it the fast way.
> What is probably happening is your box is slowing down trying to 
> build
> this cache.  Don't mount with clear_cache unless there is a bug in 
> your
> cache.  Let it do it's thing and stuff will get faster.

 I'm just reporting what I experienced. I had space_cache in the mount 
 options while the problem developed and removed it when the system got 
 too slow. After the next reboot the system was responsive for a short 
 time (an hour maybe - which seems to have been unrelated to the mount 
 option though from what you described). Now there's no difference 
 whatsoever between no options, space_cache and clear_cache.

 To sum it up: I only played with the clear_cache option because the 
 system got too slow in the first place. I don't see how the problem can 
 be related to this option if changing it it makes no difference.

 Thanks,
 Henning

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Abysmal Performance
  2011-06-21 15:24 ` Calvin Walton
@ 2011-06-22 14:15   ` Henning Rohlfs
  2011-06-22 15:39     ` Josef Bacik
  0 siblings, 1 reply; 37+ messages in thread
From: Henning Rohlfs @ 2011-06-22 14:15 UTC (permalink / raw)
  To: linux-btrfs

 On Tue, 21 Jun 2011 11:24:11 -0400, Calvin Walton wrote:
> On Mon, 2011-06-20 at 23:51 +0200, Henning Rohlfs wrote:
>> Hello,
>>
>>  I've migrated my system to btrfs (raid1) a few months ago. Since=20
>> then
>>  the performance has been pretty bad, but recently it's gotten
>>  unbearable: a simple sync called while the system is idle can take=20
>> 20 up
>>  to 60 seconds. Creating or deleting files often has several seconds
>>  latency, too.
>
> I think I=E2=80=99ve been seeing a fairly similar, or possibly the sa=
me?=20
> issue
> as well. It looks like it=E2=80=99s actually a regression introduced =
in=20
> 2.6.39 -
> if I switch back to a 2.6.38 kernel, my latency issues magically go
> away! (I'm curious: does using the older 2.6.38.x kernel help with
> anyone else that's seeing the issue?)
>
> Some hardware/configuration details:
> btrfs on a single disc (Seagate Momentus XT hybrid), lzo compression=20
> and
> space cache enabled. Some snapshots in use.
>
> I notice that in latencytop I'm seeing a lot of lines with (cropped)
> traces like
>
> sleep_on_page wait_on_page_bit read_extent_buffer_ 13.3 msec         =
=20
> 0.5 %
>
> showing up that I didn't see with the 2.6.38 kernel. I occasionally=20
> see
> latencies as bad as 20-30 seconds on operations like fsync or
> synchronous writes.
>
> I think I can reproduce the issue well enough to bisect it, so I=20
> might
> give that a try. It'll be slow going, though.

 You are right. This seems to be a regression in the .39 kernel. I=20
 tested with 2.6.38.2 just now and the performance is back to normal.

 Thanks,
 Henning




 server ~ # uname -a
 Linux server 2.6.38.2 #1 SMP Thu Apr 14 13:05:35 CEST 2011 x86_64 AMD=20
 Athlon(tm) 64 X2 Dual Core Processor 4200+ AuthenticAMD GNU/Linux

 server ~ # sync; time sync
 real	0m0.144s
 user	0m0.000s
 sys	0m0.020s

 server ~ # bonnie++ -d tmp -u 0:0
 Version  1.96       ------Sequential Output------ --Sequential Input-=20
 --Random-
 Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block--=20
 --Seeks--
 Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP =
=20
 /sec %CP
 server          16G   147  97 279933  56 73245  34  1258  78 102379  2=
3=20
 177.3  50
 Latency               423ms     103ms     645ms     163ms     404ms   =
 =20
 264ms
 Version  1.96       ------Sequential Create------ --------Random=20
 Create--------
 server              -Create-- --Read--- -Delete-- -Create-- --Read---=20
 -Delete--
               files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP =
=20
 /sec %CP
                  16  3784  28 +++++ +++  8519  60 13694  59 +++++ +++=20
 11710  76
 Latency               127ms    1024us   18718us   15958us     119us   =
=20
 2459us
 1.96,1.96,server,1,1308745595,16G,,147,97,279933,56,73245,34,1258,78,1=
02379,23,177.3,50,16,,,,,3784,28,+++++,+++,8519,60,13694,59,+++++,+++,1=
1710,76,423ms,103ms,645ms,163ms,404ms,264ms,127ms,1024us,18718us,15958u=
s,119us,2459us

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Abysmal Performance
  2011-06-22 14:15   ` Henning Rohlfs
@ 2011-06-22 15:39     ` Josef Bacik
  2011-06-22 15:57       ` Calvin Walton
  0 siblings, 1 reply; 37+ messages in thread
From: Josef Bacik @ 2011-06-22 15:39 UTC (permalink / raw)
  To: Henning Rohlfs; +Cc: linux-btrfs

On 06/22/2011 10:15 AM, Henning Rohlfs wrote:
> On Tue, 21 Jun 2011 11:24:11 -0400, Calvin Walton wrote:
>> On Mon, 2011-06-20 at 23:51 +0200, Henning Rohlfs wrote:
>>> Hello,
>>>
>>>  I've migrated my system to btrfs (raid1) a few months ago. Since t=
hen
>>>  the performance has been pretty bad, but recently it's gotten
>>>  unbearable: a simple sync called while the system is idle can take
>>> 20 up
>>>  to 60 seconds. Creating or deleting files often has several second=
s
>>>  latency, too.
>>
>> I think I=E2=80=99ve been seeing a fairly similar, or possibly the s=
ame? issue
>> as well. It looks like it=E2=80=99s actually a regression introduced=
 in 2.6.39 -
>> if I switch back to a 2.6.38 kernel, my latency issues magically go
>> away! (I'm curious: does using the older 2.6.38.x kernel help with
>> anyone else that's seeing the issue?)
>>
>> Some hardware/configuration details:
>> btrfs on a single disc (Seagate Momentus XT hybrid), lzo compression=
 and
>> space cache enabled. Some snapshots in use.
>>
>> I notice that in latencytop I'm seeing a lot of lines with (cropped)
>> traces like
>>
>> sleep_on_page wait_on_page_bit read_extent_buffer_ 13.3 msec        =
=20
>> 0.5 %
>>
>> showing up that I didn't see with the 2.6.38 kernel. I occasionally =
see
>> latencies as bad as 20-30 seconds on operations like fsync or
>> synchronous writes.
>>
>> I think I can reproduce the issue well enough to bisect it, so I mig=
ht
>> give that a try. It'll be slow going, though.
>=20
> You are right. This seems to be a regression in the .39 kernel. I tes=
ted
> with 2.6.38.2 just now and the performance is back to normal.

Would you mind bisecting?  You can make it go faster by doing

git bisect start fs/

that way only changes in fs are used.  Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Abysmal Performance
  2011-06-22 15:39     ` Josef Bacik
@ 2011-06-22 15:57       ` Calvin Walton
  2011-06-22 15:58         ` Josef Bacik
  0 siblings, 1 reply; 37+ messages in thread
From: Calvin Walton @ 2011-06-22 15:57 UTC (permalink / raw)
  To: Josef Bacik; +Cc: Henning Rohlfs, linux-btrfs

On Wed, 2011-06-22 at 11:39 -0400, Josef Bacik wrote:
> On 06/22/2011 10:15 AM, Henning Rohlfs wrote:
> > On Tue, 21 Jun 2011 11:24:11 -0400, Calvin Walton wrote:
> >> On Mon, 2011-06-20 at 23:51 +0200, Henning Rohlfs wrote:
> >>> Hello,
> >>>
> >>>  I've migrated my system to btrfs (raid1) a few months ago. Since=
 then
> >>>  the performance has been pretty bad, but recently it's gotten
> >>>  unbearable: a simple sync called while the system is idle can ta=
ke
> >>> 20 up
> >>>  to 60 seconds. Creating or deleting files often has several seco=
nds
> >>>  latency, too.
> >>
> >> I think I=E2=80=99ve been seeing a fairly similar, or possibly the=
 same? issue
> >> as well. It looks like it=E2=80=99s actually a regression introduc=
ed in 2.6.39 -
> >> if I switch back to a 2.6.38 kernel, my latency issues magically g=
o
> >> away! (I'm curious: does using the older 2.6.38.x kernel help with
> >> anyone else that's seeing the issue?)

> >> I think I can reproduce the issue well enough to bisect it, so I m=
ight
> >> give that a try. It'll be slow going, though.
> >=20
> > You are right. This seems to be a regression in the .39 kernel. I t=
ested
> > with 2.6.38.2 just now and the performance is back to normal.
>=20
> Would you mind bisecting?

Just before I was going to try bisecting, I tried the 3.0-rc4 kernel ou=
t
of curiosity. And it seems to be quite a bit better; at the very least,
I=E2=80=99m not seeing gui applications stalling for ~10 seconds when d=
oing
things like opening or writing files. Latencytop is reporting fsync()
latencies staying pretty steady in the range of under 300ms, with
occasional outliers at up to 2s, and it's not getting worse with time.

I'll still look into doing a bisect between 2.6.38 and 2.6.39, I'm
curious what went wrong.

--=20
Calvin Walton <calvin.walton@kepstin.ca>

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Abysmal Performance
  2011-06-22 15:57       ` Calvin Walton
@ 2011-06-22 15:58         ` Josef Bacik
  0 siblings, 0 replies; 37+ messages in thread
From: Josef Bacik @ 2011-06-22 15:58 UTC (permalink / raw)
  To: Calvin Walton; +Cc: Henning Rohlfs, linux-btrfs

On 06/22/2011 11:57 AM, Calvin Walton wrote:
> On Wed, 2011-06-22 at 11:39 -0400, Josef Bacik wrote:
>> On 06/22/2011 10:15 AM, Henning Rohlfs wrote:
>>> On Tue, 21 Jun 2011 11:24:11 -0400, Calvin Walton wrote:
>>>> On Mon, 2011-06-20 at 23:51 +0200, Henning Rohlfs wrote:
>>>>> Hello,
>>>>>
>>>>>  I've migrated my system to btrfs (raid1) a few months ago. Since=
 then
>>>>>  the performance has been pretty bad, but recently it's gotten
>>>>>  unbearable: a simple sync called while the system is idle can ta=
ke
>>>>> 20 up
>>>>>  to 60 seconds. Creating or deleting files often has several seco=
nds
>>>>>  latency, too.
>>>>
>>>> I think I=E2=80=99ve been seeing a fairly similar, or possibly the=
 same? issue
>>>> as well. It looks like it=E2=80=99s actually a regression introduc=
ed in 2.6.39 -
>>>> if I switch back to a 2.6.38 kernel, my latency issues magically g=
o
>>>> away! (I'm curious: does using the older 2.6.38.x kernel help with
>>>> anyone else that's seeing the issue?)
>=20
>>>> I think I can reproduce the issue well enough to bisect it, so I m=
ight
>>>> give that a try. It'll be slow going, though.
>>>
>>> You are right. This seems to be a regression in the .39 kernel. I t=
ested
>>> with 2.6.38.2 just now and the performance is back to normal.
>>
>> Would you mind bisecting?
>=20
> Just before I was going to try bisecting, I tried the 3.0-rc4 kernel =
out
> of curiosity. And it seems to be quite a bit better; at the very leas=
t,
> I=E2=80=99m not seeing gui applications stalling for ~10 seconds when=
 doing
> things like opening or writing files. Latencytop is reporting fsync()
> latencies staying pretty steady in the range of under 300ms, with
> occasional outliers at up to 2s, and it's not getting worse with time=
=2E
>=20
> I'll still look into doing a bisect between 2.6.38 and 2.6.39, I'm
> curious what went wrong.
>=20

Yeah that makes two of us :).  There were some other plugging changes
that went into to 38, so maybe bisect all of the kernel, not just fs/
just in case it was those and not us.  Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: abysmal performance
  2011-05-03 15:42                     ` Mitch Harder
@ 2011-05-03 16:51                       ` Chris Mason
  0 siblings, 0 replies; 37+ messages in thread
From: Chris Mason @ 2011-05-03 16:51 UTC (permalink / raw)
  To: Mitch Harder; +Cc: Daniel J Blueman, Bernhard Schmidt, linux-btrfs

Excerpts from Mitch Harder's message of 2011-05-03 11:42:56 -0400:
> On Tue, May 3, 2011 at 9:41 AM, Daniel J Blueman
> <daniel.blueman@gmail.com> wrote:
> >
> > It does seem the case generally; on 2.6.39-rc5, writing to a fresh
> > filesystem using rsync with BTRFS compression enabled, 128KB extents
> > seem very common [1] (filefrag inconsistency noted).
> >
> > Defragmenting with compression gives a nice linear extent [2]. It
> > looks like it'll be a good win to prevent extents being split at
> > writeout for the read case on rotational media.
> >
> 
> Yes, 128KB extents are hardcoded in Btrfs right now.
> 
> There are two reasons cited in the comments for this:
> 
> (1)  Ease the RAM required when spreading compression across several CPUs.
> (2)  Make sure the amount of IO required to do a random read is
> reasonably small.
> 
> For about 4 months, I've been playing locally with 2 patches to
> increase the extent size to 512KB.
> 
> I haven't noticed any issues running with these patches.  However, I
> only have a Core2duo with 2 CPUs, so I'm probably not running into
> issues that someone with more CPUs might encounter.
> 
> I'll submit these patches to the list as an RFC so more people can at
> least see where this is done.  But with my limited hardware, I can't
> assert this change is the best for everyone.

The problem is just that any random read into the file will require
reading the full 512KB in order to decompress the extent.  And you need
to make sure you have enough room in ram to represent the decompressed
bytes in order to find the pages you care about.

The alternative is to keep the smaller compressed extent size and make
the allocator work harder to find contiguous 128KB extents to store all
of the file bytes.  This will work out much better ;)

-chris

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: abysmal performance
  2011-05-03 14:41                   ` Daniel J Blueman
@ 2011-05-03 15:42                     ` Mitch Harder
  2011-05-03 16:51                       ` Chris Mason
  0 siblings, 1 reply; 37+ messages in thread
From: Mitch Harder @ 2011-05-03 15:42 UTC (permalink / raw)
  To: Daniel J Blueman; +Cc: Chris Mason, Bernhard Schmidt, linux-btrfs

On Tue, May 3, 2011 at 9:41 AM, Daniel J Blueman
<daniel.blueman@gmail.com> wrote:
>
> It does seem the case generally; on 2.6.39-rc5, writing to a fresh
> filesystem using rsync with BTRFS compression enabled, 128KB extents
> seem very common [1] (filefrag inconsistency noted).
>
> Defragmenting with compression gives a nice linear extent [2]. It
> looks like it'll be a good win to prevent extents being split at
> writeout for the read case on rotational media.
>

Yes, 128KB extents are hardcoded in Btrfs right now.

There are two reasons cited in the comments for this:

(1)  Ease the RAM required when spreading compression across several CPUs.
(2)  Make sure the amount of IO required to do a random read is
reasonably small.

For about 4 months, I've been playing locally with 2 patches to
increase the extent size to 512KB.

I haven't noticed any issues running with these patches.  However, I
only have a Core2duo with 2 CPUs, so I'm probably not running into
issues that someone with more CPUs might encounter.

I'll submit these patches to the list as an RFC so more people can at
least see where this is done.  But with my limited hardware, I can't
assert this change is the best for everyone.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: abysmal performance
  2011-05-03 14:54             ` Daniel J Blueman
@ 2011-05-03 15:10               ` Bernhard Schmidt
  0 siblings, 0 replies; 37+ messages in thread
From: Bernhard Schmidt @ 2011-05-03 15:10 UTC (permalink / raw)
  To: Daniel J Blueman; +Cc: Chris Mason, linux-btrfs

Am 03.05.2011 16:54, schrieb Daniel J Blueman:

Hi,

>>> The file the defrag ioctl works is that it schedules things for defrag
>>> but doesn't force out the IO immediately unless you use -f.
>>>
>>> So, to test the result of the defrag, you need to either wait a bit or
>>> run sync.
>>
>> Did so, no change. See my reply to cwillu for the data.
> 
> Can you try with the compression option enabled? Eg:
> 
> # filefrag foo.dat
> foo.dat: 11 extents found
> 
> # find . -xdev -type f -print0 | xargs -0 btrfs filesystem defragment -c
> 
> # filefrag foo.dat
> foo.dat: 1 extent found
> 
> Seems to work fine on 2.6.39-rc5; I mounted with '-o
> compress,clear_cache' though.

Maybe I was expecting too much. I tried it on a file with 72 extends and
was expecting for the count to go down to 1 (or very very few). This
does not seem to happen with this particular file. I just tested another
file (with 193 extends) and it was reduced to 5. defrag with -f, but
without -c. Still mounted with compress=lzo.

However, the 72 frags file is not getting any better, no matter which
flags I tried. No big problem at the moment, I've found an older (Ubuntu
Maverick) based system with a rotating disk that had like 50000 extends
for a single file. I expect defragging that will increase performance
quite nicely :-)

Bernhard

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: abysmal performance
  2011-05-03 11:30           ` Bernhard Schmidt
  2011-05-03 11:36             ` Chris Mason
@ 2011-05-03 14:54             ` Daniel J Blueman
  2011-05-03 15:10               ` Bernhard Schmidt
  1 sibling, 1 reply; 37+ messages in thread
From: Daniel J Blueman @ 2011-05-03 14:54 UTC (permalink / raw)
  To: Bernhard Schmidt; +Cc: Chris Mason, linux-btrfs

On 3 May 2011 19:30, Bernhard Schmidt <berni@birkenwald.de> wrote:
[]
>> The file the defrag ioctl works is that it schedules things for defrag
>> but doesn't force out the IO immediately unless you use -f.
>>
>> So, to test the result of the defrag, you need to either wait a bit or
>> run sync.
>
> Did so, no change. See my reply to cwillu for the data.

Can you try with the compression option enabled? Eg:

# filefrag foo.dat
foo.dat: 11 extents found

# find . -xdev -type f -print0 | xargs -0 btrfs filesystem defragment -c

# filefrag foo.dat
foo.dat: 1 extent found

Seems to work fine on 2.6.39-rc5; I mounted with '-o
compress,clear_cache' though.

Daniel
-- 
Daniel J Blueman

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: abysmal performance
  2011-05-03 12:52                 ` Chris Mason
  2011-05-03 13:03                   ` Bernhard Schmidt
@ 2011-05-03 14:41                   ` Daniel J Blueman
  2011-05-03 15:42                     ` Mitch Harder
  1 sibling, 1 reply; 37+ messages in thread
From: Daniel J Blueman @ 2011-05-03 14:41 UTC (permalink / raw)
  To: Chris Mason; +Cc: Bernhard Schmidt, linux-btrfs

On 3 May 2011 20:52, Chris Mason <chris.mason@oracle.com> wrote:
> Excerpts from Bernhard Schmidt's message of 2011-05-03 07:43:04 -0400=
:
>> Hi,
>>
>> > Using compression is not a problem, but in order to reduce the max=
imum
>> > amount of ram we need to uncompress an extent, we enforce a max si=
ze on
>> > the extent. =A0So you'll tend to have more extents, but they shoul=
d be
>> > close together on disk.
>> >
>> > Could you please do a filefrag -v on the file? =A0Lets see how bad=
 it
>> > really is.
>>
>>
>> root@schleppi:~# filefrag -v
>> /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1
>> Filesystem type is: 9123683e
>> File size of /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1 =
is
>> 9361528 (2286 blocks, blocksize 4096)
>> =A0ext logical physical expected length flags
>> =A0 =A00 =A0 =A0 =A0 0 =A04542111 =A0 =A0 =A0 =A0 =A0 =A0 =A032
>> =A0 =A01 =A0 =A0 =A032 =A04542134 =A04542142 =A0 =A0 32
>> =A0 =A02 =A0 =A0 =A064 =A04573263 =A04542165 =A0 =A0 32
>
> Ok, looks like we could be doing a little better job when compression=
 is
> on to build out a bigger extent. =A0This shouldn't be causing trouble=
 on
> an ssd at all but on your rotating disk it'll be slightly slower.

It does seem the case generally; on 2.6.39-rc5, writing to a fresh
filesystem using rsync with BTRFS compression enabled, 128KB extents
seem very common [1] (filefrag inconsistency noted).

Defragmenting with compression gives a nice linear extent [2]. It
looks like it'll be a good win to prevent extents being split at
writeout for the read case on rotational media.

Daniel

--- [1]

# filefrag sdhci.ppm
sdhci.ppm: 173 extents found

# filefrag -v sdhci.ppm
=46ilesystem type is: 9123683e
=46ile size of sdhci.ppm is 36838232 (8994 blocks, blocksize 4096)
 ext logical physical expected length flags
   0       0  7410681              32
   1      32  7564365  7410712     32
   2      64  7875283  7564396     32
   3      96  8197111  7875314     32
   4     128  8459255  8197142     32
   5     160 10072355  8459286     32
   6     192 10556407 10072386     32
   7     224 11088956 10556438     32
   8     256  9435820 11088987     32
   9     288  9769977  9435851     32
  10     320 10032121  9770008     32
  11     352 11342839 10032152     32
  12     384 12129270 11342870     32
  13     416 12653556 12129301     32
  14     448 12915698 12653587     32
  15     480 12915705 12915729     32
  16     512 13439988 12915736     32
  17     544 13702133 13440019     32
  18     576 14163816 13702164     32
  19     608 14163823 14163847     32
  20     640 14163830 14163854     32
  21     672 14163837 14163861     32
  22     704 14163843 14163868     32
  23     736 14163849 14163874     32
  24     768 14163859 14163880     32
  25     800 14163869 14163890     32
  26     832 14163879 14163900     32
  27     864 14163889 14163910     32
  28     896 14163899 14163920     32
  29     928 14163909 14163930     32
  30     960 14163919 14163940     32
  31     992 14163929 14163950     32
  32    1024 14163939 14163960     32
  33    1056 14163949 14163970     32
  34    1088 14163958 14163980     32
  35    1120 14163967 14163989     32
  36    1152 14163976 14163998     32
  37    1184 14163985 14164007     32
  38    1216 14163994 14164016     32
  39    1248 14164003 14164025     32
  40    1280 14164012 14164034     32
  41    1312 14164021 14164043     32
  42    1344 14164031 14164052     32
  43    1376 14164041 14164062     32
  44    1408 14164050 14164072     32
  45    1440 14164060 14164081     32
  46    1472 14164070 14164091     32
  47    1504 14164080 14164101     32
  48    1536 14164090 14164111     32
  49    1568 14164100 14164121     32
  50    1600 14164109 14164131     32
  51    1632 14164119 14164140     32
  52    1664 14164129 14164150     32
  53    1696 14164139 14164160     32
  54    1728 14164148 14164170     32
  55    1760 14164157 14164179     32
  56    1792 14164166 14164188     32
  57    1824 14164176 14164197     32
  58    1856 14164186 14164207     32
  59    1888 14164196 14164217     32
  60    1920 14164205 14164227     32
  61    1952 14164214 14164236     32
  62    1984 14164223 14164245     32
  63    2016 14164233 14164254     32
  64    2048 14164243 14164264     32
  65    2080 14164252 14164274     32
  66    2112 14164262 14164283     32
  67    2144 14164272 14164293     32
  68    2176 14164282 14164303     32
  69    2208 14164292 14164313     32
  70    2240 14164302 14164323     32
  71    2272 14164311 14164333     32
  72    2304 14164321 14164342     32
  73    2336 14164331 14164352     32
  74    2368 14164340 14164362     32
  75    2400 14164350 14164371     32
  76    2432 14164360 14164381     32
  77    2464 14164369 14164391     32
  78    2496 14164379 14164400     32
  79    2528 14164389 14164410     32
  80    2560 14164398 14164420     32
  81    2592 14164407 14164429     32
  82    2624 14164416 14164438     32
  83    2656 14164425 14164447     32
  84    2688 14164434 14164456     32
  85    2720 14164443 14164465     32
  86    2752 14164452 14164474     32
  87    2784 14164462 14164483     32
  88    2816 14164472 14164493     32
  89    2848 14164481 14164503     32
  90    2880 14164490 14164512     32
  91    2912 14164499 14164521     32
  92    2944 14164508 14164530     32
  93    2976 14164517 14164539     32
  94    3008 14164526 14164548     32
  95    3040 14164535 14164557     32
  96    3072 14164545 14164566     32
  97    3104 14164554 14164576     32
  98    3136 14164563 14164585     32
  99    3168 14164573 14164594     32
 100    3200 14164582 14164604     32
 101    3232 14164591 14164613     32
 102    3264 14164600 14164622     32
 103    3296 14164609 14164631     32
 104    3328 14164618 14164640     32
 105    3360 14164628 14164649     32
 106    3392 14164637 14164659     32
 107    3424 14164646 14164668     32
 108    3456 14164656 14164677     32
 109    3488 14164665 14164687     32
 110    3520 14164674 14164696     32
 111    3552 14164684 14164705     32
 112    3584 14164693 14164715     32
 113    3616 14164702 14164724     32
 114    3648 14164711 14164733     32
 115    3680 14164720 14164742     32
 116    3712 14164729 14164751     32
 117    3744 14164739 14164760     32
 118    3776 14164748 14164770     32
 119    3808 14164757 14164779     32
 120    3840 14164766 14164788     32
 121    3872 14164775 14164797     32
 122    3904 14164784 14164806     32
 123    3936 14164793 14164815     32
 124    3968 14164802 14164824     32
 125    4000 14164811 14164833     32
 126    4032 14164820 14164842     32
 127    4064 14164829 14164851     32
 128    4096 14164838 14164860     32
 129    4128 14164848 14164869     32
 130    4160 14164857 14164879     32
 131    4192 14164866 14164888     32
 132    4224 14164876 14164897     32
 133    4256 14164885 14164907     32
 134    4288 14164895 14164916     32
 135    4320 14164905 14164926     32
 136    4352 14164915 14164936     32
 137    4384 14164925 14164946     32
 138    4416 14164935 14164956     32
 139    4448 14164945 14164966     32
 140    4480 14164955 14164976     32
 141    4512 14164965 14164986     32
 142    4544 14164975 14164996     32
 143    4576 14164986 14165006     32
 144    4608 14164997 14165017     32
 145    4640 14165007 14165028     32
 146    4672 14165018 14165038     32
 147    4704 14165029 14165049     32
 148    4736 14165039 14165060     32
 149    4768 14165049 14165070    224
 150    4992 14165273              32
 151    5024 14165283 14165304     32
 152    5056 14165294 14165314     32
 153    5088 14165305 14165325     32
 154    5120 14165315 14165336     32
 155    5152 14165326 14165346     32
 156    5184 14165337 14165357     32
 157    5216 14165347 14165368     32
 158    5248 14165358 14165378     32
 159    5280 14165369 14165389    224
 160    5504 14165593              32
 161    5536 14165603 14165624     32
 162    5568 14165613 14165634     32
 163    5600 14165623 14165644     32
 164    5632 14165633 14165654     32
 165    5664 14165643 14165664     32
 166    5696 14165653 14165674     32
 167    5728 14165663 14165684    288
 168    6016 14165951              32
 169    6048 14165960 14165982     32
 170    6080 14165969 14165991     32
 171    6112 14165978 14166000     32
 172    6144 14165988 14166009   2850 eof
sdhci.ppm: 170 extents found

--- [2]

# btrfs filesystem defragment -c sdhci.ppm

# filefrag -v sdhci.ppm
=46ilesystem type is: 9123683e
=46ile size of sdhci.ppm is 36838232 (8994 blocks, blocksize 4096)
 ext logical physical expected length flags
   0       0 233041476            8994 eof
sdhci.ppm: 1 extent found
--=20
Daniel J Blueman
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: abysmal performance
  2011-05-03 13:03                   ` Bernhard Schmidt
@ 2011-05-03 13:41                     ` Mitch Harder
  0 siblings, 0 replies; 37+ messages in thread
From: Mitch Harder @ 2011-05-03 13:41 UTC (permalink / raw)
  To: Bernhard Schmidt; +Cc: Chris Mason, linux-btrfs

On Tue, May 3, 2011 at 8:03 AM, Bernhard Schmidt <berni@birkenwald.de> =
wrote:
> Hi,
>
>> Ok, looks like we could be doing a little better job when compressio=
n is
>> on to build out a bigger extent. =A0This shouldn't be causing troubl=
e on
>> an ssd at all but on your rotating disk it'll be slightly slower.
>>
>> Still most of these extents are somewhat close together, this is rou=
ghly
>> what mount -o ssd (which is enabled automatically when we detect a
>> non-rotating drive) would try for.
>>
>> The problematic files are going to have thousands of extents, this f=
ile
>> should be fine.
>
> Thanks, I'll check on my system with rotating disks at home when I ge=
t back.
>

I'd also be curious to see if mounting with "-o compress-force=3Dlzo"
affects anything.

As I recall, the compress-force option was added because performance
could be affected negatively when trying to optimize compression with
"-o compress".
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: abysmal performance
  2011-05-03 12:52                 ` Chris Mason
@ 2011-05-03 13:03                   ` Bernhard Schmidt
  2011-05-03 13:41                     ` Mitch Harder
  2011-05-03 14:41                   ` Daniel J Blueman
  1 sibling, 1 reply; 37+ messages in thread
From: Bernhard Schmidt @ 2011-05-03 13:03 UTC (permalink / raw)
  To: Chris Mason; +Cc: linux-btrfs

Hi,

> Ok, looks like we could be doing a little better job when compression is
> on to build out a bigger extent.  This shouldn't be causing trouble on
> an ssd at all but on your rotating disk it'll be slightly slower.
> 
> Still most of these extents are somewhat close together, this is roughly
> what mount -o ssd (which is enabled automatically when we detect a
> non-rotating drive) would try for.
> 
> The problematic files are going to have thousands of extents, this file
> should be fine.

Thanks, I'll check on my system with rotating disks at home when I get back.

Bernhard

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: abysmal performance
  2011-05-03 11:43               ` Bernhard Schmidt
@ 2011-05-03 12:52                 ` Chris Mason
  2011-05-03 13:03                   ` Bernhard Schmidt
  2011-05-03 14:41                   ` Daniel J Blueman
  0 siblings, 2 replies; 37+ messages in thread
From: Chris Mason @ 2011-05-03 12:52 UTC (permalink / raw)
  To: Bernhard Schmidt; +Cc: linux-btrfs

Excerpts from Bernhard Schmidt's message of 2011-05-03 07:43:04 -0400:
> Hi,
> 
> > Using compression is not a problem, but in order to reduce the maximum
> > amount of ram we need to uncompress an extent, we enforce a max size on
> > the extent.  So you'll tend to have more extents, but they should be
> > close together on disk.
> > 
> > Could you please do a filefrag -v on the file?  Lets see how bad it
> > really is.
> 
> 
> root@schleppi:~# filefrag -v
> /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1
> Filesystem type is: 9123683e
> File size of /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1 is
> 9361528 (2286 blocks, blocksize 4096)
>  ext logical physical expected length flags
>    0       0  4542111              32
>    1      32  4542134  4542142     32
>    2      64  4573263  4542165     32

Ok, looks like we could be doing a little better job when compression is
on to build out a bigger extent.  This shouldn't be causing trouble on
an ssd at all but on your rotating disk it'll be slightly slower.

Still most of these extents are somewhat close together, this is roughly
what mount -o ssd (which is enabled automatically when we detect a
non-rotating drive) would try for.

The problematic files are going to have thousands of extents, this file
should be fine.

-chris

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: abysmal performance
  2011-05-03 11:36             ` Chris Mason
@ 2011-05-03 11:43               ` Bernhard Schmidt
  2011-05-03 12:52                 ` Chris Mason
  0 siblings, 1 reply; 37+ messages in thread
From: Bernhard Schmidt @ 2011-05-03 11:43 UTC (permalink / raw)
  To: Chris Mason; +Cc: linux-btrfs

Hi,

> Using compression is not a problem, but in order to reduce the maximum
> amount of ram we need to uncompress an extent, we enforce a max size on
> the extent.  So you'll tend to have more extents, but they should be
> close together on disk.
> 
> Could you please do a filefrag -v on the file?  Lets see how bad it
> really is.


root@schleppi:~# filefrag -v
/usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1
Filesystem type is: 9123683e
File size of /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1 is
9361528 (2286 blocks, blocksize 4096)
 ext logical physical expected length flags
   0       0  4542111              32
   1      32  4542134  4542142     32
   2      64  4573263  4542165     32
   3      96  4573285  4573294     32
   4     128  4579639  4573316     32
   5     160  4579664  4579670     32
   6     192  4581178  4579695     32
   7     224  4579811  4581209     32
   8     256  4579836  4579842     32
   9     288  4579861  4579867     32
  10     320  4579884  4579892     32
  11     352  4580698  4579915     32
  12     384  4580720  4580729     32
  13     416  4580746  4580751     32
  14     448  4580768  4580777     32
  15     480  4580793  4580799     32
  16     512  4580819  4580824     32
  17     544  4581238  4580850     32
  18     576  4600396  4581269     32
  19     608  4600422  4600427     32
  20     640  4600447  4600453     32
  21     672  4600472  4600478     32
  22     704  4600498  4600503     32
  23     736  4600523  4600529     32
  24     768  4601483  4600554     32
  25     800  4601509  4601514     32
  26     832  4601534  4601540     32
  27     864  4601558  4601565     32
  28     896  4601583  4601589     32
  29     928  4601608  4601614     32
  30     960  4618420  4601639     32
  31     992  4618443  4618451     32
  32    1024  4541221  4618474     32
  33    1056  4618463  4541252     32
  34    1088  4618485  4618494     32
  35    1120  4618505  4618516     32
  36    1152  4579536  4618536     32
  37    1184  4579688  4579567     32
  38    1216  4579740  4579719     32
  39    1248  4618526  4579771     32
  40    1280  4618544  4618557     32
  41    1312  4618563  4618575     32
  42    1344  4618583  4618594     32
  43    1376  4618605  4618614     32
  44    1408  4618626  4618636     32
  45    1440  4618652  4618657     32
  46    1472  4618677  4618683     32
  47    1504  4618703  4618708     32
  48    1536  4618728  4618734     32
  49    1568  4618754  4618759     32
  50    1600  4618774  4618785     32
  51    1632  4618782  4618805     32
  52    1664  4561195  4618813     32
  53    1696  4600548  4561226     32
  54    1728  4618793  4600579     32
  55    1760  4618807  4618824     32
  56    1792  4539912  4618838     32
  57    1824  4542619  4539943     32
  58    1856  4556887  4542650     32
  59    1888  4601632  4556918     32
  60    1920  4558150  4601663     32
  61    1952  4561224  4558181     32
  62    1984  4618816  4561255     32
  63    2016  4618835  4618847     32
  64    2048  4618861  4618866     32
  65    2080  4618881  4618892     32
  66    2112  4618901  4618912     32
  67    2144  4618917  4618932     32
  68    2176  4539241  4618948     32
  69    2208  4539915  4539272     32
  70    2240  4539985  4539946     32
  71    2272  4540096  4540016     14 eof
/usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1: 72 extents found

Bernhard

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: abysmal performance
  2011-05-03 11:30           ` Bernhard Schmidt
@ 2011-05-03 11:36             ` Chris Mason
  2011-05-03 11:43               ` Bernhard Schmidt
  2011-05-03 14:54             ` Daniel J Blueman
  1 sibling, 1 reply; 37+ messages in thread
From: Chris Mason @ 2011-05-03 11:36 UTC (permalink / raw)
  To: Bernhard Schmidt; +Cc: linux-btrfs

Excerpts from Bernhard Schmidt's message of 2011-05-03 07:30:36 -0400:
> Am 03.05.2011 13:08, schrieb Chris Mason:
> 
> >> defragging btrfs does not seem to work for me. I have run the filefrag
> >> command over the whole fs and (manually) tried to defrag a few heavily
> >> fragmented files, but I don't get it to work (it still has the same
> >> number of extends and they are horrently uncorrelated)
> >>
> >> root@schleppi:~# filefrag
> >> /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1
> >> /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1: 72 extents found
> >> root@schleppi:~# btrfs filesystem defrag
> >> /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1
> >> root@schleppi:~# filefrag
> >> /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1
> >> /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1: 72 extents found
> >>
> >> I'm using Ubuntu Natty (2.6.38.4) and tried both btrfs-tools from Natty
> >> (201006xx) and from Debian experimental (git from 20101101). Both show
> >> the same symptoms. I don't think fragmentation is bad on this box (due
> >> to having an SSD), but my system at home is getting dog slow and I'd
> >> like to try that when I come home end of the week.
> > 
> > Do you have compression on?
> 
> Yes. lzo to be exact.
> 
> > The file the defrag ioctl works is that it schedules things for defrag
> > but doesn't force out the IO immediately unless you use -f.
> > 
> > So, to test the result of the defrag, you need to either wait a bit or
> > run sync.
> 
> Did so, no change. See my reply to cwillu for the data.
> 
> I usually mount my / without any compression option and did "mount -o
> remount,compress=lzo /" before. I cannot reboot at the moment and I did
> not find any option to disable compression again. There does not seem to
> be a "nocompress" or "compress=[none|off]" option. Is this correct?

Using compression is not a problem, but in order to reduce the maximum
amount of ram we need to uncompress an extent, we enforce a max size on
the extent.  So you'll tend to have more extents, but they should be
close together on disk.

Could you please do a filefrag -v on the file?  Lets see how bad it
really is.

-chris

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: abysmal performance
  2011-05-03 11:08         ` Chris Mason
@ 2011-05-03 11:30           ` Bernhard Schmidt
  2011-05-03 11:36             ` Chris Mason
  2011-05-03 14:54             ` Daniel J Blueman
  0 siblings, 2 replies; 37+ messages in thread
From: Bernhard Schmidt @ 2011-05-03 11:30 UTC (permalink / raw)
  To: Chris Mason; +Cc: linux-btrfs

Am 03.05.2011 13:08, schrieb Chris Mason:

>> defragging btrfs does not seem to work for me. I have run the filefrag
>> command over the whole fs and (manually) tried to defrag a few heavily
>> fragmented files, but I don't get it to work (it still has the same
>> number of extends and they are horrently uncorrelated)
>>
>> root@schleppi:~# filefrag
>> /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1
>> /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1: 72 extents found
>> root@schleppi:~# btrfs filesystem defrag
>> /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1
>> root@schleppi:~# filefrag
>> /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1
>> /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1: 72 extents found
>>
>> I'm using Ubuntu Natty (2.6.38.4) and tried both btrfs-tools from Natty
>> (201006xx) and from Debian experimental (git from 20101101). Both show
>> the same symptoms. I don't think fragmentation is bad on this box (due
>> to having an SSD), but my system at home is getting dog slow and I'd
>> like to try that when I come home end of the week.
> 
> Do you have compression on?

Yes. lzo to be exact.

> The file the defrag ioctl works is that it schedules things for defrag
> but doesn't force out the IO immediately unless you use -f.
> 
> So, to test the result of the defrag, you need to either wait a bit or
> run sync.

Did so, no change. See my reply to cwillu for the data.

I usually mount my / without any compression option and did "mount -o
remount,compress=lzo /" before. I cannot reboot at the moment and I did
not find any option to disable compression again. There does not seem to
be a "nocompress" or "compress=[none|off]" option. Is this correct?

Bernhard

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: abysmal performance
  2011-05-03 11:00         ` cwillu
@ 2011-05-03 11:26           ` Bernhard Schmidt
  0 siblings, 0 replies; 37+ messages in thread
From: Bernhard Schmidt @ 2011-05-03 11:26 UTC (permalink / raw)
  To: cwillu; +Cc: linux-btrfs

Am 03.05.2011 13:00, schrieb cwillu:

Hi,

>> defragging btrfs does not seem to work for me. I have run the filefrag
>> command over the whole fs and (manually) tried to defrag a few heavily
>> fragmented files, but I don't get it to work (it still has the same
>> number of extends and they are horrently uncorrelated)
>>
>> root@schleppi:~# filefrag
>> /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1
>> /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1: 72 extents found
>> root@schleppi:~# btrfs filesystem defrag
>> /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1
>> root@schleppi:~# filefrag
>> /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1
>> /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1: 72 extents found
>>
>> I'm using Ubuntu Natty (2.6.38.4) and tried both btrfs-tools from Natty
>> (201006xx) and from Debian experimental (git from 20101101). Both show
>> the same symptoms. I don't think fragmentation is bad on this box (due
>> to having an SSD), but my system at home is getting dog slow and I'd
>> like to try that when I come home end of the week.
> 
> You're not using compression on that filesystem are you?  If so, be
> aware that the number of extents isn't going to change after
> defragmentation, although you should find that the _locations_ of
> those extents is contiguous.

Actually I do run compression, but the location is not continguous
afterwards (even after btrfs filesystem sync / or using -f):

root@schleppi:~# btrfs filesystem defrag -f
/usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1
root@schleppi:~# btrfs filesystem sync /FSSync '/'
root@schleppi:~# filefrag -v
/usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1
Filesystem type is: 9123683e
File size of /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1 is
9361528 (2286 blocks, blocksize 4096)
 ext logical physical expected length flags
   0       0  4542111              32
   1      32  4542134  4542142     32
   2      64  4573263  4542165     32
   3      96  4573285  4573294     32
   4     128  4579639  4573316     32
   5     160  4579664  4579670     32
   6     192  4581178  4579695     32
   7     224  4579811  4581209     32
   8     256  4579836  4579842     32
   9     288  4579861  4579867     32
  10     320  4579884  4579892     32
  11     352  4580698  4579915     32
  12     384  4580720  4580729     32
  13     416  4580746  4580751     32
  14     448  4580768  4580777     32
  15     480  4580793  4580799     32
  16     512  4580819  4580824     32
  17     544  4581238  4580850     32
  18     576  4600396  4581269     32
  19     608  4600422  4600427     32
  20     640  4600447  4600453     32
  21     672  4600472  4600478     32
  22     704  4600498  4600503     32
  23     736  4600523  4600529     32
  24     768  4601483  4600554     32
  25     800  4601509  4601514     32
  26     832  4601534  4601540     32
  27     864  4601558  4601565     32
  28     896  4601583  4601589     32
  29     928  4601608  4601614     32
  30     960  4618420  4601639     32
  31     992  4618443  4618451     32
  32    1024  4541221  4618474     32
  33    1056  4618463  4541252     32
  34    1088  4618485  4618494     32
  35    1120  4618505  4618516     32
  36    1152  4579536  4618536     32
  37    1184  4579688  4579567     32
  38    1216  4579740  4579719     32
  39    1248  4618526  4579771     32
  40    1280  4618544  4618557     32
  41    1312  4618563  4618575     32
  42    1344  4618583  4618594     32
  43    1376  4618605  4618614     32
  44    1408  4618626  4618636     32
  45    1440  4618652  4618657     32
  46    1472  4618677  4618683     32
  47    1504  4618703  4618708     32
  48    1536  4618728  4618734     32
  49    1568  4618754  4618759     32
  50    1600  4618774  4618785     32
  51    1632  4618782  4618805     32
  52    1664  4561195  4618813     32
  53    1696  4600548  4561226     32
  54    1728  4618793  4600579     32
  55    1760  4618807  4618824     32
  56    1792  4539912  4618838     32
  57    1824  4542619  4539943     32
  58    1856  4556887  4542650     32
  59    1888  4601632  4556918     32
  60    1920  4558150  4601663     32
  61    1952  4561224  4558181     32
  62    1984  4618816  4561255     32
  63    2016  4618835  4618847     32
  64    2048  4618861  4618866     32
  65    2080  4618881  4618892     32
  66    2112  4618901  4618912     32
  67    2144  4618917  4618932     32
  68    2176  4539241  4618948     32
  69    2208  4539915  4539272     32
  70    2240  4539985  4539946     32
  71    2272  4540096  4540016     14 eof
/usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1: 72 extents found

Best Regards,
Bernhard


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: abysmal performance
  2011-05-03 10:33       ` Bernhard Schmidt
  2011-05-03 11:00         ` cwillu
@ 2011-05-03 11:08         ` Chris Mason
  2011-05-03 11:30           ` Bernhard Schmidt
  1 sibling, 1 reply; 37+ messages in thread
From: Chris Mason @ 2011-05-03 11:08 UTC (permalink / raw)
  To: Bernhard Schmidt; +Cc: linux-btrfs

Excerpts from Bernhard Schmidt's message of 2011-05-03 06:33:25 -0400:
> Peter Stuge <peter@stuge.se> wrote:
> 
> Hey,
> 
> defragging btrfs does not seem to work for me. I have run the filefrag
> command over the whole fs and (manually) tried to defrag a few heavily
> fragmented files, but I don't get it to work (it still has the same
> number of extends and they are horrently uncorrelated)
> 
> root@schleppi:~# filefrag
> /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1
> /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1: 72 extents found
> root@schleppi:~# btrfs filesystem defrag
> /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1
> root@schleppi:~# filefrag
> /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1
> /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1: 72 extents found
> 
> I'm using Ubuntu Natty (2.6.38.4) and tried both btrfs-tools from Natty
> (201006xx) and from Debian experimental (git from 20101101). Both show
> the same symptoms. I don't think fragmentation is bad on this box (due
> to having an SSD), but my system at home is getting dog slow and I'd
> like to try that when I come home end of the week.

Do you have compression on?

The file the defrag ioctl works is that it schedules things for defrag
but doesn't force out the IO immediately unless you use -f.

So, to test the result of the defrag, you need to either wait a bit or
run sync.

-chris

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: abysmal performance
  2011-04-30 22:33         ` John Wyzer
  2011-05-03 11:05           ` Chris Mason
@ 2011-05-03 11:06           ` Chris Mason
  1 sibling, 0 replies; 37+ messages in thread
From: Chris Mason @ 2011-05-03 11:06 UTC (permalink / raw)
  To: John Wyzer; +Cc: linux-btrfs

Excerpts from John Wyzer's message of 2011-04-30 18:33:20 -0400:
> Excerpts from Mitch Harder's message of Sun May 01 00:16:53 +0200 2011:
> > > Hmm.
> > > Tried it and it gives me about 500000 lines of
> > >
> > > FIBMAP: Invalid argument
> > >
> > > and then:
> > >
> > > large_file: 1 extent found
> > >
> > > Is that the way it is supposed to work?
> > > Just asking because this was part of a vmware disk image. Both the virtual
> > > machine and the rest of the host system are almost unusable once the VM ist
> > > started (even more unusable than without vmware :-D )
> > 
> > No.  It sounds like the filefrag command is getting confused in the
> > virtual environment.
> 
> Misunderstanding :-) I tried filefrag _on_ a vmware disk image, not  inside a
> virtual machine.
> The whole btrfs story here is on a real machine.

Older filefrag uses fibmap, which we don't support (we use fiemap instead).
If you update your e2fsprogs you should get a newer filefrag.

-chris

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: abysmal performance
  2011-04-30 22:33         ` John Wyzer
@ 2011-05-03 11:05           ` Chris Mason
  2011-05-03 11:06           ` Chris Mason
  1 sibling, 0 replies; 37+ messages in thread
From: Chris Mason @ 2011-05-03 11:05 UTC (permalink / raw)
  To: John Wyzer; +Cc: linux-btrfs

Excerpts from John Wyzer's message of 2011-04-30 18:33:20 -0400:
> Excerpts from Mitch Harder's message of Sun May 01 00:16:53 +0200 2011:
> > > Hmm.
> > > Tried it and it gives me about 500000 lines of
> > >
> > > FIBMAP: Invalid argument
> > >
> > > and then:
> > >
> > > large_file: 1 extent found
> > >
> > > Is that the way it is supposed to work?
> > > Just asking because this was part of a vmware disk image. Both the virtual
> > > machine and the rest of the host system are almost unusable once the VM ist
> > > started (even more unusable than without vmware :-D )
> > 
> > No.  It sounds like the filefrag command is getting confused in the
> > virtual environment.
> 
> Misunderstanding :-) I tried filefrag _on_ a vmware disk image, not  inside a
> virtual machine.
> The whole btrfs story here is on a real machine.

The most important files to defrag are going to be your internal firefox
files (I think in .mozilla), your sup database (in .sup) and your vmware
images.  I would definitely suggest that you try to narrow down if
vmware is making the machine seem much slower after the defrag is done.

It might make sense to run the nocow ioctl on your vmware images, they
are probably triggering lots of seeks.

You'll notice the machine is much slower after a reboot, this is because
we have to do a lot of IO to recache the extent allocation tree.  If you
pull from the btrfs-unstable tree, you can use mount -o space_cache,
which cuts down on the reading after a reboot dramatically.

If none of this works, we'll look at the files that you're fsyncing.
That seems to be the bulk of your latencies.

-chris

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: abysmal performance
  2011-05-03 10:33       ` Bernhard Schmidt
@ 2011-05-03 11:00         ` cwillu
  2011-05-03 11:26           ` Bernhard Schmidt
  2011-05-03 11:08         ` Chris Mason
  1 sibling, 1 reply; 37+ messages in thread
From: cwillu @ 2011-05-03 11:00 UTC (permalink / raw)
  To: Bernhard Schmidt; +Cc: linux-btrfs

On Tue, May 3, 2011 at 4:33 AM, Bernhard Schmidt <berni@birkenwald.de> wrote:
> Peter Stuge <peter@stuge.se> wrote:
>
> Hey,
>
> defragging btrfs does not seem to work for me. I have run the filefrag
> command over the whole fs and (manually) tried to defrag a few heavily
> fragmented files, but I don't get it to work (it still has the same
> number of extends and they are horrently uncorrelated)
>
> root@schleppi:~# filefrag
> /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1
> /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1: 72 extents found
> root@schleppi:~# btrfs filesystem defrag
> /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1
> root@schleppi:~# filefrag
> /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1
> /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1: 72 extents found
>
> I'm using Ubuntu Natty (2.6.38.4) and tried both btrfs-tools from Natty
> (201006xx) and from Debian experimental (git from 20101101). Both show
> the same symptoms. I don't think fragmentation is bad on this box (due
> to having an SSD), but my system at home is getting dog slow and I'd
> like to try that when I come home end of the week.

You're not using compression on that filesystem are you?  If so, be
aware that the number of extents isn't going to change after
defragmentation, although you should find that the _locations_ of
those extents is contiguous.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: abysmal performance
  2011-04-30 23:55     ` Peter Stuge
@ 2011-05-03 10:33       ` Bernhard Schmidt
  2011-05-03 11:00         ` cwillu
  2011-05-03 11:08         ` Chris Mason
  0 siblings, 2 replies; 37+ messages in thread
From: Bernhard Schmidt @ 2011-05-03 10:33 UTC (permalink / raw)
  To: linux-btrfs

Peter Stuge <peter@stuge.se> wrote:

Hey,

defragging btrfs does not seem to work for me. I have run the filefrag
command over the whole fs and (manually) tried to defrag a few heavily
fragmented files, but I don't get it to work (it still has the same
number of extends and they are horrently uncorrelated)

root@schleppi:~# filefrag
/usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1
/usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1: 72 extents found
root@schleppi:~# btrfs filesystem defrag
/usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1
root@schleppi:~# filefrag
/usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1
/usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.4/cc1: 72 extents found

I'm using Ubuntu Natty (2.6.38.4) and tried both btrfs-tools from Natty
(201006xx) and from Debian experimental (git from 20101101). Both show
the same symptoms. I don't think fragmentation is bad on this box (due
to having an SSD), but my system at home is getting dog slow and I'd
like to try that when I come home end of the week.

Best Regards,
Bernhard


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: abysmal performance
  2011-04-30 17:33   ` Mitch Harder
  2011-04-30 20:40     ` John Wyzer
@ 2011-04-30 23:55     ` Peter Stuge
  2011-05-03 10:33       ` Bernhard Schmidt
  1 sibling, 1 reply; 37+ messages in thread
From: Peter Stuge @ 2011-04-30 23:55 UTC (permalink / raw)
  To: linux-btrfs

Mitch Harder wrote:
> To defragment your entire volume, you'll need a command like:
> 
> # for file in $(find <PATH/TO/BTRFS/VOL/> -type f); do btrfs
> filesystem defragment ${file}; done

Suggest:

find /path/to/btrfs/vol -type f -exec btrfs filesystem defragment '{}' ';'


> If you just want to see your fragmentation you can use the 'filefrag'
> program from e2fsprogs:
> 
> # for file in $(find <PATH/TO/BTRFS/VOL/> -type f); do filefrag
> ${file}; done | sort -n -k 2 | less

find /path/to/btrfs/vol -type f -exec filefrag '{}' ';'


If either command can take multiple filenames as parameters, then e.g.:

find /path/to/btrfs/vol -type f -execdir filefrag '+'

(significantly better because not a fork per file)


//Peter

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: abysmal performance
  2011-04-30 22:16       ` Mitch Harder
@ 2011-04-30 22:33         ` John Wyzer
  2011-05-03 11:05           ` Chris Mason
  2011-05-03 11:06           ` Chris Mason
  0 siblings, 2 replies; 37+ messages in thread
From: John Wyzer @ 2011-04-30 22:33 UTC (permalink / raw)
  To: linux-btrfs

Excerpts from Mitch Harder's message of Sun May 01 00:16:53 +0200 2011:
> > Hmm.
> > Tried it and it gives me about 500000 lines of
> >
> > FIBMAP: Invalid argument
> >
> > and then:
> >
> > large_file: 1 extent found
> >
> > Is that the way it is supposed to work?
> > Just asking because this was part of a vmware disk image. Both the virtual
> > machine and the rest of the host system are almost unusable once the VM ist
> > started (even more unusable than without vmware :-D )
> 
> No.  It sounds like the filefrag command is getting confused in the
> virtual environment.

Misunderstanding :-) I tried filefrag _on_ a vmware disk image, not  inside a
virtual machine.
The whole btrfs story here is on a real machine.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: abysmal performance
  2011-04-30 20:40     ` John Wyzer
@ 2011-04-30 22:16       ` Mitch Harder
  2011-04-30 22:33         ` John Wyzer
  0 siblings, 1 reply; 37+ messages in thread
From: Mitch Harder @ 2011-04-30 22:16 UTC (permalink / raw)
  To: John Wyzer; +Cc: linux-btrfs

On Sat, Apr 30, 2011 at 3:40 PM, John Wyzer <john.wyzer@gmx.de> wrote:
>> If you just want to see your fragmentation you can use the 'filefrag'
>> program from e2fsprogs:
>>
>> # for file in $(find <PATH/TO/BTRFS/VOL/> -type f); do filefrag
>> ${file}; done | sort -n -k 2 | less
>
>
> Hmm.
> Tried it and it gives me about 500000 lines of
>
> FIBMAP: Invalid argument
>
> and then:
>
> large_file: 1 extent found
>
> Is that the way it is supposed to work?
> Just asking because this was part of a vmware disk image. Both the virtual
> machine and the rest of the host system are almost unusable once the VM ist
> started (even more unusable than without vmware :-D )

No.  It sounds like the filefrag command is getting confused in the
virtual environment.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: abysmal performance
       [not found]           ` <1304146193-sup-2200@localhost>
@ 2011-04-30 20:51             ` John Wyzer
  0 siblings, 0 replies; 37+ messages in thread
From: John Wyzer @ 2011-04-30 20:51 UTC (permalink / raw)
  To: linux-btrfs


Resending this to the list since I did not notice that I just responded to
Chris Mason for the last emails...

Excerpts from John Wyzer's message of Sat Apr 30 08:57:06 +0200 2011:
> Excerpts from Chris Mason's message of Fri Apr 29 22:48:58 +0200 2011:
> > > 
> > > http://bayimg.com/NahClAadn
> > > http://bayimg.com/NahcnaADn
> > > http://bayimg.com/NAhCoAAdN
> > > http://bayimg.com/PahCaaAdN
> > 
> > Ok, you have three processes that may be causing trouble.  We probably
> > just need to defragment the files related to these three and life will
> > be good again.
> > 
> > 1) Firefox.  firefox has a bunch of little databases that are going to
> > fragment badly as we cow. 
> > 
> > 2) sup.  Is this the sup email client?
> > 
> > 3) vmware.  Are you hosting vmware virtual images on btrfs too?
> 
> @2: yes, the sup mail client, polling for messages.
> @3: yes, from some tasks I use vmware
> 
> But those were just the active applications at the time of taking the screenshot.
> I have the same thing with opera or during snapshot deletion or practically
> anything that involves some disk access.
> I'll probably get all atimes for files on my system, sort and defragment the
> files in order of importance...
> We'll see how btrfs behaves afterwards...

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: abysmal performance
  2011-04-30 17:33   ` Mitch Harder
@ 2011-04-30 20:40     ` John Wyzer
  2011-04-30 22:16       ` Mitch Harder
  2011-04-30 23:55     ` Peter Stuge
  1 sibling, 1 reply; 37+ messages in thread
From: John Wyzer @ 2011-04-30 20:40 UTC (permalink / raw)
  To: linux-btrfs

Excerpts from Mitch Harder's message of Sat Apr 30 19:33:16 +0200 2011:
> Also, please note that 'btrfs filesystem defragment -v /' will
> defragment the directory structure, but not the files.
[...]
> To defragment your entire volume, you'll need a command like:
> 
> # for file in $(find <PATH/TO/BTRFS/VOL/> -type f); do btrfs
> filesystem defragment ${file}; done

Thanks, I'm doing something like that at the moment (sorted the whole system
according to atimes and mtimes and started defragmenting in order of recent
access...)

However at this speed this will never end.

I'm willing to let it run some more nights however to see whether there will be
an effect in the end.

By the way: does it make a difference to run defrag on one file at a time or on
more?
At the moment I'm doing 100 files/directories  per btrfs call...

> If you just want to see your fragmentation you can use the 'filefrag'
> program from e2fsprogs:
> 
> # for file in $(find <PATH/TO/BTRFS/VOL/> -type f); do filefrag
> ${file}; done | sort -n -k 2 | less


Hmm.
Tried it and it gives me about 500000 lines of

FIBMAP: Invalid argument 

and then:

large_file: 1 extent found

Is that the way it is supposed to work?
Just asking because this was part of a vmware disk image. Both the virtual
machine and the rest of the host system are almost unusable once the VM ist
started (even more unusable than without vmware :-D )

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: abysmal performance
  2011-04-29 15:01 ` Chris Mason
@ 2011-04-30 17:33   ` Mitch Harder
  2011-04-30 20:40     ` John Wyzer
  2011-04-30 23:55     ` Peter Stuge
       [not found]   ` <1304100271-sup-4177@localhost>
  1 sibling, 2 replies; 37+ messages in thread
From: Mitch Harder @ 2011-04-30 17:33 UTC (permalink / raw)
  To: Chris Mason; +Cc: John Wyzer, linux-btrfs

On Fri, Apr 29, 2011 at 10:01 AM, Chris Mason <chris.mason@oracle.com> =
wrote:
> Excerpts from John Wyzer's message of 2011-04-29 10:46:08 -0400:
>> Currently on
>> commit 7cf96da3ec7ca225acf4f284b0e904a1f5f98821
>> Author: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
>> Date: =A0 Mon Apr 25 19:43:53 2011 -0400
>> =A0 =A0 Btrfs: cleanup error handling in inode.c
>>
>> merged into 2.6.38.4
>>
>> I'm on a btrfs filesystem that has been used for some time. Let's sa=
y nine
>> months. Very recently I noticed performance getting worse and worse.
>> Most of the time it feels as if the system is just busy with iowait.
>> Write and read performance during random access is mostly around 2MB=
/s,
>> sometimes 1MB/s or slower. It's better for big files which can be re=
ad with about
>> 6-9MB/s. The disk is a reasonably recent SATA disk (WDC_WD3200BEVT) =
so 30MB/s
>> or 40MB/s linear reading should not be a problem.
>>
>> rootfs =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0291G =A0242G =A0 35G =A088% /
>>
>> I tried =A0btrfs filesystem defragment -v / but did not notice any i=
mprovement
>> after that.
>>
>> Is this a known phenomenon? :-)
>>
>
> Sounds like you're hitting fragmentation, which we can confirm with
> latencytop. =A0Please run latencytop while you're seeing poor perform=
ance
> and take a look at where you're spending most of your time.
>
> -chris
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs=
" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at =A0http://vger.kernel.org/majordomo-info.html
>

Also, please note that 'btrfs filesystem defragment -v /' will
defragment the directory structure, but not the files.

See:
https://btrfs.wiki.kernel.org/index.php/Problem_FAQ#Defragmenting_a_dir=
ectory_doesn.27t_work

To defragment your entire volume, you'll need a command like:

# for file in $(find <PATH/TO/BTRFS/VOL/> -type f); do btrfs
filesystem defragment ${file}; done

There's also a similar command in the FAQ referenced above.

If you just want to see your fragmentation you can use the 'filefrag'
program from e2fsprogs:

# for file in $(find <PATH/TO/BTRFS/VOL/> -type f); do filefrag
${file}; done | sort -n -k 2 | less
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: abysmal performance
  2011-04-29 14:46 abysmal performance John Wyzer
@ 2011-04-29 15:01 ` Chris Mason
  2011-04-30 17:33   ` Mitch Harder
       [not found]   ` <1304100271-sup-4177@localhost>
  0 siblings, 2 replies; 37+ messages in thread
From: Chris Mason @ 2011-04-29 15:01 UTC (permalink / raw)
  To: John Wyzer; +Cc: linux-btrfs

Excerpts from John Wyzer's message of 2011-04-29 10:46:08 -0400:
> Currently on 
> commit 7cf96da3ec7ca225acf4f284b0e904a1f5f98821
> Author: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
> Date:   Mon Apr 25 19:43:53 2011 -0400
>     Btrfs: cleanup error handling in inode.c
> 
> merged into 2.6.38.4
> 
> I'm on a btrfs filesystem that has been used for some time. Let's say nine 
> months. Very recently I noticed performance getting worse and worse.
> Most of the time it feels as if the system is just busy with iowait.
> Write and read performance during random access is mostly around 2MB/s,
> sometimes 1MB/s or slower. It's better for big files which can be read with about
> 6-9MB/s. The disk is a reasonably recent SATA disk (WDC_WD3200BEVT) so 30MB/s
> or 40MB/s linear reading should not be a problem.
> 
> rootfs                291G  242G   35G  88% /
> 
> I tried  btrfs filesystem defragment -v / but did not notice any improvement 
> after that.
> 
> Is this a known phenomenon? :-)
> 

Sounds like you're hitting fragmentation, which we can confirm with
latencytop.  Please run latencytop while you're seeing poor performance
and take a look at where you're spending most of your time.

-chris

^ permalink raw reply	[flat|nested] 37+ messages in thread

* abysmal performance
@ 2011-04-29 14:46 John Wyzer
  2011-04-29 15:01 ` Chris Mason
  0 siblings, 1 reply; 37+ messages in thread
From: John Wyzer @ 2011-04-29 14:46 UTC (permalink / raw)
  To: linux-btrfs

Currently on 
commit 7cf96da3ec7ca225acf4f284b0e904a1f5f98821
Author: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
Date:   Mon Apr 25 19:43:53 2011 -0400
    Btrfs: cleanup error handling in inode.c

merged into 2.6.38.4

I'm on a btrfs filesystem that has been used for some time. Let's say nine 
months. Very recently I noticed performance getting worse and worse.
Most of the time it feels as if the system is just busy with iowait.
Write and read performance during random access is mostly around 2MB/s,
sometimes 1MB/s or slower. It's better for big files which can be read with about
6-9MB/s. The disk is a reasonably recent SATA disk (WDC_WD3200BEVT) so 30MB/s
or 40MB/s linear reading should not be a problem.

rootfs                291G  242G   35G  88% /

I tried  btrfs filesystem defragment -v / but did not notice any improvement 
after that.

Is this a known phenomenon? :-)


^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2011-06-22 15:58 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-06-20 21:51 Abysmal Performance Henning Rohlfs
2011-06-21  0:12 ` Josef Bacik
2011-06-21  7:10   ` Henning Rohlfs
2011-06-21  8:00 ` Sander
2011-06-21  9:26   ` Henning Rohlfs
2011-06-21 15:18     ` Josef Bacik
2011-06-21 16:55       ` Henning Rohlfs
2011-06-21 15:24 ` Calvin Walton
2011-06-22 14:15   ` Henning Rohlfs
2011-06-22 15:39     ` Josef Bacik
2011-06-22 15:57       ` Calvin Walton
2011-06-22 15:58         ` Josef Bacik
  -- strict thread matches above, loose matches on Subject: below --
2011-04-29 14:46 abysmal performance John Wyzer
2011-04-29 15:01 ` Chris Mason
2011-04-30 17:33   ` Mitch Harder
2011-04-30 20:40     ` John Wyzer
2011-04-30 22:16       ` Mitch Harder
2011-04-30 22:33         ` John Wyzer
2011-05-03 11:05           ` Chris Mason
2011-05-03 11:06           ` Chris Mason
2011-04-30 23:55     ` Peter Stuge
2011-05-03 10:33       ` Bernhard Schmidt
2011-05-03 11:00         ` cwillu
2011-05-03 11:26           ` Bernhard Schmidt
2011-05-03 11:08         ` Chris Mason
2011-05-03 11:30           ` Bernhard Schmidt
2011-05-03 11:36             ` Chris Mason
2011-05-03 11:43               ` Bernhard Schmidt
2011-05-03 12:52                 ` Chris Mason
2011-05-03 13:03                   ` Bernhard Schmidt
2011-05-03 13:41                     ` Mitch Harder
2011-05-03 14:41                   ` Daniel J Blueman
2011-05-03 15:42                     ` Mitch Harder
2011-05-03 16:51                       ` Chris Mason
2011-05-03 14:54             ` Daniel J Blueman
2011-05-03 15:10               ` Bernhard Schmidt
     [not found]   ` <1304100271-sup-4177@localhost>
     [not found]     ` <1304100862-sup-1493@think>
     [not found]       ` <1304107977-sup-3815@localhost>
     [not found]         ` <1304110058-sup-7292@think>
     [not found]           ` <1304146193-sup-2200@localhost>
2011-04-30 20:51             ` John Wyzer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.