linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Poor interactive performance with I/O loads with fsync()ing
@ 2010-03-16 15:31 Ben Gamari
  2010-03-17  1:24 ` tytso
                   ` (4 more replies)
  0 siblings, 5 replies; 32+ messages in thread
From: Ben Gamari @ 2010-03-16 15:31 UTC (permalink / raw)
  To: linux-kernel; +Cc: Olly Betts, martin f krafft

Hey all,

Recently I started using the Xapian-based notmuch mail client for everyday
use.  One of the things I was quite surprised by after the switch was the
incredible hit in interactive performance that is observed during database
updates. Things are particularly bad during runs of 'notmuch new,' which scans
the file system looking for new messages and adds them to the database.
Specifically, the worst of the performance hit appears to occur when the
database is being updated.

During these periods, even small chunks of I/O can become minute-long ordeals.
It is common for latencytop to show 30 second long latencies for page faults
and writing pages.  Interactive performance is absolutely abysmal, with other
unrelated processes feeling horrible latencies, causing media players,
editors, and even terminals to grind to a halt.

Despite the system being clearly I/O bound, iostat shows pitiful disk
throughput (700kByte/second read, 300 kByte/second write). Certainly this poor
performance can, at least to some degree, be attributable to the fact that
Xapian uses fdatasync() to ensure data consistency. That being said, it seems
like Xapian's page usage causes horrible thrashing, hence the performance hit
on unrelated processes. Moreover, the hit on unrelated processes is so bad
that I would almost suspect that swap I/O is being serialized by fsync() as
well, despite being on a separate swap partition beyond the control of the
filesystem.

Xapian, however, is far from the first time I have seen this sort of
performance cliff. Rsync, which also uses fsync(), can also trigger this sort
of thrashing during system backups, as can rdiff. slocate's updatedb
absolutely kills interactive performance as well.

Issues similar to this have been widely reported[1-5] in the past, and despite
many attempts[5-8] within both I/O and memory managements subsystems to fix
it, the problem certainly remains. I have tried reducing swappiness from 60 to
40, with some small improvement and it has been reported[20] that these sorts
of symptoms can be negated through use of memory control groups to prevent
interactive process pages from being evicted.

I would really like to see this issue finally fixed. I have tried
several[2][3] times to organize the known data about this bug, although in all
cases discussion has stopped with claims of insufficient data (which is fair,
admittedly, it's a very difficult issue to tackle). However, I do think that
_something_ has to be done to alleviate the thrashing and poor interactive
performance that these work-loads cause.

Thanks,

- Ben


[1] http://bugzilla.kernel.org/show_bug.cgi?id=5900
[2] http://bugzilla.kernel.org/show_bug.cgi?id=7372
[3] http://bugzilla.kernel.org/show_bug.cgi?id=12309
[4] http://lkml.org/lkml/2009/4/28/24
[5] http://lkml.org/lkml/2009/3/26/72
[6] http://notmuchmail.org/pipermail/notmuch/2010/001868.html

[10] http://lkml.org/lkml/2009/5/16/225
[11] http://lkml.org/lkml/2007/7/21/219
[12] http://lwn.net/Articles/328363/
[13] http://lkml.org/lkml/2009/4/6/114

[20] http://lkml.org/lkml/2009/4/28/68


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Poor interactive performance with I/O loads with fsync()ing
  2010-03-16 15:31 Poor interactive performance with I/O loads with fsync()ing Ben Gamari
@ 2010-03-17  1:24 ` tytso
  2010-03-17  3:18   ` Ben Gamari
  2010-03-17  4:53 ` Nick Piggin
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 32+ messages in thread
From: tytso @ 2010-03-17  1:24 UTC (permalink / raw)
  To: Ben Gamari; +Cc: linux-kernel, Olly Betts, martin f krafft

On Tue, Mar 16, 2010 at 08:31:12AM -0700, Ben Gamari wrote:
> Hey all,
> 
> Recently I started using the Xapian-based notmuch mail client for everyday
> use.  One of the things I was quite surprised by after the switch was the
> incredible hit in interactive performance that is observed during database
> updates. Things are particularly bad during runs of 'notmuch new,' which scans
> the file system looking for new messages and adds them to the database.
> Specifically, the worst of the performance hit appears to occur when the
> database is being updated.

What kernel version are you using; what distribution and what version
of that distro are you running; what file system are you using and
what if any mount options are you using?  And what kind of hard drives
do you have?

I'm going to assume you're running into the standard ext3
"data=ordered" entagled writes problem.  There are solutions, such as
switching to using ext4, mounting with data=writeback mode, but they
have various shortcomings.

A number of improvements have been made in ext3 and ext4 since some of
the discussions you quoted, but since you didn't tell us what
distribution version and/or what kernel version you are using, we
can't tell you are using those newer improvements yet.

      	       	   	       	     - Ted

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Poor interactive performance with I/O loads with fsync()ing
  2010-03-17  1:24 ` tytso
@ 2010-03-17  3:18   ` Ben Gamari
  2010-03-17  3:30     ` tytso
  0 siblings, 1 reply; 32+ messages in thread
From: Ben Gamari @ 2010-03-17  3:18 UTC (permalink / raw)
  To: tytso; +Cc: linux-kernel, Olly Betts, martin f krafft

Sorry about the lack of any useful information in my initial email.
I clearly didn't read it before sending.

On Tue, 16 Mar 2010 21:24:39 -0400, tytso@mit.edu wrote:
> What kernel version are you using; what distribution and what version
> of that distro are you running; what file system are you using and
> what if any mount options are you using?  And what kind of hard drives
> do you have?

While this problem has been around for some time, my current configuration
is the following:

  Kernel 2.6.32 (although also reproducible with kernels at least as early as 2.6.28)
  Filesystem: Now Btrfs (was ext4 less than a week ago), default mount options
  Hard drive: Seagate Momentus 7200.4 (ST9500420AS)
  Distribution: Ubuntu 9.10 (Karmic)

> 
> I'm going to assume you're running into the standard ext3
> "data=ordered" entagled writes problem.  There are solutions, such as
> switching to using ext4, mounting with data=writeback mode, but they
> have various shortcomings.
> 
Unfortunately several people have continued to encounter unacceptable
latency, even with ext4 and data=writeback.

> A number of improvements have been made in ext3 and ext4 since some of
> the discussions you quoted, but since you didn't tell us what
> distribution version and/or what kernel version you are using, we
> can't tell you are using those newer improvements yet.
>
Sorry about that. I should know better by now.

- Ben

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Poor interactive performance with I/O loads with fsync()ing
  2010-03-17  3:18   ` Ben Gamari
@ 2010-03-17  3:30     ` tytso
  2010-03-17  4:31       ` Ben Gamari
  0 siblings, 1 reply; 32+ messages in thread
From: tytso @ 2010-03-17  3:30 UTC (permalink / raw)
  To: Ben Gamari; +Cc: linux-kernel, Olly Betts, martin f krafft

On Tue, Mar 16, 2010 at 08:18:09PM -0700, Ben Gamari wrote:
> Sorry about the lack of any useful information in my initial email.
> I clearly didn't read it before sending.
> 
> On Tue, 16 Mar 2010 21:24:39 -0400, tytso@mit.edu wrote:
> > What kernel version are you using; what distribution and what version
> > of that distro are you running; what file system are you using and
> > what if any mount options are you using?  And what kind of hard drives
> > do you have?
> 
> While this problem has been around for some time, my current configuration
> is the following:
> 
>   Kernel 2.6.32 (although also reproducible with kernels at least as early as 2.6.28)
>   Filesystem: Now Btrfs (was ext4 less than a week ago), default mount options
>   Hard drive: Seagate Momentus 7200.4 (ST9500420AS)
>   Distribution: Ubuntu 9.10 (Karmic)

.... so did switching to Btrfs solve your latency issues, or are you
still having problems?

						- Ted

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Poor interactive performance with I/O loads with fsync()ing
  2010-03-17  3:30     ` tytso
@ 2010-03-17  4:31       ` Ben Gamari
  2010-03-26  3:16         ` Ben Gamari
  0 siblings, 1 reply; 32+ messages in thread
From: Ben Gamari @ 2010-03-17  4:31 UTC (permalink / raw)
  To: tytso; +Cc: linux-kernel, Olly Betts, martin f krafft

On Tue, 16 Mar 2010 23:30:10 -0400, tytso@mit.edu wrote:
> .... so did switching to Btrfs solve your latency issues, or are you
> still having problems?

Still having troubles although I'm now running 2.6.34-rc1 and things seem
mildly better. I'll try doing a backup tonight and report back.

- Ben

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Poor interactive performance with I/O loads with fsync()ing
  2010-03-16 15:31 Poor interactive performance with I/O loads with fsync()ing Ben Gamari
  2010-03-17  1:24 ` tytso
@ 2010-03-17  4:53 ` Nick Piggin
  2010-03-17  9:37   ` Ingo Molnar
  2010-03-26  3:28   ` Ben Gamari
  2010-03-23 19:51 ` Jesper Krogh
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 32+ messages in thread
From: Nick Piggin @ 2010-03-17  4:53 UTC (permalink / raw)
  To: Ben Gamari, tytso; +Cc: linux-kernel, Olly Betts, martin f krafft

Hi,

On Tue, Mar 16, 2010 at 08:31:12AM -0700, Ben Gamari wrote:
> Hey all,
> 
> Recently I started using the Xapian-based notmuch mail client for everyday
> use.  One of the things I was quite surprised by after the switch was the
> incredible hit in interactive performance that is observed during database
> updates. Things are particularly bad during runs of 'notmuch new,' which scans
> the file system looking for new messages and adds them to the database.
> Specifically, the worst of the performance hit appears to occur when the
> database is being updated.
> 
> During these periods, even small chunks of I/O can become minute-long ordeals.
> It is common for latencytop to show 30 second long latencies for page faults
> and writing pages.  Interactive performance is absolutely abysmal, with other
> unrelated processes feeling horrible latencies, causing media players,
> editors, and even terminals to grind to a halt.
> 
> Despite the system being clearly I/O bound, iostat shows pitiful disk
> throughput (700kByte/second read, 300 kByte/second write). Certainly this poor
> performance can, at least to some degree, be attributable to the fact that
> Xapian uses fdatasync() to ensure data consistency. That being said, it seems
> like Xapian's page usage causes horrible thrashing, hence the performance hit
> on unrelated processes.

Where are the unrelated processes waiting? Can you get a sample of
several backtraces?  (/proc/<pid>/stack should do it)


> Moreover, the hit on unrelated processes is so bad
> that I would almost suspect that swap I/O is being serialized by fsync() as
> well, despite being on a separate swap partition beyond the control of the
> filesystem.

It shouldn't be, until it reaches the bio layer. If it is on the same
block device, it will still fight for access. It could also be blocking
on dirty data thresholds, or page reclaim though -- writeback and
reclaim could easily be getting slowed down by the fsync activity.

Swapping tends to cause fairly nasty disk access patterns, combined with
fsync it could be pretty unavoidable.

> 
> Xapian, however, is far from the first time I have seen this sort of
> performance cliff. Rsync, which also uses fsync(), can also trigger this sort
> of thrashing during system backups, as can rdiff. slocate's updatedb
> absolutely kills interactive performance as well.
> 
> Issues similar to this have been widely reported[1-5] in the past, and despite
> many attempts[5-8] within both I/O and memory managements subsystems to fix
> it, the problem certainly remains. I have tried reducing swappiness from 60 to
> 40, with some small improvement and it has been reported[20] that these sorts
> of symptoms can be negated through use of memory control groups to prevent
> interactive process pages from being evicted.

So the workload is causing quite a lot of swapping as well? How much
pagecache do you have? It could be that you have too much pagecache and
it is pushing out anonymous memory too easily, or you might have too
little pagecache causing suboptimal writeout patterns (possibly writeout
from page reclaim rather than asynchronous dirty page cleaner threads,
which can really hurt).

Thanks,
Nick


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Poor interactive performance with I/O loads with fsync()ing
  2010-03-17  4:53 ` Nick Piggin
@ 2010-03-17  9:37   ` Ingo Molnar
  2010-03-26  3:31     ` Ben Gamari
  2010-04-09 15:21     ` Ben Gamari
  2010-03-26  3:28   ` Ben Gamari
  1 sibling, 2 replies; 32+ messages in thread
From: Ingo Molnar @ 2010-03-17  9:37 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Ben Gamari, tytso, linux-kernel, Olly Betts, martin f krafft


* Nick Piggin <npiggin@suse.de> wrote:

> Hi,
> 
> On Tue, Mar 16, 2010 at 08:31:12AM -0700, Ben Gamari wrote:
> > Hey all,
> > 
> > Recently I started using the Xapian-based notmuch mail client for everyday
> > use.  One of the things I was quite surprised by after the switch was the
> > incredible hit in interactive performance that is observed during database
> > updates. Things are particularly bad during runs of 'notmuch new,' which scans
> > the file system looking for new messages and adds them to the database.
> > Specifically, the worst of the performance hit appears to occur when the
> > database is being updated.
> > 
> > During these periods, even small chunks of I/O can become minute-long ordeals.
> > It is common for latencytop to show 30 second long latencies for page faults
> > and writing pages.  Interactive performance is absolutely abysmal, with other
> > unrelated processes feeling horrible latencies, causing media players,
> > editors, and even terminals to grind to a halt.
> > 
> > Despite the system being clearly I/O bound, iostat shows pitiful disk
> > throughput (700kByte/second read, 300 kByte/second write). Certainly this poor
> > performance can, at least to some degree, be attributable to the fact that
> > Xapian uses fdatasync() to ensure data consistency. That being said, it seems
> > like Xapian's page usage causes horrible thrashing, hence the performance hit
> > on unrelated processes.
> 
> Where are the unrelated processes waiting? Can you get a sample of several 
> backtraces?  (/proc/<pid>/stack should do it)

A call-graph profile will show the precise reason for IO latencies, and their 
relatively likelihood.

It's really simple to do it with a recent kernel. Firstly, enable 
CONFIG_BLK_DEV_IO_TRACE=y, CONFIG_EVENT_PROFILE=y:

  Kernel performance events and counters (PERF_EVENTS) [Y/?] y
    Tracepoint profiling sources (EVENT_PROFILE) [Y/n/?] y
    Support for tracing block IO actions (BLK_DEV_IO_TRACE) [N/y/?] y

(boot into this kernel)

Then build perf via:

  cd tools/perf/
  make -j install

and then capture 10 seconds of the DB workload:

  perf record -f -g -a -e block:block_rq_issue -c 1 sleep 10

  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.251 MB perf.data (~10977 samples) ]

and look at the call-graph output:

  perf report

# Samples: 5
#
# Overhead          Command      Shared Object  Symbol
# ........  ...............  .................  ......
#
    80.00%        kjournald  [kernel.kallsyms]  [k] perf_trace_block_rq_issue
                  |
                  --- perf_trace_block_rq_issue
                      scsi_request_fn
                     |          
                     |--50.00%-- __blk_run_queue
                     |          cfq_insert_request
                     |          elv_insert
                     |          __elv_add_request
                     |          __make_request
                     |          generic_make_request
                     |          submit_bio
                     |          submit_bh
                     |          sync_dirty_buffer
                     |          journal_commit_transaction
                     |          kjournald
                     |          kthread
                     |          kernel_thread_helper
                     |          
                      --50.00%-- __generic_unplug_device
                                generic_unplug_device
                                blk_unplug
                                blk_backing_dev_unplug
                                sync_buffer
                                __wait_on_bit
                                out_of_line_wait_on_bit
                                __wait_on_buffer
                                wait_on_buffer
                                journal_commit_transaction
                                kjournald
                                kthread
                                kernel_thread_helper

    20.00%               as  [kernel.kallsyms]  [k] perf_trace_block_rq_issue
                         |
                         --- perf_trace_block_rq_issue
                             scsi_request_fn
                             __generic_unplug_device
                             generic_unplug_device
                             blk_unplug
                             blk_backing_dev_unplug
                             page_cache_async_readahead
                             generic_file_aio_read
                             do_sync_read
                             vfs_read
                             sys_read
                             system_call_fastpath
                             0x39f8ad4930


This (very simple) example had 80% of the IO in kjournald and 20% of it in 
'as'. The precise call-paths of IO issues are visible.

For general scheduler context-switch events you can use:

  perf record -f -g -a -e context-switches -c 1 sleep 10

see 'perf list' for all events.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Poor interactive performance with I/O loads with fsync()ing
  2010-03-16 15:31 Poor interactive performance with I/O loads with fsync()ing Ben Gamari
  2010-03-17  1:24 ` tytso
  2010-03-17  4:53 ` Nick Piggin
@ 2010-03-23 19:51 ` Jesper Krogh
  2010-03-26  3:13 ` Ben Gamari
  2010-03-28  1:20 ` Ben Gamari
  4 siblings, 0 replies; 32+ messages in thread
From: Jesper Krogh @ 2010-03-23 19:51 UTC (permalink / raw)
  To: Ben Gamari, linux-kernel

Ben Gamari wrote:
> Hey all,
> 
> Recently I started using the Xapian-based notmuch mail client for everyday
> use.  One of the things I was quite surprised by after the switch was the
> incredible hit in interactive performance that is observed during database
> updates. Things are particularly bad during runs of 'notmuch new,' which scans
> the file system looking for new messages and adds them to the database.
> Specifically, the worst of the performance hit appears to occur when the
> database is being updated.

I would suggest that you include a 2.6.31 kernel in your testing. I have
seen something that seems like "huge" stalls in 2.6.32 but I havent been
able to "dig into it" to find more.

In 2.6.32 I have seen IO-wait numbers around 80% on a 16 core machine
with 128GB of memory and load-numbers over 120 under workloads that
didn't make 2.6.31 sweat at all.

Filesystems are a mixture of ext3 and ext4 (so it could be the barriers)?

-- 
Jesper

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Poor interactive performance with I/O loads with fsync()ing
  2010-03-16 15:31 Poor interactive performance with I/O loads with fsync()ing Ben Gamari
                   ` (2 preceding siblings ...)
  2010-03-23 19:51 ` Jesper Krogh
@ 2010-03-26  3:13 ` Ben Gamari
  2010-03-28  1:20 ` Ben Gamari
  4 siblings, 0 replies; 32+ messages in thread
From: Ben Gamari @ 2010-03-26  3:13 UTC (permalink / raw)
  To: linux-kernel; +Cc: Olly Betts, martin f krafft

On Tue, 16 Mar 2010 08:31:12 -0700 (PDT), Ben Gamari <bgamari.foss@gmail.com> wrote:
> Hey all,

I apologize for my extreme tardiness in replying to your responses. I was
hoping to have more time during Spring break in dealing with this issue than I
did (as always). Nevertheless, I'll hopefully be able to keep up with things at
this point. Specific replies will follow.

- Ben


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Poor interactive performance with I/O loads with fsync()ing
  2010-03-17  4:31       ` Ben Gamari
@ 2010-03-26  3:16         ` Ben Gamari
  0 siblings, 0 replies; 32+ messages in thread
From: Ben Gamari @ 2010-03-26  3:16 UTC (permalink / raw)
  To: tytso; +Cc: linux-kernel, Olly Betts, martin f krafft

On Tue, 16 Mar 2010 21:31:03 -0700 (PDT), Ben Gamari <bgamari.foss@gmail.com> wrote:
> On Tue, 16 Mar 2010 23:30:10 -0400, tytso@mit.edu wrote:
> > .... so did switching to Btrfs solve your latency issues, or are you
> > still having problems?
> 
> Still having troubles although I'm now running 2.6.34-rc1 and things seem
> mildly better. I'll try doing a backup tonight and report back.
> 
I stand by my assertion that 2.6.34 does seem better in some regards. While
there certainly are still latency issues, it's now less often that heavy I/O
spills over into over processes' interactive performance. That being said,
earlier this evening Tracker and notmuch were both indexing and I saw several
events of tens of seconds of latency.

- Ben

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Poor interactive performance with I/O loads with fsync()ing
  2010-03-17  4:53 ` Nick Piggin
  2010-03-17  9:37   ` Ingo Molnar
@ 2010-03-26  3:28   ` Ben Gamari
  1 sibling, 0 replies; 32+ messages in thread
From: Ben Gamari @ 2010-03-26  3:28 UTC (permalink / raw)
  To: Nick Piggin, tytso; +Cc: linux-kernel, Olly Betts, martin f krafft

On Wed, 17 Mar 2010 15:53:50 +1100, Nick Piggin <npiggin@suse.de> wrote:
> Where are the unrelated processes waiting? Can you get a sample of
> several backtraces?  (/proc/<pid>/stack should do it)
> 
I wish. One of the incredibly frustrating characteristics of this issue is the
difficulty in measuring it. By the time processes begin blocking, it's already
far too late to open a terminal and cat to a file. By the time the terminal has
opened, tens of seconds have passed and things have started to return to normal.

> 
> > Moreover, the hit on unrelated processes is so bad
> > that I would almost suspect that swap I/O is being serialized by fsync() as
> > well, despite being on a separate swap partition beyond the control of the
> > filesystem.
> 
> It shouldn't be, until it reaches the bio layer. If it is on the same
> block device, it will still fight for access. It could also be blocking
> on dirty data thresholds, or page reclaim though -- writeback and
> reclaim could easily be getting slowed down by the fsync activity.
> 
Hmm, this sounds interesting. Is there a way to monitor writeback throughput.

> Swapping tends to cause fairly nasty disk access patterns, combined with
> fsync it could be pretty unavoidable.
> 
This is definitely a possibility. However, it seems to me like swapping should
be at least mildly favored over other I/O by the I/O scheduler. That being
said, I can certainly see how it would be difficult to implement such a
heuristic in a fair way so as not to block out standard filesystem access
during a thrashing spree.

> > 
> > Xapian, however, is far from the first time I have seen this sort of
> > performance cliff. Rsync, which also uses fsync(), can also trigger this sort
> > of thrashing during system backups, as can rdiff. slocate's updatedb
> > absolutely kills interactive performance as well.
> > 
> > Issues similar to this have been widely reported[1-5] in the past, and despite
> > many attempts[5-8] within both I/O and memory managements subsystems to fix
> > it, the problem certainly remains. I have tried reducing swappiness from 60 to
> > 40, with some small improvement and it has been reported[20] that these sorts
> > of symptoms can be negated through use of memory control groups to prevent
> > interactive process pages from being evicted.
> 
> So the workload is causing quite a lot of swapping as well? How much
> pagecache do you have? It could be that you have too much pagecache and
> it is pushing out anonymous memory too easily, or you might have too
> little pagecache causing suboptimal writeout patterns (possibly writeout
> from page reclaim rather than asynchronous dirty page cleaner threads,
> which can really hurt).
> 
As far as I can tell, the workload should fit in memory without a problem. This
machine has 4 gigabytes of memory, of which currently 2.8GB is page cache.
Seems high perhaps? I've included meminfo below. I can completely see how
overly-aggressive page-cache would result in this sort of behavior.

- Ben


MemTotal:        4048068 kB
MemFree:           47232 kB
Buffers:              48 kB
Cached:          2774648 kB
SwapCached:         1148 kB
Active:          2353572 kB
Inactive:        1355980 kB
Active(anon):    1343176 kB
Inactive(anon):   342644 kB
Active(file):    1010396 kB
Inactive(file):  1013336 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:       4883756 kB
SwapFree:        4882532 kB
Dirty:             24736 kB
Writeback:             0 kB
AnonPages:        933820 kB
Mapped:            88840 kB
Shmem:            750948 kB
Slab:             150752 kB
SReclaimable:     121404 kB
SUnreclaim:        29348 kB
KernelStack:        2672 kB
PageTables:        31312 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     6907788 kB
Committed_AS:    2773672 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      364080 kB
VmallocChunk:   34359299100 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:        8552 kB
DirectMap2M:     4175872 kB


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Poor interactive performance with I/O loads with fsync()ing
  2010-03-17  9:37   ` Ingo Molnar
@ 2010-03-26  3:31     ` Ben Gamari
  2010-04-09 15:21     ` Ben Gamari
  1 sibling, 0 replies; 32+ messages in thread
From: Ben Gamari @ 2010-03-26  3:31 UTC (permalink / raw)
  To: Ingo Molnar, Nick Piggin; +Cc: tytso, linux-kernel, Olly Betts, martin f krafft

On Wed, 17 Mar 2010 10:37:04 +0100, Ingo Molnar <mingo@elte.hu> wrote:
> 
> A call-graph profile will show the precise reason for IO latencies, and their 
> relatively likelihood.
> 
Once I get home I'll try to reproduce the issue and get a call graph. Thanks!

- Ben

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Poor interactive performance with I/O loads with fsync()ing
  2010-03-16 15:31 Poor interactive performance with I/O loads with fsync()ing Ben Gamari
                   ` (3 preceding siblings ...)
  2010-03-26  3:13 ` Ben Gamari
@ 2010-03-28  1:20 ` Ben Gamari
  2010-03-28  1:29   ` Ben Gamari
  2010-03-28  3:42   ` Arjan van de Ven
  4 siblings, 2 replies; 32+ messages in thread
From: Ben Gamari @ 2010-03-28  1:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: tytso, npiggin, mingo, Ruald Andreae, Jens Axboe, Olly Betts,
	martin f krafft

Hey all,

I have posted another profile[1] from an incident yesterday. As you can see,
both swapper and init (strange?) show up prominently in the profile. Moreover,
most processes seem to be in blk_peek_request a disturbingly large percentage
of the time. Both of these profiles were taken with 2.6.34-rc kernels.

Anyone have any ideas on how to proceed? Is more profile data necessary? Are
the existing profiles at all useful? Thanks,

- Ben

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Poor interactive performance with I/O loads with fsync()ing
  2010-03-28  1:20 ` Ben Gamari
@ 2010-03-28  1:29   ` Ben Gamari
  2010-03-28  3:42   ` Arjan van de Ven
  1 sibling, 0 replies; 32+ messages in thread
From: Ben Gamari @ 2010-03-28  1:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: tytso, npiggin, mingo, Ruald Andreae, Jens Axboe, Olly Betts,
	martin f krafft

On Sat, 27 Mar 2010 18:20:37 -0700 (PDT), Ben Gamari <bgamari.foss@gmail.com> wrote:
> Hey all,
> 
> I have posted another profile[1] from an incident yesterday. As you can see,
> both swapper and init (strange?) show up prominently in the profile. Moreover,
> most processes seem to be in blk_peek_request a disturbingly large percentage
> of the time. Both of these profiles were taken with 2.6.34-rc kernels.
> 

Apparently my initial email announcing my first set of profiles never made it
out. Sorry for the confusion. I've included it below.


From: Ben Gamari <bgamari.foss@gmail.com>
Subject: Re: Poor interactive performance with I/O loads with fsync()ing
To: Ingo Molnar <mingo@elte.hu>, Nick Piggin <npiggin@suse.de>
Cc: tytso@mit.edu, linux-kernel@vger.kernel.org, Olly Betts
	<olly@survex.com>, martin f krafft <madduck@madduck.net>
Bcc: bgamari@gmail.com
In-Reply-To: <20100317093704.GA17146@elte.hu>
References: <4b9fa440.12135e0a.7fc8.ffffe745@mx.google.com>
	<20100317045350.GA2869@laptop> <20100317093704.GA17146@elte.hu>
On Wed, 17 Mar 2010 10:37:04 +0100, Ingo Molnar <mingo@elte.hu> wrote:
> A call-graph profile will show the precise reason for IO latencies, and their 
> relatively likelihood.

Well, here is something for now. I'm not sure how valid the reproduction
workload is (git pull, rsync, and 'notmuch new' all running at once), but I
certainly did produce a few stalls and swapper is highest on the profile.
This was on 2.6.34-rc2. I've included part of the profile below, although more
complete set of data is available at [1].

Thanks,

- Ben


[1] http://mw0.mooo.com/~ben/latency-2010-03-25-a/


# Samples: 25295
#
# Overhead          Command      Shared Object  Symbol
# ........  ...............  .................  ......
#
    24.50%          swapper  [kernel.kallsyms]  [k] blk_peek_request
                    |
                    --- blk_peek_request
                        scsi_request_fn
                        __blk_run_queue
                       |          
                       |--98.32%-- blk_run_queue
                       |          scsi_run_queue
                       |          scsi_next_command
                       |          scsi_io_completion
                       |          scsi_finish_command
                       |          scsi_softirq_done
                       |          blk_done_softirq
                       |          __do_softirq
                       |          call_softirq
                       |          do_softirq
                       |          irq_exit
                       |          |          
                       |          |--99.56%-- do_IRQ
                       |          |          ret_from_intr
                       |          |          |          
                       |          |          |--98.02%-- cpuidle_idle_call
                       |          |          |          cpu_idle
                       |          |          |          rest_init
                       |          |          |          start_kernel
                       |          |          |          x86_64_start_reservations
                       |          |          |          x86_64_start_kernel
                       |          |          |          
                       |          |          |--0.91%-- clockevents_notify
                       |          |          |          lapic_timer_state_broadcast
                       |          |          |          |          
                       |          |          |          |--83.64%-- acpi_idle_enter_bm
                       |          |          |          |          cpuidle_idle_call
                       |          |          |          |          cpu_idle
                       |          |          |          |          rest_init
                       |          |          |          |          start_kernel
                       |          |          |          |          x86_64_start_reservations
                       |          |          |          |          x86_64_start_kernel
                       |          |          |          |          
                       |          |          |           --16.36%-- acpi_idle_enter_simple
                       |          |          |                     cpuidle_idle_call
                       |          |          |                     cpu_idle
                       |          |          |                     rest_init
                       |          |          |                     start_kernel
                       |          |          |                     x86_64_start_reservations
                       |          |          |                     x86_64_start_kernel
                       |          |          |          
                       |          |          |--0.81%-- cpu_idle
                       |          |          |          rest_init
                       |          |          |          start_kernel
                       |          |          |          x86_64_start_reservations
                       |          |          |          x86_64_start_kernel
                       |          |           --0.26%-- [...]
                       |           --0.44%-- [...]
                       |          
                        --1.68%-- elv_completed_request
                                  __blk_put_request
                                  blk_finish_request
                                  blk_end_bidi_request
                                  blk_end_request
                                  scsi_io_completion
                                  scsi_finish_command
                                  scsi_softirq_done
                                  blk_done_softirq
                                  __do_softirq
                                  call_softirq
                                  do_softirq
                                  irq_exit
                                  do_IRQ
                                  ret_from_intr
                                  |          
                                  |--96.15%-- cpuidle_idle_call
                                  |          cpu_idle
                                  |          rest_init
                                  |          start_kernel
                                  |          x86_64_start_reservations
                                  |          x86_64_start_kernel
                                  |          
                                  |--1.92%-- cpu_idle
                                  |          rest_init
                                  |          start_kernel
                                  |          x86_64_start_reservations
                                  |          x86_64_start_kernel
                                  |          
                                  |--0.96%-- schedule
                                  |          cpu_idle
                                  |          rest_init
                                  |          start_kernel
                                  |          x86_64_start_reservations
                                  |          x86_64_start_kernel
                                  |          
                                   --0.96%-- clockevents_notify
                                             lapic_timer_state_broadcast
                                             acpi_idle_enter_bm
                                             cpuidle_idle_call
                                             cpu_idle
                                             rest_init
                                             start_kernel
                                             x86_64_start_reservations
                                             x86_64_start_kernel

    23.74%             init  [kernel.kallsyms]  [k] blk_peek_request
                       |
                       --- blk_peek_request
                           scsi_request_fn
                           __blk_run_queue
                          |          
                          |--98.77%-- blk_run_queue
                          |          scsi_run_queue
                          |          scsi_next_command
                          |          scsi_io_completion
                          |          scsi_finish_command
                          |          scsi_softirq_done
                          |          blk_done_softirq
                          |          __do_softirq
                          |          call_softirq
                          |          do_softirq
                          |          irq_exit
                          |          |          
                          |          |--99.87%-- do_IRQ
                          |          |          ret_from_intr
                          |          |          |          
                          |          |          |--98.38%-- cpuidle_idle_call
                          |          |          |          cpu_idle
                          |          |          |          start_secondary
                          |          |          |          
                          |          |          |--0.81%-- schedule
                          |          |          |          cpu_idle
                          |          |          |          start_secondary
                          |          |          |          
                          |          |          |--0.56%-- cpu_idle
                          |          |          |          start_secondary
                          |          |           --0.25%-- [...]
                          |           --0.13%-- [...]
                          |          
                           --1.23%-- elv_completed_request
                                     __blk_put_request
                                     blk_finish_request
                                     blk_end_bidi_request
                                     blk_end_request
                                     scsi_io_completion
                                     scsi_finish_command
                                     scsi_softirq_done
                                     blk_done_softirq
                                     __do_softirq
                                     call_softirq
                                     do_softirq
                                     irq_exit
                                     do_IRQ
                                     ret_from_intr
                                     cpuidle_idle_call
                                     cpu_idle
                                     start_secondary

     5.85%  chromium-browse  [kernel.kallsyms]  [k] blk_peek_request
            |
            --- blk_peek_request
                scsi_request_fn
                __blk_run_queue
                blk_run_queue
                scsi_run_queue
                scsi_next_command
                scsi_io_completion
                scsi_finish_command
                scsi_softirq_done
                blk_done_softirq
                __do_softirq
                call_softirq
                do_softirq
                irq_exit
                do_IRQ
                ret_from_intr
               |          
               |--50.00%-- check_match.8653
               |          
                --50.00%-- unlink_anon_vmas
                          free_pgtables
                          exit_mmap
                          mmput
                          exit_mm
                          do_exit
                          do_group_exit
                          sys_exit_group
                          system_call
...

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Poor interactive performance with I/O loads with fsync()ing
  2010-03-28  1:20 ` Ben Gamari
  2010-03-28  1:29   ` Ben Gamari
@ 2010-03-28  3:42   ` Arjan van de Ven
  2010-03-28 14:06     ` Ben Gamari
  1 sibling, 1 reply; 32+ messages in thread
From: Arjan van de Ven @ 2010-03-28  3:42 UTC (permalink / raw)
  To: Ben Gamari
  Cc: linux-kernel, tytso, npiggin, mingo, Ruald Andreae, Jens Axboe,
	Olly Betts, martin f krafft

On Sat, 27 Mar 2010 18:20:37 -0700 (PDT)
Ben Gamari <bgamari.foss@gmail.com> wrote:

> Hey all,
> 
> I have posted another profile[1] from an incident yesterday. As you
> can see, both swapper and init (strange?) show up prominently in the
> profile. Moreover, most processes seem to be in blk_peek_request a
> disturbingly large percentage of the time. Both of these profiles
> were taken with 2.6.34-rc kernels.
> 
> Anyone have any ideas on how to proceed? Is more profile data
> necessary? Are the existing profiles at all useful? Thanks,


profiles tend to be about cpu usage... and are rather poor to deal with
anything IO related.

latencytop might get closer in giving useful information....

(btw some general suggestion.. make sure you're using noatime or
relatime as mount option) 
-- 
Arjan van de Ven 	Intel Open Source Technology Centre
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Poor interactive performance with I/O loads with fsync()ing
  2010-03-28  3:42   ` Arjan van de Ven
@ 2010-03-28 14:06     ` Ben Gamari
  2010-03-28 22:08       ` Andi Kleen
  0 siblings, 1 reply; 32+ messages in thread
From: Ben Gamari @ 2010-03-28 14:06 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: linux-kernel, tytso, npiggin, mingo, Ruald Andreae, Jens Axboe,
	Olly Betts, martin f krafft

On Sat, 27 Mar 2010 20:42:33 -0700, Arjan van de Ven <arjan@infradead.org> wrote:
> On Sat, 27 Mar 2010 18:20:37 -0700 (PDT)
> Ben Gamari <bgamari.foss@gmail.com> wrote:
> 
> > Hey all,
> > 
> > I have posted another profile[1] from an incident yesterday. As you
> > can see, both swapper and init (strange?) show up prominently in the
> > profile. Moreover, most processes seem to be in blk_peek_request a
> > disturbingly large percentage of the time. 
> > 
I suppose this statement was a tad misleading. The provided profiles were taken
with,

perf record -f -g -a -e block:block_rq_issue -c 1

Which I believe measures block requests issued, not CPU usage (correct me if
I'm wrong).

> profiles tend to be about cpu usage... and are rather poor to deal with
> anything IO related.
> 
See above.

> latencytop might get closer in giving useful information....
> 
Latencytop generally shows a large amount of time handling page faults.

> (btw some general suggestion.. make sure you're using noatime or
> relatime as mount option) 

Thanks for the suggestion. I had actually forgotten relatime in my fstab, so
we'll see if there's any improvement now. That being said, I/O loads over small
numbers of files (e.g. xapian) are just as bad as loads over large numbers of
files. To me that weakly suggests perhaps atime updates aren't the issue (I
could be horribly wrong though).

- Ben


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Poor interactive performance with I/O loads with fsync()ing
  2010-03-28 14:06     ` Ben Gamari
@ 2010-03-28 22:08       ` Andi Kleen
  2010-04-09 14:56         ` Ben Gamari
  0 siblings, 1 reply; 32+ messages in thread
From: Andi Kleen @ 2010-03-28 22:08 UTC (permalink / raw)
  To: Ben Gamari
  Cc: Arjan van de Ven, linux-kernel, tytso, npiggin, mingo,
	Ruald Andreae, Jens Axboe, Olly Betts, martin f krafft

Ben Gamari <bgamari.foss@gmail.com> writes:

You don't say which file system you use, but ext3 and the file systems
with similar journal design (like reiserfs) all have known fsync starvation
issues. The problem is that any fsync has to wait for all transactions
to commit, and this might take a long time depending on how busy
the disk is.

ext4/XFS/JFS/btrfs should be better in this regard

-Andi
-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Poor interactive performance with I/O loads with fsync()ing
  2010-03-28 22:08       ` Andi Kleen
@ 2010-04-09 14:56         ` Ben Gamari
  2010-04-11 15:03           ` Avi Kivity
  0 siblings, 1 reply; 32+ messages in thread
From: Ben Gamari @ 2010-04-09 14:56 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Arjan van de Ven, linux-kernel, tytso, npiggin, mingo,
	Ruald Andreae, Jens Axboe, Olly Betts, martin f krafft

On Mon, 29 Mar 2010 00:08:58 +0200, Andi Kleen <andi@firstfloor.org> wrote:
> Ben Gamari <bgamari.foss@gmail.com> writes:
> ext4/XFS/JFS/btrfs should be better in this regard
> 
I am using btrfs, so yes, I was expecting things to be better. Unfortunately, 
the improvement seems to be non-existent under high IO/fsync load.

- Ben


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Poor interactive performance with I/O loads with fsync()ing
  2010-03-17  9:37   ` Ingo Molnar
  2010-03-26  3:31     ` Ben Gamari
@ 2010-04-09 15:21     ` Ben Gamari
  1 sibling, 0 replies; 32+ messages in thread
From: Ben Gamari @ 2010-04-09 15:21 UTC (permalink / raw)
  To: Ingo Molnar, Nick Piggin; +Cc: tytso, linux-kernel, Olly Betts, martin f krafft

On Wed, 17 Mar 2010 10:37:04 +0100, Ingo Molnar <mingo@elte.hu> wrote:
> * Nick Piggin <npiggin@suse.de> wrote:
> 
> > Hi,
> > 
> > On Tue, Mar 16, 2010 at 08:31:12AM -0700, Ben Gamari wrote:
> > > Hey all,
> > > 
> > > Recently I started using the Xapian-based notmuch mail client for everyday
> > > use.  One of the things I was quite surprised by after the switch was the
> > > incredible hit in interactive performance that is observed during database
> > > updates. Things are particularly bad during runs of 'notmuch new,' which scans
> > > the file system looking for new messages and adds them to the database.
> > > Specifically, the worst of the performance hit appears to occur when the
> > > database is being updated.
> > > 
> > > During these periods, even small chunks of I/O can become minute-long ordeals.
> > > It is common for latencytop to show 30 second long latencies for page faults
> > > and writing pages.  Interactive performance is absolutely abysmal, with other
> > > unrelated processes feeling horrible latencies, causing media players,
> > > editors, and even terminals to grind to a halt.
> > > 
> > > Despite the system being clearly I/O bound, iostat shows pitiful disk
> > > throughput (700kByte/second read, 300 kByte/second write). Certainly this poor
> > > performance can, at least to some degree, be attributable to the fact that
> > > Xapian uses fdatasync() to ensure data consistency. That being said, it seems
> > > like Xapian's page usage causes horrible thrashing, hence the performance hit
> > > on unrelated processes.
> > 
> > Where are the unrelated processes waiting? Can you get a sample of several 
> > backtraces?  (/proc/<pid>/stack should do it)
> 
> A call-graph profile will show the precise reason for IO latencies, and their 
> relatively likelihood.
> 
Are these backtraces at all useful? I've received no feedback thusfar, so I can
only that either,
  a) there is insufficient data to draw any conclusions and there is no
     interest in pursuing this further, or
  b) nobody has looked at the backtraces

As I've said in the past, I am very interested in seeing this problem looked at and
would love to contribute whatever I can to that effort. However, without knowing what
information is necessary, I can be of only very limited use in my own debugging
efforts. Thanks,

- Ben


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Poor interactive performance with I/O loads with fsync()ing
  2010-04-09 14:56         ` Ben Gamari
@ 2010-04-11 15:03           ` Avi Kivity
  2010-04-11 16:35             ` Ben Gamari
  2010-04-11 18:16             ` Thomas Gleixner
  0 siblings, 2 replies; 32+ messages in thread
From: Avi Kivity @ 2010-04-11 15:03 UTC (permalink / raw)
  To: Ben Gamari
  Cc: Andi Kleen, Arjan van de Ven, linux-kernel, tytso, npiggin,
	mingo, Ruald Andreae, Jens Axboe, Olly Betts, martin f krafft

On 04/09/2010 05:56 PM, Ben Gamari wrote:
> On Mon, 29 Mar 2010 00:08:58 +0200, Andi Kleen<andi@firstfloor.org>  wrote:
>    
>> Ben Gamari<bgamari.foss@gmail.com>  writes:
>> ext4/XFS/JFS/btrfs should be better in this regard
>>
>>      
> I am using btrfs, so yes, I was expecting things to be better. Unfortunately,
> the improvement seems to be non-existent under high IO/fsync load.
>
>    

btrfs is known to perform poorly under fsync.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Poor interactive performance with I/O loads with fsync()ing
  2010-04-11 15:03           ` Avi Kivity
@ 2010-04-11 16:35             ` Ben Gamari
  2010-04-11 17:20               ` Andi Kleen
  2010-04-11 18:16             ` Thomas Gleixner
  1 sibling, 1 reply; 32+ messages in thread
From: Ben Gamari @ 2010-04-11 16:35 UTC (permalink / raw)
  To: Avi Kivity, linux-btrfs
  Cc: Andi Kleen, Arjan van de Ven, linux-kernel, tytso, npiggin,
	mingo, Ruald Andreae, Jens Axboe, Olly Betts, martin f krafft

On Sun, 11 Apr 2010 18:03:00 +0300, Avi Kivity <avi@redhat.com> wrote:
> On 04/09/2010 05:56 PM, Ben Gamari wrote:
> > On Mon, 29 Mar 2010 00:08:58 +0200, Andi Kleen<andi@firstfloor.org>  wrote:
> >    
> >> Ben Gamari<bgamari.foss@gmail.com>  writes:
> >> ext4/XFS/JFS/btrfs should be better in this regard
> >>
> >>      
> > I am using btrfs, so yes, I was expecting things to be better. Unfortunately,
> > the improvement seems to be non-existent under high IO/fsync load.
> >
> 
> btrfs is known to perform poorly under fsync.
> 
Has the reason for this been identified? Judging from the nature of metadata
loads, it would seem that it should be substantially easier to implement
fsync() efficiently.

- Ben


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Poor interactive performance with I/O loads with fsync()ing
  2010-04-11 16:35             ` Ben Gamari
@ 2010-04-11 17:20               ` Andi Kleen
  0 siblings, 0 replies; 32+ messages in thread
From: Andi Kleen @ 2010-04-11 17:20 UTC (permalink / raw)
  To: Ben Gamari
  Cc: Avi Kivity, linux-btrfs, Andi Kleen, Arjan van de Ven,
	linux-kernel, tytso, npiggin, mingo, Ruald Andreae, Jens Axboe,
	Olly Betts, martin f krafft

> Has the reason for this been identified? Judging from the nature of metadata
> loads, it would seem that it should be substantially easier to implement
> fsync() efficiently.

By design a copy on write tree fs would need to flush a whole
tree hierarchy on a sync. btrfs avoids this by using a special
log for fsync, but that causes more overhead if you have that
log on the same disk.  So IO subsystem will do more work.

It's a bit like JBD data journaling.

However it should not have the stalls inherent in ext3's journaling.

-Andi
-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Poor interactive performance with I/O loads with fsync()ing
  2010-04-11 15:03           ` Avi Kivity
  2010-04-11 16:35             ` Ben Gamari
@ 2010-04-11 18:16             ` Thomas Gleixner
  2010-04-11 18:42               ` Andi Kleen
  2010-04-12  0:22               ` Dave Chinner
  1 sibling, 2 replies; 32+ messages in thread
From: Thomas Gleixner @ 2010-04-11 18:16 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Ben Gamari, Andi Kleen, Arjan van de Ven, LKML, tytso, npiggin,
	Ingo Molnar, Ruald Andreae, Jens Axboe, Olly Betts,
	martin f krafft

On Sun, 11 Apr 2010, Avi Kivity wrote:

> On 04/09/2010 05:56 PM, Ben Gamari wrote:
> > On Mon, 29 Mar 2010 00:08:58 +0200, Andi Kleen<andi@firstfloor.org>  wrote:
> >    
> > > Ben Gamari<bgamari.foss@gmail.com>  writes:
> > > ext4/XFS/JFS/btrfs should be better in this regard
> > > 
> > >      
> > I am using btrfs, so yes, I was expecting things to be better.
> > Unfortunately,
> > the improvement seems to be non-existent under high IO/fsync load.
> > 
> >    
> 
> btrfs is known to perform poorly under fsync.

XFS does not do much better. Just moved my VM images back to ext for
that reason.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Poor interactive performance with I/O loads with fsync()ing
  2010-04-11 18:16             ` Thomas Gleixner
@ 2010-04-11 18:42               ` Andi Kleen
  2010-04-11 21:54                 ` Thomas Gleixner
  2010-04-12  0:22               ` Dave Chinner
  1 sibling, 1 reply; 32+ messages in thread
From: Andi Kleen @ 2010-04-11 18:42 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Avi Kivity, Ben Gamari, Andi Kleen, Arjan van de Ven, LKML,
	tytso, npiggin, Ingo Molnar, Ruald Andreae, Jens Axboe,
	Olly Betts, martin f krafft

> XFS does not do much better. Just moved my VM images back to ext for
> that reason.

Did you move from XFS to ext3?  ext3 defaults to barriers off, XFS on,
which can make a big difference depending on the disk. You can
disable them on XFS too of course, with the known drawbacks.

XFS also typically needs some tuning to get reasonable log sizes.

My point was merely (before people chime in with counter examples) 
that XFS/btrfs/jfs don't suffer from the "need to sync all transactions for
every fsync" issue. There can (and will be) still other issues.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Poor interactive performance with I/O loads with fsync()ing
  2010-04-11 18:42               ` Andi Kleen
@ 2010-04-11 21:54                 ` Thomas Gleixner
  2010-04-11 23:43                   ` Hans-Peter Jansen
  0 siblings, 1 reply; 32+ messages in thread
From: Thomas Gleixner @ 2010-04-11 21:54 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Avi Kivity, Ben Gamari, Arjan van de Ven, LKML, tytso, npiggin,
	Ingo Molnar, Ruald Andreae, Jens Axboe, Olly Betts,
	martin f krafft

On Sun, 11 Apr 2010, Andi Kleen wrote:

> > XFS does not do much better. Just moved my VM images back to ext for
> > that reason.
> 
> Did you move from XFS to ext3?  ext3 defaults to barriers off, XFS on,
> which can make a big difference depending on the disk. You can
> disable them on XFS too of course, with the known drawbacks.
> 
> XFS also typically needs some tuning to get reasonable log sizes.
> 
> My point was merely (before people chime in with counter examples) 
> that XFS/btrfs/jfs don't suffer from the "need to sync all transactions for
> every fsync" issue. There can (and will be) still other issues.

Yes, I moved them back from XFS to ext3 simply because moving them
from ext3 to XFS turned out to be a completely unusable disaster.

I know that I can tweak knobs on XFS (or any other file system), but I
would not have expected that it sucks that much for KVM with the
default settings which are perfectly fine for the other use cases
which made us move to XFS.

Thanks,

	tglx



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Poor interactive performance with I/O loads with fsync()ing
  2010-04-11 21:54                 ` Thomas Gleixner
@ 2010-04-11 23:43                   ` Hans-Peter Jansen
  0 siblings, 0 replies; 32+ messages in thread
From: Hans-Peter Jansen @ 2010-04-11 23:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Thomas Gleixner, Andi Kleen, Avi Kivity, Ben Gamari,
	Arjan van de Ven, tytso, npiggin, Ingo Molnar, Ruald Andreae,
	Jens Axboe, Olly Betts, martin f krafft

On Sunday 11 April 2010, 23:54:34 Thomas Gleixner wrote:
> On Sun, 11 Apr 2010, Andi Kleen wrote:
> > > XFS does not do much better. Just moved my VM images back to ext for
> > > that reason.
> >
> > Did you move from XFS to ext3?  ext3 defaults to barriers off, XFS on,
> > which can make a big difference depending on the disk. You can
> > disable them on XFS too of course, with the known drawbacks.
> >
> > XFS also typically needs some tuning to get reasonable log sizes.
> >
> > My point was merely (before people chime in with counter examples)
> > that XFS/btrfs/jfs don't suffer from the "need to sync all transactions
> > for every fsync" issue. There can (and will be) still other issues.
>
> Yes, I moved them back from XFS to ext3 simply because moving them
> from ext3 to XFS turned out to be a completely unusable disaster.
>
> I know that I can tweak knobs on XFS (or any other file system), but I
> would not have expected that it sucks that much for KVM with the
> default settings which are perfectly fine for the other use cases
> which made us move to XFS.

Thomas, what Andi was merely turning out, is that xfs has a really 
concerning different default: barriers, that hurts with fsync(). 

In order to make a fair comparison of the two, you may want to mount xfs 
with nobarrier or ext3 with barrier option set, and _then_ check which one 
is sucking less.

I guess, that outcome will be interesting for quite a bunch of people in the 
audience (including me¹).

Pete

¹) while in transition of getting rid of even suckier technology junk like
   VMware-Server - but digging out a current², but _stable_ kernel release
   seems harder then ever nowadays. 
²) with operational VT-d support for kvm

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Poor interactive performance with I/O loads with fsync()ing
  2010-04-11 18:16             ` Thomas Gleixner
  2010-04-11 18:42               ` Andi Kleen
@ 2010-04-12  0:22               ` Dave Chinner
  2010-04-14 18:40                 ` Ric Wheeler
  1 sibling, 1 reply; 32+ messages in thread
From: Dave Chinner @ 2010-04-12  0:22 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Avi Kivity, Ben Gamari, Andi Kleen, Arjan van de Ven, LKML,
	tytso, npiggin, Ingo Molnar, Ruald Andreae, Jens Axboe,
	Olly Betts, martin f krafft

On Sun, Apr 11, 2010 at 08:16:09PM +0200, Thomas Gleixner wrote:
> On Sun, 11 Apr 2010, Avi Kivity wrote:
> > On 04/09/2010 05:56 PM, Ben Gamari wrote:
> > > On Mon, 29 Mar 2010 00:08:58 +0200, Andi Kleen<andi@firstfloor.org>  wrote:
> > > > Ben Gamari<bgamari.foss@gmail.com>  writes:
> > > > ext4/XFS/JFS/btrfs should be better in this regard
> > > > 
> > > I am using btrfs, so yes, I was expecting things to be better.
> > > Unfortunately,
> > > the improvement seems to be non-existent under high IO/fsync load.
> > 
> > btrfs is known to perform poorly under fsync.
> 
> XFS does not do much better. Just moved my VM images back to ext for
> that reason.

Numbers? Workload description? Mount options? I hate it when all I
hear is "XFS sucked, so I went back to extN" reports without any
more details - it's hard to improve anything without any details
of the problems.

Also worth remembering is that XFS defaults to slow-but-safe
options, but ext3 defaults to fast-and-I-don't-give-a-damn-about-
data-safety, so there's a world of difference between the
filesystem defaults....

And FWIW, I run all my VMs on XFS using default mkfs and mount options,
and I can't say that I've noticed any performance problems at all
despite hammering the IO subsystems all the time. The only thing
I've ever done is occasionally run xfs_fsr across permanent qcow2
VM images to defrag them as the grow slowly over time...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Poor interactive performance with I/O loads with fsync()ing
  2010-04-12  0:22               ` Dave Chinner
@ 2010-04-14 18:40                 ` Ric Wheeler
  0 siblings, 0 replies; 32+ messages in thread
From: Ric Wheeler @ 2010-04-14 18:40 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Thomas Gleixner, Avi Kivity, Ben Gamari, Andi Kleen,
	Arjan van de Ven, LKML, tytso, npiggin, Ingo Molnar,
	Ruald Andreae, Jens Axboe, Olly Betts, martin f krafft

On 04/11/2010 05:22 PM, Dave Chinner wrote:
> On Sun, Apr 11, 2010 at 08:16:09PM +0200, Thomas Gleixner wrote:
>    
>> On Sun, 11 Apr 2010, Avi Kivity wrote:
>>      
>>> On 04/09/2010 05:56 PM, Ben Gamari wrote:
>>>        
>>>> On Mon, 29 Mar 2010 00:08:58 +0200, Andi Kleen<andi@firstfloor.org>   wrote:
>>>>          
>>>>> Ben Gamari<bgamari.foss@gmail.com>   writes:
>>>>> ext4/XFS/JFS/btrfs should be better in this regard
>>>>>
>>>>>            
>>>> I am using btrfs, so yes, I was expecting things to be better.
>>>> Unfortunately,
>>>> the improvement seems to be non-existent under high IO/fsync load.
>>>>          
>>> btrfs is known to perform poorly under fsync.
>>>        
>> XFS does not do much better. Just moved my VM images back to ext for
>> that reason.
>>      
> Numbers? Workload description? Mount options? I hate it when all I
> hear is "XFS sucked, so I went back to extN" reports without any
> more details - it's hard to improve anything without any details
> of the problems.
>
> Also worth remembering is that XFS defaults to slow-but-safe
> options, but ext3 defaults to fast-and-I-don't-give-a-damn-about-
> data-safety, so there's a world of difference between the
> filesystem defaults....
>
> And FWIW, I run all my VMs on XFS using default mkfs and mount options,
> and I can't say that I've noticed any performance problems at all
> despite hammering the IO subsystems all the time. The only thing
> I've ever done is occasionally run xfs_fsr across permanent qcow2
> VM images to defrag them as the grow slowly over time...
>
> Cheers,
>
> Dave.
>    

And if you are asking for details, the type of storage  you use is also 
quite interesting.

Thanks!

Ric


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Poor interactive performance with I/O loads with fsync()ing
  2010-03-23 13:27 ` Jens Axboe
  2010-03-26  3:35   ` Ben Gamari
@ 2010-03-30 10:46   ` Pawel S
  1 sibling, 0 replies; 32+ messages in thread
From: Pawel S @ 2010-03-30 10:46 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-kernel, Ben Gamari, tytso, Nick Piggin, Ingo Molnar, Andrew Morton

2010/3/23 Jens Axboe <jens.axboe@oracle.com>:

> It's also been my sneaking suspicion that swap is involved. I had lots
> of RAM in anything I use, even the laptop and workstation. I'll try and
> run some tests with lower memory and force it into swap, I've seen nasty
> hangs that way.

I am not sure if the memory swapping is the case. According to KDE
System monitor swap is not even touched when copying files. I also
noticed similar responsiveness problems when extracting some rar files
- twelve parts , each about 100MB and the system becomes unresponsive
for some time since the first seconds of operation, then it behaves
normally and then problem is back. I have 2GB of RAM. I am using ext4
file system and I am using noatime mount option.

P.S. It seems it is a little better when I have AHCI mode set in BIOS
(at least when extracting archives).

P.S.2 I would be glad to provide some useful data. I created perf
chart, but if this is not enough just instruct me what should I do
next, please.

Regards

Pawel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Poor interactive performance with I/O loads with fsync()ing
  2010-03-23 13:27 ` Jens Axboe
@ 2010-03-26  3:35   ` Ben Gamari
  2010-03-30 10:46   ` Pawel S
  1 sibling, 0 replies; 32+ messages in thread
From: Ben Gamari @ 2010-03-26  3:35 UTC (permalink / raw)
  To: Jens Axboe, Pawel S
  Cc: linux-kernel, tytso, Nick Piggin, Ingo Molnar, Andrew Morton

On Tue, 23 Mar 2010 14:27:19 +0100, Jens Axboe <jens.axboe@oracle.com> wrote:
> It's also been my sneaking suspicion that swap is involved. I had lots
> of RAM in anything I use, even the laptop and workstation. I'll try and
> run some tests with lower memory and force it into swap, I've seen nasty
> hangs that way.
> 
Find out anything useful?

- Ben

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Poor interactive performance with I/O loads with fsync()ing
  2010-03-23 11:28 Pawel S
@ 2010-03-23 13:27 ` Jens Axboe
  2010-03-26  3:35   ` Ben Gamari
  2010-03-30 10:46   ` Pawel S
  0 siblings, 2 replies; 32+ messages in thread
From: Jens Axboe @ 2010-03-23 13:27 UTC (permalink / raw)
  To: Pawel S
  Cc: linux-kernel, Ben Gamari, tytso, Nick Piggin, Ingo Molnar, Andrew Morton

On Tue, Mar 23 2010, Pawel S wrote:
> Hello
> 
> I am experiencing very similar issue. My system is a regular desktop
> PC and it suffers from very high I/O latencies (sometimes desktop
> "hangs" for eight seconds or more) when copying large files. I tried
> kernels up to 2.6.34-rc2, but without luck. This issue was raised at
> Phoronix forums and Arjan (from Intel) noticed it can be VM related:
> 
> http://www.phoronix.com/forums/showpost.php?p=114975&postcount=51
> 
> Here is my perf timechart where you can notice I/O "steals" CPU from
> the other tasks:
> 
> http://hotfile.com/dl/30596827/ebe566b/output.svg.gz.html

It's also been my sneaking suspicion that swap is involved. I had lots
of RAM in anything I use, even the laptop and workstation. I'll try and
run some tests with lower memory and force it into swap, I've seen nasty
hangs that way.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Poor interactive performance with I/O loads with fsync()ing
@ 2010-03-23 11:28 Pawel S
  2010-03-23 13:27 ` Jens Axboe
  0 siblings, 1 reply; 32+ messages in thread
From: Pawel S @ 2010-03-23 11:28 UTC (permalink / raw)
  To: linux-kernel, Ben Gamari, tytso, Nick Piggin, Ingo Molnar,
	Andrew Morton, Jens Axboe

Hello

I am experiencing very similar issue. My system is a regular desktop
PC and it suffers from very high I/O latencies (sometimes desktop
"hangs" for eight seconds or more) when copying large files. I tried
kernels up to 2.6.34-rc2, but without luck. This issue was raised at
Phoronix forums and Arjan (from Intel) noticed it can be VM related:

http://www.phoronix.com/forums/showpost.php?p=114975&postcount=51

Here is my perf timechart where you can notice I/O "steals" CPU from
the other tasks:

http://hotfile.com/dl/30596827/ebe566b/output.svg.gz.html

Regards!

P.S. if there is some way I can help more just let me know please.

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2010-04-14 18:41 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-16 15:31 Poor interactive performance with I/O loads with fsync()ing Ben Gamari
2010-03-17  1:24 ` tytso
2010-03-17  3:18   ` Ben Gamari
2010-03-17  3:30     ` tytso
2010-03-17  4:31       ` Ben Gamari
2010-03-26  3:16         ` Ben Gamari
2010-03-17  4:53 ` Nick Piggin
2010-03-17  9:37   ` Ingo Molnar
2010-03-26  3:31     ` Ben Gamari
2010-04-09 15:21     ` Ben Gamari
2010-03-26  3:28   ` Ben Gamari
2010-03-23 19:51 ` Jesper Krogh
2010-03-26  3:13 ` Ben Gamari
2010-03-28  1:20 ` Ben Gamari
2010-03-28  1:29   ` Ben Gamari
2010-03-28  3:42   ` Arjan van de Ven
2010-03-28 14:06     ` Ben Gamari
2010-03-28 22:08       ` Andi Kleen
2010-04-09 14:56         ` Ben Gamari
2010-04-11 15:03           ` Avi Kivity
2010-04-11 16:35             ` Ben Gamari
2010-04-11 17:20               ` Andi Kleen
2010-04-11 18:16             ` Thomas Gleixner
2010-04-11 18:42               ` Andi Kleen
2010-04-11 21:54                 ` Thomas Gleixner
2010-04-11 23:43                   ` Hans-Peter Jansen
2010-04-12  0:22               ` Dave Chinner
2010-04-14 18:40                 ` Ric Wheeler
2010-03-23 11:28 Pawel S
2010-03-23 13:27 ` Jens Axboe
2010-03-26  3:35   ` Ben Gamari
2010-03-30 10:46   ` Pawel S

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).