All of lore.kernel.org
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: Avi Kivity <avi@scylladb.com>
Cc: Glauber Costa <glauber@scylladb.com>, xfs@oss.sgi.com
Subject: Re: sleeps and waits during io_submit
Date: Tue, 1 Dec 2015 08:11:16 -0500	[thread overview]
Message-ID: <20151201131114.GA26129@bfoster.bfoster> (raw)
In-Reply-To: <565D639F.8070403@scylladb.com>

On Tue, Dec 01, 2015 at 11:08:47AM +0200, Avi Kivity wrote:
> On 11/30/2015 06:14 PM, Brian Foster wrote:
> >On Mon, Nov 30, 2015 at 04:29:13PM +0200, Avi Kivity wrote:
> >>
> >>On 11/30/2015 04:10 PM, Brian Foster wrote:
...
> >The agsize/agcount mkfs-time heuristics change depending on the type of
> >storage. A single AG can be up to 1TB and if the fs is not considered
> >"multidisk" (e.g., no stripe unit/width is defined), 4 AGs is the
> >default up to 4TB. If a stripe unit is set, the agsize/agcount is
> >adjusted depending on the size of the overall volume (see
> >xfsprogs-dev/mkfs/xfs_mkfs.c:calc_default_ag_geometry() for details).
> 
> We'll experiment with this.  Surely it depends on more than the amount of
> storage?  If you have a high op rate you'll be more likely to excite
> contention, no?
> 

Sure. The absolute optimal configuration for your workload probably
depends on more than storage size, but mkfs doesn't have that
information. In general, it tries to use the most reasonable
configuration based on the storage and expected workload. If you want to
tweak it beyond that, indeed, the best bet is to experiment with what
works.

> >
> >>Are those locks held around I/O, or just CPU operations, or a mix?
> >I believe it's a mix of modifications and I/O, though it looks like some
> >of the I/O cases don't necessarily wait on the lock. E.g., the AIL
> >pushing case will trylock and defer to the next list iteration if the
> >buffer is busy.
> >
> 
> Ok.  For us sleeping in io_submit() is death because we have no other thread
> on that core to take its place.
> 

The above is with regard to metadata I/O, whereas io_submit() is
obviously for user I/O. io_submit() can probably block in a variety of
places afaict... it might have to read in the inode extent map, allocate
blocks, take inode/ag locks, reserve log space for transactions, etc.

It sounds to me that first and foremost you want to make sure you don't
have however many parallel operations you typically have running
contending on the same inodes or AGs. Hint: creating files under
separate subdirectories is a quick and easy way to allocate inodes under
separate AGs (the agno is encoded into the upper bits of the inode
number). Reducing the frequency of block allocation/frees might also be
another help (e.g., preallocate and reuse files, 'mount -o ikeep,'
etc.). Beyond that, you probably want to make sure the log is large
enough to support all concurrent operations. See the xfs_log_grant_*
tracepoints for a window into if/how long transaction reservations might
be waiting on the log.

Brian

> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2015-12-01 13:11 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-28  2:43 sleeps and waits during io_submit Glauber Costa
2015-11-30 14:10 ` Brian Foster
2015-11-30 14:29   ` Avi Kivity
2015-11-30 16:14     ` Brian Foster
2015-12-01  9:08       ` Avi Kivity
2015-12-01 13:11         ` Brian Foster [this message]
2015-12-01 13:58           ` Avi Kivity
2015-12-01 14:01             ` Glauber Costa
2015-12-01 14:37               ` Avi Kivity
2015-12-01 20:45               ` Dave Chinner
2015-12-01 20:56                 ` Avi Kivity
2015-12-01 23:41                   ` Dave Chinner
2015-12-02  8:23                     ` Avi Kivity
2015-12-01 14:56             ` Brian Foster
2015-12-01 15:22               ` Avi Kivity
2015-12-01 16:01                 ` Brian Foster
2015-12-01 16:08                   ` Avi Kivity
2015-12-01 16:29                     ` Brian Foster
2015-12-01 17:09                       ` Avi Kivity
2015-12-01 18:03                         ` Carlos Maiolino
2015-12-01 19:07                           ` Avi Kivity
2015-12-01 21:19                             ` Dave Chinner
2015-12-01 21:38                               ` Avi Kivity
2015-12-01 23:06                                 ` Dave Chinner
2015-12-02  9:02                                   ` Avi Kivity
2015-12-02 12:57                                     ` Carlos Maiolino
2015-12-02 23:19                                     ` Dave Chinner
2015-12-03 12:52                                       ` Avi Kivity
2015-12-04  3:16                                         ` Dave Chinner
2015-12-08 13:52                                           ` Avi Kivity
2015-12-08 23:13                                             ` Dave Chinner
2015-12-01 18:51                         ` Brian Foster
2015-12-01 19:07                           ` Glauber Costa
2015-12-01 19:35                             ` Brian Foster
2015-12-01 19:45                               ` Avi Kivity
2015-12-01 19:26                           ` Avi Kivity
2015-12-01 19:41                             ` Christoph Hellwig
2015-12-01 19:50                               ` Avi Kivity
2015-12-02  0:13                             ` Brian Foster
2015-12-02  0:57                               ` Dave Chinner
2015-12-02  8:38                                 ` Avi Kivity
2015-12-02  8:34                               ` Avi Kivity
2015-12-08  6:03                                 ` Dave Chinner
2015-12-08 13:56                                   ` Avi Kivity
2015-12-08 23:32                                     ` Dave Chinner
2015-12-09  8:37                                       ` Avi Kivity
2015-12-01 21:04                 ` Dave Chinner
2015-12-01 21:10                   ` Glauber Costa
2015-12-01 21:39                     ` Dave Chinner
2015-12-01 21:24                   ` Avi Kivity
2015-12-01 21:31                     ` Glauber Costa
2015-11-30 15:49   ` Glauber Costa
2015-12-01 13:11     ` Brian Foster
2015-12-01 13:39       ` Glauber Costa
2015-12-01 14:02         ` Brian Foster
2015-11-30 23:10 ` Dave Chinner
2015-11-30 23:51   ` Glauber Costa
2015-12-01 20:30     ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151201131114.GA26129@bfoster.bfoster \
    --to=bfoster@redhat.com \
    --cc=avi@scylladb.com \
    --cc=glauber@scylladb.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.