All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: "Theodore Ts'o" <tytso@mit.edu>
Cc: Mel Gorman <mgorman@suse.de>,
	linux-ext4@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>, Jiri Slaby <jslaby@suse.cz>
Subject: Re: Excessive stall times on ext4 in 3.9-rc2
Date: Thu, 11 Apr 2013 23:33:35 +0200	[thread overview]
Message-ID: <20130411213335.GE9379@quack.suse.cz> (raw)
In-Reply-To: <20130411183512.GA12298@thunk.org>

On Thu 11-04-13 14:35:12, Ted Tso wrote:
> On Thu, Apr 11, 2013 at 06:04:02PM +0100, Mel Gorman wrote:
> > > If we're stalling on lock_buffer(), that implies that buffer was being
> > > written, and for some reason it was taking a very long time to
> > > complete.
> > > 
> > 
> > Yes.
> > 
> > > It might be worthwhile to put a timestamp in struct dm_crypt_io, and
> > > record the time when a particular I/O encryption/decryption is getting
> > > queued to the kcryptd workqueues, and when they finally squirt out.
> > > 
> > 
> > That somewhat assumes that dm_crypt was at fault which is not unreasonable
> > but I was skeptical as the workload on dm_crypt was opening a maildir
> > and mostly reads.
> 
> Hmm... well, I've reviewed all of the places in the ext4 and jbd2
> layer where we call lock_buffer(), and with one exception[1] we're not
> holding the the bh locked any longer than necessary.  There are a few
> places where we grab a spinlock or two before we can do what we need
> to do and then release the lock'ed buffer head, but the only time we
> hold the bh locked for long periods of time is when we submit metadata
> blocks for I/O.
> 
> [1] There is one exception in ext4_xattr_release_block() where I
> believe we should move the call to unlock_buffer(bh) before the call
> to ext4_free_blocks(), since we've already elevanted the bh count and
> ext4_free_blocks() does not need to have the bh locked.  It's not
> related to any of the stalls you've repored, though, as near as I can
> tell (none of the stack traces include the ext4 xattr code, and this
> would only affect external extended attribute blocks).
> 
> 
> Could you code which checks the hold time of lock_buffer(), measuing
> from when the lock is successfully grabbed, to see if you can see if I
> missed some code path in ext4 or jbd2 where the bh is locked and then
> there is some call to some function which needs to block for some
> random reason?  What I'd suggest is putting a timestamp in buffer_head
> structure, which is set by lock_buffer once it is successfully grabbed
> the lock, and then in unlock_buffer(), if it is held for more than a
> second or some such, to dump out the stack trace.
> 
> Because at this point, either I'm missing something or I'm beginning
> to suspect that your hard drive (or maybe something the block layer?)
> is simply taking a long time to service an I/O request.  Putting in
> this check should be able to very quickly determine what code path
> and/or which subsystem we should be focused upon.
  I think it might be more enlightening if Mel traced which process in
which funclion is holding the buffer lock. I suspect we'll find out that
the flusher thread has submitted the buffer for IO as an async write and
thus it takes a long time to complete in presence of reads which have
higher priority.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

WARNING: multiple messages have this Message-ID (diff)
From: Jan Kara <jack@suse.cz>
To: Theodore Ts'o <tytso@mit.edu>
Cc: Mel Gorman <mgorman@suse.de>,
	linux-ext4@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>, Jiri Slaby <jslaby@suse.cz>
Subject: Re: Excessive stall times on ext4 in 3.9-rc2
Date: Thu, 11 Apr 2013 23:33:35 +0200	[thread overview]
Message-ID: <20130411213335.GE9379@quack.suse.cz> (raw)
In-Reply-To: <20130411183512.GA12298@thunk.org>

On Thu 11-04-13 14:35:12, Ted Tso wrote:
> On Thu, Apr 11, 2013 at 06:04:02PM +0100, Mel Gorman wrote:
> > > If we're stalling on lock_buffer(), that implies that buffer was being
> > > written, and for some reason it was taking a very long time to
> > > complete.
> > > 
> > 
> > Yes.
> > 
> > > It might be worthwhile to put a timestamp in struct dm_crypt_io, and
> > > record the time when a particular I/O encryption/decryption is getting
> > > queued to the kcryptd workqueues, and when they finally squirt out.
> > > 
> > 
> > That somewhat assumes that dm_crypt was at fault which is not unreasonable
> > but I was skeptical as the workload on dm_crypt was opening a maildir
> > and mostly reads.
> 
> Hmm... well, I've reviewed all of the places in the ext4 and jbd2
> layer where we call lock_buffer(), and with one exception[1] we're not
> holding the the bh locked any longer than necessary.  There are a few
> places where we grab a spinlock or two before we can do what we need
> to do and then release the lock'ed buffer head, but the only time we
> hold the bh locked for long periods of time is when we submit metadata
> blocks for I/O.
> 
> [1] There is one exception in ext4_xattr_release_block() where I
> believe we should move the call to unlock_buffer(bh) before the call
> to ext4_free_blocks(), since we've already elevanted the bh count and
> ext4_free_blocks() does not need to have the bh locked.  It's not
> related to any of the stalls you've repored, though, as near as I can
> tell (none of the stack traces include the ext4 xattr code, and this
> would only affect external extended attribute blocks).
> 
> 
> Could you code which checks the hold time of lock_buffer(), measuing
> from when the lock is successfully grabbed, to see if you can see if I
> missed some code path in ext4 or jbd2 where the bh is locked and then
> there is some call to some function which needs to block for some
> random reason?  What I'd suggest is putting a timestamp in buffer_head
> structure, which is set by lock_buffer once it is successfully grabbed
> the lock, and then in unlock_buffer(), if it is held for more than a
> second or some such, to dump out the stack trace.
> 
> Because at this point, either I'm missing something or I'm beginning
> to suspect that your hard drive (or maybe something the block layer?)
> is simply taking a long time to service an I/O request.  Putting in
> this check should be able to very quickly determine what code path
> and/or which subsystem we should be focused upon.
  I think it might be more enlightening if Mel traced which process in
which funclion is holding the buffer lock. I suspect we'll find out that
the flusher thread has submitted the buffer for IO as an async write and
thus it takes a long time to complete in presence of reads which have
higher priority.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2013-04-11 21:33 UTC|newest]

Thread overview: 105+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-02 14:27 Excessive stall times on ext4 in 3.9-rc2 Mel Gorman
2013-04-02 14:27 ` Mel Gorman
2013-04-02 15:00 ` Jiri Slaby
2013-04-02 15:00   ` Jiri Slaby
2013-04-02 15:03 ` Zheng Liu
2013-04-02 15:03   ` Zheng Liu
2013-04-02 15:15   ` Mel Gorman
2013-04-02 15:15     ` Mel Gorman
2013-04-02 15:06 ` Theodore Ts'o
2013-04-02 15:06   ` Theodore Ts'o
2013-04-02 15:14   ` Theodore Ts'o
2013-04-02 15:14     ` Theodore Ts'o
2013-04-02 18:19     ` Theodore Ts'o
2013-04-02 18:19       ` Theodore Ts'o
2013-04-07 21:59       ` Frank Ch. Eigler
2013-04-07 21:59         ` Frank Ch. Eigler
2013-04-08  8:36         ` Mel Gorman
2013-04-08  8:36           ` Mel Gorman
2013-04-08 10:52           ` Frank Ch. Eigler
2013-04-08 10:52             ` Frank Ch. Eigler
2013-04-08 11:01         ` Theodore Ts'o
2013-04-08 11:01           ` Theodore Ts'o
2013-04-03 10:19     ` Mel Gorman
2013-04-03 10:19       ` Mel Gorman
2013-04-03 12:05       ` Theodore Ts'o
2013-04-03 12:05         ` Theodore Ts'o
2013-04-03 15:15         ` Mel Gorman
2013-04-05 22:18       ` Jiri Slaby
2013-04-05 22:18         ` Jiri Slaby
2013-04-05 23:16         ` Theodore Ts'o
2013-04-05 23:16           ` Theodore Ts'o
2013-04-06  7:29           ` Jiri Slaby
2013-04-06  7:29             ` Jiri Slaby
2013-04-06  7:37             ` Jiri Slaby
2013-04-06  7:37               ` Jiri Slaby
2013-04-06  8:19               ` Jiri Slaby
2013-04-06 13:15             ` Theodore Ts'o
2013-04-06 13:15               ` Theodore Ts'o
2013-04-10 10:56   ` Mel Gorman
2013-04-10 10:56     ` Mel Gorman
2013-04-10 13:12     ` Theodore Ts'o
2013-04-10 13:12       ` Theodore Ts'o
2013-04-11 17:04       ` Mel Gorman
2013-04-11 17:04         ` Mel Gorman
2013-04-11 18:35         ` Theodore Ts'o
2013-04-11 18:35           ` Theodore Ts'o
2013-04-11 21:33           ` Jan Kara [this message]
2013-04-11 21:33             ` Jan Kara
2013-04-12  2:57             ` Theodore Ts'o
2013-04-12  2:57               ` Theodore Ts'o
2013-04-12  4:50               ` Dave Chinner
2013-04-12  4:50                 ` Dave Chinner
2013-04-12 15:19                 ` Theodore Ts'o
2013-04-12 15:19                   ` Theodore Ts'o
2013-04-13  1:23                   ` Dave Chinner
2013-04-13  1:23                     ` Dave Chinner
2013-04-22 14:38                   ` Mel Gorman
2013-04-22 14:38                     ` Mel Gorman
2013-04-22 22:42                     ` Jeff Moyer
2013-04-22 22:42                       ` Jeff Moyer
2013-04-23  0:02                       ` Theodore Ts'o
2013-04-23  0:02                         ` Theodore Ts'o
2013-04-23  9:31                       ` Jan Kara
2013-04-23  9:31                         ` Jan Kara
2013-04-23 14:01                       ` Mel Gorman
2013-04-23 14:01                         ` Mel Gorman
2013-04-24 19:09                         ` Jeff Moyer
2013-04-24 19:09                           ` Jeff Moyer
2013-04-25 12:21                           ` Mel Gorman
2013-04-25 12:21                             ` Mel Gorman
2013-04-12  9:47               ` Mel Gorman
2013-04-12  9:47                 ` Mel Gorman
2013-04-21  0:05                 ` Theodore Ts'o
2013-04-21  0:05                   ` Theodore Ts'o
2013-04-21  0:07                   ` [PATCH 1/3] ext4: mark all metadata I/O with REQ_META Theodore Ts'o
2013-04-21  0:07                     ` Theodore Ts'o
2013-04-21  0:07                     ` [PATCH 2/3] buffer: add BH_Prio and BH_Meta flags Theodore Ts'o
2013-04-21  0:07                       ` Theodore Ts'o
2013-04-21  0:07                     ` [PATCH 3/3] ext4: mark metadata blocks using bh flags Theodore Ts'o
2013-04-21  0:07                       ` Theodore Ts'o
2013-04-21  6:09                       ` Jiri Slaby
2013-04-21  6:09                         ` Jiri Slaby
2013-04-21  6:09                         ` Jiri Slaby
2013-04-21 19:55                         ` Theodore Ts'o
2013-04-21 19:55                           ` Theodore Ts'o
2013-04-21 19:55                           ` Theodore Ts'o
2013-04-21 20:48                           ` [PATCH 3/3 -v2] " Theodore Ts'o
2013-04-21 20:48                             ` Theodore Ts'o
2013-04-21 20:48                             ` Theodore Ts'o
2013-04-22 12:06                     ` [PATCH 1/3] ext4: mark all metadata I/O with REQ_META Zheng Liu
2013-04-22 12:06                       ` Zheng Liu
2013-04-23 15:33                   ` Excessive stall times on ext4 in 3.9-rc2 Mel Gorman
2013-04-23 15:33                     ` Mel Gorman
2013-04-23 15:50                     ` Theodore Ts'o
2013-04-23 15:50                       ` Theodore Ts'o
2013-04-23 16:13                       ` Mel Gorman
2013-04-23 16:13                         ` Mel Gorman
2013-04-12 10:18               ` Tvrtko Ursulin
2013-04-12 10:18                 ` Tvrtko Ursulin
2013-04-12  9:45           ` Mel Gorman
2013-04-12  9:45             ` Mel Gorman
2013-04-02 23:16 ` Theodore Ts'o
2013-04-02 23:16   ` Theodore Ts'o
2013-04-03 15:22   ` Mel Gorman
2013-04-03 15:22     ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130411213335.GE9379@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=jslaby@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.