linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Mingming Cao <cmm@us.ibm.com>,
	akpm@linux-foundation.org, linux-ext4@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] jbd_commit_transaction() races with journal_try_to_drop_buffers() causing DIO failures
Date: Mon, 5 May 2008 19:06:36 +0200	[thread overview]
Message-ID: <20080505170636.GK25722@duck.suse.cz> (raw)
In-Reply-To: <1209654981.27240.19.camel@badari-desktop>

On Thu 01-05-08 08:16:21, Badari Pulavarty wrote:
> Hi Andrew & Jan,
> 
> I was able to reproduce the customer problem involving DIO
> (invalidate_inode_pages2) problem by writing simple testcase
> to keep writing to a file using buffered writes and DIO writes
> forever in a loop. I see DIO writes fail with -EIO.
> 
> After a long debug, found 2 cases how this could happen.
> These are race conditions with journal_try_to_free_buffers()
> and journal_commit_transaction().
> 
> 1) journal_submit_data_buffers() tries to get bh_state lock. If
> try lock fails, it drops the j_list_lock and sleeps for
> bh_state lock, while holding a reference on the buffer.
> In the meanwhile, journal_try_to_free_buffers() can clean up the
> journal head could call try_to_free_buffers(). try_to_free_buffers()
> would fail due to the reference held by journal_submit_data_buffers()
> - which in turn causes failues for DIO (invalidate_inode_pages2()).
> 
> 2) When the buffer is on t_locked_list waiting for IO to finish,
> we hold a reference and give up the cpu, if we can't get
> bh_state lock. This causes try_to_free_buffers() to fail.
> 
> Fix is to drop the reference on the buffer if we can't get
> bh_state lock, give up the cpu and re-try the whole operation -
> instead of waiting for the vh_state lock.
> 
> Does this look like a resonable fix ?
  As Mingming pointed out there are few other places where we could hold
the bh reference. Note also that we accumulate references to buffers in the
wbuf[] list and we need that for submit_bh() which consumes one bh
reference. Generally, it seems to me as a too fragile and impractical
rule "nobody can hold bh reference when not holding page lock" which is
basically what it comes down to if you really want to be sure that
journal_try_to_free_buffers() succeeds. And also note that in principle
there are other places which hold references to buffers without holding the
page lock - for example writepage() in ordered mode (although this one is
in practice hardly triggerable). So how we could fix at least the races
with commit code is to implement launder_page() callback for ext3/4 which
would wait for the previous transaction commit in case the page has buffers
that are part of that commit (I don't want this logic in
journal_try_to_free_buffers() as that is called also on memory-reclaim
path, but journal_launder_page() is fine with me). This would be correct
but could considerably slow down O_DIRECT writes in cases they're mixed
with buffered writes so I'm not sure if this is acceptable.
  OTOH with the ordered mode rewrite patch, the problem with commit code
also goes away since there we don't need extra references to data buffers
(we use just filemap_fdatawrite).

> 1) journal_submit_data_buffers() tries to get bh_state lock. If
> try lock fails, it drops the j_list_lock and sleeps for
> bh_state lock, while holding a reference on the buffer head.
> In the meanwhile, journal_try_to_free_buffers() can clean up the
> journal head could call try_to_free_buffers(). try_to_free_buffers()
> would fail due to the reference held by journal_submit_data_buffers()
> - which inturn causes failues for DIO (invalidate_inode_pages2()).
> 
> 2) When the buffer is on t_locked_list waiting for IO to finish,
> we hold a reference and give up the cpu, if we can't get 
> bh_state lock. This causes try_to_free_buffers() to fail.
> 
> Fix is to drop the reference on the buffer, give up the cpu
> and re-try the whole operation.
> 
> Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
> Reviewed-by: Mingming Cao <mcao@us.ibm.com>

							Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

  parent reply	other threads:[~2008-05-05 17:06 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-06 17:42 [RFC] JBD ordered mode rewrite Jan Kara
2008-03-06 19:05 ` Josef Bacik
2008-03-10 16:30   ` Jan Kara
2008-03-06 23:53 ` Andrew Morton
2008-03-10 17:38   ` Jan Kara
2008-03-07  1:34 ` Mark Fasheh
2008-03-10 18:00   ` Jan Kara
2008-03-07 10:55 ` Mingming Cao
2008-03-10 18:29   ` Jan Kara
2008-03-07 23:52 ` Andreas Dilger
2008-03-08  0:08   ` Mingming Cao
2008-03-08 12:14   ` Christoph Hellwig
2008-03-10 19:54   ` Jan Kara
2008-03-10 21:37     ` Andreas Dilger
2008-04-25 23:38 ` Possible race between direct IO and JBD? Mingming Cao
2008-04-26 10:41   ` Andrew Morton
2008-04-28 12:26   ` Jan Kara
2008-04-28 17:11     ` Badari Pulavarty
2008-04-28 18:09       ` Jan Kara
2008-04-28 19:09         ` Mingming Cao
2008-04-29 12:43           ` Jan Kara
2008-04-29 17:49             ` Mingming Cao
2008-05-01 15:16             ` [PATCH] jbd_commit_transaction() races with journal_try_to_drop_buffers() causing DIO failures Badari Pulavarty
2008-05-01 22:08               ` Mingming Cao
2008-05-05 17:06               ` Jan Kara [this message]
2008-05-05 17:53                 ` Mingming Cao
2008-05-06  0:10                 ` Badari Pulavarty
2008-05-09 22:27                 ` Mingming Cao
2008-05-12 15:54                   ` Jan Kara
2008-05-12 19:23                     ` Mingming Cao
2008-05-13 14:20                       ` Jan Kara
2008-05-13  0:39                     ` Mingming Cao
2008-05-13 14:54                       ` Jan Kara
2008-05-13 16:37                         ` Mingming Cao
2008-05-13 22:23                         ` Mingming Cao
2008-05-14 17:08                           ` Jan Kara
2008-05-14 17:41                             ` Mingming Cao
2008-05-14 18:14                               ` Jan Kara
2008-05-16 14:13                                 ` Mingming Cao
2008-05-16 14:14                                 ` [PATCH] Fix DIO EIO error caused by race between jbd_commit_transaction() and journal_try_to_drop_buffers() Mingming Cao
2008-05-16 15:01                                   ` Josef Bacik
2008-05-16 17:11                                     ` Mingming Cao
2008-05-16 17:17                                       ` Badari Pulavarty
2008-05-16 17:30                                         ` Mingming Cao
2008-05-16 17:12                                   ` Badari Pulavarty
2008-05-16 21:01                                     ` [PATCH] JBD: Fix DIO EIO error caused by race between free buffer and commit trasanction Mingming Cao
2008-05-18 22:37                                       ` Jan Kara
2008-05-19 19:59                                         ` Mingming Cao
2008-05-19 20:25                                           ` Andrew Morton
2008-05-19 22:07                                             ` Mingming Cao
2008-05-20  9:30                                               ` Jens Axboe
2008-05-20 17:47                                                 ` Mingming Cao
2008-05-20 18:02                                               ` [PATCH-v2] JBD: Fix " Mingming Cao
2008-05-20 23:53                                                 ` Jan Kara
2008-05-21 17:14                                                   ` Mingming
2008-05-24 22:44                                                     ` Jan Kara
2008-05-28 18:18                                                       ` Mingming Cao
2008-05-28 18:55                                                         ` Jan Kara
2008-05-29  0:15                                                           ` Mingming Cao
2008-05-29  0:16                                                           ` [PATCH][take 5] " Mingming Cao
2008-05-29  0:18                                                             ` [PATCH][take 5] JBD2: " Mingming Cao
2008-05-30  6:24                                                               ` Aneesh Kumar K.V
2008-05-30 15:17                                                                 ` Mingming Cao
2008-05-21 23:38                                                 ` [PATCH 1/2][TAKE3] JBD: " Mingming
2008-05-22  5:57                                                   ` Andrew Morton
2008-05-21 23:39                                                 ` [PATCH 2/2][TAKE3] JBD2: " Mingming
2008-05-20 18:03                                               ` [PATCH -v2] JBD2: Fix race between journal " Mingming Cao
2008-05-16 21:01                                     ` [PATCH] JBD2: Fix DIO EIO error caused by race between " Mingming Cao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080505170636.GK25722@duck.suse.cz \
    --to=jack@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=cmm@us.ibm.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbadari@us.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).