linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: cmm@us.ibm.com
Cc: jack@suse.cz, pbadari@us.ibm.com, linux-ext4@vger.kernel.org,
	linux-kernel@vger.kernel.org, Jens Axboe <jens.axboe@oracle.com>
Subject: Re: [PATCH] JBD: Fix DIO EIO error caused by race between free buffer and commit trasanction
Date: Mon, 19 May 2008 13:25:53 -0700	[thread overview]
Message-ID: <20080519132553.de9b78b0.akpm@linux-foundation.org> (raw)
In-Reply-To: <1211227158.3663.25.camel@localhost.localdomain>

On Mon, 19 May 2008 12:59:18 -0700
Mingming Cao <cmm@us.ibm.com> wrote:

> On Mon, 2008-05-19 at 00:37 +0200, Jan Kara wrote:
> >   Hi,
> > 
> > > This patch fixed a few races between direct IO and kjournald commit
> > > transaction.  An unexpected EIO error gets returned to direct IO
> > > caller when it failed to free those data buffers. This could be
> > > reproduced easily with parallel direct write and buffered write to the
> > > same file
> > > 
> > > More specific, those races could cause journal_try_to_free_buffers()
> > > fail to free the data buffers, when jbd is committing the transaction
> > > that has those data buffers on its t_syncdata_list or t_locked_list.
> > > journal_commit_transaction() still holds the reference to those
> > > buffers before data reach to disk and buffers are removed from the
> > > t_syncdata_list of t_locked_list. This prevent the concurrent
> > > journal_try_to_free_buffers() to free those buffers at the same time,
> > > but cause EIO error returns back to direct IO.
> > > 
> > > With this patch, in case of direct IO and when try_to_free_buffers() failed,
> > > let's waiting for journal_commit_transaction() to finish
> > > flushing the current committing transaction's data buffers to disk, 
> > > then try to free those buffers again.
> >   If Andrew or Christoph wouldn't beat you for "inventive use" of
> > gfp_mask, I'm fine with the patch as well ;). You can add
> >   Acked-by: Jan Kara <jack@suse.cz>
> > 
> 
> This is less intrusive way to fix this problem. The gfp_mask was marked
> as unused in try_to_free_page(). I looked at filesystems in the kernel,
> there is only a few defined releasepage() callback, and only xfs checks
> the flag(but not used). btrfs is actually using it though. I thought
> about the way you have suggested, i.e.clean up this gfp_mask and and
> replace with a flag.  I am not entirely sure if it we need to change the
> address_space_operations and fix all the filesystems for this matter.
> 
> Andrew, what do you think? Is this approach acceptable? 
> 

<wakes up>

Please ensure that the final patch is sufficiently well changelogged to
permit me to remain asleep ;)

The ->releasepage semantics are fairly ad-hoc and have grown over time.
It'd be nice to prevent them from becoming vaguer than they are.

It has been (approximately?) the case that code paths which really care
about having the page released will set __GFP_WAIT (via GFP_KERNEL)
whereas code paths which are happy with best-effort will clear
__GFP_WAIT (with a "0').  And that's reasonsable - __GFP_WAIT here
means "be synchronous" whereas !__GFP_WAIT means "be non-blocking".

Is that old convention not sufficient here as well?  Two problem areas
I see are mm/vmscan.c and fs/splice.c (there may be others).

In mm/vmscan.c we probably don't want your new synchronous behaviour
and it might well be deadlockable anyway.  No probs, that's what
__GFP_FS is for.

In fs/splice.c, reading the comment there I have a feeling that you've
found another bug, and that splice _does_ want your new synchronous
behaviour?

  reply	other threads:[~2008-05-19 20:26 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-06 17:42 [RFC] JBD ordered mode rewrite Jan Kara
2008-03-06 19:05 ` Josef Bacik
2008-03-10 16:30   ` Jan Kara
2008-03-06 23:53 ` Andrew Morton
2008-03-10 17:38   ` Jan Kara
2008-03-07  1:34 ` Mark Fasheh
2008-03-10 18:00   ` Jan Kara
2008-03-07 10:55 ` Mingming Cao
2008-03-10 18:29   ` Jan Kara
2008-03-07 23:52 ` Andreas Dilger
2008-03-08  0:08   ` Mingming Cao
2008-03-08 12:14   ` Christoph Hellwig
2008-03-10 19:54   ` Jan Kara
2008-03-10 21:37     ` Andreas Dilger
2008-04-25 23:38 ` Possible race between direct IO and JBD? Mingming Cao
2008-04-26 10:41   ` Andrew Morton
2008-04-28 12:26   ` Jan Kara
2008-04-28 17:11     ` Badari Pulavarty
2008-04-28 18:09       ` Jan Kara
2008-04-28 19:09         ` Mingming Cao
2008-04-29 12:43           ` Jan Kara
2008-04-29 17:49             ` Mingming Cao
2008-05-01 15:16             ` [PATCH] jbd_commit_transaction() races with journal_try_to_drop_buffers() causing DIO failures Badari Pulavarty
2008-05-01 22:08               ` Mingming Cao
2008-05-05 17:06               ` Jan Kara
2008-05-05 17:53                 ` Mingming Cao
2008-05-06  0:10                 ` Badari Pulavarty
2008-05-09 22:27                 ` Mingming Cao
2008-05-12 15:54                   ` Jan Kara
2008-05-12 19:23                     ` Mingming Cao
2008-05-13 14:20                       ` Jan Kara
2008-05-13  0:39                     ` Mingming Cao
2008-05-13 14:54                       ` Jan Kara
2008-05-13 16:37                         ` Mingming Cao
2008-05-13 22:23                         ` Mingming Cao
2008-05-14 17:08                           ` Jan Kara
2008-05-14 17:41                             ` Mingming Cao
2008-05-14 18:14                               ` Jan Kara
2008-05-16 14:13                                 ` Mingming Cao
2008-05-16 14:14                                 ` [PATCH] Fix DIO EIO error caused by race between jbd_commit_transaction() and journal_try_to_drop_buffers() Mingming Cao
2008-05-16 15:01                                   ` Josef Bacik
2008-05-16 17:11                                     ` Mingming Cao
2008-05-16 17:17                                       ` Badari Pulavarty
2008-05-16 17:30                                         ` Mingming Cao
2008-05-16 17:12                                   ` Badari Pulavarty
2008-05-16 21:01                                     ` [PATCH] JBD: Fix DIO EIO error caused by race between free buffer and commit trasanction Mingming Cao
2008-05-18 22:37                                       ` Jan Kara
2008-05-19 19:59                                         ` Mingming Cao
2008-05-19 20:25                                           ` Andrew Morton [this message]
2008-05-19 22:07                                             ` Mingming Cao
2008-05-20  9:30                                               ` Jens Axboe
2008-05-20 17:47                                                 ` Mingming Cao
2008-05-20 18:02                                               ` [PATCH-v2] JBD: Fix " Mingming Cao
2008-05-20 23:53                                                 ` Jan Kara
2008-05-21 17:14                                                   ` Mingming
2008-05-24 22:44                                                     ` Jan Kara
2008-05-28 18:18                                                       ` Mingming Cao
2008-05-28 18:55                                                         ` Jan Kara
2008-05-29  0:15                                                           ` Mingming Cao
2008-05-29  0:16                                                           ` [PATCH][take 5] " Mingming Cao
2008-05-29  0:18                                                             ` [PATCH][take 5] JBD2: " Mingming Cao
2008-05-30  6:24                                                               ` Aneesh Kumar K.V
2008-05-30 15:17                                                                 ` Mingming Cao
2008-05-21 23:38                                                 ` [PATCH 1/2][TAKE3] JBD: " Mingming
2008-05-22  5:57                                                   ` Andrew Morton
2008-05-21 23:39                                                 ` [PATCH 2/2][TAKE3] JBD2: " Mingming
2008-05-20 18:03                                               ` [PATCH -v2] JBD2: Fix race between journal " Mingming Cao
2008-05-16 21:01                                     ` [PATCH] JBD2: Fix DIO EIO error caused by race between " Mingming Cao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080519132553.de9b78b0.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=cmm@us.ibm.com \
    --cc=jack@suse.cz \
    --cc=jens.axboe@oracle.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbadari@us.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).