All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Bobrowski <mbobrowski@mbobrowski.org>
To: Ritesh Harjani <riteshh@linux.ibm.com>
Cc: tytso@mit.edu, jack@suse.cz, adilger.kernel@dilger.ca,
	hch@infradead.org, darrick.wong@oracle.com,
	linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	david@fromorbit.com
Subject: Re: [PATCH v2 5/6] ext4: introduce direct IO write path using iomap infrastructure
Date: Tue, 10 Sep 2019 21:20:14 +1000	[thread overview]
Message-ID: <20190910112014.GB10579@bobrowski> (raw)
In-Reply-To: <20190909143219.B549042049@d06av24.portsmouth.uk.ibm.com>

On Mon, Sep 09, 2019 at 08:02:13PM +0530, Ritesh Harjani wrote:
> On 9/9/19 2:56 PM, Ritesh Harjani wrote:
> > > +    ret = iomap_dio_rw(iocb, from, &ext4_iomap_ops,
> > > ext4_dio_write_end_io);
> > > +
> > > +    /*
> > > +     * Unaligned direct AIO must be the only IO in flight or else
> > > +     * any overlapping aligned IO after unaligned IO might result
> > > +     * in data corruption. We also need to wait here in the case
> > > +     * where the inode is being extended so that inode extension
> > > +     * routines in ext4_dio_write_end_io() are covered by the
> > > +     * inode_lock().
> > > +     */
> > > +    if (ret == -EIOCBQUEUED && (unaligned_aio || extend))
> > > +        inode_dio_wait(inode);
> 
> So, I looked this more closely into the AIO DIO write extend case
> of yours here. AFAICT, this looks good in the sense that it follows
> the behavior what we used to have before from __blockdev_direct_IO.
> In that it used to wait for AIO DIO writes beyond EOF, but the iomap
> framework does not have that. So waiting in case of writes beyond EOF
> should be the right thing to do here for ext4 (following the legacy code).
> 
> But I would like to confirm the exact race this extend case
> is protecting here.
> Since writes beyond EOF will require update of inode i_size
> (ext4_update_inode_size()) which require us to hold the inode_lock
> in exclusive mode, so we must need to wait in extend case here,
> even for AIO DIO writes.
> 
> Q1. Is above understanding completely correct?

Yes, that's essentially correct.

> Q2. Or is there anything else also which it is also protecting which I am
> missing?

No, I think that's it...

> Do we need to hold inode exclusive lock for ext4_orphan_del() as well?

Yes, I believe so.

> Q3. How about XFS then?
> (I do see some tricks done with IOLOCK handling in case of ki__pos > i_size
> & to zero out the buffer space between old i_size & ki_pos).
> 
> But if we talk only about the above case of extending AIO DIO writes beyond
> EOF, XFS only takes a shared lock. why?
> 
> Looking into XFS code, I see that they have IOLOCK & ILOCK.
> So I guess for protecting inode->i_size update they use ILOCK (in
> xfs_dio_write_end_io() -> xfs_iomap_write_unwritten()
> or ip->i_flags_lock lock in NON-UNWRITTEN case). And for IO part the IOLOCK
> is used and hence IOLOCK can be used in shared mode. Is this correct
> understanding for XFS?

* David/Christoph/Darrick

I haven't looked at the intricate XFS locking semantics, so I can't really
comment until I've looked at the code to be perfectly honest. Perhaps asking
one of the XFS maintainers could get you the answer you're looking for on
this.

--<M>--

  reply	other threads:[~2019-09-10 11:21 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-08 23:18 [PATCH v2 0/6] ext4: port direct IO to iomap infrastructure Matthew Bobrowski
2019-09-08 23:18 ` [PATCH v2 1/6] ext4: introduce direct IO read path using " Matthew Bobrowski
2019-09-09  7:48   ` Ritesh Harjani
2019-09-08 23:19 ` [PATCH v2 2/6] ext4: move inode extension/truncate code out from ext4_iomap_end() Matthew Bobrowski
2019-09-09  8:17   ` Ritesh Harjani
2019-09-10 10:26     ` Matthew Bobrowski
2019-09-10 11:16       ` Ritesh Harjani
2019-09-08 23:19 ` [PATCH v2 3/6] iomap: modify ->end_io() calling convention Matthew Bobrowski
2019-09-08 23:32   ` Matthew Bobrowski
2019-09-08 23:19 ` [PATCH v2 4/6] ext4: reorder map.m_flags checks in ext4_iomap_begin() Matthew Bobrowski
2019-09-09  8:30   ` Ritesh Harjani
2019-09-08 23:19 ` [PATCH v2 5/6] ext4: introduce direct IO write path using iomap infrastructure Matthew Bobrowski
2019-09-09  9:20   ` Ritesh Harjani
2019-09-09  9:26   ` Ritesh Harjani
2019-09-09 14:32     ` Ritesh Harjani
2019-09-10 11:20       ` Matthew Bobrowski [this message]
2019-09-10 10:31     ` Matthew Bobrowski
2019-09-11  8:08     ` Ritesh Harjani
2019-09-11 12:39       ` Matthew Bobrowski
2019-09-08 23:20 ` [PATCH v2 6/6] ext4: cleanup legacy buffer_head direct IO code Matthew Bobrowski
2019-09-09  9:36   ` Ritesh Harjani

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190910112014.GB10579@bobrowski \
    --to=mbobrowski@mbobrowski.org \
    --cc=adilger.kernel@dilger.ca \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=riteshh@linux.ibm.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.