From: Omar Sandoval <osandov@osandov.com>
To: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>, Jan Kara <jack@suse.cz>,
Andrew Morton <akpm@linux-foundation.org>,
Trond Myklebust <trond.myklebust@primarydata.com>,
David Sterba <dsterba@suse.cz>,
linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO
Date: Thu, 18 Dec 2014 22:24:05 -0800 [thread overview]
Message-ID: <20141219062405.GA11486@mew> (raw)
In-Reply-To: <20141217220313.GK22149@ZenIV.linux.org.uk>
On Wed, Dec 17, 2014 at 10:03:13PM +0000, Al Viro wrote:
> On Wed, Dec 17, 2014 at 10:52:56AM -0800, Christoph Hellwig wrote:
> > On Wed, Dec 17, 2014 at 06:58:32AM -0800, Omar Sandoval wrote:
> > > See my previous message. If we use O_DIRECT on the original open, then
> > > filesystems that implement bmap but not direct_IO will no longer work.
> > > These are the ones that I found in my tree:
> >
> > In the long run I don't think they are worth keeping. But to keep you
> > out of that discussion you can just try an open without O_DIRECT if the
> > open with the flag failed.
>
> Umm... That's one possibility, of course (and if swapon(2) is on someone's
> hotpath, I really would like to see what the hell they are doing - it has
> to be interesting in a sick way).
If this is the approach you'd prefer, I'll go ahead and do that for v2.
I personally think it looks pretty kludgey, but I'm fine either way:
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 63f55cc..c1b3073 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -2379,7 +2379,16 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
name = NULL;
goto bad_swap;
}
- swap_file = file_open_name(name, O_RDWR|O_LARGEFILE, 0);
+ swap_file = file_open_name(name, O_RDWR | O_LARGEFILE | O_DIRECT, 0);
+ if (IS_ERR(swap_file) && PTR_ERR(swap_file) == -EINVAL)
+ swap_file = file_open_name(name, O_RDWR | O_LARGEFILE, 0);
if (IS_ERR(swap_file)) {
error = PTR_ERR(swap_file);
swap_file = NULL;
> BTW, speaking of read/write vs. swap - what's the story with e.g. AFS
> write() checking IS_SWAPFILE() and failing with -EBUSY? Note that
> * it's done before acquiring i_mutex, so it isn't race-free
> * it's dubious from the POSIX POV - EBUSY isn't in the error
> list for write(2).
> * other filesystems generally don't have anything of that sort.
> NFS does, but local ones do not...
> Besides, do we even allow swapfiles on AFS?
AFS doesn't implement ->bmap or ->swap_activate, so that code is dead,
probably cargo-culted from the NFS code. It seems pretty pointless, not
only because it's inconsistent with the local filesystems like you
mentioned, but also because it's trivial to bypass with O_DIRECT on NFS:
ssize_t nfs_file_write(struct kiocb *iocb, struct iov_iter *from)
{
struct file *file = iocb->ki_filp;
struct inode *inode = file_inode(file);
unsigned long written = 0;
ssize_t result;
size_t count = iov_iter_count(from);
loff_t pos = iocb->ki_pos;
result = nfs_key_timeout_notify(file, inode);
if (result)
return result;
if (file->f_flags & O_DIRECT)
return nfs_file_direct_write(iocb, from, pos);
dprintk("NFS: write(%pD2, %zu@%Ld)\n",
file, count, (long long) pos);
result = -EBUSY;
if (IS_SWAPFILE(inode))
goto out_swapfile;
I think it's safe to scrap that code. However, this also led me to find that
NFS doesn't prevent truncates on an active swapfile. I'm submitting a patch for
that now.
--
Omar
next prev parent reply other threads:[~2014-12-19 6:24 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-15 5:26 [PATCH 0/8] clean up and generalize swap-over-NFS Omar Sandoval
2014-12-15 5:26 ` [PATCH 1/8] nfs: follow direct I/O write locking convention Omar Sandoval
2014-12-15 12:49 ` Trond Myklebust
2014-12-15 15:42 ` Omar Sandoval
2014-12-15 5:26 ` [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO Omar Sandoval
2014-12-15 16:27 ` Jan Kara
2014-12-15 16:56 ` Christoph Hellwig
2014-12-15 22:11 ` Omar Sandoval
2014-12-16 8:35 ` Christoph Hellwig
2014-12-16 8:56 ` Omar Sandoval
2014-12-17 8:06 ` Christoph Hellwig
2014-12-17 8:20 ` Al Viro
2014-12-17 8:24 ` Christoph Hellwig
2014-12-17 14:58 ` Omar Sandoval
2014-12-17 18:52 ` Christoph Hellwig
2014-12-17 22:03 ` Al Viro
2014-12-19 6:24 ` Omar Sandoval [this message]
2014-12-19 6:28 ` Al Viro
2014-12-20 6:51 ` Al Viro
2014-12-22 7:26 ` Omar Sandoval
2014-12-23 9:37 ` Christoph Hellwig
2014-12-15 5:26 ` [PATCH 3/8] swap: don't add ITER_BVEC flag to direct_IO rw Omar Sandoval
2014-12-15 6:16 ` Al Viro
2014-12-15 15:57 ` Omar Sandoval
2014-12-15 5:26 ` [PATCH 4/8] iov_iter: add iov_iter_bvec and convert callers Omar Sandoval
2014-12-15 5:26 ` [PATCH 5/8] direct-io: don't dirty ITER_BVEC pages on read Omar Sandoval
2014-12-15 5:27 ` [PATCH 6/8] nfs: don't dirty ITER_BVEC pages read through direct I/O Omar Sandoval
2014-12-15 6:17 ` Al Viro
2014-12-15 5:27 ` [PATCH 7/8] swap: use direct I/O for SWP_FILE swap_readpage Omar Sandoval
2014-12-15 5:27 ` [PATCH 8/8] vfs: update swap_{,de}activate documentation Omar Sandoval
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141219062405.GA11486@mew \
--to=osandov@osandov.com \
--cc=akpm@linux-foundation.org \
--cc=dsterba@suse.cz \
--cc=hch@infradead.org \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nfs@vger.kernel.org \
--cc=trond.myklebust@primarydata.com \
--cc=viro@ZenIV.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).