linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Omar Sandoval <osandov@osandov.com>
To: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>, Jan Kara <jack@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	Trond Myklebust <trond.myklebust@primarydata.com>,
	David Sterba <dsterba@suse.cz>,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO
Date: Thu, 18 Dec 2014 22:24:05 -0800	[thread overview]
Message-ID: <20141219062405.GA11486@mew> (raw)
In-Reply-To: <20141217220313.GK22149@ZenIV.linux.org.uk>

On Wed, Dec 17, 2014 at 10:03:13PM +0000, Al Viro wrote:
> On Wed, Dec 17, 2014 at 10:52:56AM -0800, Christoph Hellwig wrote:
> > On Wed, Dec 17, 2014 at 06:58:32AM -0800, Omar Sandoval wrote:
> > > See my previous message. If we use O_DIRECT on the original open, then
> > > filesystems that implement bmap but not direct_IO will no longer work.
> > > These are the ones that I found in my tree:
> > 
> > In the long run I don't think they are worth keeping.  But to keep you
> > out of that discussion you can just try an open without O_DIRECT if the
> > open with the flag failed.
> 
> Umm...  That's one possibility, of course (and if swapon(2) is on someone's
> hotpath, I really would like to see what the hell they are doing - it has
> to be interesting in a sick way).

If this is the approach you'd prefer, I'll go ahead and do that for v2.
I personally think it looks pretty kludgey, but I'm fine either way:

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 63f55cc..c1b3073 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -2379,7 +2379,16 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
                name = NULL;
                goto bad_swap;
        }
-       swap_file = file_open_name(name, O_RDWR|O_LARGEFILE, 0);
+       swap_file = file_open_name(name, O_RDWR | O_LARGEFILE | O_DIRECT, 0);
+       if (IS_ERR(swap_file) && PTR_ERR(swap_file) == -EINVAL)
+               swap_file = file_open_name(name, O_RDWR | O_LARGEFILE, 0);
        if (IS_ERR(swap_file)) {
                error = PTR_ERR(swap_file);
                swap_file = NULL;

> BTW, speaking of read/write vs. swap - what's the story with e.g. AFS
> write() checking IS_SWAPFILE() and failing with -EBUSY?  Note that
> 	* it's done before acquiring i_mutex, so it isn't race-free
> 	* it's dubious from the POSIX POV - EBUSY isn't in the error
> list for write(2).
> 	* other filesystems generally don't have anything of that sort.
> NFS does, but local ones do not...
> Besides, do we even allow swapfiles on AFS?

AFS doesn't implement ->bmap or ->swap_activate, so that code is dead,
probably cargo-culted from the NFS code. It seems pretty pointless, not
only because it's inconsistent with the local filesystems like you
mentioned, but also because it's trivial to bypass with O_DIRECT on NFS:

ssize_t nfs_file_write(struct kiocb *iocb, struct iov_iter *from)
{
	struct file *file = iocb->ki_filp;
	struct inode *inode = file_inode(file);
	unsigned long written = 0;
	ssize_t result;
	size_t count = iov_iter_count(from);
	loff_t pos = iocb->ki_pos;

	result = nfs_key_timeout_notify(file, inode);
	if (result)
		return result;

	if (file->f_flags & O_DIRECT)
		return nfs_file_direct_write(iocb, from, pos);

	dprintk("NFS: write(%pD2, %zu@%Ld)\n",
		file, count, (long long) pos);

	result = -EBUSY;
	if (IS_SWAPFILE(inode))
		goto out_swapfile;

I think it's safe to scrap that code. However, this also led me to find that
NFS doesn't prevent truncates on an active swapfile. I'm submitting a patch for
that now.

-- 
Omar

  reply	other threads:[~2014-12-19  6:24 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-15  5:26 [PATCH 0/8] clean up and generalize swap-over-NFS Omar Sandoval
2014-12-15  5:26 ` [PATCH 1/8] nfs: follow direct I/O write locking convention Omar Sandoval
2014-12-15 12:49   ` Trond Myklebust
2014-12-15 15:42     ` Omar Sandoval
2014-12-15  5:26 ` [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO Omar Sandoval
2014-12-15 16:27   ` Jan Kara
2014-12-15 16:56     ` Christoph Hellwig
2014-12-15 22:11       ` Omar Sandoval
2014-12-16  8:35         ` Christoph Hellwig
2014-12-16  8:56           ` Omar Sandoval
2014-12-17  8:06             ` Christoph Hellwig
2014-12-17  8:20               ` Al Viro
2014-12-17  8:24                 ` Christoph Hellwig
2014-12-17 14:58                   ` Omar Sandoval
2014-12-17 18:52                     ` Christoph Hellwig
2014-12-17 22:03                       ` Al Viro
2014-12-19  6:24                         ` Omar Sandoval [this message]
2014-12-19  6:28                           ` Al Viro
2014-12-20  6:51       ` Al Viro
2014-12-22  7:26         ` Omar Sandoval
2014-12-23  9:37         ` Christoph Hellwig
2014-12-15  5:26 ` [PATCH 3/8] swap: don't add ITER_BVEC flag to direct_IO rw Omar Sandoval
2014-12-15  6:16   ` Al Viro
2014-12-15 15:57     ` Omar Sandoval
2014-12-15  5:26 ` [PATCH 4/8] iov_iter: add iov_iter_bvec and convert callers Omar Sandoval
2014-12-15  5:26 ` [PATCH 5/8] direct-io: don't dirty ITER_BVEC pages on read Omar Sandoval
2014-12-15  5:27 ` [PATCH 6/8] nfs: don't dirty ITER_BVEC pages read through direct I/O Omar Sandoval
2014-12-15  6:17   ` Al Viro
2014-12-15  5:27 ` [PATCH 7/8] swap: use direct I/O for SWP_FILE swap_readpage Omar Sandoval
2014-12-15  5:27 ` [PATCH 8/8] vfs: update swap_{,de}activate documentation Omar Sandoval

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141219062405.GA11486@mew \
    --to=osandov@osandov.com \
    --cc=akpm@linux-foundation.org \
    --cc=dsterba@suse.cz \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trond.myklebust@primarydata.com \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).