All of lore.kernel.org
 help / color / mirror / Atom feed
From: Omar Sandoval <osandov@osandov.com>
To: Alexander Viro <viro@zeniv.linux.org.uk>,
	Andrew Morton <akpm@linux-foundation.org>,
	Chris Mason <clm@fb.com>, Josef Bacik <jbacik@fb.com>,
	Trond Myklebust <trond.myklebust@primarydata.com>,
	Christoph Hellwig <hch@infradead.org>,
	David Sterba <dsterba@suse.cz>,
	linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, linux-nfs@vger.kernel.org,
	linux-kernel@vger.kernel.org
Cc: Omar Sandoval <osandov@osandov.com>
Subject: [RFC PATCH v3 0/7] btrfs: implement swap file support
Date: Tue,  9 Dec 2014 17:45:41 -0800	[thread overview]
Message-ID: <cover.1418173063.git.osandov@osandov.com> (raw)

Hi, everyone,

This patch series, based on v3.18, implements support for swap files on BTRFS.
Patches 1, 3, and 4 are for the VFS folks, patch 2 is for NFS, and the rest is
all BTRFS.

The standard swap file implementation uses bmap() to get a list of physical
blocks to do I/O on. This doesn't work for BTRFS, which moves disk blocks around
as part of normal operation (COW, defragmentation, etc.).

Swap-over-NFS introduced an interface through which a filesystem can arbitrate
swap I/O through address space operations:

- swap_activate() is called by swapon() and informs the address space that the
  given file is going to be used for swap, so it should take adequate measures
  like reserving space on disk and pinning block lookup information in memory
- swap_deactivate() is used to clean up on swapoff()
- direct_IO() is used to page in and out (this no longer uses readpage as part
  of this patch series)

Patches 1-4 clean up the necessary infrastructure. There's more that can make
this better (like resurrecting kernel AIO), but that can be done as a follow-up
to the work here.

Patches 5 and 6 lay the groundwork needed for using a swap file on BTRFS, and
patch 7 implements the actual aops.

Version 3 incorporates a bunch of David Sterba's feedback, both style and design
issues. We now audit various ioctls to prevent them from interfering with swap
file operation and handle extents which can't be nocow'd.

After some discussion on the mailing list, I decided that for simplicity and
reliability, it's best to simply disallow COW files and files with shared
extents (like files with extents shared with a snapshot). From a user's
perspective, this means that a snapshotted subvolume cannot be used for a swap
file, but keeping the swap file in a separate subvolume that is never
snapshotted seems entirely reasonable to me. An alternative suggestion was to
allow swap files to be snapshotted and to do an implied COW on swap file
activation, which I was ready to implement until I realized that we can't permit
snapshotting a subvolume with an active swap file, so this creates a surprising
inconsistency for users (in my opinion).

As with before, this functionality is tenuously tested in a virtual machine with
some artificial workloads, but it "works for me". I'm pretty happy with the
results on my end, so please comment away.

Thanks!

Omar Sandoval (7):
  direct-io: don't dirty ITER_BVEC pages on read
  nfs: don't dirty ITER_BVEC pages read through direct I/O
  swap: use direct I/O for SWP_FILE swap_readpage
  vfs: update swap_{,de}activate documentation
  btrfs: prevent ioctls from interfering with a swap file
  btrfs: add EXTENT_FLAG_SWAPFILE
  btrfs: enable swap file support

 Documentation/filesystems/Locking |   7 +-
 Documentation/filesystems/vfs.txt |   7 +-
 fs/btrfs/ctree.h                  |   3 +
 fs/btrfs/disk-io.c                |   1 +
 fs/btrfs/extent_io.c              |   1 +
 fs/btrfs/extent_map.h             |   1 +
 fs/btrfs/inode.c                  | 132 ++++++++++++++++++++++++++++++++++++++
 fs/btrfs/ioctl.c                  |  35 ++++++++--
 fs/direct-io.c                    |   8 ++-
 fs/nfs/direct.c                   |   5 +-
 include/trace/events/btrfs.h      |   3 +-
 mm/page_io.c                      |  32 +++++++--
 12 files changed, 216 insertions(+), 19 deletions(-)

-- 
2.1.3


WARNING: multiple messages have this Message-ID (diff)
From: Omar Sandoval <osandov@osandov.com>
To: Alexander Viro <viro@zeniv.linux.org.uk>,
	Andrew Morton <akpm@linux-foundation.org>,
	Chris Mason <clm@fb.com>, Josef Bacik <jbacik@fb.com>,
	Trond Myklebust <trond.myklebust@primarydata.com>,
	Christoph Hellwig <hch@infradead.org>,
	David Sterba <dsterba@suse.cz>,
	linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, linux-nfs@vger.kernel.org,
	linux-kernel@vger.kernel.org
Cc: Omar Sandoval <osandov@osandov.com>
Subject: [RFC PATCH v3 0/7] btrfs: implement swap file support
Date: Tue,  9 Dec 2014 17:45:41 -0800	[thread overview]
Message-ID: <cover.1418173063.git.osandov@osandov.com> (raw)

Hi, everyone,

This patch series, based on v3.18, implements support for swap files on BTRFS.
Patches 1, 3, and 4 are for the VFS folks, patch 2 is for NFS, and the rest is
all BTRFS.

The standard swap file implementation uses bmap() to get a list of physical
blocks to do I/O on. This doesn't work for BTRFS, which moves disk blocks around
as part of normal operation (COW, defragmentation, etc.).

Swap-over-NFS introduced an interface through which a filesystem can arbitrate
swap I/O through address space operations:

- swap_activate() is called by swapon() and informs the address space that the
  given file is going to be used for swap, so it should take adequate measures
  like reserving space on disk and pinning block lookup information in memory
- swap_deactivate() is used to clean up on swapoff()
- direct_IO() is used to page in and out (this no longer uses readpage as part
  of this patch series)

Patches 1-4 clean up the necessary infrastructure. There's more that can make
this better (like resurrecting kernel AIO), but that can be done as a follow-up
to the work here.

Patches 5 and 6 lay the groundwork needed for using a swap file on BTRFS, and
patch 7 implements the actual aops.

Version 3 incorporates a bunch of David Sterba's feedback, both style and design
issues. We now audit various ioctls to prevent them from interfering with swap
file operation and handle extents which can't be nocow'd.

After some discussion on the mailing list, I decided that for simplicity and
reliability, it's best to simply disallow COW files and files with shared
extents (like files with extents shared with a snapshot). From a user's
perspective, this means that a snapshotted subvolume cannot be used for a swap
file, but keeping the swap file in a separate subvolume that is never
snapshotted seems entirely reasonable to me. An alternative suggestion was to
allow swap files to be snapshotted and to do an implied COW on swap file
activation, which I was ready to implement until I realized that we can't permit
snapshotting a subvolume with an active swap file, so this creates a surprising
inconsistency for users (in my opinion).

As with before, this functionality is tenuously tested in a virtual machine with
some artificial workloads, but it "works for me". I'm pretty happy with the
results on my end, so please comment away.

Thanks!

Omar Sandoval (7):
  direct-io: don't dirty ITER_BVEC pages on read
  nfs: don't dirty ITER_BVEC pages read through direct I/O
  swap: use direct I/O for SWP_FILE swap_readpage
  vfs: update swap_{,de}activate documentation
  btrfs: prevent ioctls from interfering with a swap file
  btrfs: add EXTENT_FLAG_SWAPFILE
  btrfs: enable swap file support

 Documentation/filesystems/Locking |   7 +-
 Documentation/filesystems/vfs.txt |   7 +-
 fs/btrfs/ctree.h                  |   3 +
 fs/btrfs/disk-io.c                |   1 +
 fs/btrfs/extent_io.c              |   1 +
 fs/btrfs/extent_map.h             |   1 +
 fs/btrfs/inode.c                  | 132 ++++++++++++++++++++++++++++++++++++++
 fs/btrfs/ioctl.c                  |  35 ++++++++--
 fs/direct-io.c                    |   8 ++-
 fs/nfs/direct.c                   |   5 +-
 include/trace/events/btrfs.h      |   3 +-
 mm/page_io.c                      |  32 +++++++--
 12 files changed, 216 insertions(+), 19 deletions(-)

-- 
2.1.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

             reply	other threads:[~2014-12-10  1:46 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-10  1:45 Omar Sandoval [this message]
2014-12-10  1:45 ` [RFC PATCH v3 0/7] btrfs: implement swap file support Omar Sandoval
2014-12-10  1:45 ` [RFC PATCH v3 1/7] direct-io: don't dirty ITER_BVEC pages on read Omar Sandoval
2014-12-10  1:45   ` Omar Sandoval
2014-12-10  1:45 ` [RFC PATCH v3 2/7] nfs: don't dirty ITER_BVEC pages read through direct I/O Omar Sandoval
2014-12-10  1:45   ` Omar Sandoval
2014-12-10  1:45 ` [RFC PATCH v3 3/7] swap: use direct I/O for SWP_FILE swap_readpage Omar Sandoval
2014-12-10  1:45   ` Omar Sandoval
2014-12-10  1:45 ` [RFC PATCH v3 4/7] vfs: update swap_{,de}activate documentation Omar Sandoval
2014-12-10  1:45   ` Omar Sandoval
2014-12-10  1:45 ` [RFC PATCH v3 5/7] btrfs: prevent ioctls from interfering with a swap file Omar Sandoval
2014-12-10  1:45   ` Omar Sandoval
2014-12-10  1:45 ` [RFC PATCH v3 6/7] btrfs: add EXTENT_FLAG_SWAPFILE Omar Sandoval
2014-12-10  1:45   ` Omar Sandoval
2014-12-12 10:32   ` David Sterba
2014-12-12 10:32     ` David Sterba
2014-12-10  1:45 ` [RFC PATCH v3 7/7] btrfs: enable swap file support Omar Sandoval
2014-12-10  1:45   ` Omar Sandoval
2014-12-12 10:51   ` David Sterba
2014-12-12 10:51     ` David Sterba
2014-12-12 10:51     ` David Sterba
2014-12-12 20:00     ` Omar Sandoval
2014-12-12 20:00       ` Omar Sandoval
2014-12-12 10:32 ` [RFC PATCH v3 0/7] btrfs: implement " David Sterba
2014-12-12 10:32   ` David Sterba
2014-12-12 20:15   ` Omar Sandoval
2014-12-12 20:15     ` Omar Sandoval

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1418173063.git.osandov@osandov.com \
    --to=osandov@osandov.com \
    --cc=akpm@linux-foundation.org \
    --cc=clm@fb.com \
    --cc=dsterba@suse.cz \
    --cc=hch@infradead.org \
    --cc=jbacik@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trond.myklebust@primarydata.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.