All of lore.kernel.org
 help / color / mirror / Atom feed
From: Omar Sandoval <osandov@osandov.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: linux-btrfs@vger.kernel.org, Mel Gorman <mgorman@suse.de>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [RFC PATCH 0/6] btrfs: implement swap file support
Date: Tue, 18 Nov 2014 23:22:35 -0800	[thread overview]
Message-ID: <20141119072235.GA6541@mew> (raw)
In-Reply-To: <20141117154817.GA25670@infradead.org>

[-- Attachment #1: Type: text/plain, Size: 1490 bytes --]

On Mon, Nov 17, 2014 at 07:48:17AM -0800, Christoph Hellwig wrote:
> With the new iov_iter infrastructure that supprots direct I/O to kernel
> pages please get rid of the ->readpage hack first.  I'm still utterly
> disapoined that this crap ever got merged.
> 
That seems reasonable. Using direct I/O circumvents the need for patches 3, 4,
and 5, which were workarounds for readpage being fed a swapcache page, and
patch 1, which is a big, error-prone mess.

Here's a nice little bit of insanity I put together in that direction --
consider it a discussion point more than a patch. It does two things:

- Uses an ITER_BVEC iov_iter to do direct_IO for swap_readpage. This makes
  swap_readpage a synchronous operation, but I think that's the best we can do
  with the existing interface.
- Unless I'm missing something, there don't appear to be any instances of
  ITER_BVEC | READ in the kernel, so the dio path doesn't know not to dirty
  pages it gets that way. Dave Kleikamp and Ming Lei each previously submitted
  patches doing this as part of adding an aio_kernel interface. (The NFS direct
  I/O implementation doesn't know how to deal with these either, so this patch
  actually breaks the only existing user of this code path, but in the interest
  of keeping the patch short, I didn't try to fix it :)

Obviously, there's more to be done if that's how you'd prefer I do this. I'm
far from being an expert in any of this, so please let me know if I'm spewing
nonsense :)

-- 
Omar

[-- Attachment #2: 0001-swap-use-direct_IO-for-swap_readpage.patch --]
[-- Type: text/x-diff, Size: 4279 bytes --]

>From e58c52e69a9aef07c0089f9ce552fca96d42bce9 Mon Sep 17 00:00:00 2001
Message-Id: <e58c52e69a9aef07c0089f9ce552fca96d42bce9.1416380574.git.osandov@osandov.com>
From: Omar Sandoval <osandov@osandov.com>
Date: Tue, 18 Nov 2014 22:42:10 -0800
Subject: [PATCH] swap: use direct_IO for swap_readpage

Signed-off-by: Omar Sandoval <osandov@osandov.com>
---
 fs/direct-io.c |  8 +++++---
 mm/page_io.c   | 37 ++++++++++++++++++++++++++++++-------
 2 files changed, 35 insertions(+), 10 deletions(-)

diff --git a/fs/direct-io.c b/fs/direct-io.c
index e181b6b..e542ce4 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -120,6 +120,7 @@ struct dio {
 	spinlock_t bio_lock;		/* protects BIO fields below */
 	int page_errors;		/* errno from get_user_pages() */
 	int is_async;			/* is IO async ? */
+	int should_dirty;		/* should we mark read pages dirty? */
 	bool defer_completion;		/* defer AIO completion to workqueue? */
 	int io_error;			/* IO error in completion path */
 	unsigned long refcount;		/* direct_io_worker() and bios */
@@ -392,7 +393,7 @@ static inline void dio_bio_submit(struct dio *dio, struct dio_submit *sdio)
 	dio->refcount++;
 	spin_unlock_irqrestore(&dio->bio_lock, flags);
 
-	if (dio->is_async && dio->rw == READ)
+	if (dio->is_async && dio->rw == READ && dio->should_dirty)
 		bio_set_pages_dirty(bio);
 
 	if (sdio->submit_io)
@@ -463,13 +464,13 @@ static int dio_bio_complete(struct dio *dio, struct bio *bio)
 	if (!uptodate)
 		dio->io_error = -EIO;
 
-	if (dio->is_async && dio->rw == READ) {
+	if (dio->is_async && dio->rw == READ && dio->should_dirty) {
 		bio_check_pages_dirty(bio);	/* transfers ownership */
 	} else {
 		bio_for_each_segment_all(bvec, bio, i) {
 			struct page *page = bvec->bv_page;
 
-			if (dio->rw == READ && !PageCompound(page))
+			if (dio->rw == READ && !PageCompound(page) && dio->should_dirty)
 				set_page_dirty_lock(page);
 			page_cache_release(page);
 		}
@@ -1177,6 +1178,7 @@ do_blockdev_direct_IO(int rw, struct kiocb *iocb, struct inode *inode,
 
 	dio->inode = inode;
 	dio->rw = rw;
+	dio->should_dirty = !(iter->type & ITER_BVEC);
 
 	/*
 	 * For AIO O_(D)SYNC writes we need to defer completions to a workqueue
diff --git a/mm/page_io.c b/mm/page_io.c
index 955db8b..b9b84b2 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -266,8 +266,8 @@ int __swap_writepage(struct page *page, struct writeback_control *wbc,
 		struct address_space *mapping = swap_file->f_mapping;
 		struct bio_vec bv = {
 			.bv_page = page,
-			.bv_len  = PAGE_SIZE,
-			.bv_offset = 0
+			.bv_len = PAGE_SIZE,
+			.bv_offset = 0,
 		};
 		struct iov_iter from = {
 			.type = ITER_BVEC | WRITE,
@@ -283,8 +283,7 @@ int __swap_writepage(struct page *page, struct writeback_control *wbc,
 
 		set_page_writeback(page);
 		unlock_page(page);
-		ret = mapping->a_ops->direct_IO(ITER_BVEC | WRITE,
-						&kiocb, &from,
+		ret = mapping->a_ops->direct_IO(WRITE, &kiocb, &from,
 						kiocb.ki_pos);
 		if (ret == PAGE_SIZE) {
 			count_vm_event(PSWPOUT);
@@ -303,7 +302,7 @@ int __swap_writepage(struct page *page, struct writeback_control *wbc,
 			set_page_dirty(page);
 			ClearPageReclaim(page);
 			pr_err_ratelimited("Write error on dio swapfile (%Lu)\n",
-				page_file_offset(page));
+					   page_file_offset(page));
 		}
 		end_page_writeback(page);
 		return ret;
@@ -348,12 +347,36 @@ int swap_readpage(struct page *page)
 	}
 
 	if (sis->flags & SWP_FILE) {
+		struct kiocb kiocb;
 		struct file *swap_file = sis->swap_file;
 		struct address_space *mapping = swap_file->f_mapping;
+		struct bio_vec bv = {
+			.bv_page = page,
+			.bv_len = PAGE_SIZE,
+			.bv_offset = 0,
+		};
+		struct iov_iter to = {
+			.type = ITER_BVEC | READ,
+			.count = PAGE_SIZE,
+			.iov_offset = 0,
+			.nr_segs = 1,
+		};
+		to.bvec = &bv;	/* older gcc versions are broken */
+
+		init_sync_kiocb(&kiocb, swap_file);
+		kiocb.ki_pos = page_file_offset(page);
+		kiocb.ki_nbytes = PAGE_SIZE;
 
-		ret = mapping->a_ops->readpage(swap_file, page);
-		if (!ret)
+		ret = mapping->a_ops->direct_IO(READ, &kiocb, &to,
+						kiocb.ki_pos);
+		if (ret == PAGE_SIZE) {
+			SetPageUptodate(page);
 			count_vm_event(PSWPIN);
+			ret = 0;
+		} else {
+			PageError(page); /* XXX: maybe? */
+		}
+		unlock_page(page);
 		return ret;
 	}
 
-- 
2.1.3


  reply	other threads:[~2014-11-19  7:22 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-17 10:36 [RFC PATCH 0/6] btrfs: implement swap file support Omar Sandoval
2014-11-17 10:36 ` [RFC PATCH 1/6] btrfs: convert uses of ->mapping and ->index to wrappers Omar Sandoval
2014-11-17 10:36 ` [RFC PATCH 2/6] btrfs: don't allow -C or +c chattrs on a swap file Omar Sandoval
2014-11-17 10:36 ` [RFC PATCH 3/6] btrfs: don't set ->private on swapcache pages Omar Sandoval
2014-11-17 10:36 ` [RFC PATCH 4/6] btrfs: don't check the cleancache for " Omar Sandoval
2014-11-17 10:36 ` [RFC PATCH 5/6] btrfs: don't mark extents used for swap as up to date Omar Sandoval
2014-11-17 10:36 ` [RFC PATCH 6/6] btrfs: enable swap file support Omar Sandoval
2014-11-17 15:48 ` [RFC PATCH 0/6] btrfs: implement " Christoph Hellwig
2014-11-19  7:22   ` Omar Sandoval [this message]
2014-11-21 10:06     ` Christoph Hellwig
2014-11-21 10:12       ` Omar Sandoval

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141119072235.GA6541@mew \
    --to=osandov@osandov.com \
    --cc=hch@infradead.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.