All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Benjamin Marzinski" <bmarzins@redhat.com>
To: linux-fsdevel <linux-fsdevel@vger.kernel.org>
Cc: cluster-devel <cluster-devel@redhat.com>
Subject: [PATCH 2/2] gfs2: writeout truncated pages
Date: Tue, 14 Jun 2016 17:02:49 -0500	[thread overview]
Message-ID: <1465941769-25596-3-git-send-email-bmarzins@redhat.com> (raw)
In-Reply-To: <1465941769-25596-1-git-send-email-bmarzins@redhat.com>

When gfs2 attempts to write a page to a file that is being truncated,
and notices that the page is completely outside of the file size, it
tries to invalidate it.  However, this may require a transaction for
journaled data files to revoke any buffers from the page on the active
items list. Unfortunately, this can happen inside a log flush, where a
transaction cannot be started. Also, gfs2 may need to be able to remove
the buffer from the ail1 list before it can finish the log flush.

To deal with this, gfs2 now skips the check to see if the write is
outside the file size, and simply writes it anyway. This situation can
only occur when the truncate code still has the file locked exclusively,
and hasn't marked this block as free in the metadata (which happens
later in truc_dealloc).  After gfs2 writes this page out, the truncation
code will shortly invalidate it and write out any revokes if necessary.

To do this, gfs2 now implements its own version of block_write_full_page
without the check, and calls the newly exported __block_write_full_page.
The check still exists in nobh_writepage, which is gfs2 calls in the
non-journaled case. But in this case the buffers will never be in the
journal. Thus, no revokes will need to be issued and the write can
safely be skipped without causing any possible journal replay issues.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 fs/gfs2/aops.c | 37 +++++++++++++++++++++++++++----------
 1 file changed, 27 insertions(+), 10 deletions(-)

diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c
index 37b7bc1..d3a7301 100644
--- a/fs/gfs2/aops.c
+++ b/fs/gfs2/aops.c
@@ -100,20 +100,11 @@ static int gfs2_writepage_common(struct page *page,
 	struct inode *inode = page->mapping->host;
 	struct gfs2_inode *ip = GFS2_I(inode);
 	struct gfs2_sbd *sdp = GFS2_SB(inode);
-	loff_t i_size = i_size_read(inode);
-	pgoff_t end_index = i_size >> PAGE_SHIFT;
-	unsigned offset;
 
 	if (gfs2_assert_withdraw(sdp, gfs2_glock_is_held_excl(ip->i_gl)))
 		goto out;
 	if (current->journal_info)
 		goto redirty;
-	/* Is the page fully outside i_size? (truncate in progress) */
-	offset = i_size & (PAGE_SIZE-1);
-	if (page->index > end_index || (page->index == end_index && !offset)) {
-		page->mapping->a_ops->invalidatepage(page, 0, PAGE_SIZE);
-		goto out;
-	}
 	return 1;
 redirty:
 	redirty_page_for_writepage(wbc, page);
@@ -140,6 +131,32 @@ static int gfs2_writepage(struct page *page, struct writeback_control *wbc)
 	return nobh_writepage(page, gfs2_get_block_noalloc, wbc);
 }
 
+/* This is the same as calling block_write_full_page, but it also
+ * writes pages outside of i_size
+ */
+int gfs2_write_full_page(struct page *page, get_block_t *get_block,
+			 struct writeback_control *wbc)
+{
+	struct inode * const inode = page->mapping->host;
+	loff_t i_size = i_size_read(inode);
+	const pgoff_t end_index = i_size >> PAGE_SHIFT;
+	unsigned offset;
+
+	/*
+	 * The page straddles i_size.  It must be zeroed out on each and every
+	 * writepage invocation because it may be mmapped.  "A file is mapped
+	 * in multiples of the page size.  For a file that is not a multiple of
+	 * the  page size, the remaining memory is zeroed when mapped, and
+	 * writes to that region are not written out to the file."
+	 */
+	offset = i_size & (PAGE_SIZE-1);
+	if (page->index == end_index && offset)
+		zero_user_segment(page, offset, PAGE_SIZE);
+
+	return __block_write_full_page(inode, page, get_block, wbc,
+				       end_buffer_async_write);
+}
+
 /**
  * __gfs2_jdata_writepage - The core of jdata writepage
  * @page: The page to write
@@ -165,7 +182,7 @@ static int __gfs2_jdata_writepage(struct page *page, struct writeback_control *w
 		}
 		gfs2_page_add_databufs(ip, page, 0, sdp->sd_vfs->s_blocksize-1);
 	}
-	return block_write_full_page(page, gfs2_get_block_noalloc, wbc);
+	return gfs2_write_full_page(page, gfs2_get_block_noalloc, wbc);
 }
 
 /**
-- 
1.8.3.1


WARNING: multiple messages have this Message-ID (diff)
From: Benjamin Marzinski <bmarzins@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [PATCH 2/2] gfs2: writeout truncated pages
Date: Tue, 14 Jun 2016 17:02:49 -0500	[thread overview]
Message-ID: <1465941769-25596-3-git-send-email-bmarzins@redhat.com> (raw)
In-Reply-To: <1465941769-25596-1-git-send-email-bmarzins@redhat.com>

When gfs2 attempts to write a page to a file that is being truncated,
and notices that the page is completely outside of the file size, it
tries to invalidate it.  However, this may require a transaction for
journaled data files to revoke any buffers from the page on the active
items list. Unfortunately, this can happen inside a log flush, where a
transaction cannot be started. Also, gfs2 may need to be able to remove
the buffer from the ail1 list before it can finish the log flush.

To deal with this, gfs2 now skips the check to see if the write is
outside the file size, and simply writes it anyway. This situation can
only occur when the truncate code still has the file locked exclusively,
and hasn't marked this block as free in the metadata (which happens
later in truc_dealloc).  After gfs2 writes this page out, the truncation
code will shortly invalidate it and write out any revokes if necessary.

To do this, gfs2 now implements its own version of block_write_full_page
without the check, and calls the newly exported __block_write_full_page.
The check still exists in nobh_writepage, which is gfs2 calls in the
non-journaled case. But in this case the buffers will never be in the
journal. Thus, no revokes will need to be issued and the write can
safely be skipped without causing any possible journal replay issues.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 fs/gfs2/aops.c | 37 +++++++++++++++++++++++++++----------
 1 file changed, 27 insertions(+), 10 deletions(-)

diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c
index 37b7bc1..d3a7301 100644
--- a/fs/gfs2/aops.c
+++ b/fs/gfs2/aops.c
@@ -100,20 +100,11 @@ static int gfs2_writepage_common(struct page *page,
 	struct inode *inode = page->mapping->host;
 	struct gfs2_inode *ip = GFS2_I(inode);
 	struct gfs2_sbd *sdp = GFS2_SB(inode);
-	loff_t i_size = i_size_read(inode);
-	pgoff_t end_index = i_size >> PAGE_SHIFT;
-	unsigned offset;
 
 	if (gfs2_assert_withdraw(sdp, gfs2_glock_is_held_excl(ip->i_gl)))
 		goto out;
 	if (current->journal_info)
 		goto redirty;
-	/* Is the page fully outside i_size? (truncate in progress) */
-	offset = i_size & (PAGE_SIZE-1);
-	if (page->index > end_index || (page->index == end_index && !offset)) {
-		page->mapping->a_ops->invalidatepage(page, 0, PAGE_SIZE);
-		goto out;
-	}
 	return 1;
 redirty:
 	redirty_page_for_writepage(wbc, page);
@@ -140,6 +131,32 @@ static int gfs2_writepage(struct page *page, struct writeback_control *wbc)
 	return nobh_writepage(page, gfs2_get_block_noalloc, wbc);
 }
 
+/* This is the same as calling block_write_full_page, but it also
+ * writes pages outside of i_size
+ */
+int gfs2_write_full_page(struct page *page, get_block_t *get_block,
+			 struct writeback_control *wbc)
+{
+	struct inode * const inode = page->mapping->host;
+	loff_t i_size = i_size_read(inode);
+	const pgoff_t end_index = i_size >> PAGE_SHIFT;
+	unsigned offset;
+
+	/*
+	 * The page straddles i_size.  It must be zeroed out on each and every
+	 * writepage invocation because it may be mmapped.  "A file is mapped
+	 * in multiples of the page size.  For a file that is not a multiple of
+	 * the  page size, the remaining memory is zeroed when mapped, and
+	 * writes to that region are not written out to the file."
+	 */
+	offset = i_size & (PAGE_SIZE-1);
+	if (page->index == end_index && offset)
+		zero_user_segment(page, offset, PAGE_SIZE);
+
+	return __block_write_full_page(inode, page, get_block, wbc,
+				       end_buffer_async_write);
+}
+
 /**
  * __gfs2_jdata_writepage - The core of jdata writepage
  * @page: The page to write
@@ -165,7 +182,7 @@ static int __gfs2_jdata_writepage(struct page *page, struct writeback_control *w
 		}
 		gfs2_page_add_databufs(ip, page, 0, sdp->sd_vfs->s_blocksize-1);
 	}
-	return block_write_full_page(page, gfs2_get_block_noalloc, wbc);
+	return gfs2_write_full_page(page, gfs2_get_block_noalloc, wbc);
 }
 
 /**
-- 
1.8.3.1



  parent reply	other threads:[~2016-06-14 22:02 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-14 22:02 [PATCH 0/2] fix gfs2 truncate race Benjamin Marzinski
2016-06-14 22:02 ` [Cluster-devel] " Benjamin Marzinski
2016-06-14 22:02 ` [PATCH 1/2] fs: export __block_write_full_page Benjamin Marzinski
2016-06-14 22:02   ` [Cluster-devel] " Benjamin Marzinski
2016-06-14 22:02 ` Benjamin Marzinski [this message]
2016-06-14 22:02   ` [Cluster-devel] [PATCH 2/2] gfs2: writeout truncated pages Benjamin Marzinski
2016-06-15 17:08   ` Bob Peterson
2016-06-15 17:08     ` Bob Peterson
2016-06-15 18:45     ` Benjamin Marzinski
2016-06-15 18:45       ` Benjamin Marzinski
2016-06-16 15:56   ` Steven Whitehouse
2016-06-16 15:56     ` Steven Whitehouse
2016-06-16 17:49     ` Benjamin Marzinski
2016-06-16 17:49       ` Benjamin Marzinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1465941769-25596-3-git-send-email-bmarzins@redhat.com \
    --to=bmarzins@redhat.com \
    --cc=cluster-devel@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.