All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5 v6] ext4: fix 1k block bugs
@ 2011-08-24 19:07 Allison Henderson
  2011-08-24 19:07 ` [PATCH 1/5 v6] ext4: Add new ext4_discard_partial_page_buffers routines Allison Henderson
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Allison Henderson @ 2011-08-24 19:07 UTC (permalink / raw)
  To: linux-ext4; +Cc: Allison Henderson

This patch set corrects several bugs that occur
when blocksize = 1k.

The first bug appears in xfstests 75, 112, 127 and occurs
because the partial pages appearing at the head and tail
of a hole are not properly zeroed and unmapped

A second bug in test 127 occurs when i_size and the end of
the hole appear in the same page.  Because punch hole needs
to flush the pages in the hole, it will need to properly
unmap the partial page appearing after i_size.

A similar bug occurred during extra fsx testing for the
first two patches.  This bug happens because truncate also
does not zero and unmap the partial page at the end of the
file.

Similar bugs also occurred when a write operation begins or
ends in a hole or extends EOF.  The partial page of the write
operation needs to be zeroed and unmapped.

These bugs are corrected using a new ext4_discard_partial_page_buffers
routine that zeros a specified region of a page, and unmaps buffer
heads for any block aligned region of the page that was completely
zeroed.

Lastly this set also contains a fix for a bug reported by Lukas while
working on a new patch to add discard support to loop devices using
punch hole.  This bug occurs when calculating the byte block offset for
very large holes.  Since the code now needs the byte offset of the
pages instead of the blocks, the problem code is simply removed.
 
This patch set has passed the following tests
(for both 1k and 4k blocksizes):
xfstests 75, 112, 127, 130, 252 and 256
fsx stress test (12 hours)


v1 -> v2
Added EXT4_BLOCK_ZERO_DISCARD_BUFFER flag

v2 -> v3
Moved code out of ext4_zero_block_page_range and in
to new ext4_unmap_page_range function

v3 -> v4
Renamed ext4_unmap_page_range to ext4_unmap_partial_page_buffers
Moved ext4_unmap_partial_page_buffers from inode.c to extents.c
Corrected comments for non block/page aligned handling
Added checks to avoid unnecessary page unmaps
Removed unneeded journaling and mapping from new routine

v4 -> v5
Renamed ext4_unmap_partial_page_buffers to
ext4_discard_partial_page_buffers_no_lock and added
new ext4_discard_partial_page_buffers wrapper function

Modified ext4_discard_partial_page_buffers_no_lock to
zero the page as well as unmap buffers

Moved ext4_discard_partial_page_buffers functions
back to inode.c, and also put in a separate patch.

Added extra patches for write bugs and truncate bugs

Added extra patch for large hole calculation

v5 -> v6
Merged patch 4 into patch 2 since block offsets are
no longer needed

Removed uneeded ClearPageUptodate when unmapping
buffer heads

Patch set now passes test 130.  The
ext4_discard_partial_page_buffers_no_lock routine
was modified to map buffer heads for unblock aligned
regions of the page (if needed) so that they can be
updated with the contents of the block before being
partially zeroed

Allison Henderson (5):
  ext4: Add new ext4_discard_partial_page_buffers routines
  ext4: fix xfstests 75, 112, 127 punch hole failure
  ext4: fix 2nd xfstests 127 punch hole failure
  ext4: fix fsx truncate failure
  ext4: fix partial page writes

 fs/ext4/ext4.h     |   11 +++
 fs/ext4/extents.c  |  116 +++++++++++++++++++------
 fs/ext4/indirect.c |   16 +++-
 fs/ext4/inode.c    |  242 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 355 insertions(+), 30 deletions(-)


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/5 v6] ext4: Add new ext4_discard_partial_page_buffers routines
  2011-08-24 19:07 [PATCH 0/5 v6] ext4: fix 1k block bugs Allison Henderson
@ 2011-08-24 19:07 ` Allison Henderson
  2011-08-27  4:06   ` Ted Ts'o
  2011-08-24 19:07 ` [PATCH 2/5 v6] ext4: fix xfstests 75, 112, 127 punch hole failure Allison Henderson
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 10+ messages in thread
From: Allison Henderson @ 2011-08-24 19:07 UTC (permalink / raw)
  To: linux-ext4; +Cc: Allison Henderson

This patch adds two new routines: ext4_discard_partial_page_buffers
and ext4_discard_partial_page_buffers_no_lock.

The ext4_discard_partial_page_buffers routine is a wrapper
function to ext4_discard_partial_page_buffers_no_lock.
The wrapper function locks the page and passes it to
ext4_discard_partial_page_buffers_no_lock.
Calling functions that already have the page locked can call
ext4_discard_partial_page_buffers_no_lock directly.

The ext4_discard_partial_page_buffers_no_lock function
zeros a specified range in a page, and unmaps the
corresponding buffer heads.  Only block aligned regions of the
page will have their buffer heads unmapped.  Unblock aligned regions
will be mapped if needed so that they can be updated with the
partial zero out.  This function is meant to
be used to update a page and its buffer heads to be zeroed
and unmaped when the corresponding blocks have been released
or will be released.

This routine is used in the following scenarios:
* A hole is punched and the non page aligned regions
  of the head and tail of the hole need to be discarded

* The file is truncated and the partial page beyond EOF needs
  to be discarded

* The end of a hole is in the same page as EOF.  After the
  page is flushed, the partial page beyond EOF needs to be
  discarded.

* A write operation begins or ends inside a hole and the partial
  page appearing before or after the write needs to be discarded

* A write operation extends EOF and the partial page beyond EOF
  needs to be discarded

This function takes a flag EXT4_DSCRD_PARTIAL_PG_ZERO_UNMAPED
which is used when a write operation begins or ends in a hole.
When the EXT4_DSCRD_PARTIAL_PG_ZERO_UNMAPED flag is used, only
buffer heads that are already unmapped will have the corresponding
regions of the page zeroed.

Signed-off-by: Allison Henderson <achender@linux.vnet.ibm.com>
---
:100644 100644 040b3fa... c386f8d... M	fs/ext4/ext4.h
:100644 100644 8fdc298... 6328428... M	fs/ext4/inode.c
 fs/ext4/ext4.h  |   11 +++
 fs/ext4/inode.c |  223 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 234 insertions(+), 0 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 040b3fa..c386f8d 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -529,6 +529,11 @@ struct ext4_new_group_data {
 #define EXT4_FREE_BLOCKS_NO_QUOT_UPDATE	0x0008
 
 /*
+ * Flags used by ext4_discard_partial_page_buffers
+ */
+#define EXT4_DSCRD_PARTIAL_PG_ZERO_UNMAPED	0x0001
+
+/*
  * ioctl commands
  */
 #define	EXT4_IOC_GETFLAGS		FS_IOC_GETFLAGS
@@ -1836,6 +1841,12 @@ extern int ext4_block_truncate_page(handle_t *handle,
 		struct address_space *mapping, loff_t from);
 extern int ext4_block_zero_page_range(handle_t *handle,
 		struct address_space *mapping, loff_t from, loff_t length);
+extern int ext4_discard_partial_page_buffers(handle_t *handle,
+		struct address_space *mapping, loff_t from,
+		loff_t length, int flags);
+extern int ext4_discard_partial_page_buffers_no_lock(handle_t *handle,
+		struct inode *inode, struct page *page, loff_t from,
+		loff_t length, int flags);
 extern int ext4_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf);
 extern qsize_t *ext4_get_reserved_space(struct inode *inode);
 extern void ext4_da_update_reserve_space(struct inode *inode,
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 8fdc298..6328428 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2911,6 +2911,229 @@ void ext4_set_aops(struct inode *inode)
 		inode->i_mapping->a_ops = &ext4_journalled_aops;
 }
 
+
+/*
+ * ext4_discard_partial_page_buffers()
+ * Wrapper function for ext4_discard_partial_page_buffers_no_lock.
+ * This function finds and locks the page containing the offset
+ * "from" and passes it to ext4_discard_partial_page_buffers_no_lock.
+ * Calling functions that already have the page locked should call
+ * ext4_discard_partial_page_buffers_no_lock directly.
+ */
+int ext4_discard_partial_page_buffers(handle_t *handle,
+		struct address_space *mapping, loff_t from,
+		loff_t length, int flags)
+{
+	struct inode *inode = mapping->host;
+	struct page *page;
+	int err = 0;
+
+	page = find_or_create_page(mapping, from >> PAGE_CACHE_SHIFT,
+				   mapping_gfp_mask(mapping) & ~__GFP_FS);
+	if (!page)
+		return -EINVAL;
+
+	err = ext4_discard_partial_page_buffers_no_lock(handle, inode, page,
+		from, length, flags);
+
+	unlock_page(page);
+	page_cache_release(page);
+	return err;
+}
+
+/*
+ * ext4_discard_partial_page_buffers_no_lock()
+ * Zeros a page range of length 'length' starting from offset 'from'.
+ * Buffer heads that correspond to the block aligned regions of the
+ * zeroed range will be unmapped.  Unblock aligned regions
+ * will have the corresponding buffer head mapped if needed so that
+ * that region of the page can be updated with the partial zero out.
+ *
+ * This function assumes that the page has already been  locked.  The
+ * The range to be discarded must be contained with in the given page.
+ * If the specified range exceeds the end of the page it will be shortened
+ * to the end of the page that corresponds to 'from'.  This function is
+ * appropriate for updateing a page and it buffer heads to be unmaped and
+ * zeroed for blocks that have been either released, or are going to be
+ * released.
+ *
+ * handle: The journal handle
+ * inode:  The files inode
+ * page:   A locked page that contains the offset "from"
+ * from:   The starting byte offset (from the begining of the file)
+ *         to begin discarding
+ * len:    The length of bytes to discard
+ * flags:  Optional flags that may be used:
+ *
+ *         EXT4_DSCRD_PARTIAL_PG_ZERO_UNMAPED
+ *         Only zero the regions of the page whose buffer heads
+ *         have already been unmapped.  This flag is appropriate
+ *         for updateing the contents of a page whose blocks may
+ *         have already been released, and we only want to zero
+ *         out the regions that correspond to those released blocks.
+ *
+ * Returns zero on sucess or negative on failure.
+ */
+int ext4_discard_partial_page_buffers_no_lock(handle_t *handle,
+		struct inode *inode, struct page *page, loff_t from,
+		loff_t length, int flags)
+{
+	ext4_fsblk_t index = from >> PAGE_CACHE_SHIFT;
+	unsigned int offset = from & (PAGE_CACHE_SIZE-1);
+	unsigned int blocksize, max, pos;
+	unsigned int end_of_block, range_to_discard;
+	ext4_lblk_t iblock;
+	struct buffer_head *bh;
+	int err = 0;
+
+	blocksize = inode->i_sb->s_blocksize;
+	max = PAGE_CACHE_SIZE - offset;
+
+	if (index != page->index)
+		return -EINVAL;
+
+	/*
+	 * correct length if it does not fall between
+	 * 'from' and the end of the page
+	 */
+	if (length > max || length < 0)
+		length = max;
+
+	iblock = index << (PAGE_CACHE_SHIFT - inode->i_sb->s_blocksize_bits);
+
+	if (!page_has_buffers(page)) {
+		/*
+		 * If the range to be discarded covers a partial block
+		 * we need to get the page buffers.  This is because
+		 * partial blocks cannot be released and the page needs
+		 * to be updated with the contents of the block before
+		 * we write the zeros on top of it.
+		 */
+		if (!(from & (blocksize - 1)) ||
+		    !((from + length) & (blocksize - 1))) {
+			create_empty_buffers(page, blocksize, 0);
+		} else {
+			/*
+			 * If there are no partial blocks,
+			 * there is nothing to update,
+			 * so we can return now
+			 */
+			return 0;
+		}
+	}
+
+	/* Find the buffer that contains "offset" */
+	bh = page_buffers(page);
+	pos = blocksize;
+	while (offset >= pos) {
+		bh = bh->b_this_page;
+		iblock++;
+		pos += blocksize;
+	}
+
+	pos = offset;
+	while (pos < offset + length) {
+		err = 0;
+
+		/* The length of space left to zero and unmap */
+		range_to_discard = offset + length - pos;
+
+		/* The length of space until the end of the block */
+		end_of_block = blocksize - (pos & (blocksize-1));
+
+		/*
+		 * Do not unmap or zero past end of block
+		 * for this buffer head
+		 */
+		if (range_to_discard > end_of_block)
+			range_to_discard = end_of_block;
+
+
+		/*
+		 * Skip this buffer head if we are only zeroing unampped
+		 * regions of the page
+		 */
+		if (flags & EXT4_DSCRD_PARTIAL_PG_ZERO_UNMAPED &&
+			buffer_mapped(bh))
+				goto next;
+
+		/*
+		 * If this block is not completely contained in the range
+		 * to be discarded, then it is not going to be released. Because
+		 * we need to keep this block, we need to make sure this part
+		 * of the page is uptodate before we modify it by writeing
+		 * partial zeros on it.
+		 */
+		if (range_to_discard != blocksize) {
+			/*
+			 * Buffer head must be mapped before we can read
+			 * from the block
+			 */
+			if (!buffer_mapped(bh)) {
+				BUFFER_TRACE(bh, "unmapped");
+				ext4_get_block(inode, iblock, bh, 0);
+				/* unmapped? It's a hole - nothing to do */
+				if (!buffer_mapped(bh)) {
+					BUFFER_TRACE(bh, "still unmapped");
+					goto next;
+				}
+			}
+
+			/* Ok, it's mapped. Make sure it's up-to-date */
+			if (PageUptodate(page))
+				set_buffer_uptodate(bh);
+
+			if (!buffer_uptodate(bh)) {
+				err = -EIO;
+				ll_rw_block(READ, 1, &bh);
+				wait_on_buffer(bh);
+				/* Uhhuh. Read error. Complain and punt.*/
+				if (!buffer_uptodate(bh))
+					goto next;
+			}
+		}
+
+		if (ext4_should_journal_data(inode)) {
+			BUFFER_TRACE(bh, "get write access");
+			err = ext4_journal_get_write_access(handle, bh);
+			if (err)
+				goto next;
+		}
+
+		zero_user(page, pos, range_to_discard);
+
+		BUFFER_TRACE(bh, "buffer discarded");
+
+		err = 0;
+		if (ext4_should_journal_data(inode)) {
+			err = ext4_handle_dirty_metadata(handle, inode, bh);
+		} else {
+			if (ext4_should_order_data(inode) &&
+			   EXT4_I(inode)->jinode)
+				err = ext4_jbd2_file_inode(handle, inode);
+			mark_buffer_dirty(bh);
+		}
+
+		/* If the range is block aligned, unmap */
+		if (range_to_discard == blocksize) {
+			clear_buffer_dirty(bh);
+			bh->b_bdev = NULL;
+			clear_buffer_mapped(bh);
+			clear_buffer_req(bh);
+			clear_buffer_new(bh);
+			clear_buffer_delay(bh);
+			clear_buffer_unwritten(bh);
+			clear_buffer_uptodate(bh);
+		}
+next:
+		bh = bh->b_this_page;
+		iblock++;
+		pos += range_to_discard;
+	}
+
+	return err;
+}
+
 /*
  * ext4_block_truncate_page() zeroes out a mapping from file offset `from'
  * up to the end of the block which corresponds to `from'.
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/5 v6] ext4: fix xfstests 75, 112, 127 punch hole failure
  2011-08-24 19:07 [PATCH 0/5 v6] ext4: fix 1k block bugs Allison Henderson
  2011-08-24 19:07 ` [PATCH 1/5 v6] ext4: Add new ext4_discard_partial_page_buffers routines Allison Henderson
@ 2011-08-24 19:07 ` Allison Henderson
  2011-08-24 19:07 ` [PATCH 3/5 v6] ext4: fix 2nd xfstests " Allison Henderson
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Allison Henderson @ 2011-08-24 19:07 UTC (permalink / raw)
  To: linux-ext4; +Cc: Allison Henderson

This patch addresses a bug found by xfstests 75, 112, 127
when blocksize = 1k

This bug happens because the punch hole code only zeros
out non block aligned regions of the page.  This means that if the
blocks are smaller than a page, then the block aligned regions of
the page inside the hole are left un-zeroed, and their buffer heads
are still mapped.  This bug is corrected by using
ext4_discard_partial_page_buffers to properly zero the partial page
at the head and tail of the hole, and unmap the corresponding buffer
heads

This patch also addresses a bug reported by Lukas while working on a
new patch to add discard support for loop devices using punch hole.
The bug happened because of the first and last block number
needed to be cast to a larger data type before calculating the
byte offset, but since now we only need the byte offsets of the
pages, we no longer even need to be calculating the byte offsets
of the blocks.  The code to do the block offset calculations is
removed in this patch.

Signed-off-by: Allison Henderson <achender@linux.vnet.ibm.com>
---
:100644 100644 4d73e11... 653a8aa... M	fs/ext4/extents.c
 fs/ext4/extents.c |   61 +++++++++++++++++++++++++++++++++-------------------
 1 files changed, 39 insertions(+), 22 deletions(-)

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 4d73e11..653a8aa 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -4157,17 +4157,14 @@ int ext4_ext_punch_hole(struct file *file, loff_t offset, loff_t length)
 	struct address_space *mapping = inode->i_mapping;
 	struct ext4_map_blocks map;
 	handle_t *handle;
-	loff_t first_block_offset, last_block_offset, block_len;
-	loff_t first_page, last_page, first_page_offset, last_page_offset;
+	loff_t first_page, last_page, page_len;
+	loff_t first_page_offset, last_page_offset;
 	int ret, credits, blocks_released, err = 0;
 
 	first_block = (offset + sb->s_blocksize - 1) >>
 		EXT4_BLOCK_SIZE_BITS(sb);
 	last_block = (offset + length) >> EXT4_BLOCK_SIZE_BITS(sb);
 
-	first_block_offset = first_block << EXT4_BLOCK_SIZE_BITS(sb);
-	last_block_offset = last_block << EXT4_BLOCK_SIZE_BITS(sb);
-
 	first_page = (offset + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
 	last_page = (offset + length) >> PAGE_CACHE_SHIFT;
 
@@ -4206,24 +4203,44 @@ int ext4_ext_punch_hole(struct file *file, loff_t offset, loff_t length)
 		goto out;
 
 	/*
-	 * Now we need to zero out the un block aligned data.
-	 * If the file is smaller than a block, just
-	 * zero out the middle
+	 * Now we need to zero out the non-page-aligned data in the
+	 * pages at the start and tail of the hole, and unmap the buffer
+	 * heads for the block aligned regions of the page that were
+	 * completely zeroed.
 	 */
-	if (first_block > last_block)
-		ext4_block_zero_page_range(handle, mapping, offset, length);
-	else {
-		/* zero out the head of the hole before the first block */
-		block_len  = first_block_offset - offset;
-		if (block_len > 0)
-			ext4_block_zero_page_range(handle, mapping,
-						   offset, block_len);
-
-		/* zero out the tail of the hole after the last block */
-		block_len = offset + length - last_block_offset;
-		if (block_len > 0) {
-			ext4_block_zero_page_range(handle, mapping,
-					last_block_offset, block_len);
+	if (first_page > last_page) {
+		/*
+		 * If the file space being truncated is contained within a page
+		 * just zero out and unmap the middle of that page
+		 */
+		err = ext4_discard_partial_page_buffers(handle,
+			mapping, offset, length, 0);
+
+		if (err)
+			goto out;
+	} else {
+		/*
+		 * zero out and unmap the partial page that contains
+		 * the start of the hole
+		 */
+		page_len  = first_page_offset - offset;
+		if (page_len > 0) {
+			err = ext4_discard_partial_page_buffers(handle, mapping,
+						   offset, page_len, 0);
+			if (err)
+				goto out;
+		}
+
+		/*
+		 * zero out and unmap the partial page that contains
+		 * the end of the hole
+		 */
+		page_len = offset + length - last_page_offset;
+		if (page_len > 0) {
+			err = ext4_discard_partial_page_buffers(handle, mapping,
+					last_page_offset, page_len, 0);
+			if (err)
+				goto out;
 		}
 	}
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/5 v6] ext4: fix 2nd xfstests 127 punch hole failure
  2011-08-24 19:07 [PATCH 0/5 v6] ext4: fix 1k block bugs Allison Henderson
  2011-08-24 19:07 ` [PATCH 1/5 v6] ext4: Add new ext4_discard_partial_page_buffers routines Allison Henderson
  2011-08-24 19:07 ` [PATCH 2/5 v6] ext4: fix xfstests 75, 112, 127 punch hole failure Allison Henderson
@ 2011-08-24 19:07 ` Allison Henderson
  2011-08-24 19:07 ` [PATCH 4/5 v6] ext4: fix fsx truncate failure Allison Henderson
  2011-08-24 19:07 ` [PATCH 5/5 v6] ext4: fix partial page writes Allison Henderson
  4 siblings, 0 replies; 10+ messages in thread
From: Allison Henderson @ 2011-08-24 19:07 UTC (permalink / raw)
  To: linux-ext4; +Cc: Allison Henderson

This patch fixes a second punch hole bug found by xfstests 127.

This bug happens because punch hole needs to flush the pages
of the hole to avoid race conditions.  But if the end of the
hole is in the same page as i_size, the buffer heads beyond
i_size need to be unmapped and the page needs to be zeroed
after it is flushed.

To correct this, the new ext4_discard_partial_page_buffers
routine is used to zero and unmap the partial page
beyond i_size if the end of the hole appears in the same
page as i_size.

The code has also been optimized to set the end of the hole
to the page after i_size if the specified hole exceeds i_size,
and the code that flushes the pages has been simplified.

Signed-off-by: Allison Henderson <achender@linux.vnet.ibm.com>
---
:100644 100644 653a8aa... 3e87a43... M	fs/ext4/extents.c
 fs/ext4/extents.c |   41 +++++++++++++++++++++++++++++++++++++----
 1 files changed, 37 insertions(+), 4 deletions(-)

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 653a8aa..3e87a43 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -4161,6 +4161,20 @@ int ext4_ext_punch_hole(struct file *file, loff_t offset, loff_t length)
 	loff_t first_page_offset, last_page_offset;
 	int ret, credits, blocks_released, err = 0;
 
+	/* No need to punch hole beyond i_size */
+	if (offset >= inode->i_size)
+		return 0;
+
+	/*
+	 * If the hole extends beyond i_size, set the hole
+	 * to end after the page that contains i_size
+	 */
+	if (offset + length > inode->i_size) {
+		length = inode->i_size +
+		   PAGE_CACHE_SIZE - (inode->i_size & (PAGE_CACHE_SIZE - 1)) -
+		   offset;
+	}
+
 	first_block = (offset + sb->s_blocksize - 1) >>
 		EXT4_BLOCK_SIZE_BITS(sb);
 	last_block = (offset + length) >> EXT4_BLOCK_SIZE_BITS(sb);
@@ -4177,11 +4191,10 @@ int ext4_ext_punch_hole(struct file *file, loff_t offset, loff_t length)
 	 */
 	if (mapping->nrpages && mapping_tagged(mapping, PAGECACHE_TAG_DIRTY)) {
 		err = filemap_write_and_wait_range(mapping,
-			first_page_offset == 0 ? 0 : first_page_offset-1,
-			last_page_offset);
+			offset, offset + length - 1);
 
-			if (err)
-				return err;
+		if (err)
+			return err;
 	}
 
 	/* Now release the pages */
@@ -4244,6 +4257,26 @@ int ext4_ext_punch_hole(struct file *file, loff_t offset, loff_t length)
 		}
 	}
 
+
+	/*
+	 * If i_size is contained in the last page, we need to
+	 * unmap and zero the partial page after i_size
+	 */
+	if (inode->i_size >> PAGE_CACHE_SHIFT == last_page &&
+	   inode->i_size % PAGE_CACHE_SIZE != 0) {
+
+		page_len = PAGE_CACHE_SIZE -
+			(inode->i_size & (PAGE_CACHE_SIZE - 1));
+
+		if (page_len > 0) {
+			err = ext4_discard_partial_page_buffers(handle,
+			  mapping, inode->i_size, page_len, 0);
+
+			if (err)
+				goto out;
+		}
+	}
+
 	/* If there are no blocks to remove, return now */
 	if (first_block >= last_block)
 		goto out;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/5 v6] ext4: fix fsx truncate failure
  2011-08-24 19:07 [PATCH 0/5 v6] ext4: fix 1k block bugs Allison Henderson
                   ` (2 preceding siblings ...)
  2011-08-24 19:07 ` [PATCH 3/5 v6] ext4: fix 2nd xfstests " Allison Henderson
@ 2011-08-24 19:07 ` Allison Henderson
  2011-08-24 19:07 ` [PATCH 5/5 v6] ext4: fix partial page writes Allison Henderson
  4 siblings, 0 replies; 10+ messages in thread
From: Allison Henderson @ 2011-08-24 19:07 UTC (permalink / raw)
  To: linux-ext4; +Cc: Allison Henderson

While running extended fsx tests to verify the first
two patches, a similar bug was also found in the
truncate operation

This bug happens because the truncate routine only zeros
the unblock aligned portion of the last page.  This means
that the block aligned portions of the page appearing after
i_size are left unzeroed, and the buffer heads still mapped.

This bug is corrected by using ext4_discard_partial_page_buffers
in the truncate routine to zero the partial page and unmap
the buffer headers

Signed-off-by: Allison Henderson <achender@linux.vnet.ibm.com>
---
:100644 100644 3e87a43... 98a23b2... M	fs/ext4/extents.c
:100644 100644 b8602cd... c11bfcd... M	fs/ext4/indirect.c
 fs/ext4/extents.c  |   14 ++++++++++++--
 fs/ext4/indirect.c |   16 ++++++++++++++--
 2 files changed, 26 insertions(+), 4 deletions(-)

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 3e87a43..98a23b2 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -3644,6 +3644,7 @@ void ext4_ext_truncate(struct inode *inode)
 	struct super_block *sb = inode->i_sb;
 	ext4_lblk_t last_block;
 	handle_t *handle;
+	loff_t page_len;
 	int err = 0;
 
 	/*
@@ -3660,8 +3661,17 @@ void ext4_ext_truncate(struct inode *inode)
 	if (IS_ERR(handle))
 		return;
 
-	if (inode->i_size & (sb->s_blocksize - 1))
-		ext4_block_truncate_page(handle, mapping, inode->i_size);
+	if (inode->i_size % PAGE_CACHE_SIZE != 0) {
+
+		page_len = PAGE_CACHE_SIZE -
+			(inode->i_size & (PAGE_CACHE_SIZE - 1));
+
+		err = ext4_discard_partial_page_buffers(handle,
+			mapping, inode->i_size, page_len, 0);
+
+		if (err)
+			goto out_stop;
+	}
 
 	if (ext4_orphan_add(handle, inode))
 		goto out_stop;
diff --git a/fs/ext4/indirect.c b/fs/ext4/indirect.c
index b8602cd..c11bfcd 100644
--- a/fs/ext4/indirect.c
+++ b/fs/ext4/indirect.c
@@ -1338,7 +1338,9 @@ void ext4_ind_truncate(struct inode *inode)
 	__le32 nr = 0;
 	int n = 0;
 	ext4_lblk_t last_block, max_block;
+	loff_t page_len;
 	unsigned blocksize = inode->i_sb->s_blocksize;
+	int err;
 
 	handle = start_transaction(inode);
 	if (IS_ERR(handle))
@@ -1349,9 +1351,19 @@ void ext4_ind_truncate(struct inode *inode)
 	max_block = (EXT4_SB(inode->i_sb)->s_bitmap_maxbytes + blocksize-1)
 					>> EXT4_BLOCK_SIZE_BITS(inode->i_sb);
 
-	if (inode->i_size & (blocksize - 1))
-		if (ext4_block_truncate_page(handle, mapping, inode->i_size))
+
+	if (inode->i_size % PAGE_CACHE_SIZE != 0) {
+
+		page_len = PAGE_CACHE_SIZE -
+			(inode->i_size & (PAGE_CACHE_SIZE - 1));
+
+		err = ext4_discard_partial_page_buffers(handle,
+			mapping, inode->i_size, page_len, 0);
+
+		if (err)
 			goto out_stop;
+	}
+
 
 	if (last_block != max_block) {
 		n = ext4_block_to_path(inode, last_block, offsets, NULL);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 5/5 v6] ext4: fix partial page writes
  2011-08-24 19:07 [PATCH 0/5 v6] ext4: fix 1k block bugs Allison Henderson
                   ` (3 preceding siblings ...)
  2011-08-24 19:07 ` [PATCH 4/5 v6] ext4: fix fsx truncate failure Allison Henderson
@ 2011-08-24 19:07 ` Allison Henderson
  2011-08-26 15:47   ` Ted Ts'o
  4 siblings, 1 reply; 10+ messages in thread
From: Allison Henderson @ 2011-08-24 19:07 UTC (permalink / raw)
  To: linux-ext4; +Cc: Allison Henderson

While running extended fsx tests to verify the preceeding patches,
a similar bug was also found in the write operation

When ever a write operation begins or ends in a hole,
or extends EOF, the partial page contained in the hole
or beyond EOF needs to be zeroed out.

To correct this the new ext4_discard_partial_page_buffers_no_lock
routine is used to zero out the partial page, but only for buffer
heads that are already unmapped.

Signed-off-by: Allison Henderson <achender@linux.vnet.ibm.com>
---
:100644 100644 6328428... 1674fa7... M	fs/ext4/inode.c
 fs/ext4/inode.c |   19 +++++++++++++++++++
 1 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 6328428..1674fa7 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2214,6 +2214,7 @@ static int ext4_da_write_begin(struct file *file, struct address_space *mapping,
 	pgoff_t index;
 	struct inode *inode = mapping->host;
 	handle_t *handle;
+	loff_t page_len;
 
 	index = pos >> PAGE_CACHE_SHIFT;
 
@@ -2260,6 +2261,13 @@ retry:
 		 */
 		if (pos + len > inode->i_size)
 			ext4_truncate_failed_write(inode);
+	} else {
+		page_len = pos & (PAGE_CACHE_SIZE - 1);
+		if (page_len > 0) {
+			ret = ext4_discard_partial_page_buffers_no_lock(handle,
+				inode, page, pos - page_len, page_len,
+				EXT4_DSCRD_PARTIAL_PG_ZERO_UNMAPED);
+		}
 	}
 
 	if (ret == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries))
@@ -2302,6 +2310,7 @@ static int ext4_da_write_end(struct file *file,
 	loff_t new_i_size;
 	unsigned long start, end;
 	int write_mode = (int)(unsigned long)fsdata;
+	loff_t page_len;
 
 	if (write_mode == FALL_BACK_TO_NONDELALLOC) {
 		if (ext4_should_order_data(inode)) {
@@ -2350,6 +2359,16 @@ static int ext4_da_write_end(struct file *file,
 	}
 	ret2 = generic_write_end(file, mapping, pos, len, copied,
 							page, fsdata);
+
+	page_len = PAGE_CACHE_SIZE -
+			((pos + copied - 1) & (PAGE_CACHE_SIZE - 1));
+
+	if (page_len > 0) {
+		ret = ext4_discard_partial_page_buffers_no_lock(handle,
+			inode, page, pos + copied - 1, page_len,
+			EXT4_DSCRD_PARTIAL_PG_ZERO_UNMAPED);
+	}
+
 	copied = ret2;
 	if (ret2 < 0)
 		ret = ret2;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 5/5 v6] ext4: fix partial page writes
  2011-08-24 19:07 ` [PATCH 5/5 v6] ext4: fix partial page writes Allison Henderson
@ 2011-08-26 15:47   ` Ted Ts'o
  2011-08-26 22:46     ` Allison Henderson
  0 siblings, 1 reply; 10+ messages in thread
From: Ted Ts'o @ 2011-08-26 15:47 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-ext4

On Wed, Aug 24, 2011 at 12:07:22PM -0700, Allison Henderson wrote:
> While running extended fsx tests to verify the preceeding patches,
> a similar bug was also found in the write operation
> 
> When ever a write operation begins or ends in a hole,
> or extends EOF, the partial page contained in the hole
> or beyond EOF needs to be zeroed out.
> 
> To correct this the new ext4_discard_partial_page_buffers_no_lock
> routine is used to zero out the partial page, but only for buffer
> heads that are already unmapped.
> 
> Signed-off-by: Allison Henderson <achender@linux.vnet.ibm.com>

I haven't had a chance to debug this yet, but FYI, with this patch
applied, xfstests #224 causes OOM's to result when running under KVM
with 1024megs.  I tried doubling the memory to 2048 megs, and it
passed in 4k standard ext4 mode, but then got hit with the OOM killer
on test #224 in ext3 emulation mode (nodealloc, and with extents not
enabled in the file system).  Without this patch, and the first four
patches in the v6 patch series applied, xfstests #224 passes in all of
the standard configurations (4k, ext3, 1k, data=journal, etc.)

I don't know why this is happening, but I'll try to dig into it in the
next day or two.  Any thoughts or fixes you might suggest would be
much appreciated.

						- Ted

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 5/5 v6] ext4: fix partial page writes
  2011-08-26 15:47   ` Ted Ts'o
@ 2011-08-26 22:46     ` Allison Henderson
  0 siblings, 0 replies; 10+ messages in thread
From: Allison Henderson @ 2011-08-26 22:46 UTC (permalink / raw)
  To: Ted Ts'o; +Cc: linux-ext4

On 08/26/2011 08:47 AM, Ted Ts'o wrote:
> On Wed, Aug 24, 2011 at 12:07:22PM -0700, Allison Henderson wrote:
>> While running extended fsx tests to verify the preceeding patches,
>> a similar bug was also found in the write operation
>>
>> When ever a write operation begins or ends in a hole,
>> or extends EOF, the partial page contained in the hole
>> or beyond EOF needs to be zeroed out.
>>
>> To correct this the new ext4_discard_partial_page_buffers_no_lock
>> routine is used to zero out the partial page, but only for buffer
>> heads that are already unmapped.
>>
>> Signed-off-by: Allison Henderson<achender@linux.vnet.ibm.com>
>
> I haven't had a chance to debug this yet, but FYI, with this patch
> applied, xfstests #224 causes OOM's to result when running under KVM
> with 1024megs.  I tried doubling the memory to 2048 megs, and it
> passed in 4k standard ext4 mode, but then got hit with the OOM killer
> on test #224 in ext3 emulation mode (nodealloc, and with extents not
> enabled in the file system).  Without this patch, and the first four
> patches in the v6 patch series applied, xfstests #224 passes in all of
> the standard configurations (4k, ext3, 1k, data=journal, etc.)
>
> I don't know why this is happening, but I'll try to dig into it in the
> next day or two.  Any thoughts or fixes you might suggest would be
> much appreciated.
>
> 						- Ted

Alrighty then, I will look into this one and see if I can figure out 
whats going on.  Thx again!

Allison Henderson

> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/5 v6] ext4: Add new ext4_discard_partial_page_buffers routines
  2011-08-24 19:07 ` [PATCH 1/5 v6] ext4: Add new ext4_discard_partial_page_buffers routines Allison Henderson
@ 2011-08-27  4:06   ` Ted Ts'o
  2011-08-28  2:22     ` Allison Henderson
  0 siblings, 1 reply; 10+ messages in thread
From: Ted Ts'o @ 2011-08-27  4:06 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-ext4

On Wed, Aug 24, 2011 at 12:07:18PM -0700, Allison Henderson wrote:
> +	while (pos < offset + length) {
> +		err = 0;
> +
> +		/* The length of space left to zero and unmap */
> +		range_to_discard = offset + length - pos;
> +
> +		/* The length of space until the end of the block */
> +		end_of_block = blocksize - (pos & (blocksize-1));
> +
> +		/*
> +		 * Do not unmap or zero past end of block
> +		 * for this buffer head
> +		 */
> +		if (range_to_discard > end_of_block)
> +			range_to_discard = end_of_block;
> +
> +
> +		/*
> +		 * Skip this buffer head if we are only zeroing unampped
> +		 * regions of the page
> +		 */
> +		if (flags & EXT4_DSCRD_PARTIAL_PG_ZERO_UNMAPED &&
> +			buffer_mapped(bh))
> +				goto next;
> +


You should move this bit of code here:

		/* If the range is block aligned, unmap */
		if (range_to_discard == blocksize) {
			clear_buffer_dirty(bh);
			bh->b_bdev = NULL;
			clear_buffer_mapped(bh);
			clear_buffer_req(bh);
			clear_buffer_new(bh);
			clear_buffer_delay(bh);
			clear_buffer_unwritten(bh);
			clear_buffer_uptodate(bh);

and add these two lines:
+			zero_user(page, pos, blocksize);
+			goto next;

		}

Why?  Because if the range is block aligned, all you have to do is
unmap the buffer and call zero_user() just in case the page was
mmap'ed into some process's address space.  You don't want to mark the
block dirty --- in fact, if the buffer was already unmapped, you'll
trigger a WARN_ON in fs/buffer.c in mark_buffer_dirty() --- which is
how I noticed the problem and decided to look more closely at this bit
of code.   

You also don't want to engage the journaling machinery and journal the
data block in data=journalling mode, or to put the inode on the
data=ordered writeback list just because of this write.  That's just
wasted work.

If you do this, then you also don't need the conditional below:

> +		/*
> +		 * If this block is not completely contained in the range
> +		 * to be discarded, then it is not going to be released. Because
> +		 * we need to keep this block, we need to make sure this part
> +		 * of the page is uptodate before we modify it by writeing
> +		 * partial zeros on it.
> +		 */
> +		if (range_to_discard != blocksize) {

... which will also reduce a level of indentation, in the code, which
is good.

						- Ted

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/5 v6] ext4: Add new ext4_discard_partial_page_buffers routines
  2011-08-27  4:06   ` Ted Ts'o
@ 2011-08-28  2:22     ` Allison Henderson
  0 siblings, 0 replies; 10+ messages in thread
From: Allison Henderson @ 2011-08-28  2:22 UTC (permalink / raw)
  To: Ted Ts'o; +Cc: linux-ext4

On 08/26/2011 09:06 PM, Ted Ts'o wrote:
> On Wed, Aug 24, 2011 at 12:07:18PM -0700, Allison Henderson wrote:
>> +	while (pos<  offset + length) {
>> +		err = 0;
>> +
>> +		/* The length of space left to zero and unmap */
>> +		range_to_discard = offset + length - pos;
>> +
>> +		/* The length of space until the end of the block */
>> +		end_of_block = blocksize - (pos&  (blocksize-1));
>> +
>> +		/*
>> +		 * Do not unmap or zero past end of block
>> +		 * for this buffer head
>> +		 */
>> +		if (range_to_discard>  end_of_block)
>> +			range_to_discard = end_of_block;
>> +
>> +
>> +		/*
>> +		 * Skip this buffer head if we are only zeroing unampped
>> +		 * regions of the page
>> +		 */
>> +		if (flags&  EXT4_DSCRD_PARTIAL_PG_ZERO_UNMAPED&&
>> +			buffer_mapped(bh))
>> +				goto next;
>> +
>
>
> You should move this bit of code here:
>
> 		/* If the range is block aligned, unmap */
> 		if (range_to_discard == blocksize) {
> 			clear_buffer_dirty(bh);
> 			bh->b_bdev = NULL;
> 			clear_buffer_mapped(bh);
> 			clear_buffer_req(bh);
> 			clear_buffer_new(bh);
> 			clear_buffer_delay(bh);
> 			clear_buffer_unwritten(bh);
> 			clear_buffer_uptodate(bh);
>
> and add these two lines:
> +			zero_user(page, pos, blocksize);
> +			goto next;
>
> 		}
>
> Why?  Because if the range is block aligned, all you have to do is
> unmap the buffer and call zero_user() just in case the page was
> mmap'ed into some process's address space.  You don't want to mark the
> block dirty --- in fact, if the buffer was already unmapped, you'll
> trigger a WARN_ON in fs/buffer.c in mark_buffer_dirty() --- which is
> how I noticed the problem and decided to look more closely at this bit
> of code.
>
> You also don't want to engage the journaling machinery and journal the
> data block in data=journalling mode, or to put the inode on the
> data=ordered writeback list just because of this write.  That's just
> wasted work.
>
> If you do this, then you also don't need the conditional below:
>
>> +		/*
>> +		 * If this block is not completely contained in the range
>> +		 * to be discarded, then it is not going to be released. Because
>> +		 * we need to keep this block, we need to make sure this part
>> +		 * of the page is uptodate before we modify it by writeing
>> +		 * partial zeros on it.
>> +		 */
>> +		if (range_to_discard != blocksize) {
>
> ... which will also reduce a level of indentation, in the code, which
> is good.
>
> 						- Ted


Alrighty, I have updated my current copy of the set.  I am still working 
trying to narrow down where the OOM bug is coming from.  I noticed that 
I started getting OOM bugs when I updated my kernel sand box to the 
latest code.  But I didnt get them while running the punch hole tests, 
or even while the set was applied  (it was actually a script I wrote for 
secure delete that dumps an image and analyzes it.  I probably just need 
to make the image smaller, but it didnt used to do that).

But since you mention OOM problems, my first thought was to reproduce 
the problem with xfstest 224, and then see if the problem goes away on 
the down level kernel (3.0.0-next-20110725), since Im not immediately 
seeing anything in this set that would use up so much memory. 
Unfortunately, I havent gotten the 224 OOM bug to happen for me yet, but 
I will keep you posted if I find anything.   Thx again for helping me 
out with this set!  :)

Allison Henderson

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2011-08-28  2:22 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-24 19:07 [PATCH 0/5 v6] ext4: fix 1k block bugs Allison Henderson
2011-08-24 19:07 ` [PATCH 1/5 v6] ext4: Add new ext4_discard_partial_page_buffers routines Allison Henderson
2011-08-27  4:06   ` Ted Ts'o
2011-08-28  2:22     ` Allison Henderson
2011-08-24 19:07 ` [PATCH 2/5 v6] ext4: fix xfstests 75, 112, 127 punch hole failure Allison Henderson
2011-08-24 19:07 ` [PATCH 3/5 v6] ext4: fix 2nd xfstests " Allison Henderson
2011-08-24 19:07 ` [PATCH 4/5 v6] ext4: fix fsx truncate failure Allison Henderson
2011-08-24 19:07 ` [PATCH 5/5 v6] ext4: fix partial page writes Allison Henderson
2011-08-26 15:47   ` Ted Ts'o
2011-08-26 22:46     ` Allison Henderson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.