All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/3] ext4: Reserve metadata if writing into uninitialized
@ 2013-03-08  8:24 Lukas Czerner
  2013-03-08  8:24 ` [PATCH 2/3] ext4: Update reserved space after the 'correction' Lukas Czerner
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Lukas Czerner @ 2013-03-08  8:24 UTC (permalink / raw)
  To: linux-ext4; +Cc: gnehzuil.liu, Lukas Czerner

Currently in delalloc write we do not reserve any space if we're
writing into the uninitialized extent. This is ok for data, because
the space has already been allocated so we do not have to do data
reservation, however we have to reserve metadata for uninitialized
extent conversion on writeback.

Add new ext4_da_reserve_metadata() function to only reserve metadata
blocks for delayed allocation an use it if we're writing delayed blocks
into unwritten extent to reserve metadata for extent conversion.

The problem can be reproduced with xfstest 083 on bigalloc file system.
With this patch I can not reproduce the problem anymore.

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
---
 fs/ext4/ext4.h  |    1 +
 fs/ext4/inode.c |   76 +++++++++++++++++++++++++++++++++++++++++++++++++-----
 2 files changed, 70 insertions(+), 7 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 6e16c18..c20efe2 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -581,6 +581,7 @@ enum {
 #define EXT4_GET_BLOCKS_NO_LOCK			0x0100
 	/* Do not put hole in extent cache */
 #define EXT4_GET_BLOCKS_NO_PUT_HOLE		0x0200
+#define EXT4_GET_BLOCKS_METADATA_RESERVED	0x0400
 
 /*
  * Flags used by ext4_free_blocks
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 9ea0cde..c3b88ec 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -606,7 +606,8 @@ found:
 	 * let the underlying get_block() function know to
 	 * avoid double accounting
 	 */
-	if (flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE)
+	if ((flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE) ||
+	    (flags & EXT4_GET_BLOCKS_METADATA_RESERVED))
 		ext4_set_inode_state(inode, EXT4_STATE_DELALLOC_RESERVED);
 	/*
 	 * We need to check for EXT4 here because migrate
@@ -636,7 +637,8 @@ found:
 			(flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE))
 			ext4_da_update_reserve_space(inode, retval, 1);
 	}
-	if (flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE)
+	if ((flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE) ||
+	    (flags & EXT4_GET_BLOCKS_METADATA_RESERVED))
 		ext4_clear_inode_state(inode, EXT4_STATE_DELALLOC_RESERVED);
 
 	if (retval > 0) {
@@ -1215,6 +1217,56 @@ static int ext4_journalled_write_end(struct file *file,
 	return ret ? ret : copied;
 }
 
+
+/*
+ * Reserve a metadata for a single block located at lblock
+ */
+static int ext4_da_reserve_metadata(struct inode *inode, ext4_lblk_t lblock)
+{
+	int retries = 0;
+	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
+	struct ext4_inode_info *ei = EXT4_I(inode);
+	unsigned int md_needed;
+	ext4_lblk_t save_last_lblock;
+	int save_len;
+
+	/*
+	 * recalculate the amount of metadata blocks to reserve
+	 * in order to allocate nrblocks
+	 * worse case is one extent per block
+	 */
+repeat:
+	spin_lock(&ei->i_block_reservation_lock);
+	/*
+	 * ext4_calc_metadata_amount() has side effects, which we have
+	 * to be prepared undo if we fail to claim space.
+	 */
+	save_len = ei->i_da_metadata_calc_len;
+	save_last_lblock = ei->i_da_metadata_calc_last_lblock;
+	md_needed = EXT4_NUM_B2C(sbi,
+				 ext4_calc_metadata_amount(inode, lblock));
+	trace_ext4_da_reserve_space(inode, md_needed);
+
+	/*
+	 * We do still charge estimated metadata to the sb though;
+	 * we cannot afford to run out of free blocks.
+	 */
+	if (ext4_claim_free_clusters(sbi, md_needed, 0)) {
+		ei->i_da_metadata_calc_len = save_len;
+		ei->i_da_metadata_calc_last_lblock = save_last_lblock;
+		spin_unlock(&ei->i_block_reservation_lock);
+		if (ext4_should_retry_alloc(inode->i_sb, &retries)) {
+			cond_resched();
+			goto repeat;
+		}
+		return -ENOSPC;
+	}
+	ei->i_reserved_meta_blocks += md_needed;
+	spin_unlock(&ei->i_block_reservation_lock);
+
+	return 0;       /* success */
+}
+
 /*
  * Reserve a single cluster located at lblock
  */
@@ -1601,7 +1653,8 @@ static void mpage_da_map_and_submit(struct mpage_da_data *mpd)
 	 */
 	map.m_lblk = next;
 	map.m_len = max_blocks;
-	get_blocks_flags = EXT4_GET_BLOCKS_CREATE;
+	get_blocks_flags = EXT4_GET_BLOCKS_CREATE |
+			   EXT4_GET_BLOCKS_METADATA_RESERVED;
 	if (ext4_should_dioread_nolock(mpd->inode))
 		get_blocks_flags |= EXT4_GET_BLOCKS_IO_CREATE_EXT;
 	if (mpd->b_state & (1 << BH_Delay))
@@ -1766,7 +1819,7 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
 			      struct buffer_head *bh)
 {
 	struct extent_status es;
-	int retval;
+	int retval, ret;
 	sector_t invalid_block = ~((sector_t) 0xffff);
 
 	if (invalid_block < ext4_blocks_count(EXT4_SB(inode->i_sb)->s_es))
@@ -1804,9 +1857,19 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
 		map->m_len = retval;
 		if (ext4_es_is_written(&es))
 			map->m_flags |= EXT4_MAP_MAPPED;
-		else if (ext4_es_is_unwritten(&es))
+		else if (ext4_es_is_unwritten(&es)) {
+			/*
+			 * We have delalloc write into the unwritten extent
+			 * which means that we have to reserve metadata
+			 * potentially required for converting unwritten
+			 * extent.
+			 */
+			ret = ext4_da_reserve_metadata(inode, iblock);
+			if (ret)
+				/* not enough space to reserve */
+				retval = ret;
 			map->m_flags |= EXT4_MAP_UNWRITTEN;
-		else
+		} else
 			BUG_ON(1);
 
 		return retval;
@@ -1838,7 +1901,6 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
 
 add_delayed:
 	if (retval == 0) {
-		int ret;
 		/*
 		 * XXX: __block_prepare_write() unmaps passed block,
 		 * is it OK?
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/3] ext4: Update reserved space after the 'correction'
  2013-03-08  8:24 [PATCH 1/3] ext4: Reserve metadata if writing into uninitialized Lukas Czerner
@ 2013-03-08  8:24 ` Lukas Czerner
  2013-03-11  2:50   ` Theodore Ts'o
  2013-03-08  8:24 ` [PATCH 3/3] ext4: reserve metadata block for every delayed write Lukas Czerner
  2013-03-11  2:49 ` [PATCH 1/3] ext4: Reserve metadata if writing into uninitialized Theodore Ts'o
  2 siblings, 1 reply; 6+ messages in thread
From: Lukas Czerner @ 2013-03-08  8:24 UTC (permalink / raw)
  To: linux-ext4; +Cc: gnehzuil.liu, Lukas Czerner

Currently in ext4_ext_map_blocks() in delayed allocation writeback
we would update the reservation and after that check whether we claimed
cluster outside of the range of the allocation and if so, we'll give the
block back to the reservation pool.

However this also means that if the number of reserved data block
dropped to zero before the correction, we would release all the metadata
reservation as well, however we might still need it because the we're
not done with the delayed allocation and there might be more blocks to
come. This will result in error messages such as:

EXT4-fs warning (device sdb): ext4_da_update_reserve_space:361: ino 12,
allocated 1 with only 0 reserved metadata blocks (releasing 1 blocks
with reserved 1 data blocks)

This will only happen on bigalloc file system and it can be easily
reproduced using fiemap-tester from xfstests like this:

./src/fiemap-tester -m DHDHDHDHD -S -p0 /mnt/test/file

Or using xfstests such as 225.

Fix this by doing the correction first and updating the reservation
after that so that we do not accidentally decrease
i_reserved_data_blocks to zero.

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
---
 fs/ext4/extents.c |   12 +++++++++---
 1 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 28dd8ee..cac80120 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -4106,9 +4106,6 @@ got_allocated_blocks:
 			}
 		} else {
 			BUG_ON(allocated_clusters < reserved_clusters);
-			/* We will claim quota for all newly allocated blocks.*/
-			ext4_da_update_reserve_space(inode, allocated_clusters,
-							1);
 			if (reserved_clusters < allocated_clusters) {
 				struct ext4_inode_info *ei = EXT4_I(inode);
 				int reservation = allocated_clusters -
@@ -4159,6 +4156,15 @@ got_allocated_blocks:
 				ei->i_reserved_data_blocks += reservation;
 				spin_unlock(&ei->i_block_reservation_lock);
 			}
+			/*
+			 * We will claim quota for all newly allocated blocks.
+			 * We're updating the reserved space *after* the
+			 * correction above so we do not accidentally free
+			 * all the metadata reservation because we might
+			 * actually need it later on.
+			 */
+			ext4_da_update_reserve_space(inode, allocated_clusters,
+							1);
 		}
 	}
 
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 3/3] ext4: reserve metadata block for every delayed write
  2013-03-08  8:24 [PATCH 1/3] ext4: Reserve metadata if writing into uninitialized Lukas Czerner
  2013-03-08  8:24 ` [PATCH 2/3] ext4: Update reserved space after the 'correction' Lukas Czerner
@ 2013-03-08  8:24 ` Lukas Czerner
  2013-03-11  2:52   ` Theodore Ts'o
  2013-03-11  2:49 ` [PATCH 1/3] ext4: Reserve metadata if writing into uninitialized Theodore Ts'o
  2 siblings, 1 reply; 6+ messages in thread
From: Lukas Czerner @ 2013-03-08  8:24 UTC (permalink / raw)
  To: linux-ext4; +Cc: gnehzuil.liu, Lukas Czerner

Currently we only reserve space (data+metadata) in delayed allocation if
we're allocating from new cluster (which is always in non-bigalloc file
system) which is ok for data blocks, because we reserve whole cluster.

However we have to reserve metadata for every delayed block we're going
to write because every block could potentially require metedata block
when we need to grow the extent tree.

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
---
 fs/ext4/inode.c |   14 ++++++++++++--
 1 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index c3b88ec..e0d5bc6 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1905,8 +1905,11 @@ add_delayed:
 		 * XXX: __block_prepare_write() unmaps passed block,
 		 * is it OK?
 		 */
-		/* If the block was allocated from previously allocated cluster,
-		 * then we dont need to reserve it again. */
+		/*
+		 * If the block was allocated from previously allocated cluster,
+		 * then we don't need to reserve it again. However we still need
+		 * to reserve metadata for every block we're going to write.
+		 */
 		if (!(map->m_flags & EXT4_MAP_FROM_CLUSTER)) {
 			ret = ext4_da_reserve_space(inode, iblock);
 			if (ret) {
@@ -1914,6 +1917,13 @@ add_delayed:
 				retval = ret;
 				goto out_unlock;
 			}
+		} else {
+			ret = ext4_da_reserve_metadata(inode, iblock);
+			if (ret) {
+				/* not enough space to reserve */
+				retval = ret;
+				goto out_unlock;
+			}
 		}
 
 		ret = ext4_es_insert_extent(inode, map->m_lblk, map->m_len,
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/3] ext4: Reserve metadata if writing into uninitialized
  2013-03-08  8:24 [PATCH 1/3] ext4: Reserve metadata if writing into uninitialized Lukas Czerner
  2013-03-08  8:24 ` [PATCH 2/3] ext4: Update reserved space after the 'correction' Lukas Czerner
  2013-03-08  8:24 ` [PATCH 3/3] ext4: reserve metadata block for every delayed write Lukas Czerner
@ 2013-03-11  2:49 ` Theodore Ts'o
  2 siblings, 0 replies; 6+ messages in thread
From: Theodore Ts'o @ 2013-03-11  2:49 UTC (permalink / raw)
  To: Lukas Czerner; +Cc: linux-ext4, gnehzuil.liu

On Fri, Mar 08, 2013 at 09:24:17AM +0100, Lukas Czerner wrote:
> Currently in delalloc write we do not reserve any space if we're
> writing into the uninitialized extent. This is ok for data, because
> the space has already been allocated so we do not have to do data
> reservation, however we have to reserve metadata for uninitialized
> extent conversion on writeback.
> 
> Add new ext4_da_reserve_metadata() function to only reserve metadata
> blocks for delayed allocation an use it if we're writing delayed blocks
> into unwritten extent to reserve metadata for extent conversion.
> 
> The problem can be reproduced with xfstest 083 on bigalloc file system.
> With this patch I can not reproduce the problem anymore.
> 
> Signed-off-by: Lukas Czerner <lczerner@redhat.com>

Thanks, applied.

					- Ted

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/3] ext4: Update reserved space after the 'correction'
  2013-03-08  8:24 ` [PATCH 2/3] ext4: Update reserved space after the 'correction' Lukas Czerner
@ 2013-03-11  2:50   ` Theodore Ts'o
  0 siblings, 0 replies; 6+ messages in thread
From: Theodore Ts'o @ 2013-03-11  2:50 UTC (permalink / raw)
  To: Lukas Czerner; +Cc: linux-ext4, gnehzuil.liu

On Fri, Mar 08, 2013 at 09:24:18AM +0100, Lukas Czerner wrote:
> Currently in ext4_ext_map_blocks() in delayed allocation writeback
> we would update the reservation and after that check whether we claimed
> cluster outside of the range of the allocation and if so, we'll give the
> block back to the reservation pool.
> 
> However this also means that if the number of reserved data block
> dropped to zero before the correction, we would release all the metadata
> reservation as well, however we might still need it because the we're
> not done with the delayed allocation and there might be more blocks to
> come. This will result in error messages such as:
> 
> EXT4-fs warning (device sdb): ext4_da_update_reserve_space:361: ino 12,
> allocated 1 with only 0 reserved metadata blocks (releasing 1 blocks
> with reserved 1 data blocks)
> 
> This will only happen on bigalloc file system and it can be easily
> reproduced using fiemap-tester from xfstests like this:
> 
> ./src/fiemap-tester -m DHDHDHDHD -S -p0 /mnt/test/file
> 
> Or using xfstests such as 225.
> 
> Fix this by doing the correction first and updating the reservation
> after that so that we do not accidentally decrease
> i_reserved_data_blocks to zero.
> 
> Signed-off-by: Lukas Czerner <lczerner@redhat.com>

Thanks, applied.

						- Ted

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 3/3] ext4: reserve metadata block for every delayed write
  2013-03-08  8:24 ` [PATCH 3/3] ext4: reserve metadata block for every delayed write Lukas Czerner
@ 2013-03-11  2:52   ` Theodore Ts'o
  0 siblings, 0 replies; 6+ messages in thread
From: Theodore Ts'o @ 2013-03-11  2:52 UTC (permalink / raw)
  To: Lukas Czerner; +Cc: linux-ext4, gnehzuil.liu

On Fri, Mar 08, 2013 at 09:24:19AM +0100, Lukas Czerner wrote:
> Currently we only reserve space (data+metadata) in delayed allocation if
> we're allocating from new cluster (which is always in non-bigalloc file
> system) which is ok for data blocks, because we reserve whole cluster.
> 
> However we have to reserve metadata for every delayed block we're going
> to write because every block could potentially require metedata block
> when we need to grow the extent tree.
> 
> Signed-off-by: Lukas Czerner <lczerner@redhat.com>

Thanks, applied.

					- Ted

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-03-11  2:52 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-08  8:24 [PATCH 1/3] ext4: Reserve metadata if writing into uninitialized Lukas Czerner
2013-03-08  8:24 ` [PATCH 2/3] ext4: Update reserved space after the 'correction' Lukas Czerner
2013-03-11  2:50   ` Theodore Ts'o
2013-03-08  8:24 ` [PATCH 3/3] ext4: reserve metadata block for every delayed write Lukas Czerner
2013-03-11  2:52   ` Theodore Ts'o
2013-03-11  2:49 ` [PATCH 1/3] ext4: Reserve metadata if writing into uninitialized Theodore Ts'o

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.