All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] fs: Provide function to unmap metadata for a range of blocks
@ 2016-10-21 11:55 Jan Kara
  2016-10-21 11:55 ` [PATCH 1/4] " Jan Kara
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Jan Kara @ 2016-10-21 11:55 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-fsdevel, linux-ext4, Jan Kara

Hello,

I've noticed that in several places we need to unmap metadata in buffer cache
for a range of blocks and we do it by iterating over all blocks in given
range. Let's provide a helper function for that and implement it in a way
more efficient for larger ranges or blocks. The patches passed xfstests
for ext2 and ext4.

Jens, can you merge these patches if they look fine to you?

								Honza

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/4] fs: Provide function to unmap metadata for a range of blocks
  2016-10-21 11:55 [PATCH 0/4] fs: Provide function to unmap metadata for a range of blocks Jan Kara
@ 2016-10-21 11:55 ` Jan Kara
  2016-10-21 12:05   ` Christoph Hellwig
  2016-10-21 11:55 ` [PATCH 2/4] direct-io: Use unmap_underlying_metadata_ext() instead of handmade iteration Jan Kara
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 7+ messages in thread
From: Jan Kara @ 2016-10-21 11:55 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-fsdevel, linux-ext4, Jan Kara

Provide function equivalent to unmap_underlying_metadata() for a range
of blocks. We somewhat optimize the function to use pagevec lookups
instead of looking up buffer heads one by one and use page lock to pin
buffer heads instead of mapping's private_lock to improve scalability.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/buffer.c                 | 62 +++++++++++++++++++++++++++++++++++++++++++++
 include/linux/buffer_head.h |  2 ++
 2 files changed, 64 insertions(+)

diff --git a/fs/buffer.c b/fs/buffer.c
index b205a629001d..26e2953555e9 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -43,6 +43,7 @@
 #include <linux/bitops.h>
 #include <linux/mpage.h>
 #include <linux/bit_spinlock.h>
+#include <linux/pagevec.h>
 #include <trace/events/block.h>
 
 static int fsync_buffers_list(spinlock_t *lock, struct list_head *list);
@@ -1637,6 +1638,67 @@ void unmap_underlying_metadata(struct block_device *bdev, sector_t block)
 EXPORT_SYMBOL(unmap_underlying_metadata);
 
 /*
+ * Functionally, this is like unmap_underlying_metadata() for a range of
+ * blocks. It is implemented to be more efficient for larger ranges of blocks
+ * though.
+ */
+void unmap_underlying_metadata_ext(struct block_device *bdev, sector_t block,
+				   sector_t len)
+{
+	struct inode *bd_inode = bdev->bd_inode;
+	struct address_space *bd_mapping = bd_inode->i_mapping;
+	struct pagevec pvec;
+	pgoff_t index = block >> (PAGE_SHIFT - bd_inode->i_blkbits);
+	pgoff_t end;
+	int i;
+	struct buffer_head *bh;
+	struct buffer_head *head;
+
+	end = (block + len - 1) >> (PAGE_SHIFT - bd_inode->i_blkbits);
+	pagevec_init(&pvec, 0);
+	while (index <= end && pagevec_lookup(&pvec, bd_mapping, index,
+			min(end - index, (pgoff_t)PAGEVEC_SIZE - 1) + 1)) {
+		for (i = 0; i < pagevec_count(&pvec); i++) {
+			struct page *page = pvec.pages[i];
+
+			index = page->index;
+			if (index > end)
+				break;
+			if (!page_has_buffers(page))
+				continue;
+			/*
+			 * We use page lock instead of bd_mapping->private_lock
+			 * to pin buffers here since we can afford to sleep and
+			 * it scales better than a global spinlock lock.
+			 */
+			lock_page(page);
+			/* Recheck when the page is locked which pins bhs */
+			if (!page_has_buffers(page))
+				goto unlock_page;
+			head = page_buffers(page);
+			bh = head;
+			do {
+				if (!buffer_mapped(bh))
+					goto next;
+				if (bh->b_blocknr >= block + len)
+					break;
+				clear_buffer_dirty(bh);
+				wait_on_buffer(bh);
+				clear_buffer_req(bh);
+next:
+				bh = bh->b_this_page;
+			} while (bh != head);
+unlock_page:
+			unlock_page(page);
+		}
+		pagevec_release(&pvec);
+		cond_resched();
+		index++;
+	}
+}
+EXPORT_SYMBOL(unmap_underlying_metadata_ext);
+
+/*
  * Size is a power-of-two in the range 512..PAGE_SIZE,
  * and the case we care about most is PAGE_SIZE.
  *
diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index ebbacd14d450..9f2dd6c2f81a 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -169,6 +169,8 @@ void invalidate_inode_buffers(struct inode *);
 int remove_inode_buffers(struct inode *inode);
 int sync_mapping_buffers(struct address_space *mapping);
 void unmap_underlying_metadata(struct block_device *bdev, sector_t block);
+void unmap_underlying_metadata_ext(struct block_device *bdev, sector_t block,
+				   sector_t len);
 
 void mark_buffer_async_write(struct buffer_head *bh);
 void __wait_on_buffer(struct buffer_head *);
-- 
2.6.6


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/4] direct-io: Use unmap_underlying_metadata_ext() instead of handmade iteration
  2016-10-21 11:55 [PATCH 0/4] fs: Provide function to unmap metadata for a range of blocks Jan Kara
  2016-10-21 11:55 ` [PATCH 1/4] " Jan Kara
@ 2016-10-21 11:55 ` Jan Kara
  2016-10-21 11:55 ` [PATCH 3/4] ext4: Use unmap_underlying_metadata_ext() instead of iteration Jan Kara
  2016-10-21 11:55 ` [PATCH 4/4] ext2: " Jan Kara
  3 siblings, 0 replies; 7+ messages in thread
From: Jan Kara @ 2016-10-21 11:55 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-fsdevel, linux-ext4, Jan Kara

Use new provided function instead of an iteration through all allocated
blocks.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/direct-io.c | 28 +++++++---------------------
 1 file changed, 7 insertions(+), 21 deletions(-)

diff --git a/fs/direct-io.c b/fs/direct-io.c
index fb9aa16a7727..656b9bc2112d 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -843,24 +843,6 @@ submit_page_section(struct dio *dio, struct dio_submit *sdio, struct page *page,
 }
 
 /*
- * Clean any dirty buffers in the blockdev mapping which alias newly-created
- * file blocks.  Only called for S_ISREG files - blockdevs do not set
- * buffer_new
- */
-static void clean_blockdev_aliases(struct dio *dio, struct buffer_head *map_bh)
-{
-	unsigned i;
-	unsigned nblocks;
-
-	nblocks = map_bh->b_size >> dio->inode->i_blkbits;
-
-	for (i = 0; i < nblocks; i++) {
-		unmap_underlying_metadata(map_bh->b_bdev,
-					  map_bh->b_blocknr + i);
-	}
-}
-
-/*
  * If we are not writing the entire block and get_block() allocated
  * the block for us, we need to fill-in the unused portion of the
  * block with zeros. This happens only if user-buffer, fileoffset or
@@ -960,11 +942,15 @@ static int do_direct_IO(struct dio *dio, struct dio_submit *sdio,
 					goto do_holes;
 
 				sdio->blocks_available =
-						map_bh->b_size >> sdio->blkbits;
+						map_bh->b_size >> blkbits;
 				sdio->next_block_for_io =
 					map_bh->b_blocknr << sdio->blkfactor;
-				if (buffer_new(map_bh))
-					clean_blockdev_aliases(dio, map_bh);
+				if (buffer_new(map_bh)) {
+					unmap_underlying_metadata_ext(
+						map_bh->b_bdev,
+						map_bh->b_blocknr,
+						map_bh->b_size >> blkbits);
+				}
 
 				if (!sdio->blkfactor)
 					goto do_holes;
-- 
2.6.6


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/4] ext4: Use unmap_underlying_metadata_ext() instead of iteration
  2016-10-21 11:55 [PATCH 0/4] fs: Provide function to unmap metadata for a range of blocks Jan Kara
  2016-10-21 11:55 ` [PATCH 1/4] " Jan Kara
  2016-10-21 11:55 ` [PATCH 2/4] direct-io: Use unmap_underlying_metadata_ext() instead of handmade iteration Jan Kara
@ 2016-10-21 11:55 ` Jan Kara
  2016-10-21 11:55 ` [PATCH 4/4] ext2: " Jan Kara
  3 siblings, 0 replies; 7+ messages in thread
From: Jan Kara @ 2016-10-21 11:55 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-fsdevel, linux-ext4, Jan Kara

Use unmap_underlying_metadata_ext() instead of iterating through blocks
one by one.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ext4/extents.c | 10 +---------
 fs/ext4/inode.c   | 13 ++++---------
 2 files changed, 5 insertions(+), 18 deletions(-)

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index c930a0110fb4..0cd47355d919 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -3777,14 +3777,6 @@ static int ext4_convert_unwritten_extents_endio(handle_t *handle,
 	return err;
 }
 
-static void unmap_underlying_metadata_blocks(struct block_device *bdev,
-			sector_t block, int count)
-{
-	int i;
-	for (i = 0; i < count; i++)
-                unmap_underlying_metadata(bdev, block + i);
-}
-
 /*
  * Handle EOFBLOCKS_FL flag, clearing it if necessary
  */
@@ -4121,7 +4113,7 @@ ext4_ext_handle_unwritten_extents(handle_t *handle, struct inode *inode,
 	 * new.
 	 */
 	if (allocated > map->m_len) {
-		unmap_underlying_metadata_blocks(inode->i_sb->s_bdev,
+		unmap_underlying_metadata_ext(inode->i_sb->s_bdev,
 					newblock + map->m_len,
 					allocated - map->m_len);
 		allocated = map->m_len;
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 9c064727ed62..e2046742a564 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -654,12 +654,9 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode,
 		if (flags & EXT4_GET_BLOCKS_ZERO &&
 		    map->m_flags & EXT4_MAP_MAPPED &&
 		    map->m_flags & EXT4_MAP_NEW) {
-			ext4_lblk_t i;
-
-			for (i = 0; i < map->m_len; i++) {
-				unmap_underlying_metadata(inode->i_sb->s_bdev,
-							  map->m_pblk + i);
-			}
+			unmap_underlying_metadata_ext(inode->i_sb->s_bdev,
+						      map->m_pblk,
+						      map->m_len);
 			ret = ext4_issue_zeroout(inode, map->m_lblk,
 						 map->m_pblk, map->m_len);
 			if (ret) {
@@ -2361,10 +2358,8 @@ static int mpage_map_one_extent(handle_t *handle, struct mpage_da_data *mpd)
 	BUG_ON(map->m_len == 0);
 	if (map->m_flags & EXT4_MAP_NEW) {
 		struct block_device *bdev = inode->i_sb->s_bdev;
-		int i;
 
-		for (i = 0; i < map->m_len; i++)
-			unmap_underlying_metadata(bdev, map->m_pblk + i);
+		unmap_underlying_metadata_ext(bdev, map->m_pblk, map->m_len);
 	}
 	return 0;
 }
-- 
2.6.6


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 4/4] ext2: Use unmap_underlying_metadata_ext() instead of iteration
  2016-10-21 11:55 [PATCH 0/4] fs: Provide function to unmap metadata for a range of blocks Jan Kara
                   ` (2 preceding siblings ...)
  2016-10-21 11:55 ` [PATCH 3/4] ext4: Use unmap_underlying_metadata_ext() instead of iteration Jan Kara
@ 2016-10-21 11:55 ` Jan Kara
  3 siblings, 0 replies; 7+ messages in thread
From: Jan Kara @ 2016-10-21 11:55 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-fsdevel, linux-ext4, Jan Kara

Use unmap_underlying_metadata_ext() instead of iterating through blocks
one by one.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ext2/inode.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c
index d831e24dc885..335cd1e1f902 100644
--- a/fs/ext2/inode.c
+++ b/fs/ext2/inode.c
@@ -732,16 +732,13 @@ static int ext2_get_blocks(struct inode *inode,
 	}
 
 	if (IS_DAX(inode)) {
-		int i;
-
 		/*
 		 * We must unmap blocks before zeroing so that writeback cannot
 		 * overwrite zeros with stale data from block device page cache.
 		 */
-		for (i = 0; i < count; i++) {
-			unmap_underlying_metadata(inode->i_sb->s_bdev,
-					le32_to_cpu(chain[depth-1].key) + i);
-		}
+		unmap_underlying_metadata_ext(inode->i_sb->s_bdev,
+					      le32_to_cpu(chain[depth-1].key),
+					      count);
 		/*
 		 * block must be initialised before we put it in the tree
 		 * so that it's not found by another thread before it's
-- 
2.6.6


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/4] fs: Provide function to unmap metadata for a range of blocks
  2016-10-21 11:55 ` [PATCH 1/4] " Jan Kara
@ 2016-10-21 12:05   ` Christoph Hellwig
  2016-10-24 11:45     ` Jan Kara
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2016-10-21 12:05 UTC (permalink / raw)
  To: Jan Kara; +Cc: Jens Axboe, linux-fsdevel, linux-ext4

> + * Functionally, this is like unmap_underlying_metadata() for a range of
> + * blocks. It is implemented to be more efficient for larger ranges of blocks
> + * though.
> + */
> +void unmap_underlying_metadata_ext(struct block_device *bdev, sector_t block,
> +				   sector_t len)

Please explain what it does and why you'd call it.  And while we're
naming I think the 'metadata' part is highly confusing.  What it does
is to clear buffers from the block device mapping, nothing about
metadata really.

So how about unmap_buffers_range or similar?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/4] fs: Provide function to unmap metadata for a range of blocks
  2016-10-21 12:05   ` Christoph Hellwig
@ 2016-10-24 11:45     ` Jan Kara
  0 siblings, 0 replies; 7+ messages in thread
From: Jan Kara @ 2016-10-24 11:45 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Jan Kara, Jens Axboe, linux-fsdevel, linux-ext4

On Fri 21-10-16 05:05:42, Christoph Hellwig wrote:
> > + * Functionally, this is like unmap_underlying_metadata() for a range of
> > + * blocks. It is implemented to be more efficient for larger ranges of blocks
> > + * though.
> > + */
> > +void unmap_underlying_metadata_ext(struct block_device *bdev, sector_t block,
> > +				   sector_t len)
> 
> Please explain what it does and why you'd call it.  And while we're

OK.

> naming I think the 'metadata' part is highly confusing.  What it does
> is to clear buffers from the block device mapping, nothing about
> metadata really.
> 
> So how about unmap_buffers_range or similar?

I can rename the function but I wanted to be consistent with
unmap_underlying_metadata() function. It seems strange to have a function
for a single block and a function for a range of blocks with very different
names...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-10-24 11:45 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-21 11:55 [PATCH 0/4] fs: Provide function to unmap metadata for a range of blocks Jan Kara
2016-10-21 11:55 ` [PATCH 1/4] " Jan Kara
2016-10-21 12:05   ` Christoph Hellwig
2016-10-24 11:45     ` Jan Kara
2016-10-21 11:55 ` [PATCH 2/4] direct-io: Use unmap_underlying_metadata_ext() instead of handmade iteration Jan Kara
2016-10-21 11:55 ` [PATCH 3/4] ext4: Use unmap_underlying_metadata_ext() instead of iteration Jan Kara
2016-10-21 11:55 ` [PATCH 4/4] ext2: " Jan Kara

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.