All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] btrfs: btrfs_dec_test_*_ordered_extent() refactor
@ 2020-12-18  5:16 Qu Wenruo
  2020-12-18  5:17 ` [PATCH 1/2] btrfs: make btrfs_dio_private::bytes to be u32 Qu Wenruo
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Qu Wenruo @ 2020-12-18  5:16 UTC (permalink / raw)
  To: linux-btrfs

This small patchset is btrfs_dec_test_*_ordered_extent() refactor during
subpage RW support development.

This is mostly to make btrfs_dev_test_* functions more human readable
and prepare it for calling btrfs_dec_test_first_ordered_extent() in
btrfs_writepage_endio_finish_ordered() where we can have one or more
ordered extents for one bvec.

Qu Wenruo (2):
  btrfs: make btrfs_dio_private::bytes to be u32
  btrfs: refactor btrfs_dec_test_* functions for ordered extents

 fs/btrfs/btrfs_inode.h  |  2 +-
 fs/btrfs/inode.c        |  7 ++-
 fs/btrfs/ordered-data.c | 98 ++++++++++++++++++++++-------------------
 fs/btrfs/ordered-data.h | 10 ++---
 4 files changed, 62 insertions(+), 55 deletions(-)

-- 
2.29.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/2] btrfs: make btrfs_dio_private::bytes to be u32
  2020-12-18  5:16 [PATCH 0/2] btrfs: btrfs_dec_test_*_ordered_extent() refactor Qu Wenruo
@ 2020-12-18  5:17 ` Qu Wenruo
  2020-12-18  5:17 ` [PATCH 2/2] btrfs: refactor btrfs_dec_test_* functions for ordered extents Qu Wenruo
  2020-12-18 15:57 ` [PATCH 0/2] btrfs: btrfs_dec_test_*_ordered_extent() refactor David Sterba
  2 siblings, 0 replies; 6+ messages in thread
From: Qu Wenruo @ 2020-12-18  5:17 UTC (permalink / raw)
  To: linux-btrfs

btrfs_dio_private::bytes is only assigned from bio::bi_iter::bi_size,
which is no larger than U32.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/btrfs_inode.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h
index d9bf53d9ff90..fbd65807f29d 100644
--- a/fs/btrfs/btrfs_inode.h
+++ b/fs/btrfs/btrfs_inode.h
@@ -325,7 +325,7 @@ struct btrfs_dio_private {
 	struct inode *inode;
 	u64 logical_offset;
 	u64 disk_bytenr;
-	u64 bytes;
+	u32 bytes;
 
 	/*
 	 * References to this structure. There is one reference per in-flight
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/2] btrfs: refactor btrfs_dec_test_* functions for ordered extents
  2020-12-18  5:16 [PATCH 0/2] btrfs: btrfs_dec_test_*_ordered_extent() refactor Qu Wenruo
  2020-12-18  5:17 ` [PATCH 1/2] btrfs: make btrfs_dio_private::bytes to be u32 Qu Wenruo
@ 2020-12-18  5:17 ` Qu Wenruo
  2020-12-18 15:57 ` [PATCH 0/2] btrfs: btrfs_dec_test_*_ordered_extent() refactor David Sterba
  2 siblings, 0 replies; 6+ messages in thread
From: Qu Wenruo @ 2020-12-18  5:17 UTC (permalink / raw)
  To: linux-btrfs

The refactors involves the following modifications:
- Return bool instead of int

- Reduce the width of @io_size for btrfs_dec_test_ordered_pending()
  Function btrfs_dec_test_ordered_pending() is only supposed to handle
  at most one page, thus u32 for @io_size is enough.
  Also added ASSERT()s for such width reduce.

  For btrfs_dev_test_first_ordered_pending(), btrfs_dio_iomap_end() is
  the only blockage, since its @length is loff_t and I'm not that
  confident that @length can be contained in U32, thus
  btrfs_dev_test_first_ordered_pending() still uses u64 for its @iosize.

- Parameter rename for @cached of btrfs_dec_test_first_ordered_pending()
  For btrfs_dec_test_first_ordered_pending(), @cached is only used to
  return the finished ordered extent.
  Rename it to @finished_ret.

- Comments update
  * Change one stale comment
    Which still refers to btrfs_dec_test_ordered_pending(), but the
    context is calling  btrfs_dec_test_first_ordered_pending().
  * Follow the common comment style for both functions
    Add more detailed descriptions for parameters and the return value
  * Move the reason why test_and_set_bit() is used into the call sites

- Change how the return value is calculated
  The most anti-human part of the return value is:

    if (...)
	ret = 1;
    ...
    return ret == 0;

  This means, when we set ret to 1, the function returns 0.
  Change the local variable name to @finished, and directly return the
  value of it.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/inode.c        |  7 ++-
 fs/btrfs/ordered-data.c | 98 ++++++++++++++++++++++-------------------
 fs/btrfs/ordered-data.h | 10 ++---
 3 files changed, 61 insertions(+), 54 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index fa9fbed36ec9..ab110552daef 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -2920,6 +2920,7 @@ void btrfs_writepage_endio_finish_ordered(struct page *page, u64 start,
 	trace_btrfs_writepage_end_io_hook(page, start, end, uptodate);
 
 	ClearPagePrivate2(page);
+	ASSERT(end - start + 1 < U32_MAX);
 	if (!btrfs_dec_test_ordered_pending(inode, &ordered_extent, start,
 					    end - start + 1, uptodate))
 		return;
@@ -7793,10 +7794,7 @@ static void __endio_write_update_ordered(struct btrfs_inode *inode,
 					NULL);
 			btrfs_queue_work(wq, &ordered->work);
 		}
-		/*
-		 * If btrfs_dec_test_ordered_pending does not find any ordered
-		 * extent in the range, we can exit.
-		 */
+		/* No ordered extent found in the range, exit. */
 		if (ordered_offset == last_offset)
 			return;
 		/*
@@ -8216,6 +8214,7 @@ static void btrfs_invalidatepage(struct page *page, unsigned int offset,
 				ordered->truncated_len = new_len;
 			spin_unlock_irq(&tree->lock);
 
+			ASSERT(end - start + 1 < U32_MAX);
 			if (btrfs_dec_test_ordered_pending(inode, &ordered,
 							   start,
 							   end - start + 1, 1)) {
diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
index 79d366a36223..f9dff5c90a27 100644
--- a/fs/btrfs/ordered-data.c
+++ b/fs/btrfs/ordered-data.c
@@ -297,26 +297,33 @@ void btrfs_add_ordered_sum(struct btrfs_ordered_extent *entry,
 }
 
 /*
- * this is used to account for finished IO across a given range
- * of the file.  The IO may span ordered extents.  If
- * a given ordered_extent is completely done, 1 is returned, otherwise
- * 0.
+ * Finish io for one ordered extent across a given range.
+ * The range can contain several ordered extents.
  *
- * test_and_set_bit on a flag in the struct btrfs_ordered_extent is used
- * to make sure this function only returns 1 once for a given ordered extent.
+ * @found_ret:	 Return the finished ordered extent
+ * @file_offset: File offset for the finished io
+ * 		 Will also be updated to one byte past the range that is
+ * 		 recordered as finished. This allows caller to walk forward.
+ * @io_size:	 Length of the finish io range
+ * @uptodate:	 If the IO finishes without problem.
  *
- * file_offset is updated to one byte past the range that is recorded as
- * complete.  This allows you to walk forward in the file.
+ * Return true if any ordered extent is finished in the range, and update
+ * @found_ret and @file_offset.
+ * Return false otherwise.
+ *
+ * NOTE: Although The range can cross multiple ordered extents, only one
+ * ordered extent will be updated during one call. The caller is responsible
+ * to iterate all ordered extents in the range.
  */
-int btrfs_dec_test_first_ordered_pending(struct btrfs_inode *inode,
-				   struct btrfs_ordered_extent **cached,
+bool btrfs_dec_test_first_ordered_pending(struct btrfs_inode *inode,
+				   struct btrfs_ordered_extent **finished_ret,
 				   u64 *file_offset, u64 io_size, int uptodate)
 {
 	struct btrfs_fs_info *fs_info = inode->root->fs_info;
 	struct btrfs_ordered_inode_tree *tree = &inode->ordered_tree;
 	struct rb_node *node;
 	struct btrfs_ordered_extent *entry = NULL;
-	int ret;
+	bool finished = false;
 	unsigned long flags;
 	u64 dec_end;
 	u64 dec_start;
@@ -324,16 +331,12 @@ int btrfs_dec_test_first_ordered_pending(struct btrfs_inode *inode,
 
 	spin_lock_irqsave(&tree->lock, flags);
 	node = tree_search(tree, *file_offset);
-	if (!node) {
-		ret = 1;
+	if (!node)
 		goto out;
-	}
 
 	entry = rb_entry(node, struct btrfs_ordered_extent, rb_node);
-	if (!offset_in_entry(entry, *file_offset)) {
-		ret = 1;
+	if (!offset_in_entry(entry, *file_offset))
 		goto out;
-	}
 
 	dec_start = max(*file_offset, entry->file_offset);
 	dec_end = min(*file_offset + io_size,
@@ -354,39 +357,48 @@ int btrfs_dec_test_first_ordered_pending(struct btrfs_inode *inode,
 		set_bit(BTRFS_ORDERED_IOERR, &entry->flags);
 
 	if (entry->bytes_left == 0) {
-		ret = test_and_set_bit(BTRFS_ORDERED_IO_DONE, &entry->flags);
+		/* Ensure only one caller can get true returned. */
+		finished = !test_and_set_bit(BTRFS_ORDERED_IO_DONE, &entry->flags);
 		/* test_and_set_bit implies a barrier */
 		cond_wake_up_nomb(&entry->wait);
-	} else {
-		ret = 1;
 	}
 out:
-	if (!ret && cached && entry) {
-		*cached = entry;
+	if (finished && finished_ret && entry) {
+		*finished_ret = entry;
 		refcount_inc(&entry->refs);
 	}
 	spin_unlock_irqrestore(&tree->lock, flags);
-	return ret == 0;
+	return finished;
 }
 
 /*
- * this is used to account for finished IO across a given range
- * of the file.  The IO should not span ordered extents.  If
- * a given ordered_extent is completely done, 1 is returned, otherwise
- * 0.
+ * Finish io for one ordered extent across a given range.
+ * The range can only contain one ordered extent.
+ *
+ * @cached:	 The cached ordered extent.
+ * 		 If not NULL, we can skip the tree search and use the ordered
+ * 		 extent directly.
+ * 		 Will also be used to store the finished ordered extent.
+ * @file_offset: File offset for the finished io
+ * @io_size:	 Length of the finish io range
+ * @uptodate:	 If the IO finishes without problem.
  *
- * test_and_set_bit on a flag in the struct btrfs_ordered_extent is used
- * to make sure this function only returns 1 once for a given ordered extent.
+ * Return true if the ordered extent is finished in the range, and update
+ * @cached.
+ * Return false otherwise.
+ *
+ * NOTE: The range can NOT cross multiple ordered extents.
+ * Thus caller should ensure the range is no larger than one sector.
  */
-int btrfs_dec_test_ordered_pending(struct btrfs_inode *inode,
-				   struct btrfs_ordered_extent **cached,
-				   u64 file_offset, u64 io_size, int uptodate)
+bool btrfs_dec_test_ordered_pending(struct btrfs_inode *inode,
+				    struct btrfs_ordered_extent **cached,
+				    u64 file_offset, u32 io_size, int uptodate)
 {
 	struct btrfs_ordered_inode_tree *tree = &inode->ordered_tree;
 	struct rb_node *node;
 	struct btrfs_ordered_extent *entry = NULL;
 	unsigned long flags;
-	int ret;
+	bool finished = false;
 
 	spin_lock_irqsave(&tree->lock, flags);
 	if (cached && *cached) {
@@ -395,21 +407,18 @@ int btrfs_dec_test_ordered_pending(struct btrfs_inode *inode,
 	}
 
 	node = tree_search(tree, file_offset);
-	if (!node) {
-		ret = 1;
+	if (!node)
 		goto out;
-	}
 
 	entry = rb_entry(node, struct btrfs_ordered_extent, rb_node);
 have_entry:
-	if (!offset_in_entry(entry, file_offset)) {
-		ret = 1;
+	if (!offset_in_entry(entry, file_offset))
 		goto out;
-	}
 
 	if (io_size > entry->bytes_left) {
+		ASSERT(0);
 		btrfs_crit(inode->root->fs_info,
-			   "bad ordered accounting left %llu size %llu",
+			   "bad ordered accounting left %llu size %u",
 		       entry->bytes_left, io_size);
 	}
 	entry->bytes_left -= io_size;
@@ -417,19 +426,18 @@ int btrfs_dec_test_ordered_pending(struct btrfs_inode *inode,
 		set_bit(BTRFS_ORDERED_IOERR, &entry->flags);
 
 	if (entry->bytes_left == 0) {
-		ret = test_and_set_bit(BTRFS_ORDERED_IO_DONE, &entry->flags);
+		/* Ensure only one caller can get true returned. */
+		finished = !test_and_set_bit(BTRFS_ORDERED_IO_DONE, &entry->flags);
 		/* test_and_set_bit implies a barrier */
 		cond_wake_up_nomb(&entry->wait);
-	} else {
-		ret = 1;
 	}
 out:
-	if (!ret && cached && entry) {
+	if (finished && cached && entry) {
 		*cached = entry;
 		refcount_inc(&entry->refs);
 	}
 	spin_unlock_irqrestore(&tree->lock, flags);
-	return ret == 0;
+	return finished;
 }
 
 /*
diff --git a/fs/btrfs/ordered-data.h b/fs/btrfs/ordered-data.h
index 0bfa82b58e23..3ee987cd402d 100644
--- a/fs/btrfs/ordered-data.h
+++ b/fs/btrfs/ordered-data.h
@@ -152,11 +152,11 @@ btrfs_ordered_inode_tree_init(struct btrfs_ordered_inode_tree *t)
 void btrfs_put_ordered_extent(struct btrfs_ordered_extent *entry);
 void btrfs_remove_ordered_extent(struct btrfs_inode *btrfs_inode,
 				struct btrfs_ordered_extent *entry);
-int btrfs_dec_test_ordered_pending(struct btrfs_inode *inode,
-				   struct btrfs_ordered_extent **cached,
-				   u64 file_offset, u64 io_size, int uptodate);
-int btrfs_dec_test_first_ordered_pending(struct btrfs_inode *inode,
-				   struct btrfs_ordered_extent **cached,
+bool btrfs_dec_test_ordered_pending(struct btrfs_inode *inode,
+				    struct btrfs_ordered_extent **cached,
+				    u64 file_offset, u32 io_size, int uptodate);
+bool btrfs_dec_test_first_ordered_pending(struct btrfs_inode *inode,
+				   struct btrfs_ordered_extent **finished_ret,
 				   u64 *file_offset, u64 io_size,
 				   int uptodate);
 int btrfs_add_ordered_extent(struct btrfs_inode *inode, u64 file_offset,
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/2] btrfs: btrfs_dec_test_*_ordered_extent() refactor
  2020-12-18  5:16 [PATCH 0/2] btrfs: btrfs_dec_test_*_ordered_extent() refactor Qu Wenruo
  2020-12-18  5:17 ` [PATCH 1/2] btrfs: make btrfs_dio_private::bytes to be u32 Qu Wenruo
  2020-12-18  5:17 ` [PATCH 2/2] btrfs: refactor btrfs_dec_test_* functions for ordered extents Qu Wenruo
@ 2020-12-18 15:57 ` David Sterba
  2020-12-19  0:26   ` Qu Wenruo
  2 siblings, 1 reply; 6+ messages in thread
From: David Sterba @ 2020-12-18 15:57 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Fri, Dec 18, 2020 at 01:16:59PM +0800, Qu Wenruo wrote:
> This small patchset is btrfs_dec_test_*_ordered_extent() refactor during
> subpage RW support development.
> 
> This is mostly to make btrfs_dev_test_* functions more human readable
> and prepare it for calling btrfs_dec_test_first_ordered_extent() in
> btrfs_writepage_endio_finish_ordered() where we can have one or more
> ordered extents for one bvec.
> 
> Qu Wenruo (2):
>   btrfs: make btrfs_dio_private::bytes to be u32
>   btrfs: refactor btrfs_dec_test_* functions for ordered extents

The idea makes sense but the patches are IMO in wrong order and still
leave some u64/u32, eg. in btrfs_dec_test_first_ordered_pending the
io_size is still u64 while for btrfs_dec_test_ordered_pending it
switches type to u32 (as expected).

The type cleanup should be done bottom-up, from the leaf functions
upwards. After that, the structure type can be safely switched.

I'm not sure what to do with __endio_write_update_ordered, it can take
u32 for bytes but internally uses u64 for ordered_bytes, that should be
u32 as well. But

 7711                 if (ordered_offset < offset + bytes) {
 7712                         ordered_bytes = offset + bytes - ordered_offset;
 7713                         ordered = NULL;
 7714                 }

expression on line 7712 would need a temporary variable for the u64
calculation and then reassign. Maybe such conversions are inevitable so
we have clean function API and not pass u64 just because.

I like that the hole btrfs_dio_private gets removed so the cleanups are
worthwhile, but maybe not trivial.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/2] btrfs: btrfs_dec_test_*_ordered_extent() refactor
  2020-12-18 15:57 ` [PATCH 0/2] btrfs: btrfs_dec_test_*_ordered_extent() refactor David Sterba
@ 2020-12-19  0:26   ` Qu Wenruo
  2020-12-22  5:37     ` Qu Wenruo
  0 siblings, 1 reply; 6+ messages in thread
From: Qu Wenruo @ 2020-12-19  0:26 UTC (permalink / raw)
  To: dsterba, Qu Wenruo, linux-btrfs



On 2020/12/18 下午11:57, David Sterba wrote:
> On Fri, Dec 18, 2020 at 01:16:59PM +0800, Qu Wenruo wrote:
>> This small patchset is btrfs_dec_test_*_ordered_extent() refactor during
>> subpage RW support development.
>>
>> This is mostly to make btrfs_dev_test_* functions more human readable
>> and prepare it for calling btrfs_dec_test_first_ordered_extent() in
>> btrfs_writepage_endio_finish_ordered() where we can have one or more
>> ordered extents for one bvec.
>>
>> Qu Wenruo (2):
>>    btrfs: make btrfs_dio_private::bytes to be u32
>>    btrfs: refactor btrfs_dec_test_* functions for ordered extents
>
> The idea makes sense but the patches are IMO in wrong order and still
> leave some u64/u32, eg. in btrfs_dec_test_first_ordered_pending the
> io_size is still u64 while for btrfs_dec_test_ordered_pending it
> switches type to u32 (as expected).

That u64 is left there and the reason is explained in the 2nd patch.

Mostly due to iomap requirement.

But I totally get your point.

Thanks,
Qu

>
> The type cleanup should be done bottom-up, from the leaf functions
> upwards. After that, the structure type can be safely switched.
>
> I'm not sure what to do with __endio_write_update_ordered, it can take
> u32 for bytes but internally uses u64 for ordered_bytes, that should be
> u32 as well. But
>
>   7711                 if (ordered_offset < offset + bytes) {
>   7712                         ordered_bytes = offset + bytes - ordered_offset;
>   7713                         ordered = NULL;
>   7714                 }
>
> expression on line 7712 would need a temporary variable for the u64
> calculation and then reassign. Maybe such conversions are inevitable so
> we have clean function API and not pass u64 just because.
>
> I like that the hole btrfs_dio_private gets removed so the cleanups are
> worthwhile, but maybe not trivial.
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/2] btrfs: btrfs_dec_test_*_ordered_extent() refactor
  2020-12-19  0:26   ` Qu Wenruo
@ 2020-12-22  5:37     ` Qu Wenruo
  0 siblings, 0 replies; 6+ messages in thread
From: Qu Wenruo @ 2020-12-22  5:37 UTC (permalink / raw)
  To: dsterba, Qu Wenruo, linux-btrfs



On 2020/12/19 上午8:26, Qu Wenruo wrote:
>
>
> On 2020/12/18 下午11:57, David Sterba wrote:
>> On Fri, Dec 18, 2020 at 01:16:59PM +0800, Qu Wenruo wrote:
>>> This small patchset is btrfs_dec_test_*_ordered_extent() refactor during
>>> subpage RW support development.
>>>
>>> This is mostly to make btrfs_dev_test_* functions more human readable
>>> and prepare it for calling btrfs_dec_test_first_ordered_extent() in
>>> btrfs_writepage_endio_finish_ordered() where we can have one or more
>>> ordered extents for one bvec.
>>>
>>> Qu Wenruo (2):
>>>    btrfs: make btrfs_dio_private::bytes to be u32
>>>    btrfs: refactor btrfs_dec_test_* functions for ordered extents
>>
>> The idea makes sense but the patches are IMO in wrong order and still
>> leave some u64/u32, eg. in btrfs_dec_test_first_ordered_pending the
>> io_size is still u64 while for btrfs_dec_test_ordered_pending it
>> switches type to u32 (as expected).
>
> That u64 is left there and the reason is explained in the 2nd patch.
>
> Mostly due to iomap requirement.
>
> But I totally get your point.

After digging more, I prefer to give up the width reduction, but just
focus on the other cleanups in the patch.

The width reduction really needs more concern, especially when iomap is
involved.

Thanks,
Qu
>
> Thanks,
> Qu
>
>>
>> The type cleanup should be done bottom-up, from the leaf functions
>> upwards. After that, the structure type can be safely switched.
>>
>> I'm not sure what to do with __endio_write_update_ordered, it can take
>> u32 for bytes but internally uses u64 for ordered_bytes, that should be
>> u32 as well. But
>>
>>   7711                 if (ordered_offset < offset + bytes) {
>>   7712                         ordered_bytes = offset + bytes -
>> ordered_offset;
>>   7713                         ordered = NULL;
>>   7714                 }
>>
>> expression on line 7712 would need a temporary variable for the u64
>> calculation and then reassign. Maybe such conversions are inevitable so
>> we have clean function API and not pass u64 just because.
>>
>> I like that the hole btrfs_dio_private gets removed so the cleanups are
>> worthwhile, but maybe not trivial.
>>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-12-22  5:42 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-18  5:16 [PATCH 0/2] btrfs: btrfs_dec_test_*_ordered_extent() refactor Qu Wenruo
2020-12-18  5:17 ` [PATCH 1/2] btrfs: make btrfs_dio_private::bytes to be u32 Qu Wenruo
2020-12-18  5:17 ` [PATCH 2/2] btrfs: refactor btrfs_dec_test_* functions for ordered extents Qu Wenruo
2020-12-18 15:57 ` [PATCH 0/2] btrfs: btrfs_dec_test_*_ordered_extent() refactor David Sterba
2020-12-19  0:26   ` Qu Wenruo
2020-12-22  5:37     ` Qu Wenruo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.