linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] block: optimise bvec_iter_advance()
       [not found] <cover.1575144884.git.asml.silence@gmail.com>
@ 2019-11-30 20:23 ` Pavel Begunkov
  2019-11-30 22:05   ` Arvind Sankar
  2019-12-03 14:37   ` Jens Axboe
  0 siblings, 2 replies; 4+ messages in thread
From: Pavel Begunkov @ 2019-11-30 20:23 UTC (permalink / raw)
  To: Jens Axboe, nivedita, linux-block, linux-kernel

bvec_iter_advance() is quite popular, but compilers fail to do proper
alias analysis and optimise it good enough. The assembly is checked
for gcc 9.2, x86-64.

- remove @iter->bi_size from min(...), as it's always less than @bytes.
Modify at the beginning and forget about it.

- the compiler isn't able to collapse memory dependencies and remove
writes in the loop. Help it by explicitely using local vars.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---

v2: simplify code (Arvind Sankar)

 include/linux/bvec.h | 22 ++++++++++------------
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/include/linux/bvec.h b/include/linux/bvec.h
index a032f01e928c..679a42253170 100644
--- a/include/linux/bvec.h
+++ b/include/linux/bvec.h
@@ -87,26 +87,24 @@ struct bvec_iter_all {
 static inline bool bvec_iter_advance(const struct bio_vec *bv,
 		struct bvec_iter *iter, unsigned bytes)
 {
+	unsigned int idx = iter->bi_idx;
+
 	if (WARN_ONCE(bytes > iter->bi_size,
 		     "Attempted to advance past end of bvec iter\n")) {
 		iter->bi_size = 0;
 		return false;
 	}
 
-	while (bytes) {
-		const struct bio_vec *cur = bv + iter->bi_idx;
-		unsigned len = min3(bytes, iter->bi_size,
-				    cur->bv_len - iter->bi_bvec_done);
-
-		bytes -= len;
-		iter->bi_size -= len;
-		iter->bi_bvec_done += len;
+	iter->bi_size -= bytes;
+	bytes += iter->bi_bvec_done;
 
-		if (iter->bi_bvec_done == cur->bv_len) {
-			iter->bi_bvec_done = 0;
-			iter->bi_idx++;
-		}
+	while (bytes && bytes >= bv[idx].bv_len) {
+		bytes -= bv[idx].bv_len;
+		idx++;
 	}
+
+	iter->bi_idx = idx;
+	iter->bi_bvec_done = bytes;
 	return true;
 }
 
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] block: optimise bvec_iter_advance()
  2019-11-30 20:23 ` [PATCH v2] block: optimise bvec_iter_advance() Pavel Begunkov
@ 2019-11-30 22:05   ` Arvind Sankar
  2019-12-01  0:04     ` Pavel Begunkov
  2019-12-03 14:37   ` Jens Axboe
  1 sibling, 1 reply; 4+ messages in thread
From: Arvind Sankar @ 2019-11-30 22:05 UTC (permalink / raw)
  To: Pavel Begunkov; +Cc: Jens Axboe, nivedita, linux-block, linux-kernel

On Sat, Nov 30, 2019 at 11:23:52PM +0300, Pavel Begunkov wrote:
> bvec_iter_advance() is quite popular, but compilers fail to do proper
> alias analysis and optimise it good enough. The assembly is checked
> for gcc 9.2, x86-64.
> 
> - remove @iter->bi_size from min(...), as it's always less than @bytes.
> Modify at the beginning and forget about it.
> 
> - the compiler isn't able to collapse memory dependencies and remove
> writes in the loop. Help it by explicitely using local vars.
> 
> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
> ---
> 
> v2: simplify code (Arvind Sankar)
> 

Thanks :)

Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>

Btw, I discovered that gcc 9.2 doesn't optimize away the second
comparison in something like

	m = min(a,b);
	return m>a;

So the WARN_ONCE bit doesn't get optimized away even in cases like
bio_for_each_bvec where it's guaranteed at compile-time to not trigger.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] block: optimise bvec_iter_advance()
  2019-11-30 22:05   ` Arvind Sankar
@ 2019-12-01  0:04     ` Pavel Begunkov
  0 siblings, 0 replies; 4+ messages in thread
From: Pavel Begunkov @ 2019-12-01  0:04 UTC (permalink / raw)
  To: Arvind Sankar; +Cc: Jens Axboe, linux-block, linux-kernel


[-- Attachment #1.1: Type: text/plain, Size: 1410 bytes --]

On 01/12/2019 01:05, Arvind Sankar wrote:
> On Sat, Nov 30, 2019 at 11:23:52PM +0300, Pavel Begunkov wrote:
>> bvec_iter_advance() is quite popular, but compilers fail to do proper
>> alias analysis and optimise it good enough. The assembly is checked
>> for gcc 9.2, x86-64.
>>
>> - remove @iter->bi_size from min(...), as it's always less than @bytes.
>> Modify at the beginning and forget about it.
>>
>> - the compiler isn't able to collapse memory dependencies and remove
>> writes in the loop. Help it by explicitly using local vars.
>>
>> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
>> ---
>>
>> v2: simplify code (Arvind Sankar)
>>
> 
> Thanks :)
> 
> Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
> 
> Btw, I discovered that gcc 9.2 doesn't optimize away the second
> comparison in something like

That's unfortunate. IDK, whether it could be easily done in gcc,
but maybe relay it to compiler guys?

> 
> 	m = min(a,b);
> 	return m>a;
> 
> So the WARN_ONCE bit doesn't get optimized away even in cases like
> bio_for_each_bvec where it's guaranteed at compile-time to not trigger.
> 

That's more like a starting point. The idea is to revise and
tune/rewrite iteration helpers including *for_each_bvec(). I assume,
all those are really poorly optimisable (especially with
-fno-strict-aliasing). So, noted to consider

-- 
Pavel Begunkov


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] block: optimise bvec_iter_advance()
  2019-11-30 20:23 ` [PATCH v2] block: optimise bvec_iter_advance() Pavel Begunkov
  2019-11-30 22:05   ` Arvind Sankar
@ 2019-12-03 14:37   ` Jens Axboe
  1 sibling, 0 replies; 4+ messages in thread
From: Jens Axboe @ 2019-12-03 14:37 UTC (permalink / raw)
  To: Pavel Begunkov, nivedita, linux-block, linux-kernel

On 11/30/19 1:23 PM, Pavel Begunkov wrote:
> bvec_iter_advance() is quite popular, but compilers fail to do proper
> alias analysis and optimise it good enough. The assembly is checked
> for gcc 9.2, x86-64.
> 
> - remove @iter->bi_size from min(...), as it's always less than @bytes.
> Modify at the beginning and forget about it.
> 
> - the compiler isn't able to collapse memory dependencies and remove
> writes in the loop. Help it by explicitely using local vars.

Applied, thanks.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-12-03 14:37 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <cover.1575144884.git.asml.silence@gmail.com>
2019-11-30 20:23 ` [PATCH v2] block: optimise bvec_iter_advance() Pavel Begunkov
2019-11-30 22:05   ` Arvind Sankar
2019-12-01  0:04     ` Pavel Begunkov
2019-12-03 14:37   ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).