All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Laight <David.Laight@ACULAB.COM>
To: 'Linus Torvalds' <torvalds@linux-foundation.org>,
	David Howells <dhowells@redhat.com>
Cc: Pavel Begunkov <asml.silence@gmail.com>,
	Matthew Wilcox <willy@infradead.org>,
	Jens Axboe <axboe@kernel.dk>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-block <linux-block@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: RE: [PATCH 01/29] iov_iter: Switch to using a table of operations
Date: Sun, 22 Nov 2020 22:34:37 +0000	[thread overview]
Message-ID: <3ba98abf0ddb4f16af7166db201fe9c1@AcuMS.aculab.com> (raw)
In-Reply-To: <CAHk-=wggLYmTe5jm7nWvywcNNxUd=Vm4eGFYq8MjNZizpOzBLw@mail.gmail.com>

From: Linus Torvalds
> Sent: 22 November 2020 19:22
> Subject: Re: [PATCH 01/29] iov_iter: Switch to using a table of operations
> 
> On Sun, Nov 22, 2020 at 5:33 AM David Howells <dhowells@redhat.com> wrote:
> >
> > I don't know enough about how spectre v2 works to say if this would be a
> > problem for the ops-table approach, but wouldn't it also affect the chain of
> > conditional branches that we currently use, since it's branch-prediction
> > based?
> 
> No, regular conditional branches aren't a problem. Yes, they may
> mispredict, but outside of a few very rare cases that we handle
> specially, that's not an issue.
> 
> Why? Because they always mispredict to one or the other side, so the
> code flow may be mis-predicted, but it is fairly controlled.
> 
> In contrast, an indirect jump can mispredict the target, and branch
> _anywhere_, and the attack vectors can poison the BTB (branch target
> buffer), so our mitigation for that is that every single indirect
> branch isn't predicted at all (using "retpoline").
> 
> So a conditional branch takes zero cycles when predicted (and most
> will predict quite well). And as David Laight pointed out a compiler
> can also turn a series of conditional branches into a tree, means that
> N conditional branches basically only needs log2(N) conditionals
> executed.

The compiler can convert a switch statement into a branch tree.
But I don't think it can convert the 'if chain' in the current code
to one.

There is also the problem that some x86 cpu can't predict branches
if too many happen in the same cache line (or similar).

> In contrast, with retpoline in place, an indirect branch will
> basically always take something like 25-30 cycles, because it always
> mispredicts.

I also wonder if a retpoline also trashes the return stack optimisation.
(If that is ever really a significant gain for real functions.)
 
...
> So this is not in any way "indirect branches are bad". It's more of a
> "indirect branches really aren't necessarily better than a couple of
> conditionals, and _may_ be much worse".

Even without retpolines, the jump table is likely to a data-cache
miss (and maybe a TLB miss) unless you are running hot-cache.
That is probably an extra cache miss on top of the I-cache ones.
Even worse if you end up with the jump table near the code
since the data cache line and TLB might never be shared.

So a very short switch statement is likely to be better as
conditional jumps anyway.

> For example, look at this gcc bugzilla:
> 
>     https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86952
> 
> which basically is about the compiler generating a jump table (is a
> single indirect branch) vs a series of conditional branches. With
> retpoline, the cross-over point is basically when you need to have
> over 10 conditional branches - and because of the log2(N) behavior,
> that's around a thousand cases!

That was a hot-cache test.
Cold-cache is likely to favour the retpoline a little sooner.
(And the retpoline (probbaly) won't be (much) worse than the
mid-predicted indirect jump.

I do wonder how much of the kernel actually runs hot-cache?
Except for parts that explicitly run things in bursts.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

  reply	other threads:[~2020-11-22 22:35 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-21 14:13 [PATCH 00/29] RFC: iov_iter: Switch to using an ops table David Howells
2020-11-21 14:13 ` [PATCH 01/29] iov_iter: Switch to using a table of operations David Howells
2020-11-21 14:31   ` Pavel Begunkov
2020-11-23 23:21     ` Pavel Begunkov
2020-11-21 18:21   ` Linus Torvalds
2020-12-11  1:30     ` Al Viro
2020-11-22 13:33   ` David Howells
2020-11-22 13:58     ` David Laight
2020-11-22 19:22     ` Linus Torvalds
2020-11-22 22:34       ` David Laight [this message]
2020-11-22 22:46   ` David Laight
2020-11-23  8:05   ` Christoph Hellwig
2020-11-23 10:31   ` David Howells
2020-11-23 23:42     ` Pavel Begunkov
2020-11-24 12:50     ` David Howells
2020-11-24 15:30       ` Jens Axboe
2020-11-27 17:14       ` David Howells
2020-11-23 11:14   ` David Howells
2020-12-03  6:45   ` [iov_iter] 9bd0e337c6: will-it-scale.per_process_ops -4.8% regression kernel test robot
2020-12-03  6:45     ` kernel test robot
2020-12-03 17:47     ` Linus Torvalds
2020-12-03 17:47       ` Linus Torvalds
2020-12-03 17:50       ` Jens Axboe
2020-12-03 17:50         ` Jens Axboe
2020-12-04 11:50     ` David Howells
2020-12-04 11:50       ` David Howells
2020-12-04 11:51     ` David Howells
2020-12-04 11:51       ` David Howells
2020-12-07 13:10       ` Oliver Sang
2020-12-07 13:10         ` Oliver Sang
2020-12-07 13:20       ` David Howells
2020-12-07 13:20         ` David Howells
2020-11-21 14:13 ` [PATCH 02/29] iov_iter: Split copy_page_to_iter() David Howells
2020-11-21 14:13 ` [PATCH 03/29] iov_iter: Split iov_iter_fault_in_readable David Howells
2020-11-21 14:13 ` [PATCH 04/29] iov_iter: Split the iterate_and_advance() macro David Howells
2020-11-21 14:14 ` [PATCH 05/29] iov_iter: Split copy_to_iter() David Howells
2020-11-21 14:14 ` [PATCH 06/29] iov_iter: Split copy_mc_to_iter() David Howells
2020-11-21 14:14 ` [PATCH 07/29] iov_iter: Split copy_from_iter() David Howells
2020-11-21 14:14 ` [PATCH 08/29] iov_iter: Split the iterate_all_kinds() macro David Howells
2020-11-21 14:14 ` [PATCH 09/29] iov_iter: Split copy_from_iter_full() David Howells
2020-11-21 14:14 ` [PATCH 10/29] iov_iter: Split copy_from_iter_nocache() David Howells
2020-11-21 14:14 ` [PATCH 11/29] iov_iter: Split copy_from_iter_flushcache() David Howells
2020-11-21 14:14 ` [PATCH 12/29] iov_iter: Split copy_from_iter_full_nocache() David Howells
2020-11-21 14:15 ` [PATCH 13/29] iov_iter: Split copy_page_from_iter() David Howells
2020-11-21 14:15 ` [PATCH 14/29] iov_iter: Split iov_iter_zero() David Howells
2020-11-21 14:15 ` [PATCH 15/29] iov_iter: Split copy_from_user_atomic() David Howells
2020-11-21 14:15 ` [PATCH 16/29] iov_iter: Split iov_iter_advance() David Howells
2020-11-21 14:15 ` [PATCH 17/29] iov_iter: Split iov_iter_revert() David Howells
2020-11-21 14:15 ` [PATCH 18/29] iov_iter: Split iov_iter_single_seg_count() David Howells
2020-11-21 14:15 ` [PATCH 19/29] iov_iter: Split iov_iter_alignment() David Howells
2020-11-21 14:15 ` [PATCH 20/29] iov_iter: Split iov_iter_gap_alignment() David Howells
2020-11-21 14:16 ` [PATCH 21/29] iov_iter: Split iov_iter_get_pages() David Howells
2020-11-21 14:16 ` [PATCH 22/29] iov_iter: Split iov_iter_get_pages_alloc() David Howells
2020-11-21 14:16 ` [PATCH 23/29] iov_iter: Split csum_and_copy_from_iter() David Howells
2020-11-21 14:16 ` [PATCH 24/29] iov_iter: Split csum_and_copy_from_iter_full() David Howells
2020-11-21 14:16 ` [PATCH 25/29] iov_iter: Split csum_and_copy_to_iter() David Howells
2020-11-21 14:16 ` [PATCH 26/29] iov_iter: Split iov_iter_npages() David Howells
2020-11-21 14:16 ` [PATCH 27/29] iov_iter: Split dup_iter() David Howells
2020-11-21 14:17 ` [PATCH 28/29] iov_iter: Split iov_iter_for_each_range() David Howells
2020-11-21 14:17 ` [PATCH 29/29] iov_iter: Remove iterate_all_kinds() and iterate_and_advance() David Howells
2020-11-21 14:34 ` [PATCH 00/29] RFC: iov_iter: Switch to using an ops table Pavel Begunkov
2020-11-21 18:23 ` Linus Torvalds
2020-12-11  3:24 ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3ba98abf0ddb4f16af7166db201fe9c1@AcuMS.aculab.com \
    --to=david.laight@aculab.com \
    --cc=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=dhowells@redhat.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.