netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Dumazet <edumazet@google.com>
To: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	David Dworken <ddworken@google.com>,
	Willem de Bruijn <willemb@google.com>,
	"David S. Miller" <davem@davemloft.net>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Subject: Re: 126 ms irqsoff Latency - Possibly due to commit 190cc82489f4 ("tcp: change source port randomizarion at connect() time")
Date: Sat, 1 Oct 2022 15:31:15 -0700	[thread overview]
Message-ID: <CANn89iK3maLVo_G7MGswuXV0Og9tEFJxMZt+34ZKTo4zUNoLRw@mail.gmail.com> (raw)
In-Reply-To: <Yzi8Md2tkSYDnF1B@zx2c4.com>

On Sat, Oct 1, 2022 at 3:16 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>
> (CC+Sebastian)
>
> Hi Eric, Christophe,
>
> I'm trying to understand the context of this and whether/why there's a
> problem. Some overview on how get_random_bytes() works:
>
> Most of the time, get_random_bytes() is completely lockless and operates
> over per-CPU data structures. get_random_bytes() calls
> _get_random_bytes(), which calls crng_make_state(), and then operates
> over stack data to churn out some random bytes. crng_make_state() is
> where all the meat happens.
>
> In crng_make_state(), there are three unlikely conditionals where locks
> are taken. The first is:
>
>     if (!crng_ready()) {
>         ... do some expensive things involving locks ...
>         ... but only during early boot before the rng is initialized ...
>     }
>
> The second one is:
>
>     if (unlikely(time_is_before_jiffies(READ_ONCE(base_crng.birth) + crng_reseed_interval()))) {
>         ... do something less expensive involving locks ...
>         ... which happens approximately once per minute ...
>     }
>
> The third one is:
>
>     if (unlikely(crng->generation != READ_ONCE(base_crng.generation))) {
>         ... do something even less expensive involving locks ...
>         ... which happens when after a different cpu hit the above ...
>     }
>
> So all three of these conditions are pretty darn unlikely, with the
> exception of the first one that happens all the time during early boot
> before the RNG is initialized, after which it is static-branched out and
> never triggers again. So as far as /locks/ are concerned, things should
> be good here.
>
> However, in order to operate on per-cpu data, and therefore be lockless
> most of the time, it does take a "local lock", which is basically just
> disabling interrupts on non-RT to do a short operation:
>
>     local_lock_irqsave(&crngs.lock, flags);
>     crng = raw_cpu_ptr(&crngs);
>     crng_fast_key_erasure(...);
>     local_unlock_irqrestore(&crngs.lock, flags);
>
> crng_fast_key_erasure(), in turn, computes a single block of chacha20,
> which should be relatively fast. So the critical section is very short
> there.
>
> The reason that's local_lock_irqsave() rather than local_lock() (which
> would only disable preemption, I believe), is because IRQ handlers are
> supposed to be able to have access to random bytes too. It seems like it
> wouldn't be a super nice thing to remove that capability.
>
> It might be possible to double the amount of per-cpu data and have a
> separate state for IRQ than for non-IRQ, but that seems kind of wasteful
> and complex/hairy to implement.
>
> So that leads me to wonder more about the context: why does this matter?
> It looks like you're hitting this from a DO_ONCE() thing, which are
> usually only hit, as the name says, once, and then incur the overhead of
> firing off a worker to change the once-static-branch, which means
> DO_ONCE()es aren't very fast anyway? Or does that not accurately reflect
> what's happening?
>
> I'll also CC Sebastian here, who worked with me on that local lock and
> might have some insights on IRQ latency as well.

Sorry Jason, it seems I forgot to CC you on the tentative patch I sent
earlier today

https://patchwork.kernel.org/project/netdevbpf/patch/20221001205102.2319658-1-eric.dumazet@gmail.com/

  reply	other threads:[~2022-10-01 22:31 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-01 17:16 126 ms irqsoff Latency - Possibly due to commit 190cc82489f4 ("tcp: change source port randomizarion at connect() time") Christophe Leroy
2022-10-01 17:43 ` Eric Dumazet
2022-10-01 17:58   ` Eric Dumazet
2022-10-01 22:16     ` Jason A. Donenfeld
2022-10-01 22:31       ` Eric Dumazet [this message]
2022-10-01 22:37         ` Jason A. Donenfeld
2022-10-01 22:34       ` Jason A. Donenfeld
2022-10-01 22:37         ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CANn89iK3maLVo_G7MGswuXV0Og9tEFJxMZt+34ZKTo4zUNoLRw@mail.gmail.com \
    --to=edumazet@google.com \
    --cc=Jason@zx2c4.com \
    --cc=bigeasy@linutronix.de \
    --cc=christophe.leroy@csgroup.eu \
    --cc=davem@davemloft.net \
    --cc=ddworken@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).