linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Srivatsa S. Bhat" <srivatsa@csail.mit.edu>
To: "Theodore Y. Ts'o" <tytso@mit.edu>,
	Stephan Mueller <smueller@chronox.de>,
	linux-crypto@vger.kernel.org,
	Linux Kernel Developers List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 1/5] random: fix crng_ready() test
Date: Wed, 16 May 2018 17:07:08 -0700	[thread overview]
Message-ID: <6bb4d4cb-aafa-7440-0dc3-40faf647ec89@csail.mit.edu> (raw)
In-Reply-To: <20180413170037.GA28721@thunk.org>

On 4/13/18 10:00 AM, Theodore Y. Ts'o wrote:
> On Fri, Apr 13, 2018 at 03:05:01PM +0200, Stephan Mueller wrote:
>>
>> What I would like to point out that more and more folks change to 
>> getrandom(2). As this call will now unblock much later in the boot cycle, 
>> these systems see a significant departure from the current system behavior.
>>
>> E.g. an sshd using getrandom(2) would be ready shortly after the boot finishes 
>> as of now. Now it can be a matter minutes before it responds. Thus, is such 
>> change in the kernel behavior something for stable?

[...]

> I was a little worried that on VM's this could end up causing things
> to block for a long time, but an experiment on a GCE VM shows that
> isn't a problem:
> 
> [    0.000000] Linux version 4.16.0-rc3-ext4-00009-gf6b302ebca85 (tytso@cwcc) (gcc version 7.3.0 (Debian 7.3.0-15)) #16 SMP Thu Apr 12 16:57:17 EDT 2018
> [    1.282220] random: fast init done
> [    3.987092] random: crng init done
> [    4.376787] EXT4-fs (sda1): re-mounted. Opts: (null)
> 
> There are some desktops where the "crng_init done" report doesn't
> happen until 45-90 seconds into the boot.  I don't think I've seen
> reports where it takes _minutes_ however.  Can you give me some
> examples of such cases?

On a Photon OS VM running on VMware ESXi, this patch causes a boot speed
regression of 5 minutes :-( [ The VM doesn't have haveged or rng-tools
(rngd) installed. ]

[    1.420246] EXT4-fs (sda2): re-mounted. Opts: barrier,noacl,data=ordered
[    1.469722] tsc: Refined TSC clocksource calibration: 1900.002 MHz
[    1.470707] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x36c65c1a9e1, max_idle_ns: 881590695311 ns
[    1.474249] clocksource: Switched to clocksource tsc
[    1.584427] systemd-journald[216]: Received request to flush runtime journal from PID 1
[  346.620718] random: crng init done

Interestingly, the boot delay is exacerbated on VMs with large amounts
of RAM. For example, the delay is not so noticeable (< 30 seconds) on a
VM with 2GB memory, but goes up to 5 minutes on an 8GB VM.

Also, cloud-init-local.service seems to be the one blocking for entropy
here. systemd-analyze critical-chain shows:

The time after the unit is active or started is printed after the "@" character.
The time the unit takes to start is printed after the "+" character.

multi-user.target @6min 1.283s
└─vmtoolsd.service @6min 1.282s
  └─cloud-final.service @6min 366ms +914ms
    └─cloud-config.service @5min 59.174s +1.190s
      └─cloud-config.target @5min 59.172s
        └─cloud-init.service @5min 47.423s +11.744s
          └─systemd-networkd-wait-online.service @5min 45.999s +1.420s
            └─systemd-networkd.service @5min 45.975s +21ms
              └─network-pre.target @5min 45.973s
                └─cloud-init-local.service @241ms +5min 45.687s
                  └─systemd-remount-fs.service @222ms +13ms
                    └─systemd-fsck-root.service @193ms +26ms
                      └─systemd-journald.socket @188ms
                        └─-.mount @151ms
                          └─system.slice @161ms
                            └─-.slice @151ms

It would be great if this CVE can be fixed somehow without causing boot speed
to spike from ~20 seconds to 5 minutes, as that makes the system pretty much
unusable. I can workaround this by installing haveged, but ideally an in-kernel
fix would be better. If you need any other info about my setup or if you have
a patch that I can test, please let me know!

Thank you very much!

Regards,
Srivatsa
VMware Photon OS

  reply	other threads:[~2018-05-17  0:23 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-13  1:30 [PATCH 1/5] random: fix crng_ready() test Theodore Ts'o
2018-04-13  1:30 ` [PATCH 2/5] random: use a different mixing algorithm for add_device_randomness() Theodore Ts'o
2018-04-13  1:30 ` [PATCH 3/5] random: set up the NUMA crng instances after the CRNG is fully initialized Theodore Ts'o
2018-04-13 22:31   ` kbuild test robot
2018-04-13  1:30 ` [PATCH 4/5] random: crng_reseed() should lock the crng instance that it is modifying Theodore Ts'o
2018-04-13  1:30 ` [PATCH 5/5] random: add new ioctl RNDRESEEDCRNG Theodore Ts'o
2018-04-13  5:38 ` [PATCH 1/5] random: fix crng_ready() test Stephan Mueller
2018-04-13 12:53   ` Theodore Y. Ts'o
2018-04-13 13:05     ` Stephan Mueller
2018-04-13 17:00       ` Theodore Y. Ts'o
2018-05-17  0:07         ` Srivatsa S. Bhat [this message]
2018-05-17 20:53           ` Theodore Y. Ts'o
2018-05-17  6:01         ` Christophe LEROY
2018-05-17 20:56           ` Theodore Y. Ts'o
2018-05-02 16:18 ` Geert Uytterhoeven

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6bb4d4cb-aafa-7440-0dc3-40faf647ec89@csail.mit.edu \
    --to=srivatsa@csail.mit.edu \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=smueller@chronox.de \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).