From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AB069C4CEC9 for ; Sun, 15 Sep 2019 02:05:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 7F89D20828 for ; Sun, 15 Sep 2019 02:05:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727428AbfIOCFz (ORCPT ); Sat, 14 Sep 2019 22:05:55 -0400 Received: from outgoing-auth-1.mit.edu ([18.9.28.11]:57778 "EHLO outgoing.mit.edu" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727262AbfIOCFz (ORCPT ); Sat, 14 Sep 2019 22:05:55 -0400 Received: from callcc.thunk.org ([66.31.38.53]) (authenticated bits=0) (User authenticated as tytso@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id x8F25Mrh018387 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sat, 14 Sep 2019 22:05:26 -0400 Received: by callcc.thunk.org (Postfix, from userid 15806) id DADD4420811; Sat, 14 Sep 2019 22:05:21 -0400 (EDT) Date: Sat, 14 Sep 2019 22:05:21 -0400 From: "Theodore Y. Ts'o" To: Linus Torvalds Cc: "Ahmed S. Darwish" , Andreas Dilger , Jan Kara , Ray Strode , William Jon McCann , "Alexander E. Patrakov" , zhangjs , linux-ext4@vger.kernel.org, Lennart Poettering , lkml Subject: Re: Linux 5.3-rc8 Message-ID: <20190915020521.GF19710@mit.edu> References: <20190912034421.GA2085@darwi-home-pc> <20190912082530.GA27365@mit.edu> <20190914150206.GA2270@darwi-home-pc> <20190914211126.GA4355@darwi-home-pc> <20190914222432.GC19710@mit.edu> <20190915010037.GE19710@mit.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Sep 14, 2019 at 06:10:47PM -0700, Linus Torvalds wrote: > > We could return 0 for success, and yet "the best we > > can do" could be really terrible. > > Yes. Which is why we should warn. I'm all in favor of warning. But people might just ignore the warning. We warn today about systemd trying to read from /dev/urandom too early, and that just gets ignored. > But we can't *block*. Because that just breaks people. Like shown in > this whole discussion. I'd be willing to let it take at least 2 minutes, since that's slow enough to be annoying. I'd be willing to to kill the process which tried to call getrandom too early. But I believe blocking is better than returning something potentially not random at all. I think failing "safe" is extremely important. And returning something not random which then gets used for a long-term private key is a disaster. You basically want to turn getrandom into /dev/urandom. And that's how we got into the mess where 10% of the publically accessible ssh keys could be guessed. I've tried that already, and we saw how that ended. > Why is warning different? Because hopefully it tells the only person > who can *do* something about it - the original maintainer or developer > of the user space tools - that they are doing something wrong and need > to fix their broken model. Except the developer could (and *has) just ignored the warning, which is what happened with /dev/urandom when it was accessed too early. Even when I drew some developers attention to the warning, at least one just said, "meh", and blew me off. Would a making it be noiser (e.g., a WARN_ON) make enough of a difference? I guess I'm just not convinced. > Blocking doesn't do that. Blocking only makes the system unusable. And > yes, some security people think "unusable == secure", but honestly, > those security people shouldn't do system design. They are the worst > kind of "technically correct" incompetent. Which is worse really depends on your point of view, and what the system might be controlling. If access to the system could cause a malicious attacker to trigger a nuclear bomb, failing safe is always going to be better. In other cases, maybe failing open is certainly more convenient. It certainly leaves the system more "usable". But how do we trade off "usable" with "insecure"? There are times when "unusable" is WAY better than "could risk life or human safety". Would you be willing to settle for a CONFIG option or a boot-command line option which controls whether we fail "safe" or fail "open" if someone calls getrandom(2) and there isn't enough entropy? Then each distribution and/or system integrator can decide whether "proper systems design" considers "usability" versus "must not fail insecurely" to be more important. - Ted