From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.6 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_2 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3167CC47092 for ; Mon, 31 May 2021 01:05:53 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E243761261 for ; Mon, 31 May 2021 01:05:52 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E243761261 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=UPGVE5wqZuEJGa+nSmztsbSMplHkW+n3QSoZzHWR0s8=; b=V2Cz0vl2WNm6mZ NwQROnY6i6VYqpSVR1mVyaSKdF094NgfZnlCzqBjk8IxgUD4k5DYvKfb1NGr+OhOz4LH2svauFWdi jgaFfVElw+6HwjUtvDUynvij7Bpj1qQeKhcw1BSILl8TuBgrNs8eFKb4c2ofRR1MdB+BwCoYfem6S kjFh3g+1JXOxsWLdbVodIIMK9WtpbtT7CJSH9ovKcYB2tdSMwTq3XKT09XMPpz1bixMsVBnyiECue hHRQTwZftUYZlKmePlUIPKDmi1OmYNyRBLbnZVf/xaX+DG4eqcy7zHmoYj+5w1D9AQEfRccuoXeBE jw+g3qsbS5BGIB0l1ahA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1lnWKn-00AOE7-FU; Mon, 31 May 2021 01:03:13 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1lnWKi-00AODK-Tr for linux-arm-kernel@lists.infradead.org; Mon, 31 May 2021 01:03:10 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 9303B6D; Sun, 30 May 2021 18:03:03 -0700 (PDT) Received: from slackpad.fritz.box (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 904C73F719; Sun, 30 May 2021 18:03:02 -0700 (PDT) Date: Mon, 31 May 2021 02:02:35 +0100 From: Andre Przywara To: Benjamin Herrenschmidt Cc: Mark Brown , Will Deacon , "Saidi, Ali" , linux-arm-kernel@lists.infradead.org, Ard Biesheuvel Subject: Re: RNDR/SS vs. SMCCC Message-ID: <20210531020235.6e4ea946@slackpad.fritz.box> In-Reply-To: References: <7d5697f3994fc1f9cf39d332525269056e3649b3.camel@kernel.crashing.org> <0339748b54e2faeddeec8d50e32a6c6ff4e8b3b7.camel@kernel.crashing.org> Organization: Arm Ltd. X-Mailer: Claws Mail 3.17.1 (GTK+ 2.24.31; x86_64-slackware-linux-gnu) MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210530_180309_097969_66A4E80C X-CRM114-Status: GOOD ( 63.79 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Sat, 29 May 2021 12:36:06 +1000 Benjamin Herrenschmidt wrote: Hi, (adding Ard, as he had some say in the decisions back then). > On Fri, 2021-05-28 at 13:56 +0100, Mark Brown wrote: > > On Fri, May 28, 2021 at 09:12:38AM +1000, Benjamin Herrenschmidt > > wrote: > > > > > > Right, I was thinking about using RNDRSS instead. > > > > Yeah, I *suspect* that's where we'll end up longer term now that > > the random.c code is less enthusiastic about calling the > > function - it was where the patches where initially until the > > concerns about overloading were raised and it does seem to map > > more naturally onto the API. I do think we should hold off until > > we've got some concrete information on how real systems perform > > simply to avoid churn but I wouldn't be surprised to see us > > making changes once we have data. Sounds like you'll be able to > > help out here! > > Hehe yup. In general quality of implementation was one concern against using the instructions. The actual entropy source would be provided by the *SoC vendor*, so we can't easily match this against, say the MIDR. The firmware interface provides means to disqualify certain implementations (using its GUID), also the firmware might get workarounds that fix problems (or even use another entropy source). > > Note that they do both get washed through the PRNG, not that I > > think it makes a huge difference to the argument here. > > Right. At this point, I think we can wait until HW is there and we have > enough data to decide what to do with the policy. I wish the ISA was > clearer in defining timing characteristics (and fail behaviour) of > those instructions... as-is, we'll probably have to mess around based > on whatever HW comes out, worse, possibly with quirks. I was playing around with the TRNG on our Juno board, and figured the raw performance of the TRNG (through MMIO!) to be around 60MB/s. This seems to be inline with most x86 entropy sources, I guess they also use clock jitter or thermal noise, which generally should provide "enough" (TM) entropy, at least for this seeding use case. So I wouldn't expect many problems with well behaved users. The firmware implementation allows to address this (ab)use case by gently stalling "leechers". But I agree that this is all quite theoretic at this point, and actual implementations would be nice to see. > > > > In practice most of the non-seed arch_get_random usage is either > > > > a fallback if there's no seed variant or mixed in with the CRNG > > > > output anyway. > > > Right but I still don't believe the end result makes a whole lot of > > > sense. In absence of SMCCC we end up using RNDR as a seed which > > > hits > > > the "not a great match" comment. Not a huge deal I suppose, for our > > > (EC2) case, we could just not implement the SMCCC call, and let it > > > use > > > RNDR, it's still going to be better than no HW random source, I > > > just > > > don't like those firmware calls much. > > > > I do see them as useful for the seeding case, it shouldn't be in > > quite such sensitive fast paths as the regular versions and > > it means that if a system has a better entropy source than the > > one backing RNDR (especially if the one backing RNDR has some > > actual problem) then we can override in software. As you say if > > the SMCCC isn't offering anything over the system registers then > > platforms don't need to implement it. > > As far as I can tell, the ISA has some pretty strict requirements for > the entropy source backing RNDR, so ideally, if implementations are > compliants, it *should* be a non-issue. Famous last words... :) Indeed, reality can be a annoying. But given the current frequency of the calls to arch_get_random() - just to reseed the CRNG - now in the once-per-5-minute range, I think the performance concerns of the SMCCC are somewhat secondary - even when including the multiple VMs case. > > > > The arch_get_random_ interfaces already provide a return code > > > > which the callers can handle as needed, they can do something > > > > appropriate for the scenario rather than the architecture code > > > > having to pick. Sometimes that's to retry later (random.c does > > > > this when seeding for example), sometimes that's to just carry on > > > > with whatever other entropy they've already got. > > > > > > Ok. It's unfortunate that the ISA is so vague on the circumstances > > > where the instructions are allowed to fail... it says "a reasonable > > > amount of time", it may as well have said a "random amount of time" > > > for > > > the usefulness of it ;-) > > > The implementation I'm aware of will fail extremely rarely when the > > > HW > > > detects an issues that requires corrective action, but I could > > > imagine > > > some implemetations just failing when there's no entropy at hand > > > (esp. > > > with RNDRSS). > > > > Yes, I think there being inadequate entropy at hand to reseed is > > the big concern here - some of that's going to be a quality > > tradeoff and it's very hard to actually enforce any constraints > > even if you define them so ultimately it all comes down to > > quality of implementation issues. > > Yup. I hate this but I foresee a future where we'll have implementation > quirks. I hope not but ... > > > > As long as the callers don't give up permanently, that's fine. I > > > was > > > just a bit concerned by cnrg_init_try_arch{_early}. It would be > > > preferable for these to "try harder". > > > > Yeah, there is some scope for retries there - unfortunately the > > arch_get_random_ interface can't distinguish between temporary > > and permanent failures, and people won't want to to slow down the > > boot path at all by actually blocking. Looking again there's > > some scope for improving this process, we will continue to pull > > seed values in crng_reseed() but not in quite the same way. I'm > > now wondering if it'd make sense to hook retries of the > > architecture init into or alongside try_to_generate_entropy() in > > wait_for_random_bytes() when trust_cpu is set. > > Could be. I'm away from the code right now, but I would like to avoid > having the system behaviour change overall based on whether it happened > to hit a failure case at boot or not. For deployments at scale, that > sort of "randomness" isn't the sort we want :) So are your concerns primarily about the system blocking or the instructions returning empty-handed while waiting for entropy? Especially during the initial seeding and address space randmisation at boot time? In my experience the demand for entropy at this point (a few hundred bytes, IIRC) are easily handled by any hardware implementation (given the above mentioned 60MB/s). And even 100 VMs starting at the same time should be serviceable by the hardware. So this stalling case is more if one user is actually hammering the instructions, which the firmware implementation helps to protect against. Cheers, Andre _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel