From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B35BBC10DCE for ; Wed, 18 Mar 2020 20:06:14 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 77EFF2071C for ; Wed, 18 Mar 2020 20:06:14 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 77EFF2071C Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=SDF.ORG Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 0C9C16B0088; Wed, 18 Mar 2020 16:06:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 07B886B0089; Wed, 18 Mar 2020 16:06:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ED2E06B008A; Wed, 18 Mar 2020 16:06:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0055.hostedemail.com [216.40.44.55]) by kanga.kvack.org (Postfix) with ESMTP id D190B6B0088 for ; Wed, 18 Mar 2020 16:06:13 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 7435E180AD804 for ; Wed, 18 Mar 2020 20:06:13 +0000 (UTC) X-FDA: 76609564626.01.cable69_8d9c7344ca140 X-HE-Tag: cable69_8d9c7344ca140 X-Filterd-Recvd-Size: 2831 Received: from mx.sdf.org (mx.sdf.org [205.166.94.20]) by imf04.hostedemail.com (Postfix) with ESMTP for ; Wed, 18 Mar 2020 20:06:12 +0000 (UTC) Received: from sdf.org (IDENT:lkml@otaku.sdf.org [205.166.94.8]) by mx.sdf.org (8.15.2/8.14.5) with ESMTPS id 02IK69Fx017013 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256 bits) verified NO); Wed, 18 Mar 2020 20:06:09 GMT Received: (from lkml@localhost) by sdf.org (8.15.2/8.12.8/Submit) id 02IK69ZV013481; Wed, 18 Mar 2020 20:06:09 GMT Date: Wed, 18 Mar 2020 20:06:09 +0000 From: George Spelvin To: Alexander Duyck Cc: Kees Cook , Dan Williams , linux-mm , Andrew Morton , lkml@sdf.org Subject: Re: [PATCH v2] mm/shuffle.c: Fix races in add_to_free_area_random() Message-ID: <20200318200609.GE2281@SDF.ORG> References: <20200317135035.GA19442@SDF.ORG> <202003171435.41F7F0DF9@keescook> <20200317230612.GB19442@SDF.ORG> <202003171619.23210A7E0@keescook> <20200318014410.GA2281@SDF.ORG> <20200318183500.GC2281@SDF.ORG> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Mar 18, 2020 at 12:17:14PM -0700, Alexander Duyck wrote: > I was just putting it out there as a possibility. What I have seen in > the past is that under some circumstances gcc can be smart enough to > interpret that as a "branch on carry". My thought was you are likely > having to test the value against itself and then you might be able to > make use of shift and carry flag to avoid that. In addition you could > get away from having to recast a unsigned value as a signed one in > order to perform the bit test. Ah, yes, it would be nice if gcc could use the carry bit for r rather than having to devote a whole register to it. But it has to do two unrelated flag tests (zero and carry), and it's generally pretty bad at taking advantage of preserved flag bits like that. My ideal x86-64 object code would be: shlq rand(%rip) jz fixup fixed: jnc tail jmp add_to_free_area tail: jmp add_to_free_area_tail fixup: pushq %rdx pushq %rsi pushq %rdi call get_random_u64 popq %rdi popq %rsi popq %rdx stc adcq %rax,%rax movq %rax, rand(%rip) jmp fixed ... but I don't know how to induce GCC to generate that, and the function doesn't seem worthy of platform-specific asm. (Note that I have to use add add on the slow path because lea doesn't set the carry bit.)