From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 82669C433F5 for ; Mon, 10 Oct 2022 21:25:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:Subject:Message-ID:Date:From: In-Reply-To:References:MIME-Version:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=CFTBHbalHdHVK5vW2g2+eSrQ61dptYkRG4j/PCbJsew=; b=y+y0f8dm9eMVGX 0lli19yFe9R1BmYsJ3isi577vWPeBXXloAGMH9qxVFr2cfmPjPrvY432fQlfOVKu/bXwIXb3J/lpO 7w2FPeIrRoWiWbDBuybLnOkoz+s1J7F/GanFzg5ZZ34KXon99090cs/FHybQPb/iOPy6U+wYAQcQM jQGSiS3HKN1f/dFzVz2kCwRv6mMN1juPL/k+co0XhksXXwkPaiW5QMljDKT7yhhw2YdHUzRnogEKr JuXIM+thO8HyH3iSHN3888oAd4Sq92Z7tDkKtDeMR9vfJmHLpUqksUM2BmGhZ3rwpFD8P2lQLSBCi 3OQitdPfdpT6VsolCcHg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1oi0FV-002KvM-2V; Mon, 10 Oct 2022 21:23:45 +0000 Received: from mail-pl1-x62e.google.com ([2607:f8b0:4864:20::62e]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1oi0FS-002Kud-2M for linux-arm-kernel@lists.infradead.org; Mon, 10 Oct 2022 21:23:43 +0000 Received: by mail-pl1-x62e.google.com with SMTP id 10so11519476pli.0 for ; Mon, 10 Oct 2022 14:23:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=+sJ/oweV4chJnDymroBgeN7mOjH5s0c90aIlCoveJ3M=; b=at/GzrNip0vvaWZdMtg6b9oC+qdIP9gDHkUbM/9OPUG+MiRzOxwwQbVFm6jxmJzhnD vP3qEQ5igJTvzngSoUH8yN3TqvVyKxqnR2a80cbfRCYA4epezytZt4gZJrqDU6vhtgpR ZQCcsJ3g6uQ83QzDXWu0lZ0V5ULbWzOfyB2mU+TkGoHCvekxONtqbZFVTNXJPKM9itZI Aa6QPe7FTR/eteIiezYGlX55cm2YoYSsZg1qc2yIVC4OOQPz/99ikKVUqV+qybwfvLFY pB7CxhRDHAaWyI9OFiwYjztUORXXzvpK/wLJcVC3N7hXmd2Q9LCUH6Wd9Mq8uFx9t6M1 m3Xg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=+sJ/oweV4chJnDymroBgeN7mOjH5s0c90aIlCoveJ3M=; b=69BWPg/ST3kXcBZ0odeCNKr8AXCTvu0L7nAMJ06PhliNOt7qrVlnaqyAO2REFa78p4 1E05EenqS9FW8JAuGQ1inqQ3dgDVw0qK78K/SK+MQIXqEkn+moR0vilLZ8HZzTxhbHLa osPIotOHyYlQzDzewQSNTUcG/c0douXRQ2+JHusfgqyZoEUoye5klL/nqrbohl0ZvZNp lWXSG4roKmTOlcvMRVECpgudzyMOGNLnWeP239Hj3np6dgn7Nth3M6TYoak+LowuRxA+ 87LhXVZRyh9P00X570UPx5dfT1W2JClGAjeKgfUr1rUIJfyOuUAdri871Xp+Arton4ZU palg== X-Gm-Message-State: ACrzQf1FxHJolXKIEPrxodowbIaGNFMx4+ege8JLC6htcUH5+oBUxfZi kXaueaBZQLJI6+LE/6mWRFPBWK45Zvl8EkXrtotlFg== X-Google-Smtp-Source: AMsMyM58c/JM1tzaEva2PMIuMrUgVzEel+We68/BGcV+MBNQiK2LN2GrNXYcryDyKaRMYyQSycCVwEN8m1GK+h8lyOM= X-Received: by 2002:a17:90b:3a88:b0:209:f35d:ad53 with SMTP id om8-20020a17090b3a8800b00209f35dad53mr33965063pjb.102.1665437020431; Mon, 10 Oct 2022 14:23:40 -0700 (PDT) MIME-Version: 1.0 References: <20220716001616.4052225-1-ndesaulniers@google.com> In-Reply-To: From: Nick Desaulniers Date: Mon, 10 Oct 2022 14:23:28 -0700 Message-ID: Subject: Re: [PATCH] arm: lib: implement aeabi_uldivmod via div64_u64_rem To: Arnd Bergmann Cc: clang-built-linux , Nathan Chancellor , Miguel Ojeda , Ard Biesheuval , Gary Guo , Russell King , Linux ARM , Linux Kernel Mailing List , Craig Topper , Philip Reames , jh@jhauser.us X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20221010_142342_134481_E15380A1 X-CRM114-Status: GOOD ( 22.42 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Sat, Jul 16, 2022 at 2:47 AM Arnd Bergmann wrote: > > On Sat, Jul 16, 2022 at 2:16 AM Nick Desaulniers > wrote: > > > > Compilers frequently need to defer 64b division to a libcall with this > > symbol name. It essentially is div64_u64_rem, just with a different > > signature. Kernel developers know to call div64_u64_rem, but compilers > > don't. > > > > Link: https://lore.kernel.org/lkml/20220524004156.0000790e@garyguo.net/ > > Suggested-by: Gary Guo > > Signed-off-by: Nick Desaulniers So the existing division by constant issues went away, and Craig was able to improve division by double-word constants in LLVM 1. https://reviews.llvm.org/D130862 2. https://reviews.llvm.org/D135541 But we still have one instance left that's not div/rem by constant via CONFIG_FPE_NWFPE=y that's now blocking Android's compiler upgrade. https://github.com/ClangBuiltLinux/linux/issues/1666 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm/nwfpe/softfloat.c#n2312 Any creative ideas on how to avoid this? Perhaps putting the `aSig -= bSig;` in inline asm? Inserting a `barrier()` or empty asm statement into the loops also seems to work. Otherwise I'd define __aeabi_uldivmod as static in arch/arm/nwfpe/softfloat.c (with the body from this patch) only for clang. I see this function seems to be based on Berkeley Softfloat http://www.jhauser.us/arithmetic/SoftFloat.html v2. v3 looks like a total rewrite. Looking at v3e, it looks like float64_rem() is now called f64_rem() and defined in f64_rem.c. It doesn't look like there's anything from v3 that we could backport to the kernel's v2 to avoid this. Otherwise perhaps we just disable OABI_COMPAT for clang. Quite a few defconfigs explicitly enable FPE_NWFPE though. Are there really a lot of OABI binaries still in use? There's also the hidden llvm flag: `-mllvm -replexitval=never` that seems to work here, though FWICT it's disabling 3 such loop elisions (I think all three statements in that do while). That's probably the best way forward here... https://reviews.llvm.org/D9800 made the decision to do such a transformation when a loop can be fully elided ("deleted"). > > This has historically been strongly NAK'd, and I don't think that position > has changed in the meantime. A variable-argument 64-bit division is > really expensive, especially on 32-bit machines that lack a native > 32-bit division instruction, and we don't want developers to accidentally > insert one in their driver code. > > Explicitly calling one of the division helpers in linux/math64.h is the > established way for driver writers to declare that a particular division > cannot be turned into a cheaper operation and is never run in a > performance critical code path. The compiler of course cannot know > about either of those. > > Arnd -- Thanks, ~Nick Desaulniers _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel