From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5100DC433E2 for ; Thu, 10 Sep 2020 02:01:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 16F14206DB for ; Thu, 10 Sep 2020 02:01:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1599703316; bh=4srfk3CxiksU+zpReTs9UrGRntjmhs5oHVzXmS3T7uI=; h=References:In-Reply-To:From:Date:Subject:To:Cc:List-ID:From; b=CkqJDZ6Jd9Jau6jqPcvtxYhvIxIuk4E0kme0gcQzuRzlbW4BYyCI0a/H7gwAKCUhq HfduoRD62bucm4Rqcc9RNgYUMjEXWoCrLcYwRA7mknozjOLej2miMo3CwNQV5mLNz/ NzPbpKOhu8/U0vD2cdo81ymij1FMw0zXwh2Q9hf4= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730500AbgIJCBr (ORCPT ); Wed, 9 Sep 2020 22:01:47 -0400 Received: from condef-04.nifty.com ([202.248.20.69]:56752 "EHLO condef-04.nifty.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728970AbgIJBkZ (ORCPT ); Wed, 9 Sep 2020 21:40:25 -0400 X-Greylist: delayed 300 seconds by postgrey-1.27 at vger.kernel.org; Wed, 09 Sep 2020 21:40:22 EDT Received: from conssluserg-04.nifty.com ([10.126.8.83])by condef-04.nifty.com with ESMTP id 08A1J0mu023902; Thu, 10 Sep 2020 10:19:00 +0900 Received: from mail-pg1-f169.google.com (mail-pg1-f169.google.com [209.85.215.169]) (authenticated) by conssluserg-04.nifty.com with ESMTP id 08A1IhTb013561; Thu, 10 Sep 2020 10:18:43 +0900 DKIM-Filter: OpenDKIM Filter v2.10.3 conssluserg-04.nifty.com 08A1IhTb013561 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nifty.com; s=dec2015msa; t=1599700724; bh=x2qUAAqVWiePSrCzfI1IEP4eafAB9yee8r8ckQho/u8=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=WGjr5LVsquWsgfjbUw8zsVNV+eA8G4HhcuX7mJ53BwE1QXEJLWsiBBR8lguwcOerT d71o9nfLv7ija30NHqQedE2CfJD68H7zHtYLQdmNLNYavfgBOdGEVAam0QzLlPx5Q2 6EJvPCcYrNNyDZNKB0TNqWiS2LuxTF6IL0ttSnkDcBELKjgx7TEBTYnWbGAc/tZbYZ ZMn59+M37cpTDQVmqeXbYfFIZ/VUmlVU6FP4rP8nFvJoOjs3HFc/LbGKo4YdWIQOK6 KSbHxjj2F61Imz4LpNo5mWBjeXerF0bjYZRmNxbsHsNpkA4wxyCBw0p6bFOn5d+i5j eYvVVaetEFYKg== X-Nifty-SrcIP: [209.85.215.169] Received: by mail-pg1-f169.google.com with SMTP id g29so3356898pgl.2; Wed, 09 Sep 2020 18:18:43 -0700 (PDT) X-Gm-Message-State: AOAM530F73TwB0Fmzr4rQbsyBv8w417djTI0Bd+yBGyIznSW6IwvbMEQ aqn88/qrsnbMbTZ6JBfWJ29DykE++eZe+uEIKeE= X-Google-Smtp-Source: ABdhPJw3+uiV59rG6EDJ8QWqM7TRK95TtkCpF/iAWICblIOjIFopt/zBNTTyCv+Bm880c1rm+ZI8WQaujb2+mHAsSEI= X-Received: by 2002:a63:f546:: with SMTP id e6mr2466312pgk.7.1599700722672; Wed, 09 Sep 2020 18:18:42 -0700 (PDT) MIME-Version: 1.0 References: <20200624203200.78870-1-samitolvanen@google.com> <20200903203053.3411268-1-samitolvanen@google.com> <20200908234643.GF1060586@google.com> In-Reply-To: <20200908234643.GF1060586@google.com> From: Masahiro Yamada Date: Thu, 10 Sep 2020 10:18:05 +0900 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v2 00/28] Add support for Clang LTO To: Sami Tolvanen Cc: Will Deacon , Peter Zijlstra , Steven Rostedt , Greg Kroah-Hartman , "Paul E. McKenney" , Kees Cook , Nick Desaulniers , clang-built-linux , Kernel Hardening , linux-arch , linux-arm-kernel , Linux Kbuild mailing list , Linux Kernel Mailing List , linux-pci@vger.kernel.org, X86 ML Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Sep 9, 2020 at 8:46 AM Sami Tolvanen wrote: > > On Sun, Sep 06, 2020 at 09:24:38AM +0900, Masahiro Yamada wrote: > > On Fri, Sep 4, 2020 at 5:30 AM Sami Tolvanen wrote: > > > > > > This patch series adds support for building x86_64 and arm64 kernels > > > with Clang's Link Time Optimization (LTO). > > > > > > In addition to performance, the primary motivation for LTO is > > > to allow Clang's Control-Flow Integrity (CFI) to be used in the > > > kernel. Google has shipped millions of Pixel devices running three > > > major kernel versions with LTO+CFI since 2018. > > > > > > Most of the patches are build system changes for handling LLVM > > > bitcode, which Clang produces with LTO instead of ELF object files, > > > postponing ELF processing until a later stage, and ensuring initcall > > > ordering. > > > > > > Note that patches 1-4 are not directly related to LTO, but are > > > needed to compile LTO kernels with ToT Clang, so I'm including them > > > in the series for your convenience: > > > > > > - Patches 1-3 are required for building the kernel with ToT Clang, > > > and IAS, and patch 4 is needed to build allmodconfig with LTO. > > > > > > - Patches 3-4 are already in linux-next, but not yet in 5.9-rc. > > > > > > > > > I still do not understand how this patch set works. > > (only me?) > > > > Please let me ask fundamental questions. > > > > > > > > I applied this series on top of Linus' tree, > > and compiled for ARCH=arm64. > > > > I compared the kernel size with/without LTO. > > > > > > > > [1] No LTO (arm64 defconfig, CONFIG_LTO_NONE) > > > > $ llvm-size vmlinux > > text data bss dec hex filename > > 15848692 10099449 493060 26441201 19375f1 vmlinux > > > > > > > > [2] Clang LTO (arm64 defconfig + CONFIG_LTO_CLANG) > > > > $ llvm-size vmlinux > > text data bss dec hex filename > > 15906864 10197445 490804 26595113 195cf29 vmlinux > > > > > > I compared the size of raw binary, arch/arm64/boot/Image. > > Its size increased too. > > > > > > > > So, in my experiment, enabling CONFIG_LTO_CLANG > > increases the kernel size. > > Is this correct? > > Yes. LTO does produce larger binaries, mostly due to function > inlining between translation units, I believe. The compiler people > can probably give you a more detailed answer here. Without -mllvm > -import-instr-limit, the binaries would be even larger. > > > One more thing, could you teach me > > how Clang LTO optimizes the code against > > relocatable objects? > > > > > > > > When I learned Clang LTO first, I read this document: > > https://llvm.org/docs/LinkTimeOptimization.html > > > > It is easy to confirm the final executable > > does not contain foo2, foo3... > > > > > > > > In contrast to userspace programs, > > kernel modules are basically relocatable objects. > > > > Does Clang drop unused symbols from relocatable objects? > > If so, how? > > I don't think the compiler can legally drop global symbols from > relocatable objects, but it can rename and possibly even drop static > functions. Compilers can drop static functions without LTO. Rather, it is a compiler warning (-Wunused-function), so the code should be cleaned up. > This is why we need global wrappers for initcalls, for > example, to have stable symbol names. > > Sami At first, I thought the motivation of LTO was to remove unused global symbols, and to perform further optimization. It is true for userspace programs. In fact, the example of https://llvm.org/docs/LinkTimeOptimization.html produces a smaller binary. In contrast, this patch set produces a bigger kernel because LTO cannot remove any unused symbol. So, I do not understand what the benefit is. Is inlining beneficial? I am not sure. Documentation/process/coding-style.rst "15) The inline disease" mentions that inlining is not always a good thing. As a whole, I still do not understand the motivation of this patch set. -- Best Regards Masahiro Yamada