From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.6 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9E9A7C433E1 for ; Sat, 15 Aug 2020 22:17:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7DACB2065C for ; Sat, 15 Aug 2020 22:17:54 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="M9w7CiNQ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726407AbgHOWRx (ORCPT ); Sat, 15 Aug 2020 18:17:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50322 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728730AbgHOWRv (ORCPT ); Sat, 15 Aug 2020 18:17:51 -0400 Received: from mail-pj1-x1044.google.com (mail-pj1-x1044.google.com [IPv6:2607:f8b0:4864:20::1044]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7CE03C061385 for ; Sat, 15 Aug 2020 15:17:51 -0700 (PDT) Received: by mail-pj1-x1044.google.com with SMTP id ep8so5902759pjb.3 for ; Sat, 15 Aug 2020 15:17:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Cyp9kIv6qMIcFE2qdHMexk3h45R1hTJMbabxmKvKOOQ=; b=M9w7CiNQ1Eaom7klg4evZ/xHtzZQe125/dTYvruGKA/lL0RG1wT6rCxfZzBp/nqh5E CkJrHaBeNb+eJ4uQH9UZ3sk1gHz5gcsAJy7NSS/YIfjbRn/DAL+ubvqOj99Kp2veZBSG XFLZ20wlxoTvaqCKtmZP8PgkNSJ4WBLz7p/LAlAT7gp8TKC0u5bqlDc+p5H5FDqNg+/0 n2I01hp1TQGcKgyYfuoM9rlzlNUrsQEYwyqr1y0sryQXuiiWPzIyQ8x9cAUd0ih7/hvU 5t8NaFqV/vB29T3ru9Vte4G0HQSuOY+oElz7iH9Re2OTMmYK7FxHo3lez/tVbImogNfD y4OQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Cyp9kIv6qMIcFE2qdHMexk3h45R1hTJMbabxmKvKOOQ=; b=NUAZQrweMsfxRofTMPfpDKoFmEAcrLbaH5g0VeegUdXixuxpk4+SCGNcTEhVBdosx6 TP/S+prH2Mg4u7HPdfGVDbYj9ev8uSfaEMllfKkPTNILZ3n28/dcpnmMyLlIPXQN/ALI RFH4VTpMy2IdrT1irX7r2AIBV3dF/dL8YFGwHJk5ifXlNyaw5dmESn8DyEwQCfaCgBp1 I/sspiWwbUzXgP5Q3fLKYTC+KdxkurDls3HCLijcQnmEWAPoAHlUUc+JqmBd9cnp/FQy FTOfQ1cOraakd3Zt00E9tTh3MKlvgsGvML1EfUsxlS94Ua5IOsFfK/gU6r5/0U8XYWIG xDxw== X-Gm-Message-State: AOAM53207dbU7lYDjmc8IP9ceTP7XLwI3A1NqyNTEJ+yv+IUi4EZLrgh 1LWZ0U+rLjXUJtHDhXiLkAvO8n8m9j7uJsa0AhxqwA== X-Google-Smtp-Source: ABdhPJzpmka1iHTSqa9O2mnCopmvu/YpiL1ZLiQX8vmMR5O11xwQ7jo5L3YKa7CaRoOxtLbe4SqOotUUyWWVnbIr44E= X-Received: by 2002:a17:90a:a10c:: with SMTP id s12mr7177172pjp.32.1597529870544; Sat, 15 Aug 2020 15:17:50 -0700 (PDT) MIME-Version: 1.0 References: <20200815014006.GB99152@rani.riverdale.lan> <20200815020946.1538085-1-ndesaulniers@google.com> <202008150921.B70721A359@keescook> <457a91183581509abfa00575d0392be543acbe07.camel@perches.com> In-Reply-To: From: Nick Desaulniers Date: Sat, 15 Aug 2020 15:17:39 -0700 Message-ID: Subject: Re: [PATCH v2] lib/string.c: implement stpcpy To: Joe Perches Cc: Kees Cook , Andrew Morton , =?UTF-8?B?RMOhdmlkIEJvbHZhbnNrw70=?= , Eli Friedman , "# 3.4.x" , Arvind Sankar , Sami Tolvanen , Vishal Verma , Dan Williams , Andy Shevchenko , "Joel Fernandes (Google)" , Daniel Axtens , Ingo Molnar , Yury Norov , Alexandru Ardelean , LKML , clang-built-linux , Rasmus Villemoes Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Aug 15, 2020 at 2:31 PM Joe Perches wrote: > > On Sat, 2020-08-15 at 14:28 -0700, Nick Desaulniers wrote: > > On Sat, Aug 15, 2020 at 2:24 PM Joe Perches wrote: > > > On Sat, 2020-08-15 at 13:47 -0700, Nick Desaulniers wrote: > > > > On Sat, Aug 15, 2020 at 9:34 AM Kees Cook wrote: > > > > > On Fri, Aug 14, 2020 at 07:09:44PM -0700, Nick Desaulniers wrote: > > > > > > LLVM implemented a recent "libcall optimization" that lowers calls to > > > > > > `sprintf(dest, "%s", str)` where the return value is used to > > > > > > `stpcpy(dest, str) - dest`. This generally avoids the machinery involved > > > > > > in parsing format strings. Calling `sprintf` with overlapping arguments > > > > > > was clarified in ISO C99 and POSIX.1-2001 to be undefined behavior. > > > > > > > > > > > > `stpcpy` is just like `strcpy` except it returns the pointer to the new > > > > > > tail of `dest`. This allows you to chain multiple calls to `stpcpy` in > > > > > > one statement. > > > > > > > > > > O_O What? > > > > > > > > > > No; this is a _terrible_ API: there is no bounds checking, there are no > > > > > buffer sizes. Anything using the example sprintf() pattern is _already_ > > > > > wrong and must be removed from the kernel. (Yes, I realize that the > > > > > kernel is *filled* with this bad assumption that "I'll never write more > > > > > than PAGE_SIZE bytes to this buffer", but that's both theoretically > > > > > wrong ("640k is enough for anybody") and has been known to be wrong in > > > > > practice too (e.g. when suddenly your writing routine is reachable by > > > > > splice(2) and you may not have a PAGE_SIZE buffer). > > > > > > > > > > But we cannot _add_ another dangerous string API. We're already in a > > > > > terrible mess trying to remove strcpy[1], strlcpy[2], and strncpy[3]. This > > > > > needs to be addressed up by removing the unbounded sprintf() uses. (And > > > > > to do so without introducing bugs related to using snprintf() when > > > > > scnprintf() is expected[4].) > > > > > > > > Well, everything (-next, mainline, stable) is broken right now (with > > > > ToT Clang) without providing this symbol. I'm not going to go clean > > > > the entire kernel's use of sprintf to get our CI back to being green. > > > > > > Maybe this should get place in compiler-clang.h so it isn't > > > generic and public. > > > > https://bugs.llvm.org/show_bug.cgi?id=47162#c7 and > > https://bugs.llvm.org/show_bug.cgi?id=47144 > > Seem to imply that Clang is not the only compiler that can lower a > > sequence of libcalls to stpcpy. Do we want to wait until we have a > > fire drill w/ GCC to move such an implementation from > > include/linux/compiler-clang.h back in to lib/string.c? > > My guess is yes, wait until gcc, if ever, needs it. The suggestion to use static inline doesn't even make sense. The compiler is lowering calls to other library routines; `stpcpy` isn't being explicitly called. Even if it was, not sure we want it being inlined. No symbol definition will be emitted; problem not solved. And I refuse to add any more code using `extern inline`. Putting the definition in lib/string.c is the most straightforward and avoids revisiting this issue in the future for other toolchains. I'll limit access by removing the declaration, and adding a comment to avoid its use. But if you're going to use a gnu target triple without using -ffreestanding because you *want* libcall optimizations, then you have to provide symbols for all possible library routines! -- Thanks, ~Nick Desaulniers