From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EFDB5C4346E for ; Fri, 25 Sep 2020 03:09:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AD5A420866 for ; Fri, 25 Sep 2020 03:09:37 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b="EeXCN+vi" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727051AbgIYDJL (ORCPT ); Thu, 24 Sep 2020 23:09:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33606 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726764AbgIYDJL (ORCPT ); Thu, 24 Sep 2020 23:09:11 -0400 Received: from mail-pg1-x544.google.com (mail-pg1-x544.google.com [IPv6:2607:f8b0:4864:20::544]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EAEE8C0613D3 for ; Thu, 24 Sep 2020 20:09:10 -0700 (PDT) Received: by mail-pg1-x544.google.com with SMTP id 7so1308287pgm.11 for ; Thu, 24 Sep 2020 20:09:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=hJ1W1/RS5W0hSkVAbx2PZJL6p6qwRikie5JVs3zNzvM=; b=EeXCN+viasa0U6+b2eUA5OPmUiG4kY9mInu5zsqE0vKBwiPrI2/yiRGJavfHhEJAJP J7yh3KiOAm0o+rViCqageC3W19loqvQ2a99KvKlv24zmIbBZEha/N+LHrlVWOgOEgk9J jxulU8Gikr3EQZgr6dMYlPslVFZjSjsGK+Cxc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=hJ1W1/RS5W0hSkVAbx2PZJL6p6qwRikie5JVs3zNzvM=; b=gAC0EZ7Vr9wxj2eltLNtLqPCFCBqTXmys3B+zwPy6uERNKauXf0pAqxHRDCCgHhP19 s2a1YBqd5LTjGynbZR5Csb2STtv8yvMhVFwzlyoHLA3v+qJGxzQVnuTSVKABZKLS/rWp /YqRTBr6bn0jJPIIyhaAbWEA4N4mH35WC9r5RQpAj+PLWLXwXheCvPzRU7HkCdHZp2Ud 7EA5D++TH678IH1VPOAknBZOe4YMjdWp2tzY5u3wtWvcLD/XRDaxgba06vV/GY+T65UX CftWQXGS2ppTQzQy67fVvHISZdOoppN/tooMouxNpqiv14wV1/C7qamN1gtFzmHhfB0e n+Wg== X-Gm-Message-State: AOAM5327yCohfl9i4VHC+GmSGOdFy9C0yQ5wu8Tzkivm8V2WYIzFN3N6 kGI7nYXqWy7r/zrUDPWXv/HlRw== X-Google-Smtp-Source: ABdhPJy3M327HKr9IezlNDMptTHO3YONc893KnALpMIxtXbLrvx0ZnSoE0AAxZuUKojDDbhv4QCZ7w== X-Received: by 2002:a17:902:b7c4:b029:d0:b7a2:d16 with SMTP id v4-20020a170902b7c4b02900d0b7a20d16mr2244425plz.11.1601003350409; Thu, 24 Sep 2020 20:09:10 -0700 (PDT) Received: from www.outflux.net (smtp.outflux.net. [198.145.64.163]) by smtp.gmail.com with ESMTPSA id q140sm95635pfc.39.2020.09.24.20.09.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 24 Sep 2020 20:09:09 -0700 (PDT) Date: Thu, 24 Sep 2020 20:09:07 -0700 From: Kees Cook To: YiFei Zhu Cc: YiFei Zhu , Linux Containers , bpf , kernel list , Aleksa Sarai , Andrea Arcangeli , Andy Lutomirski , Dimitrios Skarlatos , Giuseppe Scrivano , Hubertus Franke , Jack Chen , Jann Horn , Josep Torrellas , Tianyin Xu , Tobin Feldman-Fitzthum , Tycho Andersen , Valentin Rothberg , Will Drewry Subject: Re: [PATCH v2 seccomp 2/6] asm/syscall.h: Add syscall_arches[] array Message-ID: <202009242000.DE12689BD8@keescook> References: <202009241658.A062D6AE@keescook> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Sep 24, 2020 at 08:27:40PM -0500, YiFei Zhu wrote: > [resending this too] > > On Thu, Sep 24, 2020 at 6:01 PM Kees Cook wrote: > > Disregarding the "how" of this, yeah, we'll certainly need something to > > tell seccomp about the arrangement of syscall tables and how to find > > them. > > > > However, I'd still prefer to do this on a per-arch basis, and include > > more detail, as I've got in my v1. > > > > Something missing from both styles, though, is a consolidation of > > values, where the AUDIT_ARCH* isn't reused in both the seccomp info and > > the syscall_get_arch() return. The problems here were two-fold: > > > > 1) putting this in syscall.h meant you do not have full NR_syscall* > > visibility on some architectures (e.g. arm64 plays weird games with > > header include order). > > I don't get this one -- I'm not playing with NR_syscall here. Right, sorry, I may not have been clear. When building my RFC I noticed that I couldn't use NR_syscall very "early" in the header file include stack on arm64, which complicated things. So I guess what I mean is something like "it's probably better to do all these seccomp-specific macros/etc in asm/include/seccomp.h rather than in syscall.h because I know at least one architecture that might cause trouble." > > 2) seccomp needs to handle "multiplexed" tables like x86_x32 (distros > > haven't removed CONFIG_X86_X32 widely yet, so it is a reality that > > it must be dealt with), which means seccomp's idea of the arch > > "number" can't be the same as the AUDIT_ARCH. > > Why so? Does anyone actually use x32 in a container? The memory cost > and analysis cost is on everyone. The worst case scenario if we don't > support it is that the syscall is not accelerated. Ironicailly, that's the only place I actually know for sure where people using x32 because it shows measurable (10%) speed-up for builders: https://lore.kernel.org/lkml/CAOesGMgu1i3p7XMZuCEtj63T-ST_jh+BfaHy-K6LhgqNriKHAA@mail.gmail.com So, yes, as you and Jann both point out, it wouldn't be terrible to just ignore x32, it seems a shame to penalize it. That said, if the masking step from my v1 is actually noticable on a native workload, then yeah, probably x32 should be ignored. My instinct (not measured) is that it's faster than walking a small array.[citation needed] > > So, likely a combo of approaches is needed: an array (or more likely, > > enum), declared in the per-arch seccomp.h file. And I don't see a way > > to solve #1 cleanly. > > > > Regardless, it needs to be split per architecture so that regressions > > can be bisected/reverted/isolated cleanly. And if we can't actually test > > it at runtime (or find someone who can) it's not a good idea to make the > > change. :) > > You have a good point regarding tests. Don't see how it affects > regressions though. Only one file here is ever included per-build. It's easier to do a per-arch revert (i.e. all the -stable tree machinery, etc) with a single SHA instead of having to write a partial revert, etc. -- Kees Cook