From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.6 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE, SPF_PASS,USER_AGENT_NEOMUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C93DFC43613 for ; Thu, 20 Jun 2019 03:35:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 96B4820675 for ; Thu, 20 Jun 2019 03:35:46 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="OEIoM2/X" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731390AbfFTDfq (ORCPT ); Wed, 19 Jun 2019 23:35:46 -0400 Received: from mail-pg1-f193.google.com ([209.85.215.193]:40146 "EHLO mail-pg1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726370AbfFTDfq (ORCPT ); Wed, 19 Jun 2019 23:35:46 -0400 Received: by mail-pg1-f193.google.com with SMTP id w10so791607pgj.7; Wed, 19 Jun 2019 20:35:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=zOa1e8oAtRxerEx6kd/Fw1r45fnTVWgZRg3Uxwi0TJw=; b=OEIoM2/X9thwmJuYNhWXcFwNix9ajM8mHleBbahR31DvIUagWIzWj8oT+ibYiL5vxY BLKdrNKj80R9pCGF/sjgerhCQa9Zwj118HOdc1GNNXjWdtcKVThlvIdjd5gpbSa6rPrO qeuB5IPIt1cAOg1Akp0drdZF58OoYZh1HgXPKq2IvkNN32XFkdFpjw0MoRTDLLjvmj74 KQJ3mbVE0Mub1H66xq8wFeOllIowM+N6YKpHjXt9C0Nk31ZkpuhqyRZgfR9SYfz9c/Pk OxBRDIevBrpOCSePGPhUJtLOX5lcPkrCCXDs0xwuPweunJofSXW1Ute/5Y9QFgQmgW4g CD6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=zOa1e8oAtRxerEx6kd/Fw1r45fnTVWgZRg3Uxwi0TJw=; b=tRe+3bji9VDf3Nr0FotWa6W2wLzbo8+AF6kIv+by3XjKuhRJ9cDUyz/wOC+tuHjVO6 WHf638VzCELDf+zsOhFhfDleWRm3sxJ3RLBcNLGaQj9wBLudfNt014iyO27oYrPvLZgu foxHAPMr+RJfXpjyR2MqyDwZXdaYaCEObTGe0ShcfzOcFl6eJdXK5WxW+KnsG0dGVJIJ 71u8t6uJT/4j4wUTUWLuan0jSV3R8GWt0Png9isCsYQcwYdUbmHFRO2n2BBcAS2sQ3eg oQy5i1aZ3ngpKsNW+GQjM1484chXiYAfxuDNJDSKJltH7Bcl9eZf3g5Hkz1oCn9ANloK taGA== X-Gm-Message-State: APjAAAWP2ZgP0w69Whe/DqVhtjZWdno6FZJcdpzpNjnJdtLz5Ij6lXRZ SC2mWPdd4VMBeS5n8yreZ1c= X-Google-Smtp-Source: APXvYqw1P3mUG3ns3NOr/FWd1Lfmw6O29/m6sZbB4KDBSoE2m48JW1eZzvKu4PbyCD0qvXlwsUnqoQ== X-Received: by 2002:a62:6d47:: with SMTP id i68mr130221237pfc.189.1561001745232; Wed, 19 Jun 2019 20:35:45 -0700 (PDT) Received: from ast-mbp.dhcp.thefacebook.com ([2620:10d:c090:180::1:bbbf]) by smtp.gmail.com with ESMTPSA id l13sm2797416pjq.20.2019.06.19.20.35.43 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 19 Jun 2019 20:35:44 -0700 (PDT) Date: Wed, 19 Jun 2019 23:35:40 -0400 From: Alexei Starovoitov To: John Fastabend Cc: Alexei Starovoitov , davem@davemloft.net, daniel@iogearbox.net, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: Re: [PATCH v3 bpf-next 1/9] bpf: track spill/fill of constants Message-ID: <20190620033538.4oou4mbck6xs64mj@ast-mbp.dhcp.thefacebook.com> References: <20190615191225.2409862-1-ast@kernel.org> <20190615191225.2409862-2-ast@kernel.org> <5d0ad24027106_8822adea29a05b47c@john-XPS-13-9370.notmuch> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5d0ad24027106_8822adea29a05b47c@john-XPS-13-9370.notmuch> User-Agent: NeoMutt/20180223 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org On Wed, Jun 19, 2019 at 05:24:32PM -0700, John Fastabend wrote: > Alexei Starovoitov wrote: > > Compilers often spill induction variables into the stack, > > hence it is necessary for the verifier to track scalar values > > of the registers through stack slots. > > > > Also few bpf programs were incorrectly rejected in the past, > > since the verifier was not able to track such constants while > > they were used to compute offsets into packet headers. > > > > Tracking constants through the stack significantly decreases > > the chances of state pruning, since two different constants > > are considered to be different by state equivalency. > > End result that cilium tests suffer serious degradation in the number > > of states processed and corresponding verification time increase. > > > > before after > > bpf_lb-DLB_L3.o 1838 6441 > > bpf_lb-DLB_L4.o 3218 5908 > > bpf_lb-DUNKNOWN.o 1064 1064 > > bpf_lxc-DDROP_ALL.o 26935 93790 > > bpf_lxc-DUNKNOWN.o 34439 123886 > > bpf_netdev.o 9721 31413 > > bpf_overlay.o 6184 18561 > > bpf_lxc_jit.o 39389 359445 > > > > After further debugging turned out that cillium progs are > > getting hurt by clang due to the same constant tracking issue. > > Newer clang generates better code by spilling less to the stack. > > Instead it keeps more constants in the registers which > > hurts state pruning since the verifier already tracks constants > > in the registers: > > old clang new clang > > (no spill/fill tracking introduced by this patch) > > bpf_lb-DLB_L3.o 1838 1923 > > bpf_lb-DLB_L4.o 3218 3077 > > bpf_lb-DUNKNOWN.o 1064 1062 > > bpf_lxc-DDROP_ALL.o 26935 166729 > > bpf_lxc-DUNKNOWN.o 34439 174607 > ^^^^^^^^^^^^^^ > Any idea what happened here? Going from 34439 -> 174607 on the new clang? As I was alluding in commit log newer clang is smarter and generates less spill/fill of constants. In particular older clang loads two constants into r8 and r9 and immediately spills them into stack. Then fills later, does a bunch of unrelated code and calls into helper that has ARG_ANYTHING for that position. Then doing a bit more math on filled constants, spills them again and so on. Before this patch (that tracks spill/fill of constants into stack) pruning points were equivalent, but with the patch it sees the difference in registers and declares states not equivalent, though any constant is fine from safety standpoint. With new clang only r9 has this pattern of spill/fill. New clang manages to keep constant in r8 to be around without spill/fill. Existing verifier tracks constants so even without this patch the same pathalogical behavior is observed. The verifier need to walk a lot more instructions only because r8 has different constants. > > bpf_netdev.o 9721 8407 > > bpf_overlay.o 6184 5420 > > bpf_lcx_jit.o 39389 39389 > > > > The final table is depressing: > > old clang old clang new clang new clang > > const spill/fill const spill/fill > > bpf_lb-DLB_L3.o 1838 6441 1923 8128 > > bpf_lb-DLB_L4.o 3218 5908 3077 6707 > > bpf_lb-DUNKNOWN.o 1064 1064 1062 1062 > > bpf_lxc-DDROP_ALL.o 26935 93790 166729 380712 > > bpf_lxc-DUNKNOWN.o 34439 123886 174607 440652 > > bpf_netdev.o 9721 31413 8407 31904 > > bpf_overlay.o 6184 18561 5420 23569 > > bpf_lxc_jit.o 39389 359445 39389 359445 > > > > Tracking constants in the registers hurts state pruning already. > > Adding tracking of constants through stack hurts pruning even more. > > The later patch address this general constant tracking issue > > with coarse/precise logic. > > > > Signed-off-by: Alexei Starovoitov > > Acked-by: Andrii Nakryiko > > --- > > kernel/bpf/verifier.c | 90 +++++++++++++++++++++++++++++++------------ > > 1 file changed, 65 insertions(+), 25 deletions(-) > > I know these are already in bpf-next sorry it took me awhile to get > time to review, but looks good to me. Thanks! We had something similar > in the earlier loop test branch from last year. It's not in bpf-next yet :) Code reviews are appreciated at any time. Looks like we were just lucky with older clang. I haven't tracked which clang version became smarter. If you haven't seen this issue and haven't changed cilium C source to workaround that then there is chance you'll hit it as well. By "new clang" I meant version 9.0 "old clang" is unknown. I just had cilium elf .o around that I kept using for testing without recompiling them. Just by chance I recompiled them to see annotated verifier line info messages with BTF and hit this interesting issue. See patch 9 backtracking logic that resolves this 'precision of scalar' issue for progs compiled with both new and old clangs.