From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.6 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 09EC9C35247 for ; Tue, 4 Feb 2020 20:33:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id C5B322082E for ; Tue, 4 Feb 2020 20:33:26 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="UM63F9EX" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727602AbgBDUdZ (ORCPT ); Tue, 4 Feb 2020 15:33:25 -0500 Received: from mail-pl1-f193.google.com ([209.85.214.193]:39948 "EHLO mail-pl1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727458AbgBDUdX (ORCPT ); Tue, 4 Feb 2020 15:33:23 -0500 Received: by mail-pl1-f193.google.com with SMTP id y1so7747185plp.7; Tue, 04 Feb 2020 12:33:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:message-id:in-reply-to:references:subject :mime-version:content-transfer-encoding; bh=4Y76WudeWCG5zXmxP2Nw4YSkfa14RC4kqB6t3Lg4X9M=; b=UM63F9EXUlmTMTVYbwks5jV/yCvDqR4chYFapC5j4CVIQviDsobwExzj3GIkgPbRdq //qsfAxnZ1oWlUcd0vhb6wNDUBvcZIH6/bP0reunmdDhUB2k2bkh75y4qAbG+VQgV0dp 59icHfEdfg/Fz/zx7NbCienbak2IANxuX84XGTEe1YxKly8z/OJi34H3uMjk7Er/HXhj HGGS+RLISvXMN/aPIkoJD14ACshXu4NwMxheIOKq64mjltT9GCqP/sLbG29gRD80AuJr DOLtHEeEaArGoQsmTFcUUHUWTxqxpsz5Bz5FEpdv9b884Ytw6QBnv7Vq18AdvXdoNP4B yzVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:message-id:in-reply-to :references:subject:mime-version:content-transfer-encoding; bh=4Y76WudeWCG5zXmxP2Nw4YSkfa14RC4kqB6t3Lg4X9M=; b=KGbBrcpfE0rkOen2GJ+PWg/iX6oQWVxfqxYs04qeooU2bw5y8x82Uvc4cPDvt7cW+r 0veNqVqL6kPPGO3dyUc9kirJCtjMUIFBcab5MA6S3ERPqQXi355pM9X+3WsomTuHouCw RDKadcrHkZQrlO6gHAQ2Sq0JUTxV+6FFnk83XU7k8KIHKr71lAFrMtl9Ydk4Zb0V84ha DNSrQ/ELu43w+614xbZTqGm8TY0QrNnA2wLhVB8Mw+i8MXY7aRLGzM935GFOsQZNvZDp 7//wfVFhKvOK/RG+n9MBdVIL9TbKGbplF/huhiLTQoZOm8wN68rLqzTGFUf9wosHASbr 6YqA== X-Gm-Message-State: APjAAAXNh44ZdYkjrf79zg4z51PogHV2rC4vOVXKyVUmQCVWPgGOb6SG JsqXRi+PWBL7WXCIQ5cGoXA= X-Google-Smtp-Source: APXvYqzeahYF5KBgE/fLGWyNk0UEHsBipBvuQH1Ndc7tvKe17Ilwb5Vq5hgNIVuTZ9BqoQ06SyLiQQ== X-Received: by 2002:a17:90a:a78b:: with SMTP id f11mr1202943pjq.8.1580848401174; Tue, 04 Feb 2020 12:33:21 -0800 (PST) Received: from localhost ([184.63.162.180]) by smtp.gmail.com with ESMTPSA id v9sm4620636pja.26.2020.02.04.12.33.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Feb 2020 12:33:20 -0800 (PST) Date: Tue, 04 Feb 2020 12:33:13 -0800 From: John Fastabend To: =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= , Palmer Dabbelt Cc: Daniel Borkmann , Alexei Starovoitov , zlim.lnx@gmail.com, catalin.marinas@arm.com, will@kernel.org, Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , Shuah Khan , Netdev , bpf , linux-arm-kernel@lists.infradead.org, LKML , linux-kselftest@vger.kernel.org, clang-built-linux@googlegroups.com, kernel-team@android.com Message-ID: <5e39d509c9edc_63882ad0d49345c08@john-XPS-13-9370.notmuch> In-Reply-To: References: <20200128021145.36774-1-palmerdabbelt@google.com> Subject: Re: arm64: bpf: Elide some moves to a0 after calls Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Bj=C3=B6rn T=C3=B6pel wrote: > On Tue, 28 Jan 2020 at 03:14, Palmer Dabbelt = wrote: > > > > There's four patches here, but only one of them actually does anythin= g. The > > first patch fixes a BPF selftests build failure on my machine and has= already > > been sent to the list separately. The next three are just staged suc= h that > > there are some patches that avoid changing any functionality pulled o= ut from > > the whole point of those refactorings, with two cleanups and then the= idea. > > > > Maybe this is an odd thing to say in a cover letter, but I'm not actu= ally sure > > this patch set is a good idea. The issue of extra moves after calls = came up as > > I was reviewing some unrelated performance optimizations to the RISC-= V BPF JIT. > > I figured I'd take a whack at performing the optimization in the cont= ext of the > > arm64 port just to get a breath of fresh air, and I'm not convinced I= like the > > results. > > > > That said, I think I would accept something like this for the RISC-V = port > > because we're already doing a multi-pass optimization for shrinking f= unction > > addresses so it's not as much extra complexity over there. If we do = that we > > should probably start puling some of this code into the shared BPF co= mpiler, > > but we're also opening the doors to more complicated BPF JIT optimiza= tions. > > Given that the BPF JIT appears to have been designed explicitly to be= > > simple/fast as opposed to perform complex optimization, I'm not sure = this is a > > sane way to move forward. > > > = > Obviously I can only speak for myself and the RISC-V JIT, but given > that we already have opened the door for more advanced translations > (branch relaxation e.g.), I think that this makes sense. At the same > time we don't want to go all JVM on the JITs. :-P I'm not against it although if we start to go this route I would want som= e way to quantify how we are increasing/descreasing load times. > = > > I figured I'd send the patch set out as more of a question than anyth= ing else. > > Specifically: > > > > * How should I go about measuring the performance of these sort of > > optimizations? I'd like to balance the time it takes to run the JI= T with the > > time spent executing the program, but I don't have any feel for wha= t real BPF > > programs look like or have any benchmark suite to run. Is there so= mething > > out there this should be benchmarked against? (I'd also like to kn= ow that to > > run those benchmarks on the RISC-V port.) > = > If you run the selftests 'test_progs' with -v it'll measure/print the > execution time of the programs. I'd say *most* BPF program invokes a > helper (via call). It would be interesting to see, for say the > selftests, how often the optimization can be performed. > = > > * Is this the sort of thing that makes sense in a BPF JIT? I guess I= 've just > > realized I turned "review this patch" into a way bigger rabbit hole= than I > > really want to go down... > > > = > I'd say 'yes'. My hunch, and the workloads I've seen, BPF programs are > usually loaded, and then resident for a long time. So, the JIT time is > not super critical. The FB/Cilium folks can definitely provide a > better sample point, than my hunch. ;-) In our case the JIT time can be relevant because we are effectively holdi= ng up a kubernetes pod load waiting for programs to load. However, we can probably work-around it by doing more aggressive dynamic linking now that= this is starting to land. It would be interesting to have a test to measure load time in selftests or selftests/benchmark/ perhaps. We have some of these out of tree we could push in I think if there is interest. > = > = > Bj=C3=B6rn