From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.4 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6B5F2C3404B for ; Tue, 18 Feb 2020 19:28:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 3978221D56 for ; Tue, 18 Feb 2020 19:28:17 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="DvmqpHrn" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726557AbgBRT2Q (ORCPT ); Tue, 18 Feb 2020 14:28:16 -0500 Received: from mail-pf1-f179.google.com ([209.85.210.179]:43800 "EHLO mail-pf1-f179.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726467AbgBRT2P (ORCPT ); Tue, 18 Feb 2020 14:28:15 -0500 Received: by mail-pf1-f179.google.com with SMTP id s1so11135722pfh.10 for ; Tue, 18 Feb 2020 11:28:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:subject:in-reply-to:cc:from:to:message-id:mime-version :content-transfer-encoding; bh=zhYJCkJXlPxXtgjObEVr42uyGuLn0+lKW2wKfP/UpCE=; b=DvmqpHrnM5O330XZo1r4oQuSvPgecXnkvWf1HePsqxhXDtfRmjIjyRMJC5NXDcCYn+ OwvGhCTbVAvqmsw7DHX6CX7ayI8MRX80ZKZFm3l7gtOi8e7I+QXoV2LNG3hx3pvtvgiV LSIt91cMdVZ0sjN19WquhWlHxwMKtoi8sHzngiTeNW13UUBuZJ+eKQMx92fEmu2wm/iA Zt5J/EId0BdKG8K0owTwcXRs3abR2p6cDm6V/Hc1tImyyILCj1yMKM1SnD10oedygFij qIgZMDYYyaPAr0O+2Mq3cmqiX39FLOwWYivab+YigA6JwRjURv6j+9wGEC6ZR3iOJu+H mZDg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:subject:in-reply-to:cc:from:to:message-id :mime-version:content-transfer-encoding; bh=zhYJCkJXlPxXtgjObEVr42uyGuLn0+lKW2wKfP/UpCE=; b=CMeJzLEW9yYfedDdbvP2KFirHiDORvbBda9zsK4wEDwFB4J12K3xXxjmtY0PNlnTgz KvNa5j6OUtzPs6oQbLAjLDXP+cTgJyPuKleMTnuJKsBKE87FKCSeXmKy26U6L2j7dzpR J9epFQspz1OH7p1GpM/COaDD/G/0ilQQP3Rt+KaQEC0fmlIDZZGxjC9O3O45JyjFMi3S Dhxa6MbdG+wE6Ahi3YdtzS9LLtYt9xM12FrSglR8g66wjqyytYiRz3EckMvoH6gZMDe4 4FApe3k78l0s8mCsom/dcoMh0Nk3RQhujhjU09byHYEaD680KOpxFXWAbDQI29YICB+q 8PbA== X-Gm-Message-State: APjAAAVqDL6lwIIN6pdNyhY0tzecZ5NbLP+4bBPHGsJ9G/RMv/njSW1W 5/KpTASRATKsFET2LROMOZP+IA== X-Google-Smtp-Source: APXvYqxvBhs/kNz46f0NM7yQK9O+VsDMro0hxwtL6uWXHEPpf1QvsgH3bBRF8NOa1+svFTWNKSApFA== X-Received: by 2002:a62:cfc1:: with SMTP id b184mr22611022pfg.55.1582054093924; Tue, 18 Feb 2020 11:28:13 -0800 (PST) Received: from localhost ([2620:0:1000:2514:23a5:d584:6a92:3e3c]) by smtp.gmail.com with ESMTPSA id q12sm4931028pfh.158.2020.02.18.11.28.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 18 Feb 2020 11:28:13 -0800 (PST) Date: Tue, 18 Feb 2020 11:28:13 -0800 (PST) X-Google-Original-Date: Tue, 18 Feb 2020 11:28:02 PST (-0800) Subject: Re: arm64: bpf: Elide some moves to a0 after calls In-Reply-To: <5e39d509c9edc_63882ad0d49345c08@john-XPS-13-9370.notmuch> CC: Bjorn Topel , daniel@iogearbox.net, ast@kernel.org, zlim.lnx@gmail.com, catalin.marinas@arm.com, will@kernel.org, kafai@fb.com, songliubraving@fb.com, yhs@fb.com, andriin@fb.com, shuah@kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, clang-built-linux@googlegroups.com, kernel-team@android.com From: Palmer Dabbelt To: john.fastabend@gmail.com Message-ID: Mime-Version: 1.0 (MHng) Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 04 Feb 2020 12:33:13 PST (-0800), john.fastabend@gmail.com wrote: > Björn Töpel wrote: >> On Tue, 28 Jan 2020 at 03:14, Palmer Dabbelt wrote: >> > >> > There's four patches here, but only one of them actually does anything. The >> > first patch fixes a BPF selftests build failure on my machine and has already >> > been sent to the list separately. The next three are just staged such that >> > there are some patches that avoid changing any functionality pulled out from >> > the whole point of those refactorings, with two cleanups and then the idea. >> > >> > Maybe this is an odd thing to say in a cover letter, but I'm not actually sure >> > this patch set is a good idea. The issue of extra moves after calls came up as >> > I was reviewing some unrelated performance optimizations to the RISC-V BPF JIT. >> > I figured I'd take a whack at performing the optimization in the context of the >> > arm64 port just to get a breath of fresh air, and I'm not convinced I like the >> > results. >> > >> > That said, I think I would accept something like this for the RISC-V port >> > because we're already doing a multi-pass optimization for shrinking function >> > addresses so it's not as much extra complexity over there. If we do that we >> > should probably start puling some of this code into the shared BPF compiler, >> > but we're also opening the doors to more complicated BPF JIT optimizations. >> > Given that the BPF JIT appears to have been designed explicitly to be >> > simple/fast as opposed to perform complex optimization, I'm not sure this is a >> > sane way to move forward. >> > >> >> Obviously I can only speak for myself and the RISC-V JIT, but given >> that we already have opened the door for more advanced translations >> (branch relaxation e.g.), I think that this makes sense. At the same >> time we don't want to go all JVM on the JITs. :-P > > I'm not against it although if we start to go this route I would want some > way to quantify how we are increasing/descreasing load times. > >> >> > I figured I'd send the patch set out as more of a question than anything else. >> > Specifically: >> > >> > * How should I go about measuring the performance of these sort of >> > optimizations? I'd like to balance the time it takes to run the JIT with the >> > time spent executing the program, but I don't have any feel for what real BPF >> > programs look like or have any benchmark suite to run. Is there something >> > out there this should be benchmarked against? (I'd also like to know that to >> > run those benchmarks on the RISC-V port.) >> >> If you run the selftests 'test_progs' with -v it'll measure/print the >> execution time of the programs. I'd say *most* BPF program invokes a >> helper (via call). It would be interesting to see, for say the >> selftests, how often the optimization can be performed. >> >> > * Is this the sort of thing that makes sense in a BPF JIT? I guess I've just >> > realized I turned "review this patch" into a way bigger rabbit hole than I >> > really want to go down... >> > >> >> I'd say 'yes'. My hunch, and the workloads I've seen, BPF programs are >> usually loaded, and then resident for a long time. So, the JIT time is >> not super critical. The FB/Cilium folks can definitely provide a >> better sample point, than my hunch. ;-) > > In our case the JIT time can be relevant because we are effectively holding > up a kubernetes pod load waiting for programs to load. However, we can > probably work-around it by doing more aggressive dynamic linking now that > this is starting to land. > > It would be interesting to have a test to measure load time in selftests > or selftests/benchmark/ perhaps. We have some of these out of tree we > could push in I think if there is interest. I'd be interested in some sort of benchmark suite for BPF. Something like selftests/bpf/benchmarks/ seems like a reasonable place to me. > >> >> >> Björn