From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-yb1-f176.google.com (mail-yb1-f176.google.com [209.85.219.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 55BD0443E for ; Fri, 23 Jun 2023 23:18:55 +0000 (UTC) Received: by mail-yb1-f176.google.com with SMTP id 3f1490d57ef6-bd5f59fb71dso1136021276.3 for ; Fri, 23 Jun 2023 16:18:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687562334; x=1690154334; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=o5h2Fku8APXH3iOdYKNWdhjGBKRm3o6/JPRIVNrW9qA=; b=QFfPEzgVXYAeQa/6a5IBaDl9QfzWAr6OxLdvn4H9sUUu1YPmlx9TYMKYcOkzf8Dri+ I+EW41qX0D4ZLpiWpESkGo34UvdXh1t4YXWQjbcFDqWssJDjt7mcSlWmal0Io/ON+E4J QYEv3qDTt/TaPxDICboK4LH8hXpS/ZaV3vealWnFLolM7YMeOqg7dpU8bsTECkoLaCAe vdvmxkAQUYI0zrdjEpHv3RFRNQsZzBzcLy4T+Lo0cXVco+/1cQdBdoa9TMn4Az3qKXYR R/2Xy40EeDFEzF5J83APlgmNUsCx/Bt4V/JP3RTpBKSMDAA/ziEa/sGACVr25XR1AUJ8 X54g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687562334; x=1690154334; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=o5h2Fku8APXH3iOdYKNWdhjGBKRm3o6/JPRIVNrW9qA=; b=DYByhImPF1rJ/2+9bL2Ra5BF13vo9gGc22lDcW7vYYv3iCXmo6+443PhBXKi9QHj7Y +vZtO6jJDhgyBeiQgXglefk6Zjbn8cYFo3khyAnMt5ElV1dOTqCrQVIeIURrxTrRp0HU EsuzV9a0aT0aS9cB/M3jwCV0tDfnUMoqnCuYnXPE0id3WguuMFICu7B/5nYGPM5+rAuM mp3GyMhnEx4+Yv93tdZSnIpehfY0/ta34vyK2T1dFEl61Nyr8UkpVHcEqLgDC24dzA97 IPA19CX/lJ1Sx2t8lB5JxQ65px9QWS5zk13sHf8yPEvhyKoICk/WZtbJ2tSNehiCned3 rdkQ== X-Gm-Message-State: AC+VfDx4+3j8kEbLa5GtHn/Ake/lboYOxD3McDghIlWP1nlPw0qa7NJI uAEJdfTIsnw6KsRIwRBD5SUYhBDGmGlCKsDn2OI= X-Google-Smtp-Source: ACHHUZ64Ie2Q4TyVcEiEZZRvHCNcp3rrPGp7VLcfz4WgjiET7zv1uOAZWjP5DPConbdo8r3WXueFReyEKBSOKG0ki6g= X-Received: by 2002:a25:4050:0:b0:bfe:9259:8f1a with SMTP id n77-20020a254050000000b00bfe92598f1amr10554353yba.50.1687562333975; Fri, 23 Jun 2023 16:18:53 -0700 (PDT) Precedence: bulk X-Mailing-List: llvm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <76CB17D0-5A66-4D49-A389-8F40EC830DC0@sladewatkins.net> <85822c3c-2254-52cc-e6b1-9c89adb63771@amd.com> <85aabdc8-07cd-3285-1f3f-605f9ebbab18@amd.com> In-Reply-To: From: Namhyung Kim Date: Fri, 23 Jun 2023 16:18:42 -0700 Message-ID: Subject: Re: Invalid event (cycles:pp) in per-thread mode, enable system wide with '-a'. To: Nick Desaulniers Cc: Ravi Bangoria , Stephane Eranian , Slade Watkins , linux-perf-users , LKML , Ian Rogers , Kees Cook , sandipan.das@amd.com, Bill Wendling , clang-built-linux , Yonghong Song , Peter Zijlstra Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Nick, On Fri, Jun 23, 2023 at 9:23=E2=80=AFAM Nick Desaulniers wrote: > > On Tue, Oct 11, 2022 at 10:05=E2=80=AFPM Ravi Bangoria wrote: > > > > On 12-Oct-22 9:36 AM, Ravi Bangoria wrote: > > > On 12-Oct-22 3:02 AM, Nick Desaulniers wrote: > > >> On Thu, Oct 6, 2022 at 8:56 PM Ravi Bangoria = wrote: > > >>> > > >>> +cc: PeterZ > > >>> > > >>>>>>>> +Ravi who may be able to say if there are any issues with the = precise > > >>>>>>>> sampling on AMD. > > >>>>>>> > > >>>>>>> Afaik cvcles:pp will use IBS but it doesn't support per-task pr= ofiling > > >>>>>>> since it has no task context. Ravi is working on it.. > > >>>>>> > > >>>>>> Right. > > >>>>>> https://lore.kernel.org/lkml/20220829113347.295-1-ravi.bangoria@= amd.com > > >>>>> > > >>>>> Cool, thanks for working on this Ravi. > > >>>>> > > >>>>> I'm not sure yet whether I may replace the kernel on my corporate > > >>>>> provided workstation, so I'm not sure yet I can help test that pa= tch. > > >>>>> > > >>>>> Can you confirm that > > >>>>> $ perf record -e cycles:pp --freq=3D128 --call-graph lbr -- > > >>>>> > > >>>>> works with just that patch applied? Or is there more work require= d? > > >>>>> What is the status of that patch? > > >>>>> > > >>>>> For context, we had difficulty upstreaming support for instrument= ation > > >>>>> based profile guided optimizations in the Linux kernel. > > >>>>> https://lore.kernel.org/lkml/CAHk-=3DwhqCT0BeqBQhW8D-YoLLgp_eFY= =3D8Y=3D9ieREM5xx0ef08w@mail.gmail.com/ > > >>>>> We'd like to be able to use either instrumentation or sampling to > > >>>>> optimize our builds. The major barrier to sample based approache= s are > > >>>>> architecture / micro architecture issues with sample based profil= e > > >>>>> data collection, and bitrot of data processing utilities. > > >>>>> https://github.com/google/autofdo/issues/144 > > >>>> > > >>>> On existing AMD Zen2, Zen3 the following cmdline: > > >>>> $ perf record -e cycles:pp --freq=3D128 --call-graph lbr -- > > >>>> > > >>>> does not work. I see two reasons: > > >>>> > > >>>> 1. cycles:pp is likely converted into IBS op in cycle mode. > > >>>> Current kernels do not support IBS in per-thread mode. > > >>>> This is purely a kernel limitation > > >>> > > >>> Right, it's purely a kernel limitation. And below simple patch on t= op > > >>> of event-context rewrite patch[1] should be sufficient to make cycl= es:pp > > >>> working in per-process mode on AMD Zen. > > >>> > > >>> --- > > >>> diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c > > >>> index c251bc44c088..de01b5d27e40 100644 > > >>> --- a/arch/x86/events/amd/ibs.c > > >>> +++ b/arch/x86/events/amd/ibs.c > > >>> @@ -665,7 +665,7 @@ static struct perf_ibs perf_ibs_fetch =3D { > > >>> > > >>> static struct perf_ibs perf_ibs_op =3D { > > >>> .pmu =3D { > > >>> - .task_ctx_nr =3D perf_invalid_context, > > >>> + .task_ctx_nr =3D perf_hw_context, > > >>> > > >>> .event_init =3D perf_ibs_init, > > >>> .add =3D perf_ibs_add, > > >>> --- > > >>> > > >>> [1]: https://lore.kernel.org/lkml/20220829113347.295-1-ravi.bangori= a@amd.com > > >> > > >> Hi Ravi, > > >> I didn't see the above diff in > > >> https://lore.kernel.org/lkml/20221008062424.313-1-ravi.bangoria@amd.= com/ > > >> Was there another distinct patch you were going to send for the abov= e? > > > > > > Yes Nick. I was planning to send it once the rewrite stuff goes in. > > > > Hi Nick, > > > > Since you have practical use case, would it be possible to run your wor= kflow > > with perf rewrite and IBS patches applied? It will help us in finding/f= ixing > > more bugs and upstreaming these changes. > > Hi Ravi, > Sorry, I'm not able to load a custom kernel image on my employer > provided workstation, and I never got approval to expense hardware for > testing this otherwise. > > Was there ever any update on this? I'm on 6.1.25 now and still cant run > $ perf record -e cycles:pp --call-graph lbr > $ cat /proc/cpuinfo > ... > model name : AMD Ryzen Threadripper PRO 3995WX 64-Cores > ... The commit 30093056f7b2 ("perf/amd/ibs: Make IBS a core pmu") in v6.2. $ git name-rev --tags --refs=3Dv[2-6].* 30093056f7b2 30093056f7b2 v6.2-rc1~176^2~16 https://git.kernel.org/torvalds/c/30093056f7b2 Thanks, Namhyung