From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54B3FC48BCD for ; Wed, 9 Jun 2021 08:23:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2CE2860FE5 for ; Wed, 9 Jun 2021 08:23:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237618AbhFIIZG (ORCPT ); Wed, 9 Jun 2021 04:25:06 -0400 Received: from mga01.intel.com ([192.55.52.88]:7219 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236161AbhFIIZC (ORCPT ); Wed, 9 Jun 2021 04:25:02 -0400 IronPort-SDR: npEVE186Hlzc8j6uSYt+hBaEPhWWnzZ+YTkyfMWbnX0/GAD9f9meAKZnxpLbbrq7qpHlEMJ9Jn MtIoLOzh/Yvw== X-IronPort-AV: E=McAfee;i="6200,9189,10009"; a="226395376" X-IronPort-AV: E=Sophos;i="5.83,260,1616482800"; d="scan'208";a="226395376" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Jun 2021 01:23:00 -0700 IronPort-SDR: 7PEud1cuUSnZaaYrwPItsgM29nX57AV8UOQZF+rIBpJ1HMHahxHxx4pGAdJBq15m7Pker0o5nl bzOIPse5OozQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.83,260,1616482800"; d="scan'208";a="482294965" Received: from ahunter-desktop.fi.intel.com (HELO [10.237.72.79]) ([10.237.72.79]) by orsmga001.jf.intel.com with ESMTP; 09 Jun 2021 01:22:56 -0700 Subject: Re: [PATCH v2 8/8] perf record: Directly bail out for compat case To: Leo Yan Cc: Arnaldo Carvalho de Melo , Peter Zijlstra , Ingo Molnar , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Thomas Gleixner , x86@kernel.org, "H. Peter Anvin" , Mathieu Poirier , Suzuki K Poulose , Mike Leach , Andi Kleen , linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, coresight@lists.linaro.org, linux-arm-kernel@lists.infradead.org References: <20210602103007.184993-1-leo.yan@linaro.org> <20210602103007.184993-9-leo.yan@linaro.org> <20210602123847.GE10272@leoy-ThinkPad-X240s> <96e5fac6-17a2-ea03-9b15-338b84321ecf@intel.com> <20210607150903.GC1071897@leoy-ThinkPad-X240s> From: Adrian Hunter Organization: Intel Finland Oy, Registered Address: PL 281, 00181 Helsinki, Business Identity Code: 0357606 - 4, Domiciled in Helsinki Message-ID: Date: Wed, 9 Jun 2021 11:23:25 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <20210607150903.GC1071897@leoy-ThinkPad-X240s> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 7/06/21 6:09 pm, Leo Yan wrote: > On Mon, Jun 07, 2021 at 01:23:43PM +0300, Adrian Hunter wrote: >> On 2/06/21 3:38 pm, Leo Yan wrote: >>> Hi Adrain, >>> >>> On Wed, Jun 02, 2021 at 02:18:47PM +0300, Adrian Hunter wrote: >>>> On 2/06/21 1:30 pm, Leo Yan wrote: >>>>> Since the 64-bit atomicity is not promised in 32-bit perf, directly >>>>> report the error and bail out for this case. >>>>> >>>>> Now only applies on x86_64 and Arm64 platforms. >>>>> >>>>> Suggested-by: Adrian Hunter >>>> >>>> Maybe we can do better for the compat case. >>>> >>>> We can assume the upper 32-bits change very seldom, >>>> and always increase. So for the 'read' case: >>>> >>>> u64 first, second, last; >>>> u64 mask = (u64)((u32)-1) << 32; >>>> >>>> do { >>>> first = READ_ONCE(pc->aux_head); >>>> rmb(); >>>> second = READ_ONCE(pc->aux_head); >>>> rmb(); >>>> last = READ_ONCE(pc->aux_head); >>>> } while ((first & mask) != (last & mask)); >>>> return second; >>>> >>>> For the write case, we can cause a fatal error only if the new >>>> tail has non-zero upper 32-bits. That gives up to 4GiB of data >>>> before aborting: >>>> >>>> if (tail & mask) >>>> return -1; >>>> smp_mb(); >>>> WRITE_ONCE(pc->aux_tail, tail); >>> >>> Seems to me, it's pointless to only support aux_head for 64-bit and >>> support aux_tail for 32-bit. I understand this can be helpful for the >>> snapshot mode which only uses aux_head, but it still fails to support >>> the normal case for AUX ring buffer using 64-bit head/tail. >> >> I am not sure why you say it is pointless. 'perf record' would still be >> able to capture up to 4GiB of data. Do you mean you usually capture more >> than 4GiB of data? > > Okay, understand. We can support 32-bit perf for compat mode when the > trace data is less than 4GiB. > >> I was thinking we would separate out the compat case: >> >> #if BITS_PER_LONG == 32 >> if (kernel_is_64_bit) >> return compat_auxtrace_mmap__[read_head/write_tail]() >> #endif >> >> So the non-compat cases would not be affected. > > Because I don't want to introduce the complexity for read/write head > and tail, and we also need to handle the same issue for the perf ring > buffer. So how about below change? > > The main idea for below change is it allows the perf to run normally > on the compat mode and exitly if detects the buffer head is close to > the low 32-bit's overflow: when detect the low 32-bit value is bigger > than 0xf0000000 (so we have 256MiB margin to the overflow), it reports > error and exit. > > diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c > index 1b4091a3b508..2a9965bfeab4 100644 > --- a/tools/perf/util/auxtrace.c > +++ b/tools/perf/util/auxtrace.c > @@ -1693,6 +1693,14 @@ static int __auxtrace_mmap__read(struct mmap *map, > pr_debug3("auxtrace idx %d old %#"PRIx64" head %#"PRIx64" diff %#"PRIx64"\n", > mm->idx, old, head, head - old); > > +#ifdef BITS_PER_LONG == 32 > + if (kernel_is_64bit() && head >= 0xf0000000) { You are assuming the head never increases by more than 256MiB which means you should limit the buffer size to 256MiB maximum. To me this seems a bit too far from an ideal solution. I would have thought separating out the compat case makes things simpler to understand. > + pr_err("32-bit perf cannot read 64-bit value atomically;\n"); > + pr_err("exit to avoid the 4GB (32-bit) AUX buffer overflow on compat mode.\n"); > + return -ENOMEM; > + } > +#endif > + > if (mm->mask) { > head_off = head & mm->mask; > old_off = old & mm->mask; > diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c > index 9130f6fad8d5..823b69895b85 100644 > --- a/tools/perf/util/env.c > +++ b/tools/perf/util/env.c > @@ -405,3 +405,20 @@ int perf_env__numa_node(struct perf_env *env, int cpu) > > return cpu >= 0 && cpu < env->nr_numa_map ? env->numa_map[cpu] : -1; > } > + > +int perf_kernel_is_64bit(void) > +{ > + struct utsname uts; > + int ret; > + > + ret = uname(&uts); > + if (ret < 0) > + return 0; > + > + if (!strncmp(uts.machine, "x86_64", 6) || > + !strncmp(uts.machine, "aarch64", 7) || > + !strncmp(uts.machine, "arm64", 5)) > + return 1; > + > + return 0; > +} Obviously, we don't need to keep checking uname. It could be a global variable that is always set up early. > diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h > index ca249bf5e984..c6c034fc08f6 100644 > --- a/tools/perf/util/env.h > +++ b/tools/perf/util/env.h > @@ -147,4 +147,6 @@ void perf_env__insert_btf(struct perf_env *env, struct btf_node *btf_node); > struct btf_node *perf_env__find_btf(struct perf_env *env, __u32 btf_id); > > int perf_env__numa_node(struct perf_env *env, int cpu); > + > +int perf_kernel_is_64bit(void); > #endif /* __PERF_ENV_H */ > diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c > index ab7108d22428..f1d3725d599a 100644 > --- a/tools/perf/util/mmap.c > +++ b/tools/perf/util/mmap.c > @@ -323,6 +323,14 @@ int perf_mmap__push(struct mmap *md, void *to, > if (rc < 0) > return (rc == -EAGAIN) ? 1 : -1; > > +#ifdef BITS_PER_LONG == 32 > + if (kernel_is_64bit() && head >= 0xf0000000) { > + pr_err("32-bit perf cannot read 64-bit value atomically;\n"); > + pr_err("exit to avoid the 4GB (32-bit) buffer overflow on compat mode.\n"); > + return -ENOMEM; > + } > +#endif > + > size = md->core.end - md->core.start; > > if ((md->core.start & md->core.mask) + size != (md->core.end & md->core.mask)) { > > Thanks, > Leo > From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 981C3C47095 for ; Wed, 9 Jun 2021 09:09:50 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 53DD3610E6 for ; Wed, 9 Jun 2021 09:09:50 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 53DD3610E6 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:Date: Message-ID:From:References:Cc:To:Subject:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=pKzJMqjgiVykXlKyS9nWT0KFbs3e8pVz7Wdga9zzyn8=; b=42bK3wNAPGkwPCd6di5cWH/iy1 Tcq9rsXNJkqYwfTy8mW+i7or7Rwby42Wn7nBGxisLi0XrIWdPKjRf6XTtNFHf8Wr2LdVyHP4o3YgM hus9Mz6Icgvurmj1ALhzpPNqnfX6cjy+iV9u1zeFrFGhzpYBAhrEhiwQCzjblMQR9S6P4jw2mnraA mQLKadM+KoVXnPOVdUtoeE16QTeYZFZoHLFuZeyFj7wOz5j6AX0HWCvXall9q8V96RxPJ5UlL/gap 8KmUqky6bem1C1SqOK+/+NclJHWtI6pjM5coQbxsH4PxQcijoT5LdzDzhNCRgoZcrtLZx6YYIBY/V OQhSscsg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1lquBa-00CZII-CF; Wed, 09 Jun 2021 09:07:43 +0000 Received: from mga06.intel.com ([134.134.136.31]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1lqtUM-00CM8w-UY for linux-arm-kernel@lists.infradead.org; Wed, 09 Jun 2021 08:23:05 +0000 IronPort-SDR: /OeG/rA1YDV4bUdP2NFtXZpZAvzV/veKzFS2AxER0E+dH7z6YmFYaWqMKyNsIHF8tLJdkOnbBE ToYc6WxIPQCA== X-IronPort-AV: E=McAfee;i="6200,9189,10009"; a="266182905" X-IronPort-AV: E=Sophos;i="5.83,260,1616482800"; d="scan'208";a="266182905" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Jun 2021 01:23:00 -0700 IronPort-SDR: 7PEud1cuUSnZaaYrwPItsgM29nX57AV8UOQZF+rIBpJ1HMHahxHxx4pGAdJBq15m7Pker0o5nl bzOIPse5OozQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.83,260,1616482800"; d="scan'208";a="482294965" Received: from ahunter-desktop.fi.intel.com (HELO [10.237.72.79]) ([10.237.72.79]) by orsmga001.jf.intel.com with ESMTP; 09 Jun 2021 01:22:56 -0700 Subject: Re: [PATCH v2 8/8] perf record: Directly bail out for compat case To: Leo Yan Cc: Arnaldo Carvalho de Melo , Peter Zijlstra , Ingo Molnar , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Thomas Gleixner , x86@kernel.org, "H. Peter Anvin" , Mathieu Poirier , Suzuki K Poulose , Mike Leach , Andi Kleen , linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, coresight@lists.linaro.org, linux-arm-kernel@lists.infradead.org References: <20210602103007.184993-1-leo.yan@linaro.org> <20210602103007.184993-9-leo.yan@linaro.org> <20210602123847.GE10272@leoy-ThinkPad-X240s> <96e5fac6-17a2-ea03-9b15-338b84321ecf@intel.com> <20210607150903.GC1071897@leoy-ThinkPad-X240s> From: Adrian Hunter Organization: Intel Finland Oy, Registered Address: PL 281, 00181 Helsinki, Business Identity Code: 0357606 - 4, Domiciled in Helsinki Message-ID: Date: Wed, 9 Jun 2021 11:23:25 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <20210607150903.GC1071897@leoy-ThinkPad-X240s> Content-Language: en-US X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210609_012303_134723_4B6DC8DC X-CRM114-Status: GOOD ( 41.24 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 7/06/21 6:09 pm, Leo Yan wrote: > On Mon, Jun 07, 2021 at 01:23:43PM +0300, Adrian Hunter wrote: >> On 2/06/21 3:38 pm, Leo Yan wrote: >>> Hi Adrain, >>> >>> On Wed, Jun 02, 2021 at 02:18:47PM +0300, Adrian Hunter wrote: >>>> On 2/06/21 1:30 pm, Leo Yan wrote: >>>>> Since the 64-bit atomicity is not promised in 32-bit perf, directly >>>>> report the error and bail out for this case. >>>>> >>>>> Now only applies on x86_64 and Arm64 platforms. >>>>> >>>>> Suggested-by: Adrian Hunter >>>> >>>> Maybe we can do better for the compat case. >>>> >>>> We can assume the upper 32-bits change very seldom, >>>> and always increase. So for the 'read' case: >>>> >>>> u64 first, second, last; >>>> u64 mask = (u64)((u32)-1) << 32; >>>> >>>> do { >>>> first = READ_ONCE(pc->aux_head); >>>> rmb(); >>>> second = READ_ONCE(pc->aux_head); >>>> rmb(); >>>> last = READ_ONCE(pc->aux_head); >>>> } while ((first & mask) != (last & mask)); >>>> return second; >>>> >>>> For the write case, we can cause a fatal error only if the new >>>> tail has non-zero upper 32-bits. That gives up to 4GiB of data >>>> before aborting: >>>> >>>> if (tail & mask) >>>> return -1; >>>> smp_mb(); >>>> WRITE_ONCE(pc->aux_tail, tail); >>> >>> Seems to me, it's pointless to only support aux_head for 64-bit and >>> support aux_tail for 32-bit. I understand this can be helpful for the >>> snapshot mode which only uses aux_head, but it still fails to support >>> the normal case for AUX ring buffer using 64-bit head/tail. >> >> I am not sure why you say it is pointless. 'perf record' would still be >> able to capture up to 4GiB of data. Do you mean you usually capture more >> than 4GiB of data? > > Okay, understand. We can support 32-bit perf for compat mode when the > trace data is less than 4GiB. > >> I was thinking we would separate out the compat case: >> >> #if BITS_PER_LONG == 32 >> if (kernel_is_64_bit) >> return compat_auxtrace_mmap__[read_head/write_tail]() >> #endif >> >> So the non-compat cases would not be affected. > > Because I don't want to introduce the complexity for read/write head > and tail, and we also need to handle the same issue for the perf ring > buffer. So how about below change? > > The main idea for below change is it allows the perf to run normally > on the compat mode and exitly if detects the buffer head is close to > the low 32-bit's overflow: when detect the low 32-bit value is bigger > than 0xf0000000 (so we have 256MiB margin to the overflow), it reports > error and exit. > > diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c > index 1b4091a3b508..2a9965bfeab4 100644 > --- a/tools/perf/util/auxtrace.c > +++ b/tools/perf/util/auxtrace.c > @@ -1693,6 +1693,14 @@ static int __auxtrace_mmap__read(struct mmap *map, > pr_debug3("auxtrace idx %d old %#"PRIx64" head %#"PRIx64" diff %#"PRIx64"\n", > mm->idx, old, head, head - old); > > +#ifdef BITS_PER_LONG == 32 > + if (kernel_is_64bit() && head >= 0xf0000000) { You are assuming the head never increases by more than 256MiB which means you should limit the buffer size to 256MiB maximum. To me this seems a bit too far from an ideal solution. I would have thought separating out the compat case makes things simpler to understand. > + pr_err("32-bit perf cannot read 64-bit value atomically;\n"); > + pr_err("exit to avoid the 4GB (32-bit) AUX buffer overflow on compat mode.\n"); > + return -ENOMEM; > + } > +#endif > + > if (mm->mask) { > head_off = head & mm->mask; > old_off = old & mm->mask; > diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c > index 9130f6fad8d5..823b69895b85 100644 > --- a/tools/perf/util/env.c > +++ b/tools/perf/util/env.c > @@ -405,3 +405,20 @@ int perf_env__numa_node(struct perf_env *env, int cpu) > > return cpu >= 0 && cpu < env->nr_numa_map ? env->numa_map[cpu] : -1; > } > + > +int perf_kernel_is_64bit(void) > +{ > + struct utsname uts; > + int ret; > + > + ret = uname(&uts); > + if (ret < 0) > + return 0; > + > + if (!strncmp(uts.machine, "x86_64", 6) || > + !strncmp(uts.machine, "aarch64", 7) || > + !strncmp(uts.machine, "arm64", 5)) > + return 1; > + > + return 0; > +} Obviously, we don't need to keep checking uname. It could be a global variable that is always set up early. > diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h > index ca249bf5e984..c6c034fc08f6 100644 > --- a/tools/perf/util/env.h > +++ b/tools/perf/util/env.h > @@ -147,4 +147,6 @@ void perf_env__insert_btf(struct perf_env *env, struct btf_node *btf_node); > struct btf_node *perf_env__find_btf(struct perf_env *env, __u32 btf_id); > > int perf_env__numa_node(struct perf_env *env, int cpu); > + > +int perf_kernel_is_64bit(void); > #endif /* __PERF_ENV_H */ > diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c > index ab7108d22428..f1d3725d599a 100644 > --- a/tools/perf/util/mmap.c > +++ b/tools/perf/util/mmap.c > @@ -323,6 +323,14 @@ int perf_mmap__push(struct mmap *md, void *to, > if (rc < 0) > return (rc == -EAGAIN) ? 1 : -1; > > +#ifdef BITS_PER_LONG == 32 > + if (kernel_is_64bit() && head >= 0xf0000000) { > + pr_err("32-bit perf cannot read 64-bit value atomically;\n"); > + pr_err("exit to avoid the 4GB (32-bit) buffer overflow on compat mode.\n"); > + return -ENOMEM; > + } > +#endif > + > size = md->core.end - md->core.start; > > if ((md->core.start & md->core.mask) + size != (md->core.end & md->core.mask)) { > > Thanks, > Leo > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel