[PATCH] Revert "perf/x86: Allow zero PEBS status with only single active event"

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH] Revert "perf/x86: Allow zero PEBS status with only single active event"
@ 2021-03-03 13:42 kan.liang
  2021-03-03 18:59 ` Peter Zijlstra
  0 siblings, 1 reply; 9+ messages in thread
From: kan.liang @ 2021-03-03 13:42 UTC (permalink / raw)
  To: peterz, mingo, linux-kernel
  Cc: acme, mark.rutland, alexander.shishkin, jolsa, namhyung, eranian,
	ak, Kan Liang, stable

From: Kan Liang <kan.liang@linux.intel.com>

This reverts commit 01330d7288e0 ("perf/x86: Allow zero PEBS status with
only single active event")

A repeatable crash can be triggered by the perf_fuzzer on some Haswell
system.
https://lore.kernel.org/lkml/7170d3b-c17f-1ded-52aa-cc6d9ae999f4@maine.edu/

For some old CPUs (HSW and earlier), the PEBS status in a PEBS record
may be mistakenly set to 0. To minimize the impact of the defect, the
commit was introduced to try to avoid dropping the PEBS record for some
cases. It adds a check in the intel_pmu_drain_pebs_nhm(), and updates
the local pebs_status accordingly. However, it doesn't correct the PEBS
status in the PEBS record, which may trigger the crash, especially for
the large PEBS.

It's possible that all the PEBS records in a large PEBS have the PEBS
status 0. If so, the first get_next_pebs_record_by_bit() in the
__intel_pmu_pebs_event() returns NULL. The at = NULL. Since it's a large
PEBS, the 'count' parameter must > 1. The second
get_next_pebs_record_by_bit() will crash.

Two solutions were considered to fix the crash.
- Keep the SW workaround and add extra checks in the
  get_next_pebs_record_by_bit() to workaround the issue. The
  get_next_pebs_record_by_bit() is a critical path. The extra checks
  will bring extra overhead for the latest CPUs which don't have the
  defect. Also, the defect can only be observed on some old CPUs
  (For example, the issue can be reproduced on an HSW client, but I
  didn't observe the issue on my Haswell server machine.). The impact
  of the defect should be limit.
  This solution is dropped.
- Drop the SW workaround and revert the commit.
  It seems that the commit never works, because the PEBS status in the
  PEBS record never be changed. The get_next_pebs_record_by_bit() only
  checks the PEBS status in the PEBS record. The record is dropped
  eventually. Reverting the commit should not change the current
  behavior.

Fixes: 01330d7288e0 ("perf/x86: Allow zero PEBS status with only single active event")
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: stable@vger.kernel.org
---
 arch/x86/events/intel/ds.c | 12 ------------
 1 file changed, 12 deletions(-)

diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 7ebae18..9c90d1e 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -2000,18 +2000,6 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs, struct perf_sample_d
 			continue;
 		}
 
-		/*
-		 * On some CPUs the PEBS status can be zero when PEBS is
-		 * racing with clearing of GLOBAL_STATUS.
-		 *
-		 * Normally we would drop that record, but in the
-		 * case when there is only a single active PEBS event
-		 * we can assume it's for that event.
-		 */
-		if (!pebs_status && cpuc->pebs_enabled &&
-			!(cpuc->pebs_enabled & (cpuc->pebs_enabled-1)))
-			pebs_status = cpuc->pebs_enabled;
-
 		bit = find_first_bit((unsigned long *)&pebs_status,
 					x86_pmu.max_pebs_events);
 		if (bit >= x86_pmu.max_pebs_events)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] Revert "perf/x86: Allow zero PEBS status with only single active event"
  2021-03-03 13:42 [PATCH] Revert "perf/x86: Allow zero PEBS status with only single active event" kan.liang
@ 2021-03-03 18:59 ` Peter Zijlstra
  2021-03-03 19:53   ` Liang, Kan
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2021-03-03 18:59 UTC (permalink / raw)
  To: kan.liang
  Cc: mingo, linux-kernel, acme, mark.rutland, alexander.shishkin,
	jolsa, namhyung, eranian, ak, stable

On Wed, Mar 03, 2021 at 05:42:18AM -0800, kan.liang@linux.intel.com wrote:

> For some old CPUs (HSW and earlier), the PEBS status in a PEBS record
> may be mistakenly set to 0. To minimize the impact of the defect, the
> commit was introduced to try to avoid dropping the PEBS record for some
> cases. It adds a check in the intel_pmu_drain_pebs_nhm(), and updates
> the local pebs_status accordingly. However, it doesn't correct the PEBS
> status in the PEBS record, which may trigger the crash, especially for
> the large PEBS.
> 
> It's possible that all the PEBS records in a large PEBS have the PEBS
> status 0. If so, the first get_next_pebs_record_by_bit() in the
> __intel_pmu_pebs_event() returns NULL. The at = NULL. Since it's a large
> PEBS, the 'count' parameter must > 1. The second
> get_next_pebs_record_by_bit() will crash.
> 
> Two solutions were considered to fix the crash.
> - Keep the SW workaround and add extra checks in the
>   get_next_pebs_record_by_bit() to workaround the issue. The
>   get_next_pebs_record_by_bit() is a critical path. The extra checks
>   will bring extra overhead for the latest CPUs which don't have the
>   defect. Also, the defect can only be observed on some old CPUs
>   (For example, the issue can be reproduced on an HSW client, but I
>   didn't observe the issue on my Haswell server machine.). The impact
>   of the defect should be limit.
>   This solution is dropped.
> - Drop the SW workaround and revert the commit.
>   It seems that the commit never works, because the PEBS status in the
>   PEBS record never be changed. The get_next_pebs_record_by_bit() only
>   checks the PEBS status in the PEBS record. The record is dropped
>   eventually. Reverting the commit should not change the current
>   behavior.

> +++ b/arch/x86/events/intel/ds.c
> @@ -2000,18 +2000,6 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs, struct perf_sample_d
>  			continue;
>  		}
>  
> -		/*
> -		 * On some CPUs the PEBS status can be zero when PEBS is
> -		 * racing with clearing of GLOBAL_STATUS.
> -		 *
> -		 * Normally we would drop that record, but in the
> -		 * case when there is only a single active PEBS event
> -		 * we can assume it's for that event.
> -		 */
> -		if (!pebs_status && cpuc->pebs_enabled &&
> -			!(cpuc->pebs_enabled & (cpuc->pebs_enabled-1)))
> -			pebs_status = cpuc->pebs_enabled;

Wouldn't something like:

			pebs_status = p->status = cpus->pebs_enabled;

actually fix things without adding overhead?



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Revert "perf/x86: Allow zero PEBS status with only single active event"
  2021-03-03 18:59 ` Peter Zijlstra
@ 2021-03-03 19:53   ` Liang, Kan
  2021-03-03 20:21     ` Peter Zijlstra
  0 siblings, 1 reply; 9+ messages in thread
From: Liang, Kan @ 2021-03-03 19:53 UTC (permalink / raw)
  To: Peter Zijlstra, Vince Weaver
  Cc: mingo, linux-kernel, acme, mark.rutland, alexander.shishkin,
	jolsa, namhyung, eranian, ak, stable



On 3/3/2021 1:59 PM, Peter Zijlstra wrote:
> On Wed, Mar 03, 2021 at 05:42:18AM -0800, kan.liang@linux.intel.com wrote:
> 
>> For some old CPUs (HSW and earlier), the PEBS status in a PEBS record
>> may be mistakenly set to 0. To minimize the impact of the defect, the
>> commit was introduced to try to avoid dropping the PEBS record for some
>> cases. It adds a check in the intel_pmu_drain_pebs_nhm(), and updates
>> the local pebs_status accordingly. However, it doesn't correct the PEBS
>> status in the PEBS record, which may trigger the crash, especially for
>> the large PEBS.
>>
>> It's possible that all the PEBS records in a large PEBS have the PEBS
>> status 0. If so, the first get_next_pebs_record_by_bit() in the
>> __intel_pmu_pebs_event() returns NULL. The at = NULL. Since it's a large
>> PEBS, the 'count' parameter must > 1. The second
>> get_next_pebs_record_by_bit() will crash.
>>
>> Two solutions were considered to fix the crash.
>> - Keep the SW workaround and add extra checks in the
>>    get_next_pebs_record_by_bit() to workaround the issue. The
>>    get_next_pebs_record_by_bit() is a critical path. The extra checks
>>    will bring extra overhead for the latest CPUs which don't have the
>>    defect. Also, the defect can only be observed on some old CPUs
>>    (For example, the issue can be reproduced on an HSW client, but I
>>    didn't observe the issue on my Haswell server machine.). The impact
>>    of the defect should be limit.
>>    This solution is dropped.
>> - Drop the SW workaround and revert the commit.
>>    It seems that the commit never works, because the PEBS status in the
>>    PEBS record never be changed. The get_next_pebs_record_by_bit() only
>>    checks the PEBS status in the PEBS record. The record is dropped
>>    eventually. Reverting the commit should not change the current
>>    behavior.
> 
>> +++ b/arch/x86/events/intel/ds.c
>> @@ -2000,18 +2000,6 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs, struct perf_sample_d
>>   			continue;
>>   		}
>>   
>> -		/*
>> -		 * On some CPUs the PEBS status can be zero when PEBS is
>> -		 * racing with clearing of GLOBAL_STATUS.
>> -		 *
>> -		 * Normally we would drop that record, but in the
>> -		 * case when there is only a single active PEBS event
>> -		 * we can assume it's for that event.
>> -		 */
>> -		if (!pebs_status && cpuc->pebs_enabled &&
>> -			!(cpuc->pebs_enabled & (cpuc->pebs_enabled-1)))
>> -			pebs_status = cpuc->pebs_enabled;
> 
> Wouldn't something like:
> 
> 			pebs_status = p->status = cpus->pebs_enabled;
>

I didn't consider it as a potential solution in this patch because I 
don't think it's a proper way that SW modifies the buffer, which is 
supposed to be manipulated by the HW.
It's just a personal preference. I don't see any issue here. We may try it.

Vince, could you please help check whether Peter's suggestion fixes the 
crash?

Thanks,
Kan

> actually fix things without adding overhead?
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Revert "perf/x86: Allow zero PEBS status with only single active event"
  2021-03-03 19:53   ` Liang, Kan
@ 2021-03-03 20:21     ` Peter Zijlstra
  2021-03-16  7:22       ` Namhyung Kim
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2021-03-03 20:21 UTC (permalink / raw)
  To: Liang, Kan
  Cc: Vince Weaver, mingo, linux-kernel, acme, mark.rutland,
	alexander.shishkin, jolsa, namhyung, eranian, ak, stable

On Wed, Mar 03, 2021 at 02:53:00PM -0500, Liang, Kan wrote:
> On 3/3/2021 1:59 PM, Peter Zijlstra wrote:
> > On Wed, Mar 03, 2021 at 05:42:18AM -0800, kan.liang@linux.intel.com wrote:

> > > +++ b/arch/x86/events/intel/ds.c
> > > @@ -2000,18 +2000,6 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs, struct perf_sample_d
> > >   			continue;
> > >   		}
> > > -		/*
> > > -		 * On some CPUs the PEBS status can be zero when PEBS is
> > > -		 * racing with clearing of GLOBAL_STATUS.
> > > -		 *
> > > -		 * Normally we would drop that record, but in the
> > > -		 * case when there is only a single active PEBS event
> > > -		 * we can assume it's for that event.
> > > -		 */
> > > -		if (!pebs_status && cpuc->pebs_enabled &&
> > > -			!(cpuc->pebs_enabled & (cpuc->pebs_enabled-1)))
> > > -			pebs_status = cpuc->pebs_enabled;
> > 
> > Wouldn't something like:
> > 
> > 			pebs_status = p->status = cpus->pebs_enabled;
> > 
> 
> I didn't consider it as a potential solution in this patch because I don't
> think it's a proper way that SW modifies the buffer, which is supposed to be
> manipulated by the HW.

Right, but then HW was supposed to write sane values and it doesn't do
that either ;-)

> It's just a personal preference. I don't see any issue here. We may try it.

So I mostly agree with you, but I think it's a shame to unsupport such
chips, HSW is still a plenty useable chip today.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Revert "perf/x86: Allow zero PEBS status with only single active event"
  2021-03-03 20:21     ` Peter Zijlstra
@ 2021-03-16  7:22       ` Namhyung Kim
  2021-03-16 12:28         ` Liang, Kan
  0 siblings, 1 reply; 9+ messages in thread
From: Namhyung Kim @ 2021-03-16  7:22 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Liang, Kan, Vince Weaver, Ingo Molnar, linux-kernel,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Stephane Eranian, Andi Kleen, stable # 4 . 5

Hi Peter and Kan,

On Thu, Mar 4, 2021 at 5:22 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Wed, Mar 03, 2021 at 02:53:00PM -0500, Liang, Kan wrote:
> > On 3/3/2021 1:59 PM, Peter Zijlstra wrote:
> > > On Wed, Mar 03, 2021 at 05:42:18AM -0800, kan.liang@linux.intel.com wrote:
>
> > > > +++ b/arch/x86/events/intel/ds.c
> > > > @@ -2000,18 +2000,6 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs, struct perf_sample_d
> > > >                           continue;
> > > >                   }
> > > > -         /*
> > > > -          * On some CPUs the PEBS status can be zero when PEBS is
> > > > -          * racing with clearing of GLOBAL_STATUS.
> > > > -          *
> > > > -          * Normally we would drop that record, but in the
> > > > -          * case when there is only a single active PEBS event
> > > > -          * we can assume it's for that event.
> > > > -          */
> > > > -         if (!pebs_status && cpuc->pebs_enabled &&
> > > > -                 !(cpuc->pebs_enabled & (cpuc->pebs_enabled-1)))
> > > > -                 pebs_status = cpuc->pebs_enabled;
> > >
> > > Wouldn't something like:
> > >
> > >                     pebs_status = p->status = cpus->pebs_enabled;
> > >
> >
> > I didn't consider it as a potential solution in this patch because I don't
> > think it's a proper way that SW modifies the buffer, which is supposed to be
> > manipulated by the HW.
>
> Right, but then HW was supposed to write sane values and it doesn't do
> that either ;-)
>
> > It's just a personal preference. I don't see any issue here. We may try it.
>
> So I mostly agree with you, but I think it's a shame to unsupport such
> chips, HSW is still a plenty useable chip today.

I got a similar issue on ivybridge machines which caused kernel crash.
My case it's related to the branch stack with PEBS events but I think
it's the same issue.  And I can confirm that the above approach of
updating p->status fixed the problem.

I've talked to Stephane about this, and he wants to make it more
robust when we see stale (or invalid) PEBS records.  I'll send the
patch soon.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Revert "perf/x86: Allow zero PEBS status with only single active event"
  2021-03-16  7:22       ` Namhyung Kim
@ 2021-03-16 12:28         ` Liang, Kan
  2021-03-16 18:34           ` Stephane Eranian
  2021-03-17  2:04           ` Namhyung Kim
  0 siblings, 2 replies; 9+ messages in thread
From: Liang, Kan @ 2021-03-16 12:28 UTC (permalink / raw)
  To: Namhyung Kim, Peter Zijlstra
  Cc: Vince Weaver, Ingo Molnar, linux-kernel,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Stephane Eranian, Andi Kleen, stable # 4 . 5



On 3/16/2021 3:22 AM, Namhyung Kim wrote:
> Hi Peter and Kan,
> 
> On Thu, Mar 4, 2021 at 5:22 AM Peter Zijlstra <peterz@infradead.org> wrote:
>>
>> On Wed, Mar 03, 2021 at 02:53:00PM -0500, Liang, Kan wrote:
>>> On 3/3/2021 1:59 PM, Peter Zijlstra wrote:
>>>> On Wed, Mar 03, 2021 at 05:42:18AM -0800, kan.liang@linux.intel.com wrote:
>>
>>>>> +++ b/arch/x86/events/intel/ds.c
>>>>> @@ -2000,18 +2000,6 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs, struct perf_sample_d
>>>>>                            continue;
>>>>>                    }
>>>>> -         /*
>>>>> -          * On some CPUs the PEBS status can be zero when PEBS is
>>>>> -          * racing with clearing of GLOBAL_STATUS.
>>>>> -          *
>>>>> -          * Normally we would drop that record, but in the
>>>>> -          * case when there is only a single active PEBS event
>>>>> -          * we can assume it's for that event.
>>>>> -          */
>>>>> -         if (!pebs_status && cpuc->pebs_enabled &&
>>>>> -                 !(cpuc->pebs_enabled & (cpuc->pebs_enabled-1)))
>>>>> -                 pebs_status = cpuc->pebs_enabled;
>>>>
>>>> Wouldn't something like:
>>>>
>>>>                      pebs_status = p->status = cpus->pebs_enabled;
>>>>
>>>
>>> I didn't consider it as a potential solution in this patch because I don't
>>> think it's a proper way that SW modifies the buffer, which is supposed to be
>>> manipulated by the HW.
>>
>> Right, but then HW was supposed to write sane values and it doesn't do
>> that either ;-)
>>
>>> It's just a personal preference. I don't see any issue here. We may try it.
>>
>> So I mostly agree with you, but I think it's a shame to unsupport such
>> chips, HSW is still a plenty useable chip today.
> 
> I got a similar issue on ivybridge machines which caused kernel crash.
> My case it's related to the branch stack with PEBS events but I think
> it's the same issue.  And I can confirm that the above approach of
> updating p->status fixed the problem.
> 
> I've talked to Stephane about this, and he wants to make it more
> robust when we see stale (or invalid) PEBS records.  I'll send the
> patch soon.
> 

Hi Namhyung,

In case you didn't see it, I've already submitted a patch to fix the 
issue last Friday.
https://lore.kernel.org/lkml/1615555298-140216-1-git-send-email-kan.liang@linux.intel.com/
But if you have a more robust proposal, please feel free to submit it.

BTW: The patch set from last Friday also fixed another bug found by the 
perf_fuzzer test. You may be interested.

Thanks,
Kan


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Revert "perf/x86: Allow zero PEBS status with only single active event"
  2021-03-16 12:28         ` Liang, Kan
@ 2021-03-16 18:34           ` Stephane Eranian
  2021-03-16 19:36             ` Liang, Kan
  2021-03-17  2:04           ` Namhyung Kim
  1 sibling, 1 reply; 9+ messages in thread
From: Stephane Eranian @ 2021-03-16 18:34 UTC (permalink / raw)
  To: Liang, Kan
  Cc: Namhyung Kim, Peter Zijlstra, Vince Weaver, Ingo Molnar,
	linux-kernel, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Andi Kleen, stable # 4 . 5

On Tue, Mar 16, 2021 at 5:28 AM Liang, Kan <kan.liang@linux.intel.com> wrote:
>
>
>
> On 3/16/2021 3:22 AM, Namhyung Kim wrote:
> > Hi Peter and Kan,
> >
> > On Thu, Mar 4, 2021 at 5:22 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >>
> >> On Wed, Mar 03, 2021 at 02:53:00PM -0500, Liang, Kan wrote:
> >>> On 3/3/2021 1:59 PM, Peter Zijlstra wrote:
> >>>> On Wed, Mar 03, 2021 at 05:42:18AM -0800, kan.liang@linux.intel.com wrote:
> >>
> >>>>> +++ b/arch/x86/events/intel/ds.c
> >>>>> @@ -2000,18 +2000,6 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs, struct perf_sample_d
> >>>>>                            continue;
> >>>>>                    }
> >>>>> -         /*
> >>>>> -          * On some CPUs the PEBS status can be zero when PEBS is
> >>>>> -          * racing with clearing of GLOBAL_STATUS.
> >>>>> -          *
> >>>>> -          * Normally we would drop that record, but in the
> >>>>> -          * case when there is only a single active PEBS event
> >>>>> -          * we can assume it's for that event.
> >>>>> -          */
> >>>>> -         if (!pebs_status && cpuc->pebs_enabled &&
> >>>>> -                 !(cpuc->pebs_enabled & (cpuc->pebs_enabled-1)))
> >>>>> -                 pebs_status = cpuc->pebs_enabled;
> >>>>
> >>>> Wouldn't something like:
> >>>>
> >>>>                      pebs_status = p->status = cpus->pebs_enabled;
> >>>>
> >>>
> >>> I didn't consider it as a potential solution in this patch because I don't
> >>> think it's a proper way that SW modifies the buffer, which is supposed to be
> >>> manipulated by the HW.
> >>
> >> Right, but then HW was supposed to write sane values and it doesn't do
> >> that either ;-)
> >>
> >>> It's just a personal preference. I don't see any issue here. We may try it.
> >>
> >> So I mostly agree with you, but I think it's a shame to unsupport such
> >> chips, HSW is still a plenty useable chip today.
> >
> > I got a similar issue on ivybridge machines which caused kernel crash.
> > My case it's related to the branch stack with PEBS events but I think
> > it's the same issue.  And I can confirm that the above approach of
> > updating p->status fixed the problem.
> >
> > I've talked to Stephane about this, and he wants to make it more
> > robust when we see stale (or invalid) PEBS records.  I'll send the
> > patch soon.
> >
>
> Hi Namhyung,
>
> In case you didn't see it, I've already submitted a patch to fix the
> issue last Friday.
> https://lore.kernel.org/lkml/1615555298-140216-1-git-send-email-kan.liang@linux.intel.com/
> But if you have a more robust proposal, please feel free to submit it.
>
This fixes the problem on the older systems. The other problem we
identified related to the
PEBS sample processing code is that you can end up with uninitialized
perf_sample_data
struct passed to perf_event_overflow():

 setup_pebs_fixed_sample_data(pebs, data)
{
        if (!pebs)
                return;
        perf_sample_data_init(data);  <<< must be moved before the if (!pebs)
        ...
}

__intel_pmu_pebs_event(pebs, data)
{
        setup_sample(pebs, data)
        perf_event_overflow(data);
        ...
}

If there is any other reason to get a pebs = NULL in fix_sample_data()
or adaptive_sample_data(), then
you must call perf_sample_data_init(data) BEFORE you return otherwise
you end up in perf_event_overflow()
with uninitialized data and you may die as follows:

[<ffffffff812f283d>] ? perf_output_copy+0x4d/0xb0
[<ffffffff812e0fb1>] perf_output_sample+0x561/0xab0
[<ffffffff812e0952>] ? __perf_event_header__init_id+0x112/0x130
[<ffffffff812e1be1>] ? perf_prepare_sample+0x1b1/0x730
[<ffffffff812e21b9>] perf_event_output_forward+0x59/0x80
[<ffffffff812e0634>] ? perf_event_update_userpage+0xf4/0x110
[<ffffffff812e4468>] perf_event_overflow+0x88/0xe0
[<ffffffff810175b8>] __intel_pmu_pebs_event+0x328/0x380

This all stems from get_next_pebs_record_by_bit()  potentially
returning NULL but the NULL is not handled correctly
by the callers. This is what I'd like to see cleaned up in
__intel_pmu_pebs_event() to  avoid future problems.

I have a patch that moves the perf_sample_data_init() and I can send
it to LKML but it would also need the cleanup
for get_next_pebs_record_by_bit() to be complete.

Thanks.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Revert "perf/x86: Allow zero PEBS status with only single active event"
  2021-03-16 18:34           ` Stephane Eranian
@ 2021-03-16 19:36             ` Liang, Kan
  0 siblings, 0 replies; 9+ messages in thread
From: Liang, Kan @ 2021-03-16 19:36 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: Namhyung Kim, Peter Zijlstra, Vince Weaver, Ingo Molnar,
	linux-kernel, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Andi Kleen, stable # 4 . 5



On 3/16/2021 2:34 PM, Stephane Eranian wrote:
> On Tue, Mar 16, 2021 at 5:28 AM Liang, Kan <kan.liang@linux.intel.com> wrote:
>>
>>
>>
>> On 3/16/2021 3:22 AM, Namhyung Kim wrote:
>>> Hi Peter and Kan,
>>>
>>> On Thu, Mar 4, 2021 at 5:22 AM Peter Zijlstra <peterz@infradead.org> wrote:
>>>>
>>>> On Wed, Mar 03, 2021 at 02:53:00PM -0500, Liang, Kan wrote:
>>>>> On 3/3/2021 1:59 PM, Peter Zijlstra wrote:
>>>>>> On Wed, Mar 03, 2021 at 05:42:18AM -0800, kan.liang@linux.intel.com wrote:
>>>>
>>>>>>> +++ b/arch/x86/events/intel/ds.c
>>>>>>> @@ -2000,18 +2000,6 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs, struct perf_sample_d
>>>>>>>                             continue;
>>>>>>>                     }
>>>>>>> -         /*
>>>>>>> -          * On some CPUs the PEBS status can be zero when PEBS is
>>>>>>> -          * racing with clearing of GLOBAL_STATUS.
>>>>>>> -          *
>>>>>>> -          * Normally we would drop that record, but in the
>>>>>>> -          * case when there is only a single active PEBS event
>>>>>>> -          * we can assume it's for that event.
>>>>>>> -          */
>>>>>>> -         if (!pebs_status && cpuc->pebs_enabled &&
>>>>>>> -                 !(cpuc->pebs_enabled & (cpuc->pebs_enabled-1)))
>>>>>>> -                 pebs_status = cpuc->pebs_enabled;
>>>>>>
>>>>>> Wouldn't something like:
>>>>>>
>>>>>>                       pebs_status = p->status = cpus->pebs_enabled;
>>>>>>
>>>>>
>>>>> I didn't consider it as a potential solution in this patch because I don't
>>>>> think it's a proper way that SW modifies the buffer, which is supposed to be
>>>>> manipulated by the HW.
>>>>
>>>> Right, but then HW was supposed to write sane values and it doesn't do
>>>> that either ;-)
>>>>
>>>>> It's just a personal preference. I don't see any issue here. We may try it.
>>>>
>>>> So I mostly agree with you, but I think it's a shame to unsupport such
>>>> chips, HSW is still a plenty useable chip today.
>>>
>>> I got a similar issue on ivybridge machines which caused kernel crash.
>>> My case it's related to the branch stack with PEBS events but I think
>>> it's the same issue.  And I can confirm that the above approach of
>>> updating p->status fixed the problem.
>>>
>>> I've talked to Stephane about this, and he wants to make it more
>>> robust when we see stale (or invalid) PEBS records.  I'll send the
>>> patch soon.
>>>
>>
>> Hi Namhyung,
>>
>> In case you didn't see it, I've already submitted a patch to fix the
>> issue last Friday.
>> https://lore.kernel.org/lkml/1615555298-140216-1-git-send-email-kan.liang@linux.intel.com/
>> But if you have a more robust proposal, please feel free to submit it.
>>
> This fixes the problem on the older systems. The other problem we
> identified related to the
> PEBS sample processing code is that you can end up with uninitialized
> perf_sample_data
> struct passed to perf_event_overflow():
> 
>   setup_pebs_fixed_sample_data(pebs, data)
> {
>          if (!pebs)
>                  return;
>          perf_sample_data_init(data);  <<< must be moved before the if (!pebs)
>          ...
> }
> 
> __intel_pmu_pebs_event(pebs, data)
> {
>          setup_sample(pebs, data)
>          perf_event_overflow(data);
>          ...
> }
> 
> If there is any other reason to get a pebs = NULL in fix_sample_data()
> or adaptive_sample_data(), then
> you must call perf_sample_data_init(data) BEFORE you return otherwise
> you end up in perf_event_overflow()
> with uninitialized data and you may die as follows:
> 
> [<ffffffff812f283d>] ? perf_output_copy+0x4d/0xb0
> [<ffffffff812e0fb1>] perf_output_sample+0x561/0xab0
> [<ffffffff812e0952>] ? __perf_event_header__init_id+0x112/0x130
> [<ffffffff812e1be1>] ? perf_prepare_sample+0x1b1/0x730
> [<ffffffff812e21b9>] perf_event_output_forward+0x59/0x80
> [<ffffffff812e0634>] ? perf_event_update_userpage+0xf4/0x110
> [<ffffffff812e4468>] perf_event_overflow+0x88/0xe0
> [<ffffffff810175b8>] __intel_pmu_pebs_event+0x328/0x380
> 
> This all stems from get_next_pebs_record_by_bit()  potentially
> returning NULL but the NULL is not handled correctly
> by the callers. This is what I'd like to see cleaned up in
> __intel_pmu_pebs_event() to  avoid future problems.
> 

Got it. Yes, we need another patch to correctly handle the potentially
returning NULL. Thanks for pointing it out.

Thanks,
Kan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Revert "perf/x86: Allow zero PEBS status with only single active event"
  2021-03-16 12:28         ` Liang, Kan
  2021-03-16 18:34           ` Stephane Eranian
@ 2021-03-17  2:04           ` Namhyung Kim
  1 sibling, 0 replies; 9+ messages in thread
From: Namhyung Kim @ 2021-03-17  2:04 UTC (permalink / raw)
  To: Liang, Kan
  Cc: Peter Zijlstra, Vince Weaver, Ingo Molnar, linux-kernel,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Stephane Eranian, Andi Kleen, stable # 4 . 5

On Tue, Mar 16, 2021 at 9:28 PM Liang, Kan <kan.liang@linux.intel.com> wrote:
>
>
>
> On 3/16/2021 3:22 AM, Namhyung Kim wrote:
> > Hi Peter and Kan,
> >
> > On Thu, Mar 4, 2021 at 5:22 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >>
> >> On Wed, Mar 03, 2021 at 02:53:00PM -0500, Liang, Kan wrote:
> >>> On 3/3/2021 1:59 PM, Peter Zijlstra wrote:
> >>>> On Wed, Mar 03, 2021 at 05:42:18AM -0800, kan.liang@linux.intel.com wrote:
> >>
> >>>>> +++ b/arch/x86/events/intel/ds.c
> >>>>> @@ -2000,18 +2000,6 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs, struct perf_sample_d
> >>>>>                            continue;
> >>>>>                    }
> >>>>> -         /*
> >>>>> -          * On some CPUs the PEBS status can be zero when PEBS is
> >>>>> -          * racing with clearing of GLOBAL_STATUS.
> >>>>> -          *
> >>>>> -          * Normally we would drop that record, but in the
> >>>>> -          * case when there is only a single active PEBS event
> >>>>> -          * we can assume it's for that event.
> >>>>> -          */
> >>>>> -         if (!pebs_status && cpuc->pebs_enabled &&
> >>>>> -                 !(cpuc->pebs_enabled & (cpuc->pebs_enabled-1)))
> >>>>> -                 pebs_status = cpuc->pebs_enabled;
> >>>>
> >>>> Wouldn't something like:
> >>>>
> >>>>                      pebs_status = p->status = cpus->pebs_enabled;
> >>>>
> >>>
> >>> I didn't consider it as a potential solution in this patch because I don't
> >>> think it's a proper way that SW modifies the buffer, which is supposed to be
> >>> manipulated by the HW.
> >>
> >> Right, but then HW was supposed to write sane values and it doesn't do
> >> that either ;-)
> >>
> >>> It's just a personal preference. I don't see any issue here. We may try it.
> >>
> >> So I mostly agree with you, but I think it's a shame to unsupport such
> >> chips, HSW is still a plenty useable chip today.
> >
> > I got a similar issue on ivybridge machines which caused kernel crash.
> > My case it's related to the branch stack with PEBS events but I think
> > it's the same issue.  And I can confirm that the above approach of
> > updating p->status fixed the problem.
> >
> > I've talked to Stephane about this, and he wants to make it more
> > robust when we see stale (or invalid) PEBS records.  I'll send the
> > patch soon.
> >
>
> Hi Namhyung,
>
> In case you didn't see it, I've already submitted a patch to fix the
> issue last Friday.
> https://lore.kernel.org/lkml/1615555298-140216-1-git-send-email-kan.liang@linux.intel.com/
> But if you have a more robust proposal, please feel free to submit it.
>
> BTW: The patch set from last Friday also fixed another bug found by the
> perf_fuzzer test. You may be interested.

Right, I missed it.  It'd be nice if you could CC me for perf patches later.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-03-17  2:04 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-03 13:42 [PATCH] Revert "perf/x86: Allow zero PEBS status with only single active event" kan.liang
2021-03-03 18:59 ` Peter Zijlstra
2021-03-03 19:53   ` Liang, Kan
2021-03-03 20:21     ` Peter Zijlstra
2021-03-16  7:22       ` Namhyung Kim
2021-03-16 12:28         ` Liang, Kan
2021-03-16 18:34           ` Stephane Eranian
2021-03-16 19:36             ` Liang, Kan
2021-03-17  2:04           ` Namhyung Kim

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.