stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] perf/x86/intel: Fix a crash caused by zero PEBS status
@ 2021-03-12 13:21 kan.liang
  2021-03-12 13:21 ` [PATCH 2/2] perf/x86/intel: Fix unchecked MSR access error caused by VLBR_EVENT kan.liang
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: kan.liang @ 2021-03-12 13:21 UTC (permalink / raw)
  To: peterz, mingo, linux-kernel
  Cc: vincent.weaver, eranian, ak, Kan Liang, stable

From: Kan Liang <kan.liang@linux.intel.com>

A repeatable crash can be triggered by the perf_fuzzer on some Haswell
system.
https://lore.kernel.org/lkml/7170d3b-c17f-1ded-52aa-cc6d9ae999f4@maine.edu/

For some old CPUs (HSW and earlier), the PEBS status in a PEBS record
may be mistakenly set to 0. To minimize the impact of the defect, the
commit was introduced to try to avoid dropping the PEBS record for some
cases. It adds a check in the intel_pmu_drain_pebs_nhm(), and updates
the local pebs_status accordingly. However, it doesn't correct the PEBS
status in the PEBS record, which may trigger the crash, especially for
the large PEBS.

It's possible that all the PEBS records in a large PEBS have the PEBS
status 0. If so, the first get_next_pebs_record_by_bit() in the
__intel_pmu_pebs_event() returns NULL. The at = NULL. Since it's a large
PEBS, the 'count' parameter must > 1. The second
get_next_pebs_record_by_bit() will crash.

Besides the local pebs_status, correct the PEBS status in the PEBS
record as well.

Fixes: 01330d7288e0 ("perf/x86: Allow zero PEBS status with only single active event")
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: stable@vger.kernel.org
---
 arch/x86/events/intel/ds.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 7ebae18..bcf4fa5 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -2010,7 +2010,7 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs, struct perf_sample_d
 		 */
 		if (!pebs_status && cpuc->pebs_enabled &&
 			!(cpuc->pebs_enabled & (cpuc->pebs_enabled-1)))
-			pebs_status = cpuc->pebs_enabled;
+			pebs_status = p->status = cpuc->pebs_enabled;
 
 		bit = find_first_bit((unsigned long *)&pebs_status,
 					x86_pmu.max_pebs_events);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] perf/x86/intel: Fix unchecked MSR access error caused by VLBR_EVENT
  2021-03-12 13:21 [PATCH 1/2] perf/x86/intel: Fix a crash caused by zero PEBS status kan.liang
@ 2021-03-12 13:21 ` kan.liang
  2021-03-17 12:38   ` [tip: perf/urgent] " tip-bot2 for Kan Liang
  2021-03-17  3:01 ` [PATCH 1/2] perf/x86/intel: Fix a crash caused by zero PEBS status Namhyung Kim
  2021-03-17 12:38 ` [tip: perf/urgent] " tip-bot2 for Kan Liang
  2 siblings, 1 reply; 5+ messages in thread
From: kan.liang @ 2021-03-12 13:21 UTC (permalink / raw)
  To: peterz, mingo, linux-kernel
  Cc: vincent.weaver, eranian, ak, Kan Liang, stable

From: Kan Liang <kan.liang@linux.intel.com>

On a Haswell machine, the perf_fuzzer managed to trigger this message:

[117248.075892] unchecked MSR access error: WRMSR to 0x3f1 (tried to
write 0x0400000000000000) at rIP: 0xffffffff8106e4f4
(native_write_msr+0x4/0x20)
[117248.089957] Call Trace:
[117248.092685]  intel_pmu_pebs_enable_all+0x31/0x40
[117248.097737]  intel_pmu_enable_all+0xa/0x10
[117248.102210]  __perf_event_task_sched_in+0x2df/0x2f0
[117248.107511]  finish_task_switch.isra.0+0x15f/0x280
[117248.112765]  schedule_tail+0xc/0x40
[117248.116562]  ret_from_fork+0x8/0x30

A fake event called VLBR_EVENT may use the bit 58 of the PEBS_ENABLE, if
the precise_ip is set. The bit 58 is reserved by the HW. Accessing the
bit causes the unchecked MSR access error.

The fake event doesn't support PEBS. The case should be rejected.

Fixes: 097e4311cda9 ("perf/x86: Add constraint to create guest LBR event without hw counter")
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: stable@vger.kernel.org
---
 arch/x86/events/intel/core.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 5bac48d..6789619 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3659,6 +3659,9 @@ static int intel_pmu_hw_config(struct perf_event *event)
 		return ret;
 
 	if (event->attr.precise_ip) {
+		if ((event->attr.config & INTEL_ARCH_EVENT_MASK) == INTEL_FIXED_VLBR_EVENT)
+			return -EINVAL;
+
 		if (!(event->attr.freq || (event->attr.wakeup_events && !event->attr.watermark))) {
 			event->hw.flags |= PERF_X86_EVENT_AUTO_RELOAD;
 			if (!(event->attr.sample_type &
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] perf/x86/intel: Fix a crash caused by zero PEBS status
  2021-03-12 13:21 [PATCH 1/2] perf/x86/intel: Fix a crash caused by zero PEBS status kan.liang
  2021-03-12 13:21 ` [PATCH 2/2] perf/x86/intel: Fix unchecked MSR access error caused by VLBR_EVENT kan.liang
@ 2021-03-17  3:01 ` Namhyung Kim
  2021-03-17 12:38 ` [tip: perf/urgent] " tip-bot2 for Kan Liang
  2 siblings, 0 replies; 5+ messages in thread
From: Namhyung Kim @ 2021-03-17  3:01 UTC (permalink / raw)
  To: kan.liang
  Cc: peterz, mingo, linux-kernel, vincent.weaver, eranian, ak, stable

On Fri, Mar 12, 2021 at 05:21:37AM -0800, kan.liang@linux.intel.com wrote:
> From: Kan Liang <kan.liang@linux.intel.com>
> 
> A repeatable crash can be triggered by the perf_fuzzer on some Haswell
> system.
> https://lore.kernel.org/lkml/7170d3b-c17f-1ded-52aa-cc6d9ae999f4@maine.edu/
> 
> For some old CPUs (HSW and earlier), the PEBS status in a PEBS record
> may be mistakenly set to 0. To minimize the impact of the defect, the
> commit was introduced to try to avoid dropping the PEBS record for some
> cases. It adds a check in the intel_pmu_drain_pebs_nhm(), and updates
> the local pebs_status accordingly. However, it doesn't correct the PEBS
> status in the PEBS record, which may trigger the crash, especially for
> the large PEBS.
> 
> It's possible that all the PEBS records in a large PEBS have the PEBS
> status 0. If so, the first get_next_pebs_record_by_bit() in the
> __intel_pmu_pebs_event() returns NULL. The at = NULL. Since it's a large
> PEBS, the 'count' parameter must > 1. The second
> get_next_pebs_record_by_bit() will crash.
> 
> Besides the local pebs_status, correct the PEBS status in the PEBS
> record as well.
> 
> Fixes: 01330d7288e0 ("perf/x86: Allow zero PEBS status with only single active event")
> Reported-by: Vince Weaver <vincent.weaver@maine.edu>
> Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
> Cc: stable@vger.kernel.org

Tested-by: Namhyung Kim <namhyung@kernel.org>

Thanks,
Namhyung


> ---
>  arch/x86/events/intel/ds.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
> index 7ebae18..bcf4fa5 100644
> --- a/arch/x86/events/intel/ds.c
> +++ b/arch/x86/events/intel/ds.c
> @@ -2010,7 +2010,7 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs, struct perf_sample_d
>  		 */
>  		if (!pebs_status && cpuc->pebs_enabled &&
>  			!(cpuc->pebs_enabled & (cpuc->pebs_enabled-1)))
> -			pebs_status = cpuc->pebs_enabled;
> +			pebs_status = p->status = cpuc->pebs_enabled;
>  
>  		bit = find_first_bit((unsigned long *)&pebs_status,
>  					x86_pmu.max_pebs_events);
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [tip: perf/urgent] perf/x86/intel: Fix unchecked MSR access error caused by VLBR_EVENT
  2021-03-12 13:21 ` [PATCH 2/2] perf/x86/intel: Fix unchecked MSR access error caused by VLBR_EVENT kan.liang
@ 2021-03-17 12:38   ` tip-bot2 for Kan Liang
  0 siblings, 0 replies; 5+ messages in thread
From: tip-bot2 for Kan Liang @ 2021-03-17 12:38 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Vince Weaver, Kan Liang, Peter Zijlstra (Intel),
	stable, x86, linux-kernel

The following commit has been merged into the perf/urgent branch of tip:

Commit-ID:     2dc0572f2cef87425147658698dce2600b799bd3
Gitweb:        https://git.kernel.org/tip/2dc0572f2cef87425147658698dce2600b799bd3
Author:        Kan Liang <kan.liang@linux.intel.com>
AuthorDate:    Fri, 12 Mar 2021 05:21:38 -08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Tue, 16 Mar 2021 21:44:39 +01:00

perf/x86/intel: Fix unchecked MSR access error caused by VLBR_EVENT

On a Haswell machine, the perf_fuzzer managed to trigger this message:

[117248.075892] unchecked MSR access error: WRMSR to 0x3f1 (tried to
write 0x0400000000000000) at rIP: 0xffffffff8106e4f4
(native_write_msr+0x4/0x20)
[117248.089957] Call Trace:
[117248.092685]  intel_pmu_pebs_enable_all+0x31/0x40
[117248.097737]  intel_pmu_enable_all+0xa/0x10
[117248.102210]  __perf_event_task_sched_in+0x2df/0x2f0
[117248.107511]  finish_task_switch.isra.0+0x15f/0x280
[117248.112765]  schedule_tail+0xc/0x40
[117248.116562]  ret_from_fork+0x8/0x30

A fake event called VLBR_EVENT may use the bit 58 of the PEBS_ENABLE, if
the precise_ip is set. The bit 58 is reserved by the HW. Accessing the
bit causes the unchecked MSR access error.

The fake event doesn't support PEBS. The case should be rejected.

Fixes: 097e4311cda9 ("perf/x86: Add constraint to create guest LBR event without hw counter")
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1615555298-140216-2-git-send-email-kan.liang@linux.intel.com
---
 arch/x86/events/intel/core.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 7bbb5bb..37ce384 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3659,6 +3659,9 @@ static int intel_pmu_hw_config(struct perf_event *event)
 		return ret;
 
 	if (event->attr.precise_ip) {
+		if ((event->attr.config & INTEL_ARCH_EVENT_MASK) == INTEL_FIXED_VLBR_EVENT)
+			return -EINVAL;
+
 		if (!(event->attr.freq || (event->attr.wakeup_events && !event->attr.watermark))) {
 			event->hw.flags |= PERF_X86_EVENT_AUTO_RELOAD;
 			if (!(event->attr.sample_type &

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [tip: perf/urgent] perf/x86/intel: Fix a crash caused by zero PEBS status
  2021-03-12 13:21 [PATCH 1/2] perf/x86/intel: Fix a crash caused by zero PEBS status kan.liang
  2021-03-12 13:21 ` [PATCH 2/2] perf/x86/intel: Fix unchecked MSR access error caused by VLBR_EVENT kan.liang
  2021-03-17  3:01 ` [PATCH 1/2] perf/x86/intel: Fix a crash caused by zero PEBS status Namhyung Kim
@ 2021-03-17 12:38 ` tip-bot2 for Kan Liang
  2 siblings, 0 replies; 5+ messages in thread
From: tip-bot2 for Kan Liang @ 2021-03-17 12:38 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Vince Weaver, Peter Zijlstra (Intel),
	Kan Liang, stable, x86, linux-kernel

The following commit has been merged into the perf/urgent branch of tip:

Commit-ID:     d88d05a9e0b6d9356e97129d4ff9942d765f46ea
Gitweb:        https://git.kernel.org/tip/d88d05a9e0b6d9356e97129d4ff9942d765f46ea
Author:        Kan Liang <kan.liang@linux.intel.com>
AuthorDate:    Fri, 12 Mar 2021 05:21:37 -08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Tue, 16 Mar 2021 21:44:39 +01:00

perf/x86/intel: Fix a crash caused by zero PEBS status

A repeatable crash can be triggered by the perf_fuzzer on some Haswell
system.
https://lore.kernel.org/lkml/7170d3b-c17f-1ded-52aa-cc6d9ae999f4@maine.edu/

For some old CPUs (HSW and earlier), the PEBS status in a PEBS record
may be mistakenly set to 0. To minimize the impact of the defect, the
commit was introduced to try to avoid dropping the PEBS record for some
cases. It adds a check in the intel_pmu_drain_pebs_nhm(), and updates
the local pebs_status accordingly. However, it doesn't correct the PEBS
status in the PEBS record, which may trigger the crash, especially for
the large PEBS.

It's possible that all the PEBS records in a large PEBS have the PEBS
status 0. If so, the first get_next_pebs_record_by_bit() in the
__intel_pmu_pebs_event() returns NULL. The at = NULL. Since it's a large
PEBS, the 'count' parameter must > 1. The second
get_next_pebs_record_by_bit() will crash.

Besides the local pebs_status, correct the PEBS status in the PEBS
record as well.

Fixes: 01330d7288e0 ("perf/x86: Allow zero PEBS status with only single active event")
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1615555298-140216-1-git-send-email-kan.liang@linux.intel.com
---
 arch/x86/events/intel/ds.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 7ebae18..d32b302 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -2010,7 +2010,7 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs, struct perf_sample_d
 		 */
 		if (!pebs_status && cpuc->pebs_enabled &&
 			!(cpuc->pebs_enabled & (cpuc->pebs_enabled-1)))
-			pebs_status = cpuc->pebs_enabled;
+			pebs_status = p->status = cpuc->pebs_enabled;
 
 		bit = find_first_bit((unsigned long *)&pebs_status,
 					x86_pmu.max_pebs_events);

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-03-17 12:39 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-12 13:21 [PATCH 1/2] perf/x86/intel: Fix a crash caused by zero PEBS status kan.liang
2021-03-12 13:21 ` [PATCH 2/2] perf/x86/intel: Fix unchecked MSR access error caused by VLBR_EVENT kan.liang
2021-03-17 12:38   ` [tip: perf/urgent] " tip-bot2 for Kan Liang
2021-03-17  3:01 ` [PATCH 1/2] perf/x86/intel: Fix a crash caused by zero PEBS status Namhyung Kim
2021-03-17 12:38 ` [tip: perf/urgent] " tip-bot2 for Kan Liang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).