[PATCH 0/2] coresight: etf/etb: NULL Pointer dereference crash fixes

linux-arm-msm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/2] coresight: etf/etb: NULL Pointer dereference crash fixes
@ 2020-10-07 13:00 Sai Prakash Ranjan
  2020-10-07 13:00 ` [PATCH 1/2] coresight: tmc-etf: Fix NULL ptr dereference in tmc_enable_etf_sink_perf() Sai Prakash Ranjan
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Sai Prakash Ranjan @ 2020-10-07 13:00 UTC (permalink / raw)
  To: Mathieu Poirier, Suzuki K Poulose, Mike Leach
  Cc: coresight, Stephen Boyd, linux-arm-msm, linux-kernel,
	linux-arm-kernel, denik, leo.yan, peterz, Sai Prakash Ranjan

There was a report of NULL pointer dereference in ETF enable
path for perf CS mode with PID monitoring. It is almost 100%
reproducible when the process to monitor is something very
active such as chrome and with ETF as the sink and not ETR.
Currently in a bid to find the pid, the owner is dereferenced
via task_pid_nr() call in tmc_enable_etf_sink_perf() and with
owner being NULL, we get a NULL pointer dereference.

Looking at the ETR and other places in the kernel, ETF and the
ETB are the only places trying to dereference the task(owner)
in tmc_enable_etf_sink_perf() which is also called from the
sched_in path as in the call trace. Owner(task) is NULL even
in the case of ETR in tmc_enable_etr_sink_perf(), but since we
cache the PID in alloc_buffer() callback and it is done as part
of etm_setup_aux() when allocating buffer for ETR sink, we never
dereference this NULL pointer and we are safe. So lets do the
same thing with ETF and ETB and cache the PID to which the
cs_buffer belongs in alloc_buffer() callback for ETF and ETB as
done for ETR. This will also remove the unnecessary function calls
(task_pid_nr()) in tmc_enable_etr_sink_perf() and etb_enable_perf().

Easily reproducible running below:

 perf record -e cs_etm/@tmc_etf0/ -N -p <pid>

Unable to handle kernel NULL pointer dereference at virtual address 0000000000000548
Mem abort info:
  ESR = 0x96000006
  EC = 0x25: DABT (current EL), IL = 32 bits
  SET = 0, FnV = 0
  EA = 0, S1PTW = 0
Data abort info:
  ISV = 0, ISS = 0x00000006
  CM = 0, WnR = 0
<snip>...
Call trace:
 tmc_enable_etf_sink+0xe4/0x280
 coresight_enable_path+0x168/0x1fc
 etm_event_start+0x8c/0xf8
 etm_event_add+0x38/0x54
 event_sched_in+0x194/0x2ac
 group_sched_in+0x54/0x12c
 flexible_sched_in+0xd8/0x120
 visit_groups_merge+0x100/0x16c
 ctx_flexible_sched_in+0x50/0x74
 ctx_sched_in+0xa4/0xa8
 perf_event_sched_in+0x60/0x6c
 perf_event_context_sched_in+0x98/0xe0
 __perf_event_task_sched_in+0x5c/0xd8
 finish_task_switch+0x184/0x1cc
 schedule_tail+0x20/0xec
 ret_from_fork+0x4/0x18

Sai Prakash Ranjan (2):
  coresight: tmc-etf: Fix NULL ptr dereference in
    tmc_enable_etf_sink_perf()
  coresight: etb10: Fix possible NULL ptr dereference in
    etb_enable_perf()

 drivers/hwtracing/coresight/coresight-etb10.c   | 4 +++-
 drivers/hwtracing/coresight/coresight-priv.h    | 2 ++
 drivers/hwtracing/coresight/coresight-tmc-etf.c | 4 +++-
 3 files changed, 8 insertions(+), 2 deletions(-)

base-commit: 3477326277451000bc667dfcc4fd0774c039184c
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/2] coresight: tmc-etf: Fix NULL ptr dereference in tmc_enable_etf_sink_perf()
  2020-10-07 13:00 [PATCH 0/2] coresight: etf/etb: NULL Pointer dereference crash fixes Sai Prakash Ranjan
@ 2020-10-07 13:00 ` Sai Prakash Ranjan
  2020-10-13 16:35   ` Suzuki K Poulose
  2020-10-07 13:00 ` [PATCH 2/2] coresight: etb10: Fix possible NULL ptr dereference in etb_enable_perf() Sai Prakash Ranjan
  2020-12-29 20:15 ` [PATCH 0/2] coresight: etf/etb: NULL Pointer dereference crash fixes patchwork-bot+linux-arm-msm
  2 siblings, 1 reply; 17+ messages in thread
From: Sai Prakash Ranjan @ 2020-10-07 13:00 UTC (permalink / raw)
  To: Mathieu Poirier, Suzuki K Poulose, Mike Leach
  Cc: coresight, Stephen Boyd, linux-arm-msm, linux-kernel,
	linux-arm-kernel, denik, leo.yan, peterz, Sai Prakash Ranjan

There was a report of NULL pointer dereference in ETF enable
path for perf CS mode with PID monitoring. It is almost 100%
reproducible when the process to monitor is something very
active such as chrome and with ETF as the sink and not ETR.
Currently in a bid to find the pid, the owner is dereferenced
via task_pid_nr() call in tmc_enable_etf_sink_perf() and with
owner being NULL, we get a NULL pointer dereference.

Looking at the ETR and other places in the kernel, ETF and the
ETB are the only places trying to dereference the task(owner)
in tmc_enable_etf_sink_perf() which is also called from the
sched_in path as in the call trace. Owner(task) is NULL even
in the case of ETR in tmc_enable_etr_sink_perf(), but since we
cache the PID in alloc_buffer() callback and it is done as part
of etm_setup_aux() when allocating buffer for ETR sink, we never
dereference this NULL pointer and we are safe. So lets do the
same thing with ETF and cache the PID to which the cs_buffer
belongs in tmc_alloc_etf_buffer() as done for ETR. This will
also remove the unnecessary function calls(task_pid_nr()) since
we are caching the PID.

Easily reproducible running below:

 perf record -e cs_etm/@tmc_etf0/ -N -p <pid>

Unable to handle kernel NULL pointer dereference at virtual address 0000000000000548
Mem abort info:
  ESR = 0x96000006
  EC = 0x25: DABT (current EL), IL = 32 bits
  SET = 0, FnV = 0
  EA = 0, S1PTW = 0
Data abort info:
  ISV = 0, ISS = 0x00000006
  CM = 0, WnR = 0
<snip>...
Call trace:
 tmc_enable_etf_sink+0xe4/0x280
 coresight_enable_path+0x168/0x1fc
 etm_event_start+0x8c/0xf8
 etm_event_add+0x38/0x54
 event_sched_in+0x194/0x2ac
 group_sched_in+0x54/0x12c
 flexible_sched_in+0xd8/0x120
 visit_groups_merge+0x100/0x16c
 ctx_flexible_sched_in+0x50/0x74
 ctx_sched_in+0xa4/0xa8
 perf_event_sched_in+0x60/0x6c
 perf_event_context_sched_in+0x98/0xe0
 __perf_event_task_sched_in+0x5c/0xd8
 finish_task_switch+0x184/0x1cc
 schedule_tail+0x20/0xec
 ret_from_fork+0x4/0x18

Fixes: 880af782c6e8 ("coresight: tmc-etf: Add support for CPU-wide trace scenarios")
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
---
 drivers/hwtracing/coresight/coresight-priv.h    | 2 ++
 drivers/hwtracing/coresight/coresight-tmc-etf.c | 4 +++-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/hwtracing/coresight/coresight-priv.h b/drivers/hwtracing/coresight/coresight-priv.h
index 65a29293b6cb..f5f654ea2994 100644
--- a/drivers/hwtracing/coresight/coresight-priv.h
+++ b/drivers/hwtracing/coresight/coresight-priv.h
@@ -87,6 +87,7 @@ enum cs_mode {
  * struct cs_buffer - keep track of a recording session' specifics
  * @cur:	index of the current buffer
  * @nr_pages:	max number of pages granted to us
+ * @pid:	PID this cs_buffer belongs to
  * @offset:	offset within the current buffer
  * @data_size:	how much we collected in this run
  * @snapshot:	is this run in snapshot mode
@@ -95,6 +96,7 @@ enum cs_mode {
 struct cs_buffers {
 	unsigned int		cur;
 	unsigned int		nr_pages;
+	pid_t			pid;
 	unsigned long		offset;
 	local_t			data_size;
 	bool			snapshot;
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c
index 44402d413ebb..989d965f3d90 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etf.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c
@@ -227,6 +227,7 @@ static int tmc_enable_etf_sink_perf(struct coresight_device *csdev, void *data)
 	unsigned long flags;
 	struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
 	struct perf_output_handle *handle = data;
+	struct cs_buffers *buf = etm_perf_sink_config(handle);
 
 	spin_lock_irqsave(&drvdata->spinlock, flags);
 	do {
@@ -243,7 +244,7 @@ static int tmc_enable_etf_sink_perf(struct coresight_device *csdev, void *data)
 		}
 
 		/* Get a handle on the pid of the process to monitor */
-		pid = task_pid_nr(handle->event->owner);
+		pid = buf->pid;
 
 		if (drvdata->pid != -1 && drvdata->pid != pid) {
 			ret = -EBUSY;
@@ -399,6 +400,7 @@ static void *tmc_alloc_etf_buffer(struct coresight_device *csdev,
 	if (!buf)
 		return NULL;
 
+	buf->pid = task_pid_nr(event->owner);
 	buf->snapshot = overwrite;
 	buf->nr_pages = nr_pages;
 	buf->data_pages = pages;
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/2] coresight: etb10: Fix possible NULL ptr dereference in etb_enable_perf()
  2020-10-07 13:00 [PATCH 0/2] coresight: etf/etb: NULL Pointer dereference crash fixes Sai Prakash Ranjan
  2020-10-07 13:00 ` [PATCH 1/2] coresight: tmc-etf: Fix NULL ptr dereference in tmc_enable_etf_sink_perf() Sai Prakash Ranjan
@ 2020-10-07 13:00 ` Sai Prakash Ranjan
  2020-12-29 20:15 ` [PATCH 0/2] coresight: etf/etb: NULL Pointer dereference crash fixes patchwork-bot+linux-arm-msm
  2 siblings, 0 replies; 17+ messages in thread
From: Sai Prakash Ranjan @ 2020-10-07 13:00 UTC (permalink / raw)
  To: Mathieu Poirier, Suzuki K Poulose, Mike Leach
  Cc: coresight, Stephen Boyd, linux-arm-msm, linux-kernel,
	linux-arm-kernel, denik, leo.yan, peterz, Sai Prakash Ranjan

There was a report of NULL pointer dereference in ETF enable
path for perf CS mode with PID monitoring. It is almost 100%
reproducible when the process to monitor is something very
active such as chrome and with ETF as the sink, not ETR.

But code path shows that ETB has a similar path as ETF, so
there could be possible NULL pointer dereference crash in
ETB as well. Currently in a bid to find the pid, the owner
is dereferenced via task_pid_nr() call in etb_enable_perf()
and with owner being NULL, we can get a NULL pointer
dereference, so have a similar fix as ETF where we cache PID
in alloc_buffer() callback which is called as the part of
etm_setup_aux().

Fixes: 75d7dbd38824 ("coresight: etb10: Add support for CPU-wide trace scenarios")
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
---
 drivers/hwtracing/coresight/coresight-etb10.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c
index 248cc82c838e..1b320ab581ca 100644
--- a/drivers/hwtracing/coresight/coresight-etb10.c
+++ b/drivers/hwtracing/coresight/coresight-etb10.c
@@ -176,6 +176,7 @@ static int etb_enable_perf(struct coresight_device *csdev, void *data)
 	unsigned long flags;
 	struct etb_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
 	struct perf_output_handle *handle = data;
+	struct cs_buffers *buf = etm_perf_sink_config(handle);
 
 	spin_lock_irqsave(&drvdata->spinlock, flags);
 
@@ -186,7 +187,7 @@ static int etb_enable_perf(struct coresight_device *csdev, void *data)
 	}
 
 	/* Get a handle on the pid of the process to monitor */
-	pid = task_pid_nr(handle->event->owner);
+	pid = buf->pid;
 
 	if (drvdata->pid != -1 && drvdata->pid != pid) {
 		ret = -EBUSY;
@@ -383,6 +384,7 @@ static void *etb_alloc_buffer(struct coresight_device *csdev,
 	if (!buf)
 		return NULL;
 
+	buf->pid = task_pid_nr(event->owner);
 	buf->snapshot = overwrite;
 	buf->nr_pages = nr_pages;
 	buf->data_pages = pages;
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] coresight: tmc-etf: Fix NULL ptr dereference in tmc_enable_etf_sink_perf()
  2020-10-07 13:00 ` [PATCH 1/2] coresight: tmc-etf: Fix NULL ptr dereference in tmc_enable_etf_sink_perf() Sai Prakash Ranjan
@ 2020-10-13 16:35   ` Suzuki K Poulose
  2020-10-14  7:50     ` Sai Prakash Ranjan
  2020-10-14  9:36     ` Sai Prakash Ranjan
  0 siblings, 2 replies; 17+ messages in thread
From: Suzuki K Poulose @ 2020-10-13 16:35 UTC (permalink / raw)
  To: saiprakash.ranjan, mathieu.poirier, mike.leach
  Cc: coresight, swboyd, linux-arm-msm, linux-kernel, linux-arm-kernel,
	denik, leo.yan, peterz

On 10/07/2020 02:00 PM, Sai Prakash Ranjan wrote:
> There was a report of NULL pointer dereference in ETF enable
> path for perf CS mode with PID monitoring. It is almost 100%
> reproducible when the process to monitor is something very
> active such as chrome and with ETF as the sink and not ETR.
> Currently in a bid to find the pid, the owner is dereferenced
> via task_pid_nr() call in tmc_enable_etf_sink_perf() and with
> owner being NULL, we get a NULL pointer dereference.
> 
> Looking at the ETR and other places in the kernel, ETF and the
> ETB are the only places trying to dereference the task(owner)
> in tmc_enable_etf_sink_perf() which is also called from the
> sched_in path as in the call trace. Owner(task) is NULL even
> in the case of ETR in tmc_enable_etr_sink_perf(), but since we
> cache the PID in alloc_buffer() callback and it is done as part
> of etm_setup_aux() when allocating buffer for ETR sink, we never
> dereference this NULL pointer and we are safe. So lets do the

The patch is necessary to fix some of the issues. But I feel it is
not complete. Why is it safe earlier and not later ? I believe we are
simply reducing the chances of hitting the issue, by doing this earlier than
later. I would say we better fix all instances to make sure that the
event->owner is valid. (e.g, I can see that the for kernel events
event->owner == -1 ?)

struct task_struct *tsk = READ_ONCE(event->owner);

if (!tsk || is_kernel_event(event))
    /* skip ? */

Suzuki


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] coresight: tmc-etf: Fix NULL ptr dereference in tmc_enable_etf_sink_perf()
  2020-10-13 16:35   ` Suzuki K Poulose
@ 2020-10-14  7:50     ` Sai Prakash Ranjan
  2020-10-14  9:36     ` Sai Prakash Ranjan
  1 sibling, 0 replies; 17+ messages in thread
From: Sai Prakash Ranjan @ 2020-10-14  7:50 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: mathieu.poirier, mike.leach, coresight, swboyd, linux-arm-msm,
	linux-kernel, linux-arm-kernel, denik, leo.yan, peterz

Hi Suzuki,

On 2020-10-13 22:05, Suzuki K Poulose wrote:
> On 10/07/2020 02:00 PM, Sai Prakash Ranjan wrote:
>> There was a report of NULL pointer dereference in ETF enable
>> path for perf CS mode with PID monitoring. It is almost 100%
>> reproducible when the process to monitor is something very
>> active such as chrome and with ETF as the sink and not ETR.
>> Currently in a bid to find the pid, the owner is dereferenced
>> via task_pid_nr() call in tmc_enable_etf_sink_perf() and with
>> owner being NULL, we get a NULL pointer dereference.
>> 
>> Looking at the ETR and other places in the kernel, ETF and the
>> ETB are the only places trying to dereference the task(owner)
>> in tmc_enable_etf_sink_perf() which is also called from the
>> sched_in path as in the call trace. Owner(task) is NULL even
>> in the case of ETR in tmc_enable_etr_sink_perf(), but since we
>> cache the PID in alloc_buffer() callback and it is done as part
>> of etm_setup_aux() when allocating buffer for ETR sink, we never
>> dereference this NULL pointer and we are safe. So lets do the
> 
> The patch is necessary to fix some of the issues. But I feel it is
> not complete. Why is it safe earlier and not later ? I believe we are
> simply reducing the chances of hitting the issue, by doing this earlier 
> than
> later.

I did stress it for a long time with this patch and did not face
any issues but I guess it doesn't hurt to have the check as you
suggested.

> I would say we better fix all instances to make sure that the
> event->owner is valid. (e.g, I can see that the for kernel events
> event->owner == -1 ?)
> 
> struct task_struct *tsk = READ_ONCE(event->owner);
> 
> if (!tsk || is_kernel_event(event))
>    /* skip ? */
> 

So to confirm my understanding, I will add the above checks on top
of this patch for ETR, ETB and ETF something like below?

diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c 
b/drivers/hwtracing/coresight/coresight-tmc-etf.c
index 989d965f3d90..86ff0dda0444 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etf.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c
@@ -392,6 +392,10 @@ static void *tmc_alloc_etf_buffer(struct 
coresight_device *csdev,
  {
         int node;
         struct cs_buffers *buf;
+       struct task_struct *task = READ_ONCE(event->owner);
+
+       if (!task || is_kernel_event(event))
+               return NULL;

         node = (event->cpu == -1) ? NUMA_NO_NODE : 
cpu_to_node(event->cpu);

@@ -400,7 +404,7 @@ static void *tmc_alloc_etf_buffer(struct 
coresight_device *csdev,
         if (!buf)
                 return NULL;

-       buf->pid = task_pid_nr(event->owner);
+       buf->pid = task_pid_nr(task);
         buf->snapshot = overwrite;
         buf->nr_pages = nr_pages;
         buf->data_pages = pages;


Thanks,
Sai
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member
of Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] coresight: tmc-etf: Fix NULL ptr dereference in tmc_enable_etf_sink_perf()
  2020-10-13 16:35   ` Suzuki K Poulose
  2020-10-14  7:50     ` Sai Prakash Ranjan
@ 2020-10-14  9:36     ` Sai Prakash Ranjan
  2020-10-14 13:16       ` Suzuki K Poulose
  1 sibling, 1 reply; 17+ messages in thread
From: Sai Prakash Ranjan @ 2020-10-14  9:36 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: mathieu.poirier, mike.leach, coresight, swboyd, linux-arm-msm,
	linux-kernel, linux-arm-kernel, denik, leo.yan, peterz

On 2020-10-13 22:05, Suzuki K Poulose wrote:
> On 10/07/2020 02:00 PM, Sai Prakash Ranjan wrote:
>> There was a report of NULL pointer dereference in ETF enable
>> path for perf CS mode with PID monitoring. It is almost 100%
>> reproducible when the process to monitor is something very
>> active such as chrome and with ETF as the sink and not ETR.
>> Currently in a bid to find the pid, the owner is dereferenced
>> via task_pid_nr() call in tmc_enable_etf_sink_perf() and with
>> owner being NULL, we get a NULL pointer dereference.
>> 
>> Looking at the ETR and other places in the kernel, ETF and the
>> ETB are the only places trying to dereference the task(owner)
>> in tmc_enable_etf_sink_perf() which is also called from the
>> sched_in path as in the call trace. Owner(task) is NULL even
>> in the case of ETR in tmc_enable_etr_sink_perf(), but since we
>> cache the PID in alloc_buffer() callback and it is done as part
>> of etm_setup_aux() when allocating buffer for ETR sink, we never
>> dereference this NULL pointer and we are safe. So lets do the
> 
> The patch is necessary to fix some of the issues. But I feel it is
> not complete. Why is it safe earlier and not later ? I believe we are
> simply reducing the chances of hitting the issue, by doing this earlier 
> than
> later. I would say we better fix all instances to make sure that the
> event->owner is valid. (e.g, I can see that the for kernel events
> event->owner == -1 ?)
> 
> struct task_struct *tsk = READ_ONCE(event->owner);
> 
> if (!tsk || is_kernel_event(event))
>    /* skip ? */
> 

Looking at it some more, is_kernel_event() is not exposed
outside events core and probably for good reason. Why do
we need to check for this and not just tsk?

Thanks,
Sai

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member
of Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] coresight: tmc-etf: Fix NULL ptr dereference in tmc_enable_etf_sink_perf()
  2020-10-14  9:36     ` Sai Prakash Ranjan
@ 2020-10-14 13:16       ` Suzuki K Poulose
  2020-10-14 15:59         ` Sai Prakash Ranjan
  0 siblings, 1 reply; 17+ messages in thread
From: Suzuki K Poulose @ 2020-10-14 13:16 UTC (permalink / raw)
  To: saiprakash.ranjan
  Cc: mathieu.poirier, mike.leach, coresight, swboyd, linux-arm-msm,
	linux-kernel, linux-arm-kernel, denik, leo.yan, peterz

On 10/14/2020 10:36 AM, Sai Prakash Ranjan wrote:
> On 2020-10-13 22:05, Suzuki K Poulose wrote:
>> On 10/07/2020 02:00 PM, Sai Prakash Ranjan wrote:
>>> There was a report of NULL pointer dereference in ETF enable
>>> path for perf CS mode with PID monitoring. It is almost 100%
>>> reproducible when the process to monitor is something very
>>> active such as chrome and with ETF as the sink and not ETR.
>>> Currently in a bid to find the pid, the owner is dereferenced
>>> via task_pid_nr() call in tmc_enable_etf_sink_perf() and with
>>> owner being NULL, we get a NULL pointer dereference.
>>>
>>> Looking at the ETR and other places in the kernel, ETF and the
>>> ETB are the only places trying to dereference the task(owner)
>>> in tmc_enable_etf_sink_perf() which is also called from the
>>> sched_in path as in the call trace. Owner(task) is NULL even
>>> in the case of ETR in tmc_enable_etr_sink_perf(), but since we
>>> cache the PID in alloc_buffer() callback and it is done as part
>>> of etm_setup_aux() when allocating buffer for ETR sink, we never
>>> dereference this NULL pointer and we are safe. So lets do the
>>
>> The patch is necessary to fix some of the issues. But I feel it is
>> not complete. Why is it safe earlier and not later ? I believe we are
>> simply reducing the chances of hitting the issue, by doing this earlier than
>> later. I would say we better fix all instances to make sure that the
>> event->owner is valid. (e.g, I can see that the for kernel events
>> event->owner == -1 ?)
>>
>> struct task_struct *tsk = READ_ONCE(event->owner);
>>
>> if (!tsk || is_kernel_event(event))
>>    /* skip ? */
>>
> 
> Looking at it some more, is_kernel_event() is not exposed
> outside events core and probably for good reason. Why do
> we need to check for this and not just tsk?

Because the event->owner could be :

  = NULL
  = -1UL  // kernel event
  = valid.


Kind regards
Suzuki

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] coresight: tmc-etf: Fix NULL ptr dereference in tmc_enable_etf_sink_perf()
  2020-10-14 13:16       ` Suzuki K Poulose
@ 2020-10-14 15:59         ` Sai Prakash Ranjan
  2020-10-20 16:10           ` Sai Prakash Ranjan
  0 siblings, 1 reply; 17+ messages in thread
From: Sai Prakash Ranjan @ 2020-10-14 15:59 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: mathieu.poirier, mike.leach, coresight, swboyd, linux-arm-msm,
	linux-kernel, linux-arm-kernel, denik, leo.yan, peterz

On 2020-10-14 18:46, Suzuki K Poulose wrote:
> On 10/14/2020 10:36 AM, Sai Prakash Ranjan wrote:
>> On 2020-10-13 22:05, Suzuki K Poulose wrote:
>>> On 10/07/2020 02:00 PM, Sai Prakash Ranjan wrote:
>>>> There was a report of NULL pointer dereference in ETF enable
>>>> path for perf CS mode with PID monitoring. It is almost 100%
>>>> reproducible when the process to monitor is something very
>>>> active such as chrome and with ETF as the sink and not ETR.
>>>> Currently in a bid to find the pid, the owner is dereferenced
>>>> via task_pid_nr() call in tmc_enable_etf_sink_perf() and with
>>>> owner being NULL, we get a NULL pointer dereference.
>>>> 
>>>> Looking at the ETR and other places in the kernel, ETF and the
>>>> ETB are the only places trying to dereference the task(owner)
>>>> in tmc_enable_etf_sink_perf() which is also called from the
>>>> sched_in path as in the call trace. Owner(task) is NULL even
>>>> in the case of ETR in tmc_enable_etr_sink_perf(), but since we
>>>> cache the PID in alloc_buffer() callback and it is done as part
>>>> of etm_setup_aux() when allocating buffer for ETR sink, we never
>>>> dereference this NULL pointer and we are safe. So lets do the
>>> 
>>> The patch is necessary to fix some of the issues. But I feel it is
>>> not complete. Why is it safe earlier and not later ? I believe we are
>>> simply reducing the chances of hitting the issue, by doing this 
>>> earlier than
>>> later. I would say we better fix all instances to make sure that the
>>> event->owner is valid. (e.g, I can see that the for kernel events
>>> event->owner == -1 ?)
>>> 
>>> struct task_struct *tsk = READ_ONCE(event->owner);
>>> 
>>> if (!tsk || is_kernel_event(event))
>>>    /* skip ? */
>>> 
>> 
>> Looking at it some more, is_kernel_event() is not exposed
>> outside events core and probably for good reason. Why do
>> we need to check for this and not just tsk?
> 
> Because the event->owner could be :
> 
>  = NULL
>  = -1UL  // kernel event
>  = valid.
> 

Yes I understood that part, but here we were trying to
fix the NULL pointer dereference right and hence the
question as to why we need to check for kernel events?
I am no expert in perf but I don't see anywhere in the
kernel checking for is_kernel_event(), so I am a bit
skeptical if exporting that is actually right or not.

Thanks,
Sai

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member
of Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] coresight: tmc-etf: Fix NULL ptr dereference in tmc_enable_etf_sink_perf()
  2020-10-14 15:59         ` Sai Prakash Ranjan
@ 2020-10-20 16:10           ` Sai Prakash Ranjan
  2020-10-21  7:29             ` Sai Prakash Ranjan
  0 siblings, 1 reply; 17+ messages in thread
From: Sai Prakash Ranjan @ 2020-10-20 16:10 UTC (permalink / raw)
  To: Suzuki K Poulose, Mathieu Poirier
  Cc: mike.leach, coresight, swboyd, linux-arm-msm, linux-kernel,
	linux-arm-kernel, denik, leo.yan, peterz

On 2020-10-14 21:29, Sai Prakash Ranjan wrote:
> On 2020-10-14 18:46, Suzuki K Poulose wrote:
>> On 10/14/2020 10:36 AM, Sai Prakash Ranjan wrote:
>>> On 2020-10-13 22:05, Suzuki K Poulose wrote:
>>>> On 10/07/2020 02:00 PM, Sai Prakash Ranjan wrote:
>>>>> There was a report of NULL pointer dereference in ETF enable
>>>>> path for perf CS mode with PID monitoring. It is almost 100%
>>>>> reproducible when the process to monitor is something very
>>>>> active such as chrome and with ETF as the sink and not ETR.
>>>>> Currently in a bid to find the pid, the owner is dereferenced
>>>>> via task_pid_nr() call in tmc_enable_etf_sink_perf() and with
>>>>> owner being NULL, we get a NULL pointer dereference.
>>>>> 
>>>>> Looking at the ETR and other places in the kernel, ETF and the
>>>>> ETB are the only places trying to dereference the task(owner)
>>>>> in tmc_enable_etf_sink_perf() which is also called from the
>>>>> sched_in path as in the call trace. Owner(task) is NULL even
>>>>> in the case of ETR in tmc_enable_etr_sink_perf(), but since we
>>>>> cache the PID in alloc_buffer() callback and it is done as part
>>>>> of etm_setup_aux() when allocating buffer for ETR sink, we never
>>>>> dereference this NULL pointer and we are safe. So lets do the
>>>> 
>>>> The patch is necessary to fix some of the issues. But I feel it is
>>>> not complete. Why is it safe earlier and not later ? I believe we 
>>>> are
>>>> simply reducing the chances of hitting the issue, by doing this 
>>>> earlier than
>>>> later. I would say we better fix all instances to make sure that the
>>>> event->owner is valid. (e.g, I can see that the for kernel events
>>>> event->owner == -1 ?)
>>>> 
>>>> struct task_struct *tsk = READ_ONCE(event->owner);
>>>> 
>>>> if (!tsk || is_kernel_event(event))
>>>>    /* skip ? */
>>>> 
>>> 
>>> Looking at it some more, is_kernel_event() is not exposed
>>> outside events core and probably for good reason. Why do
>>> we need to check for this and not just tsk?
>> 
>> Because the event->owner could be :
>> 
>>  = NULL
>>  = -1UL  // kernel event
>>  = valid.
>> 
> 
> Yes I understood that part, but here we were trying to
> fix the NULL pointer dereference right and hence the
> question as to why we need to check for kernel events?
> I am no expert in perf but I don't see anywhere in the
> kernel checking for is_kernel_event(), so I am a bit
> skeptical if exporting that is actually right or not.
> 

I have stress tested with the original patch many times
now, i.e., without a check for event->owner and is_kernel_event()
and didn't observe any crash. Plus on ETR where this was already
done, no crashes were reported till date and with ETF, the issue
was quickly reproducible, so I am fairly confident that this
doesn't just delay the original issue but actually fixes
it. I will run an overnight test again to confirm this.

Thanks,
Sai

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member
of Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] coresight: tmc-etf: Fix NULL ptr dereference in tmc_enable_etf_sink_perf()
  2020-10-20 16:10           ` Sai Prakash Ranjan
@ 2020-10-21  7:29             ` Sai Prakash Ranjan
  2020-10-21 10:08               ` Suzuki Poulose
  0 siblings, 1 reply; 17+ messages in thread
From: Sai Prakash Ranjan @ 2020-10-21  7:29 UTC (permalink / raw)
  To: Suzuki K Poulose, Mathieu Poirier
  Cc: mike.leach, coresight, swboyd, linux-arm-msm, linux-kernel,
	linux-arm-kernel, denik, leo.yan, peterz

On 2020-10-20 21:40, Sai Prakash Ranjan wrote:
> On 2020-10-14 21:29, Sai Prakash Ranjan wrote:
>> On 2020-10-14 18:46, Suzuki K Poulose wrote:
>>> On 10/14/2020 10:36 AM, Sai Prakash Ranjan wrote:
>>>> On 2020-10-13 22:05, Suzuki K Poulose wrote:
>>>>> On 10/07/2020 02:00 PM, Sai Prakash Ranjan wrote:
>>>>>> There was a report of NULL pointer dereference in ETF enable
>>>>>> path for perf CS mode with PID monitoring. It is almost 100%
>>>>>> reproducible when the process to monitor is something very
>>>>>> active such as chrome and with ETF as the sink and not ETR.
>>>>>> Currently in a bid to find the pid, the owner is dereferenced
>>>>>> via task_pid_nr() call in tmc_enable_etf_sink_perf() and with
>>>>>> owner being NULL, we get a NULL pointer dereference.
>>>>>> 
>>>>>> Looking at the ETR and other places in the kernel, ETF and the
>>>>>> ETB are the only places trying to dereference the task(owner)
>>>>>> in tmc_enable_etf_sink_perf() which is also called from the
>>>>>> sched_in path as in the call trace. Owner(task) is NULL even
>>>>>> in the case of ETR in tmc_enable_etr_sink_perf(), but since we
>>>>>> cache the PID in alloc_buffer() callback and it is done as part
>>>>>> of etm_setup_aux() when allocating buffer for ETR sink, we never
>>>>>> dereference this NULL pointer and we are safe. So lets do the
>>>>> 
>>>>> The patch is necessary to fix some of the issues. But I feel it is
>>>>> not complete. Why is it safe earlier and not later ? I believe we 
>>>>> are
>>>>> simply reducing the chances of hitting the issue, by doing this 
>>>>> earlier than
>>>>> later. I would say we better fix all instances to make sure that 
>>>>> the
>>>>> event->owner is valid. (e.g, I can see that the for kernel events
>>>>> event->owner == -1 ?)
>>>>> 
>>>>> struct task_struct *tsk = READ_ONCE(event->owner);
>>>>> 
>>>>> if (!tsk || is_kernel_event(event))
>>>>>    /* skip ? */
>>>>> 
>>>> 
>>>> Looking at it some more, is_kernel_event() is not exposed
>>>> outside events core and probably for good reason. Why do
>>>> we need to check for this and not just tsk?
>>> 
>>> Because the event->owner could be :
>>> 
>>>  = NULL
>>>  = -1UL  // kernel event
>>>  = valid.
>>> 
>> 
>> Yes I understood that part, but here we were trying to
>> fix the NULL pointer dereference right and hence the
>> question as to why we need to check for kernel events?
>> I am no expert in perf but I don't see anywhere in the
>> kernel checking for is_kernel_event(), so I am a bit
>> skeptical if exporting that is actually right or not.
>> 
> 
> I have stress tested with the original patch many times
> now, i.e., without a check for event->owner and is_kernel_event()
> and didn't observe any crash. Plus on ETR where this was already
> done, no crashes were reported till date and with ETF, the issue
> was quickly reproducible, so I am fairly confident that this
> doesn't just delay the original issue but actually fixes
> it. I will run an overnight test again to confirm this.
> 

I ran the overnight test which collected aroung 4G data(see below),
with the following small change to see if the two cases
(event->owner=NULL and is_kernel_event()) are triggered
with suggested changes and it didn't trigger at all.
Do we still need those additional checks?

[ perf record: Captured and wrote 4677.989 MB perf.data ]

diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c 
b/drivers/hwtracing/coresight/coresight-tmc-etf.c
index 989d965f3d90..123c446ce585 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etf.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c
@@ -13,6 +13,13 @@
  #include "coresight-tmc.h"
  #include "coresight-etm-perf.h"

+#define TASK_TOMBSTONE ((void *)-1L)
+
+static bool is_kernel_event2(struct perf_event *event)
+{
+       return READ_ONCE(event->owner) == TASK_TOMBSTONE;
+}
+
  static int tmc_set_etf_buffer(struct coresight_device *csdev,
                               struct perf_output_handle *handle);

@@ -392,6 +399,15 @@ static void *tmc_alloc_etf_buffer(struct 
coresight_device *csdev,
  {
         int node;
         struct cs_buffers *buf;
+       struct task_struct *task = READ_ONCE(event->owner);
+
+       if (!task) {
+               pr_info("**sai in task=NULL**\n");
+               return NULL;
+       }
+
+       if (is_kernel_event2(event))
+               pr_info("**sai in is_kernel_event**\n");

         node = (event->cpu == -1) ? NUMA_NO_NODE : 
cpu_to_node(event->cpu);


Thanks,
Sai

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member
of Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] coresight: tmc-etf: Fix NULL ptr dereference in tmc_enable_etf_sink_perf()
  2020-10-21  7:29             ` Sai Prakash Ranjan
@ 2020-10-21 10:08               ` Suzuki Poulose
  2020-10-22  8:02                 ` Sai Prakash Ranjan
  0 siblings, 1 reply; 17+ messages in thread
From: Suzuki Poulose @ 2020-10-21 10:08 UTC (permalink / raw)
  To: Sai Prakash Ranjan, Mathieu Poirier
  Cc: mike.leach, coresight, swboyd, linux-arm-msm, linux-kernel,
	linux-arm-kernel, denik, leo.yan, peterz

On 10/21/20 8:29 AM, Sai Prakash Ranjan wrote:
> On 2020-10-20 21:40, Sai Prakash Ranjan wrote:
>> On 2020-10-14 21:29, Sai Prakash Ranjan wrote:
>>> On 2020-10-14 18:46, Suzuki K Poulose wrote:
>>>> On 10/14/2020 10:36 AM, Sai Prakash Ranjan wrote:
>>>>> On 2020-10-13 22:05, Suzuki K Poulose wrote:
>>>>>> On 10/07/2020 02:00 PM, Sai Prakash Ranjan wrote:
>>>>>>> There was a report of NULL pointer dereference in ETF enable
>>>>>>> path for perf CS mode with PID monitoring. It is almost 100%
>>>>>>> reproducible when the process to monitor is something very
>>>>>>> active such as chrome and with ETF as the sink and not ETR.
>>>>>>> Currently in a bid to find the pid, the owner is dereferenced
>>>>>>> via task_pid_nr() call in tmc_enable_etf_sink_perf() and with
>>>>>>> owner being NULL, we get a NULL pointer dereference.
>>>>>>>
>>>>>>> Looking at the ETR and other places in the kernel, ETF and the
>>>>>>> ETB are the only places trying to dereference the task(owner)
>>>>>>> in tmc_enable_etf_sink_perf() which is also called from the
>>>>>>> sched_in path as in the call trace. Owner(task) is NULL even
>>>>>>> in the case of ETR in tmc_enable_etr_sink_perf(), but since we
>>>>>>> cache the PID in alloc_buffer() callback and it is done as part
>>>>>>> of etm_setup_aux() when allocating buffer for ETR sink, we never
>>>>>>> dereference this NULL pointer and we are safe. So lets do the
>>>>>>
>>>>>> The patch is necessary to fix some of the issues. But I feel it is
>>>>>> not complete. Why is it safe earlier and not later ? I believe we are
>>>>>> simply reducing the chances of hitting the issue, by doing this 
>>>>>> earlier than
>>>>>> later. I would say we better fix all instances to make sure that the
>>>>>> event->owner is valid. (e.g, I can see that the for kernel events
>>>>>> event->owner == -1 ?)
>>>>>>
>>>>>> struct task_struct *tsk = READ_ONCE(event->owner);
>>>>>>
>>>>>> if (!tsk || is_kernel_event(event))
>>>>>>    /* skip ? */
>>>>>>
>>>>>
>>>>> Looking at it some more, is_kernel_event() is not exposed
>>>>> outside events core and probably for good reason. Why do
>>>>> we need to check for this and not just tsk?
>>>>
>>>> Because the event->owner could be :
>>>>
>>>>  = NULL
>>>>  = -1UL  // kernel event
>>>>  = valid.
>>>>
>>>
>>> Yes I understood that part, but here we were trying to
>>> fix the NULL pointer dereference right and hence the
>>> question as to why we need to check for kernel events?
>>> I am no expert in perf but I don't see anywhere in the
>>> kernel checking for is_kernel_event(), so I am a bit
>>> skeptical if exporting that is actually right or not.
>>>
>>
>> I have stress tested with the original patch many times
>> now, i.e., without a check for event->owner and is_kernel_event()
>> and didn't observe any crash. Plus on ETR where this was already
>> done, no crashes were reported till date and with ETF, the issue
>> was quickly reproducible, so I am fairly confident that this
>> doesn't just delay the original issue but actually fixes
>> it. I will run an overnight test again to confirm this.
>>
> 
> I ran the overnight test which collected aroung 4G data(see below),
> with the following small change to see if the two cases
> (event->owner=NULL and is_kernel_event()) are triggered
> with suggested changes and it didn't trigger at all.
> Do we still need those additional checks?
> 

Yes. Please see perf_event_create_kernel_event(), which is
an exported function allowing any kernel code (including modules)
to use the PMU (just like the userspace perf tool would do).
Just because your use case doesn't trigger this (because
you don't run something that can trigger this) doesn't mean
this can't be triggered.

Cheers
Suzuki


> [ perf record: Captured and wrote 4677.989 MB perf.data ]
> 
> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c 
> b/drivers/hwtracing/coresight/coresight-tmc-etf.c
> index 989d965f3d90..123c446ce585 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c
> +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c
> @@ -13,6 +13,13 @@
>   #include "coresight-tmc.h"
>   #include "coresight-etm-perf.h"
> 
> +#define TASK_TOMBSTONE ((void *)-1L)
> +
> +static bool is_kernel_event2(struct perf_event *event)
> +{
> +       return READ_ONCE(event->owner) == TASK_TOMBSTONE;
> +}
> +
>   static int tmc_set_etf_buffer(struct coresight_device *csdev,
>                                struct perf_output_handle *handle);
> 
> @@ -392,6 +399,15 @@ static void *tmc_alloc_etf_buffer(struct 
> coresight_device *csdev,
>   {
>          int node;
>          struct cs_buffers *buf;
> +       struct task_struct *task = READ_ONCE(event->owner);
> +
> +       if (!task) {
> +               pr_info("**sai in task=NULL**\n");
> +               return NULL;
> +       }
> +
> +       if (is_kernel_event2(event))
> +               pr_info("**sai in is_kernel_event**\n");
> 
>          node = (event->cpu == -1) ? NUMA_NO_NODE : 
> cpu_to_node(event->cpu);
> 
> 
> Thanks,
> Sai
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] coresight: tmc-etf: Fix NULL ptr dereference in tmc_enable_etf_sink_perf()
  2020-10-21 10:08               ` Suzuki Poulose
@ 2020-10-22  8:02                 ` Sai Prakash Ranjan
  2020-10-22  9:27                   ` Suzuki Poulose
  0 siblings, 1 reply; 17+ messages in thread
From: Sai Prakash Ranjan @ 2020-10-22  8:02 UTC (permalink / raw)
  To: Suzuki Poulose
  Cc: Mathieu Poirier, mike.leach, coresight, swboyd, linux-arm-msm,
	linux-kernel, linux-arm-kernel, denik, leo.yan, peterz

On 2020-10-21 15:38, Suzuki Poulose wrote:
> On 10/21/20 8:29 AM, Sai Prakash Ranjan wrote:
>> On 2020-10-20 21:40, Sai Prakash Ranjan wrote:
>>> On 2020-10-14 21:29, Sai Prakash Ranjan wrote:
>>>> On 2020-10-14 18:46, Suzuki K Poulose wrote:
>>>>> On 10/14/2020 10:36 AM, Sai Prakash Ranjan wrote:
>>>>>> On 2020-10-13 22:05, Suzuki K Poulose wrote:
>>>>>>> On 10/07/2020 02:00 PM, Sai Prakash Ranjan wrote:
>>>>>>>> There was a report of NULL pointer dereference in ETF enable
>>>>>>>> path for perf CS mode with PID monitoring. It is almost 100%
>>>>>>>> reproducible when the process to monitor is something very
>>>>>>>> active such as chrome and with ETF as the sink and not ETR.
>>>>>>>> Currently in a bid to find the pid, the owner is dereferenced
>>>>>>>> via task_pid_nr() call in tmc_enable_etf_sink_perf() and with
>>>>>>>> owner being NULL, we get a NULL pointer dereference.
>>>>>>>> 
>>>>>>>> Looking at the ETR and other places in the kernel, ETF and the
>>>>>>>> ETB are the only places trying to dereference the task(owner)
>>>>>>>> in tmc_enable_etf_sink_perf() which is also called from the
>>>>>>>> sched_in path as in the call trace. Owner(task) is NULL even
>>>>>>>> in the case of ETR in tmc_enable_etr_sink_perf(), but since we
>>>>>>>> cache the PID in alloc_buffer() callback and it is done as part
>>>>>>>> of etm_setup_aux() when allocating buffer for ETR sink, we never
>>>>>>>> dereference this NULL pointer and we are safe. So lets do the
>>>>>>> 
>>>>>>> The patch is necessary to fix some of the issues. But I feel it 
>>>>>>> is
>>>>>>> not complete. Why is it safe earlier and not later ? I believe we 
>>>>>>> are
>>>>>>> simply reducing the chances of hitting the issue, by doing this 
>>>>>>> earlier than
>>>>>>> later. I would say we better fix all instances to make sure that 
>>>>>>> the
>>>>>>> event->owner is valid. (e.g, I can see that the for kernel events
>>>>>>> event->owner == -1 ?)
>>>>>>> 
>>>>>>> struct task_struct *tsk = READ_ONCE(event->owner);
>>>>>>> 
>>>>>>> if (!tsk || is_kernel_event(event))
>>>>>>>    /* skip ? */
>>>>>>> 
>>>>>> 
>>>>>> Looking at it some more, is_kernel_event() is not exposed
>>>>>> outside events core and probably for good reason. Why do
>>>>>> we need to check for this and not just tsk?
>>>>> 
>>>>> Because the event->owner could be :
>>>>> 
>>>>>  = NULL
>>>>>  = -1UL  // kernel event
>>>>>  = valid.
>>>>> 
>>>> 
>>>> Yes I understood that part, but here we were trying to
>>>> fix the NULL pointer dereference right and hence the
>>>> question as to why we need to check for kernel events?
>>>> I am no expert in perf but I don't see anywhere in the
>>>> kernel checking for is_kernel_event(), so I am a bit
>>>> skeptical if exporting that is actually right or not.
>>>> 
>>> 
>>> I have stress tested with the original patch many times
>>> now, i.e., without a check for event->owner and is_kernel_event()
>>> and didn't observe any crash. Plus on ETR where this was already
>>> done, no crashes were reported till date and with ETF, the issue
>>> was quickly reproducible, so I am fairly confident that this
>>> doesn't just delay the original issue but actually fixes
>>> it. I will run an overnight test again to confirm this.
>>> 
>> 
>> I ran the overnight test which collected aroung 4G data(see below),
>> with the following small change to see if the two cases
>> (event->owner=NULL and is_kernel_event()) are triggered
>> with suggested changes and it didn't trigger at all.
>> Do we still need those additional checks?
>> 
> 
> Yes. Please see perf_event_create_kernel_event(), which is
> an exported function allowing any kernel code (including modules)
> to use the PMU (just like the userspace perf tool would do).
> Just because your use case doesn't trigger this (because
> you don't run something that can trigger this) doesn't mean
> this can't be triggered.
> 

Thanks for that pointer, I will add them in the next version.

Thanks,
Sai

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member
of Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] coresight: tmc-etf: Fix NULL ptr dereference in tmc_enable_etf_sink_perf()
  2020-10-22  8:02                 ` Sai Prakash Ranjan
@ 2020-10-22  9:27                   ` Suzuki Poulose
  2020-10-22 11:07                     ` Sai Prakash Ranjan
  0 siblings, 1 reply; 17+ messages in thread
From: Suzuki Poulose @ 2020-10-22  9:27 UTC (permalink / raw)
  To: Sai Prakash Ranjan
  Cc: Mathieu Poirier, mike.leach, coresight, swboyd, linux-arm-msm,
	linux-kernel, linux-arm-kernel, denik, leo.yan, peterz

On 10/22/20 9:02 AM, Sai Prakash Ranjan wrote:
> On 2020-10-21 15:38, Suzuki Poulose wrote:
>> On 10/21/20 8:29 AM, Sai Prakash Ranjan wrote:
>>> On 2020-10-20 21:40, Sai Prakash Ranjan wrote:
>>>> On 2020-10-14 21:29, Sai Prakash Ranjan wrote:
>>>>> On 2020-10-14 18:46, Suzuki K Poulose wrote:
>>>>>> On 10/14/2020 10:36 AM, Sai Prakash Ranjan wrote:
>>>>>>> On 2020-10-13 22:05, Suzuki K Poulose wrote:
>>>>>>>> On 10/07/2020 02:00 PM, Sai Prakash Ranjan wrote:
>>>>>>>>> There was a report of NULL pointer dereference in ETF enable
>>>>>>>>> path for perf CS mode with PID monitoring. It is almost 100%
>>>>>>>>> reproducible when the process to monitor is something very
>>>>>>>>> active such as chrome and with ETF as the sink and not ETR.
>>>>>>>>> Currently in a bid to find the pid, the owner is dereferenced
>>>>>>>>> via task_pid_nr() call in tmc_enable_etf_sink_perf() and with
>>>>>>>>> owner being NULL, we get a NULL pointer dereference.
>>>>>>>>>
>>>>>>>>> Looking at the ETR and other places in the kernel, ETF and the
>>>>>>>>> ETB are the only places trying to dereference the task(owner)
>>>>>>>>> in tmc_enable_etf_sink_perf() which is also called from the
>>>>>>>>> sched_in path as in the call trace. Owner(task) is NULL even
>>>>>>>>> in the case of ETR in tmc_enable_etr_sink_perf(), but since we
>>>>>>>>> cache the PID in alloc_buffer() callback and it is done as part
>>>>>>>>> of etm_setup_aux() when allocating buffer for ETR sink, we never
>>>>>>>>> dereference this NULL pointer and we are safe. So lets do the
>>>>>>>>
>>>>>>>> The patch is necessary to fix some of the issues. But I feel it is
>>>>>>>> not complete. Why is it safe earlier and not later ? I believe 
>>>>>>>> we are
>>>>>>>> simply reducing the chances of hitting the issue, by doing this 
>>>>>>>> earlier than
>>>>>>>> later. I would say we better fix all instances to make sure that 
>>>>>>>> the
>>>>>>>> event->owner is valid. (e.g, I can see that the for kernel events
>>>>>>>> event->owner == -1 ?)
>>>>>>>>
>>>>>>>> struct task_struct *tsk = READ_ONCE(event->owner);
>>>>>>>>
>>>>>>>> if (!tsk || is_kernel_event(event))
>>>>>>>>    /* skip ? */
>>>>>>>>
>>>>>>>
>>>>>>> Looking at it some more, is_kernel_event() is not exposed
>>>>>>> outside events core and probably for good reason. Why do
>>>>>>> we need to check for this and not just tsk?
>>>>>>
>>>>>> Because the event->owner could be :
>>>>>>
>>>>>>  = NULL
>>>>>>  = -1UL  // kernel event
>>>>>>  = valid.
>>>>>>
>>>>>
>>>>> Yes I understood that part, but here we were trying to
>>>>> fix the NULL pointer dereference right and hence the
>>>>> question as to why we need to check for kernel events?
>>>>> I am no expert in perf but I don't see anywhere in the
>>>>> kernel checking for is_kernel_event(), so I am a bit
>>>>> skeptical if exporting that is actually right or not.
>>>>>
>>>>
>>>> I have stress tested with the original patch many times
>>>> now, i.e., without a check for event->owner and is_kernel_event()
>>>> and didn't observe any crash. Plus on ETR where this was already
>>>> done, no crashes were reported till date and with ETF, the issue
>>>> was quickly reproducible, so I am fairly confident that this
>>>> doesn't just delay the original issue but actually fixes
>>>> it. I will run an overnight test again to confirm this.
>>>>
>>>
>>> I ran the overnight test which collected aroung 4G data(see below),
>>> with the following small change to see if the two cases
>>> (event->owner=NULL and is_kernel_event()) are triggered
>>> with suggested changes and it didn't trigger at all.
>>> Do we still need those additional checks?
>>>
>>
>> Yes. Please see perf_event_create_kernel_event(), which is
>> an exported function allowing any kernel code (including modules)
>> to use the PMU (just like the userspace perf tool would do).
>> Just because your use case doesn't trigger this (because
>> you don't run something that can trigger this) doesn't mean
>> this can't be triggered.
>>
> 
> Thanks for that pointer, I will add them in the next version.
> 

And instead of redefining TASK_TOMBSTONE in the driver, you
may simply use IS_ERR_OR_NULL(tsk) to cover both NULL case
and kernel event.

Cheers
Suzuki

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] coresight: tmc-etf: Fix NULL ptr dereference in tmc_enable_etf_sink_perf()
  2020-10-22  9:27                   ` Suzuki Poulose
@ 2020-10-22 11:07                     ` Sai Prakash Ranjan
  2020-10-22 11:14                       ` Suzuki Poulose
  0 siblings, 1 reply; 17+ messages in thread
From: Sai Prakash Ranjan @ 2020-10-22 11:07 UTC (permalink / raw)
  To: Suzuki Poulose
  Cc: Mathieu Poirier, mike.leach, coresight, swboyd, linux-arm-msm,
	linux-kernel, linux-arm-kernel, denik, leo.yan, peterz

On 2020-10-22 14:57, Suzuki Poulose wrote:
> On 10/22/20 9:02 AM, Sai Prakash Ranjan wrote:
>> On 2020-10-21 15:38, Suzuki Poulose wrote:
>>> On 10/21/20 8:29 AM, Sai Prakash Ranjan wrote:
>>>> On 2020-10-20 21:40, Sai Prakash Ranjan wrote:
>>>>> On 2020-10-14 21:29, Sai Prakash Ranjan wrote:
>>>>>> On 2020-10-14 18:46, Suzuki K Poulose wrote:
>>>>>>> On 10/14/2020 10:36 AM, Sai Prakash Ranjan wrote:
>>>>>>>> On 2020-10-13 22:05, Suzuki K Poulose wrote:
>>>>>>>>> On 10/07/2020 02:00 PM, Sai Prakash Ranjan wrote:
>>>>>>>>>> There was a report of NULL pointer dereference in ETF enable
>>>>>>>>>> path for perf CS mode with PID monitoring. It is almost 100%
>>>>>>>>>> reproducible when the process to monitor is something very
>>>>>>>>>> active such as chrome and with ETF as the sink and not ETR.
>>>>>>>>>> Currently in a bid to find the pid, the owner is dereferenced
>>>>>>>>>> via task_pid_nr() call in tmc_enable_etf_sink_perf() and with
>>>>>>>>>> owner being NULL, we get a NULL pointer dereference.
>>>>>>>>>> 
>>>>>>>>>> Looking at the ETR and other places in the kernel, ETF and the
>>>>>>>>>> ETB are the only places trying to dereference the task(owner)
>>>>>>>>>> in tmc_enable_etf_sink_perf() which is also called from the
>>>>>>>>>> sched_in path as in the call trace. Owner(task) is NULL even
>>>>>>>>>> in the case of ETR in tmc_enable_etr_sink_perf(), but since we
>>>>>>>>>> cache the PID in alloc_buffer() callback and it is done as 
>>>>>>>>>> part
>>>>>>>>>> of etm_setup_aux() when allocating buffer for ETR sink, we 
>>>>>>>>>> never
>>>>>>>>>> dereference this NULL pointer and we are safe. So lets do the
>>>>>>>>> 
>>>>>>>>> The patch is necessary to fix some of the issues. But I feel it 
>>>>>>>>> is
>>>>>>>>> not complete. Why is it safe earlier and not later ? I believe 
>>>>>>>>> we are
>>>>>>>>> simply reducing the chances of hitting the issue, by doing this 
>>>>>>>>> earlier than
>>>>>>>>> later. I would say we better fix all instances to make sure 
>>>>>>>>> that the
>>>>>>>>> event->owner is valid. (e.g, I can see that the for kernel 
>>>>>>>>> events
>>>>>>>>> event->owner == -1 ?)
>>>>>>>>> 
>>>>>>>>> struct task_struct *tsk = READ_ONCE(event->owner);
>>>>>>>>> 
>>>>>>>>> if (!tsk || is_kernel_event(event))
>>>>>>>>>    /* skip ? */
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> Looking at it some more, is_kernel_event() is not exposed
>>>>>>>> outside events core and probably for good reason. Why do
>>>>>>>> we need to check for this and not just tsk?
>>>>>>> 
>>>>>>> Because the event->owner could be :
>>>>>>> 
>>>>>>>  = NULL
>>>>>>>  = -1UL  // kernel event
>>>>>>>  = valid.
>>>>>>> 
>>>>>> 
>>>>>> Yes I understood that part, but here we were trying to
>>>>>> fix the NULL pointer dereference right and hence the
>>>>>> question as to why we need to check for kernel events?
>>>>>> I am no expert in perf but I don't see anywhere in the
>>>>>> kernel checking for is_kernel_event(), so I am a bit
>>>>>> skeptical if exporting that is actually right or not.
>>>>>> 
>>>>> 
>>>>> I have stress tested with the original patch many times
>>>>> now, i.e., without a check for event->owner and is_kernel_event()
>>>>> and didn't observe any crash. Plus on ETR where this was already
>>>>> done, no crashes were reported till date and with ETF, the issue
>>>>> was quickly reproducible, so I am fairly confident that this
>>>>> doesn't just delay the original issue but actually fixes
>>>>> it. I will run an overnight test again to confirm this.
>>>>> 
>>>> 
>>>> I ran the overnight test which collected aroung 4G data(see below),
>>>> with the following small change to see if the two cases
>>>> (event->owner=NULL and is_kernel_event()) are triggered
>>>> with suggested changes and it didn't trigger at all.
>>>> Do we still need those additional checks?
>>>> 
>>> 
>>> Yes. Please see perf_event_create_kernel_event(), which is
>>> an exported function allowing any kernel code (including modules)
>>> to use the PMU (just like the userspace perf tool would do).
>>> Just because your use case doesn't trigger this (because
>>> you don't run something that can trigger this) doesn't mean
>>> this can't be triggered.
>>> 
>> 
>> Thanks for that pointer, I will add them in the next version.
>> 
> 
> And instead of redefining TASK_TOMBSTONE in the driver, you
> may simply use IS_ERR_OR_NULL(tsk) to cover both NULL case
> and kernel event.
> 

Ugh sorry, sent out v2 exporting is_kernel_event() before seeing
this comment, I will resend.

Thanks,
Sai

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member
of Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] coresight: tmc-etf: Fix NULL ptr dereference in tmc_enable_etf_sink_perf()
  2020-10-22 11:07                     ` Sai Prakash Ranjan
@ 2020-10-22 11:14                       ` Suzuki Poulose
  2020-10-22 11:20                         ` Sai Prakash Ranjan
  0 siblings, 1 reply; 17+ messages in thread
From: Suzuki Poulose @ 2020-10-22 11:14 UTC (permalink / raw)
  To: Sai Prakash Ranjan
  Cc: Mathieu Poirier, mike.leach, coresight, swboyd, linux-arm-msm,
	linux-kernel, linux-arm-kernel, denik, leo.yan, peterz

On 10/22/20 12:07 PM, Sai Prakash Ranjan wrote:
> On 2020-10-22 14:57, Suzuki Poulose wrote:
>> On 10/22/20 9:02 AM, Sai Prakash Ranjan wrote:
>>> On 2020-10-21 15:38, Suzuki Poulose wrote:
>>>> On 10/21/20 8:29 AM, Sai Prakash Ranjan wrote:
>>>>> On 2020-10-20 21:40, Sai Prakash Ranjan wrote:
>>>>>> On 2020-10-14 21:29, Sai Prakash Ranjan wrote:
>>>>>>> On 2020-10-14 18:46, Suzuki K Poulose wrote:
>>>>>>>> On 10/14/2020 10:36 AM, Sai Prakash Ranjan wrote:
>>>>>>>>> On 2020-10-13 22:05, Suzuki K Poulose wrote:
>>>>>>>>>> On 10/07/2020 02:00 PM, Sai Prakash Ranjan wrote:
>>>>>>>>>>> There was a report of NULL pointer dereference in ETF enable
>>>>>>>>>>> path for perf CS mode with PID monitoring. It is almost 100%
>>>>>>>>>>> reproducible when the process to monitor is something very
>>>>>>>>>>> active such as chrome and with ETF as the sink and not ETR.
>>>>>>>>>>> Currently in a bid to find the pid, the owner is dereferenced
>>>>>>>>>>> via task_pid_nr() call in tmc_enable_etf_sink_perf() and with
>>>>>>>>>>> owner being NULL, we get a NULL pointer dereference.
>>>>>>>>>>>
>>>>>>>>>>> Looking at the ETR and other places in the kernel, ETF and the
>>>>>>>>>>> ETB are the only places trying to dereference the task(owner)
>>>>>>>>>>> in tmc_enable_etf_sink_perf() which is also called from the
>>>>>>>>>>> sched_in path as in the call trace. Owner(task) is NULL even
>>>>>>>>>>> in the case of ETR in tmc_enable_etr_sink_perf(), but since we
>>>>>>>>>>> cache the PID in alloc_buffer() callback and it is done as part
>>>>>>>>>>> of etm_setup_aux() when allocating buffer for ETR sink, we never
>>>>>>>>>>> dereference this NULL pointer and we are safe. So lets do the
>>>>>>>>>>
>>>>>>>>>> The patch is necessary to fix some of the issues. But I feel 
>>>>>>>>>> it is
>>>>>>>>>> not complete. Why is it safe earlier and not later ? I believe 
>>>>>>>>>> we are
>>>>>>>>>> simply reducing the chances of hitting the issue, by doing 
>>>>>>>>>> this earlier than
>>>>>>>>>> later. I would say we better fix all instances to make sure 
>>>>>>>>>> that the
>>>>>>>>>> event->owner is valid. (e.g, I can see that the for kernel events
>>>>>>>>>> event->owner == -1 ?)
>>>>>>>>>>
>>>>>>>>>> struct task_struct *tsk = READ_ONCE(event->owner);
>>>>>>>>>>
>>>>>>>>>> if (!tsk || is_kernel_event(event))
>>>>>>>>>>    /* skip ? */
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Looking at it some more, is_kernel_event() is not exposed
>>>>>>>>> outside events core and probably for good reason. Why do
>>>>>>>>> we need to check for this and not just tsk?
>>>>>>>>
>>>>>>>> Because the event->owner could be :
>>>>>>>>
>>>>>>>>  = NULL
>>>>>>>>  = -1UL  // kernel event
>>>>>>>>  = valid.
>>>>>>>>
>>>>>>>
>>>>>>> Yes I understood that part, but here we were trying to
>>>>>>> fix the NULL pointer dereference right and hence the
>>>>>>> question as to why we need to check for kernel events?
>>>>>>> I am no expert in perf but I don't see anywhere in the
>>>>>>> kernel checking for is_kernel_event(), so I am a bit
>>>>>>> skeptical if exporting that is actually right or not.
>>>>>>>
>>>>>>
>>>>>> I have stress tested with the original patch many times
>>>>>> now, i.e., without a check for event->owner and is_kernel_event()
>>>>>> and didn't observe any crash. Plus on ETR where this was already
>>>>>> done, no crashes were reported till date and with ETF, the issue
>>>>>> was quickly reproducible, so I am fairly confident that this
>>>>>> doesn't just delay the original issue but actually fixes
>>>>>> it. I will run an overnight test again to confirm this.
>>>>>>
>>>>>
>>>>> I ran the overnight test which collected aroung 4G data(see below),
>>>>> with the following small change to see if the two cases
>>>>> (event->owner=NULL and is_kernel_event()) are triggered
>>>>> with suggested changes and it didn't trigger at all.
>>>>> Do we still need those additional checks?
>>>>>
>>>>
>>>> Yes. Please see perf_event_create_kernel_event(), which is
>>>> an exported function allowing any kernel code (including modules)
>>>> to use the PMU (just like the userspace perf tool would do).
>>>> Just because your use case doesn't trigger this (because
>>>> you don't run something that can trigger this) doesn't mean
>>>> this can't be triggered.
>>>>
>>>
>>> Thanks for that pointer, I will add them in the next version.
>>>
>>
>> And instead of redefining TASK_TOMBSTONE in the driver, you
>> may simply use IS_ERR_OR_NULL(tsk) to cover both NULL case
>> and kernel event.
>>
> 
> Ugh sorry, sent out v2 exporting is_kernel_event() before seeing
> this comment, I will resend.

Saw that. I would say, wait until someone complains about that. If
people are Ok with exporting it, it is fine. I guess it will be useful.
You could fall back to this approach if there is resistance.

Cheers
Suzuki

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] coresight: tmc-etf: Fix NULL ptr dereference in tmc_enable_etf_sink_perf()
  2020-10-22 11:14                       ` Suzuki Poulose
@ 2020-10-22 11:20                         ` Sai Prakash Ranjan
  0 siblings, 0 replies; 17+ messages in thread
From: Sai Prakash Ranjan @ 2020-10-22 11:20 UTC (permalink / raw)
  To: Suzuki Poulose
  Cc: Mathieu Poirier, mike.leach, coresight, swboyd, linux-arm-msm,
	linux-kernel, linux-arm-kernel, denik, leo.yan, peterz

On 2020-10-22 16:44, Suzuki Poulose wrote:
> On 10/22/20 12:07 PM, Sai Prakash Ranjan wrote:
>> On 2020-10-22 14:57, Suzuki Poulose wrote:
>>> On 10/22/20 9:02 AM, Sai Prakash Ranjan wrote:
>>>> On 2020-10-21 15:38, Suzuki Poulose wrote:
>>>>> On 10/21/20 8:29 AM, Sai Prakash Ranjan wrote:
>>>>>> On 2020-10-20 21:40, Sai Prakash Ranjan wrote:
>>>>>>> On 2020-10-14 21:29, Sai Prakash Ranjan wrote:
>>>>>>>> On 2020-10-14 18:46, Suzuki K Poulose wrote:
>>>>>>>>> On 10/14/2020 10:36 AM, Sai Prakash Ranjan wrote:
>>>>>>>>>> On 2020-10-13 22:05, Suzuki K Poulose wrote:
>>>>>>>>>>> On 10/07/2020 02:00 PM, Sai Prakash Ranjan wrote:
>>>>>>>>>>>> There was a report of NULL pointer dereference in ETF enable
>>>>>>>>>>>> path for perf CS mode with PID monitoring. It is almost 100%
>>>>>>>>>>>> reproducible when the process to monitor is something very
>>>>>>>>>>>> active such as chrome and with ETF as the sink and not ETR.
>>>>>>>>>>>> Currently in a bid to find the pid, the owner is 
>>>>>>>>>>>> dereferenced
>>>>>>>>>>>> via task_pid_nr() call in tmc_enable_etf_sink_perf() and 
>>>>>>>>>>>> with
>>>>>>>>>>>> owner being NULL, we get a NULL pointer dereference.
>>>>>>>>>>>> 
>>>>>>>>>>>> Looking at the ETR and other places in the kernel, ETF and 
>>>>>>>>>>>> the
>>>>>>>>>>>> ETB are the only places trying to dereference the 
>>>>>>>>>>>> task(owner)
>>>>>>>>>>>> in tmc_enable_etf_sink_perf() which is also called from the
>>>>>>>>>>>> sched_in path as in the call trace. Owner(task) is NULL even
>>>>>>>>>>>> in the case of ETR in tmc_enable_etr_sink_perf(), but since 
>>>>>>>>>>>> we
>>>>>>>>>>>> cache the PID in alloc_buffer() callback and it is done as 
>>>>>>>>>>>> part
>>>>>>>>>>>> of etm_setup_aux() when allocating buffer for ETR sink, we 
>>>>>>>>>>>> never
>>>>>>>>>>>> dereference this NULL pointer and we are safe. So lets do 
>>>>>>>>>>>> the
>>>>>>>>>>> 
>>>>>>>>>>> The patch is necessary to fix some of the issues. But I feel 
>>>>>>>>>>> it is
>>>>>>>>>>> not complete. Why is it safe earlier and not later ? I 
>>>>>>>>>>> believe we are
>>>>>>>>>>> simply reducing the chances of hitting the issue, by doing 
>>>>>>>>>>> this earlier than
>>>>>>>>>>> later. I would say we better fix all instances to make sure 
>>>>>>>>>>> that the
>>>>>>>>>>> event->owner is valid. (e.g, I can see that the for kernel 
>>>>>>>>>>> events
>>>>>>>>>>> event->owner == -1 ?)
>>>>>>>>>>> 
>>>>>>>>>>> struct task_struct *tsk = READ_ONCE(event->owner);
>>>>>>>>>>> 
>>>>>>>>>>> if (!tsk || is_kernel_event(event))
>>>>>>>>>>>    /* skip ? */
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Looking at it some more, is_kernel_event() is not exposed
>>>>>>>>>> outside events core and probably for good reason. Why do
>>>>>>>>>> we need to check for this and not just tsk?
>>>>>>>>> 
>>>>>>>>> Because the event->owner could be :
>>>>>>>>> 
>>>>>>>>>  = NULL
>>>>>>>>>  = -1UL  // kernel event
>>>>>>>>>  = valid.
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> Yes I understood that part, but here we were trying to
>>>>>>>> fix the NULL pointer dereference right and hence the
>>>>>>>> question as to why we need to check for kernel events?
>>>>>>>> I am no expert in perf but I don't see anywhere in the
>>>>>>>> kernel checking for is_kernel_event(), so I am a bit
>>>>>>>> skeptical if exporting that is actually right or not.
>>>>>>>> 
>>>>>>> 
>>>>>>> I have stress tested with the original patch many times
>>>>>>> now, i.e., without a check for event->owner and is_kernel_event()
>>>>>>> and didn't observe any crash. Plus on ETR where this was already
>>>>>>> done, no crashes were reported till date and with ETF, the issue
>>>>>>> was quickly reproducible, so I am fairly confident that this
>>>>>>> doesn't just delay the original issue but actually fixes
>>>>>>> it. I will run an overnight test again to confirm this.
>>>>>>> 
>>>>>> 
>>>>>> I ran the overnight test which collected aroung 4G data(see 
>>>>>> below),
>>>>>> with the following small change to see if the two cases
>>>>>> (event->owner=NULL and is_kernel_event()) are triggered
>>>>>> with suggested changes and it didn't trigger at all.
>>>>>> Do we still need those additional checks?
>>>>>> 
>>>>> 
>>>>> Yes. Please see perf_event_create_kernel_event(), which is
>>>>> an exported function allowing any kernel code (including modules)
>>>>> to use the PMU (just like the userspace perf tool would do).
>>>>> Just because your use case doesn't trigger this (because
>>>>> you don't run something that can trigger this) doesn't mean
>>>>> this can't be triggered.
>>>>> 
>>>> 
>>>> Thanks for that pointer, I will add them in the next version.
>>>> 
>>> 
>>> And instead of redefining TASK_TOMBSTONE in the driver, you
>>> may simply use IS_ERR_OR_NULL(tsk) to cover both NULL case
>>> and kernel event.
>>> 
>> 
>> Ugh sorry, sent out v2 exporting is_kernel_event() before seeing
>> this comment, I will resend.
> 
> Saw that. I would say, wait until someone complains about that. If
> people are Ok with exporting it, it is fine. I guess it will be useful.
> You could fall back to this approach if there is resistance.
> 

Sure, I will wait for some comments although I hurried
to tell them to ignore it :(

Thanks,
Sai

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member
of Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/2] coresight: etf/etb: NULL Pointer dereference crash fixes
  2020-10-07 13:00 [PATCH 0/2] coresight: etf/etb: NULL Pointer dereference crash fixes Sai Prakash Ranjan
  2020-10-07 13:00 ` [PATCH 1/2] coresight: tmc-etf: Fix NULL ptr dereference in tmc_enable_etf_sink_perf() Sai Prakash Ranjan
  2020-10-07 13:00 ` [PATCH 2/2] coresight: etb10: Fix possible NULL ptr dereference in etb_enable_perf() Sai Prakash Ranjan
@ 2020-12-29 20:15 ` patchwork-bot+linux-arm-msm
  2 siblings, 0 replies; 17+ messages in thread
From: patchwork-bot+linux-arm-msm @ 2020-12-29 20:15 UTC (permalink / raw)
  To: Sai Prakash Ranjan; +Cc: linux-arm-msm

Hello:

This series was applied to qcom/linux.git (refs/heads/for-next):

On Wed,  7 Oct 2020 18:30:23 +0530 you wrote:
> There was a report of NULL pointer dereference in ETF enable
> path for perf CS mode with PID monitoring. It is almost 100%
> reproducible when the process to monitor is something very
> active such as chrome and with ETF as the sink and not ETR.
> Currently in a bid to find the pid, the owner is dereferenced
> via task_pid_nr() call in tmc_enable_etf_sink_perf() and with
> owner being NULL, we get a NULL pointer dereference.
> 
> [...]

Here is the summary with links:
  - [1/2] coresight: tmc-etf: Fix NULL ptr dereference in tmc_enable_etf_sink_perf()
    https://git.kernel.org/qcom/c/868663dd5d69
  - [2/2] coresight: etb10: Fix possible NULL ptr dereference in etb_enable_perf()
    https://git.kernel.org/qcom/c/22b2beaa7f16

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2020-12-29 20:18 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-07 13:00 [PATCH 0/2] coresight: etf/etb: NULL Pointer dereference crash fixes Sai Prakash Ranjan
2020-10-07 13:00 ` [PATCH 1/2] coresight: tmc-etf: Fix NULL ptr dereference in tmc_enable_etf_sink_perf() Sai Prakash Ranjan
2020-10-13 16:35   ` Suzuki K Poulose
2020-10-14  7:50     ` Sai Prakash Ranjan
2020-10-14  9:36     ` Sai Prakash Ranjan
2020-10-14 13:16       ` Suzuki K Poulose
2020-10-14 15:59         ` Sai Prakash Ranjan
2020-10-20 16:10           ` Sai Prakash Ranjan
2020-10-21  7:29             ` Sai Prakash Ranjan
2020-10-21 10:08               ` Suzuki Poulose
2020-10-22  8:02                 ` Sai Prakash Ranjan
2020-10-22  9:27                   ` Suzuki Poulose
2020-10-22 11:07                     ` Sai Prakash Ranjan
2020-10-22 11:14                       ` Suzuki Poulose
2020-10-22 11:20                         ` Sai Prakash Ranjan
2020-10-07 13:00 ` [PATCH 2/2] coresight: etb10: Fix possible NULL ptr dereference in etb_enable_perf() Sai Prakash Ranjan
2020-12-29 20:15 ` [PATCH 0/2] coresight: etf/etb: NULL Pointer dereference crash fixes patchwork-bot+linux-arm-msm

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).