linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
To: Suzuki K Poulose <suzuki.poulose@arm.com>,
	Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: mike.leach@linaro.org, coresight@lists.linaro.org,
	swboyd@chromium.org, linux-arm-msm@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, denik@google.com,
	leo.yan@linaro.org, peterz@infradead.org
Subject: Re: [PATCH 1/2] coresight: tmc-etf: Fix NULL ptr dereference in tmc_enable_etf_sink_perf()
Date: Wed, 21 Oct 2020 12:59:35 +0530	[thread overview]
Message-ID: <0ee3566e50143bac5b662b2edf551b89@codeaurora.org> (raw)
In-Reply-To: <8affc09d4045812e2f5a065695b375de@codeaurora.org>

On 2020-10-20 21:40, Sai Prakash Ranjan wrote:
> On 2020-10-14 21:29, Sai Prakash Ranjan wrote:
>> On 2020-10-14 18:46, Suzuki K Poulose wrote:
>>> On 10/14/2020 10:36 AM, Sai Prakash Ranjan wrote:
>>>> On 2020-10-13 22:05, Suzuki K Poulose wrote:
>>>>> On 10/07/2020 02:00 PM, Sai Prakash Ranjan wrote:
>>>>>> There was a report of NULL pointer dereference in ETF enable
>>>>>> path for perf CS mode with PID monitoring. It is almost 100%
>>>>>> reproducible when the process to monitor is something very
>>>>>> active such as chrome and with ETF as the sink and not ETR.
>>>>>> Currently in a bid to find the pid, the owner is dereferenced
>>>>>> via task_pid_nr() call in tmc_enable_etf_sink_perf() and with
>>>>>> owner being NULL, we get a NULL pointer dereference.
>>>>>> 
>>>>>> Looking at the ETR and other places in the kernel, ETF and the
>>>>>> ETB are the only places trying to dereference the task(owner)
>>>>>> in tmc_enable_etf_sink_perf() which is also called from the
>>>>>> sched_in path as in the call trace. Owner(task) is NULL even
>>>>>> in the case of ETR in tmc_enable_etr_sink_perf(), but since we
>>>>>> cache the PID in alloc_buffer() callback and it is done as part
>>>>>> of etm_setup_aux() when allocating buffer for ETR sink, we never
>>>>>> dereference this NULL pointer and we are safe. So lets do the
>>>>> 
>>>>> The patch is necessary to fix some of the issues. But I feel it is
>>>>> not complete. Why is it safe earlier and not later ? I believe we 
>>>>> are
>>>>> simply reducing the chances of hitting the issue, by doing this 
>>>>> earlier than
>>>>> later. I would say we better fix all instances to make sure that 
>>>>> the
>>>>> event->owner is valid. (e.g, I can see that the for kernel events
>>>>> event->owner == -1 ?)
>>>>> 
>>>>> struct task_struct *tsk = READ_ONCE(event->owner);
>>>>> 
>>>>> if (!tsk || is_kernel_event(event))
>>>>>    /* skip ? */
>>>>> 
>>>> 
>>>> Looking at it some more, is_kernel_event() is not exposed
>>>> outside events core and probably for good reason. Why do
>>>> we need to check for this and not just tsk?
>>> 
>>> Because the event->owner could be :
>>> 
>>>  = NULL
>>>  = -1UL  // kernel event
>>>  = valid.
>>> 
>> 
>> Yes I understood that part, but here we were trying to
>> fix the NULL pointer dereference right and hence the
>> question as to why we need to check for kernel events?
>> I am no expert in perf but I don't see anywhere in the
>> kernel checking for is_kernel_event(), so I am a bit
>> skeptical if exporting that is actually right or not.
>> 
> 
> I have stress tested with the original patch many times
> now, i.e., without a check for event->owner and is_kernel_event()
> and didn't observe any crash. Plus on ETR where this was already
> done, no crashes were reported till date and with ETF, the issue
> was quickly reproducible, so I am fairly confident that this
> doesn't just delay the original issue but actually fixes
> it. I will run an overnight test again to confirm this.
> 

I ran the overnight test which collected aroung 4G data(see below),
with the following small change to see if the two cases
(event->owner=NULL and is_kernel_event()) are triggered
with suggested changes and it didn't trigger at all.
Do we still need those additional checks?

[ perf record: Captured and wrote 4677.989 MB perf.data ]

diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c 
b/drivers/hwtracing/coresight/coresight-tmc-etf.c
index 989d965f3d90..123c446ce585 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etf.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c
@@ -13,6 +13,13 @@
  #include "coresight-tmc.h"
  #include "coresight-etm-perf.h"

+#define TASK_TOMBSTONE ((void *)-1L)
+
+static bool is_kernel_event2(struct perf_event *event)
+{
+       return READ_ONCE(event->owner) == TASK_TOMBSTONE;
+}
+
  static int tmc_set_etf_buffer(struct coresight_device *csdev,
                               struct perf_output_handle *handle);

@@ -392,6 +399,15 @@ static void *tmc_alloc_etf_buffer(struct 
coresight_device *csdev,
  {
         int node;
         struct cs_buffers *buf;
+       struct task_struct *task = READ_ONCE(event->owner);
+
+       if (!task) {
+               pr_info("**sai in task=NULL**\n");
+               return NULL;
+       }
+
+       if (is_kernel_event2(event))
+               pr_info("**sai in is_kernel_event**\n");

         node = (event->cpu == -1) ? NUMA_NO_NODE : 
cpu_to_node(event->cpu);


Thanks,
Sai

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member
of Code Aurora Forum, hosted by The Linux Foundation

  reply	other threads:[~2020-10-21  7:29 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-07 13:00 [PATCH 0/2] coresight: etf/etb: NULL Pointer dereference crash fixes Sai Prakash Ranjan
2020-10-07 13:00 ` [PATCH 1/2] coresight: tmc-etf: Fix NULL ptr dereference in tmc_enable_etf_sink_perf() Sai Prakash Ranjan
2020-10-13 16:35   ` Suzuki K Poulose
2020-10-14  7:50     ` Sai Prakash Ranjan
2020-10-14  9:36     ` Sai Prakash Ranjan
2020-10-14 13:16       ` Suzuki K Poulose
2020-10-14 15:59         ` Sai Prakash Ranjan
2020-10-20 16:10           ` Sai Prakash Ranjan
2020-10-21  7:29             ` Sai Prakash Ranjan [this message]
2020-10-21 10:08               ` Suzuki Poulose
2020-10-22  8:02                 ` Sai Prakash Ranjan
2020-10-22  9:27                   ` Suzuki Poulose
2020-10-22 11:07                     ` Sai Prakash Ranjan
2020-10-22 11:14                       ` Suzuki Poulose
2020-10-22 11:20                         ` Sai Prakash Ranjan
2020-10-07 13:00 ` [PATCH 2/2] coresight: etb10: Fix possible NULL ptr dereference in etb_enable_perf() Sai Prakash Ranjan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0ee3566e50143bac5b662b2edf551b89@codeaurora.org \
    --to=saiprakash.ranjan@codeaurora.org \
    --cc=coresight@lists.linaro.org \
    --cc=denik@google.com \
    --cc=leo.yan@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.poirier@linaro.org \
    --cc=mike.leach@linaro.org \
    --cc=peterz@infradead.org \
    --cc=suzuki.poulose@arm.com \
    --cc=swboyd@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).