All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com>
To: Jiri Olsa <jolsa@kernel.org>, Masami Hiramatsu <mhiramat@kernel.org>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>,
	"bibo,mao" <bibo.mao@intel.com>,
	"David S. Miller" <davem@davemloft.net>,
	lkml <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	"Ziqian SUN (Zamir)" <zsun@redhat.com>
Subject: Re: [RFC] kretprobe: Prevent triggering kretprobe from within kprobe_flush_task
Date: Thu, 09 Apr 2020 18:46:47 +0530	[thread overview]
Message-ID: <1586437540.j6vekko069.naveen@linux.ibm.com> (raw)
In-Reply-To: <20200409213806.7657ec27d1b5cbd8243505b9@kernel.org>

Hi Masami,

Masami Hiramatsu wrote:
> Hi Jiri,
> 
> On Wed,  8 Apr 2020 18:46:41 +0200
> Jiri Olsa <jolsa@kernel.org> wrote:
> 
>> hi,
>> Ziqian reported lockup when adding retprobe on _raw_spin_lock_irqsave.
> 
> Hmm, kprobe is lockless, but kretprobe involves spinlock.
> Thus, eventually, I will blacklist the _raw_spin_lock_irqsave()
> for kretprobe.

As far as I can see, this is the only place where probing 
_raw_spin_lock_irqsave() is an issue.  Should we blacklist all users for 
this case alone?

> If you need to trace spinlock return, please consider to putting
> kprobe at "ret" instruction.
> 
>> My test was also able to trigger lockdep output:
>> 
>>  ============================================
>>  WARNING: possible recursive locking detected
>>  5.6.0-rc6+ #6 Not tainted
>>  --------------------------------------------
>>  sched-messaging/2767 is trying to acquire lock:
>>  ffffffff9a492798 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_hash_lock+0x52/0xa0
>> 
>>  but task is already holding lock:
>>  ffffffff9a491a18 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_trampoline+0x0/0x50
>> 
>>  other info that might help us debug this:
>>   Possible unsafe locking scenario:
>> 
>>         CPU0
>>         ----
>>    lock(&(kretprobe_table_locks[i].lock));
>>    lock(&(kretprobe_table_locks[i].lock));
>> 
>>   *** DEADLOCK ***
>> 
>>   May be due to missing lock nesting notation
>> 
>>  1 lock held by sched-messaging/2767:
>>   #0: ffffffff9a491a18 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_trampoline+0x0/0x50
>> 
>>  stack backtrace:
>>  CPU: 3 PID: 2767 Comm: sched-messaging Not tainted 5.6.0-rc6+ #6
>>  Call Trace:
>>   dump_stack+0x96/0xe0
>>   __lock_acquire.cold.57+0x173/0x2b7
>>   ? native_queued_spin_lock_slowpath+0x42b/0x9e0
>>   ? lockdep_hardirqs_on+0x590/0x590
>>   ? __lock_acquire+0xf63/0x4030
>>   lock_acquire+0x15a/0x3d0
>>   ? kretprobe_hash_lock+0x52/0xa0
>>   _raw_spin_lock_irqsave+0x36/0x70
>>   ? kretprobe_hash_lock+0x52/0xa0
>>   kretprobe_hash_lock+0x52/0xa0
>>   trampoline_handler+0xf8/0x940
>>   ? kprobe_fault_handler+0x380/0x380
>>   ? find_held_lock+0x3a/0x1c0
>>   kretprobe_trampoline+0x25/0x50
>>   ? lock_acquired+0x392/0xbc0
>>   ? _raw_spin_lock_irqsave+0x50/0x70
>>   ? __get_valid_kprobe+0x1f0/0x1f0
>>   ? _raw_spin_unlock_irqrestore+0x3b/0x40
>>   ? finish_task_switch+0x4b9/0x6d0
>>   ? __switch_to_asm+0x34/0x70
>>   ? __switch_to_asm+0x40/0x70
>> 
>> The code within the kretprobe handler checks for probe reentrancy,
>> so we won't trigger any _raw_spin_lock_irqsave probe in there.
>> 
>> The problem is in outside kprobe_flush_task, where we call:
>> 
>>   kprobe_flush_task
>>     kretprobe_table_lock
>>       raw_spin_lock_irqsave
>>         _raw_spin_lock_irqsave
>> 
>> where _raw_spin_lock_irqsave triggers the kretprobe and installs
>> kretprobe_trampoline handler on _raw_spin_lock_irqsave return.
> 
> Hmm, OK. In this case, I think we should mark this process is
> going to die and never try to kretprobe on it.
> 
>> 
>> The kretprobe_trampoline handler is then executed with already
>> locked kretprobe_table_locks, and first thing it does is to
>> lock kretprobe_table_locks ;-) the whole lockup path like:
>> 
>>   kprobe_flush_task
>>     kretprobe_table_lock
>>       raw_spin_lock_irqsave
>>         _raw_spin_lock_irqsave ---> probe triggered, kretprobe_trampoline installed
>> 
>>         ---> kretprobe_table_locks locked
>> 
>>         kretprobe_trampoline
>>           trampoline_handler
>>             kretprobe_hash_lock(current, &head, &flags);  <--- deadlock
>> 
>> The change below sets current_kprobe in kprobe_flush_task, so the probe
>> recursion protection check is hit and the probe is never set. It seems
>> to fix the deadlock.
>> 
>> I'm not sure this is the best fix, any ideas are welcome ;-)
> 
> Hmm, this is a bit tricky to fix this issue. Of course, temporary disable
> kprobes (and kretprobe) on an area by filling current_kprobe might
> be a good idea, but it also involves other kprobes.

Not sure how you mean that. Jiri's RFC patch would be disabling 
k[ret]probes within kprobe_flush_task(), which is only ever invoked from 
finish_task_switch(). I only see calls to spin locks and kfree() from 
here. Besides, kprobe_flush_task() itself is NOKPROBE, so we would 
ideally want to not trace/probe other functions it calls.

> 
> How about let kretprobe skip the task which state == TASK_DEAD ?
> 
> diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> index 627fc1b7011a..3f207d2e0afb 100644
> --- a/kernel/kprobes.c
> +++ b/kernel/kprobes.c
> @@ -1874,9 +1874,12 @@ static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs)
>  	 * To avoid deadlocks, prohibit return probing in NMI contexts,
>  	 * just skip the probe and increase the (inexact) 'nmissed'
>  	 * statistical counter, so that the user is informed that
> -	 * something happened:
> +	 * something happened.
> +	 * Also, if the current task is dead, we will already in the process
> +	 * to reclaim kretprobe instances from hash list. To avoid memory
> +	 * leak, skip to run the kretprobe on such task.
>  	 */
> -	if (unlikely(in_nmi())) {
> +	if (unlikely(in_nmi()) || current->state == TASK_DEAD) {

I'm wondering if this actually works. kprobe_flush_task() seems to be 
called from finish_task_switch(), after the task switch is complete. So, 
current task won't actually be dead here.


- Naveen


  parent reply	other threads:[~2020-04-09 13:17 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-08 16:46 [RFC] kretprobe: Prevent triggering kretprobe from within kprobe_flush_task Jiri Olsa
2020-04-09  9:02 ` Naveen N. Rao
2020-04-09 18:43   ` Jiri Olsa
2020-04-09 12:38 ` Masami Hiramatsu
2020-04-09 12:52   ` Jiri Olsa
2020-04-09 14:16     ` Masami Hiramatsu
2020-04-09 13:16   ` Naveen N. Rao [this message]
2020-04-09 14:26     ` Masami Hiramatsu
2020-04-09 14:41 ` Masami Hiramatsu
2020-04-09 18:44   ` Jiri Olsa
2020-04-09 20:13     ` Jiri Olsa
2020-04-10  0:31       ` Masami Hiramatsu
2020-04-14 16:03         ` Jiri Olsa
2020-04-15  9:05           ` [PATCH] " Jiri Olsa
2020-04-16  1:55             ` Masami Hiramatsu
2020-04-16  9:13               ` Jiri Olsa
2020-04-16 13:42                 ` Masami Hiramatsu
2020-04-16 14:31                   ` [PATCHv2] " Jiri Olsa
2020-04-17  7:38                     ` Masami Hiramatsu
2020-04-28 21:36                       ` Jiri Olsa
2020-05-01  2:01                         ` Masami Hiramatsu
2020-05-07 10:15                           ` Jiri Olsa
2020-04-10  1:31       ` [RFC] " Ziqian SUN (Zamir)
2020-04-14 16:03         ` Jiri Olsa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1586437540.j6vekko069.naveen@linux.ibm.com \
    --to=naveen.n.rao@linux.ibm.com \
    --cc=anil.s.keshavamurthy@intel.com \
    --cc=bibo.mao@intel.com \
    --cc=davem@davemloft.net \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=peterz@infradead.org \
    --cc=zsun@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.