All of lore.kernel.org
 help / color / mirror / Atom feed
From: Masami Hiramatsu <mhiramat@kernel.org>
To: Jiri Olsa <jolsa@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>,
	"Naveen N. Rao" <naveen.n.rao@linux.ibm.com>,
	Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>,
	"David S. Miller" <davem@davemloft.net>,
	Peter Zijlstra <peterz@infradead.org>,
	lkml <linux-kernel@vger.kernel.org>,
	"bibo,mao" <bibo.mao@intel.com>,
	"Ziqian SUN (Zamir)" <zsun@redhat.com>,
	stable@vger.kernel.org
Subject: Re: [PATCH] kretprobe: Prevent triggering kretprobe from within kprobe_flush_task
Date: Thu, 16 Apr 2020 10:55:06 +0900	[thread overview]
Message-ID: <20200416105506.904b7847a1b621b75463076d@kernel.org> (raw)
In-Reply-To: <20200415090507.GG208694@krava>

On Wed, 15 Apr 2020 11:05:07 +0200
Jiri Olsa <jolsa@redhat.com> wrote:

> Ziqian reported lockup when adding retprobe on _raw_spin_lock_irqsave.
> My test was also able to trigger lockdep output:
> 
>  ============================================
>  WARNING: possible recursive locking detected
>  5.6.0-rc6+ #6 Not tainted
>  --------------------------------------------
>  sched-messaging/2767 is trying to acquire lock:
>  ffffffff9a492798 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_hash_lock+0x52/0xa0
> 
>  but task is already holding lock:
>  ffffffff9a491a18 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_trampoline+0x0/0x50
> 
>  other info that might help us debug this:
>   Possible unsafe locking scenario:
> 
>         CPU0
>         ----
>    lock(&(kretprobe_table_locks[i].lock));
>    lock(&(kretprobe_table_locks[i].lock));
> 
>   *** DEADLOCK ***
> 
>   May be due to missing lock nesting notation
> 
>  1 lock held by sched-messaging/2767:
>   #0: ffffffff9a491a18 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_trampoline+0x0/0x50
> 
>  stack backtrace:
>  CPU: 3 PID: 2767 Comm: sched-messaging Not tainted 5.6.0-rc6+ #6
>  Call Trace:
>   dump_stack+0x96/0xe0
>   __lock_acquire.cold.57+0x173/0x2b7
>   ? native_queued_spin_lock_slowpath+0x42b/0x9e0
>   ? lockdep_hardirqs_on+0x590/0x590
>   ? __lock_acquire+0xf63/0x4030
>   lock_acquire+0x15a/0x3d0
>   ? kretprobe_hash_lock+0x52/0xa0
>   _raw_spin_lock_irqsave+0x36/0x70
>   ? kretprobe_hash_lock+0x52/0xa0
>   kretprobe_hash_lock+0x52/0xa0
>   trampoline_handler+0xf8/0x940
>   ? kprobe_fault_handler+0x380/0x380
>   ? find_held_lock+0x3a/0x1c0
>   kretprobe_trampoline+0x25/0x50
>   ? lock_acquired+0x392/0xbc0
>   ? _raw_spin_lock_irqsave+0x50/0x70
>   ? __get_valid_kprobe+0x1f0/0x1f0
>   ? _raw_spin_unlock_irqrestore+0x3b/0x40
>   ? finish_task_switch+0x4b9/0x6d0
>   ? __switch_to_asm+0x34/0x70
>   ? __switch_to_asm+0x40/0x70
> 
> The code within the kretprobe handler checks for probe reentrancy,
> so we won't trigger any _raw_spin_lock_irqsave probe in there.
> 
> The problem is in outside kprobe_flush_task, where we call:
> 
>   kprobe_flush_task
>     kretprobe_table_lock
>       raw_spin_lock_irqsave
>         _raw_spin_lock_irqsave
> 
> where _raw_spin_lock_irqsave triggers the kretprobe and installs
> kretprobe_trampoline handler on _raw_spin_lock_irqsave return.
> 
> The kretprobe_trampoline handler is then executed with already
> locked kretprobe_table_locks, and first thing it does is to
> lock kretprobe_table_locks ;-) the whole lockup path like:
> 
>   kprobe_flush_task
>     kretprobe_table_lock
>       raw_spin_lock_irqsave
>         _raw_spin_lock_irqsave ---> probe triggered, kretprobe_trampoline installed
> 
>         ---> kretprobe_table_locks locked
> 
>         kretprobe_trampoline
>           trampoline_handler
>             kretprobe_hash_lock(current, &head, &flags);  <--- deadlock
> 
> Adding kprobe_busy_begin/end helpers that mark code with fake
> probe installed to prevent triggering of another kprobe within
> this code.
> 
> Using these helpers in kprobe_flush_task, so the probe recursion
> protection check is hit and the probe is never set to prevent
> above lockup.
> 
> Reported-by: "Ziqian SUN (Zamir)" <zsun@redhat.com>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>

Thanks Jiri and Ziqian!

Looks good to me.

Acked-by: Masami Hiramatsu <mhiramat@kernel.org>

BTW, this is a kind of bugfix. So should it add a Fixes tag?

Fixes: ef53d9c5e4da ('kprobes: improve kretprobe scalability with hashed locking')
Cc: stable@vger.kernel.org

Thank you,

> ---
>  arch/x86/kernel/kprobes/core.c | 16 +++-------------
>  include/linux/kprobes.h        |  4 ++++
>  kernel/kprobes.c               | 24 ++++++++++++++++++++++++
>  3 files changed, 31 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
> index 4d7022a740ab..a12adbe1559d 100644
> --- a/arch/x86/kernel/kprobes/core.c
> +++ b/arch/x86/kernel/kprobes/core.c
> @@ -753,16 +753,11 @@ asm(
>  NOKPROBE_SYMBOL(kretprobe_trampoline);
>  STACK_FRAME_NON_STANDARD(kretprobe_trampoline);
>  
> -static struct kprobe kretprobe_kprobe = {
> -	.addr = (void *)kretprobe_trampoline,
> -};
> -
>  /*
>   * Called from kretprobe_trampoline
>   */
>  __used __visible void *trampoline_handler(struct pt_regs *regs)
>  {
> -	struct kprobe_ctlblk *kcb;
>  	struct kretprobe_instance *ri = NULL;
>  	struct hlist_head *head, empty_rp;
>  	struct hlist_node *tmp;
> @@ -772,16 +767,12 @@ __used __visible void *trampoline_handler(struct pt_regs *regs)
>  	void *frame_pointer;
>  	bool skipped = false;
>  
> -	preempt_disable();
> -
>  	/*
>  	 * Set a dummy kprobe for avoiding kretprobe recursion.
>  	 * Since kretprobe never run in kprobe handler, kprobe must not
>  	 * be running at this point.
>  	 */
> -	kcb = get_kprobe_ctlblk();
> -	__this_cpu_write(current_kprobe, &kretprobe_kprobe);
> -	kcb->kprobe_status = KPROBE_HIT_ACTIVE;
> +	kprobe_busy_begin();
>  
>  	INIT_HLIST_HEAD(&empty_rp);
>  	kretprobe_hash_lock(current, &head, &flags);
> @@ -857,7 +848,7 @@ __used __visible void *trampoline_handler(struct pt_regs *regs)
>  			__this_cpu_write(current_kprobe, &ri->rp->kp);
>  			ri->ret_addr = correct_ret_addr;
>  			ri->rp->handler(ri, regs);
> -			__this_cpu_write(current_kprobe, &kretprobe_kprobe);
> +			__this_cpu_write(current_kprobe, &kprobe_busy);
>  		}
>  
>  		recycle_rp_inst(ri, &empty_rp);
> @@ -873,8 +864,7 @@ __used __visible void *trampoline_handler(struct pt_regs *regs)
>  
>  	kretprobe_hash_unlock(current, &flags);
>  
> -	__this_cpu_write(current_kprobe, NULL);
> -	preempt_enable();
> +	kprobe_busy_end();
>  
>  	hlist_for_each_entry_safe(ri, tmp, &empty_rp, hlist) {
>  		hlist_del(&ri->hlist);
> diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
> index 04bdaf01112c..645fd401c856 100644
> --- a/include/linux/kprobes.h
> +++ b/include/linux/kprobes.h
> @@ -350,6 +350,10 @@ static inline struct kprobe_ctlblk *get_kprobe_ctlblk(void)
>  	return this_cpu_ptr(&kprobe_ctlblk);
>  }
>  
> +extern struct kprobe kprobe_busy;
> +void kprobe_busy_begin(void);
> +void kprobe_busy_end(void);
> +
>  kprobe_opcode_t *kprobe_lookup_name(const char *name, unsigned int offset);
>  int register_kprobe(struct kprobe *p);
>  void unregister_kprobe(struct kprobe *p);
> diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> index 2625c241ac00..75bb4a8458e7 100644
> --- a/kernel/kprobes.c
> +++ b/kernel/kprobes.c
> @@ -1236,6 +1236,26 @@ __releases(hlist_lock)
>  }
>  NOKPROBE_SYMBOL(kretprobe_table_unlock);
>  
> +struct kprobe kprobe_busy = {
> +	.addr = (void *) get_kprobe,
> +};
> +
> +void kprobe_busy_begin(void)
> +{
> +	struct kprobe_ctlblk *kcb;
> +
> +	preempt_disable();
> +	__this_cpu_write(current_kprobe, &kprobe_busy);
> +	kcb = get_kprobe_ctlblk();
> +	kcb->kprobe_status = KPROBE_HIT_ACTIVE;
> +}
> +
> +void kprobe_busy_end(void)
> +{
> +	__this_cpu_write(current_kprobe, NULL);
> +	preempt_enable();
> +}
> +
>  /*
>   * This function is called from finish_task_switch when task tk becomes dead,
>   * so that we can recycle any function-return probe instances associated
> @@ -1253,6 +1273,8 @@ void kprobe_flush_task(struct task_struct *tk)
>  		/* Early boot.  kretprobe_table_locks not yet initialized. */
>  		return;
>  
> +	kprobe_busy_begin();
> +
>  	INIT_HLIST_HEAD(&empty_rp);
>  	hash = hash_ptr(tk, KPROBE_HASH_BITS);
>  	head = &kretprobe_inst_table[hash];
> @@ -1266,6 +1288,8 @@ void kprobe_flush_task(struct task_struct *tk)
>  		hlist_del(&ri->hlist);
>  		kfree(ri);
>  	}
> +
> +	kprobe_busy_end();
>  }
>  NOKPROBE_SYMBOL(kprobe_flush_task);
>  
> -- 
> 2.18.2
> 


-- 
Masami Hiramatsu <mhiramat@kernel.org>

  reply	other threads:[~2020-04-16  1:55 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-08 16:46 [RFC] kretprobe: Prevent triggering kretprobe from within kprobe_flush_task Jiri Olsa
2020-04-09  9:02 ` Naveen N. Rao
2020-04-09 18:43   ` Jiri Olsa
2020-04-09 12:38 ` Masami Hiramatsu
2020-04-09 12:52   ` Jiri Olsa
2020-04-09 14:16     ` Masami Hiramatsu
2020-04-09 13:16   ` Naveen N. Rao
2020-04-09 14:26     ` Masami Hiramatsu
2020-04-09 14:41 ` Masami Hiramatsu
2020-04-09 18:44   ` Jiri Olsa
2020-04-09 20:13     ` Jiri Olsa
2020-04-10  0:31       ` Masami Hiramatsu
2020-04-14 16:03         ` Jiri Olsa
2020-04-15  9:05           ` [PATCH] " Jiri Olsa
2020-04-16  1:55             ` Masami Hiramatsu [this message]
2020-04-16  9:13               ` Jiri Olsa
2020-04-16 13:42                 ` Masami Hiramatsu
2020-04-16 14:31                   ` [PATCHv2] " Jiri Olsa
2020-04-17  7:38                     ` Masami Hiramatsu
2020-04-28 21:36                       ` Jiri Olsa
2020-05-01  2:01                         ` Masami Hiramatsu
2020-05-07 10:15                           ` Jiri Olsa
2020-04-10  1:31       ` [RFC] " Ziqian SUN (Zamir)
2020-04-14 16:03         ` Jiri Olsa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200416105506.904b7847a1b621b75463076d@kernel.org \
    --to=mhiramat@kernel.org \
    --cc=anil.s.keshavamurthy@intel.com \
    --cc=bibo.mao@intel.com \
    --cc=davem@davemloft.net \
    --cc=jolsa@kernel.org \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=naveen.n.rao@linux.ibm.com \
    --cc=peterz@infradead.org \
    --cc=stable@vger.kernel.org \
    --cc=zsun@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.