From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.6 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_PATCH,MAILING_LIST_MULTI, NICE_REPLY_A,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7F31CC433DF for ; Mon, 3 Aug 2020 14:50:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5BDB92076C for ; Mon, 3 Aug 2020 14:50:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1596466252; bh=hYqtB5D8bWSgMhqH73CrhvUPEYWs3SY+kGQi2srIaeo=; h=Date:From:To:Cc:Subject:In-Reply-To:References:List-ID:From; b=flYyVwke5Oi6oZ6ktlWTcO7yVvDpnJDOUOdG3n1orMbw35ZiWlzNwLjUDsWFYgoUF q1me9lCUv6XwVjhpEMV50UtQd9N6cOEOa49Oa7yvFR4ZaS/mHTGmdj39LnKhk2d7+7 Dy3Ux9JAg8JugIhzX1lXNWk76PqErv7K2Vtbyp7o= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728178AbgHCOuv (ORCPT ); Mon, 3 Aug 2020 10:50:51 -0400 Received: from mail.kernel.org ([198.145.29.99]:42752 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726189AbgHCOuu (ORCPT ); Mon, 3 Aug 2020 10:50:50 -0400 Received: from devnote2 (NE2965lan1.rev.em-net.ne.jp [210.141.244.193]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 74A152076E; Mon, 3 Aug 2020 14:50:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1596466249; bh=hYqtB5D8bWSgMhqH73CrhvUPEYWs3SY+kGQi2srIaeo=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=q+iSnGP5ZNkXtGQxke2PTU+GyeDD/wYYeOyi+w/42hBZHwEgwWfTWTG6noTc02Lfo qcPdCNpZTW5UcOW0Xfs6iJ4erN+q3x9pHzlt8aWgfzoliUC53iWlbYEX+5MoQV20iD lKE2twHBN05pcyPdii3h1bNPN6r9NIHtyS9sFf3U= Date: Mon, 3 Aug 2020 23:50:42 +0900 From: Masami Hiramatsu To: Muchun Song , Steven Rostedt Cc: naveen.n.rao@linux.ibm.com, anil.s.keshavamurthy@intel.com, davem@davemloft.net, ast@kernel.org, daniel@iogearbox.net, kafai@fb.com, songliubraving@fb.com, yhs@fb.com, andriin@fb.com, john.fastabend@gmail.com, kpsingh@chromium.org, LKML , netdev@vger.kernel.org, bpf@vger.kernel.org, Chengming Zhou Subject: Re: [PATCH] kprobes: fix NULL pointer dereference at kprobe_ftrace_handler Message-Id: <20200803235042.6bacaf3eb53b7ab831f4edd3@kernel.org> In-Reply-To: References: <20200728064536.24405-1-songmuchun@bytedance.com> X-Mailer: Sylpheed 3.7.0 (GTK+ 2.24.32; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Mon, 3 Aug 2020 13:46:25 +0800 Muchun Song wrote: > Ping guys. Any comments or suggestions? > > On Tue, Jul 28, 2020 at 2:45 PM Muchun Song wrote: > > > > We found a case of kernel panic on our server. The stack trace is as > > follows(omit some irrelevant information): > > > > BUG: kernel NULL pointer dereference, address: 0000000000000080 > > RIP: 0010:kprobe_ftrace_handler+0x5e/0xe0 > > RSP: 0018:ffffb512c6550998 EFLAGS: 00010282 > > RAX: 0000000000000000 RBX: ffff8e9d16eea018 RCX: 0000000000000000 > > RDX: ffffffffbe1179c0 RSI: ffffffffc0535564 RDI: ffffffffc0534ec0 > > RBP: ffffffffc0534ec1 R08: ffff8e9d1bbb0f00 R09: 0000000000000004 > > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 > > R13: ffff8e9d1f797060 R14: 000000000000bacc R15: ffff8e9ce13eca00 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 0000000000000080 CR3: 00000008453d0005 CR4: 00000000003606e0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > Call Trace: > > > > ftrace_ops_assist_func+0x56/0xe0 > > ftrace_call+0x5/0x34 > > tcpa_statistic_send+0x5/0x130 [ttcp_engine] > > > > The tcpa_statistic_send is the function being kprobed. After analysis, > > the root cause is that the fourth parameter regs of kprobe_ftrace_handler > > is NULL. Why regs is NULL? We use the crash tool to analyze the kdump. > > > > crash> dis tcpa_statistic_send -r > > : callq 0xffffffffbd8018c0 > > > > The tcpa_statistic_send calls ftrace_caller instead of ftrace_regs_caller. > > So it is reasonable that the fourth parameter regs of kprobe_ftrace_handler > > is NULL. In theory, we should call the ftrace_regs_caller instead of the > > ftrace_caller. After in-depth analysis, we found a reproducible path. > > > > Writing a simple kernel module which starts a periodic timer. The > > timer's handler is named 'kprobe_test_timer_handler'. The module > > name is kprobe_test.ko. > > > > 1) insmod kprobe_test.ko > > 2) bpftrace -e 'kretprobe:kprobe_test_timer_handler {}' > > 3) echo 0 > /proc/sys/kernel/ftrace_enabled > > 4) rmmod kprobe_test > > 5) stop step 2) kprobe > > 6) insmod kprobe_test.ko > > 7) bpftrace -e 'kretprobe:kprobe_test_timer_handler {}' > > > > We mark the kprobe as GONE but not disarm the kprobe in the step 4). > > The step 5) also do not disarm the kprobe when unregister kprobe. So > > we do not remove the ip from the filter. In this case, when the module > > loads again in the step 6), we will replace the code to ftrace_caller > > via the ftrace_module_enable(). When we register kprobe again, we will > > not replace ftrace_caller to ftrace_regs_caller because the ftrace is > > disabled in the step 3). So the step 7) will trigger kernel panic. Fix > > this problem by disarming the kprobe when the module is going away. > > Nice catch! Acked-by: Masami Hiramatsu Fixes: ae6aa16fdc16 ("kprobes: introduce ftrace based optimization") Cc: stable@vger.kernel.org Thank you! > > Signed-off-by: Muchun Song > > Co-developed-by: Chengming Zhou > > Signed-off-by: Chengming Zhou > > --- > > kernel/kprobes.c | 7 +++++++ > > 1 file changed, 7 insertions(+) > > > > diff --git a/kernel/kprobes.c b/kernel/kprobes.c > > index 146c648eb943..503add629599 100644 > > --- a/kernel/kprobes.c > > +++ b/kernel/kprobes.c > > @@ -2148,6 +2148,13 @@ static void kill_kprobe(struct kprobe *p) > > * the original probed function (which will be freed soon) any more. > > */ > > arch_remove_kprobe(p); > > + > > + /* > > + * The module is going away. We should disarm the kprobe which > > + * is using ftrace. > > + */ > > + if (kprobe_ftrace(p)) > > + disarm_kprobe_ftrace(p); > > } > > > > /* Disable one kprobe */ > > -- > > 2.11.0 > > > > > -- > Yours, > Muchun -- Masami Hiramatsu