From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751478AbdJ1InU (ORCPT ); Sat, 28 Oct 2017 04:43:20 -0400 Received: from mail.kernel.org ([198.145.29.99]:57854 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750803AbdJ1InQ (ORCPT ); Sat, 28 Oct 2017 04:43:16 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D7D8D218A5 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=mhiramat@kernel.org Date: Sat, 28 Oct 2017 17:43:10 +0900 From: Masami Hiramatsu To: zhouchengming Cc: Borislav Petkov , , , , , , , , , , , , Subject: Re: [PATCH] kprobes, x86/alternatives: use text_mutex to protect smp_alt_modules Message-Id: <20171028174310.384d62976cc5ba4859325d3a@kernel.org> In-Reply-To: <59F334F0.2070900@huawei.com> References: <1509096884-22993-1-git-send-email-zhouchengming1@huawei.com> <20171027111527.GD1305@nazgul.tnic> <59F31BB5.90905@huawei.com> <20171027123348.GE1305@nazgul.tnic> <59F334F0.2070900@huawei.com> X-Mailer: Sylpheed 3.5.1 (GTK+ 2.24.31; x86_64-redhat-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 27 Oct 2017 21:30:24 +0800 zhouchengming wrote: > On 2017/10/27 20:33, Borislav Petkov wrote: > > On Fri, Oct 27, 2017 at 07:42:45PM +0800, zhouchengming wrote: > >> This is a real bug happened on one of our machines, below is the calltrace. > >> We can see the trigger is at alternatives_text_reserved+0x20/0x80, and > >> encounter a deleted (poisoned) list_head. > > Looks like some out-of-tree, old kernel thing. We don't have > > mlx4_stats_sysfs_create() upstream and looking at the boot timestamps, > > it could be that register_jprobe() is not ready yet. > > Yes, it's an out-of-tree module, loaded when boot kernel. register_kprobe() > maybe not ready yet, but the bug is not caused by it obviously. > > > > > Looking at the Code, though: > > > > 20: 74 59 je 0x7b > > 22: 66 0f 1f 84 00 00 00 nopw 0x0(%rax,%rax,1) > > 29: 00 00 > > 2b:* 48 3b 71 20 cmp 0x20(%rcx),%rsi<-- trapping instruction > > 2f: 72 3a jb 0x6b > > 31: 48 3b 79 28 cmp 0x28(%rcx),%rdi > > 35: 77 34 ja 0x6b > > > > %rcx is 0xdead0000000000d0 and that is POISON_POINTER_DELTA + 0xd0 so > > that looks more like smp_alt_modules is not initialized yet but I could > > could very well be wrong because this is an old kernel. So trigger that > > with the upstream kernel without out of tree modules. > > The smp_alt_modules is defined by LIST_HEAD, so it's initialized at start. > > A deleted list_head->next = LIST_POISON1 = 0xdead000000000000 + 0x100, then > container_of() to get the struct smp_alt_module: -0x30 = 0xdead0000000000d0 > > Obviously, it's a deleted list_head, and I have explained clearly how it happen in > the patch comment. Ah, I see. It looks alternatives_text_reserved() bug at a glance. But simply adding smp_alt mutex to alternatives_text_reserved() causes ABBA deadlock in the kprobe's path. So your solution is to replace the smp_alt with text_mutex, since alternatives_text_reserved is x86 specific function. Hmm, let me see... I agree that will be a simple way to solve, but it also means we have 2 resources protected by text_mutex. Thank you, -- Masami Hiramatsu