From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754421AbdKJVaE (ORCPT ); Fri, 10 Nov 2017 16:30:04 -0500 Received: from Galois.linutronix.de ([146.0.238.70]:56762 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750933AbdKJVaC (ORCPT ); Fri, 10 Nov 2017 16:30:02 -0500 Date: Fri, 10 Nov 2017 22:29:59 +0100 (CET) From: Thomas Gleixner To: Linus Torvalds cc: Fengguang Wu , Network Development , Linux Wireless List , Linux Kernel Mailing List Subject: Re: [run_timer_softirq] BUG: unable to handle kernel paging request at 0000000000010007 In-Reply-To: Message-ID: References: <20171029225155.qcum5i75awrt5tzm@wfg-t540p.sh.intel.com> <20171029234820.nzwavupqlv2iqo3m@wfg-t540p.sh.intel.com> <20171109051905.pdlsyrbzrwlsjbrs@wfg-t540p.sh.intel.com> User-Agent: Alpine 2.20 (DEB 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 10 Nov 2017, Linus Torvalds wrote: > On Wed, Nov 8, 2017 at 9:19 PM, Fengguang Wu wrote: > > > > Yes it's accessing the list. Here is the faddr2line output. > > Ok, so it's a corrupted timer list. Which is not a big surprise. > > It's > > next->pprev = pprev; > > in __hlist_del(), and the trapping instruction decodes as > > mov %rdx,0x8(%rax) > > with %rax having the value dead000000000200, > > Which is just LIST_POISON2. > > So we've deleted that entry twice - LIST_POISON2 is what hlist_del() > sets pprev to after already deleting it once. > > Although in this case it might not be hlist_del(), because > detach_timer() also sets entry->next to LIST_POISON2. > > Which is pretty bogus, we are supposed to use LIST_POISON1 for the > "next" pointer. Oh well. Nobody cares, except for the list entry > debugging code, which isn't run on the hlist cases. > > Adding Thomas Gleixner to the cc. It should not be possible to delete > the same timer twice. Right, it shouldn't. Fengguang, can you please enable: CONFIG_DEBUG_OBJECTS CONFIG_DEBUG_OBJECTS_TIMERS and try to reproduce? Debugobject should catch that hopefully. Thanks, tglx