From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754209Ab3BPTrV (ORCPT ); Sat, 16 Feb 2013 14:47:21 -0500 Received: from mail-vb0-f47.google.com ([209.85.212.47]:41042 "EHLO mail-vb0-f47.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754160Ab3BPTrU (ORCPT ); Sat, 16 Feb 2013 14:47:20 -0500 MIME-Version: 1.0 In-Reply-To: <20130216192554.GB7035@linux.vnet.ibm.com> References: <20130213041629.GA28622@redhat.com> <20130213193411.GA15928@redhat.com> <20130215011503.GA11914@redhat.com> <20130215174435.GA2792@linux.vnet.ibm.com> <20130216192554.GB7035@linux.vnet.ibm.com> From: Linus Torvalds Date: Sat, 16 Feb 2013 11:46:59 -0800 X-Google-Sender-Auth: e2u-tTlC_nr77QZTGtPjdByTCwc Message-ID: Subject: Re: Debugging Thinkpad T430s occasional suspend failure. To: Paul McKenney , "H. Peter Anvin" Cc: Frederic Weisbecker , Thomas Gleixner , Ingo Molnar , Peter Zijlstra , Dave Jones , Hugh Dickins , Linux Kernel Mailing List , Paul McKenney Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Feb 16, 2013 at 11:25 AM, Paul E. McKenney wrote: > > Sorry for the delay in testing this, but there was a need to upgrade > my laptop, and bozo here figured "why not go to 64 bits while I am at > it?" -- and then proceeded to learn the hard way that it is necessary > to do "make mrproper" before doing a build in 64-bit mode. :-/ Hmm. Our object file dependency check includes checking that the compiler options are the same, but that's only true for normal C files. Some of the other rules do *not* test the full range of config options, so in general, if you change architecture etc models, you do indeed want to make sure that you do a "make distclean" (aka "make mrproper") or something like "git clean -dqfx". For a number of other files, we just depend on the normal make timestamp logic, which means that "if the object file is newer than the sources", we'll trust it. Which obviously doesn't work for cases where the object file may have been generated under totally different architecture rules.. (That said, what kind of old environment did you do this in? stub32_sigaltstack was removed during the merge window, so I'm assuming you applied my patch on top of plain 3.7 or something?) > The kernel build system's way of telling you this at the moment is: > > arch/x86/built-in.o:(.rodata+0x4990): undefined reference to `stub32_sigaltstack' Adding Peter Anvin to the people, just in case he sees what's wrong with the system call stub generation that keeps excessively old object files around. If it's easy to fix, it might be worth trying to make it ok to switch from i386 to x86-64 and back in the same tree. Peter? Not a big deal, but if you see something obvious, let's just try to fix it, ok? > Anyway, with this patch, I see CPU stall warnings when running rcutorture > as shown below. This is not a hard failure: Yeah, there's something wrong with the patch, I didn't bother trying to figure it out for now. It also causes a hard failure with lockdep (or lock proving/debugging, I'm not sure which one triggered it) - and it happens too early to even see anything on the screen. So I'd like to make that "downgrade from hardirq to softirq" atomic, and I think it would clean up the crazy code too (currently it does a *lot* of back-and-forth on the preempt flags), but I clearly missed some case where we used a wrapper or two to add some tracepoint or a RCU scheduling point. And I'm not going to worry about it right now, since I'm preparing to make v3.8 soon. But if somebody spots the bug, holler. Linus