From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932084Ab1JZBsM (ORCPT ); Tue, 25 Oct 2011 21:48:12 -0400 Received: from mail-wy0-f174.google.com ([74.125.82.174]:53925 "EHLO mail-wy0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754297Ab1JZBsK (ORCPT ); Tue, 25 Oct 2011 21:48:10 -0400 Date: Wed, 26 Oct 2011 09:47:58 +0800 From: Yong Zhang To: Simon Kirby Cc: Linus Torvalds , Peter Zijlstra , Ingo Molnar , Thomas Gleixner , Linux Kernel Mailing List , Dave Jones , Martin Schwidefsky , David Miller Subject: Re: Linux 3.1-rc9 Message-ID: <20111026014758.GA7195@zhy> Reply-To: Yong Zhang References: <1318874090.4172.84.camel@twins> <1318879396.4172.92.camel@twins> <1318928713.21167.4.camel@twins> <20111018182046.GF1309@hostway.ca> <20111025152631.GA17008@hostway.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20111025152631.GA17008@hostway.ca> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Oct 25, 2011 at 08:26:31AM -0700, Simon Kirby wrote: > On Tue, Oct 18, 2011 at 01:12:41PM -0700, Linus Torvalds wrote: > > > On Tue, Oct 18, 2011 at 12:48 PM, Thomas Gleixner wrote: > > > > > > It does not look related. > > > > Yeah, the only lock held there seems to be the socket lock, and it > > looks like all CPU's are spinning on it. > > > > > Could you try to reproduce that problem with > > > lockdep enabled? lockdep might make it go away, but it's definitely > > > worth a try. > > > > And DEBUG_SPINLOCK / DEBUG_SPINLOCK_SLEEP too. Maybe you're triggering > > some odd networking thing. It sounds unlikely, but maybe some error > > case you get into doesn't release the socket lock. > > > > I think PROVE_LOCKING already enables DEBUG_SPINLOCK, but the sleeping > > lock thing is separate, iirc. > > I think the config option you were trying to think of is > CONFIG_DEBUG_ATOMIC_SLEEP, which enables CONFIG_PREEMPT_COUNT. > > By the way, we got this WARN_ON_ONCE while running lockdep elsewhere: > > /* > * We can walk the hash lockfree, because the hash only > * grows, and we are careful when adding entries to the end: > */ > list_for_each_entry(class, hash_head, hash_entry) { > if (class->key == key) { > WARN_ON_ONCE(class->name != lock->name); Someone has hit this before, maybe you can try the patch in: http://marc.info/?l=linux-kernel&m=131919035525533 Thanks, Yong > return class; > } > } > > [19274.691090] ------------[ cut here ]------------ > [19274.691107] WARNING: at kernel/lockdep.c:690 __lock_acquire+0xfd6/0x2180() > [19274.691112] Hardware name: PowerEdge 2950 > [19274.691115] Modules linked in: drbd lru_cache cn ipmi_devintf ipmi_si ipmi_msghandler sata_sil24 bnx2 > [19274.691137] Pid: 4416, comm: heartbeat Not tainted 3.1.0-hw-lockdep+ #52 > [19274.691141] Call Trace: > [19274.691149] [] ? __lock_acquire+0xfd6/0x2180 > [19274.691156] [] warn_slowpath_common+0x80/0xc0 > [19274.691163] [] warn_slowpath_null+0x15/0x20 > [19274.691169] [] __lock_acquire+0xfd6/0x2180 > [19274.691175] [] ? lock_release_non_nested+0x1a9/0x340 > [19274.691181] [] lock_acquire+0x109/0x140 > [19274.691185] [] ? double_rq_lock+0x52/0x80 > [19274.691191] [] ? __delay+0xa/0x10 > [19274.691197] [] _raw_spin_lock_nested+0x3a/0x50 > [19274.691201] [] ? double_rq_lock+0x52/0x80 > [19274.691205] [] double_rq_lock+0x52/0x80 > [19274.691210] [] load_balance+0x897/0x16e0 > [19274.691215] [] ? load_balance+0x8c9/0x16e0 > [19274.691219] [] ? update_shares+0xd2/0x150 > [19274.691226] [] ? __schedule+0x842/0xa20 > [19274.691232] [] __schedule+0x8d8/0xa20 > [19274.691238] [] ? __schedule+0x842/0xa20 > [19274.691243] [] ? local_bh_enable+0xa7/0x110 > [19274.691249] [] ? unix_stream_recvmsg+0x1d8/0x7f0 > [19274.691254] [] ? dev_queue_xmit+0x1a8/0x8a0 > [19274.691258] [] schedule+0x3a/0x60 > [19274.691265] [] schedule_hrtimeout_range_clock+0x105/0x120 > [19274.691270] [] ? trace_hardirqs_on+0xd/0x10 > [19274.691276] [] ? add_wait_queue+0x49/0x60 > [19274.691282] [] schedule_hrtimeout_range+0xe/0x10 > [19274.691291] [] poll_schedule_timeout+0x44/0x70 > [19274.691297] [] do_sys_poll+0x33c/0x4f0 > [19274.691303] [] ? poll_freewait+0xc0/0xc0 > [19274.691309] [] ? __pollwait+0x100/0x100 > [19274.691317] [] ? sock_update_classid+0xfd/0x140 > [19274.691323] [] ? sock_update_classid+0x70/0x140 > [19274.691330] [] ? sock_recvmsg+0xf7/0x130 > [19274.691336] [] ? __lock_acquire+0x490/0x2180 > [19274.691343] [] ? might_fault+0x4e/0xa0 > [19274.691351] [] ? sched_clock+0x9/0x10 > [19274.691356] [] ? trace_hardirqs_off+0xd/0x10 > [19274.691363] [] ? sys_recvfrom+0xbb/0x120 > [19274.691370] [] ? process_cpu_clock_getres+0x10/0x10 > [19274.691376] [] ? might_fault+0x4e/0xa0 > [19274.691383] [] ? might_fault+0x4e/0xa0 > [19274.691390] [] ? sysret_check+0x2e/0x69 > [19274.691396] [] sys_poll+0x77/0x110 > [19274.691402] [] system_call_fastpath+0x16/0x1b > [19274.691407] ---[ end trace 74fbaae9066aadcc ]--- > > Simon- > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- Only stand for myself