From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932102AbWFXEQE (ORCPT ); Sat, 24 Jun 2006 00:16:04 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S932177AbWFXEQE (ORCPT ); Sat, 24 Jun 2006 00:16:04 -0400 Received: from dsl-202-45-110-141-static.VIC.netspace.net.au ([202.45.110.141]:27588 "EHLO firewall.reed.wattle.id.au") by vger.kernel.org with ESMTP id S932160AbWFXEQD (ORCPT ); Sat, 24 Jun 2006 00:16:03 -0400 From: Darren Reed Message-Id: <200606240247.k5O2lU3C009083@firewall.reed.wattle.id.au> Subject: Re: 2.6.11: spinlock problem In-Reply-To: To: "linux-os (Dick Johnson)" Date: Sat, 24 Jun 2006 12:47:30 +1000 (EST) CC: Darren Reed , linux-kernel@vger.kernel.org X-Mailer: ELM [version 2.4ME+ PL107a (25)] MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org [ Charset ISO-8859-1 unsupported, converting... ] > On Fri, 23 Jun 2006, Darren Reed wrote: > > > Hi, > > > > I'm seeing a spinlock held panic with a kernel stack like this: > > > > spinlock - panic, lock already held > > .. > > __do_softirq > > do_softirq > > ========= > > do_IRQ > > common_interrupt > > spinlock/spinunlock > > .. > > > > when I load up the system in testing. > > The code protected by the spinlock is quite small - counter increment. > > > > I'm using 2.6.11-1.1369_FC4 #1, installed inside of vmware, > > running as a guest on a Windows XP box. > > > > Is this > > (a) linux allowing the IRQ too early > > (b) vmware not doing something right > > (c) enivitable > > (d) somehow my fault > > (e) something else? > > > > Thanks, > > Darren > > Where's the code? Also, did you initialize the spin-lock variable > before use? locks are intialised...code runs for a minute or so before panic'ing I write my own wrappers to read/write_lock/unlock. The call stack for the panic is: panic ipf_read_enter .. do_softirq ===== do_IRQ common_interrupt ipf_rw_exit ipf_read_enter and ipf_rw_exit are being called for different locks. The panic is occuring in the spin_lock() for the counter increment. The counter incrememnt/decrement uses the same lock, regardless of the counter being used. I believe I'm hitting a race condition of sorts...I just don't know who owns it yet - vmware or linux and I cant test running linux natively at present because I only have one computer. Darren INLINE void ipf_rw_exit(rwlk) ipfrwlock_t *rwlk; { if (rwlk->ipf_isw > 0) { rwlk->ipf_isw = 0; write_unlock(&rwlk->ipf_lk); } else if (rwlk->ipf_isr > 0) { ATOMIC_DEC32(rwlk->ipf_isr); read_unlock(&rwlk->ipf_lk); } else { panic("rwlk->ipf_isw %d isr %d rwlk %p name [%s]\n", rwlk->ipf_isw, rwlk->ipf_isr, rwlk, rwlk->ipf_lname); } } INLINE void ipf_read_enter(rwlk) ipfrwlock_t *rwlk; { read_lock(&rwlk->ipf_lk); ATOMIC_INC32(rwlk->ipf_isr); } # define ATOMIC_INC32(x) MUTEX_ENTER(&ipf_rw); (x)++; \ MUTEX_EXIT(&ipf_rw) # define ATOMIC_DEC32(x) MUTEX_ENTER(&ipf_rw); (x)--; \ MUTEX_EXIT(&ipf_rw) # define MUTEX_ENTER(x) spin_lock(&(x)->ipf_lk) # define MUTEX_EXIT(x) spin_unlock(&(x)->ipf_lk)