From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756184AbZHYUkw (ORCPT ); Tue, 25 Aug 2009 16:40:52 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1756114AbZHYUkw (ORCPT ); Tue, 25 Aug 2009 16:40:52 -0400 Received: from emroute4.ornl.gov ([160.91.86.27]:35738 "EHLO emroute4.ornl.gov" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756109AbZHYUkv (ORCPT ); Tue, 25 Aug 2009 16:40:51 -0400 Date: Tue, 25 Aug 2009 16:40:48 -0400 From: David Dillow Subject: Re: [PATCH 2.6.30-rc4] r8169: avoid losing MSI interrupts In-reply-to: To: "Eric W. Biederman" Cc: Michael Riepe , Michael Buesch , Francois Romieu , Rui Santos , Michael =?ISO-8859-1?Q?B=FCker?= , linux-kernel@vger.kernel.org, netdev@vger.kernel.org Message-id: <1251232848.9607.15.camel@lap75545.ornl.gov> MIME-version: 1.0 X-Mailer: Evolution 2.24.5 (2.24.5-2.fc10) Content-type: text/plain Content-transfer-encoding: 7bit References: <200903041828.49972.m.bueker@berlin.de> <1242001754.4093.12.camel@obelisk.thedillows.org> <200905112248.44868.mb@bu3sch.de> <200905112310.08534.mb@bu3sch.de> <1242077392.3716.15.camel@lap75545.ornl.gov> <4A09DC3E.2080807@googlemail.com> <1242268709.4979.7.camel@obelisk.thedillows.org> <4A0C6504.8000704@googlemail.com> <1242328457.32579.12.camel@lap75545.ornl.gov> <4A0C7443.1010000@googlemail.com> <1243042174.3580.23.camel@obelisk.thedillows.org> <1250895567.23419.1.camel@obelisk.thedillows.org> <1250897657.23419.5.camel@obelisk.thedillows.org> <1250973787.3582.14.camel@obelisk.thedillows.org> <1251169150.4023.11.camel@obelisk.thedillows.org> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 2009-08-25 at 13:22 -0700, Eric W. Biederman wrote: > David Dillow writes: > > I'm not real happy with the interrupt handling in the driver; it makes a > > certain amount of sense to split the MSI vs non-MSI interrupt cases out. > > It also means another pass through re-auditing things against the vendor > > driver. That's more work than I'm able to commit to at the moment. > > > > I've not been able to reproduce it locally on my r8169d, running for ~30 > > minutes straight at full speed. I've not tried running it in UP, though. > > Perhaps I can do that tomorrow. > > > > Here's a possible patch to mask the NAPI events while we're running in > > NAPI mode. I'm not sure it is going to help, since the intr_mask was > > 0xffff when you hit the loop guard, so I left it in for now. > > Interesting. > > If I understand this correctly the situation is that we have on the > chip there is correct logic for a level triggered interrupt and that > the msi logic sits on it and sends an event when the interrupt signal > goes high, but when we acknowledge some bits but not all it does not > send another interrupt. Correct, we have to acknowledge all current outstanding event sources before we get another MSI interrupt. It looks like the MSI interrupt is triggered on the edge transition of a logical OR of all irq sources. > Baring playing games with what version of the card has working logic > and which does not we seem to have to simple choices (if we don't want > to loop possibly forever). > - Don't use the msi logic on this card. > - Move all of the logic into rtl8169_poll and only come out of NAPI > mode when we have caught up with all of the interrupt work. > > Is that how you understand the hardware issue you are trying to work > around? That's how I understood the issue I was working around with the problematic patch, but I thought I had covered both issues fairly well without having to split the handling any further -- we ACK all existing sources each pass through the loop, so we'll get a new interrupt on the unmasked events, but not on ones we've masked out for NAPI until NAPI completes and unmasks them. I'm curious how you managed to receive an packet between us clearing the all current sources and reading the current source list continuously for 60+ seconds -- the loop is basically status = get IRQ events from chip while (status) { /* process events, start NAPI if needed */ clear current events from chip status = get IRQ events from chip } That seems like a very small race window to consistently hit -- especially for long enough to trigger soft lockups.