From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754077Ab1I2ElN (ORCPT ); Thu, 29 Sep 2011 00:41:13 -0400 Received: from smtp-out.google.com ([74.125.121.67]:50290 "EHLO smtp-out.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752230Ab1I2ElM convert rfc822-to-8bit (ORCPT ); Thu, 29 Sep 2011 00:41:12 -0400 DomainKey-Signature: a=rsa-sha1; s=beta; d=google.com; c=nofws; q=dns; h=dkim-signature:mime-version:in-reply-to:references:from:date: message-id:subject:to:cc:content-type: content-transfer-encoding:x-system-of-record; b=ftM6IEVAA3WKox6mBc7ri5atB08qMnqu6GOIn/+0xWSWpP0fwX49C+a9u/gpjrP/5 0ExfudVN99Yxl0CqIWEuA== MIME-Version: 1.0 In-Reply-To: <20110929004214.GA2241@neilslaptop.think-freely.org> References: <1316025413-5855-1-git-send-email-nhorman@tuxdriver.com> <1316447235-31345-1-git-send-email-nhorman@tuxdriver.com> <20110922135428.GC16740@parisc-linux.org> <20110922143202.GC13359@shamino.rdu.redhat.com> <20110929004214.GA2241@neilslaptop.think-freely.org> From: Bjorn Helgaas Date: Wed, 28 Sep 2011 22:40:43 -0600 Message-ID: Subject: Re: [PATCH] sysfs: add per pci device msi[x] irq listing (v3) To: Neil Horman Cc: Matthew Wilcox , linux-kernel@vger.kernel.org, Greg Kroah-Hartman , Jesse Barnes , linux-pci@vger.kernel.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8BIT X-System-Of-Record: true Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Sep 28, 2011 at 6:42 PM, Neil Horman wrote: > > On Wed, Sep 28, 2011 at 04:18:55PM -0600, Bjorn Helgaas wrote: > > On Thu, Sep 22, 2011 at 8:32 AM, Neil Horman wrote: > > > > > > On Thu, Sep 22, 2011 at 07:54:28AM -0600, Matthew Wilcox wrote: > > > > On Mon, Sep 19, 2011 at 11:47:15AM -0400, Neil Horman wrote: > > > > > So a while back, I wanted to provide a way for irqbalance (and other apps) to > > > > > definitively map irqs to devices, which, for msi[x] irqs is currently not really > > > > > possible in user space.  My first attempt wen't not so well: > > > > > https://lkml.org/lkml/2011/4/21/308 > > > > > > > > > > It was plauged by the same issues that prior attempts were, namely that it > > > > > violated the one-file-one-value sysfs rule.  I wandered off but have recently > > > > > come back to this.  I've got a new implementation here that exports a new > > > > > subdirectory for every pci device,  called msi_irqs.  This subdirectory contanis > > > > > a variable number of numbered subdirectories, in which the number represents an > > > > > msi irq.  Each numbered subdirectory contains attributes for that irq, which > > > > > currently is only the mode it is operating in (msi vs. msix).  I think fits > > > > > within the constraints sysfs requires, and will allow irqbalance to properly map > > > > > msi irqs to devices without having to rely on rickety, best guess methods like > > > > > interface name matching. > > > > > > > > This approach feels like building bigger rockets instead of a space > > > > elevator :-) > > > > > > > In which case your comments make me think that you're trying to build the > > > Death Star instead of buying more tie fighters :) > > > https://docs.google.com/viewer?url=http://www.dau.mil/pubscats/ATL%20Docs/Sep-Oct11/Ward.pdf > > > > > > > What we need is to allow device drivers to ask for per-CPU interrupts, > > > > and implement them in terms of MSI-X.  I've made a couple of stabs at > > > > implementing this, but haven't got anything working yet.  It would solve > > > Yes, IIRC you were trying to do this the first time I proposed this: > > > https://lkml.org/lkml/2011/4/21/315 > > > > > > > a number of problems: > > > > > > > Thats great, I don't see how this precludes what I'm trying to do here.  All > > > this patch does is expose a definitive relationship between msi irqs and the pci > > > devices that allocate them.  The kernel internal model used to allocate msi > > > interrupts can change, the kobject creation and removal just has to change with > > > it (presumably to create and destroy the msi irq kobjects when the individual > > > irqs are allocated/freed, rather than in a batch).  I don't see why we should > > > block enhancements to the existing msi implementation until you get new model > > > sorted, especially when this feature works equally well, despite the model we > > > use internally. > > > > Matthew, I don't understand this issue well enough to know whether > > Neil's patch would get in the way of your planned enhancements, or > > whether it would be baggage we won't want to maintain forever.  As far > > as I can tell, the patch exposes an (IRQ -> device) mapping, which > > would still be meaningful even with per-CPU interrupts.  Can you > > educate me? > > > Thats my view on the subject, to which I think I commented.  Matthews > enhancements are perfectly reasonable, but they're orthogonal to these changes. > Regardless of the way they're allocated (matthews changes), theres still an > association between the irq and the device (my changes) > > > Neil, why do you propose doing this just for MSI IRQs?  I would think > > it'd be useful information for *all* IRQs, regardless of type, and > > that exposing the mapping for all IRQs would make it easier for tools. > > > Because legacy (non-msi) irqs are already ostensibly exposed via > /proc/bus/pci/devices/.../irq.  So non-msi irqs are already covered. But that's a different mechanism, in a different directory hierarchy. It seems like it could be easier for user-space if all types of IRQs were exposed uniformly in sysfs, even if we had the leftover /proc/ stuff that only covers non-MSI IRQs. I guess one could argue that we shouldn't have non-MSI IRQs in both places, since we can never remove the /proc stuff anyway. Bjorn