* irqdomain API: how to set affinity of parent irq of chained irqs? @ 2022-05-02 8:21 Marek Behún 2022-05-02 9:31 ` Marc Zyngier 0 siblings, 1 reply; 7+ messages in thread From: Marek Behún @ 2022-05-02 8:21 UTC (permalink / raw) To: Marc Zyngier; +Cc: linux-kernel, Thomas Gleixner, pali Dear Marc, Thomas, we have encountered the following problem that can hopefully be put some light onto: What is the intended way to set affinity (and possibly other irq attributes) of parent IRQ of chained IRQs, when using the irqdomain API? We are working on a driver that - registers an irqchip and adds an irqdomain - calls irq_set_chained_handler_and_data(parent_irq, handler) where handler triggers handling of child IRQs - but since parent_irq isn't requested for with request_thread_irq(), it does not show up in proc/sysfs, only in debugfs - the HW does not support setting affinity for the chained IRQs, only the parent (which comes from a GIC chip) The problem is that he parent IRQ, as mentioned in the third point, does not show up in proc/sysfs. Is there some precedent for this? Thank you. Marek ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: irqdomain API: how to set affinity of parent irq of chained irqs? 2022-05-02 8:21 irqdomain API: how to set affinity of parent irq of chained irqs? Marek Behún @ 2022-05-02 9:31 ` Marc Zyngier 2022-05-02 15:45 ` Marek Behún 0 siblings, 1 reply; 7+ messages in thread From: Marc Zyngier @ 2022-05-02 9:31 UTC (permalink / raw) To: Marek Behún; +Cc: linux-kernel, Thomas Gleixner, pali On Mon, 02 May 2022 09:21:37 +0100, Marek Behún <kabel@kernel.org> wrote: > > Dear Marc, Thomas, > > we have encountered the following problem that can hopefully be put > some light onto: What is the intended way to set affinity (and possibly > other irq attributes) of parent IRQ of chained IRQs, when using the > irqdomain API? Simples: you can't. What sense does it make to change the affinity of the parent interrupt, given that its fate is tied to *all* of the other interrupts that are muxed to it? Moving the parent interrupt breaks userspace's view of how interrupt affinity is managed (change the affinity of one interrupt, see all the others move the same way). Which is why we don't expose this interrupt to userspace, as this can only lead to bad things. Note that this has nothing to do with the irqdomain API, but everything to do with the userspace ABI. > > We are working on a driver that > - registers an irqchip and adds an irqdomain > - calls irq_set_chained_handler_and_data(parent_irq, handler) > where handler triggers handling of child IRQs > - but since parent_irq isn't requested for with request_thread_irq(), > it does not show up in proc/sysfs, only in debugfs > - the HW does not support setting affinity for the chained IRQs, only > the parent (which comes from a GIC chip) > > The problem is that he parent IRQ, as mentioned in the third point, does > not show up in proc/sysfs. > > Is there some precedent for this? There were precedents of irqchips doing terrible things, such as implementing a set_affinity() callback in the chained irqchip. Thankfully, they have been either fixed or eradicated. M. -- Without deviation from the norm, progress is not possible. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: irqdomain API: how to set affinity of parent irq of chained irqs? 2022-05-02 9:31 ` Marc Zyngier @ 2022-05-02 15:45 ` Marek Behún 2022-05-03 9:32 ` Marc Zyngier 0 siblings, 1 reply; 7+ messages in thread From: Marek Behún @ 2022-05-02 15:45 UTC (permalink / raw) To: Marc Zyngier; +Cc: linux-kernel, Thomas Gleixner, pali On Mon, 02 May 2022 10:31:11 +0100 Marc Zyngier <maz@kernel.org> wrote: > On Mon, 02 May 2022 09:21:37 +0100, > Marek Behún <kabel@kernel.org> wrote: > > > > Dear Marc, Thomas, > > > > we have encountered the following problem that can hopefully be put > > some light onto: What is the intended way to set affinity (and possibly > > other irq attributes) of parent IRQ of chained IRQs, when using the > > irqdomain API? > > Simples: you can't. What sense does it make to change the affinity of > the parent interrupt, given that its fate is tied to *all* of the > other interrupts that are muxed to it? Dear Marc, thank you for your answer. Still: What about when we want to set the same affinity for all the chained interrupts? Example: on Armada 385 there are 4 PCIe controllers. Each controller has one interrupt from which we trigger chained interrupts. We would like to configure each controller to trigger interrupt (and thus all chained interrupts in the domain) on different CPU core. Moreover we would really like to do this in runtime, through sysfs, depending on for example whether there are cards plugged in the PCIe ports. Maybe there should be some mechanism to allow to change affinity for whole irqdomain, or something? Marek ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: irqdomain API: how to set affinity of parent irq of chained irqs? 2022-05-02 15:45 ` Marek Behún @ 2022-05-03 9:32 ` Marc Zyngier 2023-04-06 23:56 ` Radu Rendec 0 siblings, 1 reply; 7+ messages in thread From: Marc Zyngier @ 2022-05-03 9:32 UTC (permalink / raw) To: Marek Behún; +Cc: linux-kernel, Thomas Gleixner, pali On Mon, 02 May 2022 16:45:59 +0100, Marek Behún <kabel@kernel.org> wrote: > > On Mon, 02 May 2022 10:31:11 +0100 > Marc Zyngier <maz@kernel.org> wrote: > > > On Mon, 02 May 2022 09:21:37 +0100, > > Marek Behún <kabel@kernel.org> wrote: > > > > > > Dear Marc, Thomas, > > > > > > we have encountered the following problem that can hopefully be put > > > some light onto: What is the intended way to set affinity (and possibly > > > other irq attributes) of parent IRQ of chained IRQs, when using the > > > irqdomain API? > > > > Simples: you can't. What sense does it make to change the affinity of > > the parent interrupt, given that its fate is tied to *all* of the > > other interrupts that are muxed to it? > > Dear Marc, > > thank you for your answer. Still: > > What about when we want to set the same affinity for all the chained > interrupts? > > Example: on Armada 385 there are 4 PCIe controllers. Each controller > has one interrupt from which we trigger chained interrupts. We would > like to configure each controller to trigger interrupt (and thus all > chained interrupts in the domain) on different CPU core. > > Moreover we would really like to do this in runtime, through sysfs, > depending on for example whether there are cards plugged in the PCIe > ports. > > Maybe there should be some mechanism to allow to change affinity for > whole irqdomain, or something? Should? Maybe. But not for an irqdomain (which really doesn't have anything to do with interrupt affinity). What you may want is a new sysfs interface that would allow a parent interrupt affinity being changed, but also exposing to userspace all the interrupts this affects *at the same time*. something like: /sys/kernel/irq/42/smp_affinity_list /sys/kernel/irq/42/muxed_irqs/ /sys/kernel/irq/42/muxed_irqs/56 -> ../../56 /sys/kernel/irq/42/muxed_irqs/57 -> ../../57 The main issues are that: - we don't really track the muxing information in any of the data structures, so you can't just walk a short list and generate this information. You'd need to build the topology information at allocation time (or fish it out at runtime, but that's likely a pain). - sysfs doesn't deal with affinities at all. procfs does, but adding more crap there is frowned upon. - it *must* be a new interface. You can't repurpose the existing one, as something like irqbalance would be otherwise be massively confused by seeing interrupts moving around behind its back. - conversely, you'll need to teach irqbalance how to deal with this new interface. - this needs to be safe against CPU hotplug. It probably already is, but nobody ever tested it, given that userspace can't interact with these interrupts at the moment. M. -- Without deviation from the norm, progress is not possible. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: irqdomain API: how to set affinity of parent irq of chained irqs? 2022-05-03 9:32 ` Marc Zyngier @ 2023-04-06 23:56 ` Radu Rendec 2023-04-07 9:18 ` Marc Zyngier 0 siblings, 1 reply; 7+ messages in thread From: Radu Rendec @ 2023-04-06 23:56 UTC (permalink / raw) To: Marc Zyngier, Marek Behún Cc: linux-kernel, Thomas Gleixner, pali, Brian Masney Hello Marc, Marek, On Tue, 2022-05-03 at 10:32 +0100, Marc Zyngier wrote: > On Mon, 02 May 2022 16:45:59 +0100, > Marek Behún <kabel@kernel.org> wrote: > > > > On Mon, 02 May 2022 10:31:11 +0100 > > Marc Zyngier <maz@kernel.org> wrote: > > > > > On Mon, 02 May 2022 09:21:37 +0100, > > > Marek Behún <kabel@kernel.org> wrote: > > > > > > > > Dear Marc, Thomas, > > > > > > > > we have encountered the following problem that can hopefully be put > > > > some light onto: What is the intended way to set affinity (and possibly > > > > other irq attributes) of parent IRQ of chained IRQs, when using the > > > > irqdomain API? > > > > > > Simples: you can't. What sense does it make to change the affinity of > > > the parent interrupt, given that its fate is tied to *all* of the > > > other interrupts that are muxed to it? > > > > Dear Marc, > > > > thank you for your answer. Still: > > > > What about when we want to set the same affinity for all the chained > > interrupts? > > > > Example: on Armada 385 there are 4 PCIe controllers. Each controller > > has one interrupt from which we trigger chained interrupts. We would > > like to configure each controller to trigger interrupt (and thus all > > chained interrupts in the domain) on different CPU core. > > > > Moreover we would really like to do this in runtime, through sysfs, > > depending on for example whether there are cards plugged in the PCIe > > ports. > > > > Maybe there should be some mechanism to allow to change affinity for > > whole irqdomain, or something? > > Should? Maybe. But not for an irqdomain (which really doesn't have > anything to do with interrupt affinity). > > What you may want is a new sysfs interface that would allow a parent > interrupt affinity being changed, but also exposing to userspace all > the interrupts this affects *at the same time*. something like: > > /sys/kernel/irq/42/smp_affinity_list > /sys/kernel/irq/42/muxed_irqs/ > /sys/kernel/irq/42/muxed_irqs/56 -> ../../56 > /sys/kernel/irq/42/muxed_irqs/57 -> ../../57 > > The main issues are that: > > - we don't really track the muxing information in any of the data > structures, so you can't just walk a short list and generate this > information. You'd need to build the topology information at > allocation time (or fish it out at runtime, but that's likely a > pain). > > - sysfs doesn't deal with affinities at all. procfs does, but adding > more crap there is frowned upon. > > - it *must* be a new interface. You can't repurpose the existing one, > as something like irqbalance would be otherwise be massively > confused by seeing interrupts moving around behind its back. > > - conversely, you'll need to teach irqbalance how to deal with this > new interface. > > - this needs to be safe against CPU hotplug. It probably already is, > but nobody ever tested it, given that userspace can't interact with > these interrupts at the moment. Are you aware of any work being done (or having been done) in this area? Thanks in advance! My colleagues and I are looking into picking this up and implementing the new sysfs interface and the related irqbalance changes, and we are currently evaluating the level of effort. Obviously, we would like to avoid any effort duplication. Best regards, Radu ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: irqdomain API: how to set affinity of parent irq of chained irqs? 2023-04-06 23:56 ` Radu Rendec @ 2023-04-07 9:18 ` Marc Zyngier 2023-04-20 22:22 ` Radu Rendec 0 siblings, 1 reply; 7+ messages in thread From: Marc Zyngier @ 2023-04-07 9:18 UTC (permalink / raw) To: Radu Rendec Cc: Marek Behún, linux-kernel, Thomas Gleixner, pali, Brian Masney Hi Radu, On Fri, 07 Apr 2023 00:56:40 +0100, Radu Rendec <rrendec@redhat.com> wrote: > > Hello Marc, Marek, > > On Tue, 2022-05-03 at 10:32 +0100, Marc Zyngier wrote: > > On Mon, 02 May 2022 16:45:59 +0100, > > Marek Behún <kabel@kernel.org> wrote: > > > > > > On Mon, 02 May 2022 10:31:11 +0100 > > > Marc Zyngier <maz@kernel.org> wrote: > > > > > > > On Mon, 02 May 2022 09:21:37 +0100, > > > > Marek Behún <kabel@kernel.org> wrote: > > > > > > > > > > Dear Marc, Thomas, > > > > > > > > > > we have encountered the following problem that can hopefully be put > > > > > some light onto: What is the intended way to set affinity (and possibly > > > > > other irq attributes) of parent IRQ of chained IRQs, when using the > > > > > irqdomain API? > > > > > > > > Simples: you can't. What sense does it make to change the affinity of > > > > the parent interrupt, given that its fate is tied to *all* of the > > > > other interrupts that are muxed to it? > > > > > > Dear Marc, > > > > > > thank you for your answer. Still: > > > > > > What about when we want to set the same affinity for all the chained > > > interrupts? > > > > > > Example: on Armada 385 there are 4 PCIe controllers. Each controller > > > has one interrupt from which we trigger chained interrupts. We would > > > like to configure each controller to trigger interrupt (and thus all > > > chained interrupts in the domain) on different CPU core. > > > > > > Moreover we would really like to do this in runtime, through sysfs, > > > depending on for example whether there are cards plugged in the PCIe > > > ports. > > > > > > Maybe there should be some mechanism to allow to change affinity for > > > whole irqdomain, or something? > > > > Should? Maybe. But not for an irqdomain (which really doesn't have > > anything to do with interrupt affinity). > > > > What you may want is a new sysfs interface that would allow a parent > > interrupt affinity being changed, but also exposing to userspace all > > the interrupts this affects *at the same time*. something like: > > > > /sys/kernel/irq/42/smp_affinity_list > > /sys/kernel/irq/42/muxed_irqs/ > > /sys/kernel/irq/42/muxed_irqs/56 -> ../../56 > > /sys/kernel/irq/42/muxed_irqs/57 -> ../../57 > > > > The main issues are that: > > > > - we don't really track the muxing information in any of the data > > structures, so you can't just walk a short list and generate this > > information. You'd need to build the topology information at > > allocation time (or fish it out at runtime, but that's likely a > > pain). > > > > - sysfs doesn't deal with affinities at all. procfs does, but adding > > more crap there is frowned upon. > > > > - it *must* be a new interface. You can't repurpose the existing one, > > as something like irqbalance would be otherwise be massively > > confused by seeing interrupts moving around behind its back. > > > > - conversely, you'll need to teach irqbalance how to deal with this > > new interface. > > > > - this needs to be safe against CPU hotplug. It probably already is, > > but nobody ever tested it, given that userspace can't interact with > > these interrupts at the moment. > > Are you aware of any work being done (or having been done) in this > area? Thanks in advance! > > My colleagues and I are looking into picking this up and implementing > the new sysfs interface and the related irqbalance changes, and we are > currently evaluating the level of effort. Obviously, we would like to > avoid any effort duplication. I don't think anyone ever tried it (it's far easier to just moan about it than to do anything useful). But if you want to start looking into that, that'd be great. One of my concern is that allowing affinity changes for chained interrupt may uncover issues in existing drivers, so it would have to be an explicit buy-in for any chained irqchip. That's probably not too hard to achieve anyway given that you'll need some new infrastructure to track the muxed interrupts. Hopefully this will result in something actually happening! ;-) Cheers, M. -- Without deviation from the norm, progress is not possible. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: irqdomain API: how to set affinity of parent irq of chained irqs? 2023-04-07 9:18 ` Marc Zyngier @ 2023-04-20 22:22 ` Radu Rendec 0 siblings, 0 replies; 7+ messages in thread From: Radu Rendec @ 2023-04-20 22:22 UTC (permalink / raw) To: Marc Zyngier Cc: Marek Behún, linux-kernel, Thomas Gleixner, pali, Brian Masney Hi Marc, On Fri, 2023-04-07 at 10:18 +0100, Marc Zyngier wrote: > On Fri, 07 Apr 2023 00:56:40 +0100, Radu Rendec <rrendec@redhat.com> wrote: > > Are you aware of any work being done (or having been done) in this > > area? Thanks in advance! > > > > My colleagues and I are looking into picking this up and implementing > > the new sysfs interface and the related irqbalance changes, and we are > > currently evaluating the level of effort. Obviously, we would like to > > avoid any effort duplication. > > I don't think anyone ever tried it (it's far easier to just moan about > it than to do anything useful). But if you want to start looking into > that, that'd be great. Thanks for the feedback and sorry for the late reply. It looks like I already started. I have been working on a "sandbox" driver that implements hierarchical/muxed interrupts and would allow me to test in a generic environment, without requiring mux hardware or to mess with real interrupts. But first, I would like to clarify something, just to make sure I'm on the right track. It looks to me like with the hierarchical IRQ domain API, there is always a 1:1 end-to-end mapping between virqs and the hwirqs near the CPU. IOW, there is a 1:1 mapping between a given virq and the corresponding hwirq in each IRQ domain along the chain, and there is no other virq in-between. I looked at many of the irqchip drivers that implement the hierarchical API, and couldn't find a single one that does muxed IRQs. Furthermore, the revmap in struct irq_domain is clearly a 1:1 map, so when an IRQ vector is entered, there is no way to map multiple virqs (and run the associated handlers). I tried it in my test driver, and if the .alloc domain op implementation allocates the same hwirq in the parent domain for two different (v)irqs, the revmap slot in the parent domain is overwritten. If my understanding is correct, muxed IRQs are not possible with the hierarchical IRQ domain API. That means in this particular case you can never indirectly change the affinity of a different IRQ because hwirqs are never shared. So, this is just a matter of exposing the affinity through the new sysfs API for every irqchip driver that opts-in. On the other hand, muxed IRQs *are* possible with the legacy API, and drivers/irqchip/irq-imx-intmux.c is a clear example of that. However, in this case one (or multiple) additional virq(s) exist at the mux level, and it is the virq handler that implements the logic to invoke the appropriate downstream (child) virq handler(s). Then the virq(s) at the mux level and all the corresponding downstream virqs share the same affinity setting, because they also share the same hwirq in the root domain (which is where affinity is really implemented). And yes, in this case the relationship between these virqs is not tracked anywhere currently. Is this what you had in mind when you mentioned below a "new infrastructure to track muxed interrupts"? > One of my concern is that allowing affinity changes for chained > interrupt may uncover issues in existing drivers, so it would have to > be an explicit buy-in for any chained irqchip. That's probably not too > hard to achieve anyway given that you'll need some new infrastructure > to track the muxed interrupts. The first thing that comes to mind for the "explicit buy-in" is a new function pointer in struct irq_chip to set the affinity in a mux-aware manner. Something like irq_set_affinity_shared or _chained. I may not see the whole picture yet but so far my thinking is that the existing irq_set_affinity must remain unchanged in order to preserve compatibility/behavior of the procfs interface. > Hopefully this will result in something actually happening! ;-) I really hope so. I am also excited to have the opportunity to work on this. I will likely need your guidance along the way but I think it's better to talk in advance than submit a huge patch series that makes no sense :) Thanks, Radu ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-04-20 22:22 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-05-02 8:21 irqdomain API: how to set affinity of parent irq of chained irqs? Marek Behún 2022-05-02 9:31 ` Marc Zyngier 2022-05-02 15:45 ` Marek Behún 2022-05-03 9:32 ` Marc Zyngier 2023-04-06 23:56 ` Radu Rendec 2023-04-07 9:18 ` Marc Zyngier 2023-04-20 22:22 ` Radu Rendec
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.