* About irq_create_affinity_masks() for a platform device driver
@ 2020-01-22 10:09 John Garry
2020-01-22 10:59 ` Thomas Gleixner
0 siblings, 1 reply; 8+ messages in thread
From: John Garry @ 2020-01-22 10:09 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: Marc Zyngier, linux-kernel, chenxiang
Hi Thomas,
A question about the function in the $subject:
Would there be any issue with a SCSI platform device driver referencing
this function?
So I have a multi-queue platform device, and I want to spread interrupts
over all possible CPUs, just like we can do for PCI MSI vectors. This
topic was touched on in [0].
And, if so it's ok, could we export that same symbol?
Cheers,
John
[0]
https://lore.kernel.org/lkml/88d64d51-4344-e908-b55b-0583b0137ddf@huawei.com/
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: About irq_create_affinity_masks() for a platform device driver
2020-01-22 10:09 About irq_create_affinity_masks() for a platform device driver John Garry
@ 2020-01-22 10:59 ` Thomas Gleixner
2020-01-22 11:27 ` John Garry
0 siblings, 1 reply; 8+ messages in thread
From: Thomas Gleixner @ 2020-01-22 10:59 UTC (permalink / raw)
To: John Garry; +Cc: Marc Zyngier, linux-kernel, chenxiang
John,
John Garry <john.garry@huawei.com> writes:
> Would there be any issue with a SCSI platform device driver referencing
> this function?
>
> So I have a multi-queue platform device, and I want to spread interrupts
> over all possible CPUs, just like we can do for PCI MSI vectors. This
> topic was touched on in [0].
>
> And, if so it's ok, could we export that same symbol?
I think you will need something similar to what we have in the pci/msi
code, but that shouldn't be in your device driver. So I'd rather create
platform infrastructure for this and export that.
Thanks,
tglx
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: About irq_create_affinity_masks() for a platform device driver
2020-01-22 10:59 ` Thomas Gleixner
@ 2020-01-22 11:27 ` John Garry
2020-01-31 14:25 ` John Garry
0 siblings, 1 reply; 8+ messages in thread
From: John Garry @ 2020-01-22 11:27 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: Marc Zyngier, linux-kernel, chenxiang
On 22/01/2020 10:59, Thomas Gleixner wrote:
Hi Thomas,
> John Garry <john.garry@huawei.com> writes:
>> Would there be any issue with a SCSI platform device driver referencing
>> this function?
>>
>> So I have a multi-queue platform device, and I want to spread interrupts
>> over all possible CPUs, just like we can do for PCI MSI vectors. This
>> topic was touched on in [0].
>>
>> And, if so it's ok, could we export that same symbol?
>
> I think you will need something similar to what we have in the pci/msi
> code, but that shouldn't be in your device driver. So I'd rather create
> platform infrastructure for this and export that.
>
That would seem the proper thing do to.
So I was doing this for legacy hw as a cheap and quick performance
boost, but I doubt how many other users there would be in future for any
new API. Also, the effort could be more than the reward and so I may
consider dropping the whole idea.
But I'll have a play with how the code could look now.
Cheers,
john
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: About irq_create_affinity_masks() for a platform device driver
2020-01-22 11:27 ` John Garry
@ 2020-01-31 14:25 ` John Garry
2020-01-31 21:41 ` Thomas Gleixner
0 siblings, 1 reply; 8+ messages in thread
From: John Garry @ 2020-01-31 14:25 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: Marc Zyngier, linux-kernel, chenxiang
>> John Garry <john.garry@huawei.com> writes:
>>> Would there be any issue with a SCSI platform device driver referencing
>>> this function?
>>>
>>> So I have a multi-queue platform device, and I want to spread interrupts
>>> over all possible CPUs, just like we can do for PCI MSI vectors. This
>>> topic was touched on in [0].
>>>
>>> And, if so it's ok, could we export that same symbol?
>>
Hi Thomas,
>> I think you will need something similar to what we have in the pci/msi
>> code, but that shouldn't be in your device driver. So I'd rather create
>> platform infrastructure for this and export that.
>>
>
> That would seem the proper thing do to.
>
> So I was doing this for legacy hw as a cheap and quick performance
> boost, but I doubt how many other users there would be in future for any
> new API. Also, the effort could be more than the reward and so I may
> consider dropping the whole idea.
>
> But I'll have a play with how the code could look now.
So I'd figure that an API like this would be required:
--- a/include/linux/platform_device.h
+++ b/include/linux/platform_device.h
@@ -11,6 +11,7 @@
#define _PLATFORM_DEVICE_H_
#include <linux/device.h>
+#include <linux/interrupt.h>
#define PLATFORM_DEVID_NONE (-1)
#define PLATFORM_DEVID_AUTO (-2)
@@ -27,6 +28,7 @@ struct platform_device {
u64 dma_mask;
u32 num_resources;
struct resource *resource;
+ struct irq_affinity_desc *desc;
and in platform.c, adding:
/**
* platform_get_irqs_affinity - get all IRQs for a device with affinity
* @dev: platform device
* @affd: Affinity descriptor
* @count: pointer to count of IRQS
* @irqs: pointer holder for irqs numbers
*
* Gets a full set of IRQs for a platform device
*
* Return: 0 on success, negative error number on failure.
*/
int platform_get_irqs_affinity(struct platform_device *dev, struct
irq_affinity *affd, unsigned int *count, int **irqs)
{
int i;
int *pirqs;
if (ACPI_COMPANION(&dev->dev)) {
*count = acpi_irq_get_count(ACPI_HANDLE(&dev->dev));
} else {
// TODO
}
pirqs = kzalloc(*count * sizeof(int), GFP_KERNEL);
if (!pirqs)
return -ENOMEM;
dev->desc = irq_create_affinity_masks(*count, affd);
if (!dev->desc) {
kfree(irqs);
return -ENOMEM;
}
for (i = 0; i < *count; i++) {
pirqs[i] = platform_get_irq(dev, i);
if (irqs[i] < 0) {
kfree(dev->desc);
kfree(irqs);
return -ENOMEM;
}
}
*irqs = pirqs;
return 0;
}
EXPORT_SYMBOL_GPL(platform_get_irqs_affinity);
Here we pass the affinity descriptor and allocate all IRQs for a device.
So this is less than a half-baked solution. We only create the affinity
masks but do nothing with them, and the actual irq_desc 's generated
would not would have their affinity mask set and would not be managed.
Only the platform device driver itself would access the masks, to set
the irq affinity hint, etc.
To achieve the proper result, we would somehow need to pass the per-IRQ
affinity descriptor all the way down through
platform_get_irq()->acpi_irq_get()->irq_create_fwspec_mapping()->irq_domain_alloc_irqs(),
which could involve disruptive changes in different subsystems - not
welcome, I'd say.
I could take the alt approach to generate the interrupt affinity masks
in my LLDD instead. Considering I know some of the CPU and numa node
properties of the device host, I could generate the masks in the LLDD
itself simply, but I still would rather avoid this if possible and use
standard APIs.
So if there are any better ideas on this, then it would be good to hear
them.
Thanks,
john
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: About irq_create_affinity_masks() for a platform device driver
2020-01-31 14:25 ` John Garry
@ 2020-01-31 21:41 ` Thomas Gleixner
2020-02-03 15:00 ` John Garry
0 siblings, 1 reply; 8+ messages in thread
From: Thomas Gleixner @ 2020-01-31 21:41 UTC (permalink / raw)
To: John Garry; +Cc: Marc Zyngier, linux-kernel, chenxiang
John!
John Garry <john.garry@huawei.com> writes:
> So I'd figure that an API like this would be required:
>
> --- a/include/linux/platform_device.h
> +++ b/include/linux/platform_device.h
> @@ -11,6 +11,7 @@
> #define _PLATFORM_DEVICE_H_
>
> #include <linux/device.h>
> +#include <linux/interrupt.h>
>
> #define PLATFORM_DEVID_NONE (-1)
> #define PLATFORM_DEVID_AUTO (-2)
> @@ -27,6 +28,7 @@ struct platform_device {
> u64 dma_mask;
> u32 num_resources;
> struct resource *resource;
> + struct irq_affinity_desc *desc;
>
> and in platform.c, adding:
>
> /**
> * platform_get_irqs_affinity - get all IRQs for a device with affinity
> * @dev: platform device
> * @affd: Affinity descriptor
> * @count: pointer to count of IRQS
> * @irqs: pointer holder for irqs numbers
> *
> * Gets a full set of IRQs for a platform device
> *
> * Return: 0 on success, negative error number on failure.
> */
> int platform_get_irqs_affinity(struct platform_device *dev, struct
> irq_affinity *affd, unsigned int *count, int **irqs)
> {
> int i;
> int *pirqs;
>
> if (ACPI_COMPANION(&dev->dev)) {
> *count = acpi_irq_get_count(ACPI_HANDLE(&dev->dev));
> } else {
> // TODO
> }
>
> pirqs = kzalloc(*count * sizeof(int), GFP_KERNEL);
> if (!pirqs)
> return -ENOMEM;
>
> dev->desc = irq_create_affinity_masks(*count, affd);
> if (!dev->desc) {
> kfree(irqs);
pirqs I assume and this also leaks the affinity masks and the pointer in
dev.
> return -ENOMEM;
> }
>
> for (i = 0; i < *count; i++) {
> pirqs[i] = platform_get_irq(dev, i);
> if (irqs[i] < 0) {
> kfree(dev->desc);
> kfree(irqs);
> return -ENOMEM;
That's obviously broken as well :)
> }
> }
>
> *irqs = pirqs;
>
> return 0;
> }
> EXPORT_SYMBOL_GPL(platform_get_irqs_affinity);
>
> Here we pass the affinity descriptor and allocate all IRQs for a device.
>
> So this is less than a half-baked solution. We only create the affinity
> masks but do nothing with them, and the actual irq_desc 's generated
> would not would have their affinity mask set and would not be managed.
> Only the platform device driver itself would access the masks, to set
> the irq affinity hint, etc.
>
> To achieve the proper result, we would somehow need to pass the
> per-IRQ affinity descriptor all the way down through
> platform_get_irq()->acpi_irq_get()->irq_create_fwspec_mapping()->irq_domain_alloc_irqs(),
> which could involve disruptive changes in different subsystems - not
> welcome, I'd say.
>
> I could take the alt approach to generate the interrupt affinity masks
> in my LLDD instead. Considering I know some of the CPU and numa node
> properties of the device host, I could generate the masks in the LLDD
> itself simply, but I still would rather avoid this if possible and use
> standard APIs.
>
> So if there are any better ideas on this, then it would be good to hear
> them.
I wouldn't mind to expose a function which allows you to switch the
allocated interrupts to managed. The reason why we do it in one go in
the PCI code is that we get automatically the irq descriptors allocated
on the correct node. So if the node aware allocation is not a
showstopper for this then your function would do:
...
for (i = 0; i < count; i++) {
pirqs[i] = platform_get_irq(dev, i);
irq_update_affinity_desc(pirqs[i], affdescs + i);
}
int irq_update_affinity_desc(unsigned int irq, irq_affinity_desc *affinity)
{
unsigned long flags;
struct irq_desc *desc = irq_get_desc_lock(irq, &flags, 0);
if (!desc)
return -EINVAL;
if (affinity->is_managed) {
irqd_set(&desc->irq_data, IRQD_IRQ_DISABLED);
irqd_set(&desc->irq_data, IRQD_IRQ_MASKED);
}
cpumask_copy(desc->irq_common_data.affinity, affinity);
return 0;
}
That should just work.
Thanks,
tglx
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: About irq_create_affinity_masks() for a platform device driver
2020-01-31 21:41 ` Thomas Gleixner
@ 2020-02-03 15:00 ` John Garry
2020-02-04 9:20 ` Thomas Gleixner
0 siblings, 1 reply; 8+ messages in thread
From: John Garry @ 2020-02-03 15:00 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: Marc Zyngier, linux-kernel, chenxiang
Hi Thomas,
>>
>> pirqs = kzalloc(*count * sizeof(int), GFP_KERNEL);
>> if (!pirqs)
>> return -ENOMEM;
>>
>> dev->desc = irq_create_affinity_masks(*count, affd);
>> if (!dev->desc) {
>> kfree(irqs);
>
> pirqs I assume and this also leaks the affinity masks and the pointer in
> dev.
Right
>
>> return -ENOMEM;
>> }
>>
>> for (i = 0; i < *count; i++) {
>> pirqs[i] = platform_get_irq(dev, i);
>> if (irqs[i] < 0) {
>> kfree(dev->desc);
>> kfree(irqs);
>> return -ENOMEM;
>
> That's obviously broken as well :)
Right, again
>
>> }
>> }
>>
>> *irqs = pirqs;
>>
>> return 0;
>> }
>> EXPORT_SYMBOL_GPL(platform_get_irqs_affinity);
[...]
>
> I wouldn't mind to expose a function which allows you to switch the
> allocated interrupts to managed. The reason why we do it in one go in
> the PCI code is that we get automatically the irq descriptors allocated
> on the correct node. So if the node aware allocation is not a
> showstopper
I wouldn't say so for now.
for this then your function would do:
>
> ...
> for (i = 0; i < count; i++) {
> pirqs[i] = platform_get_irq(dev, i);
>
> irq_update_affinity_desc(pirqs[i], affdescs + i);
>
> }
>
> int irq_update_affinity_desc(unsigned int irq, irq_affinity_desc *affinity)
> {
> unsigned long flags;
> struct irq_desc *desc = irq_get_desc_lock(irq, &flags, 0);
>
> if (!desc)
> return -EINVAL;
>
> if (affinity->is_managed) {
> irqd_set(&desc->irq_data, IRQD_IRQ_DISABLED);
> irqd_set(&desc->irq_data, IRQD_IRQ_MASKED);
Are these correct? I assume we want to follow alloc_descs() here.
> }
> cpumask_copy(desc->irq_common_data.affinity, affinity);
> return 0;
> }
I see. So I made a couple of changes and it did work:
int irq_update_affinity_desc(unsigned int irq, struct irq_affinity_desc
*affinity)
{
unsigned long flags;
struct irq_desc *desc = irq_get_desc_lock(irq, &flags, 0);
if (!desc)
return -EINVAL;
if (affinity->is_managed) {
irqd_set(&desc->irq_data, IRQD_AFFINITY_MANAGED);
irqd_set(&desc->irq_data, IRQD_MANAGED_SHUTDOWN);
}
cpumask_copy(desc->irq_common_data.affinity, &affinity->mask);
irq_put_desc_unlock(desc, flags);
return 0;
}
And if we were to go this way, then we don't need to add the pointer in
struct platform_device to hold affinity mask descriptors as we're using
them immediately. Or even have a single function to do it all in the irq
code (create the masks and update the affinity desc).
And since we're just updating the masks, I figure we shouldn't need to
add acpi_irq_get_count(), which I invented to get the irq count (without
creating the IRQ mapping).
Thanks,
John
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: About irq_create_affinity_masks() for a platform device driver
2020-02-03 15:00 ` John Garry
@ 2020-02-04 9:20 ` Thomas Gleixner
2020-02-04 9:55 ` John Garry
0 siblings, 1 reply; 8+ messages in thread
From: Thomas Gleixner @ 2020-02-04 9:20 UTC (permalink / raw)
To: John Garry; +Cc: Marc Zyngier, linux-kernel, chenxiang
John,
John Garry <john.garry@huawei.com> writes:
>> I wouldn't mind to expose a function which allows you to switch the
>> allocated interrupts to managed. The reason why we do it in one go in
>> the PCI code is that we get automatically the irq descriptors allocated
>> on the correct node. So if the node aware allocation is not a
>> showstopper
>
> I wouldn't say so for now.
Good.
> for this then your function would do:
>>
>> ...
>> for (i = 0; i < count; i++) {
>> pirqs[i] = platform_get_irq(dev, i);
>>
>> irq_update_affinity_desc(pirqs[i], affdescs + i);
>>
>> }
>>
>> int irq_update_affinity_desc(unsigned int irq, irq_affinity_desc *affinity)
>> {
>> unsigned long flags;
>> struct irq_desc *desc = irq_get_desc_lock(irq, &flags, 0);
>>
>> if (!desc)
>> return -EINVAL;
>>
>> if (affinity->is_managed) {
>> irqd_set(&desc->irq_data, IRQD_IRQ_DISABLED);
>> irqd_set(&desc->irq_data, IRQD_IRQ_MASKED);
>
> Are these correct? I assume we want to follow alloc_descs() here.
Yeah, copied the wrong chunk :)
>> }
>> cpumask_copy(desc->irq_common_data.affinity, affinity);
>> return 0;
>> }
>
> I see. So I made a couple of changes and it did work:
>
> int irq_update_affinity_desc(unsigned int irq, struct irq_affinity_desc
> *affinity)
> {
> unsigned long flags;
> struct irq_desc *desc = irq_get_desc_lock(irq, &flags, 0);
>
> if (!desc)
> return -EINVAL;
>
> if (affinity->is_managed) {
> irqd_set(&desc->irq_data, IRQD_AFFINITY_MANAGED);
> irqd_set(&desc->irq_data, IRQD_MANAGED_SHUTDOWN);
> }
>
> cpumask_copy(desc->irq_common_data.affinity, &affinity->mask);
> irq_put_desc_unlock(desc, flags);
> return 0;
> }
Looks correct.
> And if we were to go this way, then we don't need to add the pointer in
> struct platform_device to hold affinity mask descriptors as we're using
> them immediately. Or even have a single function to do it all in the irq
> code (create the masks and update the affinity desc).
>
> And since we're just updating the masks, I figure we shouldn't need to
> add acpi_irq_get_count(), which I invented to get the irq count (without
> creating the IRQ mapping).
Yes, you can create and apply the masks after setting up the interrupts.
Thanks,
tglx
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: About irq_create_affinity_masks() for a platform device driver
2020-02-04 9:20 ` Thomas Gleixner
@ 2020-02-04 9:55 ` John Garry
0 siblings, 0 replies; 8+ messages in thread
From: John Garry @ 2020-02-04 9:55 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: Marc Zyngier, linux-kernel, chenxiang
Hi Thomas,
>
>> And if we were to go this way, then we don't need to add the pointer in
>> struct platform_device to hold affinity mask descriptors as we're using
>> them immediately. Or even have a single function to do it all in the irq
>> code (create the masks and update the affinity desc).
>>
>> And since we're just updating the masks, I figure we shouldn't need to
>> add acpi_irq_get_count(), which I invented to get the irq count (without
>> creating the IRQ mapping).
> Yes, you can create and apply the masks after setting up the interrupts.
>
So my original concern was that the changes here would be quite
disruptive, but that does not look to be the case.
I need a couple of more things to go into the kernel before I can safely
switch to use managed interrupts in the LLDD, like "blk-mq: improvement
CPU hotplug" series, so I will need to wait on that for the moment.
Thanks,
John
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2020-02-04 9:55 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-22 10:09 About irq_create_affinity_masks() for a platform device driver John Garry
2020-01-22 10:59 ` Thomas Gleixner
2020-01-22 11:27 ` John Garry
2020-01-31 14:25 ` John Garry
2020-01-31 21:41 ` Thomas Gleixner
2020-02-03 15:00 ` John Garry
2020-02-04 9:20 ` Thomas Gleixner
2020-02-04 9:55 ` John Garry
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).