* About irq_create_affinity_masks() for a platform device driver @ 2020-01-22 10:09 John Garry 2020-01-22 10:59 ` Thomas Gleixner 0 siblings, 1 reply; 8+ messages in thread From: John Garry @ 2020-01-22 10:09 UTC (permalink / raw) To: Thomas Gleixner; +Cc: Marc Zyngier, linux-kernel, chenxiang Hi Thomas, A question about the function in the $subject: Would there be any issue with a SCSI platform device driver referencing this function? So I have a multi-queue platform device, and I want to spread interrupts over all possible CPUs, just like we can do for PCI MSI vectors. This topic was touched on in [0]. And, if so it's ok, could we export that same symbol? Cheers, John [0] https://lore.kernel.org/lkml/88d64d51-4344-e908-b55b-0583b0137ddf@huawei.com/ ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: About irq_create_affinity_masks() for a platform device driver 2020-01-22 10:09 About irq_create_affinity_masks() for a platform device driver John Garry @ 2020-01-22 10:59 ` Thomas Gleixner 2020-01-22 11:27 ` John Garry 0 siblings, 1 reply; 8+ messages in thread From: Thomas Gleixner @ 2020-01-22 10:59 UTC (permalink / raw) To: John Garry; +Cc: Marc Zyngier, linux-kernel, chenxiang John, John Garry <john.garry@huawei.com> writes: > Would there be any issue with a SCSI platform device driver referencing > this function? > > So I have a multi-queue platform device, and I want to spread interrupts > over all possible CPUs, just like we can do for PCI MSI vectors. This > topic was touched on in [0]. > > And, if so it's ok, could we export that same symbol? I think you will need something similar to what we have in the pci/msi code, but that shouldn't be in your device driver. So I'd rather create platform infrastructure for this and export that. Thanks, tglx ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: About irq_create_affinity_masks() for a platform device driver 2020-01-22 10:59 ` Thomas Gleixner @ 2020-01-22 11:27 ` John Garry 2020-01-31 14:25 ` John Garry 0 siblings, 1 reply; 8+ messages in thread From: John Garry @ 2020-01-22 11:27 UTC (permalink / raw) To: Thomas Gleixner; +Cc: Marc Zyngier, linux-kernel, chenxiang On 22/01/2020 10:59, Thomas Gleixner wrote: Hi Thomas, > John Garry <john.garry@huawei.com> writes: >> Would there be any issue with a SCSI platform device driver referencing >> this function? >> >> So I have a multi-queue platform device, and I want to spread interrupts >> over all possible CPUs, just like we can do for PCI MSI vectors. This >> topic was touched on in [0]. >> >> And, if so it's ok, could we export that same symbol? > > I think you will need something similar to what we have in the pci/msi > code, but that shouldn't be in your device driver. So I'd rather create > platform infrastructure for this and export that. > That would seem the proper thing do to. So I was doing this for legacy hw as a cheap and quick performance boost, but I doubt how many other users there would be in future for any new API. Also, the effort could be more than the reward and so I may consider dropping the whole idea. But I'll have a play with how the code could look now. Cheers, john ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: About irq_create_affinity_masks() for a platform device driver 2020-01-22 11:27 ` John Garry @ 2020-01-31 14:25 ` John Garry 2020-01-31 21:41 ` Thomas Gleixner 0 siblings, 1 reply; 8+ messages in thread From: John Garry @ 2020-01-31 14:25 UTC (permalink / raw) To: Thomas Gleixner; +Cc: Marc Zyngier, linux-kernel, chenxiang >> John Garry <john.garry@huawei.com> writes: >>> Would there be any issue with a SCSI platform device driver referencing >>> this function? >>> >>> So I have a multi-queue platform device, and I want to spread interrupts >>> over all possible CPUs, just like we can do for PCI MSI vectors. This >>> topic was touched on in [0]. >>> >>> And, if so it's ok, could we export that same symbol? >> Hi Thomas, >> I think you will need something similar to what we have in the pci/msi >> code, but that shouldn't be in your device driver. So I'd rather create >> platform infrastructure for this and export that. >> > > That would seem the proper thing do to. > > So I was doing this for legacy hw as a cheap and quick performance > boost, but I doubt how many other users there would be in future for any > new API. Also, the effort could be more than the reward and so I may > consider dropping the whole idea. > > But I'll have a play with how the code could look now. So I'd figure that an API like this would be required: --- a/include/linux/platform_device.h +++ b/include/linux/platform_device.h @@ -11,6 +11,7 @@ #define _PLATFORM_DEVICE_H_ #include <linux/device.h> +#include <linux/interrupt.h> #define PLATFORM_DEVID_NONE (-1) #define PLATFORM_DEVID_AUTO (-2) @@ -27,6 +28,7 @@ struct platform_device { u64 dma_mask; u32 num_resources; struct resource *resource; + struct irq_affinity_desc *desc; and in platform.c, adding: /** * platform_get_irqs_affinity - get all IRQs for a device with affinity * @dev: platform device * @affd: Affinity descriptor * @count: pointer to count of IRQS * @irqs: pointer holder for irqs numbers * * Gets a full set of IRQs for a platform device * * Return: 0 on success, negative error number on failure. */ int platform_get_irqs_affinity(struct platform_device *dev, struct irq_affinity *affd, unsigned int *count, int **irqs) { int i; int *pirqs; if (ACPI_COMPANION(&dev->dev)) { *count = acpi_irq_get_count(ACPI_HANDLE(&dev->dev)); } else { // TODO } pirqs = kzalloc(*count * sizeof(int), GFP_KERNEL); if (!pirqs) return -ENOMEM; dev->desc = irq_create_affinity_masks(*count, affd); if (!dev->desc) { kfree(irqs); return -ENOMEM; } for (i = 0; i < *count; i++) { pirqs[i] = platform_get_irq(dev, i); if (irqs[i] < 0) { kfree(dev->desc); kfree(irqs); return -ENOMEM; } } *irqs = pirqs; return 0; } EXPORT_SYMBOL_GPL(platform_get_irqs_affinity); Here we pass the affinity descriptor and allocate all IRQs for a device. So this is less than a half-baked solution. We only create the affinity masks but do nothing with them, and the actual irq_desc 's generated would not would have their affinity mask set and would not be managed. Only the platform device driver itself would access the masks, to set the irq affinity hint, etc. To achieve the proper result, we would somehow need to pass the per-IRQ affinity descriptor all the way down through platform_get_irq()->acpi_irq_get()->irq_create_fwspec_mapping()->irq_domain_alloc_irqs(), which could involve disruptive changes in different subsystems - not welcome, I'd say. I could take the alt approach to generate the interrupt affinity masks in my LLDD instead. Considering I know some of the CPU and numa node properties of the device host, I could generate the masks in the LLDD itself simply, but I still would rather avoid this if possible and use standard APIs. So if there are any better ideas on this, then it would be good to hear them. Thanks, john ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: About irq_create_affinity_masks() for a platform device driver 2020-01-31 14:25 ` John Garry @ 2020-01-31 21:41 ` Thomas Gleixner 2020-02-03 15:00 ` John Garry 0 siblings, 1 reply; 8+ messages in thread From: Thomas Gleixner @ 2020-01-31 21:41 UTC (permalink / raw) To: John Garry; +Cc: Marc Zyngier, linux-kernel, chenxiang John! John Garry <john.garry@huawei.com> writes: > So I'd figure that an API like this would be required: > > --- a/include/linux/platform_device.h > +++ b/include/linux/platform_device.h > @@ -11,6 +11,7 @@ > #define _PLATFORM_DEVICE_H_ > > #include <linux/device.h> > +#include <linux/interrupt.h> > > #define PLATFORM_DEVID_NONE (-1) > #define PLATFORM_DEVID_AUTO (-2) > @@ -27,6 +28,7 @@ struct platform_device { > u64 dma_mask; > u32 num_resources; > struct resource *resource; > + struct irq_affinity_desc *desc; > > and in platform.c, adding: > > /** > * platform_get_irqs_affinity - get all IRQs for a device with affinity > * @dev: platform device > * @affd: Affinity descriptor > * @count: pointer to count of IRQS > * @irqs: pointer holder for irqs numbers > * > * Gets a full set of IRQs for a platform device > * > * Return: 0 on success, negative error number on failure. > */ > int platform_get_irqs_affinity(struct platform_device *dev, struct > irq_affinity *affd, unsigned int *count, int **irqs) > { > int i; > int *pirqs; > > if (ACPI_COMPANION(&dev->dev)) { > *count = acpi_irq_get_count(ACPI_HANDLE(&dev->dev)); > } else { > // TODO > } > > pirqs = kzalloc(*count * sizeof(int), GFP_KERNEL); > if (!pirqs) > return -ENOMEM; > > dev->desc = irq_create_affinity_masks(*count, affd); > if (!dev->desc) { > kfree(irqs); pirqs I assume and this also leaks the affinity masks and the pointer in dev. > return -ENOMEM; > } > > for (i = 0; i < *count; i++) { > pirqs[i] = platform_get_irq(dev, i); > if (irqs[i] < 0) { > kfree(dev->desc); > kfree(irqs); > return -ENOMEM; That's obviously broken as well :) > } > } > > *irqs = pirqs; > > return 0; > } > EXPORT_SYMBOL_GPL(platform_get_irqs_affinity); > > Here we pass the affinity descriptor and allocate all IRQs for a device. > > So this is less than a half-baked solution. We only create the affinity > masks but do nothing with them, and the actual irq_desc 's generated > would not would have their affinity mask set and would not be managed. > Only the platform device driver itself would access the masks, to set > the irq affinity hint, etc. > > To achieve the proper result, we would somehow need to pass the > per-IRQ affinity descriptor all the way down through > platform_get_irq()->acpi_irq_get()->irq_create_fwspec_mapping()->irq_domain_alloc_irqs(), > which could involve disruptive changes in different subsystems - not > welcome, I'd say. > > I could take the alt approach to generate the interrupt affinity masks > in my LLDD instead. Considering I know some of the CPU and numa node > properties of the device host, I could generate the masks in the LLDD > itself simply, but I still would rather avoid this if possible and use > standard APIs. > > So if there are any better ideas on this, then it would be good to hear > them. I wouldn't mind to expose a function which allows you to switch the allocated interrupts to managed. The reason why we do it in one go in the PCI code is that we get automatically the irq descriptors allocated on the correct node. So if the node aware allocation is not a showstopper for this then your function would do: ... for (i = 0; i < count; i++) { pirqs[i] = platform_get_irq(dev, i); irq_update_affinity_desc(pirqs[i], affdescs + i); } int irq_update_affinity_desc(unsigned int irq, irq_affinity_desc *affinity) { unsigned long flags; struct irq_desc *desc = irq_get_desc_lock(irq, &flags, 0); if (!desc) return -EINVAL; if (affinity->is_managed) { irqd_set(&desc->irq_data, IRQD_IRQ_DISABLED); irqd_set(&desc->irq_data, IRQD_IRQ_MASKED); } cpumask_copy(desc->irq_common_data.affinity, affinity); return 0; } That should just work. Thanks, tglx ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: About irq_create_affinity_masks() for a platform device driver 2020-01-31 21:41 ` Thomas Gleixner @ 2020-02-03 15:00 ` John Garry 2020-02-04 9:20 ` Thomas Gleixner 0 siblings, 1 reply; 8+ messages in thread From: John Garry @ 2020-02-03 15:00 UTC (permalink / raw) To: Thomas Gleixner; +Cc: Marc Zyngier, linux-kernel, chenxiang Hi Thomas, >> >> pirqs = kzalloc(*count * sizeof(int), GFP_KERNEL); >> if (!pirqs) >> return -ENOMEM; >> >> dev->desc = irq_create_affinity_masks(*count, affd); >> if (!dev->desc) { >> kfree(irqs); > > pirqs I assume and this also leaks the affinity masks and the pointer in > dev. Right > >> return -ENOMEM; >> } >> >> for (i = 0; i < *count; i++) { >> pirqs[i] = platform_get_irq(dev, i); >> if (irqs[i] < 0) { >> kfree(dev->desc); >> kfree(irqs); >> return -ENOMEM; > > That's obviously broken as well :) Right, again > >> } >> } >> >> *irqs = pirqs; >> >> return 0; >> } >> EXPORT_SYMBOL_GPL(platform_get_irqs_affinity); [...] > > I wouldn't mind to expose a function which allows you to switch the > allocated interrupts to managed. The reason why we do it in one go in > the PCI code is that we get automatically the irq descriptors allocated > on the correct node. So if the node aware allocation is not a > showstopper I wouldn't say so for now. for this then your function would do: > > ... > for (i = 0; i < count; i++) { > pirqs[i] = platform_get_irq(dev, i); > > irq_update_affinity_desc(pirqs[i], affdescs + i); > > } > > int irq_update_affinity_desc(unsigned int irq, irq_affinity_desc *affinity) > { > unsigned long flags; > struct irq_desc *desc = irq_get_desc_lock(irq, &flags, 0); > > if (!desc) > return -EINVAL; > > if (affinity->is_managed) { > irqd_set(&desc->irq_data, IRQD_IRQ_DISABLED); > irqd_set(&desc->irq_data, IRQD_IRQ_MASKED); Are these correct? I assume we want to follow alloc_descs() here. > } > cpumask_copy(desc->irq_common_data.affinity, affinity); > return 0; > } I see. So I made a couple of changes and it did work: int irq_update_affinity_desc(unsigned int irq, struct irq_affinity_desc *affinity) { unsigned long flags; struct irq_desc *desc = irq_get_desc_lock(irq, &flags, 0); if (!desc) return -EINVAL; if (affinity->is_managed) { irqd_set(&desc->irq_data, IRQD_AFFINITY_MANAGED); irqd_set(&desc->irq_data, IRQD_MANAGED_SHUTDOWN); } cpumask_copy(desc->irq_common_data.affinity, &affinity->mask); irq_put_desc_unlock(desc, flags); return 0; } And if we were to go this way, then we don't need to add the pointer in struct platform_device to hold affinity mask descriptors as we're using them immediately. Or even have a single function to do it all in the irq code (create the masks and update the affinity desc). And since we're just updating the masks, I figure we shouldn't need to add acpi_irq_get_count(), which I invented to get the irq count (without creating the IRQ mapping). Thanks, John ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: About irq_create_affinity_masks() for a platform device driver 2020-02-03 15:00 ` John Garry @ 2020-02-04 9:20 ` Thomas Gleixner 2020-02-04 9:55 ` John Garry 0 siblings, 1 reply; 8+ messages in thread From: Thomas Gleixner @ 2020-02-04 9:20 UTC (permalink / raw) To: John Garry; +Cc: Marc Zyngier, linux-kernel, chenxiang John, John Garry <john.garry@huawei.com> writes: >> I wouldn't mind to expose a function which allows you to switch the >> allocated interrupts to managed. The reason why we do it in one go in >> the PCI code is that we get automatically the irq descriptors allocated >> on the correct node. So if the node aware allocation is not a >> showstopper > > I wouldn't say so for now. Good. > for this then your function would do: >> >> ... >> for (i = 0; i < count; i++) { >> pirqs[i] = platform_get_irq(dev, i); >> >> irq_update_affinity_desc(pirqs[i], affdescs + i); >> >> } >> >> int irq_update_affinity_desc(unsigned int irq, irq_affinity_desc *affinity) >> { >> unsigned long flags; >> struct irq_desc *desc = irq_get_desc_lock(irq, &flags, 0); >> >> if (!desc) >> return -EINVAL; >> >> if (affinity->is_managed) { >> irqd_set(&desc->irq_data, IRQD_IRQ_DISABLED); >> irqd_set(&desc->irq_data, IRQD_IRQ_MASKED); > > Are these correct? I assume we want to follow alloc_descs() here. Yeah, copied the wrong chunk :) >> } >> cpumask_copy(desc->irq_common_data.affinity, affinity); >> return 0; >> } > > I see. So I made a couple of changes and it did work: > > int irq_update_affinity_desc(unsigned int irq, struct irq_affinity_desc > *affinity) > { > unsigned long flags; > struct irq_desc *desc = irq_get_desc_lock(irq, &flags, 0); > > if (!desc) > return -EINVAL; > > if (affinity->is_managed) { > irqd_set(&desc->irq_data, IRQD_AFFINITY_MANAGED); > irqd_set(&desc->irq_data, IRQD_MANAGED_SHUTDOWN); > } > > cpumask_copy(desc->irq_common_data.affinity, &affinity->mask); > irq_put_desc_unlock(desc, flags); > return 0; > } Looks correct. > And if we were to go this way, then we don't need to add the pointer in > struct platform_device to hold affinity mask descriptors as we're using > them immediately. Or even have a single function to do it all in the irq > code (create the masks and update the affinity desc). > > And since we're just updating the masks, I figure we shouldn't need to > add acpi_irq_get_count(), which I invented to get the irq count (without > creating the IRQ mapping). Yes, you can create and apply the masks after setting up the interrupts. Thanks, tglx ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: About irq_create_affinity_masks() for a platform device driver 2020-02-04 9:20 ` Thomas Gleixner @ 2020-02-04 9:55 ` John Garry 0 siblings, 0 replies; 8+ messages in thread From: John Garry @ 2020-02-04 9:55 UTC (permalink / raw) To: Thomas Gleixner; +Cc: Marc Zyngier, linux-kernel, chenxiang Hi Thomas, > >> And if we were to go this way, then we don't need to add the pointer in >> struct platform_device to hold affinity mask descriptors as we're using >> them immediately. Or even have a single function to do it all in the irq >> code (create the masks and update the affinity desc). >> >> And since we're just updating the masks, I figure we shouldn't need to >> add acpi_irq_get_count(), which I invented to get the irq count (without >> creating the IRQ mapping). > Yes, you can create and apply the masks after setting up the interrupts. > So my original concern was that the changes here would be quite disruptive, but that does not look to be the case. I need a couple of more things to go into the kernel before I can safely switch to use managed interrupts in the LLDD, like "blk-mq: improvement CPU hotplug" series, so I will need to wait on that for the moment. Thanks, John ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2020-02-04 9:55 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-01-22 10:09 About irq_create_affinity_masks() for a platform device driver John Garry 2020-01-22 10:59 ` Thomas Gleixner 2020-01-22 11:27 ` John Garry 2020-01-31 14:25 ` John Garry 2020-01-31 21:41 ` Thomas Gleixner 2020-02-03 15:00 ` John Garry 2020-02-04 9:20 ` Thomas Gleixner 2020-02-04 9:55 ` John Garry
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).