linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: John Garry <john.garry@huawei.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	gregkh@linuxfoundation.org, rafael@kernel.org,
	martin.petersen@oracle.com, jejb@linux.ibm.com,
	linuxarm@huawei.com, linux-scsi@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 1/3] genirq/affinity: Add irq_update_affinity_desc()
Date: Wed, 25 Nov 2020 18:35:27 +0000	[thread overview]
Message-ID: <5a588f5d86010602ff9a90e8f057743c@kernel.org> (raw)
In-Reply-To: <702e1729-9a4b-b16f-6a58-33172b1a3220@huawei.com>

On 2020-11-24 17:38, John Garry wrote:
> Hi Marc,
> 
>>> So initially in the msi_prepare method we setup the its dev - this is
>>> from the mbigen probe. Then when all the irqs are unmapped later for
>>> end device driver removal, we release this its device in
>>> its_irq_domain_free(). But I don't see anything to set it up again. 
>>> Is
>>> it improper to have released the its device in this scenario?
>>> Commenting out the release makes things "good" again.
>> 
>> Huh, that's ugly. The issue is that the device that deals with the
>> interrupts isn't the device that the ITS knows about (there isn't a
>> 1:1 mapping between mbigen and the endpoint).
>> 
>> The mbigen is responsible for the creation of the corresponding
>> irqdomain, and and crucially for the "prepare" phase, which results
>> in storing the its_dev pointer in info->scratchpad[0].
>> 
>> As we free all the interrupts associated with the endpoint, we
>> free the its_dev (nothing else needs it at this point). On the
>> next allocation, we reuse the damn its_dev pointer, and we're SOL.
>> This is wrong, because we haven't removed the mbigen, only the
>> device *connected* to the mbigen. And since the mbigen can be shared
>> across endpoints, we can't reliably tear it down at all. Boo.
>> 
>> The only thing to do is to convey that by marking the its_dev as
>> shared so that it isn't deleted when no LPIs are being used. After
>> all, it isn't like the mbigen is going anywhere.
> 
> Right, I did consider this.

FWIW, I've pushed my hack branch[1] out with a couple of patches
for you to try (the top 3 patches). They allow platform-MSI domains
created by devices (mbigen, ICU) to be advertised as shared between
devices, so that the low-level driver can handle that in an appropriate
way.

I gave it a go on my D05 and nothing blew up, but I can't really remove
the kernel module, as that's where my disks are... :-/
Please let me know if that helps.

>> It is just that passing that information down isn't a simple affair,
>> as msi_alloc_info_t isn't a generic type... Let me have a think.
> 
> I think that there is a way to circumvent the problem, which you might
> call hacky, but OTOH, not sure if there's much point changing mbigen
> or related infrastructure at this stage.

Bah, it's a simple change, and there is now more than the mbigen using
the same API...

> 
> Anyway, so we have 128 irqs in total for the mbigen domain, but the
> driver only is interesting in something like irq indexes 1,2,72-81,
> and 96-112. So we can just dispose the mappings for irq index 0-112 at
> removal stage, thereby keeping the its device around. We do still call
> platform_irq_count(), which sets up all 128 mappings, so maybe we
> should be unmapping all of these - this would be the contentious part.
> But maybe not, as the device driver is only interested in that subset,
> and has no business unmapping the rest.

I don't think the driver should mess with interrupts it doesn't own.
And while the mbigen port that is connected to the SAS controller
doesn't seem to be shared between endpoints, some other ports definitely
are:

# cat /sys/kernel/debug/irq/domains/\\_SB.MBI1
name:   \_SB.MBI1
  size:   409
  mapped: 192
  flags:  0x00000003

[...]

I guess that the other 217 lines are connected somewhere.

> With that change, the platform.c API would work a bit more like the
> pci msi code equivalent, where we request a min and max number of
> vectors. In fact, that platform.c change needs to be made anyway as
> platform_get_irqs_affinity() is broken currently for when nr_cpus <
> #hw queues.
> 
> Thoughts?

I'm happy to look at some code! ;-)

         M.
-- 
Jazz is not dead. It just smells funny...

  reply	other threads:[~2020-11-25 18:35 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-28 12:33 [PATCH v2 0/3] Support managed interrupts for platform devices John Garry
2020-10-28 12:33 ` [PATCH v2 1/3] genirq/affinity: Add irq_update_affinity_desc() John Garry
2020-10-28 18:22   ` Thomas Gleixner
2020-11-02 17:32     ` John Garry
2020-11-02 20:35       ` Thomas Gleixner
2020-11-17 21:28         ` Thomas Gleixner
2020-11-18 11:34           ` John Garry
2020-11-18 20:38             ` Thomas Gleixner
2020-11-19  9:31               ` John Garry
2020-11-19 18:09                 ` Thomas Gleixner
2020-11-19 19:56                   ` John Garry
2020-11-19 21:03                     ` Thomas Gleixner
2020-11-20 11:52                       ` John Garry
2020-11-22 13:38                         ` Marc Zyngier
2020-11-23 12:54                           ` John Garry
2020-11-23 13:26                             ` Marc Zyngier
2020-11-23 15:45                               ` John Garry
2020-11-24 16:51                                 ` Marc Zyngier
2020-11-24 17:38                                   ` John Garry
2020-11-25 18:35                                     ` Marc Zyngier [this message]
2020-11-26 10:47                                       ` John Garry
2020-11-26 11:08                                         ` Marc Zyngier
2020-11-26 11:29                                           ` John Garry
2020-11-26 16:52                                             ` John Garry
2020-11-27  9:57                                               ` Marc Zyngier
2020-11-27 12:45                                                 ` John Garry
2020-11-27 12:49                                                   ` Marc Zyngier
2020-10-28 12:33 ` [PATCH v2 2/3] Driver core: platform: Add platform_get_irqs_affinity() John Garry
2020-10-28 12:33 ` [PATCH v2 3/3] scsi: hisi_sas: Expose HW queues for v2 hw John Garry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5a588f5d86010602ff9a90e8f057743c@kernel.org \
    --to=maz@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=jejb@linux.ibm.com \
    --cc=john.garry@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=martin.petersen@oracle.com \
    --cc=rafael@kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).