dmaengine.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: "Dey\, Megha" <megha.dey@intel.com>,
	Marc Zyngier <maz@kernel.org>, "Jiang\,
	Dave" <dave.jiang@intel.com>,
	"vkoul\@kernel.org" <vkoul@kernel.org>,
	"bhelgaas\@google.com" <bhelgaas@google.com>,
	"rafael\@kernel.org" <rafael@kernel.org>,
	"gregkh\@linuxfoundation.org" <gregkh@linuxfoundation.org>,
	"hpa\@zytor.com" <hpa@zytor.com>,
	"alex.williamson\@redhat.com" <alex.williamson@redhat.com>, "Pan\,
	Jacob jun" <jacob.jun.pan@intel.com>, "Raj\,
	Ashok" <ashok.raj@intel.com>, "Liu\, Yi L" <yi.l.liu@intel.com>,
	"Lu\, Baolu" <baolu.lu@intel.com>, "Tian\,
	Kevin" <kevin.tian@intel.com>, "Kumar\,
	Sanjay K" <sanjay.k.kumar@intel.com>, "Luck\,
	Tony" <tony.luck@intel.com>, "Lin\, Jing" <jing.lin@intel.com>,
	"Williams\, Dan J" <dan.j.williams@intel.com>,
	"kwankhede\@nvidia.com" <kwankhede@nvidia.com>,
	"eric.auger\@redhat.com" <eric.auger@redhat.com>,
	"parav\@mellanox.com" <parav@mellanox.com>, "Hansen\,
	Dave" <dave.hansen@intel.com>,
	"netanelg\@mellanox.com" <netanelg@mellanox.com>,
	"shahafs\@mellanox.com" <shahafs@mellanox.com>,
	"yan.y.zhao\@linux.intel.com" <yan.y.zhao@linux.intel.com>,
	"pbonzini\@redhat.com" <pbonzini@redhat.com>, "Ortiz\,
	Samuel" <samuel.ortiz@intel.com>, "Hossain\,
	Mona" <mona.hossain@intel.com>,
	"dmaengine\@vger.kernel.org" <dmaengine@vger.kernel.org>,
	"linux-kernel\@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"x86\@kernel.org" <x86@kernel.org>,
	"linux-pci\@vger.kernel.org" <linux-pci@vger.kernel.org>,
	"kvm\@vger.kernel.org" <kvm@vger.kernel.org>
Subject: Re: [PATCH RFC v2 02/18] irq/dev-msi: Add support for a new DEV_MSI irq domain
Date: Fri, 07 Aug 2020 17:22:34 +0200	[thread overview]
Message-ID: <87y2mqfpl1.fsf@nanos.tec.linutronix.de> (raw)
In-Reply-To: <20200807120650.GR16789@nvidia.com>

Jason,

Jason Gunthorpe <jgg@nvidia.com> writes:
> On Thu, Aug 06, 2020 at 10:21:11PM +0200, Thomas Gleixner wrote:
>
>> Optionally? Please tell the hardware folks to make this mandatory. We
>> have enough pain with non maskable MSI interrupts already so introducing
>> yet another non maskable interrupt trainwreck is not an option.
>
> Can you elaborate on the flows where Linux will need to trigger
> masking?

1) disable/enable_irq() obviously needs masking
   
2) Affinity changes are preferrably done with masking to avoid a
   boatload of nasty side effect. We have a "fix" for 32bit addressing
   mode which works by chance due to the layout but it would fail
   miserably with 64bit addressing mode. 64bit addressing mode is only
   relevant for more than 256 CPUs which requires X2APIC which in turn
   requires interrupt remapping. Interrupt remappind saves us here
   because the interrupt can be disabled at the remapping level.

3) The ability to shutdown an irq at the interrupt level in case of
   malfunction. Of course that's pure paranoia because devices are
   perfect and never misbehave :)

So it's nowhere in the hot path of interrupt handling itself.

> I expect that masking will be available in our NIC HW too - but it
> will require a spin loop if masking has to be done in an atomic
> context.

Yes, it's all in atomic context.

We have functionality in the interrupt core to do #1 and #2 from task
context (requires the caller to be in task context as well). #3 not so
much.

>> It's more than a decade now that I tell HW people not to repeat the
>> non-maskable MSI failure, but obviously they still think that
>> non-maskable interrupts are a brilliant idea. I know that HW folks
>> believe that everything they omit can be fixed in software, but they
>> have to finally understand that this particular issue _cannot_ be fixed
>> at all.
>
> Sure, the CPU should always be able to shut off an interrupt!

Oh yes!

> Maybe explaining the goals would help understand the HW perspective.
>
> Today HW can process > 100k queues of work at once. Interrupt delivery
> works by having a MSI index in each queue's metadata and the interrupt
> indirects through a MSI-X table on-chip which has the
> addr/data/mask/etc.
>
> What IMS proposes is that the interrupt data can move into the queue
> meta data (which is not required to be on-chip), eg along side the
> producer/consumer pointers, and the central MSI-X table is not
> needed. This is necessary because the PCI spec has very harsh design
> requirements for a MSI-X table that make scaling it prohibitive.

I know.

> So an IRQ can be silenced by deleting or stopping the queue(s)
> triggering it.

We cannot do that from the interrupt layer without squaring the
circle and violating all locking and layering rules in one go.

> It can be masked by including masking in the queue metadata. We can
> detect pending by checking the producer/consumer values.
>
> However synchronizing all the HW and all the state is now more
> complicated than just writing a mask bit via MMIO to an on-die memory.

That's one of the reasons why I think that the IMS handling has to be a
per device irqdomain with it's own interrupt chip because the way how
IMS is managed is completely device specific.

There is certainly opportunity for sharing some of the functionality and
code, but not by creating a pseudo-shared entity which is customized per
device with indirections and magic storage plus device specific IMS slot
management glued at it as a wart. Such concepts fall apart in no time or
end up in a completely unmaintainable mess.

Coming back to mask/unmask. We could lift that requirement if and only
if irq remapping is mandatory to make use of those magic devices because
the remapping unit allows us to do the masking. That still would not
justify the pseudo-shared irqdomain because the IMS slot management
still stays per device.

Thanks,

        tglx


  parent reply	other threads:[~2020-08-07 15:22 UTC|newest]

Thread overview: 97+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-21 16:02 [PATCH RFC v2 00/18] Add VFIO mediated device support and DEV-MSI support for the idxd driver Dave Jiang
2020-07-21 16:02 ` [PATCH RFC v2 01/18] platform-msi: Introduce platform_msi_ops Dave Jiang
2020-07-21 16:02 ` [PATCH RFC v2 02/18] irq/dev-msi: Add support for a new DEV_MSI irq domain Dave Jiang
2020-07-21 16:13   ` Jason Gunthorpe
2020-07-22 16:50     ` Dey, Megha
2020-07-22 18:52   ` Marc Zyngier
2020-07-22 19:59     ` Jason Gunthorpe
2020-07-23  8:51       ` Marc Zyngier
2020-07-24  0:16         ` Jason Gunthorpe
2020-07-24  0:36           ` Thomas Gleixner
2020-08-05 19:18       ` Dey, Megha
2020-08-05 22:15         ` Jason Gunthorpe
2020-08-05 22:36           ` Dey, Megha
2020-08-05 22:53             ` Jason Gunthorpe
2020-08-06  0:13               ` Dey, Megha
2020-08-06  0:19                 ` Jason Gunthorpe
2020-08-06  0:32                   ` Dey, Megha
2020-08-06  0:46                     ` Jason Gunthorpe
2020-08-06 17:10                     ` Thomas Gleixner
2020-08-06 17:58                       ` Dey, Megha
2020-08-06 20:21                         ` Thomas Gleixner
2020-08-06 22:27                           ` Dey, Megha
2020-08-07  8:48                             ` Thomas Gleixner
2020-08-07 12:06                           ` Jason Gunthorpe
2020-08-07 12:38                             ` gregkh
2020-08-07 13:34                               ` Jason Gunthorpe
2020-08-07 16:47                                 ` Thomas Gleixner
2020-08-07 17:54                                   ` Dey, Megha
2020-08-07 18:39                                     ` Jason Gunthorpe
2020-08-07 20:31                                       ` Dey, Megha
2020-08-08 19:47                                     ` Thomas Gleixner
2020-08-10 21:46                                       ` Thomas Gleixner
2020-08-11  9:53                                         ` Thomas Gleixner
2020-08-11 18:46                                           ` Dey, Megha
2020-08-11 21:25                                             ` Thomas Gleixner
2020-08-11 18:39                                       ` Dey, Megha
2020-08-11 22:39                                         ` Thomas Gleixner
2020-08-07 15:22                             ` Thomas Gleixner [this message]
2020-08-05 18:55     ` Dey, Megha
2020-07-21 16:02 ` [PATCH RFC v2 03/18] irq/dev-msi: Create IR-DEV-MSI " Dave Jiang
2020-07-21 16:21   ` Jason Gunthorpe
2020-07-22 17:03     ` Dey, Megha
2020-07-22 17:33       ` Jason Gunthorpe
2020-07-22 20:44   ` Thomas Gleixner
2020-08-05 19:02     ` Dey, Megha
2020-07-21 16:02 ` [PATCH RFC v2 04/18] irq/dev-msi: Introduce APIs to allocate/free dev-msi interrupts Dave Jiang
2020-07-21 16:25   ` Jason Gunthorpe
2020-07-22 17:05     ` Dey, Megha
2020-07-22 17:35       ` Jason Gunthorpe
2020-08-05 20:19         ` Dey, Megha
2020-07-21 16:02 ` [PATCH RFC v2 05/18] dmaengine: idxd: add support for readonly config devices Dave Jiang
2020-07-21 16:02 ` [PATCH RFC v2 06/18] dmaengine: idxd: add interrupt handle request support Dave Jiang
2020-07-21 16:03 ` [PATCH RFC v2 07/18] dmaengine: idxd: add DEV-MSI support in base driver Dave Jiang
2020-07-21 16:03 ` [PATCH RFC v2 08/18] dmaengine: idxd: add device support functions in prep for mdev Dave Jiang
2020-07-21 16:03 ` [PATCH RFC v2 09/18] dmaengine: idxd: add basic mdev registration and helper functions Dave Jiang
2020-07-21 16:03 ` [PATCH RFC v2 10/18] dmaengine: idxd: add emulation rw routines Dave Jiang
2020-07-21 16:03 ` [PATCH RFC v2 11/18] dmaengine: idxd: prep for virtual device commands Dave Jiang
2020-07-21 16:03 ` [PATCH RFC v2 12/18] dmaengine: idxd: virtual device commands emulation Dave Jiang
2020-07-21 16:03 ` [PATCH RFC v2 13/18] dmaengine: idxd: ims setup for the vdcm Dave Jiang
2020-07-21 16:03 ` [PATCH RFC v2 14/18] dmaengine: idxd: add mdev type as a new wq type Dave Jiang
2020-07-21 16:03 ` [PATCH RFC v2 15/18] dmaengine: idxd: add dedicated wq mdev type Dave Jiang
2020-07-21 16:04 ` [PATCH RFC v2 16/18] dmaengine: idxd: add new wq state for mdev Dave Jiang
2020-07-21 16:04 ` [PATCH RFC v2 17/18] dmaengine: idxd: add error notification from host driver to mediated device Dave Jiang
2020-07-21 16:04 ` [PATCH RFC v2 18/18] dmaengine: idxd: add ABI documentation for mediated device support Dave Jiang
2020-07-21 16:28 ` [PATCH RFC v2 00/18] Add VFIO mediated device support and DEV-MSI support for the idxd driver Greg KH
2020-07-21 17:17   ` Dave Jiang
2020-07-21 21:35   ` Dan Williams
2020-07-21 16:45 ` Jason Gunthorpe
2020-07-21 18:00   ` Dave Jiang
2020-07-22 17:31     ` Dey, Megha
2020-07-22 18:16       ` Jason Gunthorpe
2020-07-21 23:54   ` Tian, Kevin
2020-07-24  0:19     ` Jason Gunthorpe
2020-08-06  1:22       ` Alex Williamson
2020-08-07 12:19         ` Jason Gunthorpe
2020-08-10  7:32           ` Tian, Kevin
2020-08-11 17:00             ` Alex Williamson
2020-08-12  1:58               ` Tian, Kevin
2020-08-12  2:36                 ` Alex Williamson
2020-08-12  3:35                   ` Tian, Kevin
2020-08-12  3:28             ` Jason Wang
2020-08-12  4:05               ` Tian, Kevin
2020-08-13  4:33                 ` Jason Wang
2020-08-13  5:26                   ` Tian, Kevin
2020-08-13  6:01                     ` Jason Wang
2020-08-14 13:23                       ` Jason Gunthorpe
2020-08-17  2:24                         ` Tian, Kevin
2020-08-14 13:35             ` Jason Gunthorpe
2020-08-17  2:12               ` Tian, Kevin
2020-08-18  0:43                 ` Jason Gunthorpe
2020-08-18  1:09                   ` Tian, Kevin
2020-08-18 11:50                     ` Jason Gunthorpe
2020-08-18 16:27                       ` Paolo Bonzini
2020-08-18 16:49                         ` Jason Gunthorpe
2020-08-18 17:05                           ` Paolo Bonzini
2020-08-18 17:18                             ` Jason Gunthorpe
2020-08-19  7:29                       ` Tian, Kevin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87y2mqfpl1.fsf@nanos.tec.linutronix.de \
    --to=tglx@linutronix.de \
    --cc=alex.williamson@redhat.com \
    --cc=ashok.raj@intel.com \
    --cc=baolu.lu@intel.com \
    --cc=bhelgaas@google.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=dmaengine@vger.kernel.org \
    --cc=eric.auger@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hpa@zytor.com \
    --cc=jacob.jun.pan@intel.com \
    --cc=jgg@nvidia.com \
    --cc=jing.lin@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=megha.dey@intel.com \
    --cc=mona.hossain@intel.com \
    --cc=netanelg@mellanox.com \
    --cc=parav@mellanox.com \
    --cc=pbonzini@redhat.com \
    --cc=rafael@kernel.org \
    --cc=samuel.ortiz@intel.com \
    --cc=sanjay.k.kumar@intel.com \
    --cc=shahafs@mellanox.com \
    --cc=tony.luck@intel.com \
    --cc=vkoul@kernel.org \
    --cc=x86@kernel.org \
    --cc=yan.y.zhao@linux.intel.com \
    --cc=yi.l.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).