All of lore.kernel.org
 help / color / mirror / Atom feed
From: Logan Gunthorpe <logang@deltatee.com>
To: Dave Jiang <dave.jiang@intel.com>,
	linux-kernel@vger.kernel.org, linux-ntb@googlegroups.com,
	linux-pci@vger.kernel.org, iommu@lists.linux-foundation.org,
	linux-kselftest@vger.kernel.org, Jon Mason <jdmason@kudzu.us>,
	Bjorn Helgaas <bhelgaas@google.com>,
	Joerg Roedel <joro@8bytes.org>
Cc: Allen Hubbe <allenbh@gmail.com>,
	Serge Semin <fancer.lancer@gmail.com>,
	Eric Pilmore <epilmore@gigaio.com>
Subject: Re: [PATCH 0/9] Support using MSI interrupts in ntb_transport
Date: Thu, 31 Jan 2019 15:39:51 -0700	[thread overview]
Message-ID: <29f7e3fe-5354-6156-1243-7248ffb2249f@deltatee.com> (raw)
In-Reply-To: <ea7b7c3c-8ed5-51ec-022a-7d28a5afce82@intel.com>



On 2019-01-31 1:58 p.m., Dave Jiang wrote:
> 
> On 1/31/2019 1:48 PM, Logan Gunthorpe wrote:
>>
>> On 2019-01-31 1:20 p.m., Dave Jiang wrote:
>>> Does this work when the system moves the MSI vector either via software
>>> (irqbalance) or BIOS APIC programming (some modes cause round robin
>>> behavior)?
>>
>> I don't know how irqbalance works, and I'm not sure what you are
>> referring to by BIOS APIC programming, however I would expect these
>> things would not be a problem.
>>
>> The MSI code I'm presenting here doesn't do anything crazy with the
>> interrupts, it allocates and uses them just as any PCI driver would. The
>> only real difference here is that instead of a piece of hardware sending
>> the IRQ TLP, it will be sent through the memory window (which, from the
>> OS's perspective, is just coming from an NTB hardware proxy alias).
>>
>> Logan
> Right. I did that as a hack a while back for some silicon errata 
> workaround. When the vector moves, the address for the LAPIC changes. So 
> unless it gets updated, you end up writing to the old location and lose 
> all the new interrupts. irqbalance is a user daemon that rotates the 
> system interrupts around to ensure that not all interrupts are pinned on 
> a single core. 

Yes, that would be a problem if something changes the MSI vectors out
from under us. Seems like that would be a bit difficult to do even with
regular hardware. So far I haven't seen anything that would do that. If
you know of where in the kernel this happens I'd be interested in
getting a pointer to the flow in the code. If that is the case this MSI
stuff will need to get much more complicated...

> I think it's enabled by default on several distros. 

> Although MSIX has nothing to do with the IOAPIC, the mode that the APIC 
> is programmed can have an influence on how the interrupts are delivered. 
> There are certain Intel platforms (I don't know if AMD does anything 
> like that) puts the IOAPIC in a certain configuration that causes the 
> interrupts to be moved in a round robin fashion. I think it's physical 
> flat mode? I don't quite recall. Normally on the low end Xeons. It's 
> probably worth doing a test run with the irqbalance daemon running and 
> make sure you traffic stream doesn't all of sudden stop.

I've tested with irqbalance running and haven't found any noticeable
difference.

Logan

WARNING: multiple messages have this Message-ID (diff)
From: logang at deltatee.com (Logan Gunthorpe)
Subject: [PATCH 0/9] Support using MSI interrupts in ntb_transport
Date: Thu, 31 Jan 2019 15:39:51 -0700	[thread overview]
Message-ID: <29f7e3fe-5354-6156-1243-7248ffb2249f@deltatee.com> (raw)
In-Reply-To: <ea7b7c3c-8ed5-51ec-022a-7d28a5afce82@intel.com>



On 2019-01-31 1:58 p.m., Dave Jiang wrote:
> 
> On 1/31/2019 1:48 PM, Logan Gunthorpe wrote:
>>
>> On 2019-01-31 1:20 p.m., Dave Jiang wrote:
>>> Does this work when the system moves the MSI vector either via software
>>> (irqbalance) or BIOS APIC programming (some modes cause round robin
>>> behavior)?
>>
>> I don't know how irqbalance works, and I'm not sure what you are
>> referring to by BIOS APIC programming, however I would expect these
>> things would not be a problem.
>>
>> The MSI code I'm presenting here doesn't do anything crazy with the
>> interrupts, it allocates and uses them just as any PCI driver would. The
>> only real difference here is that instead of a piece of hardware sending
>> the IRQ TLP, it will be sent through the memory window (which, from the
>> OS's perspective, is just coming from an NTB hardware proxy alias).
>>
>> Logan
> Right. I did that as a hack a while back for some silicon errata 
> workaround. When the vector moves, the address for the LAPIC changes. So 
> unless it gets updated, you end up writing to the old location and lose 
> all the new interrupts. irqbalance is a user daemon that rotates the 
> system interrupts around to ensure that not all interrupts are pinned on 
> a single core. 

Yes, that would be a problem if something changes the MSI vectors out
from under us. Seems like that would be a bit difficult to do even with
regular hardware. So far I haven't seen anything that would do that. If
you know of where in the kernel this happens I'd be interested in
getting a pointer to the flow in the code. If that is the case this MSI
stuff will need to get much more complicated...

> I think it's enabled by default on several distros. 

> Although MSIX has nothing to do with the IOAPIC, the mode that the APIC 
> is programmed can have an influence on how the interrupts are delivered. 
> There are certain Intel platforms (I don't know if AMD does anything 
> like that) puts the IOAPIC in a certain configuration that causes the 
> interrupts to be moved in a round robin fashion. I think it's physical 
> flat mode? I don't quite recall. Normally on the low end Xeons. It's 
> probably worth doing a test run with the irqbalance daemon running and 
> make sure you traffic stream doesn't all of sudden stop.

I've tested with irqbalance running and haven't found any noticeable
difference.

Logan

WARNING: multiple messages have this Message-ID (diff)
From: logang@deltatee.com (Logan Gunthorpe)
Subject: [PATCH 0/9] Support using MSI interrupts in ntb_transport
Date: Thu, 31 Jan 2019 15:39:51 -0700	[thread overview]
Message-ID: <29f7e3fe-5354-6156-1243-7248ffb2249f@deltatee.com> (raw)
Message-ID: <20190131223951.mqTbzQ2gb2evID2Go22yePrRxwpeHWxPG93vke0XW_w@z> (raw)
In-Reply-To: <ea7b7c3c-8ed5-51ec-022a-7d28a5afce82@intel.com>



On 2019-01-31 1:58 p.m., Dave Jiang wrote:
> 
> On 1/31/2019 1:48 PM, Logan Gunthorpe wrote:
>>
>> On 2019-01-31 1:20 p.m., Dave Jiang wrote:
>>> Does this work when the system moves the MSI vector either via software
>>> (irqbalance) or BIOS APIC programming (some modes cause round robin
>>> behavior)?
>>
>> I don't know how irqbalance works, and I'm not sure what you are
>> referring to by BIOS APIC programming, however I would expect these
>> things would not be a problem.
>>
>> The MSI code I'm presenting here doesn't do anything crazy with the
>> interrupts, it allocates and uses them just as any PCI driver would. The
>> only real difference here is that instead of a piece of hardware sending
>> the IRQ TLP, it will be sent through the memory window (which, from the
>> OS's perspective, is just coming from an NTB hardware proxy alias).
>>
>> Logan
> Right. I did that as a hack a while back for some silicon errata 
> workaround. When the vector moves, the address for the LAPIC changes. So 
> unless it gets updated, you end up writing to the old location and lose 
> all the new interrupts. irqbalance is a user daemon that rotates the 
> system interrupts around to ensure that not all interrupts are pinned on 
> a single core. 

Yes, that would be a problem if something changes the MSI vectors out
from under us. Seems like that would be a bit difficult to do even with
regular hardware. So far I haven't seen anything that would do that. If
you know of where in the kernel this happens I'd be interested in
getting a pointer to the flow in the code. If that is the case this MSI
stuff will need to get much more complicated...

> I think it's enabled by default on several distros. 

> Although MSIX has nothing to do with the IOAPIC, the mode that the APIC 
> is programmed can have an influence on how the interrupts are delivered. 
> There are certain Intel platforms (I don't know if AMD does anything 
> like that) puts the IOAPIC in a certain configuration that causes the 
> interrupts to be moved in a round robin fashion. I think it's physical 
> flat mode? I don't quite recall. Normally on the low end Xeons. It's 
> probably worth doing a test run with the irqbalance daemon running and 
> make sure you traffic stream doesn't all of sudden stop.

I've tested with irqbalance running and haven't found any noticeable
difference.

Logan

WARNING: multiple messages have this Message-ID (diff)
From: Logan Gunthorpe <logang-OTvnGxWRz7hWk0Htik3J/w@public.gmane.org>
To: Dave Jiang <dave.jiang-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-ntb-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org,
	linux-pci-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
	linux-kselftest-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Jon Mason <jdmason-zqedIHPRSYI@public.gmane.org>,
	Bjorn Helgaas <bhelgaas-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
Cc: Allen Hubbe <allenbh-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Serge Semin
	<fancer.lancer-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Eric Pilmore <epilmore-Op3I1peydIbQT0dZR+AlfA@public.gmane.org>
Subject: Re: [PATCH 0/9] Support using MSI interrupts in ntb_transport
Date: Thu, 31 Jan 2019 15:39:51 -0700	[thread overview]
Message-ID: <29f7e3fe-5354-6156-1243-7248ffb2249f@deltatee.com> (raw)
In-Reply-To: <ea7b7c3c-8ed5-51ec-022a-7d28a5afce82-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>



On 2019-01-31 1:58 p.m., Dave Jiang wrote:
> 
> On 1/31/2019 1:48 PM, Logan Gunthorpe wrote:
>>
>> On 2019-01-31 1:20 p.m., Dave Jiang wrote:
>>> Does this work when the system moves the MSI vector either via software
>>> (irqbalance) or BIOS APIC programming (some modes cause round robin
>>> behavior)?
>>
>> I don't know how irqbalance works, and I'm not sure what you are
>> referring to by BIOS APIC programming, however I would expect these
>> things would not be a problem.
>>
>> The MSI code I'm presenting here doesn't do anything crazy with the
>> interrupts, it allocates and uses them just as any PCI driver would. The
>> only real difference here is that instead of a piece of hardware sending
>> the IRQ TLP, it will be sent through the memory window (which, from the
>> OS's perspective, is just coming from an NTB hardware proxy alias).
>>
>> Logan
> Right. I did that as a hack a while back for some silicon errata 
> workaround. When the vector moves, the address for the LAPIC changes. So 
> unless it gets updated, you end up writing to the old location and lose 
> all the new interrupts. irqbalance is a user daemon that rotates the 
> system interrupts around to ensure that not all interrupts are pinned on 
> a single core. 

Yes, that would be a problem if something changes the MSI vectors out
from under us. Seems like that would be a bit difficult to do even with
regular hardware. So far I haven't seen anything that would do that. If
you know of where in the kernel this happens I'd be interested in
getting a pointer to the flow in the code. If that is the case this MSI
stuff will need to get much more complicated...

> I think it's enabled by default on several distros. 

> Although MSIX has nothing to do with the IOAPIC, the mode that the APIC 
> is programmed can have an influence on how the interrupts are delivered. 
> There are certain Intel platforms (I don't know if AMD does anything 
> like that) puts the IOAPIC in a certain configuration that causes the 
> interrupts to be moved in a round robin fashion. I think it's physical 
> flat mode? I don't quite recall. Normally on the low end Xeons. It's 
> probably worth doing a test run with the irqbalance daemon running and 
> make sure you traffic stream doesn't all of sudden stop.

I've tested with irqbalance running and haven't found any noticeable
difference.

Logan

  reply	other threads:[~2019-01-31 22:40 UTC|newest]

Thread overview: 81+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-31 18:56 [PATCH 0/9] Support using MSI interrupts in ntb_transport Logan Gunthorpe
2019-01-31 18:56 ` Logan Gunthorpe
2019-01-31 18:56 ` logang
2019-01-31 18:56 ` [PATCH 1/9] iommu/vt-d: Allow interrupts from the entire bus for aliased devices Logan Gunthorpe
2019-01-31 18:56   ` Logan Gunthorpe
2019-01-31 18:56   ` logang
2019-02-01 16:44   ` Joerg Roedel
2019-02-01 16:44     ` Joerg Roedel
2019-02-01 16:44     ` joro
2019-02-01 17:27     ` Logan Gunthorpe
2019-02-01 17:27       ` Logan Gunthorpe
2019-02-01 17:27       ` logang
2019-02-05 19:19       ` Jacob Pan
2019-02-05 19:19         ` Jacob Pan
2019-02-05 19:19         ` jacob.jun.pan
2019-02-05 20:40         ` Logan Gunthorpe
2019-02-05 20:40           ` Logan Gunthorpe
2019-02-05 20:40           ` logang
2019-02-05 23:58           ` Jacob Pan
2019-02-05 23:58             ` Jacob Pan
2019-02-05 23:58             ` jacob.jun.pan
2019-01-31 18:56 ` [PATCH 2/9] PCI/MSI: Support allocating virtual MSI interrupts Logan Gunthorpe
2019-01-31 18:56   ` Logan Gunthorpe
2019-01-31 18:56   ` logang
2019-01-31 22:39   ` Bjorn Helgaas
2019-01-31 22:39     ` Bjorn Helgaas
2019-01-31 22:39     ` helgaas
2019-01-31 22:52     ` Logan Gunthorpe
2019-01-31 22:52       ` Logan Gunthorpe
2019-01-31 22:52       ` logang
2019-02-01 19:23       ` Bjorn Helgaas
2019-02-01 19:23         ` Bjorn Helgaas
2019-02-01 19:23         ` helgaas
2019-01-31 18:56 ` [PATCH 3/9] PCI/switchtec: Add module parameter to request more interrupts Logan Gunthorpe
2019-01-31 18:56   ` Logan Gunthorpe
2019-01-31 18:56   ` logang
2019-01-31 18:56 ` [PATCH 4/9] NTB: Introduce functions to calculate multi-port resource index Logan Gunthorpe
2019-01-31 18:56   ` Logan Gunthorpe
2019-01-31 18:56   ` logang
2019-01-31 18:56 ` [PATCH 5/9] NTB: Rename ntb.c to support multiple source files in the module Logan Gunthorpe
2019-01-31 18:56   ` Logan Gunthorpe
2019-01-31 18:56   ` logang
2019-01-31 18:56 ` [PATCH 6/9] NTB: Introduce MSI library Logan Gunthorpe
2019-01-31 18:56   ` Logan Gunthorpe
2019-01-31 18:56   ` logang
2019-01-31 18:56 ` [PATCH 7/9] NTB: Introduce NTB MSI Test Client Logan Gunthorpe
2019-01-31 18:56   ` Logan Gunthorpe
2019-01-31 18:56   ` logang
2019-01-31 18:56 ` [PATCH 8/9] NTB: Add ntb_msi_test support to ntb_test Logan Gunthorpe
2019-01-31 18:56   ` Logan Gunthorpe
2019-01-31 18:56   ` logang
2019-01-31 18:56 ` [PATCH 9/9] NTB: Add MSI interrupt support to ntb_transport Logan Gunthorpe
2019-01-31 18:56   ` Logan Gunthorpe
2019-01-31 18:56   ` logang
2019-01-31 20:20 ` [PATCH 0/9] Support using MSI interrupts in ntb_transport Dave Jiang
2019-01-31 20:20   ` Dave Jiang
2019-01-31 20:20   ` dave.jiang
2019-01-31 20:48   ` Logan Gunthorpe
2019-01-31 20:48     ` Logan Gunthorpe
2019-01-31 20:48     ` logang
2019-01-31 20:58     ` Dave Jiang
2019-01-31 20:58       ` Dave Jiang
2019-01-31 20:58       ` Dave Jiang
2019-01-31 20:58       ` dave.jiang
2019-01-31 22:39       ` Logan Gunthorpe [this message]
2019-01-31 22:39         ` Logan Gunthorpe
2019-01-31 22:39         ` Logan Gunthorpe
2019-01-31 22:39         ` logang
2019-01-31 22:46         ` Dave Jiang
2019-01-31 22:46           ` Dave Jiang
2019-01-31 22:46           ` Dave Jiang
2019-01-31 22:46           ` dave.jiang
2019-01-31 23:41           ` Logan Gunthorpe
2019-01-31 23:41             ` Logan Gunthorpe
2019-01-31 23:41             ` logang
2019-01-31 23:48             ` Dave Jiang
2019-01-31 23:48               ` Dave Jiang
2019-01-31 23:48               ` dave.jiang
2019-01-31 23:52               ` Logan Gunthorpe
2019-01-31 23:52                 ` Logan Gunthorpe
2019-01-31 23:52                 ` logang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=29f7e3fe-5354-6156-1243-7248ffb2249f@deltatee.com \
    --to=logang@deltatee.com \
    --cc=allenbh@gmail.com \
    --cc=bhelgaas@google.com \
    --cc=dave.jiang@intel.com \
    --cc=epilmore@gigaio.com \
    --cc=fancer.lancer@gmail.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jdmason@kudzu.us \
    --cc=joro@8bytes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-ntb@googlegroups.com \
    --cc=linux-pci@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.