All of lore.kernel.org
 help / color / mirror / Atom feed
From: Arnd Bergmann <arnd@arndb.de>
To: linux-arm-kernel@lists.infradead.org
Cc: Sinan Kaya <okaya@codeaurora.org>,
	Abhijit Mahajan <abhijit.mahajan@avagotech.com>,
	linux-scsi@vger.kernel.org,
	Nagalakshmi Nandigama <nagalakshmi.nandigama@avagotech.com>,
	jcm@redhat.com, timur@codeaurora.org,
	"James E.J. Bottomley" <JBottomley@odin.com>,
	linux-kernel@vger.kernel.org,
	Sreekanth Reddy <sreekanth.reddy@avagotech.com>,
	Praveen Krishnamoorthy <praveen.krishnamoorthy@avagotech.com>,
	Hannes Reinecke <hare@suse.de>,
	linux-arm-msm@vger.kernel.org, agross@codeaurora.org,
	MPT-FusionLinux.pdl@avagotech.com, cov@codeaurora.org
Subject: Re: [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
Date: Tue, 10 Nov 2015 23:06:03 +0100	[thread overview]
Message-ID: <3710077.X1fITqqHih@wuerfel> (raw)
In-Reply-To: <56425A6B.2070900@codeaurora.org>

On Tuesday 10 November 2015 15:58:19 Sinan Kaya wrote:
> 
> On 11/10/2015 2:56 PM, Arnd Bergmann wrote:
> >> The ACPI IORT table declares whether you enable IOMMU for a particular
> >> >device or not. The placement of IOMMU HW is system specific. The IORT
> >> >table gives the IOMMU HW topology to the operating system.
> > This sounds odd. Clearly you need to specify the IOMMU settings for each
> > possible PCI device independent of whether the OS actually uses the IOMMU
> > or not.
> 
> There are provisions to have DMA mask in the PCIe host bridge not at the 
> PCIe device level inside IORT table. This setting is specific for each 
> PCIe bus. It is not per PCIe device.

Same thing, I meant the bootloader must provide all the information that
is needed to use the IOMMU on all PCI devices. I don't care where the IOMMU
driver gets that information. Some IOMMUs require programming a
bus/device/function specific number into the I/O page tables, and they
might not always have the same algorithm to map from the PCI numbers
into their own number space.

> It is assumed that the endpoint device driver knows the hardware for 
> PCIe devices. The driver can also query the supported DMA bits by this 
> platform via DMA APIs and will request the correct DMA mask from the DMA 
> subsystem (surprise!).

I know how the negotiation works. Note that dma_get_required_mask()
will only tell you what mask the device needs to access all of memory,
while both the device and bus may have additional limitations, and
there is not always a solution.

> >In a lot of cases, we want to turn it off to get better performance
> > when the driver has set a DMA mask that covers all of RAM, but you
> > also want to enable the IOMMU for debugging purposes or for device
> > assignment if you run virtual machines. The bootloader doesn't know how
> > the device is going to be used, so it cannot define the policy here.
> 
> I think we'll end up adding a virtualization option to the UEFI BIOS 
> similar to how Intel platforms work. Based on this switch, we'll end up 
> patching the ACPI table.
> 
> If I remove the IORT entry, then the device is in coherent mode with 
> device accessing the full RAM range.
> 
> If I have the IORT table, the device is in IOMMU translation mode.
> 
> Details are in the IORT spec.

I think that would suck a lot more than being slightly out of spec
regarding SBSA if you make the low PCI addresses map to the start
of RAM. Asking users to select a 'virtualization' option based on
what kind of PCI device and kernel version they have is a major
hassle.

	Arnd

WARNING: multiple messages have this Message-ID (diff)
From: arnd@arndb.de (Arnd Bergmann)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails
Date: Tue, 10 Nov 2015 23:06:03 +0100	[thread overview]
Message-ID: <3710077.X1fITqqHih@wuerfel> (raw)
In-Reply-To: <56425A6B.2070900@codeaurora.org>

On Tuesday 10 November 2015 15:58:19 Sinan Kaya wrote:
> 
> On 11/10/2015 2:56 PM, Arnd Bergmann wrote:
> >> The ACPI IORT table declares whether you enable IOMMU for a particular
> >> >device or not. The placement of IOMMU HW is system specific. The IORT
> >> >table gives the IOMMU HW topology to the operating system.
> > This sounds odd. Clearly you need to specify the IOMMU settings for each
> > possible PCI device independent of whether the OS actually uses the IOMMU
> > or not.
> 
> There are provisions to have DMA mask in the PCIe host bridge not at the 
> PCIe device level inside IORT table. This setting is specific for each 
> PCIe bus. It is not per PCIe device.

Same thing, I meant the bootloader must provide all the information that
is needed to use the IOMMU on all PCI devices. I don't care where the IOMMU
driver gets that information. Some IOMMUs require programming a
bus/device/function specific number into the I/O page tables, and they
might not always have the same algorithm to map from the PCI numbers
into their own number space.

> It is assumed that the endpoint device driver knows the hardware for 
> PCIe devices. The driver can also query the supported DMA bits by this 
> platform via DMA APIs and will request the correct DMA mask from the DMA 
> subsystem (surprise!).

I know how the negotiation works. Note that dma_get_required_mask()
will only tell you what mask the device needs to access all of memory,
while both the device and bus may have additional limitations, and
there is not always a solution.

> >In a lot of cases, we want to turn it off to get better performance
> > when the driver has set a DMA mask that covers all of RAM, but you
> > also want to enable the IOMMU for debugging purposes or for device
> > assignment if you run virtual machines. The bootloader doesn't know how
> > the device is going to be used, so it cannot define the policy here.
> 
> I think we'll end up adding a virtualization option to the UEFI BIOS 
> similar to how Intel platforms work. Based on this switch, we'll end up 
> patching the ACPI table.
> 
> If I remove the IORT entry, then the device is in coherent mode with 
> device accessing the full RAM range.
> 
> If I have the IORT table, the device is in IOMMU translation mode.
> 
> Details are in the IORT spec.

I think that would suck a lot more than being slightly out of spec
regarding SBSA if you make the low PCI addresses map to the start
of RAM. Asking users to select a 'virtualization' option based on
what kind of PCI device and kernel version they have is a major
hassle.

	Arnd

  reply	other threads:[~2015-11-10 22:06 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-09  1:57 [PATCH V2 0/3] scsi: mptxsas: updates for ARM64 Sinan Kaya
2015-11-09  1:57 ` Sinan Kaya
2015-11-09  1:57 ` [PATCH V2 1/3] scsi: mptxsas: try 64 bit DMA when 32 bit DMA fails Sinan Kaya
2015-11-09  1:57   ` Sinan Kaya
2015-11-09  7:09   ` Hannes Reinecke
2015-11-09  7:09     ` Hannes Reinecke
2015-11-09  8:59     ` Arnd Bergmann
2015-11-09  8:59       ` Arnd Bergmann
2015-11-09 14:07       ` Sinan Kaya
2015-11-09 14:07         ` Sinan Kaya
2015-11-09 14:33         ` Arnd Bergmann
2015-11-09 14:33           ` Arnd Bergmann
2015-11-09 23:22           ` Sinan Kaya
2015-11-09 23:22             ` Sinan Kaya
2015-11-09 23:29             ` Timur Tabi
2015-11-09 23:29               ` Timur Tabi
2015-11-10  8:38             ` Arnd Bergmann
2015-11-10  8:38               ` Arnd Bergmann
2015-11-10 16:06               ` Sinan Kaya
2015-11-10 16:06                 ` Sinan Kaya
2015-11-10 16:47                 ` Arnd Bergmann
2015-11-10 16:47                   ` Arnd Bergmann
2015-11-10 17:00                   ` Timur Tabi
2015-11-10 17:00                     ` Timur Tabi
2015-11-10 19:13                     ` Arnd Bergmann
2015-11-10 19:13                       ` Arnd Bergmann
2015-11-10 21:03                       ` Timur Tabi
2015-11-10 21:03                         ` Timur Tabi
2015-11-10 21:54                         ` Arnd Bergmann
2015-11-10 21:54                           ` Arnd Bergmann
2015-11-10 21:59                           ` Timur Tabi
2015-11-10 21:59                             ` Timur Tabi
2015-11-10 22:08                             ` Arnd Bergmann
2015-11-10 22:08                               ` Arnd Bergmann
2015-11-10 17:19                   ` Sinan Kaya
2015-11-10 17:19                     ` Sinan Kaya
2015-11-10 18:27                     ` James Bottomley
2015-11-10 18:27                       ` James Bottomley
2015-11-10 19:14                       ` Sinan Kaya
2015-11-10 19:14                         ` Sinan Kaya
2015-11-10 19:43                         ` James Bottomley
2015-11-10 19:43                           ` James Bottomley
2015-11-10 19:56                           ` Sinan Kaya
2015-11-10 19:56                             ` Sinan Kaya
2015-11-10 20:05                             ` James Bottomley
2015-11-10 20:05                               ` James Bottomley
2015-11-10 20:26                               ` Sinan Kaya
2015-11-10 20:26                                 ` Sinan Kaya
2015-11-10 20:35                                 ` James Bottomley
2015-11-10 20:35                                   ` James Bottomley
2015-11-10 19:56                     ` Arnd Bergmann
2015-11-10 19:56                       ` Arnd Bergmann
2015-11-10 20:58                       ` Sinan Kaya
2015-11-10 20:58                         ` Sinan Kaya
2015-11-10 22:06                         ` Arnd Bergmann [this message]
2015-11-10 22:06                           ` Arnd Bergmann
2015-11-09 14:00     ` Sinan Kaya
2015-11-09 14:00       ` Sinan Kaya
2015-11-09  1:57 ` [PATCH V2 2/3] scsi: fix compiler warning for sg Sinan Kaya
2015-11-09  1:57   ` Sinan Kaya
2015-11-09 14:14   ` Andy Shevchenko
2015-11-09 14:14     ` Andy Shevchenko
2015-11-10  3:21     ` Sinan Kaya
2015-11-10  3:21       ` Sinan Kaya
2015-11-10  3:21       ` Sinan Kaya
2015-11-10  3:26       ` Timur Tabi
2015-11-10  3:26         ` Timur Tabi
2015-11-10  4:51         ` Sinan Kaya
2015-11-10  4:51           ` Sinan Kaya
2015-11-10  4:53           ` Timur Tabi
2015-11-10  4:53             ` Timur Tabi
2015-11-10  9:23             ` Andy Shevchenko
2015-11-10  9:23               ` Andy Shevchenko
2015-11-10 10:09             ` Arnd Bergmann
2015-11-10 10:09               ` Arnd Bergmann
2015-11-09  1:57 ` [PATCH V2 3/3] scsi: mptxsas: offload IRQ execution Sinan Kaya
2015-11-09  1:57   ` Sinan Kaya
2015-11-09  7:15   ` Hannes Reinecke
2015-11-09  7:15     ` Hannes Reinecke
2015-11-09 14:01     ` Sinan Kaya
2015-11-09 14:01       ` Sinan Kaya
2015-11-10  5:59     ` Sinan Kaya
2015-11-10  5:59       ` Sinan Kaya
2015-11-10  5:59       ` Sinan Kaya
2016-03-16 15:31       ` Christopher Covington
2016-03-16 15:31         ` Christopher Covington

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3710077.X1fITqqHih@wuerfel \
    --to=arnd@arndb.de \
    --cc=JBottomley@odin.com \
    --cc=MPT-FusionLinux.pdl@avagotech.com \
    --cc=abhijit.mahajan@avagotech.com \
    --cc=agross@codeaurora.org \
    --cc=cov@codeaurora.org \
    --cc=hare@suse.de \
    --cc=jcm@redhat.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=nagalakshmi.nandigama@avagotech.com \
    --cc=okaya@codeaurora.org \
    --cc=praveen.krishnamoorthy@avagotech.com \
    --cc=sreekanth.reddy@avagotech.com \
    --cc=timur@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.