linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9
@ 2019-06-11  9:21 Oded Gabbay
  2019-06-11  9:58 ` Greg KH
  0 siblings, 1 reply; 22+ messages in thread
From: Oded Gabbay @ 2019-06-11  9:21 UTC (permalink / raw)
  To: linux-kernel, gregkh

This patch enables support in the driver for 64-bit DMA mask when running
in a POWER9 machine.

POWER9 supports either 32-bit or 64-bit DMA mask. However, our ASICs
support 48-bit DMA mask. To support 64-bit, the driver needs to add a
special configuration to the ASIC's PCIe controller.

The activation of this special configuration is done in case the PCI
parent device of Goya is a PHB4 PCI device of IBM.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
---
Changes in v2:
- Remove kernel module parameter and instead read the PCI device ID of the
  parent PCI bus. If it is PHB4, do the special configuration.
 
 drivers/misc/habanalabs/goya/goya.c  |  6 +++++-
 drivers/misc/habanalabs/habanalabs.h |  4 ++++
 drivers/misc/habanalabs/pci.c        | 23 ++++++++++++++++++++++-
 3 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/drivers/misc/habanalabs/goya/goya.c b/drivers/misc/habanalabs/goya/goya.c
index e8b3a31d211f..71dc2341dd8c 100644
--- a/drivers/misc/habanalabs/goya/goya.c
+++ b/drivers/misc/habanalabs/goya/goya.c
@@ -472,7 +472,11 @@ static int goya_early_init(struct hl_device *hdev)
 
 	prop->dram_pci_bar_size = pci_resource_len(pdev, DDR_BAR_ID);
 
-	rc = hl_pci_init(hdev, 48);
+	if (hl_pci_parent_is_phb4(hdev))
+		rc = hl_pci_init(hdev, 64);
+	else
+		rc = hl_pci_init(hdev, 48);
+
 	if (rc)
 		return rc;
 
diff --git a/drivers/misc/habanalabs/habanalabs.h b/drivers/misc/habanalabs/habanalabs.h
index 5e4a631b3d88..3ba94188884f 100644
--- a/drivers/misc/habanalabs/habanalabs.h
+++ b/drivers/misc/habanalabs/habanalabs.h
@@ -1208,6 +1208,8 @@ struct hl_device_reset_work {
  * @dma_mask: the dma mask that was set for this device
  * @in_debug: is device under debug. This, together with fd_open_cnt, enforces
  *            that only a single user is configuring the debug infrastructure.
+ * @power9_64bit_dma_enable: true to enable 64-bit DMA mask support. Relevant
+ *                           only to POWER9 machines.
  */
 struct hl_device {
 	struct pci_dev			*pdev;
@@ -1281,6 +1283,7 @@ struct hl_device {
 	u8				device_cpu_disabled;
 	u8				dma_mask;
 	u8				in_debug;
+	u8				power9_64bit_dma_enable;
 
 	/* Parameters for bring-up */
 	u8				mmu_enable;
@@ -1504,6 +1507,7 @@ int hl_pci_init_iatu(struct hl_device *hdev, u64 sram_base_address,
 int hl_pci_init(struct hl_device *hdev, u8 dma_mask);
 void hl_pci_fini(struct hl_device *hdev);
 int hl_pci_set_dma_mask(struct hl_device *hdev, u8 dma_mask);
+bool hl_pci_parent_is_phb4(struct hl_device *hdev);
 
 long hl_get_frequency(struct hl_device *hdev, u32 pll_index, bool curr);
 void hl_set_frequency(struct hl_device *hdev, u32 pll_index, u64 freq);
diff --git a/drivers/misc/habanalabs/pci.c b/drivers/misc/habanalabs/pci.c
index c98d88c7a5c6..dcb737f9677c 100644
--- a/drivers/misc/habanalabs/pci.c
+++ b/drivers/misc/habanalabs/pci.c
@@ -10,6 +10,8 @@
 
 #include <linux/pci.h>
 
+#define PCI_DEVICE_ID_IBM_PHB4		0x04C1
+
 #define HL_PLDM_PCI_ELBI_TIMEOUT_MSEC	(HL_PCI_ELBI_TIMEOUT_MSEC * 10)
 
 /**
@@ -182,6 +184,20 @@ static void hl_pci_reset_link_through_bridge(struct hl_device *hdev)
 	ssleep(3);
 }
 
+bool hl_pci_parent_is_phb4(struct hl_device *hdev)
+{
+	struct pci_dev *parent_port = hdev->pdev->bus->self;
+
+	if ((parent_port->vendor == PCI_VENDOR_ID_IBM) &&
+			(parent_port->device == PCI_DEVICE_ID_IBM_PHB4)) {
+		hdev->power9_64bit_dma_enable = 1;
+		return true;
+	}
+
+	hdev->power9_64bit_dma_enable = 0;
+	return false;
+}
+
 /**
  * hl_pci_set_dram_bar_base() - Set DDR BAR to map specific device address.
  * @hdev: Pointer to hl_device structure.
@@ -283,7 +299,12 @@ int hl_pci_init_iatu(struct hl_device *hdev, u64 sram_base_address,
 				upper_32_bits(host_phys_base_address));
 	rc |= hl_pci_iatu_write(hdev, 0x010, lower_32_bits(host_phys_end_addr));
 	rc |= hl_pci_iatu_write(hdev, 0x014, 0);
-	rc |= hl_pci_iatu_write(hdev, 0x018, 0);
+
+	if ((hdev->power9_64bit_dma_enable) && (hdev->dma_mask == 64))
+		rc |= hl_pci_iatu_write(hdev, 0x018, 0x08000000);
+	else
+		rc |= hl_pci_iatu_write(hdev, 0x018, 0);
+
 	rc |= hl_pci_iatu_write(hdev, 0x020, upper_32_bits(host_phys_end_addr));
 	/* Increase region size */
 	rc |= hl_pci_iatu_write(hdev, 0x000, 0x00002000);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9
  2019-06-11  9:21 [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9 Oded Gabbay
@ 2019-06-11  9:58 ` Greg KH
  2019-06-11 11:47   ` Oded Gabbay
  2019-06-11 15:17   ` Christoph Hellwig
  0 siblings, 2 replies; 22+ messages in thread
From: Greg KH @ 2019-06-11  9:58 UTC (permalink / raw)
  To: Oded Gabbay; +Cc: linux-kernel

On Tue, Jun 11, 2019 at 12:21:44PM +0300, Oded Gabbay wrote:
> +bool hl_pci_parent_is_phb4(struct hl_device *hdev)
> +{
> +	struct pci_dev *parent_port = hdev->pdev->bus->self;
> +
> +	if ((parent_port->vendor == PCI_VENDOR_ID_IBM) &&
> +			(parent_port->device == PCI_DEVICE_ID_IBM_PHB4)) {
> +		hdev->power9_64bit_dma_enable = 1;
> +		return true;
> +	}
> +
> +	hdev->power9_64bit_dma_enable = 0;
> +	return false;
> +}

That feels like a big hack.  ppc doesn't have any "what arch am I
running on?" runtime call?  Did you ask on the ppc64 mailing list?  I'm
ok to take this for now, but odds are you need a better fix for this
sometime...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9
  2019-06-11  9:58 ` Greg KH
@ 2019-06-11 11:47   ` Oded Gabbay
  2019-06-11 15:17   ` Christoph Hellwig
  1 sibling, 0 replies; 22+ messages in thread
From: Oded Gabbay @ 2019-06-11 11:47 UTC (permalink / raw)
  To: Greg KH; +Cc: Linux-Kernel@Vger. Kernel. Org

On Tue, Jun 11, 2019 at 12:59 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Tue, Jun 11, 2019 at 12:21:44PM +0300, Oded Gabbay wrote:
> > +bool hl_pci_parent_is_phb4(struct hl_device *hdev)
> > +{
> > +     struct pci_dev *parent_port = hdev->pdev->bus->self;
> > +
> > +     if ((parent_port->vendor == PCI_VENDOR_ID_IBM) &&
> > +                     (parent_port->device == PCI_DEVICE_ID_IBM_PHB4)) {
> > +             hdev->power9_64bit_dma_enable = 1;
> > +             return true;
> > +     }
> > +
> > +     hdev->power9_64bit_dma_enable = 0;
> > +     return false;
> > +}
>
> That feels like a big hack.
I agree but so far I found no other way.

>> ppc doesn't have any "what arch am I
> running on?" runtime call?  Did you ask on the ppc64 mailing list?  I'm
> ok to take this for now, but odds are you need a better fix for this
> sometime...
I talked to a couple of people and they didn't know about such a thing.
I'll also dug in the code and didn't find anything.
I'll post a "formal" question to the ppc64 mailing list and dig again
and I will update if I find anything.

Thanks,
Oded

>
> thanks,
>
> greg k-h

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9
  2019-06-11  9:58 ` Greg KH
  2019-06-11 11:47   ` Oded Gabbay
@ 2019-06-11 15:17   ` Christoph Hellwig
  2019-06-11 15:26     ` Greg KH
  1 sibling, 1 reply; 22+ messages in thread
From: Christoph Hellwig @ 2019-06-11 15:17 UTC (permalink / raw)
  To: Greg KH; +Cc: Oded Gabbay, linux-kernel

On Tue, Jun 11, 2019 at 11:58:57AM +0200, Greg KH wrote:
> That feels like a big hack.  ppc doesn't have any "what arch am I
> running on?" runtime call?  Did you ask on the ppc64 mailing list?  I'm
> ok to take this for now, but odds are you need a better fix for this
> sometime...

That isn't the worst part of it.  The whole idea of checking what I'm
running to set a dma mask just doesn't make any sense at all.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9
  2019-06-11 15:17   ` Christoph Hellwig
@ 2019-06-11 15:26     ` Greg KH
  2019-06-11 17:03       ` Oded Gabbay
  2019-06-15 12:12       ` Oded Gabbay
  0 siblings, 2 replies; 22+ messages in thread
From: Greg KH @ 2019-06-11 15:26 UTC (permalink / raw)
  To: Christoph Hellwig, Oded Gabbay; +Cc: linux-kernel

On Tue, Jun 11, 2019 at 08:17:53AM -0700, Christoph Hellwig wrote:
> On Tue, Jun 11, 2019 at 11:58:57AM +0200, Greg KH wrote:
> > That feels like a big hack.  ppc doesn't have any "what arch am I
> > running on?" runtime call?  Did you ask on the ppc64 mailing list?  I'm
> > ok to take this for now, but odds are you need a better fix for this
> > sometime...
> 
> That isn't the worst part of it.  The whole idea of checking what I'm
> running to set a dma mask just doesn't make any sense at all.

Oded, I thought I asked if there was a dma call you should be making to
keep this type of check from being needed.  What happened to that?  As
Christoph points out, none of this should be needed, which is what I
thought I originally said :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9
  2019-06-11 15:26     ` Greg KH
@ 2019-06-11 17:03       ` Oded Gabbay
  2019-06-11 17:22         ` Oded Gabbay
  2019-06-15 12:12       ` Oded Gabbay
  1 sibling, 1 reply; 22+ messages in thread
From: Oded Gabbay @ 2019-06-11 17:03 UTC (permalink / raw)
  To: Greg KH; +Cc: Christoph Hellwig, Linux-Kernel@Vger. Kernel. Org

On Tue, Jun 11, 2019 at 6:26 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Tue, Jun 11, 2019 at 08:17:53AM -0700, Christoph Hellwig wrote:
> > On Tue, Jun 11, 2019 at 11:58:57AM +0200, Greg KH wrote:
> > > That feels like a big hack.  ppc doesn't have any "what arch am I
> > > running on?" runtime call?  Did you ask on the ppc64 mailing list?  I'm
> > > ok to take this for now, but odds are you need a better fix for this
> > > sometime...
> >
> > That isn't the worst part of it.  The whole idea of checking what I'm
> > running to set a dma mask just doesn't make any sense at all.
>
> Oded, I thought I asked if there was a dma call you should be making to
> keep this type of check from being needed.  What happened to that?  As
> Christoph points out, none of this should be needed, which is what I
> thought I originally said :)
>
> thanks,
>
> greg k-h

I'm sorry, but it seems I can't explain what's my problem because you
and Christoph keep mentioning the pci_set_dma_mask() but it doesn't
help me.
I'll try again to explain.

The main problem specifically for Goya device, is that I can't call
this function with *the same parameter* for POWER9 and x86-64, because
x86-64 supports dma mask of 48-bits while POWER9 supports only 32-bits
or 64-bits.

The main limitation in my Goya device is that it can generate PCI
outbound transactions with addresses from 0 to (2^50 - 1).
That's why when we first integrated it in x86-64, we used a DMA mask
of 48-bits, by calling pci_set_dma_mask(pdev, 48). That way, the
kernel ensures me that all the DMA addresses are from 0 to (2^48 - 1),
and that address range is accessible by my device.

If for some reason, the x86-64 machine doesn't support 48-bits, the
standard fallback code in ALL the drivers I have seen is to set the
DMA mask to 32-bits. And that's how my current driver's code is
written.

Now, when I tried to integrate Goya into a POWER9 machine, I got a
reject from the call to pci_set_dma_mask(pdev, 48). The standard code,
as I wrote above, is to call the same function with 32-bits. That
works BUT it is not practical, as our applications require much more
memory mapped then 32-bits. In addition, once you add more cards which
are all mapped to the same range, it is simply not usable at all.

Therefore, I consulted with POWER people and they told me I can call
to pci_set_dma_mask with the mask as 64, but I must make sure that ALL
outbound transactions from Goya will be with bit 59 set in the
address.
I can achieve that with a dedicated configuration I make in Goya's
PCIe controller. That's what I did and that works.

So, to summarize:
If I call pci_set_dma_mask with 48, then it fails on POWER9. However,
in runtime, I don't know if its POWER9 or not, so upon failure I will
call it again with 32, which makes our device pretty much unusable.
If I call pci_set_dma_mask with 64, and do the dedicated configuration
in Goya's PCIe controller, then it won't work on x86-64, because bit
59 will be set and the host won't like it (I checked it). In addition,
I might get addresses above 50 bits, which my device can't generate.

I hope this makes things more clear. Now, please explain to me how I
can call pci_set_dma_mask without any regard to whether I run on
x86-64 or POWER9, considering what I wrote above ?

Thanks,
Oded

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9
  2019-06-11 17:03       ` Oded Gabbay
@ 2019-06-11 17:22         ` Oded Gabbay
  2019-06-11 22:53           ` Benjamin Herrenschmidt
  2019-06-12  6:35           ` Oliver O'Halloran
  0 siblings, 2 replies; 22+ messages in thread
From: Oded Gabbay @ 2019-06-11 17:22 UTC (permalink / raw)
  To: Greg KH, linuxppc-dev, Christoph Hellwig; +Cc: Linux-Kernel@Vger. Kernel. Org

On Tue, Jun 11, 2019 at 8:03 PM Oded Gabbay <oded.gabbay@gmail.com> wrote:
>
> On Tue, Jun 11, 2019 at 6:26 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> > On Tue, Jun 11, 2019 at 08:17:53AM -0700, Christoph Hellwig wrote:
> > > On Tue, Jun 11, 2019 at 11:58:57AM +0200, Greg KH wrote:
> > > > That feels like a big hack.  ppc doesn't have any "what arch am I
> > > > running on?" runtime call?  Did you ask on the ppc64 mailing list?  I'm
> > > > ok to take this for now, but odds are you need a better fix for this
> > > > sometime...
> > >
> > > That isn't the worst part of it.  The whole idea of checking what I'm
> > > running to set a dma mask just doesn't make any sense at all.
> >
> > Oded, I thought I asked if there was a dma call you should be making to
> > keep this type of check from being needed.  What happened to that?  As
> > Christoph points out, none of this should be needed, which is what I
> > thought I originally said :)
> >
> > thanks,
> >
> > greg k-h
>
> I'm sorry, but it seems I can't explain what's my problem because you
> and Christoph keep mentioning the pci_set_dma_mask() but it doesn't
> help me.
> I'll try again to explain.
>
> The main problem specifically for Goya device, is that I can't call
> this function with *the same parameter* for POWER9 and x86-64, because
> x86-64 supports dma mask of 48-bits while POWER9 supports only 32-bits
> or 64-bits.
>
> The main limitation in my Goya device is that it can generate PCI
> outbound transactions with addresses from 0 to (2^50 - 1).
> That's why when we first integrated it in x86-64, we used a DMA mask
> of 48-bits, by calling pci_set_dma_mask(pdev, 48). That way, the
> kernel ensures me that all the DMA addresses are from 0 to (2^48 - 1),
> and that address range is accessible by my device.
>
> If for some reason, the x86-64 machine doesn't support 48-bits, the
> standard fallback code in ALL the drivers I have seen is to set the
> DMA mask to 32-bits. And that's how my current driver's code is
> written.
>
> Now, when I tried to integrate Goya into a POWER9 machine, I got a
> reject from the call to pci_set_dma_mask(pdev, 48). The standard code,
> as I wrote above, is to call the same function with 32-bits. That
> works BUT it is not practical, as our applications require much more
> memory mapped then 32-bits. In addition, once you add more cards which
> are all mapped to the same range, it is simply not usable at all.
>
> Therefore, I consulted with POWER people and they told me I can call
> to pci_set_dma_mask with the mask as 64, but I must make sure that ALL
> outbound transactions from Goya will be with bit 59 set in the
> address.
> I can achieve that with a dedicated configuration I make in Goya's
> PCIe controller. That's what I did and that works.
>
> So, to summarize:
> If I call pci_set_dma_mask with 48, then it fails on POWER9. However,
> in runtime, I don't know if its POWER9 or not, so upon failure I will
> call it again with 32, which makes our device pretty much unusable.
> If I call pci_set_dma_mask with 64, and do the dedicated configuration
> in Goya's PCIe controller, then it won't work on x86-64, because bit
> 59 will be set and the host won't like it (I checked it). In addition,
> I might get addresses above 50 bits, which my device can't generate.
>
> I hope this makes things more clear. Now, please explain to me how I
> can call pci_set_dma_mask without any regard to whether I run on
> x86-64 or POWER9, considering what I wrote above ?
>
> Thanks,
> Oded

Adding ppc mailing list.

Oded

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9
  2019-06-11 17:22         ` Oded Gabbay
@ 2019-06-11 22:53           ` Benjamin Herrenschmidt
  2019-06-12  5:45             ` Oliver O'Halloran
  2019-06-12  6:25             ` Oded Gabbay
  2019-06-12  6:35           ` Oliver O'Halloran
  1 sibling, 2 replies; 22+ messages in thread
From: Benjamin Herrenschmidt @ 2019-06-11 22:53 UTC (permalink / raw)
  To: Oded Gabbay, Greg KH, linuxppc-dev, Christoph Hellwig
  Cc: Linux-Kernel@Vger. Kernel. Org, Oliver OHalloran, Russell Currey

On Tue, 2019-06-11 at 20:22 +0300, Oded Gabbay wrote:
> 
> > So, to summarize:
> > If I call pci_set_dma_mask with 48, then it fails on POWER9. However,
> > in runtime, I don't know if its POWER9 or not, so upon failure I will
> > call it again with 32, which makes our device pretty much unusable.
> > If I call pci_set_dma_mask with 64, and do the dedicated configuration
> > in Goya's PCIe controller, then it won't work on x86-64, because bit
> > 59 will be set and the host won't like it (I checked it). In addition,
> > I might get addresses above 50 bits, which my device can't generate.
> > 
> > I hope this makes things more clear. Now, please explain to me how I
> > can call pci_set_dma_mask without any regard to whether I run on
> > x86-64 or POWER9, considering what I wrote above ?
> > 
> > Thanks,
> > Oded
> 
> Adding ppc mailing list.

You can't. Your device is broken. Devices that don't support DMAing to
the full 64-bit deserve to be added to the trash pile.

As a result, getting it to work will require hacks. Some GPUs have
similar issues and require similar hacks, it's unfortunate.

Added a couple of guys on CC who might be able to help get those hacks
right.

It's still very fishy .. the idea is to detect the case where setting a
64-bit mask will give your system memory mapped at a fixed high address
(1 << 59 in our case) and program that in your chip in the "Fixed high
bits" register that you seem to have (also make sure it doesn't affect
MSIs or it will break them).

This will only work as long as all of the system memory can be
addressed at an offset from that fixed address that itself fits your
device addressing capabilities (50 bits in this case). It may or may
not be the case but there's no way to check since the DMA mask logic
won't really apply.

You might want to consider fixing your HW in the next iteration... This
is going to bite you when x86 increases the max physical memory for
example, or on other architectures.

Cheers,
Ben.





^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9
  2019-06-11 22:53           ` Benjamin Herrenschmidt
@ 2019-06-12  5:45             ` Oliver O'Halloran
  2019-06-12  8:17               ` Benjamin Herrenschmidt
  2019-06-12  6:25             ` Oded Gabbay
  1 sibling, 1 reply; 22+ messages in thread
From: Oliver O'Halloran @ 2019-06-12  5:45 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Oded Gabbay, Greg KH, linuxppc-dev, Christoph Hellwig,
	Russell Currey, Linux-Kernel@Vger. Kernel. Org

On Wed, Jun 12, 2019 at 8:54 AM Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
>
> On Tue, 2019-06-11 at 20:22 +0300, Oded Gabbay wrote:
> >
> > > So, to summarize:
> > > If I call pci_set_dma_mask with 48, then it fails on POWER9. However,
> > > in runtime, I don't know if its POWER9 or not, so upon failure I will
> > > call it again with 32, which makes our device pretty much unusable.
> > > If I call pci_set_dma_mask with 64, and do the dedicated configuration
> > > in Goya's PCIe controller, then it won't work on x86-64, because bit
> > > 59 will be set and the host won't like it (I checked it). In addition,
> > > I might get addresses above 50 bits, which my device can't generate.
> > >
> > > I hope this makes things more clear. Now, please explain to me how I
> > > can call pci_set_dma_mask without any regard to whether I run on
> > > x86-64 or POWER9, considering what I wrote above ?
> > >
> > > Thanks,
> > > Oded
> >
> > Adding ppc mailing list.
>
> You can't. Your device is broken. Devices that don't support DMAing to
> the full 64-bit deserve to be added to the trash pile.
>
> As a result, getting it to work will require hacks. Some GPUs have
> similar issues and require similar hacks, it's unfortunate.
>
> Added a couple of guys on CC who might be able to help get those hacks
> right.

> It's still very fishy .. the idea is to detect the case where setting a
> 64-bit mask will give your system memory mapped at a fixed high address
> (1 << 59 in our case) and program that in your chip in the "Fixed high
> bits" register that you seem to have (also make sure it doesn't affect
> MSIs or it will break them).

Judging from the patch (https://lkml.org/lkml/2019/6/11/59) this is
what they're doing.

Also, are you sure about the MSI thing? The IODA3 spec says the only
important bits for a 64bit MSI are bits 61:60 (to hit the window) and
the lower bits that determine what IVE to use. Everything in between
is ignored so ORing in bit 59 shouldn't break anything.

> This will only work as long as all of the system memory can be
> addressed at an offset from that fixed address that itself fits your
> device addressing capabilities (50 bits in this case). It may or may
> not be the case but there's no way to check since the DMA mask logic
> won't really apply.
>
> You might want to consider fixing your HW in the next iteration... This
> is going to bite you when x86 increases the max physical memory for
> example, or on other architectures.

Yes, do this. The easiest way to avoid this sort of wierd hack is to
just design the PCIe interface to the spec in the first place.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9
  2019-06-11 22:53           ` Benjamin Herrenschmidt
  2019-06-12  5:45             ` Oliver O'Halloran
@ 2019-06-12  6:25             ` Oded Gabbay
  2019-06-12  8:18               ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 22+ messages in thread
From: Oded Gabbay @ 2019-06-12  6:25 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Greg KH, linuxppc-dev, Christoph Hellwig,
	Linux-Kernel@Vger. Kernel. Org, Oliver OHalloran, Russell Currey

On Wed, Jun 12, 2019 at 1:53 AM Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
>
> On Tue, 2019-06-11 at 20:22 +0300, Oded Gabbay wrote:
> >
> > > So, to summarize:
> > > If I call pci_set_dma_mask with 48, then it fails on POWER9. However,
> > > in runtime, I don't know if its POWER9 or not, so upon failure I will
> > > call it again with 32, which makes our device pretty much unusable.
> > > If I call pci_set_dma_mask with 64, and do the dedicated configuration
> > > in Goya's PCIe controller, then it won't work on x86-64, because bit
> > > 59 will be set and the host won't like it (I checked it). In addition,
> > > I might get addresses above 50 bits, which my device can't generate.
> > >
> > > I hope this makes things more clear. Now, please explain to me how I
> > > can call pci_set_dma_mask without any regard to whether I run on
> > > x86-64 or POWER9, considering what I wrote above ?
> > >
> > > Thanks,
> > > Oded
> >
> > Adding ppc mailing list.
>
> You can't. Your device is broken. Devices that don't support DMAing to
> the full 64-bit deserve to be added to the trash pile.
>
Hmm... right know they are added to customers data-centers but what do I know ;)

> As a result, getting it to work will require hacks. Some GPUs have
> similar issues and require similar hacks, it's unfortunate.
>
> Added a couple of guys on CC who might be able to help get those hacks
> right.
Thanks :)
>
> It's still very fishy .. the idea is to detect the case where setting a
> 64-bit mask will give your system memory mapped at a fixed high address
> (1 << 59 in our case) and program that in your chip in the "Fixed high
> bits" register that you seem to have (also make sure it doesn't affect
> MSIs or it will break them).
MSI-X are working. The set of bit 59 doesn't apply to MSI-X
transactions (AFAICS from the PCIe controller spec we have).
>
> This will only work as long as all of the system memory can be
> addressed at an offset from that fixed address that itself fits your
> device addressing capabilities (50 bits in this case). It may or may
> not be the case but there's no way to check since the DMA mask logic
> won't really apply.
Understood. In the specific system we are integrated to, that is the
case - we have less then 48 bits. But, as you pointed out, it is not a
generic solution but with my H/W I can't give a generic fit-all
solution for POWER9. I'll settle for the best that I can do.

>
> You might want to consider fixing your HW in the next iteration... This
> is going to bite you when x86 increases the max physical memory for
> example, or on other architectures.
Understood and taken care of.

>
> Cheers,
> Ben.
>
>
>
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9
  2019-06-11 17:22         ` Oded Gabbay
  2019-06-11 22:53           ` Benjamin Herrenschmidt
@ 2019-06-12  6:35           ` Oliver O'Halloran
  2019-06-12  6:53             ` Christoph Hellwig
  1 sibling, 1 reply; 22+ messages in thread
From: Oliver O'Halloran @ 2019-06-12  6:35 UTC (permalink / raw)
  To: Oded Gabbay
  Cc: Greg KH, linuxppc-dev, Christoph Hellwig, Linux-Kernel@Vger. Kernel. Org

On Wed, Jun 12, 2019 at 3:25 AM Oded Gabbay <oded.gabbay@gmail.com> wrote:
>
> On Tue, Jun 11, 2019 at 8:03 PM Oded Gabbay <oded.gabbay@gmail.com> wrote:
> >
> > On Tue, Jun 11, 2019 at 6:26 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > > *snip*
> >
> > Now, when I tried to integrate Goya into a POWER9 machine, I got a
> > reject from the call to pci_set_dma_mask(pdev, 48). The standard code,
> > as I wrote above, is to call the same function with 32-bits. That
> > works BUT it is not practical, as our applications require much more
> > memory mapped then 32-bits.

Setting a 48 bit DMA mask doesn't work today because we only allocate
IOMMU tables to cover the 0..2GB range of PCI bus addresses. Alexey
has some patches to expand that range so we can support devices that
can't hit the 64 bit bypass window. You need:

This fix: http://patchwork.ozlabs.org/patch/1113506/
This series: http://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=110810

Give that a try and see if the IOMMU overhead is tolerable.

> >In addition, once you add more cards which
> > are all mapped to the same range, it is simply not usable at all.

Each IOMMU group should have a separate bus address space and seperate
cards shouldn't be in the same IOMMU group. If they are then there's
something up.

Oliver

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9
  2019-06-12  6:35           ` Oliver O'Halloran
@ 2019-06-12  6:53             ` Christoph Hellwig
  2019-06-12 11:48               ` Oliver O'Halloran
  0 siblings, 1 reply; 22+ messages in thread
From: Christoph Hellwig @ 2019-06-12  6:53 UTC (permalink / raw)
  To: Oliver O'Halloran
  Cc: Oded Gabbay, Greg KH, linuxppc-dev, Christoph Hellwig,
	Linux-Kernel@Vger. Kernel. Org

On Wed, Jun 12, 2019 at 04:35:22PM +1000, Oliver O'Halloran wrote:
> Setting a 48 bit DMA mask doesn't work today because we only allocate
> IOMMU tables to cover the 0..2GB range of PCI bus addresses.

I don't think that is true upstream, and if it is we need to fix bug
in the powerpc code.  powerpc should be falling back treating a 48-bit
dma mask like a 32-bit one at least, that is use dynamic iommu mappings
instead of using the direct mapping.  And from my reding of 
arch/powerpc/kernel/dma-iommu.c that is exactly what it does.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9
  2019-06-12  5:45             ` Oliver O'Halloran
@ 2019-06-12  8:17               ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 22+ messages in thread
From: Benjamin Herrenschmidt @ 2019-06-12  8:17 UTC (permalink / raw)
  To: Oliver O'Halloran
  Cc: Oded Gabbay, Greg KH, linuxppc-dev, Christoph Hellwig,
	Russell Currey, Linux-Kernel@Vger. Kernel. Org

On Wed, 2019-06-12 at 15:45 +1000, Oliver O'Halloran wrote:
> 
> Also, are you sure about the MSI thing? The IODA3 spec says the only
> important bits for a 64bit MSI are bits 61:60 (to hit the window) and
> the lower bits that determine what IVE to use. Everything in between
> is ignored so ORing in bit 59 shouldn't break anything.

On IODA3... could be different on another system. My point is you can't
just have a fixed setting for all top bits for DMA & MSIs.

> > This will only work as long as all of the system memory can be
> > addressed at an offset from that fixed address that itself fits your
> > device addressing capabilities (50 bits in this case). It may or may
> > not be the case but there's no way to check since the DMA mask logic
> > won't really apply.
> > 
> > You might want to consider fixing your HW in the next iteration... This
> > is going to bite you when x86 increases the max physical memory for
> > example, or on other architectures.
> 
> Yes, do this. The easiest way to avoid this sort of wierd hack is to
> just design the PCIe interface to the spec in the first place.

Ben.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9
  2019-06-12  6:25             ` Oded Gabbay
@ 2019-06-12  8:18               ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 22+ messages in thread
From: Benjamin Herrenschmidt @ 2019-06-12  8:18 UTC (permalink / raw)
  To: Oded Gabbay
  Cc: Greg KH, linuxppc-dev, Christoph Hellwig,
	Linux-Kernel@Vger. Kernel. Org, Oliver OHalloran, Russell Currey

On Wed, 2019-06-12 at 09:25 +0300, Oded Gabbay wrote:
> 
> > You can't. Your device is broken. Devices that don't support DMAing to
> > the full 64-bit deserve to be added to the trash pile.
> > 
> 
> Hmm... right know they are added to customers data-centers but what do I know ;)

Well, some customers don't know they are being sold a lemon :)

> > As a result, getting it to work will require hacks. Some GPUs have
> > similar issues and require similar hacks, it's unfortunate.
> > 
> > Added a couple of guys on CC who might be able to help get those hacks
> > right.
> 
> Thanks :)
> > 
> > It's still very fishy .. the idea is to detect the case where setting a
> > 64-bit mask will give your system memory mapped at a fixed high address
> > (1 << 59 in our case) and program that in your chip in the "Fixed high
> > bits" register that you seem to have (also make sure it doesn't affect
> > MSIs or it will break them).
> 
> MSI-X are working. The set of bit 59 doesn't apply to MSI-X
> transactions (AFAICS from the PCIe controller spec we have).

Ok.

> > This will only work as long as all of the system memory can be
> > addressed at an offset from that fixed address that itself fits your
> > device addressing capabilities (50 bits in this case). It may or may
> > not be the case but there's no way to check since the DMA mask logic
> > won't really apply.
> 
> Understood. In the specific system we are integrated to, that is the
> case - we have less then 48 bits. But, as you pointed out, it is not a
> generic solution but with my H/W I can't give a generic fit-all
> solution for POWER9. I'll settle for the best that I can do.
> 
> > 
> > You might want to consider fixing your HW in the next iteration... This
> > is going to bite you when x86 increases the max physical memory for
> > example, or on other architectures.
> 
> Understood and taken care of.

Cheers,
Ben.

> > 
> > Cheers,
> > Ben.
> > 
> > 
> > 
> > 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9
  2019-06-12  6:53             ` Christoph Hellwig
@ 2019-06-12 11:48               ` Oliver O'Halloran
  0 siblings, 0 replies; 22+ messages in thread
From: Oliver O'Halloran @ 2019-06-12 11:48 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Oded Gabbay, Greg KH, linuxppc-dev,
	Linux-Kernel@Vger. Kernel. Org, Alexey Kardashevskiy

On Wed, Jun 12, 2019 at 4:53 PM Christoph Hellwig <hch@infradead.org> wrote:
>
> On Wed, Jun 12, 2019 at 04:35:22PM +1000, Oliver O'Halloran wrote:
> > Setting a 48 bit DMA mask doesn't work today because we only allocate
> > IOMMU tables to cover the 0..2GB range of PCI bus addresses.
>
> I don't think that is true upstream, and if it is we need to fix bug
> in the powerpc code.  powerpc should be falling back treating a 48-bit
> dma mask like a 32-bit one at least, that is use dynamic iommu mappings
> instead of using the direct mapping.  And from my reding of
> arch/powerpc/kernel/dma-iommu.c that is exactly what it does.

This is more or less what Alexey's patches fix. The IOMMU table
allocated for the 32bit DMA window is only sized for 2GB in the
platform code, see pnv_pci_ioda2_setup_default_config().

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9
  2019-06-11 15:26     ` Greg KH
  2019-06-11 17:03       ` Oded Gabbay
@ 2019-06-15 12:12       ` Oded Gabbay
  2019-06-16  9:55         ` Christoph Hellwig
  1 sibling, 1 reply; 22+ messages in thread
From: Oded Gabbay @ 2019-06-15 12:12 UTC (permalink / raw)
  To: Greg KH; +Cc: Linux-Kernel@Vger. Kernel. Org

On Tue, Jun 11, 2019 at 6:26 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Tue, Jun 11, 2019 at 08:17:53AM -0700, Christoph Hellwig wrote:
> > On Tue, Jun 11, 2019 at 11:58:57AM +0200, Greg KH wrote:
> > > That feels like a big hack.  ppc doesn't have any "what arch am I
> > > running on?" runtime call?  Did you ask on the ppc64 mailing list?  I'm
> > > ok to take this for now, but odds are you need a better fix for this
> > > sometime...
> >
> > That isn't the worst part of it.  The whole idea of checking what I'm
> > running to set a dma mask just doesn't make any sense at all.
>
> Oded, I thought I asked if there was a dma call you should be making to
> keep this type of check from being needed.  What happened to that?  As
> Christoph points out, none of this should be needed, which is what I
> thought I originally said :)
>
> thanks,
>
> greg k-h

Hi Greg,
So after the dust has settled a bit, do you think it is reasonable to
add this patch upstream ?

As Benjamin and Oliver mentioned, there is no better-looking/standard
way to solve this, considering my device's limitations.
AFAICS, the only way is either this hack, or the kernel module parameter method.

I'll of course monitor the PPC code upstream and if they will manage
to push a fix to their current DMA mask limitation (that will allow
setting dma mask of 48 bits and without setting bit 59 in outbound
transactions), I will modify my code accordingly and then this hack
won't be necessary. But for now, it is what it is.

What do you think ?

Oded

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9
  2019-06-15 12:12       ` Oded Gabbay
@ 2019-06-16  9:55         ` Christoph Hellwig
  2019-06-16 11:24           ` Oded Gabbay
  0 siblings, 1 reply; 22+ messages in thread
From: Christoph Hellwig @ 2019-06-16  9:55 UTC (permalink / raw)
  To: Oded Gabbay; +Cc: Greg KH, Linux-Kernel@Vger. Kernel. Org

On Sat, Jun 15, 2019 at 03:12:36PM +0300, Oded Gabbay wrote:
> So after the dust has settled a bit, do you think it is reasonable to
> add this patch upstream ?

I'm not Greg, but the answer is a very clear no.  drivers have abslutely
no business adding these hacks.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9
  2019-06-16  9:55         ` Christoph Hellwig
@ 2019-06-16 11:24           ` Oded Gabbay
  2019-06-17  8:19             ` Christoph Hellwig
  0 siblings, 1 reply; 22+ messages in thread
From: Oded Gabbay @ 2019-06-16 11:24 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Greg KH, Linux-Kernel@Vger. Kernel. Org

On Sun, Jun 16, 2019 at 12:55 PM Christoph Hellwig <hch@infradead.org> wrote:
>
> On Sat, Jun 15, 2019 at 03:12:36PM +0300, Oded Gabbay wrote:
> > So after the dust has settled a bit, do you think it is reasonable to
> > add this patch upstream ?
>
> I'm not Greg, but the answer is a very clear no.  drivers have abslutely
> no business adding these hacks.

So the alternative is that my device won't work on POWER9. Does that make sense?
What is the reason for this logic?
I'm not adding code that will be used by other drivers/users.
I'm just doing a special configuration to my device's H/W and I
condition it upon the PCI device ID of my parent PCI device.
What is the harm in that?

Thanks,
Oded

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9
  2019-06-16 11:24           ` Oded Gabbay
@ 2019-06-17  8:19             ` Christoph Hellwig
  2019-06-17  8:35               ` Oded Gabbay
  0 siblings, 1 reply; 22+ messages in thread
From: Christoph Hellwig @ 2019-06-17  8:19 UTC (permalink / raw)
  To: Oded Gabbay; +Cc: Christoph Hellwig, Greg KH, Linux-Kernel@Vger. Kernel. Org

On Sun, Jun 16, 2019 at 02:24:08PM +0300, Oded Gabbay wrote:
> So the alternative is that my device won't work on POWER9.

The alternative is that we fix the powerpc code to do the right
thing, which already is in progress.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9
  2019-06-17  8:19             ` Christoph Hellwig
@ 2019-06-17  8:35               ` Oded Gabbay
  2019-06-24  6:53                 ` Oded Gabbay
  0 siblings, 1 reply; 22+ messages in thread
From: Oded Gabbay @ 2019-06-17  8:35 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Greg KH, Linux-Kernel@Vger. Kernel. Org

On Mon, Jun 17, 2019 at 11:19 AM Christoph Hellwig <hch@infradead.org> wrote:
>
> On Sun, Jun 16, 2019 at 02:24:08PM +0300, Oded Gabbay wrote:
> > So the alternative is that my device won't work on POWER9.
>
> The alternative is that we fix the powerpc code to do the right
> thing, which already is in progress.
Great, I agree this is the correct approach, and that's why I wrote in
my earlier email:

"I'll of course monitor the PPC code upstream and if they will manage
to push a fix to their current DMA mask limitation (that will allow
setting dma mask of 48 bits and without setting bit 59 in outbound
transactions), I will modify my code accordingly and then this hack
won't be necessary. But for now, it is what it is."

So I don't get it why you object for this temporary fix in my driver.

Thanks,
Oded

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9
  2019-06-17  8:35               ` Oded Gabbay
@ 2019-06-24  6:53                 ` Oded Gabbay
  2019-06-24  6:58                   ` Christoph Hellwig
  0 siblings, 1 reply; 22+ messages in thread
From: Oded Gabbay @ 2019-06-24  6:53 UTC (permalink / raw)
  To: Christoph Hellwig, Greg KH; +Cc: Linux-Kernel@Vger. Kernel. Org

On Mon, Jun 17, 2019 at 11:35 AM Oded Gabbay <oded.gabbay@gmail.com> wrote:
>
> On Mon, Jun 17, 2019 at 11:19 AM Christoph Hellwig <hch@infradead.org> wrote:
> >
> > On Sun, Jun 16, 2019 at 02:24:08PM +0300, Oded Gabbay wrote:
> > > So the alternative is that my device won't work on POWER9.
> >
> > The alternative is that we fix the powerpc code to do the right
> > thing, which already is in progress.
> Great, I agree this is the correct approach, and that's why I wrote in
> my earlier email:
>
> "I'll of course monitor the PPC code upstream and if they will manage
> to push a fix to their current DMA mask limitation (that will allow
> setting dma mask of 48 bits and without setting bit 59 in outbound
> transactions), I will modify my code accordingly and then this hack
> won't be necessary. But for now, it is what it is."
>
> So I don't get it why you object for this temporary fix in my driver.
>
> Thanks,
> Oded

Hi,
I didn't get any reply to my last email.
Can we continue this discussion ?

Thanks,
Oded

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9
  2019-06-24  6:53                 ` Oded Gabbay
@ 2019-06-24  6:58                   ` Christoph Hellwig
  0 siblings, 0 replies; 22+ messages in thread
From: Christoph Hellwig @ 2019-06-24  6:58 UTC (permalink / raw)
  To: Oded Gabbay; +Cc: Christoph Hellwig, Greg KH, Linux-Kernel@Vger. Kernel. Org

Because as a matter of policy the driver has no business knowing the
actual bridge.  Even if we'd agree that a driver workaround would be
the right thing it has to be discoverable by an actual interface and
not a system type or root port PCI ID.

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2019-06-24  6:58 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-11  9:21 [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9 Oded Gabbay
2019-06-11  9:58 ` Greg KH
2019-06-11 11:47   ` Oded Gabbay
2019-06-11 15:17   ` Christoph Hellwig
2019-06-11 15:26     ` Greg KH
2019-06-11 17:03       ` Oded Gabbay
2019-06-11 17:22         ` Oded Gabbay
2019-06-11 22:53           ` Benjamin Herrenschmidt
2019-06-12  5:45             ` Oliver O'Halloran
2019-06-12  8:17               ` Benjamin Herrenschmidt
2019-06-12  6:25             ` Oded Gabbay
2019-06-12  8:18               ` Benjamin Herrenschmidt
2019-06-12  6:35           ` Oliver O'Halloran
2019-06-12  6:53             ` Christoph Hellwig
2019-06-12 11:48               ` Oliver O'Halloran
2019-06-15 12:12       ` Oded Gabbay
2019-06-16  9:55         ` Christoph Hellwig
2019-06-16 11:24           ` Oded Gabbay
2019-06-17  8:19             ` Christoph Hellwig
2019-06-17  8:35               ` Oded Gabbay
2019-06-24  6:53                 ` Oded Gabbay
2019-06-24  6:58                   ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).