linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9
       [not found]       ` <CAFCwf13A73AxKzaa7Dk3tU-1NDgTFs4+xCO2os7SuSyUHZ9Z3Q@mail.gmail.com>
@ 2019-06-11 17:22         ` Oded Gabbay
  2019-06-11 22:53           ` Benjamin Herrenschmidt
  2019-06-12  6:35           ` Oliver O'Halloran
  0 siblings, 2 replies; 9+ messages in thread
From: Oded Gabbay @ 2019-06-11 17:22 UTC (permalink / raw)
  To: Greg KH, linuxppc-dev, Christoph Hellwig; +Cc: Linux-Kernel@Vger. Kernel. Org

On Tue, Jun 11, 2019 at 8:03 PM Oded Gabbay <oded.gabbay@gmail.com> wrote:
>
> On Tue, Jun 11, 2019 at 6:26 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> > On Tue, Jun 11, 2019 at 08:17:53AM -0700, Christoph Hellwig wrote:
> > > On Tue, Jun 11, 2019 at 11:58:57AM +0200, Greg KH wrote:
> > > > That feels like a big hack.  ppc doesn't have any "what arch am I
> > > > running on?" runtime call?  Did you ask on the ppc64 mailing list?  I'm
> > > > ok to take this for now, but odds are you need a better fix for this
> > > > sometime...
> > >
> > > That isn't the worst part of it.  The whole idea of checking what I'm
> > > running to set a dma mask just doesn't make any sense at all.
> >
> > Oded, I thought I asked if there was a dma call you should be making to
> > keep this type of check from being needed.  What happened to that?  As
> > Christoph points out, none of this should be needed, which is what I
> > thought I originally said :)
> >
> > thanks,
> >
> > greg k-h
>
> I'm sorry, but it seems I can't explain what's my problem because you
> and Christoph keep mentioning the pci_set_dma_mask() but it doesn't
> help me.
> I'll try again to explain.
>
> The main problem specifically for Goya device, is that I can't call
> this function with *the same parameter* for POWER9 and x86-64, because
> x86-64 supports dma mask of 48-bits while POWER9 supports only 32-bits
> or 64-bits.
>
> The main limitation in my Goya device is that it can generate PCI
> outbound transactions with addresses from 0 to (2^50 - 1).
> That's why when we first integrated it in x86-64, we used a DMA mask
> of 48-bits, by calling pci_set_dma_mask(pdev, 48). That way, the
> kernel ensures me that all the DMA addresses are from 0 to (2^48 - 1),
> and that address range is accessible by my device.
>
> If for some reason, the x86-64 machine doesn't support 48-bits, the
> standard fallback code in ALL the drivers I have seen is to set the
> DMA mask to 32-bits. And that's how my current driver's code is
> written.
>
> Now, when I tried to integrate Goya into a POWER9 machine, I got a
> reject from the call to pci_set_dma_mask(pdev, 48). The standard code,
> as I wrote above, is to call the same function with 32-bits. That
> works BUT it is not practical, as our applications require much more
> memory mapped then 32-bits. In addition, once you add more cards which
> are all mapped to the same range, it is simply not usable at all.
>
> Therefore, I consulted with POWER people and they told me I can call
> to pci_set_dma_mask with the mask as 64, but I must make sure that ALL
> outbound transactions from Goya will be with bit 59 set in the
> address.
> I can achieve that with a dedicated configuration I make in Goya's
> PCIe controller. That's what I did and that works.
>
> So, to summarize:
> If I call pci_set_dma_mask with 48, then it fails on POWER9. However,
> in runtime, I don't know if its POWER9 or not, so upon failure I will
> call it again with 32, which makes our device pretty much unusable.
> If I call pci_set_dma_mask with 64, and do the dedicated configuration
> in Goya's PCIe controller, then it won't work on x86-64, because bit
> 59 will be set and the host won't like it (I checked it). In addition,
> I might get addresses above 50 bits, which my device can't generate.
>
> I hope this makes things more clear. Now, please explain to me how I
> can call pci_set_dma_mask without any regard to whether I run on
> x86-64 or POWER9, considering what I wrote above ?
>
> Thanks,
> Oded

Adding ppc mailing list.

Oded

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9
  2019-06-11 17:22         ` [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9 Oded Gabbay
@ 2019-06-11 22:53           ` Benjamin Herrenschmidt
  2019-06-12  5:45             ` Oliver O'Halloran
  2019-06-12  6:25             ` Oded Gabbay
  2019-06-12  6:35           ` Oliver O'Halloran
  1 sibling, 2 replies; 9+ messages in thread
From: Benjamin Herrenschmidt @ 2019-06-11 22:53 UTC (permalink / raw)
  To: Oded Gabbay, Greg KH, linuxppc-dev, Christoph Hellwig
  Cc: Russell Currey, Oliver OHalloran, Linux-Kernel@Vger. Kernel. Org

On Tue, 2019-06-11 at 20:22 +0300, Oded Gabbay wrote:
> 
> > So, to summarize:
> > If I call pci_set_dma_mask with 48, then it fails on POWER9. However,
> > in runtime, I don't know if its POWER9 or not, so upon failure I will
> > call it again with 32, which makes our device pretty much unusable.
> > If I call pci_set_dma_mask with 64, and do the dedicated configuration
> > in Goya's PCIe controller, then it won't work on x86-64, because bit
> > 59 will be set and the host won't like it (I checked it). In addition,
> > I might get addresses above 50 bits, which my device can't generate.
> > 
> > I hope this makes things more clear. Now, please explain to me how I
> > can call pci_set_dma_mask without any regard to whether I run on
> > x86-64 or POWER9, considering what I wrote above ?
> > 
> > Thanks,
> > Oded
> 
> Adding ppc mailing list.

You can't. Your device is broken. Devices that don't support DMAing to
the full 64-bit deserve to be added to the trash pile.

As a result, getting it to work will require hacks. Some GPUs have
similar issues and require similar hacks, it's unfortunate.

Added a couple of guys on CC who might be able to help get those hacks
right.

It's still very fishy .. the idea is to detect the case where setting a
64-bit mask will give your system memory mapped at a fixed high address
(1 << 59 in our case) and program that in your chip in the "Fixed high
bits" register that you seem to have (also make sure it doesn't affect
MSIs or it will break them).

This will only work as long as all of the system memory can be
addressed at an offset from that fixed address that itself fits your
device addressing capabilities (50 bits in this case). It may or may
not be the case but there's no way to check since the DMA mask logic
won't really apply.

You might want to consider fixing your HW in the next iteration... This
is going to bite you when x86 increases the max physical memory for
example, or on other architectures.

Cheers,
Ben.





^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9
  2019-06-11 22:53           ` Benjamin Herrenschmidt
@ 2019-06-12  5:45             ` Oliver O'Halloran
  2019-06-12  8:17               ` Benjamin Herrenschmidt
  2019-06-12  6:25             ` Oded Gabbay
  1 sibling, 1 reply; 9+ messages in thread
From: Oliver O'Halloran @ 2019-06-12  5:45 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Oded Gabbay, Russell Currey, Greg KH,
	Linux-Kernel@Vger. Kernel. Org, Christoph Hellwig, linuxppc-dev

On Wed, Jun 12, 2019 at 8:54 AM Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
>
> On Tue, 2019-06-11 at 20:22 +0300, Oded Gabbay wrote:
> >
> > > So, to summarize:
> > > If I call pci_set_dma_mask with 48, then it fails on POWER9. However,
> > > in runtime, I don't know if its POWER9 or not, so upon failure I will
> > > call it again with 32, which makes our device pretty much unusable.
> > > If I call pci_set_dma_mask with 64, and do the dedicated configuration
> > > in Goya's PCIe controller, then it won't work on x86-64, because bit
> > > 59 will be set and the host won't like it (I checked it). In addition,
> > > I might get addresses above 50 bits, which my device can't generate.
> > >
> > > I hope this makes things more clear. Now, please explain to me how I
> > > can call pci_set_dma_mask without any regard to whether I run on
> > > x86-64 or POWER9, considering what I wrote above ?
> > >
> > > Thanks,
> > > Oded
> >
> > Adding ppc mailing list.
>
> You can't. Your device is broken. Devices that don't support DMAing to
> the full 64-bit deserve to be added to the trash pile.
>
> As a result, getting it to work will require hacks. Some GPUs have
> similar issues and require similar hacks, it's unfortunate.
>
> Added a couple of guys on CC who might be able to help get those hacks
> right.

> It's still very fishy .. the idea is to detect the case where setting a
> 64-bit mask will give your system memory mapped at a fixed high address
> (1 << 59 in our case) and program that in your chip in the "Fixed high
> bits" register that you seem to have (also make sure it doesn't affect
> MSIs or it will break them).

Judging from the patch (https://lkml.org/lkml/2019/6/11/59) this is
what they're doing.

Also, are you sure about the MSI thing? The IODA3 spec says the only
important bits for a 64bit MSI are bits 61:60 (to hit the window) and
the lower bits that determine what IVE to use. Everything in between
is ignored so ORing in bit 59 shouldn't break anything.

> This will only work as long as all of the system memory can be
> addressed at an offset from that fixed address that itself fits your
> device addressing capabilities (50 bits in this case). It may or may
> not be the case but there's no way to check since the DMA mask logic
> won't really apply.
>
> You might want to consider fixing your HW in the next iteration... This
> is going to bite you when x86 increases the max physical memory for
> example, or on other architectures.

Yes, do this. The easiest way to avoid this sort of wierd hack is to
just design the PCIe interface to the spec in the first place.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9
  2019-06-11 22:53           ` Benjamin Herrenschmidt
  2019-06-12  5:45             ` Oliver O'Halloran
@ 2019-06-12  6:25             ` Oded Gabbay
  2019-06-12  8:18               ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 9+ messages in thread
From: Oded Gabbay @ 2019-06-12  6:25 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Russell Currey, Oliver OHalloran, Greg KH,
	Linux-Kernel@Vger. Kernel. Org, Christoph Hellwig, linuxppc-dev

On Wed, Jun 12, 2019 at 1:53 AM Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
>
> On Tue, 2019-06-11 at 20:22 +0300, Oded Gabbay wrote:
> >
> > > So, to summarize:
> > > If I call pci_set_dma_mask with 48, then it fails on POWER9. However,
> > > in runtime, I don't know if its POWER9 or not, so upon failure I will
> > > call it again with 32, which makes our device pretty much unusable.
> > > If I call pci_set_dma_mask with 64, and do the dedicated configuration
> > > in Goya's PCIe controller, then it won't work on x86-64, because bit
> > > 59 will be set and the host won't like it (I checked it). In addition,
> > > I might get addresses above 50 bits, which my device can't generate.
> > >
> > > I hope this makes things more clear. Now, please explain to me how I
> > > can call pci_set_dma_mask without any regard to whether I run on
> > > x86-64 or POWER9, considering what I wrote above ?
> > >
> > > Thanks,
> > > Oded
> >
> > Adding ppc mailing list.
>
> You can't. Your device is broken. Devices that don't support DMAing to
> the full 64-bit deserve to be added to the trash pile.
>
Hmm... right know they are added to customers data-centers but what do I know ;)

> As a result, getting it to work will require hacks. Some GPUs have
> similar issues and require similar hacks, it's unfortunate.
>
> Added a couple of guys on CC who might be able to help get those hacks
> right.
Thanks :)
>
> It's still very fishy .. the idea is to detect the case where setting a
> 64-bit mask will give your system memory mapped at a fixed high address
> (1 << 59 in our case) and program that in your chip in the "Fixed high
> bits" register that you seem to have (also make sure it doesn't affect
> MSIs or it will break them).
MSI-X are working. The set of bit 59 doesn't apply to MSI-X
transactions (AFAICS from the PCIe controller spec we have).
>
> This will only work as long as all of the system memory can be
> addressed at an offset from that fixed address that itself fits your
> device addressing capabilities (50 bits in this case). It may or may
> not be the case but there's no way to check since the DMA mask logic
> won't really apply.
Understood. In the specific system we are integrated to, that is the
case - we have less then 48 bits. But, as you pointed out, it is not a
generic solution but with my H/W I can't give a generic fit-all
solution for POWER9. I'll settle for the best that I can do.

>
> You might want to consider fixing your HW in the next iteration... This
> is going to bite you when x86 increases the max physical memory for
> example, or on other architectures.
Understood and taken care of.

>
> Cheers,
> Ben.
>
>
>
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9
  2019-06-11 17:22         ` [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9 Oded Gabbay
  2019-06-11 22:53           ` Benjamin Herrenschmidt
@ 2019-06-12  6:35           ` Oliver O'Halloran
  2019-06-12  6:53             ` Christoph Hellwig
  1 sibling, 1 reply; 9+ messages in thread
From: Oliver O'Halloran @ 2019-06-12  6:35 UTC (permalink / raw)
  To: Oded Gabbay
  Cc: Christoph Hellwig, Greg KH, Linux-Kernel@Vger. Kernel. Org, linuxppc-dev

On Wed, Jun 12, 2019 at 3:25 AM Oded Gabbay <oded.gabbay@gmail.com> wrote:
>
> On Tue, Jun 11, 2019 at 8:03 PM Oded Gabbay <oded.gabbay@gmail.com> wrote:
> >
> > On Tue, Jun 11, 2019 at 6:26 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > > *snip*
> >
> > Now, when I tried to integrate Goya into a POWER9 machine, I got a
> > reject from the call to pci_set_dma_mask(pdev, 48). The standard code,
> > as I wrote above, is to call the same function with 32-bits. That
> > works BUT it is not practical, as our applications require much more
> > memory mapped then 32-bits.

Setting a 48 bit DMA mask doesn't work today because we only allocate
IOMMU tables to cover the 0..2GB range of PCI bus addresses. Alexey
has some patches to expand that range so we can support devices that
can't hit the 64 bit bypass window. You need:

This fix: http://patchwork.ozlabs.org/patch/1113506/
This series: http://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=110810

Give that a try and see if the IOMMU overhead is tolerable.

> >In addition, once you add more cards which
> > are all mapped to the same range, it is simply not usable at all.

Each IOMMU group should have a separate bus address space and seperate
cards shouldn't be in the same IOMMU group. If they are then there's
something up.

Oliver

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9
  2019-06-12  6:35           ` Oliver O'Halloran
@ 2019-06-12  6:53             ` Christoph Hellwig
  2019-06-12 11:48               ` Oliver O'Halloran
  0 siblings, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2019-06-12  6:53 UTC (permalink / raw)
  To: Oliver O'Halloran
  Cc: Oded Gabbay, Greg KH, Christoph Hellwig,
	Linux-Kernel@Vger. Kernel. Org, linuxppc-dev

On Wed, Jun 12, 2019 at 04:35:22PM +1000, Oliver O'Halloran wrote:
> Setting a 48 bit DMA mask doesn't work today because we only allocate
> IOMMU tables to cover the 0..2GB range of PCI bus addresses.

I don't think that is true upstream, and if it is we need to fix bug
in the powerpc code.  powerpc should be falling back treating a 48-bit
dma mask like a 32-bit one at least, that is use dynamic iommu mappings
instead of using the direct mapping.  And from my reding of 
arch/powerpc/kernel/dma-iommu.c that is exactly what it does.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9
  2019-06-12  5:45             ` Oliver O'Halloran
@ 2019-06-12  8:17               ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 9+ messages in thread
From: Benjamin Herrenschmidt @ 2019-06-12  8:17 UTC (permalink / raw)
  To: Oliver O'Halloran
  Cc: Oded Gabbay, Russell Currey, Greg KH,
	Linux-Kernel@Vger. Kernel. Org, Christoph Hellwig, linuxppc-dev

On Wed, 2019-06-12 at 15:45 +1000, Oliver O'Halloran wrote:
> 
> Also, are you sure about the MSI thing? The IODA3 spec says the only
> important bits for a 64bit MSI are bits 61:60 (to hit the window) and
> the lower bits that determine what IVE to use. Everything in between
> is ignored so ORing in bit 59 shouldn't break anything.

On IODA3... could be different on another system. My point is you can't
just have a fixed setting for all top bits for DMA & MSIs.

> > This will only work as long as all of the system memory can be
> > addressed at an offset from that fixed address that itself fits your
> > device addressing capabilities (50 bits in this case). It may or may
> > not be the case but there's no way to check since the DMA mask logic
> > won't really apply.
> > 
> > You might want to consider fixing your HW in the next iteration... This
> > is going to bite you when x86 increases the max physical memory for
> > example, or on other architectures.
> 
> Yes, do this. The easiest way to avoid this sort of wierd hack is to
> just design the PCIe interface to the spec in the first place.

Ben.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9
  2019-06-12  6:25             ` Oded Gabbay
@ 2019-06-12  8:18               ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 9+ messages in thread
From: Benjamin Herrenschmidt @ 2019-06-12  8:18 UTC (permalink / raw)
  To: Oded Gabbay
  Cc: Russell Currey, Oliver OHalloran, Greg KH,
	Linux-Kernel@Vger. Kernel. Org, Christoph Hellwig, linuxppc-dev

On Wed, 2019-06-12 at 09:25 +0300, Oded Gabbay wrote:
> 
> > You can't. Your device is broken. Devices that don't support DMAing to
> > the full 64-bit deserve to be added to the trash pile.
> > 
> 
> Hmm... right know they are added to customers data-centers but what do I know ;)

Well, some customers don't know they are being sold a lemon :)

> > As a result, getting it to work will require hacks. Some GPUs have
> > similar issues and require similar hacks, it's unfortunate.
> > 
> > Added a couple of guys on CC who might be able to help get those hacks
> > right.
> 
> Thanks :)
> > 
> > It's still very fishy .. the idea is to detect the case where setting a
> > 64-bit mask will give your system memory mapped at a fixed high address
> > (1 << 59 in our case) and program that in your chip in the "Fixed high
> > bits" register that you seem to have (also make sure it doesn't affect
> > MSIs or it will break them).
> 
> MSI-X are working. The set of bit 59 doesn't apply to MSI-X
> transactions (AFAICS from the PCIe controller spec we have).

Ok.

> > This will only work as long as all of the system memory can be
> > addressed at an offset from that fixed address that itself fits your
> > device addressing capabilities (50 bits in this case). It may or may
> > not be the case but there's no way to check since the DMA mask logic
> > won't really apply.
> 
> Understood. In the specific system we are integrated to, that is the
> case - we have less then 48 bits. But, as you pointed out, it is not a
> generic solution but with my H/W I can't give a generic fit-all
> solution for POWER9. I'll settle for the best that I can do.
> 
> > 
> > You might want to consider fixing your HW in the next iteration... This
> > is going to bite you when x86 increases the max physical memory for
> > example, or on other architectures.
> 
> Understood and taken care of.

Cheers,
Ben.

> > 
> > Cheers,
> > Ben.
> > 
> > 
> > 
> > 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9
  2019-06-12  6:53             ` Christoph Hellwig
@ 2019-06-12 11:48               ` Oliver O'Halloran
  0 siblings, 0 replies; 9+ messages in thread
From: Oliver O'Halloran @ 2019-06-12 11:48 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Oded Gabbay, Greg KH, Linux-Kernel@Vger. Kernel. Org,
	Alexey Kardashevskiy, linuxppc-dev

On Wed, Jun 12, 2019 at 4:53 PM Christoph Hellwig <hch@infradead.org> wrote:
>
> On Wed, Jun 12, 2019 at 04:35:22PM +1000, Oliver O'Halloran wrote:
> > Setting a 48 bit DMA mask doesn't work today because we only allocate
> > IOMMU tables to cover the 0..2GB range of PCI bus addresses.
>
> I don't think that is true upstream, and if it is we need to fix bug
> in the powerpc code.  powerpc should be falling back treating a 48-bit
> dma mask like a 32-bit one at least, that is use dynamic iommu mappings
> instead of using the direct mapping.  And from my reding of
> arch/powerpc/kernel/dma-iommu.c that is exactly what it does.

This is more or less what Alexey's patches fix. The IOMMU table
allocated for the 32bit DMA window is only sized for 2GB in the
platform code, see pnv_pci_ioda2_setup_default_config().

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-06-12 11:50 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20190611092144.11194-1-oded.gabbay@gmail.com>
     [not found] ` <20190611095857.GB24058@kroah.com>
     [not found]   ` <20190611151753.GA11404@infradead.org>
     [not found]     ` <20190611152655.GA3972@kroah.com>
     [not found]       ` <CAFCwf13A73AxKzaa7Dk3tU-1NDgTFs4+xCO2os7SuSyUHZ9Z3Q@mail.gmail.com>
2019-06-11 17:22         ` [PATCH v2 8/8] habanalabs: enable 64-bit DMA mask in POWER9 Oded Gabbay
2019-06-11 22:53           ` Benjamin Herrenschmidt
2019-06-12  5:45             ` Oliver O'Halloran
2019-06-12  8:17               ` Benjamin Herrenschmidt
2019-06-12  6:25             ` Oded Gabbay
2019-06-12  8:18               ` Benjamin Herrenschmidt
2019-06-12  6:35           ` Oliver O'Halloran
2019-06-12  6:53             ` Christoph Hellwig
2019-06-12 11:48               ` Oliver O'Halloran

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).