All of lore.kernel.org
 help / color / mirror / Atom feed
* SGI Octane && Bridge DMA bug
@ 2016-08-28 12:01 Joshua Kinard
  2016-08-28 16:58 ` Joshua Kinard
  0 siblings, 1 reply; 5+ messages in thread
From: Joshua Kinard @ 2016-08-28 12:01 UTC (permalink / raw)
  To: Linux/MIPS

Trying to tackle the bug on SGI Octane systems where the machine misbehaves if
the amount of installed RAM is >2GB.  Reading some hints from the OpenBSD
xbridge.c driver, it seems Octane's (and maybe IP27's?) Bridge IOMMU is weird
in that, it cannot translate DMA addresses that go over 0x7fffffff (1ULL <<
31).  Which is complicated by the fact that Octane's physical memory is offset
by 512MB, so I think the real DMA limits need to be 0x20000000 to 0x9fffffff.

Been messing around in the dma-coherence.h header for Octane, and so far, with
4GB of RAM installed, it gets all the way down to bringing up the MD raid
stuff, then throws an instruction bus error for address 0xffffffffa0013ea0.  I
can't make a determination if that's a DMA address or something else.  It's
sign-extended, so it's not any valid 64-bit address (including Crosstalk or
something attached to HEART).  It's very consistent, though, as it's in the EPC
register after each crash.

The problem with Linux's DMA code is it is basically rigged to handle DMA for
PCI devices.  This includes the MIPS-specific DMA stuff.  The Impact video
board in an Octane is not a PCI device, but rather a pure Crosstalk device, and
it has no issues with DMA (as far as I know).  So I need to find a way to limit
DMA addresses for the Bridge driver only, but not mangle Impact DMA addresses.

Ideas?

-- 
Joshua Kinard
Gentoo/MIPS
kumba@gentoo.org
6144R/F5C6C943 2015-04-27
177C 1972 1FB8 F254 BAD0 3E72 5C63 F4E3 F5C6 C943

"The past tempts us, the present confuses us, the future frightens us.  And our
lives slip away, moment by moment, lost in that vast, terrible in-between."

--Emperor Turhan, Centauri Republic

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: SGI Octane && Bridge DMA bug
  2016-08-28 12:01 SGI Octane && Bridge DMA bug Joshua Kinard
@ 2016-08-28 16:58 ` Joshua Kinard
  2016-08-28 18:06   ` Florian Fainelli
  0 siblings, 1 reply; 5+ messages in thread
From: Joshua Kinard @ 2016-08-28 16:58 UTC (permalink / raw)
  To: linux-mips

On 08/28/2016 08:01, Joshua Kinard wrote:
> Trying to tackle the bug on SGI Octane systems where the machine misbehaves if
> the amount of installed RAM is >2GB.  Reading some hints from the OpenBSD
> xbridge.c driver, it seems Octane's (and maybe IP27's?) Bridge IOMMU is weird
> in that, it cannot translate DMA addresses that go over 0x7fffffff (1ULL <<
> 31).  Which is complicated by the fact that Octane's physical memory is offset
> by 512MB, so I think the real DMA limits need to be 0x20000000 to 0x9fffffff.
> 
> Been messing around in the dma-coherence.h header for Octane, and so far, with
> 4GB of RAM installed, it gets all the way down to bringing up the MD raid
> stuff, then throws an instruction bus error for address 0xffffffffa0013ea0.  I
> can't make a determination if that's a DMA address or something else.  It's
> sign-extended, so it's not any valid 64-bit address (including Crosstalk or
> something attached to HEART).  It's very consistent, though, as it's in the EPC
> register after each crash.
> 
> The problem with Linux's DMA code is it is basically rigged to handle DMA for
> PCI devices.  This includes the MIPS-specific DMA stuff.  The Impact video
> board in an Octane is not a PCI device, but rather a pure Crosstalk device, and
> it has no issues with DMA (as far as I know).  So I need to find a way to limit
> DMA addresses for the Bridge driver only, but not mangle Impact DMA addresses.
> 
> Ideas?

I think the 0xffffffffa0013ea0 address I keep hitting from multiple, unrelated
*alloc*() functions is, by virtue of being in CKSEG1 space, an exception
handler.  Or was.  Seems like those are getting blown away somehow when
something triggers an Oops -- seems the disk layer (MD, XFS, or qla1280), doing
a DMA function and probably (though not confirmed) running into that Bridge
issue of limited DMA addressing.

Cause it seems that when the Oops happens, the MIPS trap code dumps the stack
and registers, but when it goes to print the code trace, that trips up an
instruction bus error on 0xffffffffa0013ea0, followed by one or more data bus
errors.

Seems to be the only explanation that I can think of.  Is it likely I'll have
to write Octane-specific DMA alloc functions instead of the default-dma.c
versions?  It seems dma-coherence.h is for dealing with addresses that have
already been allocated, when I think I'll have to intercept the DMA calls and
make sure nothing over 0x7fffffff in physmem for Bridge gets allocated.

-- 
Joshua Kinard
Gentoo/MIPS
kumba@gentoo.org
6144R/F5C6C943 2015-04-27
177C 1972 1FB8 F254 BAD0 3E72 5C63 F4E3 F5C6 C943

"The past tempts us, the present confuses us, the future frightens us.  And our
lives slip away, moment by moment, lost in that vast, terrible in-between."

--Emperor Turhan, Centauri Republic

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: SGI Octane && Bridge DMA bug
  2016-08-28 16:58 ` Joshua Kinard
@ 2016-08-28 18:06   ` Florian Fainelli
  2016-08-29  3:33     ` Joshua Kinard
  0 siblings, 1 reply; 5+ messages in thread
From: Florian Fainelli @ 2016-08-28 18:06 UTC (permalink / raw)
  To: Joshua Kinard; +Cc: Linux-MIPS

2016-08-28 9:58 GMT-07:00 Joshua Kinard <kumba@gentoo.org>:
> On 08/28/2016 08:01, Joshua Kinard wrote:
>> Trying to tackle the bug on SGI Octane systems where the machine misbehaves if
>> the amount of installed RAM is >2GB.  Reading some hints from the OpenBSD
>> xbridge.c driver, it seems Octane's (and maybe IP27's?) Bridge IOMMU is weird
>> in that, it cannot translate DMA addresses that go over 0x7fffffff (1ULL <<
>> 31).  Which is complicated by the fact that Octane's physical memory is offset
>> by 512MB, so I think the real DMA limits need to be 0x20000000 to 0x9fffffff.
>>
>> Been messing around in the dma-coherence.h header for Octane, and so far, with
>> 4GB of RAM installed, it gets all the way down to bringing up the MD raid
>> stuff, then throws an instruction bus error for address 0xffffffffa0013ea0.  I
>> can't make a determination if that's a DMA address or something else.  It's
>> sign-extended, so it's not any valid 64-bit address (including Crosstalk or
>> something attached to HEART).  It's very consistent, though, as it's in the EPC
>> register after each crash.
>>
>> The problem with Linux's DMA code is it is basically rigged to handle DMA for
>> PCI devices.  This includes the MIPS-specific DMA stuff.  The Impact video
>> board in an Octane is not a PCI device, but rather a pure Crosstalk device, and
>> it has no issues with DMA (as far as I know).  So I need to find a way to limit
>> DMA addresses for the Bridge driver only, but not mangle Impact DMA addresses.
>>
>> Ideas?
>
> I think the 0xffffffffa0013ea0 address I keep hitting from multiple, unrelated
> *alloc*() functions is, by virtue of being in CKSEG1 space, an exception
> handler.  Or was.  Seems like those are getting blown away somehow when
> something triggers an Oops -- seems the disk layer (MD, XFS, or qla1280), doing
> a DMA function and probably (though not confirmed) running into that Bridge
> issue of limited DMA addressing.
>
> Cause it seems that when the Oops happens, the MIPS trap code dumps the stack
> and registers, but when it goes to print the code trace, that trips up an
> instruction bus error on 0xffffffffa0013ea0, followed by one or more data bus
> errors.
>
> Seems to be the only explanation that I can think of.  Is it likely I'll have
> to write Octane-specific DMA alloc functions instead of the default-dma.c
> versions?  It seems dma-coherence.h is for dealing with addresses that have
> already been allocated, when I think I'll have to intercept the DMA calls and
> make sure nothing over 0x7fffffff in physmem for Bridge gets allocated.

Regarding your first question, for all plat_dma_* operations you
should be able to inspect the struct device properties and provide the
correct implementation based on whether this device is a child of the
Bridge IOMMU or not (e.g: looking at dev->parent.name for instance?)

You are right that this only works for addresses that have already
been allocated, if you need to make sure that the allocation falls
under a particular range as well, which is not taken care of by
dma-default.c, either setting an appropriate dma_mask, or providing a
custom implementation for dma_ma_ops may be required here.

HTH
-- 
Florian

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: SGI Octane && Bridge DMA bug
  2016-08-28 18:06   ` Florian Fainelli
@ 2016-08-29  3:33     ` Joshua Kinard
  2016-08-29  6:05       ` Joshua Kinard
  0 siblings, 1 reply; 5+ messages in thread
From: Joshua Kinard @ 2016-08-29  3:33 UTC (permalink / raw)
  To: linux-mips

On 08/28/2016 14:06, Florian Fainelli wrote:
> 2016-08-28 9:58 GMT-07:00 Joshua Kinard <kumba@gentoo.org>:
>> On 08/28/2016 08:01, Joshua Kinard wrote:
>>> Trying to tackle the bug on SGI Octane systems where the machine misbehaves if
>>> the amount of installed RAM is >2GB.  Reading some hints from the OpenBSD
>>> xbridge.c driver, it seems Octane's (and maybe IP27's?) Bridge IOMMU is weird
>>> in that, it cannot translate DMA addresses that go over 0x7fffffff (1ULL <<
>>> 31).  Which is complicated by the fact that Octane's physical memory is offset
>>> by 512MB, so I think the real DMA limits need to be 0x20000000 to 0x9fffffff.
>>>
>>> Been messing around in the dma-coherence.h header for Octane, and so far, with
>>> 4GB of RAM installed, it gets all the way down to bringing up the MD raid
>>> stuff, then throws an instruction bus error for address 0xffffffffa0013ea0.  I
>>> can't make a determination if that's a DMA address or something else.  It's
>>> sign-extended, so it's not any valid 64-bit address (including Crosstalk or
>>> something attached to HEART).  It's very consistent, though, as it's in the EPC
>>> register after each crash.
>>>
>>> The problem with Linux's DMA code is it is basically rigged to handle DMA for
>>> PCI devices.  This includes the MIPS-specific DMA stuff.  The Impact video
>>> board in an Octane is not a PCI device, but rather a pure Crosstalk device, and
>>> it has no issues with DMA (as far as I know).  So I need to find a way to limit
>>> DMA addresses for the Bridge driver only, but not mangle Impact DMA addresses.
>>>
>>> Ideas?
>>
>> I think the 0xffffffffa0013ea0 address I keep hitting from multiple, unrelated
>> *alloc*() functions is, by virtue of being in CKSEG1 space, an exception
>> handler.  Or was.  Seems like those are getting blown away somehow when
>> something triggers an Oops -- seems the disk layer (MD, XFS, or qla1280), doing
>> a DMA function and probably (though not confirmed) running into that Bridge
>> issue of limited DMA addressing.
>>
>> Cause it seems that when the Oops happens, the MIPS trap code dumps the stack
>> and registers, but when it goes to print the code trace, that trips up an
>> instruction bus error on 0xffffffffa0013ea0, followed by one or more data bus
>> errors.
>>
>> Seems to be the only explanation that I can think of.  Is it likely I'll have
>> to write Octane-specific DMA alloc functions instead of the default-dma.c
>> versions?  It seems dma-coherence.h is for dealing with addresses that have
>> already been allocated, when I think I'll have to intercept the DMA calls and
>> make sure nothing over 0x7fffffff in physmem for Bridge gets allocated.
> 
> Regarding your first question, for all plat_dma_* operations you
> should be able to inspect the struct device properties and provide the
> correct implementation based on whether this device is a child of the
> Bridge IOMMU or not (e.g: looking at dev->parent.name for instance?)

Stan's original code used to check the struct device *dev arg for !NULL to
determine if it needed to be cast to struct pci_device for Bridge ops, else,
regard it as the Impact board.  But when Impact was converted to a platform
device, that check would no longer work (dev was always set then), so I
switched it to checking that dev->bus->name was "pci".  I thought that code got
executed a lot, though, and strcmp() is expensive.  Turns out, the plat_dma_*
functions are not called very often, so a strcmp() shouldn't be too much of an
issue.


> You are right that this only works for addresses that have already
> been allocated, if you need to make sure that the allocation falls
> under a particular range as well, which is not taken care of by
> dma-default.c, either setting an appropriate dma_mask, or providing a
> custom implementation for dma_ma_ops may be required here.

OpenBSD's using some "uvm" subsystem that appears to be quite adaptable once
you set a few parameters, which is what their xbridge driver is doing, but it's
completely unlike what Linux has.  I can't tell yet if I have to guarantee that
Bridge DMA allocations have to stay within 0x20000000 and 0x9fffffff (possibly
subtracting/adding 0x20000000 as needed to deal w/ the physical address
offset), or if I have to just translate already-allocated addresses to/from
that range.  If the latter, I should be able to do that w/ dma-coherence.h.
Else, it'll probably be a custom dma_ops setup.  At least I have Loongson and
Octeon to look at for examples.

Luckily, SGI appears to have imported large chunks of the original IRIX PCI
code into Linux when they were bringing up the Altix platform.  So I've been
referring back to 2.4.18 and 2.5.70 to see how the "pcibr.c" and "xtalk.c"
drivers implemented a lot of stuff in IA64.

Can't find anything specific to the Octane in the Linux code, though.  So I
can't tell if they had any workarounds in place or not for the Bridge ASIC on
this platform.  If they did, they probably removed them.

-- 
Joshua Kinard
Gentoo/MIPS
kumba@gentoo.org
6144R/F5C6C943 2015-04-27
177C 1972 1FB8 F254 BAD0 3E72 5C63 F4E3 F5C6 C943

"The past tempts us, the present confuses us, the future frightens us.  And our
lives slip away, moment by moment, lost in that vast, terrible in-between."

--Emperor Turhan, Centauri Republic

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: SGI Octane && Bridge DMA bug
  2016-08-29  3:33     ` Joshua Kinard
@ 2016-08-29  6:05       ` Joshua Kinard
  0 siblings, 0 replies; 5+ messages in thread
From: Joshua Kinard @ 2016-08-29  6:05 UTC (permalink / raw)
  To: linux-mips

On 08/28/2016 23:33, Joshua Kinard wrote:
> On 08/28/2016 14:06, Florian Fainelli wrote:
>> 2016-08-28 9:58 GMT-07:00 Joshua Kinard <kumba@gentoo.org>:
>>> On 08/28/2016 08:01, Joshua Kinard wrote:

[snip]

>>
>> Regarding your first question, for all plat_dma_* operations you
>> should be able to inspect the struct device properties and provide the
>> correct implementation based on whether this device is a child of the
>> Bridge IOMMU or not (e.g: looking at dev->parent.name for instance?)
> 
> Stan's original code used to check the struct device *dev arg for !NULL to
> determine if it needed to be cast to struct pci_device for Bridge ops, else,
> regard it as the Impact board.  But when Impact was converted to a platform
> device, that check would no longer work (dev was always set then), so I
> switched it to checking that dev->bus->name was "pci".  I thought that code got
> executed a lot, though, and strcmp() is expensive.  Turns out, the plat_dma_*
> functions are not called very often, so a strcmp() shouldn't be too much of an
> issue.
> 
> 
>> You are right that this only works for addresses that have already
>> been allocated, if you need to make sure that the allocation falls
>> under a particular range as well, which is not taken care of by
>> dma-default.c, either setting an appropriate dma_mask, or providing a
>> custom implementation for dma_ma_ops may be required here.
> 
> OpenBSD's using some "uvm" subsystem that appears to be quite adaptable once
> you set a few parameters, which is what their xbridge driver is doing, but it's
> completely unlike what Linux has.  I can't tell yet if I have to guarantee that
> Bridge DMA allocations have to stay within 0x20000000 and 0x9fffffff (possibly
> subtracting/adding 0x20000000 as needed to deal w/ the physical address
> offset), or if I have to just translate already-allocated addresses to/from
> that range.  If the latter, I should be able to do that w/ dma-coherence.h.
> Else, it'll probably be a custom dma_ops setup.  At least I have Loongson and
> Octeon to look at for examples.
> 
> Luckily, SGI appears to have imported large chunks of the original IRIX PCI
> code into Linux when they were bringing up the Altix platform.  So I've been
> referring back to 2.4.18 and 2.5.70 to see how the "pcibr.c" and "xtalk.c"
> drivers implemented a lot of stuff in IA64.
> 
> Can't find anything specific to the Octane in the Linux code, though.  So I
> can't tell if they had any workarounds in place or not for the Bridge ASIC on
> this platform.  If they did, they probably removed them.
> 

Hrm, so it looks like qla1280 sets a 64-bit DMA mask if BITS_PER_LONG == 64.
I've tried using ZONE_DMA or ZONE_DMA32 (but not both together), with no real
luck so far.  During boot, qla1280 seems to have no issues doing DMA for the
disk probing and other actions.  MD and XFS are the drivers that are triggering
the random Oopses when they try to assemble an array or mount the root partition.

Since ZONE_DMA is for the old 24-bit DMA space (16MB), I think for Octane, I
want ZONE_DMA32, but override MAX_DMA32_PFN to be (1UL << (31 - PAGE_SHIFT)).
Still need to figure out how to handle translating between phys and DMA
addresses to handle the 512MB physical address offset imposed by the system's
design.

I think some of the confusion also arises in that Octane provides three
separate groups of "windows" into Crosstalk space via HEART, its system controller:

  - sixteen 16MB "small" windows, 0x000010000000 - 0x00001f000000
  - sixteen 2GB "medium" windows, 0x000800000000 - 0x000f80000000
  - fifteen 64GB "large" windows, 0x001000000000 - 0x00f000000000

The existing Octane code appears to be picking a "default" window setup by the
firmware, which seems to be the large 64GB windows.  I think some kind of
translation layer would be needed to talk to the HEART to dynamically shift
between the three windows.  Although, not sure why you'd need the smaller
windows at all (64GB is big enough for everyone, right?).

Then you've got the Crossbow (XBOW) that the HEART connects to as widget #8,
and that's how it accesses the other widgets, such as Bridge (widget #f) or
Impact video (widget #c).

Bridge grants you three methods of accessing PCI devices:

  - 64-bit direct-mapped DMA addressing (not affected by 31-bit window bug)
  - 32-bit direct-mapped DMA addressing (affected by 31-bit window bug)
  - 32-bit translated addresses, via a type of built-in IOMMU ("ATE")

The ATE is reportedly rather buggy and OpenBSD seems to avoid using it (only
has 128 "internal" translation entries and cannot be updated while DMA is in
progress).  They go for the 32-bit direct-mapped DMA instead.

On Linux, I've got the Bridge driver using direct-mapped 64-bit DMA for Octane
and Onyx2, and that seems to work OK for Onyx2, regardless of installed memory
(8GB).  Octane is where the problems begin if installed memory is >2GB.  So I
suspect this 31-bit bug is Octane-exclusive.

It would probably help if I understood PCI addressing better.  Still confused
over what a BAR is for, and why qla1280 needs three of them (#0, #1, and #6).
Additionally, if qla1280 can do 64-bit DMA using Bridge's 64-bit direct-mapped
mode, and thus dodge the 31-bit bug, I'm puzzled why it's always MD or XFS that
trigger the Oops.

Do software drivers like MD/XFS do their own DMA, or do they use the DMA
provided by the disk driver?

Goal is to at least get the base I/O devices to work right w/ >2GB RAM,
preferably as 64-bit PCI devices.  I can then go back and look at handling
additional Bridge widgets (such as the PCI shoebox or XIO shoehorn adapters).
PCI devices plugged into the shoebox/horn will probably be pure 32-bit devices,
so I'll have to defeat this 31-bit bug somehow.

-- 
Joshua Kinard
Gentoo/MIPS
kumba@gentoo.org
6144R/F5C6C943 2015-04-27
177C 1972 1FB8 F254 BAD0 3E72 5C63 F4E3 F5C6 C943

"The past tempts us, the present confuses us, the future frightens us.  And our
lives slip away, moment by moment, lost in that vast, terrible in-between."

--Emperor Turhan, Centauri Republic

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-08-29  6:05 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-28 12:01 SGI Octane && Bridge DMA bug Joshua Kinard
2016-08-28 16:58 ` Joshua Kinard
2016-08-28 18:06   ` Florian Fainelli
2016-08-29  3:33     ` Joshua Kinard
2016-08-29  6:05       ` Joshua Kinard

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.