* [PATCH] TC: Set DMA masks for devices
@ 2018-10-03 12:21 Maciej W. Rozycki
2018-10-04 16:57 ` Fredrik Noring
0 siblings, 1 reply; 9+ messages in thread
From: Maciej W. Rozycki @ 2018-10-03 12:21 UTC (permalink / raw)
To: Ralf Baechle; +Cc: linux-mips, linux-kernel
Fix a TURBOchannel support regression with commit 205e1b7f51e4
("dma-mapping: warn when there is no coherent_dma_mask") that caused
coherent DMA allocations to produce a warning such as:
defxx: v1.11 2014/07/01 Lawrence V. Stefani and others
tc1: DEFTA at MMIO addr = 0x1e900000, IRQ = 20, Hardware addr = 08-00-2b-a3-a3-29
------------[ cut here ]------------
WARNING: CPU: 0 PID: 1 at ./include/linux/dma-mapping.h:516 dfx_dev_register+0x670/0x678
Modules linked in:
CPU: 0 PID: 1 Comm: swapper Not tainted 4.19.0-rc6 #2
Stack : ffffffff8009ffc0 fffffffffffffec0 0000000000000000 ffffffff80647650
0000000000000000 0000000000000000 ffffffff806f5f80 ffffffffffffffff
0000000000000000 0000000000000000 0000000000000001 ffffffff8065d4e8
98000000031b6300 ffffffff80563478 ffffffff805685b0 ffffffffffffffff
0000000000000000 ffffffff805d6720 0000000000000204 ffffffff80388df8
0000000000000000 0000000000000009 ffffffff8053efd0 ffffffff806657d0
0000000000000000 ffffffff803177f8 0000000000000000 ffffffff806d0000
9800000003078000 980000000307b9e0 000000001e900000 ffffffff80067940
0000000000000000 ffffffff805d6720 0000000000000204 ffffffff80388df8
ffffffff805176c0 ffffffff8004dc78 0000000000000000 ffffffff80067940
...
Call Trace:
[<ffffffff8004dc78>] show_stack+0xa0/0x130
[<ffffffff80067940>] __warn+0x128/0x170
---[ end trace b1d1e094f67f3bb2 ]---
This is because the TURBOchannel bus driver fails to set the coherent
DMA mask for devices enumerated.
Set the regular and coherent DMA masks for TURBOchannel devices then,
observing that the bus protocol supports a 34-bit (16GiB) DMA address
space, by interpreting the value presented in the address cycle across
the 32 `ad' lines as a 32-bit word rather than byte address[1]. The
architectural size of the TURBOchannel DMA address space exceeds the
maximum amount of RAM any actual TURBOchannel system in existence may
have, hence both masks are the same.
This removes the warning shown above.
References:
[1] "TURBOchannel Hardware Specification", EK-369AA-OD-007B, Digital
Equipment Corporation, January 1993, Section "DMA", pp. 1-15 -- 1-17
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Fixes: 205e1b7f51e4 ("dma-mapping: warn when there is no coherent_dma_mask")
Cc: stable@vger.kernel.org # 4.16+
---
drivers/tc/tc.c | 8 +++++++-
include/linux/tc.h | 1 +
2 files changed, 8 insertions(+), 1 deletion(-)
linux-tc-dma-mask.patch
Index: linux-20180930-4maxp64/drivers/tc/tc.c
===================================================================
--- linux-20180930-4maxp64.orig/drivers/tc/tc.c
+++ linux-20180930-4maxp64/drivers/tc/tc.c
@@ -2,7 +2,7 @@
* TURBOchannel bus services.
*
* Copyright (c) Harald Koerfgen, 1998
- * Copyright (c) 2001, 2003, 2005, 2006 Maciej W. Rozycki
+ * Copyright (c) 2001, 2003, 2005, 2006, 2018 Maciej W. Rozycki
* Copyright (c) 2005 James Simmons
*
* This file is subject to the terms and conditions of the GNU
@@ -10,6 +10,7 @@
* directory of this archive for more details.
*/
#include <linux/compiler.h>
+#include <linux/dma-mapping.h>
#include <linux/errno.h>
#include <linux/init.h>
#include <linux/ioport.h>
@@ -92,6 +93,11 @@ static void __init tc_bus_add_devices(st
tdev->dev.bus = &tc_bus_type;
tdev->slot = slot;
+ /* TURBOchannel has 34-bit DMA addressing (16GiB space). */
+ tdev->dma_mask = DMA_BIT_MASK(34);
+ tdev->dev.dma_mask = &tdev->dma_mask;
+ tdev->dev.coherent_dma_mask = DMA_BIT_MASK(34);
+
for (i = 0; i < 8; i++) {
tdev->firmware[i] =
readb(module + offset + TC_FIRM_VER + 4 * i);
Index: linux-20180930-4maxp64/include/linux/tc.h
===================================================================
--- linux-20180930-4maxp64.orig/include/linux/tc.h
+++ linux-20180930-4maxp64/include/linux/tc.h
@@ -84,6 +84,7 @@ struct tc_dev {
device. */
struct device dev; /* Generic device interface. */
struct resource resource; /* Address space of this device. */
+ u64 dma_mask; /* DMA addressable range. */
char vendor[9];
char name[9];
char firmware[9];
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] TC: Set DMA masks for devices
2018-10-03 12:21 [PATCH] TC: Set DMA masks for devices Maciej W. Rozycki
@ 2018-10-04 16:57 ` Fredrik Noring
2018-10-04 17:55 ` Fredrik Noring
2018-10-04 20:09 ` Maciej W. Rozycki
0 siblings, 2 replies; 9+ messages in thread
From: Fredrik Noring @ 2018-10-04 16:57 UTC (permalink / raw)
To: Maciej W. Rozycki; +Cc: Ralf Baechle, linux-mips, Jürgen Urban
Hi Maciej,
> Fix a TURBOchannel support regression with commit 205e1b7f51e4
> ("dma-mapping: warn when there is no coherent_dma_mask") that caused
> coherent DMA allocations to produce a warning such as:
>
> defxx: v1.11 2014/07/01 Lawrence V. Stefani and others
> tc1: DEFTA at MMIO addr = 0x1e900000, IRQ = 20, Hardware addr = 08-00-2b-a3-a3-29
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 1 at ./include/linux/dma-mapping.h:516 dfx_dev_register+0x670/0x678
> Modules linked in:
> CPU: 0 PID: 1 Comm: swapper Not tainted 4.19.0-rc6 #2
> Stack : ffffffff8009ffc0 fffffffffffffec0 0000000000000000 ffffffff80647650
> 0000000000000000 0000000000000000 ffffffff806f5f80 ffffffffffffffff
> 0000000000000000 0000000000000000 0000000000000001 ffffffff8065d4e8
> 98000000031b6300 ffffffff80563478 ffffffff805685b0 ffffffffffffffff
> 0000000000000000 ffffffff805d6720 0000000000000204 ffffffff80388df8
> 0000000000000000 0000000000000009 ffffffff8053efd0 ffffffff806657d0
> 0000000000000000 ffffffff803177f8 0000000000000000 ffffffff806d0000
> 9800000003078000 980000000307b9e0 000000001e900000 ffffffff80067940
> 0000000000000000 ffffffff805d6720 0000000000000204 ffffffff80388df8
> ffffffff805176c0 ffffffff8004dc78 0000000000000000 ffffffff80067940
> ...
> Call Trace:
> [<ffffffff8004dc78>] show_stack+0xa0/0x130
> [<ffffffff80067940>] __warn+0x128/0x170
> ---[ end trace b1d1e094f67f3bb2 ]---
>
> This is because the TURBOchannel bus driver fails to set the coherent
> DMA mask for devices enumerated.
Interesting! This warning is also triggered by the PS2 OHCI driver. Robin
Murphy proposed the patch
https://lkml.org/lkml/2018/7/3/507
that relaxed it and a related warning. Half of the patch was merged in
commit d27fb99f62af7 while the other half (related to this warning) was
rejected by Christoph Hellwig. The PS2 OHCI triggers the following trace:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 62 at ./include/linux/dma-mapping.h:516 ohci_setup+0x41c/0x424 [ohci_hcd]
Modules linked in: ohci_ps2(+) ohci_hcd usbcore usb_common sd_mod iop iop_fio iop_module iop_memory sif
CPU: 0 PID: 62 Comm: modprobe Not tainted 4.16.0+ #1533
Stack : 00000000 00000000 80747392 00000037 81c6eb0c 804f32e7 80493b24 0000003e
80743498 00000204 00000001 c01c0000 802a2fa0 10058c00 81ea5a68 804facc0
00000000 00000000 80740000 00000007 00000000 00000060 00000000 00000000
3a6d6d6f 00000000 0000005f 646f6d20 80000000 00000000 c01e66e8 c01e813c
00000009 00000204 00000001 c01c0000 00000018 80278fe0 0007579f 00000001
...
Call Trace:
[<8001d6e4>] show_stack+0x74/0x104
[<800323a8>] __warn+0x118/0x120
[<8003246c>] warn_slowpath_null+0x44/0x58
[<c01e66e8>] ohci_setup+0x41c/0x424 [ohci_hcd]
[<c01f209c>] ohci_ps2_reset+0x30/0x70 [ohci_ps2]
[<c01a8aec>] usb_add_hcd+0x2d4/0x89c [usbcore]
[<c01f2360>] ohci_hcd_ps2_probe+0x284/0x2a4 [ohci_ps2]
[<802a8a74>] platform_drv_probe+0x2c/0x68
[<802a70b4>] driver_probe_device+0x22c/0x2e4
[<802a71f0>] __driver_attach+0x84/0xc8
[<802a53fc>] bus_for_each_dev+0x60/0x90
[<802a6580>] bus_add_driver+0x1b8/0x200
[<802a7980>] driver_register+0xc0/0x100
[<800106bc>] do_one_initcall+0x17c/0x190
[<800841f4>] do_init_module+0x74/0x1f0
[<80082f30>] load_module+0x1680/0x2044
[<80083adc>] SyS_finit_module+0xa0/0xb8
[<8002190c>] syscall_common+0x34/0x58
---[ end trace e71738b5fa6bf9aa ]---
> Set the regular and coherent DMA masks for TURBOchannel devices then,
> observing that the bus protocol supports a 34-bit (16GiB) DMA address
> space, by interpreting the value presented in the address cycle across
> the 32 `ad' lines as a 32-bit word rather than byte address[1]. The
> architectural size of the TURBOchannel DMA address space exceeds the
> maximum amount of RAM any actual TURBOchannel system in existence may
> have, hence both masks are the same.
A complication with the PS2 OHCI is that DMA addresses 0-0x200000 map to
0x1c000000-0x1c200000 as seen by the kernel. Robin suggested that the mask
might correspond to the effective addressing capability, which would be
DMA_BIT_MASK(21), but it does not seem to be entirely clear, since his
commit message said that
A somewhat similar line of reasoning also applies at the other end for
the mask check in dma_alloc_attrs() too - indeed, a device which cannot
access anything other than its own local memory probably *shouldn't*
have a valid mask for the general coherent DMA API.
A special circumstance here is the use of HCD_LOCAL_MEM that is a kind of
DMA bounce buffer. Are you using anything similar with your DEFTA driver?
Fredrik
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] TC: Set DMA masks for devices
2018-10-04 16:57 ` Fredrik Noring
@ 2018-10-04 17:55 ` Fredrik Noring
2018-10-04 20:09 ` Maciej W. Rozycki
1 sibling, 0 replies; 9+ messages in thread
From: Fredrik Noring @ 2018-10-04 17:55 UTC (permalink / raw)
To: Maciej W. Rozycki; +Cc: Ralf Baechle, linux-mips, Jürgen Urban
H Maciej,
> > Set the regular and coherent DMA masks for TURBOchannel devices then,
> > observing that the bus protocol supports a 34-bit (16GiB) DMA address
> > space, by interpreting the value presented in the address cycle across
> > the 32 `ad' lines as a 32-bit word rather than byte address[1]. The
> > architectural size of the TURBOchannel DMA address space exceeds the
> > maximum amount of RAM any actual TURBOchannel system in existence may
> > have, hence both masks are the same.
>
> A complication with the PS2 OHCI is that DMA addresses 0-0x200000 map to
> 0x1c000000-0x1c200000 as seen by the kernel. Robin suggested that the mask
> might correspond to the effective addressing capability, which would be
> DMA_BIT_MASK(21), but it does not seem to be entirely clear, since his
> commit message said that
>
> A somewhat similar line of reasoning also applies at the other end for
> the mask check in dma_alloc_attrs() too - indeed, a device which cannot
> access anything other than its own local memory probably *shouldn't*
> have a valid mask for the general coherent DMA API.
>
> A special circumstance here is the use of HCD_LOCAL_MEM that is a kind of
> DMA bounce buffer. Are you using anything similar with your DEFTA driver?
Sorry, I didn't interpret your comment properly. With TURBOchannel DMA
address space exceeding any practical amount of RAM, bounce buffers isn't
needed for that system. The situation is the reverse with the PS2 OHCI.
Fredrik
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] TC: Set DMA masks for devices
2018-10-04 16:57 ` Fredrik Noring
2018-10-04 17:55 ` Fredrik Noring
@ 2018-10-04 20:09 ` Maciej W. Rozycki
2018-10-05 14:56 ` Fredrik Noring
1 sibling, 1 reply; 9+ messages in thread
From: Maciej W. Rozycki @ 2018-10-04 20:09 UTC (permalink / raw)
To: Fredrik Noring; +Cc: Ralf Baechle, linux-mips, Jürgen Urban
Hi Fredrik,
> A complication with the PS2 OHCI is that DMA addresses 0-0x200000 map to
> 0x1c000000-0x1c200000 as seen by the kernel. Robin suggested that the mask
> might correspond to the effective addressing capability, which would be
> DMA_BIT_MASK(21),
I take it you mean 0-0x1fffff obviously; let's be accurate in a technical
discussion and avoid ambiguous cases.
Well, the need to map between the CPU and the DMA address space is not
uncommon. As I recall the Galileo/Marvell GT-64xxx system controllers
have a BAR for PCI master accesses to local DRAM (so that multiple such
controllers can coexist in a NUMA system) and any non-identity mapping has
to be taken into account with DMA of course
And indeed e.g. `dma_map_single' does handle that and given a CPU-side
physical memory address returns a corresponding DMA-side address. And the
DMA mask has to reflect that and describe the DMA side, as it's the device
side that has an address space limitation here and any offset resulting
from a non-identity mapping does not change that limitation, although the
offset does have of course to be taken into account by `dma_map_single',
etc. in determining whether the memory area requested for use by a DMA
device can be used directly or whether a bounce buffer will be required
for that mapping.
> but it does not seem to be entirely clear, since his
> commit message said that
>
> A somewhat similar line of reasoning also applies at the other end for
> the mask check in dma_alloc_attrs() too - indeed, a device which cannot
> access anything other than its own local memory probably *shouldn't*
> have a valid mask for the general coherent DMA API.
Well, how can such a device use the DMA API in the first place? If the
device has local memory, than the driver has to manage it itself somehow
if needed, and then arrange copying it to main memory, either by a CPU or
a third-party DMA controller (data mover) if available. Of course in the
latter case a driver for the DMA controller may have to use the DMA API.
I'll be resubmitting a driver for such a device shortly, the DEFZA (the
previous submission can be found here:
<https://marc.info/?l=linux-netdev&m=139841853827404>). It is interesting
in that the FDDI engine supports host DMA on the reception side (and
consequently the driver uses the DMA API to handle that), while on the
transmission side (as well as with a couple of maintenance queues) it only
does DMA with its onboard buffer memory, the contents of which need to be
copied by the CPU. So there's no use of the DMA API on the transmission
or maintenance side. However usual DMA rings (all located in board memory
too) are used for all data moves.
The DEFTA is a follow-up and an upgrade to the DEFZA, more integrated
(the DEFZA uses a pair of PCBs while the DEFTA fits on one, of the size of
each in the former pair), and with the extra silicon space gained it was
possible to squeeze in circuitry required to do host DMA for all data
moves, and also the DMA rings.
> A special circumstance here is the use of HCD_LOCAL_MEM that is a kind of
> DMA bounce buffer. Are you using anything similar with your DEFTA driver?
The driver does need either an IOMMU or bounce buffers in system RAM in
the case of 64-bit PCI systems, as the PFI PCI ASIC that the FDDI PDQ ASIC
interfaces on the DEFPA does not AFAIK support 64-bit addressing (be it
directly or with the use of DAC), although the PDQ itself does support
48-bit addressing (i.e. DMA descriptor addresses hold bits 47:2 of host
addresses), which would be sufficient for the usual cases.
Not in the DEFTA (or for that matter DEFEA; possibly the only EISA device
using the DMA API) case though, as the most equipped TURBOchannel systems,
i.e. the DEC 3000 AXP models 500, 800 and 900 only support up to 1GiB of
memory, which is well below the 34-bit addressing limit.
The PDQ ASIC was used to interface FDDI to many host buses and in
addition to the 3 bus attachments mentioned above, all of which we have
support for in Linux, it was also used for Q-bus (the DEFQA) and FutureBus
(the DEFAA). We may have support for the DEFQA one day as I have both
such a board and a suitable system to use it with. We are unlikely to
have support for the DEFAA, as FutureBus was only used in high-end VAX and
Alpha systems, the size of a full 19" rack at the very least, but it is
there I believe only that the full PDQ addressing capability was actually
utilised.
NB I sat on this fix from 2014, well before the warning was introduced in
the first place, and it's only now that I got to unloading my patch queue.
:(
Maciej
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] TC: Set DMA masks for devices
2018-10-04 20:09 ` Maciej W. Rozycki
@ 2018-10-05 14:56 ` Fredrik Noring
2018-10-05 22:52 ` Maciej W. Rozycki
0 siblings, 1 reply; 9+ messages in thread
From: Fredrik Noring @ 2018-10-05 14:56 UTC (permalink / raw)
To: Maciej W. Rozycki; +Cc: Ralf Baechle, linux-mips, Jürgen Urban
Hi Maciej,
> > A complication with the PS2 OHCI is that DMA addresses 0-0x200000 map to
> > 0x1c000000-0x1c200000 as seen by the kernel. Robin suggested that the mask
> > might correspond to the effective addressing capability, which would be
> > DMA_BIT_MASK(21),
>
> I take it you mean 0-0x1fffff obviously; let's be accurate in a technical
> discussion and avoid ambiguous cases.
That's interesting. :) 0x1fffff is not a valid DMA address due to alignment
restrictions, so if one wants to indicate a closed [inclusive] DMA address
interval it would be 0-0x1ffffc, since the 32-bit word rather than the byte
is the unit of the IOP DMA. In mathematics and programming languages it is
often convenient to work with half-open intervals denoted by "[0,0x200000)"
in this case. I think both notations are technically accurate, but they do
emphasize different aspects of addresses and memory. I can switch to your
byte-centric notation if that helps. :)
> Well, the need to map between the CPU and the DMA address space is not
> uncommon. As I recall the Galileo/Marvell GT-64xxx system controllers
> have a BAR for PCI master accesses to local DRAM (so that multiple such
> controllers can coexist in a NUMA system) and any non-identity mapping has
> to be taken into account with DMA of course
>
> And indeed e.g. `dma_map_single' does handle that and given a CPU-side
> physical memory address returns a corresponding DMA-side address. And the
> DMA mask has to reflect that and describe the DMA side, as it's the device
> side that has an address space limitation here and any offset resulting
> from a non-identity mapping does not change that limitation, although the
> offset does have of course to be taken into account by `dma_map_single',
> etc. in determining whether the memory area requested for use by a DMA
> device can be used directly or whether a bounce buffer will be required
> for that mapping.
Ah... memory that is known to be DMA compatible is allocated separately,
and then handed over to the DMA subsystem using dma_declare_coherent_memory.
This is done once during driver initialisation. The drivers ohci-sm501.c and
ohci-tmio.c do that too, which is why I suspect they might broken as well.
The SM501 driver has this explanation:
/* The sm501 chip is equipped with local memory that may be used
* by on-chip devices such as the video controller and the usb host.
* This driver uses dma_declare_coherent_memory() to make sure
* usb allocations with dma_alloc_coherent() allocate from
* this local memory. The dma_handle returned by dma_alloc_coherent()
* will be an offset starting from 0 for the first local memory byte.
*
* So as long as data is allocated using dma_alloc_coherent() all is
* fine. This is however not always the case - buffers may be allocated
* using kmalloc() - so the usb core needs to be told that it must copy
* data into our local memory if the buffers happen to be placed in
* regular memory. The HCD_LOCAL_MEM flag does just that.
*/
retval = dma_declare_coherent_memory(dev, mem->start,
mem->start - mem->parent->start,
resource_size(mem),
DMA_MEMORY_EXCLUSIVE);
The corresponding code in the PS2 OHCI driver does
ps2priv->iop_dma_addr = iop_alloc(size);
if (ps2priv->iop_dma_addr == 0) {
dev_err(dev, "iop_alloc failed\n");
return -ENOMEM;
}
if (dma_declare_coherent_memory(dev,
iop_bus_to_phys(ps2priv->iop_dma_addr),
ps2priv->iop_dma_addr, size, flags)) {
dev_err(dev, "dma_declare_coherent_memory failed\n");
iop_free(ps2priv->iop_dma_addr);
ps2priv->iop_dma_addr = 0;
return -ENOMEM;
}
where iop_alloc is a special IOP memory allocation function and its return
value stored in iop_dma_addr is handed over to dma_declare_coherent_memory.
> > but it does not seem to be entirely clear, since his
> > commit message said that
> >
> > A somewhat similar line of reasoning also applies at the other end for
> > the mask check in dma_alloc_attrs() too - indeed, a device which cannot
> > access anything other than its own local memory probably *shouldn't*
> > have a valid mask for the general coherent DMA API.
>
> Well, how can such a device use the DMA API in the first place? If the
> device has local memory, than the driver has to manage it itself somehow
> if needed, and then arrange copying it to main memory, either by a CPU or
> a third-party DMA controller (data mover) if available. Of course in the
> latter case a driver for the DMA controller may have to use the DMA API.
The coherently declared memory given to the DMA subsystem is used for a
fixed sized DMA pool and no additional allocations are permitted. One could
choose a DMA mask that pretends to be reasonable, or the opposite, a mask
such as 1 that is unreasonable on purpose, as Robin writes:
Alternatively, there is perhaps some degree of argument for
deliberately picking a nonzero but useless value like 1,
although it looks like the MIPS allocator (at least the dma-
default one) never actually checks whether the page it gets
is within range of the device's coherent mask, which it
probably should do.
https://lkml.org/lkml/2018/7/6/697
> I'll be resubmitting a driver for such a device shortly, the DEFZA (the
> previous submission can be found here:
> <https://marc.info/?l=linux-netdev&m=139841853827404>). It is interesting
> in that the FDDI engine supports host DMA on the reception side (and
> consequently the driver uses the DMA API to handle that), while on the
> transmission side (as well as with a couple of maintenance queues) it only
> does DMA with its onboard buffer memory, the contents of which need to be
> copied by the CPU. So there's no use of the DMA API on the transmission
> or maintenance side. However usual DMA rings (all located in board memory
> too) are used for all data moves.
The DMA for its onboard buffer memory appears to be very similar to the
IOP and its DMA? That memory is currently copied by the EE, but there are
other DMA controllers that could handle that, possibly synchronised using
DMA chaining, which would assist the EE significantly.
Apart from USB, the IOP does networking, FireWire, harddisks, etc. Some
or all of the peripherals could be accelerated with DMA, which is an
interesting challenge.
> The DEFTA is a follow-up and an upgrade to the DEFZA, more integrated
> (the DEFZA uses a pair of PCBs while the DEFTA fits on one, of the size of
> each in the former pair), and with the extra silicon space gained it was
> possible to squeeze in circuitry required to do host DMA for all data
> moves, and also the DMA rings.
Nice. :)
> > A special circumstance here is the use of HCD_LOCAL_MEM that is a kind of
> > DMA bounce buffer. Are you using anything similar with your DEFTA driver?
>
> The driver does need either an IOMMU or bounce buffers in system RAM in
> the case of 64-bit PCI systems, as the PFI PCI ASIC that the FDDI PDQ ASIC
> interfaces on the DEFPA does not AFAIK support 64-bit addressing (be it
> directly or with the use of DAC), although the PDQ itself does support
> 48-bit addressing (i.e. DMA descriptor addresses hold bits 47:2 of host
> addresses), which would be sufficient for the usual cases.
>
> Not in the DEFTA (or for that matter DEFEA; possibly the only EISA device
> using the DMA API) case though, as the most equipped TURBOchannel systems,
> i.e. the DEC 3000 AXP models 500, 800 and 900 only support up to 1GiB of
> memory, which is well below the 34-bit addressing limit.
>
> The PDQ ASIC was used to interface FDDI to many host buses and in
> addition to the 3 bus attachments mentioned above, all of which we have
> support for in Linux, it was also used for Q-bus (the DEFQA) and FutureBus
> (the DEFAA). We may have support for the DEFQA one day as I have both
> such a board and a suitable system to use it with. We are unlikely to
> have support for the DEFAA, as FutureBus was only used in high-end VAX and
> Alpha systems, the size of a full 19" rack at the very least, but it is
> there I believe only that the full PDQ addressing capability was actually
> utilised.
Thanks! By the way, is it possible to find spare parts for such vintage
hardware these days in case of irrepairable failures?
> NB I sat on this fix from 2014, well before the warning was introduced in
> the first place, and it's only now that I got to unloading my patch queue.
> :(
Do you have the latest kernel running on your DECstation machines now? :)
Fredrik
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] TC: Set DMA masks for devices
2018-10-05 14:56 ` Fredrik Noring
@ 2018-10-05 22:52 ` Maciej W. Rozycki
2018-10-06 9:21 ` Fredrik Noring
0 siblings, 1 reply; 9+ messages in thread
From: Maciej W. Rozycki @ 2018-10-05 22:52 UTC (permalink / raw)
To: Fredrik Noring; +Cc: Ralf Baechle, linux-mips, Jürgen Urban
Hi Fredrik,
> > I take it you mean 0-0x1fffff obviously; let's be accurate in a technical
> > discussion and avoid ambiguous cases.
>
> That's interesting. :) 0x1fffff is not a valid DMA address due to alignment
> restrictions, so if one wants to indicate a closed [inclusive] DMA address
> interval it would be 0-0x1ffffc, since the 32-bit word rather than the byte
> is the unit of the IOP DMA. In mathematics and programming languages it is
> often convenient to work with half-open intervals denoted by "[0,0x200000)"
> in this case. I think both notations are technically accurate, but they do
> emphasize different aspects of addresses and memory. I can switch to your
> byte-centric notation if that helps. :)
Well, the byte at 0x1fffff may not be individually addressable by this
particular DMA engine, but surely it is there included in DMA transfers
accessing the location that spans it. If instead you prefer to use the
mathematical notation to specify inclusive/exclusive ranges, then of
course I'm fine with that too.
> > And indeed e.g. `dma_map_single' does handle that and given a CPU-side
> > physical memory address returns a corresponding DMA-side address. And the
> > DMA mask has to reflect that and describe the DMA side, as it's the device
> > side that has an address space limitation here and any offset resulting
> > from a non-identity mapping does not change that limitation, although the
> > offset does have of course to be taken into account by `dma_map_single',
> > etc. in determining whether the memory area requested for use by a DMA
> > device can be used directly or whether a bounce buffer will be required
> > for that mapping.
>
> Ah... memory that is known to be DMA compatible is allocated separately,
> and then handed over to the DMA subsystem using dma_declare_coherent_memory.
Well, that does specify both a CPU-side and a corresponding DMA-side
address too.
> This is done once during driver initialisation. The drivers ohci-sm501.c and
> ohci-tmio.c do that too, which is why I suspect they might broken as well.
>
> The SM501 driver has this explanation:
>
> /* The sm501 chip is equipped with local memory that may be used
> * by on-chip devices such as the video controller and the usb host.
> * This driver uses dma_declare_coherent_memory() to make sure
> * usb allocations with dma_alloc_coherent() allocate from
> * this local memory. The dma_handle returned by dma_alloc_coherent()
> * will be an offset starting from 0 for the first local memory byte.
From the description I take it it is some MMIO memory rather than host
memory. I fail to see how it is supposed to work with these calls for
non-system memory, which certainly any MMIO memory is, which surely is not
under the supervision of the kernel memory allocator.
There are calls for MMIO memory defined in the DMA API, specifically
`dma_map_resource' and `dma_unmap_resource'. I've never used them myself,
and I gather they provide you with a way for CPUs to access MMIO memory
with caching enabled and without the need to use the MMIO accessors only,
such as `readl', `writel', etc., which are expected to avoid going through
any CPU cache. Maybe these are what you're after?
But maybe I'm missing something.
> *
> * So as long as data is allocated using dma_alloc_coherent() all is
> * fine. This is however not always the case - buffers may be allocated
> * using kmalloc() - so the usb core needs to be told that it must copy
> * data into our local memory if the buffers happen to be placed in
> * regular memory. The HCD_LOCAL_MEM flag does just that.
> */
This raises a hack alert to me TBH.
> > Well, how can such a device use the DMA API in the first place? If the
> > device has local memory, than the driver has to manage it itself somehow
> > if needed, and then arrange copying it to main memory, either by a CPU or
> > a third-party DMA controller (data mover) if available. Of course in the
> > latter case a driver for the DMA controller may have to use the DMA API.
>
> The coherently declared memory given to the DMA subsystem is used for a
> fixed sized DMA pool and no additional allocations are permitted. One could
> choose a DMA mask that pretends to be reasonable, or the opposite, a mask
> such as 1 that is unreasonable on purpose, as Robin writes:
>
> Alternatively, there is perhaps some degree of argument for
> deliberately picking a nonzero but useless value like 1,
> although it looks like the MIPS allocator (at least the dma-
> default one) never actually checks whether the page it gets
> is within range of the device's coherent mask, which it
> probably should do.
>
> https://lkml.org/lkml/2018/7/6/697
It does look like an API abuse to me, as I noted above.
> > I'll be resubmitting a driver for such a device shortly, the DEFZA (the
> > previous submission can be found here:
> > <https://marc.info/?l=linux-netdev&m=139841853827404>). It is interesting
> > in that the FDDI engine supports host DMA on the reception side (and
> > consequently the driver uses the DMA API to handle that), while on the
> > transmission side (as well as with a couple of maintenance queues) it only
> > does DMA with its onboard buffer memory, the contents of which need to be
> > copied by the CPU. So there's no use of the DMA API on the transmission
> > or maintenance side. However usual DMA rings (all located in board memory
> > too) are used for all data moves.
>
> The DMA for its onboard buffer memory appears to be very similar to the
> IOP and its DMA? That memory is currently copied by the EE, but there are
> other DMA controllers that could handle that, possibly synchronised using
> DMA chaining, which would assist the EE significantly.
Mind that the DEFZA runs its own RTOS for initialization and management
support, including in particular SMT (Station Management). This is run on
an MC68000 processor. That processor is interfaced to a bus where board
memory is attached as well as the RMC (Ring Memory Controller) chip, which
acts as a DMA master on that bus, like does the host bus interface. Also
certain control register writes from the host raise interrupts to the
MC68000 for special situations to handle.
All the PDQ-based FDDI adapters also have an M68000 which runs an RTOS,
however the presence of the PDQ ASIC makes their architecture slightly
different as the FDDI chipset does host DMA via the PDQ ASIC, which acts
as a master on the host bus (possibly through a bridge chip like the PFI,
though TURBOchannel for example is interfaced directly).
These adapters went through several revisions, all using the Motorola
FDDI chipset (originally designed by DEC and then sold to Motorola for
fabrication and marketing, with DEC retaining an unlimited licence to
use), but with the PDQ (Packet Data Queue, I believe; not officially
confirmed) replacing the FSI (FDDI System Interface) block, and the CAMEL
(MAC and ELM (Media Access Controller and Elasticity Buffer and Link
Management)) and FCG (FDDI Clock Generator) blocks both retained.
> > The PDQ ASIC was used to interface FDDI to many host buses and in
> > addition to the 3 bus attachments mentioned above, all of which we have
> > support for in Linux, it was also used for Q-bus (the DEFQA) and FutureBus
> > (the DEFAA). We may have support for the DEFQA one day as I have both
> > such a board and a suitable system to use it with. We are unlikely to
> > have support for the DEFAA, as FutureBus was only used in high-end VAX and
> > Alpha systems, the size of a full 19" rack at the very least, but it is
> > there I believe only that the full PDQ addressing capability was actually
> > utilised.
>
> Thanks! By the way, is it possible to find spare parts for such vintage
> hardware these days in case of irrepairable failures?
What do you mean by spare parts? ICs? Complete modules can certainly be
chased, though obviously there are the more common ones, and then there
are the exotic ones.
The biggest challenge has turned out to be electrolytic capacitor
failures in power supplies. Unfortunately in late 1980s to mid 1990s
several lines of low-ESR capacitors, used in output filters in switch-mode
PSUs, were made with a new electrolyte formula based on a quaternary
ammonium salt. All they have turned out to suffer from excessive
corrosion caused by that electrolyte, shortening the lifespan of those
parts well below the expectations even in the enhanced lines specifically
made with long life in mind. Consequently those parts start leaking even
if unused (or indeed never used) and then obviously cause PSU breakage if
powered up.
Those were all from reputable manufacturers, such as Chemi-con, Nichicon
or Panasonic; not to be confused with the bulged capacitor problem, aka
capacitor plague, which many ATX PSUs have suffered from mid 1990s to mid
2000s where cheap parts were used from less reputable manufacturers.
Sadly I have ruined a couple of PSUs before I realised what the problem
was and I have been struggling since with tracking down other parts that
have failed as a result. I plan to get back to it sometime.
Some DECstation models are affected, as is other DEC (and non-DEC)
hardware:
* The 5000/200, /240 and /260 are not affected.
* The 2100 and 3100 are not if stored in their working orientation, as the
capacitors are mounted leads up in their PSUs and corrosion only breaks
the seal and not the aluminium can.
* The 5000/120, /125, /133 and /150 are all affected and are better
recapped -- all SXF Chemi-con parts have to be replaced at the very
least. Newer PSUs use newer LXF Chemi-con parts that haven't failed for
me (yet?), but are expected to too.
* I can't speak of the 5000/20, /25, /33, /50 as I haven't got one of
these.
* Other pieces of hardware would have to be inspected by their respective
owners, e.g. I had a case where I had to recap the PSU of a small Cisco
Ethernet switch with an FDDI bridge module from that era (that actually
used a stock industrial PSU you can still buy new, although at ~£500 +
VAT -- not exactly cheaply).
Other parts that have been failing are the usual Dallas RTC chips having
an integrated Lithium coin cell depleted; either the DS1287 or the DS1287A
depending on the specific model of hardware. DECstations have these chips
located in the TURBOchannel slot area with little clearance around them.
Therefore I have been slowly converting them to a version with a discrete
coin cell embedded in the IC case instead, as photographically documented
here: <ftp://ftp.linux-mips.org/pub/linux/mips/people/macro/ds1287/>.
You can still get recently manufactured brand new DS12887 or DS12887A
parts from Maxim through the usual distribution channels, however for
reference systems, such as I consider mine, I prefer to use original parts
to avoid surprises, as the DS12887/A chips have 104 bytes of general NVRAM
as opposed to 50 bytes with the DS1287/A.
NB according to HP end of sales for the DEFPA was only 2004-2005 and
based on occasional enquiries I get as the maintainer it remains deployed
in production environments. These boards remain readily available on the
second-hand market; sometimes you can get at unused old stock even.
Unless you look for the less common SMF variants, that is. I own a couple
of universal-PCI DEFPA boards that use the most recent PFI-3 ASIC (earlier
versions were 5V-only), some of which have HP recorded as the vendor in
the subsystem ID.
Also new TURBOchannel option hardware has been designed and manufactured
recently, see: <http://www.flxd.de/tc-usb/>. :) We'll get a Linux driver
sometime.
> > NB I sat on this fix from 2014, well before the warning was introduced in
> > the first place, and it's only now that I got to unloading my patch queue.
> > :(
>
> Do you have the latest kernel running on your DECstation machines now? :)
Yep:
Linux version 4.19.0-rc6 (macro@tp) (gcc version 4.1.2) #3 Mon Oct 1 00:22:03 BST 2018
bootconsole [prom0] enabled
This is a DECstation 5000/2x0
CPU0 revision is: 00000440 (R4400SC)
FPU revision is: 00000500
Checking for the multiply/shift bug... no.
Checking for the daddiu bug... yes, workaround... yes.
Determined physical RAM map:
memory: 0000000004000000 @ 0000000000000000 (usable)
Primary instruction cache 16kB, VIPT, direct mapped, linesize 16 bytes.
Primary data cache 16kB, direct mapped, VIPT, no aliases, linesize 16 bytes
Unified secondary cache 1024kB direct mapped, linesize 32 bytes.
Zone ranges:
Normal [mem 0x0000000000000000-0x0000000003ffffff]
Movable zone start for each node
Early memory node ranges
node 0: [mem 0x0000000000000000-0x0000000003ffffff]
Initmem setup node 0 [mem 0x0000000000000000-0x0000000003ffffff]
On node 0 totalpages: 4096
Normal zone: 14 pages used for memmap
Normal zone: 0 pages reserved
Normal zone: 4096 pages, LIFO batch:0
pcpu-alloc: s0 r0 d32768 u32768 alloc=1*32768
pcpu-alloc: [0] 0
Built 1 zonelists, mobility grouping off. Total pages: 4082
Kernel command line: rw console=ttyS3 debug panic=60 ip=bootp root=/dev/nfs
Dentry cache hash table entries: 8192 (order: 2, 65536 bytes)
Inode-cache hash table entries: 4096 (order: 1, 32768 bytes)
Memory: 57632K/65536K available (5279K kernel code, 338K rwdata, 1004K rodata, 272K init, 216K bss, 7904K reserved, 0K cma-reserved)
NR_IRQS: 128
I/O ASIC clock frequency 24999536Hz
clocksource: dec-ioasic: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 76451836814 ns
sched_clock: 32 bits at 24MHz, resolution 40ns, wraps every 85900940267ns
MIPS counter frequency 60000464Hz
clocksource: MIPS: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 31854094440 ns
sched_clock: 32 bits at 60MHz, resolution 16ns, wraps every 35791117303ns
Console: colour dummy device 160x64
console [ttyS3] enabled
bootconsole [prom0] disabled
Calibrating delay loop... 59.33 BogoMIPS (lpj=231424)
pid_max: default: 32768 minimum: 301
Mount-cache hash table entries: 2048 (order: 0, 16384 bytes)
Mountpoint-cache hash table entries: 2048 (order: 0, 16384 bytes)
Checking for the daddi bug... no.
random: get_random_u32 called from bucket_table_alloc+0xbc/0x2e8 with crng_init=0
clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 14931722236523437 ns
futex hash table entries: 256 (order: -2, 6144 bytes)
NET: Registered protocol family 16
Can't analyze schedule() prologue at (____ptrval____)
HugeTLB registered 32.0 MiB page size, pre-allocated 0 pages
SCSI subsystem initialized
tc: TURBOchannel rev. 1 at 25.0 MHz (without parity)
tc0: DEC PMAG-AA V1.0a
tc1: DEC PMAF-FD V3.1D
tc2: DEC PMAF-AA T5.2P-
clocksource: Switched to clocksource MIPS
NET: Registered protocol family 2
tcp_listen_portaddr_hash hash table entries: 1024 (order: 0, 16384 bytes)
TCP established hash table entries: 2048 (order: 0, 16384 bytes)
TCP bind hash table entries: 2048 (order: 0, 16384 bytes)
TCP: Hash tables configured (established 2048 bind 2048)
UDP hash table entries: 512 (order: 0, 16384 bytes)
UDP-Lite hash table entries: 512 (order: 0, 16384 bytes)
NET: Registered protocol family 1
RPC: Registered named UNIX socket transport module.
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
RPC: Registered tcp NFSv4.1 backchannel transport module.
workingset: timestamp_bits=62 max_order=12 bucket_order=0
Block layer SCSI generic (bsg) driver version 0.4 loaded (major 253)
io scheduler noop registered
io scheduler deadline registered
io scheduler cfq registered (default)
Console: switching to mono frame buffer device 160x64
fb0: PMAG-AA frame buffer device at tc0
DECstation Z85C30 serial driver version 0.10
ttyS0 at MMIO 0x1f900008 (irq = 14, base_baud = 460800) is a Z85C30 SCC
ttyS1 at MMIO 0x1f900000 (irq = 14, base_baud = 460800) is a Z85C30 SCC
ttyS2 at MMIO 0x1f980008 (irq = 15, base_baud = 460800) is a Z85C30 SCC
ttyS3 at MMIO 0x1f980000 (irq = 15, base_baud = 460800) is a Z85C30 SCC
ms02-nv.c: v.1.0.0 13 Aug 2001 Maciej W. Rozycki.
mtd0: DEC MS02-NV NVRAM at 0x07000000, size 1MiB.
declance.c: v0.011 by Linux MIPS DECstation task force
declance0: IOASIC onboard LANCE, addr = 08:00:2b:35:62:c1, irq = 16
declance0: registered as eth0.
defxx: v1.11 2014/07/01 Lawrence V. Stefani and others
random: fast init done
tc1: DEFTA at MMIO addr = 0x1e900000, IRQ = 20, Hardware addr = 08-00-2b-a3-a3-29
tc1: registered as fddi0
defza: v.1.1.4 Oct 2 2018 Maciej W. Rozycki
tc2: DEC FDDIcontroller 700 or 700-C at 0x1f000000, irq 21
tc2: resetting the board...
tc2: OK
tc2: model 700 (DEFZA-AA), MMF PMD, address 08-00-2b-2e-6d-75
tc2: ROM rev. 1.0, firmware rev. 1.2, RMC rev. A, SMT ver. 1
tc2: link unavailable
tc2: registered as fddi1
mousedev: PS/2 mouse device common for all mice
rtc_cmos rtc_cmos: registered as rtc0
rtc_cmos rtc_cmos: no alarms, 50 bytes nvram
NET: Registered protocol family 10
Segment Routing with IPv6
sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver
NET: Registered protocol family 17
rtc_cmos rtc_cmos: setting system clock to 2018-10-01 00:45:12 UTC (1538354712)
Sending BOOTP requests . OK
IP-Config: Got BOOTP answer from xxx.xxx.xxx.xxx, my address is xxx.xxx.xxx.xxx
IP-Config: Complete:
device=eth0, hwaddr=08:00:2b:35:62:c1, ipaddr=xxx.xxx.xxx.xxx, mask=xxx.xxx.xxx.xxx, gw=xxx.xxx.xxx.xxx
fddi1: link available
host=hhh.hhh.hhh.hhh, domain=, nis-domain=(none)
bootserver=xxx.xxx.xxx.xxx, rootserver=xxx.xxx.xxx.xxx, rootpath=/ddd/ddd
nameserver0=xxx.xxx.xxx.xxx
fddi1: link unavailable
VFS: Mounted root (nfs filesystem) on device 0:11.
Freeing unused PROM memory: 112k freed
Freeing unused kernel memory: 272K
This architecture does not have kernel memory protection.
Run /sbin/init as init process
[...]
I had to revert recent changes forcing the minimum of GCC 4.6, and then
patch up the breakage that was the motivation for the version bump, as I
cannot easily upgrade my compiler (the newest one I was able to make
working without NPTL), which will be a process.
Still 4.18 can be used pristine with CONFIG_32BIT, except for a recent
build breakage with the RTC driver, my small fix for which has already
been accepted. I think 4.17 will build and boot just fine out of the box,
and I expect the RTC fix to be backported to 4.18 too.
For CONFIG_64BIT a fix for memory corruption with `memset' is required
that applies to 4.17 and later versions, and is pending maintainer's
acceptance. So I think 4.16 will work just fine, but you need the
toolchain (GCC+binutils) from my site with a DADDI and DADDIU workarounds
implemented to build such a kernel. I think the workarounds will never
make it upstream due to their intrusiveness, but I mean to maintain them
indefinitely (though as I mentioned above it'll make me a little bit yet
to get beyond GCC 4.1.2).
Maciej
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] TC: Set DMA masks for devices
2018-10-05 22:52 ` Maciej W. Rozycki
@ 2018-10-06 9:21 ` Fredrik Noring
2018-10-14 23:51 ` Maciej W. Rozycki
0 siblings, 1 reply; 9+ messages in thread
From: Fredrik Noring @ 2018-10-06 9:21 UTC (permalink / raw)
To: Maciej W. Rozycki; +Cc: Ralf Baechle, linux-mips, Jürgen Urban
Hi Maciej,
> > Ah... memory that is known to be DMA compatible is allocated separately,
> > and then handed over to the DMA subsystem using dma_declare_coherent_memory.
>
> Well, that does specify both a CPU-side and a corresponding DMA-side
> address too.
Yes, side-stepping any practical use of a DMA mask, which is why it
probably could have an arbitrary value except 0 that causes this warning.
> > This is done once during driver initialisation. The drivers ohci-sm501.c and
> > ohci-tmio.c do that too, which is why I suspect they might broken as well.
> >
> > The SM501 driver has this explanation:
> >
> > /* The sm501 chip is equipped with local memory that may be used
> > * by on-chip devices such as the video controller and the usb host.
> > * This driver uses dma_declare_coherent_memory() to make sure
> > * usb allocations with dma_alloc_coherent() allocate from
> > * this local memory. The dma_handle returned by dma_alloc_coherent()
> > * will be an offset starting from 0 for the first local memory byte.
>
> From the description I take it it is some MMIO memory rather than host
> memory. I fail to see how it is supposed to work with these calls for
> non-system memory, which certainly any MMIO memory is, which surely is not
> under the supervision of the kernel memory allocator.
I agree, this is obscure to me too.
> There are calls for MMIO memory defined in the DMA API, specifically
> `dma_map_resource' and `dma_unmap_resource'. I've never used them myself,
> and I gather they provide you with a way for CPUs to access MMIO memory
> with caching enabled and without the need to use the MMIO accessors only,
> such as `readl', `writel', etc., which are expected to avoid going through
> any CPU cache. Maybe these are what you're after?
>
> But maybe I'm missing something.
That is handled within the USB OHCI subsystem. I don't know the details,
actually.
> > *
> > * So as long as data is allocated using dma_alloc_coherent() all is
> > * fine. This is however not always the case - buffers may be allocated
> > * using kmalloc() - so the usb core needs to be told that it must copy
> > * data into our local memory if the buffers happen to be placed in
> > * regular memory. The HCD_LOCAL_MEM flag does just that.
> > */
>
> This raises a hack alert to me TBH.
Christoph Hellwig raised concerns too, but I don't know how an OHCI driver
could do things differently given the circumstances, at least for a simple
initial implementation. For sure, the IOP has the capability and was most
likely designed for handling USB devices and other peripherals to a much
greater extent than allowed by the current PS2 OHCI driver, where the EE
manipulates the OHCI registers directly, which is quite inefficient.
> > The DMA for its onboard buffer memory appears to be very similar to the
> > IOP and its DMA? That memory is currently copied by the EE, but there are
> > other DMA controllers that could handle that, possibly synchronised using
> > DMA chaining, which would assist the EE significantly.
>
> Mind that the DEFZA runs its own RTOS for initialization and management
> support, including in particular SMT (Station Management). This is run on
> an MC68000 processor. That processor is interfaced to a bus where board
> memory is attached as well as the RMC (Ring Memory Controller) chip, which
> acts as a DMA master on that bus, like does the host bus interface. Also
> certain control register writes from the host raise interrupts to the
> MC68000 for special situations to handle.
>
> All the PDQ-based FDDI adapters also have an M68000 which runs an RTOS,
> however the presence of the PDQ ASIC makes their architecture slightly
> different as the FDDI chipset does host DMA via the PDQ ASIC, which acts
> as a master on the host bus (possibly through a bridge chip like the PFI,
> though TURBOchannel for example is interfaced directly).
How is its firmware handled? The Linux MIPS wiki entry for the DECstation
firmware
https://www.linux-mips.org/wiki/DECstation#Firmware
is a TODO. :) The main reason I'm asking is that the IOP is a MIPS R3000
(apparently in later product models replaced with a PowerPC 405GP and its
DECKARD software emulator) that also needs firmware. The IOP most likely
ought to handle multiple firmware files, in the IRX format, depending on
its set of services.
Have you implemented sysfs structures to inspect the DEFZA RTOS? That is
something I would like to do for the IOP.
> The biggest challenge has turned out to be electrolytic capacitor
> failures in power supplies. Unfortunately in late 1980s to mid 1990s
> several lines of low-ESR capacitors, used in output filters in switch-mode
> PSUs, were made with a new electrolyte formula based on a quaternary
> ammonium salt. All they have turned out to suffer from excessive
> corrosion caused by that electrolyte, shortening the lifespan of those
> parts well below the expectations even in the enhanced lines specifically
> made with long life in mind. Consequently those parts start leaking even
> if unused (or indeed never used) and then obviously cause PSU breakage if
> powered up.
>
> Those were all from reputable manufacturers, such as Chemi-con, Nichicon
> or Panasonic; not to be confused with the bulged capacitor problem, aka
> capacitor plague, which many ATX PSUs have suffered from mid 1990s to mid
> 2000s where cheap parts were used from less reputable manufacturers.
Interesting!
> This is a DECstation 5000/2x0
> CPU0 revision is: 00000440 (R4400SC)
> FPU revision is: 00000500
> Checking for the multiply/shift bug... no.
> Checking for the daddiu bug... yes, workaround... yes.
> Determined physical RAM map:
> memory: 0000000004000000 @ 0000000000000000 (usable)
Considering the amount of memory, how do compile for it?
Fredrik
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] TC: Set DMA masks for devices
@ 2018-10-14 23:51 ` Maciej W. Rozycki
0 siblings, 0 replies; 9+ messages in thread
From: Maciej W. Rozycki @ 2018-10-14 23:51 UTC (permalink / raw)
To: Fredrik Noring; +Cc: Ralf Baechle, linux-mips, Jürgen Urban
Hi Fredrik,
> > From the description I take it it is some MMIO memory rather than host
> > memory. I fail to see how it is supposed to work with these calls for
> > non-system memory, which certainly any MMIO memory is, which surely is not
> > under the supervision of the kernel memory allocator.
>
> I agree, this is obscure to me too.
I can't be bothered (sorry!) to study this code or the datasheet for the
IC to figure out what the arrangement is, but I do encourage you to do so
if you want to make any changes here.
> > Mind that the DEFZA runs its own RTOS for initialization and management
> > support, including in particular SMT (Station Management). This is run on
> > an MC68000 processor. That processor is interfaced to a bus where board
> > memory is attached as well as the RMC (Ring Memory Controller) chip, which
> > acts as a DMA master on that bus, like does the host bus interface. Also
> > certain control register writes from the host raise interrupts to the
> > MC68000 for special situations to handle.
> >
> > All the PDQ-based FDDI adapters also have an M68000 which runs an RTOS,
> > however the presence of the PDQ ASIC makes their architecture slightly
> > different as the FDDI chipset does host DMA via the PDQ ASIC, which acts
> > as a master on the host bus (possibly through a bridge chip like the PFI,
> > though TURBOchannel for example is interfaced directly).
>
> How is its firmware handled? The Linux MIPS wiki entry for the DECstation
> firmware
>
> https://www.linux-mips.org/wiki/DECstation#Firmware
>
> is a TODO. :)
I'm not sure who actually created that entry and what they had in mind.
Likely the console firmware and any of its peculiarities related to Linux.
> The main reason I'm asking is that the IOP is a MIPS R3000
> (apparently in later product models replaced with a PowerPC 405GP and its
> DECKARD software emulator) that also needs firmware. The IOP most likely
> ought to handle multiple firmware files, in the IRX format, depending on
> its set of services.
The firmware of these FDDI boards is stored in flash memory onboard, so
you don't need to do anything to load it as it boots by itself.
There is a documented way to flash a firmware image by fiddling with the
control registers appropriately, downloading the new image to board RAM
and then requesting the board to transfer the image to onboard flash.
From documentation I gather this process is done entirely by board
circuitry with no software involved on the board side, that is a failed
firmware flashing process does not preclude another attempt.
Normally to start initializing the board you just assert/deassert RESET
with one of the control registers and the board boots.
It takes DEFZA 10s to boot (the documented amount of time to wait for the
driver to wait for the boostrap to complete is 30s). This is why I made
initialisation messages so verbose, so that the user is not confused and
does not conclude the kernel has hung.
You need to boot the board to retrieve its MAC address as the onboard
PROM chip holding the address is not accesssible from the host side and
the address is only returned by the INIT command (NB there is no way to
override it either). There is an undocumented quicker way board's console
support code uses for presentation purposes in a system's console monitor,
but that's board's internal protocol and I didn't want to risk an
incompatibility with some board revision out there.
Therefore the board driver requests its interrupt right away, sets a
timer, cycles RESET and puts the driver to sleep so that the system does
not become frozen if the driver is loaded as a module during normal Linux
operation. Then either a state change interrupt from the board or the
timer fires and the driver resumes from there accordingly.
After reboot a command has to be sent to the board to initialise the DMA
rings and it also takes a while, though not as much. My measurements
indicate 160ms, but it's obviously still too long for the driver to just
busy-wait there twiddling thumbs, so it puts itself to sleep too.
An unfortunate side effect of this design is that the the IRQ handler is
called `tcX' rather than `fddiX', as observed in /proc/interrupts. Maybe
I'll propose a `rename_irq' API, however I'm not sure if it's worth it.
The board also has to be reset during normal operation if the so called
PC Trace (Physical Connection Trace) event has happened in the course of
FDDI ring fault recovery (i.e. when the token has been lost and could not
have been restored with beaconing). That event causes the board to switch
into the halted state (the link status LED changes from green to red to
signify the problem) and the board has to be rebooted by the driver to
verify it's not this board that is the FDDI station having caused the ring
fault.
Then all the usual commands have to be sent to initialise the board, set
FDDI link parameters, add any CAM entries that were set before the reboot
and set the promiscuous mode if in use, and then finally join the ring.
So this is handled with an interrupt-driven state machine as otherwise
again the driver would have to freeze the system for the duration of all
this processing.
The PDQ-based adapters are much quicker, they boot in ~1s. However the
current `defxx' driver is flawed in that it does not handle that PC Trace
event with a state machine and it does freeze the system if that happens,
remaining in the hardirq context throughout. Also it may fail DMA buffer
allocation in the course of the reboot as it (unnecessarily) frees all the
buffers previously allocated and requests new ones instead.
I need to fix this all, modelling the solution after `defza', however I
want to upstream the latter driver first. Fortunately PC Trace events are
not that common, but earlier this year someone has already complained
about this issue with `defxx' causing unacceptable latency problems with
their system, so I do need to look into it.
> Have you implemented sysfs structures to inspect the DEFZA RTOS? That is
> something I would like to do for the IOP.
There is no (documented) way to access the internals of board firmware
(except for the request to flash it). You only have have access to
onboard 1MiB of RAM and a bunch of control/status registers. Likewise
with the PDQ-based adapters, although their use of RAM is not clearly
documented (the PFI has a separate BAR for board RAM access) -- I find it
hard to believe they'd put 1MiB of RAM there only to support firmware
upgrades, so I think it is still used as a temporary packet buffer and
other operational purposes.
> > This is a DECstation 5000/2x0
> > CPU0 revision is: 00000440 (R4400SC)
> > FPU revision is: 00000500
> > Checking for the multiply/shift bug... no.
> > Checking for the daddiu bug... yes, workaround... yes.
> > Determined physical RAM map:
> > memory: 0000000004000000 @ 0000000000000000 (usable)
>
> Considering the amount of memory, how do compile for it?
The kernel can be cross-compiled easily and with no pitfalls, so this is
what I have been always doing.
With userland builds most software packages can be cross-compiled, but I
prefer native builds indeed, as these do not require manual tweaking of
any parameters that cannot be inferred in cross-compilation (fortunately
modern versions of Autoconf are able to figure out what the sizes of data
types are even if cross-compiling, as setting these manually used to be a
real pain).
For those I usually use my Broadcom SWARM board, which is clocked at
800Mhz and currently has 3200MiB of RAM (pending a firmware fix of DRAM
controller initialisation that will hopefully allow for full 4GiB possible
with modules available on the market out of 8GiB theoretical maximum).
The SWARM has switchable endianness with the line to control it at reset
wired to a PCB header used with a jumper as shipped. I have instead wired
it to an external switch mounted on a cover plate of an unused option
slot, so that I don't have to pull the system apart to change the
endianness.
I have better equipped DECstations at my remote site though; the maximum
amount of RAM the /200, /240 and /260 models accept is 480MiB. The
remaining 32MiB of space addressable via the KSEG0/KSEG1 spaces is used
for system ROM and MMIO (for onboard I/O circuitry and TURBOchannel).
TURBOchannel can also be accessed from 0x20000000 physical up (not with
the /200), for 3 slots of 512MiB of MMIO space each, however due to an API
shortcoming system firmware cannot cope with that (as documented on the
DECstation wiki).
Maciej
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] TC: Set DMA masks for devices
@ 2018-10-14 23:51 ` Maciej W. Rozycki
0 siblings, 0 replies; 9+ messages in thread
From: Maciej W. Rozycki @ 2018-10-14 23:51 UTC (permalink / raw)
To: Fredrik Noring; +Cc: Ralf Baechle, linux-mips, Jürgen Urban
Hi Fredrik,
> > From the description I take it it is some MMIO memory rather than host
> > memory. I fail to see how it is supposed to work with these calls for
> > non-system memory, which certainly any MMIO memory is, which surely is not
> > under the supervision of the kernel memory allocator.
>
> I agree, this is obscure to me too.
I can't be bothered (sorry!) to study this code or the datasheet for the
IC to figure out what the arrangement is, but I do encourage you to do so
if you want to make any changes here.
> > Mind that the DEFZA runs its own RTOS for initialization and management
> > support, including in particular SMT (Station Management). This is run on
> > an MC68000 processor. That processor is interfaced to a bus where board
> > memory is attached as well as the RMC (Ring Memory Controller) chip, which
> > acts as a DMA master on that bus, like does the host bus interface. Also
> > certain control register writes from the host raise interrupts to the
> > MC68000 for special situations to handle.
> >
> > All the PDQ-based FDDI adapters also have an M68000 which runs an RTOS,
> > however the presence of the PDQ ASIC makes their architecture slightly
> > different as the FDDI chipset does host DMA via the PDQ ASIC, which acts
> > as a master on the host bus (possibly through a bridge chip like the PFI,
> > though TURBOchannel for example is interfaced directly).
>
> How is its firmware handled? The Linux MIPS wiki entry for the DECstation
> firmware
>
> https://www.linux-mips.org/wiki/DECstation#Firmware
>
> is a TODO. :)
I'm not sure who actually created that entry and what they had in mind.
Likely the console firmware and any of its peculiarities related to Linux.
> The main reason I'm asking is that the IOP is a MIPS R3000
> (apparently in later product models replaced with a PowerPC 405GP and its
> DECKARD software emulator) that also needs firmware. The IOP most likely
> ought to handle multiple firmware files, in the IRX format, depending on
> its set of services.
The firmware of these FDDI boards is stored in flash memory onboard, so
you don't need to do anything to load it as it boots by itself.
There is a documented way to flash a firmware image by fiddling with the
control registers appropriately, downloading the new image to board RAM
and then requesting the board to transfer the image to onboard flash.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2018-10-14 23:51 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-03 12:21 [PATCH] TC: Set DMA masks for devices Maciej W. Rozycki
2018-10-04 16:57 ` Fredrik Noring
2018-10-04 17:55 ` Fredrik Noring
2018-10-04 20:09 ` Maciej W. Rozycki
2018-10-05 14:56 ` Fredrik Noring
2018-10-05 22:52 ` Maciej W. Rozycki
2018-10-06 9:21 ` Fredrik Noring
2018-10-14 23:51 ` Maciej W. Rozycki
2018-10-14 23:51 ` Maciej W. Rozycki
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.