* [PATCH] TC: Set DMA masks for devices @ 2018-10-03 12:21 Maciej W. Rozycki 2018-10-04 16:57 ` Fredrik Noring 0 siblings, 1 reply; 9+ messages in thread From: Maciej W. Rozycki @ 2018-10-03 12:21 UTC (permalink / raw) To: Ralf Baechle; +Cc: linux-mips, linux-kernel Fix a TURBOchannel support regression with commit 205e1b7f51e4 ("dma-mapping: warn when there is no coherent_dma_mask") that caused coherent DMA allocations to produce a warning such as: defxx: v1.11 2014/07/01 Lawrence V. Stefani and others tc1: DEFTA at MMIO addr = 0x1e900000, IRQ = 20, Hardware addr = 08-00-2b-a3-a3-29 ------------[ cut here ]------------ WARNING: CPU: 0 PID: 1 at ./include/linux/dma-mapping.h:516 dfx_dev_register+0x670/0x678 Modules linked in: CPU: 0 PID: 1 Comm: swapper Not tainted 4.19.0-rc6 #2 Stack : ffffffff8009ffc0 fffffffffffffec0 0000000000000000 ffffffff80647650 0000000000000000 0000000000000000 ffffffff806f5f80 ffffffffffffffff 0000000000000000 0000000000000000 0000000000000001 ffffffff8065d4e8 98000000031b6300 ffffffff80563478 ffffffff805685b0 ffffffffffffffff 0000000000000000 ffffffff805d6720 0000000000000204 ffffffff80388df8 0000000000000000 0000000000000009 ffffffff8053efd0 ffffffff806657d0 0000000000000000 ffffffff803177f8 0000000000000000 ffffffff806d0000 9800000003078000 980000000307b9e0 000000001e900000 ffffffff80067940 0000000000000000 ffffffff805d6720 0000000000000204 ffffffff80388df8 ffffffff805176c0 ffffffff8004dc78 0000000000000000 ffffffff80067940 ... Call Trace: [<ffffffff8004dc78>] show_stack+0xa0/0x130 [<ffffffff80067940>] __warn+0x128/0x170 ---[ end trace b1d1e094f67f3bb2 ]--- This is because the TURBOchannel bus driver fails to set the coherent DMA mask for devices enumerated. Set the regular and coherent DMA masks for TURBOchannel devices then, observing that the bus protocol supports a 34-bit (16GiB) DMA address space, by interpreting the value presented in the address cycle across the 32 `ad' lines as a 32-bit word rather than byte address[1]. The architectural size of the TURBOchannel DMA address space exceeds the maximum amount of RAM any actual TURBOchannel system in existence may have, hence both masks are the same. This removes the warning shown above. References: [1] "TURBOchannel Hardware Specification", EK-369AA-OD-007B, Digital Equipment Corporation, January 1993, Section "DMA", pp. 1-15 -- 1-17 Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Fixes: 205e1b7f51e4 ("dma-mapping: warn when there is no coherent_dma_mask") Cc: stable@vger.kernel.org # 4.16+ --- drivers/tc/tc.c | 8 +++++++- include/linux/tc.h | 1 + 2 files changed, 8 insertions(+), 1 deletion(-) linux-tc-dma-mask.patch Index: linux-20180930-4maxp64/drivers/tc/tc.c =================================================================== --- linux-20180930-4maxp64.orig/drivers/tc/tc.c +++ linux-20180930-4maxp64/drivers/tc/tc.c @@ -2,7 +2,7 @@ * TURBOchannel bus services. * * Copyright (c) Harald Koerfgen, 1998 - * Copyright (c) 2001, 2003, 2005, 2006 Maciej W. Rozycki + * Copyright (c) 2001, 2003, 2005, 2006, 2018 Maciej W. Rozycki * Copyright (c) 2005 James Simmons * * This file is subject to the terms and conditions of the GNU @@ -10,6 +10,7 @@ * directory of this archive for more details. */ #include <linux/compiler.h> +#include <linux/dma-mapping.h> #include <linux/errno.h> #include <linux/init.h> #include <linux/ioport.h> @@ -92,6 +93,11 @@ static void __init tc_bus_add_devices(st tdev->dev.bus = &tc_bus_type; tdev->slot = slot; + /* TURBOchannel has 34-bit DMA addressing (16GiB space). */ + tdev->dma_mask = DMA_BIT_MASK(34); + tdev->dev.dma_mask = &tdev->dma_mask; + tdev->dev.coherent_dma_mask = DMA_BIT_MASK(34); + for (i = 0; i < 8; i++) { tdev->firmware[i] = readb(module + offset + TC_FIRM_VER + 4 * i); Index: linux-20180930-4maxp64/include/linux/tc.h =================================================================== --- linux-20180930-4maxp64.orig/include/linux/tc.h +++ linux-20180930-4maxp64/include/linux/tc.h @@ -84,6 +84,7 @@ struct tc_dev { device. */ struct device dev; /* Generic device interface. */ struct resource resource; /* Address space of this device. */ + u64 dma_mask; /* DMA addressable range. */ char vendor[9]; char name[9]; char firmware[9]; ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] TC: Set DMA masks for devices 2018-10-03 12:21 [PATCH] TC: Set DMA masks for devices Maciej W. Rozycki @ 2018-10-04 16:57 ` Fredrik Noring 2018-10-04 17:55 ` Fredrik Noring 2018-10-04 20:09 ` Maciej W. Rozycki 0 siblings, 2 replies; 9+ messages in thread From: Fredrik Noring @ 2018-10-04 16:57 UTC (permalink / raw) To: Maciej W. Rozycki; +Cc: Ralf Baechle, linux-mips, Jürgen Urban Hi Maciej, > Fix a TURBOchannel support regression with commit 205e1b7f51e4 > ("dma-mapping: warn when there is no coherent_dma_mask") that caused > coherent DMA allocations to produce a warning such as: > > defxx: v1.11 2014/07/01 Lawrence V. Stefani and others > tc1: DEFTA at MMIO addr = 0x1e900000, IRQ = 20, Hardware addr = 08-00-2b-a3-a3-29 > ------------[ cut here ]------------ > WARNING: CPU: 0 PID: 1 at ./include/linux/dma-mapping.h:516 dfx_dev_register+0x670/0x678 > Modules linked in: > CPU: 0 PID: 1 Comm: swapper Not tainted 4.19.0-rc6 #2 > Stack : ffffffff8009ffc0 fffffffffffffec0 0000000000000000 ffffffff80647650 > 0000000000000000 0000000000000000 ffffffff806f5f80 ffffffffffffffff > 0000000000000000 0000000000000000 0000000000000001 ffffffff8065d4e8 > 98000000031b6300 ffffffff80563478 ffffffff805685b0 ffffffffffffffff > 0000000000000000 ffffffff805d6720 0000000000000204 ffffffff80388df8 > 0000000000000000 0000000000000009 ffffffff8053efd0 ffffffff806657d0 > 0000000000000000 ffffffff803177f8 0000000000000000 ffffffff806d0000 > 9800000003078000 980000000307b9e0 000000001e900000 ffffffff80067940 > 0000000000000000 ffffffff805d6720 0000000000000204 ffffffff80388df8 > ffffffff805176c0 ffffffff8004dc78 0000000000000000 ffffffff80067940 > ... > Call Trace: > [<ffffffff8004dc78>] show_stack+0xa0/0x130 > [<ffffffff80067940>] __warn+0x128/0x170 > ---[ end trace b1d1e094f67f3bb2 ]--- > > This is because the TURBOchannel bus driver fails to set the coherent > DMA mask for devices enumerated. Interesting! This warning is also triggered by the PS2 OHCI driver. Robin Murphy proposed the patch https://lkml.org/lkml/2018/7/3/507 that relaxed it and a related warning. Half of the patch was merged in commit d27fb99f62af7 while the other half (related to this warning) was rejected by Christoph Hellwig. The PS2 OHCI triggers the following trace: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 62 at ./include/linux/dma-mapping.h:516 ohci_setup+0x41c/0x424 [ohci_hcd] Modules linked in: ohci_ps2(+) ohci_hcd usbcore usb_common sd_mod iop iop_fio iop_module iop_memory sif CPU: 0 PID: 62 Comm: modprobe Not tainted 4.16.0+ #1533 Stack : 00000000 00000000 80747392 00000037 81c6eb0c 804f32e7 80493b24 0000003e 80743498 00000204 00000001 c01c0000 802a2fa0 10058c00 81ea5a68 804facc0 00000000 00000000 80740000 00000007 00000000 00000060 00000000 00000000 3a6d6d6f 00000000 0000005f 646f6d20 80000000 00000000 c01e66e8 c01e813c 00000009 00000204 00000001 c01c0000 00000018 80278fe0 0007579f 00000001 ... Call Trace: [<8001d6e4>] show_stack+0x74/0x104 [<800323a8>] __warn+0x118/0x120 [<8003246c>] warn_slowpath_null+0x44/0x58 [<c01e66e8>] ohci_setup+0x41c/0x424 [ohci_hcd] [<c01f209c>] ohci_ps2_reset+0x30/0x70 [ohci_ps2] [<c01a8aec>] usb_add_hcd+0x2d4/0x89c [usbcore] [<c01f2360>] ohci_hcd_ps2_probe+0x284/0x2a4 [ohci_ps2] [<802a8a74>] platform_drv_probe+0x2c/0x68 [<802a70b4>] driver_probe_device+0x22c/0x2e4 [<802a71f0>] __driver_attach+0x84/0xc8 [<802a53fc>] bus_for_each_dev+0x60/0x90 [<802a6580>] bus_add_driver+0x1b8/0x200 [<802a7980>] driver_register+0xc0/0x100 [<800106bc>] do_one_initcall+0x17c/0x190 [<800841f4>] do_init_module+0x74/0x1f0 [<80082f30>] load_module+0x1680/0x2044 [<80083adc>] SyS_finit_module+0xa0/0xb8 [<8002190c>] syscall_common+0x34/0x58 ---[ end trace e71738b5fa6bf9aa ]--- > Set the regular and coherent DMA masks for TURBOchannel devices then, > observing that the bus protocol supports a 34-bit (16GiB) DMA address > space, by interpreting the value presented in the address cycle across > the 32 `ad' lines as a 32-bit word rather than byte address[1]. The > architectural size of the TURBOchannel DMA address space exceeds the > maximum amount of RAM any actual TURBOchannel system in existence may > have, hence both masks are the same. A complication with the PS2 OHCI is that DMA addresses 0-0x200000 map to 0x1c000000-0x1c200000 as seen by the kernel. Robin suggested that the mask might correspond to the effective addressing capability, which would be DMA_BIT_MASK(21), but it does not seem to be entirely clear, since his commit message said that A somewhat similar line of reasoning also applies at the other end for the mask check in dma_alloc_attrs() too - indeed, a device which cannot access anything other than its own local memory probably *shouldn't* have a valid mask for the general coherent DMA API. A special circumstance here is the use of HCD_LOCAL_MEM that is a kind of DMA bounce buffer. Are you using anything similar with your DEFTA driver? Fredrik ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] TC: Set DMA masks for devices 2018-10-04 16:57 ` Fredrik Noring @ 2018-10-04 17:55 ` Fredrik Noring 2018-10-04 20:09 ` Maciej W. Rozycki 1 sibling, 0 replies; 9+ messages in thread From: Fredrik Noring @ 2018-10-04 17:55 UTC (permalink / raw) To: Maciej W. Rozycki; +Cc: Ralf Baechle, linux-mips, Jürgen Urban H Maciej, > > Set the regular and coherent DMA masks for TURBOchannel devices then, > > observing that the bus protocol supports a 34-bit (16GiB) DMA address > > space, by interpreting the value presented in the address cycle across > > the 32 `ad' lines as a 32-bit word rather than byte address[1]. The > > architectural size of the TURBOchannel DMA address space exceeds the > > maximum amount of RAM any actual TURBOchannel system in existence may > > have, hence both masks are the same. > > A complication with the PS2 OHCI is that DMA addresses 0-0x200000 map to > 0x1c000000-0x1c200000 as seen by the kernel. Robin suggested that the mask > might correspond to the effective addressing capability, which would be > DMA_BIT_MASK(21), but it does not seem to be entirely clear, since his > commit message said that > > A somewhat similar line of reasoning also applies at the other end for > the mask check in dma_alloc_attrs() too - indeed, a device which cannot > access anything other than its own local memory probably *shouldn't* > have a valid mask for the general coherent DMA API. > > A special circumstance here is the use of HCD_LOCAL_MEM that is a kind of > DMA bounce buffer. Are you using anything similar with your DEFTA driver? Sorry, I didn't interpret your comment properly. With TURBOchannel DMA address space exceeding any practical amount of RAM, bounce buffers isn't needed for that system. The situation is the reverse with the PS2 OHCI. Fredrik ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] TC: Set DMA masks for devices 2018-10-04 16:57 ` Fredrik Noring 2018-10-04 17:55 ` Fredrik Noring @ 2018-10-04 20:09 ` Maciej W. Rozycki 2018-10-05 14:56 ` Fredrik Noring 1 sibling, 1 reply; 9+ messages in thread From: Maciej W. Rozycki @ 2018-10-04 20:09 UTC (permalink / raw) To: Fredrik Noring; +Cc: Ralf Baechle, linux-mips, Jürgen Urban Hi Fredrik, > A complication with the PS2 OHCI is that DMA addresses 0-0x200000 map to > 0x1c000000-0x1c200000 as seen by the kernel. Robin suggested that the mask > might correspond to the effective addressing capability, which would be > DMA_BIT_MASK(21), I take it you mean 0-0x1fffff obviously; let's be accurate in a technical discussion and avoid ambiguous cases. Well, the need to map between the CPU and the DMA address space is not uncommon. As I recall the Galileo/Marvell GT-64xxx system controllers have a BAR for PCI master accesses to local DRAM (so that multiple such controllers can coexist in a NUMA system) and any non-identity mapping has to be taken into account with DMA of course And indeed e.g. `dma_map_single' does handle that and given a CPU-side physical memory address returns a corresponding DMA-side address. And the DMA mask has to reflect that and describe the DMA side, as it's the device side that has an address space limitation here and any offset resulting from a non-identity mapping does not change that limitation, although the offset does have of course to be taken into account by `dma_map_single', etc. in determining whether the memory area requested for use by a DMA device can be used directly or whether a bounce buffer will be required for that mapping. > but it does not seem to be entirely clear, since his > commit message said that > > A somewhat similar line of reasoning also applies at the other end for > the mask check in dma_alloc_attrs() too - indeed, a device which cannot > access anything other than its own local memory probably *shouldn't* > have a valid mask for the general coherent DMA API. Well, how can such a device use the DMA API in the first place? If the device has local memory, than the driver has to manage it itself somehow if needed, and then arrange copying it to main memory, either by a CPU or a third-party DMA controller (data mover) if available. Of course in the latter case a driver for the DMA controller may have to use the DMA API. I'll be resubmitting a driver for such a device shortly, the DEFZA (the previous submission can be found here: <https://marc.info/?l=linux-netdev&m=139841853827404>). It is interesting in that the FDDI engine supports host DMA on the reception side (and consequently the driver uses the DMA API to handle that), while on the transmission side (as well as with a couple of maintenance queues) it only does DMA with its onboard buffer memory, the contents of which need to be copied by the CPU. So there's no use of the DMA API on the transmission or maintenance side. However usual DMA rings (all located in board memory too) are used for all data moves. The DEFTA is a follow-up and an upgrade to the DEFZA, more integrated (the DEFZA uses a pair of PCBs while the DEFTA fits on one, of the size of each in the former pair), and with the extra silicon space gained it was possible to squeeze in circuitry required to do host DMA for all data moves, and also the DMA rings. > A special circumstance here is the use of HCD_LOCAL_MEM that is a kind of > DMA bounce buffer. Are you using anything similar with your DEFTA driver? The driver does need either an IOMMU or bounce buffers in system RAM in the case of 64-bit PCI systems, as the PFI PCI ASIC that the FDDI PDQ ASIC interfaces on the DEFPA does not AFAIK support 64-bit addressing (be it directly or with the use of DAC), although the PDQ itself does support 48-bit addressing (i.e. DMA descriptor addresses hold bits 47:2 of host addresses), which would be sufficient for the usual cases. Not in the DEFTA (or for that matter DEFEA; possibly the only EISA device using the DMA API) case though, as the most equipped TURBOchannel systems, i.e. the DEC 3000 AXP models 500, 800 and 900 only support up to 1GiB of memory, which is well below the 34-bit addressing limit. The PDQ ASIC was used to interface FDDI to many host buses and in addition to the 3 bus attachments mentioned above, all of which we have support for in Linux, it was also used for Q-bus (the DEFQA) and FutureBus (the DEFAA). We may have support for the DEFQA one day as I have both such a board and a suitable system to use it with. We are unlikely to have support for the DEFAA, as FutureBus was only used in high-end VAX and Alpha systems, the size of a full 19" rack at the very least, but it is there I believe only that the full PDQ addressing capability was actually utilised. NB I sat on this fix from 2014, well before the warning was introduced in the first place, and it's only now that I got to unloading my patch queue. :( Maciej ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] TC: Set DMA masks for devices 2018-10-04 20:09 ` Maciej W. Rozycki @ 2018-10-05 14:56 ` Fredrik Noring 2018-10-05 22:52 ` Maciej W. Rozycki 0 siblings, 1 reply; 9+ messages in thread From: Fredrik Noring @ 2018-10-05 14:56 UTC (permalink / raw) To: Maciej W. Rozycki; +Cc: Ralf Baechle, linux-mips, Jürgen Urban Hi Maciej, > > A complication with the PS2 OHCI is that DMA addresses 0-0x200000 map to > > 0x1c000000-0x1c200000 as seen by the kernel. Robin suggested that the mask > > might correspond to the effective addressing capability, which would be > > DMA_BIT_MASK(21), > > I take it you mean 0-0x1fffff obviously; let's be accurate in a technical > discussion and avoid ambiguous cases. That's interesting. :) 0x1fffff is not a valid DMA address due to alignment restrictions, so if one wants to indicate a closed [inclusive] DMA address interval it would be 0-0x1ffffc, since the 32-bit word rather than the byte is the unit of the IOP DMA. In mathematics and programming languages it is often convenient to work with half-open intervals denoted by "[0,0x200000)" in this case. I think both notations are technically accurate, but they do emphasize different aspects of addresses and memory. I can switch to your byte-centric notation if that helps. :) > Well, the need to map between the CPU and the DMA address space is not > uncommon. As I recall the Galileo/Marvell GT-64xxx system controllers > have a BAR for PCI master accesses to local DRAM (so that multiple such > controllers can coexist in a NUMA system) and any non-identity mapping has > to be taken into account with DMA of course > > And indeed e.g. `dma_map_single' does handle that and given a CPU-side > physical memory address returns a corresponding DMA-side address. And the > DMA mask has to reflect that and describe the DMA side, as it's the device > side that has an address space limitation here and any offset resulting > from a non-identity mapping does not change that limitation, although the > offset does have of course to be taken into account by `dma_map_single', > etc. in determining whether the memory area requested for use by a DMA > device can be used directly or whether a bounce buffer will be required > for that mapping. Ah... memory that is known to be DMA compatible is allocated separately, and then handed over to the DMA subsystem using dma_declare_coherent_memory. This is done once during driver initialisation. The drivers ohci-sm501.c and ohci-tmio.c do that too, which is why I suspect they might broken as well. The SM501 driver has this explanation: /* The sm501 chip is equipped with local memory that may be used * by on-chip devices such as the video controller and the usb host. * This driver uses dma_declare_coherent_memory() to make sure * usb allocations with dma_alloc_coherent() allocate from * this local memory. The dma_handle returned by dma_alloc_coherent() * will be an offset starting from 0 for the first local memory byte. * * So as long as data is allocated using dma_alloc_coherent() all is * fine. This is however not always the case - buffers may be allocated * using kmalloc() - so the usb core needs to be told that it must copy * data into our local memory if the buffers happen to be placed in * regular memory. The HCD_LOCAL_MEM flag does just that. */ retval = dma_declare_coherent_memory(dev, mem->start, mem->start - mem->parent->start, resource_size(mem), DMA_MEMORY_EXCLUSIVE); The corresponding code in the PS2 OHCI driver does ps2priv->iop_dma_addr = iop_alloc(size); if (ps2priv->iop_dma_addr == 0) { dev_err(dev, "iop_alloc failed\n"); return -ENOMEM; } if (dma_declare_coherent_memory(dev, iop_bus_to_phys(ps2priv->iop_dma_addr), ps2priv->iop_dma_addr, size, flags)) { dev_err(dev, "dma_declare_coherent_memory failed\n"); iop_free(ps2priv->iop_dma_addr); ps2priv->iop_dma_addr = 0; return -ENOMEM; } where iop_alloc is a special IOP memory allocation function and its return value stored in iop_dma_addr is handed over to dma_declare_coherent_memory. > > but it does not seem to be entirely clear, since his > > commit message said that > > > > A somewhat similar line of reasoning also applies at the other end for > > the mask check in dma_alloc_attrs() too - indeed, a device which cannot > > access anything other than its own local memory probably *shouldn't* > > have a valid mask for the general coherent DMA API. > > Well, how can such a device use the DMA API in the first place? If the > device has local memory, than the driver has to manage it itself somehow > if needed, and then arrange copying it to main memory, either by a CPU or > a third-party DMA controller (data mover) if available. Of course in the > latter case a driver for the DMA controller may have to use the DMA API. The coherently declared memory given to the DMA subsystem is used for a fixed sized DMA pool and no additional allocations are permitted. One could choose a DMA mask that pretends to be reasonable, or the opposite, a mask such as 1 that is unreasonable on purpose, as Robin writes: Alternatively, there is perhaps some degree of argument for deliberately picking a nonzero but useless value like 1, although it looks like the MIPS allocator (at least the dma- default one) never actually checks whether the page it gets is within range of the device's coherent mask, which it probably should do. https://lkml.org/lkml/2018/7/6/697 > I'll be resubmitting a driver for such a device shortly, the DEFZA (the > previous submission can be found here: > <https://marc.info/?l=linux-netdev&m=139841853827404>). It is interesting > in that the FDDI engine supports host DMA on the reception side (and > consequently the driver uses the DMA API to handle that), while on the > transmission side (as well as with a couple of maintenance queues) it only > does DMA with its onboard buffer memory, the contents of which need to be > copied by the CPU. So there's no use of the DMA API on the transmission > or maintenance side. However usual DMA rings (all located in board memory > too) are used for all data moves. The DMA for its onboard buffer memory appears to be very similar to the IOP and its DMA? That memory is currently copied by the EE, but there are other DMA controllers that could handle that, possibly synchronised using DMA chaining, which would assist the EE significantly. Apart from USB, the IOP does networking, FireWire, harddisks, etc. Some or all of the peripherals could be accelerated with DMA, which is an interesting challenge. > The DEFTA is a follow-up and an upgrade to the DEFZA, more integrated > (the DEFZA uses a pair of PCBs while the DEFTA fits on one, of the size of > each in the former pair), and with the extra silicon space gained it was > possible to squeeze in circuitry required to do host DMA for all data > moves, and also the DMA rings. Nice. :) > > A special circumstance here is the use of HCD_LOCAL_MEM that is a kind of > > DMA bounce buffer. Are you using anything similar with your DEFTA driver? > > The driver does need either an IOMMU or bounce buffers in system RAM in > the case of 64-bit PCI systems, as the PFI PCI ASIC that the FDDI PDQ ASIC > interfaces on the DEFPA does not AFAIK support 64-bit addressing (be it > directly or with the use of DAC), although the PDQ itself does support > 48-bit addressing (i.e. DMA descriptor addresses hold bits 47:2 of host > addresses), which would be sufficient for the usual cases. > > Not in the DEFTA (or for that matter DEFEA; possibly the only EISA device > using the DMA API) case though, as the most equipped TURBOchannel systems, > i.e. the DEC 3000 AXP models 500, 800 and 900 only support up to 1GiB of > memory, which is well below the 34-bit addressing limit. > > The PDQ ASIC was used to interface FDDI to many host buses and in > addition to the 3 bus attachments mentioned above, all of which we have > support for in Linux, it was also used for Q-bus (the DEFQA) and FutureBus > (the DEFAA). We may have support for the DEFQA one day as I have both > such a board and a suitable system to use it with. We are unlikely to > have support for the DEFAA, as FutureBus was only used in high-end VAX and > Alpha systems, the size of a full 19" rack at the very least, but it is > there I believe only that the full PDQ addressing capability was actually > utilised. Thanks! By the way, is it possible to find spare parts for such vintage hardware these days in case of irrepairable failures? > NB I sat on this fix from 2014, well before the warning was introduced in > the first place, and it's only now that I got to unloading my patch queue. > :( Do you have the latest kernel running on your DECstation machines now? :) Fredrik ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] TC: Set DMA masks for devices 2018-10-05 14:56 ` Fredrik Noring @ 2018-10-05 22:52 ` Maciej W. Rozycki 2018-10-06 9:21 ` Fredrik Noring 0 siblings, 1 reply; 9+ messages in thread From: Maciej W. Rozycki @ 2018-10-05 22:52 UTC (permalink / raw) To: Fredrik Noring; +Cc: Ralf Baechle, linux-mips, Jürgen Urban Hi Fredrik, > > I take it you mean 0-0x1fffff obviously; let's be accurate in a technical > > discussion and avoid ambiguous cases. > > That's interesting. :) 0x1fffff is not a valid DMA address due to alignment > restrictions, so if one wants to indicate a closed [inclusive] DMA address > interval it would be 0-0x1ffffc, since the 32-bit word rather than the byte > is the unit of the IOP DMA. In mathematics and programming languages it is > often convenient to work with half-open intervals denoted by "[0,0x200000)" > in this case. I think both notations are technically accurate, but they do > emphasize different aspects of addresses and memory. I can switch to your > byte-centric notation if that helps. :) Well, the byte at 0x1fffff may not be individually addressable by this particular DMA engine, but surely it is there included in DMA transfers accessing the location that spans it. If instead you prefer to use the mathematical notation to specify inclusive/exclusive ranges, then of course I'm fine with that too. > > And indeed e.g. `dma_map_single' does handle that and given a CPU-side > > physical memory address returns a corresponding DMA-side address. And the > > DMA mask has to reflect that and describe the DMA side, as it's the device > > side that has an address space limitation here and any offset resulting > > from a non-identity mapping does not change that limitation, although the > > offset does have of course to be taken into account by `dma_map_single', > > etc. in determining whether the memory area requested for use by a DMA > > device can be used directly or whether a bounce buffer will be required > > for that mapping. > > Ah... memory that is known to be DMA compatible is allocated separately, > and then handed over to the DMA subsystem using dma_declare_coherent_memory. Well, that does specify both a CPU-side and a corresponding DMA-side address too. > This is done once during driver initialisation. The drivers ohci-sm501.c and > ohci-tmio.c do that too, which is why I suspect they might broken as well. > > The SM501 driver has this explanation: > > /* The sm501 chip is equipped with local memory that may be used > * by on-chip devices such as the video controller and the usb host. > * This driver uses dma_declare_coherent_memory() to make sure > * usb allocations with dma_alloc_coherent() allocate from > * this local memory. The dma_handle returned by dma_alloc_coherent() > * will be an offset starting from 0 for the first local memory byte. From the description I take it it is some MMIO memory rather than host memory. I fail to see how it is supposed to work with these calls for non-system memory, which certainly any MMIO memory is, which surely is not under the supervision of the kernel memory allocator. There are calls for MMIO memory defined in the DMA API, specifically `dma_map_resource' and `dma_unmap_resource'. I've never used them myself, and I gather they provide you with a way for CPUs to access MMIO memory with caching enabled and without the need to use the MMIO accessors only, such as `readl', `writel', etc., which are expected to avoid going through any CPU cache. Maybe these are what you're after? But maybe I'm missing something. > * > * So as long as data is allocated using dma_alloc_coherent() all is > * fine. This is however not always the case - buffers may be allocated > * using kmalloc() - so the usb core needs to be told that it must copy > * data into our local memory if the buffers happen to be placed in > * regular memory. The HCD_LOCAL_MEM flag does just that. > */ This raises a hack alert to me TBH. > > Well, how can such a device use the DMA API in the first place? If the > > device has local memory, than the driver has to manage it itself somehow > > if needed, and then arrange copying it to main memory, either by a CPU or > > a third-party DMA controller (data mover) if available. Of course in the > > latter case a driver for the DMA controller may have to use the DMA API. > > The coherently declared memory given to the DMA subsystem is used for a > fixed sized DMA pool and no additional allocations are permitted. One could > choose a DMA mask that pretends to be reasonable, or the opposite, a mask > such as 1 that is unreasonable on purpose, as Robin writes: > > Alternatively, there is perhaps some degree of argument for > deliberately picking a nonzero but useless value like 1, > although it looks like the MIPS allocator (at least the dma- > default one) never actually checks whether the page it gets > is within range of the device's coherent mask, which it > probably should do. > > https://lkml.org/lkml/2018/7/6/697 It does look like an API abuse to me, as I noted above. > > I'll be resubmitting a driver for such a device shortly, the DEFZA (the > > previous submission can be found here: > > <https://marc.info/?l=linux-netdev&m=139841853827404>). It is interesting > > in that the FDDI engine supports host DMA on the reception side (and > > consequently the driver uses the DMA API to handle that), while on the > > transmission side (as well as with a couple of maintenance queues) it only > > does DMA with its onboard buffer memory, the contents of which need to be > > copied by the CPU. So there's no use of the DMA API on the transmission > > or maintenance side. However usual DMA rings (all located in board memory > > too) are used for all data moves. > > The DMA for its onboard buffer memory appears to be very similar to the > IOP and its DMA? That memory is currently copied by the EE, but there are > other DMA controllers that could handle that, possibly synchronised using > DMA chaining, which would assist the EE significantly. Mind that the DEFZA runs its own RTOS for initialization and management support, including in particular SMT (Station Management). This is run on an MC68000 processor. That processor is interfaced to a bus where board memory is attached as well as the RMC (Ring Memory Controller) chip, which acts as a DMA master on that bus, like does the host bus interface. Also certain control register writes from the host raise interrupts to the MC68000 for special situations to handle. All the PDQ-based FDDI adapters also have an M68000 which runs an RTOS, however the presence of the PDQ ASIC makes their architecture slightly different as the FDDI chipset does host DMA via the PDQ ASIC, which acts as a master on the host bus (possibly through a bridge chip like the PFI, though TURBOchannel for example is interfaced directly). These adapters went through several revisions, all using the Motorola FDDI chipset (originally designed by DEC and then sold to Motorola for fabrication and marketing, with DEC retaining an unlimited licence to use), but with the PDQ (Packet Data Queue, I believe; not officially confirmed) replacing the FSI (FDDI System Interface) block, and the CAMEL (MAC and ELM (Media Access Controller and Elasticity Buffer and Link Management)) and FCG (FDDI Clock Generator) blocks both retained. > > The PDQ ASIC was used to interface FDDI to many host buses and in > > addition to the 3 bus attachments mentioned above, all of which we have > > support for in Linux, it was also used for Q-bus (the DEFQA) and FutureBus > > (the DEFAA). We may have support for the DEFQA one day as I have both > > such a board and a suitable system to use it with. We are unlikely to > > have support for the DEFAA, as FutureBus was only used in high-end VAX and > > Alpha systems, the size of a full 19" rack at the very least, but it is > > there I believe only that the full PDQ addressing capability was actually > > utilised. > > Thanks! By the way, is it possible to find spare parts for such vintage > hardware these days in case of irrepairable failures? What do you mean by spare parts? ICs? Complete modules can certainly be chased, though obviously there are the more common ones, and then there are the exotic ones. The biggest challenge has turned out to be electrolytic capacitor failures in power supplies. Unfortunately in late 1980s to mid 1990s several lines of low-ESR capacitors, used in output filters in switch-mode PSUs, were made with a new electrolyte formula based on a quaternary ammonium salt. All they have turned out to suffer from excessive corrosion caused by that electrolyte, shortening the lifespan of those parts well below the expectations even in the enhanced lines specifically made with long life in mind. Consequently those parts start leaking even if unused (or indeed never used) and then obviously cause PSU breakage if powered up. Those were all from reputable manufacturers, such as Chemi-con, Nichicon or Panasonic; not to be confused with the bulged capacitor problem, aka capacitor plague, which many ATX PSUs have suffered from mid 1990s to mid 2000s where cheap parts were used from less reputable manufacturers. Sadly I have ruined a couple of PSUs before I realised what the problem was and I have been struggling since with tracking down other parts that have failed as a result. I plan to get back to it sometime. Some DECstation models are affected, as is other DEC (and non-DEC) hardware: * The 5000/200, /240 and /260 are not affected. * The 2100 and 3100 are not if stored in their working orientation, as the capacitors are mounted leads up in their PSUs and corrosion only breaks the seal and not the aluminium can. * The 5000/120, /125, /133 and /150 are all affected and are better recapped -- all SXF Chemi-con parts have to be replaced at the very least. Newer PSUs use newer LXF Chemi-con parts that haven't failed for me (yet?), but are expected to too. * I can't speak of the 5000/20, /25, /33, /50 as I haven't got one of these. * Other pieces of hardware would have to be inspected by their respective owners, e.g. I had a case where I had to recap the PSU of a small Cisco Ethernet switch with an FDDI bridge module from that era (that actually used a stock industrial PSU you can still buy new, although at ~£500 + VAT -- not exactly cheaply). Other parts that have been failing are the usual Dallas RTC chips having an integrated Lithium coin cell depleted; either the DS1287 or the DS1287A depending on the specific model of hardware. DECstations have these chips located in the TURBOchannel slot area with little clearance around them. Therefore I have been slowly converting them to a version with a discrete coin cell embedded in the IC case instead, as photographically documented here: <ftp://ftp.linux-mips.org/pub/linux/mips/people/macro/ds1287/>. You can still get recently manufactured brand new DS12887 or DS12887A parts from Maxim through the usual distribution channels, however for reference systems, such as I consider mine, I prefer to use original parts to avoid surprises, as the DS12887/A chips have 104 bytes of general NVRAM as opposed to 50 bytes with the DS1287/A. NB according to HP end of sales for the DEFPA was only 2004-2005 and based on occasional enquiries I get as the maintainer it remains deployed in production environments. These boards remain readily available on the second-hand market; sometimes you can get at unused old stock even. Unless you look for the less common SMF variants, that is. I own a couple of universal-PCI DEFPA boards that use the most recent PFI-3 ASIC (earlier versions were 5V-only), some of which have HP recorded as the vendor in the subsystem ID. Also new TURBOchannel option hardware has been designed and manufactured recently, see: <http://www.flxd.de/tc-usb/>. :) We'll get a Linux driver sometime. > > NB I sat on this fix from 2014, well before the warning was introduced in > > the first place, and it's only now that I got to unloading my patch queue. > > :( > > Do you have the latest kernel running on your DECstation machines now? :) Yep: Linux version 4.19.0-rc6 (macro@tp) (gcc version 4.1.2) #3 Mon Oct 1 00:22:03 BST 2018 bootconsole [prom0] enabled This is a DECstation 5000/2x0 CPU0 revision is: 00000440 (R4400SC) FPU revision is: 00000500 Checking for the multiply/shift bug... no. Checking for the daddiu bug... yes, workaround... yes. Determined physical RAM map: memory: 0000000004000000 @ 0000000000000000 (usable) Primary instruction cache 16kB, VIPT, direct mapped, linesize 16 bytes. Primary data cache 16kB, direct mapped, VIPT, no aliases, linesize 16 bytes Unified secondary cache 1024kB direct mapped, linesize 32 bytes. Zone ranges: Normal [mem 0x0000000000000000-0x0000000003ffffff] Movable zone start for each node Early memory node ranges node 0: [mem 0x0000000000000000-0x0000000003ffffff] Initmem setup node 0 [mem 0x0000000000000000-0x0000000003ffffff] On node 0 totalpages: 4096 Normal zone: 14 pages used for memmap Normal zone: 0 pages reserved Normal zone: 4096 pages, LIFO batch:0 pcpu-alloc: s0 r0 d32768 u32768 alloc=1*32768 pcpu-alloc: [0] 0 Built 1 zonelists, mobility grouping off. Total pages: 4082 Kernel command line: rw console=ttyS3 debug panic=60 ip=bootp root=/dev/nfs Dentry cache hash table entries: 8192 (order: 2, 65536 bytes) Inode-cache hash table entries: 4096 (order: 1, 32768 bytes) Memory: 57632K/65536K available (5279K kernel code, 338K rwdata, 1004K rodata, 272K init, 216K bss, 7904K reserved, 0K cma-reserved) NR_IRQS: 128 I/O ASIC clock frequency 24999536Hz clocksource: dec-ioasic: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 76451836814 ns sched_clock: 32 bits at 24MHz, resolution 40ns, wraps every 85900940267ns MIPS counter frequency 60000464Hz clocksource: MIPS: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 31854094440 ns sched_clock: 32 bits at 60MHz, resolution 16ns, wraps every 35791117303ns Console: colour dummy device 160x64 console [ttyS3] enabled bootconsole [prom0] disabled Calibrating delay loop... 59.33 BogoMIPS (lpj=231424) pid_max: default: 32768 minimum: 301 Mount-cache hash table entries: 2048 (order: 0, 16384 bytes) Mountpoint-cache hash table entries: 2048 (order: 0, 16384 bytes) Checking for the daddi bug... no. random: get_random_u32 called from bucket_table_alloc+0xbc/0x2e8 with crng_init=0 clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 14931722236523437 ns futex hash table entries: 256 (order: -2, 6144 bytes) NET: Registered protocol family 16 Can't analyze schedule() prologue at (____ptrval____) HugeTLB registered 32.0 MiB page size, pre-allocated 0 pages SCSI subsystem initialized tc: TURBOchannel rev. 1 at 25.0 MHz (without parity) tc0: DEC PMAG-AA V1.0a tc1: DEC PMAF-FD V3.1D tc2: DEC PMAF-AA T5.2P- clocksource: Switched to clocksource MIPS NET: Registered protocol family 2 tcp_listen_portaddr_hash hash table entries: 1024 (order: 0, 16384 bytes) TCP established hash table entries: 2048 (order: 0, 16384 bytes) TCP bind hash table entries: 2048 (order: 0, 16384 bytes) TCP: Hash tables configured (established 2048 bind 2048) UDP hash table entries: 512 (order: 0, 16384 bytes) UDP-Lite hash table entries: 512 (order: 0, 16384 bytes) NET: Registered protocol family 1 RPC: Registered named UNIX socket transport module. RPC: Registered udp transport module. RPC: Registered tcp transport module. RPC: Registered tcp NFSv4.1 backchannel transport module. workingset: timestamp_bits=62 max_order=12 bucket_order=0 Block layer SCSI generic (bsg) driver version 0.4 loaded (major 253) io scheduler noop registered io scheduler deadline registered io scheduler cfq registered (default) Console: switching to mono frame buffer device 160x64 fb0: PMAG-AA frame buffer device at tc0 DECstation Z85C30 serial driver version 0.10 ttyS0 at MMIO 0x1f900008 (irq = 14, base_baud = 460800) is a Z85C30 SCC ttyS1 at MMIO 0x1f900000 (irq = 14, base_baud = 460800) is a Z85C30 SCC ttyS2 at MMIO 0x1f980008 (irq = 15, base_baud = 460800) is a Z85C30 SCC ttyS3 at MMIO 0x1f980000 (irq = 15, base_baud = 460800) is a Z85C30 SCC ms02-nv.c: v.1.0.0 13 Aug 2001 Maciej W. Rozycki. mtd0: DEC MS02-NV NVRAM at 0x07000000, size 1MiB. declance.c: v0.011 by Linux MIPS DECstation task force declance0: IOASIC onboard LANCE, addr = 08:00:2b:35:62:c1, irq = 16 declance0: registered as eth0. defxx: v1.11 2014/07/01 Lawrence V. Stefani and others random: fast init done tc1: DEFTA at MMIO addr = 0x1e900000, IRQ = 20, Hardware addr = 08-00-2b-a3-a3-29 tc1: registered as fddi0 defza: v.1.1.4 Oct 2 2018 Maciej W. Rozycki tc2: DEC FDDIcontroller 700 or 700-C at 0x1f000000, irq 21 tc2: resetting the board... tc2: OK tc2: model 700 (DEFZA-AA), MMF PMD, address 08-00-2b-2e-6d-75 tc2: ROM rev. 1.0, firmware rev. 1.2, RMC rev. A, SMT ver. 1 tc2: link unavailable tc2: registered as fddi1 mousedev: PS/2 mouse device common for all mice rtc_cmos rtc_cmos: registered as rtc0 rtc_cmos rtc_cmos: no alarms, 50 bytes nvram NET: Registered protocol family 10 Segment Routing with IPv6 sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver NET: Registered protocol family 17 rtc_cmos rtc_cmos: setting system clock to 2018-10-01 00:45:12 UTC (1538354712) Sending BOOTP requests . OK IP-Config: Got BOOTP answer from xxx.xxx.xxx.xxx, my address is xxx.xxx.xxx.xxx IP-Config: Complete: device=eth0, hwaddr=08:00:2b:35:62:c1, ipaddr=xxx.xxx.xxx.xxx, mask=xxx.xxx.xxx.xxx, gw=xxx.xxx.xxx.xxx fddi1: link available host=hhh.hhh.hhh.hhh, domain=, nis-domain=(none) bootserver=xxx.xxx.xxx.xxx, rootserver=xxx.xxx.xxx.xxx, rootpath=/ddd/ddd nameserver0=xxx.xxx.xxx.xxx fddi1: link unavailable VFS: Mounted root (nfs filesystem) on device 0:11. Freeing unused PROM memory: 112k freed Freeing unused kernel memory: 272K This architecture does not have kernel memory protection. Run /sbin/init as init process [...] I had to revert recent changes forcing the minimum of GCC 4.6, and then patch up the breakage that was the motivation for the version bump, as I cannot easily upgrade my compiler (the newest one I was able to make working without NPTL), which will be a process. Still 4.18 can be used pristine with CONFIG_32BIT, except for a recent build breakage with the RTC driver, my small fix for which has already been accepted. I think 4.17 will build and boot just fine out of the box, and I expect the RTC fix to be backported to 4.18 too. For CONFIG_64BIT a fix for memory corruption with `memset' is required that applies to 4.17 and later versions, and is pending maintainer's acceptance. So I think 4.16 will work just fine, but you need the toolchain (GCC+binutils) from my site with a DADDI and DADDIU workarounds implemented to build such a kernel. I think the workarounds will never make it upstream due to their intrusiveness, but I mean to maintain them indefinitely (though as I mentioned above it'll make me a little bit yet to get beyond GCC 4.1.2). Maciej ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] TC: Set DMA masks for devices 2018-10-05 22:52 ` Maciej W. Rozycki @ 2018-10-06 9:21 ` Fredrik Noring 2018-10-14 23:51 ` Maciej W. Rozycki 0 siblings, 1 reply; 9+ messages in thread From: Fredrik Noring @ 2018-10-06 9:21 UTC (permalink / raw) To: Maciej W. Rozycki; +Cc: Ralf Baechle, linux-mips, Jürgen Urban Hi Maciej, > > Ah... memory that is known to be DMA compatible is allocated separately, > > and then handed over to the DMA subsystem using dma_declare_coherent_memory. > > Well, that does specify both a CPU-side and a corresponding DMA-side > address too. Yes, side-stepping any practical use of a DMA mask, which is why it probably could have an arbitrary value except 0 that causes this warning. > > This is done once during driver initialisation. The drivers ohci-sm501.c and > > ohci-tmio.c do that too, which is why I suspect they might broken as well. > > > > The SM501 driver has this explanation: > > > > /* The sm501 chip is equipped with local memory that may be used > > * by on-chip devices such as the video controller and the usb host. > > * This driver uses dma_declare_coherent_memory() to make sure > > * usb allocations with dma_alloc_coherent() allocate from > > * this local memory. The dma_handle returned by dma_alloc_coherent() > > * will be an offset starting from 0 for the first local memory byte. > > From the description I take it it is some MMIO memory rather than host > memory. I fail to see how it is supposed to work with these calls for > non-system memory, which certainly any MMIO memory is, which surely is not > under the supervision of the kernel memory allocator. I agree, this is obscure to me too. > There are calls for MMIO memory defined in the DMA API, specifically > `dma_map_resource' and `dma_unmap_resource'. I've never used them myself, > and I gather they provide you with a way for CPUs to access MMIO memory > with caching enabled and without the need to use the MMIO accessors only, > such as `readl', `writel', etc., which are expected to avoid going through > any CPU cache. Maybe these are what you're after? > > But maybe I'm missing something. That is handled within the USB OHCI subsystem. I don't know the details, actually. > > * > > * So as long as data is allocated using dma_alloc_coherent() all is > > * fine. This is however not always the case - buffers may be allocated > > * using kmalloc() - so the usb core needs to be told that it must copy > > * data into our local memory if the buffers happen to be placed in > > * regular memory. The HCD_LOCAL_MEM flag does just that. > > */ > > This raises a hack alert to me TBH. Christoph Hellwig raised concerns too, but I don't know how an OHCI driver could do things differently given the circumstances, at least for a simple initial implementation. For sure, the IOP has the capability and was most likely designed for handling USB devices and other peripherals to a much greater extent than allowed by the current PS2 OHCI driver, where the EE manipulates the OHCI registers directly, which is quite inefficient. > > The DMA for its onboard buffer memory appears to be very similar to the > > IOP and its DMA? That memory is currently copied by the EE, but there are > > other DMA controllers that could handle that, possibly synchronised using > > DMA chaining, which would assist the EE significantly. > > Mind that the DEFZA runs its own RTOS for initialization and management > support, including in particular SMT (Station Management). This is run on > an MC68000 processor. That processor is interfaced to a bus where board > memory is attached as well as the RMC (Ring Memory Controller) chip, which > acts as a DMA master on that bus, like does the host bus interface. Also > certain control register writes from the host raise interrupts to the > MC68000 for special situations to handle. > > All the PDQ-based FDDI adapters also have an M68000 which runs an RTOS, > however the presence of the PDQ ASIC makes their architecture slightly > different as the FDDI chipset does host DMA via the PDQ ASIC, which acts > as a master on the host bus (possibly through a bridge chip like the PFI, > though TURBOchannel for example is interfaced directly). How is its firmware handled? The Linux MIPS wiki entry for the DECstation firmware https://www.linux-mips.org/wiki/DECstation#Firmware is a TODO. :) The main reason I'm asking is that the IOP is a MIPS R3000 (apparently in later product models replaced with a PowerPC 405GP and its DECKARD software emulator) that also needs firmware. The IOP most likely ought to handle multiple firmware files, in the IRX format, depending on its set of services. Have you implemented sysfs structures to inspect the DEFZA RTOS? That is something I would like to do for the IOP. > The biggest challenge has turned out to be electrolytic capacitor > failures in power supplies. Unfortunately in late 1980s to mid 1990s > several lines of low-ESR capacitors, used in output filters in switch-mode > PSUs, were made with a new electrolyte formula based on a quaternary > ammonium salt. All they have turned out to suffer from excessive > corrosion caused by that electrolyte, shortening the lifespan of those > parts well below the expectations even in the enhanced lines specifically > made with long life in mind. Consequently those parts start leaking even > if unused (or indeed never used) and then obviously cause PSU breakage if > powered up. > > Those were all from reputable manufacturers, such as Chemi-con, Nichicon > or Panasonic; not to be confused with the bulged capacitor problem, aka > capacitor plague, which many ATX PSUs have suffered from mid 1990s to mid > 2000s where cheap parts were used from less reputable manufacturers. Interesting! > This is a DECstation 5000/2x0 > CPU0 revision is: 00000440 (R4400SC) > FPU revision is: 00000500 > Checking for the multiply/shift bug... no. > Checking for the daddiu bug... yes, workaround... yes. > Determined physical RAM map: > memory: 0000000004000000 @ 0000000000000000 (usable) Considering the amount of memory, how do compile for it? Fredrik ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] TC: Set DMA masks for devices @ 2018-10-14 23:51 ` Maciej W. Rozycki 0 siblings, 0 replies; 9+ messages in thread From: Maciej W. Rozycki @ 2018-10-14 23:51 UTC (permalink / raw) To: Fredrik Noring; +Cc: Ralf Baechle, linux-mips, Jürgen Urban Hi Fredrik, > > From the description I take it it is some MMIO memory rather than host > > memory. I fail to see how it is supposed to work with these calls for > > non-system memory, which certainly any MMIO memory is, which surely is not > > under the supervision of the kernel memory allocator. > > I agree, this is obscure to me too. I can't be bothered (sorry!) to study this code or the datasheet for the IC to figure out what the arrangement is, but I do encourage you to do so if you want to make any changes here. > > Mind that the DEFZA runs its own RTOS for initialization and management > > support, including in particular SMT (Station Management). This is run on > > an MC68000 processor. That processor is interfaced to a bus where board > > memory is attached as well as the RMC (Ring Memory Controller) chip, which > > acts as a DMA master on that bus, like does the host bus interface. Also > > certain control register writes from the host raise interrupts to the > > MC68000 for special situations to handle. > > > > All the PDQ-based FDDI adapters also have an M68000 which runs an RTOS, > > however the presence of the PDQ ASIC makes their architecture slightly > > different as the FDDI chipset does host DMA via the PDQ ASIC, which acts > > as a master on the host bus (possibly through a bridge chip like the PFI, > > though TURBOchannel for example is interfaced directly). > > How is its firmware handled? The Linux MIPS wiki entry for the DECstation > firmware > > https://www.linux-mips.org/wiki/DECstation#Firmware > > is a TODO. :) I'm not sure who actually created that entry and what they had in mind. Likely the console firmware and any of its peculiarities related to Linux. > The main reason I'm asking is that the IOP is a MIPS R3000 > (apparently in later product models replaced with a PowerPC 405GP and its > DECKARD software emulator) that also needs firmware. The IOP most likely > ought to handle multiple firmware files, in the IRX format, depending on > its set of services. The firmware of these FDDI boards is stored in flash memory onboard, so you don't need to do anything to load it as it boots by itself. There is a documented way to flash a firmware image by fiddling with the control registers appropriately, downloading the new image to board RAM and then requesting the board to transfer the image to onboard flash. From documentation I gather this process is done entirely by board circuitry with no software involved on the board side, that is a failed firmware flashing process does not preclude another attempt. Normally to start initializing the board you just assert/deassert RESET with one of the control registers and the board boots. It takes DEFZA 10s to boot (the documented amount of time to wait for the driver to wait for the boostrap to complete is 30s). This is why I made initialisation messages so verbose, so that the user is not confused and does not conclude the kernel has hung. You need to boot the board to retrieve its MAC address as the onboard PROM chip holding the address is not accesssible from the host side and the address is only returned by the INIT command (NB there is no way to override it either). There is an undocumented quicker way board's console support code uses for presentation purposes in a system's console monitor, but that's board's internal protocol and I didn't want to risk an incompatibility with some board revision out there. Therefore the board driver requests its interrupt right away, sets a timer, cycles RESET and puts the driver to sleep so that the system does not become frozen if the driver is loaded as a module during normal Linux operation. Then either a state change interrupt from the board or the timer fires and the driver resumes from there accordingly. After reboot a command has to be sent to the board to initialise the DMA rings and it also takes a while, though not as much. My measurements indicate 160ms, but it's obviously still too long for the driver to just busy-wait there twiddling thumbs, so it puts itself to sleep too. An unfortunate side effect of this design is that the the IRQ handler is called `tcX' rather than `fddiX', as observed in /proc/interrupts. Maybe I'll propose a `rename_irq' API, however I'm not sure if it's worth it. The board also has to be reset during normal operation if the so called PC Trace (Physical Connection Trace) event has happened in the course of FDDI ring fault recovery (i.e. when the token has been lost and could not have been restored with beaconing). That event causes the board to switch into the halted state (the link status LED changes from green to red to signify the problem) and the board has to be rebooted by the driver to verify it's not this board that is the FDDI station having caused the ring fault. Then all the usual commands have to be sent to initialise the board, set FDDI link parameters, add any CAM entries that were set before the reboot and set the promiscuous mode if in use, and then finally join the ring. So this is handled with an interrupt-driven state machine as otherwise again the driver would have to freeze the system for the duration of all this processing. The PDQ-based adapters are much quicker, they boot in ~1s. However the current `defxx' driver is flawed in that it does not handle that PC Trace event with a state machine and it does freeze the system if that happens, remaining in the hardirq context throughout. Also it may fail DMA buffer allocation in the course of the reboot as it (unnecessarily) frees all the buffers previously allocated and requests new ones instead. I need to fix this all, modelling the solution after `defza', however I want to upstream the latter driver first. Fortunately PC Trace events are not that common, but earlier this year someone has already complained about this issue with `defxx' causing unacceptable latency problems with their system, so I do need to look into it. > Have you implemented sysfs structures to inspect the DEFZA RTOS? That is > something I would like to do for the IOP. There is no (documented) way to access the internals of board firmware (except for the request to flash it). You only have have access to onboard 1MiB of RAM and a bunch of control/status registers. Likewise with the PDQ-based adapters, although their use of RAM is not clearly documented (the PFI has a separate BAR for board RAM access) -- I find it hard to believe they'd put 1MiB of RAM there only to support firmware upgrades, so I think it is still used as a temporary packet buffer and other operational purposes. > > This is a DECstation 5000/2x0 > > CPU0 revision is: 00000440 (R4400SC) > > FPU revision is: 00000500 > > Checking for the multiply/shift bug... no. > > Checking for the daddiu bug... yes, workaround... yes. > > Determined physical RAM map: > > memory: 0000000004000000 @ 0000000000000000 (usable) > > Considering the amount of memory, how do compile for it? The kernel can be cross-compiled easily and with no pitfalls, so this is what I have been always doing. With userland builds most software packages can be cross-compiled, but I prefer native builds indeed, as these do not require manual tweaking of any parameters that cannot be inferred in cross-compilation (fortunately modern versions of Autoconf are able to figure out what the sizes of data types are even if cross-compiling, as setting these manually used to be a real pain). For those I usually use my Broadcom SWARM board, which is clocked at 800Mhz and currently has 3200MiB of RAM (pending a firmware fix of DRAM controller initialisation that will hopefully allow for full 4GiB possible with modules available on the market out of 8GiB theoretical maximum). The SWARM has switchable endianness with the line to control it at reset wired to a PCB header used with a jumper as shipped. I have instead wired it to an external switch mounted on a cover plate of an unused option slot, so that I don't have to pull the system apart to change the endianness. I have better equipped DECstations at my remote site though; the maximum amount of RAM the /200, /240 and /260 models accept is 480MiB. The remaining 32MiB of space addressable via the KSEG0/KSEG1 spaces is used for system ROM and MMIO (for onboard I/O circuitry and TURBOchannel). TURBOchannel can also be accessed from 0x20000000 physical up (not with the /200), for 3 slots of 512MiB of MMIO space each, however due to an API shortcoming system firmware cannot cope with that (as documented on the DECstation wiki). Maciej ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] TC: Set DMA masks for devices @ 2018-10-14 23:51 ` Maciej W. Rozycki 0 siblings, 0 replies; 9+ messages in thread From: Maciej W. Rozycki @ 2018-10-14 23:51 UTC (permalink / raw) To: Fredrik Noring; +Cc: Ralf Baechle, linux-mips, Jürgen Urban Hi Fredrik, > > From the description I take it it is some MMIO memory rather than host > > memory. I fail to see how it is supposed to work with these calls for > > non-system memory, which certainly any MMIO memory is, which surely is not > > under the supervision of the kernel memory allocator. > > I agree, this is obscure to me too. I can't be bothered (sorry!) to study this code or the datasheet for the IC to figure out what the arrangement is, but I do encourage you to do so if you want to make any changes here. > > Mind that the DEFZA runs its own RTOS for initialization and management > > support, including in particular SMT (Station Management). This is run on > > an MC68000 processor. That processor is interfaced to a bus where board > > memory is attached as well as the RMC (Ring Memory Controller) chip, which > > acts as a DMA master on that bus, like does the host bus interface. Also > > certain control register writes from the host raise interrupts to the > > MC68000 for special situations to handle. > > > > All the PDQ-based FDDI adapters also have an M68000 which runs an RTOS, > > however the presence of the PDQ ASIC makes their architecture slightly > > different as the FDDI chipset does host DMA via the PDQ ASIC, which acts > > as a master on the host bus (possibly through a bridge chip like the PFI, > > though TURBOchannel for example is interfaced directly). > > How is its firmware handled? The Linux MIPS wiki entry for the DECstation > firmware > > https://www.linux-mips.org/wiki/DECstation#Firmware > > is a TODO. :) I'm not sure who actually created that entry and what they had in mind. Likely the console firmware and any of its peculiarities related to Linux. > The main reason I'm asking is that the IOP is a MIPS R3000 > (apparently in later product models replaced with a PowerPC 405GP and its > DECKARD software emulator) that also needs firmware. The IOP most likely > ought to handle multiple firmware files, in the IRX format, depending on > its set of services. The firmware of these FDDI boards is stored in flash memory onboard, so you don't need to do anything to load it as it boots by itself. There is a documented way to flash a firmware image by fiddling with the control registers appropriately, downloading the new image to board RAM and then requesting the board to transfer the image to onboard flash. ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2018-10-14 23:51 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-10-03 12:21 [PATCH] TC: Set DMA masks for devices Maciej W. Rozycki 2018-10-04 16:57 ` Fredrik Noring 2018-10-04 17:55 ` Fredrik Noring 2018-10-04 20:09 ` Maciej W. Rozycki 2018-10-05 14:56 ` Fredrik Noring 2018-10-05 22:52 ` Maciej W. Rozycki 2018-10-06 9:21 ` Fredrik Noring 2018-10-14 23:51 ` Maciej W. Rozycki 2018-10-14 23:51 ` Maciej W. Rozycki
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.