On Tue, Aug 21, 2018 at 4:20 PM, Logan Gunthorpe <logang@deltatee.com> wrote:


On 21/08/18 05:18 PM, Eric Pilmore wrote:
> We have been running locally with Kit's change for dma_map_resource and its
> incorporation in ntb_async_tx_submit for the destination address and
> it runs fine
> under "load" (iperf) on a Xeon (Xeon(R) CPU E5-2680 v4 @ 2.40GHz) based system,
> regardless of whether the DMA engine being used is IOAT or a PLX
> device sitting in
> the PCIe tree. However, when we go back to a i7 (i7-7700K CPU @ 4.20GHz) based
> system it seems to run into issues, specifically when put under a
> load. In this case,
> just having a load using a single ping command with an interval=0, i.e. no delay
> between ping packets, after a few thousand packets the system just hangs. No
> panic or watchdogs.  Note that in this scenario I can only use a PLX DMA engine.

This is just my best guess: but it sounds to me like a bug in the PLX
DMA driver or hardware.


The PLX DMA driver?  But the PLX driver isn't really even involved in the mapping
stage.  Are you thinking maybe the stage at which the DMA descriptor is freed and
the PLX DMA driver does a dma_descriptor_unmap?

Again, PLX did not exhibit any issues on the Xeon system.

Eric