BUG cxgb3: Check and handle the dma mapping errors

* BUG cxgb3: Check and handle the dma mapping errors
@ 2013-08-05  2:59 Alexey Kardashevskiy
  2013-08-05 18:41 ` Jay Fenlason
  0 siblings, 1 reply; 6+ messages in thread
From: Alexey Kardashevskiy @ 2013-08-05  2:59 UTC (permalink / raw)
  To: Santosh Rastapur, Divy Le Ray
  Cc: Jay Fenlason, David S. Miller, netdev, Linux Kernel Mailing List

Hi!

Recently I started getting multiple errors like this:

cxgb3 0006:01:00.0: iommu_alloc failed, tbl c000000003067980 vaddr
c000001fbdaaa882 npages 1
cxgb3 0006:01:00.0: iommu_alloc failed, tbl c000000003067980 vaddr
c000001fbdaaa882 npages 1
cxgb3 0006:01:00.0: iommu_alloc failed, tbl c000000003067980 vaddr
c000001fbdaaa882 npages 1
cxgb3 0006:01:00.0: iommu_alloc failed, tbl c000000003067980 vaddr
c000001fbdaaa882 npages 1
cxgb3 0006:01:00.0: iommu_alloc failed, tbl c000000003067980 vaddr
c000001fbdaaa882 npages 1
cxgb3 0006:01:00.0: iommu_alloc failed, tbl c000000003067980 vaddr
c000001fbdaaa882 npages 1
cxgb3 0006:01:00.0: iommu_alloc failed, tbl c000000003067980 vaddr
c000001fbdaaa882 npages 1
... and so on

This is all happening on a PPC64 "powernv" platform machine. To trigger the
error state, it is enough to _flood_ ping CXGB3 card from another machine
(which has Emulex 10Gb NIC + Cisco switch). Just do "ping -f 172.20.1.2"
and wait 10-15 seconds.

The messages are coming from arch/powerpc/kernel/iommu.c and basically
mean that the driver requested more pages than the DMA window has which is
normally 1GB (there could be another possible source of errors -
ppc_md.tce_build callback - but on powernv platform it always succeeds).

The patch after which it broke is:
commit f83331bab149e29fa2c49cf102c0cd8c3f1ce9f9
Author: Santosh Rastapur <santosh@chelsio.com>
Date:   Tue May 21 04:21:29 2013 +0000
cxgb3: Check and handle the dma mapping errors

Any quick ideas? Thanks!

-- 
Alexey

^ permalink raw reply	[flat|nested] 6+ messages in thread