On Mon, 3 Oct 2022 15:11:44 +0100 "Russell King (Oracle)" wrote: > On Mon, Oct 03, 2022 at 09:30:37AM +0200, Christoph Hellwig wrote: > > On Fri, Sep 30, 2022 at 05:02:05PM +0200, Marek BehĂșn wrote: > > > It seems that the null pointer dereference comes from the data variable > > > having zero value. We assign > > > data = (u8 *)(uintptr_t)rx_desc->buf_cookie; > > > > I never see any assignment to ->buf_cookie in the driver, what am > > I missing? > > I think Marek's setup (like my setups) use the hardware buffer manager, > and it's hardware that fills in the "buf_cookie", which is supposed to > be the virtual address of the buffer. > > Each buffer supplied to the hardware buffer manager is supposed to > contain the virtual address in the first 32-bit word in that buffer. > > This is done by mvneta_bm_construct(): > > /* In order to update buf_cookie field of RX descriptor properly, > * BM hardware expects buf virtual address to be placed in the > * first four bytes of mapped buffer. > */ > *(u32 *)buf = (u32)buf; > > immediately prior to dma_map_single(..., DMA_FROM_DEVICE) is called. > > If I had to guess, I would suggest that this write is being lost via > cache invalidation, and given that the hardware BM both reads and > writes this buffer, DMA_FROM_DEVICE is not correct, it should be > DMA_BIDIRECTIONAL. > > Changing that is probably going to need DMA_FROM_DEVICE also changed > elsewhere in the mvneta_bm and mvneta driver. > > I'm not in a position where I could test that out. Marek? > Hello Russell, thanks for your suggestion! Adding Pali, since he has some information (see at the end of this message). The attached patch seems to solve the null-pointer dereference. I booted into single user mode and enabled eth2. Before it caused the NULL pointer dereference after link got up, not it does not happen. But I am still encountering the freeze after booting into system. Maybe these are different bugs? I am thinking whether we don't need something similar like 7bea67a99430 ("ARM: dts integrator: Fix DMA ranges") also for mvebu. I seem to remember Pali talking about how the ranges defined in some upstream mvebu-tree, using MBUS_ID() macros, are incorrect. Pali, what do you remember about this? Marek