* [REGRESSION, BISECTED] pci: nvme device with HMB fails on arm64 @ 2019-01-04 12:57 Liviu Dudau 2019-01-04 17:34 ` Christoph Hellwig 0 siblings, 1 reply; 3+ messages in thread From: Liviu Dudau @ 2019-01-04 12:57 UTC (permalink / raw) To: Christoph Hellwig Cc: Lorenzo Pieralisi, Bjorn Helgaas, Robin Murphy, linux-pci, LAKML, LKML Hello Christoph, As I have mentioned to you after Xmas, I think your dma-mapping series breaks NVMe drivers that use HMB on arm64 (RK3399 NanoPC T4 board with Toshiba RC100 SSD in this case). The observed behaviour is that the modprobe of the nvme module will hang due to a kernel crash, which I've sent a patch for to the mm subsystem ("mm/vmalloc.c: don't dereference possible NULL pointer in __vunmap"), but that behaviour is triggered by the failure of the NVMe drive to access host memory buffers (error 24578, flags 0x1). Now that your series has been merged into Linus' tree, I've bisected it to this: bfd56cd605219d90b210a5377fca31a644efe95c is the first bad commit commit bfd56cd605219d90b210a5377fca31a644efe95c Author: Christoph Hellwig <hch@lst.de> Date: Sun Nov 4 17:38:39 2018 +0100 dma-mapping: support highmem in the generic remap allocator By using __dma_direct_alloc_pages we can deal entirely with struct page instead of having to derive a kernel virtual address. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Robin Murphy <robin.murphy@arm.com> :040000 040000 565ae62f55f04c11da2471bd59d1b0328273992d 40047c9ecf715f6f7e8293b335c1f16dd511a0e0 M kernel Note that the bisect was done on the mainline tree without any additional patches, as applying those makes the bisect process think that the culprit is the merging of the tracing changes. I haven't tried to revert the patch as it is clear that the following patch depends on it and also because during the bisect process one of the steps generated a kernel that failed to boot as it was missing your patch 9ab91e7c5c51 ("arm64: default to the direct mapping in get_arch_dma_ops"). I've marked that step as bad, as it was related to the series, but I might have been wrong there. The full bisect log is this: $ git bisect log git bisect start # good: [00c569b567c7f1f0da6162868fd02a9f29411805] Merge tag 'locks-v4.21-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux git bisect good 00c569b567c7f1f0da6162868fd02a9f29411805 # bad: [645ff1e8e704c4f33ab1fcd3c87f95cb9b6d7144] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input git bisect bad 645ff1e8e704c4f33ab1fcd3c87f95cb9b6d7144 # bad: [02061181d3a9ccfe15ef6bc15fa56283acc47620] Merge tag 'staging-4.21-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging git bisect bad 02061181d3a9ccfe15ef6bc15fa56283acc47620 # bad: [f346b0becb1bc62e45495f9cdbae3eef35d0b635] Merge branch 'akpm' (patches from Andrew) git bisect bad f346b0becb1bc62e45495f9cdbae3eef35d0b635 # bad: [938edb8a31b976c9a92eb0cd4ff481e93f76c1f1] Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi git bisect bad 938edb8a31b976c9a92eb0cd4ff481e93f76c1f1 # good: [00203ba40d40d7f33857416adfb18adaf0e40123] kyber: use sbitmap add_wait_queue/list_del wait helpers git bisect good 00203ba40d40d7f33857416adfb18adaf0e40123 # good: [735bcc77e6ba83e464665cea9041072190ede37e] scsi: hisi_sas: Fix warnings detected by sparse git bisect good 735bcc77e6ba83e464665cea9041072190ede37e # bad: [af7ddd8a627c62a835524b3f5b471edbbbcce025] Merge tag 'dma-mapping-4.21' of git://git.infradead.org/users/hch/dma-mapping git bisect bad af7ddd8a627c62a835524b3f5b471edbbbcce025 # bad: [8ddbe5943c0b1259b5ddb6dc1729863433fc256c] dma-mapping: move dma_cache_sync out of line git bisect bad 8ddbe5943c0b1259b5ddb6dc1729863433fc256c # bad: [887712a0a5b31e0cf28087f6445de431b4722e52] x86/calgary: remove the mapping_error dma_map_ops method git bisect bad 887712a0a5b31e0cf28087f6445de431b4722e52 # bad: [b0cbeae4944924640bf550b75487729a20204c14] dma-direct: remove the mapping_error dma_map_ops method git bisect bad b0cbeae4944924640bf550b75487729a20204c14 # bad: [bfd56cd605219d90b210a5377fca31a644efe95c] dma-mapping: support highmem in the generic remap allocator git bisect bad bfd56cd605219d90b210a5377fca31a644efe95c # good: [704f2c20eaa566f6906e8812b6e2115889bd753d] dma-direct: reject highmem pages from dma_alloc_from_contiguous git bisect good 704f2c20eaa566f6906e8812b6e2115889bd753d # good: [0c3b3171ceccb8830c2bb5adff1b4e9b204c1450] dma-mapping: move the arm64 noncoherent alloc/free support to common code git bisect good 0c3b3171ceccb8830c2bb5adff1b4e9b204c1450 # first bad commit: [bfd56cd605219d90b210a5377fca31a644efe95c] dma-mapping: support highmem in the generic remap allocator Does anyone have any suggestions on what I might try as a fix? Best regards, Liviu -- ________________________________________________________ ________| |_______ \ | With enough courage, you can do without a reputation | / \ | -- Rhett Butler | / / |________________________________________________________| \ /__________) (_________\ ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [REGRESSION, BISECTED] pci: nvme device with HMB fails on arm64 2019-01-04 12:57 [REGRESSION, BISECTED] pci: nvme device with HMB fails on arm64 Liviu Dudau @ 2019-01-04 17:34 ` Christoph Hellwig 2019-01-05 0:32 ` Liviu Dudau 0 siblings, 1 reply; 3+ messages in thread From: Christoph Hellwig @ 2019-01-04 17:34 UTC (permalink / raw) To: Liviu Dudau Cc: Christoph Hellwig, Lorenzo Pieralisi, Bjorn Helgaas, Robin Murphy, linux-pci, LAKML, LKML Hi Liviu, please try the patch below. Note that this is in top of mainline, as the commit you found already needed another fixup, which has made it to Linus already. -- From a959cc1a8ee00dcb274922f9d74f6ed632709047 Mon Sep 17 00:00:00 2001 From: Christoph Hellwig <hch@lst.de> Date: Fri, 4 Jan 2019 18:31:48 +0100 Subject: dma-direct: fix DMA_ATTR_NO_KERNEL_MAPPING for remapped allocations We need to return a dma_addr_t even if we don't have a kernel mapping. Do so by consolidating the phys_to_dma call in a single place and jump to it from all the branches that return successfully. Fixes: bfd56cd60521 ("dma-mapping: support highmem in the generic remap allocator") Reported-by: Liviu Dudau <liviu@dudau.co.uk Signed-off-by: Christoph Hellwig <hch@lst.de> --- kernel/dma/remap.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/kernel/dma/remap.c b/kernel/dma/remap.c index 18cc09fc27b9..7a723194ecbe 100644 --- a/kernel/dma/remap.c +++ b/kernel/dma/remap.c @@ -204,8 +204,7 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle, ret = dma_alloc_from_pool(size, &page, flags); if (!ret) return NULL; - *dma_handle = phys_to_dma(dev, page_to_phys(page)); - return ret; + goto done; } page = __dma_direct_alloc_pages(dev, size, dma_handle, flags, attrs); @@ -215,8 +214,10 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle, /* remove any dirty cache lines on the kernel alias */ arch_dma_prep_coherent(page, size); - if (attrs & DMA_ATTR_NO_KERNEL_MAPPING) - return page; /* opaque cookie */ + if (attrs & DMA_ATTR_NO_KERNEL_MAPPING) { + ret = page; /* opaque cookie */ + goto done; + } /* create a coherent mapping */ ret = dma_common_contiguous_remap(page, size, VM_USERMAP, @@ -227,9 +228,9 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle, return ret; } - *dma_handle = phys_to_dma(dev, page_to_phys(page)); memset(ret, 0, size); - +done: + *dma_handle = phys_to_dma(dev, page_to_phys(page)); return ret; } -- 2.20.1 ^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [REGRESSION, BISECTED] pci: nvme device with HMB fails on arm64 2019-01-04 17:34 ` Christoph Hellwig @ 2019-01-05 0:32 ` Liviu Dudau 0 siblings, 0 replies; 3+ messages in thread From: Liviu Dudau @ 2019-01-05 0:32 UTC (permalink / raw) To: Christoph Hellwig Cc: Lorenzo Pieralisi, Bjorn Helgaas, Robin Murphy, linux-pci, LAKML, LKML On Fri, Jan 04, 2019 at 06:34:47PM +0100, Christoph Hellwig wrote: > Hi Liviu, Hi Christoph, > > please try the patch below. Note that this is in top of mainline, > as the commit you found already needed another fixup, which has > made it to Linus already. This patch does fix the issue with NVMe HMBs. You can add my: Tested-by: Liviu Dudau <liviu@dudau.co.uk> Now I need to go and try to figure out why the rk_gmac-dwmac driver gives this warning: [ 11.277363] ------------[ cut here ]------------ [ 11.277800] DMA-API: rk_gmac-dwmac fe300000.ethernet: device driver frees DMA memory with wrong function [device address=0x00000000e7c21c02] [size=342 bytes] [mapped as page] [unmapped as single] [ 11.279348] WARNING: CPU: 0 PID: 0 at kernel/dma/debug.c:1085 check_unmap+0x720/0x840 [ 11.280037] Modules linked in: pcie_rockchip_host phy_rockchip_pcie rockchip realtek dwmac_rk stmmac_platform stmmac [ 11.280989] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.20.0-10990-gda3599602d2f-dirty #1 [ 11.281708] Hardware name: FriendlyARM NanoPC-T4 (DT) [ 11.282162] pstate: 80000085 (Nzcv daIf -PAN -UAO) [ 11.282592] pc : check_unmap+0x720/0x840 [ 11.282947] lr : check_unmap+0x720/0x840 [ 11.283299] sp : ffff000010003b90 [ 11.283597] x29: ffff000010003b90 x28: ffff8000e7d74800 [ 11.284075] x27: ffff8000f66fe410 x26: ffff000010f33370 [ 11.284551] x25: 0000000000000000 x24: ffff0000111163d0 [ 11.285027] x23: ffff000010f33000 x22: ffff000010f0d818 [ 11.285504] x21: ffff000010003c48 x20: ffff00001110dd80 [ 11.285981] x19: ffff8000f6657090 x18: 0000000000000010 [ 11.286457] x17: 0000000000000001 x16: 0000000000000007 [ 11.286933] x15: ffffffffffffffff x14: 5d32306331326337 [ 11.287409] x13: 6530303030303030 x12: 3078303d73736572 [ 11.287886] x11: 6464612065636976 x10: 65645b206e6f6974 [ 11.288363] x9 : 636e756620676e6f x8 : 2073612064657070 [ 11.288840] x7 : 616d6e755b205d65 x6 : 00000000000001d4 [ 11.289316] x5 : 0000000000000001 x4 : 00008000e685f000 [ 11.289793] x3 : ffff8000f774deb8 x2 : 0000000000000007 [ 11.290269] x1 : d5f5206026126100 x0 : 0000000000000000 [ 11.290746] Call trace: [ 11.290974] check_unmap+0x720/0x840 [ 11.291301] debug_dma_unmap_page+0xc4/0xd0 [ 11.291719] stmmac_napi_poll+0x214/0x1088 [stmmac] [ 11.292160] net_rx_action+0x12c/0x3d0 [ 11.292502] __do_softirq+0x1a0/0x438 [ 11.292835] irq_exit+0xcc/0xe0 [ 11.293124] __handle_domain_irq+0x9c/0x108 [ 11.293500] gic_handle_irq+0xb8/0x15c [ 11.293839] el1_irq+0xb4/0x130 [ 11.294127] cpuidle_enter_state+0xbc/0x508 [ 11.294504] cpuidle_enter+0x34/0x48 [ 11.294830] call_cpuidle+0x44/0x68 [ 11.295148] do_idle+0x264/0x2a0 [ 11.295445] cpu_startup_entry+0x28/0x30 [ 11.295802] rest_init+0xd4/0xe0 [ 11.296101] arch_call_rest_init+0x14/0x1c [ 11.296471] start_kernel+0x48c/0x4b4 [ 11.296800] ---[ end trace c63a0054785f45a3 ]--- [ 11.297213] DMA-API: Mapped at: [ 11.297531] stmmac_xmit+0x4b8/0x1168 [stmmac] [ 11.297934] dev_hard_start_xmit+0xac/0x290 [ 11.298313] sch_direct_xmit+0x120/0x330 [ 11.298667] __qdisc_run+0x140/0x6e8 [ 11.298994] __dev_queue_xmit+0x48c/0x718 I've seen you have provided some other net driver reporter with a patch to make dma_map_single_attrs use dma_map_page_attrs, which I have applied, but that doesn't remove the WARN() for me. Many thanks, Liviu > > -- > From a959cc1a8ee00dcb274922f9d74f6ed632709047 Mon Sep 17 00:00:00 2001 > From: Christoph Hellwig <hch@lst.de> > Date: Fri, 4 Jan 2019 18:31:48 +0100 > Subject: dma-direct: fix DMA_ATTR_NO_KERNEL_MAPPING for remapped allocations > > We need to return a dma_addr_t even if we don't have a kernel mapping. > Do so by consolidating the phys_to_dma call in a single place and jump > to it from all the branches that return successfully. > > Fixes: bfd56cd60521 ("dma-mapping: support highmem in the generic remap allocator") > Reported-by: Liviu Dudau <liviu@dudau.co.uk > Signed-off-by: Christoph Hellwig <hch@lst.de> > --- > kernel/dma/remap.c | 13 +++++++------ > 1 file changed, 7 insertions(+), 6 deletions(-) > > diff --git a/kernel/dma/remap.c b/kernel/dma/remap.c > index 18cc09fc27b9..7a723194ecbe 100644 > --- a/kernel/dma/remap.c > +++ b/kernel/dma/remap.c > @@ -204,8 +204,7 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle, > ret = dma_alloc_from_pool(size, &page, flags); > if (!ret) > return NULL; > - *dma_handle = phys_to_dma(dev, page_to_phys(page)); > - return ret; > + goto done; > } > > page = __dma_direct_alloc_pages(dev, size, dma_handle, flags, attrs); > @@ -215,8 +214,10 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle, > /* remove any dirty cache lines on the kernel alias */ > arch_dma_prep_coherent(page, size); > > - if (attrs & DMA_ATTR_NO_KERNEL_MAPPING) > - return page; /* opaque cookie */ > + if (attrs & DMA_ATTR_NO_KERNEL_MAPPING) { > + ret = page; /* opaque cookie */ > + goto done; > + } > > /* create a coherent mapping */ > ret = dma_common_contiguous_remap(page, size, VM_USERMAP, > @@ -227,9 +228,9 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle, > return ret; > } > > - *dma_handle = phys_to_dma(dev, page_to_phys(page)); > memset(ret, 0, size); > - > +done: > + *dma_handle = phys_to_dma(dev, page_to_phys(page)); > return ret; > } > > -- > 2.20.1 > -- ________________________________________________________ ________| |_______ \ | With enough courage, you can do without a reputation | / \ | -- Rhett Butler | / / |________________________________________________________| \ /__________) (_________\ ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2019-01-05 0:32 UTC | newest] Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-01-04 12:57 [REGRESSION, BISECTED] pci: nvme device with HMB fails on arm64 Liviu Dudau 2019-01-04 17:34 ` Christoph Hellwig 2019-01-05 0:32 ` Liviu Dudau
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).