* Re: next/master bisection: baseline.login on r8a77960-ulcb [not found] <60346234.1c69fb81.cd55e.770d@mx.google.com> @ 2021-02-23 9:56 ` Guillaume Tucker 2021-02-24 21:39 ` Heiko Thiery 0 siblings, 1 reply; 7+ messages in thread From: Guillaume Tucker @ 2021-02-23 9:56 UTC (permalink / raw) To: Konrad Rzeszutek Wilk, Jianxiong Gao, Christoph Hellwig Cc: iommu, Robin Murphy, linux-kernel, kernelci-results Hi Christoph, Please see the bisection report below about a boot failure on r8a77960-ulcb on next-20210222. Reports aren't automatically sent to the public while we're trialing new bisection features on kernelci.org but this one looks valid. The log shows a kernel panic, more details can be found here: https://kernelci.org/test/case/id/6034bde034504edc9faddd2c/ Please let us know if you need any help to debug the issue or try a fix on this platform. Best wishes, Guillaume On 23/02/2021 02:02, KernelCI bot wrote: > * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * > * This automated bisection report was sent to you on the basis * > * that you may be involved with the breaking commit it has * > * found. No manual investigation has been done to verify it, * > * and the root cause of the problem may be somewhere else. * > * * > * If you do send a fix, please include this trailer: * > * Reported-by: "kernelci.org bot" <bot@kernelci.org> * > * * > * Hope this helps! * > * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * > > next/master bisection: baseline.login on r8a77960-ulcb > > Summary: > Start: 37dfbfbdca66 Add linux-next specific files for 20210222 > Plain log: https://storage.kernelci.org/next/master/next-20210222/arm64/defconfig/clang-10/lab-baylibre/baseline-r8a77960-ulcb.txt > HTML log: https://storage.kernelci.org/next/master/next-20210222/arm64/defconfig/clang-10/lab-baylibre/baseline-r8a77960-ulcb.html > Result: 567d877f9a7d swiotlb: refactor swiotlb_tbl_map_single > > Checks: > revert: PASS > verify: PASS > > Parameters: > Tree: next > URL: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git > Branch: master > Target: r8a77960-ulcb > CPU arch: arm64 > Lab: lab-baylibre > Compiler: clang-10 > Config: defconfig > Test case: baseline.login > > Breaking commit found: > > ------------------------------------------------------------------------------- > commit 567d877f9a7d6bf4e4bf0ecd6de23fec8039b123 > Author: Christoph Hellwig <hch@lst.de> > Date: Thu Feb 4 11:08:35 2021 +0100 > > swiotlb: refactor swiotlb_tbl_map_single > > Split out a bunch of a self-contained helpers to make the function easier > to follow. > > Signed-off-by: Christoph Hellwig <hch@lst.de> > Acked-by: Jianxiong Gao <jxgao@google.com> > Tested-by: Jianxiong Gao <jxgao@google.com> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c > index b38b1553c466..381c24ef1ac1 100644 > --- a/kernel/dma/swiotlb.c > +++ b/kernel/dma/swiotlb.c > @@ -468,134 +468,133 @@ static void swiotlb_bounce(phys_addr_t orig_addr, phys_addr_t tlb_addr, > } > } > > -phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, phys_addr_t orig_addr, > - size_t mapping_size, size_t alloc_size, > - enum dma_data_direction dir, unsigned long attrs) > -{ > - dma_addr_t tbl_dma_addr = phys_to_dma_unencrypted(hwdev, io_tlb_start); > - unsigned long flags; > - phys_addr_t tlb_addr; > - unsigned int nslots, stride, index, wrap; > - int i; > - unsigned long mask; > - unsigned long offset_slots; > - unsigned long max_slots; > - unsigned long tmp_io_tlb_used; > - > - if (no_iotlb_memory) > - panic("Can not allocate SWIOTLB buffer earlier and can't now provide you with the DMA bounce buffer"); > - > - if (mem_encrypt_active()) > - pr_warn_once("Memory encryption is active and system is using DMA bounce buffers\n"); > +#define slot_addr(start, idx) ((start) + ((idx) << IO_TLB_SHIFT)) > > - if (mapping_size > alloc_size) { > - dev_warn_once(hwdev, "Invalid sizes (mapping: %zd bytes, alloc: %zd bytes)", > - mapping_size, alloc_size); > - return (phys_addr_t)DMA_MAPPING_ERROR; > - } > - > - mask = dma_get_seg_boundary(hwdev); > +/* > + * Carefully handle integer overflow which can occur when boundary_mask == ~0UL. > + */ > +static inline unsigned long get_max_slots(unsigned long boundary_mask) > +{ > + if (boundary_mask == ~0UL) > + return 1UL << (BITS_PER_LONG - IO_TLB_SHIFT); > + return nr_slots(boundary_mask + 1); > +} > > - tbl_dma_addr &= mask; > +static unsigned int wrap_index(unsigned int index) > +{ > + if (index >= io_tlb_nslabs) > + return 0; > + return index; > +} > > - offset_slots = nr_slots(tbl_dma_addr); > +/* > + * Find a suitable number of IO TLB entries size that will fit this request and > + * allocate a buffer from that IO TLB pool. > + */ > +static int find_slots(struct device *dev, size_t alloc_size) > +{ > + unsigned long boundary_mask = dma_get_seg_boundary(dev); > + dma_addr_t tbl_dma_addr = > + phys_to_dma_unencrypted(dev, io_tlb_start) & boundary_mask; > + unsigned int max_slots = get_max_slots(boundary_mask); > + unsigned int nslots = nr_slots(alloc_size), stride = 1; > + unsigned int index, wrap, count = 0, i; > + unsigned long flags; > > - /* > - * Carefully handle integer overflow which can occur when mask == ~0UL. > - */ > - max_slots = mask + 1 > - ? nr_slots(mask + 1) > - : 1UL << (BITS_PER_LONG - IO_TLB_SHIFT); > + BUG_ON(!nslots); > > /* > * For mappings greater than or equal to a page, we limit the stride > * (and hence alignment) to a page size. > */ > - nslots = nr_slots(alloc_size); > if (alloc_size >= PAGE_SIZE) > - stride = (1 << (PAGE_SHIFT - IO_TLB_SHIFT)); > - else > - stride = 1; > + stride <<= (PAGE_SHIFT - IO_TLB_SHIFT); > > - BUG_ON(!nslots); > - > - /* > - * Find suitable number of IO TLB entries size that will fit this > - * request and allocate a buffer from that IO TLB pool. > - */ > spin_lock_irqsave(&io_tlb_lock, flags); > - > if (unlikely(nslots > io_tlb_nslabs - io_tlb_used)) > goto not_found; > > - index = ALIGN(io_tlb_index, stride); > - if (index >= io_tlb_nslabs) > - index = 0; > - wrap = index; > - > + index = wrap = wrap_index(ALIGN(io_tlb_index, stride)); > do { > - while (iommu_is_span_boundary(index, nslots, offset_slots, > - max_slots)) { > - index += stride; > - if (index >= io_tlb_nslabs) > - index = 0; > - if (index == wrap) > - goto not_found; > - } > - > /* > * If we find a slot that indicates we have 'nslots' number of > * contiguous buffers, we allocate the buffers from that slot > * and mark the entries as '0' indicating unavailable. > */ > - if (io_tlb_list[index] >= nslots) { > - int count = 0; > - > - for (i = index; i < (int) (index + nslots); i++) > - io_tlb_list[i] = 0; > - for (i = index - 1; > - io_tlb_offset(i) != IO_TLB_SEGSIZE - 1 && > - io_tlb_list[i]; i--) > - io_tlb_list[i] = ++count; > - tlb_addr = io_tlb_start + (index << IO_TLB_SHIFT); > - > - /* > - * Update the indices to avoid searching in the next > - * round. > - */ > - io_tlb_index = ((index + nslots) < io_tlb_nslabs > - ? (index + nslots) : 0); > - > - goto found; > + if (!iommu_is_span_boundary(index, nslots, > + nr_slots(tbl_dma_addr), > + max_slots)) { > + if (io_tlb_list[index] >= nslots) > + goto found; > } > - index += stride; > - if (index >= io_tlb_nslabs) > - index = 0; > + index = wrap_index(index + stride); > } while (index != wrap); > > not_found: > - tmp_io_tlb_used = io_tlb_used; > - > spin_unlock_irqrestore(&io_tlb_lock, flags); > - if (!(attrs & DMA_ATTR_NO_WARN) && printk_ratelimit()) > - dev_warn(hwdev, "swiotlb buffer is full (sz: %zd bytes), total %lu (slots), used %lu (slots)\n", > - alloc_size, io_tlb_nslabs, tmp_io_tlb_used); > - return (phys_addr_t)DMA_MAPPING_ERROR; > + return -1; > + > found: > + for (i = index; i < index + nslots; i++) > + io_tlb_list[i] = 0; > + for (i = index - 1; > + io_tlb_offset(i) != IO_TLB_SEGSIZE - 1 && > + io_tlb_list[i]; i--) > + io_tlb_list[i] = ++count; > + > + /* > + * Update the indices to avoid searching in the next round. > + */ > + if (index + nslots < io_tlb_nslabs) > + io_tlb_index = index + nslots; > + else > + io_tlb_index = 0; > io_tlb_used += nslots; > + > spin_unlock_irqrestore(&io_tlb_lock, flags); > + return index; > +} > + > +phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr, > + size_t mapping_size, size_t alloc_size, > + enum dma_data_direction dir, unsigned long attrs) > +{ > + unsigned int index, i; > + phys_addr_t tlb_addr; > + > + if (no_iotlb_memory) > + panic("Can not allocate SWIOTLB buffer earlier and can't now provide you with the DMA bounce buffer"); > + > + if (mem_encrypt_active()) > + pr_warn_once("Memory encryption is active and system is using DMA bounce buffers\n"); > + > + if (mapping_size > alloc_size) { > + dev_warn_once(dev, "Invalid sizes (mapping: %zd bytes, alloc: %zd bytes)", > + mapping_size, alloc_size); > + return (phys_addr_t)DMA_MAPPING_ERROR; > + } > + > + index = find_slots(dev, alloc_size); > + if (index == -1) { > + if (!(attrs & DMA_ATTR_NO_WARN)) > + dev_warn_ratelimited(dev, > + "swiotlb buffer is full (sz: %zd bytes), total %lu (slots), used %lu (slots)\n", > + alloc_size, io_tlb_nslabs, io_tlb_used); > + return (phys_addr_t)DMA_MAPPING_ERROR; > + } > > /* > * Save away the mapping from the original address to the DMA address. > * This is needed when we sync the memory. Then we sync the buffer if > * needed. > */ > - for (i = 0; i < nslots; i++) > - io_tlb_orig_addr[index+i] = orig_addr + (i << IO_TLB_SHIFT); > + for (i = 0; i < nr_slots(alloc_size); i++) > + io_tlb_orig_addr[index + i] = slot_addr(orig_addr, i); > + > + tlb_addr = slot_addr(io_tlb_start, index); > if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) && > (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)) > swiotlb_bounce(orig_addr, tlb_addr, mapping_size, DMA_TO_DEVICE); > - > return tlb_addr; > } > ------------------------------------------------------------------------------- > > > Git bisection log: > > ------------------------------------------------------------------------------- > git bisect start > # good: [d99676af540c2dc829999928fb81c58c80a1dce4] Merge tag 'drm-next-2021-02-19' of git://anongit.freedesktop.org/drm/drm > git bisect good d99676af540c2dc829999928fb81c58c80a1dce4 > # bad: [37dfbfbdca66834bc0f64ec9b35e09ac6c8898da] Add linux-next specific files for 20210222 > git bisect bad 37dfbfbdca66834bc0f64ec9b35e09ac6c8898da > # bad: [25c1843cc6b3d64ce774ce7f1dc649ca3109a4c5] Merge remote-tracking branch 'block/for-next' > git bisect bad 25c1843cc6b3d64ce774ce7f1dc649ca3109a4c5 > # good: [705552a85bfda7f2b0a3922b318d74fcc8368fd6] Merge remote-tracking branch 'btrfs/for-next' > git bisect good 705552a85bfda7f2b0a3922b318d74fcc8368fd6 > # good: [eed3cd1a28b4a41ca25b8f5fbd86449be8ac3216] Merge remote-tracking branch 'v4l-dvb-next/master' > git bisect good eed3cd1a28b4a41ca25b8f5fbd86449be8ac3216 > # bad: [366e8fe73e13244686604662ddcc70aa14a3e0e6] Merge remote-tracking branch 'rdma/for-next' > git bisect bad 366e8fe73e13244686604662ddcc70aa14a3e0e6 > # good: [5120bf0a5fc15dec210a0fe0f39e4a256bb6e349] RDMA/rxe: Correct skb on loopback path > git bisect good 5120bf0a5fc15dec210a0fe0f39e4a256bb6e349 > # good: [ffc46af1757e05652e17c47e4aa2a01bf5aaf3ad] Merge remote-tracking branch 'thermal/thermal/linux-next' > git bisect good ffc46af1757e05652e17c47e4aa2a01bf5aaf3ad > # good: [229557230c760e25b6af79709aa85d30de4c8500] RDMA/hns: Remove unused member and variable of CMDQ > git bisect good 229557230c760e25b6af79709aa85d30de4c8500 > # good: [2b5715fc17386a6223490d5b8f08d031999b0c0b] RDMA/srp: Fix support for unpopulated and unbalanced NUMA nodes > git bisect good 2b5715fc17386a6223490d5b8f08d031999b0c0b > # bad: [e952d9a1bc204109a21f7dbedddedc110a33baf1] swiotlb: don't modify orig_addr in swiotlb_tbl_sync_single > git bisect bad e952d9a1bc204109a21f7dbedddedc110a33baf1 > # good: [c7fbeca757fe74135d8b6a4c8ddaef76f5775d68] swiotlb: factor out an io_tlb_offset helper > git bisect good c7fbeca757fe74135d8b6a4c8ddaef76f5775d68 > # good: [ca10d0f8e530600ec63c603dbace2c30927d70b7] swiotlb: clean up swiotlb_tbl_unmap_single > git bisect good ca10d0f8e530600ec63c603dbace2c30927d70b7 > # bad: [567d877f9a7d6bf4e4bf0ecd6de23fec8039b123] swiotlb: refactor swiotlb_tbl_map_single > git bisect bad 567d877f9a7d6bf4e4bf0ecd6de23fec8039b123 > # first bad commit: [567d877f9a7d6bf4e4bf0ecd6de23fec8039b123] swiotlb: refactor swiotlb_tbl_map_single > ------------------------------------------------------------------------------- > > > -=-=-=-=-=-=-=-=-=-=-=- > Groups.io Links: You receive all messages sent to this group. > View/Reply Online (#7292): https://groups.io/g/kernelci-results/message/7292 > Mute This Topic: https://groups.io/mt/80842441/924702 > Group Owner: kernelci-results+owner@groups.io > Unsubscribe: https://groups.io/g/kernelci-results/unsub [guillaume.tucker@collabora.com] > -=-=-=-=-=-=-=-=-=-=-=- > > _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: next/master bisection: baseline.login on r8a77960-ulcb 2021-02-23 9:56 ` next/master bisection: baseline.login on r8a77960-ulcb Guillaume Tucker @ 2021-02-24 21:39 ` Heiko Thiery 2021-02-25 11:09 ` Thierry Reding 0 siblings, 1 reply; 7+ messages in thread From: Heiko Thiery @ 2021-02-24 21:39 UTC (permalink / raw) To: Guillaume Tucker, Konrad Rzeszutek Wilk, Jianxiong Gao, Christoph Hellwig Cc: iommu, Robin Murphy, linux-kernel, kernelci-results Hi Christoph and all, On 23.02.21 10:56, Guillaume Tucker wrote: > Hi Christoph, > > Please see the bisection report below about a boot failure on > r8a77960-ulcb on next-20210222. > > Reports aren't automatically sent to the public while we're > trialing new bisection features on kernelci.org but this one > looks valid. > > The log shows a kernel panic, more details can be found here: > > https://kernelci.org/test/case/id/6034bde034504edc9faddd2c/ > > Please let us know if you need any help to debug the issue or try > a fix on this platform. I am also seeing this problem on an iMX8MQ board and can help test if you have a fix. BR -- Heiko _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: next/master bisection: baseline.login on r8a77960-ulcb 2021-02-24 21:39 ` Heiko Thiery @ 2021-02-25 11:09 ` Thierry Reding 2021-02-25 11:14 ` Robin Murphy 0 siblings, 1 reply; 7+ messages in thread From: Thierry Reding @ 2021-02-25 11:09 UTC (permalink / raw) To: Heiko Thiery Cc: kernelci-results, Konrad Rzeszutek Wilk, Guillaume Tucker, linux-kernel, iommu, Robin Murphy, Christoph Hellwig, Jianxiong Gao [-- Attachment #1.1: Type: text/plain, Size: 4361 bytes --] On Wed, Feb 24, 2021 at 10:39:42PM +0100, Heiko Thiery wrote: > Hi Christoph and all, > > On 23.02.21 10:56, Guillaume Tucker wrote: > > Hi Christoph, > > > > Please see the bisection report below about a boot failure on > > r8a77960-ulcb on next-20210222. > > > > Reports aren't automatically sent to the public while we're > > trialing new bisection features on kernelci.org but this one > > looks valid. > > > > The log shows a kernel panic, more details can be found here: > > > > https://kernelci.org/test/case/id/6034bde034504edc9faddd2c/ > > > > Please let us know if you need any help to debug the issue or try > > a fix on this platform. > > I am also seeing this problem on an iMX8MQ board and can help test if you > have a fix. This is also causing boot failures on Jetson AGX Xavier. The origin is slightly different from the above kernelci.org report, but the BUG_ON is the same: [ 2.650447] ------------[ cut here ]------------ [ 2.650588] kernel BUG at include/linux/iommu-helper.h:23! [ 2.650729] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP [ 2.654330] Modules linked in: [ 2.657474] CPU: 2 PID: 67 Comm: kworker/2:1 Not tainted 5.11.0-next-20210225-00025-gfd15609b3a81-dirty #120 [ 2.667367] Hardware name: NVIDIA Jetson AGX Xavier Developer Kit (DT) [ 2.674096] Workqueue: events deferred_probe_work_func [ 2.679169] pstate: 40400089 (nZcv daIf +PAN -UAO -TCO BTYPE=--) [ 2.684949] pc : find_slots.isra.0+0x118/0x2f0 [ 2.689494] lr : find_slots.isra.0+0x88/0x2f0 [ 2.693696] sp : ffff800011faf950 [ 2.697281] x29: ffff800011faf950 x28: 0000000000000001 [ 2.702537] x27: 0000000000000001 x26: 0000000000000000 [ 2.708131] x25: 0000000000000001 x24: 0000000105f03148 [ 2.713556] x23: 0000000000000001 x22: ffff800011559000 [ 2.718835] x21: ffff800011559a80 x20: 00000000edc00000 [ 2.724493] x19: 0000000000000000 x18: 0000000000000020 [ 2.729770] x17: ffff0003ffd7d160 x16: 0000000000000068 [ 2.735173] x15: ffff000080b43150 x14: ffffffffffffffff [ 2.740944] x13: ffff000082b5d791 x12: 0000000000000040 [ 2.746113] x11: ffff0000a0000248 x10: 0000000000000000 [ 2.751882] x9 : 0000000000000000 x8 : ffff0003fed30000 [ 2.757139] x7 : 0000000000000000 x6 : 0000000000000000 [ 2.762818] x5 : 0000000000000000 x4 : 0000000000000000 [ 2.767984] x3 : 00000001e8303148 x2 : 0000000000008000 [ 2.773580] x1 : ffffffffffffffff x0 : 00000000001db800 [ 2.778662] Call trace: [ 2.781136] find_slots.isra.0+0x118/0x2f0 [ 2.785137] swiotlb_tbl_map_single+0x80/0x1b4 [ 2.789858] swiotlb_map+0x58/0x200 [ 2.793355] dma_direct_map_page+0x148/0x1c0 [ 2.797386] dma_map_page_attrs+0x2c/0x54 [ 2.801411] dw_pcie_host_init+0x40c/0x4c0 [ 2.805633] tegra_pcie_config_rp+0x7c/0x1f4 [ 2.810155] tegra_pcie_dw_probe+0x3d0/0x60c [ 2.814185] platform_probe+0x68/0xe0 [ 2.817688] really_probe+0xe4/0x4c0 [ 2.821362] driver_probe_device+0x58/0xc0 [ 2.825386] __device_attach_driver+0xa8/0x104 [ 2.829953] bus_for_each_drv+0x78/0xd0 [ 2.833434] __device_attach+0xdc/0x17c [ 2.837631] device_initial_probe+0x14/0x20 [ 2.841680] bus_probe_device+0x9c/0xa4 [ 2.845160] deferred_probe_work_func+0x74/0xb0 [ 2.849734] process_one_work+0x1cc/0x350 [ 2.853822] worker_thread+0x20c/0x3ac [ 2.858018] kthread+0x128/0x134 [ 2.860997] ret_from_fork+0x10/0x34 [ 2.864508] Code: ca180063 ea06007f 54fffee1 b50001e7 (d4210000) [ 2.870547] ---[ end trace e5c50bdcf12b316e ]--- [ 2.875087] note: kworker/2:1[67] exited with preempt_count 2 [ 2.880836] ------------[ cut here ]------------ I've confirmed that reverting the following commits makes the system boot again: 47cfc5be1934 ("swiotlb: Validate bounce size in the sync/unmap path") c6f50c7719e7 ("swiotlb: respect min_align_mask") e952d9a1bc20 ("swiotlb: don't modify orig_addr in swiotlb_tbl_sync_single") 567d877f9a7d ("swiotlb: refactor swiotlb_tbl_map_single") Let me know if I can help test any fixes for this. Thierry [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] [-- Attachment #2: Type: text/plain, Size: 156 bytes --] _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: next/master bisection: baseline.login on r8a77960-ulcb 2021-02-25 11:09 ` Thierry Reding @ 2021-02-25 11:14 ` Robin Murphy 2021-02-25 11:50 ` Thierry Reding 0 siblings, 1 reply; 7+ messages in thread From: Robin Murphy @ 2021-02-25 11:14 UTC (permalink / raw) To: Thierry Reding, Heiko Thiery Cc: kernelci-results, Konrad Rzeszutek Wilk, Guillaume Tucker, linux-kernel, iommu, Christoph Hellwig, Jianxiong Gao On 2021-02-25 11:09, Thierry Reding wrote: > On Wed, Feb 24, 2021 at 10:39:42PM +0100, Heiko Thiery wrote: >> Hi Christoph and all, >> >> On 23.02.21 10:56, Guillaume Tucker wrote: >>> Hi Christoph, >>> >>> Please see the bisection report below about a boot failure on >>> r8a77960-ulcb on next-20210222. >>> >>> Reports aren't automatically sent to the public while we're >>> trialing new bisection features on kernelci.org but this one >>> looks valid. >>> >>> The log shows a kernel panic, more details can be found here: >>> >>> https://kernelci.org/test/case/id/6034bde034504edc9faddd2c/ >>> >>> Please let us know if you need any help to debug the issue or try >>> a fix on this platform. >> >> I am also seeing this problem on an iMX8MQ board and can help test if you >> have a fix. > > This is also causing boot failures on Jetson AGX Xavier. The origin is > slightly different from the above kernelci.org report, but the BUG_ON is > the same: > > [ 2.650447] ------------[ cut here ]------------ > [ 2.650588] kernel BUG at include/linux/iommu-helper.h:23! > [ 2.650729] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP > [ 2.654330] Modules linked in: > [ 2.657474] CPU: 2 PID: 67 Comm: kworker/2:1 Not tainted 5.11.0-next-20210225-00025-gfd15609b3a81-dirty #120 > [ 2.667367] Hardware name: NVIDIA Jetson AGX Xavier Developer Kit (DT) > [ 2.674096] Workqueue: events deferred_probe_work_func > [ 2.679169] pstate: 40400089 (nZcv daIf +PAN -UAO -TCO BTYPE=--) > [ 2.684949] pc : find_slots.isra.0+0x118/0x2f0 > [ 2.689494] lr : find_slots.isra.0+0x88/0x2f0 > [ 2.693696] sp : ffff800011faf950 > [ 2.697281] x29: ffff800011faf950 x28: 0000000000000001 > [ 2.702537] x27: 0000000000000001 x26: 0000000000000000 > [ 2.708131] x25: 0000000000000001 x24: 0000000105f03148 > [ 2.713556] x23: 0000000000000001 x22: ffff800011559000 > [ 2.718835] x21: ffff800011559a80 x20: 00000000edc00000 > [ 2.724493] x19: 0000000000000000 x18: 0000000000000020 > [ 2.729770] x17: ffff0003ffd7d160 x16: 0000000000000068 > [ 2.735173] x15: ffff000080b43150 x14: ffffffffffffffff > [ 2.740944] x13: ffff000082b5d791 x12: 0000000000000040 > [ 2.746113] x11: ffff0000a0000248 x10: 0000000000000000 > [ 2.751882] x9 : 0000000000000000 x8 : ffff0003fed30000 > [ 2.757139] x7 : 0000000000000000 x6 : 0000000000000000 > [ 2.762818] x5 : 0000000000000000 x4 : 0000000000000000 > [ 2.767984] x3 : 00000001e8303148 x2 : 0000000000008000 > [ 2.773580] x1 : ffffffffffffffff x0 : 00000000001db800 > [ 2.778662] Call trace: > [ 2.781136] find_slots.isra.0+0x118/0x2f0 > [ 2.785137] swiotlb_tbl_map_single+0x80/0x1b4 > [ 2.789858] swiotlb_map+0x58/0x200 > [ 2.793355] dma_direct_map_page+0x148/0x1c0 > [ 2.797386] dma_map_page_attrs+0x2c/0x54 > [ 2.801411] dw_pcie_host_init+0x40c/0x4c0 > [ 2.805633] tegra_pcie_config_rp+0x7c/0x1f4 > [ 2.810155] tegra_pcie_dw_probe+0x3d0/0x60c > [ 2.814185] platform_probe+0x68/0xe0 > [ 2.817688] really_probe+0xe4/0x4c0 > [ 2.821362] driver_probe_device+0x58/0xc0 > [ 2.825386] __device_attach_driver+0xa8/0x104 > [ 2.829953] bus_for_each_drv+0x78/0xd0 > [ 2.833434] __device_attach+0xdc/0x17c > [ 2.837631] device_initial_probe+0x14/0x20 > [ 2.841680] bus_probe_device+0x9c/0xa4 > [ 2.845160] deferred_probe_work_func+0x74/0xb0 > [ 2.849734] process_one_work+0x1cc/0x350 > [ 2.853822] worker_thread+0x20c/0x3ac > [ 2.858018] kthread+0x128/0x134 > [ 2.860997] ret_from_fork+0x10/0x34 > [ 2.864508] Code: ca180063 ea06007f 54fffee1 b50001e7 (d4210000) > [ 2.870547] ---[ end trace e5c50bdcf12b316e ]--- > [ 2.875087] note: kworker/2:1[67] exited with preempt_count 2 > [ 2.880836] ------------[ cut here ]------------ > > I've confirmed that reverting the following commits makes the system > boot again: > > 47cfc5be1934 ("swiotlb: Validate bounce size in the sync/unmap path") > c6f50c7719e7 ("swiotlb: respect min_align_mask") > e952d9a1bc20 ("swiotlb: don't modify orig_addr in swiotlb_tbl_sync_single") > 567d877f9a7d ("swiotlb: refactor swiotlb_tbl_map_single") > > Let me know if I can help test any fixes for this. FWIW, this sounds like it's probably the same thing for which a fix should be pending: https://lore.kernel.org/linux-iommu/20210223072514.GA18079@lst.de/T/#u Robin. _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: next/master bisection: baseline.login on r8a77960-ulcb 2021-02-25 11:14 ` Robin Murphy @ 2021-02-25 11:50 ` Thierry Reding 2021-02-25 13:00 ` Heiko Thiery 0 siblings, 1 reply; 7+ messages in thread From: Thierry Reding @ 2021-02-25 11:50 UTC (permalink / raw) To: Robin Murphy Cc: kernelci-results, Konrad Rzeszutek Wilk, Guillaume Tucker, linux-kernel, iommu, Heiko Thiery, Christoph Hellwig, Jianxiong Gao [-- Attachment #1.1: Type: text/plain, Size: 5165 bytes --] On Thu, Feb 25, 2021 at 11:14:57AM +0000, Robin Murphy wrote: > On 2021-02-25 11:09, Thierry Reding wrote: > > On Wed, Feb 24, 2021 at 10:39:42PM +0100, Heiko Thiery wrote: > > > Hi Christoph and all, > > > > > > On 23.02.21 10:56, Guillaume Tucker wrote: > > > > Hi Christoph, > > > > > > > > Please see the bisection report below about a boot failure on > > > > r8a77960-ulcb on next-20210222. > > > > > > > > Reports aren't automatically sent to the public while we're > > > > trialing new bisection features on kernelci.org but this one > > > > looks valid. > > > > > > > > The log shows a kernel panic, more details can be found here: > > > > > > > > https://kernelci.org/test/case/id/6034bde034504edc9faddd2c/ > > > > > > > > Please let us know if you need any help to debug the issue or try > > > > a fix on this platform. > > > > > > I am also seeing this problem on an iMX8MQ board and can help test if you > > > have a fix. > > > > This is also causing boot failures on Jetson AGX Xavier. The origin is > > slightly different from the above kernelci.org report, but the BUG_ON is > > the same: > > > > [ 2.650447] ------------[ cut here ]------------ > > [ 2.650588] kernel BUG at include/linux/iommu-helper.h:23! > > [ 2.650729] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP > > [ 2.654330] Modules linked in: > > [ 2.657474] CPU: 2 PID: 67 Comm: kworker/2:1 Not tainted 5.11.0-next-20210225-00025-gfd15609b3a81-dirty #120 > > [ 2.667367] Hardware name: NVIDIA Jetson AGX Xavier Developer Kit (DT) > > [ 2.674096] Workqueue: events deferred_probe_work_func > > [ 2.679169] pstate: 40400089 (nZcv daIf +PAN -UAO -TCO BTYPE=--) > > [ 2.684949] pc : find_slots.isra.0+0x118/0x2f0 > > [ 2.689494] lr : find_slots.isra.0+0x88/0x2f0 > > [ 2.693696] sp : ffff800011faf950 > > [ 2.697281] x29: ffff800011faf950 x28: 0000000000000001 > > [ 2.702537] x27: 0000000000000001 x26: 0000000000000000 > > [ 2.708131] x25: 0000000000000001 x24: 0000000105f03148 > > [ 2.713556] x23: 0000000000000001 x22: ffff800011559000 > > [ 2.718835] x21: ffff800011559a80 x20: 00000000edc00000 > > [ 2.724493] x19: 0000000000000000 x18: 0000000000000020 > > [ 2.729770] x17: ffff0003ffd7d160 x16: 0000000000000068 > > [ 2.735173] x15: ffff000080b43150 x14: ffffffffffffffff > > [ 2.740944] x13: ffff000082b5d791 x12: 0000000000000040 > > [ 2.746113] x11: ffff0000a0000248 x10: 0000000000000000 > > [ 2.751882] x9 : 0000000000000000 x8 : ffff0003fed30000 > > [ 2.757139] x7 : 0000000000000000 x6 : 0000000000000000 > > [ 2.762818] x5 : 0000000000000000 x4 : 0000000000000000 > > [ 2.767984] x3 : 00000001e8303148 x2 : 0000000000008000 > > [ 2.773580] x1 : ffffffffffffffff x0 : 00000000001db800 > > [ 2.778662] Call trace: > > [ 2.781136] find_slots.isra.0+0x118/0x2f0 > > [ 2.785137] swiotlb_tbl_map_single+0x80/0x1b4 > > [ 2.789858] swiotlb_map+0x58/0x200 > > [ 2.793355] dma_direct_map_page+0x148/0x1c0 > > [ 2.797386] dma_map_page_attrs+0x2c/0x54 > > [ 2.801411] dw_pcie_host_init+0x40c/0x4c0 > > [ 2.805633] tegra_pcie_config_rp+0x7c/0x1f4 > > [ 2.810155] tegra_pcie_dw_probe+0x3d0/0x60c > > [ 2.814185] platform_probe+0x68/0xe0 > > [ 2.817688] really_probe+0xe4/0x4c0 > > [ 2.821362] driver_probe_device+0x58/0xc0 > > [ 2.825386] __device_attach_driver+0xa8/0x104 > > [ 2.829953] bus_for_each_drv+0x78/0xd0 > > [ 2.833434] __device_attach+0xdc/0x17c > > [ 2.837631] device_initial_probe+0x14/0x20 > > [ 2.841680] bus_probe_device+0x9c/0xa4 > > [ 2.845160] deferred_probe_work_func+0x74/0xb0 > > [ 2.849734] process_one_work+0x1cc/0x350 > > [ 2.853822] worker_thread+0x20c/0x3ac > > [ 2.858018] kthread+0x128/0x134 > > [ 2.860997] ret_from_fork+0x10/0x34 > > [ 2.864508] Code: ca180063 ea06007f 54fffee1 b50001e7 (d4210000) > > [ 2.870547] ---[ end trace e5c50bdcf12b316e ]--- > > [ 2.875087] note: kworker/2:1[67] exited with preempt_count 2 > > [ 2.880836] ------------[ cut here ]------------ > > > > I've confirmed that reverting the following commits makes the system > > boot again: > > > > 47cfc5be1934 ("swiotlb: Validate bounce size in the sync/unmap path") > > c6f50c7719e7 ("swiotlb: respect min_align_mask") > > e952d9a1bc20 ("swiotlb: don't modify orig_addr in swiotlb_tbl_sync_single") > > 567d877f9a7d ("swiotlb: refactor swiotlb_tbl_map_single") > > > > Let me know if I can help test any fixes for this. > > FWIW, this sounds like it's probably the same thing for which a fix should > be pending: > > https://lore.kernel.org/linux-iommu/20210223072514.GA18079@lst.de/T/#u Yep, changing max_slots from unsigned int to unsigned long fixes this as well. Thanks for the pointer! Thierry [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] [-- Attachment #2: Type: text/plain, Size: 156 bytes --] _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: next/master bisection: baseline.login on r8a77960-ulcb 2021-02-25 11:50 ` Thierry Reding @ 2021-02-25 13:00 ` Heiko Thiery 2021-02-26 8:56 ` Yoshihiro Shimoda 0 siblings, 1 reply; 7+ messages in thread From: Heiko Thiery @ 2021-02-25 13:00 UTC (permalink / raw) To: Thierry Reding Cc: kernelci-results, Konrad Rzeszutek Wilk, Guillaume Tucker, linux-kernel, iommu, Robin Murphy, Christoph Hellwig, Jianxiong Gao Hi all, Am Do., 25. Feb. 2021 um 12:50 Uhr schrieb Thierry Reding <thierry.reding@gmail.com>: > > On Thu, Feb 25, 2021 at 11:14:57AM +0000, Robin Murphy wrote: > > On 2021-02-25 11:09, Thierry Reding wrote: > > > On Wed, Feb 24, 2021 at 10:39:42PM +0100, Heiko Thiery wrote: > > > > Hi Christoph and all, > > > > > > > > On 23.02.21 10:56, Guillaume Tucker wrote: > > > > > Hi Christoph, > > > > > > > > > > Please see the bisection report below about a boot failure on > > > > > r8a77960-ulcb on next-20210222. > > > > > > > > > > Reports aren't automatically sent to the public while we're > > > > > trialing new bisection features on kernelci.org but this one > > > > > looks valid. > > > > > > > > > > The log shows a kernel panic, more details can be found here: > > > > > > > > > > https://kernelci.org/test/case/id/6034bde034504edc9faddd2c/ > > > > > > > > > > Please let us know if you need any help to debug the issue or try > > > > > a fix on this platform. > > > > > > > > I am also seeing this problem on an iMX8MQ board and can help test if you > > > > have a fix. > > > > > > This is also causing boot failures on Jetson AGX Xavier. The origin is > > > slightly different from the above kernelci.org report, but the BUG_ON is > > > the same: > > > > > > [ 2.650447] ------------[ cut here ]------------ > > > [ 2.650588] kernel BUG at include/linux/iommu-helper.h:23! > > > [ 2.650729] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP > > > [ 2.654330] Modules linked in: > > > [ 2.657474] CPU: 2 PID: 67 Comm: kworker/2:1 Not tainted 5.11.0-next-20210225-00025-gfd15609b3a81-dirty #120 > > > [ 2.667367] Hardware name: NVIDIA Jetson AGX Xavier Developer Kit (DT) > > > [ 2.674096] Workqueue: events deferred_probe_work_func > > > [ 2.679169] pstate: 40400089 (nZcv daIf +PAN -UAO -TCO BTYPE=--) > > > [ 2.684949] pc : find_slots.isra.0+0x118/0x2f0 > > > [ 2.689494] lr : find_slots.isra.0+0x88/0x2f0 > > > [ 2.693696] sp : ffff800011faf950 > > > [ 2.697281] x29: ffff800011faf950 x28: 0000000000000001 > > > [ 2.702537] x27: 0000000000000001 x26: 0000000000000000 > > > [ 2.708131] x25: 0000000000000001 x24: 0000000105f03148 > > > [ 2.713556] x23: 0000000000000001 x22: ffff800011559000 > > > [ 2.718835] x21: ffff800011559a80 x20: 00000000edc00000 > > > [ 2.724493] x19: 0000000000000000 x18: 0000000000000020 > > > [ 2.729770] x17: ffff0003ffd7d160 x16: 0000000000000068 > > > [ 2.735173] x15: ffff000080b43150 x14: ffffffffffffffff > > > [ 2.740944] x13: ffff000082b5d791 x12: 0000000000000040 > > > [ 2.746113] x11: ffff0000a0000248 x10: 0000000000000000 > > > [ 2.751882] x9 : 0000000000000000 x8 : ffff0003fed30000 > > > [ 2.757139] x7 : 0000000000000000 x6 : 0000000000000000 > > > [ 2.762818] x5 : 0000000000000000 x4 : 0000000000000000 > > > [ 2.767984] x3 : 00000001e8303148 x2 : 0000000000008000 > > > [ 2.773580] x1 : ffffffffffffffff x0 : 00000000001db800 > > > [ 2.778662] Call trace: > > > [ 2.781136] find_slots.isra.0+0x118/0x2f0 > > > [ 2.785137] swiotlb_tbl_map_single+0x80/0x1b4 > > > [ 2.789858] swiotlb_map+0x58/0x200 > > > [ 2.793355] dma_direct_map_page+0x148/0x1c0 > > > [ 2.797386] dma_map_page_attrs+0x2c/0x54 > > > [ 2.801411] dw_pcie_host_init+0x40c/0x4c0 > > > [ 2.805633] tegra_pcie_config_rp+0x7c/0x1f4 > > > [ 2.810155] tegra_pcie_dw_probe+0x3d0/0x60c > > > [ 2.814185] platform_probe+0x68/0xe0 > > > [ 2.817688] really_probe+0xe4/0x4c0 > > > [ 2.821362] driver_probe_device+0x58/0xc0 > > > [ 2.825386] __device_attach_driver+0xa8/0x104 > > > [ 2.829953] bus_for_each_drv+0x78/0xd0 > > > [ 2.833434] __device_attach+0xdc/0x17c > > > [ 2.837631] device_initial_probe+0x14/0x20 > > > [ 2.841680] bus_probe_device+0x9c/0xa4 > > > [ 2.845160] deferred_probe_work_func+0x74/0xb0 > > > [ 2.849734] process_one_work+0x1cc/0x350 > > > [ 2.853822] worker_thread+0x20c/0x3ac > > > [ 2.858018] kthread+0x128/0x134 > > > [ 2.860997] ret_from_fork+0x10/0x34 > > > [ 2.864508] Code: ca180063 ea06007f 54fffee1 b50001e7 (d4210000) > > > [ 2.870547] ---[ end trace e5c50bdcf12b316e ]--- > > > [ 2.875087] note: kworker/2:1[67] exited with preempt_count 2 > > > [ 2.880836] ------------[ cut here ]------------ > > > > > > I've confirmed that reverting the following commits makes the system > > > boot again: > > > > > > 47cfc5be1934 ("swiotlb: Validate bounce size in the sync/unmap path") > > > c6f50c7719e7 ("swiotlb: respect min_align_mask") > > > e952d9a1bc20 ("swiotlb: don't modify orig_addr in swiotlb_tbl_sync_single") > > > 567d877f9a7d ("swiotlb: refactor swiotlb_tbl_map_single") > > > > > > Let me know if I can help test any fixes for this. > > > > FWIW, this sounds like it's probably the same thing for which a fix should > > be pending: > > > > https://lore.kernel.org/linux-iommu/20210223072514.GA18079@lst.de/T/#u > > Yep, changing max_slots from unsigned int to unsigned long fixes this as > well. Thanks for the pointer! I also can confirm that changing that to unsigned long fixes the issue. Thank you -- Heiko _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: next/master bisection: baseline.login on r8a77960-ulcb 2021-02-25 13:00 ` Heiko Thiery @ 2021-02-26 8:56 ` Yoshihiro Shimoda 0 siblings, 0 replies; 7+ messages in thread From: Yoshihiro Shimoda @ 2021-02-26 8:56 UTC (permalink / raw) To: Heiko Thiery, Thierry Reding Cc: kernelci-results, Konrad Rzeszutek Wilk, Guillaume Tucker, linux-kernel, iommu, Robin Murphy, Christoph Hellwig, Jianxiong Gao Hi all, > From: Heiko Thiery, Sent: Thursday, February 25, 2021 10:01 PM > Am Do., 25. Feb. 2021 um 12:50 Uhr schrieb Thierry Reding: > > On Thu, Feb 25, 2021 at 11:14:57AM +0000, Robin Murphy wrote: > > > On 2021-02-25 11:09, Thierry Reding wrote: > > > > On Wed, Feb 24, 2021 at 10:39:42PM +0100, Heiko Thiery wrote: > > > > > Hi Christoph and all, > > > > > > > > > > On 23.02.21 10:56, Guillaume Tucker wrote: > > > > > > Hi Christoph, > > > > > > > > > > > > Please see the bisection report below about a boot failure on > > > > > > r8a77960-ulcb on next-20210222. > > > > > > > > > > > > Reports aren't automatically sent to the public while we're > > > > > > trialing new bisection features on kernelci.org but this one > > > > > > looks valid. > > > > > > > > > > > > The log shows a kernel panic, more details can be found here: <snip> > > > > Yep, changing max_slots from unsigned int to unsigned long fixes this as > > well. Thanks for the pointer! > > I also can confirm that changing that to unsigned long fixes the issue. Thank you for the information! I also confirmed that changing the type of max_slots fixed the issue on my environment (r8a77951-salvator-xs.dts with defconfig). Best regards, Yoshihiro Shimoda _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2021-02-26 9:11 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <60346234.1c69fb81.cd55e.770d@mx.google.com> 2021-02-23 9:56 ` next/master bisection: baseline.login on r8a77960-ulcb Guillaume Tucker 2021-02-24 21:39 ` Heiko Thiery 2021-02-25 11:09 ` Thierry Reding 2021-02-25 11:14 ` Robin Murphy 2021-02-25 11:50 ` Thierry Reding 2021-02-25 13:00 ` Heiko Thiery 2021-02-26 8:56 ` Yoshihiro Shimoda
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).