* Re: next/master bisection: baseline.login on r8a77960-ulcb
[not found] <60346234.1c69fb81.cd55e.770d@mx.google.com>
@ 2021-02-23 9:56 ` Guillaume Tucker
2021-02-24 21:39 ` Heiko Thiery
0 siblings, 1 reply; 7+ messages in thread
From: Guillaume Tucker @ 2021-02-23 9:56 UTC (permalink / raw)
To: Konrad Rzeszutek Wilk, Jianxiong Gao, Christoph Hellwig
Cc: iommu, Robin Murphy, linux-kernel, kernelci-results
Hi Christoph,
Please see the bisection report below about a boot failure on
r8a77960-ulcb on next-20210222.
Reports aren't automatically sent to the public while we're
trialing new bisection features on kernelci.org but this one
looks valid.
The log shows a kernel panic, more details can be found here:
https://kernelci.org/test/case/id/6034bde034504edc9faddd2c/
Please let us know if you need any help to debug the issue or try
a fix on this platform.
Best wishes,
Guillaume
On 23/02/2021 02:02, KernelCI bot wrote:
> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
> * This automated bisection report was sent to you on the basis *
> * that you may be involved with the breaking commit it has *
> * found. No manual investigation has been done to verify it, *
> * and the root cause of the problem may be somewhere else. *
> * *
> * If you do send a fix, please include this trailer: *
> * Reported-by: "kernelci.org bot" <bot@kernelci.org> *
> * *
> * Hope this helps! *
> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
>
> next/master bisection: baseline.login on r8a77960-ulcb
>
> Summary:
> Start: 37dfbfbdca66 Add linux-next specific files for 20210222
> Plain log: https://storage.kernelci.org/next/master/next-20210222/arm64/defconfig/clang-10/lab-baylibre/baseline-r8a77960-ulcb.txt
> HTML log: https://storage.kernelci.org/next/master/next-20210222/arm64/defconfig/clang-10/lab-baylibre/baseline-r8a77960-ulcb.html
> Result: 567d877f9a7d swiotlb: refactor swiotlb_tbl_map_single
>
> Checks:
> revert: PASS
> verify: PASS
>
> Parameters:
> Tree: next
> URL: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
> Branch: master
> Target: r8a77960-ulcb
> CPU arch: arm64
> Lab: lab-baylibre
> Compiler: clang-10
> Config: defconfig
> Test case: baseline.login
>
> Breaking commit found:
>
> -------------------------------------------------------------------------------
> commit 567d877f9a7d6bf4e4bf0ecd6de23fec8039b123
> Author: Christoph Hellwig <hch@lst.de>
> Date: Thu Feb 4 11:08:35 2021 +0100
>
> swiotlb: refactor swiotlb_tbl_map_single
>
> Split out a bunch of a self-contained helpers to make the function easier
> to follow.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Acked-by: Jianxiong Gao <jxgao@google.com>
> Tested-by: Jianxiong Gao <jxgao@google.com>
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index b38b1553c466..381c24ef1ac1 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -468,134 +468,133 @@ static void swiotlb_bounce(phys_addr_t orig_addr, phys_addr_t tlb_addr,
> }
> }
>
> -phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, phys_addr_t orig_addr,
> - size_t mapping_size, size_t alloc_size,
> - enum dma_data_direction dir, unsigned long attrs)
> -{
> - dma_addr_t tbl_dma_addr = phys_to_dma_unencrypted(hwdev, io_tlb_start);
> - unsigned long flags;
> - phys_addr_t tlb_addr;
> - unsigned int nslots, stride, index, wrap;
> - int i;
> - unsigned long mask;
> - unsigned long offset_slots;
> - unsigned long max_slots;
> - unsigned long tmp_io_tlb_used;
> -
> - if (no_iotlb_memory)
> - panic("Can not allocate SWIOTLB buffer earlier and can't now provide you with the DMA bounce buffer");
> -
> - if (mem_encrypt_active())
> - pr_warn_once("Memory encryption is active and system is using DMA bounce buffers\n");
> +#define slot_addr(start, idx) ((start) + ((idx) << IO_TLB_SHIFT))
>
> - if (mapping_size > alloc_size) {
> - dev_warn_once(hwdev, "Invalid sizes (mapping: %zd bytes, alloc: %zd bytes)",
> - mapping_size, alloc_size);
> - return (phys_addr_t)DMA_MAPPING_ERROR;
> - }
> -
> - mask = dma_get_seg_boundary(hwdev);
> +/*
> + * Carefully handle integer overflow which can occur when boundary_mask == ~0UL.
> + */
> +static inline unsigned long get_max_slots(unsigned long boundary_mask)
> +{
> + if (boundary_mask == ~0UL)
> + return 1UL << (BITS_PER_LONG - IO_TLB_SHIFT);
> + return nr_slots(boundary_mask + 1);
> +}
>
> - tbl_dma_addr &= mask;
> +static unsigned int wrap_index(unsigned int index)
> +{
> + if (index >= io_tlb_nslabs)
> + return 0;
> + return index;
> +}
>
> - offset_slots = nr_slots(tbl_dma_addr);
> +/*
> + * Find a suitable number of IO TLB entries size that will fit this request and
> + * allocate a buffer from that IO TLB pool.
> + */
> +static int find_slots(struct device *dev, size_t alloc_size)
> +{
> + unsigned long boundary_mask = dma_get_seg_boundary(dev);
> + dma_addr_t tbl_dma_addr =
> + phys_to_dma_unencrypted(dev, io_tlb_start) & boundary_mask;
> + unsigned int max_slots = get_max_slots(boundary_mask);
> + unsigned int nslots = nr_slots(alloc_size), stride = 1;
> + unsigned int index, wrap, count = 0, i;
> + unsigned long flags;
>
> - /*
> - * Carefully handle integer overflow which can occur when mask == ~0UL.
> - */
> - max_slots = mask + 1
> - ? nr_slots(mask + 1)
> - : 1UL << (BITS_PER_LONG - IO_TLB_SHIFT);
> + BUG_ON(!nslots);
>
> /*
> * For mappings greater than or equal to a page, we limit the stride
> * (and hence alignment) to a page size.
> */
> - nslots = nr_slots(alloc_size);
> if (alloc_size >= PAGE_SIZE)
> - stride = (1 << (PAGE_SHIFT - IO_TLB_SHIFT));
> - else
> - stride = 1;
> + stride <<= (PAGE_SHIFT - IO_TLB_SHIFT);
>
> - BUG_ON(!nslots);
> -
> - /*
> - * Find suitable number of IO TLB entries size that will fit this
> - * request and allocate a buffer from that IO TLB pool.
> - */
> spin_lock_irqsave(&io_tlb_lock, flags);
> -
> if (unlikely(nslots > io_tlb_nslabs - io_tlb_used))
> goto not_found;
>
> - index = ALIGN(io_tlb_index, stride);
> - if (index >= io_tlb_nslabs)
> - index = 0;
> - wrap = index;
> -
> + index = wrap = wrap_index(ALIGN(io_tlb_index, stride));
> do {
> - while (iommu_is_span_boundary(index, nslots, offset_slots,
> - max_slots)) {
> - index += stride;
> - if (index >= io_tlb_nslabs)
> - index = 0;
> - if (index == wrap)
> - goto not_found;
> - }
> -
> /*
> * If we find a slot that indicates we have 'nslots' number of
> * contiguous buffers, we allocate the buffers from that slot
> * and mark the entries as '0' indicating unavailable.
> */
> - if (io_tlb_list[index] >= nslots) {
> - int count = 0;
> -
> - for (i = index; i < (int) (index + nslots); i++)
> - io_tlb_list[i] = 0;
> - for (i = index - 1;
> - io_tlb_offset(i) != IO_TLB_SEGSIZE - 1 &&
> - io_tlb_list[i]; i--)
> - io_tlb_list[i] = ++count;
> - tlb_addr = io_tlb_start + (index << IO_TLB_SHIFT);
> -
> - /*
> - * Update the indices to avoid searching in the next
> - * round.
> - */
> - io_tlb_index = ((index + nslots) < io_tlb_nslabs
> - ? (index + nslots) : 0);
> -
> - goto found;
> + if (!iommu_is_span_boundary(index, nslots,
> + nr_slots(tbl_dma_addr),
> + max_slots)) {
> + if (io_tlb_list[index] >= nslots)
> + goto found;
> }
> - index += stride;
> - if (index >= io_tlb_nslabs)
> - index = 0;
> + index = wrap_index(index + stride);
> } while (index != wrap);
>
> not_found:
> - tmp_io_tlb_used = io_tlb_used;
> -
> spin_unlock_irqrestore(&io_tlb_lock, flags);
> - if (!(attrs & DMA_ATTR_NO_WARN) && printk_ratelimit())
> - dev_warn(hwdev, "swiotlb buffer is full (sz: %zd bytes), total %lu (slots), used %lu (slots)\n",
> - alloc_size, io_tlb_nslabs, tmp_io_tlb_used);
> - return (phys_addr_t)DMA_MAPPING_ERROR;
> + return -1;
> +
> found:
> + for (i = index; i < index + nslots; i++)
> + io_tlb_list[i] = 0;
> + for (i = index - 1;
> + io_tlb_offset(i) != IO_TLB_SEGSIZE - 1 &&
> + io_tlb_list[i]; i--)
> + io_tlb_list[i] = ++count;
> +
> + /*
> + * Update the indices to avoid searching in the next round.
> + */
> + if (index + nslots < io_tlb_nslabs)
> + io_tlb_index = index + nslots;
> + else
> + io_tlb_index = 0;
> io_tlb_used += nslots;
> +
> spin_unlock_irqrestore(&io_tlb_lock, flags);
> + return index;
> +}
> +
> +phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr,
> + size_t mapping_size, size_t alloc_size,
> + enum dma_data_direction dir, unsigned long attrs)
> +{
> + unsigned int index, i;
> + phys_addr_t tlb_addr;
> +
> + if (no_iotlb_memory)
> + panic("Can not allocate SWIOTLB buffer earlier and can't now provide you with the DMA bounce buffer");
> +
> + if (mem_encrypt_active())
> + pr_warn_once("Memory encryption is active and system is using DMA bounce buffers\n");
> +
> + if (mapping_size > alloc_size) {
> + dev_warn_once(dev, "Invalid sizes (mapping: %zd bytes, alloc: %zd bytes)",
> + mapping_size, alloc_size);
> + return (phys_addr_t)DMA_MAPPING_ERROR;
> + }
> +
> + index = find_slots(dev, alloc_size);
> + if (index == -1) {
> + if (!(attrs & DMA_ATTR_NO_WARN))
> + dev_warn_ratelimited(dev,
> + "swiotlb buffer is full (sz: %zd bytes), total %lu (slots), used %lu (slots)\n",
> + alloc_size, io_tlb_nslabs, io_tlb_used);
> + return (phys_addr_t)DMA_MAPPING_ERROR;
> + }
>
> /*
> * Save away the mapping from the original address to the DMA address.
> * This is needed when we sync the memory. Then we sync the buffer if
> * needed.
> */
> - for (i = 0; i < nslots; i++)
> - io_tlb_orig_addr[index+i] = orig_addr + (i << IO_TLB_SHIFT);
> + for (i = 0; i < nr_slots(alloc_size); i++)
> + io_tlb_orig_addr[index + i] = slot_addr(orig_addr, i);
> +
> + tlb_addr = slot_addr(io_tlb_start, index);
> if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
> (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL))
> swiotlb_bounce(orig_addr, tlb_addr, mapping_size, DMA_TO_DEVICE);
> -
> return tlb_addr;
> }
> -------------------------------------------------------------------------------
>
>
> Git bisection log:
>
> -------------------------------------------------------------------------------
> git bisect start
> # good: [d99676af540c2dc829999928fb81c58c80a1dce4] Merge tag 'drm-next-2021-02-19' of git://anongit.freedesktop.org/drm/drm
> git bisect good d99676af540c2dc829999928fb81c58c80a1dce4
> # bad: [37dfbfbdca66834bc0f64ec9b35e09ac6c8898da] Add linux-next specific files for 20210222
> git bisect bad 37dfbfbdca66834bc0f64ec9b35e09ac6c8898da
> # bad: [25c1843cc6b3d64ce774ce7f1dc649ca3109a4c5] Merge remote-tracking branch 'block/for-next'
> git bisect bad 25c1843cc6b3d64ce774ce7f1dc649ca3109a4c5
> # good: [705552a85bfda7f2b0a3922b318d74fcc8368fd6] Merge remote-tracking branch 'btrfs/for-next'
> git bisect good 705552a85bfda7f2b0a3922b318d74fcc8368fd6
> # good: [eed3cd1a28b4a41ca25b8f5fbd86449be8ac3216] Merge remote-tracking branch 'v4l-dvb-next/master'
> git bisect good eed3cd1a28b4a41ca25b8f5fbd86449be8ac3216
> # bad: [366e8fe73e13244686604662ddcc70aa14a3e0e6] Merge remote-tracking branch 'rdma/for-next'
> git bisect bad 366e8fe73e13244686604662ddcc70aa14a3e0e6
> # good: [5120bf0a5fc15dec210a0fe0f39e4a256bb6e349] RDMA/rxe: Correct skb on loopback path
> git bisect good 5120bf0a5fc15dec210a0fe0f39e4a256bb6e349
> # good: [ffc46af1757e05652e17c47e4aa2a01bf5aaf3ad] Merge remote-tracking branch 'thermal/thermal/linux-next'
> git bisect good ffc46af1757e05652e17c47e4aa2a01bf5aaf3ad
> # good: [229557230c760e25b6af79709aa85d30de4c8500] RDMA/hns: Remove unused member and variable of CMDQ
> git bisect good 229557230c760e25b6af79709aa85d30de4c8500
> # good: [2b5715fc17386a6223490d5b8f08d031999b0c0b] RDMA/srp: Fix support for unpopulated and unbalanced NUMA nodes
> git bisect good 2b5715fc17386a6223490d5b8f08d031999b0c0b
> # bad: [e952d9a1bc204109a21f7dbedddedc110a33baf1] swiotlb: don't modify orig_addr in swiotlb_tbl_sync_single
> git bisect bad e952d9a1bc204109a21f7dbedddedc110a33baf1
> # good: [c7fbeca757fe74135d8b6a4c8ddaef76f5775d68] swiotlb: factor out an io_tlb_offset helper
> git bisect good c7fbeca757fe74135d8b6a4c8ddaef76f5775d68
> # good: [ca10d0f8e530600ec63c603dbace2c30927d70b7] swiotlb: clean up swiotlb_tbl_unmap_single
> git bisect good ca10d0f8e530600ec63c603dbace2c30927d70b7
> # bad: [567d877f9a7d6bf4e4bf0ecd6de23fec8039b123] swiotlb: refactor swiotlb_tbl_map_single
> git bisect bad 567d877f9a7d6bf4e4bf0ecd6de23fec8039b123
> # first bad commit: [567d877f9a7d6bf4e4bf0ecd6de23fec8039b123] swiotlb: refactor swiotlb_tbl_map_single
> -------------------------------------------------------------------------------
>
>
> -=-=-=-=-=-=-=-=-=-=-=-
> Groups.io Links: You receive all messages sent to this group.
> View/Reply Online (#7292): https://groups.io/g/kernelci-results/message/7292
> Mute This Topic: https://groups.io/mt/80842441/924702
> Group Owner: kernelci-results+owner@groups.io
> Unsubscribe: https://groups.io/g/kernelci-results/unsub [guillaume.tucker@collabora.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>
>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: next/master bisection: baseline.login on r8a77960-ulcb
2021-02-23 9:56 ` next/master bisection: baseline.login on r8a77960-ulcb Guillaume Tucker
@ 2021-02-24 21:39 ` Heiko Thiery
2021-02-25 11:09 ` Thierry Reding
0 siblings, 1 reply; 7+ messages in thread
From: Heiko Thiery @ 2021-02-24 21:39 UTC (permalink / raw)
To: Guillaume Tucker, Konrad Rzeszutek Wilk, Jianxiong Gao,
Christoph Hellwig
Cc: iommu, Robin Murphy, linux-kernel, kernelci-results
Hi Christoph and all,
On 23.02.21 10:56, Guillaume Tucker wrote:
> Hi Christoph,
>
> Please see the bisection report below about a boot failure on
> r8a77960-ulcb on next-20210222.
>
> Reports aren't automatically sent to the public while we're
> trialing new bisection features on kernelci.org but this one
> looks valid.
>
> The log shows a kernel panic, more details can be found here:
>
> https://kernelci.org/test/case/id/6034bde034504edc9faddd2c/
>
> Please let us know if you need any help to debug the issue or try
> a fix on this platform.
I am also seeing this problem on an iMX8MQ board and can help test if
you have a fix.
BR
--
Heiko
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: next/master bisection: baseline.login on r8a77960-ulcb
2021-02-24 21:39 ` Heiko Thiery
@ 2021-02-25 11:09 ` Thierry Reding
2021-02-25 11:14 ` Robin Murphy
0 siblings, 1 reply; 7+ messages in thread
From: Thierry Reding @ 2021-02-25 11:09 UTC (permalink / raw)
To: Heiko Thiery
Cc: kernelci-results, Konrad Rzeszutek Wilk, Guillaume Tucker,
linux-kernel, iommu, Robin Murphy, Christoph Hellwig,
Jianxiong Gao
[-- Attachment #1.1: Type: text/plain, Size: 4361 bytes --]
On Wed, Feb 24, 2021 at 10:39:42PM +0100, Heiko Thiery wrote:
> Hi Christoph and all,
>
> On 23.02.21 10:56, Guillaume Tucker wrote:
> > Hi Christoph,
> >
> > Please see the bisection report below about a boot failure on
> > r8a77960-ulcb on next-20210222.
> >
> > Reports aren't automatically sent to the public while we're
> > trialing new bisection features on kernelci.org but this one
> > looks valid.
> >
> > The log shows a kernel panic, more details can be found here:
> >
> > https://kernelci.org/test/case/id/6034bde034504edc9faddd2c/
> >
> > Please let us know if you need any help to debug the issue or try
> > a fix on this platform.
>
> I am also seeing this problem on an iMX8MQ board and can help test if you
> have a fix.
This is also causing boot failures on Jetson AGX Xavier. The origin is
slightly different from the above kernelci.org report, but the BUG_ON is
the same:
[ 2.650447] ------------[ cut here ]------------
[ 2.650588] kernel BUG at include/linux/iommu-helper.h:23!
[ 2.650729] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[ 2.654330] Modules linked in:
[ 2.657474] CPU: 2 PID: 67 Comm: kworker/2:1 Not tainted 5.11.0-next-20210225-00025-gfd15609b3a81-dirty #120
[ 2.667367] Hardware name: NVIDIA Jetson AGX Xavier Developer Kit (DT)
[ 2.674096] Workqueue: events deferred_probe_work_func
[ 2.679169] pstate: 40400089 (nZcv daIf +PAN -UAO -TCO BTYPE=--)
[ 2.684949] pc : find_slots.isra.0+0x118/0x2f0
[ 2.689494] lr : find_slots.isra.0+0x88/0x2f0
[ 2.693696] sp : ffff800011faf950
[ 2.697281] x29: ffff800011faf950 x28: 0000000000000001
[ 2.702537] x27: 0000000000000001 x26: 0000000000000000
[ 2.708131] x25: 0000000000000001 x24: 0000000105f03148
[ 2.713556] x23: 0000000000000001 x22: ffff800011559000
[ 2.718835] x21: ffff800011559a80 x20: 00000000edc00000
[ 2.724493] x19: 0000000000000000 x18: 0000000000000020
[ 2.729770] x17: ffff0003ffd7d160 x16: 0000000000000068
[ 2.735173] x15: ffff000080b43150 x14: ffffffffffffffff
[ 2.740944] x13: ffff000082b5d791 x12: 0000000000000040
[ 2.746113] x11: ffff0000a0000248 x10: 0000000000000000
[ 2.751882] x9 : 0000000000000000 x8 : ffff0003fed30000
[ 2.757139] x7 : 0000000000000000 x6 : 0000000000000000
[ 2.762818] x5 : 0000000000000000 x4 : 0000000000000000
[ 2.767984] x3 : 00000001e8303148 x2 : 0000000000008000
[ 2.773580] x1 : ffffffffffffffff x0 : 00000000001db800
[ 2.778662] Call trace:
[ 2.781136] find_slots.isra.0+0x118/0x2f0
[ 2.785137] swiotlb_tbl_map_single+0x80/0x1b4
[ 2.789858] swiotlb_map+0x58/0x200
[ 2.793355] dma_direct_map_page+0x148/0x1c0
[ 2.797386] dma_map_page_attrs+0x2c/0x54
[ 2.801411] dw_pcie_host_init+0x40c/0x4c0
[ 2.805633] tegra_pcie_config_rp+0x7c/0x1f4
[ 2.810155] tegra_pcie_dw_probe+0x3d0/0x60c
[ 2.814185] platform_probe+0x68/0xe0
[ 2.817688] really_probe+0xe4/0x4c0
[ 2.821362] driver_probe_device+0x58/0xc0
[ 2.825386] __device_attach_driver+0xa8/0x104
[ 2.829953] bus_for_each_drv+0x78/0xd0
[ 2.833434] __device_attach+0xdc/0x17c
[ 2.837631] device_initial_probe+0x14/0x20
[ 2.841680] bus_probe_device+0x9c/0xa4
[ 2.845160] deferred_probe_work_func+0x74/0xb0
[ 2.849734] process_one_work+0x1cc/0x350
[ 2.853822] worker_thread+0x20c/0x3ac
[ 2.858018] kthread+0x128/0x134
[ 2.860997] ret_from_fork+0x10/0x34
[ 2.864508] Code: ca180063 ea06007f 54fffee1 b50001e7 (d4210000)
[ 2.870547] ---[ end trace e5c50bdcf12b316e ]---
[ 2.875087] note: kworker/2:1[67] exited with preempt_count 2
[ 2.880836] ------------[ cut here ]------------
I've confirmed that reverting the following commits makes the system
boot again:
47cfc5be1934 ("swiotlb: Validate bounce size in the sync/unmap path")
c6f50c7719e7 ("swiotlb: respect min_align_mask")
e952d9a1bc20 ("swiotlb: don't modify orig_addr in swiotlb_tbl_sync_single")
567d877f9a7d ("swiotlb: refactor swiotlb_tbl_map_single")
Let me know if I can help test any fixes for this.
Thierry
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
[-- Attachment #2: Type: text/plain, Size: 156 bytes --]
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: next/master bisection: baseline.login on r8a77960-ulcb
2021-02-25 11:09 ` Thierry Reding
@ 2021-02-25 11:14 ` Robin Murphy
2021-02-25 11:50 ` Thierry Reding
0 siblings, 1 reply; 7+ messages in thread
From: Robin Murphy @ 2021-02-25 11:14 UTC (permalink / raw)
To: Thierry Reding, Heiko Thiery
Cc: kernelci-results, Konrad Rzeszutek Wilk, Guillaume Tucker,
linux-kernel, iommu, Christoph Hellwig, Jianxiong Gao
On 2021-02-25 11:09, Thierry Reding wrote:
> On Wed, Feb 24, 2021 at 10:39:42PM +0100, Heiko Thiery wrote:
>> Hi Christoph and all,
>>
>> On 23.02.21 10:56, Guillaume Tucker wrote:
>>> Hi Christoph,
>>>
>>> Please see the bisection report below about a boot failure on
>>> r8a77960-ulcb on next-20210222.
>>>
>>> Reports aren't automatically sent to the public while we're
>>> trialing new bisection features on kernelci.org but this one
>>> looks valid.
>>>
>>> The log shows a kernel panic, more details can be found here:
>>>
>>> https://kernelci.org/test/case/id/6034bde034504edc9faddd2c/
>>>
>>> Please let us know if you need any help to debug the issue or try
>>> a fix on this platform.
>>
>> I am also seeing this problem on an iMX8MQ board and can help test if you
>> have a fix.
>
> This is also causing boot failures on Jetson AGX Xavier. The origin is
> slightly different from the above kernelci.org report, but the BUG_ON is
> the same:
>
> [ 2.650447] ------------[ cut here ]------------
> [ 2.650588] kernel BUG at include/linux/iommu-helper.h:23!
> [ 2.650729] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
> [ 2.654330] Modules linked in:
> [ 2.657474] CPU: 2 PID: 67 Comm: kworker/2:1 Not tainted 5.11.0-next-20210225-00025-gfd15609b3a81-dirty #120
> [ 2.667367] Hardware name: NVIDIA Jetson AGX Xavier Developer Kit (DT)
> [ 2.674096] Workqueue: events deferred_probe_work_func
> [ 2.679169] pstate: 40400089 (nZcv daIf +PAN -UAO -TCO BTYPE=--)
> [ 2.684949] pc : find_slots.isra.0+0x118/0x2f0
> [ 2.689494] lr : find_slots.isra.0+0x88/0x2f0
> [ 2.693696] sp : ffff800011faf950
> [ 2.697281] x29: ffff800011faf950 x28: 0000000000000001
> [ 2.702537] x27: 0000000000000001 x26: 0000000000000000
> [ 2.708131] x25: 0000000000000001 x24: 0000000105f03148
> [ 2.713556] x23: 0000000000000001 x22: ffff800011559000
> [ 2.718835] x21: ffff800011559a80 x20: 00000000edc00000
> [ 2.724493] x19: 0000000000000000 x18: 0000000000000020
> [ 2.729770] x17: ffff0003ffd7d160 x16: 0000000000000068
> [ 2.735173] x15: ffff000080b43150 x14: ffffffffffffffff
> [ 2.740944] x13: ffff000082b5d791 x12: 0000000000000040
> [ 2.746113] x11: ffff0000a0000248 x10: 0000000000000000
> [ 2.751882] x9 : 0000000000000000 x8 : ffff0003fed30000
> [ 2.757139] x7 : 0000000000000000 x6 : 0000000000000000
> [ 2.762818] x5 : 0000000000000000 x4 : 0000000000000000
> [ 2.767984] x3 : 00000001e8303148 x2 : 0000000000008000
> [ 2.773580] x1 : ffffffffffffffff x0 : 00000000001db800
> [ 2.778662] Call trace:
> [ 2.781136] find_slots.isra.0+0x118/0x2f0
> [ 2.785137] swiotlb_tbl_map_single+0x80/0x1b4
> [ 2.789858] swiotlb_map+0x58/0x200
> [ 2.793355] dma_direct_map_page+0x148/0x1c0
> [ 2.797386] dma_map_page_attrs+0x2c/0x54
> [ 2.801411] dw_pcie_host_init+0x40c/0x4c0
> [ 2.805633] tegra_pcie_config_rp+0x7c/0x1f4
> [ 2.810155] tegra_pcie_dw_probe+0x3d0/0x60c
> [ 2.814185] platform_probe+0x68/0xe0
> [ 2.817688] really_probe+0xe4/0x4c0
> [ 2.821362] driver_probe_device+0x58/0xc0
> [ 2.825386] __device_attach_driver+0xa8/0x104
> [ 2.829953] bus_for_each_drv+0x78/0xd0
> [ 2.833434] __device_attach+0xdc/0x17c
> [ 2.837631] device_initial_probe+0x14/0x20
> [ 2.841680] bus_probe_device+0x9c/0xa4
> [ 2.845160] deferred_probe_work_func+0x74/0xb0
> [ 2.849734] process_one_work+0x1cc/0x350
> [ 2.853822] worker_thread+0x20c/0x3ac
> [ 2.858018] kthread+0x128/0x134
> [ 2.860997] ret_from_fork+0x10/0x34
> [ 2.864508] Code: ca180063 ea06007f 54fffee1 b50001e7 (d4210000)
> [ 2.870547] ---[ end trace e5c50bdcf12b316e ]---
> [ 2.875087] note: kworker/2:1[67] exited with preempt_count 2
> [ 2.880836] ------------[ cut here ]------------
>
> I've confirmed that reverting the following commits makes the system
> boot again:
>
> 47cfc5be1934 ("swiotlb: Validate bounce size in the sync/unmap path")
> c6f50c7719e7 ("swiotlb: respect min_align_mask")
> e952d9a1bc20 ("swiotlb: don't modify orig_addr in swiotlb_tbl_sync_single")
> 567d877f9a7d ("swiotlb: refactor swiotlb_tbl_map_single")
>
> Let me know if I can help test any fixes for this.
FWIW, this sounds like it's probably the same thing for which a fix
should be pending:
https://lore.kernel.org/linux-iommu/20210223072514.GA18079@lst.de/T/#u
Robin.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: next/master bisection: baseline.login on r8a77960-ulcb
2021-02-25 11:14 ` Robin Murphy
@ 2021-02-25 11:50 ` Thierry Reding
2021-02-25 13:00 ` Heiko Thiery
0 siblings, 1 reply; 7+ messages in thread
From: Thierry Reding @ 2021-02-25 11:50 UTC (permalink / raw)
To: Robin Murphy
Cc: kernelci-results, Konrad Rzeszutek Wilk, Guillaume Tucker,
linux-kernel, iommu, Heiko Thiery, Christoph Hellwig,
Jianxiong Gao
[-- Attachment #1.1: Type: text/plain, Size: 5165 bytes --]
On Thu, Feb 25, 2021 at 11:14:57AM +0000, Robin Murphy wrote:
> On 2021-02-25 11:09, Thierry Reding wrote:
> > On Wed, Feb 24, 2021 at 10:39:42PM +0100, Heiko Thiery wrote:
> > > Hi Christoph and all,
> > >
> > > On 23.02.21 10:56, Guillaume Tucker wrote:
> > > > Hi Christoph,
> > > >
> > > > Please see the bisection report below about a boot failure on
> > > > r8a77960-ulcb on next-20210222.
> > > >
> > > > Reports aren't automatically sent to the public while we're
> > > > trialing new bisection features on kernelci.org but this one
> > > > looks valid.
> > > >
> > > > The log shows a kernel panic, more details can be found here:
> > > >
> > > > https://kernelci.org/test/case/id/6034bde034504edc9faddd2c/
> > > >
> > > > Please let us know if you need any help to debug the issue or try
> > > > a fix on this platform.
> > >
> > > I am also seeing this problem on an iMX8MQ board and can help test if you
> > > have a fix.
> >
> > This is also causing boot failures on Jetson AGX Xavier. The origin is
> > slightly different from the above kernelci.org report, but the BUG_ON is
> > the same:
> >
> > [ 2.650447] ------------[ cut here ]------------
> > [ 2.650588] kernel BUG at include/linux/iommu-helper.h:23!
> > [ 2.650729] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
> > [ 2.654330] Modules linked in:
> > [ 2.657474] CPU: 2 PID: 67 Comm: kworker/2:1 Not tainted 5.11.0-next-20210225-00025-gfd15609b3a81-dirty #120
> > [ 2.667367] Hardware name: NVIDIA Jetson AGX Xavier Developer Kit (DT)
> > [ 2.674096] Workqueue: events deferred_probe_work_func
> > [ 2.679169] pstate: 40400089 (nZcv daIf +PAN -UAO -TCO BTYPE=--)
> > [ 2.684949] pc : find_slots.isra.0+0x118/0x2f0
> > [ 2.689494] lr : find_slots.isra.0+0x88/0x2f0
> > [ 2.693696] sp : ffff800011faf950
> > [ 2.697281] x29: ffff800011faf950 x28: 0000000000000001
> > [ 2.702537] x27: 0000000000000001 x26: 0000000000000000
> > [ 2.708131] x25: 0000000000000001 x24: 0000000105f03148
> > [ 2.713556] x23: 0000000000000001 x22: ffff800011559000
> > [ 2.718835] x21: ffff800011559a80 x20: 00000000edc00000
> > [ 2.724493] x19: 0000000000000000 x18: 0000000000000020
> > [ 2.729770] x17: ffff0003ffd7d160 x16: 0000000000000068
> > [ 2.735173] x15: ffff000080b43150 x14: ffffffffffffffff
> > [ 2.740944] x13: ffff000082b5d791 x12: 0000000000000040
> > [ 2.746113] x11: ffff0000a0000248 x10: 0000000000000000
> > [ 2.751882] x9 : 0000000000000000 x8 : ffff0003fed30000
> > [ 2.757139] x7 : 0000000000000000 x6 : 0000000000000000
> > [ 2.762818] x5 : 0000000000000000 x4 : 0000000000000000
> > [ 2.767984] x3 : 00000001e8303148 x2 : 0000000000008000
> > [ 2.773580] x1 : ffffffffffffffff x0 : 00000000001db800
> > [ 2.778662] Call trace:
> > [ 2.781136] find_slots.isra.0+0x118/0x2f0
> > [ 2.785137] swiotlb_tbl_map_single+0x80/0x1b4
> > [ 2.789858] swiotlb_map+0x58/0x200
> > [ 2.793355] dma_direct_map_page+0x148/0x1c0
> > [ 2.797386] dma_map_page_attrs+0x2c/0x54
> > [ 2.801411] dw_pcie_host_init+0x40c/0x4c0
> > [ 2.805633] tegra_pcie_config_rp+0x7c/0x1f4
> > [ 2.810155] tegra_pcie_dw_probe+0x3d0/0x60c
> > [ 2.814185] platform_probe+0x68/0xe0
> > [ 2.817688] really_probe+0xe4/0x4c0
> > [ 2.821362] driver_probe_device+0x58/0xc0
> > [ 2.825386] __device_attach_driver+0xa8/0x104
> > [ 2.829953] bus_for_each_drv+0x78/0xd0
> > [ 2.833434] __device_attach+0xdc/0x17c
> > [ 2.837631] device_initial_probe+0x14/0x20
> > [ 2.841680] bus_probe_device+0x9c/0xa4
> > [ 2.845160] deferred_probe_work_func+0x74/0xb0
> > [ 2.849734] process_one_work+0x1cc/0x350
> > [ 2.853822] worker_thread+0x20c/0x3ac
> > [ 2.858018] kthread+0x128/0x134
> > [ 2.860997] ret_from_fork+0x10/0x34
> > [ 2.864508] Code: ca180063 ea06007f 54fffee1 b50001e7 (d4210000)
> > [ 2.870547] ---[ end trace e5c50bdcf12b316e ]---
> > [ 2.875087] note: kworker/2:1[67] exited with preempt_count 2
> > [ 2.880836] ------------[ cut here ]------------
> >
> > I've confirmed that reverting the following commits makes the system
> > boot again:
> >
> > 47cfc5be1934 ("swiotlb: Validate bounce size in the sync/unmap path")
> > c6f50c7719e7 ("swiotlb: respect min_align_mask")
> > e952d9a1bc20 ("swiotlb: don't modify orig_addr in swiotlb_tbl_sync_single")
> > 567d877f9a7d ("swiotlb: refactor swiotlb_tbl_map_single")
> >
> > Let me know if I can help test any fixes for this.
>
> FWIW, this sounds like it's probably the same thing for which a fix should
> be pending:
>
> https://lore.kernel.org/linux-iommu/20210223072514.GA18079@lst.de/T/#u
Yep, changing max_slots from unsigned int to unsigned long fixes this as
well. Thanks for the pointer!
Thierry
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
[-- Attachment #2: Type: text/plain, Size: 156 bytes --]
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: next/master bisection: baseline.login on r8a77960-ulcb
2021-02-25 11:50 ` Thierry Reding
@ 2021-02-25 13:00 ` Heiko Thiery
2021-02-26 8:56 ` Yoshihiro Shimoda
0 siblings, 1 reply; 7+ messages in thread
From: Heiko Thiery @ 2021-02-25 13:00 UTC (permalink / raw)
To: Thierry Reding
Cc: kernelci-results, Konrad Rzeszutek Wilk, Guillaume Tucker,
linux-kernel, iommu, Robin Murphy, Christoph Hellwig,
Jianxiong Gao
Hi all,
Am Do., 25. Feb. 2021 um 12:50 Uhr schrieb Thierry Reding
<thierry.reding@gmail.com>:
>
> On Thu, Feb 25, 2021 at 11:14:57AM +0000, Robin Murphy wrote:
> > On 2021-02-25 11:09, Thierry Reding wrote:
> > > On Wed, Feb 24, 2021 at 10:39:42PM +0100, Heiko Thiery wrote:
> > > > Hi Christoph and all,
> > > >
> > > > On 23.02.21 10:56, Guillaume Tucker wrote:
> > > > > Hi Christoph,
> > > > >
> > > > > Please see the bisection report below about a boot failure on
> > > > > r8a77960-ulcb on next-20210222.
> > > > >
> > > > > Reports aren't automatically sent to the public while we're
> > > > > trialing new bisection features on kernelci.org but this one
> > > > > looks valid.
> > > > >
> > > > > The log shows a kernel panic, more details can be found here:
> > > > >
> > > > > https://kernelci.org/test/case/id/6034bde034504edc9faddd2c/
> > > > >
> > > > > Please let us know if you need any help to debug the issue or try
> > > > > a fix on this platform.
> > > >
> > > > I am also seeing this problem on an iMX8MQ board and can help test if you
> > > > have a fix.
> > >
> > > This is also causing boot failures on Jetson AGX Xavier. The origin is
> > > slightly different from the above kernelci.org report, but the BUG_ON is
> > > the same:
> > >
> > > [ 2.650447] ------------[ cut here ]------------
> > > [ 2.650588] kernel BUG at include/linux/iommu-helper.h:23!
> > > [ 2.650729] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
> > > [ 2.654330] Modules linked in:
> > > [ 2.657474] CPU: 2 PID: 67 Comm: kworker/2:1 Not tainted 5.11.0-next-20210225-00025-gfd15609b3a81-dirty #120
> > > [ 2.667367] Hardware name: NVIDIA Jetson AGX Xavier Developer Kit (DT)
> > > [ 2.674096] Workqueue: events deferred_probe_work_func
> > > [ 2.679169] pstate: 40400089 (nZcv daIf +PAN -UAO -TCO BTYPE=--)
> > > [ 2.684949] pc : find_slots.isra.0+0x118/0x2f0
> > > [ 2.689494] lr : find_slots.isra.0+0x88/0x2f0
> > > [ 2.693696] sp : ffff800011faf950
> > > [ 2.697281] x29: ffff800011faf950 x28: 0000000000000001
> > > [ 2.702537] x27: 0000000000000001 x26: 0000000000000000
> > > [ 2.708131] x25: 0000000000000001 x24: 0000000105f03148
> > > [ 2.713556] x23: 0000000000000001 x22: ffff800011559000
> > > [ 2.718835] x21: ffff800011559a80 x20: 00000000edc00000
> > > [ 2.724493] x19: 0000000000000000 x18: 0000000000000020
> > > [ 2.729770] x17: ffff0003ffd7d160 x16: 0000000000000068
> > > [ 2.735173] x15: ffff000080b43150 x14: ffffffffffffffff
> > > [ 2.740944] x13: ffff000082b5d791 x12: 0000000000000040
> > > [ 2.746113] x11: ffff0000a0000248 x10: 0000000000000000
> > > [ 2.751882] x9 : 0000000000000000 x8 : ffff0003fed30000
> > > [ 2.757139] x7 : 0000000000000000 x6 : 0000000000000000
> > > [ 2.762818] x5 : 0000000000000000 x4 : 0000000000000000
> > > [ 2.767984] x3 : 00000001e8303148 x2 : 0000000000008000
> > > [ 2.773580] x1 : ffffffffffffffff x0 : 00000000001db800
> > > [ 2.778662] Call trace:
> > > [ 2.781136] find_slots.isra.0+0x118/0x2f0
> > > [ 2.785137] swiotlb_tbl_map_single+0x80/0x1b4
> > > [ 2.789858] swiotlb_map+0x58/0x200
> > > [ 2.793355] dma_direct_map_page+0x148/0x1c0
> > > [ 2.797386] dma_map_page_attrs+0x2c/0x54
> > > [ 2.801411] dw_pcie_host_init+0x40c/0x4c0
> > > [ 2.805633] tegra_pcie_config_rp+0x7c/0x1f4
> > > [ 2.810155] tegra_pcie_dw_probe+0x3d0/0x60c
> > > [ 2.814185] platform_probe+0x68/0xe0
> > > [ 2.817688] really_probe+0xe4/0x4c0
> > > [ 2.821362] driver_probe_device+0x58/0xc0
> > > [ 2.825386] __device_attach_driver+0xa8/0x104
> > > [ 2.829953] bus_for_each_drv+0x78/0xd0
> > > [ 2.833434] __device_attach+0xdc/0x17c
> > > [ 2.837631] device_initial_probe+0x14/0x20
> > > [ 2.841680] bus_probe_device+0x9c/0xa4
> > > [ 2.845160] deferred_probe_work_func+0x74/0xb0
> > > [ 2.849734] process_one_work+0x1cc/0x350
> > > [ 2.853822] worker_thread+0x20c/0x3ac
> > > [ 2.858018] kthread+0x128/0x134
> > > [ 2.860997] ret_from_fork+0x10/0x34
> > > [ 2.864508] Code: ca180063 ea06007f 54fffee1 b50001e7 (d4210000)
> > > [ 2.870547] ---[ end trace e5c50bdcf12b316e ]---
> > > [ 2.875087] note: kworker/2:1[67] exited with preempt_count 2
> > > [ 2.880836] ------------[ cut here ]------------
> > >
> > > I've confirmed that reverting the following commits makes the system
> > > boot again:
> > >
> > > 47cfc5be1934 ("swiotlb: Validate bounce size in the sync/unmap path")
> > > c6f50c7719e7 ("swiotlb: respect min_align_mask")
> > > e952d9a1bc20 ("swiotlb: don't modify orig_addr in swiotlb_tbl_sync_single")
> > > 567d877f9a7d ("swiotlb: refactor swiotlb_tbl_map_single")
> > >
> > > Let me know if I can help test any fixes for this.
> >
> > FWIW, this sounds like it's probably the same thing for which a fix should
> > be pending:
> >
> > https://lore.kernel.org/linux-iommu/20210223072514.GA18079@lst.de/T/#u
>
> Yep, changing max_slots from unsigned int to unsigned long fixes this as
> well. Thanks for the pointer!
I also can confirm that changing that to unsigned long fixes the issue.
Thank you
--
Heiko
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: next/master bisection: baseline.login on r8a77960-ulcb
2021-02-25 13:00 ` Heiko Thiery
@ 2021-02-26 8:56 ` Yoshihiro Shimoda
0 siblings, 0 replies; 7+ messages in thread
From: Yoshihiro Shimoda @ 2021-02-26 8:56 UTC (permalink / raw)
To: Heiko Thiery, Thierry Reding
Cc: kernelci-results, Konrad Rzeszutek Wilk, Guillaume Tucker,
linux-kernel, iommu, Robin Murphy, Christoph Hellwig,
Jianxiong Gao
Hi all,
> From: Heiko Thiery, Sent: Thursday, February 25, 2021 10:01 PM
> Am Do., 25. Feb. 2021 um 12:50 Uhr schrieb Thierry Reding:
> > On Thu, Feb 25, 2021 at 11:14:57AM +0000, Robin Murphy wrote:
> > > On 2021-02-25 11:09, Thierry Reding wrote:
> > > > On Wed, Feb 24, 2021 at 10:39:42PM +0100, Heiko Thiery wrote:
> > > > > Hi Christoph and all,
> > > > >
> > > > > On 23.02.21 10:56, Guillaume Tucker wrote:
> > > > > > Hi Christoph,
> > > > > >
> > > > > > Please see the bisection report below about a boot failure on
> > > > > > r8a77960-ulcb on next-20210222.
> > > > > >
> > > > > > Reports aren't automatically sent to the public while we're
> > > > > > trialing new bisection features on kernelci.org but this one
> > > > > > looks valid.
> > > > > >
> > > > > > The log shows a kernel panic, more details can be found here:
<snip>
> >
> > Yep, changing max_slots from unsigned int to unsigned long fixes this as
> > well. Thanks for the pointer!
>
> I also can confirm that changing that to unsigned long fixes the issue.
Thank you for the information! I also confirmed that changing the type of
max_slots fixed the issue on my environment (r8a77951-salvator-xs.dts with defconfig).
Best regards,
Yoshihiro Shimoda
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2021-02-26 9:11 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <60346234.1c69fb81.cd55e.770d@mx.google.com>
2021-02-23 9:56 ` next/master bisection: baseline.login on r8a77960-ulcb Guillaume Tucker
2021-02-24 21:39 ` Heiko Thiery
2021-02-25 11:09 ` Thierry Reding
2021-02-25 11:14 ` Robin Murphy
2021-02-25 11:50 ` Thierry Reding
2021-02-25 13:00 ` Heiko Thiery
2021-02-26 8:56 ` Yoshihiro Shimoda
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).