All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: stable-rc/linux-5.4.y bisection: baseline.login on qemu_arm-virt-gicv2-uefi
       [not found] <625c8753.1c69fb81.b232.69bb@mx.google.com>
@ 2022-04-19 13:31 ` Mark Brown
  2022-04-20  9:18   ` Mike Rapoport
  0 siblings, 1 reply; 5+ messages in thread
From: Mark Brown @ 2022-04-19 13:31 UTC (permalink / raw)
  To: Mike Rapoport, Tony Lindgren, Mark-PK Tsai, Greg Kroah-Hartman
  Cc: kernelci-results, bot, gtucker, stable

[-- Attachment #1: Type: text/plain, Size: 8513 bytes --]

On Sun, Apr 17, 2022 at 02:32:03PM -0700, KernelCI bot wrote:

The KernelCI bisection bot found that commit 6026d4032dbbe3 ("arm:
extend pfn_valid to take into account freed memory map alignment")
triggered a regression in v5.4.x on 32 bit ARM with a qemu platform
booting UEFI firmware.  We try to dereference an invalid pointer parsing
the DMI tables:

<1>[    0.084476] 8<--- cut here ---
<1>[    0.084595] Unable to handle kernel paging request at virtual address dfb76000
<1>[    0.084938] pgd = (ptrval)
<1>[    0.085038] [dfb76000] *pgd=5f7fe801, *pte=00000000, *ppte=00000000

...

<4>[    0.093923] [<c0ed6ce8>] (memcpy) from [<c16a06f8>] (dmi_setup+0x60/0x418)
<4>[    0.094204] [<c16a06f8>] (dmi_setup) from [<c16a38d4>] (arm_dmi_init+0x8/0x10)
<4>[    0.094408] [<c16a38d4>] (arm_dmi_init) from [<c0302e9c>] (do_one_initcall+0x50/0x228)
<4>[    0.094619] [<c0302e9c>] (do_one_initcall) from [<c16011e4>] (kernel_init_freeable+0x15c/0x1f8)
<4>[    0.094841] [<c16011e4>] (kernel_init_freeable) from [<c0f028cc>] (kernel_init+0x8/0x10c)
<4>[    0.095057] [<c0f028cc>] (kernel_init) from [<c03010e8>] (ret_from_fork+0x14/0x2c)

This particular bisect is from GICv2 but GICv3 shows the same issue, and
it persists in the latest stable -rc:

    https://linux.kernelci.org/test/job/stable-rc/branch/linux-5.4.y/kernel/v5.4.189-64-gab55553793398/plan/baseline/

A quick check seems to show that other stable branches are unaffected.
I've left all the context from the report (including full boot logs and
a Reported-by tag) below:

> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
> * This automated bisection report was sent to you on the basis  *
> * that you may be involved with the breaking commit it has      *
> * found.  No manual investigation has been done to verify it,   *
> * and the root cause of the problem may be somewhere else.      *
> *                                                               *
> * If you do send a fix, please include this trailer:            *
> *   Reported-by: "kernelci.org bot" <bot@kernelci.org>          *
> *                                                               *
> * Hope this helps!                                              *
> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
> 
> stable-rc/linux-5.4.y bisection: baseline.login on qemu_arm-virt-gicv2-uefi
> 
> Summary:
>   Start:      e7f5213d755bc Linux 5.4.189
>   Plain log:  https://storage.kernelci.org/stable-rc/linux-5.4.y/v5.4.189/arm/multi_v7_defconfig/gcc-10/lab-baylibre/baseline-qemu_arm-virt-gicv2-uefi.txt
>   HTML log:   https://storage.kernelci.org/stable-rc/linux-5.4.y/v5.4.189/arm/multi_v7_defconfig/gcc-10/lab-baylibre/baseline-qemu_arm-virt-gicv2-uefi.html
>   Result:     6026d4032dbbe arm: extend pfn_valid to take into account freed memory map alignment
> 
> Checks:
>   revert:     PASS
>   verify:     PASS
> 
> Parameters:
>   Tree:       stable-rc
>   URL:        https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
>   Branch:     linux-5.4.y
>   Target:     qemu_arm-virt-gicv2-uefi
>   CPU arch:   arm
>   Lab:        lab-baylibre
>   Compiler:   gcc-10
>   Config:     multi_v7_defconfig
>   Test case:  baseline.login
> 
> Breaking commit found:
> 
> -------------------------------------------------------------------------------
> commit 6026d4032dbbe3d7f4ac2c8daa923fe74dcf41c4
> Author: Mike Rapoport <rppt@linux.ibm.com>
> Date:   Mon Dec 13 16:57:09 2021 +0800
> 
>     arm: extend pfn_valid to take into account freed memory map alignment
>     
>     commit a4d5613c4dc6d413e0733e37db9d116a2a36b9f3 upstream.
>     
>     When unused memory map is freed the preserved part of the memory map is
>     extended to match pageblock boundaries because lots of core mm
>     functionality relies on homogeneity of the memory map within pageblock
>     boundaries.
>     
>     Since pfn_valid() is used to check whether there is a valid memory map
>     entry for a PFN, make it return true also for PFNs that have memory map
>     entries even if there is no actual memory populated there.
>     
>     Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
>     Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>     Tested-by: Tony Lindgren <tony@atomide.com>
>     Link: https://lore.kernel.org/lkml/20210630071211.21011-1-rppt@kernel.org/
>     Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
>     Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> 
> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
> index 5635bcc419af8..ff2cd985d20e0 100644
> --- a/arch/arm/mm/init.c
> +++ b/arch/arm/mm/init.c
> @@ -176,11 +176,22 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max_low,
>  int pfn_valid(unsigned long pfn)
>  {
>  	phys_addr_t addr = __pfn_to_phys(pfn);
> +	unsigned long pageblock_size = PAGE_SIZE * pageblock_nr_pages;
>  
>  	if (__phys_to_pfn(addr) != pfn)
>  		return 0;
>  
> -	return memblock_is_map_memory(__pfn_to_phys(pfn));
> +	/*
> +	 * If address less than pageblock_size bytes away from a present
> +	 * memory chunk there still will be a memory map entry for it
> +	 * because we round freed memory map to the pageblock boundaries.
> +	 */
> +	if (memblock_overlaps_region(&memblock.memory,
> +				     ALIGN_DOWN(addr, pageblock_size),
> +				     pageblock_size))
> +		return 1;
> +
> +	return 0;
>  }
>  EXPORT_SYMBOL(pfn_valid);
>  #endif
> -------------------------------------------------------------------------------
> 
> 
> Git bisection log:
> 
> -------------------------------------------------------------------------------
> git bisect start
> # good: [7f70428f0109470aa9177d1a9e5ce02de736f480] Linux 5.4.165
> git bisect good 7f70428f0109470aa9177d1a9e5ce02de736f480
> # bad: [e7f5213d755bc34f366d36f08825c0b446117d96] Linux 5.4.189
> git bisect bad e7f5213d755bc34f366d36f08825c0b446117d96
> # bad: [902528183f4d94945a0c1ed6048d4a5d4e1e712e] mmc: block: fix read single on recovery logic
> git bisect bad 902528183f4d94945a0c1ed6048d4a5d4e1e712e
> # bad: [c7e4004b38aa7ad482fc46ab76e28879f84ec77e] batman-adv: allow netlink usage in unprivileged containers
> git bisect bad c7e4004b38aa7ad482fc46ab76e28879f84ec77e
> # bad: [db0c834abbc186bda56b1e13b4eb61f7126c12c5] rndis_host: support Hytera digital radios
> git bisect bad db0c834abbc186bda56b1e13b4eb61f7126c12c5
> # bad: [0b01c51c4f47f59ad7eb1ea5bac47fab14b188a5] qlcnic: potential dereference null pointer of rx_queue->page_ring
> git bisect bad 0b01c51c4f47f59ad7eb1ea5bac47fab14b188a5
> # bad: [e7660f9535ade84ea57aed1c55d102bfb23dd2ff] mac80211: fix lookup when adding AddBA extension element
> git bisect bad e7660f9535ade84ea57aed1c55d102bfb23dd2ff
> # bad: [802a1a8501563714a5fe8824f4ed27fec04a0719] firmware: arm_scpi: Fix string overflow in SCPI genpd driver
> git bisect bad 802a1a8501563714a5fe8824f4ed27fec04a0719
> # good: [2fb8e4267c47d69d6bada6310607ea3762f6c962] KVM: x86: Ignore sparse banks size for an "all CPUs", non-sparse IPI req
> git bisect good 2fb8e4267c47d69d6bada6310607ea3762f6c962
> # good: [492f4d3cde95aadcd1d070db5dd4796ae8019165] memblock: ensure there is no overflow in memblock_overlaps_region()
> git bisect good 492f4d3cde95aadcd1d070db5dd4796ae8019165
> # bad: [e8ef940326efd17ca7fdd3cb8791c29a24b04f28] Linux 5.4.167
> git bisect bad e8ef940326efd17ca7fdd3cb8791c29a24b04f28
> # bad: [c97579584fa88df65ff6e4653b175acba154862d] arm: ioremap: don't abuse pfn_valid() to check if pfn is in RAM
> git bisect bad c97579584fa88df65ff6e4653b175acba154862d
> # bad: [6026d4032dbbe3d7f4ac2c8daa923fe74dcf41c4] arm: extend pfn_valid to take into account freed memory map alignment
> git bisect bad 6026d4032dbbe3d7f4ac2c8daa923fe74dcf41c4
> # first bad commit: [6026d4032dbbe3d7f4ac2c8daa923fe74dcf41c4] arm: extend pfn_valid to take into account freed memory map alignment
> -------------------------------------------------------------------------------
> 
> 
> -=-=-=-=-=-=-=-=-=-=-=-
> Groups.io Links: You receive all messages sent to this group.
> View/Reply Online (#25917): https://groups.io/g/kernelci-results/message/25917
> Mute This Topic: https://groups.io/mt/90529234/1131744
> Group Owner: kernelci-results+owner@groups.io
> Unsubscribe: https://groups.io/g/kernelci-results/unsub [broonie@kernel.org]
> -=-=-=-=-=-=-=-=-=-=-=-
> 
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: stable-rc/linux-5.4.y bisection: baseline.login on qemu_arm-virt-gicv2-uefi
  2022-04-19 13:31 ` stable-rc/linux-5.4.y bisection: baseline.login on qemu_arm-virt-gicv2-uefi Mark Brown
@ 2022-04-20  9:18   ` Mike Rapoport
  2022-04-20 12:07     ` Mark Brown
  0 siblings, 1 reply; 5+ messages in thread
From: Mike Rapoport @ 2022-04-20  9:18 UTC (permalink / raw)
  To: Mark Brown
  Cc: Tony Lindgren, Mark-PK Tsai, Greg Kroah-Hartman,
	kernelci-results, bot, gtucker, stable

On Tue, Apr 19, 2022 at 02:31:59PM +0100, Mark Brown wrote:
> On Sun, Apr 17, 2022 at 02:32:03PM -0700, KernelCI bot wrote:
> 
> The KernelCI bisection bot found that commit 6026d4032dbbe3 ("arm:
> extend pfn_valid to take into account freed memory map alignment")
> triggered a regression in v5.4.x on 32 bit ARM with a qemu platform
> booting UEFI firmware.  We try to dereference an invalid pointer parsing
> the DMI tables:
> 
> <1>[    0.084476] 8<--- cut here ---
> <1>[    0.084595] Unable to handle kernel paging request at virtual address dfb76000
> <1>[    0.084938] pgd = (ptrval)
> <1>[    0.085038] [dfb76000] *pgd=5f7fe801, *pte=00000000, *ppte=00000000
> 
> ...
> 
> <4>[    0.093923] [<c0ed6ce8>] (memcpy) from [<c16a06f8>] (dmi_setup+0x60/0x418)
> <4>[    0.094204] [<c16a06f8>] (dmi_setup) from [<c16a38d4>] (arm_dmi_init+0x8/0x10)
> <4>[    0.094408] [<c16a38d4>] (arm_dmi_init) from [<c0302e9c>] (do_one_initcall+0x50/0x228)
> <4>[    0.094619] [<c0302e9c>] (do_one_initcall) from [<c16011e4>] (kernel_init_freeable+0x15c/0x1f8)
> <4>[    0.094841] [<c16011e4>] (kernel_init_freeable) from [<c0f028cc>] (kernel_init+0x8/0x10c)
> <4>[    0.095057] [<c0f028cc>] (kernel_init) from [<c03010e8>] (ret_from_fork+0x14/0x2c)
> 
> This particular bisect is from GICv2 but GICv3 shows the same issue, and
> it persists in the latest stable -rc:
> 
>     https://linux.kernelci.org/test/job/stable-rc/branch/linux-5.4.y/kernel/v5.4.189-64-gab55553793398/plan/baseline/

I don't know how exactly kernel-ci runs qemu with UEFI, I've tried this:

$QEMU -serial stdio -M virt-2.11,gic-version=2 -cpu cortex-a15 -m 1G \
-drive file=$QEMU_EFI,if=pflash,format=raw,unit=0,readonly=on \
-drive file=flash1.img,if=pflash,format=raw,unit=1,readonly=off \
-kernel $kernel \
-append "console=ttyAMA0" 

with stable-rc/linux-5.4.y and I've got as far as to failure to mount
rootfs and the crash in dmu_setup() didn't reproduce for me.

My understanding is that with HEAD pointing to 6026d4032dbbe3 crash happens
because ioremap uses pfn_valid() to check if a PFN is in RAM which is fixed
by c97579584fa8 ("arm: ioremap: don't abuse pfn_valid() to check if pfn is
in RAM") that comes on top of 6026d4032dbbe3.

No clues why ab55553793398 fails, though... 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: stable-rc/linux-5.4.y bisection: baseline.login on qemu_arm-virt-gicv2-uefi
  2022-04-20  9:18   ` Mike Rapoport
@ 2022-04-20 12:07     ` Mark Brown
  2022-04-21  6:42       ` Mike Rapoport
  0 siblings, 1 reply; 5+ messages in thread
From: Mark Brown @ 2022-04-20 12:07 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Tony Lindgren, Mark-PK Tsai, Greg Kroah-Hartman,
	kernelci-results, bot, gtucker, stable

[-- Attachment #1: Type: text/plain, Size: 1552 bytes --]

On Wed, Apr 20, 2022 at 12:18:04PM +0300, Mike Rapoport wrote:

> I don't know how exactly kernel-ci runs qemu with UEFI, I've tried this:

> $QEMU -serial stdio -M virt-2.11,gic-version=2 -cpu cortex-a15 -m 1G \
> -drive file=$QEMU_EFI,if=pflash,format=raw,unit=0,readonly=on \
> -drive file=flash1.img,if=pflash,format=raw,unit=1,readonly=off \
> -kernel $kernel \
> -append "console=ttyAMA0" 

> with stable-rc/linux-5.4.y and I've got as far as to failure to mount
> rootfs and the crash in dmu_setup() didn't reproduce for me.

The command should be something to the effect of:

qemu-system-aarch64 -cpu max -machine virt,gic-version=3,mte=on,accel=tcg -nographic -net nic,model=virtio,macaddr=52:54:00:12:34:58 -net user -m 512 -monitor none -bios /var/lib/lava/dispatcher/tmp/75061/deployimages-ptln1wlp/bios/QEMU_EFI.fd -kernel /var/lib/lava/dispatcher/tmp/75061/deployimages-ptln1wlp/kernel/Image -append "console=ttyAMA0,115200 root=/dev/ram0 debug verbose console_msg_format=syslog earlycon" -initrd /var/lib/lava/dispatcher/tmp/75061/deployimages-ptln1wlp/ramdisk/rootfs.cpio.gz -drive format=qcow2,file=/var/lib/lava/dispatcher/tmp/75061/apply-overlay-guest-6hyfr8ki/lava-guest.qcow2,media=disk,if=virtio,id=lavatest

(with different paths) where QEMU_EFI.fd is:

   http://storage.kernelci.org/images/uefi/edk2-stable202005/arm64/QEMU_EFI.fd

I didn't pull an exact job, sorry.  A full job showing the expected flow
(for GICv3 which shows the same thing) is at:

   https://lava.sirena.org.uk/scheduler/job/75061

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: stable-rc/linux-5.4.y bisection: baseline.login on qemu_arm-virt-gicv2-uefi
  2022-04-20 12:07     ` Mark Brown
@ 2022-04-21  6:42       ` Mike Rapoport
  2022-04-22 11:09         ` Mark Brown
  0 siblings, 1 reply; 5+ messages in thread
From: Mike Rapoport @ 2022-04-21  6:42 UTC (permalink / raw)
  To: Mark Brown
  Cc: Tony Lindgren, Mark-PK Tsai, Greg Kroah-Hartman,
	kernelci-results, bot, gtucker, stable

On Wed, Apr 20, 2022 at 01:07:03PM +0100, Mark Brown wrote:
> On Wed, Apr 20, 2022 at 12:18:04PM +0300, Mike Rapoport wrote:
> 
> > I don't know how exactly kernel-ci runs qemu with UEFI, I've tried this:
> 
> > $QEMU -serial stdio -M virt-2.11,gic-version=2 -cpu cortex-a15 -m 1G \
> > -drive file=$QEMU_EFI,if=pflash,format=raw,unit=0,readonly=on \
> > -drive file=flash1.img,if=pflash,format=raw,unit=1,readonly=off \
> > -kernel $kernel \
> > -append "console=ttyAMA0" 
> 
> > with stable-rc/linux-5.4.y and I've got as far as to failure to mount
> > rootfs and the crash in dmu_setup() didn't reproduce for me.
> 
> The command should be something to the effect of:
> 
> qemu-system-aarch64 -cpu max -machine virt,gic-version=3,mte=on,accel=tcg -nographic -net nic,model=virtio,macaddr=52:54:00:12:34:58 -net user -m 512 -monitor none -bios /var/lib/lava/dispatcher/tmp/75061/deployimages-ptln1wlp/bios/QEMU_EFI.fd -kernel /var/lib/lava/dispatcher/tmp/75061/deployimages-ptln1wlp/kernel/Image -append "console=ttyAMA0,115200 root=/dev/ram0 debug verbose console_msg_format=syslog earlycon" -initrd /var/lib/lava/dispatcher/tmp/75061/deployimages-ptln1wlp/ramdisk/rootfs.cpio.gz -drive format=qcow2,file=/var/lib/lava/dispatcher/tmp/75061/apply-overlay-guest-6hyfr8ki/lava-guest.qcow2,media=disk,if=virtio,id=lavatest

I could reproduce with this, thanks!

The problem is that memremap() uses pfn_valid() to check if RAM resource can
be accessed via linear mapping and this check is wrong.
The problem does not manifest on more recent kernels because the way ARM
registers "System RAM" resources was updated so that it skipped NOMAP
memory.

Can you please give a whirl to the patch below? If it's Ok I'll extend it
to include arm64 and will send a formal patch.


diff --git a/arch/arm/include/asm/io.h b/arch/arm/include/asm/io.h
index 7a0596fcb2e7..2df7454be649 100644
--- a/arch/arm/include/asm/io.h
+++ b/arch/arm/include/asm/io.h
@@ -144,6 +144,10 @@ extern void __iomem * (*arch_ioremap_caller)(phys_addr_t, size_t,
 	unsigned int, void *);
 extern void (*arch_iounmap)(volatile void __iomem *);
 
+extern bool arch_memremap_can_ram_remap(resource_size_t offset, size_t size,
+					unsigned long flags);
+#define arch_memremap_can_ram_remap arch_memremap_can_ram_remap
+
 /*
  * Bad read/write accesses...
  */
diff --git a/arch/arm/mm/ioremap.c b/arch/arm/mm/ioremap.c
index 513c26b46db3..98f90cd5a948 100644
--- a/arch/arm/mm/ioremap.c
+++ b/arch/arm/mm/ioremap.c
@@ -42,6 +42,13 @@
 #include <asm/mach/pci.h>
 #include "mm.h"
 
+bool arch_memremap_can_ram_remap(resource_size_t offset, size_t size,
+				 unsigned long flags)
+{
+	unsigned long pfn = PHYS_PFN(offset);
+
+	return memblock_is_map_memory(pfn);
+}
 
 LIST_HEAD(static_vmlist);
 
diff --git a/kernel/iomem.c b/kernel/iomem.c
index 62c92e43aa0d..e85bed24c0a9 100644
--- a/kernel/iomem.c
+++ b/kernel/iomem.c
@@ -33,7 +33,7 @@ static void *try_ram_remap(resource_size_t offset, size_t size,
 	unsigned long pfn = PHYS_PFN(offset);
 
 	/* In the simple case just return the existing linear address */
-	if (pfn_valid(pfn) && !PageHighMem(pfn_to_page(pfn)) &&
+	if (!PageHighMem(pfn_to_page(pfn)) &&
 	    arch_memremap_can_ram_remap(offset, size, flags))
 		return __va(offset);
 
 
> (with different paths) where QEMU_EFI.fd is:
> 
>    http://storage.kernelci.org/images/uefi/edk2-stable202005/arm64/QEMU_EFI.fd
> 
> I didn't pull an exact job, sorry.  A full job showing the expected flow
> (for GICv3 which shows the same thing) is at:
> 
>    https://lava.sirena.org.uk/scheduler/job/75061



-- 
Sincerely yours,
Mike.

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: stable-rc/linux-5.4.y bisection: baseline.login on qemu_arm-virt-gicv2-uefi
  2022-04-21  6:42       ` Mike Rapoport
@ 2022-04-22 11:09         ` Mark Brown
  0 siblings, 0 replies; 5+ messages in thread
From: Mark Brown @ 2022-04-22 11:09 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Tony Lindgren, Mark-PK Tsai, Greg Kroah-Hartman,
	kernelci-results, bot, gtucker, stable

[-- Attachment #1: Type: text/plain, Size: 675 bytes --]

On Thu, Apr 21, 2022 at 09:42:36AM +0300, Mike Rapoport wrote:

> The problem is that memremap() uses pfn_valid() to check if RAM resource can
> be accessed via linear mapping and this check is wrong.
> The problem does not manifest on more recent kernels because the way ARM
> registers "System RAM" resources was updated so that it skipped NOMAP
> memory.

> Can you please give a whirl to the patch below? If it's Ok I'll extend it
> to include arm64 and will send a formal patch.

Looks to have tested out OK:

https://linux.kernelci.org/test/job/broonie-misc/branch/for-kernelci/kernel/v5.4-18289-g6ebcdf39c03b/plan/baseline/

Tested-by: Mark Brown <broonie@kernel.org>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-04-22 11:09 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <625c8753.1c69fb81.b232.69bb@mx.google.com>
2022-04-19 13:31 ` stable-rc/linux-5.4.y bisection: baseline.login on qemu_arm-virt-gicv2-uefi Mark Brown
2022-04-20  9:18   ` Mike Rapoport
2022-04-20 12:07     ` Mark Brown
2022-04-21  6:42       ` Mike Rapoport
2022-04-22 11:09         ` Mark Brown

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.