All of lore.kernel.org
 help / color / mirror / Atom feed
From: "chenxiang (M)" via <qemu-devel@nongnu.org>
To: Ard Biesheuvel <ardb@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>, <will@kernel.org>,
	<mark.rutland@arm.com>,  <linux-arm-kernel@lists.infradead.org>,
	chenxiang via <qemu-devel@nongnu.org>,
	"linuxarm@huawei.com" <linuxarm@huawei.com>
Subject: Re: regression: insmod module failed in VM with nvdimm on
Date: Fri, 2 Dec 2022 10:48:06 +0800	[thread overview]
Message-ID: <21cf7de2-27e8-8d1f-9efc-aa68cefbad50@hisilicon.com> (raw)
In-Reply-To: <CAMj1kXGF=DuQSgf8FbW98WTX94U7rB0hq_cFAc0+AfVn=HHsFg@mail.gmail.com>

Hi Ard,


在 2022/12/1 19:07, Ard Biesheuvel 写道:
> On Thu, 1 Dec 2022 at 09:07, Ard Biesheuvel <ardb@kernel.org> wrote:
>> On Thu, 1 Dec 2022 at 08:15, chenxiang (M) <chenxiang66@hisilicon.com> wrote:
>>> Hi Ard,
>>>
>>>
>>> 在 2022/11/30 16:18, Ard Biesheuvel 写道:
>>>> On Wed, 30 Nov 2022 at 08:53, Marc Zyngier <maz@kernel.org> wrote:
>>>>> On Wed, 30 Nov 2022 02:52:35 +0000,
>>>>> "chenxiang (M)" <chenxiang66@hisilicon.com> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> We boot the VM using following commands (with nvdimm on)  (qemu
>>>>>> version 6.1.50, kernel 6.0-r4):
>>>>> How relevant is the presence of the nvdimm? Do you observe the failure
>>>>> without this?
>>>>>
>>>>>> qemu-system-aarch64 -machine
>>>>>> virt,kernel_irqchip=on,gic-version=3,nvdimm=on  -kernel
>>>>>> /home/kernel/Image -initrd /home/mini-rootfs/rootfs.cpio.gz -bios
>>>>>> /root/QEMU_EFI.FD -cpu host -enable-kvm -net none -nographic -m
>>>>>> 2G,maxmem=64G,slots=3 -smp 4 -append 'rdinit=init console=ttyAMA0
>>>>>> ealycon=pl0ll,0x90000000 pcie_ports=native pciehp.pciehp_debug=1'
>>>>>> -object memory-backend-ram,id=ram1,size=10G -device
>>>>>> nvdimm,id=dimm1,memdev=ram1  -device ioh3420,id=root_port1,chassis=1
>>>>>> -device vfio-pci,host=7d:01.0,id=net0,bus=root_port1
>>>>>>
>>>>>> Then in VM we insmod a module, vmalloc error occurs as follows (kernel
>>>>>> 5.19-rc4 is normal, and the issue is still on kernel 6.1-rc4):
>>>>>>
>>>>>> estuary:/$ insmod /lib/modules/$(uname -r)/hnae3.ko
>>>>>> [    8.186563] vmap allocation for size 20480 failed: use
>>>>>> vmalloc=<size> to increase size
>>>>> Have you tried increasing the vmalloc size to check that this is
>>>>> indeed the problem?
>>>>>
>>>>> [...]
>>>>>
>>>>>> We git bisect the code, and find the patch c5a89f75d2a ("arm64: kaslr:
>>>>>> defer initialization to initcall where permitted").
>>>>> I guess you mean commit fc5a89f75d2a instead, right?
>>>>>
>>>>>> Do you have any idea about the issue?
>>>>> I sort of suspect that the nvdimm gets vmap-ed and consumes a large
>>>>> portion of the vmalloc space, but you give very little information
>>>>> that could help here...
>>>>>
>>>> Ouch. I suspect what's going on here: that patch defers the
>>>> randomization of the module region, so that we can decouple it from
>>>> the very early init code.
>>>>
>>>> Obviously, it is happening too late now, and the randomized module
>>>> region is overlapping with a vmalloc region that is in use by the time
>>>> the randomization occurs.
>>>>
>>>> Does the below fix the issue?
>>> The issue still occurs, but it seems decrease the probability, before it
>>> occured almost every time, after the change, i tried 2-3 times, and it
>>> occurs.
>>> But i change back "subsys_initcall" to "core_initcall", and i test more
>>> than 20 times, and it is still ok.
>>>
>> Thank you for confirming. I will send out a patch today.
>>
> ...but before I do that, could you please check whether the change
> below fixes your issue as well?
>
> diff --git a/arch/arm64/kernel/kaslr.c b/arch/arm64/kernel/kaslr.c
> index 6ccc7ef600e7c1e1..c8c205b630da1951 100644
> --- a/arch/arm64/kernel/kaslr.c
> +++ b/arch/arm64/kernel/kaslr.c
> @@ -20,7 +20,11 @@
>   #include <asm/sections.h>
>   #include <asm/setup.h>
>
> -u64 __ro_after_init module_alloc_base;
> +/*
> + * Set a reasonable default for module_alloc_base in case
> + * we end up running with module randomization disabled.
> + */
> +u64 __ro_after_init module_alloc_base = (u64)_etext - MODULES_VSIZE;
>   u16 __initdata memstart_offset_seed;
>
>   struct arm64_ftr_override kaslr_feature_override __initdata;
> @@ -30,12 +34,6 @@ static int __init kaslr_init(void)
>          u64 module_range;
>          u32 seed;
>
> -       /*
> -        * Set a reasonable default for module_alloc_base in case
> -        * we end up running with module randomization disabled.
> -        */
> -       module_alloc_base = (u64)_etext - MODULES_VSIZE;
> -
>          if (kaslr_feature_override.val & kaslr_feature_override.mask & 0xf) {
>                  pr_info("KASLR disabled on command line\n");
>                  return 0;
> .

We have tested this change, the issue is still and it doesn't fix the issue.



WARNING: multiple messages have this Message-ID (diff)
From: "chenxiang (M)" <chenxiang66@hisilicon.com>
To: Ard Biesheuvel <ardb@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>, <will@kernel.org>,
	<mark.rutland@arm.com>, <linux-arm-kernel@lists.infradead.org>,
	chenxiang via <qemu-devel@nongnu.org>,
	"linuxarm@huawei.com" <linuxarm@huawei.com>
Subject: Re: regression: insmod module failed in VM with nvdimm on
Date: Fri, 2 Dec 2022 10:48:06 +0800	[thread overview]
Message-ID: <21cf7de2-27e8-8d1f-9efc-aa68cefbad50@hisilicon.com> (raw)
In-Reply-To: <CAMj1kXGF=DuQSgf8FbW98WTX94U7rB0hq_cFAc0+AfVn=HHsFg@mail.gmail.com>

Hi Ard,


在 2022/12/1 19:07, Ard Biesheuvel 写道:
> On Thu, 1 Dec 2022 at 09:07, Ard Biesheuvel <ardb@kernel.org> wrote:
>> On Thu, 1 Dec 2022 at 08:15, chenxiang (M) <chenxiang66@hisilicon.com> wrote:
>>> Hi Ard,
>>>
>>>
>>> 在 2022/11/30 16:18, Ard Biesheuvel 写道:
>>>> On Wed, 30 Nov 2022 at 08:53, Marc Zyngier <maz@kernel.org> wrote:
>>>>> On Wed, 30 Nov 2022 02:52:35 +0000,
>>>>> "chenxiang (M)" <chenxiang66@hisilicon.com> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> We boot the VM using following commands (with nvdimm on)  (qemu
>>>>>> version 6.1.50, kernel 6.0-r4):
>>>>> How relevant is the presence of the nvdimm? Do you observe the failure
>>>>> without this?
>>>>>
>>>>>> qemu-system-aarch64 -machine
>>>>>> virt,kernel_irqchip=on,gic-version=3,nvdimm=on  -kernel
>>>>>> /home/kernel/Image -initrd /home/mini-rootfs/rootfs.cpio.gz -bios
>>>>>> /root/QEMU_EFI.FD -cpu host -enable-kvm -net none -nographic -m
>>>>>> 2G,maxmem=64G,slots=3 -smp 4 -append 'rdinit=init console=ttyAMA0
>>>>>> ealycon=pl0ll,0x90000000 pcie_ports=native pciehp.pciehp_debug=1'
>>>>>> -object memory-backend-ram,id=ram1,size=10G -device
>>>>>> nvdimm,id=dimm1,memdev=ram1  -device ioh3420,id=root_port1,chassis=1
>>>>>> -device vfio-pci,host=7d:01.0,id=net0,bus=root_port1
>>>>>>
>>>>>> Then in VM we insmod a module, vmalloc error occurs as follows (kernel
>>>>>> 5.19-rc4 is normal, and the issue is still on kernel 6.1-rc4):
>>>>>>
>>>>>> estuary:/$ insmod /lib/modules/$(uname -r)/hnae3.ko
>>>>>> [    8.186563] vmap allocation for size 20480 failed: use
>>>>>> vmalloc=<size> to increase size
>>>>> Have you tried increasing the vmalloc size to check that this is
>>>>> indeed the problem?
>>>>>
>>>>> [...]
>>>>>
>>>>>> We git bisect the code, and find the patch c5a89f75d2a ("arm64: kaslr:
>>>>>> defer initialization to initcall where permitted").
>>>>> I guess you mean commit fc5a89f75d2a instead, right?
>>>>>
>>>>>> Do you have any idea about the issue?
>>>>> I sort of suspect that the nvdimm gets vmap-ed and consumes a large
>>>>> portion of the vmalloc space, but you give very little information
>>>>> that could help here...
>>>>>
>>>> Ouch. I suspect what's going on here: that patch defers the
>>>> randomization of the module region, so that we can decouple it from
>>>> the very early init code.
>>>>
>>>> Obviously, it is happening too late now, and the randomized module
>>>> region is overlapping with a vmalloc region that is in use by the time
>>>> the randomization occurs.
>>>>
>>>> Does the below fix the issue?
>>> The issue still occurs, but it seems decrease the probability, before it
>>> occured almost every time, after the change, i tried 2-3 times, and it
>>> occurs.
>>> But i change back "subsys_initcall" to "core_initcall", and i test more
>>> than 20 times, and it is still ok.
>>>
>> Thank you for confirming. I will send out a patch today.
>>
> ...but before I do that, could you please check whether the change
> below fixes your issue as well?
>
> diff --git a/arch/arm64/kernel/kaslr.c b/arch/arm64/kernel/kaslr.c
> index 6ccc7ef600e7c1e1..c8c205b630da1951 100644
> --- a/arch/arm64/kernel/kaslr.c
> +++ b/arch/arm64/kernel/kaslr.c
> @@ -20,7 +20,11 @@
>   #include <asm/sections.h>
>   #include <asm/setup.h>
>
> -u64 __ro_after_init module_alloc_base;
> +/*
> + * Set a reasonable default for module_alloc_base in case
> + * we end up running with module randomization disabled.
> + */
> +u64 __ro_after_init module_alloc_base = (u64)_etext - MODULES_VSIZE;
>   u16 __initdata memstart_offset_seed;
>
>   struct arm64_ftr_override kaslr_feature_override __initdata;
> @@ -30,12 +34,6 @@ static int __init kaslr_init(void)
>          u64 module_range;
>          u32 seed;
>
> -       /*
> -        * Set a reasonable default for module_alloc_base in case
> -        * we end up running with module randomization disabled.
> -        */
> -       module_alloc_base = (u64)_etext - MODULES_VSIZE;
> -
>          if (kaslr_feature_override.val & kaslr_feature_override.mask & 0xf) {
>                  pr_info("KASLR disabled on command line\n");
>                  return 0;
> .

We have tested this change, the issue is still and it doesn't fix the issue.


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2022-12-02  2:49 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-30  2:52 regression: insmod module failed in VM with nvdimm on chenxiang (M) via
2022-11-30  2:52 ` chenxiang (M)
2022-11-30  7:53 ` Marc Zyngier
2022-11-30  7:53   ` Marc Zyngier
2022-11-30  8:18   ` Ard Biesheuvel
2022-11-30  8:18     ` Ard Biesheuvel
2022-12-01  7:15     ` chenxiang (M) via
2022-12-01  7:15       ` chenxiang (M)
2022-12-01  8:07       ` Ard Biesheuvel
2022-12-01  8:07         ` Ard Biesheuvel
2022-12-01 11:07         ` Ard Biesheuvel
2022-12-01 11:07           ` Ard Biesheuvel
2022-12-01 12:06           ` chenxiang (M)
2022-12-01 12:06             ` chenxiang (M) via
2022-12-01 12:53             ` Ard Biesheuvel
2022-12-01 12:53               ` Ard Biesheuvel
2022-12-02  2:48           ` chenxiang (M) via [this message]
2022-12-02  2:48             ` chenxiang (M)
2022-12-02 13:44             ` Ard Biesheuvel
2022-12-02 13:44               ` Ard Biesheuvel
2022-12-15 17:33               ` Thorsten Leemhuis
2022-12-15 17:33                 ` Thorsten Leemhuis
2022-12-01  7:01   ` chenxiang (M) via
2022-12-01  7:01     ` chenxiang (M)
2022-11-30 10:10 ` regression: insmod module failed in VM with nvdimm on #forregzbot Thorsten Leemhuis
2022-11-30 10:10   ` Thorsten Leemhuis
2023-03-03  9:42   ` Linux regression tracking #update (Thorsten Leemhuis)
2023-03-03  9:42     ` Linux regression tracking #update (Thorsten Leemhuis)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=21cf7de2-27e8-8d1f-9efc-aa68cefbad50@hisilicon.com \
    --to=qemu-devel@nongnu.org \
    --cc=ardb@kernel.org \
    --cc=chenxiang66@hisilicon.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linuxarm@huawei.com \
    --cc=mark.rutland@arm.com \
    --cc=maz@kernel.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.