All of lore.kernel.org
 help / color / mirror / Atom feed
From: Masayoshi Mizuma <msys.mizuma@gmail.com>
To: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>,
	linux-arm-kernel@lists.infradead.org, linux-efi@vger.kernel.org,
	Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>,
	kexec@lists.infradead.org, linux-kernel@vger.kernel.org,
	Eric Biederman <ebiederm@xmission.com>,
	d.hatayama@fujitsu.com
Subject: Re: [PATCH] efi: arm64: Introduce /sys/firmware/efi/memreserve to tell the persistent pages
Date: Mon, 2 Dec 2019 20:28:53 -0500	[thread overview]
Message-ID: <20191203012853.hwnbs6dfcbnkbtgp@gabell> (raw)
In-Reply-To: <c27b6f69-befc-0c88-24b9-7b89d4f6e5a6@gmail.com>

On Fri, Nov 29, 2019 at 01:25:36PM +0100, Matthias Brugger wrote:
> 
> 
> On 25/11/2019 19:49, Masayoshi Mizuma wrote:
> > From: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
> > 
> > kexec reboot stops in early boot sequence because efi_config_parse_tables()
> > refers garbage data. We can see the log with memblock=debug kernel option:
> > 
> >   efi:  ACPI 2.0=0x9821790014  PROP=0x8757f5c0  SMBIOS 3.0=0x9820740000  MEMRESERVE=0x9820bfdc58
> >   memblock_reserve: [0x0000009820bfdc58-0x0000009820bfdc67] efi_config_parse_tables+0x228/0x278
> >   memblock_reserve: [0x0000000082760000-0x00000000324d07ff] efi_config_parse_tables+0x228/0x278
> >   memblock_reserve: [0xcc4f84ecc0511670-0x5f6e5214a7fd91f9] efi_config_parse_tables+0x244/0x278
> >   memblock_reserve: [0xd2fd4144b9af693d-0xad0c1db1086f40a2] efi_config_parse_tables+0x244/0x278
> >   memblock_reserve: [0x0c719bb159b1fadc-0x5aa6e62a1417ce12] efi_config_parse_tables+0x244/0x278
> >   ...
> > 
> > That happens because 0x82760000, struct linux_efi_memreserve, is destroyed.
> > 0x82760000 is pointed from efi.mem_reseve, and efi.mem_reserve points the
> > head page of LPI pending table and LPI property table which are allocated by
> > gic_reserve_range().
> > 
> > The destroyer is kexec. kexec locates the initrd to the area:
> > 
> >   ]# kexec -d -l /boot/vmlinuz-5.4.0-rc7 /boot/initramfs-5.4.0-rc7.img --reuse-cmdline
> >   ...
> >   initrd: base 82290000, size 388dd8ah (59301258)
> >   ...
> > 
> > From dynamic debug log. initrd is located in segment[1]:
> >   machine_kexec_prepare:70:
> >     kexec kimage info:
> >       type:        0
> >       start:       85b30680
> >       head:        0
> >       nr_segments: 4
> >         segment[0]: 0000000080480000 - 0000000082290000, 0x1e10000 bytes, 481 pages
> >         segment[1]: 0000000082290000 - 0000000085b20000, 0x3890000 bytes, 905 pages
> >         segment[2]: 0000000085b20000 - 0000000085b30000, 0x10000 bytes, 1 pages
> >         segment[3]: 0000000085b30000 - 0000000085b40000, 0x10000 bytes, 1 pages
> > 
> > kexec searches the memory region to locate initrd through
> > "System RAM" in /proc/iomem. The pending tables are included in
> > "System RAM" because they are allocated by alloc_pages(), so kexec
> > destroys the LPI pending tables.
> > 
> 
> Doesn't that mean that you haven't enough memory reserved so that you have to
> fallback to allocate it via __get_free_page()?

That's a not fallback allocation. The pending tables and also property
tables are allocated by alloc_pages() on its_allocate_prop_table() and
its_allocate_pending_table().

> 
> 
> > Introduce /sys/firmware/efi/memreserve to tell the pages pointed by
> > efi.mem_reserve so that kexec can avoid the area to locate initrd.
> > 
> 
> Doesn't that need a patch for kexec-tools to actually take this into account?

Yes, we need a patch for kexec-tools as well. I'm preparing the kexec
patch.

> 
> > Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
> > ---
> >  drivers/firmware/efi/efi.c | 45 +++++++++++++++++++++++++++++++++++++-
> >  1 file changed, 44 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
> > index e98bbf8e5..0aa07cc09 100644
> > --- a/drivers/firmware/efi/efi.c
> > +++ b/drivers/firmware/efi/efi.c
> > @@ -141,6 +141,47 @@ static ssize_t systab_show(struct kobject *kobj,
> >  
> >  static struct kobj_attribute efi_attr_systab = __ATTR_RO_MODE(systab, 0400);
> >  
> > +static struct linux_efi_memreserve *efi_memreserve_root __ro_after_init;
> > +#ifdef CONFIG_KEXEC
> > +static ssize_t memreserve_show(struct kobject *kobj,
> > +			   struct kobj_attribute *attr, char *buf)
> > +{
> > +	struct linux_efi_memreserve *rsv;
> > +	phys_addr_t start, end;
> > +	unsigned long prsv;
> > +	char *str = buf;
> > +	int count, i;
> > +
> > +	if (!kobj || !buf)
> > +		return -EINVAL;
> > +
> > +	if ((efi_memreserve_root == (void *)ULONG_MAX) ||
> > +			(!efi_memreserve_root))
> > +		return -ENODEV;
> > +
> > +	for (prsv = efi_memreserve_root->next; prsv; prsv = rsv->next) {
> > +		rsv = memremap(prsv, sizeof(*rsv), MEMREMAP_WB);
> > +		if (!rsv) {
> > +			pr_err("Could not map efi_memreserve\n");
> > +			return -ENOMEM;
> > +		}
> > +		count = atomic_read(&rsv->count);
> > +		for (i = 0; i < count; i++) {
> > +			start = rsv->entry[i].base;
> > +			end = start + rsv->entry[i].size - 1;
> > +
> > +			str += sprintf(str, "%pa-%pa\n", &start, &end);
> 
> What happens if we provide a buf which is too small?

Good point.
The strings may exceed the buffer size (PAGE_SIZE) in case
efi_memreserve_root has a lot of entries.
It might be better to use seq_printf() to show efi_memreserve_root...
I'll move the file from a sysfs entry to a proc entry so that 
efi_memreserve_root can be handled by seq_printf().

Thanks,
Masa

WARNING: multiple messages have this Message-ID (diff)
From: Masayoshi Mizuma <msys.mizuma@gmail.com>
To: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>,
	linux-efi@vger.kernel.org,
	Ard Biesheuvel <ard.biesheuvel@linaro.org>,
	kexec@lists.infradead.org, linux-kernel@vger.kernel.org,
	Eric Biederman <ebiederm@xmission.com>,
	d.hatayama@fujitsu.com, linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH] efi: arm64: Introduce /sys/firmware/efi/memreserve to tell the persistent pages
Date: Mon, 2 Dec 2019 20:28:53 -0500	[thread overview]
Message-ID: <20191203012853.hwnbs6dfcbnkbtgp@gabell> (raw)
In-Reply-To: <c27b6f69-befc-0c88-24b9-7b89d4f6e5a6@gmail.com>

On Fri, Nov 29, 2019 at 01:25:36PM +0100, Matthias Brugger wrote:
> 
> 
> On 25/11/2019 19:49, Masayoshi Mizuma wrote:
> > From: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
> > 
> > kexec reboot stops in early boot sequence because efi_config_parse_tables()
> > refers garbage data. We can see the log with memblock=debug kernel option:
> > 
> >   efi:  ACPI 2.0=0x9821790014  PROP=0x8757f5c0  SMBIOS 3.0=0x9820740000  MEMRESERVE=0x9820bfdc58
> >   memblock_reserve: [0x0000009820bfdc58-0x0000009820bfdc67] efi_config_parse_tables+0x228/0x278
> >   memblock_reserve: [0x0000000082760000-0x00000000324d07ff] efi_config_parse_tables+0x228/0x278
> >   memblock_reserve: [0xcc4f84ecc0511670-0x5f6e5214a7fd91f9] efi_config_parse_tables+0x244/0x278
> >   memblock_reserve: [0xd2fd4144b9af693d-0xad0c1db1086f40a2] efi_config_parse_tables+0x244/0x278
> >   memblock_reserve: [0x0c719bb159b1fadc-0x5aa6e62a1417ce12] efi_config_parse_tables+0x244/0x278
> >   ...
> > 
> > That happens because 0x82760000, struct linux_efi_memreserve, is destroyed.
> > 0x82760000 is pointed from efi.mem_reseve, and efi.mem_reserve points the
> > head page of LPI pending table and LPI property table which are allocated by
> > gic_reserve_range().
> > 
> > The destroyer is kexec. kexec locates the initrd to the area:
> > 
> >   ]# kexec -d -l /boot/vmlinuz-5.4.0-rc7 /boot/initramfs-5.4.0-rc7.img --reuse-cmdline
> >   ...
> >   initrd: base 82290000, size 388dd8ah (59301258)
> >   ...
> > 
> > From dynamic debug log. initrd is located in segment[1]:
> >   machine_kexec_prepare:70:
> >     kexec kimage info:
> >       type:        0
> >       start:       85b30680
> >       head:        0
> >       nr_segments: 4
> >         segment[0]: 0000000080480000 - 0000000082290000, 0x1e10000 bytes, 481 pages
> >         segment[1]: 0000000082290000 - 0000000085b20000, 0x3890000 bytes, 905 pages
> >         segment[2]: 0000000085b20000 - 0000000085b30000, 0x10000 bytes, 1 pages
> >         segment[3]: 0000000085b30000 - 0000000085b40000, 0x10000 bytes, 1 pages
> > 
> > kexec searches the memory region to locate initrd through
> > "System RAM" in /proc/iomem. The pending tables are included in
> > "System RAM" because they are allocated by alloc_pages(), so kexec
> > destroys the LPI pending tables.
> > 
> 
> Doesn't that mean that you haven't enough memory reserved so that you have to
> fallback to allocate it via __get_free_page()?

That's a not fallback allocation. The pending tables and also property
tables are allocated by alloc_pages() on its_allocate_prop_table() and
its_allocate_pending_table().

> 
> 
> > Introduce /sys/firmware/efi/memreserve to tell the pages pointed by
> > efi.mem_reserve so that kexec can avoid the area to locate initrd.
> > 
> 
> Doesn't that need a patch for kexec-tools to actually take this into account?

Yes, we need a patch for kexec-tools as well. I'm preparing the kexec
patch.

> 
> > Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
> > ---
> >  drivers/firmware/efi/efi.c | 45 +++++++++++++++++++++++++++++++++++++-
> >  1 file changed, 44 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
> > index e98bbf8e5..0aa07cc09 100644
> > --- a/drivers/firmware/efi/efi.c
> > +++ b/drivers/firmware/efi/efi.c
> > @@ -141,6 +141,47 @@ static ssize_t systab_show(struct kobject *kobj,
> >  
> >  static struct kobj_attribute efi_attr_systab = __ATTR_RO_MODE(systab, 0400);
> >  
> > +static struct linux_efi_memreserve *efi_memreserve_root __ro_after_init;
> > +#ifdef CONFIG_KEXEC
> > +static ssize_t memreserve_show(struct kobject *kobj,
> > +			   struct kobj_attribute *attr, char *buf)
> > +{
> > +	struct linux_efi_memreserve *rsv;
> > +	phys_addr_t start, end;
> > +	unsigned long prsv;
> > +	char *str = buf;
> > +	int count, i;
> > +
> > +	if (!kobj || !buf)
> > +		return -EINVAL;
> > +
> > +	if ((efi_memreserve_root == (void *)ULONG_MAX) ||
> > +			(!efi_memreserve_root))
> > +		return -ENODEV;
> > +
> > +	for (prsv = efi_memreserve_root->next; prsv; prsv = rsv->next) {
> > +		rsv = memremap(prsv, sizeof(*rsv), MEMREMAP_WB);
> > +		if (!rsv) {
> > +			pr_err("Could not map efi_memreserve\n");
> > +			return -ENOMEM;
> > +		}
> > +		count = atomic_read(&rsv->count);
> > +		for (i = 0; i < count; i++) {
> > +			start = rsv->entry[i].base;
> > +			end = start + rsv->entry[i].size - 1;
> > +
> > +			str += sprintf(str, "%pa-%pa\n", &start, &end);
> 
> What happens if we provide a buf which is too small?

Good point.
The strings may exceed the buffer size (PAGE_SIZE) in case
efi_memreserve_root has a lot of entries.
It might be better to use seq_printf() to show efi_memreserve_root...
I'll move the file from a sysfs entry to a proc entry so that 
efi_memreserve_root can be handled by seq_printf().

Thanks,
Masa

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

WARNING: multiple messages have this Message-ID (diff)
From: Masayoshi Mizuma <msys.mizuma@gmail.com>
To: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>,
	linux-efi@vger.kernel.org,
	Ard Biesheuvel <ard.biesheuvel@linaro.org>,
	kexec@lists.infradead.org, linux-kernel@vger.kernel.org,
	Eric Biederman <ebiederm@xmission.com>,
	d.hatayama@fujitsu.com, linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH] efi: arm64: Introduce /sys/firmware/efi/memreserve to tell the persistent pages
Date: Mon, 2 Dec 2019 20:28:53 -0500	[thread overview]
Message-ID: <20191203012853.hwnbs6dfcbnkbtgp@gabell> (raw)
In-Reply-To: <c27b6f69-befc-0c88-24b9-7b89d4f6e5a6@gmail.com>

On Fri, Nov 29, 2019 at 01:25:36PM +0100, Matthias Brugger wrote:
> 
> 
> On 25/11/2019 19:49, Masayoshi Mizuma wrote:
> > From: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
> > 
> > kexec reboot stops in early boot sequence because efi_config_parse_tables()
> > refers garbage data. We can see the log with memblock=debug kernel option:
> > 
> >   efi:  ACPI 2.0=0x9821790014  PROP=0x8757f5c0  SMBIOS 3.0=0x9820740000  MEMRESERVE=0x9820bfdc58
> >   memblock_reserve: [0x0000009820bfdc58-0x0000009820bfdc67] efi_config_parse_tables+0x228/0x278
> >   memblock_reserve: [0x0000000082760000-0x00000000324d07ff] efi_config_parse_tables+0x228/0x278
> >   memblock_reserve: [0xcc4f84ecc0511670-0x5f6e5214a7fd91f9] efi_config_parse_tables+0x244/0x278
> >   memblock_reserve: [0xd2fd4144b9af693d-0xad0c1db1086f40a2] efi_config_parse_tables+0x244/0x278
> >   memblock_reserve: [0x0c719bb159b1fadc-0x5aa6e62a1417ce12] efi_config_parse_tables+0x244/0x278
> >   ...
> > 
> > That happens because 0x82760000, struct linux_efi_memreserve, is destroyed.
> > 0x82760000 is pointed from efi.mem_reseve, and efi.mem_reserve points the
> > head page of LPI pending table and LPI property table which are allocated by
> > gic_reserve_range().
> > 
> > The destroyer is kexec. kexec locates the initrd to the area:
> > 
> >   ]# kexec -d -l /boot/vmlinuz-5.4.0-rc7 /boot/initramfs-5.4.0-rc7.img --reuse-cmdline
> >   ...
> >   initrd: base 82290000, size 388dd8ah (59301258)
> >   ...
> > 
> > From dynamic debug log. initrd is located in segment[1]:
> >   machine_kexec_prepare:70:
> >     kexec kimage info:
> >       type:        0
> >       start:       85b30680
> >       head:        0
> >       nr_segments: 4
> >         segment[0]: 0000000080480000 - 0000000082290000, 0x1e10000 bytes, 481 pages
> >         segment[1]: 0000000082290000 - 0000000085b20000, 0x3890000 bytes, 905 pages
> >         segment[2]: 0000000085b20000 - 0000000085b30000, 0x10000 bytes, 1 pages
> >         segment[3]: 0000000085b30000 - 0000000085b40000, 0x10000 bytes, 1 pages
> > 
> > kexec searches the memory region to locate initrd through
> > "System RAM" in /proc/iomem. The pending tables are included in
> > "System RAM" because they are allocated by alloc_pages(), so kexec
> > destroys the LPI pending tables.
> > 
> 
> Doesn't that mean that you haven't enough memory reserved so that you have to
> fallback to allocate it via __get_free_page()?

That's a not fallback allocation. The pending tables and also property
tables are allocated by alloc_pages() on its_allocate_prop_table() and
its_allocate_pending_table().

> 
> 
> > Introduce /sys/firmware/efi/memreserve to tell the pages pointed by
> > efi.mem_reserve so that kexec can avoid the area to locate initrd.
> > 
> 
> Doesn't that need a patch for kexec-tools to actually take this into account?

Yes, we need a patch for kexec-tools as well. I'm preparing the kexec
patch.

> 
> > Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
> > ---
> >  drivers/firmware/efi/efi.c | 45 +++++++++++++++++++++++++++++++++++++-
> >  1 file changed, 44 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
> > index e98bbf8e5..0aa07cc09 100644
> > --- a/drivers/firmware/efi/efi.c
> > +++ b/drivers/firmware/efi/efi.c
> > @@ -141,6 +141,47 @@ static ssize_t systab_show(struct kobject *kobj,
> >  
> >  static struct kobj_attribute efi_attr_systab = __ATTR_RO_MODE(systab, 0400);
> >  
> > +static struct linux_efi_memreserve *efi_memreserve_root __ro_after_init;
> > +#ifdef CONFIG_KEXEC
> > +static ssize_t memreserve_show(struct kobject *kobj,
> > +			   struct kobj_attribute *attr, char *buf)
> > +{
> > +	struct linux_efi_memreserve *rsv;
> > +	phys_addr_t start, end;
> > +	unsigned long prsv;
> > +	char *str = buf;
> > +	int count, i;
> > +
> > +	if (!kobj || !buf)
> > +		return -EINVAL;
> > +
> > +	if ((efi_memreserve_root == (void *)ULONG_MAX) ||
> > +			(!efi_memreserve_root))
> > +		return -ENODEV;
> > +
> > +	for (prsv = efi_memreserve_root->next; prsv; prsv = rsv->next) {
> > +		rsv = memremap(prsv, sizeof(*rsv), MEMREMAP_WB);
> > +		if (!rsv) {
> > +			pr_err("Could not map efi_memreserve\n");
> > +			return -ENOMEM;
> > +		}
> > +		count = atomic_read(&rsv->count);
> > +		for (i = 0; i < count; i++) {
> > +			start = rsv->entry[i].base;
> > +			end = start + rsv->entry[i].size - 1;
> > +
> > +			str += sprintf(str, "%pa-%pa\n", &start, &end);
> 
> What happens if we provide a buf which is too small?

Good point.
The strings may exceed the buffer size (PAGE_SIZE) in case
efi_memreserve_root has a lot of entries.
It might be better to use seq_printf() to show efi_memreserve_root...
I'll move the file from a sysfs entry to a proc entry so that 
efi_memreserve_root can be handled by seq_printf().

Thanks,
Masa

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

  reply	other threads:[~2019-12-03  1:29 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-25 18:49 [PATCH] efi: arm64: Introduce /sys/firmware/efi/memreserve to tell the persistent pages Masayoshi Mizuma
2019-11-25 18:49 ` Masayoshi Mizuma
2019-11-25 18:49 ` Masayoshi Mizuma
2019-11-29 12:25 ` Matthias Brugger
2019-11-29 12:25   ` Matthias Brugger
2019-11-29 12:25   ` Matthias Brugger
2019-12-03  1:28   ` Masayoshi Mizuma [this message]
2019-12-03  1:28     ` Masayoshi Mizuma
2019-12-03  1:28     ` Masayoshi Mizuma

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191203012853.hwnbs6dfcbnkbtgp@gabell \
    --to=msys.mizuma@gmail.com \
    --cc=ard.biesheuvel@linaro.org \
    --cc=d.hatayama@fujitsu.com \
    --cc=ebiederm@xmission.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-efi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=m.mizuma@jp.fujitsu.com \
    --cc=matthias.bgg@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.