All of lore.kernel.org
 help / color / mirror / Atom feed
From: James Morse <james.morse@arm.com>
To: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: catalin.marinas@arm.com, will.deacon@arm.com,
	akpm@linux-foundation.org, ard.biesheuvel@linaro.org,
	tbaicar@codeaurora.org, bhsharma@redhat.com, dyoung@redhat.com,
	mark.rutland@arm.com, al.stone@linaro.org,
	graeme.gregory@linaro.org, hanjun.guo@linaro.org,
	lorenzo.pieralisi@arm.com, sudeep.holla@arm.com,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, kexec@lists.infradead.org
Subject: Re: [PATCH 0/3] arm64: kexec,kdump: fix boot failures on acpi-only system
Date: Fri, 15 Jun 2018 17:29:32 +0100	[thread overview]
Message-ID: <81a9a385-3b9f-113d-96f0-379be74c19f0@arm.com> (raw)
In-Reply-To: <20180615075623.13454-1-takahiro.akashi@linaro.org>

Hi Akashi,

Thanks for putting this together,

On 15/06/18 08:56, AKASHI Takahiro wrote:
> This patch series is a set of bug fixes to address kexec/kdump
> failures which are sometimes observed on ACPI-only system and reported
> in LAK-ML before.
> 
> In short, the phenomena are:
> 1. kexec'ed kernel can fail to boot because some ACPI table is corrupted
>    by a new kernel (or other data) being loaded into System RAM. Currently
>    kexec may possibly allocate space ignoring such "reserved" regions.
>    We will see no messages after "Bye!"
> 
> 2. crash dump (kdump) kernel can fail to boot and get into panic due to
>    an alignment fault when accessing ACPI tables. This can happen because
>    those tables are not always properly aligned while they are mapped
>    non-cacheable (ioremap'ed) as they are not recognized as part of System
>    RAM under the current implementation.
> 
> After discussing several possibilities to address those issues,
> the agreed approach, in my understanding, is
> * to add resource entries for every "reserved", i.e. memblock_reserve(),
>   regions to /proc/iomem.
>   (NOMAP regions, also marked as "reserved," remains at top-level for
>   backward compatibility.)

This means user-space can tell the difference between reserved-system-ram and
reserved-address-space.


> * For case (1), user space (kexec-tools) should rule out such regions
>   in searching for free space for loaded data.

... but doesn't today, because it fails to account for second-level entries.
We've always had second-level entries, so this is a user-space bug. We need both
fixed to fix the issue.

Our attempts to fix this just in the kernel reached a dead end, because Kdump
needs to include reserved-system-ram, whereas kexec has to avoid it. User-space
needs to be able to tell reserved-system-ram and reserved-address-space apart.
Hence we need to expose that information, and pick it up in user-space.

Patched-kernel and unpatch-user-space will work the same way it does today, as
the additional reserved regions are ignored by user-space.

Unpatched-kernel and patched-user-space will also work the same way it does
today as the additional reserved regions are missing.

I think this is the only way forwards on this issue...


> * For case (2), the kernel should access ACPI tables by mapping
>   them with appropriate memory attributes described in UEFI memory map.
>   (This means that it doesn't require any changes in /proc/iomem, and
>   hence user space.)

(this one is handled entirely in the kernel)


Thanks,

James

WARNING: multiple messages have this Message-ID (diff)
From: james.morse@arm.com (James Morse)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH 0/3] arm64: kexec,kdump: fix boot failures on acpi-only system
Date: Fri, 15 Jun 2018 17:29:32 +0100	[thread overview]
Message-ID: <81a9a385-3b9f-113d-96f0-379be74c19f0@arm.com> (raw)
In-Reply-To: <20180615075623.13454-1-takahiro.akashi@linaro.org>

Hi Akashi,

Thanks for putting this together,

On 15/06/18 08:56, AKASHI Takahiro wrote:
> This patch series is a set of bug fixes to address kexec/kdump
> failures which are sometimes observed on ACPI-only system and reported
> in LAK-ML before.
> 
> In short, the phenomena are:
> 1. kexec'ed kernel can fail to boot because some ACPI table is corrupted
>    by a new kernel (or other data) being loaded into System RAM. Currently
>    kexec may possibly allocate space ignoring such "reserved" regions.
>    We will see no messages after "Bye!"
> 
> 2. crash dump (kdump) kernel can fail to boot and get into panic due to
>    an alignment fault when accessing ACPI tables. This can happen because
>    those tables are not always properly aligned while they are mapped
>    non-cacheable (ioremap'ed) as they are not recognized as part of System
>    RAM under the current implementation.
> 
> After discussing several possibilities to address those issues,
> the agreed approach, in my understanding, is
> * to add resource entries for every "reserved", i.e. memblock_reserve(),
>   regions to /proc/iomem.
>   (NOMAP regions, also marked as "reserved," remains at top-level for
>   backward compatibility.)

This means user-space can tell the difference between reserved-system-ram and
reserved-address-space.


> * For case (1), user space (kexec-tools) should rule out such regions
>   in searching for free space for loaded data.

... but doesn't today, because it fails to account for second-level entries.
We've always had second-level entries, so this is a user-space bug. We need both
fixed to fix the issue.

Our attempts to fix this just in the kernel reached a dead end, because Kdump
needs to include reserved-system-ram, whereas kexec has to avoid it. User-space
needs to be able to tell reserved-system-ram and reserved-address-space apart.
Hence we need to expose that information, and pick it up in user-space.

Patched-kernel and unpatch-user-space will work the same way it does today, as
the additional reserved regions are ignored by user-space.

Unpatched-kernel and patched-user-space will also work the same way it does
today as the additional reserved regions are missing.

I think this is the only way forwards on this issue...


> * For case (2), the kernel should access ACPI tables by mapping
>   them with appropriate memory attributes described in UEFI memory map.
>   (This means that it doesn't require any changes in /proc/iomem, and
>   hence user space.)

(this one is handled entirely in the kernel)


Thanks,

James

WARNING: multiple messages have this Message-ID (diff)
From: James Morse <james.morse@arm.com>
To: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: mark.rutland@arm.com, lorenzo.pieralisi@arm.com,
	graeme.gregory@linaro.org, al.stone@linaro.org,
	ard.biesheuvel@linaro.org, catalin.marinas@arm.com,
	bhsharma@redhat.com, tbaicar@codeaurora.org, will.deacon@arm.com,
	linux-kernel@vger.kernel.org, hanjun.guo@linaro.org,
	sudeep.holla@arm.com, akpm@linux-foundation.org,
	dyoung@redhat.com, kexec@lists.infradead.org,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH 0/3] arm64: kexec,kdump: fix boot failures on acpi-only system
Date: Fri, 15 Jun 2018 17:29:32 +0100	[thread overview]
Message-ID: <81a9a385-3b9f-113d-96f0-379be74c19f0@arm.com> (raw)
In-Reply-To: <20180615075623.13454-1-takahiro.akashi@linaro.org>

Hi Akashi,

Thanks for putting this together,

On 15/06/18 08:56, AKASHI Takahiro wrote:
> This patch series is a set of bug fixes to address kexec/kdump
> failures which are sometimes observed on ACPI-only system and reported
> in LAK-ML before.
> 
> In short, the phenomena are:
> 1. kexec'ed kernel can fail to boot because some ACPI table is corrupted
>    by a new kernel (or other data) being loaded into System RAM. Currently
>    kexec may possibly allocate space ignoring such "reserved" regions.
>    We will see no messages after "Bye!"
> 
> 2. crash dump (kdump) kernel can fail to boot and get into panic due to
>    an alignment fault when accessing ACPI tables. This can happen because
>    those tables are not always properly aligned while they are mapped
>    non-cacheable (ioremap'ed) as they are not recognized as part of System
>    RAM under the current implementation.
> 
> After discussing several possibilities to address those issues,
> the agreed approach, in my understanding, is
> * to add resource entries for every "reserved", i.e. memblock_reserve(),
>   regions to /proc/iomem.
>   (NOMAP regions, also marked as "reserved," remains at top-level for
>   backward compatibility.)

This means user-space can tell the difference between reserved-system-ram and
reserved-address-space.


> * For case (1), user space (kexec-tools) should rule out such regions
>   in searching for free space for loaded data.

... but doesn't today, because it fails to account for second-level entries.
We've always had second-level entries, so this is a user-space bug. We need both
fixed to fix the issue.

Our attempts to fix this just in the kernel reached a dead end, because Kdump
needs to include reserved-system-ram, whereas kexec has to avoid it. User-space
needs to be able to tell reserved-system-ram and reserved-address-space apart.
Hence we need to expose that information, and pick it up in user-space.

Patched-kernel and unpatch-user-space will work the same way it does today, as
the additional reserved regions are ignored by user-space.

Unpatched-kernel and patched-user-space will also work the same way it does
today as the additional reserved regions are missing.

I think this is the only way forwards on this issue...


> * For case (2), the kernel should access ACPI tables by mapping
>   them with appropriate memory attributes described in UEFI memory map.
>   (This means that it doesn't require any changes in /proc/iomem, and
>   hence user space.)

(this one is handled entirely in the kernel)


Thanks,

James

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

  parent reply	other threads:[~2018-06-15 16:29 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-15  7:56 [PATCH 0/3] arm64: kexec,kdump: fix boot failures on acpi-only system AKASHI Takahiro
2018-06-15  7:56 ` [PATCH 0/3] arm64: kexec, kdump: " AKASHI Takahiro
2018-06-15  7:56 ` AKASHI Takahiro
2018-06-15  7:56 ` [PATCH 1/3] arm64: export memblock_reserve()d regions via /proc/iomem AKASHI Takahiro
2018-06-15  7:56   ` AKASHI Takahiro
2018-06-15  7:56   ` AKASHI Takahiro
2018-06-15  7:56 ` [PATCH 2/3] arm64: acpi,efi: fix alignment fault in accessing ACPI tables at kdump AKASHI Takahiro
2018-06-15  7:56   ` [PATCH 2/3] arm64: acpi, efi: " AKASHI Takahiro
2018-06-15  7:56   ` AKASHI Takahiro
2018-06-15 16:30   ` [PATCH 2/3] arm64: acpi,efi: " James Morse
2018-06-15 16:30     ` James Morse
2018-06-15 16:30     ` James Morse
2018-06-18  6:44     ` AKASHI Takahiro
2018-06-18  6:44       ` AKASHI Takahiro
2018-06-18  6:44       ` AKASHI Takahiro
2018-06-15  7:56 ` [PATCH 3/3] init: map UEFI memory map early if on arm or arm64 AKASHI Takahiro
2018-06-15  7:56   ` AKASHI Takahiro
2018-06-15  7:56   ` AKASHI Takahiro
2018-06-15 12:52 ` [PATCH 0/3] arm64: kexec,kdump: fix boot failures on acpi-only system Bhupesh Sharma
2018-06-15 12:52   ` [PATCH 0/3] arm64: kexec, kdump: " Bhupesh Sharma
2018-06-15 12:52   ` Bhupesh Sharma
2018-06-15 16:29 ` James Morse [this message]
2018-06-15 16:29   ` [PATCH 0/3] arm64: kexec,kdump: " James Morse
2018-06-15 16:29   ` James Morse
2018-06-18  5:57   ` AKASHI Takahiro
2018-06-18  5:57     ` AKASHI Takahiro
2018-06-18  5:57     ` AKASHI Takahiro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=81a9a385-3b9f-113d-96f0-379be74c19f0@arm.com \
    --to=james.morse@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=al.stone@linaro.org \
    --cc=ard.biesheuvel@linaro.org \
    --cc=bhsharma@redhat.com \
    --cc=catalin.marinas@arm.com \
    --cc=dyoung@redhat.com \
    --cc=graeme.gregory@linaro.org \
    --cc=hanjun.guo@linaro.org \
    --cc=kexec@lists.infradead.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lorenzo.pieralisi@arm.com \
    --cc=mark.rutland@arm.com \
    --cc=sudeep.holla@arm.com \
    --cc=takahiro.akashi@linaro.org \
    --cc=tbaicar@codeaurora.org \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.