All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Russell King (Oracle)" <linux@armlinux.org.uk>
To: Mike Rapoport <rppt@kernel.org>
Cc: Mike Rapoport <rppt@linux.ibm.com>,
	linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	David Hildenbrand <david@redhat.com>,
	Heiko Carstens <hca@linux.ibm.com>,
	Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
	Vasily Gorbik <gor@linux.ibm.com>, Will Deacon <will@kernel.org>,
	linux-arm-kernel@lists.infradead.org, linux-mips@vger.kernel.org,
	linux-mm@kvack.org, linux-s390@vger.kernel.org
Subject: Re: [RFC/RFT PATCH 2/5] memblock: introduce generic memblock_setup_resources()
Date: Wed, 2 Jun 2021 21:15:02 +0100	[thread overview]
Message-ID: <20210602201502.GP30436@shell.armlinux.org.uk> (raw)
In-Reply-To: <YLfRVGC+tq5L0TZ6@kernel.org>

On Wed, Jun 02, 2021 at 09:43:32PM +0300, Mike Rapoport wrote:
> Back then when __ex_table was moved from .data section, _sdata and _edata
> were part of the .data section. Today they are not. So something like the
> patch below will ensure for instance that __ex_table would be a part of
> "Kernel data" in /proc/iomem without moving it to the .data section:
> 
> diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
> index f7f4620d59c3..2991feceab31 100644
> --- a/arch/arm/kernel/vmlinux.lds.S
> +++ b/arch/arm/kernel/vmlinux.lds.S
> @@ -72,13 +72,6 @@ SECTIONS
>  
>  	RO_DATA(PAGE_SIZE)
>  
> -	. = ALIGN(4);
> -	__ex_table : AT(ADDR(__ex_table) - LOAD_OFFSET) {
> -		__start___ex_table = .;
> -		ARM_MMU_KEEP(*(__ex_table))
> -		__stop___ex_table = .;
> -	}
> -
>  #ifdef CONFIG_ARM_UNWIND
>  	ARM_UNWIND_SECTIONS
>  #endif
> @@ -143,6 +136,14 @@ SECTIONS
>  	__init_end = .;
>  
>  	_sdata = .;
> +
> +	. = ALIGN(4);
> +	__ex_table : AT(ADDR(__ex_table) - LOAD_OFFSET) {
> +		__start___ex_table = .;
> +		ARM_MMU_KEEP(*(__ex_table))
> +		__stop___ex_table = .;
> +	}
> +
>  	RW_DATA(L1_CACHE_BYTES, PAGE_SIZE, THREAD_SIZE)
>  	_edata = .;

This example has undesirable security implications. It moves the
exception table out of the read-only mappings into the read-write
mappings, thereby providing a way for an attacker to bypass the
read-only protection on the kernel and manipulate code pointers at
potentially known addresses for distro built kernels.

> I agree there is a risk but I don't think it's high. It does not look like
> the minor changes in "reserved" reporting in /proc/iomem will break kexec
> tooling.

What makes you come to that conclusion? The kexec tools architecture
backends get to decide what they do when parsing /proc/iomem.
Currently, only firmware areas are marked reserved in /proc/iomem on
32-bit ARM.

This is read by kexec, and entered into its memory_range[] table as
either RAM, or RESERVED.

kexec uses this to search for a suitable hole in the memory map to
place the kernel in physical memory. The addition of what I will call
ficticious "reserved" areas by the host kernel because the host kernel
happened to use them _will_ have an impact on this.

They _are_ ficticious, because they are purely an artifact of the host
kernel being run, and are of no consequence to tooling such as kexec.
What such tooling is interested in is which areas it needs to avoid
because of firmware.

I think what isn't helping here is that you haven't adequately
described what your overall objective actually is. Framing it in
terms of wanting the reserved memory to be consistent between the
various kernel "interfaces" such as /proc/iomem, the memblock debugfs
and firmware is very ambiguous and open to different interpretations,
whcih I think is what the problem is here.

> Anyway the amount of reserved and free memory depends on a
> particular system, kernel version, configuration and command line.
> I have no intention to report kernel boot time reservations
> to /proc/iomem on architectures that do not report them there today,
> although this also does not seem like a significant factor.

You seem to be missing the point I've tried to make. The areas in
memblock that are marked "reserved" are the areas of reserved memory
from the firmware _plus_ the areas that the kernel has made during
boot which are of no consequence to userspace.

Wanting /proc/iomem, memblock and firmware to all agree on the values
that they mark as "reserved" is IMHO unrealistic.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

WARNING: multiple messages have this Message-ID (diff)
From: "Russell King (Oracle)" <linux@armlinux.org.uk>
To: Mike Rapoport <rppt@kernel.org>
Cc: Mike Rapoport <rppt@linux.ibm.com>,
	linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	David Hildenbrand <david@redhat.com>,
	Heiko Carstens <hca@linux.ibm.com>,
	Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
	Vasily Gorbik <gor@linux.ibm.com>, Will Deacon <will@kernel.org>,
	linux-arm-kernel@lists.infradead.org, linux-mips@vger.kernel.org,
	linux-mm@kvack.org, linux-s390@vger.kernel.org
Subject: Re: [RFC/RFT PATCH 2/5] memblock: introduce generic memblock_setup_resources()
Date: Wed, 2 Jun 2021 21:15:02 +0100	[thread overview]
Message-ID: <20210602201502.GP30436@shell.armlinux.org.uk> (raw)
In-Reply-To: <YLfRVGC+tq5L0TZ6@kernel.org>

On Wed, Jun 02, 2021 at 09:43:32PM +0300, Mike Rapoport wrote:
> Back then when __ex_table was moved from .data section, _sdata and _edata
> were part of the .data section. Today they are not. So something like the
> patch below will ensure for instance that __ex_table would be a part of
> "Kernel data" in /proc/iomem without moving it to the .data section:
> 
> diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
> index f7f4620d59c3..2991feceab31 100644
> --- a/arch/arm/kernel/vmlinux.lds.S
> +++ b/arch/arm/kernel/vmlinux.lds.S
> @@ -72,13 +72,6 @@ SECTIONS
>  
>  	RO_DATA(PAGE_SIZE)
>  
> -	. = ALIGN(4);
> -	__ex_table : AT(ADDR(__ex_table) - LOAD_OFFSET) {
> -		__start___ex_table = .;
> -		ARM_MMU_KEEP(*(__ex_table))
> -		__stop___ex_table = .;
> -	}
> -
>  #ifdef CONFIG_ARM_UNWIND
>  	ARM_UNWIND_SECTIONS
>  #endif
> @@ -143,6 +136,14 @@ SECTIONS
>  	__init_end = .;
>  
>  	_sdata = .;
> +
> +	. = ALIGN(4);
> +	__ex_table : AT(ADDR(__ex_table) - LOAD_OFFSET) {
> +		__start___ex_table = .;
> +		ARM_MMU_KEEP(*(__ex_table))
> +		__stop___ex_table = .;
> +	}
> +
>  	RW_DATA(L1_CACHE_BYTES, PAGE_SIZE, THREAD_SIZE)
>  	_edata = .;

This example has undesirable security implications. It moves the
exception table out of the read-only mappings into the read-write
mappings, thereby providing a way for an attacker to bypass the
read-only protection on the kernel and manipulate code pointers at
potentially known addresses for distro built kernels.

> I agree there is a risk but I don't think it's high. It does not look like
> the minor changes in "reserved" reporting in /proc/iomem will break kexec
> tooling.

What makes you come to that conclusion? The kexec tools architecture
backends get to decide what they do when parsing /proc/iomem.
Currently, only firmware areas are marked reserved in /proc/iomem on
32-bit ARM.

This is read by kexec, and entered into its memory_range[] table as
either RAM, or RESERVED.

kexec uses this to search for a suitable hole in the memory map to
place the kernel in physical memory. The addition of what I will call
ficticious "reserved" areas by the host kernel because the host kernel
happened to use them _will_ have an impact on this.

They _are_ ficticious, because they are purely an artifact of the host
kernel being run, and are of no consequence to tooling such as kexec.
What such tooling is interested in is which areas it needs to avoid
because of firmware.

I think what isn't helping here is that you haven't adequately
described what your overall objective actually is. Framing it in
terms of wanting the reserved memory to be consistent between the
various kernel "interfaces" such as /proc/iomem, the memblock debugfs
and firmware is very ambiguous and open to different interpretations,
whcih I think is what the problem is here.

> Anyway the amount of reserved and free memory depends on a
> particular system, kernel version, configuration and command line.
> I have no intention to report kernel boot time reservations
> to /proc/iomem on architectures that do not report them there today,
> although this also does not seem like a significant factor.

You seem to be missing the point I've tried to make. The areas in
memblock that are marked "reserved" are the areas of reserved memory
from the firmware _plus_ the areas that the kernel has made during
boot which are of no consequence to userspace.

Wanting /proc/iomem, memblock and firmware to all agree on the values
that they mark as "reserved" is IMHO unrealistic.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-06-02 20:15 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-31 12:29 [RFC/RFT PATCH 0/5] consolidate "System RAM" resources setup Mike Rapoport
2021-05-31 12:29 ` Mike Rapoport
2021-05-31 12:29 ` [RFC/RFT PATCH 1/5] s390: make crashk_res resource a child of "System RAM" Mike Rapoport
2021-05-31 12:29   ` Mike Rapoport
2021-06-01  8:45   ` David Hildenbrand
2021-06-01  8:45     ` David Hildenbrand
2021-06-01  9:02     ` David Hildenbrand
2021-06-01  9:02       ` David Hildenbrand
2021-06-02  6:25       ` Mike Rapoport
2021-06-02  6:25         ` Mike Rapoport
2021-06-01 13:18   ` Gerald Schaefer
2021-06-01 13:18     ` Gerald Schaefer
2021-06-02  6:54     ` Mike Rapoport
2021-06-02  6:54       ` Mike Rapoport
2021-05-31 12:29 ` [RFC/RFT PATCH 2/5] memblock: introduce generic memblock_setup_resources() Mike Rapoport
2021-05-31 12:29   ` Mike Rapoport
2021-06-01 13:54   ` Russell King (Oracle)
2021-06-01 13:54     ` Russell King (Oracle)
2021-06-02  8:33     ` Mike Rapoport
2021-06-02  8:33       ` Mike Rapoport
2021-06-02 10:15       ` Russell King (Oracle)
2021-06-02 10:15         ` Russell King (Oracle)
2021-06-02 13:54         ` Mike Rapoport
2021-06-02 13:54           ` Mike Rapoport
2021-06-02 15:51           ` Russell King (Oracle)
2021-06-02 15:51             ` Russell King (Oracle)
2021-06-02 18:43             ` Mike Rapoport
2021-06-02 18:43               ` Mike Rapoport
2021-06-02 20:15               ` Russell King (Oracle) [this message]
2021-06-02 20:15                 ` Russell King (Oracle)
2021-06-03 10:32                 ` Mike Rapoport
2021-06-03 10:32                   ` Mike Rapoport
2021-05-31 12:29 ` [RFC/RFT PATCH 3/5] arm: switch to " Mike Rapoport
2021-05-31 12:29   ` Mike Rapoport
2021-05-31 12:29 ` [RFC/RFT PATCH 4/5] MIPS: switch to generic memblock_setup_resources Mike Rapoport
2021-05-31 12:29   ` Mike Rapoport
2021-05-31 12:29 ` [RFC/RFT PATCH 5/5] arm64: switch to generic memblock_setup_resources() Mike Rapoport
2021-05-31 12:29   ` Mike Rapoport
2021-06-01 13:44 ` [RFC/RFT PATCH 0/5] consolidate "System RAM" resources setup Russell King (Oracle)
2021-06-01 13:44   ` Russell King (Oracle)
2021-06-02  7:05   ` Mike Rapoport
2021-06-02  7:05     ` Mike Rapoport
2021-06-01  1:39 [RFC/RFT PATCH 2/5] memblock: introduce generic memblock_setup_resources() kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210602201502.GP30436@shell.armlinux.org.uk \
    --to=linux@armlinux.org.uk \
    --cc=akpm@linux-foundation.org \
    --cc=borntraeger@de.ibm.com \
    --cc=catalin.marinas@arm.com \
    --cc=david@redhat.com \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=rppt@kernel.org \
    --cc=rppt@linux.ibm.com \
    --cc=tsbogend@alpha.franken.de \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.