linux-renesas-soc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [QUERY]: Block region to mmap
@ 2023-01-25 12:30 Lad, Prabhakar
  2023-01-26 14:37 ` Matthew Wilcox
  0 siblings, 1 reply; 7+ messages in thread
From: Lad, Prabhakar @ 2023-01-25 12:30 UTC (permalink / raw)
  To: Linux-MM, linux-riscv, device-tree, Linux-Renesas
  Cc: Palmer Dabbelt, Arnd Bergmann, Rob Herring, Krzysztof Kozlowski,
	Jessica Clarke, Geert Uytterhoeven, Fabrizio Castro, Biju Das,
	Chris Paterson

Hi All,

Renesas RZ/Five RISC-V SoC has Instruction local memory and Data local
memory (ILM & DLM) mapped between region 0x30000 - 0x4FFFF. When a
virtual address falls within this range, the MMU doesn't trigger a
page fault; it assumes the virtual address is a physical address which
can cause undesired behaviours.

To avoid this the ILM/DLM memory regions are now added to the root
domain region of the PMPU with permissions set to 0x0 for S/U modes so
that any access to these regions gets blocked and for M-mode we grant
full access (R/W/X). This prevents any users from accessing these
regions by triggering an unhandled signal 11 in S/U modes.

This works as expected but for applications say for example when doing
mmap to this region would still succeed and later down the path when
doing a read/write to this location would cause unhandled signal 11.
To handle this case gracefully we might want mmap() itself to fail if
the addr/offset falls in this local memory region.

Tracing through the mmap call we have arch_mmap_check() if implemented
by architectures this callback gets called and it can be used as a
validator to make sure mmap() to the local memory region fails. (Note
maybe this callback can be implemented using ALTERNATIVX() macro so
that other RISC-V SoCs do nop() to this callback). This approach seems
reasonable but isn't a generic approach. For other platforms with
similar issues will have to go through similar implementation. Instead
if we define the memory regions in the device tree that aren't to be
allowed to be mmaped with this approach the implementation can be
generic and can be used on other archs/platforms.

Looking at the kernel code SPARC architecture (UltraSPARC T1) also has
a hole in the virtual memory address space (relevant commit-id to fix
this issue 8bcd17411643beb9a601e032d0cf1016909a81d3).
As this VA hole “support” has been added a long time ago now, and
maybe simply replicating their approach is not acceptable anymore
hence the proposed approach.

Is there any better approach which I am missing, any pointers comments welcome.

Cheers,
Prabhakar

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [QUERY]: Block region to mmap
  2023-01-25 12:30 [QUERY]: Block region to mmap Lad, Prabhakar
@ 2023-01-26 14:37 ` Matthew Wilcox
  2023-01-30 10:53   ` Lad, Prabhakar
  0 siblings, 1 reply; 7+ messages in thread
From: Matthew Wilcox @ 2023-01-26 14:37 UTC (permalink / raw)
  To: Lad, Prabhakar
  Cc: Linux-MM, linux-riscv, device-tree, Linux-Renesas,
	Palmer Dabbelt, Arnd Bergmann, Rob Herring, Krzysztof Kozlowski,
	Jessica Clarke, Geert Uytterhoeven, Fabrizio Castro, Biju Das,
	Chris Paterson

On Wed, Jan 25, 2023 at 12:30:13PM +0000, Lad, Prabhakar wrote:
> Renesas RZ/Five RISC-V SoC has Instruction local memory and Data local
> memory (ILM & DLM) mapped between region 0x30000 - 0x4FFFF. When a
> virtual address falls within this range, the MMU doesn't trigger a
> page fault; it assumes the virtual address is a physical address which
> can cause undesired behaviours.

Wow.  I've never come across such broken behaviour before.

> To avoid this the ILM/DLM memory regions are now added to the root
> domain region of the PMPU with permissions set to 0x0 for S/U modes so
> that any access to these regions gets blocked and for M-mode we grant
> full access (R/W/X). This prevents any users from accessing these
> regions by triggering an unhandled signal 11 in S/U modes.

I have no idea what any of this means.

> This works as expected but for applications say for example when doing
> mmap to this region would still succeed and later down the path when
> doing a read/write to this location would cause unhandled signal 11.
> To handle this case gracefully we might want mmap() itself to fail if
> the addr/offset falls in this local memory region.

No, that's not what you want.  You want mmap to avoid allocating address
space in that virtual address range.  I don't know if we have a good
way to do that at the moment; like I said I've never seen such broken
hardware before.

I'd say the right way to solve this is to add a new special kind of VMA
to the address space that covers this range.  We'd want to make sure
it doesn't appear in /proc/*/maps and also that it can't be overridden
with MAP_FIXED.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [QUERY]: Block region to mmap
  2023-01-26 14:37 ` Matthew Wilcox
@ 2023-01-30 10:53   ` Lad, Prabhakar
  2023-01-30 15:24     ` Matthew Wilcox
  0 siblings, 1 reply; 7+ messages in thread
From: Lad, Prabhakar @ 2023-01-30 10:53 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Linux-MM, linux-riscv, Linux-Renesas,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Palmer Dabbelt, Arnd Bergmann, Rob Herring, Krzysztof Kozlowski,
	Jessica Clarke, Geert Uytterhoeven, Fabrizio Castro, Biju Das,
	Chris Paterson

Hi Matthew,

Thank you for the feedback.

On Thu, Jan 26, 2023 at 2:37 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Wed, Jan 25, 2023 at 12:30:13PM +0000, Lad, Prabhakar wrote:
> > Renesas RZ/Five RISC-V SoC has Instruction local memory and Data local
> > memory (ILM & DLM) mapped between region 0x30000 - 0x4FFFF. When a
> > virtual address falls within this range, the MMU doesn't trigger a
> > page fault; it assumes the virtual address is a physical address which
> > can cause undesired behaviours.
>
> Wow.  I've never come across such broken behaviour before.
>
> > To avoid this the ILM/DLM memory regions are now added to the root
> > domain region of the PMPU with permissions set to 0x0 for S/U modes so
> > that any access to these regions gets blocked and for M-mode we grant
> > full access (R/W/X). This prevents any users from accessing these
> > regions by triggering an unhandled signal 11 in S/U modes.
>
> I have no idea what any of this means.
>
Basically we are making use of the memory protection unit (MPU) so
that only M-mode is allowed to access this region and S/U modes are
blocked.

> > This works as expected but for applications say for example when doing
> > mmap to this region would still succeed and later down the path when
> > doing a read/write to this location would cause unhandled signal 11.
> > To handle this case gracefully we might want mmap() itself to fail if
> > the addr/offset falls in this local memory region.
>
> No, that's not what you want.  You want mmap to avoid allocating address
> space in that virtual address range.  I don't know if we have a good
> way to do that at the moment; like I said I've never seen such broken
> hardware before.
>
> I'd say the right way to solve this is to add a new special kind of VMA
> to the address space that covers this range.
Do you have any pointers where I can look further into this?

> We'd want to make sure it doesn't appear in /proc/*/maps and also that
> it can't be overridden with MAP_FIXED.
Agreed.

Cheers,
Prabhakar

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [QUERY]: Block region to mmap
  2023-01-30 10:53   ` Lad, Prabhakar
@ 2023-01-30 15:24     ` Matthew Wilcox
  2023-02-01  6:31       ` Christoph Hellwig
  0 siblings, 1 reply; 7+ messages in thread
From: Matthew Wilcox @ 2023-01-30 15:24 UTC (permalink / raw)
  To: Lad, Prabhakar
  Cc: Linux-MM, linux-riscv, Linux-Renesas,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Palmer Dabbelt, Arnd Bergmann, Rob Herring, Krzysztof Kozlowski,
	Jessica Clarke, Geert Uytterhoeven, Fabrizio Castro, Biju Das,
	Chris Paterson

On Mon, Jan 30, 2023 at 10:53:28AM +0000, Lad, Prabhakar wrote:
> > > To avoid this the ILM/DLM memory regions are now added to the root
> > > domain region of the PMPU with permissions set to 0x0 for S/U modes so
> > > that any access to these regions gets blocked and for M-mode we grant
> > > full access (R/W/X). This prevents any users from accessing these
> > > regions by triggering an unhandled signal 11 in S/U modes.
> >
> > I have no idea what any of this means.
> >
> Basically we are making use of the memory protection unit (MPU) so
> that only M-mode is allowed to access this region and S/U modes are
> blocked.

This sounds like RISC-V terminology.  I have no idea what M, S or U
modes are (Supervisor and User, I'd guess for the last two?)

> > > This works as expected but for applications say for example when doing
> > > mmap to this region would still succeed and later down the path when
> > > doing a read/write to this location would cause unhandled signal 11.
> > > To handle this case gracefully we might want mmap() itself to fail if
> > > the addr/offset falls in this local memory region.
> >
> > No, that's not what you want.  You want mmap to avoid allocating address
> > space in that virtual address range.  I don't know if we have a good
> > way to do that at the moment; like I said I've never seen such broken
> > hardware before.
> >
> > I'd say the right way to solve this is to add a new special kind of VMA
> > to the address space that covers this range.
> Do you have any pointers where I can look further into this?

Before we go too deeply into it, how much would it cost to buy all of
these parts and feed them into a shredder?  I'm not entirely joking;
if it's less than the software engineering time it'd take to develop
and support this feature, we should do it.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [QUERY]: Block region to mmap
  2023-01-30 15:24     ` Matthew Wilcox
@ 2023-02-01  6:31       ` Christoph Hellwig
  2023-02-01  7:05         ` Jessica Clarke
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2023-02-01  6:31 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Lad, Prabhakar, Linux-MM, linux-riscv, Linux-Renesas,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Palmer Dabbelt, Arnd Bergmann, Rob Herring, Krzysztof Kozlowski,
	Jessica Clarke, Geert Uytterhoeven, Fabrizio Castro, Biju Das,
	Chris Paterson

On Mon, Jan 30, 2023 at 03:24:40PM +0000, Matthew Wilcox wrote:
> > Basically we are making use of the memory protection unit (MPU) so
> > that only M-mode is allowed to access this region and S/U modes are
> > blocked.
> 
> This sounds like RISC-V terminology.  I have no idea what M, S or U
> modes are (Supervisor and User, I'd guess for the last two?)


Yes, M = Machine, S = Supervisor, and U = User.
M omde is the absolutele worst idea of RISC-V and basically a mix
of microcode and super-SMM mode.

> Before we go too deeply into it, how much would it cost to buy all of
> these parts and feed them into a shredder?  I'm not entirely joking;
> if it's less than the software engineering time it'd take to develop
> and support this feature, we should do it.

The above suggests this is in no way an actual hardware problem, but the
stupid decision is done in the M-Mode firmware.  I think it is very
reasonable to simply not support the devices in Linux until the firmware
is fixed.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [QUERY]: Block region to mmap
  2023-02-01  6:31       ` Christoph Hellwig
@ 2023-02-01  7:05         ` Jessica Clarke
  2023-02-01  8:05           ` Arnd Bergmann
  0 siblings, 1 reply; 7+ messages in thread
From: Jessica Clarke @ 2023-02-01  7:05 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Matthew Wilcox, Lad, Prabhakar, Linux-MM, linux-riscv,
	Linux-Renesas,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Palmer Dabbelt, Arnd Bergmann, Rob Herring, Krzysztof Kozlowski,
	Geert Uytterhoeven, Fabrizio Castro, Biju Das, Chris Paterson

On 1 Feb 2023, at 06:31, Christoph Hellwig <hch@infradead.org> wrote:
> On Mon, Jan 30, 2023 at 03:24:40PM +0000, Matthew Wilcox wrote:
>>> Basically we are making use of the memory protection unit (MPU) so
>>> that only M-mode is allowed to access this region and S/U modes are
>>> blocked.
>> 
>> This sounds like RISC-V terminology.  I have no idea what M, S or U
>> modes are (Supervisor and User, I'd guess for the last two?)
> 
> 
> Yes, M = Machine, S = Supervisor, and U = User.
> M omde is the absolutele worst idea of RISC-V and basically a mix
> of microcode and super-SMM mode.
> 
>> Before we go too deeply into it, how much would it cost to buy all of
>> these parts and feed them into a shredder?  I'm not entirely joking;
>> if it's less than the software engineering time it'd take to develop
>> and support this feature, we should do it.
> 
> The above suggests this is in no way an actual hardware problem, but the
> stupid decision is done in the M-Mode firmware.  I think it is very
> reasonable to simply not support the devices in Linux until the firmware
> is fixed.

No, it really is a hardware spec violation. Virtual addresses within
the magic range bypass translation with no way to turn it off. The
firmware is being (has been?) patched to block those accesses at the
physical memory protection level so any attempt to use those virtual
addresses will fault, but if Linux wants to support this cursed
hardware and its gross spec violation then it needs to forbid any
allocation of the VA range.

This magic range also overlaps with the default base address used for
both GNU ld and LLVM LLD, for added entertainment, so almost every
position-dependent binary that exists in the world for RISC-V cannot be
run on this hardware. One could change that for future binaries, but
that doesn’t seem right to me... IMO this hardware is even more “not
RISC-V” than the D1 with its page table mess, but I don’t think we’ll
ever see RISC-V International come out and say that, so it’s up to the
open-source communities to decide what they want to support and what
they view as too much of a violation to be acceptable.

Jess


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [QUERY]: Block region to mmap
  2023-02-01  7:05         ` Jessica Clarke
@ 2023-02-01  8:05           ` Arnd Bergmann
  0 siblings, 0 replies; 7+ messages in thread
From: Arnd Bergmann @ 2023-02-01  8:05 UTC (permalink / raw)
  To: Jessica Clarke, Christoph Hellwig
  Cc: Matthew Wilcox, Prabhakar, Linux-MM, linux-riscv, Linux-Renesas,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Palmer Dabbelt, Rob Herring, Krzysztof Kozlowski,
	Geert Uytterhoeven, Fabrizio Castro, Biju Das, Chris Paterson

On Wed, Feb 1, 2023, at 08:05, Jessica Clarke wrote:
> On 1 Feb 2023, at 06:31, Christoph Hellwig <hch@infradead.org> wrote:
>> On Mon, Jan 30, 2023 at 03:24:40PM +0000, Matthew Wilcox wrote:
>> 
>>> Before we go too deeply into it, how much would it cost to buy all of
>>> these parts and feed them into a shredder?  I'm not entirely joking;
>>> if it's less than the software engineering time it'd take to develop
>>> and support this feature, we should do it.
>> 
>> The above suggests this is in no way an actual hardware problem, but the
>> stupid decision is done in the M-Mode firmware.  I think it is very
>> reasonable to simply not support the devices in Linux until the firmware
>> is fixed.
>
> No, it really is a hardware spec violation. Virtual addresses within
> the magic range bypass translation with no way to turn it off. The
> firmware is being (has been?) patched to block those accesses at the
> physical memory protection level so any attempt to use those virtual
> addresses will fault, but if Linux wants to support this cursed
> hardware and its gross spec violation then it needs to forbid any
> allocation of the VA range.

For a local build of an embedded system it's probably enough to
set CONFIG_DEFAULT_MMAP_MIN_ADDR and CONFIG_LSM_MMAP_MIN_ADDR
in order to force userspace outside of the broken address
range.

If that configuration can no longer run most regular userspace
binaries, there is probably not much need to detect the hardware
that needs it and do this automatically in the kernel, beyond
perhaps some platform specific code that refuses to boot unless
the config options are set this way on the affected chip
revisions.

     Arnd

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-02-01  8:06 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-25 12:30 [QUERY]: Block region to mmap Lad, Prabhakar
2023-01-26 14:37 ` Matthew Wilcox
2023-01-30 10:53   ` Lad, Prabhakar
2023-01-30 15:24     ` Matthew Wilcox
2023-02-01  6:31       ` Christoph Hellwig
2023-02-01  7:05         ` Jessica Clarke
2023-02-01  8:05           ` Arnd Bergmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).