linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] arm64: add alignment fault hanling
@ 2016-02-16  4:44 EunTaik Lee
  2016-02-16 10:31 ` Will Deacon
  2016-02-16 17:11 ` Catalin Marinas
  0 siblings, 2 replies; 14+ messages in thread
From: EunTaik Lee @ 2016-02-16  4:44 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: will.deacon, vladimir.murzin, suzuki.poulose, riandrews,
	james.morse, salyzyn, Dave.Martin, linux-arm-kernel,
	linux-kernel, EunTaik Lee

Userspace memory is mapped as below:
F2A7F000--F2A7FFFF Normal Memory
F2A80000--F2A80FFF Device nGnRnE

And that userspace application makes a system call
as below:

-009 |do_strncpy_from_user(inline)
-009 |strncpy_from_user()
-010 |getname_flags()
-011 |user_path_at_empty()
-012 |user_path_at()
-013 |SYSC_faccessat(inline)
-013 |sys_faccessat()
-014 |__sys_trace(asm)
 --> |exception

The string spans from 0xF2A7FFC1 to 0xF2A7FFFB.

When do_strncpy_from_user() reads the last (unsigned long)
value, the alignement fault is triggered. The 8 byte
from 0xF2A7FFC1 spans to the next page that is mapped as
Device nGnRnE, which does not allow an unaligned access,
causes the abort.

The instruction which caused the alignment fault is registered
in the fixup table but the exception handler does not reach there.

This patch registers a alignment fault handler and fixes up the
pc if appropriate.

Signed-off-by: Eun Taik Lee <eun.taik.lee@samsung.com>
---

changes in v2 : call do_bad_area() instead of calling
 fix_up_exception directly.

 arch/arm64/mm/fault.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 19211c4..a5ebb99 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -371,6 +371,14 @@ static int __kprobes do_translation_fault(unsigned long addr,
 	return 0;
 }
 
+static int __kprobes do_alignment_fault(unsigned long addr,
+					  unsigned int esr,
+					  struct pt_regs *regs)
+{
+	do_bad_area(addr, esr, regs);
+	return 0;
+}
+
 /*
  * This abort handler always returns "fault".
  */
@@ -418,7 +426,7 @@ static struct fault_info {
 	{ do_bad,		SIGBUS,  0,		"synchronous parity error (translation table walk" },
 	{ do_bad,		SIGBUS,  0,		"synchronous parity error (translation table walk" },
 	{ do_bad,		SIGBUS,  0,		"unknown 32"			},
-	{ do_bad,		SIGBUS,  BUS_ADRALN,	"alignment fault"		},
+	{ do_alignment_fault,	SIGBUS,  BUS_ADRALN,	"alignment fault"		},
 	{ do_bad,		SIGBUS,  0,		"debug event"			},
 	{ do_bad,		SIGBUS,  0,		"unknown 35"			},
 	{ do_bad,		SIGBUS,  0,		"unknown 36"			},
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] arm64: add alignment fault hanling
  2016-02-16  4:44 [PATCH v2] arm64: add alignment fault hanling EunTaik Lee
@ 2016-02-16 10:31 ` Will Deacon
  2016-02-16 10:57   ` Robin Murphy
  2016-02-16 17:11 ` Catalin Marinas
  1 sibling, 1 reply; 14+ messages in thread
From: Will Deacon @ 2016-02-16 10:31 UTC (permalink / raw)
  To: EunTaik Lee
  Cc: Catalin Marinas, vladimir.murzin, suzuki.poulose, riandrews,
	james.morse, salyzyn, Dave.Martin, linux-arm-kernel,
	linux-kernel

On Tue, Feb 16, 2016 at 04:44:35AM +0000, EunTaik Lee wrote:
> Userspace memory is mapped as below:
> F2A7F000--F2A7FFFF Normal Memory
> F2A80000--F2A80FFF Device nGnRnE
> 
> And that userspace application makes a system call
> as below:
> 
> -009 |do_strncpy_from_user(inline)
> -009 |strncpy_from_user()
> -010 |getname_flags()
> -011 |user_path_at_empty()
> -012 |user_path_at()
> -013 |SYSC_faccessat(inline)
> -013 |sys_faccessat()
> -014 |__sys_trace(asm)
>  --> |exception
> 
> The string spans from 0xF2A7FFC1 to 0xF2A7FFFB.
> 
> When do_strncpy_from_user() reads the last (unsigned long)
> value, the alignement fault is triggered. The 8 byte
> from 0xF2A7FFC1 spans to the next page that is mapped as
> Device nGnRnE, which does not allow an unaligned access,
> causes the abort.
> 
> The instruction which caused the alignment fault is registered
> in the fixup table but the exception handler does not reach there.
> 
> This patch registers a alignment fault handler and fixes up the
> pc if appropriate.

As discussed with Catalin previously, we should solve this by adding a
guard page rather than handling the fault.

Will

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] arm64: add alignment fault hanling
  2016-02-16 10:31 ` Will Deacon
@ 2016-02-16 10:57   ` Robin Murphy
  2016-02-16 12:21     ` Catalin Marinas
  0 siblings, 1 reply; 14+ messages in thread
From: Robin Murphy @ 2016-02-16 10:57 UTC (permalink / raw)
  To: Will Deacon, EunTaik Lee
  Cc: vladimir.murzin, suzuki.poulose, Catalin Marinas, linux-kernel,
	salyzyn, riandrews, james.morse, Dave.Martin, linux-arm-kernel

On 16/02/16 10:31, Will Deacon wrote:
> On Tue, Feb 16, 2016 at 04:44:35AM +0000, EunTaik Lee wrote:
>> Userspace memory is mapped as below:
>> F2A7F000--F2A7FFFF Normal Memory
>> F2A80000--F2A80FFF Device nGnRnE
>>
>> And that userspace application makes a system call
>> as below:
>>
>> -009 |do_strncpy_from_user(inline)
>> -009 |strncpy_from_user()
>> -010 |getname_flags()
>> -011 |user_path_at_empty()
>> -012 |user_path_at()
>> -013 |SYSC_faccessat(inline)
>> -013 |sys_faccessat()
>> -014 |__sys_trace(asm)
>>   --> |exception
>>
>> The string spans from 0xF2A7FFC1 to 0xF2A7FFFB.
>>
>> When do_strncpy_from_user() reads the last (unsigned long)
>> value, the alignement fault is triggered. The 8 byte
>> from 0xF2A7FFC1 spans to the next page that is mapped as
>> Device nGnRnE, which does not allow an unaligned access,
>> causes the abort.
>>
>> The instruction which caused the alignment fault is registered
>> in the fixup table but the exception handler does not reach there.
>>
>> This patch registers a alignment fault handler and fixes up the
>> pc if appropriate.
>
> As discussed with Catalin previously, we should solve this by adding a
> guard page rather than handling the fault.

...especially since we may not even get a fault. See "Crossing a page 
boundary with different memory types or Shareability attributes" in the 
UNPREDICTABLE spec (K1.2.10 in the latest ARMv8 ARM).

Robin.

>
> Will
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] arm64: add alignment fault hanling
  2016-02-16 10:57   ` Robin Murphy
@ 2016-02-16 12:21     ` Catalin Marinas
  2016-02-16 16:00       ` Will Deacon
  0 siblings, 1 reply; 14+ messages in thread
From: Catalin Marinas @ 2016-02-16 12:21 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Will Deacon, EunTaik Lee, vladimir.murzin, suzuki.poulose,
	linux-kernel, salyzyn, riandrews, james.morse, Dave.Martin,
	linux-arm-kernel

On Tue, Feb 16, 2016 at 10:57:49AM +0000, Robin Murphy wrote:
> On 16/02/16 10:31, Will Deacon wrote:
> >On Tue, Feb 16, 2016 at 04:44:35AM +0000, EunTaik Lee wrote:
> >>Userspace memory is mapped as below:
> >>F2A7F000--F2A7FFFF Normal Memory
> >>F2A80000--F2A80FFF Device nGnRnE
> >>
> >>And that userspace application makes a system call
> >>as below:
> >>
> >>-009 |do_strncpy_from_user(inline)
> >>-009 |strncpy_from_user()
> >>-010 |getname_flags()
> >>-011 |user_path_at_empty()
> >>-012 |user_path_at()
> >>-013 |SYSC_faccessat(inline)
> >>-013 |sys_faccessat()
> >>-014 |__sys_trace(asm)
> >>  --> |exception
> >>
> >>The string spans from 0xF2A7FFC1 to 0xF2A7FFFB.
> >>
> >>When do_strncpy_from_user() reads the last (unsigned long)
> >>value, the alignement fault is triggered. The 8 byte
> >>from 0xF2A7FFC1 spans to the next page that is mapped as
> >>Device nGnRnE, which does not allow an unaligned access,
> >>causes the abort.
> >>
> >>The instruction which caused the alignment fault is registered
> >>in the fixup table but the exception handler does not reach there.
> >>
> >>This patch registers a alignment fault handler and fixes up the
> >>pc if appropriate.
> >
> >As discussed with Catalin previously, we should solve this by adding a
> >guard page rather than handling the fault.

I don't think we can trivially add this without implementing an arm64
specific arch_get_unmapped_area().

> ...especially since we may not even get a fault. See "Crossing a page
> boundary with different memory types or Shareability attributes" in the
> UNPREDICTABLE spec (K1.2.10 in the latest ARMv8 ARM).

While this is CONSTRAINED UNPREDICTABLE, it still doesn't sound good. If
you end up with such adjacent mappings in user space, you can claim it's
the user's responsibility not to do unpredictable accesses across such
boundary. But in this case, it's the kernel do_strncpy_from_user()
assuming that it can freely go across boundaries and patch the fault
afterwards.

Still digging into this but I think we have a bug.

-- 
Catalin

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] arm64: add alignment fault hanling
  2016-02-16 12:21     ` Catalin Marinas
@ 2016-02-16 16:00       ` Will Deacon
  2016-02-16 17:04         ` Will Deacon
  2016-02-16 17:09         ` Catalin Marinas
  0 siblings, 2 replies; 14+ messages in thread
From: Will Deacon @ 2016-02-16 16:00 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Robin Murphy, EunTaik Lee, vladimir.murzin, suzuki.poulose,
	linux-kernel, salyzyn, riandrews, james.morse, Dave.Martin,
	linux-arm-kernel

On Tue, Feb 16, 2016 at 12:21:53PM +0000, Catalin Marinas wrote:
> On Tue, Feb 16, 2016 at 10:57:49AM +0000, Robin Murphy wrote:
> > On 16/02/16 10:31, Will Deacon wrote:
> > >On Tue, Feb 16, 2016 at 04:44:35AM +0000, EunTaik Lee wrote:
> > >>Userspace memory is mapped as below:
> > >>F2A7F000--F2A7FFFF Normal Memory
> > >>F2A80000--F2A80FFF Device nGnRnE
> > >>
> > >>And that userspace application makes a system call
> > >>as below:
> > >>
> > >>-009 |do_strncpy_from_user(inline)
> > >>-009 |strncpy_from_user()
> > >>-010 |getname_flags()
> > >>-011 |user_path_at_empty()
> > >>-012 |user_path_at()
> > >>-013 |SYSC_faccessat(inline)
> > >>-013 |sys_faccessat()
> > >>-014 |__sys_trace(asm)
> > >>  --> |exception
> > >>
> > >>The string spans from 0xF2A7FFC1 to 0xF2A7FFFB.
> > >>
> > >>When do_strncpy_from_user() reads the last (unsigned long)
> > >>value, the alignement fault is triggered. The 8 byte
> > >>from 0xF2A7FFC1 spans to the next page that is mapped as
> > >>Device nGnRnE, which does not allow an unaligned access,
> > >>causes the abort.
> > >>
> > >>The instruction which caused the alignment fault is registered
> > >>in the fixup table but the exception handler does not reach there.
> > >>
> > >>This patch registers a alignment fault handler and fixes up the
> > >>pc if appropriate.
> > >
> > >As discussed with Catalin previously, we should solve this by adding a
> > >guard page rather than handling the fault.
> 
> I don't think we can trivially add this without implementing an arm64
> specific arch_get_unmapped_area().

Even overriding arch_get_unmapped_area doesn't help as much as you might
like since, in the case of something like /dev/mem, the memory is remapped
using remap_pfn_range later on, so you can't necessarily tell what the
final attributes are likely to be when you initially allocate the virtual
space.

Will

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] arm64: add alignment fault hanling
  2016-02-16 16:00       ` Will Deacon
@ 2016-02-16 17:04         ` Will Deacon
  2016-02-16 18:50           ` Linus Torvalds
  2016-02-16 17:09         ` Catalin Marinas
  1 sibling, 1 reply; 14+ messages in thread
From: Will Deacon @ 2016-02-16 17:04 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Robin Murphy, EunTaik Lee, vladimir.murzin, suzuki.poulose,
	linux-kernel, salyzyn, riandrews, james.morse, Dave.Martin,
	linux-arm-kernel, torvalds, hpa, arjan, peterz

[replying to self and adding some x86 people]

Background: Euntaik reports a problem where userspace has ended up with
a memory page mapped adjacent to an MMIO page (e.g. from /dev/mem or a
PCI memory bar from someplace in /sys). strncpy_from_user happens with
the word-at-a-time implementation, and we end up reading into the MMIO
page.

Question: Does x86 guarantee that this faults? (Arjan reckoned no, but
wasn't 100%).

On Tue, Feb 16, 2016 at 04:00:55PM +0000, Will Deacon wrote:
> On Tue, Feb 16, 2016 at 12:21:53PM +0000, Catalin Marinas wrote:
> > On Tue, Feb 16, 2016 at 10:57:49AM +0000, Robin Murphy wrote:
> > > On 16/02/16 10:31, Will Deacon wrote:
> > > >On Tue, Feb 16, 2016 at 04:44:35AM +0000, EunTaik Lee wrote:
> > > >>Userspace memory is mapped as below:
> > > >>F2A7F000--F2A7FFFF Normal Memory
> > > >>F2A80000--F2A80FFF Device nGnRnE
> > > >>
> > > >>And that userspace application makes a system call
> > > >>as below:
> > > >>
> > > >>-009 |do_strncpy_from_user(inline)
> > > >>-009 |strncpy_from_user()
> > > >>-010 |getname_flags()
> > > >>-011 |user_path_at_empty()
> > > >>-012 |user_path_at()
> > > >>-013 |SYSC_faccessat(inline)
> > > >>-013 |sys_faccessat()
> > > >>-014 |__sys_trace(asm)
> > > >>  --> |exception
> > > >>
> > > >>The string spans from 0xF2A7FFC1 to 0xF2A7FFFB.
> > > >>
> > > >>When do_strncpy_from_user() reads the last (unsigned long)
> > > >>value, the alignement fault is triggered. The 8 byte
> > > >>from 0xF2A7FFC1 spans to the next page that is mapped as
> > > >>Device nGnRnE, which does not allow an unaligned access,
> > > >>causes the abort.
> > > >>
> > > >>The instruction which caused the alignment fault is registered
> > > >>in the fixup table but the exception handler does not reach there.
> > > >>
> > > >>This patch registers a alignment fault handler and fixes up the
> > > >>pc if appropriate.
> > > >
> > > >As discussed with Catalin previously, we should solve this by adding a
> > > >guard page rather than handling the fault.
> > 
> > I don't think we can trivially add this without implementing an arm64
> > specific arch_get_unmapped_area().
> 
> Even overriding arch_get_unmapped_area doesn't help as much as you might
> like since, in the case of something like /dev/mem, the memory is remapped
> using remap_pfn_range later on, so you can't necessarily tell what the
> final attributes are likely to be when you initially allocate the virtual
> space.

Thinking more about this, we could spit out a guard page between every
VMA, but it's likely to hamper any VMA merging.

Will

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] arm64: add alignment fault hanling
  2016-02-16 16:00       ` Will Deacon
  2016-02-16 17:04         ` Will Deacon
@ 2016-02-16 17:09         ` Catalin Marinas
  1 sibling, 0 replies; 14+ messages in thread
From: Catalin Marinas @ 2016-02-16 17:09 UTC (permalink / raw)
  To: Will Deacon
  Cc: vladimir.murzin, suzuki.poulose, linux-kernel, salyzyn,
	riandrews, james.morse, EunTaik Lee, Robin Murphy, Dave.Martin,
	linux-arm-kernel

On Tue, Feb 16, 2016 at 04:00:55PM +0000, Will Deacon wrote:
> On Tue, Feb 16, 2016 at 12:21:53PM +0000, Catalin Marinas wrote:
> > On Tue, Feb 16, 2016 at 10:57:49AM +0000, Robin Murphy wrote:
> > > On 16/02/16 10:31, Will Deacon wrote:
> > > >On Tue, Feb 16, 2016 at 04:44:35AM +0000, EunTaik Lee wrote:
> > > >>Userspace memory is mapped as below:
> > > >>F2A7F000--F2A7FFFF Normal Memory
> > > >>F2A80000--F2A80FFF Device nGnRnE
> > > >>
> > > >>And that userspace application makes a system call
> > > >>as below:
> > > >>
> > > >>-009 |do_strncpy_from_user(inline)
> > > >>-009 |strncpy_from_user()
> > > >>-010 |getname_flags()
> > > >>-011 |user_path_at_empty()
> > > >>-012 |user_path_at()
> > > >>-013 |SYSC_faccessat(inline)
> > > >>-013 |sys_faccessat()
> > > >>-014 |__sys_trace(asm)
> > > >>  --> |exception
> > > >>
> > > >>The string spans from 0xF2A7FFC1 to 0xF2A7FFFB.
> > > >>
> > > >>When do_strncpy_from_user() reads the last (unsigned long)
> > > >>value, the alignement fault is triggered. The 8 byte
> > > >>from 0xF2A7FFC1 spans to the next page that is mapped as
> > > >>Device nGnRnE, which does not allow an unaligned access,
> > > >>causes the abort.
> > > >>
> > > >>The instruction which caused the alignment fault is registered
> > > >>in the fixup table but the exception handler does not reach there.
> > > >>
> > > >>This patch registers a alignment fault handler and fixes up the
> > > >>pc if appropriate.
> > > >
> > > >As discussed with Catalin previously, we should solve this by adding a
> > > >guard page rather than handling the fault.
> > 
> > I don't think we can trivially add this without implementing an arm64
> > specific arch_get_unmapped_area().
> 
> Even overriding arch_get_unmapped_area doesn't help as much as you might
> like since, in the case of something like /dev/mem, the memory is remapped
> using remap_pfn_range later on, so you can't necessarily tell what the
> final attributes are likely to be when you initially allocate the virtual
> space.

This is not about attributes but allowing a guard page on each side of
the mmap range. So do_mmap() would get an unmapped range via
get_unmapped_area() and mmap_region() would populate the vma before
calling file->f_op->mmap(). Of course, we still can't prevent user
asking for specific addresses via mmap(2), but in this case you can say
it's the user's fault.

The do_strncpy_from_user() issue apart, as long as we allow Device
mappings in user space, we need to allow the kernel to handle unaligned
accesses via get_user. This means using do_bad_area for the alignment
fault instead of do_bad.

-- 
Catalin

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] arm64: add alignment fault hanling
  2016-02-16  4:44 [PATCH v2] arm64: add alignment fault hanling EunTaik Lee
  2016-02-16 10:31 ` Will Deacon
@ 2016-02-16 17:11 ` Catalin Marinas
  1 sibling, 0 replies; 14+ messages in thread
From: Catalin Marinas @ 2016-02-16 17:11 UTC (permalink / raw)
  To: EunTaik Lee
  Cc: vladimir.murzin, suzuki.poulose, will.deacon, linux-kernel,
	salyzyn, riandrews, james.morse, Dave.Martin, linux-arm-kernel

On Tue, Feb 16, 2016 at 04:44:38AM +0000, EunTaik Lee wrote:
> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> index 19211c4..a5ebb99 100644
> --- a/arch/arm64/mm/fault.c
> +++ b/arch/arm64/mm/fault.c
> @@ -371,6 +371,14 @@ static int __kprobes do_translation_fault(unsigned long addr,
>  	return 0;
>  }
>  
> +static int __kprobes do_alignment_fault(unsigned long addr,
> +					  unsigned int esr,
> +					  struct pt_regs *regs)
> +{
> +	do_bad_area(addr, esr, regs);
> +	return 0;
> +}
> +
>  /*
>   * This abort handler always returns "fault".
>   */
> @@ -418,7 +426,7 @@ static struct fault_info {
>  	{ do_bad,		SIGBUS,  0,		"synchronous parity error (translation table walk" },
>  	{ do_bad,		SIGBUS,  0,		"synchronous parity error (translation table walk" },
>  	{ do_bad,		SIGBUS,  0,		"unknown 32"			},
> -	{ do_bad,		SIGBUS,  BUS_ADRALN,	"alignment fault"		},
> +	{ do_alignment_fault,	SIGBUS,  BUS_ADRALN,	"alignment fault"		},

Do you need a new function, can you not just add do_bad_area in the
fault_info array?

-- 
Catalin

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] arm64: add alignment fault hanling
  2016-02-16 17:04         ` Will Deacon
@ 2016-02-16 18:50           ` Linus Torvalds
  2016-02-16 21:31             ` Arjan van de Ven
  2016-02-19 18:14             ` Catalin Marinas
  0 siblings, 2 replies; 14+ messages in thread
From: Linus Torvalds @ 2016-02-16 18:50 UTC (permalink / raw)
  To: Will Deacon
  Cc: Catalin Marinas, Robin Murphy, EunTaik Lee, vladimir.murzin,
	suzuki.poulose, linux-kernel, salyzyn, riandrews, james.morse,
	Dave.Martin, linux-arm-kernel, Peter Anvin, Arjan van de Ven,
	Peter Zijlstra

On Tue, Feb 16, 2016 at 9:04 AM, Will Deacon <will.deacon@arm.com> wrote:
> [replying to self and adding some x86 people]
>
> Background: Euntaik reports a problem where userspace has ended up with
> a memory page mapped adjacent to an MMIO page (e.g. from /dev/mem or a
> PCI memory bar from someplace in /sys). strncpy_from_user happens with
> the word-at-a-time implementation, and we end up reading into the MMIO
> page.
>
> Question: Does x86 guarantee that this faults? (Arjan reckoned no, but
> wasn't 100%).

x86 not only does *not* guarantee that that faults, but quite the
reverse: I'm pretty sure it would be considered an architectural bug
if it didn't silently "just work". IOW, the page-crossing unaligned
access will be split up as two accesses, and done as a regular cached
read for the normal RAM page part, and as an uncached IO read for the
MMIO page.

So if somebody passes us a string adjacent to a MMIO mapping, we'll
happily do the access on x86 and touch the MMIO space. No biggie, and
nobody sane actually does that. If you have the rights to do device
mappings, you should probably restrain from doing insane things. "With
great power comes great responsibility".

> Thinking more about this, we could spit out a guard page between every
> VMA, but it's likely to hamper any VMA merging.

More importantly, it wouldn't work for the general case. People often
pass in a specific virtual memory address to mmap because they *need*
adjacent mappings. One example of that case is doing a circular buffer
that is guaranteed to always be accessible linearly (and avoiding the
split case where it hits the end of the circular buffer) by mapping
the same buffer twice.

Of course, no actual real program will do that for mixing MMIO and
non-MMIO, and so we might obviously add code to always add a guard
page for the normal case when a specific address isn't asked for. So
as a heuristic to make sure it doesn't happen by mistake it possibly
makes sense.

You'd still want to make sure that you don't get a kernel oops if some
attacker tries to break ARM64 on purpose, obviously (although even
that is not necessarily a huge deal for somebody who already has
physical IO privileges). So I suspect that Eun Taik Lee's patch makes
perfect sense even if it's not necessarily the full solution in
itself.

                Linus

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] arm64: add alignment fault hanling
  2016-02-16 18:50           ` Linus Torvalds
@ 2016-02-16 21:31             ` Arjan van de Ven
  2016-02-16 23:04               ` Catalin Marinas
       [not found]               ` <CA+55aFz+ttJoEG_WkpkwV=+Wunzxpj9NoHobq-8oFZS0HEEyeA@mail.gmail.com>
  2016-02-19 18:14             ` Catalin Marinas
  1 sibling, 2 replies; 14+ messages in thread
From: Arjan van de Ven @ 2016-02-16 21:31 UTC (permalink / raw)
  To: Linus Torvalds, Will Deacon
  Cc: Catalin Marinas, Robin Murphy, EunTaik Lee, vladimir.murzin,
	suzuki.poulose, linux-kernel, salyzyn, riandrews, james.morse,
	Dave.Martin, linux-arm-kernel, Peter Anvin, Peter Zijlstra

On 2/16/2016 10:50 AM, Linus Torvalds wrote:
> On Tue, Feb 16, 2016 at 9:04 AM, Will Deacon <will.deacon@arm.com> wrote:
>> [replying to self and adding some x86 people]
>>
>> Background: Euntaik reports a problem where userspace has ended up with
>> a memory page mapped adjacent to an MMIO page (e.g. from /dev/mem or a
>> PCI memory bar from someplace in /sys). strncpy_from_user happens with
>> the word-at-a-time implementation, and we end up reading into the MMIO
>> page.

how does this work if the adjacent page is not accessible?
or has some other magic fault handler, or is on an NFS filesystem where
the server is rebooting?

isn't the general rule for such basic functions "don't touch memory unless you KNOW it is there"



> Of course, no actual real program will do that for mixing MMIO and
> non-MMIO, and so we might obviously add code to always add a guard
> page for the normal case when a specific address isn't asked for. So
> as a heuristic to make sure it doesn't happen by mistake it possibly
> makes sense.

but what happens to the read if the page isn't present?
or is execute-only or .. or ..

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] arm64: add alignment fault hanling
  2016-02-16 21:31             ` Arjan van de Ven
@ 2016-02-16 23:04               ` Catalin Marinas
       [not found]               ` <CA+55aFz+ttJoEG_WkpkwV=+Wunzxpj9NoHobq-8oFZS0HEEyeA@mail.gmail.com>
  1 sibling, 0 replies; 14+ messages in thread
From: Catalin Marinas @ 2016-02-16 23:04 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Linus Torvalds, Will Deacon, vladimir.murzin, suzuki.poulose,
	Peter Zijlstra, linux-kernel, salyzyn, riandrews, james.morse,
	EunTaik Lee, Peter Anvin, Robin Murphy, Dave.Martin,
	linux-arm-kernel

On Tue, Feb 16, 2016 at 01:31:36PM -0800, Arjan van de Ven wrote:
> On 2/16/2016 10:50 AM, Linus Torvalds wrote:
> >On Tue, Feb 16, 2016 at 9:04 AM, Will Deacon <will.deacon@arm.com> wrote:
> >>[replying to self and adding some x86 people]
> >>
> >>Background: Euntaik reports a problem where userspace has ended up with
> >>a memory page mapped adjacent to an MMIO page (e.g. from /dev/mem or a
> >>PCI memory bar from someplace in /sys). strncpy_from_user happens with
> >>the word-at-a-time implementation, and we end up reading into the MMIO
> >>page.
> 
> how does this work if the adjacent page is not accessible?

do_strncpy_from_user() assumes by default that it can read a word at a
time using get_user() but checks its return value in case it failed and
falls back to byte at a time. What happens on arm64 is that for
alignment faults we don't have a handler that would search the exception
table and run the get_user() fixup.

> isn't the general rule for such basic functions "don't touch memory
> unless you KNOW it is there"

Well, user access routines are in general safe with this via the
exception handling + fixup mechanism (which usually returns -EFAULT).
That's what do_strncpy_from_user() tries to do by optimising away the
boundary checks.

-- 
Catalin

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] arm64: add alignment fault hanling
       [not found]               ` <CA+55aFz+ttJoEG_WkpkwV=+Wunzxpj9NoHobq-8oFZS0HEEyeA@mail.gmail.com>
@ 2016-02-17  0:28                 ` Linus Torvalds
  0 siblings, 0 replies; 14+ messages in thread
From: Linus Torvalds @ 2016-02-17  0:28 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: james.morse, salyzyn, linux-arm-kernel, vladimir.murzin,
	Will Deacon, Robin Murphy, riandrews, Peter Anvin, Dave.Martin,
	EunTaik Lee, linux-kernel, suzuki.poulose, Catalin Marinas,
	Peter Zijlstra

On Tue, Feb 16, 2016 at 1:42 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Feb 16, 2016 1:31 PM, "Arjan van de Ven" <arjan@linux.intel.com> wrote:
>>
>> but what happens to the read if the page isn't present?
>> or is execute-only or .. or ..
>
> If we actually get a fault and handle the exception (not handling the
> exception was the problem on arm), the exception code will just cut off the
> pathname at the page boundary.
>
> So it will see the accessible part, and get zeroes for the inaccessible one.

Actually, looking closer, we only do that for the kernel case (where
pagealloc-debug can cause the unaligned path component in *kernel*
space to trap).

I misremembered because I considered doing it for user accesses too,
but as Catalin correctly says, there we don't actually end up being
that clever, and we just fall back to byte-at-a-time. Which means that
we do get the exact EFAULT behavior even though I'm not 100% convinced
we need to.

See the use of "load_unaligned_zeropad()"  (in the dcache handling) vs
just "get_user()" (in strncpy_from_user()).

The fault case doesn't actually ever happen in practice.

The IS_UNALIGNED() case (on architectures with inefficient unaligned
handling), which also falls back to the byte-at-a-time model, is
likely a much bigger problem. They probably need their own strncpy if
they care about performance. But the common architectures all happily
do efficient unaligneds these days.

             Linus

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] arm64: add alignment fault hanling
  2016-02-16 18:50           ` Linus Torvalds
  2016-02-16 21:31             ` Arjan van de Ven
@ 2016-02-19 18:14             ` Catalin Marinas
  2016-02-19 22:09               ` Linus Torvalds
  1 sibling, 1 reply; 14+ messages in thread
From: Catalin Marinas @ 2016-02-19 18:14 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Will Deacon, vladimir.murzin, Arjan van de Ven, suzuki.poulose,
	Peter Zijlstra, linux-kernel, salyzyn, riandrews, james.morse,
	EunTaik Lee, Peter Anvin, Robin Murphy, Dave.Martin,
	linux-arm-kernel

On Tue, Feb 16, 2016 at 10:50:02AM -0800, Linus Torvalds wrote:
> On Tue, Feb 16, 2016 at 9:04 AM, Will Deacon <will.deacon@arm.com> wrote:
> > [replying to self and adding some x86 people]
> >
> > Background: Euntaik reports a problem where userspace has ended up with
> > a memory page mapped adjacent to an MMIO page (e.g. from /dev/mem or a
> > PCI memory bar from someplace in /sys). strncpy_from_user happens with
> > the word-at-a-time implementation, and we end up reading into the MMIO
> > page.
> >
> > Question: Does x86 guarantee that this faults? (Arjan reckoned no, but
> > wasn't 100%).
> 
> x86 not only does *not* guarantee that that faults, but quite the
> reverse: I'm pretty sure it would be considered an architectural bug
> if it didn't silently "just work". IOW, the page-crossing unaligned
> access will be split up as two accesses, and done as a regular cached
> read for the normal RAM page part, and as an uncached IO read for the
> MMIO page.
> 
> So if somebody passes us a string adjacent to a MMIO mapping, we'll
> happily do the access on x86 and touch the MMIO space. No biggie, and
> nobody sane actually does that. If you have the rights to do device
> mappings, you should probably restrain from doing insane things. "With
> great power comes great responsibility".

Unless I misunderstand it, I don't think the user is even aware of this
potential problem. Let's say it mmap's a file (script etc.) which
contains a string (file name) towards the end of the last page. Such
pointer gets passed to sys_open() and it eventually ends up in
strncpy_from_user() with count == EMBEDDED_NAME_MAX (so well beyond the
end of the page).

The user also maps a device with mmap(NULL, ...) and there is a slight
chance that the kernel allocates this VA space without any gap from the
previous mmap().

With the above scenario, I don't see what the user could control here,
other than using MAP_FIXED and force a guard from user space.

> > Thinking more about this, we could spit out a guard page between every
> > VMA, but it's likely to hamper any VMA merging.
> 
> More importantly, it wouldn't work for the general case. People often
> pass in a specific virtual memory address to mmap because they *need*
> adjacent mappings. One example of that case is doing a circular buffer
> that is guaranteed to always be accessible linearly (and avoiding the
> split case where it hits the end of the circular buffer) by mapping
> the same buffer twice.
> 
> Of course, no actual real program will do that for mixing MMIO and
> non-MMIO, and so we might obviously add code to always add a guard
> page for the normal case when a specific address isn't asked for. So
> as a heuristic to make sure it doesn't happen by mistake it possibly
> makes sense.

I agree, a guard page for the general case (!MAP_FIXED) is the best
option and it would be better to have it in core code (maybe hidden
behind some Kconfig if it's not desirable for all architectures using
the generic implementation).

Alternatively, aligning the source pointer reads (but not the
destination) in strncpy_from_user() is another option, though I'm pretty
sure we would notice some performance penalty.

-- 
Catalin

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] arm64: add alignment fault hanling
  2016-02-19 18:14             ` Catalin Marinas
@ 2016-02-19 22:09               ` Linus Torvalds
  0 siblings, 0 replies; 14+ messages in thread
From: Linus Torvalds @ 2016-02-19 22:09 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Will Deacon, vladimir.murzin, Arjan van de Ven, suzuki.poulose,
	Peter Zijlstra, linux-kernel, salyzyn, riandrews, james.morse,
	EunTaik Lee, Peter Anvin, Robin Murphy, Dave.Martin,
	linux-arm-kernel

On Fri, Feb 19, 2016 at 10:14 AM, Catalin Marinas
<catalin.marinas@arm.com> wrote:
>
> Unless I misunderstand it, I don't think the user is even aware of this
> potential problem. Let's say it mmap's a file (script etc.) which
> contains a string (file name) towards the end of the last page. Such
> pointer gets passed to sys_open() and it eventually ends up in
> strncpy_from_user() with count == EMBEDDED_NAME_MAX (so well beyond the
> end of the page).
>
> The user also maps a device with mmap(NULL, ...) and there is a slight
> chance that the kernel allocates this VA space without any gap from the
> previous mmap().

Quite frankly, if the user mmap's MMIO - and particularly MMIO that
has side effects on speculative reads - the user had better be damn
sure they separate that MMIO well. Such a user should make sure they
have redzones around it on their own.

First off, it just doesn't happen. Basically zero applications map
MMIO to begin with, and the ones that do do it do it for frame buffers
etc that don't have side effects on reads.

I can't think of a single case of anybody actually having side effects
on reads on any sane hardware (except the obvious side effect of
"that's slow") any more, simply because it's horrible hardware design
and causes way worse problems than the whole page-crosser issue. Yeah,
I'm sure it exists somewhere, but I just wouldn't worry about it.

> Alternatively, aligning the source pointer reads (but not the
> destination) in strncpy_from_user() is another option, though I'm pretty
> sure we would notice some performance penalty.

It definitely is the destination pointer that needs alignment, simply
because that's the case that matters. There's a reason CPU's look
alike, and generally have patterns like "two read ports, one write
port" in their memory subsystem.

But more importantly, the kernel policy has always been to make the
cases that matter fast, and make sure we don't break anything.

If we can actually find a case where this matters and the dcache
access model actually breaks something, I'll care _deeply_, because it
will be the whole "no regressions" rule. But that "no regressions"
rule has always been about reality, not about theory.

We could make our strncpy try to be more careful about alignment. The
problem with that tends to be that keeping both source and destination
aligned is really complicated to do (a) efficiently and (b) portably.
There may well be things like vector extensions etc that are literally
*designed* for doing string copies and do the byte shifting in
hardware so that you can do things like this really well, but I'd like
to keep a (reasonably) simple generic strncpy_from_user() at least
until somebody actually sees it as a problem in real life.

          Linus

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2016-02-19 22:09 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-16  4:44 [PATCH v2] arm64: add alignment fault hanling EunTaik Lee
2016-02-16 10:31 ` Will Deacon
2016-02-16 10:57   ` Robin Murphy
2016-02-16 12:21     ` Catalin Marinas
2016-02-16 16:00       ` Will Deacon
2016-02-16 17:04         ` Will Deacon
2016-02-16 18:50           ` Linus Torvalds
2016-02-16 21:31             ` Arjan van de Ven
2016-02-16 23:04               ` Catalin Marinas
     [not found]               ` <CA+55aFz+ttJoEG_WkpkwV=+Wunzxpj9NoHobq-8oFZS0HEEyeA@mail.gmail.com>
2016-02-17  0:28                 ` Linus Torvalds
2016-02-19 18:14             ` Catalin Marinas
2016-02-19 22:09               ` Linus Torvalds
2016-02-16 17:09         ` Catalin Marinas
2016-02-16 17:11 ` Catalin Marinas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).