BPF Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH] arch/arm64/kernel/traps: Use find_vma_intersection() in traps for setting si_code
@ 2021-04-07 15:11 Liam Howlett
  2021-04-12 17:43 ` Catalin Marinas
  0 siblings, 1 reply; 5+ messages in thread
From: Liam Howlett @ 2021-04-07 15:11 UTC (permalink / raw)
  To: Catalin Marinas, Andre Przywara, Will Deacon,
	Peter Collingbourne, linux-arm-kernel, linux-kernel, netdev, bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Amit Daniel Kachhap

find_vma() will continue to search upwards until the end of the virtual
memory space.  This means the si_code would almost never be set to
SEGV_MAPERR even when the address falls outside of any VMA.  The result
is that the si_code is not reliable as it may or may not be set to the
correct result, depending on where the address falls in the address
space.

Using find_vma_intersection() allows for what is intended by only
returning a VMA if it falls within the range provided, in this case a
window of 1.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
---
 arch/arm64/kernel/traps.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index a05d34f0e82a..a44007904a64 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -383,9 +383,10 @@ void force_signal_inject(int signal, int code, unsigned long address, unsigned i
 void arm64_notify_segfault(unsigned long addr)
 {
 	int code;
+	unsigned long ut_addr = untagged_addr(addr);
 
 	mmap_read_lock(current->mm);
-	if (find_vma(current->mm, untagged_addr(addr)) == NULL)
+	if (find_vma_intersection(current->mm, ut_addr, ut_addr + 1) == NULL)
 		code = SEGV_MAPERR;
 	else
 		code = SEGV_ACCERR;
-- 
2.30.0

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] arch/arm64/kernel/traps: Use find_vma_intersection() in traps for setting si_code
  2021-04-07 15:11 [PATCH] arch/arm64/kernel/traps: Use find_vma_intersection() in traps for setting si_code Liam Howlett
@ 2021-04-12 17:43 ` Catalin Marinas
  2021-04-13 16:52   ` Liam Howlett
  0 siblings, 1 reply; 5+ messages in thread
From: Catalin Marinas @ 2021-04-12 17:43 UTC (permalink / raw)
  To: Liam Howlett
  Cc: Andre Przywara, Will Deacon, Peter Collingbourne,
	linux-arm-kernel, linux-kernel, netdev, bpf, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Amit Daniel Kachhap

On Wed, Apr 07, 2021 at 03:11:06PM +0000, Liam Howlett wrote:
> find_vma() will continue to search upwards until the end of the virtual
> memory space.  This means the si_code would almost never be set to
> SEGV_MAPERR even when the address falls outside of any VMA.  The result
> is that the si_code is not reliable as it may or may not be set to the
> correct result, depending on where the address falls in the address
> space.
> 
> Using find_vma_intersection() allows for what is intended by only
> returning a VMA if it falls within the range provided, in this case a
> window of 1.
> 
> Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
> ---
>  arch/arm64/kernel/traps.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> index a05d34f0e82a..a44007904a64 100644
> --- a/arch/arm64/kernel/traps.c
> +++ b/arch/arm64/kernel/traps.c
> @@ -383,9 +383,10 @@ void force_signal_inject(int signal, int code, unsigned long address, unsigned i
>  void arm64_notify_segfault(unsigned long addr)
>  {
>  	int code;
> +	unsigned long ut_addr = untagged_addr(addr);
>  
>  	mmap_read_lock(current->mm);
> -	if (find_vma(current->mm, untagged_addr(addr)) == NULL)
> +	if (find_vma_intersection(current->mm, ut_addr, ut_addr + 1) == NULL)
>  		code = SEGV_MAPERR;
>  	else
>  		code = SEGV_ACCERR;

I don't think your change is entirely correct either. We can have a
fault below the vma of a stack (with VM_GROWSDOWN) and
find_vma_intersection() would return NULL but it should be a SEGV_ACCERR
instead.

Maybe this should employ similar checks as __do_page_fault() (with
expand_stack() and VM_GROWSDOWN).

-- 
Catalin

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] arch/arm64/kernel/traps: Use find_vma_intersection() in traps for setting si_code
  2021-04-12 17:43 ` Catalin Marinas
@ 2021-04-13 16:52   ` Liam Howlett
  2021-04-13 18:00     ` Catalin Marinas
  0 siblings, 1 reply; 5+ messages in thread
From: Liam Howlett @ 2021-04-13 16:52 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Andre Przywara, Will Deacon, Peter Collingbourne,
	linux-arm-kernel, linux-kernel, netdev, bpf, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Amit Daniel Kachhap

* Catalin Marinas <catalin.marinas@arm.com> [210412 13:44]:
> On Wed, Apr 07, 2021 at 03:11:06PM +0000, Liam Howlett wrote:
> > find_vma() will continue to search upwards until the end of the virtual
> > memory space.  This means the si_code would almost never be set to
> > SEGV_MAPERR even when the address falls outside of any VMA.  The result
> > is that the si_code is not reliable as it may or may not be set to the
> > correct result, depending on where the address falls in the address
> > space.
> > 
> > Using find_vma_intersection() allows for what is intended by only
> > returning a VMA if it falls within the range provided, in this case a
> > window of 1.
> > 
> > Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
> > ---
> >  arch/arm64/kernel/traps.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> > index a05d34f0e82a..a44007904a64 100644
> > --- a/arch/arm64/kernel/traps.c
> > +++ b/arch/arm64/kernel/traps.c
> > @@ -383,9 +383,10 @@ void force_signal_inject(int signal, int code, unsigned long address, unsigned i
> >  void arm64_notify_segfault(unsigned long addr)
> >  {
> >  	int code;
> > +	unsigned long ut_addr = untagged_addr(addr);
> >  
> >  	mmap_read_lock(current->mm);
> > -	if (find_vma(current->mm, untagged_addr(addr)) == NULL)
> > +	if (find_vma_intersection(current->mm, ut_addr, ut_addr + 1) == NULL)
> >  		code = SEGV_MAPERR;
> >  	else
> >  		code = SEGV_ACCERR;


Thank you for taking the time to thoroughly review this patch.

> 
> I don't think your change is entirely correct either. We can have a
> fault below the vma of a stack (with VM_GROWSDOWN) and
> find_vma_intersection() would return NULL but it should be a SEGV_ACCERR
> instead.

I'm pretty sure I am missing something.  From what you said above, I
think this means that there can be a user cache fault below the stack
which should notify the user application that they are not allowed to
expand the stack by sending a SIGV_ACCERR in the si_code?  Is this
expected behaviour or am I missing a code path to this function?

> 
> Maybe this should employ similar checks as __do_page_fault() (with
> expand_stack() and VM_GROWSDOWN).

You mean the code needs to detect endianness and to check if this is an
attempt to expand the stack for both cases?

Thanks,
Liam

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] arch/arm64/kernel/traps: Use find_vma_intersection() in traps for setting si_code
  2021-04-13 16:52   ` Liam Howlett
@ 2021-04-13 18:00     ` Catalin Marinas
  2021-04-14 16:30       ` Liam Howlett
  0 siblings, 1 reply; 5+ messages in thread
From: Catalin Marinas @ 2021-04-13 18:00 UTC (permalink / raw)
  To: Liam Howlett
  Cc: Andre Przywara, Will Deacon, Peter Collingbourne,
	linux-arm-kernel, linux-kernel, netdev, bpf, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Amit Daniel Kachhap

On Tue, Apr 13, 2021 at 04:52:34PM +0000, Liam Howlett wrote:
> * Catalin Marinas <catalin.marinas@arm.com> [210412 13:44]:
> > On Wed, Apr 07, 2021 at 03:11:06PM +0000, Liam Howlett wrote:
> > > find_vma() will continue to search upwards until the end of the virtual
> > > memory space.  This means the si_code would almost never be set to
> > > SEGV_MAPERR even when the address falls outside of any VMA.  The result
> > > is that the si_code is not reliable as it may or may not be set to the
> > > correct result, depending on where the address falls in the address
> > > space.
> > > 
> > > Using find_vma_intersection() allows for what is intended by only
> > > returning a VMA if it falls within the range provided, in this case a
> > > window of 1.
> > > 
> > > Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
> > > ---
> > >  arch/arm64/kernel/traps.c | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> > > index a05d34f0e82a..a44007904a64 100644
> > > --- a/arch/arm64/kernel/traps.c
> > > +++ b/arch/arm64/kernel/traps.c
> > > @@ -383,9 +383,10 @@ void force_signal_inject(int signal, int code, unsigned long address, unsigned i
> > >  void arm64_notify_segfault(unsigned long addr)
> > >  {
> > >  	int code;
> > > +	unsigned long ut_addr = untagged_addr(addr);
> > >  
> > >  	mmap_read_lock(current->mm);
> > > -	if (find_vma(current->mm, untagged_addr(addr)) == NULL)
> > > +	if (find_vma_intersection(current->mm, ut_addr, ut_addr + 1) == NULL)
> > >  		code = SEGV_MAPERR;
> > >  	else
> > >  		code = SEGV_ACCERR;
[...]
> > I don't think your change is entirely correct either. We can have a
> > fault below the vma of a stack (with VM_GROWSDOWN) and
> > find_vma_intersection() would return NULL but it should be a SEGV_ACCERR
> > instead.
> 
> I'm pretty sure I am missing something.  From what you said above, I
> think this means that there can be a user cache fault below the stack
> which should notify the user application that they are not allowed to
> expand the stack by sending a SIGV_ACCERR in the si_code?  Is this
> expected behaviour or am I missing a code path to this function?

My point was that find_vma() may return a valid vma where addr < vm_end
but also addr < vm_addr. It's the responsibility of the caller to check
that that vma can be expanded (VM_GROWSDOWN) and we do something like
this in __do_page_fault(). find_vma_intersection(), OTOH, requires addr
>= vm_start.

If we hit this case (addr < vm_start), normally we'd first need to check
whether it's expandable and, if not, return MAPERR. If it's expandable,
it should be ACCERR since something else caused the fault.

Now, I think at least for user_cache_maint_handler(), we can assume that
__do_page_fault() handled any expansion already, so we don't need to
check it here. In this case, your find_vma_intersection() check should
work.

Are there other cases where we invoke arm64_notify_segfault() without a
prior fault? I think in swp_handler() we can bail out early before we
even attempted the access so we may report MAPERR but ACCERR is a better
indication. Also in sys_rt_sigreturn() we always call it as
arm64_notify_segfault(regs->sp). I'm not sure that's correct in all
cases, see restore_altstack().

I guess this code needs some tidying up.

> > Maybe this should employ similar checks as __do_page_fault() (with
> > expand_stack() and VM_GROWSDOWN).
> 
> You mean the code needs to detect endianness and to check if this is an
> attempt to expand the stack for both cases?

Nothing to do with endianness, just the relation between the address and
the vma->vm_start and whether the vma can be expanded down.

-- 
Catalin

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] arch/arm64/kernel/traps: Use find_vma_intersection() in traps for setting si_code
  2021-04-13 18:00     ` Catalin Marinas
@ 2021-04-14 16:30       ` Liam Howlett
  0 siblings, 0 replies; 5+ messages in thread
From: Liam Howlett @ 2021-04-14 16:30 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Andre Przywara, Will Deacon, Peter Collingbourne,
	linux-arm-kernel, linux-kernel, netdev, bpf, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Amit Daniel Kachhap

* Catalin Marinas <catalin.marinas@arm.com> [210413 14:00]:
> On Tue, Apr 13, 2021 at 04:52:34PM +0000, Liam Howlett wrote:
> > * Catalin Marinas <catalin.marinas@arm.com> [210412 13:44]:
> > > On Wed, Apr 07, 2021 at 03:11:06PM +0000, Liam Howlett wrote:
> > > > find_vma() will continue to search upwards until the end of the virtual
> > > > memory space.  This means the si_code would almost never be set to
> > > > SEGV_MAPERR even when the address falls outside of any VMA.  The result
> > > > is that the si_code is not reliable as it may or may not be set to the
> > > > correct result, depending on where the address falls in the address
> > > > space.
> > > > 
> > > > Using find_vma_intersection() allows for what is intended by only
> > > > returning a VMA if it falls within the range provided, in this case a
> > > > window of 1.
> > > > 
> > > > Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
> > > > ---
> > > >  arch/arm64/kernel/traps.c | 3 ++-
> > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> > > > index a05d34f0e82a..a44007904a64 100644
> > > > --- a/arch/arm64/kernel/traps.c
> > > > +++ b/arch/arm64/kernel/traps.c
> > > > @@ -383,9 +383,10 @@ void force_signal_inject(int signal, int code, unsigned long address, unsigned i
> > > >  void arm64_notify_segfault(unsigned long addr)
> > > >  {
> > > >  	int code;
> > > > +	unsigned long ut_addr = untagged_addr(addr);
> > > >  
> > > >  	mmap_read_lock(current->mm);
> > > > -	if (find_vma(current->mm, untagged_addr(addr)) == NULL)
> > > > +	if (find_vma_intersection(current->mm, ut_addr, ut_addr + 1) == NULL)
> > > >  		code = SEGV_MAPERR;
> > > >  	else
> > > >  		code = SEGV_ACCERR;
> [...]
> > > I don't think your change is entirely correct either. We can have a
> > > fault below the vma of a stack (with VM_GROWSDOWN) and
> > > find_vma_intersection() would return NULL but it should be a SEGV_ACCERR
> > > instead.
> > 
> > I'm pretty sure I am missing something.  From what you said above, I
> > think this means that there can be a user cache fault below the stack
> > which should notify the user application that they are not allowed to
> > expand the stack by sending a SIGV_ACCERR in the si_code?  Is this
> > expected behaviour or am I missing a code path to this function?
> 
> My point was that find_vma() may return a valid vma where addr < vm_end
> but also addr < vm_addr. It's the responsibility of the caller to check
> that that vma can be expanded (VM_GROWSDOWN) and we do something like
> this in __do_page_fault(). find_vma_intersection(), OTOH, requires addr
> >= vm_start.

Right.  The find_vma() interface is not clear by the function name;
returning a VMA that doesn't include the address of interest is unclear.
I think this is why we ended up with the bug in the first place.

> 
> If we hit this case (addr < vm_start), normally we'd first need to check
> whether it's expandable and, if not, return MAPERR. If it's expandable,
> it should be ACCERR since something else caused the fault.
> 
> Now, I think at least for user_cache_maint_handler(), we can assume that
> __do_page_fault() handled any expansion already, so we don't need to
> check it here. In this case, your find_vma_intersection() check should
> work.
> 
> Are there other cases where we invoke arm64_notify_segfault() without a
> prior fault? I think in swp_handler() we can bail out early before we
> even attempted the access so we may report MAPERR but ACCERR is a better
> indication.

swp_handler() is also buggy.  It is currently getting the ACCERR as long
as the address being checked is > mm->highest_vm_end.  If access_ok()
fails, it should return ACCERR and not search VMAs for the address at
all.

...

>Also in sys_rt_sigreturn() we always call it as
> arm64_notify_segfault(regs->sp). I'm not sure that's correct in all
> cases, see restore_altstack().

Ditto for sys_rt_sigreturn() and sys_sigreturn(), they both suffer the
same bug as swp_handler() outlined above.

In the case of restore_sigframe() or restore_altstack() failing, it
seems that the signal shouldn't be dependent on where the address falls
within the VMA at all.  Should the signal still be SIGSEGV or something
else?  By the comments, I would have thought SIGBUS, si_code of
BUS_ADRALN?

> 
> I guess this code needs some tidying up.

Indeed, there seems to be a few code paths that need to skip this
function all together and just set the code to ACCERR - especially since
the access_ok() just validates the number itself.

I don't think the right answer is to rewrite the function to check
VM_GROWSDOWN, as I cannot see a way to reach this function when trying
to expand the stack which should report back ACCERR.  Do you agree?

I also see that my fix would expose other bugs which need to be
addressed at the same time.

> 
> > > Maybe this should employ similar checks as __do_page_fault() (with
> > > expand_stack() and VM_GROWSDOWN).
> > 
> > You mean the code needs to detect endianness and to check if this is an
> > attempt to expand the stack for both cases?
> 
> Nothing to do with endianness, just the relation between the address and
> the vma->vm_start and whether the vma can be expanded down.

Okay, thanks for clarifying.


Cheers,
Liam

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, back to index

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-07 15:11 [PATCH] arch/arm64/kernel/traps: Use find_vma_intersection() in traps for setting si_code Liam Howlett
2021-04-12 17:43 ` Catalin Marinas
2021-04-13 16:52   ` Liam Howlett
2021-04-13 18:00     ` Catalin Marinas
2021-04-14 16:30       ` Liam Howlett

BPF Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/bpf/0 bpf/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 bpf bpf/ https://lore.kernel.org/bpf \
		bpf@vger.kernel.org
	public-inbox-index bpf

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.bpf


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git