stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/mmap: undo ->mmap() when arch_validate_flags() fails
@ 2022-09-30  0:38 Carlos Llamas
  2022-09-30  9:59 ` Catalin Marinas
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Carlos Llamas @ 2022-09-30  0:38 UTC (permalink / raw)
  To: Andrew Morton, Catalin Marinas
  Cc: Michal Hocko, Christian Brauner, linux-mm, bpf, kernel-team,
	Carlos Llamas, Liam Howlett, Suren Baghdasaryan, stable

Commit c462ac288f2c ("mm: Introduce arch_validate_flags()") added a late
check in mmap_region() to let architectures validate vm_flags. The check
needs to happen after calling ->mmap() as the flags can potentially be
modified during this callback.

If arch_validate_flags() check fails we unmap and free the vma. However,
the error path fails to undo the ->mmap() call that previously succeeded
and depending on the specific ->mmap() implementation this translates to
reference increments, memory allocations and other operations what will
not be cleaned up.

There are several places (mainly device drivers) where this is an issue.
However, one specific example is bpf_map_mmap() which keeps count of the
mappings in map->writecnt. The count is incremented on ->mmap() and then
decremented on vm_ops->close(). When arch_validate_flags() fails this
count is off since bpf_map_mmap_close() is never called.

One can reproduce this issue in arm64 devices with MTE support. Here the
vm_flags are checked to only allow VM_MTE if VM_MTE_ALLOWED has been set
previously. From userspace then is enough to pass the PROT_MTE flag to
mmap() syscall to trigger the arch_validate_flags() failure.

The following program reproduces this issue:
---
  #include <stdio.h>
  #include <unistd.h>
  #include <linux/unistd.h>
  #include <linux/bpf.h>
  #include <sys/mman.h>

  int main(void)
  {
	union bpf_attr attr = {
		.map_type = BPF_MAP_TYPE_ARRAY,
		.key_size = sizeof(int),
		.value_size = sizeof(long long),
		.max_entries = 256,
		.map_flags = BPF_F_MMAPABLE,
	};
	int fd;

	fd = syscall(__NR_bpf, BPF_MAP_CREATE, &attr, sizeof(attr));
	mmap(NULL, 4096, PROT_WRITE | PROT_MTE, MAP_SHARED, fd, 0);

	return 0;
  }
---

By manually adding some log statements to the vm_ops callbacks we can
confirm that when passing PROT_MTE to mmap() the map->writecnt is off
upon ->release():

With PROT_MTE flag:
  root@debian:~# ./bpf-test
  [  111.263874] bpf_map_write_active_inc: map=9 writecnt=1
  [  111.288763] bpf_map_release: map=9 writecnt=1

Without PROT_MTE flag:
  root@debian:~# ./bpf-test
  [  157.816912] bpf_map_write_active_inc: map=10 writecnt=1
  [  157.830442] bpf_map_write_active_dec: map=10 writecnt=0
  [  157.832396] bpf_map_release: map=10 writecnt=0

This patch fixes the above issue by calling vm_ops->close() when the
arch_validate_flags() check fails, after this we can proceed to unmap
and free the vma on the error path.

Fixes: c462ac288f2c ("mm: Introduce arch_validate_flags()")
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: <stable@vger.kernel.org> # v5.10+
Signed-off-by: Carlos Llamas <cmllamas@google.com>
---
 mm/mmap.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index 9d780f415be3..36c08e2c78da 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1797,7 +1797,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
 	if (!arch_validate_flags(vma->vm_flags)) {
 		error = -EINVAL;
 		if (file)
-			goto unmap_and_free_vma;
+			goto close_and_free_vma;
 		else
 			goto free_vma;
 	}
@@ -1844,6 +1844,9 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
 
 	return addr;
 
+close_and_free_vma:
+	if (vma->vm_ops && vma->vm_ops->close)
+		vma->vm_ops->close(vma);
 unmap_and_free_vma:
 	fput(vma->vm_file);
 	vma->vm_file = NULL;
-- 
2.38.0.rc1.362.ged0d419d3c-goog


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] mm/mmap: undo ->mmap() when arch_validate_flags() fails
  2022-09-30  0:38 [PATCH] mm/mmap: undo ->mmap() when arch_validate_flags() fails Carlos Llamas
@ 2022-09-30  9:59 ` Catalin Marinas
  2022-09-30 22:34 ` Andrii Nakryiko
  2022-10-11 16:20 ` Liam Howlett
  2 siblings, 0 replies; 4+ messages in thread
From: Catalin Marinas @ 2022-09-30  9:59 UTC (permalink / raw)
  To: Carlos Llamas
  Cc: Andrew Morton, Michal Hocko, Christian Brauner, linux-mm, bpf,
	kernel-team, Liam Howlett, Suren Baghdasaryan, stable

On Fri, Sep 30, 2022 at 12:38:43AM +0000, Carlos Llamas wrote:
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 9d780f415be3..36c08e2c78da 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1797,7 +1797,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
>  	if (!arch_validate_flags(vma->vm_flags)) {
>  		error = -EINVAL;
>  		if (file)
> -			goto unmap_and_free_vma;
> +			goto close_and_free_vma;
>  		else
>  			goto free_vma;
>  	}
> @@ -1844,6 +1844,9 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
>  
>  	return addr;
>  
> +close_and_free_vma:
> +	if (vma->vm_ops && vma->vm_ops->close)
> +		vma->vm_ops->close(vma);
>  unmap_and_free_vma:
>  	fput(vma->vm_file);
>  	vma->vm_file = NULL;

The fix looks right to me but I'm not an mm expert. FWIW:

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] mm/mmap: undo ->mmap() when arch_validate_flags() fails
  2022-09-30  0:38 [PATCH] mm/mmap: undo ->mmap() when arch_validate_flags() fails Carlos Llamas
  2022-09-30  9:59 ` Catalin Marinas
@ 2022-09-30 22:34 ` Andrii Nakryiko
  2022-10-11 16:20 ` Liam Howlett
  2 siblings, 0 replies; 4+ messages in thread
From: Andrii Nakryiko @ 2022-09-30 22:34 UTC (permalink / raw)
  To: Carlos Llamas
  Cc: Andrew Morton, Catalin Marinas, Michal Hocko, Christian Brauner,
	linux-mm, bpf, kernel-team, Liam Howlett, Suren Baghdasaryan,
	stable

On Thu, Sep 29, 2022 at 5:51 PM Carlos Llamas <cmllamas@google.com> wrote:
>
> Commit c462ac288f2c ("mm: Introduce arch_validate_flags()") added a late
> check in mmap_region() to let architectures validate vm_flags. The check
> needs to happen after calling ->mmap() as the flags can potentially be
> modified during this callback.
>
> If arch_validate_flags() check fails we unmap and free the vma. However,
> the error path fails to undo the ->mmap() call that previously succeeded
> and depending on the specific ->mmap() implementation this translates to
> reference increments, memory allocations and other operations what will
> not be cleaned up.
>
> There are several places (mainly device drivers) where this is an issue.
> However, one specific example is bpf_map_mmap() which keeps count of the
> mappings in map->writecnt. The count is incremented on ->mmap() and then
> decremented on vm_ops->close(). When arch_validate_flags() fails this
> count is off since bpf_map_mmap_close() is never called.
>
> One can reproduce this issue in arm64 devices with MTE support. Here the
> vm_flags are checked to only allow VM_MTE if VM_MTE_ALLOWED has been set
> previously. From userspace then is enough to pass the PROT_MTE flag to
> mmap() syscall to trigger the arch_validate_flags() failure.
>
> The following program reproduces this issue:
> ---
>   #include <stdio.h>
>   #include <unistd.h>
>   #include <linux/unistd.h>
>   #include <linux/bpf.h>
>   #include <sys/mman.h>
>
>   int main(void)
>   {
>         union bpf_attr attr = {
>                 .map_type = BPF_MAP_TYPE_ARRAY,
>                 .key_size = sizeof(int),
>                 .value_size = sizeof(long long),
>                 .max_entries = 256,
>                 .map_flags = BPF_F_MMAPABLE,
>         };
>         int fd;
>
>         fd = syscall(__NR_bpf, BPF_MAP_CREATE, &attr, sizeof(attr));
>         mmap(NULL, 4096, PROT_WRITE | PROT_MTE, MAP_SHARED, fd, 0);
>
>         return 0;
>   }
> ---
>
> By manually adding some log statements to the vm_ops callbacks we can
> confirm that when passing PROT_MTE to mmap() the map->writecnt is off
> upon ->release():
>
> With PROT_MTE flag:
>   root@debian:~# ./bpf-test
>   [  111.263874] bpf_map_write_active_inc: map=9 writecnt=1
>   [  111.288763] bpf_map_release: map=9 writecnt=1
>
> Without PROT_MTE flag:
>   root@debian:~# ./bpf-test
>   [  157.816912] bpf_map_write_active_inc: map=10 writecnt=1
>   [  157.830442] bpf_map_write_active_dec: map=10 writecnt=0
>   [  157.832396] bpf_map_release: map=10 writecnt=0
>
> This patch fixes the above issue by calling vm_ops->close() when the
> arch_validate_flags() check fails, after this we can proceed to unmap
> and free the vma on the error path.
>
> Fixes: c462ac288f2c ("mm: Introduce arch_validate_flags()")
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Liam Howlett <liam.howlett@oracle.com>
> Cc: Suren Baghdasaryan <surenb@google.com>
> Cc: <stable@vger.kernel.org> # v5.10+
> Signed-off-by: Carlos Llamas <cmllamas@google.com>
> ---

Makes sense to me, open/close callbacks should be symmetrical. From
BPF-side of things:

Acked-by: Andrii Nakryiko <andrii@kernel.org>

>  mm/mmap.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 9d780f415be3..36c08e2c78da 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1797,7 +1797,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
>         if (!arch_validate_flags(vma->vm_flags)) {
>                 error = -EINVAL;
>                 if (file)
> -                       goto unmap_and_free_vma;
> +                       goto close_and_free_vma;
>                 else
>                         goto free_vma;
>         }
> @@ -1844,6 +1844,9 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
>
>         return addr;
>
> +close_and_free_vma:
> +       if (vma->vm_ops && vma->vm_ops->close)
> +               vma->vm_ops->close(vma);
>  unmap_and_free_vma:
>         fput(vma->vm_file);
>         vma->vm_file = NULL;
> --
> 2.38.0.rc1.362.ged0d419d3c-goog
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] mm/mmap: undo ->mmap() when arch_validate_flags() fails
  2022-09-30  0:38 [PATCH] mm/mmap: undo ->mmap() when arch_validate_flags() fails Carlos Llamas
  2022-09-30  9:59 ` Catalin Marinas
  2022-09-30 22:34 ` Andrii Nakryiko
@ 2022-10-11 16:20 ` Liam Howlett
  2 siblings, 0 replies; 4+ messages in thread
From: Liam Howlett @ 2022-10-11 16:20 UTC (permalink / raw)
  To: Carlos Llamas
  Cc: Andrew Morton, Catalin Marinas, Michal Hocko, Christian Brauner,
	linux-mm, bpf, kernel-team, Suren Baghdasaryan, stable

> ---
>  mm/mmap.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 9d780f415be3..36c08e2c78da 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1797,7 +1797,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
>  	if (!arch_validate_flags(vma->vm_flags)) {
>  		error = -EINVAL;
>  		if (file)
> -			goto unmap_and_free_vma;
> +			goto close_and_free_vma;
>  		else
>  			goto free_vma;
>  	}
> @@ -1844,6 +1844,9 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
>  
>  	return addr;
>  
> +close_and_free_vma:
> +	if (vma->vm_ops && vma->vm_ops->close)
> +		vma->vm_ops->close(vma);
>  unmap_and_free_vma:
>  	fput(vma->vm_file);
>  	vma->vm_file = NULL;
> -- 
> 2.38.0.rc1.362.ged0d419d3c-goog
> 

Sorry for the late reply, I was out of the office.

The analysis looks good and I agree that open() should have a matching
close() call in the unwinding.

Reviewed-by: Liam Howlett <liam.howlett@oracle.com>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-10-11 16:21 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-30  0:38 [PATCH] mm/mmap: undo ->mmap() when arch_validate_flags() fails Carlos Llamas
2022-09-30  9:59 ` Catalin Marinas
2022-09-30 22:34 ` Andrii Nakryiko
2022-10-11 16:20 ` Liam Howlett

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).