stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* FAILED: patch "[PATCH] mm: prevent vm_area_struct::anon_name refcount saturation" failed to apply to 5.15-stable tree
@ 2022-03-06  9:37 gregkh
  2022-03-06 21:50 ` Sasha Levin
  0 siblings, 1 reply; 4+ messages in thread
From: gregkh @ 2022-03-06  9:37 UTC (permalink / raw)
  To: surenb, akpm, brauner, caoxiaofeng, ccross, chris.hyser,
	dave.hansen, dave, david, ebiederm, gorcunov, hannes, keescook,
	kirill.shutemov, legion, mhocko, pcc, sashal, sumit.semwal,
	torvalds, vbabka, willy
  Cc: stable


The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable@vger.kernel.org>.

thanks,

greg k-h

------------------ original commit in Linus's tree ------------------

From 96403e11283def1d1c465c8279514c9a504d8630 Mon Sep 17 00:00:00 2001
From: Suren Baghdasaryan <surenb@google.com>
Date: Fri, 4 Mar 2022 20:28:55 -0800
Subject: [PATCH] mm: prevent vm_area_struct::anon_name refcount saturation

A deep process chain with many vmas could grow really high.  With
default sysctl_max_map_count (64k) and default pid_max (32k) the max
number of vmas in the system is 2147450880 and the refcounter has
headroom of 1073774592 before it reaches REFCOUNT_SATURATED
(3221225472).

Therefore it's unlikely that an anonymous name refcounter will overflow
with these defaults.  Currently the max for pid_max is PID_MAX_LIMIT
(4194304) and for sysctl_max_map_count it's INT_MAX (2147483647).  In
this configuration anon_vma_name refcount overflow becomes theoretically
possible (that still require heavy sharing of that anon_vma_name between
processes).

kref refcounting interface used in anon_vma_name structure will detect a
counter overflow when it reaches REFCOUNT_SATURATED value but will only
generate a warning and freeze the ref counter.  This would lead to the
refcounted object never being freed.  A determined attacker could leak
memory like that but it would be rather expensive and inefficient way to
do so.

To ensure anon_vma_name refcount does not overflow, stop anon_vma_name
sharing when the refcount reaches REFCOUNT_MAX (2147483647), which still
leaves INT_MAX/2 (1073741823) values before the counter reaches
REFCOUNT_SATURATED.  This should provide enough headroom for raising the
refcounts temporarily.

Link: https://lkml.kernel.org/r/20220223153613.835563-2-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Suggested-by: Michal Hocko <mhocko@suse.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Alexey Gladkov <legion@kernel.org>
Cc: Chris Hyser <chris.hyser@oracle.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Colin Cross <ccross@google.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Collingbourne <pcc@google.com>
Cc: Sasha Levin <sashal@kernel.org>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Xiaofeng Cao <caoxiaofeng@yulong.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
index dd3accaa4e6d..cf90b1fa2c60 100644
--- a/include/linux/mm_inline.h
+++ b/include/linux/mm_inline.h
@@ -161,15 +161,25 @@ static inline void anon_vma_name_put(struct anon_vma_name *anon_name)
 		kref_put(&anon_name->kref, anon_vma_name_free);
 }
 
+static inline
+struct anon_vma_name *anon_vma_name_reuse(struct anon_vma_name *anon_name)
+{
+	/* Prevent anon_name refcount saturation early on */
+	if (kref_read(&anon_name->kref) < REFCOUNT_MAX) {
+		anon_vma_name_get(anon_name);
+		return anon_name;
+
+	}
+	return anon_vma_name_alloc(anon_name->name);
+}
+
 static inline void dup_anon_vma_name(struct vm_area_struct *orig_vma,
 				     struct vm_area_struct *new_vma)
 {
 	struct anon_vma_name *anon_name = anon_vma_name(orig_vma);
 
-	if (anon_name) {
-		anon_vma_name_get(anon_name);
-		new_vma->anon_name = anon_name;
-	}
+	if (anon_name)
+		new_vma->anon_name = anon_vma_name_reuse(anon_name);
 }
 
 static inline void free_anon_vma_name(struct vm_area_struct *vma)
diff --git a/mm/madvise.c b/mm/madvise.c
index 081b1cded21e..1f2693dccf7b 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -113,8 +113,7 @@ static int replace_anon_vma_name(struct vm_area_struct *vma,
 	if (anon_vma_name_eq(orig_name, anon_name))
 		return 0;
 
-	anon_vma_name_get(anon_name);
-	vma->anon_name = anon_name;
+	vma->anon_name = anon_vma_name_reuse(anon_name);
 	anon_vma_name_put(orig_name);
 
 	return 0;


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: FAILED: patch "[PATCH] mm: prevent vm_area_struct::anon_name refcount saturation" failed to apply to 5.15-stable tree
  2022-03-06  9:37 FAILED: patch "[PATCH] mm: prevent vm_area_struct::anon_name refcount saturation" failed to apply to 5.15-stable tree gregkh
@ 2022-03-06 21:50 ` Sasha Levin
  2022-03-06 21:58   ` Greg KH
  0 siblings, 1 reply; 4+ messages in thread
From: Sasha Levin @ 2022-03-06 21:50 UTC (permalink / raw)
  To: gregkh
  Cc: surenb, akpm, brauner, caoxiaofeng, ccross, chris.hyser,
	dave.hansen, dave, david, ebiederm, gorcunov, hannes, keescook,
	kirill.shutemov, legion, mhocko, pcc, sumit.semwal, torvalds,
	vbabka, willy, stable

On Sun, Mar 06, 2022 at 10:37:21AM +0100, gregkh@linuxfoundation.org wrote:
>
>The patch below does not apply to the 5.15-stable tree.
>If someone wants it applied there, or to any other stable or longterm
>tree, then please email the backport, including the original git commit
>id to <stable@vger.kernel.org>.
>
>thanks,
>
>greg k-h
>
>------------------ original commit in Linus's tree ------------------
>
>From 96403e11283def1d1c465c8279514c9a504d8630 Mon Sep 17 00:00:00 2001
>From: Suren Baghdasaryan <surenb@google.com>
>Date: Fri, 4 Mar 2022 20:28:55 -0800
>Subject: [PATCH] mm: prevent vm_area_struct::anon_name refcount saturation
>
>A deep process chain with many vmas could grow really high.  With
>default sysctl_max_map_count (64k) and default pid_max (32k) the max
>number of vmas in the system is 2147450880 and the refcounter has
>headroom of 1073774592 before it reaches REFCOUNT_SATURATED
>(3221225472).
>
>Therefore it's unlikely that an anonymous name refcounter will overflow
>with these defaults.  Currently the max for pid_max is PID_MAX_LIMIT
>(4194304) and for sysctl_max_map_count it's INT_MAX (2147483647).  In
>this configuration anon_vma_name refcount overflow becomes theoretically
>possible (that still require heavy sharing of that anon_vma_name between
>processes).
>
>kref refcounting interface used in anon_vma_name structure will detect a
>counter overflow when it reaches REFCOUNT_SATURATED value but will only
>generate a warning and freeze the ref counter.  This would lead to the
>refcounted object never being freed.  A determined attacker could leak
>memory like that but it would be rather expensive and inefficient way to
>do so.
>
>To ensure anon_vma_name refcount does not overflow, stop anon_vma_name
>sharing when the refcount reaches REFCOUNT_MAX (2147483647), which still
>leaves INT_MAX/2 (1073741823) values before the counter reaches
>REFCOUNT_SATURATED.  This should provide enough headroom for raising the
>refcounts temporarily.

I think that this patch depends on 78db3412833d ("mm: add anonymous vma
name refcounting") which we don't have in any of the stable trees. (is
this why it wasn't tagged for stable?).

-- 
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: FAILED: patch "[PATCH] mm: prevent vm_area_struct::anon_name refcount saturation" failed to apply to 5.15-stable tree
  2022-03-06 21:50 ` Sasha Levin
@ 2022-03-06 21:58   ` Greg KH
  2022-03-06 22:53     ` Suren Baghdasaryan
  0 siblings, 1 reply; 4+ messages in thread
From: Greg KH @ 2022-03-06 21:58 UTC (permalink / raw)
  To: Sasha Levin
  Cc: surenb, akpm, brauner, caoxiaofeng, ccross, chris.hyser,
	dave.hansen, dave, david, ebiederm, gorcunov, hannes, keescook,
	kirill.shutemov, legion, mhocko, pcc, sumit.semwal, torvalds,
	vbabka, willy, stable

On Sun, Mar 06, 2022 at 04:50:42PM -0500, Sasha Levin wrote:
> On Sun, Mar 06, 2022 at 10:37:21AM +0100, gregkh@linuxfoundation.org wrote:
> > 
> > The patch below does not apply to the 5.15-stable tree.
> > If someone wants it applied there, or to any other stable or longterm
> > tree, then please email the backport, including the original git commit
> > id to <stable@vger.kernel.org>.
> > 
> > thanks,
> > 
> > greg k-h
> > 
> > ------------------ original commit in Linus's tree ------------------
> > 
> > > From 96403e11283def1d1c465c8279514c9a504d8630 Mon Sep 17 00:00:00 2001
> > From: Suren Baghdasaryan <surenb@google.com>
> > Date: Fri, 4 Mar 2022 20:28:55 -0800
> > Subject: [PATCH] mm: prevent vm_area_struct::anon_name refcount saturation
> > 
> > A deep process chain with many vmas could grow really high.  With
> > default sysctl_max_map_count (64k) and default pid_max (32k) the max
> > number of vmas in the system is 2147450880 and the refcounter has
> > headroom of 1073774592 before it reaches REFCOUNT_SATURATED
> > (3221225472).
> > 
> > Therefore it's unlikely that an anonymous name refcounter will overflow
> > with these defaults.  Currently the max for pid_max is PID_MAX_LIMIT
> > (4194304) and for sysctl_max_map_count it's INT_MAX (2147483647).  In
> > this configuration anon_vma_name refcount overflow becomes theoretically
> > possible (that still require heavy sharing of that anon_vma_name between
> > processes).
> > 
> > kref refcounting interface used in anon_vma_name structure will detect a
> > counter overflow when it reaches REFCOUNT_SATURATED value but will only
> > generate a warning and freeze the ref counter.  This would lead to the
> > refcounted object never being freed.  A determined attacker could leak
> > memory like that but it would be rather expensive and inefficient way to
> > do so.
> > 
> > To ensure anon_vma_name refcount does not overflow, stop anon_vma_name
> > sharing when the refcount reaches REFCOUNT_MAX (2147483647), which still
> > leaves INT_MAX/2 (1073741823) values before the counter reaches
> > REFCOUNT_SATURATED.  This should provide enough headroom for raising the
> > refcounts temporarily.
> 
> I think that this patch depends on 78db3412833d ("mm: add anonymous vma
> name refcounting") which we don't have in any of the stable trees. (is
> this why it wasn't tagged for stable?).

Suren said he would provide a backport on Monday, so let's see what he
comes up with :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: FAILED: patch "[PATCH] mm: prevent vm_area_struct::anon_name refcount saturation" failed to apply to 5.15-stable tree
  2022-03-06 21:58   ` Greg KH
@ 2022-03-06 22:53     ` Suren Baghdasaryan
  0 siblings, 0 replies; 4+ messages in thread
From: Suren Baghdasaryan @ 2022-03-06 22:53 UTC (permalink / raw)
  To: Greg KH
  Cc: Sasha Levin, Andrew Morton, Christian Brauner, caoxiaofeng,
	Colin Cross, Chris Hyser, Dave Hansen, Davidlohr Bueso,
	David Hildenbrand, Eric W. Biederman, Cyrill Gorcunov,
	Johannes Weiner, Kees Cook, Kirill A . Shutemov, legion,
	Michal Hocko, Peter Collingbourne, Sumit Semwal, Linus Torvalds,
	Vlastimil Babka, Matthew Wilcox, stable

On Sun, Mar 6, 2022 at 1:59 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Sun, Mar 06, 2022 at 04:50:42PM -0500, Sasha Levin wrote:
> > On Sun, Mar 06, 2022 at 10:37:21AM +0100, gregkh@linuxfoundation.org wrote:
> > >
> > > The patch below does not apply to the 5.15-stable tree.
> > > If someone wants it applied there, or to any other stable or longterm
> > > tree, then please email the backport, including the original git commit
> > > id to <stable@vger.kernel.org>.
> > >
> > > thanks,
> > >
> > > greg k-h
> > >
> > > ------------------ original commit in Linus's tree ------------------
> > >
> > > > From 96403e11283def1d1c465c8279514c9a504d8630 Mon Sep 17 00:00:00 2001
> > > From: Suren Baghdasaryan <surenb@google.com>
> > > Date: Fri, 4 Mar 2022 20:28:55 -0800
> > > Subject: [PATCH] mm: prevent vm_area_struct::anon_name refcount saturation
> > >
> > > A deep process chain with many vmas could grow really high.  With
> > > default sysctl_max_map_count (64k) and default pid_max (32k) the max
> > > number of vmas in the system is 2147450880 and the refcounter has
> > > headroom of 1073774592 before it reaches REFCOUNT_SATURATED
> > > (3221225472).
> > >
> > > Therefore it's unlikely that an anonymous name refcounter will overflow
> > > with these defaults.  Currently the max for pid_max is PID_MAX_LIMIT
> > > (4194304) and for sysctl_max_map_count it's INT_MAX (2147483647).  In
> > > this configuration anon_vma_name refcount overflow becomes theoretically
> > > possible (that still require heavy sharing of that anon_vma_name between
> > > processes).
> > >
> > > kref refcounting interface used in anon_vma_name structure will detect a
> > > counter overflow when it reaches REFCOUNT_SATURATED value but will only
> > > generate a warning and freeze the ref counter.  This would lead to the
> > > refcounted object never being freed.  A determined attacker could leak
> > > memory like that but it would be rather expensive and inefficient way to
> > > do so.
> > >
> > > To ensure anon_vma_name refcount does not overflow, stop anon_vma_name
> > > sharing when the refcount reaches REFCOUNT_MAX (2147483647), which still
> > > leaves INT_MAX/2 (1073741823) values before the counter reaches
> > > REFCOUNT_SATURATED.  This should provide enough headroom for raising the
> > > refcounts temporarily.
> >
> > I think that this patch depends on 78db3412833d ("mm: add anonymous vma
> > name refcounting") which we don't have in any of the stable trees. (is
> > this why it wasn't tagged for stable?).
>
> Suren said he would provide a backport on Monday, so let's see what he
> comes up with :)

Ah, right. We don't have anonymous vma name support in 5.16 or earlier
kernels, so this patch is indeed not needed there.

>
> thanks,
>
> greg k-h

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-03-06 22:53 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-06  9:37 FAILED: patch "[PATCH] mm: prevent vm_area_struct::anon_name refcount saturation" failed to apply to 5.15-stable tree gregkh
2022-03-06 21:50 ` Sasha Levin
2022-03-06 21:58   ` Greg KH
2022-03-06 22:53     ` Suren Baghdasaryan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).