All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kees Cook <keescook@chromium.org>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Vegard Nossum <vegard.nossum@oracle.com>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Robert Moore <robert.moore@intel.com>,
	Erik Kaneda <erik.kaneda@intel.com>,
	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
	Christoph Lameter <cl@linux.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Marco Elver <elver@google.com>, Waiman Long <longman@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>,
	ACPI Devel Maling List <linux-acpi@vger.kernel.org>,
	Len Brown <lenb@kernel.org>, Steven Rostedt <rostedt@goodmis.org>,
	Roman Gushchin <guro@fb.com>
Subject: Re: slub freelist issue / BUG: unable to handle page fault for address: 000000003ffe0018
Date: Fri, 5 Jun 2020 11:46:36 -0700	[thread overview]
Message-ID: <202006051053.A61A42374C@keescook> (raw)
In-Reply-To: <92d994be-e4f5-b186-4ad7-21828de44967@suse.cz>

On Fri, Jun 05, 2020 at 06:55:27PM +0200, Vlastimil Babka wrote:
> 
> On 6/5/20 5:44 PM, Kees Cook wrote:
> > On Fri, Jun 05, 2020 at 04:44:51PM +0200, Vegard Nossum wrote:
> >> On 2020-06-05 16:08, Vlastimil Babka wrote:
> >> > On 6/5/20 3:12 PM, Rafael J. Wysocki wrote:
> >> > > On Fri, Jun 5, 2020 at 2:48 PM Vegard Nossum <vegard.nossum@oracle.com> wrote:
> >> > > > 
> >> > > > On 2020-06-05 11:36, Vegard Nossum wrote:
> >> > > > > 
> >> > > > > On 2020-06-05 11:11, Vlastimil Babka wrote:
> >> > > > > > So, with Kees' patch reverted, booting with slub_debug=F (or even more
> >> > > > > > specific slub_debug=F,ftrace_event_field) also hits this bug below. I
> >> > > > > > wanted to bisect it, but v5.7 was also bad, and also v5.6. Didn't try
> >> > > > > > further in history. So it's not new at all, and likely very specific to
> >> > > > > > your config+QEMU? (and related to the ACPI error messages that precede
> >> > > > > > it?).
> >> > [...]
> >> > [    0.140408] ------------[ cut here ]------------
> >> > [    0.140837] cache_from_obj: Wrong slab cache. Acpi-Namespace but object is from kmalloc-64
> >> > [    0.141406] WARNING: CPU: 0 PID: 1 at mm/slab.h:524 kmem_cache_free+0x1d3/0x250
> > 
> > Ah yes! Good. I had improved this check recently too, and I was worried
> > the freelist pointer patch was somehow blocking it, but I see now that
> > the failing config didn't have CONFIG_SLAB_FREELIST_HARDENED=y. Once
> > SLAB_CONSISTENCY_CHECKS was enabled ("slub_debug=F"), it started
> > tripping. Whew.
> > 
> > I wonder if that entire test block should just be removed from
> > cache_from_obj():
> > 
> >         if (!memcg_kmem_enabled() &&
> >             !IS_ENABLED(CONFIG_SLAB_FREELIST_HARDENED) &&
> >             !unlikely(s->flags & SLAB_CONSISTENCY_CHECKS))
> >                 return s;
> > 
> > and make this test unconditional? It's mostly only called during free(),
> > and shouldn't be too expensive to be made unconditional. Hmm.
> 
> Hmm I have a different idea. The whole cache_from_obj() was added because of
> kmemcg (commit b9ce5ef49f00d) where per-memcg cache can be different from the
> root one. And I just realized this usecase can go away with Roman's series [1].
> But cache_from_obj() also kept the original SLUB consistency check case, and you
> added the freelist hardening case. If kmemcg use case went away it would be nice
> to avoid the virt_to_cache() and check completely again, unless in debugging or
> hardened kernel.

Is it that expensive? (I'm fine with it staying behind debug/hardening,
but if we can make it on by default, that'd be safer.)

> Furthermore, the original SLUB debugging case was an unconditional pr_err() plus
> WARN_ON_ONCE(1), which was kept by commit b9ce5ef49f00d.  With freelist
> hardening this all changed to WARN_ONCE. So the second and later cases are not
> reported at all for hardening and also not for explicitly enabled debugging like
> in this case, which is IMHO not ideal.

Oh, I have no problem with WARN vs WARN_ONCE -- there's no reason to
split this. And I'd love the hardening side to gain the tracking call
too, if it's available.

I had just used WARN_ONCE() since sometimes it can be very noisy to keep
warning for some condition that might not be correctable.

> So I propose the following - the freelist hardening case keeps the WARN_ONCE,
> but also a one-line pr_err() for each case so they are not silent. The SLUB
> debugging case is always a full warning, and printing the tracking info if
> enabled and available. Pure kmemcg case does virt_to_cache() for now (until
> hopefully removed by Roman's series) but no checking at all. Would that work for
> everyone?
> [...]
> @@ -520,9 +528,18 @@ static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void *x)
>  		return s;
>  
>  	cachep = virt_to_cache(x);
> -	WARN_ONCE(cachep && !slab_equal_or_root(cachep, s),
> -		  "%s: Wrong slab cache. %s but object is from %s\n",
> -		  __func__, s->name, cachep->name);
> +	if (unlikely(s->flags & SLAB_CONSISTENCY_CHECKS)) {
> +		if (WARN(cachep && !slab_equal_or_root(cachep, s),
> +			  "%s: Wrong slab cache. %s but object is from %s\n",
> +			  __func__, s->name, cachep->name))
> +			slab_print_tracking(cachep, x);
> +	} else if (IS_ENABLED(CONFIG_SLAB_FREELIST_HARDENED)) {
> +		if (unlikely(cachep && !slab_equal_or_root(cachep, s))) {
> +			pr_err("%s: Wrong slab cache. %s but object is from %s\n",
> +				  __func__, s->name, cachep->name);
> +			WARN_ON_ONCE(1);
> +		}
> +	}

How about just this (in addition to your slab_print_tracking() refactor):

diff --git a/mm/slab.h b/mm/slab.h
index 207c83ef6e06..107b7f6db3c3 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -520,9 +520,10 @@ static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void *x)
 		return s;
 
 	cachep = virt_to_cache(x);
-	WARN_ONCE(cachep && !slab_equal_or_root(cachep, s),
+	if (WARN(cachep && !slab_equal_or_root(cachep, s),
 		  "%s: Wrong slab cache. %s but object is from %s\n",
-		  __func__, s->name, cachep->name);
+		  __func__, s->name, cachep->name))
+		slab_print_tracking(cachep, x);
 	return cachep;
 }
 

-- 
Kees Cook

  reply	other threads:[~2020-06-05 18:46 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-04 17:14 slub freelist issue / BUG: unable to handle page fault for address: 000000003ffe0018 Vegard Nossum
2020-06-04 17:18 ` Vlastimil Babka
2020-06-04 17:20   ` Vegard Nossum
2020-06-04 17:51     ` Kees Cook
2020-06-04 17:57     ` Kees Cook
2020-06-04 18:46       ` Vlastimil Babka
2020-06-05  9:11         ` Vlastimil Babka
2020-06-05  9:36           ` Vegard Nossum
2020-06-05 12:47             ` Vegard Nossum
2020-06-05 13:12               ` Rafael J. Wysocki
2020-06-05 13:12                 ` Rafael J. Wysocki
2020-06-05 14:08                 ` Vlastimil Babka
2020-06-05 14:24                   ` Rafael J. Wysocki
2020-06-05 14:24                     ` Rafael J. Wysocki
2020-06-05 14:44                   ` Vegard Nossum
2020-06-05 15:44                     ` Kees Cook
2020-06-05 16:37                       ` Vegard Nossum
2020-06-05 17:51                         ` Kees Cook
2020-06-05 16:55                       ` Vlastimil Babka
2020-06-05 18:46                         ` Kees Cook [this message]
2020-06-08 10:51                           ` Vlastimil Babka
2020-06-06  6:46                       ` Rafael J. Wysocki
2020-06-06  6:46                         ` Rafael J. Wysocki
2020-06-05 21:45                     ` Kaneda, Erik
2020-06-11  1:40                     ` Kaneda, Erik
2020-06-11 10:54                       ` Vlastimil Babka
2020-06-12 12:26                       ` Rafael J. Wysocki
2021-03-23 18:32                         ` Kirill A. Shutemov
2021-03-23 18:58                           ` Vegard Nossum
2021-03-23 19:03                           ` Rafael J. Wysocki
2021-03-23 19:03                             ` Rafael J. Wysocki
2021-03-23 21:54                             ` Kaneda, Erik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202006051053.A61A42374C@keescook \
    --to=keescook@chromium.org \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=elver@google.com \
    --cc=erik.kaneda@intel.com \
    --cc=guro@fb.com \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=longman@redhat.com \
    --cc=rafael.j.wysocki@intel.com \
    --cc=rafael@kernel.org \
    --cc=robert.moore@intel.com \
    --cc=rostedt@goodmis.org \
    --cc=vbabka@suse.cz \
    --cc=vegard.nossum@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.