netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thibaut Sautereau <thibaut.sautereau@clip-os.org>
To: Laura Abbott <labbott@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>,
	netdev@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	"David S. Miller" <davem@davemloft.net>,
	Kees Cook <keescook@chromium.org>,
	Alexander Potapenko <glider@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	clipos@ssi.gouv.fr
Subject: Re: Double free of struct sk_buff reported by SLAB_CONSISTENCY_CHECKS with init_on_free
Date: Wed, 6 Nov 2019 10:29:19 +0100	[thread overview]
Message-ID: <20191106092919.GD1006@gandi.net> (raw)
In-Reply-To: <af8174be-848e-5f00-d6eb-caa956e8fd71@redhat.com>

On Tue, Nov 05, 2019 at 12:01:15PM -0500, Laura Abbott wrote:
> On 11/5/19 10:02 AM, Vlastimil Babka wrote:
> > On 11/5/19 3:32 PM, Thibaut Sautereau wrote:
> > > On Tue, Nov 05, 2019 at 10:00:39AM +0100, Vlastimil Babka wrote:
> > > > On 11/4/19 6:03 PM, Thibaut Sautereau wrote:
> > > > > The BUG only happens when using `slub_debug=F` on the command-line (to
> > > > > enable SLAB_CONSISTENCY_CHECKS), otherwise the double free is not
> > > > > reported and the system keeps running.
> > > > 
> > > > You could change slub_debug parameter to:
> > > > slub_debug=FU,skbuff_head_cache
> > > > 
> > > > That will also print out who previously allocated and freed the double
> > > > freed object. And limit all the tracking just to the affected cache.
> > > 
> > > Thanks, I did not know about that.
> > > 
> > > However, as kind of expected, I get a BUG due to a NULL pointer
> > > dereference in print_track():
> > 
> > Ah, I didn't read properly your initial mail, that there's a null
> > pointer deference during the consistency check.
> > 
> > ...
> > 
> > > > > 
> > > > > Bisection points to the following commit: 1b7e816fc80e ("mm: slub: Fix
> > > > > slab walking for init_on_free"), and indeed the BUG is not triggered
> > > > > when init_on_free is disabled.
> > > > 
> > > > That could be either buggy SLUB code, or the commit somehow exposed a
> > > > real bug in skbuff users.
> > > 
> > > Right. At first I thought about some incompatibility between
> > > init_on_free and SLAB_CONSISTENCY_CHECKS, but in that case why would it
> > > only happen with skbuff_head_cache?
> > 
> > That's curious, yeah.
> > 
> > > On the other hand, if it's a bug in
> > > skbuff users, why is the on_freelist() check in free_consistency_check()
> > > not detecting anything when init_on_free is disabled?
> > 
> > I vaguely suspect the code in the commit 1b7e816fc80e you bisected,
> > where in slab_free_freelist_hook() in the first iteration, we have void
> > *p = NULL; and set_freepointer(s, object, p); will thus write NULL into
> > the freelist. Is is the NULL we are crashing on? The code seems to
> > assume that the freelist is rewritten later in the function, but that
> > part is only active with some CONFIG_ option(s), none of which might be
> > enabled in your case?
> > But I don't really understand what exactly this function is supposed to
> > do. Laura, does my theory make sense?
> > 
> > Thanks,
> > Vlastimil
> > 
> 
> The note about getting re-written is referring to the fact that the trick
> with the bulk free is that we build the detached freelist and then when
> we do the cmpxchg it's getting (correctly) updated there.

Thank you Laura for this clarification.

> But looking at this again, I realize this function still has a more
> fundamental problem: walking the freelist like this means we actually
> end up reversing the list so head and tail are no longer pointing
> to the correct blocks. I was able to reproduce the issue by writing a
> simple kmem_cache_bulk_alloc/kmem_cache_bulk_free function. I'm
> guessing that the test of ping with an unusual size was enough to
> regularly trigger a non-trivial bulk alloc/free.
> 
> The fix I gave before fixed part of the problem but not all of it.
> At this point we're basically duplicating the work of the loop
> below so I think we can just combine it. Was there a reason this
> wasn't just combined in the first place?

Good catch about the freelist inversion, I was actually starting to
wonder whether it was really intended. Anyway, I tested your patch (see
one tiny inline comment below) and it seems to run fine for now, without
me being able to reproduce the bug anymore.

> diff --git a/mm/slub.c b/mm/slub.c
> index dac41cf0b94a..1510b86b2e7e 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -1431,13 +1431,17 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s,
>  	void *next = *head;
>  	void *old_tail = *tail ? *tail : *head;
>  	int rsize;
> +	next = *head;

`next` is already set a few lines above.

> -	if (slab_want_init_on_free(s)) {
> -		void *p = NULL;
> +	/* Head and tail of the reconstructed freelist */
> +	*head = NULL;
> +	*tail = NULL;
> -		do {
> -			object = next;
> -			next = get_freepointer(s, object);
> +	do {
> +		object = next;
> +		next = get_freepointer(s, object);
> +
> +		if (slab_want_init_on_free(s)) {
>  			/*
>  			 * Clear the object and the metadata, but don't touch
>  			 * the redzone.
> @@ -1447,29 +1451,8 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s,
>  							   : 0;
>  			memset((char *)object + s->inuse, 0,
>  			       s->size - s->inuse - rsize);
> -			set_freepointer(s, object, p);
> -			p = object;
> -		} while (object != old_tail);
> -	}
> -
> -/*
> - * Compiler cannot detect this function can be removed if slab_free_hook()
> - * evaluates to nothing.  Thus, catch all relevant config debug options here.
> - */
> -#if defined(CONFIG_LOCKDEP)	||		\
> -	defined(CONFIG_DEBUG_KMEMLEAK) ||	\
> -	defined(CONFIG_DEBUG_OBJECTS_FREE) ||	\
> -	defined(CONFIG_KASAN)
> -	next = *head;
> -
> -	/* Head and tail of the reconstructed freelist */
> -	*head = NULL;
> -	*tail = NULL;
> -
> -	do {
> -		object = next;
> -		next = get_freepointer(s, object);
> +		}
>  		/* If object's reuse doesn't have to be delayed */
>  		if (!slab_free_hook(s, object)) {
>  			/* Move object to the new freelist */
> @@ -1484,9 +1467,6 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s,
>  		*tail = NULL;
>  	return *head != NULL;
> -#else
> -	return true;
> -#endif
>  }
>  static void *setup_object(struct kmem_cache *s, struct page *page,
> 

-- 
Thibaut Sautereau
CLIP OS developer

      reply	other threads:[~2019-11-06  9:29 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-04 17:03 Double free of struct sk_buff reported by SLAB_CONSISTENCY_CHECKS with init_on_free Thibaut Sautereau
2019-11-04 17:33 ` Eric Dumazet
2019-11-05  8:05   ` Thibaut Sautereau
2019-11-05 10:19     ` Alexander Potapenko
2019-11-05 15:04       ` Thibaut Sautereau
2019-11-05  9:00 ` Vlastimil Babka
2019-11-05 14:32   ` Thibaut Sautereau
2019-11-05 15:02     ` Vlastimil Babka
2019-11-05 17:01       ` Laura Abbott
2019-11-06  9:29         ` Thibaut Sautereau [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191106092919.GD1006@gandi.net \
    --to=thibaut.sautereau@clip-os.org \
    --cc=akpm@linux-foundation.org \
    --cc=clipos@ssi.gouv.fr \
    --cc=davem@davemloft.net \
    --cc=glider@google.com \
    --cc=keescook@chromium.org \
    --cc=labbott@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=netdev@vger.kernel.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).