linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Vegard Nossum <vegard.nossum@oracle.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Rik van Riel <riel@redhat.com>,
	Matthew Wilcox <mawilcox@microsoft.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Al Viro <viro@zeniv.linux.org.uk>, Ingo Molnar <mingo@kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH 4/4] [RFC!] mm: 'struct mm_struct' reference counting debugging
Date: Fri, 16 Dec 2016 10:01:57 +0100	[thread overview]
Message-ID: <20161216090157.GA13940@dhcp22.suse.cz> (raw)
In-Reply-To: <20161216082202.21044-4-vegard.nossum@oracle.com>

On Fri 16-12-16 09:22:02, Vegard Nossum wrote:
> Reference counting bugs are hard to debug by their nature since the actual
> manifestation of one can occur very far from where the error is introduced
> (e.g. a missing get() only manifest as a use-after-free when the reference
> count prematurely drops to 0, which could be arbitrarily long after where
> the get() should have happened if there are other users). I wrote this patch
> to try to track down a suspected 'mm_struct' reference counting bug.

I definitely agree that hunting these bugs is a royal PITA, no question
about that. I am just wondering whether this has been motivated by any
particular bug recently. I do not seem to remember any such an issue for
quite some time.

> The basic idea is to keep track of all references, not just with a reference
> counter, but with an actual reference _list_. Whenever you get() or put() a
> reference, you also add or remove yourself, respectively, from the reference
> list. This really helps debugging because (for example) you always put a
> specific reference, meaning that if that reference was not yours to put, you
> will notice it immediately (rather than when the reference counter goes to 0
> and you still have an active reference).

But who is the owner of the reference? A function/task? It is not all
that uncommon to take an mm reference from one context and release it
from a different one. But I might be missing your point here.

> The main interface is in <linux/mm_ref_types.h> and <linux/mm_ref.h>, while
> the implementation lives in mm/mm_ref.c. Since 'struct mm_struct' has both
> ->mm_users and ->mm_count, we introduce helpers for both of them, but use
> the same data structure for each (struct mm_ref). The low-level rules (i.e.
> the ones we have to follow, but which nobody else should really have to
> care about since they use the higher-level interface) are:
> 
>  - after incrementing ->mm_count you also have to call get_mm_ref()
> 
>  - before decrementing ->mm_count you also have to call put_mm_ref()
> 
>  - after incrementing ->mm_users you also have to call get_mm_users_ref()
> 
>  - before decrementing ->mm_users you also have to call put_mm_users_ref()
> 
> The rules that most of the rest of the kernel will care about are:
> 
>  - functions that acquire and return a mm_struct should take a
>    'struct mm_ref *' which it can pass on to mmget()/mmgrab()/etc.
> 
>  - functions that release an mm_struct passed as a parameter should also
>    take a 'struct mm_ref *' which it can pass on to mmput()/mmdrop()/etc.
> 
>  - any function that temporarily acquires a mm_struct reference should
>    use MM_REF() to define an on-stack reference and pass it on to
>    mmget()/mmput()/mmgrab()/mmdrop()/etc.
> 
>  - any structure that holds an mm_struct pointer must also include a
>    'struct mm_ref' member; when the mm_struct pointer is modified you
>    would typically also call mmget()/mmgrab()/mmput()/mmdrop() and they
>    should be called with this mm_ref
> 
>  - you can convert (for example) an on-stack reference to an in-struct
>    reference using move_mm_ref(). This is semantically equivalent to
>    (atomically) taking the new reference and dropping the old one, but
>    doesn't actually need to modify the reference count

This all sounds way too intrusive to me so I am not really sure this is
something we really want. A nice thing for debugging for sure but I am
somehow skeptical whether it is really worth it considering how many
those ref. count bugs we've had.

[...]
-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2016-12-16  9:02 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-16  8:21 [PATCH 1/4] mm: add new mmgrab() helper Vegard Nossum
2016-12-16  8:22 ` [PATCH 2/4] mm: add new mmget() helper Vegard Nossum
2016-12-16  9:26   ` Michal Hocko
2016-12-16  8:22 ` [PATCH 3/4] mm: use mmget_not_zero() helper Vegard Nossum
2016-12-16  9:27   ` Michal Hocko
2016-12-16  8:22 ` [PATCH 4/4] [RFC!] mm: 'struct mm_struct' reference counting debugging Vegard Nossum
2016-12-16  9:01   ` Michal Hocko [this message]
2016-12-16  9:43     ` Vegard Nossum
2016-12-16 10:11       ` crash during oom reaper (was: Re: [PATCH 4/4] [RFC!] mm: 'struct mm_struct' reference counting debugging) Michal Hocko
2016-12-16 10:44         ` Kirill A. Shutemov
2016-12-16 11:42           ` crash during oom reaper Michal Hocko
2016-12-16 12:12             ` Michal Hocko
2016-12-16 12:35             ` Kirill A. Shutemov
2016-12-16 12:56               ` Michal Hocko
2016-12-16 13:07                 ` Kirill A. Shutemov
2016-12-16 13:14                   ` Michal Hocko
2016-12-18 13:47                     ` Tetsuo Handa
2016-12-18 16:06                       ` Michal Hocko
2016-12-16 13:14         ` Vegard Nossum
2016-12-16 14:00           ` Michal Hocko
2016-12-16 14:25             ` Vegard Nossum
2016-12-16 14:32               ` Michal Hocko
2016-12-16 14:53                 ` Vegard Nossum
2016-12-16 14:04           ` Vegard Nossum
2016-12-16  9:14 ` [PATCH 1/4] mm: add new mmgrab() helper Michal Hocko
2016-12-16  9:56 ` Peter Zijlstra
2016-12-16 10:19   ` Kirill A. Shutemov
2016-12-16 10:20     ` Vegard Nossum
2016-12-16 10:36       ` Michal Hocko
2016-12-16 11:14   ` Vegard Nossum

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161216090157.GA13940@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mawilcox@microsoft.com \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=torvalds@linux-foundation.org \
    --cc=vegard.nossum@oracle.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).