All of lore.kernel.org
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Aaron Conole <aconole@redhat.com>
Cc: Florian Westphal <fw@strlen.de>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jens Axboe <axboe@fb.com>, "Ted Ts'o" <tytso@mit.edu>,
	Christoph Lameter <cl@linux.com>,
	David Miller <davem@davemloft.net>,
	Pablo Neira Ayuso <pablo@netfilter.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Network Development <netdev@vger.kernel.org>,
	NetFilter <netfilter-devel@vger.kernel.org>
Subject: Re: slab corruption with current -git (was Re: [git pull] vfs pile 1 (splice))
Date: Sun, 9 Oct 2016 20:41:17 -0700	[thread overview]
Message-ID: <CA+55aFwS1jfnNzU=8CzRyxC79qKrStwmTXYrBHQYcftAg4LT3Q@mail.gmail.com> (raw)
In-Reply-To: <CA+55aFzntXuUyL85A68ghS-D-t3VdgKo9-FiqQBG3efW2xns3A@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2080 bytes --]

On Sun, Oct 9, 2016 at 7:49 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> There is one *correct* way to remove an entry from a singly linked
> list, and it looks like this:
>
>     struct entry **pp, *p;
>
>     pp = &head;
>     while ((p = *pp) != NULL) {
>         if (right_entry(p)) {
>             *pp = p->next;
>             break;
>         }
>         pp = &p->next;
>     }
>
> and that's it. Nothing else.

This COMPLETELY UNTESTED patch tries to fix the nf_hook_entry code to do this.

I repeat: it's ENTIRELY UNTESTED. I just converted the insertion and
deletion to the proper pattern, but I could easily have gotten the
insertion priority test the wrong way around entirely, for example. Or
it could simply have some other completely broken bug in it. It
compiles for me, but that's all I actually checked.

Note that the "correct way" of doing list operations also almost
inevitably is the shortest way by far, since it gets rid of all the
special cases. So the patch looks nice. It gets rid of the magic
"nf_set_hooks_head()" thing too, because once you do list following
right, the head is no different from any other pointer in the list.

So the patch stats look good:

 net/netfilter/core.c | 108 ++++++++++++++++-----------------------------------
 1 file changed, 33 insertions(+), 75 deletions(-)

but again, it's entirely *entirely* untested. Please consider this
just a "this is generally how list insert/delete operations should be
done, avoiding special cases for the first entry".

ALSO NOTE! The code assumes that the "nf_hook_mutex" locking only
protects the actual *lists*, and that the address to the list can be
looked up without holding the lock. That's generally how things are
done, and it simplifies error handling (because you can do the "there
is no such list at all" test before you do anything else. But again, I
don't actually know the code, and if there is something that actually
expands the number of lists etc that depends on that mutex, then the
list head lookup may need to be inside the lock too.

           Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/plain, Size: 5008 bytes --]

 net/netfilter/core.c | 108 ++++++++++++++++-----------------------------------
 1 file changed, 33 insertions(+), 75 deletions(-)

diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index c9d90eb64046..814258641fcc 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -65,49 +65,24 @@ static DEFINE_MUTEX(nf_hook_mutex);
 #define nf_entry_dereference(e) \
 	rcu_dereference_protected(e, lockdep_is_held(&nf_hook_mutex))
 
-static struct nf_hook_entry *nf_hook_entry_head(struct net *net,
-						const struct nf_hook_ops *reg)
+static struct nf_hook_entry __rcu **nf_hook_entry_head(struct net *net, const struct nf_hook_ops *reg)
 {
-	struct nf_hook_entry *hook_head = NULL;
-
 	if (reg->pf != NFPROTO_NETDEV)
-		hook_head = nf_entry_dereference(net->nf.hooks[reg->pf]
-						 [reg->hooknum]);
-	else if (reg->hooknum == NF_NETDEV_INGRESS) {
+		return net->nf.hooks[reg->pf]+reg->hooknum;
+
 #ifdef CONFIG_NETFILTER_INGRESS
+	if (reg->hooknum == NF_NETDEV_INGRESS) {
 		if (reg->dev && dev_net(reg->dev) == net)
-			hook_head =
-				nf_entry_dereference(
-					reg->dev->nf_hooks_ingress);
-#endif
+			return &reg->dev->nf_hooks_ingress;
 	}
-	return hook_head;
-}
-
-/* must hold nf_hook_mutex */
-static void nf_set_hooks_head(struct net *net, const struct nf_hook_ops *reg,
-			      struct nf_hook_entry *entry)
-{
-	switch (reg->pf) {
-	case NFPROTO_NETDEV:
-#ifdef CONFIG_NETFILTER_INGRESS
-		/* We already checked in nf_register_net_hook() that this is
-		 * used from ingress.
-		 */
-		rcu_assign_pointer(reg->dev->nf_hooks_ingress, entry);
 #endif
-		break;
-	default:
-		rcu_assign_pointer(net->nf.hooks[reg->pf][reg->hooknum],
-				   entry);
-		break;
-	}
+	return NULL;
 }
 
 int nf_register_net_hook(struct net *net, const struct nf_hook_ops *reg)
 {
-	struct nf_hook_entry *hooks_entry;
-	struct nf_hook_entry *entry;
+	struct nf_hook_entry __rcu **pp;
+	struct nf_hook_entry *entry, *p;
 
 	if (reg->pf == NFPROTO_NETDEV) {
 #ifndef CONFIG_NETFILTER_INGRESS
@@ -119,6 +94,10 @@ int nf_register_net_hook(struct net *net, const struct nf_hook_ops *reg)
 			return -EINVAL;
 	}
 
+	pp = nf_hook_entry_head(net, reg);
+	if (!pp)
+		return -EINVAL;
+
 	entry = kmalloc(sizeof(*entry), GFP_KERNEL);
 	if (!entry)
 		return -ENOMEM;
@@ -128,26 +107,15 @@ int nf_register_net_hook(struct net *net, const struct nf_hook_ops *reg)
 	entry->next	= NULL;
 
 	mutex_lock(&nf_hook_mutex);
-	hooks_entry = nf_hook_entry_head(net, reg);
-
-	if (hooks_entry && hooks_entry->orig_ops->priority > reg->priority) {
-		/* This is the case where we need to insert at the head */
-		entry->next = hooks_entry;
-		hooks_entry = NULL;
-	}
-
-	while (hooks_entry &&
-		reg->priority >= hooks_entry->orig_ops->priority &&
-		nf_entry_dereference(hooks_entry->next)) {
-		hooks_entry = nf_entry_dereference(hooks_entry->next);
-	}
 
-	if (hooks_entry) {
-		entry->next = nf_entry_dereference(hooks_entry->next);
-		rcu_assign_pointer(hooks_entry->next, entry);
-	} else {
-		nf_set_hooks_head(net, reg, entry);
+	/* Find the spot in the list */
+	while ((p = nf_entry_dereference(*pp)) != NULL) {
+		if (reg->priority < p->orig_ops->priority)
+			break;
+		pp = &p->next;
 	}
+	rcu_assign_pointer(entry->next, p);
+	rcu_assign_pointer(*pp, p);
 
 	mutex_unlock(&nf_hook_mutex);
 #ifdef CONFIG_NETFILTER_INGRESS
@@ -163,33 +131,23 @@ EXPORT_SYMBOL(nf_register_net_hook);
 
 void nf_unregister_net_hook(struct net *net, const struct nf_hook_ops *reg)
 {
-	struct nf_hook_entry *hooks_entry;
+	struct nf_hook_entry __rcu **pp;
+	struct nf_hook_entry *p;
 
-	mutex_lock(&nf_hook_mutex);
-	hooks_entry = nf_hook_entry_head(net, reg);
-	if (hooks_entry && hooks_entry->orig_ops == reg) {
-		nf_set_hooks_head(net, reg,
-				  nf_entry_dereference(hooks_entry->next));
-		goto unlock;
-	}
-	while (hooks_entry && nf_entry_dereference(hooks_entry->next)) {
-		struct nf_hook_entry *next =
-			nf_entry_dereference(hooks_entry->next);
-		struct nf_hook_entry *nnext;
+	pp = nf_hook_entry_head(net, reg);
+	if (WARN_ON_ONCE(!pp))
+		return;
 
-		if (next->orig_ops != reg) {
-			hooks_entry = next;
-			continue;
+	mutex_lock(&nf_hook_mutex);
+	while ((p = nf_entry_dereference(*pp)) != NULL) {
+		if (p->orig_ops == reg) {
+			rcu_assign_pointer(*pp, p->next);
+			break;
 		}
-		nnext = nf_entry_dereference(next->next);
-		rcu_assign_pointer(hooks_entry->next, nnext);
-		hooks_entry = next;
-		break;
+		pp = &p->next;
 	}
-
-unlock:
 	mutex_unlock(&nf_hook_mutex);
-	if (!hooks_entry) {
+	if (!p) {
 		WARN(1, "nf_unregister_net_hook: hook not found!\n");
 		return;
 	}
@@ -201,10 +159,10 @@ void nf_unregister_net_hook(struct net *net, const struct nf_hook_ops *reg)
 	static_key_slow_dec(&nf_hooks_needed[reg->pf][reg->hooknum]);
 #endif
 	synchronize_net();
-	nf_queue_nf_hook_drop(net, hooks_entry);
+	nf_queue_nf_hook_drop(net, p);
 	/* other cpu might still process nfqueue verdict that used reg */
 	synchronize_net();
-	kfree(hooks_entry);
+	kfree(p);
 }
 EXPORT_SYMBOL(nf_unregister_net_hook);
 

  reply	other threads:[~2016-10-10  3:41 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-09 21:31 slab corruption with current -git (was Re: [git pull] vfs pile 1 (splice)) Linus Torvalds
2016-10-10  0:51 ` Florian Westphal
2016-10-10  1:35   ` Aaron Conole
2016-10-10  2:49     ` Linus Torvalds
2016-10-10  3:41       ` Linus Torvalds [this message]
2016-10-10  3:57         ` slab corruption with current -git David Miller
2016-10-10  8:24           ` David Miller
2016-10-10 16:15             ` Linus Torvalds
2016-10-11 13:17             ` Michal Kubecek
2016-10-11 13:55               ` Aaron Conole
2016-10-10 13:49         ` slab corruption with current -git (was Re: [git pull] vfs pile 1 (splice)) Aaron Conole
2016-10-10 16:28           ` Linus Torvalds
2016-10-10 19:05             ` Linus Torvalds
2016-10-10 19:18               ` Aaron Conole
2016-10-11  0:30               ` slab corruption with current -git David Miller
2016-10-11  0:54                 ` Linus Torvalds
2016-10-11  5:39         ` slab corruption with current -git (was Re: [git pull] vfs pile 1 (splice)) Linus Torvalds
2016-10-11  5:47           ` Linus Torvalds
2016-10-11  8:57             ` slab corruption with current -git David Miller
2016-10-13  6:02               ` Markus Trippelsdorf
2016-10-13  6:06                 ` Markus Trippelsdorf
     [not found]                   ` <CA+55aFwsUR4-YmOYgJOOO4a2e48M4_tk7YhAo4s5KZQQxUjpZw@mail.gmail.com>
2016-10-13  6:27                     ` Markus Trippelsdorf
2016-10-13  6:27                       ` Markus Trippelsdorf
2016-10-13 19:49                       ` Linus Torvalds
2016-10-13 20:43                         ` Florian Westphal
2016-10-13 21:32                         ` Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+55aFwS1jfnNzU=8CzRyxC79qKrStwmTXYrBHQYcftAg4LT3Q@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=aconole@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@fb.com \
    --cc=cl@linux.com \
    --cc=davem@davemloft.net \
    --cc=fw@strlen.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pablo@netfilter.org \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.