All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 2.6.20 07/10] nfnetlink_log: fix module reference counting
@ 2007-02-12  0:39 Michał Mirosław
  2007-02-13 12:48 ` Patrick McHardy
  0 siblings, 1 reply; 6+ messages in thread
From: Michał Mirosław @ 2007-02-12  0:39 UTC (permalink / raw)
  To: netfilter-devel; +Cc: linux-kernel

Count module references correctly: after instance_destroy() there
might be timer pending and holding a reference for this netlink instance.

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>

--- linux-2.6.20/net/netfilter/nfnetlink_log.c.5	2007-02-11 22:24:56.000000000 +0100
+++ linux-2.6.20/net/netfilter/nfnetlink_log.c	2007-02-11 22:31:19.000000000 +0100
@@ -133,6 +133,7 @@ instance_put(struct nfulnl_instance *ins
 	if (inst && atomic_dec_and_test(&inst->use)) {
 		UDEBUG("kfree(inst=%p)\n", inst);
 		kfree(inst);
+		module_put(THIS_MODULE);
 	}
 }
 
@@ -146,9 +147,13 @@ instance_create(u_int16_t group_num, int
 	UDEBUG("entering (group_num=%u, pid=%d)\n", group_num,
 		pid);
 
+	if (!try_module_get(THIS_MODULE)) {
+		UDEBUG("aborting, could not reference own module (module unloading?)\n");
+		goto out_modunload;
+	}
+
 	write_lock_bh(&instances_lock);	
 	if (__instance_lookup(group_num)) {
-		inst = NULL;
 		UDEBUG("aborting, instance already exists\n");
 		goto out_unlock;
 	}
@@ -176,9 +181,6 @@ instance_create(u_int16_t group_num, int
 	inst->copy_mode 	= NFULNL_COPY_PACKET;
 	inst->copy_range 	= 0xffff;
 
-	if (!try_module_get(THIS_MODULE))
-		goto out_free;
-
 	hlist_add_head(&inst->hlist, 
 		       &instance_table[instance_hashfn(group_num)]);
 
@@ -189,10 +191,10 @@ instance_create(u_int16_t group_num, int
 
 	return inst;
 
-out_free:
-	instance_put(inst);
 out_unlock:
 	write_unlock_bh(&instances_lock);
+	module_put(THIS_MODULE);
+out_modunload:
 	return NULL;
 }
 
@@ -228,8 +230,6 @@ _instance_destroy2(struct nfulnl_instanc
 
 	/* and finally put the refcount */
 	instance_put(inst);
-
-	module_put(THIS_MODULE);
 }
 
 static inline void

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2.6.20 07/10] nfnetlink_log: fix module reference counting
  2007-02-12  0:39 [PATCH 2.6.20 07/10] nfnetlink_log: fix module reference counting Michał Mirosław
@ 2007-02-13 12:48 ` Patrick McHardy
  2007-02-13 16:25   ` Michał Mirosław
  0 siblings, 1 reply; 6+ messages in thread
From: Patrick McHardy @ 2007-02-13 12:48 UTC (permalink / raw)
  To: Michał Mirosław; +Cc: netfilter-devel, linux-kernel

Micha³ Miros³aw wrote:
> Count module references correctly: after instance_destroy() there
> might be timer pending and holding a reference for this netlink instance.

I think we should just cancel the timer on destruction.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2.6.20 07/10] nfnetlink_log: fix module reference counting
  2007-02-13 12:48 ` Patrick McHardy
@ 2007-02-13 16:25   ` Michał Mirosław
  2007-02-14 16:46     ` Patrick McHardy
  0 siblings, 1 reply; 6+ messages in thread
From: Michał Mirosław @ 2007-02-13 16:25 UTC (permalink / raw)
  To: netfilter-devel; +Cc: linux-kernel

On Tue, Feb 13, 2007 at 01:48:03PM +0100, Patrick McHardy wrote:
> Micha³ Miros³aw wrote:
> > Count module references correctly: after instance_destroy() there
> > might be timer pending and holding a reference for this netlink instance.
> I think we should just cancel the timer on destruction.

It won't solve a race between destroying the instance and logging a
packet. It could happen that:

CPU1				CPU2
				nfulnl_log_packet()
				 -> instance_lookup_get()
				 -> lock inst
				 ** got lock on inst
_instance_destroy(inst)
 -> remove inst from list
 -> remove timer and unref inst
 -> lock inst
 				 -> start a timer
				 -> unlock inst
 ** got lock on inst
 -> clear the skb
 -> unlock inst
 -> unref inst
 -> unref module... but we still have a timer pending!

To solve that, we would need to wait until use count reached 1 somewhere
before instance_put() to be sure that no packet is being logged. That
would mean something like:

while (atomic_get(inst->use) > 1)
	del_timer_sync(&inst->timer);

I don't think it's worth the cpu cycles. The only problem I see is that
some user process could set up a long queue timeout and that would keep
the module from unloading until the timeout went off if the right packet
arrived in just the right moment. Do we need to care about this case?

BTW, in the nfnetlinkg_log.c version before my patches the timer is
cancelled already in __nfulnl_send().

 -- Michal Miroslaw

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2.6.20 07/10] nfnetlink_log: fix module reference counting
  2007-02-13 16:25   ` Michał Mirosław
@ 2007-02-14 16:46     ` Patrick McHardy
  2007-02-14 17:07       ` Michał Mirosław
  0 siblings, 1 reply; 6+ messages in thread
From: Patrick McHardy @ 2007-02-14 16:46 UTC (permalink / raw)
  To: mirq-linux; +Cc: netfilter-devel

[-- Attachment #1: Type: text/plain, Size: 1184 bytes --]

Micha³ Miros³aw wrote:
> On Tue, Feb 13, 2007 at 01:48:03PM +0100, Patrick McHardy wrote:
> 
>>I think we should just cancel the timer on destruction.
> 
> 
> It won't solve a race between destroying the instance and logging a
> packet. It could happen that:
> 
> CPU1				CPU2
> 				nfulnl_log_packet()
> 				 -> instance_lookup_get()
> 				 -> lock inst
> 				 ** got lock on inst
> _instance_destroy(inst)
>  -> remove inst from list
>  -> remove timer and unref inst
>  -> lock inst
>  				 -> start a timer
> 				 -> unlock inst
>  ** got lock on inst
>  -> clear the skb
>  -> unlock inst
>  -> unref inst
>  -> unref module... but we still have a timer pending!


This is easily fixable by adding a synchronize_rcu() call
after removing the instance from the global list.
nfulnl_log_packet() is called within a RCU read-side
critical section, so once synchronize_rcu() returns we're
guaranteed no CPU is within nfulnl_log_packet anymore with
this instance. And I think it is really preferable to having
the timer armed after destroying the instance.

This patch should take care of both the module reference
problem and the refcount leak.


[-- Attachment #2: x --]
[-- Type: text/plain, Size: 833 bytes --]

diff --git a/net/netfilter/nfnetlink_log.c b/net/netfilter/nfnetlink_log.c
index 8a460b1..1d45ff6 100644
--- a/net/netfilter/nfnetlink_log.c
+++ b/net/netfilter/nfnetlink_log.c
@@ -213,6 +213,9 @@ _instance_destroy2(struct nfulnl_instanc
 	if (lock)
 		write_unlock_bh(&instances_lock);
 
+	sychronize_rcu();
+	if (del_timer_sync(&inst->timer))
+		instance_put(inst);
 	/* then flush all pending packets from skb */
 
 	spin_lock_bh(&inst->lock);
@@ -363,9 +366,6 @@ __nfulnl_send(struct nfulnl_instance *in
 {
 	int status;
 
-	if (timer_pending(&inst->timer))
-		del_timer(&inst->timer);
-
 	if (!inst->skb)
 		return 0;
 
@@ -676,6 +676,9 @@ #endif
 		 * enough room in the skb left. flush to userspace. */
 		UDEBUG("flushing old skb\n");
 
+		if (del_timer(&inst->timer))
+			instance_put(inst);
+
 		__nfulnl_send(inst);
 	}
 

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 2.6.20 07/10] nfnetlink_log: fix module reference counting
  2007-02-14 16:46     ` Patrick McHardy
@ 2007-02-14 17:07       ` Michał Mirosław
  2007-02-14 17:22         ` Patrick McHardy
  0 siblings, 1 reply; 6+ messages in thread
From: Michał Mirosław @ 2007-02-14 17:07 UTC (permalink / raw)
  To: netfilter-devel

On Wed, Feb 14, 2007 at 05:46:38PM +0100, Patrick McHardy wrote:
> Micha³ Miros³aw wrote:
>> On Tue, Feb 13, 2007 at 01:48:03PM +0100, Patrick McHardy wrote:
>>>I think we should just cancel the timer on destruction.
>> It won't solve a race between destroying the instance and logging a
>> packet. It could happen that:
>>[cut]
> This is easily fixable by adding a synchronize_rcu() call
> after removing the instance from the global list.
> nfulnl_log_packet() is called within a RCU read-side
> critical section, so once synchronize_rcu() returns we're
> guaranteed no CPU is within nfulnl_log_packet anymore with
> this instance. And I think it is really preferable to having
> the timer armed after destroying the instance.

That looks better indeed. I'll start on reading what's RCU and
how to use it properly. :)

Thanks,
Michal Miroslaw

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2.6.20 07/10] nfnetlink_log: fix module reference counting
  2007-02-14 17:07       ` Michał Mirosław
@ 2007-02-14 17:22         ` Patrick McHardy
  0 siblings, 0 replies; 6+ messages in thread
From: Patrick McHardy @ 2007-02-14 17:22 UTC (permalink / raw)
  To: mirq-linux; +Cc: netfilter-devel

Micha³ Miros³aw wrote:
> On Wed, Feb 14, 2007 at 05:46:38PM +0100, Patrick McHardy wrote:
> 
>>Micha³ Miros³aw wrote:
>>
>>>On Tue, Feb 13, 2007 at 01:48:03PM +0100, Patrick McHardy wrote:
>>>
>>>>I think we should just cancel the timer on destruction.
>>>
>>>It won't solve a race between destroying the instance and logging a
>>>packet. It could happen that:
>>>[cut]
>>
>>This is easily fixable by adding a synchronize_rcu() call
>>after removing the instance from the global list.
>>nfulnl_log_packet() is called within a RCU read-side
>>critical section, so once synchronize_rcu() returns we're
>>guaranteed no CPU is within nfulnl_log_packet anymore with
>>this instance. And I think it is really preferable to having
>>the timer armed after destroying the instance.
> 
> 
> That looks better indeed. I'll start on reading what's RCU and
> how to use it properly. :)

Unfortunately the patch is still broken, we might hold
instances_lock and thus may not sleep (and especially
not call synchronize_rcu() since it will never return).
I'll look into this again, probably the conditional
locking should be removed as a first step.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2007-02-14 17:22 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-02-12  0:39 [PATCH 2.6.20 07/10] nfnetlink_log: fix module reference counting Michał Mirosław
2007-02-13 12:48 ` Patrick McHardy
2007-02-13 16:25   ` Michał Mirosław
2007-02-14 16:46     ` Patrick McHardy
2007-02-14 17:07       ` Michał Mirosław
2007-02-14 17:22         ` Patrick McHardy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.