All of lore.kernel.org
 help / color / mirror / Atom feed
* request_module while holding rtnl semaphore
@ 2004-11-04  3:11 Patrick McHardy
  2004-11-05 23:35 ` Herbert Xu
  0 siblings, 1 reply; 29+ messages in thread
From: Patrick McHardy @ 2004-11-04  3:11 UTC (permalink / raw)
  To: netdev

There are several instances of request_module beeing called while
holding the rtnl semaphore in net/sched. A pratical problem with
this is the teql scheduler which deadlocks when calling register_netdev
from its init function. A more far-fetched problem would be some crazy
person with their modules in a nfs-mounted directory on a server
reachable over a dial-on-demand link. I couldn't come up with a
solution except for refusing to autoload teql, maybe someone else has
an idea.

Regards
Patrick

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: request_module while holding rtnl semaphore
  2004-11-04  3:11 request_module while holding rtnl semaphore Patrick McHardy
@ 2004-11-05 23:35 ` Herbert Xu
  2004-11-10  0:11   ` David S. Miller
  0 siblings, 1 reply; 29+ messages in thread
From: Herbert Xu @ 2004-11-05 23:35 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: netdev

Patrick McHardy <kaber@trash.net> wrote:
> There are several instances of request_module beeing called while
> holding the rtnl semaphore in net/sched. A pratical problem with
> this is the teql scheduler which deadlocks when calling register_netdev
> from its init function. A more far-fetched problem would be some crazy
> person with their modules in a nfs-mounted directory on a server
> reachable over a dial-on-demand link. I couldn't come up with a
> solution except for refusing to autoload teql, maybe someone else has
> an idea.

There are a couple of causes for this problem:

1) Abuse of the rtnl.  It's being used for too many things.  It's
basically the networking system's BKL.  If the locking were more
granular then this shouldn't occur.

2) Hooking random net/sched requests into rtnetlink.  By being an
rtnetlink user you pay the price of taking the rtnl.  Most of the
net/sched stuff has nothing to do with rtnetlink.  You know it
because they all live in AF_UNSPEC :)

Tackling either problem would lead to a solution to the dead-lock.
However, neither is trivial to solve.

On a related note, I'm working on making it easier to add new netlink
families which could lead to a solution to 2.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: request_module while holding rtnl semaphore
  2004-11-05 23:35 ` Herbert Xu
@ 2004-11-10  0:11   ` David S. Miller
  2004-11-10  0:38     ` Patrick McHardy
                       ` (3 more replies)
  0 siblings, 4 replies; 29+ messages in thread
From: David S. Miller @ 2004-11-10  0:11 UTC (permalink / raw)
  To: Herbert Xu; +Cc: kaber, netdev

On Sat, 06 Nov 2004 10:35:45 +1100
Herbert Xu <herbert@gondor.apana.org.au> wrote:

> 1) Abuse of the rtnl.  It's being used for too many things.  It's
> basically the networking system's BKL.  If the locking were more
> granular then this shouldn't occur.

I totally disagree.  The use of rtnl for all networking configuration
changes is a virtue of our current setup.

All of these actions want to make sure devices don't disappear
from underneath them while verifying a configuration change.  In
fact one of the first things the packet scheduler config change
code does is:

	if ((dev = __dev_get_by_index(tcm->tcm_ifindex)) == NULL)
		return -ENODEV;

and it expects the state of that device to not change throughout
the rest of the config change verification and decision making.

> 2) Hooking random net/sched requests into rtnetlink.  By being an
> rtnetlink user you pay the price of taking the rtnl.  Most of the
> net/sched stuff has nothing to do with rtnetlink.

I believe this is a false statement.  Every single networking
config change depends upon device existence and state in some
way.  By virtue of that, they really depend upon the RTNL being
held during the duration of their execution.

Only the actual implementations know when and where such a
dependency on device state et al. does not exist, and therefore
where RTNL holding is not necessary.

Therefore I suggest we just implement the fix for this inside of
the packet scheduler layer itself.  Simply by dropping the RTNL
semaphore during the module request, and then regrabbing the RTNL
semaphore and replaying the request from the beginning.

The net/sched/sch_api.c version of the fix would look like the
following.  The act_api.c case would require a bit more surgery,
but with the right restructuring it can be done too.

===== net/sched/sch_api.c 1.41 vs edited =====
--- 1.41/net/sched/sch_api.c	2004-11-05 16:34:45 -08:00
+++ edited/net/sched/sch_api.c	2004-11-09 15:57:19 -08:00
@@ -396,17 +396,30 @@
 	struct Qdisc_ops *ops;
 	int size;
 
+	err = -EINVAL;
 	ops = qdisc_lookup_ops(kind);
 #ifdef CONFIG_KMOD
 	if (ops==NULL && tca[TCA_KIND-1] != NULL) {
 		if (RTA_PAYLOAD(kind) <= IFNAMSIZ) {
+			rtnl_unlock();
 			request_module("sch_%s", (char*)RTA_DATA(kind));
+			rtnl_lock();
+
 			ops = qdisc_lookup_ops(kind);
+
+			/* We dropped the RTNL semaphore in order to
+			 * perform the module load.  So, even if we
+			 * succeeded in loading the module we have to
+			 * tell the caller to replay the request.  We
+			 * indicate this using -EAGAIN.
+			 */
+			if (ops != NULL)
+				err = -EAGAIN;
+			goto err_out;
 		}
 	}
 #endif
 
-	err = -EINVAL;
 	if (ops == NULL)
 		goto err_out;
 	err = -EBUSY;
@@ -600,14 +613,19 @@
 
 static int tc_modify_qdisc(struct sk_buff *skb, struct nlmsghdr *n, void *arg)
 {
-	struct tcmsg *tcm = NLMSG_DATA(n);
-	struct rtattr **tca = arg;
+	struct tcmsg *tcm;
+	struct rtattr **tca;
 	struct net_device *dev;
-	u32 clid = tcm->tcm_parent;
-	struct Qdisc *q = NULL;
-	struct Qdisc *p = NULL;
+	u32 clid;
+	struct Qdisc *q, *p;
 	int err;
 
+replay:
+	tcm = NLMSG_DATA(n);
+	tca = arg;
+	clid = tcm->tcm_parent;
+	q = p = NULL;
+
 	if ((dev = __dev_get_by_index(tcm->tcm_ifindex)) == NULL)
 		return -ENODEV;
 
@@ -701,8 +719,14 @@
 		q = qdisc_create(dev, tcm->tcm_parent, tca, &err);
         else
 		q = qdisc_create(dev, tcm->tcm_handle, tca, &err);
-	if (q == NULL)
+	if (q == NULL) {
+		if (err == -EAGAIN) {
+			/* Replay the request. */
+			dev_put(dev);
+			goto replay;
+		}
 		return err;
+	}
 
 graft:
 	if (1) {

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: request_module while holding rtnl semaphore
  2004-11-10  0:11   ` David S. Miller
@ 2004-11-10  0:38     ` Patrick McHardy
  2004-11-10  1:01     ` Thomas Graf
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 29+ messages in thread
From: Patrick McHardy @ 2004-11-10  0:38 UTC (permalink / raw)
  To: David S. Miller; +Cc: Herbert Xu, netdev

David S. Miller wrote:

>On Sat, 06 Nov 2004 10:35:45 +1100
>Herbert Xu <herbert@gondor.apana.org.au> wrote:
>  
>
>>2) Hooking random net/sched requests into rtnetlink.  By being an
>>rtnetlink user you pay the price of taking the rtnl.  Most of the
>>net/sched stuff has nothing to do with rtnetlink.
>>    
>>
>
>I believe this is a false statement.  Every single networking
>config change depends upon device existence and state in some
>way.  By virtue of that, they really depend upon the RTNL being
>held during the duration of their execution.
>
>Only the actual implementations know when and where such a
>dependency on device state et al. does not exist, and therefore
>where RTNL holding is not necessary.
>
Agreed. Restructuring only net/sched to be independant
of the rtnl requires alot of non-trivial work, with not
much to gain.

>Therefore I suggest we just implement the fix for this inside of
>the packet scheduler layer itself.  Simply by dropping the RTNL
>semaphore during the module request, and then regrabbing the RTNL
>semaphore and replaying the request from the beginning.
>
Nicely done. I feel smarter just by looking at it :)

>
>The net/sched/sch_api.c version of the fix would look like the
>following.  The act_api.c case would require a bit more surgery,
>but with the right restructuring it can be done too.
>

I'll have a look at this while cleaning up the code.

Regards
Patrick

>===== net/sched/sch_api.c 1.41 vs edited =====
>--- 1.41/net/sched/sch_api.c	2004-11-05 16:34:45 -08:00
>+++ edited/net/sched/sch_api.c	2004-11-09 15:57:19 -08:00
>@@ -396,17 +396,30 @@
> 	struct Qdisc_ops *ops;
> 	int size;
> 
>+	err = -EINVAL;
> 	ops = qdisc_lookup_ops(kind);
> #ifdef CONFIG_KMOD
> 	if (ops==NULL && tca[TCA_KIND-1] != NULL) {
> 		if (RTA_PAYLOAD(kind) <= IFNAMSIZ) {
>+			rtnl_unlock();
> 			request_module("sch_%s", (char*)RTA_DATA(kind));
>+			rtnl_lock();
>+
> 			ops = qdisc_lookup_ops(kind);
>+
>+			/* We dropped the RTNL semaphore in order to
>+			 * perform the module load.  So, even if we
>+			 * succeeded in loading the module we have to
>+			 * tell the caller to replay the request.  We
>+			 * indicate this using -EAGAIN.
>+			 */
>+			if (ops != NULL)
>+				err = -EAGAIN;
>+			goto err_out;
> 		}
> 	}
> #endif
> 
>-	err = -EINVAL;
> 	if (ops == NULL)
> 		goto err_out;
> 	err = -EBUSY;
>@@ -600,14 +613,19 @@
> 
> static int tc_modify_qdisc(struct sk_buff *skb, struct nlmsghdr *n, void *arg)
> {
>-	struct tcmsg *tcm = NLMSG_DATA(n);
>-	struct rtattr **tca = arg;
>+	struct tcmsg *tcm;
>+	struct rtattr **tca;
> 	struct net_device *dev;
>-	u32 clid = tcm->tcm_parent;
>-	struct Qdisc *q = NULL;
>-	struct Qdisc *p = NULL;
>+	u32 clid;
>+	struct Qdisc *q, *p;
> 	int err;
> 
>+replay:
>+	tcm = NLMSG_DATA(n);
>+	tca = arg;
>+	clid = tcm->tcm_parent;
>+	q = p = NULL;
>+
> 	if ((dev = __dev_get_by_index(tcm->tcm_ifindex)) == NULL)
> 		return -ENODEV;
> 
>@@ -701,8 +719,14 @@
> 		q = qdisc_create(dev, tcm->tcm_parent, tca, &err);
>         else
> 		q = qdisc_create(dev, tcm->tcm_handle, tca, &err);
>-	if (q == NULL)
>+	if (q == NULL) {
>+		if (err == -EAGAIN) {
>+			/* Replay the request. */
>+			dev_put(dev);
>+			goto replay;
>+		}
> 		return err;
>+	}
> 
> graft:
> 	if (1) {
>
>  
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: request_module while holding rtnl semaphore
  2004-11-10  0:11   ` David S. Miller
  2004-11-10  0:38     ` Patrick McHardy
@ 2004-11-10  1:01     ` Thomas Graf
  2004-11-10  1:10       ` Patrick McHardy
  2004-11-10  1:15     ` request_module while holding rtnl semaphore Herbert Xu
  2005-01-11  3:04     ` Patrick McHardy
  3 siblings, 1 reply; 29+ messages in thread
From: Thomas Graf @ 2004-11-10  1:01 UTC (permalink / raw)
  To: David S. Miller; +Cc: Herbert Xu, kaber, netdev

* David S. Miller <20041109161126.376f755c.davem@davemloft.net> 2004-11-09 16:11
> On Sat, 06 Nov 2004 10:35:45 +1100
> Herbert Xu <herbert@gondor.apana.org.au> wrote:
> > 2) Hooking random net/sched requests into rtnetlink.  By being an
> > rtnetlink user you pay the price of taking the rtnl.  Most of the
> > net/sched stuff has nothing to do with rtnetlink.
> 
> I believe this is a false statement.  Every single networking
> config change depends upon device existence and state in some
> way.  By virtue of that, they really depend upon the RTNL being
> held during the duration of their execution.
> 
> Only the actual implementations know when and where such a
> dependency on device state et al. does not exist, and therefore
> where RTNL holding is not necessary.
> 
> Therefore I suggest we just implement the fix for this inside of
> the packet scheduler layer itself.  Simply by dropping the RTNL
> semaphore during the module request, and then regrabbing the RTNL
> semaphore and replaying the request from the beginning.

Sounds reasonable for the sch_api.c and cls_api.c case but
will get very nasty for act_api as you noted and any further sub
layers that might be added in the future. (The classifier I'm
working on is such a case.)

I suggest to add TCA_PRELOAD being an array of module names to
be preloaded so userspace can tell us what modules might be
needed. Those modules would get loaded at the same spot while
rtnl semaphore is dropped. I think the fix is fine for cls/qdisc
to fix it for older iproute2 versions but it might be worth
to make the action code use it right away.

Thoughts?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: request_module while holding rtnl semaphore
  2004-11-10  1:01     ` Thomas Graf
@ 2004-11-10  1:10       ` Patrick McHardy
  2004-11-10  1:22         ` Thomas Graf
  0 siblings, 1 reply; 29+ messages in thread
From: Patrick McHardy @ 2004-11-10  1:10 UTC (permalink / raw)
  To: Thomas Graf; +Cc: David S. Miller, Herbert Xu, netdev

Thomas Graf wrote:

>Sounds reasonable for the sch_api.c and cls_api.c case but
>will get very nasty for act_api as you noted and any further sub
>layers that might be added in the future. (The classifier I'm
>working on is such a case.)
>
Can you please show an example of a case where this is needed ?
Just to help me understand.

Regards
Patrick

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: request_module while holding rtnl semaphore
  2004-11-10  0:11   ` David S. Miller
  2004-11-10  0:38     ` Patrick McHardy
  2004-11-10  1:01     ` Thomas Graf
@ 2004-11-10  1:15     ` Herbert Xu
  2005-01-11  3:04     ` Patrick McHardy
  3 siblings, 0 replies; 29+ messages in thread
From: Herbert Xu @ 2004-11-10  1:15 UTC (permalink / raw)
  To: David S. Miller; +Cc: kaber, netdev

On Tue, Nov 09, 2004 at 04:11:26PM -0800, David S. Miller wrote:
> 
> I totally disagree.  The use of rtnl for all networking configuration
> changes is a virtue of our current setup.

I agree with that.  However, my point is that the RTNL is used for
a lot more than that.
 
> Only the actual implementations know when and where such a
> dependency on device state et al. does not exist, and therefore
> where RTNL holding is not necessary.

You could do that if the RTNL was exclusively used for
protecting net_device.  As it is it protects RTNETLINK
and other things too.

> Therefore I suggest we just implement the fix for this inside of
> the packet scheduler layer itself.  Simply by dropping the RTNL
> semaphore during the module request, and then regrabbing the RTNL
> semaphore and replaying the request from the beginning.

I agree that this will probably the simplest fix.  However,
I'd like to see some evidence that this is actually safe.

For example, rtnetlink_rcv relies on the RTNL to ensure ordering
between requests.  Dropping the RTNL may cause the receive queue
to be processed by another process.

Now as it is I don't think it's a huge problem since we process
requests as soon as they come in.  But if we ever allow requests
to be queued for RTNETLINK this may come back to bite us.
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: request_module while holding rtnl semaphore
  2004-11-10  1:10       ` Patrick McHardy
@ 2004-11-10  1:22         ` Thomas Graf
  2004-11-10  1:29           ` Patrick McHardy
  0 siblings, 1 reply; 29+ messages in thread
From: Thomas Graf @ 2004-11-10  1:22 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: David S. Miller, Herbert Xu, netdev

* Patrick McHardy <41916A91.3080107@trash.net> 2004-11-10 02:10
> Thomas Graf wrote:
> 
> >Sounds reasonable for the sch_api.c and cls_api.c case but
> >will get very nasty for act_api as you noted and any further sub
> >layers that might be added in the future. (The classifier I'm
> >working on is such a case.)
> >
> Can you please show an example of a case where this is needed ?
> Just to help me understand.

The action code might load modules in the middle of a classifier
configuration and it will be very hard to reverse those changes.
Right now we could move it to the top of all configurations and it
would probably be possible to get back where we fetch the device
but it will get impossible once a classifier requires module
loading which is not unlikely.

Well, it's not impossible but it would mean to have the action
code parse the TLV and just try to load the module, then report
to the classifier so it can try to load its own modules and then do
the actual action and classifier configuration. I don't even want to
think of how nasty it gets once a action module requests modules itself ;->

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: request_module while holding rtnl semaphore
  2004-11-10  1:22         ` Thomas Graf
@ 2004-11-10  1:29           ` Patrick McHardy
  2004-11-10  1:39             ` Thomas Graf
  0 siblings, 1 reply; 29+ messages in thread
From: Patrick McHardy @ 2004-11-10  1:29 UTC (permalink / raw)
  To: Thomas Graf; +Cc: David S. Miller, Herbert Xu, netdev

Thomas Graf wrote:

>* Patrick McHardy <41916A91.3080107@trash.net> 2004-11-10 02:10
>  
>
>>Can you please show an example of a case where this is needed ?
>>Just to help me understand.
>>    
>>
>
>The action code might load modules in the middle of a classifier
>configuration and it will be very hard to reverse those changes.
>Right now we could move it to the top of all configurations and it
>would probably be possible to get back where we fetch the device
>but it will get impossible once a classifier requires module
>loading which is not unlikely.
>  
>
Assuming all error-paths do proper cleanup, returning -EAGAIN
should always result in the same configuration state as before.
Module-loading as a side-effect is exactly why this is done,
the replay will have all modules available.

>Well, it's not impossible but it would mean to have the action
>code parse the TLV and just try to load the module, then report
>to the classifier so it can try to load its own modules and then do
>the actual action and classifier configuration. I don't even want to
>think of how nasty it gets once a action module requests modules itself ;->
>  
>
I still don't understand the problem.

Regards
Patrick

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: request_module while holding rtnl semaphore
  2004-11-10  1:29           ` Patrick McHardy
@ 2004-11-10  1:39             ` Thomas Graf
  2004-11-10  1:41               ` Herbert Xu
  2004-11-10  1:47               ` Patrick McHardy
  0 siblings, 2 replies; 29+ messages in thread
From: Thomas Graf @ 2004-11-10  1:39 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: David S. Miller, Herbert Xu, netdev

* Patrick McHardy <41916F0B.5010809@trash.net> 2004-11-10 02:29
> Thomas Graf wrote:
> >The action code might load modules in the middle of a classifier
> >configuration and it will be very hard to reverse those changes.
> >Right now we could move it to the top of all configurations and it
> >would probably be possible to get back where we fetch the device
> >but it will get impossible once a classifier requires module
> >loading which is not unlikely.
> > 
> >
> Assuming all error-paths do proper cleanup, returning -EAGAIN
> should always result in the same configuration state as before.

I agree but this assumption is wrong, at least for u32. I agree
that once this is true it would work perfectly fine, however I 
think it would be inefficient to parse the whole TLV tree multiple
times.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: request_module while holding rtnl semaphore
  2004-11-10  1:39             ` Thomas Graf
@ 2004-11-10  1:41               ` Herbert Xu
  2004-11-10 11:32                 ` Thomas Graf
  2004-11-10  1:47               ` Patrick McHardy
  1 sibling, 1 reply; 29+ messages in thread
From: Herbert Xu @ 2004-11-10  1:41 UTC (permalink / raw)
  To: Thomas Graf; +Cc: Patrick McHardy, David S. Miller, netdev

On Wed, Nov 10, 2004 at 02:39:41AM +0100, Thomas Graf wrote:
> 
> that once this is true it would work perfectly fine, however I 
> think it would be inefficient to parse the whole TLV tree multiple
> times.

Well it's only going to happen once for each module so that's no
big deal.
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: request_module while holding rtnl semaphore
  2004-11-10  1:39             ` Thomas Graf
  2004-11-10  1:41               ` Herbert Xu
@ 2004-11-10  1:47               ` Patrick McHardy
  2004-12-12 17:57                 ` Thomas Graf
  1 sibling, 1 reply; 29+ messages in thread
From: Patrick McHardy @ 2004-11-10  1:47 UTC (permalink / raw)
  To: Thomas Graf; +Cc: David S. Miller, Herbert Xu, netdev

Thomas Graf wrote:

>* Patrick McHardy <41916F0B.5010809@trash.net> 2004-11-10 02:29
>  
>
>>Assuming all error-paths do proper cleanup, returning -EAGAIN
>>should always result in the same configuration state as before.
>>    
>>
>
>I agree but this assumption is wrong, at least for u32.
>
It will be true soon :) Anything else is a bug, and a nice
side-effect of this change is that all those dusty error-paths
actually get used.

>I agree
>that once this is true it would work perfectly fine, however I 
>think it would be inefficient to parse the whole TLV tree multiple
>times.
>  
>
It will only happen once for every module, so I don't think
it's a big deal.

Regards
Patrick

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: request_module while holding rtnl semaphore
  2004-11-10  1:41               ` Herbert Xu
@ 2004-11-10 11:32                 ` Thomas Graf
  2004-11-10 11:42                   ` Herbert Xu
  0 siblings, 1 reply; 29+ messages in thread
From: Thomas Graf @ 2004-11-10 11:32 UTC (permalink / raw)
  To: Herbert Xu, Patrick McHardy; +Cc: David S. Miller, netdev

* Herbert Xu <20041110014125.GA7302@gondor.apana.org.au> 2004-11-10 12:41
> On Wed, Nov 10, 2004 at 02:39:41AM +0100, Thomas Graf wrote:
> > that once this is true it would work perfectly fine, however I 
> > think it would be inefficient to parse the whole TLV tree multiple
> > times.
> 
> Well it's only going to happen once for each module so that's no
> big deal.

* Patrick McHardy <41917330.6090002@trash.net> 2004-11-10 02:47
> It will only happen once for every module, so I don't think
> it's a big deal.

Agreed. I'm used to huge filter configurations up to 100 MiB per
netlink message and therefore I do care whether such as message is
parsed 1 times or 5 times since it's a matter of having the
network blocked for 3 or 15 seconds. However, if it helps to clean
up the error paths then I'll be very happy and do a workaround
for my special case if I ever need loadable modules.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: request_module while holding rtnl semaphore
  2004-11-10 11:32                 ` Thomas Graf
@ 2004-11-10 11:42                   ` Herbert Xu
  2004-11-10 11:56                     ` Thomas Graf
  0 siblings, 1 reply; 29+ messages in thread
From: Herbert Xu @ 2004-11-10 11:42 UTC (permalink / raw)
  To: Thomas Graf; +Cc: Patrick McHardy, David S. Miller, netdev

On Wed, Nov 10, 2004 at 12:32:36PM +0100, Thomas Graf wrote:
> 
> Agreed. I'm used to huge filter configurations up to 100 MiB per
> netlink message and therefore I do care whether such as message is

100MiB!? How did you manage to stuff that into one linear skb?
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: request_module while holding rtnl semaphore
  2004-11-10 11:42                   ` Herbert Xu
@ 2004-11-10 11:56                     ` Thomas Graf
  0 siblings, 0 replies; 29+ messages in thread
From: Thomas Graf @ 2004-11-10 11:56 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Patrick McHardy, David S. Miller, netdev

* Herbert Xu <20041110114209.GA11336@gondor.apana.org.au> 2004-11-10 22:42
> On Wed, Nov 10, 2004 at 12:32:36PM +0100, Thomas Graf wrote:
> > 
> > Agreed. I'm used to huge filter configurations up to 100 MiB per
> > netlink message and therefore I do care whether such as message is
> 
> 100MiB!? How did you manage to stuff that into one linear skb?

It's splitted over multiple netlink messages. My classifier
keeps appending the messages in a hash table referenced by the
sequence number until a flag is set marking the end and then
processes the whole message. It's ugly but it works and is atomic
without locking over multiple netlink messages.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: request_module while holding rtnl semaphore
  2004-11-10  1:47               ` Patrick McHardy
@ 2004-12-12 17:57                 ` Thomas Graf
  2004-12-12 18:04                   ` Patrick McHardy
  0 siblings, 1 reply; 29+ messages in thread
From: Thomas Graf @ 2004-12-12 17:57 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: David S. Miller, Herbert Xu, netdev

> >>Assuming all error-paths do proper cleanup, returning -EAGAIN
> >>should always result in the same configuration state as before.
> >I agree but this assumption is wrong, at least for u32.
> >
> It will be true soon :) Anything else is a bug, and a nice
> side-effect of this change is that all those dusty error-paths
> actually get used.

I started working on a patchset to clean up all the error paths,
allow changing all pararmeters for those not supporting it yet
and to add the action bits for tcindex, route, and rsvp. Finishing
it and writing all the test cases to do proper testing will take some
time but I hope to have it ready for early 2.6.11 inclusion.

Just to let you know so we don't do redundant work.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: request_module while holding rtnl semaphore
  2004-12-12 17:57                 ` Thomas Graf
@ 2004-12-12 18:04                   ` Patrick McHardy
  2004-12-13 16:53                     ` [RFC] tcf_bind_filter failure handling Thomas Graf
  0 siblings, 1 reply; 29+ messages in thread
From: Patrick McHardy @ 2004-12-12 18:04 UTC (permalink / raw)
  To: Thomas Graf; +Cc: David S. Miller, Herbert Xu, netdev

Thomas Graf wrote:

>I started working on a patchset to clean up all the error paths,
>allow changing all pararmeters for those not supporting it yet
>and to add the action bits for tcindex, route, and rsvp. Finishing
>it and writing all the test cases to do proper testing will take some
>time but I hope to have it ready for early 2.6.11 inclusion.
>
>Just to let you know so we don't do redundant work.
>
Thanks for the information. Next thing I want to do is clean up the
init paths and add proper error propagation.

Regards
Patrick

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [RFC] tcf_bind_filter failure handling
  2004-12-12 18:04                   ` Patrick McHardy
@ 2004-12-13 16:53                     ` Thomas Graf
  2004-12-13 18:11                       ` Patrick McHardy
  0 siblings, 1 reply; 29+ messages in thread
From: Thomas Graf @ 2004-12-13 16:53 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: David S. Miller, Herbert Xu, netdev

The handling of a failure in tcf_bind_filter is inconsistent.

u32: ignore
fw: ignore
route: ignore
rsvp: ignore
tcindex: error

It might be a good idea to make this consistent. So in order to validate
the classid before making any changes we could simply lock it via get
(see patch below), return an error if it fails  and put it back in case
of an error further in the path or after binding the filter.

Bindings not only locks the class from removal while a filter is
pointing to it. It speeds up classyfing by saving a lookup for every
tc_classify call. It's not really a problem if the class is not locked,
the qdisc will look it up and falls back to a default class if it
doesn't exists so it's rather a cosmetic/policy thing.

Note: The patch below is not intented for inclusion yet and will not
apply since I made local changes to pkt_cls.h.

--- linux-2.6.10-rc3-bk6.orig/include/net/pkt_cls.h	2004-12-12 21:54:03.000000000 +0100
+++ linux-2.6.10-rc3-bk6/include/net/pkt_cls.h	2004-12-13 16:40:06.000000000 +0100
@@ -62,6 +62,18 @@
 		tp->q->ops->cl_ops->unbind_tcf(tp->q, cl);
 }
 
+static inline unsigned long
+tcf_class_get(struct tcf_proto *tp, u32 classid)
+{
+	return tp->q->ops->cl_ops->get(tp->q, classid);
+}
+
+static inline void
+tcf_class_put(struct tcf_proto *tp, unsigned long cl)
+{
+	tp->q->ops->cl_ops->put(tp->q, cl);
+}
+
 #ifdef CONFIG_NET_CLS_ACT
 static inline int
 tcf_validate_act_police(struct rtattr *act_police_tlv, struct rtattr *rate_tlv,

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC] tcf_bind_filter failure handling
  2004-12-13 16:53                     ` [RFC] tcf_bind_filter failure handling Thomas Graf
@ 2004-12-13 18:11                       ` Patrick McHardy
  2004-12-13 18:52                         ` Thomas Graf
  0 siblings, 1 reply; 29+ messages in thread
From: Patrick McHardy @ 2004-12-13 18:11 UTC (permalink / raw)
  To: Thomas Graf; +Cc: David S. Miller, Herbert Xu, netdev

Thomas Graf wrote:

>The handling of a failure in tcf_bind_filter is inconsistent.
>
>u32: ignore
>fw: ignore
>route: ignore
>rsvp: ignore
>tcindex: error
>
>It might be a good idea to make this consistent. So in order to validate
>the classid before making any changes we could simply lock it via get
>(see patch below), return an error if it fails  and put it back in case
>of an error further in the path or after binding the filter.
>
>Bindings not only locks the class from removal while a filter is
>pointing to it. It speeds up classyfing by saving a lookup for every
>tc_classify call. It's not really a problem if the class is not locked,
>the qdisc will look it up and falls back to a default class if it
>doesn't exists so it's rather a cosmetic/policy thing.
>
You should just fix tcindex not to care about errors in tcf_bind_filter.
bind_tcf already locks the class. Some qdiscs (like prio) map bind_filter
to get, but others (HTB, HFSC, CBQ) use a seperate counter because it is
legal to end up with a refcnt > 0 after delete. When a class with filters
pointing to it is tried to destroy they return -EBUSY, which can't be done
by looking at the refcnt.

Regards
Patrick

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC] tcf_bind_filter failure handling
  2004-12-13 18:11                       ` Patrick McHardy
@ 2004-12-13 18:52                         ` Thomas Graf
  2004-12-13 19:12                           ` Patrick McHardy
  2004-12-13 19:23                           ` jamal
  0 siblings, 2 replies; 29+ messages in thread
From: Thomas Graf @ 2004-12-13 18:52 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: David S. Miller, Herbert Xu, netdev

* Patrick McHardy <41BDDB5A.9000907@trash.net> 2004-12-13 19:11
> Thomas Graf wrote:
> 
> >The handling of a failure in tcf_bind_filter is inconsistent.
> >
> >u32: ignore
> >fw: ignore
> >route: ignore
> >rsvp: ignore
> >tcindex: error
> >
> >It might be a good idea to make this consistent. So in order to validate
> >the classid before making any changes we could simply lock it via get
> >(see patch below), return an error if it fails  and put it back in case
> >of an error further in the path or after binding the filter.
> >
> >Bindings not only locks the class from removal while a filter is
> >pointing to it. It speeds up classyfing by saving a lookup for every
> >tc_classify call. It's not really a problem if the class is not locked,
> >the qdisc will look it up and falls back to a default class if it
> >doesn't exists so it's rather a cosmetic/policy thing.
> >
> You should just fix tcindex not to care about errors in tcf_bind_filter.
> bind_tcf already locks the class. Some qdiscs (like prio) map bind_filter
> to get, but others (HTB, HFSC, CBQ) use a seperate counter because it is
> legal to end up with a refcnt > 0 after delete. When a class with filters
> pointing to it is tried to destroy they return -EBUSY, which can't be done
> by looking at the refcnt.

Little misunderstanding here. I'm not aiming at replacing tcf_bind_filter
with get.  My question is rather whether to regard tcf_bind_filter not setting
tcf_result->class as an error or ignore it. 

I'm all for ignoring it in tcindex, it requires some changes because
it checks tcf_result.class field to see if hash bucket is non-empty if
perfect hash is used but is not a problem at all.

The tcf_class_get/put would be required to ensure proper locking during
validation of parameters if validating the classid being last before
changing things doesn't make sense due to the need to undo expensive
operations required before binding.

I will fix tcindex, since you also agree on simply ignoring it and regard
the binding as an ptional locking and performance increase possibility
given to userspace.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC] tcf_bind_filter failure handling
  2004-12-13 18:52                         ` Thomas Graf
@ 2004-12-13 19:12                           ` Patrick McHardy
  2004-12-13 19:23                           ` jamal
  1 sibling, 0 replies; 29+ messages in thread
From: Patrick McHardy @ 2004-12-13 19:12 UTC (permalink / raw)
  To: Thomas Graf; +Cc: David S. Miller, Herbert Xu, netdev

Thomas Graf wrote:

>* Patrick McHardy <41BDDB5A.9000907@trash.net> 2004-12-13 19:11
>  
>
>>You should just fix tcindex not to care about errors in tcf_bind_filter.
>>bind_tcf already locks the class. Some qdiscs (like prio) map bind_filter
>>to get, but others (HTB, HFSC, CBQ) use a seperate counter because it is
>>legal to end up with a refcnt > 0 after delete. When a class with filters
>>pointing to it is tried to destroy they return -EBUSY, which can't be done
>>by looking at the refcnt.
>>    
>>
>
>Little misunderstanding here. I'm not aiming at replacing tcf_bind_filter
>with get.  My question is rather whether to regard tcf_bind_filter not setting
>tcf_result->class as an error or ignore it. 
>
>I'm all for ignoring it in tcindex, it requires some changes because
>it checks tcf_result.class field to see if hash bucket is non-empty if
>perfect hash is used but is not a problem at all.
>
>The tcf_class_get/put would be required to ensure proper locking during
>validation of parameters if validating the classid being last before
>changing things doesn't make sense due to the need to undo expensive
>operations required before binding.
>
>I will fix tcindex, since you also agree on simply ignoring it and regard
>the binding as an ptional locking and performance increase possibility
>given to userspace.
>
Yes, it should be ignored, otherwise you can't point a filter to a class
you will add later.

Regards
Patrick

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC] tcf_bind_filter failure handling
  2004-12-13 18:52                         ` Thomas Graf
  2004-12-13 19:12                           ` Patrick McHardy
@ 2004-12-13 19:23                           ` jamal
  2004-12-13 19:32                             ` Thomas Graf
  1 sibling, 1 reply; 29+ messages in thread
From: jamal @ 2004-12-13 19:23 UTC (permalink / raw)
  To: Thomas Graf; +Cc: Patrick McHardy, David S. Miller, Herbert Xu, netdev


When/why would binding fail? tcindex is an exception.
I dont see binding as having any contribution to the error path.
Additional locking is not advisable. The binding could happen while
traffic is running.

cheers,
jamal 

On Mon, 2004-12-13 at 13:52, Thomas Graf wrote:
> * Patrick McHardy <41BDDB5A.9000907@trash.net> 2004-12-13 19:11
> > Thomas Graf wrote:
> > 
> > >The handling of a failure in tcf_bind_filter is inconsistent.
> > >
> > >u32: ignore
> > >fw: ignore
> > >route: ignore
> > >rsvp: ignore
> > >tcindex: error
> > >
> > >It might be a good idea to make this consistent. So in order to validate
> > >the classid before making any changes we could simply lock it via get
> > >(see patch below), return an error if it fails  and put it back in case
> > >of an error further in the path or after binding the filter.
> > >
> > >Bindings not only locks the class from removal while a filter is
> > >pointing to it. It speeds up classyfing by saving a lookup for every
> > >tc_classify call. It's not really a problem if the class is not locked,
> > >the qdisc will look it up and falls back to a default class if it
> > >doesn't exists so it's rather a cosmetic/policy thing.
> > >
> > You should just fix tcindex not to care about errors in tcf_bind_filter.
> > bind_tcf already locks the class. Some qdiscs (like prio) map bind_filter
> > to get, but others (HTB, HFSC, CBQ) use a seperate counter because it is
> > legal to end up with a refcnt > 0 after delete. When a class with filters
> > pointing to it is tried to destroy they return -EBUSY, which can't be done
> > by looking at the refcnt.
> 
> Little misunderstanding here. I'm not aiming at replacing tcf_bind_filter
> with get.  My question is rather whether to regard tcf_bind_filter not setting
> tcf_result->class as an error or ignore it. 
> 
> I'm all for ignoring it in tcindex, it requires some changes because
> it checks tcf_result.class field to see if hash bucket is non-empty if
> perfect hash is used but is not a problem at all.
> 
> The tcf_class_get/put would be required to ensure proper locking during
> validation of parameters if validating the classid being last before
> changing things doesn't make sense due to the need to undo expensive
> operations required before binding.
> 
> I will fix tcindex, since you also agree on simply ignoring it and regard
> the binding as an ptional locking and performance increase possibility
> given to userspace.
> 
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC] tcf_bind_filter failure handling
  2004-12-13 19:23                           ` jamal
@ 2004-12-13 19:32                             ` Thomas Graf
  0 siblings, 0 replies; 29+ messages in thread
From: Thomas Graf @ 2004-12-13 19:32 UTC (permalink / raw)
  To: jamal; +Cc: Patrick McHardy, David S. Miller, Herbert Xu, netdev

* jamal <1102965823.1075.14.camel@jzny.localdomain> 2004-12-13 14:23
> 
> When/why would binding fail? tcindex is an exception.
> I dont see binding as having any contribution to the error path.
> Additional locking is not advisable. The binding could happen while
> traffic is running.

The binding fails if the class does not exist at the time the classifier
is loaded. The original implementation regarded binding optional to do
the classid -> class lookup only once while loading instead of everytime
classify() returns. tcindex does not do so because it uses the class
field to determine whether a perfect hash bucket is used or not. I
changed it to check classid || police || action because one of them is
definitely defined and enforced as a requirement during validation.

The locking i mentioned was not a spinlock but rather a refcnt++ in the
class intended to be bound later during change, i.e. to ensure a qdisc
destroy rcu callback can't destroy the class while we're about to bind
to it. But since we agree on changing tcindex this is not an issue and
nothing will change.

Sorry for the confusion, my word choice was rather bad for such a
simple issue and produced more confusion than results.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: request_module while holding rtnl semaphore
  2004-11-10  0:11   ` David S. Miller
                       ` (2 preceding siblings ...)
  2004-11-10  1:15     ` request_module while holding rtnl semaphore Herbert Xu
@ 2005-01-11  3:04     ` Patrick McHardy
  2005-01-11  9:47       ` Thomas Graf
  3 siblings, 1 reply; 29+ messages in thread
From: Patrick McHardy @ 2005-01-11  3:04 UTC (permalink / raw)
  To: David S. Miller; +Cc: Herbert Xu, netdev

David S. Miller wrote:

>Therefore I suggest we just implement the fix for this inside of
>the packet scheduler layer itself.  Simply by dropping the RTNL
>semaphore during the module request, and then regrabbing the RTNL
>semaphore and replaying the request from the beginning.
>
>The net/sched/sch_api.c version of the fix would look like the
>following.  The act_api.c case would require a bit more surgery,
>but with the right restructuring it can be done too.
>  
>
This patch got lost somehow. The act_api.c changes are actually
even more complicated because besides the action init path, changes
can also be made from classifiers in a deep call-chain. I hope
Thomas's recent changes make it easier to fix this, but I think
this patch should go in now anyway.

Regards
Patrick

>===== net/sched/sch_api.c 1.41 vs edited =====
>--- 1.41/net/sched/sch_api.c	2004-11-05 16:34:45 -08:00
>+++ edited/net/sched/sch_api.c	2004-11-09 15:57:19 -08:00
>@@ -396,17 +396,30 @@
> 	struct Qdisc_ops *ops;
> 	int size;
> 
>+	err = -EINVAL;
> 	ops = qdisc_lookup_ops(kind);
> #ifdef CONFIG_KMOD
> 	if (ops==NULL && tca[TCA_KIND-1] != NULL) {
> 		if (RTA_PAYLOAD(kind) <= IFNAMSIZ) {
>+			rtnl_unlock();
> 			request_module("sch_%s", (char*)RTA_DATA(kind));
>+			rtnl_lock();
>+
> 			ops = qdisc_lookup_ops(kind);
>+
>+			/* We dropped the RTNL semaphore in order to
>+			 * perform the module load.  So, even if we
>+			 * succeeded in loading the module we have to
>+			 * tell the caller to replay the request.  We
>+			 * indicate this using -EAGAIN.
>+			 */
>+			if (ops != NULL)
>+				err = -EAGAIN;
>+			goto err_out;
> 		}
> 	}
> #endif
> 
>-	err = -EINVAL;
> 	if (ops == NULL)
> 		goto err_out;
> 	err = -EBUSY;
>@@ -600,14 +613,19 @@
> 
> static int tc_modify_qdisc(struct sk_buff *skb, struct nlmsghdr *n, void *arg)
> {
>-	struct tcmsg *tcm = NLMSG_DATA(n);
>-	struct rtattr **tca = arg;
>+	struct tcmsg *tcm;
>+	struct rtattr **tca;
> 	struct net_device *dev;
>-	u32 clid = tcm->tcm_parent;
>-	struct Qdisc *q = NULL;
>-	struct Qdisc *p = NULL;
>+	u32 clid;
>+	struct Qdisc *q, *p;
> 	int err;
> 
>+replay:
>+	tcm = NLMSG_DATA(n);
>+	tca = arg;
>+	clid = tcm->tcm_parent;
>+	q = p = NULL;
>+
> 	if ((dev = __dev_get_by_index(tcm->tcm_ifindex)) == NULL)
> 		return -ENODEV;
> 
>@@ -701,8 +719,14 @@
> 		q = qdisc_create(dev, tcm->tcm_parent, tca, &err);
>         else
> 		q = qdisc_create(dev, tcm->tcm_handle, tca, &err);
>-	if (q == NULL)
>+	if (q == NULL) {
>+		if (err == -EAGAIN) {
>+			/* Replay the request. */
>+			dev_put(dev);
>+			goto replay;
>+		}
> 		return err;
>+	}
> 
> graft:
> 	if (1) {
>
>  
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: request_module while holding rtnl semaphore
  2005-01-11  3:04     ` Patrick McHardy
@ 2005-01-11  9:47       ` Thomas Graf
  2005-01-11 21:05         ` Patrick McHardy
  0 siblings, 1 reply; 29+ messages in thread
From: Thomas Graf @ 2005-01-11  9:47 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: David S. Miller, Herbert Xu, netdev

* Patrick McHardy <41E34252.504@trash.net> 2005-01-11 04:04
> David S. Miller wrote:
> 
> >Therefore I suggest we just implement the fix for this inside of
> >the packet scheduler layer itself.  Simply by dropping the RTNL
> >semaphore during the module request, and then regrabbing the RTNL
> >semaphore and replaying the request from the beginning.
> >
> >The net/sched/sch_api.c version of the fix would look like the
> >following.  The act_api.c case would require a bit more surgery,
> >but with the right restructuring it can be done too.
> > 
> >
> This patch got lost somehow. The act_api.c changes are actually
> even more complicated because besides the action init path, changes
> can also be made from classifiers in a deep call-chain. I hope
> Thomas's recent changes make it easier to fix this, but I think
> this patch should go in now anyway.

The action initialization is now done first and no classifier
data is changed except for tp->root modifications which must not
be undone so you can safely return EBUSY. rsvp might be an exception,
I haven't looked too closely into it yet.

tcindex returns EBUSY when a filter is changed which does not fit
into the hashtable, so this must be changed.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: request_module while holding rtnl semaphore
  2005-01-11  9:47       ` Thomas Graf
@ 2005-01-11 21:05         ` Patrick McHardy
  2005-01-11 21:47           ` Thomas Graf
  0 siblings, 1 reply; 29+ messages in thread
From: Patrick McHardy @ 2005-01-11 21:05 UTC (permalink / raw)
  To: Thomas Graf; +Cc: David S. Miller, Herbert Xu, netdev

Thomas Graf wrote:

>* Patrick McHardy <41E34252.504@trash.net> 2005-01-11 04:04
>
>>This patch got lost somehow. The act_api.c changes are actually
>>even more complicated because besides the action init path, changes
>>can also be made from classifiers in a deep call-chain. I hope
>>Thomas's recent changes make it easier to fix this, but I think
>>this patch should go in now anyway.
>>
>
>The action initialization is now done first and no classifier
>data is changed except for tp->root modifications which must not
>be undone so you can safely return EBUSY. rsvp might be an exception,
>I haven't looked too closely into it yet.
>
>tcindex returns EBUSY when a filter is changed which does not fit
>into the hashtable, so this must be changed.
>
It's -EAGAIN, so no problem, nothing returns this currently. I'm going to
send a patch later.

Regards
Patrick

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: request_module while holding rtnl semaphore
  2005-01-11 21:05         ` Patrick McHardy
@ 2005-01-11 21:47           ` Thomas Graf
  2005-01-11 21:50             ` Thomas Graf
  0 siblings, 1 reply; 29+ messages in thread
From: Thomas Graf @ 2005-01-11 21:47 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: David S. Miller, Herbert Xu, netdev

* Patrick McHardy <41E43F83.1090503@trash.net> 2005-01-11 22:05
> Thomas Graf wrote:
> >tcindex returns EBUSY when a filter is changed which does not fit
> >into the hashtable, so this must be changed.
> >
> It's -EAGAIN, so no problem, nothing returns this currently. I'm going to
> send a patch later.

Of course, then there is no problem. Will you fix the request_module in
cls_api.c as well?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: request_module while holding rtnl semaphore
  2005-01-11 21:47           ` Thomas Graf
@ 2005-01-11 21:50             ` Thomas Graf
  2005-01-11 22:18               ` Patrick McHardy
  0 siblings, 1 reply; 29+ messages in thread
From: Thomas Graf @ 2005-01-11 21:50 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: David S. Miller, Herbert Xu, netdev

* Thomas Graf <20050111214720.GF26856@postel.suug.ch> 2005-01-11 22:47
> * Patrick McHardy <41E43F83.1090503@trash.net> 2005-01-11 22:05
> > Thomas Graf wrote:
> > >tcindex returns EBUSY when a filter is changed which does not fit
> > >into the hashtable, so this must be changed.
> > >
> > It's -EAGAIN, so no problem, nothing returns this currently. I'm going to
> > send a patch later.
> 
> Of course, then there is no problem. Will you fix the request_module in
> cls_api.c as well?

Just saw your patch, never mind.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: request_module while holding rtnl semaphore
  2005-01-11 21:50             ` Thomas Graf
@ 2005-01-11 22:18               ` Patrick McHardy
  0 siblings, 0 replies; 29+ messages in thread
From: Patrick McHardy @ 2005-01-11 22:18 UTC (permalink / raw)
  To: Thomas Graf; +Cc: David S. Miller, Herbert Xu, netdev

[-- Attachment #1: Type: text/plain, Size: 404 bytes --]

Thomas Graf wrote:

>>Of course, then there is no problem. Will you fix the request_module in
>>cls_api.c as well?
>>
>
>Just saw your patch, never mind.
>

Oops, embarassing :) The cls_api.c changes in my patch only deal with
EAGAIN from act_api.c, but I didn't fix module loading in cls_api.c
itself. Patch on top of the previous ones attached, I will also update
the bitkeeper tree.

Regards
Patrick


[-- Attachment #2: x --]
[-- Type: text/plain, Size: 1576 bytes --]

# This is a BitKeeper generated diff -Nru style patch.
#
# ChangeSet
#   2005/01/11 23:15:28+01:00 kaber@coreworks.de 
#   [PKT_SCHED]: cls_api.c: drop rtnl for loading modules
#   
#   Signed-off-by: Patrick McHardy <kaber@trash.net>
# 
# net/sched/cls_api.c
#   2005/01/11 23:15:20+01:00 kaber@coreworks.de +14 -5
#   [PKT_SCHED]: cls_api.c: drop rtnl for loading modules
#   
#   Signed-off-by: Patrick McHardy <kaber@trash.net>
# 
diff -Nru a/net/sched/cls_api.c b/net/sched/cls_api.c
--- a/net/sched/cls_api.c	2005-01-11 23:17:58 +01:00
+++ b/net/sched/cls_api.c	2005-01-11 23:17:58 +01:00
@@ -219,20 +219,29 @@
 		err = -ENOBUFS;
 		if ((tp = kmalloc(sizeof(*tp), GFP_KERNEL)) == NULL)
 			goto errout;
+		err = -EINVAL;
 		tp_ops = tcf_proto_lookup_ops(tca[TCA_KIND-1]);
+		if (tp_ops == NULL) {
 #ifdef CONFIG_KMOD
-		if (tp_ops==NULL && tca[TCA_KIND-1] != NULL) {
 			struct rtattr *kind = tca[TCA_KIND-1];
 			char name[IFNAMSIZ];
 
-			if (rtattr_strlcpy(name, kind, IFNAMSIZ) < IFNAMSIZ) {
+			if (kind != NULL &&
+			    rtattr_strlcpy(name, kind, IFNAMSIZ) < IFNAMSIZ) {
+				rtnl_unlock();
 				request_module("cls_%s", name);
+				rtnl_lock();
 				tp_ops = tcf_proto_lookup_ops(kind);
+				/* We dropped the RTNL semaphore in order to
+				 * perform the module load.  So, even if we
+				 * succeeded in loading the module we have to
+				 * replay the request.  We indicate this using
+				 * -EAGAIN.
+				 */
+				if (tp_ops != NULL)
+					err = -EAGAIN;
 			}
-		}
 #endif
-		if (tp_ops == NULL) {
-			err = -EINVAL;
 			kfree(tp);
 			goto errout;
 		}

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2005-01-11 22:18 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-11-04  3:11 request_module while holding rtnl semaphore Patrick McHardy
2004-11-05 23:35 ` Herbert Xu
2004-11-10  0:11   ` David S. Miller
2004-11-10  0:38     ` Patrick McHardy
2004-11-10  1:01     ` Thomas Graf
2004-11-10  1:10       ` Patrick McHardy
2004-11-10  1:22         ` Thomas Graf
2004-11-10  1:29           ` Patrick McHardy
2004-11-10  1:39             ` Thomas Graf
2004-11-10  1:41               ` Herbert Xu
2004-11-10 11:32                 ` Thomas Graf
2004-11-10 11:42                   ` Herbert Xu
2004-11-10 11:56                     ` Thomas Graf
2004-11-10  1:47               ` Patrick McHardy
2004-12-12 17:57                 ` Thomas Graf
2004-12-12 18:04                   ` Patrick McHardy
2004-12-13 16:53                     ` [RFC] tcf_bind_filter failure handling Thomas Graf
2004-12-13 18:11                       ` Patrick McHardy
2004-12-13 18:52                         ` Thomas Graf
2004-12-13 19:12                           ` Patrick McHardy
2004-12-13 19:23                           ` jamal
2004-12-13 19:32                             ` Thomas Graf
2004-11-10  1:15     ` request_module while holding rtnl semaphore Herbert Xu
2005-01-11  3:04     ` Patrick McHardy
2005-01-11  9:47       ` Thomas Graf
2005-01-11 21:05         ` Patrick McHardy
2005-01-11 21:47           ` Thomas Graf
2005-01-11 21:50             ` Thomas Graf
2005-01-11 22:18               ` Patrick McHardy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.