All of lore.kernel.org
 help / color / mirror / Atom feed
* Scalability of interface creation and deletion
@ 2011-05-07 11:08 Alex Bligh
  2011-05-07 12:22 ` Eric Dumazet
  0 siblings, 1 reply; 53+ messages in thread
From: Alex Bligh @ 2011-05-07 11:08 UTC (permalink / raw)
  To: netdev; +Cc: Alex Bligh

I am trying to track down why interface creation slows down badly with
large numbers of interfaces (~1,000 interfaces) and why deletion is so
slow. Use case: restarting routers needs to be fast; some failover methods
require interface up/down; some routers need lots of interfaces.

I have written a small shell script to create and delete a number of
interfaces supplied on the command line (script appended below). It
is important to run this with udev, udev-bridge etc. disabled. In
my environment
(Ubuntu 2.6.32-28-generic, Lucid). I did this by
 * service upstart-udev-bridge stop
 * service udev stop
 * unshare -n bash
If you don't do this, you are simply timing your distro's interface
scripts.

Note the "-n" parameter creates the supplied number of veth pair
interfaces. As these are pairs, there are twice as many interfaces actually
created.

So, the results which are pretty repeatable are as follows:

                            100 pairs      500 pairs
Interface creation               14ms          110ms
Interface deletion              160ms          148ms

Now I don't think interface deletion has in fact got faster: simply
the overhead of loading the script is spread over more processes.
But there are two obvious conclusions:

1. Interface creation slows down hugely with more interfaces
2. Interface deletion is normally much slower than interface creation

strace -T -ttt on the "ip" command used to do this does not show the delay
where I thought it would be - cataloguing the existing interfaces. Instead,
it's the final send() to the netlink socket which does the relevant action
which appears to be slow, for both addition and detion. Adding the last
interface takes 200ms in that syscall, the first is quick (symptomatic of a
slowdown); for deletion the last send syscall is quick.

Poking about in net/core/dev.c, I see that interface names are hashed using
a hash with a maximum of 256 entries. However, these seem to be hash
buckets supporting multiple entries so I can't imagine a chain of 4 entries
is problematic.

I am having difficulty seeing what might be the issue in interface
creation. Any ideas?

In interface deletion, my attention is drawn to netdev_wait_allrefs,
which does this:
        refcnt = netdev_refcnt_read(dev);

        while (refcnt != 0) {
                ...
                msleep(250);

                refcnt = netdev_refcnt_read(dev);
		....
        }

I am guessing that this is going to do the msleep 50% of the time,
explaining 125ms of the observed time. How would people react to
exponential backoff instead (untested):

	int backoff = 10;
        refcnt = netdev_refcnt_read(dev);

        while (refcnt != 0) {
                ...
                msleep(backoff);
                if ((backoff *= 2) > 250)
                  backoff = 250;
		
                refcnt = netdev_refcnt_read(dev);
		....
        }


-- 
Alex Bligh



#!/bin/bash

# Usage:
#   ifaceseq [options]
#
# Options:
#   -n NUM : use NUM interfaces
#   -t TYPE : use TYPE of interfaces (supported: veth, vlan)

numifs=10
itype=veth

while getopts n:t: flag; do
    case ${flag} in
	n) numifs=${OPTARG} ;;
	t) itype=${OPTARG} ;;
    esac
done

shift $((OPTIND-1))

createifs ()
{
    echo `date` creating $numifs interfaces
    case ${itype} in
	vlan)
	    for i in `seq 1 $numifs` ; do
		ip link add link eth0 name vlan${i} type vlan id ${i}
	    done
	    ;;
	*)
	    for i in `seq 1 $numifs` ; do
		ip link add testa${i} type veth peer name testb${i}
	    done
    esac
    echo `date` done
}

deleteifs ()
{
    echo `date` deleting $numifs interfaces
    case ${itype} in
	vlan)
	    for i in `seq 1 $numifs` ; do
		ip link delete dev vlan${i}
	    done
	    ;;
	*)
	    for i in `seq 1 $numifs` ; do
		ip link delete testa${i}
	    done
    esac
    echo `date` done
}

time createifs;
time deleteifs;




^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-07 11:08 Scalability of interface creation and deletion Alex Bligh
@ 2011-05-07 12:22 ` Eric Dumazet
  2011-05-07 15:26   ` Alex Bligh
  0 siblings, 1 reply; 53+ messages in thread
From: Eric Dumazet @ 2011-05-07 12:22 UTC (permalink / raw)
  To: Alex Bligh; +Cc: netdev

Le samedi 07 mai 2011 à 12:08 +0100, Alex Bligh a écrit :
> I am trying to track down why interface creation slows down badly with
> large numbers of interfaces (~1,000 interfaces) and why deletion is so
> slow. Use case: restarting routers needs to be fast; some failover methods
> require interface up/down; some routers need lots of interfaces.
> 
> I have written a small shell script to create and delete a number of
> interfaces supplied on the command line (script appended below). It
> is important to run this with udev, udev-bridge etc. disabled. In
> my environment
> (Ubuntu 2.6.32-28-generic, Lucid). I did this by
>  * service upstart-udev-bridge stop
>  * service udev stop
>  * unshare -n bash
> If you don't do this, you are simply timing your distro's interface
> scripts.
> 
> Note the "-n" parameter creates the supplied number of veth pair
> interfaces. As these are pairs, there are twice as many interfaces actually
> created.
> 
> So, the results which are pretty repeatable are as follows:
> 
>                             100 pairs      500 pairs
> Interface creation               14ms          110ms
> Interface deletion              160ms          148ms
> 
> Now I don't think interface deletion has in fact got faster: simply
> the overhead of loading the script is spread over more processes.
> But there are two obvious conclusions:
> 
> 1. Interface creation slows down hugely with more interfaces

sysfs is the problem, a very well known one.
(sysfs_refresh_inode(), 

try :

$ time ls /sys/class/net >/dev/null

real	0m0.002s
user	0m0.000s
sys	0m0.001s
$ modprobe dummy numdummies=1000
$ time ls /sys/class/net >/dev/null

real	0m0.041s
user	0m0.003s
sys	0m0.002s


> 2. Interface deletion is normally much slower than interface creation
> 
> strace -T -ttt on the "ip" command used to do this does not show the delay
> where I thought it would be - cataloguing the existing interfaces. Instead,
> it's the final send() to the netlink socket which does the relevant action
> which appears to be slow, for both addition and detion. Adding the last
> interface takes 200ms in that syscall, the first is quick (symptomatic of a
> slowdown); for deletion the last send syscall is quick.
> 
> Poking about in net/core/dev.c, I see that interface names are hashed using
> a hash with a maximum of 256 entries. However, these seem to be hash
> buckets supporting multiple entries so I can't imagine a chain of 4 entries
> is problematic.

Its not.

> 
> I am having difficulty seeing what might be the issue in interface
> creation. Any ideas?
> 

Actually a lot, just make

git log net/core/dev.c

and you'll see many commits to make this faster.

> In interface deletion, my attention is drawn to netdev_wait_allrefs,
> which does this:
>         refcnt = netdev_refcnt_read(dev);
> 

Here refcnt is 0, or there is a bug somewhere.
(It happens, we fix bugs once in a while)

>         while (refcnt != 0) {
>                 ...
>                 msleep(250);
> 
>                 refcnt = netdev_refcnt_read(dev);
> 		....
>         }
> 
> I am guessing that this is going to do the msleep 50% of the time,
> explaining 125ms of the observed time. How would people react to
> exponential backoff instead (untested):
> 
> 	int backoff = 10;
>         refcnt = netdev_refcnt_read(dev);
> 
>         while (refcnt != 0) {
>                 ...
>                 msleep(backoff);
>                 if ((backoff *= 2) > 250)
>                   backoff = 250;
> 		
>                 refcnt = netdev_refcnt_read(dev);
> 		....
>         }
> 
> 

Welcome to the club. This is what is discussed on netdev since many
years. Lot of work had been done to make it better.

Interface deletion needs several rcu synch calls, they are very
expensive. This is the price to pay to have lockless network stack in
fast paths.




^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-07 12:22 ` Eric Dumazet
@ 2011-05-07 15:26   ` Alex Bligh
  2011-05-07 15:54     ` Eric Dumazet
  2011-05-07 16:26     ` Eric Dumazet
  0 siblings, 2 replies; 53+ messages in thread
From: Alex Bligh @ 2011-05-07 15:26 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, Alex Bligh

Eric,

>> 1. Interface creation slows down hugely with more interfaces
>
> sysfs is the problem, a very well known one.
> (sysfs_refresh_inode(),

Thanks

>> 2. Interface deletion is normally much slower than interface creation
>>
>> strace -T -ttt on the "ip" command used to do this does not show the
>> delay where I thought it would be - cataloguing the existing interfaces.
>> Instead, it's the final send() to the netlink socket which does the
>> relevant action which appears to be slow, for both addition and detion.
>> Adding the last interface takes 200ms in that syscall, the first is
>> quick (symptomatic of a slowdown); for deletion the last send syscall is
>> quick.
>
>> I am having difficulty seeing what might be the issue in interface
>> creation. Any ideas?
>>
>
> Actually a lot, just make
>
> git log net/core/dev.c
>
> and you'll see many commits to make this faster.

OK. I am up to 2.6.38.2 and see no improvement by then. I will
try something bleeding edge in a bit.

>> I am guessing that this is going to do the msleep 50% of the time,
>> explaining 125ms of the observed time. How would people react to
>> exponential backoff instead (untested):
>>
>> 	int backoff = 10;
>>         refcnt = netdev_refcnt_read(dev);
>>
>>         while (refcnt != 0) {
>>                 ...
>>                 msleep(backoff);
>>                 if ((backoff *= 2) > 250)
>>                   backoff = 250;
>> 		
>>                 refcnt = netdev_refcnt_read(dev);
>> 		....
>>         }
>>
>>
>
> Welcome to the club. This is what is discussed on netdev since many
> years. Lot of work had been done to make it better.

Well, I patched it (patch attached for what it's worth) and it made
no difference in this case. I would suggest however that it might
be the right think to do anyway.

> Interface deletion needs several rcu synch calls, they are very
> expensive. This is the price to pay to have lockless network stack in
> fast paths.

On the current 8 core box I am testing, I see 280ms per interface
delete **even with only 10 interfaces**. I see 260ms with one
interface. I know doing lots of rcu sync stuff can be slow, but
260ms to remove one veth pair sounds like more than rcu sync going
on. It sounds like a sleep (though I may not have found the
right one). I see no CPU load.

Equally, with one interface (remember I'm doing this in unshare -n
so there is only a loopback interface there), this bit surely
can't be sysfs.

-- 
Alex Bligh

Signed-off-by: Alex Bligh <alex@alex.org.uk>
diff --git a/net/core/dev.c b/net/core/dev.c
index 6561021..f55c95c 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5429,6 +5429,7 @@ static void netdev_wait_allrefs(struct net_device 
*dev)
 {
        unsigned long rebroadcast_time, warning_time;
        int refcnt;
+       int backoff = 5;

        linkwatch_forget_dev(dev);

@@ -5460,7 +5461,9 @@ static void netdev_wait_allrefs(struct net_device 
*dev)
                        rebroadcast_time = jiffies;
                }

-               msleep(250);
+               msleep(backoff);
+               if ((backoff *= 2) > 250)
+                 backoff = 250;

                refcnt = netdev_refcnt_read(dev);





^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-07 15:26   ` Alex Bligh
@ 2011-05-07 15:54     ` Eric Dumazet
  2011-05-07 16:23       ` Ben Greear
  2011-05-09 21:46       ` Octavian Purdila
  2011-05-07 16:26     ` Eric Dumazet
  1 sibling, 2 replies; 53+ messages in thread
From: Eric Dumazet @ 2011-05-07 15:54 UTC (permalink / raw)
  To: Alex Bligh; +Cc: netdev

Le samedi 07 mai 2011 à 16:26 +0100, Alex Bligh a écrit :
> Well, I patched it (patch attached for what it's worth) and it made
> no difference in this case. I would suggest however that it might
> be the right think to do anyway.
> 

As I said, this code should not be entered in normal situations.

You are not the first to suggest a change, but it wont help you at all.




> On the current 8 core box I am testing, I see 280ms per interface
> delete **even with only 10 interfaces**. I see 260ms with one
> interface. I know doing lots of rcu sync stuff can be slow, but
> 260ms to remove one veth pair sounds like more than rcu sync going
> on. It sounds like a sleep (though I may not have found the
> right one). I see no CPU load.
> 
> Equally, with one interface (remember I'm doing this in unshare -n
> so there is only a loopback interface there), this bit surely
> can't be sysfs.
> 

synchronize_rcu() calls are not consuming cpu, they just _wait_
rcu grace period.

I suggest you read Documentation/RCU files if you really want to :)

If you want to check how expensive it is, its quite easy:
add a trace in synchronize_net() 

diff --git a/net/core/dev.c b/net/core/dev.c
index 856b6ee..70f3c46 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5915,8 +5915,10 @@ EXPORT_SYMBOL(free_netdev);
  */
 void synchronize_net(void)
 {
+	pr_err("begin synchronize_net()\n");
 	might_sleep();
 	synchronize_rcu();
+	pr_err("end synchronize_net()\n");
 }
 EXPORT_SYMBOL(synchronize_net);
 






^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-07 15:54     ` Eric Dumazet
@ 2011-05-07 16:23       ` Ben Greear
  2011-05-07 16:37         ` Eric Dumazet
  2011-05-09 21:46       ` Octavian Purdila
  1 sibling, 1 reply; 53+ messages in thread
From: Ben Greear @ 2011-05-07 16:23 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Alex Bligh, netdev

On 05/07/2011 08:54 AM, Eric Dumazet wrote:
> Le samedi 07 mai 2011 à 16:26 +0100, Alex Bligh a écrit :
>> Well, I patched it (patch attached for what it's worth) and it made
>> no difference in this case. I would suggest however that it might
>> be the right think to do anyway.
>>
>
> As I said, this code should not be entered in normal situations.
>
> You are not the first to suggest a change, but it wont help you at all.
>
>
>
>
>> On the current 8 core box I am testing, I see 280ms per interface
>> delete **even with only 10 interfaces**. I see 260ms with one
>> interface. I know doing lots of rcu sync stuff can be slow, but
>> 260ms to remove one veth pair sounds like more than rcu sync going
>> on. It sounds like a sleep (though I may not have found the
>> right one). I see no CPU load.
>>
>> Equally, with one interface (remember I'm doing this in unshare -n
>> so there is only a loopback interface there), this bit surely
>> can't be sysfs.
>>
>
> synchronize_rcu() calls are not consuming cpu, they just _wait_
> rcu grace period.
>
> I suggest you read Documentation/RCU files if you really want to :)
>
> If you want to check how expensive it is, its quite easy:
> add a trace in synchronize_net()
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 856b6ee..70f3c46 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -5915,8 +5915,10 @@ EXPORT_SYMBOL(free_netdev);
>    */
>   void synchronize_net(void)
>   {
> +	pr_err("begin synchronize_net()\n");
>   	might_sleep();
>   	synchronize_rcu();
> +	pr_err("end synchronize_net()\n");
>   }
>   EXPORT_SYMBOL(synchronize_net);

I wonder if it would be worth having a 'delete me soon'
method to delete interfaces that would not block on the
RCU code.

The controlling programs could use netlink messages to
know exactly when an interface was truly gone.

That should allow some batching in the sync-net logic
too, if user-space code deletes 1000 interfaces very
quickly, for instance...

Thanks,
Ben

>
>
>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-07 15:26   ` Alex Bligh
  2011-05-07 15:54     ` Eric Dumazet
@ 2011-05-07 16:26     ` Eric Dumazet
  2011-05-07 18:24       ` Alex Bligh
  2011-05-07 18:38       ` Alex Bligh
  1 sibling, 2 replies; 53+ messages in thread
From: Eric Dumazet @ 2011-05-07 16:26 UTC (permalink / raw)
  To: Alex Bligh; +Cc: netdev

Le samedi 07 mai 2011 à 16:26 +0100, Alex Bligh a écrit :

> On the current 8 core box I am testing, I see 280ms per interface
> delete **even with only 10 interfaces**. I see 260ms with one
> interface. I know doing lots of rcu sync stuff can be slow, but
> 260ms to remove one veth pair sounds like more than rcu sync going
> on. It sounds like a sleep (though I may not have found the
> right one). I see no CPU load.
> 

Here, on 2.6.38 kernel (Ubuntu 11.04 provided, on my 2 core laptop)

# time rmmod dummy

real	0m0.111s
user	0m0.000s
sys	0m0.000s


This removed my two dummy0/dummy1 devices.

On another machine with a very recent kernel :
$ modprobe dummy numdummies=1
$ ifconfig dummy0 192.168.46.46 up
$ time rmmod dummy

real	0m0.032s
user	0m0.000s
sys	0m0.001s
$ uname -a
Linux svivoipvnx001 2.6.39-rc6-00097-g6ac1576-dirty #550 SMP Sat May 7
00:12:26 CEST 2011 i686 i686 i386 GNU/Linux


So 260ms is a bit too much, maybe you hit yet another bug.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-07 16:23       ` Ben Greear
@ 2011-05-07 16:37         ` Eric Dumazet
  2011-05-07 16:44           ` Ben Greear
  0 siblings, 1 reply; 53+ messages in thread
From: Eric Dumazet @ 2011-05-07 16:37 UTC (permalink / raw)
  To: Ben Greear; +Cc: Alex Bligh, netdev

Le samedi 07 mai 2011 à 09:23 -0700, Ben Greear a écrit :

> I wonder if it would be worth having a 'delete me soon'
> method to delete interfaces that would not block on the
> RCU code.
> 
> The controlling programs could use netlink messages to
> know exactly when an interface was truly gone.
> 
> That should allow some batching in the sync-net logic
> too, if user-space code deletes 1000 interfaces very
> quickly, for instance...
> 

I suggested in the past to have an extension of batch capabilities, so
that one kthread could have 3 separate lists of devices being destroyed
in //,

This daemon would basically loop on one call to synchronize_rcu(), and
transfert list3 to deletion, list2 to list3, list1 to list2, loop,
eventually releasing RTNL while blocked in synchronize_rcu()

This would need to allow as you suggest an asynchronous deletion method,
or use a callback to wake the process blocked on device delete.

Right now, we hold RTNL for the whole 3 steps process, so we cannot use
any parallelism.




^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-07 16:37         ` Eric Dumazet
@ 2011-05-07 16:44           ` Ben Greear
  2011-05-07 16:51             ` Eric Dumazet
  0 siblings, 1 reply; 53+ messages in thread
From: Ben Greear @ 2011-05-07 16:44 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Alex Bligh, netdev

On 05/07/2011 09:37 AM, Eric Dumazet wrote:
> Le samedi 07 mai 2011 à 09:23 -0700, Ben Greear a écrit :
>
>> I wonder if it would be worth having a 'delete me soon'
>> method to delete interfaces that would not block on the
>> RCU code.
>>
>> The controlling programs could use netlink messages to
>> know exactly when an interface was truly gone.
>>
>> That should allow some batching in the sync-net logic
>> too, if user-space code deletes 1000 interfaces very
>> quickly, for instance...
>>
>
> I suggested in the past to have an extension of batch capabilities, so
> that one kthread could have 3 separate lists of devices being destroyed
> in //,
>
> This daemon would basically loop on one call to synchronize_rcu(), and
> transfert list3 to deletion, list2 to list3, list1 to list2, loop,
> eventually releasing RTNL while blocked in synchronize_rcu()
>
> This would need to allow as you suggest an asynchronous deletion method,
> or use a callback to wake the process blocked on device delete.

I'd want to at least have the option to not block the calling
process...otherwise, it would be a lot more difficult to
quickly delete 1000 interfaces.  You'd need 1000 threads, or
sockets, or something to parallelize it otherwise, eh?

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-07 16:44           ` Ben Greear
@ 2011-05-07 16:51             ` Eric Dumazet
  2011-05-08  3:45               ` Ben Greear
  0 siblings, 1 reply; 53+ messages in thread
From: Eric Dumazet @ 2011-05-07 16:51 UTC (permalink / raw)
  To: Ben Greear; +Cc: Alex Bligh, netdev

Le samedi 07 mai 2011 à 09:44 -0700, Ben Greear a écrit :
> On 05/07/2011 09:37 AM, Eric Dumazet wrote:
> > Le samedi 07 mai 2011 à 09:23 -0700, Ben Greear a écrit :
> >
> >> I wonder if it would be worth having a 'delete me soon'
> >> method to delete interfaces that would not block on the
> >> RCU code.
> >>
> >> The controlling programs could use netlink messages to
> >> know exactly when an interface was truly gone.
> >>
> >> That should allow some batching in the sync-net logic
> >> too, if user-space code deletes 1000 interfaces very
> >> quickly, for instance...
> >>
> >
> > I suggested in the past to have an extension of batch capabilities, so
> > that one kthread could have 3 separate lists of devices being destroyed
> > in //,
> >
> > This daemon would basically loop on one call to synchronize_rcu(), and
> > transfert list3 to deletion, list2 to list3, list1 to list2, loop,
> > eventually releasing RTNL while blocked in synchronize_rcu()
> >
> > This would need to allow as you suggest an asynchronous deletion method,
> > or use a callback to wake the process blocked on device delete.
> 
> I'd want to at least have the option to not block the calling
> process...otherwise, it would be a lot more difficult to
> quickly delete 1000 interfaces.  You'd need 1000 threads, or
> sockets, or something to parallelize it otherwise, eh?

Yes, if you can afford not receive a final notification of device being
fully freed, it should be possible.



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-07 16:26     ` Eric Dumazet
@ 2011-05-07 18:24       ` Alex Bligh
  2011-05-07 18:32         ` Eric Dumazet
  2011-05-07 18:38       ` Alex Bligh
  1 sibling, 1 reply; 53+ messages in thread
From: Alex Bligh @ 2011-05-07 18:24 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, Alex Bligh

Eric,

--On 7 May 2011 18:26:29 +0200 Eric Dumazet <eric.dumazet@gmail.com> wrote:

> Here, on 2.6.38 kernel (Ubuntu 11.04 provided, on my 2 core laptop)
># time rmmod dummy
> real	0m0.111s
...
> On another machine with a very recent kernel :
> $ modprobe dummy numdummies=1
> $ ifconfig dummy0 192.168.46.46 up
> $ time rmmod dummy
>
> real	0m0.032s

I know it's different machines, but that's a pretty significant
difference. So I compiled from 2.6.39-rc6 head (i.e. a kernel
less than an hour old), with only your suggested change in,
so that (a) I could eliminate old kernels, and (b) I could
instrument it.

> synchronize_rcu() calls are not consuming cpu, they just _wait_
> rcu grace period.
>
> I suggest you read Documentation/RCU files if you really want to :)

I understand the basic point: it needs to wait for all readers
to drop their references. It's sort of hard to understand why
on a machine with an idle network there would be reader(s) holding
references for 250ms. And indeed the analysis below shows that
isn't the case (it's more like 44 ms).

> If you want to check how expensive it is, its quite easy:
> add a trace in synchronize_net()

At least for veth devices, I see the same on 2.6.39-rc6 - if anything
it's worse:

# ./ifseq -n 100
Sat May 7 17:50:53 UTC 2011 creating 100 interfaces
Sat May 7 17:50:54 UTC 2011 done

real	0m1.549s
user	0m0.060s
sys	0m0.990s
Sat May 7 17:50:54 UTC 2011 deleting 100 interfaces
Sat May 7 17:51:22 UTC 2011 done

real	0m27.917s
user	0m0.420s
sys	0m0.060s

Performing that operation produced exactly 200 calls to synchronize net.
The timestamps indicate that's 2 per veth pair deletion, and zero
per veth pair creation.

Analysing the resultant logs shows only 31% of the problem is
time spent within synchronize_net() (perl script below).

$ ./analyse.pl < syncnet | tail -2
Total 18.98515 Usage 199 Average 0.09540 elsewhere
Total 8.77581 Usage 200 Average 0.04388 synchronizing

So *something* is spending more than twice as much time as
synchronize_net().

I've attached the log below as well.

-- 
Alex Bligh


$ cat analyse.pl
#!/usr/bin/perl

use strict;
use warnings;

my $lastuptime;
my $uptime;
my $diff;
my $area;
my %time;
my %usage;

while (<>)
{
    chomp;
    if (m/\[\s*([0-9.]+)\].*synchronize_net/)
    {
	$uptime = $1;
	if (defined($lastuptime))
	{
	    $area = (m/end/)?"synchronizing":"elsewhere";
	    $diff = $uptime - $lastuptime;
	    printf "%5.5f $area\n", $diff;
	    $time{$area}+=$diff;
	    $usage{$area}++;
	}
	$lastuptime = $uptime;
    }
}

print "\n";

my $k;
foreach $k (sort keys %time)
{
    printf "Total %5.5f Usage %d Average %5.5f %s\n", $time{$k}, 
$usage{$k}, $time{$k}/$usage{$k}, $k;
}



May  7 17:50:55 nattytest kernel: [  127.490142] begin synchronize_net()
May  7 17:50:55 nattytest kernel: [  127.560084] end synchronize_net()
May  7 17:50:55 nattytest kernel: [  127.610350] begin synchronize_net()
May  7 17:50:55 nattytest kernel: [  127.610932] end synchronize_net()
May  7 17:50:55 nattytest kernel: [  127.740078] begin synchronize_net()
May  7 17:50:55 nattytest kernel: [  127.820071] end synchronize_net()
May  7 17:50:55 nattytest kernel: [  127.870300] begin synchronize_net()
May  7 17:50:55 nattytest kernel: [  127.871050] end synchronize_net()
May  7 17:50:55 nattytest kernel: [  128.000079] begin synchronize_net()
May  7 17:50:55 nattytest kernel: [  128.070070] end synchronize_net()
May  7 17:50:55 nattytest kernel: [  128.140085] begin synchronize_net()
May  7 17:50:55 nattytest kernel: [  128.140960] end synchronize_net()
May  7 17:50:55 nattytest kernel: [  128.260082] begin synchronize_net()
May  7 17:50:55 nattytest kernel: [  128.380072] end synchronize_net()
May  7 17:50:55 nattytest kernel: [  128.430296] begin synchronize_net()
May  7 17:50:55 nattytest kernel: [  128.431135] end synchronize_net()
May  7 17:50:56 nattytest kernel: [  128.550087] begin synchronize_net()
May  7 17:50:56 nattytest kernel: [  128.640057] end synchronize_net()
May  7 17:50:56 nattytest kernel: [  128.710191] begin synchronize_net()
May  7 17:50:56 nattytest kernel: [  128.730085] end synchronize_net()
May  7 17:50:56 nattytest kernel: [  128.880074] begin synchronize_net()
May  7 17:50:56 nattytest kernel: [  128.990123] end synchronize_net()
May  7 17:50:56 nattytest kernel: [  129.060087] begin synchronize_net()
May  7 17:50:56 nattytest kernel: [  129.070128] end synchronize_net()
May  7 17:50:56 nattytest kernel: [  129.220079] begin synchronize_net()
May  7 17:50:56 nattytest kernel: [  129.310070] end synchronize_net()
May  7 17:50:56 nattytest kernel: [  129.370280] begin synchronize_net()
May  7 17:50:56 nattytest kernel: [  129.390099] end synchronize_net()
May  7 17:50:57 nattytest kernel: [  129.540174] begin synchronize_net()
May  7 17:50:57 nattytest kernel: [  129.620063] end synchronize_net()
May  7 17:50:57 nattytest kernel: [  129.690196] begin synchronize_net()
May  7 17:50:57 nattytest kernel: [  129.710098] end synchronize_net()
May  7 17:50:57 nattytest kernel: [  129.850084] begin synchronize_net()
May  7 17:50:57 nattytest kernel: [  129.930070] end synchronize_net()
May  7 17:50:57 nattytest kernel: [  129.980314] begin synchronize_net()
May  7 17:50:57 nattytest kernel: [  129.990225] end synchronize_net()
May  7 17:50:57 nattytest kernel: [  130.110086] begin synchronize_net()
May  7 17:50:57 nattytest kernel: [  130.200078] end synchronize_net()
May  7 17:50:57 nattytest kernel: [  130.270187] begin synchronize_net()
May  7 17:50:57 nattytest kernel: [  130.280159] end synchronize_net()
May  7 17:50:57 nattytest kernel: [  130.420133] begin synchronize_net()
May  7 17:50:58 nattytest kernel: [  130.500075] end synchronize_net()
May  7 17:50:58 nattytest kernel: [  130.550344] begin synchronize_net()
May  7 17:50:58 nattytest kernel: [  130.550998] end synchronize_net()
May  7 17:50:58 nattytest kernel: [  130.680075] begin synchronize_net()
May  7 17:50:58 nattytest kernel: [  130.750071] end synchronize_net()
May  7 17:50:58 nattytest kernel: [  130.800333] begin synchronize_net()
May  7 17:50:58 nattytest kernel: [  130.801105] end synchronize_net()
May  7 17:50:58 nattytest kernel: [  130.930071] begin synchronize_net()
May  7 17:50:58 nattytest kernel: [  131.010064] end synchronize_net()
May  7 17:50:58 nattytest kernel: [  131.080171] begin synchronize_net()
May  7 17:50:58 nattytest kernel: [  131.090129] end synchronize_net()
May  7 17:50:58 nattytest kernel: [  131.240078] begin synchronize_net()
May  7 17:50:58 nattytest kernel: [  131.320068] end synchronize_net()
May  7 17:50:58 nattytest kernel: [  131.390216] begin synchronize_net()
May  7 17:50:58 nattytest kernel: [  131.403088] end synchronize_net()
May  7 17:50:59 nattytest kernel: [  131.540081] begin synchronize_net()
May  7 17:50:59 nattytest kernel: [  131.610072] end synchronize_net()
May  7 17:50:59 nattytest kernel: [  131.660314] begin synchronize_net()
May  7 17:50:59 nattytest kernel: [  131.661094] end synchronize_net()
May  7 17:50:59 nattytest kernel: [  131.790076] begin synchronize_net()
May  7 17:50:59 nattytest kernel: [  131.860082] end synchronize_net()
May  7 17:50:59 nattytest kernel: [  131.910294] begin synchronize_net()
May  7 17:50:59 nattytest kernel: [  131.911061] end synchronize_net()
May  7 17:50:59 nattytest kernel: [  132.030075] begin synchronize_net()
May  7 17:50:59 nattytest kernel: [  132.110079] end synchronize_net()
May  7 17:50:59 nattytest kernel: [  132.160319] begin synchronize_net()
May  7 17:50:59 nattytest kernel: [  132.161101] end synchronize_net()
May  7 17:50:59 nattytest kernel: [  132.280075] begin synchronize_net()
May  7 17:50:59 nattytest kernel: [  132.400066] end synchronize_net()
May  7 17:51:00 nattytest kernel: [  132.450321] begin synchronize_net()
May  7 17:51:00 nattytest kernel: [  132.451134] end synchronize_net()
May  7 17:51:00 nattytest kernel: [  132.570078] begin synchronize_net()
May  7 17:51:00 nattytest kernel: [  132.650080] end synchronize_net()
May  7 17:51:00 nattytest kernel: [  132.700288] begin synchronize_net()
May  7 17:51:00 nattytest kernel: [  132.701073] end synchronize_net()
May  7 17:51:00 nattytest kernel: [  132.830066] begin synchronize_net()
May  7 17:51:00 nattytest kernel: [  132.910069] end synchronize_net()
May  7 17:51:00 nattytest kernel: [  132.990128] begin synchronize_net()
May  7 17:51:00 nattytest kernel: [  132.990893] end synchronize_net()
May  7 17:51:00 nattytest kernel: [  133.120076] begin synchronize_net()
May  7 17:51:00 nattytest kernel: [  133.200071] end synchronize_net()
May  7 17:51:00 nattytest kernel: [  133.250313] begin synchronize_net()
May  7 17:51:00 nattytest kernel: [  133.251088] end synchronize_net()
May  7 17:51:00 nattytest kernel: [  133.370082] begin synchronize_net()
May  7 17:51:01 nattytest kernel: [  133.450147] end synchronize_net()
May  7 17:51:01 nattytest kernel: [  133.500354] begin synchronize_net()
May  7 17:51:01 nattytest kernel: [  133.505508] end synchronize_net()
May  7 17:51:01 nattytest kernel: [  133.630085] begin synchronize_net()
May  7 17:51:01 nattytest kernel: [  133.710090] end synchronize_net()
May  7 17:51:01 nattytest kernel: [  133.760399] begin synchronize_net()
May  7 17:51:01 nattytest kernel: [  133.770396] end synchronize_net()
May  7 17:51:01 nattytest kernel: [  133.920090] begin synchronize_net()
May  7 17:51:01 nattytest kernel: [  134.010083] end synchronize_net()
May  7 17:51:01 nattytest kernel: [  134.060292] begin synchronize_net()
May  7 17:51:01 nattytest kernel: [  134.080082] end synchronize_net()
May  7 17:51:01 nattytest kernel: [  134.220082] begin synchronize_net()
May  7 17:51:01 nattytest kernel: [  134.310064] end synchronize_net()
May  7 17:51:01 nattytest kernel: [  134.380176] begin synchronize_net()
May  7 17:51:01 nattytest kernel: [  134.390127] end synchronize_net()
May  7 17:51:02 nattytest kernel: [  134.550205] begin synchronize_net()
May  7 17:51:02 nattytest kernel: [  134.630133] end synchronize_net()
May  7 17:51:02 nattytest kernel: [  134.830065] begin synchronize_net()
May  7 17:51:02 nattytest kernel: [  134.880091] end synchronize_net()
May  7 17:51:02 nattytest kernel: [  135.040194] begin synchronize_net()
May  7 17:51:02 nattytest kernel: [  135.120072] end synchronize_net()
May  7 17:51:02 nattytest kernel: [  135.190202] begin synchronize_net()
May  7 17:51:02 nattytest kernel: [  135.210084] end synchronize_net()
May  7 17:51:02 nattytest kernel: [  135.370081] begin synchronize_net()
May  7 17:51:03 nattytest kernel: [  135.440076] end synchronize_net()
May  7 17:51:03 nattytest kernel: [  135.490331] begin synchronize_net()
May  7 17:51:03 nattytest kernel: [  135.491147] end synchronize_net()
May  7 17:51:03 nattytest kernel: [  135.620079] begin synchronize_net()
May  7 17:51:03 nattytest kernel: [  135.700064] end synchronize_net()
May  7 17:51:03 nattytest kernel: [  135.757709] begin synchronize_net()
May  7 17:51:03 nattytest kernel: [  135.770095] end synchronize_net()
May  7 17:51:03 nattytest kernel: [  135.920101] begin synchronize_net()
May  7 17:51:03 nattytest kernel: [  135.990068] end synchronize_net()
May  7 17:51:03 nattytest kernel: [  136.050068] begin synchronize_net()
May  7 17:51:03 nattytest kernel: [  136.050825] end synchronize_net()
May  7 17:51:03 nattytest kernel: [  136.160081] begin synchronize_net()
May  7 17:51:03 nattytest kernel: [  136.240067] end synchronize_net()
May  7 17:51:03 nattytest kernel: [  136.290431] begin synchronize_net()
May  7 17:51:03 nattytest kernel: [  136.291020] end synchronize_net()
May  7 17:51:03 nattytest kernel: [  136.420094] begin synchronize_net()
May  7 17:51:04 nattytest kernel: [  136.500062] end synchronize_net()
May  7 17:51:04 nattytest kernel: [  136.550388] begin synchronize_net()
May  7 17:51:04 nattytest kernel: [  136.550988] end synchronize_net()
May  7 17:51:04 nattytest kernel: [  136.670101] begin synchronize_net()
May  7 17:51:04 nattytest kernel: [  136.750063] end synchronize_net()
May  7 17:51:04 nattytest kernel: [  136.800297] begin synchronize_net()
May  7 17:51:04 nattytest kernel: [  136.801081] end synchronize_net()
May  7 17:51:04 nattytest kernel: [  136.920070] begin synchronize_net()
May  7 17:51:04 nattytest kernel: [  137.000069] end synchronize_net()
May  7 17:51:04 nattytest kernel: [  137.050252] begin synchronize_net()
May  7 17:51:04 nattytest kernel: [  137.051042] end synchronize_net()
May  7 17:51:04 nattytest kernel: [  137.180076] begin synchronize_net()
May  7 17:51:04 nattytest kernel: [  137.260065] end synchronize_net()
May  7 17:51:04 nattytest kernel: [  137.320191] begin synchronize_net()
May  7 17:51:04 nattytest kernel: [  137.340087] end synchronize_net()
May  7 17:51:05 nattytest kernel: [  137.490082] begin synchronize_net()
May  7 17:51:05 nattytest kernel: [  137.570071] end synchronize_net()
May  7 17:51:05 nattytest kernel: [  137.620314] begin synchronize_net()
May  7 17:51:05 nattytest kernel: [  137.621084] end synchronize_net()
May  7 17:51:05 nattytest kernel: [  137.740083] begin synchronize_net()
May  7 17:51:05 nattytest kernel: [  137.830071] end synchronize_net()
May  7 17:51:05 nattytest kernel: [  137.890264] begin synchronize_net()
May  7 17:51:05 nattytest kernel: [  137.910087] end synchronize_net()
May  7 17:51:05 nattytest kernel: [  138.060074] begin synchronize_net()
May  7 17:51:05 nattytest kernel: [  138.140070] end synchronize_net()
May  7 17:51:05 nattytest kernel: [  138.210094] begin synchronize_net()
May  7 17:51:05 nattytest kernel: [  138.210940] end synchronize_net()
May  7 17:51:05 nattytest kernel: [  138.340089] begin synchronize_net()
May  7 17:51:05 nattytest kernel: [  138.410088] end synchronize_net()
May  7 17:51:06 nattytest kernel: [  138.470306] begin synchronize_net()
May  7 17:51:06 nattytest kernel: [  138.471080] end synchronize_net()
May  7 17:51:06 nattytest kernel: [  138.590082] begin synchronize_net()
May  7 17:51:06 nattytest kernel: [  138.670131] end synchronize_net()
May  7 17:51:06 nattytest kernel: [  138.720268] begin synchronize_net()
May  7 17:51:06 nattytest kernel: [  138.721034] end synchronize_net()
May  7 17:51:06 nattytest kernel: [  138.850077] begin synchronize_net()
May  7 17:51:06 nattytest kernel: [  138.920071] end synchronize_net()
May  7 17:51:06 nattytest kernel: [  138.970305] begin synchronize_net()
May  7 17:51:06 nattytest kernel: [  138.971074] end synchronize_net()
May  7 17:51:06 nattytest kernel: [  139.090082] begin synchronize_net()
May  7 17:51:06 nattytest kernel: [  139.170068] end synchronize_net()
May  7 17:51:06 nattytest kernel: [  139.230297] begin synchronize_net()
May  7 17:51:06 nattytest kernel: [  139.231115] end synchronize_net()
May  7 17:51:06 nattytest kernel: [  139.350083] begin synchronize_net()
May  7 17:51:07 nattytest kernel: [  139.440057] end synchronize_net()
May  7 17:51:07 nattytest kernel: [  139.510174] begin synchronize_net()
May  7 17:51:07 nattytest kernel: [  139.520124] end synchronize_net()
May  7 17:51:07 nattytest kernel: [  139.680077] begin synchronize_net()
May  7 17:51:07 nattytest kernel: [  139.750087] end synchronize_net()
May  7 17:51:07 nattytest kernel: [  139.840093] begin synchronize_net()
May  7 17:51:07 nattytest kernel: [  139.840867] end synchronize_net()
May  7 17:51:07 nattytest kernel: [  139.970079] begin synchronize_net()
May  7 17:51:07 nattytest kernel: [  140.080094] end synchronize_net()
May  7 17:51:07 nattytest kernel: [  140.130284] begin synchronize_net()
May  7 17:51:07 nattytest kernel: [  140.131056] end synchronize_net()
May  7 17:51:07 nattytest kernel: [  140.260076] begin synchronize_net()
May  7 17:51:07 nattytest kernel: [  140.330076] end synchronize_net()
May  7 17:51:07 nattytest kernel: [  140.380332] begin synchronize_net()
May  7 17:51:07 nattytest kernel: [  140.381364] end synchronize_net()
May  7 17:51:08 nattytest kernel: [  140.510081] begin synchronize_net()
May  7 17:51:08 nattytest kernel: [  140.590074] end synchronize_net()
May  7 17:51:08 nattytest kernel: [  140.650289] begin synchronize_net()
May  7 17:51:08 nattytest kernel: [  140.670086] end synchronize_net()
May  7 17:51:08 nattytest kernel: [  140.800064] begin synchronize_net()
May  7 17:51:08 nattytest kernel: [  140.880073] end synchronize_net()
May  7 17:51:08 nattytest kernel: [  140.930267] begin synchronize_net()
May  7 17:51:08 nattytest kernel: [  140.931048] end synchronize_net()
May  7 17:51:08 nattytest kernel: [  141.050072] begin synchronize_net()
May  7 17:51:08 nattytest kernel: [  141.140067] end synchronize_net()
May  7 17:51:08 nattytest kernel: [  141.190328] begin synchronize_net()
May  7 17:51:08 nattytest kernel: [  141.200119] end synchronize_net()
May  7 17:51:08 nattytest kernel: [  141.360077] begin synchronize_net()
May  7 17:51:08 nattytest kernel: [  141.430074] end synchronize_net()
May  7 17:51:09 nattytest kernel: [  141.480312] begin synchronize_net()
May  7 17:51:09 nattytest kernel: [  141.481110] end synchronize_net()
May  7 17:51:09 nattytest kernel: [  141.600080] begin synchronize_net()
May  7 17:51:09 nattytest kernel: [  141.680081] end synchronize_net()
May  7 17:51:09 nattytest kernel: [  141.730335] begin synchronize_net()
May  7 17:51:09 nattytest kernel: [  141.731136] end synchronize_net()
May  7 17:51:09 nattytest kernel: [  141.860079] begin synchronize_net()
May  7 17:51:09 nattytest kernel: [  141.930070] end synchronize_net()
May  7 17:51:09 nattytest kernel: [  141.980317] begin synchronize_net()
May  7 17:51:09 nattytest kernel: [  141.981114] end synchronize_net()
May  7 17:51:09 nattytest kernel: [  142.110097] begin synchronize_net()
May  7 17:51:09 nattytest kernel: [  142.180074] end synchronize_net()
May  7 17:51:09 nattytest kernel: [  142.230335] begin synchronize_net()
May  7 17:51:09 nattytest kernel: [  142.231113] end synchronize_net()
May  7 17:51:09 nattytest kernel: [  142.350080] begin synchronize_net()
May  7 17:51:09 nattytest kernel: [  142.430087] end synchronize_net()
May  7 17:51:10 nattytest kernel: [  142.480310] begin synchronize_net()
May  7 17:51:10 nattytest kernel: [  142.481084] end synchronize_net()
May  7 17:51:10 nattytest kernel: [  142.600090] begin synchronize_net()
May  7 17:51:10 nattytest kernel: [  142.720091] end synchronize_net()
May  7 17:51:10 nattytest kernel: [  142.770310] begin synchronize_net()
May  7 17:51:10 nattytest kernel: [  142.771130] end synchronize_net()
May  7 17:51:10 nattytest kernel: [  142.900080] begin synchronize_net()
May  7 17:51:10 nattytest kernel: [  142.980070] end synchronize_net()
May  7 17:51:10 nattytest kernel: [  143.060109] begin synchronize_net()
May  7 17:51:10 nattytest kernel: [  143.080079] end synchronize_net()
May  7 17:51:10 nattytest kernel: [  143.230095] begin synchronize_net()
May  7 17:51:10 nattytest kernel: [  143.300044] end synchronize_net()
May  7 17:51:10 nattytest kernel: [  143.350319] begin synchronize_net()
May  7 17:51:10 nattytest kernel: [  143.351119] end synchronize_net()
May  7 17:51:11 nattytest kernel: [  143.470076] begin synchronize_net()
May  7 17:51:11 nattytest kernel: [  143.560083] end synchronize_net()
May  7 17:51:11 nattytest kernel: [  143.620209] begin synchronize_net()
May  7 17:51:11 nattytest kernel: [  143.640090] end synchronize_net()
May  7 17:51:11 nattytest kernel: [  143.780078] begin synchronize_net()
May  7 17:51:11 nattytest kernel: [  143.860076] end synchronize_net()
May  7 17:51:11 nattytest kernel: [  143.910284] begin synchronize_net()
May  7 17:51:11 nattytest kernel: [  143.911142] end synchronize_net()
May  7 17:51:11 nattytest kernel: [  144.030075] begin synchronize_net()
May  7 17:51:11 nattytest kernel: [  144.110085] end synchronize_net()
May  7 17:51:11 nattytest kernel: [  144.160313] begin synchronize_net()
May  7 17:51:11 nattytest kernel: [  144.161104] end synchronize_net()
May  7 17:51:11 nattytest kernel: [  144.280080] begin synchronize_net()
May  7 17:51:11 nattytest kernel: [  144.360074] end synchronize_net()
May  7 17:51:11 nattytest kernel: [  144.410294] begin synchronize_net()
May  7 17:51:11 nattytest kernel: [  144.411096] end synchronize_net()
May  7 17:51:12 nattytest kernel: [  144.530092] begin synchronize_net()
May  7 17:51:12 nattytest kernel: [  144.620072] end synchronize_net()
May  7 17:51:12 nattytest kernel: [  144.680355] begin synchronize_net()
May  7 17:51:12 nattytest kernel: [  144.700081] end synchronize_net()
May  7 17:51:12 nattytest kernel: [  144.860073] begin synchronize_net()
May  7 17:51:12 nattytest kernel: [  144.930075] end synchronize_net()
May  7 17:51:12 nattytest kernel: [  144.980325] begin synchronize_net()
May  7 17:51:12 nattytest kernel: [  144.981155] end synchronize_net()
May  7 17:51:12 nattytest kernel: [  145.110079] begin synchronize_net()
May  7 17:51:12 nattytest kernel: [  145.180084] end synchronize_net()
May  7 17:51:12 nattytest kernel: [  145.230324] begin synchronize_net()
May  7 17:51:12 nattytest kernel: [  145.231098] end synchronize_net()
May  7 17:51:12 nattytest kernel: [  145.350077] begin synchronize_net()
May  7 17:51:12 nattytest kernel: [  145.430081] end synchronize_net()
May  7 17:51:13 nattytest kernel: [  145.480300] begin synchronize_net()
May  7 17:51:13 nattytest kernel: [  145.481050] end synchronize_net()
May  7 17:51:13 nattytest kernel: [  145.610079] begin synchronize_net()
May  7 17:51:13 nattytest kernel: [  145.690071] end synchronize_net()
May  7 17:51:13 nattytest kernel: [  145.750370] begin synchronize_net()
May  7 17:51:13 nattytest kernel: [  145.760378] end synchronize_net()
May  7 17:51:13 nattytest kernel: [  145.900074] begin synchronize_net()
May  7 17:51:13 nattytest kernel: [  145.970077] end synchronize_net()
May  7 17:51:13 nattytest kernel: [  146.020298] begin synchronize_net()
May  7 17:51:13 nattytest kernel: [  146.021548] end synchronize_net()
May  7 17:51:13 nattytest kernel: [  146.150081] begin synchronize_net()
May  7 17:51:13 nattytest kernel: [  146.230073] end synchronize_net()
May  7 17:51:13 nattytest kernel: [  146.299959] begin synchronize_net()
May  7 17:51:13 nattytest kernel: [  146.310076] end synchronize_net()
May  7 17:51:14 nattytest kernel: [  146.440080] begin synchronize_net()
May  7 17:51:14 nattytest kernel: [  146.520065] end synchronize_net()
May  7 17:51:14 nattytest kernel: [  146.580285] begin synchronize_net()
May  7 17:51:14 nattytest kernel: [  146.590266] end synchronize_net()
May  7 17:51:14 nattytest kernel: [  146.750063] begin synchronize_net()
May  7 17:51:14 nattytest kernel: [  146.830084] end synchronize_net()
May  7 17:51:14 nattytest kernel: [  146.890233] begin synchronize_net()
May  7 17:51:14 nattytest kernel: [  146.910088] end synchronize_net()
May  7 17:51:14 nattytest kernel: [  147.060081] begin synchronize_net()
May  7 17:51:14 nattytest kernel: [  147.140061] end synchronize_net()
May  7 17:51:14 nattytest kernel: [  147.200277] begin synchronize_net()
May  7 17:51:14 nattytest kernel: [  147.220089] end synchronize_net()
May  7 17:51:14 nattytest kernel: [  147.360081] begin synchronize_net()
May  7 17:51:15 nattytest kernel: [  147.450084] end synchronize_net()
May  7 17:51:15 nattytest kernel: [  147.510283] begin synchronize_net()
May  7 17:51:15 nattytest kernel: [  147.530135] end synchronize_net()
May  7 17:51:15 nattytest kernel: [  147.680075] begin synchronize_net()
May  7 17:51:15 nattytest kernel: [  147.760066] end synchronize_net()
May  7 17:51:15 nattytest kernel: [  147.830172] begin synchronize_net()
May  7 17:51:15 nattytest kernel: [  147.870065] end synchronize_net()
May  7 17:51:15 nattytest kernel: [  148.000075] begin synchronize_net()
May  7 17:51:15 nattytest kernel: [  148.070065] end synchronize_net()
May  7 17:51:15 nattytest kernel: [  148.120327] begin synchronize_net()
May  7 17:51:15 nattytest kernel: [  148.121099] end synchronize_net()
May  7 17:51:15 nattytest kernel: [  148.240073] begin synchronize_net()
May  7 17:51:15 nattytest kernel: [  148.320061] end synchronize_net()
May  7 17:51:15 nattytest kernel: [  148.370253] begin synchronize_net()
May  7 17:51:15 nattytest kernel: [  148.371001] end synchronize_net()
May  7 17:51:16 nattytest kernel: [  148.500082] begin synchronize_net()
May  7 17:51:16 nattytest kernel: [  148.580073] end synchronize_net()
May  7 17:51:16 nattytest kernel: [  148.650192] begin synchronize_net()
May  7 17:51:16 nattytest kernel: [  148.670095] end synchronize_net()
May  7 17:51:16 nattytest kernel: [  148.820077] begin synchronize_net()
May  7 17:51:16 nattytest kernel: [  148.910058] end synchronize_net()
May  7 17:51:16 nattytest kernel: [  148.980223] begin synchronize_net()
May  7 17:51:16 nattytest kernel: [  148.990280] end synchronize_net()
May  7 17:51:16 nattytest kernel: [  149.130076] begin synchronize_net()
May  7 17:51:16 nattytest kernel: [  149.220104] end synchronize_net()
May  7 17:51:16 nattytest kernel: [  149.270373] begin synchronize_net()
May  7 17:51:16 nattytest kernel: [  149.300076] end synchronize_net()
May  7 17:51:17 nattytest kernel: [  149.450078] begin synchronize_net()
May  7 17:51:17 nattytest kernel: [  149.530063] end synchronize_net()
May  7 17:51:17 nattytest kernel: [  149.580337] begin synchronize_net()
May  7 17:51:17 nattytest kernel: [  149.581173] end synchronize_net()
May  7 17:51:17 nattytest kernel: [  149.700072] begin synchronize_net()
May  7 17:51:17 nattytest kernel: [  149.780071] end synchronize_net()
May  7 17:51:17 nattytest kernel: [  149.830307] begin synchronize_net()
May  7 17:51:17 nattytest kernel: [  149.831075] end synchronize_net()
May  7 17:51:17 nattytest kernel: [  149.960083] begin synchronize_net()
May  7 17:51:17 nattytest kernel: [  150.030067] end synchronize_net()
May  7 17:51:17 nattytest kernel: [  150.080323] begin synchronize_net()
May  7 17:51:17 nattytest kernel: [  150.081113] end synchronize_net()
May  7 17:51:17 nattytest kernel: [  150.200076] begin synchronize_net()
May  7 17:51:17 nattytest kernel: [  150.280065] end synchronize_net()
May  7 17:51:17 nattytest kernel: [  150.330341] begin synchronize_net()
May  7 17:51:17 nattytest kernel: [  150.331164] end synchronize_net()
May  7 17:51:18 nattytest kernel: [  150.460081] begin synchronize_net()
May  7 17:51:18 nattytest kernel: [  150.540064] end synchronize_net()
May  7 17:51:18 nattytest kernel: [  150.600267] begin synchronize_net()
May  7 17:51:18 nattytest kernel: [  150.620084] end synchronize_net()
May  7 17:51:18 nattytest kernel: [  150.780067] begin synchronize_net()
May  7 17:51:18 nattytest kernel: [  150.850065] end synchronize_net()
May  7 17:51:18 nattytest kernel: [  150.900315] begin synchronize_net()
May  7 17:51:18 nattytest kernel: [  150.901093] end synchronize_net()
May  7 17:51:18 nattytest kernel: [  151.020083] begin synchronize_net()
May  7 17:51:18 nattytest kernel: [  151.100069] end synchronize_net()
May  7 17:51:18 nattytest kernel: [  151.150282] begin synchronize_net()
May  7 17:51:18 nattytest kernel: [  151.151067] end synchronize_net()
May  7 17:51:18 nattytest kernel: [  151.280079] begin synchronize_net()
May  7 17:51:18 nattytest kernel: [  151.360070] end synchronize_net()
May  7 17:51:18 nattytest kernel: [  151.410316] begin synchronize_net()
May  7 17:51:18 nattytest kernel: [  151.411133] end synchronize_net()
May  7 17:51:19 nattytest kernel: [  151.540178] begin synchronize_net()
May  7 17:51:19 nattytest kernel: [  151.610071] end synchronize_net()
May  7 17:51:19 nattytest kernel: [  151.660313] begin synchronize_net()
May  7 17:51:19 nattytest kernel: [  151.661125] end synchronize_net()
May  7 17:51:19 nattytest kernel: [  151.780072] begin synchronize_net()
May  7 17:51:19 nattytest kernel: [  151.860074] end synchronize_net()
May  7 17:51:19 nattytest kernel: [  151.910295] begin synchronize_net()
May  7 17:51:19 nattytest kernel: [  151.911124] end synchronize_net()
May  7 17:51:19 nattytest kernel: [  152.040109] begin synchronize_net()
May  7 17:51:19 nattytest kernel: [  152.110066] end synchronize_net()
May  7 17:51:19 nattytest kernel: [  152.160295] begin synchronize_net()
May  7 17:51:19 nattytest kernel: [  152.161049] end synchronize_net()
May  7 17:51:19 nattytest kernel: [  152.280073] begin synchronize_net()
May  7 17:51:19 nattytest kernel: [  152.360085] end synchronize_net()
May  7 17:51:19 nattytest kernel: [  152.410306] begin synchronize_net()
May  7 17:51:19 nattytest kernel: [  152.411060] end synchronize_net()
May  7 17:51:20 nattytest kernel: [  152.530075] begin synchronize_net()
May  7 17:51:20 nattytest kernel: [  152.610064] end synchronize_net()
May  7 17:51:20 nattytest kernel: [  152.660270] begin synchronize_net()
May  7 17:51:20 nattytest kernel: [  152.661284] end synchronize_net()
May  7 17:51:20 nattytest kernel: [  152.830095] begin synchronize_net()
May  7 17:51:20 nattytest kernel: [  152.910065] end synchronize_net()
May  7 17:51:20 nattytest kernel: [  152.960314] begin synchronize_net()
May  7 17:51:20 nattytest kernel: [  152.961131] end synchronize_net()
May  7 17:51:20 nattytest kernel: [  153.090076] begin synchronize_net()
May  7 17:51:20 nattytest kernel: [  153.160083] end synchronize_net()
May  7 17:51:20 nattytest kernel: [  153.210293] begin synchronize_net()
May  7 17:51:20 nattytest kernel: [  153.211113] end synchronize_net()
May  7 17:51:20 nattytest kernel: [  153.340081] begin synchronize_net()
May  7 17:51:20 nattytest kernel: [  153.420067] end synchronize_net()
May  7 17:51:21 nattytest kernel: [  153.470317] begin synchronize_net()
May  7 17:51:21 nattytest kernel: [  153.471164] end synchronize_net()
May  7 17:51:21 nattytest kernel: [  153.590082] begin synchronize_net()
May  7 17:51:21 nattytest kernel: [  153.680063] end synchronize_net()
May  7 17:51:21 nattytest kernel: [  153.740238] begin synchronize_net()
May  7 17:51:21 nattytest kernel: [  153.750127] end synchronize_net()
May  7 17:51:21 nattytest kernel: [  153.900077] begin synchronize_net()
May  7 17:51:21 nattytest kernel: [  153.980077] end synchronize_net()
May  7 17:51:21 nattytest kernel: [  154.080091] begin synchronize_net()
May  7 17:51:21 nattytest kernel: [  154.080872] end synchronize_net()
May  7 17:51:21 nattytest kernel: [  154.210077] begin synchronize_net()
May  7 17:51:21 nattytest kernel: [  154.290061] end synchronize_net()
May  7 17:51:21 nattytest kernel: [  154.340327] begin synchronize_net()
May  7 17:51:21 nattytest kernel: [  154.360089] end synchronize_net()
May  7 17:51:22 nattytest kernel: [  154.510085] begin synchronize_net()
May  7 17:51:22 nattytest kernel: [  154.580075] end synchronize_net()
May  7 17:51:22 nattytest kernel: [  154.630300] begin synchronize_net()
May  7 17:51:22 nattytest kernel: [  154.631065] end synchronize_net()
May  7 17:51:22 nattytest kernel: [  154.770074] begin synchronize_net()
May  7 17:51:22 nattytest kernel: [  154.880081] end synchronize_net()
May  7 17:51:22 nattytest kernel: [  154.940349] begin synchronize_net()
May  7 17:51:22 nattytest kernel: [  154.960085] end synchronize_net()
May  7 17:51:22 nattytest kernel: [  155.130080] begin synchronize_net()
May  7 17:51:22 nattytest kernel: [  155.200079] end synchronize_net()
May  7 17:51:22 nattytest kernel: [  155.250334] begin synchronize_net()
May  7 17:51:22 nattytest kernel: [  155.251105] end synchronize_net()





^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-07 18:24       ` Alex Bligh
@ 2011-05-07 18:32         ` Eric Dumazet
  2011-05-07 18:39           ` Eric Dumazet
                             ` (2 more replies)
  0 siblings, 3 replies; 53+ messages in thread
From: Eric Dumazet @ 2011-05-07 18:32 UTC (permalink / raw)
  To: Alex Bligh; +Cc: netdev

Le samedi 07 mai 2011 à 19:24 +0100, Alex Bligh a écrit :
> Eric,
> 
> --On 7 May 2011 18:26:29 +0200 Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
> > Here, on 2.6.38 kernel (Ubuntu 11.04 provided, on my 2 core laptop)
> ># time rmmod dummy
> > real	0m0.111s
> ...
> > On another machine with a very recent kernel :
> > $ modprobe dummy numdummies=1
> > $ ifconfig dummy0 192.168.46.46 up
> > $ time rmmod dummy
> >
> > real	0m0.032s
> 
> I know it's different machines, but that's a pretty significant
> difference. So I compiled from 2.6.39-rc6 head (i.e. a kernel
> less than an hour old), with only your suggested change in,
> so that (a) I could eliminate old kernels, and (b) I could
> instrument it.
> 
> > synchronize_rcu() calls are not consuming cpu, they just _wait_
> > rcu grace period.
> >
> > I suggest you read Documentation/RCU files if you really want to :)
> 
> I understand the basic point: it needs to wait for all readers
> to drop their references. It's sort of hard to understand why
> on a machine with an idle network there would be reader(s) holding
> references for 250ms. And indeed the analysis below shows that
> isn't the case (it's more like 44 ms).
> 
> > If you want to check how expensive it is, its quite easy:
> > add a trace in synchronize_net()
> 
> At least for veth devices, I see the same on 2.6.39-rc6 - if anything
> it's worse:
> 
> # ./ifseq -n 100
> Sat May 7 17:50:53 UTC 2011 creating 100 interfaces
> Sat May 7 17:50:54 UTC 2011 done
> 
> real	0m1.549s
> user	0m0.060s
> sys	0m0.990s
> Sat May 7 17:50:54 UTC 2011 deleting 100 interfaces
> Sat May 7 17:51:22 UTC 2011 done
> 
> real	0m27.917s
> user	0m0.420s
> sys	0m0.060s
> 
> Performing that operation produced exactly 200 calls to synchronize net.
> The timestamps indicate that's 2 per veth pair deletion, and zero
> per veth pair creation.
> 
> Analysing the resultant logs shows only 31% of the problem is
> time spent within synchronize_net() (perl script below).
> 
> $ ./analyse.pl < syncnet | tail -2
> Total 18.98515 Usage 199 Average 0.09540 elsewhere
> Total 8.77581 Usage 200 Average 0.04388 synchronizing
> 
> So *something* is spending more than twice as much time as
> synchronize_net().
> 
> I've attached the log below as well.
> 
> -- 
> Alex Bligh
> 
> 
> $ cat analyse.pl
> #!/usr/bin/perl
> 
> use strict;
> use warnings;
> 
> my $lastuptime;
> my $uptime;
> my $diff;
> my $area;
> my %time;
> my %usage;
> 
> while (<>)
> {
>     chomp;
>     if (m/\[\s*([0-9.]+)\].*synchronize_net/)
>     {
> 	$uptime = $1;
> 	if (defined($lastuptime))
> 	{
> 	    $area = (m/end/)?"synchronizing":"elsewhere";
> 	    $diff = $uptime - $lastuptime;
> 	    printf "%5.5f $area\n", $diff;
> 	    $time{$area}+=$diff;
> 	    $usage{$area}++;
> 	}
> 	$lastuptime = $uptime;
>     }
> }
> 
> print "\n";
> 
> my $k;
> foreach $k (sort keys %time)
> {
>     printf "Total %5.5f Usage %d Average %5.5f %s\n", $time{$k}, 
> $usage{$k}, $time{$k}/$usage{$k}, $k;
> }
> 
> 
> 
> May  7 17:50:55 nattytest kernel: [  127.490142] begin synchronize_net()
> May  7 17:50:55 nattytest kernel: [  127.560084] end synchronize_net()
> May  7 17:50:55 nattytest kernel: [  127.610350] begin synchronize_net()
> May  7 17:50:55 nattytest kernel: [  127.610932] end synchronize_net()
> May  7 17:50:55 nattytest kernel: [  127.740078] begin synchronize_net()
> May  7 17:50:55 nattytest kernel: [  127.820071] end synchronize_net()

Well, there is also one rcu_barrier() call that is expensive.
(It was changed from one synchronize_rcu() to one rcu_barrier() lately
in commit ef885afb , in 2.6.36 kernel)

net/core/dev.c line 5167


http://git2.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=ef885afbf8a37689afc1d9d545e2f3e7a8276c17

netdev_wait_allrefs() waits that all references to a device vanishes.

It currently uses a _very_ pessimistic 250 ms delay between each probe.
Some users reported that no more than 4 devices can be dismantled per
second, this is a pretty serious problem for some setups.

Most of the time, a refcount is about to be released by an RCU callback,
that is still in flight because rollback_registered_many() uses a
synchronize_rcu() call instead of rcu_barrier(). Problem is visible if
number of online cpus is one, because synchronize_rcu() is then a no op.

time to remove 50 ipip tunnels on a UP machine :

before patch : real 11.910s
after patch : real 1.250s

Reported-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reported-by: Octavian Purdila <opurdila@ixiacom.com>
Reported-by: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>




^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-07 16:26     ` Eric Dumazet
  2011-05-07 18:24       ` Alex Bligh
@ 2011-05-07 18:38       ` Alex Bligh
  2011-05-07 18:44         ` Eric Dumazet
  1 sibling, 1 reply; 53+ messages in thread
From: Alex Bligh @ 2011-05-07 18:38 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, Alex Bligh



--On 7 May 2011 18:26:29 +0200 Eric Dumazet <eric.dumazet@gmail.com> wrote:

># time rmmod dummy
>
> real	0m0.111s
> user	0m0.000s
> sys	0m0.000s
>
>
> This removed my two dummy0/dummy1 devices.

rmmod dummy even with numdummies=100 does only one synchronize_net() and
is quick (0.8ms).

-- 
Alex Bligh

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-07 18:32         ` Eric Dumazet
@ 2011-05-07 18:39           ` Eric Dumazet
  2011-05-08 10:09             ` Alex Bligh
  2011-05-07 18:42           ` Eric Dumazet
  2011-05-07 18:51           ` Alex Bligh
  2 siblings, 1 reply; 53+ messages in thread
From: Eric Dumazet @ 2011-05-07 18:39 UTC (permalink / raw)
  To: Alex Bligh; +Cc: netdev

Le samedi 07 mai 2011 à 20:32 +0200, Eric Dumazet a écrit :

Also you could patch synchronize_sched() itself instead of
synchronize_net()

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index dd4aea8..4af6e10 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -1518,6 +1518,7 @@ EXPORT_SYMBOL_GPL(call_rcu_bh);
 void synchronize_sched(void)
 {
 	struct rcu_synchronize rcu;
+	ktime_t time_start = ktime_get();
 
 	if (rcu_blocking_is_gp())
 		return;
@@ -1529,6 +1530,7 @@ void synchronize_sched(void)
 	/* Wait for it. */
 	wait_for_completion(&rcu.completion);
 	destroy_rcu_head_on_stack(&rcu.head);
+	pr_err("synchronize_rcu() in %lld us\n", ktime_us_delta(ktime_get(), time_start));
 }
 EXPORT_SYMBOL_GPL(synchronize_sched);
 



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-07 18:32         ` Eric Dumazet
  2011-05-07 18:39           ` Eric Dumazet
@ 2011-05-07 18:42           ` Eric Dumazet
  2011-05-07 18:50             ` Alex Bligh
  2011-05-08  7:12             ` Eric Dumazet
  2011-05-07 18:51           ` Alex Bligh
  2 siblings, 2 replies; 53+ messages in thread
From: Eric Dumazet @ 2011-05-07 18:42 UTC (permalink / raw)
  To: Alex Bligh; +Cc: netdev


Here is my trace here for one device deletion on one 8 core machine

[  800.447012] synchronize_rcu() in 15787 us
[  800.455013] synchronize_rcu() in 7682 us
[  800.464019] rcu_barrier() in 8487 us

Not that bad.

$ grep RCU .config
# RCU Subsystem
CONFIG_TREE_RCU=y
# CONFIG_PREEMPT_RCU is not set
CONFIG_RCU_TRACE=y
CONFIG_RCU_FANOUT=32
# CONFIG_RCU_FANOUT_EXACT is not set
# CONFIG_RCU_FAST_NO_HZ is not set
CONFIG_TREE_RCU_TRACE=y



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-07 18:38       ` Alex Bligh
@ 2011-05-07 18:44         ` Eric Dumazet
  0 siblings, 0 replies; 53+ messages in thread
From: Eric Dumazet @ 2011-05-07 18:44 UTC (permalink / raw)
  To: Alex Bligh; +Cc: netdev

Le samedi 07 mai 2011 à 19:38 +0100, Alex Bligh a écrit :
> 
> --On 7 May 2011 18:26:29 +0200 Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
> ># time rmmod dummy
> >
> > real	0m0.111s
> > user	0m0.000s
> > sys	0m0.000s
> >
> >
> > This removed my two dummy0/dummy1 devices.
> 
> rmmod dummy even with numdummies=100 does only one synchronize_net() and
> is quick (0.8ms).
> 

Yes, thanks to batching we added some time ago to speedup module unload.

And because you didnt setup IP addresses on them ;)

for i in `seq 0 99`
do
ifconfig dummy$i 192.168.$i.1 up
done




^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-07 18:42           ` Eric Dumazet
@ 2011-05-07 18:50             ` Alex Bligh
  2011-05-08  7:12             ` Eric Dumazet
  1 sibling, 0 replies; 53+ messages in thread
From: Alex Bligh @ 2011-05-07 18:50 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, Alex Bligh



--On 7 May 2011 20:42:29 +0200 Eric Dumazet <eric.dumazet@gmail.com> wrote:

> Here is my trace here for one device deletion on one 8 core machine
>
> [  800.447012] synchronize_rcu() in 15787 us
> [  800.455013] synchronize_rcu() in 7682 us
> [  800.464019] rcu_barrier() in 8487 us

Would you mind trying it with my script to do veth devices? kill udev and do
unshare -n first.

I've done this on 2 different lots of hardware now, with 3 kernels 18
months apart.

$ grep RCU .config
# RCU Subsystem
CONFIG_TREE_RCU=y
# CONFIG_PREEMPT_RCU is not set
# CONFIG_RCU_TRACE is not set
CONFIG_RCU_FANOUT=64
# CONFIG_RCU_FANOUT_EXACT is not set
CONFIG_RCU_FAST_NO_HZ=y
# CONFIG_TREE_RCU_TRACE is not set
# CONFIG_SPARSE_RCU_POINTER is not set
# CONFIG_RCU_TORTURE_TEST is not set
# CONFIG_RCU_CPU_STALL_DETECTOR is not set


-- 
Alex Bligh

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-07 18:32         ` Eric Dumazet
  2011-05-07 18:39           ` Eric Dumazet
  2011-05-07 18:42           ` Eric Dumazet
@ 2011-05-07 18:51           ` Alex Bligh
  2011-05-07 19:24             ` Eric Dumazet
  2 siblings, 1 reply; 53+ messages in thread
From: Alex Bligh @ 2011-05-07 18:51 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, Alex Bligh



--On 7 May 2011 20:32:54 +0200 Eric Dumazet <eric.dumazet@gmail.com> wrote:

> Well, there is also one rcu_barrier() call that is expensive.
> (It was changed from one synchronize_rcu() to one rcu_barrier() lately
> in commit ef885afb , in 2.6.36 kernel)

I think you are saying it may be waiting in rcu_barrier(). I'll
instrument that later plus synchronize_sched().

> http://git2.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commi
> tdiff;h=ef885afbf8a37689afc1d9d545e2f3e7a8276c17

OK, so in head, which I am using, rollback_registered_many which
previously had 2 calls to synchronize_net(), now has one, followed
by a call to rc_barrier() at the bottom.

> netdev_wait_allrefs() waits that all references to a device vanishes.
>
> It currently uses a _very_ pessimistic 250 ms delay between each probe.
> Some users reported that no more than 4 devices can be dismantled per
> second, this is a pretty serious problem for some setups.

Right, that's what I patched before (see patch attached to
message from earlier today) to do an exponential backoff (see
previous entry), i.e. do a 5ms sleep, then a 10ms, then a 20ms, but
never more than 250ms. It made no difference.

> time to remove 50 ipip tunnels on a UP machine :
>
> before patch : real 11.910s
> after patch : real 1.250s

Sadly I don't see that improvement!

-- 
Alex Bligh

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-07 18:51           ` Alex Bligh
@ 2011-05-07 19:24             ` Eric Dumazet
  0 siblings, 0 replies; 53+ messages in thread
From: Eric Dumazet @ 2011-05-07 19:24 UTC (permalink / raw)
  To: Alex Bligh; +Cc: netdev

Le samedi 07 mai 2011 à 19:51 +0100, Alex Bligh a écrit :
> 
> --On 7 May 2011 20:32:54 +0200 Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
> > Well, there is also one rcu_barrier() call that is expensive.
> > (It was changed from one synchronize_rcu() to one rcu_barrier() lately
> > in commit ef885afb , in 2.6.36 kernel)
> 
> I think you are saying it may be waiting in rcu_barrier(). I'll
> instrument that later plus synchronize_sched().
> 
> > http://git2.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commi
> > tdiff;h=ef885afbf8a37689afc1d9d545e2f3e7a8276c17
> 
> OK, so in head, which I am using, rollback_registered_many which
> previously had 2 calls to synchronize_net(), now has one, followed
> by a call to rc_barrier() at the bottom.
> 

each device dismantle needs 2 synchronize_rcu() and one rcu_barrier()


> Right, that's what I patched before (see patch attached to
> message from earlier today) to do an exponential backoff (see
> previous entry), i.e. do a 5ms sleep, then a 10ms, then a 20ms, but
> never more than 250ms. It made no difference.
> 

Oh well. How many time are you going to tell us about this ?

We suggested to wait no more than 1 ms, or even shout asap.

If after synchronize_rcu() and rcu_barrier() calls, they are still
references to the device, then there is a BUG somewhere.

Since these bugs are usually not fatal, we just wait a bit.



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-07 16:51             ` Eric Dumazet
@ 2011-05-08  3:45               ` Ben Greear
  2011-05-08  8:08                 ` Alex Bligh
  0 siblings, 1 reply; 53+ messages in thread
From: Ben Greear @ 2011-05-08  3:45 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Alex Bligh, netdev

On 05/07/2011 09:51 AM, Eric Dumazet wrote:
> Le samedi 07 mai 2011 à 09:44 -0700, Ben Greear a écrit :
>> On 05/07/2011 09:37 AM, Eric Dumazet wrote:
>>> Le samedi 07 mai 2011 à 09:23 -0700, Ben Greear a écrit :
>>>
>>>> I wonder if it would be worth having a 'delete me soon'
>>>> method to delete interfaces that would not block on the
>>>> RCU code.
>>>>
>>>> The controlling programs could use netlink messages to
>>>> know exactly when an interface was truly gone.
>>>>
>>>> That should allow some batching in the sync-net logic
>>>> too, if user-space code deletes 1000 interfaces very
>>>> quickly, for instance...
>>>>
>>>
>>> I suggested in the past to have an extension of batch capabilities, so
>>> that one kthread could have 3 separate lists of devices being destroyed
>>> in //,
>>>
>>> This daemon would basically loop on one call to synchronize_rcu(), and
>>> transfert list3 to deletion, list2 to list3, list1 to list2, loop,
>>> eventually releasing RTNL while blocked in synchronize_rcu()
>>>
>>> This would need to allow as you suggest an asynchronous deletion method,
>>> or use a callback to wake the process blocked on device delete.
>>
>> I'd want to at least have the option to not block the calling
>> process...otherwise, it would be a lot more difficult to
>> quickly delete 1000 interfaces.  You'd need 1000 threads, or
>> sockets, or something to parallelize it otherwise, eh?
>
> Yes, if you can afford not receive a final notification of device being
> fully freed, it should be possible.

Well, I'd hope to get a netlink message about the device being deleted, and
after that, be able to create another one with the same name, etc.

Whether the memory is actually freed in the kernel or not wouldn't matter
to me...

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-07 18:42           ` Eric Dumazet
  2011-05-07 18:50             ` Alex Bligh
@ 2011-05-08  7:12             ` Eric Dumazet
  2011-05-08  8:06               ` Alex Bligh
                                 ` (2 more replies)
  1 sibling, 3 replies; 53+ messages in thread
From: Eric Dumazet @ 2011-05-08  7:12 UTC (permalink / raw)
  To: Alex Bligh; +Cc: netdev, Paul E. McKenney

Le samedi 07 mai 2011 à 20:42 +0200, Eric Dumazet a écrit :
> Here is my trace here for one device deletion on one 8 core machine
> 
> [  800.447012] synchronize_rcu() in 15787 us
> [  800.455013] synchronize_rcu() in 7682 us
> [  800.464019] rcu_barrier() in 8487 us
> 
> Not that bad.
> 
> $ grep RCU .config
> # RCU Subsystem
> CONFIG_TREE_RCU=y
> # CONFIG_PREEMPT_RCU is not set
> CONFIG_RCU_TRACE=y
> CONFIG_RCU_FANOUT=32
> # CONFIG_RCU_FANOUT_EXACT is not set
> # CONFIG_RCU_FAST_NO_HZ is not set
> CONFIG_TREE_RCU_TRACE=y
> 

By the way, if I change HZ from 1000 to 100 I now have ten times slower
result :

# ip link add link eth0 eth0.103 type vlan id 103
# time ip link del eth0.103

real	0m0.430s
user	0m0.000s
sys	0m0.000s

So all this is related to your HZ value, even in a CONFIG_NO_HZ=y
kernel. Alex, I guess you have HZ=250 ?

# uname -a
Linux svivoipvnx021 2.6.39-rc6-00214-g5511a34-dirty #574 SMP Sun May 8
08:44:14 CEST 2011 x86_64 x86_64 x86_64 GNU/Linux
# cat /proc/cmdline

I enabled CONFIG_RCU_FAST_NO_HZ and got worse results (but not
alsways... its very variable)

# time ip link del eth0.103

real	0m0.544s
user	0m0.000s
sys	0m0.000s


# time ip link del eth0.103

real	0m0.414s
user	0m0.000s
sys	0m0.000s



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-08  7:12             ` Eric Dumazet
@ 2011-05-08  8:06               ` Alex Bligh
  2011-05-08  9:35               ` Alex Bligh
  2011-05-08 12:32               ` Paul E. McKenney
  2 siblings, 0 replies; 53+ messages in thread
From: Alex Bligh @ 2011-05-08  8:06 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, Paul E. McKenney, Alex Bligh

Eric,

> By the way, if I change HZ from 1000 to 100 I now have ten times slower
> result :
>
># ip link add link eth0 eth0.103 type vlan id 103
># time ip link del eth0.103
>
> real	0m0.430s
> user	0m0.000s
> sys	0m0.000s
>
> So all this is related to your HZ value, even in a CONFIG_NO_HZ=y
> kernel.

That's very mysterious.

> Alex, I guess you have HZ=250 ?

I have HZ=100. I am basically using the Ubuntu default with
localmodconfig to make compile times sensible.

amb@nattytest:~$ cd kernel/linux-2.6/
amb@nattytest:~/kernel/linux-2.6$ fgrep HZ .config
CONFIG_RCU_FAST_NO_HZ=y
CONFIG_NO_HZ=y
CONFIG_HZ_100=y
# CONFIG_HZ_250 is not set
# CONFIG_HZ_300 is not set
# CONFIG_HZ_1000 is not set
CONFIG_HZ=100
# CONFIG_MACHZ_WDT is not set



-- 
Alex Bligh

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-08  3:45               ` Ben Greear
@ 2011-05-08  8:08                 ` Alex Bligh
  0 siblings, 0 replies; 53+ messages in thread
From: Alex Bligh @ 2011-05-08  8:08 UTC (permalink / raw)
  To: Ben Greear, Eric Dumazet; +Cc: netdev, Alex Bligh



--On 7 May 2011 20:45:07 -0700 Ben Greear <greearb@candelatech.com> wrote:

> Well, I'd hope to get a netlink message about the device being deleted,
> and
> after that, be able to create another one with the same name, etc.
>
> Whether the memory is actually freed in the kernel or not wouldn't matter
> to me...

Provided the former para is always done, I can't actually think of a case
where the caller would /ever/ care about the latter (save perhaps
a final shutdown of the whole net subsystem).

-- 
Alex Bligh

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-08  7:12             ` Eric Dumazet
  2011-05-08  8:06               ` Alex Bligh
@ 2011-05-08  9:35               ` Alex Bligh
  2011-05-08 12:18                 ` Alex Bligh
  2011-05-08 12:44                 ` Paul E. McKenney
  2011-05-08 12:32               ` Paul E. McKenney
  2 siblings, 2 replies; 53+ messages in thread
From: Alex Bligh @ 2011-05-08  9:35 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, Paul E. McKenney, Alex Bligh

Eric,

--On 8 May 2011 09:12:22 +0200 Eric Dumazet <eric.dumazet@gmail.com> wrote:

> By the way, if I change HZ from 1000 to 100 I now have ten times slower
> result :

I repeated that test here. With HZ set to 1000 I got a total time of
4.022 seconds to remove 100 interfaces, of which:

 Total 3.03808 Usage 199 Average 0.01527 elsewhere
 Total 0.93992 Usage 200 Average 0.00470 synchronizing

as opposed to a total of 27.917 seconds with HZ set to 100, of which

 Total 18.98515 Usage 199 Average 0.09540 elsewhere
 Total 8.77581 Usage 200 Average 0.04388 synchronizing

Not quite a factor of 10 improvement, but nearly.

I have CONFIG_RCU_FAST_NO_HZ=y

I suspect this may just mean an rcu reader holds the rcu_read_lock
for a jiffies related time. Though I'm having difficulty seeing
what that might be on a system where the net is in essence idle.

-- 
Alex Bligh

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-07 18:39           ` Eric Dumazet
@ 2011-05-08 10:09             ` Alex Bligh
  0 siblings, 0 replies; 53+ messages in thread
From: Alex Bligh @ 2011-05-08 10:09 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, Alex Bligh



--On 7 May 2011 20:39:13 +0200 Eric Dumazet <eric.dumazet@gmail.com> wrote:

> Le samedi 07 mai 2011 à 20:32 +0200, Eric Dumazet a écrit :
>
> Also you could patch synchronize_sched() itself instead of
> synchronize_net()

OK, I did this, plus instrumented the call to rcu_barrier()
you mentioned:

Looking at the synchronize_net() and rcu_barrier() calls:
 Total 8.43935 Usage 399 Average 0.02115 elsewhere
 Total 10.65050 Usage 200 Average 0.05325 rcu_barrier
 Total 9.28948 Usage 200 Average 0.04645 synchronize_net

it's spending about 1/3 of its time in that rcu_barrier, 1/3
in synchronize_sched() and 1/3 elsewere.

Turning now to the synchronize_sched() (per your patch), I see

  Total 16.36852 Usage 400 Average 0.04092 synchronize_sched()

Note "Usage 400". That's because precisely half the calls to
synchronize_sched() occur outside of synchronize_net(), and
half occur within synchronize_net() (per logs)

A typical interface being removed looks like this:
May  8 09:47:31 nattytest kernel: [  177.030197] synchronize_sched() in 
66921 us
May  8 09:47:31 nattytest kernel: [  177.030957] begin synchronize_net()
May  8 09:47:31 nattytest kernel: [  177.120085] synchronize_sched() in 
89080 us
May  8 09:47:31 nattytest kernel: [  177.120819] end synchronize_net()
May  8 09:47:31 nattytest kernel: [  177.121698] begin rcu_barrier()
May  8 09:47:31 nattytest kernel: [  177.190152] end rcu_barrier()


So for every interface being destroyed (I'm doing 200 as veths
are pairs), we do 2 synchronize_sched() calls and 1 rcu_barrier.
Each of these takes roughly 42ms with CONFIG_HZ set to 100,
leading to 125ms per interface destroy, and 250ms per veth
pair destroy.

It may be a naive question but why would we need to do
2 synchronize_sched() and 1 rcu_barrier() to remove an
interface?

-- 
Alex Bligh

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-08  9:35               ` Alex Bligh
@ 2011-05-08 12:18                 ` Alex Bligh
  2011-05-08 12:50                   ` Paul E. McKenney
  2011-05-08 12:44                 ` Paul E. McKenney
  1 sibling, 1 reply; 53+ messages in thread
From: Alex Bligh @ 2011-05-08 12:18 UTC (permalink / raw)
  To: Alex Bligh, Eric Dumazet; +Cc: netdev, Paul E. McKenney, Alex Bligh



--On 8 May 2011 10:35:02 +0100 Alex Bligh <alex@alex.org.uk> wrote:

> I suspect this may just mean an rcu reader holds the rcu_read_lock
> for a jiffies related time. Though I'm having difficulty seeing
> what that might be on a system where the net is in essence idle.

Having read the RCU docs, this can't be right, because blocking
is not legal when in the rcu_read_lock critical section.

The system concerned is an 8 cpu system but I get comparable
results on a 2 cpu system.

I am guessing that when the synchronize_sched() happens, all cores
but the cpu on which that is executing are idle (at least on
the vast majority of calls) as the machine itself is idle.
As I understand, RCU synchronization (in the absence of lots
of callbacks etc.) is meant to wait until it knows all RCU
read critical sections which are running on entry have
been left. It exploits the fact that RCU read critical sections
cannot block by waiting for a context switch on each cpu, OR
for that cpu to be in the idle state or running user code (also
incompatible with a read critical section).

The fact that increasing HZ masks the problem seems to imply that
sychronize_sched() is waiting when it shouldn't be, as it suggests
it's waiting for a context switch. But surely it shouldn't be
waiting for context switch if all other cpu cores are idle?
It knows that it (the caller) doesn't hold an rcu_read_lock,
and presumably can see the other cpus are in the idle state,
in which case surely it should return immediately? Distribution
of latency in synchronize_sched() looks like this:

 20-49 us 110 instances (27.500%)
 50-99 us 45 instances (11.250%)
 5000-9999 us 5 instances (1.250%)
 10000-19999 us 33 instances (8.250%)
 20000-49999 us 4 instances (1.000%)
 50000-99999 us 191 instances (47.750%)
 100000-199999 us 12 instances (3.000%)

-- 
Alex Bligh

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-08  7:12             ` Eric Dumazet
  2011-05-08  8:06               ` Alex Bligh
  2011-05-08  9:35               ` Alex Bligh
@ 2011-05-08 12:32               ` Paul E. McKenney
  2 siblings, 0 replies; 53+ messages in thread
From: Paul E. McKenney @ 2011-05-08 12:32 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Alex Bligh, netdev

On Sun, May 08, 2011 at 09:12:22AM +0200, Eric Dumazet wrote:
> Le samedi 07 mai 2011 à 20:42 +0200, Eric Dumazet a écrit :
> > Here is my trace here for one device deletion on one 8 core machine
> > 
> > [  800.447012] synchronize_rcu() in 15787 us
> > [  800.455013] synchronize_rcu() in 7682 us
> > [  800.464019] rcu_barrier() in 8487 us
> > 
> > Not that bad.

There is always synchronize_rcu_expedited() if you need lower latency
and can tolerate a bit higher CPU overhead.

							Thanx, Paul

> > $ grep RCU .config
> > # RCU Subsystem
> > CONFIG_TREE_RCU=y
> > # CONFIG_PREEMPT_RCU is not set
> > CONFIG_RCU_TRACE=y
> > CONFIG_RCU_FANOUT=32
> > # CONFIG_RCU_FANOUT_EXACT is not set
> > # CONFIG_RCU_FAST_NO_HZ is not set
> > CONFIG_TREE_RCU_TRACE=y
> > 
> 
> By the way, if I change HZ from 1000 to 100 I now have ten times slower
> result :
> 
> # ip link add link eth0 eth0.103 type vlan id 103
> # time ip link del eth0.103
> 
> real	0m0.430s
> user	0m0.000s
> sys	0m0.000s
> 
> So all this is related to your HZ value, even in a CONFIG_NO_HZ=y
> kernel. Alex, I guess you have HZ=250 ?
> 
> # uname -a
> Linux svivoipvnx021 2.6.39-rc6-00214-g5511a34-dirty #574 SMP Sun May 8
> 08:44:14 CEST 2011 x86_64 x86_64 x86_64 GNU/Linux
> # cat /proc/cmdline
> 
> I enabled CONFIG_RCU_FAST_NO_HZ and got worse results (but not
> alsways... its very variable)
> 
> # time ip link del eth0.103
> 
> real	0m0.544s
> user	0m0.000s
> sys	0m0.000s
> 
> 
> # time ip link del eth0.103
> 
> real	0m0.414s
> user	0m0.000s
> sys	0m0.000s
> 
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-08  9:35               ` Alex Bligh
  2011-05-08 12:18                 ` Alex Bligh
@ 2011-05-08 12:44                 ` Paul E. McKenney
  2011-05-08 13:06                   ` Alex Bligh
  1 sibling, 1 reply; 53+ messages in thread
From: Paul E. McKenney @ 2011-05-08 12:44 UTC (permalink / raw)
  To: Alex Bligh; +Cc: Eric Dumazet, netdev

On Sun, May 08, 2011 at 10:35:02AM +0100, Alex Bligh wrote:
> Eric,
> 
> --On 8 May 2011 09:12:22 +0200 Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
> >By the way, if I change HZ from 1000 to 100 I now have ten times slower
> >result :
> 
> I repeated that test here. With HZ set to 1000 I got a total time of
> 4.022 seconds to remove 100 interfaces, of which:
> 
> Total 3.03808 Usage 199 Average 0.01527 elsewhere
> Total 0.93992 Usage 200 Average 0.00470 synchronizing
> 
> as opposed to a total of 27.917 seconds with HZ set to 100, of which
> 
> Total 18.98515 Usage 199 Average 0.09540 elsewhere
> Total 8.77581 Usage 200 Average 0.04388 synchronizing
> 
> Not quite a factor of 10 improvement, but nearly.
> 
> I have CONFIG_RCU_FAST_NO_HZ=y
> 
> I suspect this may just mean an rcu reader holds the rcu_read_lock
> for a jiffies related time. Though I'm having difficulty seeing
> what that might be on a system where the net is in essence idle.

OK, let's break it out...

4.022 seconds for 100 interfaces means about 40 milliseconds per interface.

My guess is that you have CONFIG_NO_HZ=y, which means that RCU needs to
figure out that various CPUs are in dyntick-idle state, which is a minimum
of 6 jiffies.  It could be longer if a given CPU happens to be in IRQ
when RCU checks, so call it 9 jiffies.  If you are doing the interfaces
synchronously, you will likely have to wait for a prior grace period (due
to background activity).  So I can easily imagine 18 milliseconds for
HZ=1000.  40 milliseconds sounds a bit high, but perhaps not impossible.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-08 12:18                 ` Alex Bligh
@ 2011-05-08 12:50                   ` Paul E. McKenney
  2011-05-08 13:13                     ` Alex Bligh
  0 siblings, 1 reply; 53+ messages in thread
From: Paul E. McKenney @ 2011-05-08 12:50 UTC (permalink / raw)
  To: Alex Bligh; +Cc: Eric Dumazet, netdev

On Sun, May 08, 2011 at 01:18:55PM +0100, Alex Bligh wrote:
> 
> 
> --On 8 May 2011 10:35:02 +0100 Alex Bligh <alex@alex.org.uk> wrote:
> 
> >I suspect this may just mean an rcu reader holds the rcu_read_lock
> >for a jiffies related time. Though I'm having difficulty seeing
> >what that might be on a system where the net is in essence idle.
> 
> Having read the RCU docs, this can't be right, because blocking
> is not legal when in the rcu_read_lock critical section.
> 
> The system concerned is an 8 cpu system but I get comparable
> results on a 2 cpu system.
> 
> I am guessing that when the synchronize_sched() happens, all cores
> but the cpu on which that is executing are idle (at least on
> the vast majority of calls) as the machine itself is idle.
> As I understand, RCU synchronization (in the absence of lots
> of callbacks etc.) is meant to wait until it knows all RCU
> read critical sections which are running on entry have
> been left. It exploits the fact that RCU read critical sections
> cannot block by waiting for a context switch on each cpu, OR
> for that cpu to be in the idle state or running user code (also
> incompatible with a read critical section).
> 
> The fact that increasing HZ masks the problem seems to imply that
> sychronize_sched() is waiting when it shouldn't be, as it suggests
> it's waiting for a context switch. But surely it shouldn't be
> waiting for context switch if all other cpu cores are idle?
> It knows that it (the caller) doesn't hold an rcu_read_lock,
> and presumably can see the other cpus are in the idle state,
> in which case surely it should return immediately? Distribution
> of latency in synchronize_sched() looks like this:
> 
> 20-49 us 110 instances (27.500%)
> 50-99 us 45 instances (11.250%)

Really?  I am having a hard time believing this above two.  Is this really
2000-4999 us and 5000-9999 us?  That would be much more believable,
and expected on a busy system with lots of context switching.  Or on a
system with CONFIG_NO_HZ=n.

> 5000-9999 us 5 instances (1.250%)

This makes sense for a mostly-idle system with frequent short bursts
of work.

> 10000-19999 us 33 instances (8.250%)

This makes sense for a CONFIG_NO_HZ system that is idle, where there
is some amount of background work that is also using RCU grace periods.

> 20000-49999 us 4 instances (1.000%)
> 50000-99999 us 191 instances (47.750%)
> 100000-199999 us 12 instances (3.000%)

These last involve additional delays.  Possibilities include long-running
irq handlers, SMIs, or NMIs.

								Thanx, Paul

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-08 12:44                 ` Paul E. McKenney
@ 2011-05-08 13:06                   ` Alex Bligh
  2011-05-08 13:14                     ` Alex Bligh
  0 siblings, 1 reply; 53+ messages in thread
From: Alex Bligh @ 2011-05-08 13:06 UTC (permalink / raw)
  To: paulmck; +Cc: Eric Dumazet, netdev, Alex Bligh



--On 8 May 2011 05:44:28 -0700 "Paul E. McKenney" 
<paulmck@linux.vnet.ibm.com> wrote:

> OK, let's break it out...
>
> 4.022 seconds for 100 interfaces means about 40 milliseconds per
> interface.

It's 200 interfaces as they are veth pairs, so 20ms.

> My guess is that you have CONFIG_NO_HZ=y, which means that RCU needs to
> figure out that various CPUs are in dyntick-idle state, which is a minimum
> of 6 jiffies.

That sounds like the main cause.

> It could be longer if a given CPU happens to be in IRQ
> when RCU checks, so call it 9 jiffies.

That sounds unlikely to happen much of the time. Looking at /proc/interrupts
not much is going on.

> If you are doing the interfaces
> synchronously, you will likely have to wait for a prior grace period (due
> to background activity).

Not sure I understand that (there's nothing else going on), but...

> So I can easily imagine 18 milliseconds for
> HZ=1000.  40 milliseconds sounds a bit high, but perhaps not impossible.

Add to that the fact that there are 2 x synchronize_sched() and one
rcu_barrier() per interface removal. If I ignore your IRQ idea, that's
6 x 3 = 18ms per interface removal at CONFIG_HZ=10, 180ms at
CONFIG_HZ=100.

-- 
Alex Bligh

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-08 12:50                   ` Paul E. McKenney
@ 2011-05-08 13:13                     ` Alex Bligh
  2011-05-08 13:44                       ` Paul E. McKenney
  0 siblings, 1 reply; 53+ messages in thread
From: Alex Bligh @ 2011-05-08 13:13 UTC (permalink / raw)
  To: paulmck; +Cc: Eric Dumazet, netdev, Alex Bligh


>> The fact that increasing HZ masks the problem seems to imply that
>> sychronize_sched() is waiting when it shouldn't be, as it suggests
>> it's waiting for a context switch. But surely it shouldn't be
>> waiting for context switch if all other cpu cores are idle?
>> It knows that it (the caller) doesn't hold an rcu_read_lock,
>> and presumably can see the other cpus are in the idle state,
>> in which case surely it should return immediately? Distribution
>> of latency in synchronize_sched() looks like this:
>>
>> 20-49 us 110 instances (27.500%)
>> 50-99 us 45 instances (11.250%)
>
> Really?  I am having a hard time believing this above two.  Is this really
> 2000-4999 us and 5000-9999 us?  That would be much more believable,
> and expected on a busy system with lots of context switching.  Or on a
> system with CONFIG_NO_HZ=n.

Yes, really 20-49us and 50-99us, not ms. Raw data attached :-)

I'm guessing there are circumstances where there is an early exit.

$ fgrep HZ .config
CONFIG_RCU_FAST_NO_HZ=y
CONFIG_NO_HZ=y
CONFIG_HZ_100=y
# CONFIG_HZ_250 is not set
# CONFIG_HZ_300 is not set
# CONFIG_HZ_1000 is not set
CONFIG_HZ=100
# CONFIG_MACHZ_WDT is not set

>
>> 5000-9999 us 5 instances (1.250%)
>
> This makes sense for a mostly-idle system with frequent short bursts
> of work.
>
>> 10000-19999 us 33 instances (8.250%)
>
> This makes sense for a CONFIG_NO_HZ system that is idle, where there
> is some amount of background work that is also using RCU grace periods.
>
>> 20000-49999 us 4 instances (1.000%)
>> 50000-99999 us 191 instances (47.750%)
>> 100000-199999 us 12 instances (3.000%)
>
> These last involve additional delays.  Possibilities include long-running
> irq handlers, SMIs, or NMIs.

There is nothing much going on these systems (idle, no other users,
just normal system daemons).

Note this is with CONFIG_HZ=100 (which is the Ubuntu default). I wonder
if that means you need to scale your expactations by a factor if 10 (i.e.
you'd expect 50000-99999 for "a mostly-idle system with frequent short 
bursts of work."

Interestingly there are quite a few of these on boot up. If this
is a CONFIG_HZ thing, then setting it to 1000 would probably take
half a second off bootup time.


-- 
Alex Bligh


May  8 09:47:31 nattytest kernel: [  177.030197] synchronize_sched() in 
66921 us
May  8 09:47:31 nattytest kernel: [  177.120085] synchronize_sched() in 
89080 us
May  8 09:47:31 nattytest kernel: [  177.190910] synchronize_sched() in 46 
us
May  8 09:47:31 nattytest kernel: [  177.210087] synchronize_sched() in 
18686 us
May  8 09:47:32 nattytest kernel: [  177.400080] synchronize_sched() in 
110609 us
May  8 09:47:32 nattytest kernel: [  177.480071] synchronize_sched() in 
78550 us
May  8 09:47:32 nattytest kernel: [  177.550119] synchronize_sched() in 
19087 us
May  8 09:47:32 nattytest kernel: [  177.551761] synchronize_sched() in 79 
us
May  8 09:47:32 nattytest kernel: [  177.670085] synchronize_sched() in 
60616 us
May  8 09:47:32 nattytest kernel: [  177.760075] synchronize_sched() in 
88465 us
May  8 09:47:32 nattytest kernel: [  177.820999] synchronize_sched() in 45 
us
May  8 09:47:32 nattytest kernel: [  177.840086] synchronize_sched() in 
17579 us
May  8 09:47:32 nattytest kernel: [  177.970084] synchronize_sched() in 
70925 us
May  8 09:47:32 nattytest kernel: [  178.050092] synchronize_sched() in 
78477 us
May  8 09:47:32 nattytest kernel: [  178.101046] synchronize_sched() in 39 
us
May  8 09:47:32 nattytest kernel: [  178.102574] synchronize_sched() in 48 
us
May  8 09:47:33 nattytest kernel: [  178.230082] synchronize_sched() in 
70966 us
May  8 09:47:33 nattytest kernel: [  178.310071] synchronize_sched() in 
78503 us
May  8 09:47:33 nattytest kernel: [  178.371056] synchronize_sched() in 43 
us
May  8 09:47:33 nattytest kernel: [  178.390094] synchronize_sched() in 
17553 us
May  8 09:47:33 nattytest kernel: [  178.540084] synchronize_sched() in 
70533 us
May  8 09:47:33 nattytest kernel: [  178.620074] synchronize_sched() in 
79172 us
May  8 09:47:33 nattytest kernel: [  178.671124] synchronize_sched() in 57 
us
May  8 09:47:33 nattytest kernel: [  178.672645] synchronize_sched() in 49 
us
May  8 09:47:33 nattytest kernel: [  178.830106] synchronize_sched() in 
100955 us
May  8 09:47:33 nattytest kernel: [  178.910084] synchronize_sched() in 
78408 us
May  8 09:47:33 nattytest kernel: [  178.961090] synchronize_sched() in 43 
us
May  8 09:47:33 nattytest kernel: [  178.962642] synchronize_sched() in 50 
us
May  8 09:47:33 nattytest kernel: [  179.090088] synchronize_sched() in 
71354 us
May  8 09:47:33 nattytest kernel: [  179.220071] synchronize_sched() in 
128463 us
May  8 09:47:34 nattytest kernel: [  179.300867] synchronize_sched() in 52 
us
May  8 09:47:34 nattytest kernel: [  179.302079] synchronize_sched() in 48 
us
May  8 09:47:34 nattytest kernel: [  179.420085] synchronize_sched() in 
61562 us
May  8 09:47:34 nattytest kernel: [  179.500093] synchronize_sched() in 
78506 us
May  8 09:47:34 nattytest kernel: [  179.551048] synchronize_sched() in 42 
us
May  8 09:47:34 nattytest kernel: [  179.552637] synchronize_sched() in 51 
us
May  8 09:47:34 nattytest kernel: [  179.670088] synchronize_sched() in 
61469 us
May  8 09:47:34 nattytest kernel: [  179.760078] synchronize_sched() in 
88430 us
May  8 09:47:34 nattytest kernel: [  179.830918] synchronize_sched() in 44 
us
May  8 09:47:34 nattytest kernel: [  179.850173] synchronize_sched() in 
17783 us
May  8 09:47:34 nattytest kernel: [  180.080085] synchronize_sched() in 
150585 us
May  8 09:47:34 nattytest kernel: [  180.160082] synchronize_sched() in 
78381 us
May  8 09:47:34 nattytest kernel: [  180.211027] synchronize_sched() in 39 
us
May  8 09:47:34 nattytest kernel: [  180.212186] synchronize_sched() in 52 
us
May  8 09:47:35 nattytest kernel: [  180.320091] synchronize_sched() in 
50520 us
May  8 09:47:35 nattytest kernel: [  180.400074] synchronize_sched() in 
78525 us
May  8 09:47:35 nattytest kernel: [  180.451028] synchronize_sched() in 50 
us
May  8 09:47:35 nattytest kernel: [  180.452100] synchronize_sched() in 30 
us
May  8 09:47:35 nattytest kernel: [  180.570086] synchronize_sched() in 
61211 us
May  8 09:47:35 nattytest kernel: [  180.650073] synchronize_sched() in 
79145 us
May  8 09:47:35 nattytest kernel: [  180.701376] synchronize_sched() in 38 
us
May  8 09:47:35 nattytest kernel: [  180.703490] synchronize_sched() in 57 
us
May  8 09:47:35 nattytest kernel: [  180.820087] synchronize_sched() in 
61483 us
May  8 09:47:35 nattytest kernel: [  180.910081] synchronize_sched() in 
88485 us
May  8 09:47:35 nattytest kernel: [  180.965970] synchronize_sched() in 49 
us
May  8 09:47:35 nattytest kernel: [  180.990091] synchronize_sched() in 
22425 us
May  8 09:47:35 nattytest kernel: [  181.150084] synchronize_sched() in 
71317 us
May  8 09:47:36 nattytest kernel: [  181.230074] synchronize_sched() in 
78470 us
May  8 09:47:36 nattytest kernel: [  181.300099] synchronize_sched() in 
18988 us
May  8 09:47:36 nattytest kernel: [  181.301637] synchronize_sched() in 46 
us
May  8 09:47:36 nattytest kernel: [  181.420086] synchronize_sched() in 
61240 us
May  8 09:47:36 nattytest kernel: [  181.500072] synchronize_sched() in 
77682 us
May  8 09:47:36 nattytest kernel: [  181.551059] synchronize_sched() in 45 
us
May  8 09:47:36 nattytest kernel: [  181.552546] synchronize_sched() in 29 
us
May  8 09:47:36 nattytest kernel: [  181.670084] synchronize_sched() in 
61612 us
May  8 09:47:36 nattytest kernel: [  181.750076] synchronize_sched() in 
78513 us
May  8 09:47:36 nattytest kernel: [  181.801195] synchronize_sched() in 52 
us
May  8 09:47:36 nattytest kernel: [  181.802805] synchronize_sched() in 69 
us
May  8 09:47:36 nattytest kernel: [  181.920088] synchronize_sched() in 
61998 us
May  8 09:47:36 nattytest kernel: [  182.000091] synchronize_sched() in 
78539 us
May  8 09:47:36 nattytest kernel: [  182.051053] synchronize_sched() in 42 
us
May  8 09:47:36 nattytest kernel: [  182.052595] synchronize_sched() in 29 
us
May  8 09:47:36 nattytest kernel: [  182.170103] synchronize_sched() in 
61607 us
May  8 09:47:37 nattytest kernel: [  182.260072] synchronize_sched() in 
88460 us
May  8 09:47:37 nattytest kernel: [  182.331028] synchronize_sched() in 44 
us
May  8 09:47:37 nattytest kernel: [  182.350096] synchronize_sched() in 
17563 us
May  8 09:47:37 nattytest kernel: [  182.500085] synchronize_sched() in 
71793 us
May  8 09:47:37 nattytest kernel: [  182.620085] synchronize_sched() in 
118343 us
May  8 09:47:37 nattytest kernel: [  182.671024] synchronize_sched() in 39 
us
May  8 09:47:37 nattytest kernel: [  182.672511] synchronize_sched() in 29 
us
May  8 09:47:37 nattytest kernel: [  182.830096] synchronize_sched() in 
101470 us
May  8 09:47:37 nattytest kernel: [  182.910081] synchronize_sched() in 
78413 us
May  8 09:47:37 nattytest kernel: [  182.971056] synchronize_sched() in 43 
us
May  8 09:47:37 nattytest kernel: [  182.972568] synchronize_sched() in 44 
us
May  8 09:47:37 nattytest kernel: [  183.100084] synchronize_sched() in 
60854 us
May  8 09:47:37 nattytest kernel: [  183.170097] synchronize_sched() in 
68450 us
May  8 09:47:37 nattytest kernel: [  183.221081] synchronize_sched() in 40 
us
May  8 09:47:37 nattytest kernel: [  183.222580] synchronize_sched() in 29 
us
May  8 09:47:38 nattytest kernel: [  183.340082] synchronize_sched() in 
61422 us
May  8 09:47:38 nattytest kernel: [  183.420078] synchronize_sched() in 
79154 us
May  8 09:47:38 nattytest kernel: [  183.471003] synchronize_sched() in 41 
us
May  8 09:47:38 nattytest kernel: [  183.472469] synchronize_sched() in 29 
us
May  8 09:47:38 nattytest kernel: [  183.590095] synchronize_sched() in 
61591 us
May  8 09:47:38 nattytest kernel: [  183.670069] synchronize_sched() in 
78533 us
May  8 09:47:38 nattytest kernel: [  183.721106] synchronize_sched() in 43 
us
May  8 09:47:38 nattytest kernel: [  183.722663] synchronize_sched() in 49 
us
May  8 09:47:38 nattytest kernel: [  183.850094] synchronize_sched() in 
71060 us
May  8 09:47:38 nattytest kernel: [  183.930080] synchronize_sched() in 
78522 us
May  8 09:47:38 nattytest kernel: [  183.981040] synchronize_sched() in 44 
us
May  8 09:47:38 nattytest kernel: [  183.990375] synchronize_sched() in 
7913 us
May  8 09:47:38 nattytest kernel: [  184.150080] synchronize_sched() in 
81568 us
May  8 09:47:38 nattytest kernel: [  184.220088] synchronize_sched() in 
68481 us
May  8 09:47:39 nattytest kernel: [  184.281161] synchronize_sched() in 65 
us
May  8 09:47:39 nattytest kernel: [  184.282733] synchronize_sched() in 60 
us
May  8 09:47:39 nattytest kernel: [  184.400079] synchronize_sched() in 
61791 us
May  8 09:47:39 nattytest kernel: [  184.480104] synchronize_sched() in 
79181 us
May  8 09:47:39 nattytest kernel: [  184.550103] synchronize_sched() in 
19063 us
May  8 09:47:39 nattytest kernel: [  184.551621] synchronize_sched() in 30 
us
May  8 09:47:39 nattytest kernel: [  184.680083] synchronize_sched() in 
71800 us
May  8 09:47:39 nattytest kernel: [  184.750084] synchronize_sched() in 
69162 us
May  8 09:47:39 nattytest kernel: [  184.801153] synchronize_sched() in 51 
us
May  8 09:47:39 nattytest kernel: [  184.802802] synchronize_sched() in 53 
us
May  8 09:47:39 nattytest kernel: [  184.920091] synchronize_sched() in 
61217 us
May  8 09:47:39 nattytest kernel: [  185.000074] synchronize_sched() in 
79141 us
May  8 09:47:39 nattytest kernel: [  185.050985] synchronize_sched() in 40 
us
May  8 09:47:39 nattytest kernel: [  185.052727] synchronize_sched() in 47 
us
May  8 09:47:39 nattytest kernel: [  185.170087] synchronize_sched() in 
62802 us
May  8 09:47:40 nattytest kernel: [  185.250074] synchronize_sched() in 
78418 us
May  8 09:47:40 nattytest kernel: [  185.311022] synchronize_sched() in 40 
us
May  8 09:47:40 nattytest kernel: [  185.312542] synchronize_sched() in 63 
us
May  8 09:47:40 nattytest kernel: [  185.430111] synchronize_sched() in 
61894 us
May  8 09:47:40 nattytest kernel: [  185.510073] synchronize_sched() in 
78410 us
May  8 09:47:40 nattytest kernel: [  185.561055] synchronize_sched() in 58 
us
May  8 09:47:40 nattytest kernel: [  185.562589] synchronize_sched() in 45 
us
May  8 09:47:40 nattytest kernel: [  185.680091] synchronize_sched() in 
62023 us
May  8 09:47:40 nattytest kernel: [  185.760076] synchronize_sched() in 
78438 us
May  8 09:47:40 nattytest kernel: [  185.811063] synchronize_sched() in 45 
us
May  8 09:47:40 nattytest kernel: [  185.812580] synchronize_sched() in 52 
us
May  8 09:47:40 nattytest kernel: [  185.930089] synchronize_sched() in 
61864 us
May  8 09:47:40 nattytest kernel: [  186.010072] synchronize_sched() in 
78465 us
May  8 09:47:40 nattytest kernel: [  186.061141] synchronize_sched() in 44 
us
May  8 09:47:40 nattytest kernel: [  186.062631] synchronize_sched() in 29 
us
May  8 09:47:40 nattytest kernel: [  186.180078] synchronize_sched() in 
61806 us
May  8 09:47:41 nattytest kernel: [  186.260074] synchronize_sched() in 
78415 us
May  8 09:47:41 nattytest kernel: [  186.311073] synchronize_sched() in 43 
us
May  8 09:47:41 nattytest kernel: [  186.312573] synchronize_sched() in 29 
us
May  8 09:47:41 nattytest kernel: [  186.440083] synchronize_sched() in 
72358 us
May  8 09:47:41 nattytest kernel: [  186.520071] synchronize_sched() in 
78395 us
May  8 09:47:41 nattytest kernel: [  186.573068] synchronize_sched() in 66 
us
May  8 09:47:41 nattytest kernel: [  186.590105] synchronize_sched() in 
15551 us
May  8 09:47:41 nattytest kernel: [  186.740092] synchronize_sched() in 
71884 us
May  8 09:47:41 nattytest kernel: [  186.830069] synchronize_sched() in 
89123 us
May  8 09:47:41 nattytest kernel: [  186.890967] synchronize_sched() in 41 
us
May  8 09:47:41 nattytest kernel: [  186.910091] synchronize_sched() in 
18639 us
May  8 09:47:41 nattytest kernel: [  187.070078] synchronize_sched() in 
82847 us
May  8 09:47:41 nattytest kernel: [  187.140090] synchronize_sched() in 
68499 us
May  8 09:47:41 nattytest kernel: [  187.191148] synchronize_sched() in 57 
us
May  8 09:47:41 nattytest kernel: [  187.192743] synchronize_sched() in 69 
us
May  8 09:47:42 nattytest kernel: [  187.310087] synchronize_sched() in 
63069 us
May  8 09:47:42 nattytest kernel: [  187.400076] synchronize_sched() in 
88432 us
May  8 09:47:42 nattytest kernel: [  187.461521] synchronize_sched() in 57 
us
May  8 09:47:42 nattytest kernel: [  187.480099] synchronize_sched() in 
16493 us
May  8 09:47:42 nattytest kernel: [  187.640081] synchronize_sched() in 
82269 us
May  8 09:47:42 nattytest kernel: [  187.710079] synchronize_sched() in 
68389 us
May  8 09:47:42 nattytest kernel: [  187.761071] synchronize_sched() in 38 
us
May  8 09:47:42 nattytest kernel: [  187.762623] synchronize_sched() in 67 
us
May  8 09:47:42 nattytest kernel: [  187.880077] synchronize_sched() in 
61847 us
May  8 09:47:42 nattytest kernel: [  187.960096] synchronize_sched() in 
78450 us
May  8 09:47:42 nattytest kernel: [  188.011147] synchronize_sched() in 62 
us
May  8 09:47:42 nattytest kernel: [  188.012617] synchronize_sched() in 29 
us
May  8 09:47:42 nattytest kernel: [  188.130083] synchronize_sched() in 
62431 us
May  8 09:47:42 nattytest kernel: [  188.210090] synchronize_sched() in 
78447 us
May  8 09:47:43 nattytest kernel: [  188.261015] synchronize_sched() in 38 
us
May  8 09:47:43 nattytest kernel: [  188.262491] synchronize_sched() in 29 
us
May  8 09:47:43 nattytest kernel: [  188.380097] synchronize_sched() in 
61923 us
May  8 09:47:43 nattytest kernel: [  188.470131] synchronize_sched() in 
88521 us
May  8 09:47:43 nattytest kernel: [  188.531024] synchronize_sched() in 46 
us
May  8 09:47:43 nattytest kernel: [  188.550087] synchronize_sched() in 
17676 us
May  8 09:47:43 nattytest kernel: [  188.690085] synchronize_sched() in 
71911 us
May  8 09:47:43 nattytest kernel: [  188.820071] synchronize_sched() in 
128473 us
May  8 09:47:43 nattytest kernel: [  188.881032] synchronize_sched() in 42 
us
May  8 09:47:43 nattytest kernel: [  188.900085] synchronize_sched() in 
17588 us
May  8 09:47:43 nattytest kernel: [  189.080084] synchronize_sched() in 
111945 us
May  8 09:47:43 nattytest kernel: [  189.170068] synchronize_sched() in 
88449 us
May  8 09:47:43 nattytest kernel: [  189.221066] synchronize_sched() in 41 
us
May  8 09:47:44 nattytest kernel: [  189.230257] synchronize_sched() in 
7463 us
May  8 09:47:44 nattytest kernel: [  189.380093] synchronize_sched() in 
71896 us
May  8 09:47:44 nattytest kernel: [  189.470093] synchronize_sched() in 
88481 us
May  8 09:47:44 nattytest kernel: [  189.550883] synchronize_sched() in 43 
us
May  8 09:47:44 nattytest kernel: [  189.552391] synchronize_sched() in 29 
us
May  8 09:47:44 nattytest kernel: [  189.670085] synchronize_sched() in 
61850 us
May  8 09:47:44 nattytest kernel: [  189.760080] synchronize_sched() in 
88549 us
May  8 09:47:44 nattytest kernel: [  189.821014] synchronize_sched() in 44 
us
May  8 09:47:44 nattytest kernel: [  189.850080] synchronize_sched() in 
27628 us
May  8 09:47:44 nattytest kernel: [  189.990087] synchronize_sched() in 
72251 us
May  8 09:47:44 nattytest kernel: [  190.080074] synchronize_sched() in 
88486 us
May  8 09:47:44 nattytest kernel: [  190.150968] synchronize_sched() in 71 
us
May  8 09:47:44 nattytest kernel: [  190.170092] synchronize_sched() in 
17672 us
May  8 09:47:45 nattytest kernel: [  190.320088] synchronize_sched() in 
71937 us
May  8 09:47:45 nattytest kernel: [  190.410068] synchronize_sched() in 
89155 us
May  8 09:47:45 nattytest kernel: [  190.490882] synchronize_sched() in 40 
us
May  8 09:47:45 nattytest kernel: [  190.492405] synchronize_sched() in 29 
us
May  8 09:47:45 nattytest kernel: [  190.620103] synchronize_sched() in 
62808 us
May  8 09:47:45 nattytest kernel: [  190.700108] synchronize_sched() in 
78486 us
May  8 09:47:45 nattytest kernel: [  190.820119] synchronize_sched() in 
69075 us
May  8 09:47:45 nattytest kernel: [  190.870096] synchronize_sched() in 
48447 us
May  8 09:47:45 nattytest kernel: [  191.000082] synchronize_sched() in 
62795 us
May  8 09:47:45 nattytest kernel: [  191.080082] synchronize_sched() in 
78489 us
May  8 09:47:45 nattytest kernel: [  191.141021] synchronize_sched() in 61 
us
May  8 09:47:45 nattytest kernel: [  191.160086] synchronize_sched() in 
17596 us
May  8 09:47:46 nattytest kernel: [  191.310097] synchronize_sched() in 
82089 us
May  8 09:47:46 nattytest kernel: [  191.390076] synchronize_sched() in 
78427 us
May  8 09:47:46 nattytest kernel: [  191.451045] synchronize_sched() in 49 
us
May  8 09:47:46 nattytest kernel: [  191.470088] synchronize_sched() in 
17643 us
May  8 09:47:46 nattytest kernel: [  191.620095] synchronize_sched() in 
72871 us
May  8 09:47:46 nattytest kernel: [  191.700072] synchronize_sched() in 
79137 us
May  8 09:47:46 nattytest kernel: [  191.751087] synchronize_sched() in 40 
us
May  8 09:47:46 nattytest kernel: [  191.752600] synchronize_sched() in 48 
us
May  8 09:47:46 nattytest kernel: [  191.880092] synchronize_sched() in 
72297 us
May  8 09:47:46 nattytest kernel: [  191.950076] synchronize_sched() in 
69142 us
May  8 09:47:46 nattytest kernel: [  192.005349] synchronize_sched() in 54 
us
May  8 09:47:46 nattytest kernel: [  192.006834] synchronize_sched() in 30 
us
May  8 09:47:46 nattytest kernel: [  192.130091] synchronize_sched() in 
72839 us
May  8 09:47:46 nattytest kernel: [  192.210069] synchronize_sched() in 
78535 us
May  8 09:47:47 nattytest kernel: [  192.270973] synchronize_sched() in 43 
us
May  8 09:47:47 nattytest kernel: [  192.300094] synchronize_sched() in 
27592 us
May  8 09:47:47 nattytest kernel: [  192.450079] synchronize_sched() in 
83072 us
May  8 09:47:47 nattytest kernel: [  192.520073] synchronize_sched() in 
68499 us
May  8 09:47:47 nattytest kernel: [  192.571106] synchronize_sched() in 44 
us
May  8 09:47:47 nattytest kernel: [  192.572585] synchronize_sched() in 29 
us
May  8 09:47:47 nattytest kernel: [  192.700080] synchronize_sched() in 
62670 us
May  8 09:47:47 nattytest kernel: [  192.820084] synchronize_sched() in 
118509 us
May  8 09:47:47 nattytest kernel: [  192.871107] synchronize_sched() in 69 
us
May  8 09:47:47 nattytest kernel: [  192.872611] synchronize_sched() in 29 
us
May  8 09:47:47 nattytest kernel: [  192.990110] synchronize_sched() in 
62742 us
May  8 09:47:47 nattytest kernel: [  193.080070] synchronize_sched() in 
89147 us
May  8 09:47:47 nattytest kernel: [  193.150962] synchronize_sched() in 65 
us
May  8 09:47:47 nattytest kernel: [  193.160285] synchronize_sched() in 
8879 us
May  8 09:47:48 nattytest kernel: [  193.300097] synchronize_sched() in 
72753 us
May  8 09:47:48 nattytest kernel: [  193.380075] synchronize_sched() in 
78472 us
May  8 09:47:48 nattytest kernel: [  193.431029] synchronize_sched() in 41 
us
May  8 09:47:48 nattytest kernel: [  193.432518] synchronize_sched() in 30 
us
May  8 09:47:48 nattytest kernel: [  193.560081] synchronize_sched() in 
72448 us
May  8 09:47:48 nattytest kernel: [  193.670073] synchronize_sched() in 
108452 us
May  8 09:47:48 nattytest kernel: [  193.721098] synchronize_sched() in 49 
us
May  8 09:47:48 nattytest kernel: [  193.722821] synchronize_sched() in 30 
us
May  8 09:47:48 nattytest kernel: [  193.840075] synchronize_sched() in 
62317 us
May  8 09:47:48 nattytest kernel: [  193.920070] synchronize_sched() in 
78472 us
May  8 09:47:48 nattytest kernel: [  193.971073] synchronize_sched() in 43 
us
May  8 09:47:48 nattytest kernel: [  193.972550] synchronize_sched() in 29 
us
May  8 09:47:48 nattytest kernel: [  194.100088] synchronize_sched() in 
72391 us
May  8 09:47:48 nattytest kernel: [  194.170082] synchronize_sched() in 
69202 us
May  8 09:47:48 nattytest kernel: [  194.221053] synchronize_sched() in 40 
us
May  8 09:47:48 nattytest kernel: [  194.222543] synchronize_sched() in 29 
us
May  8 09:47:49 nattytest kernel: [  194.350082] synchronize_sched() in 
72874 us
May  8 09:47:49 nattytest kernel: [  194.430068] synchronize_sched() in 
78536 us
May  8 09:47:49 nattytest kernel: [  194.481167] synchronize_sched() in 70 
us
May  8 09:47:49 nattytest kernel: [  194.500091] synchronize_sched() in 
17486 us
May  8 09:47:49 nattytest kernel: [  194.640088] synchronize_sched() in 
73012 us
May  8 09:47:49 nattytest kernel: [  194.730069] synchronize_sched() in 
88519 us
May  8 09:47:49 nattytest kernel: [  194.781117] synchronize_sched() in 46 
us
May  8 09:47:49 nattytest kernel: [  194.792950] synchronize_sched() in 
10384 us
May  8 09:47:49 nattytest kernel: [  194.910126] synchronize_sched() in 
63131 us
May  8 09:47:49 nattytest kernel: [  195.000088] synchronize_sched() in 
88453 us
May  8 09:47:49 nattytest kernel: [  195.160080] synchronize_sched() in 
79582 us
May  8 09:47:50 nattytest kernel: [  195.240073] synchronize_sched() in 
79132 us
May  8 09:47:50 nattytest kernel: [  195.360124] synchronize_sched() in 
62368 us
May  8 09:47:50 nattytest kernel: [  195.440096] synchronize_sched() in 
79148 us
May  8 09:47:50 nattytest kernel: [  195.491063] synchronize_sched() in 53 
us
May  8 09:47:50 nattytest kernel: [  195.492182] synchronize_sched() in 29 
us
May  8 09:47:50 nattytest kernel: [  195.610085] synchronize_sched() in 
62444 us
May  8 09:47:50 nattytest kernel: [  195.700078] synchronize_sched() in 
89170 us
May  8 09:47:50 nattytest kernel: [  195.761011] synchronize_sched() in 57 
us
May  8 09:47:50 nattytest kernel: [  195.780137] synchronize_sched() in 
18587 us
May  8 09:47:50 nattytest kernel: [  195.920080] synchronize_sched() in 
73194 us
May  8 09:47:50 nattytest kernel: [  196.000073] synchronize_sched() in 
78456 us
May  8 09:47:50 nattytest kernel: [  196.051067] synchronize_sched() in 42 
us
May  8 09:47:50 nattytest kernel: [  196.052613] synchronize_sched() in 44 
us
May  8 09:47:50 nattytest kernel: [  196.170080] synchronize_sched() in 
63040 us
May  8 09:47:51 nattytest kernel: [  196.250075] synchronize_sched() in 
78420 us
May  8 09:47:51 nattytest kernel: [  196.301051] synchronize_sched() in 46 
us
May  8 09:47:51 nattytest kernel: [  196.302527] synchronize_sched() in 29 
us
May  8 09:47:51 nattytest kernel: [  196.420095] synchronize_sched() in 
62701 us
May  8 09:47:51 nattytest kernel: [  196.500064] synchronize_sched() in 
78448 us
May  8 09:47:51 nattytest kernel: [  196.551104] synchronize_sched() in 48 
us
May  8 09:47:51 nattytest kernel: [  196.552631] synchronize_sched() in 49 
us
May  8 09:47:51 nattytest kernel: [  196.670084] synchronize_sched() in 
63111 us
May  8 09:47:51 nattytest kernel: [  196.750095] synchronize_sched() in 
78464 us
May  8 09:47:51 nattytest kernel: [  196.801082] synchronize_sched() in 60 
us
May  8 09:47:51 nattytest kernel: [  196.802616] synchronize_sched() in 43 
us
May  8 09:47:51 nattytest kernel: [  196.930124] synchronize_sched() in 
73010 us
May  8 09:47:51 nattytest kernel: [  197.010084] synchronize_sched() in 
78359 us
May  8 09:47:51 nattytest kernel: [  197.071137] synchronize_sched() in 79 
us
May  8 09:47:51 nattytest kernel: [  197.080320] synchronize_sched() in 
7773 us
May  8 09:47:52 nattytest kernel: [  197.230086] synchronize_sched() in 
83245 us
May  8 09:47:52 nattytest kernel: [  197.310075] synchronize_sched() in 
78509 us
May  8 09:47:52 nattytest kernel: [  197.361050] synchronize_sched() in 44 
us
May  8 09:47:52 nattytest kernel: [  197.380084] synchronize_sched() in 
17591 us
May  8 09:47:52 nattytest kernel: [  197.530084] synchronize_sched() in 
82999 us
May  8 09:47:52 nattytest kernel: [  197.610080] synchronize_sched() in 
79162 us
May  8 09:47:52 nattytest kernel: [  197.680937] synchronize_sched() in 65 
us
May  8 09:47:52 nattytest kernel: [  197.700096] synchronize_sched() in 
18620 us
May  8 09:47:52 nattytest kernel: [  197.850141] synchronize_sched() in 
73601 us
May  8 09:47:52 nattytest kernel: [  197.930091] synchronize_sched() in 
78393 us
May  8 09:47:52 nattytest kernel: [  197.981050] synchronize_sched() in 39 
us
May  8 09:47:52 nattytest kernel: [  197.982788] synchronize_sched() in 29 
us
May  8 09:47:52 nattytest kernel: [  198.110084] synchronize_sched() in 
72893 us
May  8 09:47:52 nattytest kernel: [  198.180072] synchronize_sched() in 
69151 us
May  8 09:47:53 nattytest kernel: [  198.231098] synchronize_sched() in 50 
us
May  8 09:47:53 nattytest kernel: [  198.232679] synchronize_sched() in 46 
us
May  8 09:47:53 nattytest kernel: [  198.370084] synchronize_sched() in 
72935 us
May  8 09:47:53 nattytest kernel: [  198.450068] synchronize_sched() in 
78461 us
May  8 09:47:53 nattytest kernel: [  198.511077] synchronize_sched() in 77 
us
May  8 09:47:53 nattytest kernel: [  198.530094] synchronize_sched() in 
17512 us
May  8 09:47:53 nattytest kernel: [  198.670095] synchronize_sched() in 
83122 us
May  8 09:47:53 nattytest kernel: [  198.740079] synchronize_sched() in 
68453 us
May  8 09:47:53 nattytest kernel: [  198.801121] synchronize_sched() in 48 
us
May  8 09:47:53 nattytest kernel: [  198.802600] synchronize_sched() in 29 
us
May  8 09:47:53 nattytest kernel: [  198.920100] synchronize_sched() in 
63392 us
May  8 09:47:53 nattytest kernel: [  199.000088] synchronize_sched() in 
78482 us
May  8 09:47:53 nattytest kernel: [  199.070099] synchronize_sched() in 
19013 us
May  8 09:47:53 nattytest kernel: [  199.071626] synchronize_sched() in 48 
us
May  8 09:47:53 nattytest kernel: [  199.190083] synchronize_sched() in 
63245 us
May  8 09:47:54 nattytest kernel: [  199.270075] synchronize_sched() in 
78491 us
May  8 09:47:54 nattytest kernel: [  199.321088] synchronize_sched() in 42 
us
May  8 09:47:54 nattytest kernel: [  199.322589] synchronize_sched() in 29 
us
May  8 09:47:54 nattytest kernel: [  199.440101] synchronize_sched() in 
63063 us
May  8 09:47:54 nattytest kernel: [  199.520073] synchronize_sched() in 
78463 us
May  8 09:47:54 nattytest kernel: [  199.571056] synchronize_sched() in 49 
us
May  8 09:47:54 nattytest kernel: [  199.572488] synchronize_sched() in 29 
us
May  8 09:47:54 nattytest kernel: [  199.700083] synchronize_sched() in 
73220 us
May  8 09:47:54 nattytest kernel: [  199.770096] synchronize_sched() in 
68508 us
May  8 09:47:54 nattytest kernel: [  199.821090] synchronize_sched() in 50 
us
May  8 09:47:54 nattytest kernel: [  199.822540] synchronize_sched() in 29 
us
May  8 09:47:54 nattytest kernel: [  199.940084] synchronize_sched() in 
63681 us
May  8 09:47:54 nattytest kernel: [  200.020067] synchronize_sched() in 
78451 us
May  8 09:47:54 nattytest kernel: [  200.071095] synchronize_sched() in 41 
us
May  8 09:47:54 nattytest kernel: [  200.072566] synchronize_sched() in 29 
us
May  8 09:47:54 nattytest kernel: [  200.190089] synchronize_sched() in 
63584 us
May  8 09:47:55 nattytest kernel: [  200.280076] synchronize_sched() in 
89123 us
May  8 09:47:55 nattytest kernel: [  200.341011] synchronize_sched() in 61 
us
May  8 09:47:55 nattytest kernel: [  200.360094] synchronize_sched() in 
18527 us
May  8 09:47:55 nattytest kernel: [  200.520085] synchronize_sched() in 
94129 us
May  8 09:47:55 nattytest kernel: [  200.670072] synchronize_sched() in 
148512 us
May  8 09:47:55 nattytest kernel: [  200.730983] synchronize_sched() in 41 
us
May  8 09:47:55 nattytest kernel: [  200.750355] synchronize_sched() in 
17917 us
May  8 09:47:55 nattytest kernel: [  200.890082] synchronize_sched() in 
83406 us
May  8 09:47:55 nattytest kernel: [  200.970072] synchronize_sched() in 
78439 us
May  8 09:47:55 nattytest kernel: [  201.031034] synchronize_sched() in 45 
us
May  8 09:47:55 nattytest kernel: [  201.050090] synchronize_sched() in 
17555 us
May  8 09:47:55 nattytest kernel: [  201.200090] synchronize_sched() in 
83318 us
May  8 09:47:56 nattytest kernel: [  201.270075] synchronize_sched() in 
68468 us
May  8 09:47:56 nattytest kernel: [  201.330983] synchronize_sched() in 41 
us
May  8 09:47:56 nattytest kernel: [  201.332487] synchronize_sched() in 62 
us
May  8 09:47:56 nattytest kernel: [  201.450088] synchronize_sched() in 
53772 us
May  8 09:47:56 nattytest kernel: [  201.530075] synchronize_sched() in 
78437 us
May  8 09:47:56 nattytest kernel: [  201.590986] synchronize_sched() in 46 
us
May  8 09:47:56 nattytest kernel: [  201.592134] synchronize_sched() in 29 
us
May  8 09:47:56 nattytest kernel: [  201.710085] synchronize_sched() in 
63551 us
May  8 09:47:56 nattytest kernel: [  201.830146] synchronize_sched() in 
118553 us
May  8 09:47:56 nattytest kernel: [  201.891143] synchronize_sched() in 50 
us
May  8 09:47:56 nattytest kernel: [  201.892712] synchronize_sched() in 55 
us
May  8 09:47:56 nattytest kernel: [  202.010092] synchronize_sched() in 
63446 us
May  8 09:47:56 nattytest kernel: [  202.090126] synchronize_sched() in 
78477 us
May  8 09:47:56 nattytest kernel: [  202.141120] synchronize_sched() in 45 
us
May  8 09:47:56 nattytest kernel: [  202.142867] synchronize_sched() in 30 
us
May  8 09:47:57 nattytest kernel: [  202.260092] synchronize_sched() in 
63271 us
May  8 09:47:57 nattytest kernel: [  202.340071] synchronize_sched() in 
78474 us
May  8 09:47:57 nattytest kernel: [  202.391094] synchronize_sched() in 44 
us
May  8 09:47:57 nattytest kernel: [  202.392581] synchronize_sched() in 29 
us
May  8 09:47:57 nattytest kernel: [  202.510110] synchronize_sched() in 
63718 us
May  8 09:47:57 nattytest kernel: [  202.600092] synchronize_sched() in 
88434 us
May  8 09:47:57 nattytest kernel: [  202.661075] synchronize_sched() in 46 
us
May  8 09:47:57 nattytest kernel: [  202.680091] synchronize_sched() in 
17516 us
May  8 09:47:57 nattytest kernel: [  202.820083] synchronize_sched() in 
83733 us
May  8 09:47:57 nattytest kernel: [  202.900067] synchronize_sched() in 
78478 us
May  8 09:47:57 nattytest kernel: [  202.951062] synchronize_sched() in 43 
us
May  8 09:47:57 nattytest kernel: [  202.960445] synchronize_sched() in 
7921 us
May  8 09:47:57 nattytest kernel: [  203.100080] synchronize_sched() in 
83496 us
May  8 09:47:57 nattytest kernel: [  203.170084] synchronize_sched() in 
68474 us
May  8 09:47:57 nattytest kernel: [  203.221083] synchronize_sched() in 41 
us
May  8 09:47:57 nattytest kernel: [  203.222692] synchronize_sched() in 68 
us
May  8 09:47:58 nattytest kernel: [  203.350083] synchronize_sched() in 
74062 us
May  8 09:47:58 nattytest kernel: [  203.430100] synchronize_sched() in 
78468 us
May  8 09:47:58 nattytest kernel: [  203.491028] synchronize_sched() in 52 
us
May  8 09:47:58 nattytest kernel: [  203.510086] synchronize_sched() in 
17607 us
May  8 09:47:58 nattytest kernel: [  203.670080] synchronize_sched() in 
83798 us
May  8 09:47:58 nattytest kernel: [  203.750080] synchronize_sched() in 
78382 us
May  8 09:47:58 nattytest kernel: [  203.821032] synchronize_sched() in 48 
us
May  8 09:47:58 nattytest kernel: [  203.840093] synchronize_sched() in 
17608 us
May  8 09:47:58 nattytest kernel: [  203.990084] synchronize_sched() in 
73664 us
May  8 09:47:58 nattytest kernel: [  204.080078] synchronize_sched() in 
88391 us
May  8 09:47:58 nattytest kernel: [  204.141021] synchronize_sched() in 44 
us
May  8 09:47:58 nattytest kernel: [  204.160083] synchronize_sched() in 
17671 us
May  8 09:47:59 nattytest kernel: [  204.310086] synchronize_sched() in 
83288 us
May  8 09:47:59 nattytest kernel: [  204.380074] synchronize_sched() in 
69168 us
May  8 09:47:59 nattytest kernel: [  204.431059] synchronize_sched() in 67 
us
May  8 09:47:59 nattytest kernel: [  204.432205] synchronize_sched() in 30 
us
May  8 09:47:59 nattytest kernel: [  204.550147] synchronize_sched() in 
54218 us
May  8 09:47:59 nattytest kernel: [  204.640079] synchronize_sched() in 
88401 us
May  8 09:47:59 nattytest kernel: [  204.701062] synchronize_sched() in 48 
us
May  8 09:47:59 nattytest kernel: [  204.720093] synchronize_sched() in 
17508 us
May  8 09:47:59 nattytest kernel: [  204.850108] synchronize_sched() in 
73968 us
May  8 09:47:59 nattytest kernel: [  204.940074] synchronize_sched() in 
88489 us
May  8 09:47:59 nattytest kernel: [  205.011002] synchronize_sched() in 44 
us
May  8 09:47:59 nattytest kernel: [  205.030104] synchronize_sched() in 
17658 us
May  8 09:47:59 nattytest kernel: [  205.190091] synchronize_sched() in 
83651 us
May  8 09:48:00 nattytest kernel: [  205.270094] synchronize_sched() in 
78501 us
May  8 09:48:00 nattytest kernel: [  205.341000] synchronize_sched() in 65 
us
May  8 09:48:00 nattytest kernel: [  205.360081] synchronize_sched() in 
17640 us





^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-08 13:06                   ` Alex Bligh
@ 2011-05-08 13:14                     ` Alex Bligh
  0 siblings, 0 replies; 53+ messages in thread
From: Alex Bligh @ 2011-05-08 13:14 UTC (permalink / raw)
  To: Alex Bligh, paulmck; +Cc: Eric Dumazet, netdev, Alex Bligh



--On 8 May 2011 14:06:07 +0100 Alex Bligh <alex@alex.org.uk> wrote:

> Add to that the fact that there are 2 x synchronize_sched() and one
> rcu_barrier() per interface removal. If I ignore your IRQ idea, that's
> 6 x 3 = 18ms per interface removal at CONFIG_HZ=10, 180ms at
> CONFIG_HZ=100.

Aargh I mean:

6 x 3 = 18ms per interface removal at CONFIG_HZ=1000, 180ms at
CONFIG_HZ=100.


-- 
Alex Bligh

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-08 13:13                     ` Alex Bligh
@ 2011-05-08 13:44                       ` Paul E. McKenney
  2011-05-08 14:27                         ` Alex Bligh
  0 siblings, 1 reply; 53+ messages in thread
From: Paul E. McKenney @ 2011-05-08 13:44 UTC (permalink / raw)
  To: Alex Bligh; +Cc: Eric Dumazet, netdev

On Sun, May 08, 2011 at 02:13:30PM +0100, Alex Bligh wrote:
> 
> >>The fact that increasing HZ masks the problem seems to imply that
> >>sychronize_sched() is waiting when it shouldn't be, as it suggests
> >>it's waiting for a context switch. But surely it shouldn't be
> >>waiting for context switch if all other cpu cores are idle?
> >>It knows that it (the caller) doesn't hold an rcu_read_lock,
> >>and presumably can see the other cpus are in the idle state,
> >>in which case surely it should return immediately? Distribution
> >>of latency in synchronize_sched() looks like this:
> >>
> >>20-49 us 110 instances (27.500%)
> >>50-99 us 45 instances (11.250%)
> >
> >Really?  I am having a hard time believing this above two.  Is this really
> >2000-4999 us and 5000-9999 us?  That would be much more believable,
> >and expected on a busy system with lots of context switching.  Or on a
> >system with CONFIG_NO_HZ=n.
> 
> Yes, really 20-49us and 50-99us, not ms. Raw data attached :-)
> 
> I'm guessing there are circumstances where there is an early exit.

Well, if you were onlining and offlining CPUs, then if there was only
one CPU online, this could happen.  And there really is only one CPU
online during boot, so if your measurements included early boot time,
this could easily explain these very short timings.

> $ fgrep HZ .config
> CONFIG_RCU_FAST_NO_HZ=y
> CONFIG_NO_HZ=y
> CONFIG_HZ_100=y
> # CONFIG_HZ_250 is not set
> # CONFIG_HZ_300 is not set
> # CONFIG_HZ_1000 is not set
> CONFIG_HZ=100
> # CONFIG_MACHZ_WDT is not set
> 
> >
> >>5000-9999 us 5 instances (1.250%)
> >
> >This makes sense for a mostly-idle system with frequent short bursts
> >of work.
> >
> >>10000-19999 us 33 instances (8.250%)
> >
> >This makes sense for a CONFIG_NO_HZ system that is idle, where there
> >is some amount of background work that is also using RCU grace periods.
> >
> >>20000-49999 us 4 instances (1.000%)
> >>50000-99999 us 191 instances (47.750%)
> >>100000-199999 us 12 instances (3.000%)
> >
> >These last involve additional delays.  Possibilities include long-running
> >irq handlers, SMIs, or NMIs.
> 
> There is nothing much going on these systems (idle, no other users,
> just normal system daemons).

And normal system daemons might cause this, right?

> Note this is with CONFIG_HZ=100 (which is the Ubuntu default). I wonder
> if that means you need to scale your expactations by a factor if 10 (i.e.
> you'd expect 50000-99999 for "a mostly-idle system with frequent
> short bursts of work."

Indeed, I was thinking in terms of HZ=1000.

							Thanx, Paul

> Interestingly there are quite a few of these on boot up. If this
> is a CONFIG_HZ thing, then setting it to 1000 would probably take
> half a second off bootup time.
> 
> 
> -- 
> Alex Bligh
> 
> 
> May  8 09:47:31 nattytest kernel: [  177.030197] synchronize_sched()
> in 66921 us
> May  8 09:47:31 nattytest kernel: [  177.120085] synchronize_sched()
> in 89080 us
> May  8 09:47:31 nattytest kernel: [  177.190910] synchronize_sched()
> in 46 us
> May  8 09:47:31 nattytest kernel: [  177.210087] synchronize_sched()
> in 18686 us
> May  8 09:47:32 nattytest kernel: [  177.400080] synchronize_sched()
> in 110609 us
> May  8 09:47:32 nattytest kernel: [  177.480071] synchronize_sched()
> in 78550 us
> May  8 09:47:32 nattytest kernel: [  177.550119] synchronize_sched()
> in 19087 us
> May  8 09:47:32 nattytest kernel: [  177.551761] synchronize_sched()
> in 79 us
> May  8 09:47:32 nattytest kernel: [  177.670085] synchronize_sched()
> in 60616 us
> May  8 09:47:32 nattytest kernel: [  177.760075] synchronize_sched()
> in 88465 us
> May  8 09:47:32 nattytest kernel: [  177.820999] synchronize_sched()
> in 45 us
> May  8 09:47:32 nattytest kernel: [  177.840086] synchronize_sched()
> in 17579 us
> May  8 09:47:32 nattytest kernel: [  177.970084] synchronize_sched()
> in 70925 us
> May  8 09:47:32 nattytest kernel: [  178.050092] synchronize_sched()
> in 78477 us
> May  8 09:47:32 nattytest kernel: [  178.101046] synchronize_sched()
> in 39 us
> May  8 09:47:32 nattytest kernel: [  178.102574] synchronize_sched()
> in 48 us
> May  8 09:47:33 nattytest kernel: [  178.230082] synchronize_sched()
> in 70966 us
> May  8 09:47:33 nattytest kernel: [  178.310071] synchronize_sched()
> in 78503 us
> May  8 09:47:33 nattytest kernel: [  178.371056] synchronize_sched()
> in 43 us
> May  8 09:47:33 nattytest kernel: [  178.390094] synchronize_sched()
> in 17553 us
> May  8 09:47:33 nattytest kernel: [  178.540084] synchronize_sched()
> in 70533 us
> May  8 09:47:33 nattytest kernel: [  178.620074] synchronize_sched()
> in 79172 us
> May  8 09:47:33 nattytest kernel: [  178.671124] synchronize_sched()
> in 57 us
> May  8 09:47:33 nattytest kernel: [  178.672645] synchronize_sched()
> in 49 us
> May  8 09:47:33 nattytest kernel: [  178.830106] synchronize_sched()
> in 100955 us
> May  8 09:47:33 nattytest kernel: [  178.910084] synchronize_sched()
> in 78408 us
> May  8 09:47:33 nattytest kernel: [  178.961090] synchronize_sched()
> in 43 us
> May  8 09:47:33 nattytest kernel: [  178.962642] synchronize_sched()
> in 50 us
> May  8 09:47:33 nattytest kernel: [  179.090088] synchronize_sched()
> in 71354 us
> May  8 09:47:33 nattytest kernel: [  179.220071] synchronize_sched()
> in 128463 us
> May  8 09:47:34 nattytest kernel: [  179.300867] synchronize_sched()
> in 52 us
> May  8 09:47:34 nattytest kernel: [  179.302079] synchronize_sched()
> in 48 us
> May  8 09:47:34 nattytest kernel: [  179.420085] synchronize_sched()
> in 61562 us
> May  8 09:47:34 nattytest kernel: [  179.500093] synchronize_sched()
> in 78506 us
> May  8 09:47:34 nattytest kernel: [  179.551048] synchronize_sched()
> in 42 us
> May  8 09:47:34 nattytest kernel: [  179.552637] synchronize_sched()
> in 51 us
> May  8 09:47:34 nattytest kernel: [  179.670088] synchronize_sched()
> in 61469 us
> May  8 09:47:34 nattytest kernel: [  179.760078] synchronize_sched()
> in 88430 us
> May  8 09:47:34 nattytest kernel: [  179.830918] synchronize_sched()
> in 44 us
> May  8 09:47:34 nattytest kernel: [  179.850173] synchronize_sched()
> in 17783 us
> May  8 09:47:34 nattytest kernel: [  180.080085] synchronize_sched()
> in 150585 us
> May  8 09:47:34 nattytest kernel: [  180.160082] synchronize_sched()
> in 78381 us
> May  8 09:47:34 nattytest kernel: [  180.211027] synchronize_sched()
> in 39 us
> May  8 09:47:34 nattytest kernel: [  180.212186] synchronize_sched()
> in 52 us
> May  8 09:47:35 nattytest kernel: [  180.320091] synchronize_sched()
> in 50520 us
> May  8 09:47:35 nattytest kernel: [  180.400074] synchronize_sched()
> in 78525 us
> May  8 09:47:35 nattytest kernel: [  180.451028] synchronize_sched()
> in 50 us
> May  8 09:47:35 nattytest kernel: [  180.452100] synchronize_sched()
> in 30 us
> May  8 09:47:35 nattytest kernel: [  180.570086] synchronize_sched()
> in 61211 us
> May  8 09:47:35 nattytest kernel: [  180.650073] synchronize_sched()
> in 79145 us
> May  8 09:47:35 nattytest kernel: [  180.701376] synchronize_sched()
> in 38 us
> May  8 09:47:35 nattytest kernel: [  180.703490] synchronize_sched()
> in 57 us
> May  8 09:47:35 nattytest kernel: [  180.820087] synchronize_sched()
> in 61483 us
> May  8 09:47:35 nattytest kernel: [  180.910081] synchronize_sched()
> in 88485 us
> May  8 09:47:35 nattytest kernel: [  180.965970] synchronize_sched()
> in 49 us
> May  8 09:47:35 nattytest kernel: [  180.990091] synchronize_sched()
> in 22425 us
> May  8 09:47:35 nattytest kernel: [  181.150084] synchronize_sched()
> in 71317 us
> May  8 09:47:36 nattytest kernel: [  181.230074] synchronize_sched()
> in 78470 us
> May  8 09:47:36 nattytest kernel: [  181.300099] synchronize_sched()
> in 18988 us
> May  8 09:47:36 nattytest kernel: [  181.301637] synchronize_sched()
> in 46 us
> May  8 09:47:36 nattytest kernel: [  181.420086] synchronize_sched()
> in 61240 us
> May  8 09:47:36 nattytest kernel: [  181.500072] synchronize_sched()
> in 77682 us
> May  8 09:47:36 nattytest kernel: [  181.551059] synchronize_sched()
> in 45 us
> May  8 09:47:36 nattytest kernel: [  181.552546] synchronize_sched()
> in 29 us
> May  8 09:47:36 nattytest kernel: [  181.670084] synchronize_sched()
> in 61612 us
> May  8 09:47:36 nattytest kernel: [  181.750076] synchronize_sched()
> in 78513 us
> May  8 09:47:36 nattytest kernel: [  181.801195] synchronize_sched()
> in 52 us
> May  8 09:47:36 nattytest kernel: [  181.802805] synchronize_sched()
> in 69 us
> May  8 09:47:36 nattytest kernel: [  181.920088] synchronize_sched()
> in 61998 us
> May  8 09:47:36 nattytest kernel: [  182.000091] synchronize_sched()
> in 78539 us
> May  8 09:47:36 nattytest kernel: [  182.051053] synchronize_sched()
> in 42 us
> May  8 09:47:36 nattytest kernel: [  182.052595] synchronize_sched()
> in 29 us
> May  8 09:47:36 nattytest kernel: [  182.170103] synchronize_sched()
> in 61607 us
> May  8 09:47:37 nattytest kernel: [  182.260072] synchronize_sched()
> in 88460 us
> May  8 09:47:37 nattytest kernel: [  182.331028] synchronize_sched()
> in 44 us
> May  8 09:47:37 nattytest kernel: [  182.350096] synchronize_sched()
> in 17563 us
> May  8 09:47:37 nattytest kernel: [  182.500085] synchronize_sched()
> in 71793 us
> May  8 09:47:37 nattytest kernel: [  182.620085] synchronize_sched()
> in 118343 us
> May  8 09:47:37 nattytest kernel: [  182.671024] synchronize_sched()
> in 39 us
> May  8 09:47:37 nattytest kernel: [  182.672511] synchronize_sched()
> in 29 us
> May  8 09:47:37 nattytest kernel: [  182.830096] synchronize_sched()
> in 101470 us
> May  8 09:47:37 nattytest kernel: [  182.910081] synchronize_sched()
> in 78413 us
> May  8 09:47:37 nattytest kernel: [  182.971056] synchronize_sched()
> in 43 us
> May  8 09:47:37 nattytest kernel: [  182.972568] synchronize_sched()
> in 44 us
> May  8 09:47:37 nattytest kernel: [  183.100084] synchronize_sched()
> in 60854 us
> May  8 09:47:37 nattytest kernel: [  183.170097] synchronize_sched()
> in 68450 us
> May  8 09:47:37 nattytest kernel: [  183.221081] synchronize_sched()
> in 40 us
> May  8 09:47:37 nattytest kernel: [  183.222580] synchronize_sched()
> in 29 us
> May  8 09:47:38 nattytest kernel: [  183.340082] synchronize_sched()
> in 61422 us
> May  8 09:47:38 nattytest kernel: [  183.420078] synchronize_sched()
> in 79154 us
> May  8 09:47:38 nattytest kernel: [  183.471003] synchronize_sched()
> in 41 us
> May  8 09:47:38 nattytest kernel: [  183.472469] synchronize_sched()
> in 29 us
> May  8 09:47:38 nattytest kernel: [  183.590095] synchronize_sched()
> in 61591 us
> May  8 09:47:38 nattytest kernel: [  183.670069] synchronize_sched()
> in 78533 us
> May  8 09:47:38 nattytest kernel: [  183.721106] synchronize_sched()
> in 43 us
> May  8 09:47:38 nattytest kernel: [  183.722663] synchronize_sched()
> in 49 us
> May  8 09:47:38 nattytest kernel: [  183.850094] synchronize_sched()
> in 71060 us
> May  8 09:47:38 nattytest kernel: [  183.930080] synchronize_sched()
> in 78522 us
> May  8 09:47:38 nattytest kernel: [  183.981040] synchronize_sched()
> in 44 us
> May  8 09:47:38 nattytest kernel: [  183.990375] synchronize_sched()
> in 7913 us
> May  8 09:47:38 nattytest kernel: [  184.150080] synchronize_sched()
> in 81568 us
> May  8 09:47:38 nattytest kernel: [  184.220088] synchronize_sched()
> in 68481 us
> May  8 09:47:39 nattytest kernel: [  184.281161] synchronize_sched()
> in 65 us
> May  8 09:47:39 nattytest kernel: [  184.282733] synchronize_sched()
> in 60 us
> May  8 09:47:39 nattytest kernel: [  184.400079] synchronize_sched()
> in 61791 us
> May  8 09:47:39 nattytest kernel: [  184.480104] synchronize_sched()
> in 79181 us
> May  8 09:47:39 nattytest kernel: [  184.550103] synchronize_sched()
> in 19063 us
> May  8 09:47:39 nattytest kernel: [  184.551621] synchronize_sched()
> in 30 us
> May  8 09:47:39 nattytest kernel: [  184.680083] synchronize_sched()
> in 71800 us
> May  8 09:47:39 nattytest kernel: [  184.750084] synchronize_sched()
> in 69162 us
> May  8 09:47:39 nattytest kernel: [  184.801153] synchronize_sched()
> in 51 us
> May  8 09:47:39 nattytest kernel: [  184.802802] synchronize_sched()
> in 53 us
> May  8 09:47:39 nattytest kernel: [  184.920091] synchronize_sched()
> in 61217 us
> May  8 09:47:39 nattytest kernel: [  185.000074] synchronize_sched()
> in 79141 us
> May  8 09:47:39 nattytest kernel: [  185.050985] synchronize_sched()
> in 40 us
> May  8 09:47:39 nattytest kernel: [  185.052727] synchronize_sched()
> in 47 us
> May  8 09:47:39 nattytest kernel: [  185.170087] synchronize_sched()
> in 62802 us
> May  8 09:47:40 nattytest kernel: [  185.250074] synchronize_sched()
> in 78418 us
> May  8 09:47:40 nattytest kernel: [  185.311022] synchronize_sched()
> in 40 us
> May  8 09:47:40 nattytest kernel: [  185.312542] synchronize_sched()
> in 63 us
> May  8 09:47:40 nattytest kernel: [  185.430111] synchronize_sched()
> in 61894 us
> May  8 09:47:40 nattytest kernel: [  185.510073] synchronize_sched()
> in 78410 us
> May  8 09:47:40 nattytest kernel: [  185.561055] synchronize_sched()
> in 58 us
> May  8 09:47:40 nattytest kernel: [  185.562589] synchronize_sched()
> in 45 us
> May  8 09:47:40 nattytest kernel: [  185.680091] synchronize_sched()
> in 62023 us
> May  8 09:47:40 nattytest kernel: [  185.760076] synchronize_sched()
> in 78438 us
> May  8 09:47:40 nattytest kernel: [  185.811063] synchronize_sched()
> in 45 us
> May  8 09:47:40 nattytest kernel: [  185.812580] synchronize_sched()
> in 52 us
> May  8 09:47:40 nattytest kernel: [  185.930089] synchronize_sched()
> in 61864 us
> May  8 09:47:40 nattytest kernel: [  186.010072] synchronize_sched()
> in 78465 us
> May  8 09:47:40 nattytest kernel: [  186.061141] synchronize_sched()
> in 44 us
> May  8 09:47:40 nattytest kernel: [  186.062631] synchronize_sched()
> in 29 us
> May  8 09:47:40 nattytest kernel: [  186.180078] synchronize_sched()
> in 61806 us
> May  8 09:47:41 nattytest kernel: [  186.260074] synchronize_sched()
> in 78415 us
> May  8 09:47:41 nattytest kernel: [  186.311073] synchronize_sched()
> in 43 us
> May  8 09:47:41 nattytest kernel: [  186.312573] synchronize_sched()
> in 29 us
> May  8 09:47:41 nattytest kernel: [  186.440083] synchronize_sched()
> in 72358 us
> May  8 09:47:41 nattytest kernel: [  186.520071] synchronize_sched()
> in 78395 us
> May  8 09:47:41 nattytest kernel: [  186.573068] synchronize_sched()
> in 66 us
> May  8 09:47:41 nattytest kernel: [  186.590105] synchronize_sched()
> in 15551 us
> May  8 09:47:41 nattytest kernel: [  186.740092] synchronize_sched()
> in 71884 us
> May  8 09:47:41 nattytest kernel: [  186.830069] synchronize_sched()
> in 89123 us
> May  8 09:47:41 nattytest kernel: [  186.890967] synchronize_sched()
> in 41 us
> May  8 09:47:41 nattytest kernel: [  186.910091] synchronize_sched()
> in 18639 us
> May  8 09:47:41 nattytest kernel: [  187.070078] synchronize_sched()
> in 82847 us
> May  8 09:47:41 nattytest kernel: [  187.140090] synchronize_sched()
> in 68499 us
> May  8 09:47:41 nattytest kernel: [  187.191148] synchronize_sched()
> in 57 us
> May  8 09:47:41 nattytest kernel: [  187.192743] synchronize_sched()
> in 69 us
> May  8 09:47:42 nattytest kernel: [  187.310087] synchronize_sched()
> in 63069 us
> May  8 09:47:42 nattytest kernel: [  187.400076] synchronize_sched()
> in 88432 us
> May  8 09:47:42 nattytest kernel: [  187.461521] synchronize_sched()
> in 57 us
> May  8 09:47:42 nattytest kernel: [  187.480099] synchronize_sched()
> in 16493 us
> May  8 09:47:42 nattytest kernel: [  187.640081] synchronize_sched()
> in 82269 us
> May  8 09:47:42 nattytest kernel: [  187.710079] synchronize_sched()
> in 68389 us
> May  8 09:47:42 nattytest kernel: [  187.761071] synchronize_sched()
> in 38 us
> May  8 09:47:42 nattytest kernel: [  187.762623] synchronize_sched()
> in 67 us
> May  8 09:47:42 nattytest kernel: [  187.880077] synchronize_sched()
> in 61847 us
> May  8 09:47:42 nattytest kernel: [  187.960096] synchronize_sched()
> in 78450 us
> May  8 09:47:42 nattytest kernel: [  188.011147] synchronize_sched()
> in 62 us
> May  8 09:47:42 nattytest kernel: [  188.012617] synchronize_sched()
> in 29 us
> May  8 09:47:42 nattytest kernel: [  188.130083] synchronize_sched()
> in 62431 us
> May  8 09:47:42 nattytest kernel: [  188.210090] synchronize_sched()
> in 78447 us
> May  8 09:47:43 nattytest kernel: [  188.261015] synchronize_sched()
> in 38 us
> May  8 09:47:43 nattytest kernel: [  188.262491] synchronize_sched()
> in 29 us
> May  8 09:47:43 nattytest kernel: [  188.380097] synchronize_sched()
> in 61923 us
> May  8 09:47:43 nattytest kernel: [  188.470131] synchronize_sched()
> in 88521 us
> May  8 09:47:43 nattytest kernel: [  188.531024] synchronize_sched()
> in 46 us
> May  8 09:47:43 nattytest kernel: [  188.550087] synchronize_sched()
> in 17676 us
> May  8 09:47:43 nattytest kernel: [  188.690085] synchronize_sched()
> in 71911 us
> May  8 09:47:43 nattytest kernel: [  188.820071] synchronize_sched()
> in 128473 us
> May  8 09:47:43 nattytest kernel: [  188.881032] synchronize_sched()
> in 42 us
> May  8 09:47:43 nattytest kernel: [  188.900085] synchronize_sched()
> in 17588 us
> May  8 09:47:43 nattytest kernel: [  189.080084] synchronize_sched()
> in 111945 us
> May  8 09:47:43 nattytest kernel: [  189.170068] synchronize_sched()
> in 88449 us
> May  8 09:47:43 nattytest kernel: [  189.221066] synchronize_sched()
> in 41 us
> May  8 09:47:44 nattytest kernel: [  189.230257] synchronize_sched()
> in 7463 us
> May  8 09:47:44 nattytest kernel: [  189.380093] synchronize_sched()
> in 71896 us
> May  8 09:47:44 nattytest kernel: [  189.470093] synchronize_sched()
> in 88481 us
> May  8 09:47:44 nattytest kernel: [  189.550883] synchronize_sched()
> in 43 us
> May  8 09:47:44 nattytest kernel: [  189.552391] synchronize_sched()
> in 29 us
> May  8 09:47:44 nattytest kernel: [  189.670085] synchronize_sched()
> in 61850 us
> May  8 09:47:44 nattytest kernel: [  189.760080] synchronize_sched()
> in 88549 us
> May  8 09:47:44 nattytest kernel: [  189.821014] synchronize_sched()
> in 44 us
> May  8 09:47:44 nattytest kernel: [  189.850080] synchronize_sched()
> in 27628 us
> May  8 09:47:44 nattytest kernel: [  189.990087] synchronize_sched()
> in 72251 us
> May  8 09:47:44 nattytest kernel: [  190.080074] synchronize_sched()
> in 88486 us
> May  8 09:47:44 nattytest kernel: [  190.150968] synchronize_sched()
> in 71 us
> May  8 09:47:44 nattytest kernel: [  190.170092] synchronize_sched()
> in 17672 us
> May  8 09:47:45 nattytest kernel: [  190.320088] synchronize_sched()
> in 71937 us
> May  8 09:47:45 nattytest kernel: [  190.410068] synchronize_sched()
> in 89155 us
> May  8 09:47:45 nattytest kernel: [  190.490882] synchronize_sched()
> in 40 us
> May  8 09:47:45 nattytest kernel: [  190.492405] synchronize_sched()
> in 29 us
> May  8 09:47:45 nattytest kernel: [  190.620103] synchronize_sched()
> in 62808 us
> May  8 09:47:45 nattytest kernel: [  190.700108] synchronize_sched()
> in 78486 us
> May  8 09:47:45 nattytest kernel: [  190.820119] synchronize_sched()
> in 69075 us
> May  8 09:47:45 nattytest kernel: [  190.870096] synchronize_sched()
> in 48447 us
> May  8 09:47:45 nattytest kernel: [  191.000082] synchronize_sched()
> in 62795 us
> May  8 09:47:45 nattytest kernel: [  191.080082] synchronize_sched()
> in 78489 us
> May  8 09:47:45 nattytest kernel: [  191.141021] synchronize_sched()
> in 61 us
> May  8 09:47:45 nattytest kernel: [  191.160086] synchronize_sched()
> in 17596 us
> May  8 09:47:46 nattytest kernel: [  191.310097] synchronize_sched()
> in 82089 us
> May  8 09:47:46 nattytest kernel: [  191.390076] synchronize_sched()
> in 78427 us
> May  8 09:47:46 nattytest kernel: [  191.451045] synchronize_sched()
> in 49 us
> May  8 09:47:46 nattytest kernel: [  191.470088] synchronize_sched()
> in 17643 us
> May  8 09:47:46 nattytest kernel: [  191.620095] synchronize_sched()
> in 72871 us
> May  8 09:47:46 nattytest kernel: [  191.700072] synchronize_sched()
> in 79137 us
> May  8 09:47:46 nattytest kernel: [  191.751087] synchronize_sched()
> in 40 us
> May  8 09:47:46 nattytest kernel: [  191.752600] synchronize_sched()
> in 48 us
> May  8 09:47:46 nattytest kernel: [  191.880092] synchronize_sched()
> in 72297 us
> May  8 09:47:46 nattytest kernel: [  191.950076] synchronize_sched()
> in 69142 us
> May  8 09:47:46 nattytest kernel: [  192.005349] synchronize_sched()
> in 54 us
> May  8 09:47:46 nattytest kernel: [  192.006834] synchronize_sched()
> in 30 us
> May  8 09:47:46 nattytest kernel: [  192.130091] synchronize_sched()
> in 72839 us
> May  8 09:47:46 nattytest kernel: [  192.210069] synchronize_sched()
> in 78535 us
> May  8 09:47:47 nattytest kernel: [  192.270973] synchronize_sched()
> in 43 us
> May  8 09:47:47 nattytest kernel: [  192.300094] synchronize_sched()
> in 27592 us
> May  8 09:47:47 nattytest kernel: [  192.450079] synchronize_sched()
> in 83072 us
> May  8 09:47:47 nattytest kernel: [  192.520073] synchronize_sched()
> in 68499 us
> May  8 09:47:47 nattytest kernel: [  192.571106] synchronize_sched()
> in 44 us
> May  8 09:47:47 nattytest kernel: [  192.572585] synchronize_sched()
> in 29 us
> May  8 09:47:47 nattytest kernel: [  192.700080] synchronize_sched()
> in 62670 us
> May  8 09:47:47 nattytest kernel: [  192.820084] synchronize_sched()
> in 118509 us
> May  8 09:47:47 nattytest kernel: [  192.871107] synchronize_sched()
> in 69 us
> May  8 09:47:47 nattytest kernel: [  192.872611] synchronize_sched()
> in 29 us
> May  8 09:47:47 nattytest kernel: [  192.990110] synchronize_sched()
> in 62742 us
> May  8 09:47:47 nattytest kernel: [  193.080070] synchronize_sched()
> in 89147 us
> May  8 09:47:47 nattytest kernel: [  193.150962] synchronize_sched()
> in 65 us
> May  8 09:47:47 nattytest kernel: [  193.160285] synchronize_sched()
> in 8879 us
> May  8 09:47:48 nattytest kernel: [  193.300097] synchronize_sched()
> in 72753 us
> May  8 09:47:48 nattytest kernel: [  193.380075] synchronize_sched()
> in 78472 us
> May  8 09:47:48 nattytest kernel: [  193.431029] synchronize_sched()
> in 41 us
> May  8 09:47:48 nattytest kernel: [  193.432518] synchronize_sched()
> in 30 us
> May  8 09:47:48 nattytest kernel: [  193.560081] synchronize_sched()
> in 72448 us
> May  8 09:47:48 nattytest kernel: [  193.670073] synchronize_sched()
> in 108452 us
> May  8 09:47:48 nattytest kernel: [  193.721098] synchronize_sched()
> in 49 us
> May  8 09:47:48 nattytest kernel: [  193.722821] synchronize_sched()
> in 30 us
> May  8 09:47:48 nattytest kernel: [  193.840075] synchronize_sched()
> in 62317 us
> May  8 09:47:48 nattytest kernel: [  193.920070] synchronize_sched()
> in 78472 us
> May  8 09:47:48 nattytest kernel: [  193.971073] synchronize_sched()
> in 43 us
> May  8 09:47:48 nattytest kernel: [  193.972550] synchronize_sched()
> in 29 us
> May  8 09:47:48 nattytest kernel: [  194.100088] synchronize_sched()
> in 72391 us
> May  8 09:47:48 nattytest kernel: [  194.170082] synchronize_sched()
> in 69202 us
> May  8 09:47:48 nattytest kernel: [  194.221053] synchronize_sched()
> in 40 us
> May  8 09:47:48 nattytest kernel: [  194.222543] synchronize_sched()
> in 29 us
> May  8 09:47:49 nattytest kernel: [  194.350082] synchronize_sched()
> in 72874 us
> May  8 09:47:49 nattytest kernel: [  194.430068] synchronize_sched()
> in 78536 us
> May  8 09:47:49 nattytest kernel: [  194.481167] synchronize_sched()
> in 70 us
> May  8 09:47:49 nattytest kernel: [  194.500091] synchronize_sched()
> in 17486 us
> May  8 09:47:49 nattytest kernel: [  194.640088] synchronize_sched()
> in 73012 us
> May  8 09:47:49 nattytest kernel: [  194.730069] synchronize_sched()
> in 88519 us
> May  8 09:47:49 nattytest kernel: [  194.781117] synchronize_sched()
> in 46 us
> May  8 09:47:49 nattytest kernel: [  194.792950] synchronize_sched()
> in 10384 us
> May  8 09:47:49 nattytest kernel: [  194.910126] synchronize_sched()
> in 63131 us
> May  8 09:47:49 nattytest kernel: [  195.000088] synchronize_sched()
> in 88453 us
> May  8 09:47:49 nattytest kernel: [  195.160080] synchronize_sched()
> in 79582 us
> May  8 09:47:50 nattytest kernel: [  195.240073] synchronize_sched()
> in 79132 us
> May  8 09:47:50 nattytest kernel: [  195.360124] synchronize_sched()
> in 62368 us
> May  8 09:47:50 nattytest kernel: [  195.440096] synchronize_sched()
> in 79148 us
> May  8 09:47:50 nattytest kernel: [  195.491063] synchronize_sched()
> in 53 us
> May  8 09:47:50 nattytest kernel: [  195.492182] synchronize_sched()
> in 29 us
> May  8 09:47:50 nattytest kernel: [  195.610085] synchronize_sched()
> in 62444 us
> May  8 09:47:50 nattytest kernel: [  195.700078] synchronize_sched()
> in 89170 us
> May  8 09:47:50 nattytest kernel: [  195.761011] synchronize_sched()
> in 57 us
> May  8 09:47:50 nattytest kernel: [  195.780137] synchronize_sched()
> in 18587 us
> May  8 09:47:50 nattytest kernel: [  195.920080] synchronize_sched()
> in 73194 us
> May  8 09:47:50 nattytest kernel: [  196.000073] synchronize_sched()
> in 78456 us
> May  8 09:47:50 nattytest kernel: [  196.051067] synchronize_sched()
> in 42 us
> May  8 09:47:50 nattytest kernel: [  196.052613] synchronize_sched()
> in 44 us
> May  8 09:47:50 nattytest kernel: [  196.170080] synchronize_sched()
> in 63040 us
> May  8 09:47:51 nattytest kernel: [  196.250075] synchronize_sched()
> in 78420 us
> May  8 09:47:51 nattytest kernel: [  196.301051] synchronize_sched()
> in 46 us
> May  8 09:47:51 nattytest kernel: [  196.302527] synchronize_sched()
> in 29 us
> May  8 09:47:51 nattytest kernel: [  196.420095] synchronize_sched()
> in 62701 us
> May  8 09:47:51 nattytest kernel: [  196.500064] synchronize_sched()
> in 78448 us
> May  8 09:47:51 nattytest kernel: [  196.551104] synchronize_sched()
> in 48 us
> May  8 09:47:51 nattytest kernel: [  196.552631] synchronize_sched()
> in 49 us
> May  8 09:47:51 nattytest kernel: [  196.670084] synchronize_sched()
> in 63111 us
> May  8 09:47:51 nattytest kernel: [  196.750095] synchronize_sched()
> in 78464 us
> May  8 09:47:51 nattytest kernel: [  196.801082] synchronize_sched()
> in 60 us
> May  8 09:47:51 nattytest kernel: [  196.802616] synchronize_sched()
> in 43 us
> May  8 09:47:51 nattytest kernel: [  196.930124] synchronize_sched()
> in 73010 us
> May  8 09:47:51 nattytest kernel: [  197.010084] synchronize_sched()
> in 78359 us
> May  8 09:47:51 nattytest kernel: [  197.071137] synchronize_sched()
> in 79 us
> May  8 09:47:51 nattytest kernel: [  197.080320] synchronize_sched()
> in 7773 us
> May  8 09:47:52 nattytest kernel: [  197.230086] synchronize_sched()
> in 83245 us
> May  8 09:47:52 nattytest kernel: [  197.310075] synchronize_sched()
> in 78509 us
> May  8 09:47:52 nattytest kernel: [  197.361050] synchronize_sched()
> in 44 us
> May  8 09:47:52 nattytest kernel: [  197.380084] synchronize_sched()
> in 17591 us
> May  8 09:47:52 nattytest kernel: [  197.530084] synchronize_sched()
> in 82999 us
> May  8 09:47:52 nattytest kernel: [  197.610080] synchronize_sched()
> in 79162 us
> May  8 09:47:52 nattytest kernel: [  197.680937] synchronize_sched()
> in 65 us
> May  8 09:47:52 nattytest kernel: [  197.700096] synchronize_sched()
> in 18620 us
> May  8 09:47:52 nattytest kernel: [  197.850141] synchronize_sched()
> in 73601 us
> May  8 09:47:52 nattytest kernel: [  197.930091] synchronize_sched()
> in 78393 us
> May  8 09:47:52 nattytest kernel: [  197.981050] synchronize_sched()
> in 39 us
> May  8 09:47:52 nattytest kernel: [  197.982788] synchronize_sched()
> in 29 us
> May  8 09:47:52 nattytest kernel: [  198.110084] synchronize_sched()
> in 72893 us
> May  8 09:47:52 nattytest kernel: [  198.180072] synchronize_sched()
> in 69151 us
> May  8 09:47:53 nattytest kernel: [  198.231098] synchronize_sched()
> in 50 us
> May  8 09:47:53 nattytest kernel: [  198.232679] synchronize_sched()
> in 46 us
> May  8 09:47:53 nattytest kernel: [  198.370084] synchronize_sched()
> in 72935 us
> May  8 09:47:53 nattytest kernel: [  198.450068] synchronize_sched()
> in 78461 us
> May  8 09:47:53 nattytest kernel: [  198.511077] synchronize_sched()
> in 77 us
> May  8 09:47:53 nattytest kernel: [  198.530094] synchronize_sched()
> in 17512 us
> May  8 09:47:53 nattytest kernel: [  198.670095] synchronize_sched()
> in 83122 us
> May  8 09:47:53 nattytest kernel: [  198.740079] synchronize_sched()
> in 68453 us
> May  8 09:47:53 nattytest kernel: [  198.801121] synchronize_sched()
> in 48 us
> May  8 09:47:53 nattytest kernel: [  198.802600] synchronize_sched()
> in 29 us
> May  8 09:47:53 nattytest kernel: [  198.920100] synchronize_sched()
> in 63392 us
> May  8 09:47:53 nattytest kernel: [  199.000088] synchronize_sched()
> in 78482 us
> May  8 09:47:53 nattytest kernel: [  199.070099] synchronize_sched()
> in 19013 us
> May  8 09:47:53 nattytest kernel: [  199.071626] synchronize_sched()
> in 48 us
> May  8 09:47:53 nattytest kernel: [  199.190083] synchronize_sched()
> in 63245 us
> May  8 09:47:54 nattytest kernel: [  199.270075] synchronize_sched()
> in 78491 us
> May  8 09:47:54 nattytest kernel: [  199.321088] synchronize_sched()
> in 42 us
> May  8 09:47:54 nattytest kernel: [  199.322589] synchronize_sched()
> in 29 us
> May  8 09:47:54 nattytest kernel: [  199.440101] synchronize_sched()
> in 63063 us
> May  8 09:47:54 nattytest kernel: [  199.520073] synchronize_sched()
> in 78463 us
> May  8 09:47:54 nattytest kernel: [  199.571056] synchronize_sched()
> in 49 us
> May  8 09:47:54 nattytest kernel: [  199.572488] synchronize_sched()
> in 29 us
> May  8 09:47:54 nattytest kernel: [  199.700083] synchronize_sched()
> in 73220 us
> May  8 09:47:54 nattytest kernel: [  199.770096] synchronize_sched()
> in 68508 us
> May  8 09:47:54 nattytest kernel: [  199.821090] synchronize_sched()
> in 50 us
> May  8 09:47:54 nattytest kernel: [  199.822540] synchronize_sched()
> in 29 us
> May  8 09:47:54 nattytest kernel: [  199.940084] synchronize_sched()
> in 63681 us
> May  8 09:47:54 nattytest kernel: [  200.020067] synchronize_sched()
> in 78451 us
> May  8 09:47:54 nattytest kernel: [  200.071095] synchronize_sched()
> in 41 us
> May  8 09:47:54 nattytest kernel: [  200.072566] synchronize_sched()
> in 29 us
> May  8 09:47:54 nattytest kernel: [  200.190089] synchronize_sched()
> in 63584 us
> May  8 09:47:55 nattytest kernel: [  200.280076] synchronize_sched()
> in 89123 us
> May  8 09:47:55 nattytest kernel: [  200.341011] synchronize_sched()
> in 61 us
> May  8 09:47:55 nattytest kernel: [  200.360094] synchronize_sched()
> in 18527 us
> May  8 09:47:55 nattytest kernel: [  200.520085] synchronize_sched()
> in 94129 us
> May  8 09:47:55 nattytest kernel: [  200.670072] synchronize_sched()
> in 148512 us
> May  8 09:47:55 nattytest kernel: [  200.730983] synchronize_sched()
> in 41 us
> May  8 09:47:55 nattytest kernel: [  200.750355] synchronize_sched()
> in 17917 us
> May  8 09:47:55 nattytest kernel: [  200.890082] synchronize_sched()
> in 83406 us
> May  8 09:47:55 nattytest kernel: [  200.970072] synchronize_sched()
> in 78439 us
> May  8 09:47:55 nattytest kernel: [  201.031034] synchronize_sched()
> in 45 us
> May  8 09:47:55 nattytest kernel: [  201.050090] synchronize_sched()
> in 17555 us
> May  8 09:47:55 nattytest kernel: [  201.200090] synchronize_sched()
> in 83318 us
> May  8 09:47:56 nattytest kernel: [  201.270075] synchronize_sched()
> in 68468 us
> May  8 09:47:56 nattytest kernel: [  201.330983] synchronize_sched()
> in 41 us
> May  8 09:47:56 nattytest kernel: [  201.332487] synchronize_sched()
> in 62 us
> May  8 09:47:56 nattytest kernel: [  201.450088] synchronize_sched()
> in 53772 us
> May  8 09:47:56 nattytest kernel: [  201.530075] synchronize_sched()
> in 78437 us
> May  8 09:47:56 nattytest kernel: [  201.590986] synchronize_sched()
> in 46 us
> May  8 09:47:56 nattytest kernel: [  201.592134] synchronize_sched()
> in 29 us
> May  8 09:47:56 nattytest kernel: [  201.710085] synchronize_sched()
> in 63551 us
> May  8 09:47:56 nattytest kernel: [  201.830146] synchronize_sched()
> in 118553 us
> May  8 09:47:56 nattytest kernel: [  201.891143] synchronize_sched()
> in 50 us
> May  8 09:47:56 nattytest kernel: [  201.892712] synchronize_sched()
> in 55 us
> May  8 09:47:56 nattytest kernel: [  202.010092] synchronize_sched()
> in 63446 us
> May  8 09:47:56 nattytest kernel: [  202.090126] synchronize_sched()
> in 78477 us
> May  8 09:47:56 nattytest kernel: [  202.141120] synchronize_sched()
> in 45 us
> May  8 09:47:56 nattytest kernel: [  202.142867] synchronize_sched()
> in 30 us
> May  8 09:47:57 nattytest kernel: [  202.260092] synchronize_sched()
> in 63271 us
> May  8 09:47:57 nattytest kernel: [  202.340071] synchronize_sched()
> in 78474 us
> May  8 09:47:57 nattytest kernel: [  202.391094] synchronize_sched()
> in 44 us
> May  8 09:47:57 nattytest kernel: [  202.392581] synchronize_sched()
> in 29 us
> May  8 09:47:57 nattytest kernel: [  202.510110] synchronize_sched()
> in 63718 us
> May  8 09:47:57 nattytest kernel: [  202.600092] synchronize_sched()
> in 88434 us
> May  8 09:47:57 nattytest kernel: [  202.661075] synchronize_sched()
> in 46 us
> May  8 09:47:57 nattytest kernel: [  202.680091] synchronize_sched()
> in 17516 us
> May  8 09:47:57 nattytest kernel: [  202.820083] synchronize_sched()
> in 83733 us
> May  8 09:47:57 nattytest kernel: [  202.900067] synchronize_sched()
> in 78478 us
> May  8 09:47:57 nattytest kernel: [  202.951062] synchronize_sched()
> in 43 us
> May  8 09:47:57 nattytest kernel: [  202.960445] synchronize_sched()
> in 7921 us
> May  8 09:47:57 nattytest kernel: [  203.100080] synchronize_sched()
> in 83496 us
> May  8 09:47:57 nattytest kernel: [  203.170084] synchronize_sched()
> in 68474 us
> May  8 09:47:57 nattytest kernel: [  203.221083] synchronize_sched()
> in 41 us
> May  8 09:47:57 nattytest kernel: [  203.222692] synchronize_sched()
> in 68 us
> May  8 09:47:58 nattytest kernel: [  203.350083] synchronize_sched()
> in 74062 us
> May  8 09:47:58 nattytest kernel: [  203.430100] synchronize_sched()
> in 78468 us
> May  8 09:47:58 nattytest kernel: [  203.491028] synchronize_sched()
> in 52 us
> May  8 09:47:58 nattytest kernel: [  203.510086] synchronize_sched()
> in 17607 us
> May  8 09:47:58 nattytest kernel: [  203.670080] synchronize_sched()
> in 83798 us
> May  8 09:47:58 nattytest kernel: [  203.750080] synchronize_sched()
> in 78382 us
> May  8 09:47:58 nattytest kernel: [  203.821032] synchronize_sched()
> in 48 us
> May  8 09:47:58 nattytest kernel: [  203.840093] synchronize_sched()
> in 17608 us
> May  8 09:47:58 nattytest kernel: [  203.990084] synchronize_sched()
> in 73664 us
> May  8 09:47:58 nattytest kernel: [  204.080078] synchronize_sched()
> in 88391 us
> May  8 09:47:58 nattytest kernel: [  204.141021] synchronize_sched()
> in 44 us
> May  8 09:47:58 nattytest kernel: [  204.160083] synchronize_sched()
> in 17671 us
> May  8 09:47:59 nattytest kernel: [  204.310086] synchronize_sched()
> in 83288 us
> May  8 09:47:59 nattytest kernel: [  204.380074] synchronize_sched()
> in 69168 us
> May  8 09:47:59 nattytest kernel: [  204.431059] synchronize_sched()
> in 67 us
> May  8 09:47:59 nattytest kernel: [  204.432205] synchronize_sched()
> in 30 us
> May  8 09:47:59 nattytest kernel: [  204.550147] synchronize_sched()
> in 54218 us
> May  8 09:47:59 nattytest kernel: [  204.640079] synchronize_sched()
> in 88401 us
> May  8 09:47:59 nattytest kernel: [  204.701062] synchronize_sched()
> in 48 us
> May  8 09:47:59 nattytest kernel: [  204.720093] synchronize_sched()
> in 17508 us
> May  8 09:47:59 nattytest kernel: [  204.850108] synchronize_sched()
> in 73968 us
> May  8 09:47:59 nattytest kernel: [  204.940074] synchronize_sched()
> in 88489 us
> May  8 09:47:59 nattytest kernel: [  205.011002] synchronize_sched()
> in 44 us
> May  8 09:47:59 nattytest kernel: [  205.030104] synchronize_sched()
> in 17658 us
> May  8 09:47:59 nattytest kernel: [  205.190091] synchronize_sched()
> in 83651 us
> May  8 09:48:00 nattytest kernel: [  205.270094] synchronize_sched()
> in 78501 us
> May  8 09:48:00 nattytest kernel: [  205.341000] synchronize_sched()
> in 65 us
> May  8 09:48:00 nattytest kernel: [  205.360081] synchronize_sched()
> in 17640 us
> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-08 13:44                       ` Paul E. McKenney
@ 2011-05-08 14:27                         ` Alex Bligh
  2011-05-08 14:47                           ` Paul E. McKenney
  0 siblings, 1 reply; 53+ messages in thread
From: Alex Bligh @ 2011-05-08 14:27 UTC (permalink / raw)
  To: paulmck; +Cc: Eric Dumazet, netdev, Alex Bligh

Paul,

>> Yes, really 20-49us and 50-99us, not ms. Raw data attached :-)
>>
>> I'm guessing there are circumstances where there is an early exit.
>
> Well, if you were onlining and offlining CPUs, then if there was only
> one CPU online, this could happen.

No, I wasn't doing that.

>  And there really is only one CPU
> online during boot, so if your measurements included early boot time,
> this could easily explain these very short timings.

No, I waited a few minutes after boot for the system to stabilize, and
all CPUs were definitely online.

The patch to the kernel I am running is below.

>> There is nothing much going on these systems (idle, no other users,
>> just normal system daemons).
>
> And normal system daemons might cause this, right?

Yes. Everything is normal, except I did
 service udev stop
 unshare -n bash
which together stop the system running interface scripts when
interfaces are created (as upstart and upstart-udev-bridge are
now integrated, you can't kill upstart, so you have to rely on
unshare -n to stop the events being propagated). That's just
to avoid measuring the time it takes to execute the scripts.

-- 
Alex Bligh

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index dd4aea8..e401018 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -1518,6 +1518,7 @@ EXPORT_SYMBOL_GPL(call_rcu_bh);
 void synchronize_sched(void)
 {
        struct rcu_synchronize rcu;
+       ktime_t time_start = ktime_get();

        if (rcu_blocking_is_gp())
                return;
@@ -1529,6 +1530,7 @@ void synchronize_sched(void)
        /* Wait for it. */
        wait_for_completion(&rcu.completion);
        destroy_rcu_head_on_stack(&rcu.head);
+       pr_err("synchronize_sched() in %lld us\n", 
ktime_us_delta(ktime_get(), time_start));
 }
 EXPORT_SYMBOL_GPL(synchronize_sched);

diff --git a/net/core/dev.c b/net/core/dev.c
index 856b6ee..013f627 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5164,7 +5164,9 @@ static void rollback_registered_many(struct list_head 
*head)
        dev = list_first_entry(head, struct net_device, unreg_list);
        call_netdevice_notifiers(NETDEV_UNREGISTER_BATCH, dev);

+       pr_err("begin rcu_barrier()\n");
        rcu_barrier();
+       pr_err("end rcu_barrier()\n");

        list_for_each_entry(dev, head, unreg_list)
                dev_put(dev);
@@ -5915,8 +5917,10 @@ EXPORT_SYMBOL(free_netdev);
  */
 void synchronize_net(void)
 {
+       pr_err("begin synchronize_net()\n");
        might_sleep();
        synchronize_rcu();
+       pr_err("end synchronize_net()\n");
 }
 EXPORT_SYMBOL(synchronize_net);



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-08 14:27                         ` Alex Bligh
@ 2011-05-08 14:47                           ` Paul E. McKenney
  2011-05-08 15:17                             ` Alex Bligh
  0 siblings, 1 reply; 53+ messages in thread
From: Paul E. McKenney @ 2011-05-08 14:47 UTC (permalink / raw)
  To: Alex Bligh; +Cc: Eric Dumazet, netdev

On Sun, May 08, 2011 at 03:27:07PM +0100, Alex Bligh wrote:
> Paul,
> 
> >>Yes, really 20-49us and 50-99us, not ms. Raw data attached :-)
> >>
> >>I'm guessing there are circumstances where there is an early exit.
> >
> >Well, if you were onlining and offlining CPUs, then if there was only
> >one CPU online, this could happen.
> 
> No, I wasn't doing that.

OK.

> > And there really is only one CPU
> >online during boot, so if your measurements included early boot time,
> >this could easily explain these very short timings.
> 
> No, I waited a few minutes after boot for the system to stabilize, and
> all CPUs were definitely online.
> 
> The patch to the kernel I am running is below.

OK, interesting...

My guess is that you need to be using ktime_get_ts().  Isn't ktime_get()
subject to various sorts of adjustment?

> >>There is nothing much going on these systems (idle, no other users,
> >>just normal system daemons).
> >
> >And normal system daemons might cause this, right?
> 
> Yes. Everything is normal, except I did
> service udev stop
> unshare -n bash
> which together stop the system running interface scripts when
> interfaces are created (as upstart and upstart-udev-bridge are
> now integrated, you can't kill upstart, so you have to rely on
> unshare -n to stop the events being propagated). That's just
> to avoid measuring the time it takes to execute the scripts.

OK, so you really could be seeing grace periods started by these system
daemons.

							Thanx, Paul

> -- 
> Alex Bligh
> 
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index dd4aea8..e401018 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -1518,6 +1518,7 @@ EXPORT_SYMBOL_GPL(call_rcu_bh);
> void synchronize_sched(void)
> {
>        struct rcu_synchronize rcu;
> +       ktime_t time_start = ktime_get();
> 
>        if (rcu_blocking_is_gp())
>                return;
> @@ -1529,6 +1530,7 @@ void synchronize_sched(void)
>        /* Wait for it. */
>        wait_for_completion(&rcu.completion);
>        destroy_rcu_head_on_stack(&rcu.head);
> +       pr_err("synchronize_sched() in %lld us\n",
> ktime_us_delta(ktime_get(), time_start));
> }
> EXPORT_SYMBOL_GPL(synchronize_sched);
> 
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 856b6ee..013f627 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -5164,7 +5164,9 @@ static void rollback_registered_many(struct
> list_head *head)
>        dev = list_first_entry(head, struct net_device, unreg_list);
>        call_netdevice_notifiers(NETDEV_UNREGISTER_BATCH, dev);
> 
> +       pr_err("begin rcu_barrier()\n");
>        rcu_barrier();
> +       pr_err("end rcu_barrier()\n");
> 
>        list_for_each_entry(dev, head, unreg_list)
>                dev_put(dev);
> @@ -5915,8 +5917,10 @@ EXPORT_SYMBOL(free_netdev);
>  */
> void synchronize_net(void)
> {
> +       pr_err("begin synchronize_net()\n");
>        might_sleep();
>        synchronize_rcu();
> +       pr_err("end synchronize_net()\n");
> }
> EXPORT_SYMBOL(synchronize_net);
> 
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-08 14:47                           ` Paul E. McKenney
@ 2011-05-08 15:17                             ` Alex Bligh
  2011-05-08 15:48                               ` Paul E. McKenney
  0 siblings, 1 reply; 53+ messages in thread
From: Alex Bligh @ 2011-05-08 15:17 UTC (permalink / raw)
  To: paulmck; +Cc: Eric Dumazet, netdev, Alex Bligh

Paul,

>> No, I waited a few minutes after boot for the system to stabilize, and
>> all CPUs were definitely online.
>>
>> The patch to the kernel I am running is below.
>
> OK, interesting...
>
> My guess is that you need to be using ktime_get_ts().  Isn't ktime_get()
> subject to various sorts of adjustment?

It's Eric's code, not mine, but:

kernel/time/timekeeping.c suggests they do the same thing
(adjust xtime by wall_to_monotonic), just one returns a
struct timespec and the other returns a ktime_t.

>> >> There is nothing much going on these systems (idle, no other users,
>> >> just normal system daemons).
>> >
>> > And normal system daemons might cause this, right?
>>
>> Yes. Everything is normal, except I did
>> service udev stop
>> unshare -n bash
>> which together stop the system running interface scripts when
>> interfaces are created (as upstart and upstart-udev-bridge are
>> now integrated, you can't kill upstart, so you have to rely on
>> unshare -n to stop the events being propagated). That's just
>> to avoid measuring the time it takes to execute the scripts.
>
> OK, so you really could be seeing grace periods started by these system
> daemons.

In 50% of 200 calls? That seems pretty unlikely. I think it's more
likely to be the 6 jiffies per call to ensure cpus are idle,
plus the 3 calls per interface destroy.

If 6 jiffies per call to ensure cpus are idle is a fact of life,
then the question goes back to why interface removal is waiting
for rcu readers to be released synchronously, as opposed to
doing the update bits synchronously, then doing the reclaim
element (freeing the memory) afterwards using call_rcu.

-- 
Alex Bligh

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-08 15:17                             ` Alex Bligh
@ 2011-05-08 15:48                               ` Paul E. McKenney
  2011-05-08 21:00                                 ` Eric Dumazet
  0 siblings, 1 reply; 53+ messages in thread
From: Paul E. McKenney @ 2011-05-08 15:48 UTC (permalink / raw)
  To: Alex Bligh; +Cc: Eric Dumazet, netdev

On Sun, May 08, 2011 at 04:17:42PM +0100, Alex Bligh wrote:
> Paul,
> 
> >>No, I waited a few minutes after boot for the system to stabilize, and
> >>all CPUs were definitely online.
> >>
> >>The patch to the kernel I am running is below.
> >
> >OK, interesting...
> >
> >My guess is that you need to be using ktime_get_ts().  Isn't ktime_get()
> >subject to various sorts of adjustment?
> 
> It's Eric's code, not mine, but:
> 
> kernel/time/timekeeping.c suggests they do the same thing
> (adjust xtime by wall_to_monotonic), just one returns a
> struct timespec and the other returns a ktime_t.
> 
> >>>> There is nothing much going on these systems (idle, no other users,
> >>>> just normal system daemons).
> >>>
> >>> And normal system daemons might cause this, right?
> >>
> >>Yes. Everything is normal, except I did
> >>service udev stop
> >>unshare -n bash
> >>which together stop the system running interface scripts when
> >>interfaces are created (as upstart and upstart-udev-bridge are
> >>now integrated, you can't kill upstart, so you have to rely on
> >>unshare -n to stop the events being propagated). That's just
> >>to avoid measuring the time it takes to execute the scripts.
> >
> >OK, so you really could be seeing grace periods started by these system
> >daemons.
> 
> In 50% of 200 calls? That seems pretty unlikely. I think it's more
> likely to be the 6 jiffies per call to ensure cpus are idle,
> plus the 3 calls per interface destroy.
> 
> If 6 jiffies per call to ensure cpus are idle is a fact of life,
> then the question goes back to why interface removal is waiting
> for rcu readers to be released synchronously, as opposed to
> doing the update bits synchronously, then doing the reclaim
> element (freeing the memory) afterwards using call_rcu.

This would speed things up considerably, assuming that there is no
other reason to block for an RCU grace period.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-08 15:48                               ` Paul E. McKenney
@ 2011-05-08 21:00                                 ` Eric Dumazet
  2011-05-09  4:44                                   ` [PATCH] veth: use batched device unregister Eric Dumazet
                                                     ` (3 more replies)
  0 siblings, 4 replies; 53+ messages in thread
From: Eric Dumazet @ 2011-05-08 21:00 UTC (permalink / raw)
  To: paulmck; +Cc: Alex Bligh, netdev, Jesse Gross

Le dimanche 08 mai 2011 à 08:48 -0700, Paul E. McKenney a écrit :
> On Sun, May 08, 2011 at 04:17:42PM +0100, Alex Bligh wrote:
> > 
> > If 6 jiffies per call to ensure cpus are idle is a fact of life,
> > then the question goes back to why interface removal is waiting
> > for rcu readers to be released synchronously, as opposed to
> > doing the update bits synchronously, then doing the reclaim
> > element (freeing the memory) afterwards using call_rcu.
> 
> This would speed things up considerably, assuming that there is no
> other reason to block for an RCU grace period.
> 

Thats not so simple... Things are modular and better be safe than crash,
on a very rare event (device dismantles are not the thing we expect to
do very often. Only special needs might need to perform hundred of them
per minute...)

For example, in the VLAN dismantle phase (ip link del eth0.103)
we have 3 calls to synchronize_rcu() and one call to rcu_barrier()

[ the 'extra' synchronize_rcu() call comes from unregister_vlan_dev() ]

Maybe with new VLAN model, we could now remove this synchronize_net()
call from vlan code. Jesse what do you think ?
Once vlan_group_set_device(grp, vlan_id, NULL) had been called, why
should we respect one rcu grace period at all, given dev is queued to
unregister_netdevice_queue() [ which has its own couples of
synchronize_net() / rcu_barrier() ]


The real scalability problem of device dismantles comes from the fact
that all these waits are done under RTNL mutex. This is the real killer
because you cannot use your eight cpus, even if you are willing to.

We can probably speed things, but we should consider the following user
actions :

ip link add link eth0 vlan103 type vlan id 103
ip link del vlan103
ip link add link eth1 vlan103 type vlan id 103

The "link del" command should return to user only if the minimum things
had been done, to make sure the following "link add" wont fail
mysteriously.




^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH] veth: use batched device unregister
  2011-05-08 21:00                                 ` Eric Dumazet
@ 2011-05-09  4:44                                   ` Eric Dumazet
  2011-05-09  6:56                                     ` Michał Mirosław
  2011-05-09  7:45                                     ` [PATCH v2 net-next-2.6] veth: use batched device unregister Eric Dumazet
  2011-05-09  5:37                                   ` Scalability of interface creation and deletion Alex Bligh
                                                     ` (2 subsequent siblings)
  3 siblings, 2 replies; 53+ messages in thread
From: Eric Dumazet @ 2011-05-09  4:44 UTC (permalink / raw)
  To: David Miller
  Cc: Alex Bligh, netdev, Jesse Gross, Paul E. McKenney, Ben Greear

veth devices dont use the batched device unregisters yet.

Since veth are a pair of devices, it makes sense to use a batch of two
unregisters, this roughly divide dismantle time by two.

Reported-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Jesse Gross <jesse@nicira.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ben Greear <greearb@candelatech.com>
---
 drivers/net/veth.c |   12 ++++++++++--
 1 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 3b99f64..77c4679 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -451,8 +451,16 @@ static void veth_dellink(struct net_device *dev, struct list_head *head)
 	priv = netdev_priv(dev);
 	peer = priv->peer;
 
-	unregister_netdevice_queue(dev, head);
-	unregister_netdevice_queue(peer, head);
+	if (head == NULL) {
+		LIST_HEAD(list);
+		/* make a batch of two devices to speedup unregister */
+		unregister_netdevice_queue(dev, &list);
+		unregister_netdevice_queue(peer, &list);
+		unregister_netdevice_many(&list);
+	} else {
+		unregister_netdevice_queue(dev, head);
+		unregister_netdevice_queue(peer, head);
+	}
 }
 
 static const struct nla_policy veth_policy[VETH_INFO_MAX + 1];



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-08 21:00                                 ` Eric Dumazet
  2011-05-09  4:44                                   ` [PATCH] veth: use batched device unregister Eric Dumazet
@ 2011-05-09  5:37                                   ` Alex Bligh
  2011-05-09  6:37                                     ` Eric Dumazet
  2011-05-09  7:11                                   ` Paul E. McKenney
  2011-05-09 17:30                                   ` Jesse Gross
  3 siblings, 1 reply; 53+ messages in thread
From: Alex Bligh @ 2011-05-09  5:37 UTC (permalink / raw)
  To: Eric Dumazet, paulmck; +Cc: netdev, Jesse Gross, Alex Bligh



--On 8 May 2011 23:00:47 +0200 Eric Dumazet <eric.dumazet@gmail.com> wrote:

> We can probably speed things, but we should consider the following user
> actions :

How about

> ip link add link eth0 vlan103 type vlan id 103
> ip link del vlan103

Removes and unlinks structures, including making name available, sending
out netlink messages, but doesn't free things

> ip link add link eth1 vlan103 type vlan id 103

creates new interface

[some time later] original zombie i/f freed

> The "link del" command should return to user only if the minimum things
> had been done, to make sure the following "link add" wont fail
> mysteriously.

Are you worried about failure through name collision (already
dealt with), vlan tag collision (ditto) or what?

-- 
Alex Bligh

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-09  5:37                                   ` Scalability of interface creation and deletion Alex Bligh
@ 2011-05-09  6:37                                     ` Eric Dumazet
  0 siblings, 0 replies; 53+ messages in thread
From: Eric Dumazet @ 2011-05-09  6:37 UTC (permalink / raw)
  To: Alex Bligh; +Cc: paulmck, netdev, Jesse Gross, Ben Greear

Le lundi 09 mai 2011 à 06:37 +0100, Alex Bligh a écrit :
> 
> --On 8 May 2011 23:00:47 +0200 Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
> > We can probably speed things, but we should consider the following user
> > actions :
> 
> How about
> 
> > ip link add link eth0 vlan103 type vlan id 103
> > ip link del vlan103
> 
> Removes and unlinks structures, including making name available, sending
> out netlink messages, but doesn't free things

Most of the cleanup work has to be done with RTNL being held, and this
might because of transaction atomicity requirement.

In your test you dismantle idle devices. Now think a bit when you have
both trafic in and out, sockets with destinations still pointing to the
device, in flight arp requests, all this using RCU of course.

When you dismantle one device (or several in case of a module unload),
this can have implications on other devices (see veth cas for an obvious
example : this automatically removes the peer device), but also on
routes, neighbours, cached routes, various protocol cleanups, ... and so
on. Few people even on netdev understand the whole picture.

Given that 99.99% machines setup netdevice at boot time only, and hardly
consider dismantles, we netdev guys were pragmatic and safe. Two or
three synchronize_rcu() were considered as a non issue.

It seems there is interest to improve things now.

One way is to allow more batching and delegation, and I am working on
that right now, using a kthread, so that we dont block the requester for
the whole device dismantle.

This kthread might use call_rcu() driven state machine, but that is a
detail of implementation, since only kthread would be impacted.

I am pretty busy at work these days, so dont expect patches before some
time :)




^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] veth: use batched device unregister
  2011-05-09  4:44                                   ` [PATCH] veth: use batched device unregister Eric Dumazet
@ 2011-05-09  6:56                                     ` Michał Mirosław
  2011-05-09  8:20                                       ` Eric Dumazet
  2011-05-09  7:45                                     ` [PATCH v2 net-next-2.6] veth: use batched device unregister Eric Dumazet
  1 sibling, 1 reply; 53+ messages in thread
From: Michał Mirosław @ 2011-05-09  6:56 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, Alex Bligh, netdev, Jesse Gross, Paul E. McKenney,
	Ben Greear

2011/5/9 Eric Dumazet <eric.dumazet@gmail.com>:
> veth devices dont use the batched device unregisters yet.
>
> Since veth are a pair of devices, it makes sense to use a batch of two
> unregisters, this roughly divide dismantle time by two.
[...]
> --- a/drivers/net/veth.c
> +++ b/drivers/net/veth.c
> @@ -451,8 +451,16 @@ static void veth_dellink(struct net_device *dev, struct list_head *head)
>        priv = netdev_priv(dev);
>        peer = priv->peer;
>
> -       unregister_netdevice_queue(dev, head);
> -       unregister_netdevice_queue(peer, head);
> +       if (head == NULL) {
> +               LIST_HEAD(list);
> +               /* make a batch of two devices to speedup unregister */
> +               unregister_netdevice_queue(dev, &list);
> +               unregister_netdevice_queue(peer, &list);
> +               unregister_netdevice_many(&list);
> +       } else {
> +               unregister_netdevice_queue(dev, head);
> +               unregister_netdevice_queue(peer, head);
> +       }

You could change dellink callers to always pass head != NULL. As a
side effect, unregister_netdevice_queue() would do just what its name
suggests.

Best Regards,
Michał Mirosław

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-08 21:00                                 ` Eric Dumazet
  2011-05-09  4:44                                   ` [PATCH] veth: use batched device unregister Eric Dumazet
  2011-05-09  5:37                                   ` Scalability of interface creation and deletion Alex Bligh
@ 2011-05-09  7:11                                   ` Paul E. McKenney
  2011-05-09 17:30                                   ` Jesse Gross
  3 siblings, 0 replies; 53+ messages in thread
From: Paul E. McKenney @ 2011-05-09  7:11 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Alex Bligh, netdev, Jesse Gross

On Sun, May 08, 2011 at 11:00:47PM +0200, Eric Dumazet wrote:
> Le dimanche 08 mai 2011 à 08:48 -0700, Paul E. McKenney a écrit :
> > On Sun, May 08, 2011 at 04:17:42PM +0100, Alex Bligh wrote:
> > > 
> > > If 6 jiffies per call to ensure cpus are idle is a fact of life,
> > > then the question goes back to why interface removal is waiting
> > > for rcu readers to be released synchronously, as opposed to
> > > doing the update bits synchronously, then doing the reclaim
> > > element (freeing the memory) afterwards using call_rcu.
> > 
> > This would speed things up considerably, assuming that there is no
> > other reason to block for an RCU grace period.
> 
> Thats not so simple... Things are modular and better be safe than crash,
> on a very rare event (device dismantles are not the thing we expect to
> do very often. Only special needs might need to perform hundred of them
> per minute...)

I was afraid of that, but had to ask...

> For example, in the VLAN dismantle phase (ip link del eth0.103)
> we have 3 calls to synchronize_rcu() and one call to rcu_barrier()
> 
> [ the 'extra' synchronize_rcu() call comes from unregister_vlan_dev() ]
> 
> Maybe with new VLAN model, we could now remove this synchronize_net()
> call from vlan code. Jesse what do you think ?
> Once vlan_group_set_device(grp, vlan_id, NULL) had been called, why
> should we respect one rcu grace period at all, given dev is queued to
> unregister_netdevice_queue() [ which has its own couples of
> synchronize_net() / rcu_barrier() ]
> 
> 
> The real scalability problem of device dismantles comes from the fact
> that all these waits are done under RTNL mutex. This is the real killer
> because you cannot use your eight cpus, even if you are willing to.
> 
> We can probably speed things, but we should consider the following user
> actions :
> 
> ip link add link eth0 vlan103 type vlan id 103
> ip link del vlan103
> ip link add link eth1 vlan103 type vlan id 103
> 
> The "link del" command should return to user only if the minimum things
> had been done, to make sure the following "link add" wont fail
> mysteriously.

Hmmm...  One approach would be to use synchronize_rcu_expedited(), though
that is a bit of a big hammer.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 net-next-2.6] veth: use batched device unregister
  2011-05-09  4:44                                   ` [PATCH] veth: use batched device unregister Eric Dumazet
  2011-05-09  6:56                                     ` Michał Mirosław
@ 2011-05-09  7:45                                     ` Eric Dumazet
  2011-05-09  9:22                                       ` Eric Dumazet
  1 sibling, 1 reply; 53+ messages in thread
From: Eric Dumazet @ 2011-05-09  7:45 UTC (permalink / raw)
  To: David Miller
  Cc: Alex Bligh, netdev, Jesse Gross, Paul E. McKenney, Ben Greear

veth devices dont use the batched device unregisters yet.

Since veth are a pair of devices, it makes sense to use a batch of two
unregisters, this roughly divide dismantle time by two.

Reported-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Jesse Gross <jesse@nicira.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ben Greear <greearb@candelatech.com>
---
v2: added a list_del(&list) for safety (see commit ceaaec98)
 drivers/net/veth.c |   13 +++++++++++--
 1 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 3b0151a..b41d6a9 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -416,8 +416,17 @@ static void veth_dellink(struct net_device *dev, struct list_head *head)
 	priv = netdev_priv(dev);
 	peer = priv->peer;
 
-	unregister_netdevice_queue(dev, head);
-	unregister_netdevice_queue(peer, head);
+	if (head == NULL) {
+		LIST_HEAD(list);
+		/* make a batch of two devices to speedup unregister */
+		unregister_netdevice_queue(dev, &list);
+		unregister_netdevice_queue(peer, &list);
+		unregister_netdevice_many(&list);
+		list_del(&list);
+	} else {
+		unregister_netdevice_queue(dev, head);
+		unregister_netdevice_queue(peer, head);
+	}
 }
 
 static const struct nla_policy veth_policy[VETH_INFO_MAX + 1];



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [PATCH] veth: use batched device unregister
  2011-05-09  6:56                                     ` Michał Mirosław
@ 2011-05-09  8:20                                       ` Eric Dumazet
  2011-05-09  9:17                                         ` [PATCH net-next-2.6] net: use batched device unregister in veth and macvlan Eric Dumazet
  0 siblings, 1 reply; 53+ messages in thread
From: Eric Dumazet @ 2011-05-09  8:20 UTC (permalink / raw)
  To: Michał Mirosław
  Cc: David Miller, Alex Bligh, netdev, Jesse Gross, Paul E. McKenney,
	Ben Greear

Le lundi 09 mai 2011 à 08:56 +0200, Michał Mirosław a écrit :

> You could change dellink callers to always pass head != NULL. As a
> side effect, unregister_netdevice_queue() would do just what its name
> suggests.

Good idea. At first  glance, macvlan and rtnetlink.c would need a
change.

This would help macvlan_device_event( event=NETDEV_UNREGISTER) use batch
as well.

And yes, unregister_netdevice_queue(dev, head) would only make a

list_move_tail(&dev->unreg_list, head);

Will submit a patch soon, thanks !



^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH net-next-2.6] net: use batched device unregister in veth and macvlan
  2011-05-09  8:20                                       ` Eric Dumazet
@ 2011-05-09  9:17                                         ` Eric Dumazet
  2011-05-09 18:42                                           ` David Miller
  0 siblings, 1 reply; 53+ messages in thread
From: Eric Dumazet @ 2011-05-09  9:17 UTC (permalink / raw)
  To: Michał Mirosław, David Miller
  Cc: Alex Bligh, netdev, Jesse Gross, Paul E. McKenney, Ben Greear

veth devices dont use the batched device unregisters yet.

Since veth are a pair of devices, it makes sense to use a batch of two
unregisters, this roughly divides dismantle time by two.

Fix this by changing dellink() callers to always provide a non NULL
head. (Idea from Michał Mirosław)

This patch also handles macvlan case : We now dismantle all macvlans on
top of a lower dev at once.

Reported-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Michał Mirosław <mirqus@gmail.com>
Cc: Jesse Gross <jesse@nicira.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ben Greear <greearb@candelatech.com>
---
v3: Michał Mirosław dellink idea

 drivers/net/macvlan.c |    5 ++++-
 net/core/rtnetlink.c  |    5 ++++-
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index 3ad5425..d7c0bc62 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -785,6 +785,7 @@ static int macvlan_device_event(struct notifier_block *unused,
 	struct net_device *dev = ptr;
 	struct macvlan_dev *vlan, *next;
 	struct macvlan_port *port;
+	LIST_HEAD(list_kill);
 
 	if (!macvlan_port_exists(dev))
 		return NOTIFY_DONE;
@@ -810,7 +811,9 @@ static int macvlan_device_event(struct notifier_block *unused,
 			break;
 
 		list_for_each_entry_safe(vlan, next, &port->vlans, list)
-			vlan->dev->rtnl_link_ops->dellink(vlan->dev, NULL);
+			vlan->dev->rtnl_link_ops->dellink(vlan->dev, &list_kill);
+		unregister_netdevice_many(&list_kill);
+		list_del(&list_kill);
 		break;
 	case NETDEV_PRE_TYPE_CHANGE:
 		/* Forbid underlaying device to change its type. */
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 5a160f4..d2ba259 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -1501,6 +1501,7 @@ static int rtnl_dellink(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
 	char ifname[IFNAMSIZ];
 	struct nlattr *tb[IFLA_MAX+1];
 	int err;
+	LIST_HEAD(list_kill);
 
 	err = nlmsg_parse(nlh, sizeof(*ifm), tb, IFLA_MAX, ifla_policy);
 	if (err < 0)
@@ -1524,7 +1525,9 @@ static int rtnl_dellink(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
 	if (!ops)
 		return -EOPNOTSUPP;
 
-	ops->dellink(dev, NULL);
+	ops->dellink(dev, &list_kill);
+	unregister_netdevice_many(&list_kill);
+	list_del(&list_kill);
 	return 0;
 }
 



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 net-next-2.6] veth: use batched device unregister
  2011-05-09  7:45                                     ` [PATCH v2 net-next-2.6] veth: use batched device unregister Eric Dumazet
@ 2011-05-09  9:22                                       ` Eric Dumazet
  0 siblings, 0 replies; 53+ messages in thread
From: Eric Dumazet @ 2011-05-09  9:22 UTC (permalink / raw)
  To: David Miller
  Cc: Alex Bligh, netdev, Jesse Gross, Paul E. McKenney, Ben Greear

Le lundi 09 mai 2011 à 09:45 +0200, Eric Dumazet a écrit :
> veth devices dont use the batched device unregisters yet.
> 
> Since veth are a pair of devices, it makes sense to use a batch of two
> unregisters, this roughly divide dismantle time by two.
> 
> Reported-by: Alex Bligh <alex@alex.org.uk>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> Cc: Jesse Gross <jesse@nicira.com>
> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Cc: Ben Greear <greearb@candelatech.com>
> ---
> v2: added a list_del(&list) for safety (see commit ceaaec98)

Just to make things clear, please dont apply this patch, since I posted
another version including Michał idea.

thanks



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-08 21:00                                 ` Eric Dumazet
                                                     ` (2 preceding siblings ...)
  2011-05-09  7:11                                   ` Paul E. McKenney
@ 2011-05-09 17:30                                   ` Jesse Gross
  3 siblings, 0 replies; 53+ messages in thread
From: Jesse Gross @ 2011-05-09 17:30 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: paulmck, Alex Bligh, netdev

On Sun, May 8, 2011 at 2:00 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> For example, in the VLAN dismantle phase (ip link del eth0.103)
> we have 3 calls to synchronize_rcu() and one call to rcu_barrier()
>
> [ the 'extra' synchronize_rcu() call comes from unregister_vlan_dev() ]
>
> Maybe with new VLAN model, we could now remove this synchronize_net()
> call from vlan code. Jesse what do you think ?
> Once vlan_group_set_device(grp, vlan_id, NULL) had been called, why
> should we respect one rcu grace period at all, given dev is queued to
> unregister_netdevice_queue() [ which has its own couples of
> synchronize_net() / rcu_barrier() ]

Yes, I agree that the extra call to synchronize_net() provides no
value, though I think that's actually been true for a while.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH net-next-2.6] net: use batched device unregister in veth and macvlan
  2011-05-09  9:17                                         ` [PATCH net-next-2.6] net: use batched device unregister in veth and macvlan Eric Dumazet
@ 2011-05-09 18:42                                           ` David Miller
  2011-05-09 19:05                                             ` Eric Dumazet
  0 siblings, 1 reply; 53+ messages in thread
From: David Miller @ 2011-05-09 18:42 UTC (permalink / raw)
  To: eric.dumazet; +Cc: mirqus, alex, netdev, jesse, paulmck, greearb

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 09 May 2011 11:17:57 +0200

> veth devices dont use the batched device unregisters yet.
> 
> Since veth are a pair of devices, it makes sense to use a batch of two
> unregisters, this roughly divides dismantle time by two.
> 
> Fix this by changing dellink() callers to always provide a non NULL
> head. (Idea from Michał Mirosław)
> 
> This patch also handles macvlan case : We now dismantle all macvlans on
> top of a lower dev at once.
> 
> Reported-by: Alex Bligh <alex@alex.org.uk>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

Applied.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH net-next-2.6] net: use batched device unregister in veth and macvlan
  2011-05-09 18:42                                           ` David Miller
@ 2011-05-09 19:05                                             ` Eric Dumazet
  2011-05-09 20:17                                               ` Eric Dumazet
  0 siblings, 1 reply; 53+ messages in thread
From: Eric Dumazet @ 2011-05-09 19:05 UTC (permalink / raw)
  To: David Miller
  Cc: mirqus, alex, netdev, jesse, paulmck, greearb, Patrick McHardy

Le lundi 09 mai 2011 à 11:42 -0700, David Miller a écrit :
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Mon, 09 May 2011 11:17:57 +0200
> 
> > veth devices dont use the batched device unregisters yet.
> > 
> > Since veth are a pair of devices, it makes sense to use a batch of two
> > unregisters, this roughly divides dismantle time by two.
> > 
> > Fix this by changing dellink() callers to always provide a non NULL
> > head. (Idea from Michał Mirosław)
> > 
> > This patch also handles macvlan case : We now dismantle all macvlans on
> > top of a lower dev at once.
> > 
> > Reported-by: Alex Bligh <alex@alex.org.uk>
> > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> 
> Applied.

Thanks !

I believe there is one problem with this patch and
unregister_vlan_dev(), I'll have to find a solution fast ;)


ip link add link eth2 eth2.103 type vlan id 103 gvrp on
ip link add link eth2 eth2.104 type vlan id 104 gvrp on
ip link set eth2.103 up
ip link set eth2.104 up
ip link del eth2.103
ip link del eth2.104   <<<BUG>>>


[  372.573591] BUG: unable to handle kernel NULL pointer dereference at           (null)
[  372.573738] IP: [<ffffffffa014ecde>] garp_request_leave+0x2e/0x88 [garp]
[  372.573835] PGD 7a7d0067 PUD 7c9b1067 PMD 0 
[  372.573995] Oops: 0000 [#1] SMP 
[  372.574119] last sysfs file: /sys/devices/virtual/net/eth2.104/ifindex
[  372.574180] CPU 3 
[  372.574221] Modules linked in: 8021q garp stp llc nfsd lockd sunrpc tg3 libphy sg [last unloaded: x_tables]
[  372.574765] 
[  372.574817] Pid: 5656, comm: ip Tainted: G        W   2.6.39-rc2-01916-g0e21eae-dirty #696 HP ProLiant BL460c G6
[  372.574967] RIP: 0010:[<ffffffffa014ecde>]  [<ffffffffa014ecde>] garp_request_leave+0x2e/0x88 [garp]
[  372.575083] RSP: 0018:ffff8801168697c8  EFLAGS: 00010282
[  372.577084] RAX: 0000000000000000 RBX: ffff880116869816 RCX: 0000000000000002
[  372.577146] RDX: 0000000000000000 RSI: ffffffffa01594c0 RDI: ffff880117bc0000
[  372.577208] RBP: ffff8801168697f8 R08: 0000000000000001 R09: ffff88007a190800
[  372.577269] R10: ffff88007a17da00 R11: 0000000000000000 R12: ffff880117bc0000
[  372.577331] R13: ffff8801168699d8 R14: 0000000000000001 R15: 0000000000000002
[  372.577393] FS:  0000000000000000(0000) GS:ffff88007fc40000(0063) knlGS:00000000f779f6c0
[  372.577494] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
[  372.577553] CR2: 0000000000000000 CR3: 000000007af08000 CR4: 00000000000006e0
[  372.577615] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  372.577677] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  372.577739] Process ip (pid: 5656, threadinfo ffff880116868000, task ffff88011a388000)
[  372.577816] Stack:
[  372.577868]  ffff8801168697e8 ffff88007a74c800 ffff880117bc0000 ffff8801168699d8
[  372.578083]  ffff880116869868 0000000000000000 ffff880116869818 ffffffffa0158226
[  372.578297]  0000000316869818 6800880116869938 ffff880116869838 ffffffffa0157467
[  372.578511] Call Trace:
[  372.578579]  [<ffffffffa0158226>] vlan_gvrp_request_leave+0x46/0x50 [8021q]
[  372.578642]  [<ffffffffa0157467>] vlan_dev_stop+0xb7/0xc0 [8021q]
[  372.578703]  [<ffffffff81398b87>] __dev_close_many+0x87/0xe0
[  372.578763]  [<ffffffff81398c67>] dev_close_many+0x87/0x110
[  372.578823]  [<ffffffff81398d90>] rollback_registered_many+0xa0/0x240
[  372.578884]  [<ffffffff81398f49>] unregister_netdevice_many+0x19/0x60
[  372.578946]  [<ffffffff813a7e62>] rtnl_dellink+0xc2/0xf0
[  372.579005]  [<ffffffff813a5ae7>] rtnetlink_rcv_msg+0x247/0x250
[  372.579066]  [<ffffffff813a58a0>] ? rtnetlink_net_init+0x40/0x40
[  372.579126]  [<ffffffff813cb529>] netlink_rcv_skb+0x99/0xc0
[  372.579185]  [<ffffffff813a7690>] rtnetlink_rcv+0x20/0x30
[  372.579244]  [<ffffffff813cb296>] netlink_unicast+0x296/0x2a0
[  372.579304]  [<ffffffff8139052f>] ? memcpy_fromiovec+0x5f/0x80
[  372.579364]  [<ffffffff813cc1c7>] netlink_sendmsg+0x227/0x370


unregister_vlan_dev() does :

vlan_group_set_device(grp, vlan_id, NULL); 

unregister_netdevice_queue(dev, head);
/* If the group is now empty, kill off the group. */
if (grp->nr_vlans == 0) {
	vlan_gvrp_uninit_applicant(real_dev);


Now 'head' is not anymore NULL, we no longer immediately release the
dev in unregister_netdevice_queue() but queue it.

So vlan_gvrp_uninit_applicant() is now freeing garp structure, _before_
vlan_gvrp_request_leave() is called from vlan_dev_stop()

So we dereference NULL pointer in garp_request_leave

I suspect we should move the 'group freeing' out from unregister_vlan_dev() to 
vlan_dev_stop() ?

Patrick, David any idea before I cook a patch ?

BTW, bug must be present in net-2.6, if we unload vlan module (since in this
case we also had a non NULL head )

Thanks



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH net-next-2.6] net: use batched device unregister in veth and macvlan
  2011-05-09 19:05                                             ` Eric Dumazet
@ 2011-05-09 20:17                                               ` Eric Dumazet
  2011-05-10  6:40                                                 ` [PATCH net-2.6] vlan: fix GVRP at dismantle time Eric Dumazet
  0 siblings, 1 reply; 53+ messages in thread
From: Eric Dumazet @ 2011-05-09 20:17 UTC (permalink / raw)
  To: David Miller
  Cc: mirqus, alex, netdev, jesse, paulmck, greearb, Patrick McHardy

Le lundi 09 mai 2011 à 21:05 +0200, Eric Dumazet a écrit :

> BTW, bug must be present in net-2.6, if we unload vlan module (since in this
> case we also had a non NULL head )

Yes, I confirm we have the bug in linux-2.6

Here eth2 provided by tg3 module

ip link add link eth2 eth2.103 type vlan id 103 gvrp on loose_binding on
ip link add link eth2 eth2.104 type vlan id 104 gvrp on loose_binding on
ip link set eth2.103 up
ip link set eth2.104 up
sync
sleep 5
sync
rmmod tg3

<<<BUG>>>

[ 1848.669338] BUG: unable to handle kernel NULL pointer dereference at           (null)
[ 1848.669388] IP: [<ffffffffa0030c9e>] garp_request_leave+0x3e/0xc0 [garp]
[ 1848.669421] PGD 11d251067 PUD 11b9e0067 PMD 0 
[ 1848.669454] Oops: 0000 [#1] SMP 
[ 1848.669482] last sysfs file: /sys/devices/virtual/net/eth2.104/ifindex
[ 1848.669512] CPU 0 
[ 1848.669518] Modules linked in: tg3(-) 8021q garp nfsd lockd auth_rpcgss sunrpc libphy sg [last unloaded: x_tables]
[ 1848.669613] 
[ 1848.669633] Pid: 11494, comm: rmmod Tainted: G        W   2.6.39-rc6-00261-gfd71257-dirty #580 HP ProLiant BL460c G6
[ 1848.669688] RIP: 0010:[<ffffffffa0030c9e>]  [<ffffffffa0030c9e>] garp_request_leave+0x3e/0xc0 [garp]
[ 1848.669740] RSP: 0018:ffff88007a19bae8  EFLAGS: 00010286
[ 1848.669766] RAX: 0000000000000000 RBX: ffff88011b5e2000 RCX: 0000000000000002
[ 1848.669796] RDX: 0000000000000000 RSI: 0000000000000175 RDI: ffffffffa0030d5b
[ 1848.669826] RBP: ffff88007a19bb18 R08: 0000000000000001 R09: ffff88011bd64a00
[ 1848.669860] R10: ffff88011d34ec00 R11: 0000000000000000 R12: 0000000000000002
[ 1848.669890] R13: ffff88007a19bc48 R14: ffff88007a19bb88 R15: 0000000000000001
[ 1848.669920] FS:  0000000000000000(0000) GS:ffff88011fc00000(0063) knlGS:00000000f77d76c0
[ 1848.669966] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
[ 1848.669993] CR2: 0000000000000000 CR3: 000000011a675000 CR4: 00000000000006f0
[ 1848.670023] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1848.670063] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1848.670115] Process rmmod (pid: 11494, threadinfo ffff88007a19a000, task ffff8800798595c0)
[ 1848.672098] Stack:
[ 1848.672118]  ffff88007a19bb36 ffff88011c84b800 ffff88011b5e2000 ffff88007a19bc48
[ 1848.672172]  ffff88007a19bb88 0000000000000006 ffff88007a19bb38 ffffffffa003a5f6
[ 1848.672235]  ffff88007a19bb38 670088007a19bba8 ffff88007a19bb58 ffffffffa00397e7
[ 1848.672293] Call Trace:
[ 1848.672317]  [<ffffffffa003a5f6>] vlan_gvrp_request_leave+0x46/0x50 [8021q]
[ 1848.672348]  [<ffffffffa00397e7>] vlan_dev_stop+0xb7/0xc0 [8021q]
[ 1848.672377]  [<ffffffff8137e427>] __dev_close_many+0x87/0xe0
[ 1848.672405]  [<ffffffff8137e507>] dev_close_many+0x87/0x110
[ 1848.672433]  [<ffffffff8137e630>] rollback_registered_many+0xa0/0x240
[ 1848.672462]  [<ffffffff8137e7e9>] unregister_netdevice_many+0x19/0x60
[ 1848.672492]  [<ffffffffa00389eb>] vlan_device_event+0x53b/0x550 [8021q]
[ 1848.672522]  [<ffffffff8143f448>] ? ip6mr_device_event+0xa8/0xd0
[ 1848.672551]  [<ffffffff81479d03>] notifier_call_chain+0x53/0x80
[ 1848.672583]  [<ffffffff81062539>] __raw_notifier_call_chain+0x9/0x10
[ 1848.672612]  [<ffffffff81062551>] raw_notifier_call_chain+0x11/0x20
[ 1848.672641]  [<ffffffff8137df82>] call_netdevice_notifiers+0x32/0x60
[ 1848.672671]  [<ffffffff8137e69f>] rollback_registered_many+0x10f/0x240
[ 1848.672700]  [<ffffffff8137e85f>] rollback_registered+0x2f/0x40
[ 1848.672729]  [<ffffffff8137e8c8>] unregister_netdevice_queue+0x58/0x90
[ 1848.672759]  [<ffffffff8137e9eb>] unregister_netdev+0x1b/0x30
[ 1848.672788]  [<ffffffffa005d73f>] tg3_remove_one+0x6f/0x10b [tg3]




^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Scalability of interface creation and deletion
  2011-05-07 15:54     ` Eric Dumazet
  2011-05-07 16:23       ` Ben Greear
@ 2011-05-09 21:46       ` Octavian Purdila
  1 sibling, 0 replies; 53+ messages in thread
From: Octavian Purdila @ 2011-05-09 21:46 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Alex Bligh, netdev

On Sat, May 7, 2011 at 6:54 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:

>
> synchronize_rcu() calls are not consuming cpu, they just _wait_
> rcu grace period.
>
> I suggest you read Documentation/RCU files if you really want to :)
>
> If you want to check how expensive it is, its quite easy:
> add a trace in synchronize_net()
>
<snip>

I proposed adding a "wait" software counter to perf [1] a while ago,
which would allow people identify sync_rcu hotspots:

http://marc.info/?l=linux-kernel&m=129188584110162

I don't know how much visibility it got, so given this context, I
thought of bringing it up again :)

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH net-2.6] vlan: fix GVRP at dismantle time
  2011-05-09 20:17                                               ` Eric Dumazet
@ 2011-05-10  6:40                                                 ` Eric Dumazet
  2011-05-10 19:23                                                   ` David Miller
  0 siblings, 1 reply; 53+ messages in thread
From: Eric Dumazet @ 2011-05-10  6:40 UTC (permalink / raw)
  To: David Miller; +Cc: mirqus, alex, netdev, jesse, greearb, Patrick McHardy

Le lundi 09 mai 2011 à 22:17 +0200, Eric Dumazet a écrit :
> Le lundi 09 mai 2011 à 21:05 +0200, Eric Dumazet a écrit :
> 
> > BTW, bug must be present in net-2.6, if we unload vlan module (since in this
> > case we also had a non NULL head )
> 
> Yes, I confirm we have the bug in linux-2.6
> 

Here is a patch to address this problem.

Thanks !

[PATCH net-2.6] vlan: fix GVRP at dismantle time

ip link add link eth2 eth2.103 type vlan id 103 gvrp on loose_binding on
ip link set eth2.103 up
rmmod tg3    # driver providing eth2

 BUG: unable to handle kernel NULL pointer dereference at           (null)
 IP: [<ffffffffa0030c9e>] garp_request_leave+0x3e/0xc0 [garp]
 PGD 11d251067 PUD 11b9e0067 PMD 0 
 Oops: 0000 [#1] SMP 
 last sysfs file: /sys/devices/virtual/net/eth2.104/ifindex
 CPU 0 
 Modules linked in: tg3(-) 8021q garp nfsd lockd auth_rpcgss sunrpc libphy sg [last unloaded: x_tables]
 
 Pid: 11494, comm: rmmod Tainted: G        W   2.6.39-rc6-00261-gfd71257-dirty #580 HP ProLiant BL460c G6
 RIP: 0010:[<ffffffffa0030c9e>]  [<ffffffffa0030c9e>] garp_request_leave+0x3e/0xc0 [garp]
 RSP: 0018:ffff88007a19bae8  EFLAGS: 00010286
 RAX: 0000000000000000 RBX: ffff88011b5e2000 RCX: 0000000000000002
 RDX: 0000000000000000 RSI: 0000000000000175 RDI: ffffffffa0030d5b
 RBP: ffff88007a19bb18 R08: 0000000000000001 R09: ffff88011bd64a00
 R10: ffff88011d34ec00 R11: 0000000000000000 R12: 0000000000000002
 R13: ffff88007a19bc48 R14: ffff88007a19bb88 R15: 0000000000000001
 FS:  0000000000000000(0000) GS:ffff88011fc00000(0063) knlGS:00000000f77d76c0
 CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
 CR2: 0000000000000000 CR3: 000000011a675000 CR4: 00000000000006f0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Process rmmod (pid: 11494, threadinfo ffff88007a19a000, task ffff8800798595c0)
 Stack:
  ffff88007a19bb36 ffff88011c84b800 ffff88011b5e2000 ffff88007a19bc48
  ffff88007a19bb88 0000000000000006 ffff88007a19bb38 ffffffffa003a5f6
  ffff88007a19bb38 670088007a19bba8 ffff88007a19bb58 ffffffffa00397e7
 Call Trace:
  [<ffffffffa003a5f6>] vlan_gvrp_request_leave+0x46/0x50 [8021q]
  [<ffffffffa00397e7>] vlan_dev_stop+0xb7/0xc0 [8021q]
  [<ffffffff8137e427>] __dev_close_many+0x87/0xe0
  [<ffffffff8137e507>] dev_close_many+0x87/0x110
  [<ffffffff8137e630>] rollback_registered_many+0xa0/0x240
  [<ffffffff8137e7e9>] unregister_netdevice_many+0x19/0x60
  [<ffffffffa00389eb>] vlan_device_event+0x53b/0x550 [8021q]
  [<ffffffff8143f448>] ? ip6mr_device_event+0xa8/0xd0
  [<ffffffff81479d03>] notifier_call_chain+0x53/0x80
  [<ffffffff81062539>] __raw_notifier_call_chain+0x9/0x10
  [<ffffffff81062551>] raw_notifier_call_chain+0x11/0x20
  [<ffffffff8137df82>] call_netdevice_notifiers+0x32/0x60
  [<ffffffff8137e69f>] rollback_registered_many+0x10f/0x240
  [<ffffffff8137e85f>] rollback_registered+0x2f/0x40
  [<ffffffff8137e8c8>] unregister_netdevice_queue+0x58/0x90
  [<ffffffff8137e9eb>] unregister_netdev+0x1b/0x30
  [<ffffffffa005d73f>] tg3_remove_one+0x6f/0x10b [tg3]

We should call vlan_gvrp_request_leave() from unregister_vlan_dev(),
not from vlan_dev_stop(), because vlan_gvrp_uninit_applicant() 
is called right after unregister_netdevice_queue(). In batch mode,
unregister_netdevice_queue() doesn’t immediately call vlan_dev_stop().

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ben Greear <greearb@candelatech.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Jesse Gross <jesse@nicira.com>
Cc: Michał Mirosław <mirq-linux@rere.qmqm.pl>
---
 net/8021q/vlan.c     |    3 +++
 net/8021q/vlan_dev.c |    3 ---
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index 7850412..0eb1a88 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -124,6 +124,9 @@ void unregister_vlan_dev(struct net_device *dev, struct list_head *head)
 
 	grp->nr_vlans--;
 
+	if (vlan->flags & VLAN_FLAG_GVRP)
+		vlan_gvrp_request_leave(dev);
+
 	vlan_group_set_device(grp, vlan_id, NULL);
 	if (!grp->killall)
 		synchronize_net();
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index e34ea9e..b2ff6c8 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -487,9 +487,6 @@ static int vlan_dev_stop(struct net_device *dev)
 	struct vlan_dev_info *vlan = vlan_dev_info(dev);
 	struct net_device *real_dev = vlan->real_dev;
 
-	if (vlan->flags & VLAN_FLAG_GVRP)
-		vlan_gvrp_request_leave(dev);
-
 	dev_mc_unsync(real_dev, dev);
 	dev_uc_unsync(real_dev, dev);
 	if (dev->flags & IFF_ALLMULTI)



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [PATCH net-2.6] vlan: fix GVRP at dismantle time
  2011-05-10  6:40                                                 ` [PATCH net-2.6] vlan: fix GVRP at dismantle time Eric Dumazet
@ 2011-05-10 19:23                                                   ` David Miller
  0 siblings, 0 replies; 53+ messages in thread
From: David Miller @ 2011-05-10 19:23 UTC (permalink / raw)
  To: eric.dumazet; +Cc: mirqus, alex, netdev, jesse, greearb, kaber

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 10 May 2011 08:40:36 +0200

> [PATCH net-2.6] vlan: fix GVRP at dismantle time

Applied and queued up for -stable, thanks!

^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2011-05-10 19:23 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-07 11:08 Scalability of interface creation and deletion Alex Bligh
2011-05-07 12:22 ` Eric Dumazet
2011-05-07 15:26   ` Alex Bligh
2011-05-07 15:54     ` Eric Dumazet
2011-05-07 16:23       ` Ben Greear
2011-05-07 16:37         ` Eric Dumazet
2011-05-07 16:44           ` Ben Greear
2011-05-07 16:51             ` Eric Dumazet
2011-05-08  3:45               ` Ben Greear
2011-05-08  8:08                 ` Alex Bligh
2011-05-09 21:46       ` Octavian Purdila
2011-05-07 16:26     ` Eric Dumazet
2011-05-07 18:24       ` Alex Bligh
2011-05-07 18:32         ` Eric Dumazet
2011-05-07 18:39           ` Eric Dumazet
2011-05-08 10:09             ` Alex Bligh
2011-05-07 18:42           ` Eric Dumazet
2011-05-07 18:50             ` Alex Bligh
2011-05-08  7:12             ` Eric Dumazet
2011-05-08  8:06               ` Alex Bligh
2011-05-08  9:35               ` Alex Bligh
2011-05-08 12:18                 ` Alex Bligh
2011-05-08 12:50                   ` Paul E. McKenney
2011-05-08 13:13                     ` Alex Bligh
2011-05-08 13:44                       ` Paul E. McKenney
2011-05-08 14:27                         ` Alex Bligh
2011-05-08 14:47                           ` Paul E. McKenney
2011-05-08 15:17                             ` Alex Bligh
2011-05-08 15:48                               ` Paul E. McKenney
2011-05-08 21:00                                 ` Eric Dumazet
2011-05-09  4:44                                   ` [PATCH] veth: use batched device unregister Eric Dumazet
2011-05-09  6:56                                     ` Michał Mirosław
2011-05-09  8:20                                       ` Eric Dumazet
2011-05-09  9:17                                         ` [PATCH net-next-2.6] net: use batched device unregister in veth and macvlan Eric Dumazet
2011-05-09 18:42                                           ` David Miller
2011-05-09 19:05                                             ` Eric Dumazet
2011-05-09 20:17                                               ` Eric Dumazet
2011-05-10  6:40                                                 ` [PATCH net-2.6] vlan: fix GVRP at dismantle time Eric Dumazet
2011-05-10 19:23                                                   ` David Miller
2011-05-09  7:45                                     ` [PATCH v2 net-next-2.6] veth: use batched device unregister Eric Dumazet
2011-05-09  9:22                                       ` Eric Dumazet
2011-05-09  5:37                                   ` Scalability of interface creation and deletion Alex Bligh
2011-05-09  6:37                                     ` Eric Dumazet
2011-05-09  7:11                                   ` Paul E. McKenney
2011-05-09 17:30                                   ` Jesse Gross
2011-05-08 12:44                 ` Paul E. McKenney
2011-05-08 13:06                   ` Alex Bligh
2011-05-08 13:14                     ` Alex Bligh
2011-05-08 12:32               ` Paul E. McKenney
2011-05-07 18:51           ` Alex Bligh
2011-05-07 19:24             ` Eric Dumazet
2011-05-07 18:38       ` Alex Bligh
2011-05-07 18:44         ` Eric Dumazet

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.