netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* network regression: cannot rename netdev twice
       [not found] ` <CAPXgP12Sr2KzGJ9RA13QBOCkctb-z3O4+1uHOjANgMDDv2pxaQ@mail.gmail.com>
@ 2012-01-31 10:41   ` Jiri Slaby
  2012-01-31 10:52     ` Kay Sievers
  0 siblings, 1 reply; 19+ messages in thread
From: Jiri Slaby @ 2012-01-31 10:41 UTC (permalink / raw)
  To: Kay Sievers; +Cc: Eric W. Biederman, Greg KH, LKML, ML netdev

On 01/30/2012 11:52 PM, Kay Sievers wrote:
> 2012/1/30 Jiri Slaby <jslaby@suse.cz>:
>> I cannot boot properly with this commit:
>> commit 524b6c5b39b931311dfe5a2f5abae2f5c9731676
>> Author: Eric W. Biederman <ebiederm@xmission.com>
>> Date:   Sun Dec 18 20:09:31 2011 -0800
>>
>>    sysfs: Kill nlink counting.
>>
>>
>> 1) network systemd rule doesn't start network
> 
> What does that mean? What's a network systemd rule?

Oh, perhaps you call it a service file, not rule file?

Anyway this is a different bug. Revert of the patch above does not help.

The bug lays in the network layer. udev is unable to perform persistent
eth naming:
# ip link set eth0 name eth1    -- this one is OK
# ip link set eth1 name eth0
RTNETLINK answers: No such file or directory

thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: network regression: cannot rename netdev twice
  2012-01-31 10:41   ` network regression: cannot rename netdev twice Jiri Slaby
@ 2012-01-31 10:52     ` Kay Sievers
  2012-01-31 11:00       ` Jiri Slaby
  2012-02-04  2:14       ` network regression: cannot rename netdev twice Henrique de Moraes Holschuh
  0 siblings, 2 replies; 19+ messages in thread
From: Kay Sievers @ 2012-01-31 10:52 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: Eric W. Biederman, Greg KH, LKML, ML netdev

On Tue, Jan 31, 2012 at 11:41, Jiri Slaby <jslaby@suse.cz> wrote:
> On 01/30/2012 11:52 PM, Kay Sievers wrote:
>> 2012/1/30 Jiri Slaby <jslaby@suse.cz>:
>>> I cannot boot properly with this commit:
>>> commit 524b6c5b39b931311dfe5a2f5abae2f5c9731676
>>> Author: Eric W. Biederman <ebiederm@xmission.com>
>>> Date:   Sun Dec 18 20:09:31 2011 -0800
>>>
>>>    sysfs: Kill nlink counting.
>>>
>>> 1) network systemd rule doesn't start network
>>
>> What does that mean? What's a network systemd rule?
>
> Oh, perhaps you call it a service file, not rule file?
>
> Anyway this is a different bug. Revert of the patch above does not help.

Ok, fine. I checked too, and systemd does not play any silly games
with link counts.

> The bug lays in the network layer. udev is unable to perform persistent
> eth naming:
> # ip link set eth0 name eth1    -- this one is OK
> # ip link set eth1 name eth0
> RTNETLINK answers: No such file or directory

Please make sure nothing tries to swap netif names in userspace. We
have given up that approach, because it is far too fragile to
temporary rename devices to be able to swap the names, and race
against the loading of new kernel network drivers at the same time.

This might be a new kernel problem here, but in general that approach
is just broken, we have have given up fiddling around here. Udev does
not do that anymore, and also the code that currently *can* be used to
do this, will be removed from udev in the future.

Network devices can only be renamed to a namespace that isn't ethX,
and which does not race against kernel names.

Does is work, if you rename the devices to something else than ethX?

Kay

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: network regression: cannot rename netdev twice
  2012-01-31 10:52     ` Kay Sievers
@ 2012-01-31 11:00       ` Jiri Slaby
  2012-01-31 11:13         ` Kay Sievers
  2012-02-04  2:14       ` network regression: cannot rename netdev twice Henrique de Moraes Holschuh
  1 sibling, 1 reply; 19+ messages in thread
From: Jiri Slaby @ 2012-01-31 11:00 UTC (permalink / raw)
  To: Kay Sievers; +Cc: Eric W. Biederman, Greg KH, LKML, ML netdev

On 01/31/2012 11:52 AM, Kay Sievers wrote:
> On Tue, Jan 31, 2012 at 11:41, Jiri Slaby <jslaby@suse.cz> wrote:
>> On 01/30/2012 11:52 PM, Kay Sievers wrote:
>>> 2012/1/30 Jiri Slaby <jslaby@suse.cz>:
>>>> I cannot boot properly with this commit:
>>>> commit 524b6c5b39b931311dfe5a2f5abae2f5c9731676
>>>> Author: Eric W. Biederman <ebiederm@xmission.com>
>>>> Date:   Sun Dec 18 20:09:31 2011 -0800
>>>>
>>>>    sysfs: Kill nlink counting.
>>>>
>>>> 1) network systemd rule doesn't start network
>>>
>>> What does that mean? What's a network systemd rule?
>>
>> Oh, perhaps you call it a service file, not rule file?
>>
>> Anyway this is a different bug. Revert of the patch above does not help.
> 
> Ok, fine. I checked too, and systemd does not play any silly games
> with link counts.
> 
>> The bug lays in the network layer. udev is unable to perform persistent
>> eth naming:
>> # ip link set eth0 name eth1    -- this one is OK
>> # ip link set eth1 name eth0
>> RTNETLINK answers: No such file or directory
> 
> Please make sure nothing tries to swap netif names in userspace. We
> have given up that approach, because it is far too fragile to
> temporary rename devices to be able to swap the names, and race
> against the loading of new kernel network drivers at the same time.
> 
> This might be a new kernel problem here, but in general that approach
> is just broken, we have have given up fiddling around here. Udev does
> not do that anymore, and also the code that currently *can* be used to
> do this, will be removed from udev in the future.
> 
> Network devices can only be renamed to a namespace that isn't ethX,
> and which does not race against kernel names.

I have two eth interfaces. The one on the motherboard is named eth0, an
added PCI card is eth1. But kernel enumerates them in the opposite order.

So udev does this sequence:
eth1 -> rename3
eth0 -> eth1
rename3 -> eth0

How it can do it differently? (This is openSUSE factory.)

> Does is work, if you rename the devices to something else than ethX?

Negative:
# ip link set eth0 name krtek
# ip link set krtek name jezek
RTNETLINK answers: No such file or directory

thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: network regression: cannot rename netdev twice
  2012-01-31 11:00       ` Jiri Slaby
@ 2012-01-31 11:13         ` Kay Sievers
  2012-01-31 11:17           ` Jiri Slaby
  0 siblings, 1 reply; 19+ messages in thread
From: Kay Sievers @ 2012-01-31 11:13 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: Eric W. Biederman, Greg KH, LKML, ML netdev

On Tue, Jan 31, 2012 at 12:00, Jiri Slaby <jslaby@suse.cz> wrote:
> On 01/31/2012 11:52 AM, Kay Sievers wrote:
>> On Tue, Jan 31, 2012 at 11:41, Jiri Slaby <jslaby@suse.cz> wrote:
>>> On 01/30/2012 11:52 PM, Kay Sievers wrote:
>>>> 2012/1/30 Jiri Slaby <jslaby@suse.cz>:
>>>>> I cannot boot properly with this commit:
>>>>> commit 524b6c5b39b931311dfe5a2f5abae2f5c9731676
>>>>> Author: Eric W. Biederman <ebiederm@xmission.com>
>>>>> Date:   Sun Dec 18 20:09:31 2011 -0800
>>>>>
>>>>>    sysfs: Kill nlink counting.
>>>>>
>>>>> 1) network systemd rule doesn't start network
>>>>
>>>> What does that mean? What's a network systemd rule?
>>>
>>> Oh, perhaps you call it a service file, not rule file?
>>>
>>> Anyway this is a different bug. Revert of the patch above does not help.
>>
>> Ok, fine. I checked too, and systemd does not play any silly games
>> with link counts.
>>
>>> The bug lays in the network layer. udev is unable to perform persistent
>>> eth naming:
>>> # ip link set eth0 name eth1    -- this one is OK
>>> # ip link set eth1 name eth0
>>> RTNETLINK answers: No such file or directory
>>
>> Please make sure nothing tries to swap netif names in userspace. We
>> have given up that approach, because it is far too fragile to
>> temporary rename devices to be able to swap the names, and race
>> against the loading of new kernel network drivers at the same time.
>>
>> This might be a new kernel problem here, but in general that approach
>> is just broken, we have have given up fiddling around here. Udev does
>> not do that anymore, and also the code that currently *can* be used to
>> do this, will be removed from udev in the future.
>>
>> Network devices can only be renamed to a namespace that isn't ethX,
>> and which does not race against kernel names.
>
> I have two eth interfaces. The one on the motherboard is named eth0, an
> added PCI card is eth1. But kernel enumerates them in the opposite order.
>
> So udev does this sequence:
> eth1 -> rename3
> eth0 -> eth1
> rename3 -> eth0
>
> How it can do it differently? (This is openSUSE factory.)

A future udev will not help you doing that. We have given up
supporting this approach. Renaming is done during booting, at the same
time we load new kernel drivers, and all breaks in non-interesting
ways. Apart from all the other unsolvable problems with this model.

Pretending we are able to rename netif names in the same namespace the
kernel is allocating new names is just plain wrong. There are races
you can't control. The entire approach creates far more problems than
it solves. We just have to admit it was wrong to do that.
Custom/to-rename netif names can just not be ethX.

>> Does is work, if you rename the devices to something else than ethX?
>
> Negative:
> # ip link set eth0 name krtek
> # ip link set krtek name jezek
> RTNETLINK answers: No such file or directory

This is a command sequence you type manually?

You are sure that userspace is not working in the background,
triggered by uevents, and comes into your way here?

Kay

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: network regression: cannot rename netdev twice
  2012-01-31 11:13         ` Kay Sievers
@ 2012-01-31 11:17           ` Jiri Slaby
  2012-01-31 11:58             ` Kay Sievers
  0 siblings, 1 reply; 19+ messages in thread
From: Jiri Slaby @ 2012-01-31 11:17 UTC (permalink / raw)
  To: Kay Sievers; +Cc: Eric W. Biederman, Greg KH, LKML, ML netdev

On 01/31/2012 12:13 PM, Kay Sievers wrote:
>>> Does is work, if you rename the devices to something else than ethX?
>>
>> Negative:
>> # ip link set eth0 name krtek
>> # ip link set krtek name jezek
>> RTNETLINK answers: No such file or directory
> 
> This is a command sequence you type manually?

Yea, and it is working with 3.3.0-rc1-next-20120124_64+. Not with
3.3.0-rc1-next-20120131_64+.

> You are sure that userspace is not working in the background,
> triggered by uevents, and comes into your way here?

Note that krtek exists after the first command. But cannot be renamed
further.

thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: network regression: cannot rename netdev twice
  2012-01-31 11:17           ` Jiri Slaby
@ 2012-01-31 11:58             ` Kay Sievers
  2012-01-31 14:18               ` Eric W. Biederman
  0 siblings, 1 reply; 19+ messages in thread
From: Kay Sievers @ 2012-01-31 11:58 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: Eric W. Biederman, Greg KH, LKML, ML netdev

On Tue, Jan 31, 2012 at 12:17, Jiri Slaby <jslaby@suse.cz> wrote:
> On 01/31/2012 12:13 PM, Kay Sievers wrote:

>> This is a command sequence you type manually?
>
> Yea, and it is working with 3.3.0-rc1-next-20120124_64+. Not with
> 3.3.0-rc1-next-20120131_64+.
>
>> You are sure that userspace is not working in the background,
>> triggered by uevents, and comes into your way here?
>
> Note that krtek exists after the first command. But cannot be renamed
> further.

Yeah, I can confirm the problem here. I works fine with earlier
kernels and fails with the latest -next:

# uname -r
3.3.0-rc1-next-20120131+
# modprobe dummy
# ip link set dummy0 name foo0
# ip link set foo0 name bar0
RTNETLINK answers: No such file or directory

Kay

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: network regression: cannot rename netdev twice
  2012-01-31 11:58             ` Kay Sievers
@ 2012-01-31 14:18               ` Eric W. Biederman
  2012-01-31 14:40                 ` [PATCH] sysfs: Update the name hash when renaming sysfs entries Eric W. Biederman
  0 siblings, 1 reply; 19+ messages in thread
From: Eric W. Biederman @ 2012-01-31 14:18 UTC (permalink / raw)
  To: Kay Sievers; +Cc: Jiri Slaby, Greg KH, LKML, ML netdev

Kay Sievers <kay.sievers@vrfy.org> writes:

> On Tue, Jan 31, 2012 at 12:17, Jiri Slaby <jslaby@suse.cz> wrote:
>> On 01/31/2012 12:13 PM, Kay Sievers wrote:
>
>>> This is a command sequence you type manually?
>>
>> Yea, and it is working with 3.3.0-rc1-next-20120124_64+. Not with
>> 3.3.0-rc1-next-20120131_64+.
>>
>>> You are sure that userspace is not working in the background,
>>> triggered by uevents, and comes into your way here?
>>
>> Note that krtek exists after the first command. But cannot be renamed
>> further.
>
> Yeah, I can confirm the problem here. I works fine with earlier
> kernels and fails with the latest -next:
>
> # uname -r
> 3.3.0-rc1-next-20120131+
> # modprobe dummy
> # ip link set dummy0 name foo0
> # ip link set foo0 name bar0
> RTNETLINK answers: No such file or directory

There is something weird going on when sysfs directories and symlinks
are renamed.  My guess is that I fat fingered something with one of my
last sysfs patches.  I will look more deeply once I have slept some
more.

The second network device renames fails because the first rename did not
work properly.  ls -l /sys/class/net/ /sys/virtual/net/ will let you see
what I mean.

Eric

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH] sysfs:  Update the name hash when renaming sysfs entries
  2012-01-31 14:18               ` Eric W. Biederman
@ 2012-01-31 14:40                 ` Eric W. Biederman
  2012-01-31 14:41                   ` Jiri Slaby
  2012-01-31 14:55                   ` Greg KH
  0 siblings, 2 replies; 19+ messages in thread
From: Eric W. Biederman @ 2012-01-31 14:40 UTC (permalink / raw)
  To: Greg KH, Greg Kroah-Hartman; +Cc: Jiri Slaby, LKML, ML netdev, Kay Sievers


This fixes a bug introduced with sysfs name hashes where renaming a
network device appears to succeed but silently makes the sysfs files for
that network device inaccessible.

In at least one configuration this bug has stopped networking from
coming up during boot.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
 fs/sysfs/dir.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index ea64d01..dd3779c 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -872,6 +872,7 @@ int sysfs_rename(struct sysfs_dirent *sd,
 
 		dup_name = sd->s_name;
 		sd->s_name = new_name;
+		sd->s_hash = sysfs_name_hash(sd->s_ns, sd->s_name);
 	}
 
 	/* Move to the appropriate place in the appropriate directories rbtree. */
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH] sysfs:  Update the name hash when renaming sysfs entries
  2012-01-31 14:40                 ` [PATCH] sysfs: Update the name hash when renaming sysfs entries Eric W. Biederman
@ 2012-01-31 14:41                   ` Jiri Slaby
  2012-01-31 14:55                   ` Greg KH
  1 sibling, 0 replies; 19+ messages in thread
From: Jiri Slaby @ 2012-01-31 14:41 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Greg KH, Greg Kroah-Hartman, LKML, ML netdev, Kay Sievers

On 01/31/2012 03:40 PM, Eric W. Biederman wrote:
> 
> This fixes a bug introduced with sysfs name hashes where renaming a
> network device appears to succeed but silently makes the sysfs files for
> that network device inaccessible.
> 
> In at least one configuration this bug has stopped networking from
> coming up during boot.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

It works for me. Thanks.

Tested-by: Jiri Slaby <jslaby@suse.cz>

> ---
>  fs/sysfs/dir.c |    1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
> 
> diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
> index ea64d01..dd3779c 100644
> --- a/fs/sysfs/dir.c
> +++ b/fs/sysfs/dir.c
> @@ -872,6 +872,7 @@ int sysfs_rename(struct sysfs_dirent *sd,
>  
>  		dup_name = sd->s_name;
>  		sd->s_name = new_name;
> +		sd->s_hash = sysfs_name_hash(sd->s_ns, sd->s_name);
>  	}
>  
>  	/* Move to the appropriate place in the appropriate directories rbtree. */


-- 
js
suse labs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] sysfs:  Update the name hash when renaming sysfs entries
  2012-01-31 14:40                 ` [PATCH] sysfs: Update the name hash when renaming sysfs entries Eric W. Biederman
  2012-01-31 14:41                   ` Jiri Slaby
@ 2012-01-31 14:55                   ` Greg KH
  1 sibling, 0 replies; 19+ messages in thread
From: Greg KH @ 2012-01-31 14:55 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Greg Kroah-Hartman, Jiri Slaby, LKML, ML netdev, Kay Sievers

On Tue, Jan 31, 2012 at 06:40:26AM -0800, Eric W. Biederman wrote:
> 
> This fixes a bug introduced with sysfs name hashes where renaming a
> network device appears to succeed but silently makes the sysfs files for
> that network device inaccessible.
> 
> In at least one configuration this bug has stopped networking from
> coming up during boot.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
> ---
>  fs/sysfs/dir.c |    1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)

Thanks for this, I'll queue it up later today.

greg k-h

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: network regression: cannot rename netdev twice
  2012-01-31 10:52     ` Kay Sievers
  2012-01-31 11:00       ` Jiri Slaby
@ 2012-02-04  2:14       ` Henrique de Moraes Holschuh
  2012-02-06 20:03         ` Kay Sievers
  1 sibling, 1 reply; 19+ messages in thread
From: Henrique de Moraes Holschuh @ 2012-02-04  2:14 UTC (permalink / raw)
  To: Kay Sievers; +Cc: Jiri Slaby, Eric W. Biederman, Greg KH, LKML, ML netdev

On Tue, 31 Jan 2012, Kay Sievers wrote:
> Please make sure nothing tries to swap netif names in userspace. We
> have given up that approach, because it is far too fragile to
> temporary rename devices to be able to swap the names, and race
> against the loading of new kernel network drivers at the same time.

That's a damn fair reason, but the loss of that functionality could cause
trouble.  In fact, at first glance, to me it looks like this has a large
potential for unleashing untold pain and suffering in the sysadmin ranks
unless early userspace can emulate it somehow.

Is it possible to configure the kernel to use something other than "eth#" as
its initial namespace for netif names?  Or is there some other way to get
eth1 to be what you need eth1 to be during userland boot?

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: network regression: cannot rename netdev twice
  2012-02-04  2:14       ` network regression: cannot rename netdev twice Henrique de Moraes Holschuh
@ 2012-02-06 20:03         ` Kay Sievers
  2012-02-08  2:00           ` Henrique de Moraes Holschuh
  0 siblings, 1 reply; 19+ messages in thread
From: Kay Sievers @ 2012-02-06 20:03 UTC (permalink / raw)
  To: Henrique de Moraes Holschuh
  Cc: Jiri Slaby, Eric W. Biederman, Greg KH, LKML, ML netdev

On Sat, Feb 4, 2012 at 03:14, Henrique de Moraes Holschuh
<hmh@hmh.eng.br> wrote:
> On Tue, 31 Jan 2012, Kay Sievers wrote:
>> Please make sure nothing tries to swap netif names in userspace. We
>> have given up that approach, because it is far too fragile to
>> temporary rename devices to be able to swap the names, and race
>> against the loading of new kernel network drivers at the same time.
>
> That's a damn fair reason, but the loss of that functionality could cause
> trouble.  In fact, at first glance, to me it looks like this has a large
> potential for unleashing untold pain and suffering in the sysadmin ranks
> unless early userspace can emulate it somehow.
>
> Is it possible to configure the kernel to use something other than "eth#" as
> its initial namespace for netif names?  Or is there some other way to get
> eth1 to be what you need eth1 to be during userland boot?

I don't think there is a sane way to do that. Someone could add a
kernel command line parameter to switch ethX in the kernel to
something else, and create custom udev rules which match on device
properties and apply configured names which are ethX again. But for
all that, there will be no generally available support in common base
system tools, and we absolutely do not recommend anybody doing that.

Udev will not provide any help for that any more, not for automatic
device name reservation from a hotplug path, not for device name swaps
in the kernel namespace. It will only be allowed to rename devices to
a namespace that does not clash with the kernel's one.

People should use biosdevname's pci-slot names, or the on-board labels
names like DELL does for configuration-less stable names, or use
manually configured names 'internal', 'external' ,'dmz', 'vpn' and so
on.

I think we should stop pretending we can solve problems, resulting
from simple enumeration depending on device-discovery order. These
numbers can never be stable, can never reliably work in the reality we
are working with.

It's time to leave these false promises behind us and move on and that
means, no stable ethX names anymore.

Kay

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: network regression: cannot rename netdev twice
  2012-02-06 20:03         ` Kay Sievers
@ 2012-02-08  2:00           ` Henrique de Moraes Holschuh
  2012-02-08  3:50             ` Kay Sievers
  0 siblings, 1 reply; 19+ messages in thread
From: Henrique de Moraes Holschuh @ 2012-02-08  2:00 UTC (permalink / raw)
  To: Kay Sievers; +Cc: Jiri Slaby, Eric W. Biederman, Greg KH, LKML, ML netdev

On Mon, 06 Feb 2012, Kay Sievers wrote:
> On Sat, Feb 4, 2012 at 03:14, Henrique de Moraes Holschuh
> <hmh@hmh.eng.br> wrote:
> > Is it possible to configure the kernel to use something other than "eth#" as
> > its initial namespace for netif names?  Or is there some other way to get
> > eth1 to be what you need eth1 to be during userland boot?
> 
> I don't think there is a sane way to do that. Someone could add a
> kernel command line parameter to switch ethX in the kernel to
> something else, and create custom udev rules which match on device
> properties and apply configured names which are ethX again. But for
> all that, there will be no generally available support in common base
> system tools, and we absolutely do not recommend anybody doing that.

What sort of impact analysis on userspace was done about this change?

Nobody in his right mind would go back to the dark ages of uncontrolled
ifnames.  You're effectively forcing everybody with a clue away from the
eth# namespace.

Just to be very clear: the impact of this is the need to change the
interface names on potentially millions of lines of firewall rules and
scripts out there, as well as tracking down stuff (mostly scripts) that
special-cases the eth prefix.

Is there a really good reason why we cannot have a way to move the
kernel away from the eth# namespace at boot (through a kernel parameter,
maybe with the default namespace set at compile time), AND keep the
"common base system tools" support to assign ifname based on MAC
addresses that we have right now?

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: network regression: cannot rename netdev twice
  2012-02-08  2:00           ` Henrique de Moraes Holschuh
@ 2012-02-08  3:50             ` Kay Sievers
  2012-02-08  6:42               ` Valdis.Kletnieks
  0 siblings, 1 reply; 19+ messages in thread
From: Kay Sievers @ 2012-02-08  3:50 UTC (permalink / raw)
  To: Henrique de Moraes Holschuh
  Cc: Jiri Slaby, Eric W. Biederman, Greg KH, LKML, ML netdev

On Wed, Feb 8, 2012 at 03:00, Henrique de Moraes Holschuh
<hmh@hmh.eng.br> wrote:
> On Mon, 06 Feb 2012, Kay Sievers wrote:
>> On Sat, Feb 4, 2012 at 03:14, Henrique de Moraes Holschuh
>> <hmh@hmh.eng.br> wrote:
>> > Is it possible to configure the kernel to use something other than "eth#" as
>> > its initial namespace for netif names?  Or is there some other way to get
>> > eth1 to be what you need eth1 to be during userland boot?
>>
>> I don't think there is a sane way to do that. Someone could add a
>> kernel command line parameter to switch ethX in the kernel to
>> something else, and create custom udev rules which match on device
>> properties and apply configured names which are ethX again. But for
>> all that, there will be no generally available support in common base
>> system tools, and we absolutely do not recommend anybody doing that.
>
> What sort of impact analysis on userspace was done about this change?

None. It will just not be supported for new setups. Existing ones will
do what they always did.

> Nobody in his right mind would go back to the dark ages of uncontrolled
> ifnames.  You're effectively forcing everybody with a clue away from the
> eth# namespace.

Yes. It's a game we have lost and we will not win in the future. I
gave up, and I warn everybody who think it's simple to manage.

> Just to be very clear: the impact of this is the need to change the
> interface names on potentially millions of lines of firewall rules and
> scripts out there, as well as tracking down stuff (mostly scripts) that
> special-cases the eth prefix.

Yeah, and for good, ethX is a pretty much random kernel name, and I
personally will no longer work on conceptually broken infrastructure
that can never deliver what it seems to promise. In the longer run,
tools need to be fixed to automatically handle changing names, or not
care about the names at all, or names need to be explicitly set up
outside the ethX namespace to be predictable.

After years of working in that area I will stop to work on these hacks
to promise stable ethX names. It was just wrong, like enumerations
always are in hotplug setups.

> Is there a really good reason why we cannot have a way to move the
> kernel away from the eth# namespace at boot (through a kernel parameter,
> maybe with the default namespace set at compile time),

Could work, but I don't think it is worth. Simple enumeration, and
automatic persistent on-disk device name reservation in a flat
number-range is just a very flawed concept. I'm not interested in
working on that, but that surely should not stop anybody from trying
and providing tools that can do that.

> AND keep the
> "common base system tools" support to assign ifname based on MAC
> addresses that we have right now?

Not provided by udev's default setup, which did persistent name
reservation in the device hotplug path. It is already disabled and
will be entirely removed from the source tree some day. Other tools
can still try to provide that. But I declare that model as officially
failed and udev will not even try anything like that anymore.

People who need predictable interface names should just manually
configure custom/descriptive names, or names which are reliably
derived from the hardware, like firmware-provided names or the pci
slot number.

Kay

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: network regression: cannot rename netdev twice
  2012-02-08  3:50             ` Kay Sievers
@ 2012-02-08  6:42               ` Valdis.Kletnieks
  2012-02-08 10:57                 ` Kay Sievers
  0 siblings, 1 reply; 19+ messages in thread
From: Valdis.Kletnieks @ 2012-02-08  6:42 UTC (permalink / raw)
  To: Kay Sievers
  Cc: Henrique de Moraes Holschuh, Jiri Slaby, Eric W. Biederman,
	Greg KH, LKML, ML netdev

[-- Attachment #1: Type: text/plain, Size: 1375 bytes --]

On Wed, 08 Feb 2012 04:50:15 +0100, Kay Sievers said:

> After years of working in that area I will stop to work on these hacks
> to promise stable ethX names. It was just wrong, like enumerations
> always are in hotplug setups.

So (real world case) I've got a server that's got a 1G ethernet connected to
the public net, a 1G ethernet that's a cluster management network, and
a 10G ethernet that connects to our HPC clusters.

And I want to add iptables rules that distinguish based on interface. Currently
I can nail the management net to eth0, the public net to eth1, and the 10G to
eth2, and then just add "-i eth1" or whatever in the iptables ruleset.

I really don't care if the 0/1/2 move around - but if we're not having nailed-down
interface names, what will take the place of '-i ethN' in iptables?

> People who need predictable interface names should just manually
> configure custom/descriptive names, or names which are reliably
> derived from the hardware, like firmware-provided names or the pci
> slot number.

Or is this sort of thing in /etc/udev/rules.d/70-persistent-net.rules
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:25:90:0b:f2:80", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0"
what you are trying to move to, and my systems are already onboard and
I should just move along, nothing to see here? ;)

[-- Attachment #2: Type: application/pgp-signature, Size: 865 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: network regression: cannot rename netdev twice
  2012-02-08  6:42               ` Valdis.Kletnieks
@ 2012-02-08 10:57                 ` Kay Sievers
  2012-02-08 20:06                   ` Valdis.Kletnieks
  0 siblings, 1 reply; 19+ messages in thread
From: Kay Sievers @ 2012-02-08 10:57 UTC (permalink / raw)
  To: Valdis.Kletnieks
  Cc: Henrique de Moraes Holschuh, Jiri Slaby, Eric W. Biederman,
	Greg KH, LKML, ML netdev

On Wed, Feb 8, 2012 at 07:42,  <Valdis.Kletnieks@vt.edu> wrote:
> On Wed, 08 Feb 2012 04:50:15 +0100, Kay Sievers said:
>
>> After years of working in that area I will stop to work on these hacks
>> to promise stable ethX names. It was just wrong, like enumerations
>> always are in hotplug setups.
>
> So (real world case) I've got a server that's got a 1G ethernet connected to
> the public net, a 1G ethernet that's a cluster management network, and
> a 10G ethernet that connects to our HPC clusters.
>
> And I want to add iptables rules that distinguish based on interface. Currently
> I can nail the management net to eth0, the public net to eth1, and the 10G to
> eth2, and then just add "-i eth1" or whatever in the iptables ruleset.
>
> I really don't care if the 0/1/2 move around - but if we're not having nailed-down
> interface names, what will take the place of '-i ethN' in iptables?
>
>> People who need predictable interface names should just manually
>> configure custom/descriptive names, or names which are reliably
>> derived from the hardware, like firmware-provided names or the pci
>> slot number.
>
> Or is this sort of thing in /etc/udev/rules.d/70-persistent-net.rules
> SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:25:90:0b:f2:80", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0"
> what you are trying to move to, and my systems are already onboard and
> I should just move along, nothing to see here? ;)

Yeah, that's what we did in the past. It works fine if you never have
to swap names like eth0 and eth1, with need to free one of the the
names with a temporary rename.

If another device is added by a different kernel module, or just a USB
network device is already plugged-in at bootup time, the parallel
loading of drivers might cause the kernel to create a new eth0 or eth1
just in the moment we have the temporary rename active and we want to
swap the names.

That model is just entirely flawed and will never work reliably
without creating an even bigger mess we already have, by requiring
complex retry loops across multiple devices, or having global locks
including the kernel's device name allocation logic.

Let's just move on and stop pretending we want or we can solve these
problems. Simple device enumerations in hotplug setups can by their
very definition not work in a predictable way, we should never have
tried to mess around here, and just moved on to something that has at
least the potential to work.

Kay

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: network regression: cannot rename netdev twice
  2012-02-08 10:57                 ` Kay Sievers
@ 2012-02-08 20:06                   ` Valdis.Kletnieks
  2012-02-08 20:27                     ` Stephen Hemminger
  2012-02-08 23:48                     ` Kay Sievers
  0 siblings, 2 replies; 19+ messages in thread
From: Valdis.Kletnieks @ 2012-02-08 20:06 UTC (permalink / raw)
  To: Kay Sievers
  Cc: Henrique de Moraes Holschuh, Jiri Slaby, Eric W. Biederman,
	Greg KH, LKML, ML netdev

[-- Attachment #1: Type: text/plain, Size: 1214 bytes --]

On Wed, 08 Feb 2012 11:57:18 +0100, Kay Sievers said:
> On Wed, Feb 8, 2012 at 07:42,  <Valdis.Kletnieks@vt.edu> wrote:

> > Or is this sort of thing in /etc/udev/rules.d/70-persistent-net.rules
> > SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:25:90:0b:f2:80", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0"
> > what you are trying to move to, and my systems are already onboard and
> > I should just move along, nothing to see here? ;)
>
> Yeah, that's what we did in the past. It works fine if you never have
> to swap names like eth0 and eth1, with need to free one of the the
> names with a temporary rename.

Well, if I had my druthers, I'd stick name="net-mgt", "net-pub", and "net-10g"
in the udev rules, and not care about 1/2/3 and race conditions, because
meaningful names are easier to not screw up (just last week found a system that
had eth1 and eth2 reversed in some iptables rules, wouldn't have happened if
they were -mgt and -pub).

Only thing stopping me is getting iptables to accept '-i net-10g', and the
distro /etc/sysconfig/network scripts like ifup and ifdown playing nice....

So it sounds like what I want as a sysadmin is the same thing you want
as a maintainer...



[-- Attachment #2: Type: application/pgp-signature, Size: 865 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: network regression: cannot rename netdev twice
  2012-02-08 20:06                   ` Valdis.Kletnieks
@ 2012-02-08 20:27                     ` Stephen Hemminger
  2012-02-08 23:48                     ` Kay Sievers
  1 sibling, 0 replies; 19+ messages in thread
From: Stephen Hemminger @ 2012-02-08 20:27 UTC (permalink / raw)
  To: Valdis.Kletnieks
  Cc: Kay Sievers, Henrique de Moraes Holschuh, Jiri Slaby,
	Eric W. Biederman, Greg KH, LKML, ML netdev

[-- Attachment #1: Type: text/plain, Size: 111 bytes --]

Our customers would prefer network device names of the form
"Ethernet0/0" or "ge-0/0/0" (but I said no...)


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: network regression: cannot rename netdev twice
  2012-02-08 20:06                   ` Valdis.Kletnieks
  2012-02-08 20:27                     ` Stephen Hemminger
@ 2012-02-08 23:48                     ` Kay Sievers
  1 sibling, 0 replies; 19+ messages in thread
From: Kay Sievers @ 2012-02-08 23:48 UTC (permalink / raw)
  To: Valdis.Kletnieks
  Cc: Henrique de Moraes Holschuh, Jiri Slaby, Eric W. Biederman,
	Greg KH, LKML, ML netdev

On Wed, Feb 8, 2012 at 21:06,  <Valdis.Kletnieks@vt.edu> wrote:
> On Wed, 08 Feb 2012 11:57:18 +0100, Kay Sievers said:
>> On Wed, Feb 8, 2012 at 07:42,  <Valdis.Kletnieks@vt.edu> wrote:
>
>> > Or is this sort of thing in /etc/udev/rules.d/70-persistent-net.rules
>> > SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:25:90:0b:f2:80", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0"
>> > what you are trying to move to, and my systems are already onboard and
>> > I should just move along, nothing to see here? ;)
>>
>> Yeah, that's what we did in the past. It works fine if you never have
>> to swap names like eth0 and eth1, with need to free one of the the
>> names with a temporary rename.
>
> Well, if I had my druthers, I'd stick name="net-mgt", "net-pub", and "net-10g"
> in the udev rules, and not care about 1/2/3 and race conditions, because
> meaningful names are easier to not screw up (just last week found a system that
> had eth1 and eth2 reversed in some iptables rules, wouldn't have happened if
> they were -mgt and -pub).
>
> Only thing stopping me is getting iptables to accept '-i net-10g', and the
> distro /etc/sysconfig/network scripts like ifup and ifdown playing nice....
>
> So it sounds like what I want as a sysadmin is the same thing you want
> as a maintainer...

Yeah, that sounds very much like it is.

I want to push some responsibility to the admin, do less automagic,
and personally want to be less responsible for all the unintended
screw-up the automagic is causing everywhere.

Sure, the intention to keep the names like they always have been was
good, but a good intention and a broken model to deliver it, and
continue to pretend we can solve it, is the worst things we can do. :)

Kay

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2012-02-08 23:48 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <4F27120A.4040106@suse.cz>
     [not found] ` <CAPXgP12Sr2KzGJ9RA13QBOCkctb-z3O4+1uHOjANgMDDv2pxaQ@mail.gmail.com>
2012-01-31 10:41   ` network regression: cannot rename netdev twice Jiri Slaby
2012-01-31 10:52     ` Kay Sievers
2012-01-31 11:00       ` Jiri Slaby
2012-01-31 11:13         ` Kay Sievers
2012-01-31 11:17           ` Jiri Slaby
2012-01-31 11:58             ` Kay Sievers
2012-01-31 14:18               ` Eric W. Biederman
2012-01-31 14:40                 ` [PATCH] sysfs: Update the name hash when renaming sysfs entries Eric W. Biederman
2012-01-31 14:41                   ` Jiri Slaby
2012-01-31 14:55                   ` Greg KH
2012-02-04  2:14       ` network regression: cannot rename netdev twice Henrique de Moraes Holschuh
2012-02-06 20:03         ` Kay Sievers
2012-02-08  2:00           ` Henrique de Moraes Holschuh
2012-02-08  3:50             ` Kay Sievers
2012-02-08  6:42               ` Valdis.Kletnieks
2012-02-08 10:57                 ` Kay Sievers
2012-02-08 20:06                   ` Valdis.Kletnieks
2012-02-08 20:27                     ` Stephen Hemminger
2012-02-08 23:48                     ` Kay Sievers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).