linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/1] Tell linkwatch about new interfaces
@ 2009-04-01 15:40 Andrew Lutomirski
  2009-04-05  0:05 ` David Miller
  0 siblings, 1 reply; 11+ messages in thread
From: Andrew Lutomirski @ 2009-04-01 15:40 UTC (permalink / raw)
  To: netdev, LKML

From: Andrew Lutomirski <amluto@gmail.com>

When a network driver registers a new interface, linkwatch will not notice,
and hence not set the rfc2863 operstate, until netif_carrier_on gets called.
If the new interface has no carrier when it is connected, then a status of
"unknown" is reported to userspace, which confuses various tools
(NetworkManager, for example).

This fires a linkwatch event for all new interfaces, so that operstate
gets set reasonably quickly.

Signed-off-by: Andrew Lutomirski <amluto@gmail.com>

---

This looks like the root cause of the bug I reported here:

http://lkml.org/lkml/2009/3/24/499

Without knowing all the locking and ordering constraints imposed on network
drivers, this seemed like the safest way to fix the bug.  Alternative
approaches would be to call rfc2863_policy directly or to initialize the
relevent fields to sane values (for the carrier off state) in alloc_netdev.

This applies to 2.6.29.  I can rediff it for any other tree, but I didn't see
any changes that would conflict.

diff --git a/net/core/dev.c b/net/core/dev.c
index e438f54..45911fd 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4445,6 +4445,9 @@ int register_netdevice(struct net_device *dev)
 		dev->reg_state = NETREG_UNREGISTERED;
 	}

+	/* Update link state. */
+	linkwatch_fire_event(dev);
+
 out:
 	return ret;

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] Tell linkwatch about new interfaces
  2009-04-01 15:40 [PATCH 1/1] Tell linkwatch about new interfaces Andrew Lutomirski
@ 2009-04-05  0:05 ` David Miller
  2009-04-05  4:00   ` Andrew Lutomirski
  2009-07-14 17:17   ` Sergio Luis
  0 siblings, 2 replies; 11+ messages in thread
From: David Miller @ 2009-04-05  0:05 UTC (permalink / raw)
  To: amluto; +Cc: netdev, linux-kernel

From: Andrew Lutomirski <amluto@gmail.com>
Date: Wed, 1 Apr 2009 11:40:06 -0400

> When a network driver registers a new interface, linkwatch will not notice,
> and hence not set the rfc2863 operstate, until netif_carrier_on gets called.
> If the new interface has no carrier when it is connected, then a status of
> "unknown" is reported to userspace, which confuses various tools
> (NetworkManager, for example).
> 
> This fires a linkwatch event for all new interfaces, so that operstate
> gets set reasonably quickly.
> 
> Signed-off-by: Andrew Lutomirski <amluto@gmail.com>

The default assumed state for a freshly registered network
device is that the link is up.

If that disagrees from reality, the driver should make the
appropriate netif_carrier_off() call.

I'm sure you'll find that the e1000 driver is not doing this
and that is what causes the bug you are seeing.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] Tell linkwatch about new interfaces
  2009-04-05  0:05 ` David Miller
@ 2009-04-05  4:00   ` Andrew Lutomirski
  2009-04-10  0:48     ` Brandeburg, Jesse
  2009-07-14 17:17   ` Sergio Luis
  1 sibling, 1 reply; 11+ messages in thread
From: Andrew Lutomirski @ 2009-04-05  4:00 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

On Sat, Apr 4, 2009 at 8:05 PM, David Miller <davem@davemloft.net> wrote:
> From: Andrew Lutomirski <amluto@gmail.com>
> Date: Wed, 1 Apr 2009 11:40:06 -0400
>
>> When a network driver registers a new interface, linkwatch will not notice,
>> and hence not set the rfc2863 operstate, until netif_carrier_on gets called.
>> If the new interface has no carrier when it is connected, then a status of
>> "unknown" is reported to userspace, which confuses various tools
>> (NetworkManager, for example).
>>
>> This fires a linkwatch event for all new interfaces, so that operstate
>> gets set reasonably quickly.
>>
>> Signed-off-by: Andrew Lutomirski <amluto@gmail.com>
>
> The default assumed state for a freshly registered network
> device is that the link is up.
>
> If that disagrees from reality, the driver should make the
> appropriate netif_carrier_off() call.
>
> I'm sure you'll find that the e1000 driver is not doing this
> and that is what causes the bug you are seeing.
>

Nope.  The end of e1000_probe (in e1000e) is:

        /* tell the stack to leave us alone until e1000_open() is called */
        netif_carrier_off(netdev);
        netif_tx_stop_all_queues(netdev);

        strcpy(netdev->name, "eth%d");
        err = register_netdev(netdev);
        if (err)
                goto err_register;

        e1000_print_device_info(adapter);

netif_carrier_off does:

void netif_carrier_off(struct net_device *dev)
{
        if (!test_and_set_bit(__LINK_STATE_NOCARRIER, &dev->state)) {
                if (dev->reg_state == NETREG_UNINITIALIZED)
                        return;
                linkwatch_fire_event(dev);
        }
}

So, either it should be illegal to call netif_carrier_off on an
unregistered net_device (and there should be a WARN() in
netif_carrier_off), or it should work correctly, and
register_netdevice should do the right thing when there is no carrier
(i.e. something like my patch).

--Andy

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] Tell linkwatch about new interfaces
  2009-04-05  4:00   ` Andrew Lutomirski
@ 2009-04-10  0:48     ` Brandeburg, Jesse
  2009-04-11 15:46       ` Andrew Lutomirski
  2009-04-13 23:22       ` David Miller
  0 siblings, 2 replies; 11+ messages in thread
From: Brandeburg, Jesse @ 2009-04-10  0:48 UTC (permalink / raw)
  To: David Miller, Andrew Lutomirski; +Cc: netdev, linux-kernel

On Sat, 4 Apr 2009, Andrew Lutomirski wrote:
> On Sat, Apr 4, 2009 at 8:05 PM, David Miller <davem@davemloft.net> wrote:
> > From: Andrew Lutomirski <amluto@gmail.com>
> > Date: Wed, 1 Apr 2009 11:40:06 -0400
> >
> >> When a network driver registers a new interface, linkwatch will not notice,
> >> and hence not set the rfc2863 operstate, until netif_carrier_on gets called.
> >> If the new interface has no carrier when it is connected, then a status of
> >> "unknown" is reported to userspace, which confuses various tools
> >> (NetworkManager, for example).
> >>
> >> This fires a linkwatch event for all new interfaces, so that operstate
> >> gets set reasonably quickly.
> >>
> >> Signed-off-by: Andrew Lutomirski <amluto@gmail.com>
> >
> > The default assumed state for a freshly registered network
> > device is that the link is up.
> >
> > If that disagrees from reality, the driver should make the
> > appropriate netif_carrier_off() call.
> >
> > I'm sure you'll find that the e1000 driver is not doing this
> > and that is what causes the bug you are seeing.
> >

Dave, if we move the netif_carrier_off call to our dev->open like tg3 has, 
do you think this is sufficient?

I note that we were calling netif_tx_stop_all_queues here, but I'm curious 
if davem thinks we need to lock out our tx routine until dev->open 
completes or whether the starting state of the netdev is sufficient.

> Nope.  The end of e1000_probe (in e1000e) is:
> 
>         /* tell the stack to leave us alone until e1000_open() is called */
>         netif_carrier_off(netdev);
>         netif_tx_stop_all_queues(netdev);
> 
>         strcpy(netdev->name, "eth%d");
>         err = register_netdev(netdev);
>         if (err)
>                 goto err_register;
> 
>         e1000_print_device_info(adapter);
> 
> netif_carrier_off does:
> 
> void netif_carrier_off(struct net_device *dev)
> {
>         if (!test_and_set_bit(__LINK_STATE_NOCARRIER, &dev->state)) {
>                 if (dev->reg_state == NETREG_UNINITIALIZED)
>                         return;
>                 linkwatch_fire_event(dev);
>         }
> }
> 
> So, either it should be illegal to call netif_carrier_off on an
> unregistered net_device (and there should be a WARN() in
> netif_carrier_off), or it should work correctly, and
> register_netdevice should do the right thing when there is no carrier
> (i.e. something like my patch).

does this patch also fix the issue?

===== begin =====

e1000e: indicate link down at load

From: Jesse Brandeburg <jesse.brandeburg@intel.com>

same kind of patch as e1000, let driver explicitly push link state
once link comes up.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
---

 drivers/net/e1000e/netdev.c |    6 ++----
 1 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c
index fb78278..6a0411e 100644
--- a/drivers/net/e1000e/netdev.c
+++ b/drivers/net/e1000e/netdev.c
@@ -3072,6 +3072,8 @@ static int e1000_open(struct net_device *netdev)
 	if (test_bit(__E1000_TESTING, &adapter->state))
 		return -EBUSY;
 
+	netif_carrier_off(netdev);
+
 	/* allocate transmit descriptors */
 	err = e1000e_setup_tx_resources(adapter);
 	if (err)
@@ -5006,10 +5008,6 @@ static int __devinit e1000_probe(struct pci_dev *pdev,
 	if (!(adapter->flags & FLAG_HAS_AMT))
 		e1000_get_hw_control(adapter);
 
-	/* tell the stack to leave us alone until e1000_open() is called */
-	netif_carrier_off(netdev);
-	netif_tx_stop_all_queues(netdev);
-
 	strcpy(netdev->name, "eth%d");
 	err = register_netdev(netdev);
 	if (err)

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] Tell linkwatch about new interfaces
  2009-04-10  0:48     ` Brandeburg, Jesse
@ 2009-04-11 15:46       ` Andrew Lutomirski
  2009-04-13 23:22       ` David Miller
  1 sibling, 0 replies; 11+ messages in thread
From: Andrew Lutomirski @ 2009-04-11 15:46 UTC (permalink / raw)
  To: Brandeburg, Jesse; +Cc: David Miller, netdev, linux-kernel

On Thu, Apr 9, 2009 at 8:48 PM, Brandeburg, Jesse
<jesse.brandeburg@intel.com> wrote:
>
> does this patch also fix the issue?

Yes.

2: eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast
state DOWN qlen 1000

Tested-by: Andy Lutomirski <amluto@gmail.com>

The link operational states are still funny, though:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
3: wmaster0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
pfifo_fast state UNKNOWN qlen 1000

It looks like all interfaces that don't call netif_carrier_down after
registration end up in UNKNOWN until something happens.  The case of
carrier_on but UNKNOWN doesn't seem to confuse my userspace, but it's
still odd.

--Andy

>
> ===== begin =====
>
> e1000e: indicate link down at load
>
> From: Jesse Brandeburg <jesse.brandeburg@intel.com>
>
> same kind of patch as e1000, let driver explicitly push link state
> once link comes up.
>
> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
> ---
>
>  drivers/net/e1000e/netdev.c |    6 ++----
>  1 files changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c
> index fb78278..6a0411e 100644
> --- a/drivers/net/e1000e/netdev.c
> +++ b/drivers/net/e1000e/netdev.c
> @@ -3072,6 +3072,8 @@ static int e1000_open(struct net_device *netdev)
>        if (test_bit(__E1000_TESTING, &adapter->state))
>                return -EBUSY;
>
> +       netif_carrier_off(netdev);
> +
>        /* allocate transmit descriptors */
>        err = e1000e_setup_tx_resources(adapter);
>        if (err)
> @@ -5006,10 +5008,6 @@ static int __devinit e1000_probe(struct pci_dev *pdev,
>        if (!(adapter->flags & FLAG_HAS_AMT))
>                e1000_get_hw_control(adapter);
>
> -       /* tell the stack to leave us alone until e1000_open() is called */
> -       netif_carrier_off(netdev);
> -       netif_tx_stop_all_queues(netdev);
> -
>        strcpy(netdev->name, "eth%d");
>        err = register_netdev(netdev);
>        if (err)
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] Tell linkwatch about new interfaces
  2009-04-10  0:48     ` Brandeburg, Jesse
  2009-04-11 15:46       ` Andrew Lutomirski
@ 2009-04-13 23:22       ` David Miller
  2009-04-14  7:43         ` Jeff Kirsher
  1 sibling, 1 reply; 11+ messages in thread
From: David Miller @ 2009-04-13 23:22 UTC (permalink / raw)
  To: jesse.brandeburg; +Cc: amluto, netdev, linux-kernel

From: "Brandeburg, Jesse" <jesse.brandeburg@intel.com>
Date: Thu, 9 Apr 2009 17:48:04 -0700 (Pacific Daylight Time)

> Dave, if we move the netif_carrier_off call to our dev->open like tg3 has, 
> do you think this is sufficient?

Yes.

> e1000e: indicate link down at load
> 
> From: Jesse Brandeburg <jesse.brandeburg@intel.com>
> 
> same kind of patch as e1000, let driver explicitly push link state
> once link comes up.
> 
> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>

I'll wait to get this via Jeff Kirsher.

Thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] Tell linkwatch about new interfaces
  2009-04-13 23:22       ` David Miller
@ 2009-04-14  7:43         ` Jeff Kirsher
  0 siblings, 0 replies; 11+ messages in thread
From: Jeff Kirsher @ 2009-04-14  7:43 UTC (permalink / raw)
  To: David Miller; +Cc: jesse.brandeburg, amluto, netdev, linux-kernel

On Mon, Apr 13, 2009 at 4:22 PM, David Miller <davem@davemloft.net> wrote:
> From: "Brandeburg, Jesse" <jesse.brandeburg@intel.com>
> Date: Thu, 9 Apr 2009 17:48:04 -0700 (Pacific Daylight Time)
>
>> Dave, if we move the netif_carrier_off call to our dev->open like tg3 has,
>> do you think this is sufficient?
>
> Yes.
>
>> e1000e: indicate link down at load
>>
>> From: Jesse Brandeburg <jesse.brandeburg@intel.com>
>>
>> same kind of patch as e1000, let driver explicitly push link state
>> once link comes up.
>>
>> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
>
> I'll wait to get this via Jeff Kirsher.
>
> Thanks.
> --

I have added this to my queue and will be sent out with the other
e1000 patch I have,in the next day or two.

-- 
Cheers,
Jeff

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] Tell linkwatch about new interfaces
  2009-04-05  0:05 ` David Miller
  2009-04-05  4:00   ` Andrew Lutomirski
@ 2009-07-14 17:17   ` Sergio Luis
  2009-07-14 18:33     ` David Miller
  1 sibling, 1 reply; 11+ messages in thread
From: Sergio Luis @ 2009-07-14 17:17 UTC (permalink / raw)
  To: David Miller; +Cc: amluto, netdev, linux-kernel

Hello Dave,

On Sat, Apr 4, 2009 at 9:05 PM, David Miller<davem@davemloft.net> wrote:
> From: Andrew Lutomirski <amluto@gmail.com>
> Date: Wed, 1 Apr 2009 11:40:06 -0400
>
>> When a network driver registers a new interface, linkwatch will not notice,
>> and hence not set the rfc2863 operstate, until netif_carrier_on gets called.
>> If the new interface has no carrier when it is connected, then a status of
>> "unknown" is reported to userspace, which confuses various tools
>> (NetworkManager, for example).
>>
>> This fires a linkwatch event for all new interfaces, so that operstate
>> gets set reasonably quickly.
>>
>> Signed-off-by: Andrew Lutomirski <amluto@gmail.com>
>
> The default assumed state for a freshly registered network
> device is that the link is up.
>
> If that disagrees from reality, the driver should make the
> appropriate netif_carrier_off() call.
>
> I'm sure you'll find that the e1000 driver is not doing this
> and that is what causes the bug you are seeing.
> --

is this patch incorrect, though? with the linkwatch_fire_event() call,
the rfc2863 operstate will be set for everyone at device register
time.
in here I am having the interface operstate as 'unknown', but I do
ifconfig down and up or unplug/plug the cable again it will finally
set the correct rfc2863 operstate.

or should this be fixed on a per-driver basis, like it apparently was
in this case, for his e1000? (drivers/net/skge.c in here).

thanks,
sergio

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] Tell linkwatch about new interfaces
  2009-07-14 17:17   ` Sergio Luis
@ 2009-07-14 18:33     ` David Miller
  2009-07-14 18:37       ` Sergio Luis
  2009-07-14 18:58       ` Andrew Lutomirski
  0 siblings, 2 replies; 11+ messages in thread
From: David Miller @ 2009-07-14 18:33 UTC (permalink / raw)
  To: eeeesti; +Cc: amluto, netdev, linux-kernel

From: Sergio Luis <eeeesti@gmail.com>
Date: Tue, 14 Jul 2009 14:17:21 -0300

> is this patch incorrect, though? with the linkwatch_fire_event() call,
> the rfc2863 operstate will be set for everyone at device register
> time.

The issue is dumb drivers that do not manage their link state
at all.  We want them to always have their links up, from the
moment they are registered.

This is especially important for virtual devices.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] Tell linkwatch about new interfaces
  2009-07-14 18:33     ` David Miller
@ 2009-07-14 18:37       ` Sergio Luis
  2009-07-14 18:58       ` Andrew Lutomirski
  1 sibling, 0 replies; 11+ messages in thread
From: Sergio Luis @ 2009-07-14 18:37 UTC (permalink / raw)
  To: David Miller; +Cc: amluto, netdev, linux-kernel

On Tue, Jul 14, 2009 at 3:33 PM, David Miller<davem@davemloft.net> wrote:
> From: Sergio Luis <eeeesti@gmail.com>
> Date: Tue, 14 Jul 2009 14:17:21 -0300
>
>> is this patch incorrect, though? with the linkwatch_fire_event() call,
>> the rfc2863 operstate will be set for everyone at device register
>> time.
>
> The issue is dumb drivers that do not manage their link state
> at all.  We want them to always have their links up, from the
> moment they are registered.
>
> This is especially important for virtual devices.
>

I see. I will try to take a look at the driver in question, then. Thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] Tell linkwatch about new interfaces
  2009-07-14 18:33     ` David Miller
  2009-07-14 18:37       ` Sergio Luis
@ 2009-07-14 18:58       ` Andrew Lutomirski
  1 sibling, 0 replies; 11+ messages in thread
From: Andrew Lutomirski @ 2009-07-14 18:58 UTC (permalink / raw)
  To: David Miller; +Cc: eeeesti, netdev, linux-kernel

On Tue, Jul 14, 2009 at 2:33 PM, David Miller<davem@davemloft.net> wrote:
> From: Sergio Luis <eeeesti@gmail.com>
> Date: Tue, 14 Jul 2009 14:17:21 -0300
>
>> is this patch incorrect, though? with the linkwatch_fire_event() call,
>> the rfc2863 operstate will be set for everyone at device register
>> time.
>
> The issue is dumb drivers that do not manage their link state
> at all.  We want them to always have their links up, from the
> moment they are registered.

Such dumb drivers still end up with bogus operstate.

>
> This is especially important for virtual devices.

$ ip link show lo
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

I've never noticed this causing a problem, but it seems a little
silly.  Presumably lo should be "UP."

--Andy

>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2009-07-14 18:58 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-04-01 15:40 [PATCH 1/1] Tell linkwatch about new interfaces Andrew Lutomirski
2009-04-05  0:05 ` David Miller
2009-04-05  4:00   ` Andrew Lutomirski
2009-04-10  0:48     ` Brandeburg, Jesse
2009-04-11 15:46       ` Andrew Lutomirski
2009-04-13 23:22       ` David Miller
2009-04-14  7:43         ` Jeff Kirsher
2009-07-14 17:17   ` Sergio Luis
2009-07-14 18:33     ` David Miller
2009-07-14 18:37       ` Sergio Luis
2009-07-14 18:58       ` Andrew Lutomirski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).