b.a.t.m.a.n.lists.open-mesh.org archive mirror
 help / color / mirror / Atom feed
* [B.A.T.M.A.N.] [PATCH-maint v2] batman-adv: Avoid endless loop in bat-on-bat netdevice check
@ 2016-02-11 21:15 Sven Eckelmann
  2016-02-12 15:25 ` Andrew Lunn
  2016-02-16  9:19 ` Marek Lindner
  0 siblings, 2 replies; 4+ messages in thread
From: Sven Eckelmann @ 2016-02-11 21:15 UTC (permalink / raw)
  To: b.a.t.m.a.n

From: Andrew Lunn <andrew@lunn.ch>

batman-adv checks in different situation if a new device is already on top
of a different batman-adv device. This is done by getting the iflink of a
device and all its parent. It assumes that this iflink is always a parent
device in an acyclic graph. But this assumption is broken by devices like
veth which are actually a pair of two devices linked to each other. The
recursive check would therefore get veth0 when calling dev_get_iflink on
veth1. And it gets veth0 when calling dev_get_iflink with veth1.

Creating a veth pair and loading batman-adv freezes parts of the system

    ip link add veth0 type veth peer name veth1
    modprobe batman-adv

An RCU stall will be detected on the system which cannot be fixed.

    INFO: rcu_sched self-detected stall on CPU
            1: (5264 ticks this GP) idle=3e9/140000000000001/0
    softirq=144683/144686 fqs=5249
             (t=5250 jiffies g=46 c=45 q=43)
    Task dump for CPU 1:
    insmod          R  running task        0   247    245 0x00000008
     ffffffff8151f140 ffffffff8107888e ffff88000fd141c0 ffffffff8151f140
     0000000000000000 ffffffff81552df0 ffffffff8107b420 0000000000000001
     ffff88000e3fa700 ffffffff81540b00 ffffffff8107d667 0000000000000001
    Call Trace:
     <IRQ>  [<ffffffff8107888e>] ? rcu_dump_cpu_stacks+0x7e/0xd0
     [<ffffffff8107b420>] ? rcu_check_callbacks+0x3f0/0x6b0
     [<ffffffff8107d667>] ? hrtimer_run_queues+0x47/0x180
     [<ffffffff8107cf9d>] ? update_process_times+0x2d/0x50
     [<ffffffff810873fb>] ? tick_handle_periodic+0x1b/0x60
     [<ffffffff810290ae>] ? smp_trace_apic_timer_interrupt+0x5e/0x90
     [<ffffffff813bbae2>] ? apic_timer_interrupt+0x82/0x90
     <EOI>  [<ffffffff812c3fd7>] ? __dev_get_by_index+0x37/0x40
     [<ffffffffa0031f3e>] ? batadv_hard_if_event+0xee/0x3a0 [batman_adv]
     [<ffffffff812c5801>] ? register_netdevice_notifier+0x81/0x1a0
    [...]

This can be avoided by checking if two devices are each others parent and
stopping the check in this situation.

Fixes: 3d48811b27f5 ("batman-adv: prevent using any virtual device created on batman-adv as hard-interface")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
[sven@narfation.org: rewritten description, extracted fix]
Signed-off-by: Sven Eckelmann <sven@narfation.org>
---
v2:
 - fix kernel-doc warning

 net/batman-adv/hard-interface.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/net/batman-adv/hard-interface.c b/net/batman-adv/hard-interface.c
index aea4d06..730cfa8 100644
--- a/net/batman-adv/hard-interface.c
+++ b/net/batman-adv/hard-interface.c
@@ -79,6 +79,28 @@ out:
 }
 
 /**
+ * batadv_mutual_parents - check if two devices are each others parent
+ * @dev1: 1st net_device
+ * @dev2: 2nd net_device
+ *
+ * veth devices come in pairs and each is the parent of the other!
+ *
+ * Return: true if the devices are each others parent, otherwise false
+ */
+static bool batadv_mutual_parents(const struct net_device *dev1,
+				  const struct net_device *dev2)
+{
+	int dev1_parent_iflink = dev_get_iflink(dev1);
+	int dev2_parent_iflink = dev_get_iflink(dev2);
+
+	if (!dev1_parent_iflink || !dev2_parent_iflink)
+		return false;
+
+	return (dev1_parent_iflink == dev2->ifindex) &&
+	       (dev2_parent_iflink == dev1->ifindex);
+}
+
+/**
  * batadv_is_on_batman_iface - check if a device is a batman iface descendant
  * @net_dev: the device to check
  *
@@ -111,6 +133,9 @@ static bool batadv_is_on_batman_iface(const struct net_device *net_dev)
 	if (WARN(!parent_dev, "Cannot find parent device"))
 		return false;
 
+	if (batadv_mutual_parents(net_dev, parent_dev))
+		return false;
+
 	ret = batadv_is_on_batman_iface(parent_dev);
 
 	return ret;
-- 
2.7.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [B.A.T.M.A.N.] [PATCH-maint v2] batman-adv: Avoid endless loop in bat-on-bat netdevice check
  2016-02-11 21:15 [B.A.T.M.A.N.] [PATCH-maint v2] batman-adv: Avoid endless loop in bat-on-bat netdevice check Sven Eckelmann
@ 2016-02-12 15:25 ` Andrew Lunn
  2016-02-12 15:49   ` Sven Eckelmann
  2016-02-16  9:19 ` Marek Lindner
  1 sibling, 1 reply; 4+ messages in thread
From: Andrew Lunn @ 2016-02-12 15:25 UTC (permalink / raw)
  To: Sven Eckelmann; +Cc: b.a.t.m.a.n

>  /**
> + * batadv_mutual_parents - check if two devices are each others parent
> + * @dev1: 1st net_device
> + * @dev2: 2nd net_device
> + *
> + * veth devices come in pairs and each is the parent of the other!
> + *
> + * Return: true if the devices are each others parent, otherwise false
> + */
> +static bool batadv_mutual_parents(const struct net_device *dev1,
> +				  const struct net_device *dev2)
> +{
> +	int dev1_parent_iflink = dev_get_iflink(dev1);
> +	int dev2_parent_iflink = dev_get_iflink(dev2);
> +
> +	if (!dev1_parent_iflink || !dev2_parent_iflink)
> +		return false;
> +
> +	return (dev1_parent_iflink == dev2->ifindex) &&
> +	       (dev2_parent_iflink == dev1->ifindex);
> +}

Hi Sven, et al,

So this is fine for the non netns case.

But what about the netns case? This is mainly a backward compatible
issue. It sounds like some of the older kernels you have via compat.h
are going to have issues with netns support. What do the maintainers
what to do about this? NACK my patches, drop support for some of the
older kernels? Something else?

Thanks
	Andrew

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [B.A.T.M.A.N.] [PATCH-maint v2] batman-adv: Avoid endless loop in bat-on-bat netdevice check
  2016-02-12 15:25 ` Andrew Lunn
@ 2016-02-12 15:49   ` Sven Eckelmann
  0 siblings, 0 replies; 4+ messages in thread
From: Sven Eckelmann @ 2016-02-12 15:49 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: b.a.t.m.a.n

[-- Attachment #1: Type: text/plain, Size: 1196 bytes --]

On Friday 12 February 2016 16:25:36 Andrew Lunn wrote:
> >  /**
> > 
> > + * batadv_mutual_parents - check if two devices are each others parent
> > + * @dev1: 1st net_device
> > + * @dev2: 2nd net_device
> > + *
> > + * veth devices come in pairs and each is the parent of the other!
> > + *
> > + * Return: true if the devices are each others parent, otherwise false
> > + */
> > +static bool batadv_mutual_parents(const struct net_device *dev1,
> > +				  const struct net_device *dev2)
> > +{
> > +	int dev1_parent_iflink = dev_get_iflink(dev1);
> > +	int dev2_parent_iflink = dev_get_iflink(dev2);
> > +
> > +	if (!dev1_parent_iflink || !dev2_parent_iflink)
> > +		return false;
> > +
> > +	return (dev1_parent_iflink == dev2->ifindex) &&
> > +	       (dev2_parent_iflink == dev1->ifindex);
> > +}
> 
> Hi Sven, et al,
> 
> So this is fine for the non netns case.

This basically has nothing to do with your original patchset. It is only to 
fix the problem reported some time ago and nothing else. I have extracted this 
part of the patch only because you suggested this approach and Antonio wanted 
to submit it to net.git

Lets discuss the rest in the correct thread.

Kind regards,
	Sven

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [B.A.T.M.A.N.] [PATCH-maint v2] batman-adv: Avoid endless loop in bat-on-bat netdevice check
  2016-02-11 21:15 [B.A.T.M.A.N.] [PATCH-maint v2] batman-adv: Avoid endless loop in bat-on-bat netdevice check Sven Eckelmann
  2016-02-12 15:25 ` Andrew Lunn
@ 2016-02-16  9:19 ` Marek Lindner
  1 sibling, 0 replies; 4+ messages in thread
From: Marek Lindner @ 2016-02-16  9:19 UTC (permalink / raw)
  To: b.a.t.m.a.n

[-- Attachment #1: Type: text/plain, Size: 2643 bytes --]

On Thursday, February 11, 2016 22:15:57 Sven Eckelmann wrote:
> From: Andrew Lunn <andrew@lunn.ch>
> 
> batman-adv checks in different situation if a new device is already on top
> of a different batman-adv device. This is done by getting the iflink of a
> device and all its parent. It assumes that this iflink is always a parent
> device in an acyclic graph. But this assumption is broken by devices like
> veth which are actually a pair of two devices linked to each other. The
> recursive check would therefore get veth0 when calling dev_get_iflink on
> veth1. And it gets veth0 when calling dev_get_iflink with veth1.
> 
> Creating a veth pair and loading batman-adv freezes parts of the system
> 
>     ip link add veth0 type veth peer name veth1
>     modprobe batman-adv
> 
> An RCU stall will be detected on the system which cannot be fixed.
> 
>     INFO: rcu_sched self-detected stall on CPU
>             1: (5264 ticks this GP) idle=3e9/140000000000001/0
>     softirq=144683/144686 fqs=5249
>              (t=5250 jiffies g=46 c=45 q=43)
>     Task dump for CPU 1:
>     insmod          R  running task        0   247    245 0x00000008
>      ffffffff8151f140 ffffffff8107888e ffff88000fd141c0 ffffffff8151f140
>      0000000000000000 ffffffff81552df0 ffffffff8107b420 0000000000000001
>      ffff88000e3fa700 ffffffff81540b00 ffffffff8107d667 0000000000000001
>     Call Trace:
>      <IRQ>  [<ffffffff8107888e>] ? rcu_dump_cpu_stacks+0x7e/0xd0
>      [<ffffffff8107b420>] ? rcu_check_callbacks+0x3f0/0x6b0
>      [<ffffffff8107d667>] ? hrtimer_run_queues+0x47/0x180
>      [<ffffffff8107cf9d>] ? update_process_times+0x2d/0x50
>      [<ffffffff810873fb>] ? tick_handle_periodic+0x1b/0x60
>      [<ffffffff810290ae>] ? smp_trace_apic_timer_interrupt+0x5e/0x90
>      [<ffffffff813bbae2>] ? apic_timer_interrupt+0x82/0x90
>      <EOI>  [<ffffffff812c3fd7>] ? __dev_get_by_index+0x37/0x40
>      [<ffffffffa0031f3e>] ? batadv_hard_if_event+0xee/0x3a0 [batman_adv]
>      [<ffffffff812c5801>] ? register_netdevice_notifier+0x81/0x1a0
>     [...]
> 
> This can be avoided by checking if two devices are each others parent and
> stopping the check in this situation.
> 
> Fixes: 3d48811b27f5 ("batman-adv: prevent using any virtual device created
> on batman-adv as hard-interface") Signed-off-by: Andrew Lunn
> <andrew@lunn.ch>
> [sven@narfation.org: rewritten description, extracted fix]
> Signed-off-by: Sven Eckelmann <sven@narfation.org>
> ---
> v2:
>  - fix kernel-doc warning
> 
>  net/batman-adv/hard-interface.c | 25 +++++++++++++++++++++++++
>  1 file changed, 25 insertions(+)

Applied in revision 2808f92.

Thanks,
Marek

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-02-16  9:19 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-11 21:15 [B.A.T.M.A.N.] [PATCH-maint v2] batman-adv: Avoid endless loop in bat-on-bat netdevice check Sven Eckelmann
2016-02-12 15:25 ` Andrew Lunn
2016-02-12 15:49   ` Sven Eckelmann
2016-02-16  9:19 ` Marek Lindner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).