netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Bonding driver unexpected behaviour
@ 2020-07-15 12:46 pbl
  2020-07-16  3:41 ` Zhu Yanjun
  2020-07-16 15:50 ` Jay Vosburgh
  0 siblings, 2 replies; 10+ messages in thread
From: pbl @ 2020-07-15 12:46 UTC (permalink / raw)
  To: netdev

I'm attempting to set up the bonding driver on two gretap interfaces, gretap15 and gretap16
but I'm observing unexpected (to me) behaviour.
The underlying interfaces for those two are respectively intra15 (ipv4: 10.88.15.100/24) and
intra16 (ipv4: 10.88.16.100/24). These two are e1000 virtual network cards, connected through
virtual cables. As such, I would exclude any hardware issues. As a peer, I have another Linux
system configured similarly (ipv4s: 10.88.15.200 on intra15, 10.88.16.200 on intra16).

The gretap tunnels work as expected. They have the following ipv4 addresses:
          host           peer
gretap15  10.188.15.100  10.188.15.200
gretap16  10.188.16.100  10.188.16.200

When not enslaved by the bond interface, I'm able to exchange packets in the tunnel using the
internal ip addresses.

I then set up the bonding driver as follows:
# ip link add bond-15-16 type bond
# ip link set bond-15-16 type bond mode active-backup
# ip link set gretap15 down
# ip link set gretap16 down
# ip link set gretap15 master bond-15-16
# ip link set gretap16 master bond-15-16
# ip link set bond-15-16 mtu 1462
# ip addr add 10.42.42.100/24 dev bond-15-16
# ip link set bond-15-16 type bond arp_interval 100 arp_ip_target 10.42.42.200
# ip link set bond-15-16 up

I do the same on the peer system, inverting the interface and ARP target IP addresses.

At this point, IP communication using the addresses on the bond interfaces works as expected.
E.g.
# ping 10.24.24.200
gets responses from the other peer.
Using tcpdump on the other peer shows the GRE packets coming into intra15, and identical ICMP
packets coming through gretap15 and bond-15-16.

If I then disconnect the (virtual) network cable of intra15, the bonding driver switches to
intra16, as the GRE tunnel can no longer pass packets. However, despite having primary_reselect=0,
when I reconnect the network cable of intra15, the driver doesn't switch back to gretap15. In fact,
it doesn't even attempt sending any probes through it.

Fiddling with the cables (e.g. reconnecting intra15 and then disconnecting intra16) and/or bringing
the bond interface down and up usually results in the driver ping-ponging a bit between gretap15
and gretap16, before usually settling on gretap16 (but never on gretap15, it seems). Or,
sometimes, it results in the driver marking both slaves down and not doing anything ever again
until manual intervention (e.g. manually selecting a new active_slave, or down -> up).

Trying to ping the gretap15 address of the peer (10.188.15.200) from the host while gretap16 is the
active slave results in ARP traffic being temporarily exchanged on gretap15. I'm not sure whether
it originates from the bonding driver, as it seems like the generated requests are the cartesian
product of all address couples on the network segments of gretap15 and bond-15-16 (e.g. who-has
10.188.15.100 tell 10.188.15.100, who-has 10.188.15.100 tell 10.188.15.200, ..., who-hash
10.42.42.200 tell 10.42.42.200).

uname -a:
Linux fo-gw 4.19.0-9-amd64 #1 SMP Debian 4.19.118-2+deb10u1 (2020-06-07) x86_64 GNU/Linux
(same on peer system)

Am I misunderstanding how the driver works? Have I made any mistakes in the configuration?

Best regards,
Riccardo P. Bestetti


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Bonding driver unexpected behaviour
  2020-07-15 12:46 Bonding driver unexpected behaviour pbl
@ 2020-07-16  3:41 ` Zhu Yanjun
  2020-07-16  7:08   ` Riccardo Paolo Bestetti
  2020-07-16 15:50 ` Jay Vosburgh
  1 sibling, 1 reply; 10+ messages in thread
From: Zhu Yanjun @ 2020-07-16  3:41 UTC (permalink / raw)
  To: pbl; +Cc: netdev

On Wed, Jul 15, 2020 at 8:49 PM pbl@bestov.io <pbl@bestov.io> wrote:
>
> I'm attempting to set up the bonding driver on two gretap interfaces, gretap15 and gretap16
> but I'm observing unexpected (to me) behaviour.
> The underlying interfaces for those two are respectively intra15 (ipv4: 10.88.15.100/24) and
> intra16 (ipv4: 10.88.16.100/24). These two are e1000 virtual network cards, connected through
> virtual cables. As such, I would exclude any hardware issues. As a peer, I have another Linux
> system configured similarly (ipv4s: 10.88.15.200 on intra15, 10.88.16.200 on intra16).
>
> The gretap tunnels work as expected. They have the following ipv4 addresses:
>           host           peer
> gretap15  10.188.15.100  10.188.15.200
> gretap16  10.188.16.100  10.188.16.200
>
> When not enslaved by the bond interface, I'm able to exchange packets in the tunnel using the
> internal ip addresses.
>
> I then set up the bonding driver as follows:
> # ip link add bond-15-16 type bond
> # ip link set bond-15-16 type bond mode active-backup
> # ip link set gretap15 down
> # ip link set gretap16 down
> # ip link set gretap15 master bond-15-16
> # ip link set gretap16 master bond-15-16
> # ip link set bond-15-16 mtu 1462
> # ip addr add 10.42.42.100/24 dev bond-15-16
> # ip link set bond-15-16 type bond arp_interval 100 arp_ip_target 10.42.42.200
> # ip link set bond-15-16 up
>
> I do the same on the peer system, inverting the interface and ARP target IP addresses.
>
> At this point, IP communication using the addresses on the bond interfaces works as expected.
> E.g.
> # ping 10.24.24.200
> gets responses from the other peer.
> Using tcpdump on the other peer shows the GRE packets coming into intra15, and identical ICMP
> packets coming through gretap15 and bond-15-16.
>
> If I then disconnect the (virtual) network cable of intra15, the bonding driver switches to
> intra16, as the GRE tunnel can no longer pass packets. However, despite having primary_reselect=0,
> when I reconnect the network cable of intra15, the driver doesn't switch back to gretap15. In fact,
> it doesn't even attempt sending any probes through it.
>
> Fiddling with the cables (e.g. reconnecting intra15 and then disconnecting intra16) and/or bringing
> the bond interface down and up usually results in the driver ping-ponging a bit between gretap15
> and gretap16, before usually settling on gretap16 (but never on gretap15, it seems). Or,
> sometimes, it results in the driver marking both slaves down and not doing anything ever again
> until manual intervention (e.g. manually selecting a new active_slave, or down -> up).
>
> Trying to ping the gretap15 address of the peer (10.188.15.200) from the host while gretap16 is the
> active slave results in ARP traffic being temporarily exchanged on gretap15. I'm not sure whether
> it originates from the bonding driver, as it seems like the generated requests are the cartesian
> product of all address couples on the network segments of gretap15 and bond-15-16 (e.g. who-has
> 10.188.15.100 tell 10.188.15.100, who-has 10.188.15.100 tell 10.188.15.200, ..., who-hash
> 10.42.42.200 tell 10.42.42.200).

Please check this
https://developers.redhat.com/blog/2019/05/17/an-introduction-to-linux-virtual-interfaces-tunnels/#gre

Perhaps gretap only forwards ip (with L2 header) packets.

Possibly "arp -s" could help to workaround this.

Zhu Yanjun
>
> uname -a:
> Linux fo-gw 4.19.0-9-amd64 #1 SMP Debian 4.19.118-2+deb10u1 (2020-06-07) x86_64 GNU/Linux
> (same on peer system)
>
> Am I misunderstanding how the driver works? Have I made any mistakes in the configuration?
>
> Best regards,
> Riccardo P. Bestetti
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Bonding driver unexpected behaviour
  2020-07-16  3:41 ` Zhu Yanjun
@ 2020-07-16  7:08   ` Riccardo Paolo Bestetti
  2020-07-16  7:45     ` Zhu Yanjun
  0 siblings, 1 reply; 10+ messages in thread
From: Riccardo Paolo Bestetti @ 2020-07-16  7:08 UTC (permalink / raw)
  To: Zhu Yanjun; +Cc: netdev

Hello Zhu Yanjun,
 
On Thursday, July 16, 2020 05:41 CEST, Zhu Yanjun <zyjzyj2000@gmail.com> wrote: 

> 
> Please check this
> https://developers.redhat.com/blog/2019/05/17/an-introduction-to-linux-virtual-interfaces-tunnels/#gre
> 
> Perhaps gretap only forwards ip (with L2 header) packets.

That does not seem to be the case.
E.g.
root@fo-exit:/home/user# tcpdump -i intra16
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on intra16, link-type EN10MB (Ethernet), capture size 262144 bytes
09:05:12.619206 IP 10.88.16.100 > 10.88.16.200: GREv0, length 46: ARP, Request who-has 10.42.42.200 tell 10.42.42.100, length 28
09:05:12.619278 IP 10.88.16.200 > 10.88.16.100: GREv0, length 46: ARP, Reply 10.42.42.200 is-at da:9d:34:64:cb:8d (oui Unknown), length 28
09:05:14.054026 IP 10.88.16.200 > 10.88.16.100: GREv0, length 46: ARP, Request who-has 10.42.42.100 tell 10.42.42.200, length 28
09:05:14.107143 IP 10.88.16.100 > 10.88.16.200: GREv0, length 46: ARP, Reply 10.42.42.100 is-at d6:49:e5:19:52:16 (oui Unknown), length 28
^C

> 
> Possibly "arp -s" could help to workaround this.

Riccardo P. Bestetti


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Bonding driver unexpected behaviour
  2020-07-16  7:08   ` Riccardo Paolo Bestetti
@ 2020-07-16  7:45     ` Zhu Yanjun
  2020-07-16  8:08       ` Riccardo Paolo Bestetti
  0 siblings, 1 reply; 10+ messages in thread
From: Zhu Yanjun @ 2020-07-16  7:45 UTC (permalink / raw)
  To: Riccardo Paolo Bestetti; +Cc: netdev

On Thu, Jul 16, 2020 at 3:08 PM Riccardo Paolo Bestetti <pbl@bestov.io> wrote:
>
> Hello Zhu Yanjun,
>
> On Thursday, July 16, 2020 05:41 CEST, Zhu Yanjun <zyjzyj2000@gmail.com> wrote:
>
> >
> > Please check this
> > https://developers.redhat.com/blog/2019/05/17/an-introduction-to-linux-virtual-interfaces-tunnels/#gre
> >
> > Perhaps gretap only forwards ip (with L2 header) packets.
>
> That does not seem to be the case.
> E.g.
> root@fo-exit:/home/user# tcpdump -i intra16
> tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
> listening on intra16, link-type EN10MB (Ethernet), capture size 262144 bytes
> 09:05:12.619206 IP 10.88.16.100 > 10.88.16.200: GREv0, length 46: ARP, Request who-has 10.42.42.200 tell 10.42.42.100, length 28
> 09:05:12.619278 IP 10.88.16.200 > 10.88.16.100: GREv0, length 46: ARP, Reply 10.42.42.200 is-at da:9d:34:64:cb:8d (oui Unknown), length 28
> 09:05:14.054026 IP 10.88.16.200 > 10.88.16.100: GREv0, length 46: ARP, Request who-has 10.42.42.100 tell 10.42.42.200, length 28
> 09:05:14.107143 IP 10.88.16.100 > 10.88.16.200: GREv0, length 46: ARP, Reply 10.42.42.100 is-at d6:49:e5:19:52:16 (oui Unknown), length 28

Interesting problem. You can use team to make tests.

Zhu Yanjun

> ^C
>
> >
> > Possibly "arp -s" could help to workaround this.
>
> Riccardo P. Bestetti
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Bonding driver unexpected behaviour
  2020-07-16  7:45     ` Zhu Yanjun
@ 2020-07-16  8:08       ` Riccardo Paolo Bestetti
  2020-07-16  9:45         ` Zhu Yanjun
  0 siblings, 1 reply; 10+ messages in thread
From: Riccardo Paolo Bestetti @ 2020-07-16  8:08 UTC (permalink / raw)
  To: Zhu Yanjun; +Cc: netdev

Hello Zhu Yanjun, 

On Thursday, July 16, 2020 09:45 CEST, Zhu Yanjun <zyjzyj2000@gmail.com> wrote: 
> You can use team to make tests.
I'm not sure I understand what you mean. Could you point me to relevant documentation?

Riccardo P. Bestetti


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Bonding driver unexpected behaviour
  2020-07-16  8:08       ` Riccardo Paolo Bestetti
@ 2020-07-16  9:45         ` Zhu Yanjun
  2020-07-16 10:20           ` Riccardo Paolo Bestetti
  0 siblings, 1 reply; 10+ messages in thread
From: Zhu Yanjun @ 2020-07-16  9:45 UTC (permalink / raw)
  To: Riccardo Paolo Bestetti; +Cc: netdev

On Thu, Jul 16, 2020 at 4:08 PM Riccardo Paolo Bestetti <pbl@bestov.io> wrote:
>
> Hello Zhu Yanjun,
>
> On Thursday, July 16, 2020 09:45 CEST, Zhu Yanjun <zyjzyj2000@gmail.com> wrote:
> > You can use team to make tests.
> I'm not sure I understand what you mean. Could you point me to relevant documentation?

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/networking_guide/sec-comparison_of_network_teaming_to_bonding

Use team instead of bonding to make tests.

>
> Riccardo P. Bestetti
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Bonding driver unexpected behaviour
  2020-07-16  9:45         ` Zhu Yanjun
@ 2020-07-16 10:20           ` Riccardo Paolo Bestetti
  2020-07-16 14:31             ` Zhu Yanjun
  0 siblings, 1 reply; 10+ messages in thread
From: Riccardo Paolo Bestetti @ 2020-07-16 10:20 UTC (permalink / raw)
  To: Zhu Yanjun; +Cc: netdev

Hello Zhu Yanjun,

On Thursday, July 16, 2020 11:45 CEST, Zhu Yanjun <zyjzyj2000@gmail.com> wrote: 
 
> On Thu, Jul 16, 2020 at 4:08 PM Riccardo Paolo Bestetti <pbl@bestov.io> wrote:
> >
> > 
> >
> > On Thursday, July 16, 2020 09:45 CEST, Zhu Yanjun <zyjzyj2000@gmail.com> wrote:
> > > You can use team to make tests.
> > I'm not sure I understand what you mean. Could you point me to relevant documentation?
> 
> https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/networking_guide/sec-comparison_of_network_teaming_to_bonding
> 
> Use team instead of bonding to make tests.
That seems like a Red Hat-specific feature. Unfortunately, I do not know Red Hat.
Nor I would have the possibility of using Red Hat in production even if I could get teaming to work instead of bonding.

Riccardo P. Bestetti


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Bonding driver unexpected behaviour
  2020-07-16 10:20           ` Riccardo Paolo Bestetti
@ 2020-07-16 14:31             ` Zhu Yanjun
  0 siblings, 0 replies; 10+ messages in thread
From: Zhu Yanjun @ 2020-07-16 14:31 UTC (permalink / raw)
  To: Riccardo Paolo Bestetti; +Cc: netdev

On Thu, Jul 16, 2020 at 6:20 PM Riccardo Paolo Bestetti <pbl@bestov.io> wrote:
>
> Hello Zhu Yanjun,
>
> On Thursday, July 16, 2020 11:45 CEST, Zhu Yanjun <zyjzyj2000@gmail.com> wrote:
>
> > On Thu, Jul 16, 2020 at 4:08 PM Riccardo Paolo Bestetti <pbl@bestov.io> wrote:
> > >
> > >
> > >
> > > On Thursday, July 16, 2020 09:45 CEST, Zhu Yanjun <zyjzyj2000@gmail.com> wrote:
> > > > You can use team to make tests.
> > > I'm not sure I understand what you mean. Could you point me to relevant documentation?
> >
> > https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/networking_guide/sec-comparison_of_network_teaming_to_bonding
> >
> > Use team instead of bonding to make tests.
> That seems like a Red Hat-specific feature. Unfortunately, I do not know Red Hat.

Just a test.
Team driver does not belong to Red Hat.
I am also not Redhat employee.

You can make tests with team driver to find the root cause, then fix it.

IMHO, you can build bonding driver and gretap driver, make tests with
them, then find out where the packets are dropped, finally find out
the root cause.
This is a direct method.

It is up to you about how to find out the root cause.

Zhu Yanjun
> Nor I would have the possibility of using Red Hat in production even if I could get teaming to work instead of bonding.
>
> Riccardo P. Bestetti
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Bonding driver unexpected behaviour
  2020-07-15 12:46 Bonding driver unexpected behaviour pbl
  2020-07-16  3:41 ` Zhu Yanjun
@ 2020-07-16 15:50 ` Jay Vosburgh
  2020-07-19 10:07   ` Riccardo Paolo Bestetti
  1 sibling, 1 reply; 10+ messages in thread
From: Jay Vosburgh @ 2020-07-16 15:50 UTC (permalink / raw)
  To: pbl; +Cc: netdev

pbl@bestov.io <pbl@bestov.io> wrote:

>I'm attempting to set up the bonding driver on two gretap interfaces,
>gretap15 and gretap16 but I'm observing unexpected (to me) behaviour.
>The underlying interfaces for those two are respectively intra15 (ipv4:
>10.88.15.100/24) and intra16 (ipv4: 10.88.16.100/24). These two are
>e1000 virtual network cards, connected through virtual cables. As such,
>I would exclude any hardware issues. As a peer, I have another Linux
>system configured similarly (ipv4s: 10.88.15.200 on intra15,
>10.88.16.200 on intra16).
>
>The gretap tunnels work as expected. They have the following ipv4 addresses:
>          host           peer
>gretap15  10.188.15.100  10.188.15.200
>gretap16  10.188.16.100  10.188.16.200
>
>When not enslaved by the bond interface, I'm able to exchange packets
>in the tunnel using the internal ip addresses.
>
>I then set up the bonding driver as follows:
># ip link add bond-15-16 type bond
># ip link set bond-15-16 type bond mode active-backup
># ip link set gretap15 down
># ip link set gretap16 down
># ip link set gretap15 master bond-15-16
># ip link set gretap16 master bond-15-16
># ip link set bond-15-16 mtu 1462
># ip addr add 10.42.42.100/24 dev bond-15-16
># ip link set bond-15-16 type bond arp_interval 100 arp_ip_target 10.42.42.200
># ip link set bond-15-16 up
>
>I do the same on the peer system, inverting the interface and ARP
>target IP addresses.
>
>At this point, IP communication using the addresses on the bond
>interfaces works as expected.
>E.g.
># ping 10.24.24.200
>gets responses from the other peer.
>Using tcpdump on the other peer shows the GRE packets coming into
>intra15, and identical ICMP packets coming through gretap15 and
>bond-15-16.
>
>If I then disconnect the (virtual) network cable of intra15, the
>bonding driver switches to intra16, as the GRE tunnel can no longer
>pass packets. However, despite having primary_reselect=0, when I
>reconnect the network cable of intra15, the driver doesn't switch back
>to gretap15. In fact, it doesn't even attempt sending any probes
>through it.

	Based on your configuration above, I believe the lack of
fail-back to gretap15 is the expected behavior.  The primary_reselect
value only matters if some interface has been specified as "primary",
and your configuration does not do so.  Specifying something like

ip link set bond-15-16 type bond primary gretap15

	would likely result in the fail-back behavior you describe.

>Fiddling with the cables (e.g. reconnecting intra15 and then
>disconnecting intra16) and/or bringing the bond interface down and up
>usually results in the driver ping-ponging a bit between gretap15 and
>gretap16, before usually settling on gretap16 (but never on gretap15,
>it seems). Or, sometimes, it results in the driver marking both slaves
>down and not doing anything ever again until manual intervention
>(e.g. manually selecting a new active_slave, or down -> up).
>
>Trying to ping the gretap15 address of the peer (10.188.15.200) from
>the host while gretap16 is the active slave results in ARP traffic
>being temporarily exchanged on gretap15. I'm not sure whether it
>originates from the bonding driver, as it seems like the generated
>requests are the cartesian product of all address couples on the
>network segments of gretap15 and bond-15-16 (e.g. who-has 10.188.15.100
>tell 10.188.15.100, who-has 10.188.15.100 tell 10.188.15.200, ...,
>who-hash 10.42.42.200 tell 10.42.42.200).

	Do these ARP requests receive appropriate ARP replies?  Do you
see entries for the addresses in the output of "ip neigh show", and are
they FAILED or REACHABLE / STALE?

	I have not tested or have familiarity with users using IP tunnel
interfaces with bonding as you're doing, so it's possible that some
aspect of that is interfering with the function of the ARP monitor.

	-J

>uname -a:
>Linux fo-gw 4.19.0-9-amd64 #1 SMP Debian 4.19.118-2+deb10u1 (2020-06-07) x86_64 GNU/Linux
>(same on peer system)
>
>Am I misunderstanding how the driver works? Have I made any mistakes in
>the configuration?
>
>Best regards,
>Riccardo P. Bestetti
>

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Bonding driver unexpected behaviour
  2020-07-16 15:50 ` Jay Vosburgh
@ 2020-07-19 10:07   ` Riccardo Paolo Bestetti
  0 siblings, 0 replies; 10+ messages in thread
From: Riccardo Paolo Bestetti @ 2020-07-19 10:07 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: netdev

On Thursday, July 16, 2020 17:50 CEST, Jay Vosburgh <jay.vosburgh@canonical.com> wrote: 
 
> pbl@bestov.io <pbl@bestov.io> wrote:
> 	[...] I believe the lack of
> fail-back to gretap15 is the expected behavior.  The primary_reselect
> value only matters if some interface has been specified as "primary",
> and your configuration does not do so.  Specifying something like
> 
> ip link set bond-15-16 type bond primary gretap15
> 
> 	would likely result in the fail-back behavior you describe.
You are absolutely right! I had assumed that the first device added to the bond would be the primary device, but this is of course not the documented behaviour and a quick read of the source code seems to confirm this. I have now changed this and gretap15 is the primary interface on both hosts.

> >Fiddling with the cables (e.g. reconnecting intra15 and then
> >disconnecting intra16) and/or bringing the bond interface down and up
> >usually results in the driver ping-ponging a bit between gretap15 and
> >gretap16, before usually settling on gretap16 (but never on gretap15,
> >it seems). Or, sometimes, it results in the driver marking both slaves
> >down and not doing anything ever again until manual intervention
> >(e.g. manually selecting a new active_slave, or down -> up).
> >
> >Trying to ping the gretap15 address of the peer (10.188.15.200) from
> >the host while gretap16 is the active slave results in ARP traffic
> >being temporarily exchanged on gretap15. I'm not sure whether it
> >originates from the bonding driver, as it seems like the generated
> >requests are the cartesian product of all address couples on the
> >network segments of gretap15 and bond-15-16 (e.g. who-has 10.188.15.100
> >tell 10.188.15.100, who-has 10.188.15.100 tell 10.188.15.200, ...,
> >who-hash 10.42.42.200 tell 10.42.42.200).
> 
> 	Do these ARP requests receive appropriate ARP replies?  Do you
> see entries for the addresses in the output of "ip neigh show", and are
> they FAILED or REACHABLE / STALE?
Yes, it seems that the ARP requests always get a proper replies. This morning, I checked the neighbour table and 10.88.15.200 (gretap15's peer on intra15) was marked as STALE. I assumed it was because no ARP probes were sent over gretap15 by the bonding driver, and thus no ARP probes were sent over intra15 to refresh the table for 10.88.15.200 after the entry got STALE'd.
My assumption seems to be confirmed: I tried to ping 10.188.15.200 (peer's address inside gretap15) and while it of course didn't respond - as the interface is slaved to bond-15-16 - it did create egress packets over gretap15, which in turn caused routing to send ARP requests over intra15 to 10.88.15.200 to send out the encapsulating GRE packets.

Also,
# arping -I gretap15 10.42.42.200
gets proper replies and... it causes gretap15 to be immediately reselected. So it does indeed look like the driver is just refusing to send the probes for no proper reason (i.e. if it would do that, it would result in the expected behaviour.)

I'm really not set up to dip my nose in the kernel. I would nonetheless like to attempt to do it, and to prepare a patch, if I shall succeed. Could you or anyone with expertise in netdev point me to the right places to look in the source code?

Riccardo P. Bestetti


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-07-19 10:07 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-15 12:46 Bonding driver unexpected behaviour pbl
2020-07-16  3:41 ` Zhu Yanjun
2020-07-16  7:08   ` Riccardo Paolo Bestetti
2020-07-16  7:45     ` Zhu Yanjun
2020-07-16  8:08       ` Riccardo Paolo Bestetti
2020-07-16  9:45         ` Zhu Yanjun
2020-07-16 10:20           ` Riccardo Paolo Bestetti
2020-07-16 14:31             ` Zhu Yanjun
2020-07-16 15:50 ` Jay Vosburgh
2020-07-19 10:07   ` Riccardo Paolo Bestetti

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).