All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH net-next] sctp: fix src address selection if using secondary addresses
@ 2015-07-07 17:42 ` Marcelo Ricardo Leitner
  0 siblings, 0 replies; 22+ messages in thread
From: Marcelo Ricardo Leitner @ 2015-07-07 17:42 UTC (permalink / raw)
  To: Vlad Yasevich, Neil Horman; +Cc: netdev, linux-sctp

Hi folks,

This is an attempt to better choose a src address for sctp packets as
peers with rp_filter could be dropping our packets in some situations.
With this patch, we try to respect and use a src address that belongs to
the interface we are putting the packet out.

I have that feeling that there is be a better way to do this, but I
just couldn't see it.

This patch has been tested with and without gateways between the peers
and also just two peers connected via two subnets and results were
pretty good.

One could think that this limits the address combination we can use, but
such combinations probably are just bogus anyway. Like, if you have an
host with address A1 and B1 and another with A2 and B2, you cannot
expect that A can use A1 to reach B2 through subnet B, because the
return path would be via the other link which, when this switch happens,
we are thinking it's broken.

Thanks,
Marcelo

---8<---

In short, sctp is likely to incorrectly choose src address if socket is
bound to secondary addresses. This patch fixes it by adding a new check
that tries to anticipate if the src address would be expected by the
next hop/peer on this interface by doing reverse routing.

Also took the shot to reduce the indentation level on this code.

Details:

Currently, sctp will do a routing attempt without specifying the src
address and compare the returned value (preferred source) with the
addresses that the socket is bound to. When using secondary addresses,
this will not match.

Then it will try specifying each of the addresses that the socket is
bound to and re-routing, checking if that address is valid as src for
that dst. Thing is, this check alone is weak:

# ip r l
192.168.100.0/24 dev eth1  proto kernel  scope link  src 192.168.100.149
192.168.122.0/24 dev eth0  proto kernel  scope link  src 192.168.122.147

# ip a l
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 52:54:00:15:18:6a brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.147/24 brd 192.168.122.255 scope global dynamic eth0
       valid_lft 2160sec preferred_lft 2160sec
    inet 192.168.122.148/24 scope global secondary eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fe15:186a/64 scope link
       valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 52:54:00:b3:91:46 brd ff:ff:ff:ff:ff:ff
    inet 192.168.100.149/24 brd 192.168.100.255 scope global dynamic eth1
       valid_lft 2162sec preferred_lft 2162sec
    inet 192.168.100.148/24 scope global secondary eth1
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:feb3:9146/64 scope link
       valid_lft forever preferred_lft forever
4: ens9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 52:54:00:05:47:ee brd ff:ff:ff:ff:ff:ff
    inet6 fe80::5054:ff:fe05:47ee/64 scope link
       valid_lft forever preferred_lft forever

# ip r g 192.168.100.193 from 192.168.122.148
192.168.100.193 from 192.168.122.148 dev eth1
    cache

Even if you specify an interface:

# ip r g 192.168.100.193 from 192.168.122.148 oif eth1
192.168.100.193 from 192.168.122.148 dev eth1
    cache

Although this would be valid, peers using rp_filter will drop such
packets as their src doesn't match the routes for that interface.

So we fix this by adding an extra check, we try to do the reverse
routing and check if the interface used would be the same. If not, we
skip such address. If yes, we use it.

Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
---
 net/sctp/protocol.c | 55 +++++++++++++++++++++++++++++++++++++++--------------
 1 file changed, 41 insertions(+), 14 deletions(-)

diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
index 59e80356672bdf89777265ae1f8c384792dfb98c..e52fd6f77963426a7cf3e83ca01a9cdae1cb2c01 100644
--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -53,6 +53,7 @@
 #include <net/net_namespace.h>
 #include <net/protocol.h>
 #include <net/ip.h>
+#include <net/ip_fib.h>
 #include <net/ipv6.h>
 #include <net/route.h>
 #include <net/sctp/sctp.h>
@@ -487,23 +488,49 @@ static void sctp_v4_get_dst(struct sctp_transport *t, union sctp_addr *saddr,
 	 */
 	rcu_read_lock();
 	list_for_each_entry_rcu(laddr, &bp->address_list, list) {
+		struct flowi4 in;
+		struct fib_result res;
+
 		if (!laddr->valid)
 			continue;
-		if ((laddr->state == SCTP_ADDR_SRC) &&
-		    (AF_INET == laddr->a.sa.sa_family)) {
-			fl4->fl4_sport = laddr->a.v4.sin_port;
-			flowi4_update_output(fl4,
-					     asoc->base.sk->sk_bound_dev_if,
-					     RT_CONN_FLAGS(asoc->base.sk),
-					     daddr->v4.sin_addr.s_addr,
-					     laddr->a.v4.sin_addr.s_addr);
-
-			rt = ip_route_output_key(sock_net(sk), fl4);
-			if (!IS_ERR(rt)) {
-				dst = &rt->dst;
-				goto out_unlock;
-			}
+		if (laddr->state != SCTP_ADDR_SRC ||
+		    AF_INET != laddr->a.sa.sa_family)
+			continue;
+
+		fl4->fl4_sport = laddr->a.v4.sin_port;
+		flowi4_update_output(fl4,
+				     asoc->base.sk->sk_bound_dev_if,
+				     RT_CONN_FLAGS(asoc->base.sk),
+				     daddr->v4.sin_addr.s_addr,
+				     laddr->a.v4.sin_addr.s_addr);
+
+		rt = ip_route_output_key(sock_net(sk), fl4);
+		if (IS_ERR(rt))
+			continue;
+
+		dst = &rt->dst;
+
+		/* Double check if this is really expected. We simulate
+		 * rp_filter by swapping src and dst.  If interfaces are
+		 * different, means that this src wouldn't be expected
+		 * by the other host on this interface.
+		 */
+		memcpy(&in, fl4, sizeof(in));
+		in.daddr = fl4->saddr;
+		in.saddr = fl4->daddr;
+		in.flowi4_iif = fl4->flowi4_oif;
+		in.flowi4_oif = 0;
+
+		if (fib_lookup(sock_net(sk), &in, &res) ||
+		    res.type != RTN_LOCAL ||
+		    fl4->flowi4_oif != FIB_RES_DEV(res)->ifindex) {
+			/* Failed, so this was a false hit */
+			dst_release(dst);
+			dst = NULL;
+			continue;
 		}
+
+		break;
 	}
 
 out_unlock:
-- 
2.4.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH net-next] sctp: fix src address selection if using secondary addresses
@ 2015-07-07 17:42 ` Marcelo Ricardo Leitner
  0 siblings, 0 replies; 22+ messages in thread
From: Marcelo Ricardo Leitner @ 2015-07-07 17:42 UTC (permalink / raw)
  To: Vlad Yasevich, Neil Horman; +Cc: netdev, linux-sctp

Hi folks,

This is an attempt to better choose a src address for sctp packets as
peers with rp_filter could be dropping our packets in some situations.
With this patch, we try to respect and use a src address that belongs to
the interface we are putting the packet out.

I have that feeling that there is be a better way to do this, but I
just couldn't see it.

This patch has been tested with and without gateways between the peers
and also just two peers connected via two subnets and results were
pretty good.

One could think that this limits the address combination we can use, but
such combinations probably are just bogus anyway. Like, if you have an
host with address A1 and B1 and another with A2 and B2, you cannot
expect that A can use A1 to reach B2 through subnet B, because the
return path would be via the other link which, when this switch happens,
we are thinking it's broken.

Thanks,
Marcelo

---8<---

In short, sctp is likely to incorrectly choose src address if socket is
bound to secondary addresses. This patch fixes it by adding a new check
that tries to anticipate if the src address would be expected by the
next hop/peer on this interface by doing reverse routing.

Also took the shot to reduce the indentation level on this code.

Details:

Currently, sctp will do a routing attempt without specifying the src
address and compare the returned value (preferred source) with the
addresses that the socket is bound to. When using secondary addresses,
this will not match.

Then it will try specifying each of the addresses that the socket is
bound to and re-routing, checking if that address is valid as src for
that dst. Thing is, this check alone is weak:

# ip r l
192.168.100.0/24 dev eth1  proto kernel  scope link  src 192.168.100.149
192.168.122.0/24 dev eth0  proto kernel  scope link  src 192.168.122.147

# ip a l
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 52:54:00:15:18:6a brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.147/24 brd 192.168.122.255 scope global dynamic eth0
       valid_lft 2160sec preferred_lft 2160sec
    inet 192.168.122.148/24 scope global secondary eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fe15:186a/64 scope link
       valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 52:54:00:b3:91:46 brd ff:ff:ff:ff:ff:ff
    inet 192.168.100.149/24 brd 192.168.100.255 scope global dynamic eth1
       valid_lft 2162sec preferred_lft 2162sec
    inet 192.168.100.148/24 scope global secondary eth1
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:feb3:9146/64 scope link
       valid_lft forever preferred_lft forever
4: ens9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 52:54:00:05:47:ee brd ff:ff:ff:ff:ff:ff
    inet6 fe80::5054:ff:fe05:47ee/64 scope link
       valid_lft forever preferred_lft forever

# ip r g 192.168.100.193 from 192.168.122.148
192.168.100.193 from 192.168.122.148 dev eth1
    cache

Even if you specify an interface:

# ip r g 192.168.100.193 from 192.168.122.148 oif eth1
192.168.100.193 from 192.168.122.148 dev eth1
    cache

Although this would be valid, peers using rp_filter will drop such
packets as their src doesn't match the routes for that interface.

So we fix this by adding an extra check, we try to do the reverse
routing and check if the interface used would be the same. If not, we
skip such address. If yes, we use it.

Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
---
 net/sctp/protocol.c | 55 +++++++++++++++++++++++++++++++++++++++--------------
 1 file changed, 41 insertions(+), 14 deletions(-)

diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
index 59e80356672bdf89777265ae1f8c384792dfb98c..e52fd6f77963426a7cf3e83ca01a9cdae1cb2c01 100644
--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -53,6 +53,7 @@
 #include <net/net_namespace.h>
 #include <net/protocol.h>
 #include <net/ip.h>
+#include <net/ip_fib.h>
 #include <net/ipv6.h>
 #include <net/route.h>
 #include <net/sctp/sctp.h>
@@ -487,23 +488,49 @@ static void sctp_v4_get_dst(struct sctp_transport *t, union sctp_addr *saddr,
 	 */
 	rcu_read_lock();
 	list_for_each_entry_rcu(laddr, &bp->address_list, list) {
+		struct flowi4 in;
+		struct fib_result res;
+
 		if (!laddr->valid)
 			continue;
-		if ((laddr->state = SCTP_ADDR_SRC) &&
-		    (AF_INET = laddr->a.sa.sa_family)) {
-			fl4->fl4_sport = laddr->a.v4.sin_port;
-			flowi4_update_output(fl4,
-					     asoc->base.sk->sk_bound_dev_if,
-					     RT_CONN_FLAGS(asoc->base.sk),
-					     daddr->v4.sin_addr.s_addr,
-					     laddr->a.v4.sin_addr.s_addr);
-
-			rt = ip_route_output_key(sock_net(sk), fl4);
-			if (!IS_ERR(rt)) {
-				dst = &rt->dst;
-				goto out_unlock;
-			}
+		if (laddr->state != SCTP_ADDR_SRC ||
+		    AF_INET != laddr->a.sa.sa_family)
+			continue;
+
+		fl4->fl4_sport = laddr->a.v4.sin_port;
+		flowi4_update_output(fl4,
+				     asoc->base.sk->sk_bound_dev_if,
+				     RT_CONN_FLAGS(asoc->base.sk),
+				     daddr->v4.sin_addr.s_addr,
+				     laddr->a.v4.sin_addr.s_addr);
+
+		rt = ip_route_output_key(sock_net(sk), fl4);
+		if (IS_ERR(rt))
+			continue;
+
+		dst = &rt->dst;
+
+		/* Double check if this is really expected. We simulate
+		 * rp_filter by swapping src and dst.  If interfaces are
+		 * different, means that this src wouldn't be expected
+		 * by the other host on this interface.
+		 */
+		memcpy(&in, fl4, sizeof(in));
+		in.daddr = fl4->saddr;
+		in.saddr = fl4->daddr;
+		in.flowi4_iif = fl4->flowi4_oif;
+		in.flowi4_oif = 0;
+
+		if (fib_lookup(sock_net(sk), &in, &res) ||
+		    res.type != RTN_LOCAL ||
+		    fl4->flowi4_oif != FIB_RES_DEV(res)->ifindex) {
+			/* Failed, so this was a false hit */
+			dst_release(dst);
+			dst = NULL;
+			continue;
 		}
+
+		break;
 	}
 
 out_unlock:
-- 
2.4.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH net-next] sctp: fix src address selection if using secondary addresses
  2015-07-07 17:42 ` Marcelo Ricardo Leitner
@ 2015-07-09 16:54   ` Marcelo Ricardo Leitner
  -1 siblings, 0 replies; 22+ messages in thread
From: Marcelo Ricardo Leitner @ 2015-07-09 16:54 UTC (permalink / raw)
  To: netdev, linux-sctp; +Cc: Vlad Yasevich, Neil Horman, tuexen

Cc'ing Michael too.

On Tue, Jul 07, 2015 at 02:42:25PM -0300, Marcelo Ricardo Leitner wrote:
> Hi folks,
> 
> This is an attempt to better choose a src address for sctp packets as
> peers with rp_filter could be dropping our packets in some situations.
> With this patch, we try to respect and use a src address that belongs to
> the interface we are putting the packet out.
> 
> I have that feeling that there is be a better way to do this, but I
> just couldn't see it.
> 
> This patch has been tested with and without gateways between the peers
> and also just two peers connected via two subnets and results were
> pretty good.
> 
> One could think that this limits the address combination we can use, but
> such combinations probably are just bogus anyway. Like, if you have an
> host with address A1 and B1 and another with A2 and B2, you cannot
> expect that A can use A1 to reach B2 through subnet B, because the
> return path would be via the other link which, when this switch happens,
> we are thinking it's broken.
> 
> Thanks,
> Marcelo
> 
> ---8<---
> 
> In short, sctp is likely to incorrectly choose src address if socket is
> bound to secondary addresses. This patch fixes it by adding a new check
> that tries to anticipate if the src address would be expected by the
> next hop/peer on this interface by doing reverse routing.
> 
> Also took the shot to reduce the indentation level on this code.
> 
> Details:
> 
> Currently, sctp will do a routing attempt without specifying the src
> address and compare the returned value (preferred source) with the
> addresses that the socket is bound to. When using secondary addresses,
> this will not match.
> 
> Then it will try specifying each of the addresses that the socket is
> bound to and re-routing, checking if that address is valid as src for
> that dst. Thing is, this check alone is weak:
> 
> # ip r l
> 192.168.100.0/24 dev eth1  proto kernel  scope link  src 192.168.100.149
> 192.168.122.0/24 dev eth0  proto kernel  scope link  src 192.168.122.147
> 
> # ip a l
> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
>     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>     inet 127.0.0.1/8 scope host lo
>        valid_lft forever preferred_lft forever
>     inet6 ::1/128 scope host
>        valid_lft forever preferred_lft forever
> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
>     link/ether 52:54:00:15:18:6a brd ff:ff:ff:ff:ff:ff
>     inet 192.168.122.147/24 brd 192.168.122.255 scope global dynamic eth0
>        valid_lft 2160sec preferred_lft 2160sec
>     inet 192.168.122.148/24 scope global secondary eth0
>        valid_lft forever preferred_lft forever
>     inet6 fe80::5054:ff:fe15:186a/64 scope link
>        valid_lft forever preferred_lft forever
> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
>     link/ether 52:54:00:b3:91:46 brd ff:ff:ff:ff:ff:ff
>     inet 192.168.100.149/24 brd 192.168.100.255 scope global dynamic eth1
>        valid_lft 2162sec preferred_lft 2162sec
>     inet 192.168.100.148/24 scope global secondary eth1
>        valid_lft forever preferred_lft forever
>     inet6 fe80::5054:ff:feb3:9146/64 scope link
>        valid_lft forever preferred_lft forever
> 4: ens9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
>     link/ether 52:54:00:05:47:ee brd ff:ff:ff:ff:ff:ff
>     inet6 fe80::5054:ff:fe05:47ee/64 scope link
>        valid_lft forever preferred_lft forever
> 
> # ip r g 192.168.100.193 from 192.168.122.148
> 192.168.100.193 from 192.168.122.148 dev eth1
>     cache
> 
> Even if you specify an interface:
> 
> # ip r g 192.168.100.193 from 192.168.122.148 oif eth1
> 192.168.100.193 from 192.168.122.148 dev eth1
>     cache
> 
> Although this would be valid, peers using rp_filter will drop such
> packets as their src doesn't match the routes for that interface.
> 
> So we fix this by adding an extra check, we try to do the reverse
> routing and check if the interface used would be the same. If not, we
> skip such address. If yes, we use it.
> 
> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
> ---
>  net/sctp/protocol.c | 55 +++++++++++++++++++++++++++++++++++++++--------------
>  1 file changed, 41 insertions(+), 14 deletions(-)
> 
> diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
> index 59e80356672bdf89777265ae1f8c384792dfb98c..e52fd6f77963426a7cf3e83ca01a9cdae1cb2c01 100644
> --- a/net/sctp/protocol.c
> +++ b/net/sctp/protocol.c
> @@ -53,6 +53,7 @@
>  #include <net/net_namespace.h>
>  #include <net/protocol.h>
>  #include <net/ip.h>
> +#include <net/ip_fib.h>
>  #include <net/ipv6.h>
>  #include <net/route.h>
>  #include <net/sctp/sctp.h>
> @@ -487,23 +488,49 @@ static void sctp_v4_get_dst(struct sctp_transport *t, union sctp_addr *saddr,
>  	 */
>  	rcu_read_lock();
>  	list_for_each_entry_rcu(laddr, &bp->address_list, list) {
> +		struct flowi4 in;
> +		struct fib_result res;
> +
>  		if (!laddr->valid)
>  			continue;
> -		if ((laddr->state == SCTP_ADDR_SRC) &&
> -		    (AF_INET == laddr->a.sa.sa_family)) {
> -			fl4->fl4_sport = laddr->a.v4.sin_port;
> -			flowi4_update_output(fl4,
> -					     asoc->base.sk->sk_bound_dev_if,
> -					     RT_CONN_FLAGS(asoc->base.sk),
> -					     daddr->v4.sin_addr.s_addr,
> -					     laddr->a.v4.sin_addr.s_addr);
> -
> -			rt = ip_route_output_key(sock_net(sk), fl4);
> -			if (!IS_ERR(rt)) {
> -				dst = &rt->dst;
> -				goto out_unlock;
> -			}
> +		if (laddr->state != SCTP_ADDR_SRC ||
> +		    AF_INET != laddr->a.sa.sa_family)
> +			continue;
> +
> +		fl4->fl4_sport = laddr->a.v4.sin_port;
> +		flowi4_update_output(fl4,
> +				     asoc->base.sk->sk_bound_dev_if,
> +				     RT_CONN_FLAGS(asoc->base.sk),
> +				     daddr->v4.sin_addr.s_addr,
> +				     laddr->a.v4.sin_addr.s_addr);
> +
> +		rt = ip_route_output_key(sock_net(sk), fl4);
> +		if (IS_ERR(rt))
> +			continue;
> +
> +		dst = &rt->dst;
> +
> +		/* Double check if this is really expected. We simulate
> +		 * rp_filter by swapping src and dst.  If interfaces are
> +		 * different, means that this src wouldn't be expected
> +		 * by the other host on this interface.
> +		 */
> +		memcpy(&in, fl4, sizeof(in));
> +		in.daddr = fl4->saddr;
> +		in.saddr = fl4->daddr;
> +		in.flowi4_iif = fl4->flowi4_oif;
> +		in.flowi4_oif = 0;
> +
> +		if (fib_lookup(sock_net(sk), &in, &res) ||
> +		    res.type != RTN_LOCAL ||
> +		    fl4->flowi4_oif != FIB_RES_DEV(res)->ifindex) {
> +			/* Failed, so this was a false hit */
> +			dst_release(dst);
> +			dst = NULL;
> +			continue;
>  		}
> +
> +		break;
>  	}
>  
>  out_unlock:
> -- 
> 2.4.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH net-next] sctp: fix src address selection if using secondary addresses
@ 2015-07-09 16:54   ` Marcelo Ricardo Leitner
  0 siblings, 0 replies; 22+ messages in thread
From: Marcelo Ricardo Leitner @ 2015-07-09 16:54 UTC (permalink / raw)
  To: netdev, linux-sctp; +Cc: Vlad Yasevich, Neil Horman, tuexen

Cc'ing Michael too.

On Tue, Jul 07, 2015 at 02:42:25PM -0300, Marcelo Ricardo Leitner wrote:
> Hi folks,
> 
> This is an attempt to better choose a src address for sctp packets as
> peers with rp_filter could be dropping our packets in some situations.
> With this patch, we try to respect and use a src address that belongs to
> the interface we are putting the packet out.
> 
> I have that feeling that there is be a better way to do this, but I
> just couldn't see it.
> 
> This patch has been tested with and without gateways between the peers
> and also just two peers connected via two subnets and results were
> pretty good.
> 
> One could think that this limits the address combination we can use, but
> such combinations probably are just bogus anyway. Like, if you have an
> host with address A1 and B1 and another with A2 and B2, you cannot
> expect that A can use A1 to reach B2 through subnet B, because the
> return path would be via the other link which, when this switch happens,
> we are thinking it's broken.
> 
> Thanks,
> Marcelo
> 
> ---8<---
> 
> In short, sctp is likely to incorrectly choose src address if socket is
> bound to secondary addresses. This patch fixes it by adding a new check
> that tries to anticipate if the src address would be expected by the
> next hop/peer on this interface by doing reverse routing.
> 
> Also took the shot to reduce the indentation level on this code.
> 
> Details:
> 
> Currently, sctp will do a routing attempt without specifying the src
> address and compare the returned value (preferred source) with the
> addresses that the socket is bound to. When using secondary addresses,
> this will not match.
> 
> Then it will try specifying each of the addresses that the socket is
> bound to and re-routing, checking if that address is valid as src for
> that dst. Thing is, this check alone is weak:
> 
> # ip r l
> 192.168.100.0/24 dev eth1  proto kernel  scope link  src 192.168.100.149
> 192.168.122.0/24 dev eth0  proto kernel  scope link  src 192.168.122.147
> 
> # ip a l
> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
>     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>     inet 127.0.0.1/8 scope host lo
>        valid_lft forever preferred_lft forever
>     inet6 ::1/128 scope host
>        valid_lft forever preferred_lft forever
> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
>     link/ether 52:54:00:15:18:6a brd ff:ff:ff:ff:ff:ff
>     inet 192.168.122.147/24 brd 192.168.122.255 scope global dynamic eth0
>        valid_lft 2160sec preferred_lft 2160sec
>     inet 192.168.122.148/24 scope global secondary eth0
>        valid_lft forever preferred_lft forever
>     inet6 fe80::5054:ff:fe15:186a/64 scope link
>        valid_lft forever preferred_lft forever
> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
>     link/ether 52:54:00:b3:91:46 brd ff:ff:ff:ff:ff:ff
>     inet 192.168.100.149/24 brd 192.168.100.255 scope global dynamic eth1
>        valid_lft 2162sec preferred_lft 2162sec
>     inet 192.168.100.148/24 scope global secondary eth1
>        valid_lft forever preferred_lft forever
>     inet6 fe80::5054:ff:feb3:9146/64 scope link
>        valid_lft forever preferred_lft forever
> 4: ens9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
>     link/ether 52:54:00:05:47:ee brd ff:ff:ff:ff:ff:ff
>     inet6 fe80::5054:ff:fe05:47ee/64 scope link
>        valid_lft forever preferred_lft forever
> 
> # ip r g 192.168.100.193 from 192.168.122.148
> 192.168.100.193 from 192.168.122.148 dev eth1
>     cache
> 
> Even if you specify an interface:
> 
> # ip r g 192.168.100.193 from 192.168.122.148 oif eth1
> 192.168.100.193 from 192.168.122.148 dev eth1
>     cache
> 
> Although this would be valid, peers using rp_filter will drop such
> packets as their src doesn't match the routes for that interface.
> 
> So we fix this by adding an extra check, we try to do the reverse
> routing and check if the interface used would be the same. If not, we
> skip such address. If yes, we use it.
> 
> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
> ---
>  net/sctp/protocol.c | 55 +++++++++++++++++++++++++++++++++++++++--------------
>  1 file changed, 41 insertions(+), 14 deletions(-)
> 
> diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
> index 59e80356672bdf89777265ae1f8c384792dfb98c..e52fd6f77963426a7cf3e83ca01a9cdae1cb2c01 100644
> --- a/net/sctp/protocol.c
> +++ b/net/sctp/protocol.c
> @@ -53,6 +53,7 @@
>  #include <net/net_namespace.h>
>  #include <net/protocol.h>
>  #include <net/ip.h>
> +#include <net/ip_fib.h>
>  #include <net/ipv6.h>
>  #include <net/route.h>
>  #include <net/sctp/sctp.h>
> @@ -487,23 +488,49 @@ static void sctp_v4_get_dst(struct sctp_transport *t, union sctp_addr *saddr,
>  	 */
>  	rcu_read_lock();
>  	list_for_each_entry_rcu(laddr, &bp->address_list, list) {
> +		struct flowi4 in;
> +		struct fib_result res;
> +
>  		if (!laddr->valid)
>  			continue;
> -		if ((laddr->state = SCTP_ADDR_SRC) &&
> -		    (AF_INET = laddr->a.sa.sa_family)) {
> -			fl4->fl4_sport = laddr->a.v4.sin_port;
> -			flowi4_update_output(fl4,
> -					     asoc->base.sk->sk_bound_dev_if,
> -					     RT_CONN_FLAGS(asoc->base.sk),
> -					     daddr->v4.sin_addr.s_addr,
> -					     laddr->a.v4.sin_addr.s_addr);
> -
> -			rt = ip_route_output_key(sock_net(sk), fl4);
> -			if (!IS_ERR(rt)) {
> -				dst = &rt->dst;
> -				goto out_unlock;
> -			}
> +		if (laddr->state != SCTP_ADDR_SRC ||
> +		    AF_INET != laddr->a.sa.sa_family)
> +			continue;
> +
> +		fl4->fl4_sport = laddr->a.v4.sin_port;
> +		flowi4_update_output(fl4,
> +				     asoc->base.sk->sk_bound_dev_if,
> +				     RT_CONN_FLAGS(asoc->base.sk),
> +				     daddr->v4.sin_addr.s_addr,
> +				     laddr->a.v4.sin_addr.s_addr);
> +
> +		rt = ip_route_output_key(sock_net(sk), fl4);
> +		if (IS_ERR(rt))
> +			continue;
> +
> +		dst = &rt->dst;
> +
> +		/* Double check if this is really expected. We simulate
> +		 * rp_filter by swapping src and dst.  If interfaces are
> +		 * different, means that this src wouldn't be expected
> +		 * by the other host on this interface.
> +		 */
> +		memcpy(&in, fl4, sizeof(in));
> +		in.daddr = fl4->saddr;
> +		in.saddr = fl4->daddr;
> +		in.flowi4_iif = fl4->flowi4_oif;
> +		in.flowi4_oif = 0;
> +
> +		if (fib_lookup(sock_net(sk), &in, &res) ||
> +		    res.type != RTN_LOCAL ||
> +		    fl4->flowi4_oif != FIB_RES_DEV(res)->ifindex) {
> +			/* Failed, so this was a false hit */
> +			dst_release(dst);
> +			dst = NULL;
> +			continue;
>  		}
> +
> +		break;
>  	}
>  
>  out_unlock:
> -- 
> 2.4.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH net-next] sctp: fix src address selection if using secondary addresses
  2015-07-09 16:54   ` Marcelo Ricardo Leitner
@ 2015-07-09 19:55     ` Michael Tuexen
  -1 siblings, 0 replies; 22+ messages in thread
From: Michael Tuexen @ 2015-07-09 19:55 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner; +Cc: netdev, linux-sctp, Vlad Yasevich, Neil Horman

> On 09 Jul 2015, at 18:54, Marcelo Ricardo Leitner <mleitner@redhat.com> wrote:
> 
> Cc'ing Michael too.
I'm not familiar with the Linux kernel code, so I can't comment on it.
But making sure to use a source address belonging to the emitting
interface makes sense for me.

Best regards
Michael
> 
> On Tue, Jul 07, 2015 at 02:42:25PM -0300, Marcelo Ricardo Leitner wrote:
>> Hi folks,
>> 
>> This is an attempt to better choose a src address for sctp packets as
>> peers with rp_filter could be dropping our packets in some situations.
>> With this patch, we try to respect and use a src address that belongs to
>> the interface we are putting the packet out.
>> 
>> I have that feeling that there is be a better way to do this, but I
>> just couldn't see it.
>> 
>> This patch has been tested with and without gateways between the peers
>> and also just two peers connected via two subnets and results were
>> pretty good.
>> 
>> One could think that this limits the address combination we can use, but
>> such combinations probably are just bogus anyway. Like, if you have an
>> host with address A1 and B1 and another with A2 and B2, you cannot
>> expect that A can use A1 to reach B2 through subnet B, because the
>> return path would be via the other link which, when this switch happens,
>> we are thinking it's broken.
>> 
>> Thanks,
>> Marcelo
>> 
>> ---8<---
>> 
>> In short, sctp is likely to incorrectly choose src address if socket is
>> bound to secondary addresses. This patch fixes it by adding a new check
>> that tries to anticipate if the src address would be expected by the
>> next hop/peer on this interface by doing reverse routing.
>> 
>> Also took the shot to reduce the indentation level on this code.
>> 
>> Details:
>> 
>> Currently, sctp will do a routing attempt without specifying the src
>> address and compare the returned value (preferred source) with the
>> addresses that the socket is bound to. When using secondary addresses,
>> this will not match.
>> 
>> Then it will try specifying each of the addresses that the socket is
>> bound to and re-routing, checking if that address is valid as src for
>> that dst. Thing is, this check alone is weak:
>> 
>> # ip r l
>> 192.168.100.0/24 dev eth1  proto kernel  scope link  src 192.168.100.149
>> 192.168.122.0/24 dev eth0  proto kernel  scope link  src 192.168.122.147
>> 
>> # ip a l
>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
>>    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>>    inet 127.0.0.1/8 scope host lo
>>       valid_lft forever preferred_lft forever
>>    inet6 ::1/128 scope host
>>       valid_lft forever preferred_lft forever
>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
>>    link/ether 52:54:00:15:18:6a brd ff:ff:ff:ff:ff:ff
>>    inet 192.168.122.147/24 brd 192.168.122.255 scope global dynamic eth0
>>       valid_lft 2160sec preferred_lft 2160sec
>>    inet 192.168.122.148/24 scope global secondary eth0
>>       valid_lft forever preferred_lft forever
>>    inet6 fe80::5054:ff:fe15:186a/64 scope link
>>       valid_lft forever preferred_lft forever
>> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
>>    link/ether 52:54:00:b3:91:46 brd ff:ff:ff:ff:ff:ff
>>    inet 192.168.100.149/24 brd 192.168.100.255 scope global dynamic eth1
>>       valid_lft 2162sec preferred_lft 2162sec
>>    inet 192.168.100.148/24 scope global secondary eth1
>>       valid_lft forever preferred_lft forever
>>    inet6 fe80::5054:ff:feb3:9146/64 scope link
>>       valid_lft forever preferred_lft forever
>> 4: ens9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
>>    link/ether 52:54:00:05:47:ee brd ff:ff:ff:ff:ff:ff
>>    inet6 fe80::5054:ff:fe05:47ee/64 scope link
>>       valid_lft forever preferred_lft forever
>> 
>> # ip r g 192.168.100.193 from 192.168.122.148
>> 192.168.100.193 from 192.168.122.148 dev eth1
>>    cache
>> 
>> Even if you specify an interface:
>> 
>> # ip r g 192.168.100.193 from 192.168.122.148 oif eth1
>> 192.168.100.193 from 192.168.122.148 dev eth1
>>    cache
>> 
>> Although this would be valid, peers using rp_filter will drop such
>> packets as their src doesn't match the routes for that interface.
>> 
>> So we fix this by adding an extra check, we try to do the reverse
>> routing and check if the interface used would be the same. If not, we
>> skip such address. If yes, we use it.
>> 
>> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
>> ---
>> net/sctp/protocol.c | 55 +++++++++++++++++++++++++++++++++++++++--------------
>> 1 file changed, 41 insertions(+), 14 deletions(-)
>> 
>> diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
>> index 59e80356672bdf89777265ae1f8c384792dfb98c..e52fd6f77963426a7cf3e83ca01a9cdae1cb2c01 100644
>> --- a/net/sctp/protocol.c
>> +++ b/net/sctp/protocol.c
>> @@ -53,6 +53,7 @@
>> #include <net/net_namespace.h>
>> #include <net/protocol.h>
>> #include <net/ip.h>
>> +#include <net/ip_fib.h>
>> #include <net/ipv6.h>
>> #include <net/route.h>
>> #include <net/sctp/sctp.h>
>> @@ -487,23 +488,49 @@ static void sctp_v4_get_dst(struct sctp_transport *t, union sctp_addr *saddr,
>> 	 */
>> 	rcu_read_lock();
>> 	list_for_each_entry_rcu(laddr, &bp->address_list, list) {
>> +		struct flowi4 in;
>> +		struct fib_result res;
>> +
>> 		if (!laddr->valid)
>> 			continue;
>> -		if ((laddr->state == SCTP_ADDR_SRC) &&
>> -		    (AF_INET == laddr->a.sa.sa_family)) {
>> -			fl4->fl4_sport = laddr->a.v4.sin_port;
>> -			flowi4_update_output(fl4,
>> -					     asoc->base.sk->sk_bound_dev_if,
>> -					     RT_CONN_FLAGS(asoc->base.sk),
>> -					     daddr->v4.sin_addr.s_addr,
>> -					     laddr->a.v4.sin_addr.s_addr);
>> -
>> -			rt = ip_route_output_key(sock_net(sk), fl4);
>> -			if (!IS_ERR(rt)) {
>> -				dst = &rt->dst;
>> -				goto out_unlock;
>> -			}
>> +		if (laddr->state != SCTP_ADDR_SRC ||
>> +		    AF_INET != laddr->a.sa.sa_family)
>> +			continue;
>> +
>> +		fl4->fl4_sport = laddr->a.v4.sin_port;
>> +		flowi4_update_output(fl4,
>> +				     asoc->base.sk->sk_bound_dev_if,
>> +				     RT_CONN_FLAGS(asoc->base.sk),
>> +				     daddr->v4.sin_addr.s_addr,
>> +				     laddr->a.v4.sin_addr.s_addr);
>> +
>> +		rt = ip_route_output_key(sock_net(sk), fl4);
>> +		if (IS_ERR(rt))
>> +			continue;
>> +
>> +		dst = &rt->dst;
>> +
>> +		/* Double check if this is really expected. We simulate
>> +		 * rp_filter by swapping src and dst.  If interfaces are
>> +		 * different, means that this src wouldn't be expected
>> +		 * by the other host on this interface.
>> +		 */
>> +		memcpy(&in, fl4, sizeof(in));
>> +		in.daddr = fl4->saddr;
>> +		in.saddr = fl4->daddr;
>> +		in.flowi4_iif = fl4->flowi4_oif;
>> +		in.flowi4_oif = 0;
>> +
>> +		if (fib_lookup(sock_net(sk), &in, &res) ||
>> +		    res.type != RTN_LOCAL ||
>> +		    fl4->flowi4_oif != FIB_RES_DEV(res)->ifindex) {
>> +			/* Failed, so this was a false hit */
>> +			dst_release(dst);
>> +			dst = NULL;
>> +			continue;
>> 		}
>> +
>> +		break;
>> 	}
>> 
>> out_unlock:
>> -- 
>> 2.4.1
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> 
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH net-next] sctp: fix src address selection if using secondary addresses
@ 2015-07-09 19:55     ` Michael Tuexen
  0 siblings, 0 replies; 22+ messages in thread
From: Michael Tuexen @ 2015-07-09 19:55 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner; +Cc: netdev, linux-sctp, Vlad Yasevich, Neil Horman

> On 09 Jul 2015, at 18:54, Marcelo Ricardo Leitner <mleitner@redhat.com> wrote:
> 
> Cc'ing Michael too.
I'm not familiar with the Linux kernel code, so I can't comment on it.
But making sure to use a source address belonging to the emitting
interface makes sense for me.

Best regards
Michael
> 
> On Tue, Jul 07, 2015 at 02:42:25PM -0300, Marcelo Ricardo Leitner wrote:
>> Hi folks,
>> 
>> This is an attempt to better choose a src address for sctp packets as
>> peers with rp_filter could be dropping our packets in some situations.
>> With this patch, we try to respect and use a src address that belongs to
>> the interface we are putting the packet out.
>> 
>> I have that feeling that there is be a better way to do this, but I
>> just couldn't see it.
>> 
>> This patch has been tested with and without gateways between the peers
>> and also just two peers connected via two subnets and results were
>> pretty good.
>> 
>> One could think that this limits the address combination we can use, but
>> such combinations probably are just bogus anyway. Like, if you have an
>> host with address A1 and B1 and another with A2 and B2, you cannot
>> expect that A can use A1 to reach B2 through subnet B, because the
>> return path would be via the other link which, when this switch happens,
>> we are thinking it's broken.
>> 
>> Thanks,
>> Marcelo
>> 
>> ---8<---
>> 
>> In short, sctp is likely to incorrectly choose src address if socket is
>> bound to secondary addresses. This patch fixes it by adding a new check
>> that tries to anticipate if the src address would be expected by the
>> next hop/peer on this interface by doing reverse routing.
>> 
>> Also took the shot to reduce the indentation level on this code.
>> 
>> Details:
>> 
>> Currently, sctp will do a routing attempt without specifying the src
>> address and compare the returned value (preferred source) with the
>> addresses that the socket is bound to. When using secondary addresses,
>> this will not match.
>> 
>> Then it will try specifying each of the addresses that the socket is
>> bound to and re-routing, checking if that address is valid as src for
>> that dst. Thing is, this check alone is weak:
>> 
>> # ip r l
>> 192.168.100.0/24 dev eth1  proto kernel  scope link  src 192.168.100.149
>> 192.168.122.0/24 dev eth0  proto kernel  scope link  src 192.168.122.147
>> 
>> # ip a l
>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
>>    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>>    inet 127.0.0.1/8 scope host lo
>>       valid_lft forever preferred_lft forever
>>    inet6 ::1/128 scope host
>>       valid_lft forever preferred_lft forever
>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
>>    link/ether 52:54:00:15:18:6a brd ff:ff:ff:ff:ff:ff
>>    inet 192.168.122.147/24 brd 192.168.122.255 scope global dynamic eth0
>>       valid_lft 2160sec preferred_lft 2160sec
>>    inet 192.168.122.148/24 scope global secondary eth0
>>       valid_lft forever preferred_lft forever
>>    inet6 fe80::5054:ff:fe15:186a/64 scope link
>>       valid_lft forever preferred_lft forever
>> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
>>    link/ether 52:54:00:b3:91:46 brd ff:ff:ff:ff:ff:ff
>>    inet 192.168.100.149/24 brd 192.168.100.255 scope global dynamic eth1
>>       valid_lft 2162sec preferred_lft 2162sec
>>    inet 192.168.100.148/24 scope global secondary eth1
>>       valid_lft forever preferred_lft forever
>>    inet6 fe80::5054:ff:feb3:9146/64 scope link
>>       valid_lft forever preferred_lft forever
>> 4: ens9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
>>    link/ether 52:54:00:05:47:ee brd ff:ff:ff:ff:ff:ff
>>    inet6 fe80::5054:ff:fe05:47ee/64 scope link
>>       valid_lft forever preferred_lft forever
>> 
>> # ip r g 192.168.100.193 from 192.168.122.148
>> 192.168.100.193 from 192.168.122.148 dev eth1
>>    cache
>> 
>> Even if you specify an interface:
>> 
>> # ip r g 192.168.100.193 from 192.168.122.148 oif eth1
>> 192.168.100.193 from 192.168.122.148 dev eth1
>>    cache
>> 
>> Although this would be valid, peers using rp_filter will drop such
>> packets as their src doesn't match the routes for that interface.
>> 
>> So we fix this by adding an extra check, we try to do the reverse
>> routing and check if the interface used would be the same. If not, we
>> skip such address. If yes, we use it.
>> 
>> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
>> ---
>> net/sctp/protocol.c | 55 +++++++++++++++++++++++++++++++++++++++--------------
>> 1 file changed, 41 insertions(+), 14 deletions(-)
>> 
>> diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
>> index 59e80356672bdf89777265ae1f8c384792dfb98c..e52fd6f77963426a7cf3e83ca01a9cdae1cb2c01 100644
>> --- a/net/sctp/protocol.c
>> +++ b/net/sctp/protocol.c
>> @@ -53,6 +53,7 @@
>> #include <net/net_namespace.h>
>> #include <net/protocol.h>
>> #include <net/ip.h>
>> +#include <net/ip_fib.h>
>> #include <net/ipv6.h>
>> #include <net/route.h>
>> #include <net/sctp/sctp.h>
>> @@ -487,23 +488,49 @@ static void sctp_v4_get_dst(struct sctp_transport *t, union sctp_addr *saddr,
>> 	 */
>> 	rcu_read_lock();
>> 	list_for_each_entry_rcu(laddr, &bp->address_list, list) {
>> +		struct flowi4 in;
>> +		struct fib_result res;
>> +
>> 		if (!laddr->valid)
>> 			continue;
>> -		if ((laddr->state = SCTP_ADDR_SRC) &&
>> -		    (AF_INET = laddr->a.sa.sa_family)) {
>> -			fl4->fl4_sport = laddr->a.v4.sin_port;
>> -			flowi4_update_output(fl4,
>> -					     asoc->base.sk->sk_bound_dev_if,
>> -					     RT_CONN_FLAGS(asoc->base.sk),
>> -					     daddr->v4.sin_addr.s_addr,
>> -					     laddr->a.v4.sin_addr.s_addr);
>> -
>> -			rt = ip_route_output_key(sock_net(sk), fl4);
>> -			if (!IS_ERR(rt)) {
>> -				dst = &rt->dst;
>> -				goto out_unlock;
>> -			}
>> +		if (laddr->state != SCTP_ADDR_SRC ||
>> +		    AF_INET != laddr->a.sa.sa_family)
>> +			continue;
>> +
>> +		fl4->fl4_sport = laddr->a.v4.sin_port;
>> +		flowi4_update_output(fl4,
>> +				     asoc->base.sk->sk_bound_dev_if,
>> +				     RT_CONN_FLAGS(asoc->base.sk),
>> +				     daddr->v4.sin_addr.s_addr,
>> +				     laddr->a.v4.sin_addr.s_addr);
>> +
>> +		rt = ip_route_output_key(sock_net(sk), fl4);
>> +		if (IS_ERR(rt))
>> +			continue;
>> +
>> +		dst = &rt->dst;
>> +
>> +		/* Double check if this is really expected. We simulate
>> +		 * rp_filter by swapping src and dst.  If interfaces are
>> +		 * different, means that this src wouldn't be expected
>> +		 * by the other host on this interface.
>> +		 */
>> +		memcpy(&in, fl4, sizeof(in));
>> +		in.daddr = fl4->saddr;
>> +		in.saddr = fl4->daddr;
>> +		in.flowi4_iif = fl4->flowi4_oif;
>> +		in.flowi4_oif = 0;
>> +
>> +		if (fib_lookup(sock_net(sk), &in, &res) ||
>> +		    res.type != RTN_LOCAL ||
>> +		    fl4->flowi4_oif != FIB_RES_DEV(res)->ifindex) {
>> +			/* Failed, so this was a false hit */
>> +			dst_release(dst);
>> +			dst = NULL;
>> +			continue;
>> 		}
>> +
>> +		break;
>> 	}
>> 
>> out_unlock:
>> -- 
>> 2.4.1
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> 
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH net-next] sctp: fix src address selection if using secondary addresses
  2015-07-09 19:55     ` Michael Tuexen
@ 2015-07-10 11:53       ` Marcelo Ricardo Leitner
  -1 siblings, 0 replies; 22+ messages in thread
From: Marcelo Ricardo Leitner @ 2015-07-10 11:53 UTC (permalink / raw)
  To: Michael Tuexen; +Cc: netdev, linux-sctp, Vlad Yasevich, Neil Horman

On Thu, Jul 09, 2015 at 09:55:19PM +0200, Michael Tuexen wrote:
> > On 09 Jul 2015, at 18:54, Marcelo Ricardo Leitner <mleitner@redhat.com> wrote:
> > 
> > Cc'ing Michael too.
> I'm not familiar with the Linux kernel code, so I can't comment on it.
> But making sure to use a source address belonging to the emitting
> interface makes sense for me.
> 
> Best regards
> Michael

That's pretty much what I was looking for, in case I missed something on
SCTP RFCs. 

Thanks Michael!

> > On Tue, Jul 07, 2015 at 02:42:25PM -0300, Marcelo Ricardo Leitner wrote:
> >> Hi folks,
> >> 
> >> This is an attempt to better choose a src address for sctp packets as
> >> peers with rp_filter could be dropping our packets in some situations.
> >> With this patch, we try to respect and use a src address that belongs to
> >> the interface we are putting the packet out.
> >> 
> >> I have that feeling that there is be a better way to do this, but I
> >> just couldn't see it.
> >> 
> >> This patch has been tested with and without gateways between the peers
> >> and also just two peers connected via two subnets and results were
> >> pretty good.
> >> 
> >> One could think that this limits the address combination we can use, but
> >> such combinations probably are just bogus anyway. Like, if you have an
> >> host with address A1 and B1 and another with A2 and B2, you cannot
> >> expect that A can use A1 to reach B2 through subnet B, because the
> >> return path would be via the other link which, when this switch happens,
> >> we are thinking it's broken.
> >> 
> >> Thanks,
> >> Marcelo
> >> 
> >> ---8<---
> >> 
> >> In short, sctp is likely to incorrectly choose src address if socket is
> >> bound to secondary addresses. This patch fixes it by adding a new check
> >> that tries to anticipate if the src address would be expected by the
> >> next hop/peer on this interface by doing reverse routing.
> >> 
> >> Also took the shot to reduce the indentation level on this code.
> >> 
> >> Details:
> >> 
> >> Currently, sctp will do a routing attempt without specifying the src
> >> address and compare the returned value (preferred source) with the
> >> addresses that the socket is bound to. When using secondary addresses,
> >> this will not match.
> >> 
> >> Then it will try specifying each of the addresses that the socket is
> >> bound to and re-routing, checking if that address is valid as src for
> >> that dst. Thing is, this check alone is weak:
> >> 
> >> # ip r l
> >> 192.168.100.0/24 dev eth1  proto kernel  scope link  src 192.168.100.149
> >> 192.168.122.0/24 dev eth0  proto kernel  scope link  src 192.168.122.147
> >> 
> >> # ip a l
> >> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
> >>    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
> >>    inet 127.0.0.1/8 scope host lo
> >>       valid_lft forever preferred_lft forever
> >>    inet6 ::1/128 scope host
> >>       valid_lft forever preferred_lft forever
> >> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
> >>    link/ether 52:54:00:15:18:6a brd ff:ff:ff:ff:ff:ff
> >>    inet 192.168.122.147/24 brd 192.168.122.255 scope global dynamic eth0
> >>       valid_lft 2160sec preferred_lft 2160sec
> >>    inet 192.168.122.148/24 scope global secondary eth0
> >>       valid_lft forever preferred_lft forever
> >>    inet6 fe80::5054:ff:fe15:186a/64 scope link
> >>       valid_lft forever preferred_lft forever
> >> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
> >>    link/ether 52:54:00:b3:91:46 brd ff:ff:ff:ff:ff:ff
> >>    inet 192.168.100.149/24 brd 192.168.100.255 scope global dynamic eth1
> >>       valid_lft 2162sec preferred_lft 2162sec
> >>    inet 192.168.100.148/24 scope global secondary eth1
> >>       valid_lft forever preferred_lft forever
> >>    inet6 fe80::5054:ff:feb3:9146/64 scope link
> >>       valid_lft forever preferred_lft forever
> >> 4: ens9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
> >>    link/ether 52:54:00:05:47:ee brd ff:ff:ff:ff:ff:ff
> >>    inet6 fe80::5054:ff:fe05:47ee/64 scope link
> >>       valid_lft forever preferred_lft forever
> >> 
> >> # ip r g 192.168.100.193 from 192.168.122.148
> >> 192.168.100.193 from 192.168.122.148 dev eth1
> >>    cache
> >> 
> >> Even if you specify an interface:
> >> 
> >> # ip r g 192.168.100.193 from 192.168.122.148 oif eth1
> >> 192.168.100.193 from 192.168.122.148 dev eth1
> >>    cache
> >> 
> >> Although this would be valid, peers using rp_filter will drop such
> >> packets as their src doesn't match the routes for that interface.
> >> 
> >> So we fix this by adding an extra check, we try to do the reverse
> >> routing and check if the interface used would be the same. If not, we
> >> skip such address. If yes, we use it.
> >> 
> >> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
> >> ---
> >> net/sctp/protocol.c | 55 +++++++++++++++++++++++++++++++++++++++--------------
> >> 1 file changed, 41 insertions(+), 14 deletions(-)
> >> 
> >> diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
> >> index 59e80356672bdf89777265ae1f8c384792dfb98c..e52fd6f77963426a7cf3e83ca01a9cdae1cb2c01 100644
> >> --- a/net/sctp/protocol.c
> >> +++ b/net/sctp/protocol.c
> >> @@ -53,6 +53,7 @@
> >> #include <net/net_namespace.h>
> >> #include <net/protocol.h>
> >> #include <net/ip.h>
> >> +#include <net/ip_fib.h>
> >> #include <net/ipv6.h>
> >> #include <net/route.h>
> >> #include <net/sctp/sctp.h>
> >> @@ -487,23 +488,49 @@ static void sctp_v4_get_dst(struct sctp_transport *t, union sctp_addr *saddr,
> >> 	 */
> >> 	rcu_read_lock();
> >> 	list_for_each_entry_rcu(laddr, &bp->address_list, list) {
> >> +		struct flowi4 in;
> >> +		struct fib_result res;
> >> +
> >> 		if (!laddr->valid)
> >> 			continue;
> >> -		if ((laddr->state == SCTP_ADDR_SRC) &&
> >> -		    (AF_INET == laddr->a.sa.sa_family)) {
> >> -			fl4->fl4_sport = laddr->a.v4.sin_port;
> >> -			flowi4_update_output(fl4,
> >> -					     asoc->base.sk->sk_bound_dev_if,
> >> -					     RT_CONN_FLAGS(asoc->base.sk),
> >> -					     daddr->v4.sin_addr.s_addr,
> >> -					     laddr->a.v4.sin_addr.s_addr);
> >> -
> >> -			rt = ip_route_output_key(sock_net(sk), fl4);
> >> -			if (!IS_ERR(rt)) {
> >> -				dst = &rt->dst;
> >> -				goto out_unlock;
> >> -			}
> >> +		if (laddr->state != SCTP_ADDR_SRC ||
> >> +		    AF_INET != laddr->a.sa.sa_family)
> >> +			continue;
> >> +
> >> +		fl4->fl4_sport = laddr->a.v4.sin_port;
> >> +		flowi4_update_output(fl4,
> >> +				     asoc->base.sk->sk_bound_dev_if,
> >> +				     RT_CONN_FLAGS(asoc->base.sk),
> >> +				     daddr->v4.sin_addr.s_addr,
> >> +				     laddr->a.v4.sin_addr.s_addr);
> >> +
> >> +		rt = ip_route_output_key(sock_net(sk), fl4);
> >> +		if (IS_ERR(rt))
> >> +			continue;
> >> +
> >> +		dst = &rt->dst;
> >> +
> >> +		/* Double check if this is really expected. We simulate
> >> +		 * rp_filter by swapping src and dst.  If interfaces are
> >> +		 * different, means that this src wouldn't be expected
> >> +		 * by the other host on this interface.
> >> +		 */
> >> +		memcpy(&in, fl4, sizeof(in));
> >> +		in.daddr = fl4->saddr;
> >> +		in.saddr = fl4->daddr;
> >> +		in.flowi4_iif = fl4->flowi4_oif;
> >> +		in.flowi4_oif = 0;
> >> +
> >> +		if (fib_lookup(sock_net(sk), &in, &res) ||
> >> +		    res.type != RTN_LOCAL ||
> >> +		    fl4->flowi4_oif != FIB_RES_DEV(res)->ifindex) {
> >> +			/* Failed, so this was a false hit */
> >> +			dst_release(dst);
> >> +			dst = NULL;
> >> +			continue;
> >> 		}
> >> +
> >> +		break;
> >> 	}
> >> 
> >> out_unlock:
> >> -- 
> >> 2.4.1
> >> 
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> 
> > 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH net-next] sctp: fix src address selection if using secondary addresses
@ 2015-07-10 11:53       ` Marcelo Ricardo Leitner
  0 siblings, 0 replies; 22+ messages in thread
From: Marcelo Ricardo Leitner @ 2015-07-10 11:53 UTC (permalink / raw)
  To: Michael Tuexen; +Cc: netdev, linux-sctp, Vlad Yasevich, Neil Horman

On Thu, Jul 09, 2015 at 09:55:19PM +0200, Michael Tuexen wrote:
> > On 09 Jul 2015, at 18:54, Marcelo Ricardo Leitner <mleitner@redhat.com> wrote:
> > 
> > Cc'ing Michael too.
> I'm not familiar with the Linux kernel code, so I can't comment on it.
> But making sure to use a source address belonging to the emitting
> interface makes sense for me.
> 
> Best regards
> Michael

That's pretty much what I was looking for, in case I missed something on
SCTP RFCs. 

Thanks Michael!

> > On Tue, Jul 07, 2015 at 02:42:25PM -0300, Marcelo Ricardo Leitner wrote:
> >> Hi folks,
> >> 
> >> This is an attempt to better choose a src address for sctp packets as
> >> peers with rp_filter could be dropping our packets in some situations.
> >> With this patch, we try to respect and use a src address that belongs to
> >> the interface we are putting the packet out.
> >> 
> >> I have that feeling that there is be a better way to do this, but I
> >> just couldn't see it.
> >> 
> >> This patch has been tested with and without gateways between the peers
> >> and also just two peers connected via two subnets and results were
> >> pretty good.
> >> 
> >> One could think that this limits the address combination we can use, but
> >> such combinations probably are just bogus anyway. Like, if you have an
> >> host with address A1 and B1 and another with A2 and B2, you cannot
> >> expect that A can use A1 to reach B2 through subnet B, because the
> >> return path would be via the other link which, when this switch happens,
> >> we are thinking it's broken.
> >> 
> >> Thanks,
> >> Marcelo
> >> 
> >> ---8<---
> >> 
> >> In short, sctp is likely to incorrectly choose src address if socket is
> >> bound to secondary addresses. This patch fixes it by adding a new check
> >> that tries to anticipate if the src address would be expected by the
> >> next hop/peer on this interface by doing reverse routing.
> >> 
> >> Also took the shot to reduce the indentation level on this code.
> >> 
> >> Details:
> >> 
> >> Currently, sctp will do a routing attempt without specifying the src
> >> address and compare the returned value (preferred source) with the
> >> addresses that the socket is bound to. When using secondary addresses,
> >> this will not match.
> >> 
> >> Then it will try specifying each of the addresses that the socket is
> >> bound to and re-routing, checking if that address is valid as src for
> >> that dst. Thing is, this check alone is weak:
> >> 
> >> # ip r l
> >> 192.168.100.0/24 dev eth1  proto kernel  scope link  src 192.168.100.149
> >> 192.168.122.0/24 dev eth0  proto kernel  scope link  src 192.168.122.147
> >> 
> >> # ip a l
> >> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
> >>    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
> >>    inet 127.0.0.1/8 scope host lo
> >>       valid_lft forever preferred_lft forever
> >>    inet6 ::1/128 scope host
> >>       valid_lft forever preferred_lft forever
> >> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
> >>    link/ether 52:54:00:15:18:6a brd ff:ff:ff:ff:ff:ff
> >>    inet 192.168.122.147/24 brd 192.168.122.255 scope global dynamic eth0
> >>       valid_lft 2160sec preferred_lft 2160sec
> >>    inet 192.168.122.148/24 scope global secondary eth0
> >>       valid_lft forever preferred_lft forever
> >>    inet6 fe80::5054:ff:fe15:186a/64 scope link
> >>       valid_lft forever preferred_lft forever
> >> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
> >>    link/ether 52:54:00:b3:91:46 brd ff:ff:ff:ff:ff:ff
> >>    inet 192.168.100.149/24 brd 192.168.100.255 scope global dynamic eth1
> >>       valid_lft 2162sec preferred_lft 2162sec
> >>    inet 192.168.100.148/24 scope global secondary eth1
> >>       valid_lft forever preferred_lft forever
> >>    inet6 fe80::5054:ff:feb3:9146/64 scope link
> >>       valid_lft forever preferred_lft forever
> >> 4: ens9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
> >>    link/ether 52:54:00:05:47:ee brd ff:ff:ff:ff:ff:ff
> >>    inet6 fe80::5054:ff:fe05:47ee/64 scope link
> >>       valid_lft forever preferred_lft forever
> >> 
> >> # ip r g 192.168.100.193 from 192.168.122.148
> >> 192.168.100.193 from 192.168.122.148 dev eth1
> >>    cache
> >> 
> >> Even if you specify an interface:
> >> 
> >> # ip r g 192.168.100.193 from 192.168.122.148 oif eth1
> >> 192.168.100.193 from 192.168.122.148 dev eth1
> >>    cache
> >> 
> >> Although this would be valid, peers using rp_filter will drop such
> >> packets as their src doesn't match the routes for that interface.
> >> 
> >> So we fix this by adding an extra check, we try to do the reverse
> >> routing and check if the interface used would be the same. If not, we
> >> skip such address. If yes, we use it.
> >> 
> >> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
> >> ---
> >> net/sctp/protocol.c | 55 +++++++++++++++++++++++++++++++++++++++--------------
> >> 1 file changed, 41 insertions(+), 14 deletions(-)
> >> 
> >> diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
> >> index 59e80356672bdf89777265ae1f8c384792dfb98c..e52fd6f77963426a7cf3e83ca01a9cdae1cb2c01 100644
> >> --- a/net/sctp/protocol.c
> >> +++ b/net/sctp/protocol.c
> >> @@ -53,6 +53,7 @@
> >> #include <net/net_namespace.h>
> >> #include <net/protocol.h>
> >> #include <net/ip.h>
> >> +#include <net/ip_fib.h>
> >> #include <net/ipv6.h>
> >> #include <net/route.h>
> >> #include <net/sctp/sctp.h>
> >> @@ -487,23 +488,49 @@ static void sctp_v4_get_dst(struct sctp_transport *t, union sctp_addr *saddr,
> >> 	 */
> >> 	rcu_read_lock();
> >> 	list_for_each_entry_rcu(laddr, &bp->address_list, list) {
> >> +		struct flowi4 in;
> >> +		struct fib_result res;
> >> +
> >> 		if (!laddr->valid)
> >> 			continue;
> >> -		if ((laddr->state = SCTP_ADDR_SRC) &&
> >> -		    (AF_INET = laddr->a.sa.sa_family)) {
> >> -			fl4->fl4_sport = laddr->a.v4.sin_port;
> >> -			flowi4_update_output(fl4,
> >> -					     asoc->base.sk->sk_bound_dev_if,
> >> -					     RT_CONN_FLAGS(asoc->base.sk),
> >> -					     daddr->v4.sin_addr.s_addr,
> >> -					     laddr->a.v4.sin_addr.s_addr);
> >> -
> >> -			rt = ip_route_output_key(sock_net(sk), fl4);
> >> -			if (!IS_ERR(rt)) {
> >> -				dst = &rt->dst;
> >> -				goto out_unlock;
> >> -			}
> >> +		if (laddr->state != SCTP_ADDR_SRC ||
> >> +		    AF_INET != laddr->a.sa.sa_family)
> >> +			continue;
> >> +
> >> +		fl4->fl4_sport = laddr->a.v4.sin_port;
> >> +		flowi4_update_output(fl4,
> >> +				     asoc->base.sk->sk_bound_dev_if,
> >> +				     RT_CONN_FLAGS(asoc->base.sk),
> >> +				     daddr->v4.sin_addr.s_addr,
> >> +				     laddr->a.v4.sin_addr.s_addr);
> >> +
> >> +		rt = ip_route_output_key(sock_net(sk), fl4);
> >> +		if (IS_ERR(rt))
> >> +			continue;
> >> +
> >> +		dst = &rt->dst;
> >> +
> >> +		/* Double check if this is really expected. We simulate
> >> +		 * rp_filter by swapping src and dst.  If interfaces are
> >> +		 * different, means that this src wouldn't be expected
> >> +		 * by the other host on this interface.
> >> +		 */
> >> +		memcpy(&in, fl4, sizeof(in));
> >> +		in.daddr = fl4->saddr;
> >> +		in.saddr = fl4->daddr;
> >> +		in.flowi4_iif = fl4->flowi4_oif;
> >> +		in.flowi4_oif = 0;
> >> +
> >> +		if (fib_lookup(sock_net(sk), &in, &res) ||
> >> +		    res.type != RTN_LOCAL ||
> >> +		    fl4->flowi4_oif != FIB_RES_DEV(res)->ifindex) {
> >> +			/* Failed, so this was a false hit */
> >> +			dst_release(dst);
> >> +			dst = NULL;
> >> +			continue;
> >> 		}
> >> +
> >> +		break;
> >> 	}
> >> 
> >> out_unlock:
> >> -- 
> >> 2.4.1
> >> 
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> 
> > 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH net-next] sctp: fix src address selection if using secondary addresses
  2015-07-10 11:53       ` Marcelo Ricardo Leitner
@ 2015-07-10 15:35         ` Vlad Yasevich
  -1 siblings, 0 replies; 22+ messages in thread
From: Vlad Yasevich @ 2015-07-10 15:35 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner, Michael Tuexen; +Cc: netdev, linux-sctp, Neil Horman

On 07/10/2015 07:53 AM, Marcelo Ricardo Leitner wrote:
> On Thu, Jul 09, 2015 at 09:55:19PM +0200, Michael Tuexen wrote:
>>> On 09 Jul 2015, at 18:54, Marcelo Ricardo Leitner <mleitner@redhat.com> wrote:
>>>
>>> Cc'ing Michael too.
>> I'm not familiar with the Linux kernel code, so I can't comment on it.
>> But making sure to use a source address belonging to the emitting
>> interface makes sense for me.
>>
>> Best regards
>> Michael
> 
> That's pretty much what I was looking for, in case I missed something on
> SCTP RFCs. 
> 

Well, the RFCs do not really specify what the source address should be, and there
have been numerous times where I've seen weak host model in use on the wire
even with a BSD peer.

This also puts a very big nail through many suggestions we've had over the years
to allow source based path multihoming in addition to destination based multihoming
we currently support.

It might be a good idea to make rp-filter like behavior best effort, and have
the old behavior as fallback.  I am still trying to think up different scenarios
where rp-filter behavior will cause things to fail prematurely...

-vlad

> Thanks Michael!
> 
>>> On Tue, Jul 07, 2015 at 02:42:25PM -0300, Marcelo Ricardo Leitner wrote:
>>>> Hi folks,
>>>>
>>>> This is an attempt to better choose a src address for sctp packets as
>>>> peers with rp_filter could be dropping our packets in some situations.
>>>> With this patch, we try to respect and use a src address that belongs to
>>>> the interface we are putting the packet out.
>>>>
>>>> I have that feeling that there is be a better way to do this, but I
>>>> just couldn't see it.
>>>>
>>>> This patch has been tested with and without gateways between the peers
>>>> and also just two peers connected via two subnets and results were
>>>> pretty good.
>>>>
>>>> One could think that this limits the address combination we can use, but
>>>> such combinations probably are just bogus anyway. Like, if you have an
>>>> host with address A1 and B1 and another with A2 and B2, you cannot
>>>> expect that A can use A1 to reach B2 through subnet B, because the
>>>> return path would be via the other link which, when this switch happens,
>>>> we are thinking it's broken.
>>>>
>>>> Thanks,
>>>> Marcelo
>>>>
>>>> ---8<---
>>>>
>>>> In short, sctp is likely to incorrectly choose src address if socket is
>>>> bound to secondary addresses. This patch fixes it by adding a new check
>>>> that tries to anticipate if the src address would be expected by the
>>>> next hop/peer on this interface by doing reverse routing.
>>>>
>>>> Also took the shot to reduce the indentation level on this code.
>>>>
>>>> Details:
>>>>
>>>> Currently, sctp will do a routing attempt without specifying the src
>>>> address and compare the returned value (preferred source) with the
>>>> addresses that the socket is bound to. When using secondary addresses,
>>>> this will not match.
>>>>
>>>> Then it will try specifying each of the addresses that the socket is
>>>> bound to and re-routing, checking if that address is valid as src for
>>>> that dst. Thing is, this check alone is weak:
>>>>
>>>> # ip r l
>>>> 192.168.100.0/24 dev eth1  proto kernel  scope link  src 192.168.100.149
>>>> 192.168.122.0/24 dev eth0  proto kernel  scope link  src 192.168.122.147
>>>>
>>>> # ip a l
>>>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
>>>>    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>>>>    inet 127.0.0.1/8 scope host lo
>>>>       valid_lft forever preferred_lft forever
>>>>    inet6 ::1/128 scope host
>>>>       valid_lft forever preferred_lft forever
>>>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
>>>>    link/ether 52:54:00:15:18:6a brd ff:ff:ff:ff:ff:ff
>>>>    inet 192.168.122.147/24 brd 192.168.122.255 scope global dynamic eth0
>>>>       valid_lft 2160sec preferred_lft 2160sec
>>>>    inet 192.168.122.148/24 scope global secondary eth0
>>>>       valid_lft forever preferred_lft forever
>>>>    inet6 fe80::5054:ff:fe15:186a/64 scope link
>>>>       valid_lft forever preferred_lft forever
>>>> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
>>>>    link/ether 52:54:00:b3:91:46 brd ff:ff:ff:ff:ff:ff
>>>>    inet 192.168.100.149/24 brd 192.168.100.255 scope global dynamic eth1
>>>>       valid_lft 2162sec preferred_lft 2162sec
>>>>    inet 192.168.100.148/24 scope global secondary eth1
>>>>       valid_lft forever preferred_lft forever
>>>>    inet6 fe80::5054:ff:feb3:9146/64 scope link
>>>>       valid_lft forever preferred_lft forever
>>>> 4: ens9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
>>>>    link/ether 52:54:00:05:47:ee brd ff:ff:ff:ff:ff:ff
>>>>    inet6 fe80::5054:ff:fe05:47ee/64 scope link
>>>>       valid_lft forever preferred_lft forever
>>>>
>>>> # ip r g 192.168.100.193 from 192.168.122.148
>>>> 192.168.100.193 from 192.168.122.148 dev eth1
>>>>    cache
>>>>
>>>> Even if you specify an interface:
>>>>
>>>> # ip r g 192.168.100.193 from 192.168.122.148 oif eth1
>>>> 192.168.100.193 from 192.168.122.148 dev eth1
>>>>    cache
>>>>
>>>> Although this would be valid, peers using rp_filter will drop such
>>>> packets as their src doesn't match the routes for that interface.
>>>>
>>>> So we fix this by adding an extra check, we try to do the reverse
>>>> routing and check if the interface used would be the same. If not, we
>>>> skip such address. If yes, we use it.
>>>>
>>>> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
>>>> ---
>>>> net/sctp/protocol.c | 55 +++++++++++++++++++++++++++++++++++++++--------------
>>>> 1 file changed, 41 insertions(+), 14 deletions(-)
>>>>
>>>> diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
>>>> index 59e80356672bdf89777265ae1f8c384792dfb98c..e52fd6f77963426a7cf3e83ca01a9cdae1cb2c01 100644
>>>> --- a/net/sctp/protocol.c
>>>> +++ b/net/sctp/protocol.c
>>>> @@ -53,6 +53,7 @@
>>>> #include <net/net_namespace.h>
>>>> #include <net/protocol.h>
>>>> #include <net/ip.h>
>>>> +#include <net/ip_fib.h>
>>>> #include <net/ipv6.h>
>>>> #include <net/route.h>
>>>> #include <net/sctp/sctp.h>
>>>> @@ -487,23 +488,49 @@ static void sctp_v4_get_dst(struct sctp_transport *t, union sctp_addr *saddr,
>>>> 	 */
>>>> 	rcu_read_lock();
>>>> 	list_for_each_entry_rcu(laddr, &bp->address_list, list) {
>>>> +		struct flowi4 in;
>>>> +		struct fib_result res;
>>>> +
>>>> 		if (!laddr->valid)
>>>> 			continue;
>>>> -		if ((laddr->state == SCTP_ADDR_SRC) &&
>>>> -		    (AF_INET == laddr->a.sa.sa_family)) {
>>>> -			fl4->fl4_sport = laddr->a.v4.sin_port;
>>>> -			flowi4_update_output(fl4,
>>>> -					     asoc->base.sk->sk_bound_dev_if,
>>>> -					     RT_CONN_FLAGS(asoc->base.sk),
>>>> -					     daddr->v4.sin_addr.s_addr,
>>>> -					     laddr->a.v4.sin_addr.s_addr);
>>>> -
>>>> -			rt = ip_route_output_key(sock_net(sk), fl4);
>>>> -			if (!IS_ERR(rt)) {
>>>> -				dst = &rt->dst;
>>>> -				goto out_unlock;
>>>> -			}
>>>> +		if (laddr->state != SCTP_ADDR_SRC ||
>>>> +		    AF_INET != laddr->a.sa.sa_family)
>>>> +			continue;
>>>> +
>>>> +		fl4->fl4_sport = laddr->a.v4.sin_port;
>>>> +		flowi4_update_output(fl4,
>>>> +				     asoc->base.sk->sk_bound_dev_if,
>>>> +				     RT_CONN_FLAGS(asoc->base.sk),
>>>> +				     daddr->v4.sin_addr.s_addr,
>>>> +				     laddr->a.v4.sin_addr.s_addr);
>>>> +
>>>> +		rt = ip_route_output_key(sock_net(sk), fl4);
>>>> +		if (IS_ERR(rt))
>>>> +			continue;
>>>> +
>>>> +		dst = &rt->dst;
>>>> +
>>>> +		/* Double check if this is really expected. We simulate
>>>> +		 * rp_filter by swapping src and dst.  If interfaces are
>>>> +		 * different, means that this src wouldn't be expected
>>>> +		 * by the other host on this interface.
>>>> +		 */
>>>> +		memcpy(&in, fl4, sizeof(in));
>>>> +		in.daddr = fl4->saddr;
>>>> +		in.saddr = fl4->daddr;
>>>> +		in.flowi4_iif = fl4->flowi4_oif;
>>>> +		in.flowi4_oif = 0;
>>>> +
>>>> +		if (fib_lookup(sock_net(sk), &in, &res) ||
>>>> +		    res.type != RTN_LOCAL ||
>>>> +		    fl4->flowi4_oif != FIB_RES_DEV(res)->ifindex) {
>>>> +			/* Failed, so this was a false hit */
>>>> +			dst_release(dst);
>>>> +			dst = NULL;
>>>> +			continue;
>>>> 		}
>>>> +
>>>> +		break;
>>>> 	}
>>>>
>>>> out_unlock:
>>>> -- 
>>>> 2.4.1
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH net-next] sctp: fix src address selection if using secondary addresses
@ 2015-07-10 15:35         ` Vlad Yasevich
  0 siblings, 0 replies; 22+ messages in thread
From: Vlad Yasevich @ 2015-07-10 15:35 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner, Michael Tuexen; +Cc: netdev, linux-sctp, Neil Horman

On 07/10/2015 07:53 AM, Marcelo Ricardo Leitner wrote:
> On Thu, Jul 09, 2015 at 09:55:19PM +0200, Michael Tuexen wrote:
>>> On 09 Jul 2015, at 18:54, Marcelo Ricardo Leitner <mleitner@redhat.com> wrote:
>>>
>>> Cc'ing Michael too.
>> I'm not familiar with the Linux kernel code, so I can't comment on it.
>> But making sure to use a source address belonging to the emitting
>> interface makes sense for me.
>>
>> Best regards
>> Michael
> 
> That's pretty much what I was looking for, in case I missed something on
> SCTP RFCs. 
> 

Well, the RFCs do not really specify what the source address should be, and there
have been numerous times where I've seen weak host model in use on the wire
even with a BSD peer.

This also puts a very big nail through many suggestions we've had over the years
to allow source based path multihoming in addition to destination based multihoming
we currently support.

It might be a good idea to make rp-filter like behavior best effort, and have
the old behavior as fallback.  I am still trying to think up different scenarios
where rp-filter behavior will cause things to fail prematurely...

-vlad

> Thanks Michael!
> 
>>> On Tue, Jul 07, 2015 at 02:42:25PM -0300, Marcelo Ricardo Leitner wrote:
>>>> Hi folks,
>>>>
>>>> This is an attempt to better choose a src address for sctp packets as
>>>> peers with rp_filter could be dropping our packets in some situations.
>>>> With this patch, we try to respect and use a src address that belongs to
>>>> the interface we are putting the packet out.
>>>>
>>>> I have that feeling that there is be a better way to do this, but I
>>>> just couldn't see it.
>>>>
>>>> This patch has been tested with and without gateways between the peers
>>>> and also just two peers connected via two subnets and results were
>>>> pretty good.
>>>>
>>>> One could think that this limits the address combination we can use, but
>>>> such combinations probably are just bogus anyway. Like, if you have an
>>>> host with address A1 and B1 and another with A2 and B2, you cannot
>>>> expect that A can use A1 to reach B2 through subnet B, because the
>>>> return path would be via the other link which, when this switch happens,
>>>> we are thinking it's broken.
>>>>
>>>> Thanks,
>>>> Marcelo
>>>>
>>>> ---8<---
>>>>
>>>> In short, sctp is likely to incorrectly choose src address if socket is
>>>> bound to secondary addresses. This patch fixes it by adding a new check
>>>> that tries to anticipate if the src address would be expected by the
>>>> next hop/peer on this interface by doing reverse routing.
>>>>
>>>> Also took the shot to reduce the indentation level on this code.
>>>>
>>>> Details:
>>>>
>>>> Currently, sctp will do a routing attempt without specifying the src
>>>> address and compare the returned value (preferred source) with the
>>>> addresses that the socket is bound to. When using secondary addresses,
>>>> this will not match.
>>>>
>>>> Then it will try specifying each of the addresses that the socket is
>>>> bound to and re-routing, checking if that address is valid as src for
>>>> that dst. Thing is, this check alone is weak:
>>>>
>>>> # ip r l
>>>> 192.168.100.0/24 dev eth1  proto kernel  scope link  src 192.168.100.149
>>>> 192.168.122.0/24 dev eth0  proto kernel  scope link  src 192.168.122.147
>>>>
>>>> # ip a l
>>>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
>>>>    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>>>>    inet 127.0.0.1/8 scope host lo
>>>>       valid_lft forever preferred_lft forever
>>>>    inet6 ::1/128 scope host
>>>>       valid_lft forever preferred_lft forever
>>>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
>>>>    link/ether 52:54:00:15:18:6a brd ff:ff:ff:ff:ff:ff
>>>>    inet 192.168.122.147/24 brd 192.168.122.255 scope global dynamic eth0
>>>>       valid_lft 2160sec preferred_lft 2160sec
>>>>    inet 192.168.122.148/24 scope global secondary eth0
>>>>       valid_lft forever preferred_lft forever
>>>>    inet6 fe80::5054:ff:fe15:186a/64 scope link
>>>>       valid_lft forever preferred_lft forever
>>>> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
>>>>    link/ether 52:54:00:b3:91:46 brd ff:ff:ff:ff:ff:ff
>>>>    inet 192.168.100.149/24 brd 192.168.100.255 scope global dynamic eth1
>>>>       valid_lft 2162sec preferred_lft 2162sec
>>>>    inet 192.168.100.148/24 scope global secondary eth1
>>>>       valid_lft forever preferred_lft forever
>>>>    inet6 fe80::5054:ff:feb3:9146/64 scope link
>>>>       valid_lft forever preferred_lft forever
>>>> 4: ens9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
>>>>    link/ether 52:54:00:05:47:ee brd ff:ff:ff:ff:ff:ff
>>>>    inet6 fe80::5054:ff:fe05:47ee/64 scope link
>>>>       valid_lft forever preferred_lft forever
>>>>
>>>> # ip r g 192.168.100.193 from 192.168.122.148
>>>> 192.168.100.193 from 192.168.122.148 dev eth1
>>>>    cache
>>>>
>>>> Even if you specify an interface:
>>>>
>>>> # ip r g 192.168.100.193 from 192.168.122.148 oif eth1
>>>> 192.168.100.193 from 192.168.122.148 dev eth1
>>>>    cache
>>>>
>>>> Although this would be valid, peers using rp_filter will drop such
>>>> packets as their src doesn't match the routes for that interface.
>>>>
>>>> So we fix this by adding an extra check, we try to do the reverse
>>>> routing and check if the interface used would be the same. If not, we
>>>> skip such address. If yes, we use it.
>>>>
>>>> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
>>>> ---
>>>> net/sctp/protocol.c | 55 +++++++++++++++++++++++++++++++++++++++--------------
>>>> 1 file changed, 41 insertions(+), 14 deletions(-)
>>>>
>>>> diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
>>>> index 59e80356672bdf89777265ae1f8c384792dfb98c..e52fd6f77963426a7cf3e83ca01a9cdae1cb2c01 100644
>>>> --- a/net/sctp/protocol.c
>>>> +++ b/net/sctp/protocol.c
>>>> @@ -53,6 +53,7 @@
>>>> #include <net/net_namespace.h>
>>>> #include <net/protocol.h>
>>>> #include <net/ip.h>
>>>> +#include <net/ip_fib.h>
>>>> #include <net/ipv6.h>
>>>> #include <net/route.h>
>>>> #include <net/sctp/sctp.h>
>>>> @@ -487,23 +488,49 @@ static void sctp_v4_get_dst(struct sctp_transport *t, union sctp_addr *saddr,
>>>> 	 */
>>>> 	rcu_read_lock();
>>>> 	list_for_each_entry_rcu(laddr, &bp->address_list, list) {
>>>> +		struct flowi4 in;
>>>> +		struct fib_result res;
>>>> +
>>>> 		if (!laddr->valid)
>>>> 			continue;
>>>> -		if ((laddr->state = SCTP_ADDR_SRC) &&
>>>> -		    (AF_INET = laddr->a.sa.sa_family)) {
>>>> -			fl4->fl4_sport = laddr->a.v4.sin_port;
>>>> -			flowi4_update_output(fl4,
>>>> -					     asoc->base.sk->sk_bound_dev_if,
>>>> -					     RT_CONN_FLAGS(asoc->base.sk),
>>>> -					     daddr->v4.sin_addr.s_addr,
>>>> -					     laddr->a.v4.sin_addr.s_addr);
>>>> -
>>>> -			rt = ip_route_output_key(sock_net(sk), fl4);
>>>> -			if (!IS_ERR(rt)) {
>>>> -				dst = &rt->dst;
>>>> -				goto out_unlock;
>>>> -			}
>>>> +		if (laddr->state != SCTP_ADDR_SRC ||
>>>> +		    AF_INET != laddr->a.sa.sa_family)
>>>> +			continue;
>>>> +
>>>> +		fl4->fl4_sport = laddr->a.v4.sin_port;
>>>> +		flowi4_update_output(fl4,
>>>> +				     asoc->base.sk->sk_bound_dev_if,
>>>> +				     RT_CONN_FLAGS(asoc->base.sk),
>>>> +				     daddr->v4.sin_addr.s_addr,
>>>> +				     laddr->a.v4.sin_addr.s_addr);
>>>> +
>>>> +		rt = ip_route_output_key(sock_net(sk), fl4);
>>>> +		if (IS_ERR(rt))
>>>> +			continue;
>>>> +
>>>> +		dst = &rt->dst;
>>>> +
>>>> +		/* Double check if this is really expected. We simulate
>>>> +		 * rp_filter by swapping src and dst.  If interfaces are
>>>> +		 * different, means that this src wouldn't be expected
>>>> +		 * by the other host on this interface.
>>>> +		 */
>>>> +		memcpy(&in, fl4, sizeof(in));
>>>> +		in.daddr = fl4->saddr;
>>>> +		in.saddr = fl4->daddr;
>>>> +		in.flowi4_iif = fl4->flowi4_oif;
>>>> +		in.flowi4_oif = 0;
>>>> +
>>>> +		if (fib_lookup(sock_net(sk), &in, &res) ||
>>>> +		    res.type != RTN_LOCAL ||
>>>> +		    fl4->flowi4_oif != FIB_RES_DEV(res)->ifindex) {
>>>> +			/* Failed, so this was a false hit */
>>>> +			dst_release(dst);
>>>> +			dst = NULL;
>>>> +			continue;
>>>> 		}
>>>> +
>>>> +		break;
>>>> 	}
>>>>
>>>> out_unlock:
>>>> -- 
>>>> 2.4.1
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH net-next] sctp: fix src address selection if using secondary addresses
  2015-07-10 15:35         ` Vlad Yasevich
@ 2015-07-10 16:17           ` Marcelo Ricardo Leitner
  -1 siblings, 0 replies; 22+ messages in thread
From: Marcelo Ricardo Leitner @ 2015-07-10 16:17 UTC (permalink / raw)
  To: Vlad Yasevich; +Cc: Michael Tuexen, netdev, linux-sctp, Neil Horman

On Fri, Jul 10, 2015 at 11:35:28AM -0400, Vlad Yasevich wrote:
> On 07/10/2015 07:53 AM, Marcelo Ricardo Leitner wrote:
> > On Thu, Jul 09, 2015 at 09:55:19PM +0200, Michael Tuexen wrote:
> >>> On 09 Jul 2015, at 18:54, Marcelo Ricardo Leitner <mleitner@redhat.com> wrote:
> >>>
> >>> Cc'ing Michael too.
> >> I'm not familiar with the Linux kernel code, so I can't comment on it.
> >> But making sure to use a source address belonging to the emitting
> >> interface makes sense for me.
> >>
> >> Best regards
> >> Michael
> > 
> > That's pretty much what I was looking for, in case I missed something on
> > SCTP RFCs. 
> > 
> 
> Well, the RFCs do not really specify what the source address should be, and there

That's why I was afraid of having missed something ;)

> have been numerous times where I've seen weak host model in use on the wire
> even with a BSD peer.
> 
> This also puts a very big nail through many suggestions we've had over the years
> to allow source based path multihoming in addition to destination based multihoming
> we currently support.
> 
> It might be a good idea to make rp-filter like behavior best effort, and have
> the old behavior as fallback.  I am still trying to think up different scenarios
> where rp-filter behavior will cause things to fail prematurely...

The old behavior is like "if we don't have a src yet and can't find a
preferred src for this dst, use the 1st bound address". We can add it
but as I said, I'm afraid it is just doing wrong and not worth. If such
randomly src addressed packet is meant to be routed, the router will
likely drop it as it is seen as a spoof. And if it reaches the peer, it
will probably come back through a different path.

I'm tempted to say that current usual use cases are handled by the first
check on this function, which returns the preferred/primary address for
the interface and checks against bound addresses. Whenever you reach the
second check, it just allows you to use that 1st bound address that is
checked. I mean, I can't see use cases that we would be breaking with
this change. 

But yeah, it impacts source based routing, and I'm not aware of previous
discussions on it. I'll try to dig some up but if possible, please share
some pointers on it.

  Marcelo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH net-next] sctp: fix src address selection if using secondary addresses
@ 2015-07-10 16:17           ` Marcelo Ricardo Leitner
  0 siblings, 0 replies; 22+ messages in thread
From: Marcelo Ricardo Leitner @ 2015-07-10 16:17 UTC (permalink / raw)
  To: Vlad Yasevich; +Cc: Michael Tuexen, netdev, linux-sctp, Neil Horman

On Fri, Jul 10, 2015 at 11:35:28AM -0400, Vlad Yasevich wrote:
> On 07/10/2015 07:53 AM, Marcelo Ricardo Leitner wrote:
> > On Thu, Jul 09, 2015 at 09:55:19PM +0200, Michael Tuexen wrote:
> >>> On 09 Jul 2015, at 18:54, Marcelo Ricardo Leitner <mleitner@redhat.com> wrote:
> >>>
> >>> Cc'ing Michael too.
> >> I'm not familiar with the Linux kernel code, so I can't comment on it.
> >> But making sure to use a source address belonging to the emitting
> >> interface makes sense for me.
> >>
> >> Best regards
> >> Michael
> > 
> > That's pretty much what I was looking for, in case I missed something on
> > SCTP RFCs. 
> > 
> 
> Well, the RFCs do not really specify what the source address should be, and there

That's why I was afraid of having missed something ;)

> have been numerous times where I've seen weak host model in use on the wire
> even with a BSD peer.
> 
> This also puts a very big nail through many suggestions we've had over the years
> to allow source based path multihoming in addition to destination based multihoming
> we currently support.
> 
> It might be a good idea to make rp-filter like behavior best effort, and have
> the old behavior as fallback.  I am still trying to think up different scenarios
> where rp-filter behavior will cause things to fail prematurely...

The old behavior is like "if we don't have a src yet and can't find a
preferred src for this dst, use the 1st bound address". We can add it
but as I said, I'm afraid it is just doing wrong and not worth. If such
randomly src addressed packet is meant to be routed, the router will
likely drop it as it is seen as a spoof. And if it reaches the peer, it
will probably come back through a different path.

I'm tempted to say that current usual use cases are handled by the first
check on this function, which returns the preferred/primary address for
the interface and checks against bound addresses. Whenever you reach the
second check, it just allows you to use that 1st bound address that is
checked. I mean, I can't see use cases that we would be breaking with
this change. 

But yeah, it impacts source based routing, and I'm not aware of previous
discussions on it. I'll try to dig some up but if possible, please share
some pointers on it.

  Marcelo


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH net-next] sctp: fix src address selection if using secondary addresses
  2015-07-10 16:17           ` Marcelo Ricardo Leitner
@ 2015-07-10 17:14             ` Vlad Yasevich
  -1 siblings, 0 replies; 22+ messages in thread
From: Vlad Yasevich @ 2015-07-10 17:14 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner; +Cc: Michael Tuexen, netdev, linux-sctp, Neil Horman

On 07/10/2015 12:17 PM, Marcelo Ricardo Leitner wrote:
> On Fri, Jul 10, 2015 at 11:35:28AM -0400, Vlad Yasevich wrote:
>> On 07/10/2015 07:53 AM, Marcelo Ricardo Leitner wrote:
>>> On Thu, Jul 09, 2015 at 09:55:19PM +0200, Michael Tuexen wrote:
>>>>> On 09 Jul 2015, at 18:54, Marcelo Ricardo Leitner <mleitner@redhat.com> wrote:
>>>>>
>>>>> Cc'ing Michael too.
>>>> I'm not familiar with the Linux kernel code, so I can't comment on it.
>>>> But making sure to use a source address belonging to the emitting
>>>> interface makes sense for me.
>>>>
>>>> Best regards
>>>> Michael
>>>
>>> That's pretty much what I was looking for, in case I missed something on
>>> SCTP RFCs. 
>>>
>>
>> Well, the RFCs do not really specify what the source address should be, and there
> 
> That's why I was afraid of having missed something ;)
> 
>> have been numerous times where I've seen weak host model in use on the wire
>> even with a BSD peer.
>>
>> This also puts a very big nail through many suggestions we've had over the years
>> to allow source based path multihoming in addition to destination based multihoming
>> we currently support.
>>
>> It might be a good idea to make rp-filter like behavior best effort, and have
>> the old behavior as fallback.  I am still trying to think up different scenarios
>> where rp-filter behavior will cause things to fail prematurely...
> 
> The old behavior is like "if we don't have a src yet and can't find a
> preferred src for this dst, use the 1st bound address". We can add it
> but as I said, I'm afraid it is just doing wrong and not worth. If such
> randomly src addressed packet is meant to be routed, the router will
> likely drop it as it is seen as a spoof. And if it reaches the peer, it
> will probably come back through a different path.
> 
> I'm tempted to say that current usual use cases are handled by the first
> check on this function, which returns the preferred/primary address for
> the interface and checks against bound addresses. Whenever you reach the
> second check, it just allows you to use that 1st bound address that is
> checked. I mean, I can't see use cases that we would be breaking with
> this change. 

Yes,  the secondary check didn't amount to much, but we've kept it since 2.5
days (when sctp was introduced).  I've made attempts over the years to
try to make it stricter, but that never amounted to anything that worked well.

> 
> But yeah, it impacts source based routing, and I'm not aware of previous
> discussions on it. I'll try to dig some up but if possible, please share
> some pointers on it.

It's been suggested a few times that we should support source based multihoming
particularly for the case where one peer has only 1 address.
We've always punted on this, but people still ask every now and then.

I do have a question about the code though.. Have you tried with mutlipath routing
enabled.  I see rp_filter checks have special code to handle that.  Seem like we
might get false negatives in sctp.

-vlad

> 
>   Marcelo
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH net-next] sctp: fix src address selection if using secondary addresses
@ 2015-07-10 17:14             ` Vlad Yasevich
  0 siblings, 0 replies; 22+ messages in thread
From: Vlad Yasevich @ 2015-07-10 17:14 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner; +Cc: Michael Tuexen, netdev, linux-sctp, Neil Horman

On 07/10/2015 12:17 PM, Marcelo Ricardo Leitner wrote:
> On Fri, Jul 10, 2015 at 11:35:28AM -0400, Vlad Yasevich wrote:
>> On 07/10/2015 07:53 AM, Marcelo Ricardo Leitner wrote:
>>> On Thu, Jul 09, 2015 at 09:55:19PM +0200, Michael Tuexen wrote:
>>>>> On 09 Jul 2015, at 18:54, Marcelo Ricardo Leitner <mleitner@redhat.com> wrote:
>>>>>
>>>>> Cc'ing Michael too.
>>>> I'm not familiar with the Linux kernel code, so I can't comment on it.
>>>> But making sure to use a source address belonging to the emitting
>>>> interface makes sense for me.
>>>>
>>>> Best regards
>>>> Michael
>>>
>>> That's pretty much what I was looking for, in case I missed something on
>>> SCTP RFCs. 
>>>
>>
>> Well, the RFCs do not really specify what the source address should be, and there
> 
> That's why I was afraid of having missed something ;)
> 
>> have been numerous times where I've seen weak host model in use on the wire
>> even with a BSD peer.
>>
>> This also puts a very big nail through many suggestions we've had over the years
>> to allow source based path multihoming in addition to destination based multihoming
>> we currently support.
>>
>> It might be a good idea to make rp-filter like behavior best effort, and have
>> the old behavior as fallback.  I am still trying to think up different scenarios
>> where rp-filter behavior will cause things to fail prematurely...
> 
> The old behavior is like "if we don't have a src yet and can't find a
> preferred src for this dst, use the 1st bound address". We can add it
> but as I said, I'm afraid it is just doing wrong and not worth. If such
> randomly src addressed packet is meant to be routed, the router will
> likely drop it as it is seen as a spoof. And if it reaches the peer, it
> will probably come back through a different path.
> 
> I'm tempted to say that current usual use cases are handled by the first
> check on this function, which returns the preferred/primary address for
> the interface and checks against bound addresses. Whenever you reach the
> second check, it just allows you to use that 1st bound address that is
> checked. I mean, I can't see use cases that we would be breaking with
> this change. 

Yes,  the secondary check didn't amount to much, but we've kept it since 2.5
days (when sctp was introduced).  I've made attempts over the years to
try to make it stricter, but that never amounted to anything that worked well.

> 
> But yeah, it impacts source based routing, and I'm not aware of previous
> discussions on it. I'll try to dig some up but if possible, please share
> some pointers on it.

It's been suggested a few times that we should support source based multihoming
particularly for the case where one peer has only 1 address.
We've always punted on this, but people still ask every now and then.

I do have a question about the code though.. Have you tried with mutlipath routing
enabled.  I see rp_filter checks have special code to handle that.  Seem like we
might get false negatives in sctp.

-vlad

> 
>   Marcelo
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH net-next] sctp: fix src address selection if using secondary addresses
  2015-07-10 17:14             ` Vlad Yasevich
@ 2015-07-10 18:27               ` Marcelo Ricardo Leitner
  -1 siblings, 0 replies; 22+ messages in thread
From: Marcelo Ricardo Leitner @ 2015-07-10 18:27 UTC (permalink / raw)
  To: Vlad Yasevich; +Cc: Michael Tuexen, netdev, linux-sctp, Neil Horman

On Fri, Jul 10, 2015 at 01:14:21PM -0400, Vlad Yasevich wrote:
> On 07/10/2015 12:17 PM, Marcelo Ricardo Leitner wrote:
> > On Fri, Jul 10, 2015 at 11:35:28AM -0400, Vlad Yasevich wrote:
> >> On 07/10/2015 07:53 AM, Marcelo Ricardo Leitner wrote:
> >>> On Thu, Jul 09, 2015 at 09:55:19PM +0200, Michael Tuexen wrote:
> >>>>> On 09 Jul 2015, at 18:54, Marcelo Ricardo Leitner <mleitner@redhat.com> wrote:
> >>>>>
> >>>>> Cc'ing Michael too.
> >>>> I'm not familiar with the Linux kernel code, so I can't comment on it.
> >>>> But making sure to use a source address belonging to the emitting
> >>>> interface makes sense for me.
> >>>>
> >>>> Best regards
> >>>> Michael
> >>>
> >>> That's pretty much what I was looking for, in case I missed something on
> >>> SCTP RFCs. 
> >>>
> >>
> >> Well, the RFCs do not really specify what the source address should be, and there
> > 
> > That's why I was afraid of having missed something ;)
> > 
> >> have been numerous times where I've seen weak host model in use on the wire
> >> even with a BSD peer.
> >>
> >> This also puts a very big nail through many suggestions we've had over the years
> >> to allow source based path multihoming in addition to destination based multihoming
> >> we currently support.
> >>
> >> It might be a good idea to make rp-filter like behavior best effort, and have
> >> the old behavior as fallback.  I am still trying to think up different scenarios
> >> where rp-filter behavior will cause things to fail prematurely...
> > 
> > The old behavior is like "if we don't have a src yet and can't find a
> > preferred src for this dst, use the 1st bound address". We can add it
> > but as I said, I'm afraid it is just doing wrong and not worth. If such
> > randomly src addressed packet is meant to be routed, the router will
> > likely drop it as it is seen as a spoof. And if it reaches the peer, it
> > will probably come back through a different path.
> > 
> > I'm tempted to say that current usual use cases are handled by the first
> > check on this function, which returns the preferred/primary address for
> > the interface and checks against bound addresses. Whenever you reach the
> > second check, it just allows you to use that 1st bound address that is
> > checked. I mean, I can't see use cases that we would be breaking with
> > this change. 
> 
> Yes,  the secondary check didn't amount to much, but we've kept it since 2.5
> days (when sctp was introduced).  I've made attempts over the years to
> try to make it stricter, but that never amounted to anything that worked well.
> 
> > 
> > But yeah, it impacts source based routing, and I'm not aware of previous
> > discussions on it. I'll try to dig some up but if possible, please share
> > some pointers on it.
> 
> It's been suggested a few times that we should support source based multihoming
> particularly for the case where one peer has only 1 address.
> We've always punted on this, but people still ask every now and then.

Ah okay, now I see it.
 
> I do have a question about the code though.. Have you tried with mutlipath routing
> enabled.  I see rp_filter checks have special code to handle that.  Seem like we
> might get false negatives in sctp.

In the sense of CONFIG_IP_ROUTE_MULTIPATH=y, yes, but just that. My
routes were simple ones, either 2 peers attaches to 2 local subnets, or
with a gateway in the middle (with 2 subnets on each side, but mapped
1-1, no crossing. Aka subnet1<->subnet2 and subnet3<->subnet4 while not
(subnet1<->subnet4 or subnet3<->subnet2).

Note that this is not rp_filter strictly speaking, as it's mirrored.
rp_filter needs to calculate all possible output routes (actually until
it finds a valid one) for finding one that would match the one used for
incoming. 

This check already has an output path, and it's calculating if such
input would be acceptable. We can't really expect/check for other hits
because it invalidates the chosen output path.

Hmmm... but we could support multipath in the output selection, ie in
the outputs of ip_route_output_key(), probably in another patch then?

  Marcelo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH net-next] sctp: fix src address selection if using secondary addresses
@ 2015-07-10 18:27               ` Marcelo Ricardo Leitner
  0 siblings, 0 replies; 22+ messages in thread
From: Marcelo Ricardo Leitner @ 2015-07-10 18:27 UTC (permalink / raw)
  To: Vlad Yasevich; +Cc: Michael Tuexen, netdev, linux-sctp, Neil Horman

On Fri, Jul 10, 2015 at 01:14:21PM -0400, Vlad Yasevich wrote:
> On 07/10/2015 12:17 PM, Marcelo Ricardo Leitner wrote:
> > On Fri, Jul 10, 2015 at 11:35:28AM -0400, Vlad Yasevich wrote:
> >> On 07/10/2015 07:53 AM, Marcelo Ricardo Leitner wrote:
> >>> On Thu, Jul 09, 2015 at 09:55:19PM +0200, Michael Tuexen wrote:
> >>>>> On 09 Jul 2015, at 18:54, Marcelo Ricardo Leitner <mleitner@redhat.com> wrote:
> >>>>>
> >>>>> Cc'ing Michael too.
> >>>> I'm not familiar with the Linux kernel code, so I can't comment on it.
> >>>> But making sure to use a source address belonging to the emitting
> >>>> interface makes sense for me.
> >>>>
> >>>> Best regards
> >>>> Michael
> >>>
> >>> That's pretty much what I was looking for, in case I missed something on
> >>> SCTP RFCs. 
> >>>
> >>
> >> Well, the RFCs do not really specify what the source address should be, and there
> > 
> > That's why I was afraid of having missed something ;)
> > 
> >> have been numerous times where I've seen weak host model in use on the wire
> >> even with a BSD peer.
> >>
> >> This also puts a very big nail through many suggestions we've had over the years
> >> to allow source based path multihoming in addition to destination based multihoming
> >> we currently support.
> >>
> >> It might be a good idea to make rp-filter like behavior best effort, and have
> >> the old behavior as fallback.  I am still trying to think up different scenarios
> >> where rp-filter behavior will cause things to fail prematurely...
> > 
> > The old behavior is like "if we don't have a src yet and can't find a
> > preferred src for this dst, use the 1st bound address". We can add it
> > but as I said, I'm afraid it is just doing wrong and not worth. If such
> > randomly src addressed packet is meant to be routed, the router will
> > likely drop it as it is seen as a spoof. And if it reaches the peer, it
> > will probably come back through a different path.
> > 
> > I'm tempted to say that current usual use cases are handled by the first
> > check on this function, which returns the preferred/primary address for
> > the interface and checks against bound addresses. Whenever you reach the
> > second check, it just allows you to use that 1st bound address that is
> > checked. I mean, I can't see use cases that we would be breaking with
> > this change. 
> 
> Yes,  the secondary check didn't amount to much, but we've kept it since 2.5
> days (when sctp was introduced).  I've made attempts over the years to
> try to make it stricter, but that never amounted to anything that worked well.
> 
> > 
> > But yeah, it impacts source based routing, and I'm not aware of previous
> > discussions on it. I'll try to dig some up but if possible, please share
> > some pointers on it.
> 
> It's been suggested a few times that we should support source based multihoming
> particularly for the case where one peer has only 1 address.
> We've always punted on this, but people still ask every now and then.

Ah okay, now I see it.
 
> I do have a question about the code though.. Have you tried with mutlipath routing
> enabled.  I see rp_filter checks have special code to handle that.  Seem like we
> might get false negatives in sctp.

In the sense of CONFIG_IP_ROUTE_MULTIPATH=y, yes, but just that. My
routes were simple ones, either 2 peers attaches to 2 local subnets, or
with a gateway in the middle (with 2 subnets on each side, but mapped
1-1, no crossing. Aka subnet1<->subnet2 and subnet3<->subnet4 while not
(subnet1<->subnet4 or subnet3<->subnet2).

Note that this is not rp_filter strictly speaking, as it's mirrored.
rp_filter needs to calculate all possible output routes (actually until
it finds a valid one) for finding one that would match the one used for
incoming. 

This check already has an output path, and it's calculating if such
input would be acceptable. We can't really expect/check for other hits
because it invalidates the chosen output path.

Hmmm... but we could support multipath in the output selection, ie in
the outputs of ip_route_output_key(), probably in another patch then?

  Marcelo


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH net-next] sctp: fix src address selection if using secondary addresses
  2015-07-10 18:27               ` Marcelo Ricardo Leitner
@ 2015-07-15 19:03                 ` Marcelo Ricardo Leitner
  -1 siblings, 0 replies; 22+ messages in thread
From: Marcelo Ricardo Leitner @ 2015-07-15 19:03 UTC (permalink / raw)
  To: Vlad Yasevich; +Cc: Michael Tuexen, netdev, linux-sctp, Neil Horman

On Fri, Jul 10, 2015 at 03:27:02PM -0300, Marcelo Ricardo Leitner wrote:
> On Fri, Jul 10, 2015 at 01:14:21PM -0400, Vlad Yasevich wrote:
> > On 07/10/2015 12:17 PM, Marcelo Ricardo Leitner wrote:
> > > On Fri, Jul 10, 2015 at 11:35:28AM -0400, Vlad Yasevich wrote:
...
> > >> have been numerous times where I've seen weak host model in use on the wire
> > >> even with a BSD peer.
> > >>
> > >> This also puts a very big nail through many suggestions we've had over the years
> > >> to allow source based path multihoming in addition to destination based multihoming
> > >> we currently support.
> > >>
> > >> It might be a good idea to make rp-filter like behavior best effort, and have
> > >> the old behavior as fallback.  I am still trying to think up different scenarios
> > >> where rp-filter behavior will cause things to fail prematurely...
> > > 
> > > The old behavior is like "if we don't have a src yet and can't find a
> > > preferred src for this dst, use the 1st bound address". We can add it
> > > but as I said, I'm afraid it is just doing wrong and not worth. If such
> > > randomly src addressed packet is meant to be routed, the router will
> > > likely drop it as it is seen as a spoof. And if it reaches the peer, it
> > > will probably come back through a different path.
> > > 
> > > I'm tempted to say that current usual use cases are handled by the first
> > > check on this function, which returns the preferred/primary address for
> > > the interface and checks against bound addresses. Whenever you reach the
> > > second check, it just allows you to use that 1st bound address that is
> > > checked. I mean, I can't see use cases that we would be breaking with
> > > this change. 
> > 
> > Yes,  the secondary check didn't amount to much, but we've kept it since 2.5
> > days (when sctp was introduced).  I've made attempts over the years to
> > try to make it stricter, but that never amounted to anything that worked well.
> > 
> > > 
> > > But yeah, it impacts source based routing, and I'm not aware of previous
> > > discussions on it. I'll try to dig some up but if possible, please share
> > > some pointers on it.
> > 
> > It's been suggested a few times that we should support source based multihoming
> > particularly for the case where one peer has only 1 address.
> > We've always punted on this, but people still ask every now and then.
> 
> Ah okay, now I see it.
>  
> > I do have a question about the code though.. Have you tried with mutlipath routing
> > enabled.  I see rp_filter checks have special code to handle that.  Seem like we
> > might get false negatives in sctp.
> 
> In the sense of CONFIG_IP_ROUTE_MULTIPATH=y, yes, but just that. My
> routes were simple ones, either 2 peers attaches to 2 local subnets, or
> with a gateway in the middle (with 2 subnets on each side, but mapped
> 1-1, no crossing. Aka subnet1<->subnet2 and subnet3<->subnet4 while not
> (subnet1<->subnet4 or subnet3<->subnet2).
> 
> Note that this is not rp_filter strictly speaking, as it's mirrored.
> rp_filter needs to calculate all possible output routes (actually until
> it finds a valid one) for finding one that would match the one used for
> incoming. 
> 
> This check already has an output path, and it's calculating if such
> input would be acceptable. We can't really expect/check for other hits
> because it invalidates the chosen output path.
> 
> Hmmm... but we could support multipath in the output selection, ie in
> the outputs of ip_route_output_key(), probably in another patch then?

Thinking further.. we could just compare it with the addresses assigned to the
interface instead of doing a whole new routing. Cheaper/faster, provides the
results I'm looking for and the consequences are easier to see.

Something like (not tested, just illustrating the idea):

--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -489,22 +489,33 @@ static void sctp_v4_get_dst(struct sctp_transport *t, union sctp_addr *saddr,
        list_for_each_entry_rcu(laddr, &bp->address_list, list) {
                if (!laddr->valid)
                        continue;
                if ((laddr->state == SCTP_ADDR_SRC) &&
                    (AF_INET == laddr->a.sa.sa_family)) {
+                       struct net_device *odev;
+
                        fl4->fl4_sport = laddr->a.v4.sin_port;
                        flowi4_update_output(fl4,
                                             asoc->base.sk->sk_bound_dev_if,
                                             RT_CONN_FLAGS(asoc->base.sk),
                                             daddr->v4.sin_addr.s_addr,
                                             laddr->a.v4.sin_addr.s_addr);
 
                        rt = ip_route_output_key(sock_net(sk), fl4);
-                       if (!IS_ERR(rt)) {
-                               dst = &rt->dst;
-                               goto out_unlock;
-                       }
+                       if (IS_ERR(rt))
+                               continue;
+
+                       /* Ensure the src address belongs to the output
+                        * interface.
+                        */
+                       odev = __ip_dev_find(net, laddr->a.v4.sin_addr.s_addr,
+                                            false);
+                       if (odev->if_index != fl4->flowi4_oif)
+                               continue;
+
+                       dst = &rt->dst;
+                       goto out_unlock;
                }
        }
 
 out_unlock:
        rcu_read_unlock();


I like this better than my 1st attempt. What do you think?

I'll split the refactoring from this fix on v2, so it's easier to review.

  Marcelo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH net-next] sctp: fix src address selection if using secondary addresses
@ 2015-07-15 19:03                 ` Marcelo Ricardo Leitner
  0 siblings, 0 replies; 22+ messages in thread
From: Marcelo Ricardo Leitner @ 2015-07-15 19:03 UTC (permalink / raw)
  To: Vlad Yasevich; +Cc: Michael Tuexen, netdev, linux-sctp, Neil Horman

On Fri, Jul 10, 2015 at 03:27:02PM -0300, Marcelo Ricardo Leitner wrote:
> On Fri, Jul 10, 2015 at 01:14:21PM -0400, Vlad Yasevich wrote:
> > On 07/10/2015 12:17 PM, Marcelo Ricardo Leitner wrote:
> > > On Fri, Jul 10, 2015 at 11:35:28AM -0400, Vlad Yasevich wrote:
...
> > >> have been numerous times where I've seen weak host model in use on the wire
> > >> even with a BSD peer.
> > >>
> > >> This also puts a very big nail through many suggestions we've had over the years
> > >> to allow source based path multihoming in addition to destination based multihoming
> > >> we currently support.
> > >>
> > >> It might be a good idea to make rp-filter like behavior best effort, and have
> > >> the old behavior as fallback.  I am still trying to think up different scenarios
> > >> where rp-filter behavior will cause things to fail prematurely...
> > > 
> > > The old behavior is like "if we don't have a src yet and can't find a
> > > preferred src for this dst, use the 1st bound address". We can add it
> > > but as I said, I'm afraid it is just doing wrong and not worth. If such
> > > randomly src addressed packet is meant to be routed, the router will
> > > likely drop it as it is seen as a spoof. And if it reaches the peer, it
> > > will probably come back through a different path.
> > > 
> > > I'm tempted to say that current usual use cases are handled by the first
> > > check on this function, which returns the preferred/primary address for
> > > the interface and checks against bound addresses. Whenever you reach the
> > > second check, it just allows you to use that 1st bound address that is
> > > checked. I mean, I can't see use cases that we would be breaking with
> > > this change. 
> > 
> > Yes,  the secondary check didn't amount to much, but we've kept it since 2.5
> > days (when sctp was introduced).  I've made attempts over the years to
> > try to make it stricter, but that never amounted to anything that worked well.
> > 
> > > 
> > > But yeah, it impacts source based routing, and I'm not aware of previous
> > > discussions on it. I'll try to dig some up but if possible, please share
> > > some pointers on it.
> > 
> > It's been suggested a few times that we should support source based multihoming
> > particularly for the case where one peer has only 1 address.
> > We've always punted on this, but people still ask every now and then.
> 
> Ah okay, now I see it.
>  
> > I do have a question about the code though.. Have you tried with mutlipath routing
> > enabled.  I see rp_filter checks have special code to handle that.  Seem like we
> > might get false negatives in sctp.
> 
> In the sense of CONFIG_IP_ROUTE_MULTIPATH=y, yes, but just that. My
> routes were simple ones, either 2 peers attaches to 2 local subnets, or
> with a gateway in the middle (with 2 subnets on each side, but mapped
> 1-1, no crossing. Aka subnet1<->subnet2 and subnet3<->subnet4 while not
> (subnet1<->subnet4 or subnet3<->subnet2).
> 
> Note that this is not rp_filter strictly speaking, as it's mirrored.
> rp_filter needs to calculate all possible output routes (actually until
> it finds a valid one) for finding one that would match the one used for
> incoming. 
> 
> This check already has an output path, and it's calculating if such
> input would be acceptable. We can't really expect/check for other hits
> because it invalidates the chosen output path.
> 
> Hmmm... but we could support multipath in the output selection, ie in
> the outputs of ip_route_output_key(), probably in another patch then?

Thinking further.. we could just compare it with the addresses assigned to the
interface instead of doing a whole new routing. Cheaper/faster, provides the
results I'm looking for and the consequences are easier to see.

Something like (not tested, just illustrating the idea):

--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -489,22 +489,33 @@ static void sctp_v4_get_dst(struct sctp_transport *t, union sctp_addr *saddr,
        list_for_each_entry_rcu(laddr, &bp->address_list, list) {
                if (!laddr->valid)
                        continue;
                if ((laddr->state = SCTP_ADDR_SRC) &&
                    (AF_INET = laddr->a.sa.sa_family)) {
+                       struct net_device *odev;
+
                        fl4->fl4_sport = laddr->a.v4.sin_port;
                        flowi4_update_output(fl4,
                                             asoc->base.sk->sk_bound_dev_if,
                                             RT_CONN_FLAGS(asoc->base.sk),
                                             daddr->v4.sin_addr.s_addr,
                                             laddr->a.v4.sin_addr.s_addr);
 
                        rt = ip_route_output_key(sock_net(sk), fl4);
-                       if (!IS_ERR(rt)) {
-                               dst = &rt->dst;
-                               goto out_unlock;
-                       }
+                       if (IS_ERR(rt))
+                               continue;
+
+                       /* Ensure the src address belongs to the output
+                        * interface.
+                        */
+                       odev = __ip_dev_find(net, laddr->a.v4.sin_addr.s_addr,
+                                            false);
+                       if (odev->if_index != fl4->flowi4_oif)
+                               continue;
+
+                       dst = &rt->dst;
+                       goto out_unlock;
                }
        }
 
 out_unlock:
        rcu_read_unlock();


I like this better than my 1st attempt. What do you think?

I'll split the refactoring from this fix on v2, so it's easier to review.

  Marcelo


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH net-next] sctp: fix src address selection if using secondary addresses
  2015-07-15 19:03                 ` Marcelo Ricardo Leitner
@ 2015-07-16 13:09                   ` Vlad Yasevich
  -1 siblings, 0 replies; 22+ messages in thread
From: Vlad Yasevich @ 2015-07-16 13:09 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner; +Cc: Michael Tuexen, netdev, linux-sctp, Neil Horman

On 07/15/2015 03:03 PM, Marcelo Ricardo Leitner wrote:
> On Fri, Jul 10, 2015 at 03:27:02PM -0300, Marcelo Ricardo Leitner wrote:
>> On Fri, Jul 10, 2015 at 01:14:21PM -0400, Vlad Yasevich wrote:
>>> On 07/10/2015 12:17 PM, Marcelo Ricardo Leitner wrote:
>>>> On Fri, Jul 10, 2015 at 11:35:28AM -0400, Vlad Yasevich wrote:
> ...
>>>>> have been numerous times where I've seen weak host model in use on the wire
>>>>> even with a BSD peer.
>>>>>
>>>>> This also puts a very big nail through many suggestions we've had over the years
>>>>> to allow source based path multihoming in addition to destination based multihoming
>>>>> we currently support.
>>>>>
>>>>> It might be a good idea to make rp-filter like behavior best effort, and have
>>>>> the old behavior as fallback.  I am still trying to think up different scenarios
>>>>> where rp-filter behavior will cause things to fail prematurely...
>>>>
>>>> The old behavior is like "if we don't have a src yet and can't find a
>>>> preferred src for this dst, use the 1st bound address". We can add it
>>>> but as I said, I'm afraid it is just doing wrong and not worth. If such
>>>> randomly src addressed packet is meant to be routed, the router will
>>>> likely drop it as it is seen as a spoof. And if it reaches the peer, it
>>>> will probably come back through a different path.
>>>>
>>>> I'm tempted to say that current usual use cases are handled by the first
>>>> check on this function, which returns the preferred/primary address for
>>>> the interface and checks against bound addresses. Whenever you reach the
>>>> second check, it just allows you to use that 1st bound address that is
>>>> checked. I mean, I can't see use cases that we would be breaking with
>>>> this change. 
>>>
>>> Yes,  the secondary check didn't amount to much, but we've kept it since 2.5
>>> days (when sctp was introduced).  I've made attempts over the years to
>>> try to make it stricter, but that never amounted to anything that worked well.
>>>
>>>>
>>>> But yeah, it impacts source based routing, and I'm not aware of previous
>>>> discussions on it. I'll try to dig some up but if possible, please share
>>>> some pointers on it.
>>>
>>> It's been suggested a few times that we should support source based multihoming
>>> particularly for the case where one peer has only 1 address.
>>> We've always punted on this, but people still ask every now and then.
>>
>> Ah okay, now I see it.
>>  
>>> I do have a question about the code though.. Have you tried with mutlipath routing
>>> enabled.  I see rp_filter checks have special code to handle that.  Seem like we
>>> might get false negatives in sctp.
>>
>> In the sense of CONFIG_IP_ROUTE_MULTIPATH=y, yes, but just that. My
>> routes were simple ones, either 2 peers attaches to 2 local subnets, or
>> with a gateway in the middle (with 2 subnets on each side, but mapped
>> 1-1, no crossing. Aka subnet1<->subnet2 and subnet3<->subnet4 while not
>> (subnet1<->subnet4 or subnet3<->subnet2).
>>
>> Note that this is not rp_filter strictly speaking, as it's mirrored.
>> rp_filter needs to calculate all possible output routes (actually until
>> it finds a valid one) for finding one that would match the one used for
>> incoming. 
>>
>> This check already has an output path, and it's calculating if such
>> input would be acceptable. We can't really expect/check for other hits
>> because it invalidates the chosen output path.
>>
>> Hmmm... but we could support multipath in the output selection, ie in
>> the outputs of ip_route_output_key(), probably in another patch then?
> 
> Thinking further.. we could just compare it with the addresses assigned to the
> interface instead of doing a whole new routing. Cheaper/faster, provides the
> results I'm looking for and the consequences are easier to see.
> 
> Something like (not tested, just illustrating the idea):
> 
> --- a/net/sctp/protocol.c
> +++ b/net/sctp/protocol.c
> @@ -489,22 +489,33 @@ static void sctp_v4_get_dst(struct sctp_transport *t, union sctp_addr *saddr,
>         list_for_each_entry_rcu(laddr, &bp->address_list, list) {
>                 if (!laddr->valid)
>                         continue;
>                 if ((laddr->state == SCTP_ADDR_SRC) &&
>                     (AF_INET == laddr->a.sa.sa_family)) {
> +                       struct net_device *odev;
> +
>                         fl4->fl4_sport = laddr->a.v4.sin_port;
>                         flowi4_update_output(fl4,
>                                              asoc->base.sk->sk_bound_dev_if,
>                                              RT_CONN_FLAGS(asoc->base.sk),
>                                              daddr->v4.sin_addr.s_addr,
>                                              laddr->a.v4.sin_addr.s_addr);
>  
>                         rt = ip_route_output_key(sock_net(sk), fl4);
> -                       if (!IS_ERR(rt)) {
> -                               dst = &rt->dst;
> -                               goto out_unlock;
> -                       }
> +                       if (IS_ERR(rt))
> +                               continue;
> +
> +                       /* Ensure the src address belongs to the output
> +                        * interface.
> +                        */
> +                       odev = __ip_dev_find(net, laddr->a.v4.sin_addr.s_addr,
> +                                            false);
> +                       if (odev->if_index != fl4->flowi4_oif)
> +                               continue;
> +
> +                       dst = &rt->dst;
> +                       goto out_unlock;
>                 }
>         }
>  
>  out_unlock:
>         rcu_read_unlock();
> 
> 
> I like this better than my 1st attempt. What do you think?

Looks better.  Have to drop the ref on the dev since __ip_dev_find takes one.

> 
> I'll split the refactoring from this fix on v2, so it's easier to review.
> 

Sounds good.

-vlad

>   Marcelo
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH net-next] sctp: fix src address selection if using secondary addresses
@ 2015-07-16 13:09                   ` Vlad Yasevich
  0 siblings, 0 replies; 22+ messages in thread
From: Vlad Yasevich @ 2015-07-16 13:09 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner; +Cc: Michael Tuexen, netdev, linux-sctp, Neil Horman

On 07/15/2015 03:03 PM, Marcelo Ricardo Leitner wrote:
> On Fri, Jul 10, 2015 at 03:27:02PM -0300, Marcelo Ricardo Leitner wrote:
>> On Fri, Jul 10, 2015 at 01:14:21PM -0400, Vlad Yasevich wrote:
>>> On 07/10/2015 12:17 PM, Marcelo Ricardo Leitner wrote:
>>>> On Fri, Jul 10, 2015 at 11:35:28AM -0400, Vlad Yasevich wrote:
> ...
>>>>> have been numerous times where I've seen weak host model in use on the wire
>>>>> even with a BSD peer.
>>>>>
>>>>> This also puts a very big nail through many suggestions we've had over the years
>>>>> to allow source based path multihoming in addition to destination based multihoming
>>>>> we currently support.
>>>>>
>>>>> It might be a good idea to make rp-filter like behavior best effort, and have
>>>>> the old behavior as fallback.  I am still trying to think up different scenarios
>>>>> where rp-filter behavior will cause things to fail prematurely...
>>>>
>>>> The old behavior is like "if we don't have a src yet and can't find a
>>>> preferred src for this dst, use the 1st bound address". We can add it
>>>> but as I said, I'm afraid it is just doing wrong and not worth. If such
>>>> randomly src addressed packet is meant to be routed, the router will
>>>> likely drop it as it is seen as a spoof. And if it reaches the peer, it
>>>> will probably come back through a different path.
>>>>
>>>> I'm tempted to say that current usual use cases are handled by the first
>>>> check on this function, which returns the preferred/primary address for
>>>> the interface and checks against bound addresses. Whenever you reach the
>>>> second check, it just allows you to use that 1st bound address that is
>>>> checked. I mean, I can't see use cases that we would be breaking with
>>>> this change. 
>>>
>>> Yes,  the secondary check didn't amount to much, but we've kept it since 2.5
>>> days (when sctp was introduced).  I've made attempts over the years to
>>> try to make it stricter, but that never amounted to anything that worked well.
>>>
>>>>
>>>> But yeah, it impacts source based routing, and I'm not aware of previous
>>>> discussions on it. I'll try to dig some up but if possible, please share
>>>> some pointers on it.
>>>
>>> It's been suggested a few times that we should support source based multihoming
>>> particularly for the case where one peer has only 1 address.
>>> We've always punted on this, but people still ask every now and then.
>>
>> Ah okay, now I see it.
>>  
>>> I do have a question about the code though.. Have you tried with mutlipath routing
>>> enabled.  I see rp_filter checks have special code to handle that.  Seem like we
>>> might get false negatives in sctp.
>>
>> In the sense of CONFIG_IP_ROUTE_MULTIPATH=y, yes, but just that. My
>> routes were simple ones, either 2 peers attaches to 2 local subnets, or
>> with a gateway in the middle (with 2 subnets on each side, but mapped
>> 1-1, no crossing. Aka subnet1<->subnet2 and subnet3<->subnet4 while not
>> (subnet1<->subnet4 or subnet3<->subnet2).
>>
>> Note that this is not rp_filter strictly speaking, as it's mirrored.
>> rp_filter needs to calculate all possible output routes (actually until
>> it finds a valid one) for finding one that would match the one used for
>> incoming. 
>>
>> This check already has an output path, and it's calculating if such
>> input would be acceptable. We can't really expect/check for other hits
>> because it invalidates the chosen output path.
>>
>> Hmmm... but we could support multipath in the output selection, ie in
>> the outputs of ip_route_output_key(), probably in another patch then?
> 
> Thinking further.. we could just compare it with the addresses assigned to the
> interface instead of doing a whole new routing. Cheaper/faster, provides the
> results I'm looking for and the consequences are easier to see.
> 
> Something like (not tested, just illustrating the idea):
> 
> --- a/net/sctp/protocol.c
> +++ b/net/sctp/protocol.c
> @@ -489,22 +489,33 @@ static void sctp_v4_get_dst(struct sctp_transport *t, union sctp_addr *saddr,
>         list_for_each_entry_rcu(laddr, &bp->address_list, list) {
>                 if (!laddr->valid)
>                         continue;
>                 if ((laddr->state = SCTP_ADDR_SRC) &&
>                     (AF_INET = laddr->a.sa.sa_family)) {
> +                       struct net_device *odev;
> +
>                         fl4->fl4_sport = laddr->a.v4.sin_port;
>                         flowi4_update_output(fl4,
>                                              asoc->base.sk->sk_bound_dev_if,
>                                              RT_CONN_FLAGS(asoc->base.sk),
>                                              daddr->v4.sin_addr.s_addr,
>                                              laddr->a.v4.sin_addr.s_addr);
>  
>                         rt = ip_route_output_key(sock_net(sk), fl4);
> -                       if (!IS_ERR(rt)) {
> -                               dst = &rt->dst;
> -                               goto out_unlock;
> -                       }
> +                       if (IS_ERR(rt))
> +                               continue;
> +
> +                       /* Ensure the src address belongs to the output
> +                        * interface.
> +                        */
> +                       odev = __ip_dev_find(net, laddr->a.v4.sin_addr.s_addr,
> +                                            false);
> +                       if (odev->if_index != fl4->flowi4_oif)
> +                               continue;
> +
> +                       dst = &rt->dst;
> +                       goto out_unlock;
>                 }
>         }
>  
>  out_unlock:
>         rcu_read_unlock();
> 
> 
> I like this better than my 1st attempt. What do you think?

Looks better.  Have to drop the ref on the dev since __ip_dev_find takes one.

> 
> I'll split the refactoring from this fix on v2, so it's easier to review.
> 

Sounds good.

-vlad

>   Marcelo
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH net-next] sctp: fix src address selection if using secondary addresses
  2015-07-16 13:09                   ` Vlad Yasevich
@ 2015-07-16 14:06                     ` Marcelo Ricardo Leitner
  -1 siblings, 0 replies; 22+ messages in thread
From: Marcelo Ricardo Leitner @ 2015-07-16 14:06 UTC (permalink / raw)
  To: Vlad Yasevich
  Cc: Marcelo Ricardo Leitner, Michael Tuexen, netdev, linux-sctp, Neil Horman

On Thu, Jul 16, 2015 at 09:09:57AM -0400, Vlad Yasevich wrote:
> On 07/15/2015 03:03 PM, Marcelo Ricardo Leitner wrote:
> > On Fri, Jul 10, 2015 at 03:27:02PM -0300, Marcelo Ricardo Leitner wrote:
> >> On Fri, Jul 10, 2015 at 01:14:21PM -0400, Vlad Yasevich wrote:
> >>> On 07/10/2015 12:17 PM, Marcelo Ricardo Leitner wrote:
> >>>> On Fri, Jul 10, 2015 at 11:35:28AM -0400, Vlad Yasevich wrote:
> > ...
> >>>>> have been numerous times where I've seen weak host model in use on the wire
> >>>>> even with a BSD peer.
> >>>>>
> >>>>> This also puts a very big nail through many suggestions we've had over the years
> >>>>> to allow source based path multihoming in addition to destination based multihoming
> >>>>> we currently support.
> >>>>>
> >>>>> It might be a good idea to make rp-filter like behavior best effort, and have
> >>>>> the old behavior as fallback.  I am still trying to think up different scenarios
> >>>>> where rp-filter behavior will cause things to fail prematurely...
> >>>>
> >>>> The old behavior is like "if we don't have a src yet and can't find a
> >>>> preferred src for this dst, use the 1st bound address". We can add it
> >>>> but as I said, I'm afraid it is just doing wrong and not worth. If such
> >>>> randomly src addressed packet is meant to be routed, the router will
> >>>> likely drop it as it is seen as a spoof. And if it reaches the peer, it
> >>>> will probably come back through a different path.
> >>>>
> >>>> I'm tempted to say that current usual use cases are handled by the first
> >>>> check on this function, which returns the preferred/primary address for
> >>>> the interface and checks against bound addresses. Whenever you reach the
> >>>> second check, it just allows you to use that 1st bound address that is
> >>>> checked. I mean, I can't see use cases that we would be breaking with
> >>>> this change. 
> >>>
> >>> Yes,  the secondary check didn't amount to much, but we've kept it since 2.5
> >>> days (when sctp was introduced).  I've made attempts over the years to
> >>> try to make it stricter, but that never amounted to anything that worked well.
> >>>
> >>>>
> >>>> But yeah, it impacts source based routing, and I'm not aware of previous
> >>>> discussions on it. I'll try to dig some up but if possible, please share
> >>>> some pointers on it.
> >>>
> >>> It's been suggested a few times that we should support source based multihoming
> >>> particularly for the case where one peer has only 1 address.
> >>> We've always punted on this, but people still ask every now and then.
> >>
> >> Ah okay, now I see it.
> >>  
> >>> I do have a question about the code though.. Have you tried with mutlipath routing
> >>> enabled.  I see rp_filter checks have special code to handle that.  Seem like we
> >>> might get false negatives in sctp.
> >>
> >> In the sense of CONFIG_IP_ROUTE_MULTIPATH=y, yes, but just that. My
> >> routes were simple ones, either 2 peers attaches to 2 local subnets, or
> >> with a gateway in the middle (with 2 subnets on each side, but mapped
> >> 1-1, no crossing. Aka subnet1<->subnet2 and subnet3<->subnet4 while not
> >> (subnet1<->subnet4 or subnet3<->subnet2).
> >>
> >> Note that this is not rp_filter strictly speaking, as it's mirrored.
> >> rp_filter needs to calculate all possible output routes (actually until
> >> it finds a valid one) for finding one that would match the one used for
> >> incoming. 
> >>
> >> This check already has an output path, and it's calculating if such
> >> input would be acceptable. We can't really expect/check for other hits
> >> because it invalidates the chosen output path.
> >>
> >> Hmmm... but we could support multipath in the output selection, ie in
> >> the outputs of ip_route_output_key(), probably in another patch then?
> > 
> > Thinking further.. we could just compare it with the addresses assigned to the
> > interface instead of doing a whole new routing. Cheaper/faster, provides the
> > results I'm looking for and the consequences are easier to see.
> > 
> > Something like (not tested, just illustrating the idea):
> > 
> > --- a/net/sctp/protocol.c
> > +++ b/net/sctp/protocol.c
> > @@ -489,22 +489,33 @@ static void sctp_v4_get_dst(struct sctp_transport *t, union sctp_addr *saddr,
> >         list_for_each_entry_rcu(laddr, &bp->address_list, list) {
> >                 if (!laddr->valid)
> >                         continue;
> >                 if ((laddr->state == SCTP_ADDR_SRC) &&
> >                     (AF_INET == laddr->a.sa.sa_family)) {
> > +                       struct net_device *odev;
> > +
> >                         fl4->fl4_sport = laddr->a.v4.sin_port;
> >                         flowi4_update_output(fl4,
> >                                              asoc->base.sk->sk_bound_dev_if,
> >                                              RT_CONN_FLAGS(asoc->base.sk),
> >                                              daddr->v4.sin_addr.s_addr,
> >                                              laddr->a.v4.sin_addr.s_addr);
> >  
> >                         rt = ip_route_output_key(sock_net(sk), fl4);
> > -                       if (!IS_ERR(rt)) {
> > -                               dst = &rt->dst;
> > -                               goto out_unlock;
> > -                       }
> > +                       if (IS_ERR(rt))
> > +                               continue;
> > +
> > +                       /* Ensure the src address belongs to the output
> > +                        * interface.
> > +                        */
> > +                       odev = __ip_dev_find(net, laddr->a.v4.sin_addr.s_addr,
> > +                                            false);
> > +                       if (odev->if_index != fl4->flowi4_oif)
> > +                               continue;
> > +
> > +                       dst = &rt->dst;
> > +                       goto out_unlock;
> >                 }
> >         }
> >  
> >  out_unlock:
> >         rcu_read_unlock();
> > 
> > 
> > I like this better than my 1st attempt. What do you think?
> 
> Looks better.  Have to drop the ref on the dev since __ip_dev_find takes one.

Cool. I'll go that way then.

Regarding the ref, not really, because above code is under
rcu_read_lock() already and thne I passed false on its 3rd argument,
avoiding that ref.

Thanks,
Marcelo

> > 
> > I'll split the refactoring from this fix on v2, so it's easier to review.
> > 
> 
> Sounds good.
> 
> -vlad
> 
> >   Marcelo
> > 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH net-next] sctp: fix src address selection if using secondary addresses
@ 2015-07-16 14:06                     ` Marcelo Ricardo Leitner
  0 siblings, 0 replies; 22+ messages in thread
From: Marcelo Ricardo Leitner @ 2015-07-16 14:06 UTC (permalink / raw)
  To: Vlad Yasevich
  Cc: Marcelo Ricardo Leitner, Michael Tuexen, netdev, linux-sctp, Neil Horman

On Thu, Jul 16, 2015 at 09:09:57AM -0400, Vlad Yasevich wrote:
> On 07/15/2015 03:03 PM, Marcelo Ricardo Leitner wrote:
> > On Fri, Jul 10, 2015 at 03:27:02PM -0300, Marcelo Ricardo Leitner wrote:
> >> On Fri, Jul 10, 2015 at 01:14:21PM -0400, Vlad Yasevich wrote:
> >>> On 07/10/2015 12:17 PM, Marcelo Ricardo Leitner wrote:
> >>>> On Fri, Jul 10, 2015 at 11:35:28AM -0400, Vlad Yasevich wrote:
> > ...
> >>>>> have been numerous times where I've seen weak host model in use on the wire
> >>>>> even with a BSD peer.
> >>>>>
> >>>>> This also puts a very big nail through many suggestions we've had over the years
> >>>>> to allow source based path multihoming in addition to destination based multihoming
> >>>>> we currently support.
> >>>>>
> >>>>> It might be a good idea to make rp-filter like behavior best effort, and have
> >>>>> the old behavior as fallback.  I am still trying to think up different scenarios
> >>>>> where rp-filter behavior will cause things to fail prematurely...
> >>>>
> >>>> The old behavior is like "if we don't have a src yet and can't find a
> >>>> preferred src for this dst, use the 1st bound address". We can add it
> >>>> but as I said, I'm afraid it is just doing wrong and not worth. If such
> >>>> randomly src addressed packet is meant to be routed, the router will
> >>>> likely drop it as it is seen as a spoof. And if it reaches the peer, it
> >>>> will probably come back through a different path.
> >>>>
> >>>> I'm tempted to say that current usual use cases are handled by the first
> >>>> check on this function, which returns the preferred/primary address for
> >>>> the interface and checks against bound addresses. Whenever you reach the
> >>>> second check, it just allows you to use that 1st bound address that is
> >>>> checked. I mean, I can't see use cases that we would be breaking with
> >>>> this change. 
> >>>
> >>> Yes,  the secondary check didn't amount to much, but we've kept it since 2.5
> >>> days (when sctp was introduced).  I've made attempts over the years to
> >>> try to make it stricter, but that never amounted to anything that worked well.
> >>>
> >>>>
> >>>> But yeah, it impacts source based routing, and I'm not aware of previous
> >>>> discussions on it. I'll try to dig some up but if possible, please share
> >>>> some pointers on it.
> >>>
> >>> It's been suggested a few times that we should support source based multihoming
> >>> particularly for the case where one peer has only 1 address.
> >>> We've always punted on this, but people still ask every now and then.
> >>
> >> Ah okay, now I see it.
> >>  
> >>> I do have a question about the code though.. Have you tried with mutlipath routing
> >>> enabled.  I see rp_filter checks have special code to handle that.  Seem like we
> >>> might get false negatives in sctp.
> >>
> >> In the sense of CONFIG_IP_ROUTE_MULTIPATH=y, yes, but just that. My
> >> routes were simple ones, either 2 peers attaches to 2 local subnets, or
> >> with a gateway in the middle (with 2 subnets on each side, but mapped
> >> 1-1, no crossing. Aka subnet1<->subnet2 and subnet3<->subnet4 while not
> >> (subnet1<->subnet4 or subnet3<->subnet2).
> >>
> >> Note that this is not rp_filter strictly speaking, as it's mirrored.
> >> rp_filter needs to calculate all possible output routes (actually until
> >> it finds a valid one) for finding one that would match the one used for
> >> incoming. 
> >>
> >> This check already has an output path, and it's calculating if such
> >> input would be acceptable. We can't really expect/check for other hits
> >> because it invalidates the chosen output path.
> >>
> >> Hmmm... but we could support multipath in the output selection, ie in
> >> the outputs of ip_route_output_key(), probably in another patch then?
> > 
> > Thinking further.. we could just compare it with the addresses assigned to the
> > interface instead of doing a whole new routing. Cheaper/faster, provides the
> > results I'm looking for and the consequences are easier to see.
> > 
> > Something like (not tested, just illustrating the idea):
> > 
> > --- a/net/sctp/protocol.c
> > +++ b/net/sctp/protocol.c
> > @@ -489,22 +489,33 @@ static void sctp_v4_get_dst(struct sctp_transport *t, union sctp_addr *saddr,
> >         list_for_each_entry_rcu(laddr, &bp->address_list, list) {
> >                 if (!laddr->valid)
> >                         continue;
> >                 if ((laddr->state = SCTP_ADDR_SRC) &&
> >                     (AF_INET = laddr->a.sa.sa_family)) {
> > +                       struct net_device *odev;
> > +
> >                         fl4->fl4_sport = laddr->a.v4.sin_port;
> >                         flowi4_update_output(fl4,
> >                                              asoc->base.sk->sk_bound_dev_if,
> >                                              RT_CONN_FLAGS(asoc->base.sk),
> >                                              daddr->v4.sin_addr.s_addr,
> >                                              laddr->a.v4.sin_addr.s_addr);
> >  
> >                         rt = ip_route_output_key(sock_net(sk), fl4);
> > -                       if (!IS_ERR(rt)) {
> > -                               dst = &rt->dst;
> > -                               goto out_unlock;
> > -                       }
> > +                       if (IS_ERR(rt))
> > +                               continue;
> > +
> > +                       /* Ensure the src address belongs to the output
> > +                        * interface.
> > +                        */
> > +                       odev = __ip_dev_find(net, laddr->a.v4.sin_addr.s_addr,
> > +                                            false);
> > +                       if (odev->if_index != fl4->flowi4_oif)
> > +                               continue;
> > +
> > +                       dst = &rt->dst;
> > +                       goto out_unlock;
> >                 }
> >         }
> >  
> >  out_unlock:
> >         rcu_read_unlock();
> > 
> > 
> > I like this better than my 1st attempt. What do you think?
> 
> Looks better.  Have to drop the ref on the dev since __ip_dev_find takes one.

Cool. I'll go that way then.

Regarding the ref, not really, because above code is under
rcu_read_lock() already and thne I passed false on its 3rd argument,
avoiding that ref.

Thanks,
Marcelo

> > 
> > I'll split the refactoring from this fix on v2, so it's easier to review.
> > 
> 
> Sounds good.
> 
> -vlad
> 
> >   Marcelo
> > 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2015-07-16 14:06 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-07 17:42 [RFC PATCH net-next] sctp: fix src address selection if using secondary addresses Marcelo Ricardo Leitner
2015-07-07 17:42 ` Marcelo Ricardo Leitner
2015-07-09 16:54 ` Marcelo Ricardo Leitner
2015-07-09 16:54   ` Marcelo Ricardo Leitner
2015-07-09 19:55   ` Michael Tuexen
2015-07-09 19:55     ` Michael Tuexen
2015-07-10 11:53     ` Marcelo Ricardo Leitner
2015-07-10 11:53       ` Marcelo Ricardo Leitner
2015-07-10 15:35       ` Vlad Yasevich
2015-07-10 15:35         ` Vlad Yasevich
2015-07-10 16:17         ` Marcelo Ricardo Leitner
2015-07-10 16:17           ` Marcelo Ricardo Leitner
2015-07-10 17:14           ` Vlad Yasevich
2015-07-10 17:14             ` Vlad Yasevich
2015-07-10 18:27             ` Marcelo Ricardo Leitner
2015-07-10 18:27               ` Marcelo Ricardo Leitner
2015-07-15 19:03               ` Marcelo Ricardo Leitner
2015-07-15 19:03                 ` Marcelo Ricardo Leitner
2015-07-16 13:09                 ` Vlad Yasevich
2015-07-16 13:09                   ` Vlad Yasevich
2015-07-16 14:06                   ` Marcelo Ricardo Leitner
2015-07-16 14:06                     ` Marcelo Ricardo Leitner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.