All of lore.kernel.org
 help / color / mirror / Atom feed
* SCTP Multihoming Heartbeat ACK Behavior
@ 2014-06-28 11:04 Winston V. Tizon
  2014-06-30  8:24 ` Daniel Borkmann
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: Winston V. Tizon @ 2014-06-28 11:04 UTC (permalink / raw)
  To: linux-sctp

Hello everyone!

We are having some issues regarding SCTP multihoming
and we would like to ask your opinion on this matter. 
We have two RHEL6.4 (2.6.32-358.el6.x86_64, lksctp-tools
-1.0.10-5(64 bit)) machines connected by two L2 switch 
and a L3 switch (please see "Environment Setup" below). 
When we execute SCTP connection (using multihoming) between 
the two machines, the following behavior occurred:

      CLIENT           L2 and L3         SERVER      
Secondary Primary           |      Primary Secondary 
    |        |              |          |       |     
    |        |              |          |       |     
    |        |INIT-INIT_ACK |          |       |     
    |        |<-------------|--------->|       |     
    |        |COOKIE_ECHO-COOKIE_ACK   |       |     
    |        |              |          |       |     
    |        |<-------------|--------->|       |     
    |        |HB/HB_ACK     |          |       |     
    |        |              |          |       |     
    |<-------|--------------|----------|       |     
    |        |              |       HB |       |     
    |        |--------------|--------->|       |     
    |        |HB_ACK        |          |       |     
    |        |        :     |          |       |     
    |        |        :     |          |       |     

INIT/INIT_ACK handshake occurred in the primary 
path of both machines which is expected. When a 
Primary path sends HEARTBEAT to another Primary, 
HEARTBEAT_ACK was returned to the sender. But 
when a Primary path sends HEARTBEAT to a 
Secondary path, the HEARTBEAT_ACK chunk was sent 
by the Primary path. We expect that the 
HEARTBEAT_ACK would come from the Secondary.

[Questions]
1. Is this a normal behavior with regards to SCTP multihoming? 
2. Is the SCTP kernel module has something to do with this behavior? 
3. Is there a solution to force Client to use its Secondary path in 
   sending the HEARTBEAT_ACK chunk to Server's primary?

Environment Setup:

    CLIENT                        SERVER     
 172.168.39.91                172.168.40.93  
 +-----------+    +------+    +-----------+  
 |       eth0|----|  L2  |----|eth0       |  
 |           |    +------+    |           |  
 |           |       |        |           |  
 |           |  +----------+  |           |  
 |           |  |    L3    |  |           |  
 |           |  +----------+  |           |  
 |           |       |        |           |  
 |           |    +------+    |           |  
 |       eth1|----|  L2  |----|eth1       |  
 +-----------+    +------+    +-----------+  
 172.168.39.92                172.168.40.94  

Route Setup:

---------- CLIENT ----------
# ip rule show
0:	from all lookup local 
197:	from all to 172.168.39.91 lookup rt2 
198:	from 172.168.39.91 lookup rt2 
199:	from all to 172.168.39.92 lookup rt3 
200:	from 172.168.39.92 lookup rt3 
32766:	from all lookup main 
32767:	from all lookup default 

# ip route show table rt2
172.168.40.0/24 dev eth0  scope link  src 172.168.39.91 
172.168.39.0/24 dev eth0  scope link  src 172.168.39.91 

# ip route show table rt3
172.168.40.0/24 dev eth1  scope link  src 172.168.39.92 
172.168.39.0/24 dev eth1  scope link  src 172.168.39.92 

# ip route show table main
192.168.40.0/24 dev eth1  proto kernel  scope link  src 192.168.40.212 

# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
192.168.40.0    0.0.0.0         255.255.255.0   U     0      0        0 eth1

---------- SERVER ----------
# ip rule show
0:	from all lookup local 
197:	from all to 172.168.40.93 lookup rt2 
198:	from 172.168.40.93 lookup rt2 
199:	from all to 172.168.40.94 lookup rt3 
200:	from 172.168.40.94 lookup rt3 
32766:	from all lookup main 
32767:	from all lookup default 

# ip route show table rt2
172.168.40.0/24 dev eth0  scope link  src 172.168.40.93 
172.168.39.0/24 dev eth0  scope link  src 172.168.40.93 

# ip route show table rt3
172.168.40.0/24 dev eth1  scope link  src 172.168.40.94 
172.168.39.0/24 dev eth1  scope link  src 172.168.40.94 

# ip route show table main
192.168.40.0/24 dev eth1  proto kernel  scope link  src 192.168.40.127 

# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
192.168.40.0    0.0.0.0         255.255.255.0   U     0      0        0 eth1


We are hoping for your kind response and thank you in advance!

Wins


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: SCTP Multihoming Heartbeat ACK Behavior
  2014-06-28 11:04 SCTP Multihoming Heartbeat ACK Behavior Winston V. Tizon
@ 2014-06-30  8:24 ` Daniel Borkmann
  2014-06-30  9:27 ` Daniel Borkmann
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Daniel Borkmann @ 2014-06-30  8:24 UTC (permalink / raw)
  To: linux-sctp

On 06/28/2014 01:04 PM, Winston V. Tizon wrote:
> Hello everyone!
>
> We are having some issues regarding SCTP multihoming
> and we would like to ask your opinion on this matter.
> We have two RHEL6.4 (2.6.32-358.el6.x86_64, lksctp-tools
> -1.0.10-5(64 bit)) machines connected by two L2 switch
> and a L3 switch (please see "Environment Setup" below).
> When we execute SCTP connection (using multihoming) between
> the two machines, the following behavior occurred:
>
>        CLIENT           L2 and L3         SERVER
> Secondary Primary           |      Primary Secondary
>      |        |              |          |       |
>      |        |              |          |       |
>      |        |INIT-INIT_ACK |          |       |
>      |        |<-------------|--------->|       |
>      |        |COOKIE_ECHO-COOKIE_ACK   |       |
>      |        |              |          |       |
>      |        |<-------------|--------->|       |
>      |        |HB/HB_ACK     |          |       |
>      |        |              |          |       |
>      |<-------|--------------|----------|       |
>      |        |              |       HB |       |
>      |        |--------------|--------->|       |
>      |        |HB_ACK        |          |       |
>      |        |        :     |          |       |
>      |        |        :     |          |       |
>
> INIT/INIT_ACK handshake occurred in the primary
> path of both machines which is expected. When a
> Primary path sends HEARTBEAT to another Primary,
> HEARTBEAT_ACK was returned to the sender. But
> when a Primary path sends HEARTBEAT to a
> Secondary path, the HEARTBEAT_ACK chunk was sent
> by the Primary path. We expect that the
> HEARTBEAT_ACK would come from the Secondary.
>
> [Questions]
> 1. Is this a normal behavior with regards to SCTP multihoming?

So looking at the RFC, it says (RFC4960, 3.3.6. + 6.4.) ...

  An endpoint should send this chunk to its peer endpoint as a
  response to a HEARTBEAT chunk (see Section 8.3). A HEARTBEAT ACK
  is always sent to the source IP address of the IP datagram
  containing the HEARTBEAT chunk to which this ack is responding.

  [...]

  An endpoint SHOULD transmit reply chunks (e.g., SACK, HEARTBEAT ACK,
  etc.) to the same destination transport address from which it
  received the DATA or control chunk to which it is replying. This
  rule should also be followed if the endpoint is bundling DATA chunks
  together with the reply chunk.

... it would be more correct to reply via the same transport, imho.
I just checked upstream kernel with multihoming on 2 machines with
5 interfaces each, and HB, HB-ACK replies seem to be fine there,
that is, HB-ACKs go via same transports.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: SCTP Multihoming Heartbeat ACK Behavior
  2014-06-28 11:04 SCTP Multihoming Heartbeat ACK Behavior Winston V. Tizon
  2014-06-30  8:24 ` Daniel Borkmann
@ 2014-06-30  9:27 ` Daniel Borkmann
  2014-06-30 12:30 ` Neil Horman
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Daniel Borkmann @ 2014-06-30  9:27 UTC (permalink / raw)
  To: linux-sctp

On 06/30/2014 10:58 AM, Winston V. Tizon wrote:
...
> Thank you very much for your quick reply.

[ Please don't top-post. ]

> I need to confirm if your environment there with
> 2 machines with 5 interfaces each have the same RHEL and
> LKSCTP versions with the ones we are using?

Note, I said, I was using latest upstream kernel, didn't
check yet with RHEL.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: SCTP Multihoming Heartbeat ACK Behavior
  2014-06-28 11:04 SCTP Multihoming Heartbeat ACK Behavior Winston V. Tizon
  2014-06-30  8:24 ` Daniel Borkmann
  2014-06-30  9:27 ` Daniel Borkmann
@ 2014-06-30 12:30 ` Neil Horman
  2014-07-01 11:20 ` Neil Horman
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Neil Horman @ 2014-06-30 12:30 UTC (permalink / raw)
  To: linux-sctp

On Mon, Jun 30, 2014 at 04:58:36PM +0800, Winston V. Tizon wrote:
> TO: Mr. Borkmann
> 
> Good day!
> 
> Thank you very much for your quick reply.
> 
> I need to confirm if your environment there with
> 2 machines with 5 interfaces each have the same RHEL and
> LKSCTP versions with the ones we are using?
> 
> If you are using the latest RHEL and LKSCTP versions
> and HB-ACKS are doing fine there, we might think that 
> our older LKSCTP version has something to do with the 
> abnormal HB-ACK behavior.
> 
> Thanks,
> 
> W. Tizon
> 
How are you determining which transport is getting used to send the HEARTBEAT
ACK?  Are you looking at the chunks destination IP address?  Looking at
sctp_outq_flush in RHEL6 it appears we should always use the inbound transport
to send the response, which is the correct thing to do. Sometimes however, while
the destination ip address is correct, funny routing tables can lead to a single
source address getting selected.

Neil

> On Mon, 30 Jun 2014 10:24:09 +0200
> Daniel Borkmann <dborkman@redhat.com> wrote:
> 
> > On 06/28/2014 01:04 PM, Winston V. Tizon wrote:
> > > Hello everyone!
> > >
> > > We are having some issues regarding SCTP multihoming
> > > and we would like to ask your opinion on this matter.
> > > We have two RHEL6.4 (2.6.32-358.el6.x86_64, lksctp-tools
> > > -1.0.10-5(64 bit)) machines connected by two L2 switch
> > > and a L3 switch (please see "Environment Setup" below).
> > > When we execute SCTP connection (using multihoming) between
> > > the two machines, the following behavior occurred:
> > >
> > >        CLIENT           L2 and L3         SERVER
> > > Secondary Primary           |      Primary Secondary
> > >      |        |              |          |       |
> > >      |        |              |          |       |
> > >      |        |INIT-INIT_ACK |          |       |
> > >      |        |<-------------|--------->|       |
> > >      |        |COOKIE_ECHO-COOKIE_ACK   |       |
> > >      |        |              |          |       |
> > >      |        |<-------------|--------->|       |
> > >      |        |HB/HB_ACK     |          |       |
> > >      |        |              |          |       |
> > >      |<-------|--------------|----------|       |
> > >      |        |              |       HB |       |
> > >      |        |--------------|--------->|       |
> > >      |        |HB_ACK        |          |       |
> > >      |        |        :     |          |       |
> > >      |        |        :     |          |       |
> > >
> > > INIT/INIT_ACK handshake occurred in the primary
> > > path of both machines which is expected. When a
> > > Primary path sends HEARTBEAT to another Primary,
> > > HEARTBEAT_ACK was returned to the sender. But
> > > when a Primary path sends HEARTBEAT to a
> > > Secondary path, the HEARTBEAT_ACK chunk was sent
> > > by the Primary path. We expect that the
> > > HEARTBEAT_ACK would come from the Secondary.
> > >
> > > [Questions]
> > > 1. Is this a normal behavior with regards to SCTP multihoming?
> > 
> > So looking at the RFC, it says (RFC4960, 3.3.6. + 6.4.) ...
> > 
> >   An endpoint should send this chunk to its peer endpoint as a
> >   response to a HEARTBEAT chunk (see Section 8.3). A HEARTBEAT ACK
> >   is always sent to the source IP address of the IP datagram
> >   containing the HEARTBEAT chunk to which this ack is responding.
> > 
> >   [...]
> > 
> >   An endpoint SHOULD transmit reply chunks (e.g., SACK, HEARTBEAT ACK,
> >   etc.) to the same destination transport address from which it
> >   received the DATA or control chunk to which it is replying. This
> >   rule should also be followed if the endpoint is bundling DATA chunks
> >   together with the reply chunk.
> > 
> > ... it would be more correct to reply via the same transport, imho.
> > I just checked upstream kernel with multihoming on 2 machines with
> > 5 interfaces each, and HB, HB-ACK replies seem to be fine there,
> > that is, HB-ACKs go via same transports.
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: SCTP Multihoming Heartbeat ACK Behavior
  2014-06-28 11:04 SCTP Multihoming Heartbeat ACK Behavior Winston V. Tizon
                   ` (2 preceding siblings ...)
  2014-06-30 12:30 ` Neil Horman
@ 2014-07-01 11:20 ` Neil Horman
  2014-07-01 11:38 ` Daniel Borkmann
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Neil Horman @ 2014-07-01 11:20 UTC (permalink / raw)
  To: linux-sctp

On Tue, Jul 01, 2014 at 03:48:32PM +0800, Winston V. Tizon wrote:
> On Mon, 30 Jun 2014 08:30:20 -0400
> Neil Horman <nhorman@tuxdriver.com> wrote:
> 
> > How are you determining which transport is getting used to send the HEARTBEAT
> > ACK?  Are you looking at the chunks destination IP address?  Looking at
> > sctp_outq_flush in RHEL6 it appears we should always use the inbound transport
> > to send the response, which is the correct thing to do. Sometimes however, while
> > the destination ip address is correct, funny routing tables can lead to a single
> > source address getting selected.
> > 
> > Neil
> 
> Thanks for the reply Neil.
> 
> We've just performed test to check if Multihoming Heartbeat ACK Behavior 
> I've mentioned is related to routing table configuration or 
> SCTP Primary IP address setting. Please see below test information and results.
> 
> "Test case1" (Set the primary in "172.168.39.2")
> 
> [client]
> # sctp_darn -H 172.168.39.2 -B 172.168.39.3 -P 9099 -h 172.168.39.4 -p 
> 9099 -s
> 
> [server]
> # sctp_darn -H 172.168.39.4 -B 172.168.39.5 -P 9099 -l
> 
> [result]
> 172.168.39.2(client) 172.168.39.4(server) INIT
> 172.168.39.4(server) 172.168.39.2(client) INIT_ACK
> 172.168.39.2(client) 172.168.39.4(server) COOKIE_ECHO
> 172.168.39.4(server) 172.168.39.2(client) COOKIE_ACK
> 172.168.39.4(server) 172.168.39.2(client) HB
> 172.168.39.2(client) 172.168.39.4(server) HB_ACK
> 172.168.39.4(server) 172.168.39.3(client) HB
> 172.168.39.2(client) 172.168.39.4(server) HB_ACK     (***)
> 172.168.39.2(client) 172.168.39.4(server) DATA
> 172.168.39.4(server) 172.168.39.2(client) SACK
> 
> 
> "Test case2" (Set the primary in "172.168.39.3")
> 
> [client]
> # sctp_darn -H 172.168.39.3 -B 172.168.39.2 -P 9099 -h 172.168.39.4 -p 
> 9099 -s
> 
> [server]
> # sctp_darn -H 172.168.39.4 -B 172.168.39.5 -P 9099 -l
> 
> [result]
> 172.168.39.3(client) 172.168.39.4(server) INIT
> 172.168.39.4(server) 172.168.39.3(client) INIT_ACK
> 172.168.39.3(client) 172.168.39.4(server) COOKIE_ECHO
> 172.168.39.4(server) 172.168.39.3(client) COOKIE_ACK
> 172.168.39.4(server) 172.168.39.3(client) HB
> 172.168.39.3(client) 172.168.39.4(server) HB_ACK
> 172.168.39.4(server) 172.168.39.2(client) HB
> 172.168.39.3(client) 172.168.39.4(server) HB_ACK     (***)
> 172.168.39.3(client) 172.168.39.4(server) DATA 
> 172.168.39.4(server) 172.168.39.3(client) SACK
> 
> 
> Based on results, SCTP kernel always choose Primary IP address so I think
> this is not related to routing table configuration problems. 
> 
> (***) -> Based on SCTP RFC4960, expected behavior is secondary IP address
>          should be used as path in sending the HB_ACK.
> 
Actually, its quite the opposite, this confirms that the sctp protocol is
functioning normally.  RFC 4960 says this about HB_ACK's:

3.3.6.  Heartbeat Acknowledgement (HEARTBEAT ACK) (5)

   An endpoint should send this chunk to its peer endpoint as a response
   to a HEARTBEAT chunk (see Section 8.3).  A HEARTBEAT ACK is always
   sent to the source IP address of the IP datagram containing the
   HEARTBEAT chunk to which this ack is responding.

The only thing that a peer has to do regarding a HB frame is sent an HB_ACK to
the source ip address of the corresponding HB frame (in this case172.168.39.4),
which we do my recording the inbound transport that the HB frame arrived on.
The source address selection is made during L3 packet routing, which, according
to sctp_transport_route is made by querying the route tables.

The only thing thats a bit odd is the fact your not taking what is likely the
selected source address from the default route.  That usually indicates that
you're using path mtu features, and you're routing based on a locally selected
source and destination address.

The point however is, that a transport is only defined by a destination address
within an association, not by a source and a destination, the source address
selection is made by the local peer, and should not be relevant to the peer
receiving the data.

Neil


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: SCTP Multihoming Heartbeat ACK Behavior
  2014-06-28 11:04 SCTP Multihoming Heartbeat ACK Behavior Winston V. Tizon
                   ` (3 preceding siblings ...)
  2014-07-01 11:20 ` Neil Horman
@ 2014-07-01 11:38 ` Daniel Borkmann
  2014-07-01 11:57 ` Michael Tuexen
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Daniel Borkmann @ 2014-07-01 11:38 UTC (permalink / raw)
  To: linux-sctp

On 07/01/2014 01:20 PM, Neil Horman wrote:
> On Tue, Jul 01, 2014 at 03:48:32PM +0800, Winston V. Tizon wrote:
...
>> (***) -> Based on SCTP RFC4960, expected behavior is secondary IP address
>>           should be used as path in sending the HB_ACK.
>>
> Actually, its quite the opposite, this confirms that the sctp protocol is
> functioning normally.  RFC 4960 says this about HB_ACK's:
>
> 3.3.6.  Heartbeat Acknowledgement (HEARTBEAT ACK) (5)
>
>     An endpoint should send this chunk to its peer endpoint as a response
>     to a HEARTBEAT chunk (see Section 8.3).  A HEARTBEAT ACK is always
>     sent to the source IP address of the IP datagram containing the
>     HEARTBEAT chunk to which this ack is responding.
>
> The only thing that a peer has to do regarding a HB frame is sent an HB_ACK to
> the source ip address of the corresponding HB frame (in this case172.168.39.4),
> which we do my recording the inbound transport that the HB frame arrived on.

I agree with you, Neil, the RFC only mentions that we need to "sent to the
source IP address", which was what I've quoted earlier on as well, so above
statement to use "secondary IP address should be used as path in sending
the HB_ACK" is not a MUST.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: SCTP Multihoming Heartbeat ACK Behavior
  2014-06-28 11:04 SCTP Multihoming Heartbeat ACK Behavior Winston V. Tizon
                   ` (4 preceding siblings ...)
  2014-07-01 11:38 ` Daniel Borkmann
@ 2014-07-01 11:57 ` Michael Tuexen
  2014-07-01 13:59 ` Jeff Carter
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Michael Tuexen @ 2014-07-01 11:57 UTC (permalink / raw)
  To: linux-sctp

On 01 Jul 2014, at 13:38, Daniel Borkmann <dborkman@redhat.com> wrote:

> On 07/01/2014 01:20 PM, Neil Horman wrote:
>> On Tue, Jul 01, 2014 at 03:48:32PM +0800, Winston V. Tizon wrote:
> ...
>>> (***) -> Based on SCTP RFC4960, expected behavior is secondary IP address
>>>          should be used as path in sending the HB_ACK.
>>> 
>> Actually, its quite the opposite, this confirms that the sctp protocol is
>> functioning normally.  RFC 4960 says this about HB_ACK's:
>> 
>> 3.3.6.  Heartbeat Acknowledgement (HEARTBEAT ACK) (5)
>> 
>>    An endpoint should send this chunk to its peer endpoint as a response
>>    to a HEARTBEAT chunk (see Section 8.3).  A HEARTBEAT ACK is always
>>    sent to the source IP address of the IP datagram containing the
>>    HEARTBEAT chunk to which this ack is responding.
>> 
>> The only thing that a peer has to do regarding a HB frame is sent an HB_ACK to
>> the source ip address of the corresponding HB frame (in this case172.168.39.4),
>> which we do my recording the inbound transport that the HB frame arrived on.
> 
> I agree with you, Neil, the RFC only mentions that we need to "sent to the
> source IP address", which was what I've quoted earlier on as well, so above
> statement to use "secondary IP address should be used as path in sending
> the HB_ACK" is not a MUST.
RFC 4960 does not make explicit statements about source address selection
in general...

Best regards
Michael
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: SCTP Multihoming Heartbeat ACK Behavior
  2014-06-28 11:04 SCTP Multihoming Heartbeat ACK Behavior Winston V. Tizon
                   ` (5 preceding siblings ...)
  2014-07-01 11:57 ` Michael Tuexen
@ 2014-07-01 13:59 ` Jeff Carter
  2014-07-06 20:19 ` Michael Tuexen
  2014-07-07 11:41 ` Neil Horman
  8 siblings, 0 replies; 10+ messages in thread
From: Jeff Carter @ 2014-07-01 13:59 UTC (permalink / raw)
  To: linux-sctp

Also notice that the RFC states this:

" The Sender-Specific Heartbeat Info field should normally include
  information about the sender's current time when this HEARTBEAT
  chunk is sent and the destination transport address to which this
  HEARTBEAT is sent (see Section 8.3)."

So that is perfectly well expected that the HEARTBEAT ACK source
address won't be the destination address of the HEARTBEAT.

Jeff Carter


-----Original Message-----
From: linux-sctp-owner@vger.kernel.org [mailto:linux-sctp-owner@vger.kernel.org] On Behalf Of Michael Tuexen
Sent: Tuesday, July 01, 2014 7:58 AM
To: Daniel Borkmann
Cc: Neil Horman; Winston V. Tizon; linux-sctp@vger.kernel.org
Subject: Re: SCTP Multihoming Heartbeat ACK Behavior

On 01 Jul 2014, at 13:38, Daniel Borkmann <dborkman@redhat.com> wrote:

> On 07/01/2014 01:20 PM, Neil Horman wrote:
>> On Tue, Jul 01, 2014 at 03:48:32PM +0800, Winston V. Tizon wrote:
> ...
>>> (***) -> Based on SCTP RFC4960, expected behavior is secondary IP address
>>>          should be used as path in sending the HB_ACK.
>>> 
>> Actually, its quite the opposite, this confirms that the sctp protocol is
>> functioning normally.  RFC 4960 says this about HB_ACK's:
>> 
>> 3.3.6.  Heartbeat Acknowledgement (HEARTBEAT ACK) (5)
>> 
>>    An endpoint should send this chunk to its peer endpoint as a response
>>    to a HEARTBEAT chunk (see Section 8.3).  A HEARTBEAT ACK is always
>>    sent to the source IP address of the IP datagram containing the
>>    HEARTBEAT chunk to which this ack is responding.
>> 
>> The only thing that a peer has to do regarding a HB frame is sent an HB_ACK to
>> the source ip address of the corresponding HB frame (in this case172.168.39.4),
>> which we do my recording the inbound transport that the HB frame arrived on.
> 
> I agree with you, Neil, the RFC only mentions that we need to "sent to the
> source IP address", which was what I've quoted earlier on as well, so above
> statement to use "secondary IP address should be used as path in sending
> the HB_ACK" is not a MUST.
RFC 4960 does not make explicit statements about source address selection
in general...

Best regards
Michael
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: SCTP Multihoming Heartbeat ACK Behavior
  2014-06-28 11:04 SCTP Multihoming Heartbeat ACK Behavior Winston V. Tizon
                   ` (6 preceding siblings ...)
  2014-07-01 13:59 ` Jeff Carter
@ 2014-07-06 20:19 ` Michael Tuexen
  2014-07-07 11:41 ` Neil Horman
  8 siblings, 0 replies; 10+ messages in thread
From: Michael Tuexen @ 2014-07-06 20:19 UTC (permalink / raw)
  To: linux-sctp


On 04 Jul 2014, at 12:33, Winston V. Tizon <wtizon@tspi.com.ph> wrote:

> On Tue, 1 Jul 2014 07:20:01 -0400
> Neil Horman <nhorman@tuxdriver.com> wrote:
>> Actually, its quite the opposite, this confirms that the sctp protocol is
>> functioning normally.  RFC 4960 says this about HB_ACK's:
>> 
>> 3.3.6.  Heartbeat Acknowledgement (HEARTBEAT ACK) (5)
>> 
>>   An endpoint should send this chunk to its peer endpoint as a response
>>   to a HEARTBEAT chunk (see Section 8.3).  A HEARTBEAT ACK is always
>>   sent to the source IP address of the IP datagram containing the
>>   HEARTBEAT chunk to which this ack is responding.
>> 
>> The only thing that a peer has to do regarding a HB frame is sent an HB_ACK to
>> the source ip address of the corresponding HB frame (in this case172.168.39.4),
>> which we do my recording the inbound transport that the HB frame arrived on.
>> The source address selection is made during L3 packet routing, which, according
>> to sctp_transport_route is made by querying the route tables.
>> 
>> The only thing thats a bit odd is the fact your not taking what is likely the
>> selected source address from the default route.  That usually indicates that
>> you're using path mtu features, and you're routing based on a locally selected
>> source and destination address.
>> 
>> The point however is, that a transport is only defined by a destination address
>> within an association, not by a source and a destination, the source address
>> selection is made by the local peer, and should not be relevant to the peer
>> receiving the data.
>> 
>> Neil
>> 
> TO: Neil, Daniel, Michael
> 
> Thank you very much for your answers. I misunderstood the SCTP RFC statement 
> regarding reply Chunks in a multi-homed association. I thought reply 
> chunks should be sent always to the same path from where DATA or Control 
> chunks were received.
> 
> I'm sorry if I haven't explained yet the other details of our problem using 
> the following  "Environment Setup" below.
> 
>    CLIENT                        SERVER     
> 172.168.39.91                172.168.40.93  
> +-----------+    +------+    +-----------+  
> |       eth0|----|  L2  |----|eth0       |  
> |           |    +------+    |           |  
> |           |       |        |           |  
> |           |  +----------+  |           |  
> |           |  |    L3    |  |           |  
> |           |  +----------+  |           |  
> |           |       |        |           |  
> |           |    +------+    |           |  
> |       eth1|----|  L2  |----|eth1       |  
> +-----------+    +------+    +-----------+  
> 172.168.39.92                172.168.40.94  
> 
> Our problem goes like this. Normal sequence is shown in "[NORMAL]" sequence.
> While our problem started when we've performed "[ABNORMAL]"sequence below.
> 
> [NORMAL]
> 
>      SERVER           L2 and L3         CLIENT      
> Secondary Primary           |      Primary Secondary 
>    |        |              |          |       |     
>    |        |              |          |       |     
>    |        |INIT-INIT_ACK |          |       |     
>    |        |<-------------|--------->|       |     
>    |        |COOKIE_ECHO-COOKIE_ACK   |       |     
>    |        |              |          |       |     
>    |        |<-------------|--------->|       |     
>    |        |HB/HB_ACK     |          |       |     
>    |        |              |          |       |     
>    |<-------|--------------|----------|       |     
>    |        |              |       HB |       |     
>    |        |--------------|--------->|       |     
>    |        |HB_ACK        |          |       |     
>    |        |        :     |          |       |     
>    |        |        :     |          |       |     
> 
> [ABNORMAL]
> 
>      SERVER           L2 and L3         CLIENT
> Secondary Primary           |      Primary Secondary 
>    |        |              |          |       |     
>    |        |              |          |       |     
>    |        |INIT-INIT_ACK |          |       |     
>    |        |<-------------|--------->|       |     
>    |        |COOKIE_ECHO-COOKIE_ACK   |       |     
>    |        |              |          |       |     
>    |        |<-------------|--------->|       |     
>    |        |HB/HB_ACK     |          |       |     
>    |        |              |          |       |     
>    |<-------|--------------|----------|       |     
>    |        |              |       HB |       |     
>    |        |---->X        |          |       |     
>    |        |HB_ACK        |          |       |     
>    |        |        :     |          |       |     
>    |<-------|--------------|----------|       |     
>                  ABORT
> 
> X  -> LAN cable unplug, cannot send HB_ACK to Client
> 
> Based on what happened in our environment, the Client sent ABORT chunk to 
> Secondary IP address of Server causing this path to be closed. 
> 
> We don't want this to happen in our environment.
> 
> It would be of great help to us if you have any idea on how we could 
> solve question #3 below.
> 
> [Question]
> 3. Is there a solution to force Server to use its Secondary path in 
>   sending the HEARTBEAT_ACK chunk to Client's primary?   
It is always difficult to have multiple addresses in the same network
configured on a host. I guess you problems go away if you use

   CLIENT                        SERVER     
172.168.39.91                172.168.39.93  
+-----------+    +------+    +-----------+  
|       eth0|----|  L2  |----|eth0       |  
|           |    +------+    |           |  
|           |       |        |           |  
|           |  +----------+  |           |  
|           |  |    L3    |  |           |  
|           |  +----------+  |           |  
|           |       |        |           |  
|           |    +------+    |           |  
|       eth1|----|  L2  |----|eth1       |  
+-----------+    +------+    +-----------+  
172.168.40.92                172.168.40.94  

assuming all masks being 255.255.255.0

Best regards
Michael
> 
> Wins
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: SCTP Multihoming Heartbeat ACK Behavior
  2014-06-28 11:04 SCTP Multihoming Heartbeat ACK Behavior Winston V. Tizon
                   ` (7 preceding siblings ...)
  2014-07-06 20:19 ` Michael Tuexen
@ 2014-07-07 11:41 ` Neil Horman
  8 siblings, 0 replies; 10+ messages in thread
From: Neil Horman @ 2014-07-07 11:41 UTC (permalink / raw)
  To: linux-sctp

On Fri, Jul 04, 2014 at 06:33:32PM +0800, Winston V. Tizon wrote:
> On Tue, 1 Jul 2014 07:20:01 -0400
> Neil Horman <nhorman@tuxdriver.com> wrote:
> > Actually, its quite the opposite, this confirms that the sctp protocol is
> > functioning normally.  RFC 4960 says this about HB_ACK's:
> > 
> > 3.3.6.  Heartbeat Acknowledgement (HEARTBEAT ACK) (5)
> > 
> >    An endpoint should send this chunk to its peer endpoint as a response
> >    to a HEARTBEAT chunk (see Section 8.3).  A HEARTBEAT ACK is always
> >    sent to the source IP address of the IP datagram containing the
> >    HEARTBEAT chunk to which this ack is responding.
> > 
> > The only thing that a peer has to do regarding a HB frame is sent an HB_ACK to
> > the source ip address of the corresponding HB frame (in this case172.168.39.4),
> > which we do my recording the inbound transport that the HB frame arrived on.
> > The source address selection is made during L3 packet routing, which, according
> > to sctp_transport_route is made by querying the route tables.
> > 
> > The only thing thats a bit odd is the fact your not taking what is likely the
> > selected source address from the default route.  That usually indicates that
> > you're using path mtu features, and you're routing based on a locally selected
> > source and destination address.
> > 
> > The point however is, that a transport is only defined by a destination address
> > within an association, not by a source and a destination, the source address
> > selection is made by the local peer, and should not be relevant to the peer
> > receiving the data.
> > 
> > Neil
> > 
> TO: Neil, Daniel, Michael
> 
> Thank you very much for your answers. I misunderstood the SCTP RFC statement 
> regarding reply Chunks in a multi-homed association. I thought reply 
> chunks should be sent always to the same path from where DATA or Control 
> chunks were received.
> 
> I'm sorry if I haven't explained yet the other details of our problem using 
> the following  "Environment Setup" below.
> 
This is a bit of a strange setup.  Multiple subnets on the same L2 switch
without vlans can lead to strange behaviors.  Its also a bit strange that you
have a single interface with what must be multiple ip addresses on each of two
router interfaces.

>     CLIENT                        SERVER     
>  172.168.39.91                172.168.40.93  
>  +-----------+    +------+    +-----------+  
>  |       eth0|----|  L2  |----|eth0       |  
>  |           |    +------+    |           |  
>  |           |       |        |           |  
>  |           |  +----------+  |           |  
>  |           |  |    L3    |  |           |  
>  |           |  +----------+  |           |  
>  |           |       |        |           |  
>  |           |    +------+    |           |  
>  |       eth1|----|  L2  |----|eth1       |  
>  +-----------+    +------+    +-----------+  
>  172.168.39.92                172.168.40.94  
> 
> Our problem goes like this. Normal sequence is shown in "[NORMAL]" sequence.
> While our problem started when we've performed "[ABNORMAL]"sequence below.
> 
> [NORMAL]
> 
>       SERVER           L2 and L3         CLIENT      
> Secondary Primary           |      Primary Secondary 
>     |        |              |          |       |     
>     |        |              |          |       |     
>     |        |INIT-INIT_ACK |          |       |     
>     |        |<-------------|--------->|       |     
>     |        |COOKIE_ECHO-COOKIE_ACK   |       |     
>     |        |              |          |       |     
>     |        |<-------------|--------->|       |     
>     |        |HB/HB_ACK     |          |       |     
>     |        |              |          |       |     
>     |<-------|--------------|----------|       |     
>     |        |              |       HB |       |     
>     |        |--------------|--------->|       |     
>     |        |HB_ACK        |          |       |     
>     |        |        :     |          |       |     
>     |        |        :     |          |       |     
> 
> [ABNORMAL]
> 
>       SERVER           L2 and L3         CLIENT
> Secondary Primary           |      Primary Secondary 
>     |        |              |          |       |     
>     |        |              |          |       |     
>     |        |INIT-INIT_ACK |          |       |     
>     |        |<-------------|--------->|       |     
>     |        |COOKIE_ECHO-COOKIE_ACK   |       |     
>     |        |              |          |       |     
>     |        |<-------------|--------->|       |     
>     |        |HB/HB_ACK     |          |       |     
>     |        |              |          |       |     
>     |<-------|--------------|----------|       |     
>     |        |              |       HB |       |     
>     |        |---->X        |          |       |     
>     |        |HB_ACK        |          |       |     
>     |        |        :     |          |       |     
>     |<-------|--------------|----------|       |     
>                   ABORT
>     
> X  -> LAN cable unplug, cannot send HB_ACK to Client
> 

Exctly what cable was it that got unplugged here?  In your configuration above,
if the primary ip cable got pulled, then the server system would have updated
its route table and the secondary ip address would get used.  If the link to the
router of the client got pulled though, theres no real hope here.  Also, what
inspired the ABORT here?  I presume the timer ran out on the client and it sent
an ABORT because it didn't receive the HBACK in a timely fashion?

> Based on what happened in our environment, the Client sent ABORT chunk to 
> Secondary IP address of Server causing this path to be closed. 
> 
> We don't want this to happen in our environment.
> 
> It would be of great help to us if you have any idea on how we could 
> solve question #3 below.
> 
> [Question]
> 3. Is there a solution to force Server to use its Secondary path in 
>    sending the HEARTBEAT_ACK chunk to Client's primary?   
> 
Yes, but its the same solution that all SCTP traffic uses, which is to say that
it relies on the routing tables.  If the primary interface on the sending host
goes down, then the route table will expire all routes associated with that
path, and sctp will pick a new one (the mechanics are that
sctp_transport_dst_check will fail, another routing table lookup will occur, and
a route will be cloned that contains the secondary address as the saddr
variable).  That said, if you're not pulling the interface cable (i.e. if you're
pulling the cable to the router, or the client system), then the server has no
way to know that the route using the primary source is unreachable, and will
continue to use it.

SCTP is fault tolerant over multiple network paths, but that comes with the
limitation that SCTP needs to be aware of path failures in a timely fashion.
You're setup doesn't really allow for that.

I would suggest one of two things:

1) Convert your setup to have each client and server interface be on a separate
subnet.  That way the client and server primary addresses are reachable via L2
only, as are the secondary addresses.  This will make path failure easier to
detect.

2) Use the bonding driver to do failover between your interfacs.  That gives you
a single path, but the failover detection is handled at L2 rather than L3, which
is faster and more reliable for local failures like link drops (and can be made
more robust with ip monitoring).  If you want L3 failover capabilities, then I
sugggest looking into the possibility of vrrp on your router.

Neil

> 
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-07-07 11:41 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-28 11:04 SCTP Multihoming Heartbeat ACK Behavior Winston V. Tizon
2014-06-30  8:24 ` Daniel Borkmann
2014-06-30  9:27 ` Daniel Borkmann
2014-06-30 12:30 ` Neil Horman
2014-07-01 11:20 ` Neil Horman
2014-07-01 11:38 ` Daniel Borkmann
2014-07-01 11:57 ` Michael Tuexen
2014-07-01 13:59 ` Jeff Carter
2014-07-06 20:19 ` Michael Tuexen
2014-07-07 11:41 ` Neil Horman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.