All of lore.kernel.org
 help / color / mirror / Atom feed
* PMTU discovery behaviour
@ 2017-09-11 12:44 Peter Salin
  2017-09-19 12:14 ` Peter Salin
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: Peter Salin @ 2017-09-11 12:44 UTC (permalink / raw)
  To: linux-sctp

[-- Attachment #1: Type: text/plain, Size: 3241 bytes --]

Hi,

I encountered some strange PMTUD related behaviour that I need help in
understanding.

Setup:

+-----------+        +---+        +--------+
| 10.0.0.10 |--------| X |--------|10.0.0.3|
+-----------+        +---+        +--------+

A one to many socket is setup at 10.0.0.10. Two instances of the
lksctp sctp_darn applications are ran at 10.0.0.3 listening to ports
8001 and 8002. 10.0.0.3 was also setup to generate ICMP frag needed
messages for incoming messages over 600 bytes. This same issue also
occurs also when a router on the path was setup to generate the ICMP
message instead.

Test 1:
Two associations were connected from 10.0.0.10 to 10.0.0.3, one to
port 8001 and another one to 8002. Then a too large message was sent
on the association to 8001, triggering ICMP generation. When checking
the MTU reported in spinfo_mtu field of SCTP_GET_PEER_ADDR_INFO, the
association now reports 600. The association to 8002 reports 1500
until traffic is sent on it, at which point it also adjusts to 600
which I think makes sense since the destination IP is the same. When
reopening the associations, the value of 600 would be remembered for
about 10 min, which I also think makes sense since
net.ipv4.route.mtu_expires is 600.

Test 2:
Again the same two associations were connected to 10.0.0.3, but in
addition an attempt to connect a third association to a non-existing
IP was done, this attempt fails with timeout after a while. After
that, again an ICMP triggering large message was sent to 8001. Now the
behaviour is different from before. The association to 8001 reports a
spinfo_mtu of 600, but only for a brief moment, it does not stay at
600 for 10 minutes. In addition the spinfo_mtu of the association to
8002 never changes, it stays at the original 1500.

The only difference between the two tests is the attempt to connect to
a non-responding IP at the beginning of test 2. Any ideas why the
behaviour changes, is this a bug or is there some other reason for
this?

I have attached the sample application used for reproducing this.

BR,
-Peter

------ ver_linux output ------
Linux esalipe-test 4.4.0-93-generic #116-Ubuntu SMP Fri Aug 11
21:17:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

GNU C                   5.4.0
GNU Make                4.1
Binutils                2.26.1
Util-linux              2.27.1
Mount                   2.27.1
Module-init-tools       22
E2fsprogs               1.42.13
Xfsprogs                4.3.0
Linux C Library         2.23
Dynamic linker (ldd)    2.23
Linux C++ Library       6.0.21
Procps                  3.3.10
Net-tools               1.60
Kbd                     1.15.5
Console-tools           1.15.5
Sh-utils                8.25
Udev                    229
Modules Loaded          ablk_helper aes_x86_64 aesni_intel
async_memcpy async_pq async_raid6_recov async_tx async_xor autofs4
binfmt_misc btrfs  crc32_pclmul crct10dif_pclmul cryptd floppy
gf128mul ghash_clmul ni_intel glue_helper hid hid_generic ib_addr
ib_cm ib_core ib_iser ib_mad ib_sa input_leds irqbypass iscsi_tcp
iw_cm joydev kvm kvm_intel libcrc32c libiscsi libiscsi_tcp linear lrw
multipath parport parport_pc ppdev psmouse raid0 raid1 raid10 raid456
raid6_pq rdma_cm scsi_transport_iscsi sctp serio_raw usbhid xor

[-- Attachment #2: sctp_pmtud_test.cc --]
[-- Type: text/x-c++src, Size: 10166 bytes --]


#include <cstring>
#include <ctime>
#include <iomanip>
#include <iostream>

#include <errno.h>
#include <unistd.h>
#include <arpa/inet.h>
#include <net/if.h>
#include <netinet/in.h>
#include <netinet/sctp.h>
#include <sys/ioctl.h>
#include <sys/socket.h>

using namespace std;

static const int ERROR_BUFLEN = 64;
static const char* SCTP_INTERFACE_NAME = "ens4";

static string data100 = "01234567890123456789012345678901234567890123456789"
  "01234567890123456789012345678901234567890123456789";
static string data1000 = "01234567890123456789012345678901234567890123456789"
  "01234567890123456789012345678901234567890123456789"
  "01234567890123456789012345678901234567890123456789"
  "01234567890123456789012345678901234567890123456789"
  "01234567890123456789012345678901234567890123456789"
  "01234567890123456789012345678901234567890123456789"
  "01234567890123456789012345678901234567890123456789"
  "01234567890123456789012345678901234567890123456789"
  "01234567890123456789012345678901234567890123456789"
  "01234567890123456789012345678901234567890123456789"
  "01234567890123456789012345678901234567890123456789"
  "01234567890123456789012345678901234567890123456789"
  "01234567890123456789012345678901234567890123456789"
  "01234567890123456789012345678901234567890123456789"
  "01234567890123456789012345678901234567890123456789"
  "01234567890123456789012345678901234567890123456789"
  "01234567890123456789012345678901234567890123456789"
  "01234567890123456789012345678901234567890123456789"
  "01234567890123456789012345678901234567890123456789"
  "01234567890123456789012345678901234567890123456789";

void printError(const string& msg, const string& funcName) {
  char errorMessage[ERROR_BUFLEN] {};
  char* errMsg = ::strerror_r(errno, errorMessage,
			      sizeof(errorMessage));

  cerr << "::" << funcName << ": " << msg << ": " << errMsg << endl;
}

int createSocket() {
  int sockFd = socket (AF_INET,
		       SOCK_SEQPACKET,
		       IPPROTO_SCTP);
  if (sockFd == -1) {
    printError("Creation of socket failed", __FUNCTION__);
    return -1;
  }
  
  // Enable address reuse
  int enable = 1;
  int err = setsockopt(sockFd,
		       SOL_SOCKET,
		       SO_REUSEADDR,
		       &enable,
		       sizeof(enable));
  
  if (err) {
    printError("Error setting socket option SO_REUSEADDR", __FUNCTION__);
    close(sockFd);
    return -1;
  }

  // Configure SCTP
  sctp_initmsg initmsg{};
  initmsg.sinit_num_ostreams = 3;
  initmsg.sinit_max_instreams = 3;
  initmsg.sinit_max_attempts = 2;
  initmsg.sinit_max_init_timeo = 0;

  err = setsockopt(sockFd,
		   IPPROTO_SCTP,
		   SCTP_INITMSG,
		   &initmsg,
		   sizeof(initmsg));

  if (err) {
    printError("Configuring SCTP socket failed", __FUNCTION__);
    close(sockFd);
    return -1;
  }

  struct sctp_paddrparams paddr_params{};
  memset(&paddr_params, 0, sizeof(paddr_params));
  socklen_t size_of_sctp_paddr_params = sizeof(paddr_params);
  paddr_params.spp_flags = SPP_HB_ENABLE | SPP_PMTUD_ENABLE | SPP_SACKDELAY_ENABLE;

  err = setsockopt(sockFd,
		   IPPROTO_SCTP,
		   SCTP_PEER_ADDR_PARAMS,
		   &paddr_params,
		   size_of_sctp_paddr_params);

  if (err) {
    printError("Configuring SCTP params failed", __FUNCTION__);
    close(sockFd);
    return -1;
  }

  return sockFd;
}

bool bindSocket(const int sockFd, const int localPort) {
  // Get IP of ethernet interface
  string localAddress = "";
  ifreq ifr{};
  ifr.ifr_addr.sa_family = AF_INET;
  strncpy(ifr.ifr_name, SCTP_INTERFACE_NAME, IFNAMSIZ - 1);
  const int ioctlStatus = ioctl(sockFd,
				SIOCGIFADDR,
				&ifr);

  if (ioctlStatus == -1) {
    printError("Failed to get local address", __FUNCTION__);
    return false;
  }

  char ipAddrBuffer[INET_ADDRSTRLEN] {};
  inet_ntop(AF_INET,
	    &reinterpret_cast<sockaddr_in*>(&(ifr.ifr_addr))->sin_addr,
	    ipAddrBuffer,
	    sizeof(ipAddrBuffer));

  localAddress.assign(ipAddrBuffer);

  // Bind to found ip address
  sockaddr_in serv_addr{};
  serv_addr.sin_family = AF_INET;
  inet_pton(AF_INET,
	    localAddress.c_str(),
	    &serv_addr.sin_addr);
  serv_addr.sin_port = htons(localPort);

  if (bind(sockFd,
	   reinterpret_cast<sockaddr*>(&serv_addr),
	   sizeof(serv_addr))) {    
    printError("Failed to bind socket to local address", __FUNCTION__);
    localAddress.clear();
    close(sockFd);
    return false;
  }

  cout << "Local endpoint succussfully bound to local address: " << localAddress << endl;

  return true;
}

bool openAssociation(const int sockFd,
		     const string &remoteAddress,
		     std::uint16_t remotePort) {

  sockaddr_in address{};
  address.sin_family = AF_INET;
  inet_pton(AF_INET, remoteAddress.c_str(), &address.sin_addr);
  address.sin_port = htons(remotePort);

  int connectError = connect(sockFd,
			     reinterpret_cast<sockaddr *>(&address),
			     sizeof(address));
  if (connectError) {
    printError("Error connecting association", __FUNCTION__);
    return false;
  }

  cout << "Association connected to address: " << remoteAddress << ":" << remotePort << endl;
  return true;
}

void sendReq(const int sockFd,
	     const string& remoteAddress,
	     const uint16_t remotePort,
	     const std::string& data)
{

  struct sockaddr_in remoteAddr {};
  remoteAddr.sin_family = AF_INET;
  remoteAddr.sin_port = htons(remotePort);

  uint32_t payloadProtId = 7;
  uint16_t streamId = 0;
  uint32_t dataLength = data.size();
  sockaddr* servaddr = reinterpret_cast<sockaddr*>(&remoteAddr);
  inet_pton(AF_INET, remoteAddress.c_str(), &remoteAddr.sin_addr);   
  
  const std::string ipaddr =
    inet_ntoa(reinterpret_cast<sockaddr_in*>(servaddr)->sin_addr);

  cout << "Sending SCTP req to " << remoteAddress << ":" << remotePort;
  cout << ", len=" << dataLength << endl;

  const int bytesSent = sctp_sendmsg(sockFd,
				     data.c_str(),
				     (size_t)dataLength,
				     servaddr,
				     sizeof(sockaddr_in),
				     htonl(payloadProtId),
				     SCTP_ADDR_OVER,
				     streamId,
				     200,
				     0);

  if (bytesSent == -1) {
    printError("SCTP send failed", __FUNCTION__);
  }

  return;
}

sctp_assoc_t getSocketAssociationId(const int sockFd,
				    const string &remoteIpAddress,
				    std::uint16_t remotePort)

{
  sockaddr_in socket_address_in{};

  socket_address_in.sin_family = AF_INET;
  socket_address_in.sin_port = htons(remotePort);
  inet_pton(AF_INET, remoteIpAddress.c_str(), &socket_address_in.sin_addr);

  struct sockaddr *socket_address = reinterpret_cast<sockaddr*>(&socket_address_in);
  socklen_t salen = sizeof(&socket_address);

  struct sctp_paddrinfo peer_address_info{};
  socklen_t size_of_sctp_paddrinfo = sizeof peer_address_info;
  std::memcpy(&peer_address_info.spinfo_address, socket_address, salen);

  const int sctpOptInfoError = sctp_opt_info(sockFd,
					     0,
					     SCTP_GET_PEER_ADDR_INFO,
					     &peer_address_info,
					     &size_of_sctp_paddrinfo);
  
  if (sctpOptInfoError) {
    printError("Failed to get association id", __FUNCTION__);
  }
  
  return peer_address_info.spinfo_assoc_id;
}

std::uint32_t getAssociationPathMtu(const int sockFd,
				    const string &remoteIpAddress,
				    const std::uint16_t remotePort) {
  sockaddr_in socket_address_in{};

  socket_address_in.sin_family = AF_INET;
  socket_address_in.sin_port = htons(remotePort);
  inet_pton(AF_INET, remoteIpAddress.c_str(), &socket_address_in.sin_addr);

  struct sockaddr *socket_address = reinterpret_cast<sockaddr*>(&socket_address_in);
  socklen_t salen = sizeof(&socket_address);

  struct sctp_paddrinfo peer_address_info{};
  socklen_t size_of_sctp_paddrinfo = sizeof(peer_address_info);
  std::memcpy(&peer_address_info.spinfo_address, socket_address, salen);

  sctp_assoc_t sctpAssociationId = getSocketAssociationId(sockFd, remoteIpAddress, remotePort);

  const int sctpOptInfoError = sctp_opt_info(sockFd, sctpAssociationId,
					     SCTP_GET_PEER_ADDR_INFO,
					     &peer_address_info, &size_of_sctp_paddrinfo);
  if (sctpOptInfoError) {
    printError("Failed to get pmtu", __FUNCTION__);
  }

  auto t = std::time(nullptr);
  auto tm = *std::localtime(&t);
  std::cout << std::put_time(&tm, "%H:%M:%S ") << remoteIpAddress << ":" << remotePort;
  cout << " currently has a PMTU of " << peer_address_info.spinfo_mtu << endl;

  return peer_address_info.spinfo_mtu;
}

void test1(const string& data) {
  int localPort = 2944;
  string remoteIp1 = "10.0.0.3";
  uint16_t remotePort1 = 8001;
  uint16_t remotePort2 = 8002;
  
  int sockFd = createSocket();
  bindSocket(sockFd, localPort);

  cout << "### Test 1: 2 assocs" << endl;
  
  openAssociation(sockFd, remoteIp1, remotePort1);
  openAssociation(sockFd, remoteIp1, remotePort2);

  getAssociationPathMtu(sockFd, remoteIp1, remotePort1);
  getAssociationPathMtu(sockFd, remoteIp1, remotePort2);

  sendReq(sockFd, remoteIp1, remotePort1, data);
  for (int i = 0; i < 10; i++) {
    sleep(10);
    getAssociationPathMtu(sockFd, remoteIp1, remotePort1);
    getAssociationPathMtu(sockFd, remoteIp1, remotePort2);
  }
}

void test2(const string& data) {
  int localPort = 2944;
  string remoteIp1 = "10.0.0.3";
  uint16_t remotePort1 = 8001;
  uint16_t remotePort2 = 8002;
  string remoteIpFake = "10.52.96.204";
  uint16_t remotePortFake = 3239;
  
  int sockFd = createSocket();
  bindSocket(sockFd, localPort);

  cout << "### Test 2: 2 assocs + 1 unreachable assoc" << endl;
  
  openAssociation(sockFd, remoteIp1, remotePort1);
  openAssociation(sockFd, remoteIp1, remotePort2);
  openAssociation(sockFd, remoteIpFake, remotePortFake);

  getAssociationPathMtu(sockFd, remoteIp1, remotePort1);
  getAssociationPathMtu(sockFd, remoteIp1, remotePort2);

  sendReq(sockFd, remoteIp1, remotePort1, data);
  for (int i = 0; i < 10; i++) {
    sleep(10);
    getAssociationPathMtu(sockFd, remoteIp1, remotePort1);
    getAssociationPathMtu(sockFd, remoteIp1, remotePort2);
  }
}


int main(int argc, char** argv) {
  string testNr = "1";
  string& testData = data1000;
  if (argc >= 2) {
    testNr = argv[1];
  }
  if (argc >= 3) {
    testData = data100;
  }

  if (testNr == "1") {
    test1(testData);
  } else {
    test2(testData);
  }
  
  return 0;
}

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: PMTU discovery behaviour
  2017-09-11 12:44 PMTU discovery behaviour Peter Salin
@ 2017-09-19 12:14 ` Peter Salin
  2017-09-19 17:09 ` Neil Horman
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Peter Salin @ 2017-09-19 12:14 UTC (permalink / raw)
  To: linux-sctp

2017-09-11 15:41 GMT+03:00 Peter Salin <peter.salin@gmail.com>:
> Hi,
>
> I encountered some strange PMTUD related behaviour that I need help in
> understanding.
>
> Setup:
>
> +-----------+        +---+        +--------+
> | 10.0.0.10 |--------| X |--------|10.0.0.3|
> +-----------+        +---+        +--------+
>
> A one to many socket is setup at 10.0.0.10. Two instances of the lksctp
> sctp_darn applications are ran at 10.0.0.3 listening to ports 8001 and 8002.
> 10.0.0.3 was also setup to generate ICMP frag needed messages for incoming
> messages over 600 bytes. This same issue also occurs also when a router on
> the path was setup to generate the ICMP message instead.
>
> Test 1:
> Two associations were connected from 10.0.0.10 to 10.0.0.3, one to port 8001
> and another one to 8002. Then a too large message was sent on the
> association to 8001, triggering ICMP generation. When checking the MTU
> reported in spinfo_mtu field of SCTP_GET_PEER_ADDR_INFO, the association now
> reports 600. The association to 8002 reports 1500 until traffic is sent on
> it, at which point it also adjusts to 600 which I think makes sense since
> the destination IP is the same. When reopening the associations, the value
> of 600 would be remembered for about 10 min, which I also think makes sense
> since net.ipv4.route.mtu_expires is 600.
>
> Test 2:
> Again the same two associations were connected to 10.0.0.3, but in addition
> an attempt to connect a third association to a non-existing IP was done,
> this attempt fails with timeout after a while. After that, again an ICMP
> triggering large message was sent to 8001. Now the behaviour is different
> from before. The association to 8001 reports a spinfo_mtu of 600, but only
> for a brief moment, it does not stay at 600 for 10 minutes. In addition the
> spinfo_mtu of the association to 8002 never changes, it stays at the
> original 1500.
>
> The only difference between the two tests is the attempt to connect to a
> non-responding IP at the beginning of test 2. Any ideas why the behaviour
> changes, is this a bug or is there some other reason for this?
>
> I have attached the sample application used for reproducing this.

Hi,

Any input on this? Is this the right forum for this?

BR,
-Peter

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: PMTU discovery behaviour
  2017-09-11 12:44 PMTU discovery behaviour Peter Salin
  2017-09-19 12:14 ` Peter Salin
@ 2017-09-19 17:09 ` Neil Horman
  2017-09-20 11:02 ` Peter Salin
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Neil Horman @ 2017-09-19 17:09 UTC (permalink / raw)
  To: linux-sctp

On Mon, Sep 11, 2017 at 03:44:57PM +0300, Peter Salin wrote:
> Hi,
> 
> I encountered some strange PMTUD related behaviour that I need help in
> understanding.
> 
> Setup:
> 
> +-----------+        +---+        +--------+
> | 10.0.0.10 |--------| X |--------|10.0.0.3|
> +-----------+        +---+        +--------+
> 
> A one to many socket is setup at 10.0.0.10. Two instances of the
> lksctp sctp_darn applications are ran at 10.0.0.3 listening to ports
> 8001 and 8002. 10.0.0.3 was also setup to generate ICMP frag needed
> messages for incoming messages over 600 bytes. This same issue also
> occurs also when a router on the path was setup to generate the ICMP
> message instead.
> 
> Test 1:
> Two associations were connected from 10.0.0.10 to 10.0.0.3, one to
> port 8001 and another one to 8002. Then a too large message was sent
> on the association to 8001, triggering ICMP generation. When checking
> the MTU reported in spinfo_mtu field of SCTP_GET_PEER_ADDR_INFO, the
> association now reports 600. The association to 8002 reports 1500
> until traffic is sent on it, at which point it also adjusts to 600
> which I think makes sense since the destination IP is the same. When
> reopening the associations, the value of 600 would be remembered for
> about 10 min, which I also think makes sense since
> net.ipv4.route.mtu_expires is 600.
> 
> Test 2:
> Again the same two associations were connected to 10.0.0.3, but in
> addition an attempt to connect a third association to a non-existing
> IP was done, this attempt fails with timeout after a while. After
> that, again an ICMP triggering large message was sent to 8001. Now the
> behaviour is different from before. The association to 8001 reports a
> spinfo_mtu of 600, but only for a brief moment, it does not stay at
> 600 for 10 minutes. In addition the spinfo_mtu of the association to
> 8002 never changes, it stays at the original 1500.
> 
> The only difference between the two tests is the attempt to connect to
> a non-responding IP at the beginning of test 2. Any ideas why the
> behaviour changes, is this a bug or is there some other reason for
> this?
> 
> I have attached the sample application used for reproducing this.
> 
> BR,
> -Peter
> 
Hey, apologies for the delay on this, I've had it in my reader for days and kept
meaning to respond, but kept getting sidetracked.

First glance, this sounds incorrect.  Each association (or rather each
transport) maintains its own mtu, and the association reflects the mtu of the
active transport. Given that each transport holds its own dst cache entry, I
have a hard time seeing how one transports mtu changes might leak to another

But thats not really whats happening here.  By your description, the active
transport on the established association isn't updating its pathmtu, which
should happen in response to receiving the ICMP_FRAG_NEEDED message.  

I know you've provided the reproducer bellow, and I appreciate that, but I don't
have the cycles to set this up at the moment.  Could you tell me if, during the
second test, after you attempt to connect to the fake ip address and then send
the large message that should trigger the frag needed message, does said large
message get retransmitted and eventually arrive at the peer host?  If so, that
suggests that the sctp stack:

a) receives the frag needed message
and
b) resends the packet at the lower frag point

That in turn suggests we just have some internal reporting error in which we
don't update the associations pmtu with the active transports

Let me know the answer to that question and it will give me some places to start
looking
Neil

> ------ ver_linux output ------
> Linux esalipe-test 4.4.0-93-generic #116-Ubuntu SMP Fri Aug 11
> 21:17:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
> 
> GNU C                   5.4.0
> GNU Make                4.1
> Binutils                2.26.1
> Util-linux              2.27.1
> Mount                   2.27.1
> Module-init-tools       22
> E2fsprogs               1.42.13
> Xfsprogs                4.3.0
> Linux C Library         2.23
> Dynamic linker (ldd)    2.23
> Linux C++ Library       6.0.21
> Procps                  3.3.10
> Net-tools               1.60
> Kbd                     1.15.5
> Console-tools           1.15.5
> Sh-utils                8.25
> Udev                    229
> Modules Loaded          ablk_helper aes_x86_64 aesni_intel
> async_memcpy async_pq async_raid6_recov async_tx async_xor autofs4
> binfmt_misc btrfs  crc32_pclmul crct10dif_pclmul cryptd floppy
> gf128mul ghash_clmul ni_intel glue_helper hid hid_generic ib_addr
> ib_cm ib_core ib_iser ib_mad ib_sa input_leds irqbypass iscsi_tcp
> iw_cm joydev kvm kvm_intel libcrc32c libiscsi libiscsi_tcp linear lrw
> multipath parport parport_pc ppdev psmouse raid0 raid1 raid10 raid456
> raid6_pq rdma_cm scsi_transport_iscsi sctp serio_raw usbhid xor

> 
> #include <cstring>
> #include <ctime>
> #include <iomanip>
> #include <iostream>
> 
> #include <errno.h>
> #include <unistd.h>
> #include <arpa/inet.h>
> #include <net/if.h>
> #include <netinet/in.h>
> #include <netinet/sctp.h>
> #include <sys/ioctl.h>
> #include <sys/socket.h>
> 
> using namespace std;
> 
> static const int ERROR_BUFLEN = 64;
> static const char* SCTP_INTERFACE_NAME = "ens4";
> 
> static string data100 = "01234567890123456789012345678901234567890123456789"
>   "01234567890123456789012345678901234567890123456789";
> static string data1000 = "01234567890123456789012345678901234567890123456789"
>   "01234567890123456789012345678901234567890123456789"
>   "01234567890123456789012345678901234567890123456789"
>   "01234567890123456789012345678901234567890123456789"
>   "01234567890123456789012345678901234567890123456789"
>   "01234567890123456789012345678901234567890123456789"
>   "01234567890123456789012345678901234567890123456789"
>   "01234567890123456789012345678901234567890123456789"
>   "01234567890123456789012345678901234567890123456789"
>   "01234567890123456789012345678901234567890123456789"
>   "01234567890123456789012345678901234567890123456789"
>   "01234567890123456789012345678901234567890123456789"
>   "01234567890123456789012345678901234567890123456789"
>   "01234567890123456789012345678901234567890123456789"
>   "01234567890123456789012345678901234567890123456789"
>   "01234567890123456789012345678901234567890123456789"
>   "01234567890123456789012345678901234567890123456789"
>   "01234567890123456789012345678901234567890123456789"
>   "01234567890123456789012345678901234567890123456789"
>   "01234567890123456789012345678901234567890123456789";
> 
> void printError(const string& msg, const string& funcName) {
>   char errorMessage[ERROR_BUFLEN] {};
>   char* errMsg = ::strerror_r(errno, errorMessage,
> 			      sizeof(errorMessage));
> 
>   cerr << "::" << funcName << ": " << msg << ": " << errMsg << endl;
> }
> 
> int createSocket() {
>   int sockFd = socket (AF_INET,
> 		       SOCK_SEQPACKET,
> 		       IPPROTO_SCTP);
>   if (sockFd = -1) {
>     printError("Creation of socket failed", __FUNCTION__);
>     return -1;
>   }
>   
>   // Enable address reuse
>   int enable = 1;
>   int err = setsockopt(sockFd,
> 		       SOL_SOCKET,
> 		       SO_REUSEADDR,
> 		       &enable,
> 		       sizeof(enable));
>   
>   if (err) {
>     printError("Error setting socket option SO_REUSEADDR", __FUNCTION__);
>     close(sockFd);
>     return -1;
>   }
> 
>   // Configure SCTP
>   sctp_initmsg initmsg{};
>   initmsg.sinit_num_ostreams = 3;
>   initmsg.sinit_max_instreams = 3;
>   initmsg.sinit_max_attempts = 2;
>   initmsg.sinit_max_init_timeo = 0;
> 
>   err = setsockopt(sockFd,
> 		   IPPROTO_SCTP,
> 		   SCTP_INITMSG,
> 		   &initmsg,
> 		   sizeof(initmsg));
> 
>   if (err) {
>     printError("Configuring SCTP socket failed", __FUNCTION__);
>     close(sockFd);
>     return -1;
>   }
> 
>   struct sctp_paddrparams paddr_params{};
>   memset(&paddr_params, 0, sizeof(paddr_params));
>   socklen_t size_of_sctp_paddr_params = sizeof(paddr_params);
>   paddr_params.spp_flags = SPP_HB_ENABLE | SPP_PMTUD_ENABLE | SPP_SACKDELAY_ENABLE;
> 
>   err = setsockopt(sockFd,
> 		   IPPROTO_SCTP,
> 		   SCTP_PEER_ADDR_PARAMS,
> 		   &paddr_params,
> 		   size_of_sctp_paddr_params);
> 
>   if (err) {
>     printError("Configuring SCTP params failed", __FUNCTION__);
>     close(sockFd);
>     return -1;
>   }
> 
>   return sockFd;
> }
> 
> bool bindSocket(const int sockFd, const int localPort) {
>   // Get IP of ethernet interface
>   string localAddress = "";
>   ifreq ifr{};
>   ifr.ifr_addr.sa_family = AF_INET;
>   strncpy(ifr.ifr_name, SCTP_INTERFACE_NAME, IFNAMSIZ - 1);
>   const int ioctlStatus = ioctl(sockFd,
> 				SIOCGIFADDR,
> 				&ifr);
> 
>   if (ioctlStatus = -1) {
>     printError("Failed to get local address", __FUNCTION__);
>     return false;
>   }
> 
>   char ipAddrBuffer[INET_ADDRSTRLEN] {};
>   inet_ntop(AF_INET,
> 	    &reinterpret_cast<sockaddr_in*>(&(ifr.ifr_addr))->sin_addr,
> 	    ipAddrBuffer,
> 	    sizeof(ipAddrBuffer));
> 
>   localAddress.assign(ipAddrBuffer);
> 
>   // Bind to found ip address
>   sockaddr_in serv_addr{};
>   serv_addr.sin_family = AF_INET;
>   inet_pton(AF_INET,
> 	    localAddress.c_str(),
> 	    &serv_addr.sin_addr);
>   serv_addr.sin_port = htons(localPort);
> 
>   if (bind(sockFd,
> 	   reinterpret_cast<sockaddr*>(&serv_addr),
> 	   sizeof(serv_addr))) {    
>     printError("Failed to bind socket to local address", __FUNCTION__);
>     localAddress.clear();
>     close(sockFd);
>     return false;
>   }
> 
>   cout << "Local endpoint succussfully bound to local address: " << localAddress << endl;
> 
>   return true;
> }
> 
> bool openAssociation(const int sockFd,
> 		     const string &remoteAddress,
> 		     std::uint16_t remotePort) {
> 
>   sockaddr_in address{};
>   address.sin_family = AF_INET;
>   inet_pton(AF_INET, remoteAddress.c_str(), &address.sin_addr);
>   address.sin_port = htons(remotePort);
> 
>   int connectError = connect(sockFd,
> 			     reinterpret_cast<sockaddr *>(&address),
> 			     sizeof(address));
>   if (connectError) {
>     printError("Error connecting association", __FUNCTION__);
>     return false;
>   }
> 
>   cout << "Association connected to address: " << remoteAddress << ":" << remotePort << endl;
>   return true;
> }
> 
> void sendReq(const int sockFd,
> 	     const string& remoteAddress,
> 	     const uint16_t remotePort,
> 	     const std::string& data)
> {
> 
>   struct sockaddr_in remoteAddr {};
>   remoteAddr.sin_family = AF_INET;
>   remoteAddr.sin_port = htons(remotePort);
> 
>   uint32_t payloadProtId = 7;
>   uint16_t streamId = 0;
>   uint32_t dataLength = data.size();
>   sockaddr* servaddr = reinterpret_cast<sockaddr*>(&remoteAddr);
>   inet_pton(AF_INET, remoteAddress.c_str(), &remoteAddr.sin_addr);   
>   
>   const std::string ipaddr >     inet_ntoa(reinterpret_cast<sockaddr_in*>(servaddr)->sin_addr);
> 
>   cout << "Sending SCTP req to " << remoteAddress << ":" << remotePort;
>   cout << ", len=" << dataLength << endl;
> 
>   const int bytesSent = sctp_sendmsg(sockFd,
> 				     data.c_str(),
> 				     (size_t)dataLength,
> 				     servaddr,
> 				     sizeof(sockaddr_in),
> 				     htonl(payloadProtId),
> 				     SCTP_ADDR_OVER,
> 				     streamId,
> 				     200,
> 				     0);
> 
>   if (bytesSent = -1) {
>     printError("SCTP send failed", __FUNCTION__);
>   }
> 
>   return;
> }
> 
> sctp_assoc_t getSocketAssociationId(const int sockFd,
> 				    const string &remoteIpAddress,
> 				    std::uint16_t remotePort)
> 
> {
>   sockaddr_in socket_address_in{};
> 
>   socket_address_in.sin_family = AF_INET;
>   socket_address_in.sin_port = htons(remotePort);
>   inet_pton(AF_INET, remoteIpAddress.c_str(), &socket_address_in.sin_addr);
> 
>   struct sockaddr *socket_address = reinterpret_cast<sockaddr*>(&socket_address_in);
>   socklen_t salen = sizeof(&socket_address);
> 
>   struct sctp_paddrinfo peer_address_info{};
>   socklen_t size_of_sctp_paddrinfo = sizeof peer_address_info;
>   std::memcpy(&peer_address_info.spinfo_address, socket_address, salen);
> 
>   const int sctpOptInfoError = sctp_opt_info(sockFd,
> 					     0,
> 					     SCTP_GET_PEER_ADDR_INFO,
> 					     &peer_address_info,
> 					     &size_of_sctp_paddrinfo);
>   
>   if (sctpOptInfoError) {
>     printError("Failed to get association id", __FUNCTION__);
>   }
>   
>   return peer_address_info.spinfo_assoc_id;
> }
> 
> std::uint32_t getAssociationPathMtu(const int sockFd,
> 				    const string &remoteIpAddress,
> 				    const std::uint16_t remotePort) {
>   sockaddr_in socket_address_in{};
> 
>   socket_address_in.sin_family = AF_INET;
>   socket_address_in.sin_port = htons(remotePort);
>   inet_pton(AF_INET, remoteIpAddress.c_str(), &socket_address_in.sin_addr);
> 
>   struct sockaddr *socket_address = reinterpret_cast<sockaddr*>(&socket_address_in);
>   socklen_t salen = sizeof(&socket_address);
> 
>   struct sctp_paddrinfo peer_address_info{};
>   socklen_t size_of_sctp_paddrinfo = sizeof(peer_address_info);
>   std::memcpy(&peer_address_info.spinfo_address, socket_address, salen);
> 
>   sctp_assoc_t sctpAssociationId = getSocketAssociationId(sockFd, remoteIpAddress, remotePort);
> 
>   const int sctpOptInfoError = sctp_opt_info(sockFd, sctpAssociationId,
> 					     SCTP_GET_PEER_ADDR_INFO,
> 					     &peer_address_info, &size_of_sctp_paddrinfo);
>   if (sctpOptInfoError) {
>     printError("Failed to get pmtu", __FUNCTION__);
>   }
> 
>   auto t = std::time(nullptr);
>   auto tm = *std::localtime(&t);
>   std::cout << std::put_time(&tm, "%H:%M:%S ") << remoteIpAddress << ":" << remotePort;
>   cout << " currently has a PMTU of " << peer_address_info.spinfo_mtu << endl;
> 
>   return peer_address_info.spinfo_mtu;
> }
> 
> void test1(const string& data) {
>   int localPort = 2944;
>   string remoteIp1 = "10.0.0.3";
>   uint16_t remotePort1 = 8001;
>   uint16_t remotePort2 = 8002;
>   
>   int sockFd = createSocket();
>   bindSocket(sockFd, localPort);
> 
>   cout << "### Test 1: 2 assocs" << endl;
>   
>   openAssociation(sockFd, remoteIp1, remotePort1);
>   openAssociation(sockFd, remoteIp1, remotePort2);
> 
>   getAssociationPathMtu(sockFd, remoteIp1, remotePort1);
>   getAssociationPathMtu(sockFd, remoteIp1, remotePort2);
> 
>   sendReq(sockFd, remoteIp1, remotePort1, data);
>   for (int i = 0; i < 10; i++) {
>     sleep(10);
>     getAssociationPathMtu(sockFd, remoteIp1, remotePort1);
>     getAssociationPathMtu(sockFd, remoteIp1, remotePort2);
>   }
> }
> 
> void test2(const string& data) {
>   int localPort = 2944;
>   string remoteIp1 = "10.0.0.3";
>   uint16_t remotePort1 = 8001;
>   uint16_t remotePort2 = 8002;
>   string remoteIpFake = "10.52.96.204";
>   uint16_t remotePortFake = 3239;
>   
>   int sockFd = createSocket();
>   bindSocket(sockFd, localPort);
> 
>   cout << "### Test 2: 2 assocs + 1 unreachable assoc" << endl;
>   
>   openAssociation(sockFd, remoteIp1, remotePort1);
>   openAssociation(sockFd, remoteIp1, remotePort2);
>   openAssociation(sockFd, remoteIpFake, remotePortFake);
> 
>   getAssociationPathMtu(sockFd, remoteIp1, remotePort1);
>   getAssociationPathMtu(sockFd, remoteIp1, remotePort2);
> 
>   sendReq(sockFd, remoteIp1, remotePort1, data);
>   for (int i = 0; i < 10; i++) {
>     sleep(10);
>     getAssociationPathMtu(sockFd, remoteIp1, remotePort1);
>     getAssociationPathMtu(sockFd, remoteIp1, remotePort2);
>   }
> }
> 
> 
> int main(int argc, char** argv) {
>   string testNr = "1";
>   string& testData = data1000;
>   if (argc >= 2) {
>     testNr = argv[1];
>   }
>   if (argc >= 3) {
>     testData = data100;
>   }
> 
>   if (testNr = "1") {
>     test1(testData);
>   } else {
>     test2(testData);
>   }
>   
>   return 0;
> }


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: PMTU discovery behaviour
  2017-09-11 12:44 PMTU discovery behaviour Peter Salin
  2017-09-19 12:14 ` Peter Salin
  2017-09-19 17:09 ` Neil Horman
@ 2017-09-20 11:02 ` Peter Salin
  2017-09-21 11:01 ` Neil Horman
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Peter Salin @ 2017-09-20 11:02 UTC (permalink / raw)
  To: linux-sctp

[-- Attachment #1: Type: text/plain, Size: 18214 bytes --]

2017-09-19 20:09 GMT+03:00 Neil Horman <nhorman@tuxdriver.com>:
> On Mon, Sep 11, 2017 at 03:44:57PM +0300, Peter Salin wrote:
>> Hi,
>>
>> I encountered some strange PMTUD related behaviour that I need help in
>> understanding.
>>
>> Setup:
>>
>> +-----------+        +---+        +--------+
>> | 10.0.0.10 |--------| X |--------|10.0.0.3|
>> +-----------+        +---+        +--------+
>>
>> A one to many socket is setup at 10.0.0.10. Two instances of the
>> lksctp sctp_darn applications are ran at 10.0.0.3 listening to ports
>> 8001 and 8002. 10.0.0.3 was also setup to generate ICMP frag needed
>> messages for incoming messages over 600 bytes. This same issue also
>> occurs also when a router on the path was setup to generate the ICMP
>> message instead.
>>
>> Test 1:
>> Two associations were connected from 10.0.0.10 to 10.0.0.3, one to
>> port 8001 and another one to 8002. Then a too large message was sent
>> on the association to 8001, triggering ICMP generation. When checking
>> the MTU reported in spinfo_mtu field of SCTP_GET_PEER_ADDR_INFO, the
>> association now reports 600. The association to 8002 reports 1500
>> until traffic is sent on it, at which point it also adjusts to 600
>> which I think makes sense since the destination IP is the same. When
>> reopening the associations, the value of 600 would be remembered for
>> about 10 min, which I also think makes sense since
>> net.ipv4.route.mtu_expires is 600.
>>
>> Test 2:
>> Again the same two associations were connected to 10.0.0.3, but in
>> addition an attempt to connect a third association to a non-existing
>> IP was done, this attempt fails with timeout after a while. After
>> that, again an ICMP triggering large message was sent to 8001. Now the
>> behaviour is different from before. The association to 8001 reports a
>> spinfo_mtu of 600, but only for a brief moment, it does not stay at
>> 600 for 10 minutes. In addition the spinfo_mtu of the association to
>> 8002 never changes, it stays at the original 1500.
>>
>> The only difference between the two tests is the attempt to connect to
>> a non-responding IP at the beginning of test 2. Any ideas why the
>> behaviour changes, is this a bug or is there some other reason for
>> this?
>>
>> I have attached the sample application used for reproducing this.
>>
>> BR,
>> -Peter
>>
> Hey, apologies for the delay on this, I've had it in my reader for days and kept
> meaning to respond, but kept getting sidetracked.
>
> First glance, this sounds incorrect.  Each association (or rather each
> transport) maintains its own mtu, and the association reflects the mtu of the
> active transport. Given that each transport holds its own dst cache entry, I
> have a hard time seeing how one transports mtu changes might leak to another
>
> But thats not really whats happening here.  By your description, the active
> transport on the established association isn't updating its pathmtu, which
> should happen in response to receiving the ICMP_FRAG_NEEDED message.
>
> I know you've provided the reproducer bellow, and I appreciate that, but I don't
> have the cycles to set this up at the moment.  Could you tell me if, during the
> second test, after you attempt to connect to the fake ip address and then send
> the large message that should trigger the frag needed message, does said large
> message get retransmitted and eventually arrive at the peer host?  If so, that
> suggests that the sctp stack:
>
> a) receives the frag needed message
> and
> b) resends the packet at the lower frag point
>
> That in turn suggests we just have some internal reporting error in which we
> don't update the associations pmtu with the active transports
>
> Let me know the answer to that question and it will give me some places to start
> looking
> Neil
>
Thanks for responding. In response to your question, the first large
message does get retransmitted without the Don't Fragment bit set. I
modified the test a bit to also send further messages after the first
one. Those messages are indeed fragmented according to the limit of
the ICMP message. I have attached a PCAP trace and SCTP debug logs in
case that helps here.

I also tried sending a large message on the other association after
the large message on the first association had been sent. For test 2
that message was not fragmented even though the ICMP was already
received for the first assoc. After the second assoc also received an
ICMP it adjusted to use the lower MTU for subsequent messages. In the
case of test 1, sending a large message on the second assoc would auto
fragment already on the first message.

Also, after stopping and rerunning test 2 the MTU would always be
reset at 1500, whereas in test 1 the lower limit would still be in
effect for a new run. So it seems like in test 2 the lower MTU is only
known within each association, where as in test 1 the lower MTU also
gets stored deeper down?

BR,
-Peter

>> ------ ver_linux output ------
>> Linux esalipe-test 4.4.0-93-generic #116-Ubuntu SMP Fri Aug 11
>> 21:17:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
>>
>> GNU C                   5.4.0
>> GNU Make                4.1
>> Binutils                2.26.1
>> Util-linux              2.27.1
>> Mount                   2.27.1
>> Module-init-tools       22
>> E2fsprogs               1.42.13
>> Xfsprogs                4.3.0
>> Linux C Library         2.23
>> Dynamic linker (ldd)    2.23
>> Linux C++ Library       6.0.21
>> Procps                  3.3.10
>> Net-tools               1.60
>> Kbd                     1.15.5
>> Console-tools           1.15.5
>> Sh-utils                8.25
>> Udev                    229
>> Modules Loaded          ablk_helper aes_x86_64 aesni_intel
>> async_memcpy async_pq async_raid6_recov async_tx async_xor autofs4
>> binfmt_misc btrfs  crc32_pclmul crct10dif_pclmul cryptd floppy
>> gf128mul ghash_clmul ni_intel glue_helper hid hid_generic ib_addr
>> ib_cm ib_core ib_iser ib_mad ib_sa input_leds irqbypass iscsi_tcp
>> iw_cm joydev kvm kvm_intel libcrc32c libiscsi libiscsi_tcp linear lrw
>> multipath parport parport_pc ppdev psmouse raid0 raid1 raid10 raid456
>> raid6_pq rdma_cm scsi_transport_iscsi sctp serio_raw usbhid xor
>
>>
>> #include <cstring>
>> #include <ctime>
>> #include <iomanip>
>> #include <iostream>
>>
>> #include <errno.h>
>> #include <unistd.h>
>> #include <arpa/inet.h>
>> #include <net/if.h>
>> #include <netinet/in.h>
>> #include <netinet/sctp.h>
>> #include <sys/ioctl.h>
>> #include <sys/socket.h>
>>
>> using namespace std;
>>
>> static const int ERROR_BUFLEN = 64;
>> static const char* SCTP_INTERFACE_NAME = "ens4";
>>
>> static string data100 = "01234567890123456789012345678901234567890123456789"
>>   "01234567890123456789012345678901234567890123456789";
>> static string data1000 = "01234567890123456789012345678901234567890123456789"
>>   "01234567890123456789012345678901234567890123456789"
>>   "01234567890123456789012345678901234567890123456789"
>>   "01234567890123456789012345678901234567890123456789"
>>   "01234567890123456789012345678901234567890123456789"
>>   "01234567890123456789012345678901234567890123456789"
>>   "01234567890123456789012345678901234567890123456789"
>>   "01234567890123456789012345678901234567890123456789"
>>   "01234567890123456789012345678901234567890123456789"
>>   "01234567890123456789012345678901234567890123456789"
>>   "01234567890123456789012345678901234567890123456789"
>>   "01234567890123456789012345678901234567890123456789"
>>   "01234567890123456789012345678901234567890123456789"
>>   "01234567890123456789012345678901234567890123456789"
>>   "01234567890123456789012345678901234567890123456789"
>>   "01234567890123456789012345678901234567890123456789"
>>   "01234567890123456789012345678901234567890123456789"
>>   "01234567890123456789012345678901234567890123456789"
>>   "01234567890123456789012345678901234567890123456789"
>>   "01234567890123456789012345678901234567890123456789";
>>
>> void printError(const string& msg, const string& funcName) {
>>   char errorMessage[ERROR_BUFLEN] {};
>>   char* errMsg = ::strerror_r(errno, errorMessage,
>>                             sizeof(errorMessage));
>>
>>   cerr << "::" << funcName << ": " << msg << ": " << errMsg << endl;
>> }
>>
>> int createSocket() {
>>   int sockFd = socket (AF_INET,
>>                      SOCK_SEQPACKET,
>>                      IPPROTO_SCTP);
>>   if (sockFd == -1) {
>>     printError("Creation of socket failed", __FUNCTION__);
>>     return -1;
>>   }
>>
>>   // Enable address reuse
>>   int enable = 1;
>>   int err = setsockopt(sockFd,
>>                      SOL_SOCKET,
>>                      SO_REUSEADDR,
>>                      &enable,
>>                      sizeof(enable));
>>
>>   if (err) {
>>     printError("Error setting socket option SO_REUSEADDR", __FUNCTION__);
>>     close(sockFd);
>>     return -1;
>>   }
>>
>>   // Configure SCTP
>>   sctp_initmsg initmsg{};
>>   initmsg.sinit_num_ostreams = 3;
>>   initmsg.sinit_max_instreams = 3;
>>   initmsg.sinit_max_attempts = 2;
>>   initmsg.sinit_max_init_timeo = 0;
>>
>>   err = setsockopt(sockFd,
>>                  IPPROTO_SCTP,
>>                  SCTP_INITMSG,
>>                  &initmsg,
>>                  sizeof(initmsg));
>>
>>   if (err) {
>>     printError("Configuring SCTP socket failed", __FUNCTION__);
>>     close(sockFd);
>>     return -1;
>>   }
>>
>>   struct sctp_paddrparams paddr_params{};
>>   memset(&paddr_params, 0, sizeof(paddr_params));
>>   socklen_t size_of_sctp_paddr_params = sizeof(paddr_params);
>>   paddr_params.spp_flags = SPP_HB_ENABLE | SPP_PMTUD_ENABLE | SPP_SACKDELAY_ENABLE;
>>
>>   err = setsockopt(sockFd,
>>                  IPPROTO_SCTP,
>>                  SCTP_PEER_ADDR_PARAMS,
>>                  &paddr_params,
>>                  size_of_sctp_paddr_params);
>>
>>   if (err) {
>>     printError("Configuring SCTP params failed", __FUNCTION__);
>>     close(sockFd);
>>     return -1;
>>   }
>>
>>   return sockFd;
>> }
>>
>> bool bindSocket(const int sockFd, const int localPort) {
>>   // Get IP of ethernet interface
>>   string localAddress = "";
>>   ifreq ifr{};
>>   ifr.ifr_addr.sa_family = AF_INET;
>>   strncpy(ifr.ifr_name, SCTP_INTERFACE_NAME, IFNAMSIZ - 1);
>>   const int ioctlStatus = ioctl(sockFd,
>>                               SIOCGIFADDR,
>>                               &ifr);
>>
>>   if (ioctlStatus == -1) {
>>     printError("Failed to get local address", __FUNCTION__);
>>     return false;
>>   }
>>
>>   char ipAddrBuffer[INET_ADDRSTRLEN] {};
>>   inet_ntop(AF_INET,
>>           &reinterpret_cast<sockaddr_in*>(&(ifr.ifr_addr))->sin_addr,
>>           ipAddrBuffer,
>>           sizeof(ipAddrBuffer));
>>
>>   localAddress.assign(ipAddrBuffer);
>>
>>   // Bind to found ip address
>>   sockaddr_in serv_addr{};
>>   serv_addr.sin_family = AF_INET;
>>   inet_pton(AF_INET,
>>           localAddress.c_str(),
>>           &serv_addr.sin_addr);
>>   serv_addr.sin_port = htons(localPort);
>>
>>   if (bind(sockFd,
>>          reinterpret_cast<sockaddr*>(&serv_addr),
>>          sizeof(serv_addr))) {
>>     printError("Failed to bind socket to local address", __FUNCTION__);
>>     localAddress.clear();
>>     close(sockFd);
>>     return false;
>>   }
>>
>>   cout << "Local endpoint succussfully bound to local address: " << localAddress << endl;
>>
>>   return true;
>> }
>>
>> bool openAssociation(const int sockFd,
>>                    const string &remoteAddress,
>>                    std::uint16_t remotePort) {
>>
>>   sockaddr_in address{};
>>   address.sin_family = AF_INET;
>>   inet_pton(AF_INET, remoteAddress.c_str(), &address.sin_addr);
>>   address.sin_port = htons(remotePort);
>>
>>   int connectError = connect(sockFd,
>>                            reinterpret_cast<sockaddr *>(&address),
>>                            sizeof(address));
>>   if (connectError) {
>>     printError("Error connecting association", __FUNCTION__);
>>     return false;
>>   }
>>
>>   cout << "Association connected to address: " << remoteAddress << ":" << remotePort << endl;
>>   return true;
>> }
>>
>> void sendReq(const int sockFd,
>>            const string& remoteAddress,
>>            const uint16_t remotePort,
>>            const std::string& data)
>> {
>>
>>   struct sockaddr_in remoteAddr {};
>>   remoteAddr.sin_family = AF_INET;
>>   remoteAddr.sin_port = htons(remotePort);
>>
>>   uint32_t payloadProtId = 7;
>>   uint16_t streamId = 0;
>>   uint32_t dataLength = data.size();
>>   sockaddr* servaddr = reinterpret_cast<sockaddr*>(&remoteAddr);
>>   inet_pton(AF_INET, remoteAddress.c_str(), &remoteAddr.sin_addr);
>>
>>   const std::string ipaddr =
>>     inet_ntoa(reinterpret_cast<sockaddr_in*>(servaddr)->sin_addr);
>>
>>   cout << "Sending SCTP req to " << remoteAddress << ":" << remotePort;
>>   cout << ", len=" << dataLength << endl;
>>
>>   const int bytesSent = sctp_sendmsg(sockFd,
>>                                    data.c_str(),
>>                                    (size_t)dataLength,
>>                                    servaddr,
>>                                    sizeof(sockaddr_in),
>>                                    htonl(payloadProtId),
>>                                    SCTP_ADDR_OVER,
>>                                    streamId,
>>                                    200,
>>                                    0);
>>
>>   if (bytesSent == -1) {
>>     printError("SCTP send failed", __FUNCTION__);
>>   }
>>
>>   return;
>> }
>>
>> sctp_assoc_t getSocketAssociationId(const int sockFd,
>>                                   const string &remoteIpAddress,
>>                                   std::uint16_t remotePort)
>>
>> {
>>   sockaddr_in socket_address_in{};
>>
>>   socket_address_in.sin_family = AF_INET;
>>   socket_address_in.sin_port = htons(remotePort);
>>   inet_pton(AF_INET, remoteIpAddress.c_str(), &socket_address_in.sin_addr);
>>
>>   struct sockaddr *socket_address = reinterpret_cast<sockaddr*>(&socket_address_in);
>>   socklen_t salen = sizeof(&socket_address);
>>
>>   struct sctp_paddrinfo peer_address_info{};
>>   socklen_t size_of_sctp_paddrinfo = sizeof peer_address_info;
>>   std::memcpy(&peer_address_info.spinfo_address, socket_address, salen);
>>
>>   const int sctpOptInfoError = sctp_opt_info(sockFd,
>>                                            0,
>>                                            SCTP_GET_PEER_ADDR_INFO,
>>                                            &peer_address_info,
>>                                            &size_of_sctp_paddrinfo);
>>
>>   if (sctpOptInfoError) {
>>     printError("Failed to get association id", __FUNCTION__);
>>   }
>>
>>   return peer_address_info.spinfo_assoc_id;
>> }
>>
>> std::uint32_t getAssociationPathMtu(const int sockFd,
>>                                   const string &remoteIpAddress,
>>                                   const std::uint16_t remotePort) {
>>   sockaddr_in socket_address_in{};
>>
>>   socket_address_in.sin_family = AF_INET;
>>   socket_address_in.sin_port = htons(remotePort);
>>   inet_pton(AF_INET, remoteIpAddress.c_str(), &socket_address_in.sin_addr);
>>
>>   struct sockaddr *socket_address = reinterpret_cast<sockaddr*>(&socket_address_in);
>>   socklen_t salen = sizeof(&socket_address);
>>
>>   struct sctp_paddrinfo peer_address_info{};
>>   socklen_t size_of_sctp_paddrinfo = sizeof(peer_address_info);
>>   std::memcpy(&peer_address_info.spinfo_address, socket_address, salen);
>>
>>   sctp_assoc_t sctpAssociationId = getSocketAssociationId(sockFd, remoteIpAddress, remotePort);
>>
>>   const int sctpOptInfoError = sctp_opt_info(sockFd, sctpAssociationId,
>>                                            SCTP_GET_PEER_ADDR_INFO,
>>                                            &peer_address_info, &size_of_sctp_paddrinfo);
>>   if (sctpOptInfoError) {
>>     printError("Failed to get pmtu", __FUNCTION__);
>>   }
>>
>>   auto t = std::time(nullptr);
>>   auto tm = *std::localtime(&t);
>>   std::cout << std::put_time(&tm, "%H:%M:%S ") << remoteIpAddress << ":" << remotePort;
>>   cout << " currently has a PMTU of " << peer_address_info.spinfo_mtu << endl;
>>
>>   return peer_address_info.spinfo_mtu;
>> }
>>
>> void test1(const string& data) {
>>   int localPort = 2944;
>>   string remoteIp1 = "10.0.0.3";
>>   uint16_t remotePort1 = 8001;
>>   uint16_t remotePort2 = 8002;
>>
>>   int sockFd = createSocket();
>>   bindSocket(sockFd, localPort);
>>
>>   cout << "### Test 1: 2 assocs" << endl;
>>
>>   openAssociation(sockFd, remoteIp1, remotePort1);
>>   openAssociation(sockFd, remoteIp1, remotePort2);
>>
>>   getAssociationPathMtu(sockFd, remoteIp1, remotePort1);
>>   getAssociationPathMtu(sockFd, remoteIp1, remotePort2);
>>
>>   sendReq(sockFd, remoteIp1, remotePort1, data);
>>   for (int i = 0; i < 10; i++) {
>>     sleep(10);
>>     getAssociationPathMtu(sockFd, remoteIp1, remotePort1);
>>     getAssociationPathMtu(sockFd, remoteIp1, remotePort2);
>>   }
>> }
>>
>> void test2(const string& data) {
>>   int localPort = 2944;
>>   string remoteIp1 = "10.0.0.3";
>>   uint16_t remotePort1 = 8001;
>>   uint16_t remotePort2 = 8002;
>>   string remoteIpFake = "10.52.96.204";
>>   uint16_t remotePortFake = 3239;
>>
>>   int sockFd = createSocket();
>>   bindSocket(sockFd, localPort);
>>
>>   cout << "### Test 2: 2 assocs + 1 unreachable assoc" << endl;
>>
>>   openAssociation(sockFd, remoteIp1, remotePort1);
>>   openAssociation(sockFd, remoteIp1, remotePort2);
>>   openAssociation(sockFd, remoteIpFake, remotePortFake);
>>
>>   getAssociationPathMtu(sockFd, remoteIp1, remotePort1);
>>   getAssociationPathMtu(sockFd, remoteIp1, remotePort2);
>>
>>   sendReq(sockFd, remoteIp1, remotePort1, data);
>>   for (int i = 0; i < 10; i++) {
>>     sleep(10);
>>     getAssociationPathMtu(sockFd, remoteIp1, remotePort1);
>>     getAssociationPathMtu(sockFd, remoteIp1, remotePort2);
>>   }
>> }
>>
>>
>> int main(int argc, char** argv) {
>>   string testNr = "1";
>>   string& testData = data1000;
>>   if (argc >= 2) {
>>     testNr = argv[1];
>>   }
>>   if (argc >= 3) {
>>     testData = data100;
>>   }
>>
>>   if (testNr == "1") {
>>     test1(testData);
>>   } else {
>>     test2(testData);
>>   }
>>
>>   return 0;
>> }
>

[-- Attachment #2: test2_traces.tar.gz --]
[-- Type: application/x-gzip, Size: 6645 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: PMTU discovery behaviour
  2017-09-11 12:44 PMTU discovery behaviour Peter Salin
                   ` (2 preceding siblings ...)
  2017-09-20 11:02 ` Peter Salin
@ 2017-09-21 11:01 ` Neil Horman
  2017-09-21 12:41 ` Peter Salin
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Neil Horman @ 2017-09-21 11:01 UTC (permalink / raw)
  To: linux-sctp

On Wed, Sep 20, 2017 at 02:02:45PM +0300, Peter Salin wrote:
> 2017-09-19 20:09 GMT+03:00 Neil Horman <nhorman@tuxdriver.com>:
> > On Mon, Sep 11, 2017 at 03:44:57PM +0300, Peter Salin wrote:
> >> Hi,
> >>
> >> I encountered some strange PMTUD related behaviour that I need help in
> >> understanding.
> >>
> >> Setup:
> >>
> >> +-----------+        +---+        +--------+
> >> | 10.0.0.10 |--------| X |--------|10.0.0.3|
> >> +-----------+        +---+        +--------+
> >>
> >> A one to many socket is setup at 10.0.0.10. Two instances of the
> >> lksctp sctp_darn applications are ran at 10.0.0.3 listening to ports
> >> 8001 and 8002. 10.0.0.3 was also setup to generate ICMP frag needed
> >> messages for incoming messages over 600 bytes. This same issue also
> >> occurs also when a router on the path was setup to generate the ICMP
> >> message instead.
> >>
> >> Test 1:
> >> Two associations were connected from 10.0.0.10 to 10.0.0.3, one to
> >> port 8001 and another one to 8002. Then a too large message was sent
> >> on the association to 8001, triggering ICMP generation. When checking
> >> the MTU reported in spinfo_mtu field of SCTP_GET_PEER_ADDR_INFO, the
> >> association now reports 600. The association to 8002 reports 1500
> >> until traffic is sent on it, at which point it also adjusts to 600
> >> which I think makes sense since the destination IP is the same. When
> >> reopening the associations, the value of 600 would be remembered for
> >> about 10 min, which I also think makes sense since
> >> net.ipv4.route.mtu_expires is 600.
> >>
> >> Test 2:
> >> Again the same two associations were connected to 10.0.0.3, but in
> >> addition an attempt to connect a third association to a non-existing
> >> IP was done, this attempt fails with timeout after a while. After
> >> that, again an ICMP triggering large message was sent to 8001. Now the
> >> behaviour is different from before. The association to 8001 reports a
> >> spinfo_mtu of 600, but only for a brief moment, it does not stay at
> >> 600 for 10 minutes. In addition the spinfo_mtu of the association to
> >> 8002 never changes, it stays at the original 1500.
> >>
> >> The only difference between the two tests is the attempt to connect to
> >> a non-responding IP at the beginning of test 2. Any ideas why the
> >> behaviour changes, is this a bug or is there some other reason for
> >> this?
> >>
> >> I have attached the sample application used for reproducing this.
> >>
> >> BR,
> >> -Peter
> >>
> > Hey, apologies for the delay on this, I've had it in my reader for days and kept
> > meaning to respond, but kept getting sidetracked.
> >
> > First glance, this sounds incorrect.  Each association (or rather each
> > transport) maintains its own mtu, and the association reflects the mtu of the
> > active transport. Given that each transport holds its own dst cache entry, I
> > have a hard time seeing how one transports mtu changes might leak to another
> >
> > But thats not really whats happening here.  By your description, the active
> > transport on the established association isn't updating its pathmtu, which
> > should happen in response to receiving the ICMP_FRAG_NEEDED message.
> >
> > I know you've provided the reproducer bellow, and I appreciate that, but I don't
> > have the cycles to set this up at the moment.  Could you tell me if, during the
> > second test, after you attempt to connect to the fake ip address and then send
> > the large message that should trigger the frag needed message, does said large
> > message get retransmitted and eventually arrive at the peer host?  If so, that
> > suggests that the sctp stack:
> >
> > a) receives the frag needed message
> > and
> > b) resends the packet at the lower frag point
> >
> > That in turn suggests we just have some internal reporting error in which we
> > don't update the associations pmtu with the active transports
> >
> > Let me know the answer to that question and it will give me some places to start
> > looking
> > Neil
> >
> Thanks for responding. In response to your question, the first large
> message does get retransmitted without the Don't Fragment bit set. I
> modified the test a bit to also send further messages after the first
> one. Those messages are indeed fragmented according to the limit of
> the ICMP message. I have attached a PCAP trace and SCTP debug logs in
> case that helps here.
> 
> I also tried sending a large message on the other association after
> the large message on the first association had been sent. For test 2
> that message was not fragmented even though the ICMP was already
> received for the first assoc. After the second assoc also received an
> ICMP it adjusted to use the lower MTU for subsequent messages. In the
> case of test 1, sending a large message on the second assoc would auto
> fragment already on the first message.
> 
> Also, after stopping and rerunning test 2 the MTU would always be
> reset at 1500, whereas in test 1 the lower limit would still be in
> effect for a new run. So it seems like in test 2 the lower MTU is only
> known within each association, where as in test 1 the lower MTU also
> gets stored deeper down?
> 
> BR,
> -Peter
> 
So, from what I can see, your included tcpdump only shows the first part of what
you are describing.  That is to say that it sends a large data chunk on an
association that gets an ICMP frag needed response, after which the pmtu is
lowered and smaller message fragments are sent, which is good (i.e. working as
designed).

I don't see anything in the tcpdump relating to the remainder of your test,
showing failed fragmentation.  Can you include that please?

Neil

> >> ------ ver_linux output ------
> >> Linux esalipe-test 4.4.0-93-generic #116-Ubuntu SMP Fri Aug 11
> >> 21:17:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
> >>
> >> GNU C                   5.4.0
> >> GNU Make                4.1
> >> Binutils                2.26.1
> >> Util-linux              2.27.1
> >> Mount                   2.27.1
> >> Module-init-tools       22
> >> E2fsprogs               1.42.13
> >> Xfsprogs                4.3.0
> >> Linux C Library         2.23
> >> Dynamic linker (ldd)    2.23
> >> Linux C++ Library       6.0.21
> >> Procps                  3.3.10
> >> Net-tools               1.60
> >> Kbd                     1.15.5
> >> Console-tools           1.15.5
> >> Sh-utils                8.25
> >> Udev                    229
> >> Modules Loaded          ablk_helper aes_x86_64 aesni_intel
> >> async_memcpy async_pq async_raid6_recov async_tx async_xor autofs4
> >> binfmt_misc btrfs  crc32_pclmul crct10dif_pclmul cryptd floppy
> >> gf128mul ghash_clmul ni_intel glue_helper hid hid_generic ib_addr
> >> ib_cm ib_core ib_iser ib_mad ib_sa input_leds irqbypass iscsi_tcp
> >> iw_cm joydev kvm kvm_intel libcrc32c libiscsi libiscsi_tcp linear lrw
> >> multipath parport parport_pc ppdev psmouse raid0 raid1 raid10 raid456
> >> raid6_pq rdma_cm scsi_transport_iscsi sctp serio_raw usbhid xor
> >
> >>
> >> #include <cstring>
> >> #include <ctime>
> >> #include <iomanip>
> >> #include <iostream>
> >>
> >> #include <errno.h>
> >> #include <unistd.h>
> >> #include <arpa/inet.h>
> >> #include <net/if.h>
> >> #include <netinet/in.h>
> >> #include <netinet/sctp.h>
> >> #include <sys/ioctl.h>
> >> #include <sys/socket.h>
> >>
> >> using namespace std;
> >>
> >> static const int ERROR_BUFLEN = 64;
> >> static const char* SCTP_INTERFACE_NAME = "ens4";
> >>
> >> static string data100 = "01234567890123456789012345678901234567890123456789"
> >>   "01234567890123456789012345678901234567890123456789";
> >> static string data1000 = "01234567890123456789012345678901234567890123456789"
> >>   "01234567890123456789012345678901234567890123456789"
> >>   "01234567890123456789012345678901234567890123456789"
> >>   "01234567890123456789012345678901234567890123456789"
> >>   "01234567890123456789012345678901234567890123456789"
> >>   "01234567890123456789012345678901234567890123456789"
> >>   "01234567890123456789012345678901234567890123456789"
> >>   "01234567890123456789012345678901234567890123456789"
> >>   "01234567890123456789012345678901234567890123456789"
> >>   "01234567890123456789012345678901234567890123456789"
> >>   "01234567890123456789012345678901234567890123456789"
> >>   "01234567890123456789012345678901234567890123456789"
> >>   "01234567890123456789012345678901234567890123456789"
> >>   "01234567890123456789012345678901234567890123456789"
> >>   "01234567890123456789012345678901234567890123456789"
> >>   "01234567890123456789012345678901234567890123456789"
> >>   "01234567890123456789012345678901234567890123456789"
> >>   "01234567890123456789012345678901234567890123456789"
> >>   "01234567890123456789012345678901234567890123456789"
> >>   "01234567890123456789012345678901234567890123456789";
> >>
> >> void printError(const string& msg, const string& funcName) {
> >>   char errorMessage[ERROR_BUFLEN] {};
> >>   char* errMsg = ::strerror_r(errno, errorMessage,
> >>                             sizeof(errorMessage));
> >>
> >>   cerr << "::" << funcName << ": " << msg << ": " << errMsg << endl;
> >> }
> >>
> >> int createSocket() {
> >>   int sockFd = socket (AF_INET,
> >>                      SOCK_SEQPACKET,
> >>                      IPPROTO_SCTP);
> >>   if (sockFd = -1) {
> >>     printError("Creation of socket failed", __FUNCTION__);
> >>     return -1;
> >>   }
> >>
> >>   // Enable address reuse
> >>   int enable = 1;
> >>   int err = setsockopt(sockFd,
> >>                      SOL_SOCKET,
> >>                      SO_REUSEADDR,
> >>                      &enable,
> >>                      sizeof(enable));
> >>
> >>   if (err) {
> >>     printError("Error setting socket option SO_REUSEADDR", __FUNCTION__);
> >>     close(sockFd);
> >>     return -1;
> >>   }
> >>
> >>   // Configure SCTP
> >>   sctp_initmsg initmsg{};
> >>   initmsg.sinit_num_ostreams = 3;
> >>   initmsg.sinit_max_instreams = 3;
> >>   initmsg.sinit_max_attempts = 2;
> >>   initmsg.sinit_max_init_timeo = 0;
> >>
> >>   err = setsockopt(sockFd,
> >>                  IPPROTO_SCTP,
> >>                  SCTP_INITMSG,
> >>                  &initmsg,
> >>                  sizeof(initmsg));
> >>
> >>   if (err) {
> >>     printError("Configuring SCTP socket failed", __FUNCTION__);
> >>     close(sockFd);
> >>     return -1;
> >>   }
> >>
> >>   struct sctp_paddrparams paddr_params{};
> >>   memset(&paddr_params, 0, sizeof(paddr_params));
> >>   socklen_t size_of_sctp_paddr_params = sizeof(paddr_params);
> >>   paddr_params.spp_flags = SPP_HB_ENABLE | SPP_PMTUD_ENABLE | SPP_SACKDELAY_ENABLE;
> >>
> >>   err = setsockopt(sockFd,
> >>                  IPPROTO_SCTP,
> >>                  SCTP_PEER_ADDR_PARAMS,
> >>                  &paddr_params,
> >>                  size_of_sctp_paddr_params);
> >>
> >>   if (err) {
> >>     printError("Configuring SCTP params failed", __FUNCTION__);
> >>     close(sockFd);
> >>     return -1;
> >>   }
> >>
> >>   return sockFd;
> >> }
> >>
> >> bool bindSocket(const int sockFd, const int localPort) {
> >>   // Get IP of ethernet interface
> >>   string localAddress = "";
> >>   ifreq ifr{};
> >>   ifr.ifr_addr.sa_family = AF_INET;
> >>   strncpy(ifr.ifr_name, SCTP_INTERFACE_NAME, IFNAMSIZ - 1);
> >>   const int ioctlStatus = ioctl(sockFd,
> >>                               SIOCGIFADDR,
> >>                               &ifr);
> >>
> >>   if (ioctlStatus = -1) {
> >>     printError("Failed to get local address", __FUNCTION__);
> >>     return false;
> >>   }
> >>
> >>   char ipAddrBuffer[INET_ADDRSTRLEN] {};
> >>   inet_ntop(AF_INET,
> >>           &reinterpret_cast<sockaddr_in*>(&(ifr.ifr_addr))->sin_addr,
> >>           ipAddrBuffer,
> >>           sizeof(ipAddrBuffer));
> >>
> >>   localAddress.assign(ipAddrBuffer);
> >>
> >>   // Bind to found ip address
> >>   sockaddr_in serv_addr{};
> >>   serv_addr.sin_family = AF_INET;
> >>   inet_pton(AF_INET,
> >>           localAddress.c_str(),
> >>           &serv_addr.sin_addr);
> >>   serv_addr.sin_port = htons(localPort);
> >>
> >>   if (bind(sockFd,
> >>          reinterpret_cast<sockaddr*>(&serv_addr),
> >>          sizeof(serv_addr))) {
> >>     printError("Failed to bind socket to local address", __FUNCTION__);
> >>     localAddress.clear();
> >>     close(sockFd);
> >>     return false;
> >>   }
> >>
> >>   cout << "Local endpoint succussfully bound to local address: " << localAddress << endl;
> >>
> >>   return true;
> >> }
> >>
> >> bool openAssociation(const int sockFd,
> >>                    const string &remoteAddress,
> >>                    std::uint16_t remotePort) {
> >>
> >>   sockaddr_in address{};
> >>   address.sin_family = AF_INET;
> >>   inet_pton(AF_INET, remoteAddress.c_str(), &address.sin_addr);
> >>   address.sin_port = htons(remotePort);
> >>
> >>   int connectError = connect(sockFd,
> >>                            reinterpret_cast<sockaddr *>(&address),
> >>                            sizeof(address));
> >>   if (connectError) {
> >>     printError("Error connecting association", __FUNCTION__);
> >>     return false;
> >>   }
> >>
> >>   cout << "Association connected to address: " << remoteAddress << ":" << remotePort << endl;
> >>   return true;
> >> }
> >>
> >> void sendReq(const int sockFd,
> >>            const string& remoteAddress,
> >>            const uint16_t remotePort,
> >>            const std::string& data)
> >> {
> >>
> >>   struct sockaddr_in remoteAddr {};
> >>   remoteAddr.sin_family = AF_INET;
> >>   remoteAddr.sin_port = htons(remotePort);
> >>
> >>   uint32_t payloadProtId = 7;
> >>   uint16_t streamId = 0;
> >>   uint32_t dataLength = data.size();
> >>   sockaddr* servaddr = reinterpret_cast<sockaddr*>(&remoteAddr);
> >>   inet_pton(AF_INET, remoteAddress.c_str(), &remoteAddr.sin_addr);
> >>
> >>   const std::string ipaddr > >>     inet_ntoa(reinterpret_cast<sockaddr_in*>(servaddr)->sin_addr);
> >>
> >>   cout << "Sending SCTP req to " << remoteAddress << ":" << remotePort;
> >>   cout << ", len=" << dataLength << endl;
> >>
> >>   const int bytesSent = sctp_sendmsg(sockFd,
> >>                                    data.c_str(),
> >>                                    (size_t)dataLength,
> >>                                    servaddr,
> >>                                    sizeof(sockaddr_in),
> >>                                    htonl(payloadProtId),
> >>                                    SCTP_ADDR_OVER,
> >>                                    streamId,
> >>                                    200,
> >>                                    0);
> >>
> >>   if (bytesSent = -1) {
> >>     printError("SCTP send failed", __FUNCTION__);
> >>   }
> >>
> >>   return;
> >> }
> >>
> >> sctp_assoc_t getSocketAssociationId(const int sockFd,
> >>                                   const string &remoteIpAddress,
> >>                                   std::uint16_t remotePort)
> >>
> >> {
> >>   sockaddr_in socket_address_in{};
> >>
> >>   socket_address_in.sin_family = AF_INET;
> >>   socket_address_in.sin_port = htons(remotePort);
> >>   inet_pton(AF_INET, remoteIpAddress.c_str(), &socket_address_in.sin_addr);
> >>
> >>   struct sockaddr *socket_address = reinterpret_cast<sockaddr*>(&socket_address_in);
> >>   socklen_t salen = sizeof(&socket_address);
> >>
> >>   struct sctp_paddrinfo peer_address_info{};
> >>   socklen_t size_of_sctp_paddrinfo = sizeof peer_address_info;
> >>   std::memcpy(&peer_address_info.spinfo_address, socket_address, salen);
> >>
> >>   const int sctpOptInfoError = sctp_opt_info(sockFd,
> >>                                            0,
> >>                                            SCTP_GET_PEER_ADDR_INFO,
> >>                                            &peer_address_info,
> >>                                            &size_of_sctp_paddrinfo);
> >>
> >>   if (sctpOptInfoError) {
> >>     printError("Failed to get association id", __FUNCTION__);
> >>   }
> >>
> >>   return peer_address_info.spinfo_assoc_id;
> >> }
> >>
> >> std::uint32_t getAssociationPathMtu(const int sockFd,
> >>                                   const string &remoteIpAddress,
> >>                                   const std::uint16_t remotePort) {
> >>   sockaddr_in socket_address_in{};
> >>
> >>   socket_address_in.sin_family = AF_INET;
> >>   socket_address_in.sin_port = htons(remotePort);
> >>   inet_pton(AF_INET, remoteIpAddress.c_str(), &socket_address_in.sin_addr);
> >>
> >>   struct sockaddr *socket_address = reinterpret_cast<sockaddr*>(&socket_address_in);
> >>   socklen_t salen = sizeof(&socket_address);
> >>
> >>   struct sctp_paddrinfo peer_address_info{};
> >>   socklen_t size_of_sctp_paddrinfo = sizeof(peer_address_info);
> >>   std::memcpy(&peer_address_info.spinfo_address, socket_address, salen);
> >>
> >>   sctp_assoc_t sctpAssociationId = getSocketAssociationId(sockFd, remoteIpAddress, remotePort);
> >>
> >>   const int sctpOptInfoError = sctp_opt_info(sockFd, sctpAssociationId,
> >>                                            SCTP_GET_PEER_ADDR_INFO,
> >>                                            &peer_address_info, &size_of_sctp_paddrinfo);
> >>   if (sctpOptInfoError) {
> >>     printError("Failed to get pmtu", __FUNCTION__);
> >>   }
> >>
> >>   auto t = std::time(nullptr);
> >>   auto tm = *std::localtime(&t);
> >>   std::cout << std::put_time(&tm, "%H:%M:%S ") << remoteIpAddress << ":" << remotePort;
> >>   cout << " currently has a PMTU of " << peer_address_info.spinfo_mtu << endl;
> >>
> >>   return peer_address_info.spinfo_mtu;
> >> }
> >>
> >> void test1(const string& data) {
> >>   int localPort = 2944;
> >>   string remoteIp1 = "10.0.0.3";
> >>   uint16_t remotePort1 = 8001;
> >>   uint16_t remotePort2 = 8002;
> >>
> >>   int sockFd = createSocket();
> >>   bindSocket(sockFd, localPort);
> >>
> >>   cout << "### Test 1: 2 assocs" << endl;
> >>
> >>   openAssociation(sockFd, remoteIp1, remotePort1);
> >>   openAssociation(sockFd, remoteIp1, remotePort2);
> >>
> >>   getAssociationPathMtu(sockFd, remoteIp1, remotePort1);
> >>   getAssociationPathMtu(sockFd, remoteIp1, remotePort2);
> >>
> >>   sendReq(sockFd, remoteIp1, remotePort1, data);
> >>   for (int i = 0; i < 10; i++) {
> >>     sleep(10);
> >>     getAssociationPathMtu(sockFd, remoteIp1, remotePort1);
> >>     getAssociationPathMtu(sockFd, remoteIp1, remotePort2);
> >>   }
> >> }
> >>
> >> void test2(const string& data) {
> >>   int localPort = 2944;
> >>   string remoteIp1 = "10.0.0.3";
> >>   uint16_t remotePort1 = 8001;
> >>   uint16_t remotePort2 = 8002;
> >>   string remoteIpFake = "10.52.96.204";
> >>   uint16_t remotePortFake = 3239;
> >>
> >>   int sockFd = createSocket();
> >>   bindSocket(sockFd, localPort);
> >>
> >>   cout << "### Test 2: 2 assocs + 1 unreachable assoc" << endl;
> >>
> >>   openAssociation(sockFd, remoteIp1, remotePort1);
> >>   openAssociation(sockFd, remoteIp1, remotePort2);
> >>   openAssociation(sockFd, remoteIpFake, remotePortFake);
> >>
> >>   getAssociationPathMtu(sockFd, remoteIp1, remotePort1);
> >>   getAssociationPathMtu(sockFd, remoteIp1, remotePort2);
> >>
> >>   sendReq(sockFd, remoteIp1, remotePort1, data);
> >>   for (int i = 0; i < 10; i++) {
> >>     sleep(10);
> >>     getAssociationPathMtu(sockFd, remoteIp1, remotePort1);
> >>     getAssociationPathMtu(sockFd, remoteIp1, remotePort2);
> >>   }
> >> }
> >>
> >>
> >> int main(int argc, char** argv) {
> >>   string testNr = "1";
> >>   string& testData = data1000;
> >>   if (argc >= 2) {
> >>     testNr = argv[1];
> >>   }
> >>   if (argc >= 3) {
> >>     testData = data100;
> >>   }
> >>
> >>   if (testNr = "1") {
> >>     test1(testData);
> >>   } else {
> >>     test2(testData);
> >>   }
> >>
> >>   return 0;
> >> }
> >



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: PMTU discovery behaviour
  2017-09-11 12:44 PMTU discovery behaviour Peter Salin
                   ` (3 preceding siblings ...)
  2017-09-21 11:01 ` Neil Horman
@ 2017-09-21 12:41 ` Peter Salin
  2017-09-21 15:24 ` Neil Horman
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Peter Salin @ 2017-09-21 12:41 UTC (permalink / raw)
  To: linux-sctp

[-- Attachment #1: Type: text/plain, Size: 20425 bytes --]

2017-09-21 14:01 GMT+03:00 Neil Horman <nhorman@tuxdriver.com>:
> On Wed, Sep 20, 2017 at 02:02:45PM +0300, Peter Salin wrote:
>> 2017-09-19 20:09 GMT+03:00 Neil Horman <nhorman@tuxdriver.com>:
>> > On Mon, Sep 11, 2017 at 03:44:57PM +0300, Peter Salin wrote:
>> >> Hi,
>> >>
>> >> I encountered some strange PMTUD related behaviour that I need help in
>> >> understanding.
>> >>
>> >> Setup:
>> >>
>> >> +-----------+        +---+        +--------+
>> >> | 10.0.0.10 |--------| X |--------|10.0.0.3|
>> >> +-----------+        +---+        +--------+
>> >>
>> >> A one to many socket is setup at 10.0.0.10. Two instances of the
>> >> lksctp sctp_darn applications are ran at 10.0.0.3 listening to ports
>> >> 8001 and 8002. 10.0.0.3 was also setup to generate ICMP frag needed
>> >> messages for incoming messages over 600 bytes. This same issue also
>> >> occurs also when a router on the path was setup to generate the ICMP
>> >> message instead.
>> >>
>> >> Test 1:
>> >> Two associations were connected from 10.0.0.10 to 10.0.0.3, one to
>> >> port 8001 and another one to 8002. Then a too large message was sent
>> >> on the association to 8001, triggering ICMP generation. When checking
>> >> the MTU reported in spinfo_mtu field of SCTP_GET_PEER_ADDR_INFO, the
>> >> association now reports 600. The association to 8002 reports 1500
>> >> until traffic is sent on it, at which point it also adjusts to 600
>> >> which I think makes sense since the destination IP is the same. When
>> >> reopening the associations, the value of 600 would be remembered for
>> >> about 10 min, which I also think makes sense since
>> >> net.ipv4.route.mtu_expires is 600.
>> >>
>> >> Test 2:
>> >> Again the same two associations were connected to 10.0.0.3, but in
>> >> addition an attempt to connect a third association to a non-existing
>> >> IP was done, this attempt fails with timeout after a while. After
>> >> that, again an ICMP triggering large message was sent to 8001. Now the
>> >> behaviour is different from before. The association to 8001 reports a
>> >> spinfo_mtu of 600, but only for a brief moment, it does not stay at
>> >> 600 for 10 minutes. In addition the spinfo_mtu of the association to
>> >> 8002 never changes, it stays at the original 1500.
>> >>
>> >> The only difference between the two tests is the attempt to connect to
>> >> a non-responding IP at the beginning of test 2. Any ideas why the
>> >> behaviour changes, is this a bug or is there some other reason for
>> >> this?
>> >>
>> >> I have attached the sample application used for reproducing this.
>> >>
>> >> BR,
>> >> -Peter
>> >>
>> > Hey, apologies for the delay on this, I've had it in my reader for days and kept
>> > meaning to respond, but kept getting sidetracked.
>> >
>> > First glance, this sounds incorrect.  Each association (or rather each
>> > transport) maintains its own mtu, and the association reflects the mtu of the
>> > active transport. Given that each transport holds its own dst cache entry, I
>> > have a hard time seeing how one transports mtu changes might leak to another
>> >
>> > But thats not really whats happening here.  By your description, the active
>> > transport on the established association isn't updating its pathmtu, which
>> > should happen in response to receiving the ICMP_FRAG_NEEDED message.
>> >
>> > I know you've provided the reproducer bellow, and I appreciate that, but I don't
>> > have the cycles to set this up at the moment.  Could you tell me if, during the
>> > second test, after you attempt to connect to the fake ip address and then send
>> > the large message that should trigger the frag needed message, does said large
>> > message get retransmitted and eventually arrive at the peer host?  If so, that
>> > suggests that the sctp stack:
>> >
>> > a) receives the frag needed message
>> > and
>> > b) resends the packet at the lower frag point
>> >
>> > That in turn suggests we just have some internal reporting error in which we
>> > don't update the associations pmtu with the active transports
>> >
>> > Let me know the answer to that question and it will give me some places to start
>> > looking
>> > Neil
>> >
>> Thanks for responding. In response to your question, the first large
>> message does get retransmitted without the Don't Fragment bit set. I
>> modified the test a bit to also send further messages after the first
>> one. Those messages are indeed fragmented according to the limit of
>> the ICMP message. I have attached a PCAP trace and SCTP debug logs in
>> case that helps here.
>>
>> I also tried sending a large message on the other association after
>> the large message on the first association had been sent. For test 2
>> that message was not fragmented even though the ICMP was already
>> received for the first assoc. After the second assoc also received an
>> ICMP it adjusted to use the lower MTU for subsequent messages. In the
>> case of test 1, sending a large message on the second assoc would auto
>> fragment already on the first message.
>>
>> Also, after stopping and rerunning test 2 the MTU would always be
>> reset at 1500, whereas in test 1 the lower limit would still be in
>> effect for a new run. So it seems like in test 2 the lower MTU is only
>> known within each association, where as in test 1 the lower MTU also
>> gets stored deeper down?
>>
>> BR,
>> -Peter
>>
> So, from what I can see, your included tcpdump only shows the first part of what
> you are describing.  That is to say that it sends a large data chunk on an
> association that gets an ICMP frag needed response, after which the pmtu is
> lowered and smaller message fragments are sent, which is good (i.e. working as
> designed).
>
> I don't see anything in the tcpdump relating to the remainder of your test,
> showing failed fragmentation.  Can you include that please?
>
> Neil
Yes, please find attached traces that include sending on the other
association after receiving the first ICMP.

BR,
-Peter
>
>> >> ------ ver_linux output ------
>> >> Linux esalipe-test 4.4.0-93-generic #116-Ubuntu SMP Fri Aug 11
>> >> 21:17:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
>> >>
>> >> GNU C                   5.4.0
>> >> GNU Make                4.1
>> >> Binutils                2.26.1
>> >> Util-linux              2.27.1
>> >> Mount                   2.27.1
>> >> Module-init-tools       22
>> >> E2fsprogs               1.42.13
>> >> Xfsprogs                4.3.0
>> >> Linux C Library         2.23
>> >> Dynamic linker (ldd)    2.23
>> >> Linux C++ Library       6.0.21
>> >> Procps                  3.3.10
>> >> Net-tools               1.60
>> >> Kbd                     1.15.5
>> >> Console-tools           1.15.5
>> >> Sh-utils                8.25
>> >> Udev                    229
>> >> Modules Loaded          ablk_helper aes_x86_64 aesni_intel
>> >> async_memcpy async_pq async_raid6_recov async_tx async_xor autofs4
>> >> binfmt_misc btrfs  crc32_pclmul crct10dif_pclmul cryptd floppy
>> >> gf128mul ghash_clmul ni_intel glue_helper hid hid_generic ib_addr
>> >> ib_cm ib_core ib_iser ib_mad ib_sa input_leds irqbypass iscsi_tcp
>> >> iw_cm joydev kvm kvm_intel libcrc32c libiscsi libiscsi_tcp linear lrw
>> >> multipath parport parport_pc ppdev psmouse raid0 raid1 raid10 raid456
>> >> raid6_pq rdma_cm scsi_transport_iscsi sctp serio_raw usbhid xor
>> >
>> >>
>> >> #include <cstring>
>> >> #include <ctime>
>> >> #include <iomanip>
>> >> #include <iostream>
>> >>
>> >> #include <errno.h>
>> >> #include <unistd.h>
>> >> #include <arpa/inet.h>
>> >> #include <net/if.h>
>> >> #include <netinet/in.h>
>> >> #include <netinet/sctp.h>
>> >> #include <sys/ioctl.h>
>> >> #include <sys/socket.h>
>> >>
>> >> using namespace std;
>> >>
>> >> static const int ERROR_BUFLEN = 64;
>> >> static const char* SCTP_INTERFACE_NAME = "ens4";
>> >>
>> >> static string data100 = "01234567890123456789012345678901234567890123456789"
>> >>   "01234567890123456789012345678901234567890123456789";
>> >> static string data1000 = "01234567890123456789012345678901234567890123456789"
>> >>   "01234567890123456789012345678901234567890123456789"
>> >>   "01234567890123456789012345678901234567890123456789"
>> >>   "01234567890123456789012345678901234567890123456789"
>> >>   "01234567890123456789012345678901234567890123456789"
>> >>   "01234567890123456789012345678901234567890123456789"
>> >>   "01234567890123456789012345678901234567890123456789"
>> >>   "01234567890123456789012345678901234567890123456789"
>> >>   "01234567890123456789012345678901234567890123456789"
>> >>   "01234567890123456789012345678901234567890123456789"
>> >>   "01234567890123456789012345678901234567890123456789"
>> >>   "01234567890123456789012345678901234567890123456789"
>> >>   "01234567890123456789012345678901234567890123456789"
>> >>   "01234567890123456789012345678901234567890123456789"
>> >>   "01234567890123456789012345678901234567890123456789"
>> >>   "01234567890123456789012345678901234567890123456789"
>> >>   "01234567890123456789012345678901234567890123456789"
>> >>   "01234567890123456789012345678901234567890123456789"
>> >>   "01234567890123456789012345678901234567890123456789"
>> >>   "01234567890123456789012345678901234567890123456789";
>> >>
>> >> void printError(const string& msg, const string& funcName) {
>> >>   char errorMessage[ERROR_BUFLEN] {};
>> >>   char* errMsg = ::strerror_r(errno, errorMessage,
>> >>                             sizeof(errorMessage));
>> >>
>> >>   cerr << "::" << funcName << ": " << msg << ": " << errMsg << endl;
>> >> }
>> >>
>> >> int createSocket() {
>> >>   int sockFd = socket (AF_INET,
>> >>                      SOCK_SEQPACKET,
>> >>                      IPPROTO_SCTP);
>> >>   if (sockFd == -1) {
>> >>     printError("Creation of socket failed", __FUNCTION__);
>> >>     return -1;
>> >>   }
>> >>
>> >>   // Enable address reuse
>> >>   int enable = 1;
>> >>   int err = setsockopt(sockFd,
>> >>                      SOL_SOCKET,
>> >>                      SO_REUSEADDR,
>> >>                      &enable,
>> >>                      sizeof(enable));
>> >>
>> >>   if (err) {
>> >>     printError("Error setting socket option SO_REUSEADDR", __FUNCTION__);
>> >>     close(sockFd);
>> >>     return -1;
>> >>   }
>> >>
>> >>   // Configure SCTP
>> >>   sctp_initmsg initmsg{};
>> >>   initmsg.sinit_num_ostreams = 3;
>> >>   initmsg.sinit_max_instreams = 3;
>> >>   initmsg.sinit_max_attempts = 2;
>> >>   initmsg.sinit_max_init_timeo = 0;
>> >>
>> >>   err = setsockopt(sockFd,
>> >>                  IPPROTO_SCTP,
>> >>                  SCTP_INITMSG,
>> >>                  &initmsg,
>> >>                  sizeof(initmsg));
>> >>
>> >>   if (err) {
>> >>     printError("Configuring SCTP socket failed", __FUNCTION__);
>> >>     close(sockFd);
>> >>     return -1;
>> >>   }
>> >>
>> >>   struct sctp_paddrparams paddr_params{};
>> >>   memset(&paddr_params, 0, sizeof(paddr_params));
>> >>   socklen_t size_of_sctp_paddr_params = sizeof(paddr_params);
>> >>   paddr_params.spp_flags = SPP_HB_ENABLE | SPP_PMTUD_ENABLE | SPP_SACKDELAY_ENABLE;
>> >>
>> >>   err = setsockopt(sockFd,
>> >>                  IPPROTO_SCTP,
>> >>                  SCTP_PEER_ADDR_PARAMS,
>> >>                  &paddr_params,
>> >>                  size_of_sctp_paddr_params);
>> >>
>> >>   if (err) {
>> >>     printError("Configuring SCTP params failed", __FUNCTION__);
>> >>     close(sockFd);
>> >>     return -1;
>> >>   }
>> >>
>> >>   return sockFd;
>> >> }
>> >>
>> >> bool bindSocket(const int sockFd, const int localPort) {
>> >>   // Get IP of ethernet interface
>> >>   string localAddress = "";
>> >>   ifreq ifr{};
>> >>   ifr.ifr_addr.sa_family = AF_INET;
>> >>   strncpy(ifr.ifr_name, SCTP_INTERFACE_NAME, IFNAMSIZ - 1);
>> >>   const int ioctlStatus = ioctl(sockFd,
>> >>                               SIOCGIFADDR,
>> >>                               &ifr);
>> >>
>> >>   if (ioctlStatus == -1) {
>> >>     printError("Failed to get local address", __FUNCTION__);
>> >>     return false;
>> >>   }
>> >>
>> >>   char ipAddrBuffer[INET_ADDRSTRLEN] {};
>> >>   inet_ntop(AF_INET,
>> >>           &reinterpret_cast<sockaddr_in*>(&(ifr.ifr_addr))->sin_addr,
>> >>           ipAddrBuffer,
>> >>           sizeof(ipAddrBuffer));
>> >>
>> >>   localAddress.assign(ipAddrBuffer);
>> >>
>> >>   // Bind to found ip address
>> >>   sockaddr_in serv_addr{};
>> >>   serv_addr.sin_family = AF_INET;
>> >>   inet_pton(AF_INET,
>> >>           localAddress.c_str(),
>> >>           &serv_addr.sin_addr);
>> >>   serv_addr.sin_port = htons(localPort);
>> >>
>> >>   if (bind(sockFd,
>> >>          reinterpret_cast<sockaddr*>(&serv_addr),
>> >>          sizeof(serv_addr))) {
>> >>     printError("Failed to bind socket to local address", __FUNCTION__);
>> >>     localAddress.clear();
>> >>     close(sockFd);
>> >>     return false;
>> >>   }
>> >>
>> >>   cout << "Local endpoint succussfully bound to local address: " << localAddress << endl;
>> >>
>> >>   return true;
>> >> }
>> >>
>> >> bool openAssociation(const int sockFd,
>> >>                    const string &remoteAddress,
>> >>                    std::uint16_t remotePort) {
>> >>
>> >>   sockaddr_in address{};
>> >>   address.sin_family = AF_INET;
>> >>   inet_pton(AF_INET, remoteAddress.c_str(), &address.sin_addr);
>> >>   address.sin_port = htons(remotePort);
>> >>
>> >>   int connectError = connect(sockFd,
>> >>                            reinterpret_cast<sockaddr *>(&address),
>> >>                            sizeof(address));
>> >>   if (connectError) {
>> >>     printError("Error connecting association", __FUNCTION__);
>> >>     return false;
>> >>   }
>> >>
>> >>   cout << "Association connected to address: " << remoteAddress << ":" << remotePort << endl;
>> >>   return true;
>> >> }
>> >>
>> >> void sendReq(const int sockFd,
>> >>            const string& remoteAddress,
>> >>            const uint16_t remotePort,
>> >>            const std::string& data)
>> >> {
>> >>
>> >>   struct sockaddr_in remoteAddr {};
>> >>   remoteAddr.sin_family = AF_INET;
>> >>   remoteAddr.sin_port = htons(remotePort);
>> >>
>> >>   uint32_t payloadProtId = 7;
>> >>   uint16_t streamId = 0;
>> >>   uint32_t dataLength = data.size();
>> >>   sockaddr* servaddr = reinterpret_cast<sockaddr*>(&remoteAddr);
>> >>   inet_pton(AF_INET, remoteAddress.c_str(), &remoteAddr.sin_addr);
>> >>
>> >>   const std::string ipaddr =
>> >>     inet_ntoa(reinterpret_cast<sockaddr_in*>(servaddr)->sin_addr);
>> >>
>> >>   cout << "Sending SCTP req to " << remoteAddress << ":" << remotePort;
>> >>   cout << ", len=" << dataLength << endl;
>> >>
>> >>   const int bytesSent = sctp_sendmsg(sockFd,
>> >>                                    data.c_str(),
>> >>                                    (size_t)dataLength,
>> >>                                    servaddr,
>> >>                                    sizeof(sockaddr_in),
>> >>                                    htonl(payloadProtId),
>> >>                                    SCTP_ADDR_OVER,
>> >>                                    streamId,
>> >>                                    200,
>> >>                                    0);
>> >>
>> >>   if (bytesSent == -1) {
>> >>     printError("SCTP send failed", __FUNCTION__);
>> >>   }
>> >>
>> >>   return;
>> >> }
>> >>
>> >> sctp_assoc_t getSocketAssociationId(const int sockFd,
>> >>                                   const string &remoteIpAddress,
>> >>                                   std::uint16_t remotePort)
>> >>
>> >> {
>> >>   sockaddr_in socket_address_in{};
>> >>
>> >>   socket_address_in.sin_family = AF_INET;
>> >>   socket_address_in.sin_port = htons(remotePort);
>> >>   inet_pton(AF_INET, remoteIpAddress.c_str(), &socket_address_in.sin_addr);
>> >>
>> >>   struct sockaddr *socket_address = reinterpret_cast<sockaddr*>(&socket_address_in);
>> >>   socklen_t salen = sizeof(&socket_address);
>> >>
>> >>   struct sctp_paddrinfo peer_address_info{};
>> >>   socklen_t size_of_sctp_paddrinfo = sizeof peer_address_info;
>> >>   std::memcpy(&peer_address_info.spinfo_address, socket_address, salen);
>> >>
>> >>   const int sctpOptInfoError = sctp_opt_info(sockFd,
>> >>                                            0,
>> >>                                            SCTP_GET_PEER_ADDR_INFO,
>> >>                                            &peer_address_info,
>> >>                                            &size_of_sctp_paddrinfo);
>> >>
>> >>   if (sctpOptInfoError) {
>> >>     printError("Failed to get association id", __FUNCTION__);
>> >>   }
>> >>
>> >>   return peer_address_info.spinfo_assoc_id;
>> >> }
>> >>
>> >> std::uint32_t getAssociationPathMtu(const int sockFd,
>> >>                                   const string &remoteIpAddress,
>> >>                                   const std::uint16_t remotePort) {
>> >>   sockaddr_in socket_address_in{};
>> >>
>> >>   socket_address_in.sin_family = AF_INET;
>> >>   socket_address_in.sin_port = htons(remotePort);
>> >>   inet_pton(AF_INET, remoteIpAddress.c_str(), &socket_address_in.sin_addr);
>> >>
>> >>   struct sockaddr *socket_address = reinterpret_cast<sockaddr*>(&socket_address_in);
>> >>   socklen_t salen = sizeof(&socket_address);
>> >>
>> >>   struct sctp_paddrinfo peer_address_info{};
>> >>   socklen_t size_of_sctp_paddrinfo = sizeof(peer_address_info);
>> >>   std::memcpy(&peer_address_info.spinfo_address, socket_address, salen);
>> >>
>> >>   sctp_assoc_t sctpAssociationId = getSocketAssociationId(sockFd, remoteIpAddress, remotePort);
>> >>
>> >>   const int sctpOptInfoError = sctp_opt_info(sockFd, sctpAssociationId,
>> >>                                            SCTP_GET_PEER_ADDR_INFO,
>> >>                                            &peer_address_info, &size_of_sctp_paddrinfo);
>> >>   if (sctpOptInfoError) {
>> >>     printError("Failed to get pmtu", __FUNCTION__);
>> >>   }
>> >>
>> >>   auto t = std::time(nullptr);
>> >>   auto tm = *std::localtime(&t);
>> >>   std::cout << std::put_time(&tm, "%H:%M:%S ") << remoteIpAddress << ":" << remotePort;
>> >>   cout << " currently has a PMTU of " << peer_address_info.spinfo_mtu << endl;
>> >>
>> >>   return peer_address_info.spinfo_mtu;
>> >> }
>> >>
>> >> void test1(const string& data) {
>> >>   int localPort = 2944;
>> >>   string remoteIp1 = "10.0.0.3";
>> >>   uint16_t remotePort1 = 8001;
>> >>   uint16_t remotePort2 = 8002;
>> >>
>> >>   int sockFd = createSocket();
>> >>   bindSocket(sockFd, localPort);
>> >>
>> >>   cout << "### Test 1: 2 assocs" << endl;
>> >>
>> >>   openAssociation(sockFd, remoteIp1, remotePort1);
>> >>   openAssociation(sockFd, remoteIp1, remotePort2);
>> >>
>> >>   getAssociationPathMtu(sockFd, remoteIp1, remotePort1);
>> >>   getAssociationPathMtu(sockFd, remoteIp1, remotePort2);
>> >>
>> >>   sendReq(sockFd, remoteIp1, remotePort1, data);
>> >>   for (int i = 0; i < 10; i++) {
>> >>     sleep(10);
>> >>     getAssociationPathMtu(sockFd, remoteIp1, remotePort1);
>> >>     getAssociationPathMtu(sockFd, remoteIp1, remotePort2);
>> >>   }
>> >> }
>> >>
>> >> void test2(const string& data) {
>> >>   int localPort = 2944;
>> >>   string remoteIp1 = "10.0.0.3";
>> >>   uint16_t remotePort1 = 8001;
>> >>   uint16_t remotePort2 = 8002;
>> >>   string remoteIpFake = "10.52.96.204";
>> >>   uint16_t remotePortFake = 3239;
>> >>
>> >>   int sockFd = createSocket();
>> >>   bindSocket(sockFd, localPort);
>> >>
>> >>   cout << "### Test 2: 2 assocs + 1 unreachable assoc" << endl;
>> >>
>> >>   openAssociation(sockFd, remoteIp1, remotePort1);
>> >>   openAssociation(sockFd, remoteIp1, remotePort2);
>> >>   openAssociation(sockFd, remoteIpFake, remotePortFake);
>> >>
>> >>   getAssociationPathMtu(sockFd, remoteIp1, remotePort1);
>> >>   getAssociationPathMtu(sockFd, remoteIp1, remotePort2);
>> >>
>> >>   sendReq(sockFd, remoteIp1, remotePort1, data);
>> >>   for (int i = 0; i < 10; i++) {
>> >>     sleep(10);
>> >>     getAssociationPathMtu(sockFd, remoteIp1, remotePort1);
>> >>     getAssociationPathMtu(sockFd, remoteIp1, remotePort2);
>> >>   }
>> >> }
>> >>
>> >>
>> >> int main(int argc, char** argv) {
>> >>   string testNr = "1";
>> >>   string& testData = data1000;
>> >>   if (argc >= 2) {
>> >>     testNr = argv[1];
>> >>   }
>> >>   if (argc >= 3) {
>> >>     testData = data100;
>> >>   }
>> >>
>> >>   if (testNr == "1") {
>> >>     test1(testData);
>> >>   } else {
>> >>     test2(testData);
>> >>   }
>> >>
>> >>   return 0;
>> >> }
>> >
>
>

[-- Attachment #2: test2_traces_send_on_both_assocs.tar.gz --]
[-- Type: application/x-gzip, Size: 13309 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: PMTU discovery behaviour
  2017-09-11 12:44 PMTU discovery behaviour Peter Salin
                   ` (4 preceding siblings ...)
  2017-09-21 12:41 ` Peter Salin
@ 2017-09-21 15:24 ` Neil Horman
  2017-09-22  9:05 ` Peter Salin
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Neil Horman @ 2017-09-21 15:24 UTC (permalink / raw)
  To: linux-sctp

On Thu, Sep 21, 2017 at 03:41:51PM +0300, Peter Salin wrote:
> 2017-09-21 14:01 GMT+03:00 Neil Horman <nhorman@tuxdriver.com>:
> > On Wed, Sep 20, 2017 at 02:02:45PM +0300, Peter Salin wrote:
> >> 2017-09-19 20:09 GMT+03:00 Neil Horman <nhorman@tuxdriver.com>:
> >> > On Mon, Sep 11, 2017 at 03:44:57PM +0300, Peter Salin wrote:
> >> >> Hi,
> >> >>
> >> >> I encountered some strange PMTUD related behaviour that I need help in
> >> >> understanding.
> >> >>
> >> >> Setup:
> >> >>
> >> >> +-----------+        +---+        +--------+
> >> >> | 10.0.0.10 |--------| X |--------|10.0.0.3|
> >> >> +-----------+        +---+        +--------+
> >> >>
> >> >> A one to many socket is setup at 10.0.0.10. Two instances of the
> >> >> lksctp sctp_darn applications are ran at 10.0.0.3 listening to ports
> >> >> 8001 and 8002. 10.0.0.3 was also setup to generate ICMP frag needed
> >> >> messages for incoming messages over 600 bytes. This same issue also
> >> >> occurs also when a router on the path was setup to generate the ICMP
> >> >> message instead.
> >> >>
> >> >> Test 1:
> >> >> Two associations were connected from 10.0.0.10 to 10.0.0.3, one to
> >> >> port 8001 and another one to 8002. Then a too large message was sent
> >> >> on the association to 8001, triggering ICMP generation. When checking
> >> >> the MTU reported in spinfo_mtu field of SCTP_GET_PEER_ADDR_INFO, the
> >> >> association now reports 600. The association to 8002 reports 1500
> >> >> until traffic is sent on it, at which point it also adjusts to 600
> >> >> which I think makes sense since the destination IP is the same. When
> >> >> reopening the associations, the value of 600 would be remembered for
> >> >> about 10 min, which I also think makes sense since
> >> >> net.ipv4.route.mtu_expires is 600.
> >> >>
> >> >> Test 2:
> >> >> Again the same two associations were connected to 10.0.0.3, but in
> >> >> addition an attempt to connect a third association to a non-existing
> >> >> IP was done, this attempt fails with timeout after a while. After
> >> >> that, again an ICMP triggering large message was sent to 8001. Now the
> >> >> behaviour is different from before. The association to 8001 reports a
> >> >> spinfo_mtu of 600, but only for a brief moment, it does not stay at
> >> >> 600 for 10 minutes. In addition the spinfo_mtu of the association to
> >> >> 8002 never changes, it stays at the original 1500.
> >> >>
> >> >> The only difference between the two tests is the attempt to connect to
> >> >> a non-responding IP at the beginning of test 2. Any ideas why the
> >> >> behaviour changes, is this a bug or is there some other reason for
> >> >> this?
> >> >>
> >> >> I have attached the sample application used for reproducing this.
> >> >>
> >> >> BR,
> >> >> -Peter
> >> >>
> >> > Hey, apologies for the delay on this, I've had it in my reader for days and kept
> >> > meaning to respond, but kept getting sidetracked.
> >> >
> >> > First glance, this sounds incorrect.  Each association (or rather each
> >> > transport) maintains its own mtu, and the association reflects the mtu of the
> >> > active transport. Given that each transport holds its own dst cache entry, I
> >> > have a hard time seeing how one transports mtu changes might leak to another
> >> >
> >> > But thats not really whats happening here.  By your description, the active
> >> > transport on the established association isn't updating its pathmtu, which
> >> > should happen in response to receiving the ICMP_FRAG_NEEDED message.
> >> >
> >> > I know you've provided the reproducer bellow, and I appreciate that, but I don't
> >> > have the cycles to set this up at the moment.  Could you tell me if, during the
> >> > second test, after you attempt to connect to the fake ip address and then send
> >> > the large message that should trigger the frag needed message, does said large
> >> > message get retransmitted and eventually arrive at the peer host?  If so, that
> >> > suggests that the sctp stack:
> >> >
> >> > a) receives the frag needed message
> >> > and
> >> > b) resends the packet at the lower frag point
> >> >
> >> > That in turn suggests we just have some internal reporting error in which we
> >> > don't update the associations pmtu with the active transports
> >> >
> >> > Let me know the answer to that question and it will give me some places to start
> >> > looking
> >> > Neil
> >> >
> >> Thanks for responding. In response to your question, the first large
> >> message does get retransmitted without the Don't Fragment bit set. I
> >> modified the test a bit to also send further messages after the first
> >> one. Those messages are indeed fragmented according to the limit of
> >> the ICMP message. I have attached a PCAP trace and SCTP debug logs in
> >> case that helps here.
> >>
> >> I also tried sending a large message on the other association after
> >> the large message on the first association had been sent. For test 2
> >> that message was not fragmented even though the ICMP was already
> >> received for the first assoc. After the second assoc also received an
> >> ICMP it adjusted to use the lower MTU for subsequent messages. In the
> >> case of test 1, sending a large message on the second assoc would auto
> >> fragment already on the first message.
> >>
> >> Also, after stopping and rerunning test 2 the MTU would always be
> >> reset at 1500, whereas in test 1 the lower limit would still be in
> >> effect for a new run. So it seems like in test 2 the lower MTU is only
> >> known within each association, where as in test 1 the lower MTU also
> >> gets stored deeper down?
> >>
> >> BR,
> >> -Peter
> >>
> > So, from what I can see, your included tcpdump only shows the first part of what
> > you are describing.  That is to say that it sends a large data chunk on an
> > association that gets an ICMP frag needed response, after which the pmtu is
> > lowered and smaller message fragments are sent, which is good (i.e. working as
> > designed).
> >
> > I don't see anything in the tcpdump relating to the remainder of your test,
> > showing failed fragmentation.  Can you include that please?
> >
> > Neil
> Yes, please find attached traces that include sending on the other
> association after receiving the first ICMP.
> 


Thank you.  So tell me if I'm missing something here, but I think this trace
contradicts what you describe above.  Some specifics:

1) I observe two assocations in this trace:
	a) An association with index 0, who's init chunk is in frame 1
	b) An association with index 1, whos init chunk is in frame 5
	Note that I can toggle between these association flows with the display
filter of:
	sctp.assoc_index = 1
	or
	sctp.assoc_index = 0
	in wireshark


2) In both flows, I can observe that a large chunk is sent:
	a) in assoc index 0, the over-mtu chunk is in frame 9
	b) in assoc index 1, the over-mtu chunk is in frame 16

3) Subsequent to each data chunk in (2), we get an icmp unreach (frag needed
message)
	a) in assoc index 0, the icmp is in frame 10
	b) in assoc index 1, the icmp is in frame 17

4) Subsequent to (3), all DATA chunks appear to get limited to an appropriate
size for the path mtu as specified in the respective icmp from (3), and
oversized datagrams are appropriately fragmented.


Please let me know if I'm missing something, but this trace shows everything to
be working as normal.

Neil

> BR,
> -Peter
> >
> >> >> ------ ver_linux output ------
> >> >> Linux esalipe-test 4.4.0-93-generic #116-Ubuntu SMP Fri Aug 11
> >> >> 21:17:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
> >> >>
> >> >> GNU C                   5.4.0
> >> >> GNU Make                4.1
> >> >> Binutils                2.26.1
> >> >> Util-linux              2.27.1
> >> >> Mount                   2.27.1
> >> >> Module-init-tools       22
> >> >> E2fsprogs               1.42.13
> >> >> Xfsprogs                4.3.0
> >> >> Linux C Library         2.23
> >> >> Dynamic linker (ldd)    2.23
> >> >> Linux C++ Library       6.0.21
> >> >> Procps                  3.3.10
> >> >> Net-tools               1.60
> >> >> Kbd                     1.15.5
> >> >> Console-tools           1.15.5
> >> >> Sh-utils                8.25
> >> >> Udev                    229
> >> >> Modules Loaded          ablk_helper aes_x86_64 aesni_intel
> >> >> async_memcpy async_pq async_raid6_recov async_tx async_xor autofs4
> >> >> binfmt_misc btrfs  crc32_pclmul crct10dif_pclmul cryptd floppy
> >> >> gf128mul ghash_clmul ni_intel glue_helper hid hid_generic ib_addr
> >> >> ib_cm ib_core ib_iser ib_mad ib_sa input_leds irqbypass iscsi_tcp
> >> >> iw_cm joydev kvm kvm_intel libcrc32c libiscsi libiscsi_tcp linear lrw
> >> >> multipath parport parport_pc ppdev psmouse raid0 raid1 raid10 raid456
> >> >> raid6_pq rdma_cm scsi_transport_iscsi sctp serio_raw usbhid xor
> >> >
> >> >>
> >> >> #include <cstring>
> >> >> #include <ctime>
> >> >> #include <iomanip>
> >> >> #include <iostream>
> >> >>
> >> >> #include <errno.h>
> >> >> #include <unistd.h>
> >> >> #include <arpa/inet.h>
> >> >> #include <net/if.h>
> >> >> #include <netinet/in.h>
> >> >> #include <netinet/sctp.h>
> >> >> #include <sys/ioctl.h>
> >> >> #include <sys/socket.h>
> >> >>
> >> >> using namespace std;
> >> >>
> >> >> static const int ERROR_BUFLEN = 64;
> >> >> static const char* SCTP_INTERFACE_NAME = "ens4";
> >> >>
> >> >> static string data100 = "01234567890123456789012345678901234567890123456789"
> >> >>   "01234567890123456789012345678901234567890123456789";
> >> >> static string data1000 = "01234567890123456789012345678901234567890123456789"
> >> >>   "01234567890123456789012345678901234567890123456789"
> >> >>   "01234567890123456789012345678901234567890123456789"
> >> >>   "01234567890123456789012345678901234567890123456789"
> >> >>   "01234567890123456789012345678901234567890123456789"
> >> >>   "01234567890123456789012345678901234567890123456789"
> >> >>   "01234567890123456789012345678901234567890123456789"
> >> >>   "01234567890123456789012345678901234567890123456789"
> >> >>   "01234567890123456789012345678901234567890123456789"
> >> >>   "01234567890123456789012345678901234567890123456789"
> >> >>   "01234567890123456789012345678901234567890123456789"
> >> >>   "01234567890123456789012345678901234567890123456789"
> >> >>   "01234567890123456789012345678901234567890123456789"
> >> >>   "01234567890123456789012345678901234567890123456789"
> >> >>   "01234567890123456789012345678901234567890123456789"
> >> >>   "01234567890123456789012345678901234567890123456789"
> >> >>   "01234567890123456789012345678901234567890123456789"
> >> >>   "01234567890123456789012345678901234567890123456789"
> >> >>   "01234567890123456789012345678901234567890123456789"
> >> >>   "01234567890123456789012345678901234567890123456789";
> >> >>
> >> >> void printError(const string& msg, const string& funcName) {
> >> >>   char errorMessage[ERROR_BUFLEN] {};
> >> >>   char* errMsg = ::strerror_r(errno, errorMessage,
> >> >>                             sizeof(errorMessage));
> >> >>
> >> >>   cerr << "::" << funcName << ": " << msg << ": " << errMsg << endl;
> >> >> }
> >> >>
> >> >> int createSocket() {
> >> >>   int sockFd = socket (AF_INET,
> >> >>                      SOCK_SEQPACKET,
> >> >>                      IPPROTO_SCTP);
> >> >>   if (sockFd = -1) {
> >> >>     printError("Creation of socket failed", __FUNCTION__);
> >> >>     return -1;
> >> >>   }
> >> >>
> >> >>   // Enable address reuse
> >> >>   int enable = 1;
> >> >>   int err = setsockopt(sockFd,
> >> >>                      SOL_SOCKET,
> >> >>                      SO_REUSEADDR,
> >> >>                      &enable,
> >> >>                      sizeof(enable));
> >> >>
> >> >>   if (err) {
> >> >>     printError("Error setting socket option SO_REUSEADDR", __FUNCTION__);
> >> >>     close(sockFd);
> >> >>     return -1;
> >> >>   }
> >> >>
> >> >>   // Configure SCTP
> >> >>   sctp_initmsg initmsg{};
> >> >>   initmsg.sinit_num_ostreams = 3;
> >> >>   initmsg.sinit_max_instreams = 3;
> >> >>   initmsg.sinit_max_attempts = 2;
> >> >>   initmsg.sinit_max_init_timeo = 0;
> >> >>
> >> >>   err = setsockopt(sockFd,
> >> >>                  IPPROTO_SCTP,
> >> >>                  SCTP_INITMSG,
> >> >>                  &initmsg,
> >> >>                  sizeof(initmsg));
> >> >>
> >> >>   if (err) {
> >> >>     printError("Configuring SCTP socket failed", __FUNCTION__);
> >> >>     close(sockFd);
> >> >>     return -1;
> >> >>   }
> >> >>
> >> >>   struct sctp_paddrparams paddr_params{};
> >> >>   memset(&paddr_params, 0, sizeof(paddr_params));
> >> >>   socklen_t size_of_sctp_paddr_params = sizeof(paddr_params);
> >> >>   paddr_params.spp_flags = SPP_HB_ENABLE | SPP_PMTUD_ENABLE | SPP_SACKDELAY_ENABLE;
> >> >>
> >> >>   err = setsockopt(sockFd,
> >> >>                  IPPROTO_SCTP,
> >> >>                  SCTP_PEER_ADDR_PARAMS,
> >> >>                  &paddr_params,
> >> >>                  size_of_sctp_paddr_params);
> >> >>
> >> >>   if (err) {
> >> >>     printError("Configuring SCTP params failed", __FUNCTION__);
> >> >>     close(sockFd);
> >> >>     return -1;
> >> >>   }
> >> >>
> >> >>   return sockFd;
> >> >> }
> >> >>
> >> >> bool bindSocket(const int sockFd, const int localPort) {
> >> >>   // Get IP of ethernet interface
> >> >>   string localAddress = "";
> >> >>   ifreq ifr{};
> >> >>   ifr.ifr_addr.sa_family = AF_INET;
> >> >>   strncpy(ifr.ifr_name, SCTP_INTERFACE_NAME, IFNAMSIZ - 1);
> >> >>   const int ioctlStatus = ioctl(sockFd,
> >> >>                               SIOCGIFADDR,
> >> >>                               &ifr);
> >> >>
> >> >>   if (ioctlStatus = -1) {
> >> >>     printError("Failed to get local address", __FUNCTION__);
> >> >>     return false;
> >> >>   }
> >> >>
> >> >>   char ipAddrBuffer[INET_ADDRSTRLEN] {};
> >> >>   inet_ntop(AF_INET,
> >> >>           &reinterpret_cast<sockaddr_in*>(&(ifr.ifr_addr))->sin_addr,
> >> >>           ipAddrBuffer,
> >> >>           sizeof(ipAddrBuffer));
> >> >>
> >> >>   localAddress.assign(ipAddrBuffer);
> >> >>
> >> >>   // Bind to found ip address
> >> >>   sockaddr_in serv_addr{};
> >> >>   serv_addr.sin_family = AF_INET;
> >> >>   inet_pton(AF_INET,
> >> >>           localAddress.c_str(),
> >> >>           &serv_addr.sin_addr);
> >> >>   serv_addr.sin_port = htons(localPort);
> >> >>
> >> >>   if (bind(sockFd,
> >> >>          reinterpret_cast<sockaddr*>(&serv_addr),
> >> >>          sizeof(serv_addr))) {
> >> >>     printError("Failed to bind socket to local address", __FUNCTION__);
> >> >>     localAddress.clear();
> >> >>     close(sockFd);
> >> >>     return false;
> >> >>   }
> >> >>
> >> >>   cout << "Local endpoint succussfully bound to local address: " << localAddress << endl;
> >> >>
> >> >>   return true;
> >> >> }
> >> >>
> >> >> bool openAssociation(const int sockFd,
> >> >>                    const string &remoteAddress,
> >> >>                    std::uint16_t remotePort) {
> >> >>
> >> >>   sockaddr_in address{};
> >> >>   address.sin_family = AF_INET;
> >> >>   inet_pton(AF_INET, remoteAddress.c_str(), &address.sin_addr);
> >> >>   address.sin_port = htons(remotePort);
> >> >>
> >> >>   int connectError = connect(sockFd,
> >> >>                            reinterpret_cast<sockaddr *>(&address),
> >> >>                            sizeof(address));
> >> >>   if (connectError) {
> >> >>     printError("Error connecting association", __FUNCTION__);
> >> >>     return false;
> >> >>   }
> >> >>
> >> >>   cout << "Association connected to address: " << remoteAddress << ":" << remotePort << endl;
> >> >>   return true;
> >> >> }
> >> >>
> >> >> void sendReq(const int sockFd,
> >> >>            const string& remoteAddress,
> >> >>            const uint16_t remotePort,
> >> >>            const std::string& data)
> >> >> {
> >> >>
> >> >>   struct sockaddr_in remoteAddr {};
> >> >>   remoteAddr.sin_family = AF_INET;
> >> >>   remoteAddr.sin_port = htons(remotePort);
> >> >>
> >> >>   uint32_t payloadProtId = 7;
> >> >>   uint16_t streamId = 0;
> >> >>   uint32_t dataLength = data.size();
> >> >>   sockaddr* servaddr = reinterpret_cast<sockaddr*>(&remoteAddr);
> >> >>   inet_pton(AF_INET, remoteAddress.c_str(), &remoteAddr.sin_addr);
> >> >>
> >> >>   const std::string ipaddr > >> >>     inet_ntoa(reinterpret_cast<sockaddr_in*>(servaddr)->sin_addr);
> >> >>
> >> >>   cout << "Sending SCTP req to " << remoteAddress << ":" << remotePort;
> >> >>   cout << ", len=" << dataLength << endl;
> >> >>
> >> >>   const int bytesSent = sctp_sendmsg(sockFd,
> >> >>                                    data.c_str(),
> >> >>                                    (size_t)dataLength,
> >> >>                                    servaddr,
> >> >>                                    sizeof(sockaddr_in),
> >> >>                                    htonl(payloadProtId),
> >> >>                                    SCTP_ADDR_OVER,
> >> >>                                    streamId,
> >> >>                                    200,
> >> >>                                    0);
> >> >>
> >> >>   if (bytesSent = -1) {
> >> >>     printError("SCTP send failed", __FUNCTION__);
> >> >>   }
> >> >>
> >> >>   return;
> >> >> }
> >> >>
> >> >> sctp_assoc_t getSocketAssociationId(const int sockFd,
> >> >>                                   const string &remoteIpAddress,
> >> >>                                   std::uint16_t remotePort)
> >> >>
> >> >> {
> >> >>   sockaddr_in socket_address_in{};
> >> >>
> >> >>   socket_address_in.sin_family = AF_INET;
> >> >>   socket_address_in.sin_port = htons(remotePort);
> >> >>   inet_pton(AF_INET, remoteIpAddress.c_str(), &socket_address_in.sin_addr);
> >> >>
> >> >>   struct sockaddr *socket_address = reinterpret_cast<sockaddr*>(&socket_address_in);
> >> >>   socklen_t salen = sizeof(&socket_address);
> >> >>
> >> >>   struct sctp_paddrinfo peer_address_info{};
> >> >>   socklen_t size_of_sctp_paddrinfo = sizeof peer_address_info;
> >> >>   std::memcpy(&peer_address_info.spinfo_address, socket_address, salen);
> >> >>
> >> >>   const int sctpOptInfoError = sctp_opt_info(sockFd,
> >> >>                                            0,
> >> >>                                            SCTP_GET_PEER_ADDR_INFO,
> >> >>                                            &peer_address_info,
> >> >>                                            &size_of_sctp_paddrinfo);
> >> >>
> >> >>   if (sctpOptInfoError) {
> >> >>     printError("Failed to get association id", __FUNCTION__);
> >> >>   }
> >> >>
> >> >>   return peer_address_info.spinfo_assoc_id;
> >> >> }
> >> >>
> >> >> std::uint32_t getAssociationPathMtu(const int sockFd,
> >> >>                                   const string &remoteIpAddress,
> >> >>                                   const std::uint16_t remotePort) {
> >> >>   sockaddr_in socket_address_in{};
> >> >>
> >> >>   socket_address_in.sin_family = AF_INET;
> >> >>   socket_address_in.sin_port = htons(remotePort);
> >> >>   inet_pton(AF_INET, remoteIpAddress.c_str(), &socket_address_in.sin_addr);
> >> >>
> >> >>   struct sockaddr *socket_address = reinterpret_cast<sockaddr*>(&socket_address_in);
> >> >>   socklen_t salen = sizeof(&socket_address);
> >> >>
> >> >>   struct sctp_paddrinfo peer_address_info{};
> >> >>   socklen_t size_of_sctp_paddrinfo = sizeof(peer_address_info);
> >> >>   std::memcpy(&peer_address_info.spinfo_address, socket_address, salen);
> >> >>
> >> >>   sctp_assoc_t sctpAssociationId = getSocketAssociationId(sockFd, remoteIpAddress, remotePort);
> >> >>
> >> >>   const int sctpOptInfoError = sctp_opt_info(sockFd, sctpAssociationId,
> >> >>                                            SCTP_GET_PEER_ADDR_INFO,
> >> >>                                            &peer_address_info, &size_of_sctp_paddrinfo);
> >> >>   if (sctpOptInfoError) {
> >> >>     printError("Failed to get pmtu", __FUNCTION__);
> >> >>   }
> >> >>
> >> >>   auto t = std::time(nullptr);
> >> >>   auto tm = *std::localtime(&t);
> >> >>   std::cout << std::put_time(&tm, "%H:%M:%S ") << remoteIpAddress << ":" << remotePort;
> >> >>   cout << " currently has a PMTU of " << peer_address_info.spinfo_mtu << endl;
> >> >>
> >> >>   return peer_address_info.spinfo_mtu;
> >> >> }
> >> >>
> >> >> void test1(const string& data) {
> >> >>   int localPort = 2944;
> >> >>   string remoteIp1 = "10.0.0.3";
> >> >>   uint16_t remotePort1 = 8001;
> >> >>   uint16_t remotePort2 = 8002;
> >> >>
> >> >>   int sockFd = createSocket();
> >> >>   bindSocket(sockFd, localPort);
> >> >>
> >> >>   cout << "### Test 1: 2 assocs" << endl;
> >> >>
> >> >>   openAssociation(sockFd, remoteIp1, remotePort1);
> >> >>   openAssociation(sockFd, remoteIp1, remotePort2);
> >> >>
> >> >>   getAssociationPathMtu(sockFd, remoteIp1, remotePort1);
> >> >>   getAssociationPathMtu(sockFd, remoteIp1, remotePort2);
> >> >>
> >> >>   sendReq(sockFd, remoteIp1, remotePort1, data);
> >> >>   for (int i = 0; i < 10; i++) {
> >> >>     sleep(10);
> >> >>     getAssociationPathMtu(sockFd, remoteIp1, remotePort1);
> >> >>     getAssociationPathMtu(sockFd, remoteIp1, remotePort2);
> >> >>   }
> >> >> }
> >> >>
> >> >> void test2(const string& data) {
> >> >>   int localPort = 2944;
> >> >>   string remoteIp1 = "10.0.0.3";
> >> >>   uint16_t remotePort1 = 8001;
> >> >>   uint16_t remotePort2 = 8002;
> >> >>   string remoteIpFake = "10.52.96.204";
> >> >>   uint16_t remotePortFake = 3239;
> >> >>
> >> >>   int sockFd = createSocket();
> >> >>   bindSocket(sockFd, localPort);
> >> >>
> >> >>   cout << "### Test 2: 2 assocs + 1 unreachable assoc" << endl;
> >> >>
> >> >>   openAssociation(sockFd, remoteIp1, remotePort1);
> >> >>   openAssociation(sockFd, remoteIp1, remotePort2);
> >> >>   openAssociation(sockFd, remoteIpFake, remotePortFake);
> >> >>
> >> >>   getAssociationPathMtu(sockFd, remoteIp1, remotePort1);
> >> >>   getAssociationPathMtu(sockFd, remoteIp1, remotePort2);
> >> >>
> >> >>   sendReq(sockFd, remoteIp1, remotePort1, data);
> >> >>   for (int i = 0; i < 10; i++) {
> >> >>     sleep(10);
> >> >>     getAssociationPathMtu(sockFd, remoteIp1, remotePort1);
> >> >>     getAssociationPathMtu(sockFd, remoteIp1, remotePort2);
> >> >>   }
> >> >> }
> >> >>
> >> >>
> >> >> int main(int argc, char** argv) {
> >> >>   string testNr = "1";
> >> >>   string& testData = data1000;
> >> >>   if (argc >= 2) {
> >> >>     testNr = argv[1];
> >> >>   }
> >> >>   if (argc >= 3) {
> >> >>     testData = data100;
> >> >>   }
> >> >>
> >> >>   if (testNr = "1") {
> >> >>     test1(testData);
> >> >>   } else {
> >> >>     test2(testData);
> >> >>   }
> >> >>
> >> >>   return 0;
> >> >> }
> >> >
> >
> >



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: PMTU discovery behaviour
  2017-09-11 12:44 PMTU discovery behaviour Peter Salin
                   ` (5 preceding siblings ...)
  2017-09-21 15:24 ` Neil Horman
@ 2017-09-22  9:05 ` Peter Salin
  2017-09-22 11:33 ` Neil Horman
  2017-09-22 20:06 ` Neil Horman
  8 siblings, 0 replies; 10+ messages in thread
From: Peter Salin @ 2017-09-22  9:05 UTC (permalink / raw)
  To: linux-sctp

[-- Attachment #1: Type: text/plain, Size: 24847 bytes --]

2017-09-21 18:24 GMT+03:00 Neil Horman <nhorman@tuxdriver.com>:
> On Thu, Sep 21, 2017 at 03:41:51PM +0300, Peter Salin wrote:
>> 2017-09-21 14:01 GMT+03:00 Neil Horman <nhorman@tuxdriver.com>:
>> > On Wed, Sep 20, 2017 at 02:02:45PM +0300, Peter Salin wrote:
>> >> 2017-09-19 20:09 GMT+03:00 Neil Horman <nhorman@tuxdriver.com>:
>> >> > On Mon, Sep 11, 2017 at 03:44:57PM +0300, Peter Salin wrote:
>> >> >> Hi,
>> >> >>
>> >> >> I encountered some strange PMTUD related behaviour that I need help in
>> >> >> understanding.
>> >> >>
>> >> >> Setup:
>> >> >>
>> >> >> +-----------+        +---+        +--------+
>> >> >> | 10.0.0.10 |--------| X |--------|10.0.0.3|
>> >> >> +-----------+        +---+        +--------+
>> >> >>
>> >> >> A one to many socket is setup at 10.0.0.10. Two instances of the
>> >> >> lksctp sctp_darn applications are ran at 10.0.0.3 listening to ports
>> >> >> 8001 and 8002. 10.0.0.3 was also setup to generate ICMP frag needed
>> >> >> messages for incoming messages over 600 bytes. This same issue also
>> >> >> occurs also when a router on the path was setup to generate the ICMP
>> >> >> message instead.
>> >> >>
>> >> >> Test 1:
>> >> >> Two associations were connected from 10.0.0.10 to 10.0.0.3, one to
>> >> >> port 8001 and another one to 8002. Then a too large message was sent
>> >> >> on the association to 8001, triggering ICMP generation. When checking
>> >> >> the MTU reported in spinfo_mtu field of SCTP_GET_PEER_ADDR_INFO, the
>> >> >> association now reports 600. The association to 8002 reports 1500
>> >> >> until traffic is sent on it, at which point it also adjusts to 600
>> >> >> which I think makes sense since the destination IP is the same. When
>> >> >> reopening the associations, the value of 600 would be remembered for
>> >> >> about 10 min, which I also think makes sense since
>> >> >> net.ipv4.route.mtu_expires is 600.
>> >> >>
>> >> >> Test 2:
>> >> >> Again the same two associations were connected to 10.0.0.3, but in
>> >> >> addition an attempt to connect a third association to a non-existing
>> >> >> IP was done, this attempt fails with timeout after a while. After
>> >> >> that, again an ICMP triggering large message was sent to 8001. Now the
>> >> >> behaviour is different from before. The association to 8001 reports a
>> >> >> spinfo_mtu of 600, but only for a brief moment, it does not stay at
>> >> >> 600 for 10 minutes. In addition the spinfo_mtu of the association to
>> >> >> 8002 never changes, it stays at the original 1500.
>> >> >>
>> >> >> The only difference between the two tests is the attempt to connect to
>> >> >> a non-responding IP at the beginning of test 2. Any ideas why the
>> >> >> behaviour changes, is this a bug or is there some other reason for
>> >> >> this?
>> >> >>
>> >> >> I have attached the sample application used for reproducing this.
>> >> >>
>> >> >> BR,
>> >> >> -Peter
>> >> >>
>> >> > Hey, apologies for the delay on this, I've had it in my reader for days and kept
>> >> > meaning to respond, but kept getting sidetracked.
>> >> >
>> >> > First glance, this sounds incorrect.  Each association (or rather each
>> >> > transport) maintains its own mtu, and the association reflects the mtu of the
>> >> > active transport. Given that each transport holds its own dst cache entry, I
>> >> > have a hard time seeing how one transports mtu changes might leak to another
>> >> >
>> >> > But thats not really whats happening here.  By your description, the active
>> >> > transport on the established association isn't updating its pathmtu, which
>> >> > should happen in response to receiving the ICMP_FRAG_NEEDED message.
>> >> >
>> >> > I know you've provided the reproducer bellow, and I appreciate that, but I don't
>> >> > have the cycles to set this up at the moment.  Could you tell me if, during the
>> >> > second test, after you attempt to connect to the fake ip address and then send
>> >> > the large message that should trigger the frag needed message, does said large
>> >> > message get retransmitted and eventually arrive at the peer host?  If so, that
>> >> > suggests that the sctp stack:
>> >> >
>> >> > a) receives the frag needed message
>> >> > and
>> >> > b) resends the packet at the lower frag point
>> >> >
>> >> > That in turn suggests we just have some internal reporting error in which we
>> >> > don't update the associations pmtu with the active transports
>> >> >
>> >> > Let me know the answer to that question and it will give me some places to start
>> >> > looking
>> >> > Neil
>> >> >
>> >> Thanks for responding. In response to your question, the first large
>> >> message does get retransmitted without the Don't Fragment bit set. I
>> >> modified the test a bit to also send further messages after the first
>> >> one. Those messages are indeed fragmented according to the limit of
>> >> the ICMP message. I have attached a PCAP trace and SCTP debug logs in
>> >> case that helps here.
>> >>
>> >> I also tried sending a large message on the other association after
>> >> the large message on the first association had been sent. For test 2
>> >> that message was not fragmented even though the ICMP was already
>> >> received for the first assoc. After the second assoc also received an
>> >> ICMP it adjusted to use the lower MTU for subsequent messages. In the
>> >> case of test 1, sending a large message on the second assoc would auto
>> >> fragment already on the first message.
>> >>
>> >> Also, after stopping and rerunning test 2 the MTU would always be
>> >> reset at 1500, whereas in test 1 the lower limit would still be in
>> >> effect for a new run. So it seems like in test 2 the lower MTU is only
>> >> known within each association, where as in test 1 the lower MTU also
>> >> gets stored deeper down?
>> >>
>> >> BR,
>> >> -Peter
>> >>
>> > So, from what I can see, your included tcpdump only shows the first part of what
>> > you are describing.  That is to say that it sends a large data chunk on an
>> > association that gets an ICMP frag needed response, after which the pmtu is
>> > lowered and smaller message fragments are sent, which is good (i.e. working as
>> > designed).
>> >
>> > I don't see anything in the tcpdump relating to the remainder of your test,
>> > showing failed fragmentation.  Can you include that please?
>> >
>> > Neil
>> Yes, please find attached traces that include sending on the other
>> association after receiving the first ICMP.
>>
>
>
> Thank you.  So tell me if I'm missing something here, but I think this trace
> contradicts what you describe above.  Some specifics:
>
> 1) I observe two assocations in this trace:
>         a) An association with index 0, who's init chunk is in frame 1
>         b) An association with index 1, whos init chunk is in frame 5
>         Note that I can toggle between these association flows with the display
> filter of:
>         sctp.assoc_index == 1
>         or
>         sctp.assoc_index == 0
>         in wireshark
>
>
> 2) In both flows, I can observe that a large chunk is sent:
>         a) in assoc index 0, the over-mtu chunk is in frame 9
>         b) in assoc index 1, the over-mtu chunk is in frame 16
>
> 3) Subsequent to each data chunk in (2), we get an icmp unreach (frag needed
> message)
>         a) in assoc index 0, the icmp is in frame 10
>         b) in assoc index 1, the icmp is in frame 17
>
> 4) Subsequent to (3), all DATA chunks appear to get limited to an appropriate
> size for the path mtu as specified in the respective icmp from (3), and
> oversized datagrams are appropriately fragmented.
>
>
> Please let me know if I'm missing something, but this trace shows everything to
> be working as normal.
>
> Neil

I would have expected the second ICMP to not be needed as I thought
both assocs are on the same transport.

I have now attached traces for both tests so that you can compare them
side-by-side and see what I am after here. I have run each test twice
in the traces to be able to show the two key differences here:

1) In test 1, the first large message sent on the second assoc (frame
22) is already limited correctly in size and no second ICMP is needed
like in test 2.

(Here test 1 behaviour looked ok to me since I thought both assocs are
on the same transport and therefore the MTU would be synced to both
assocs. There seems to be some locally sent ICMP message in frame 16,
perhaps this has to do with the syncing?)

2) When rerunning test 1, the previously found out MTU value is
remembered and no new ICMPs are needed (frame 52). This is not the
case for test 2.

(Again here test 1 made sense to me since I thougth the MTU would be
cached and only forgotten after 10 minutes
(net.ipv4.route.mtu_expires).

Please note that in the test application the only difference between
test 1 and test 2 is the attempt to connect a third assoc to a
non-responding IP in test 2. Yet the behaviour of the stack is very
different between the two tests.

I am new to SCTP, but to me the behaviour shown in test 1 looked more
like what I would have expected. In any case I don't understand why
the behaviour is so different between these two cases, so I hope we
can find some explanation for that.

BR,
-Peter
>
>> BR,
>> -Peter
>> >
>> >> >> ------ ver_linux output ------
>> >> >> Linux esalipe-test 4.4.0-93-generic #116-Ubuntu SMP Fri Aug 11
>> >> >> 21:17:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
>> >> >>
>> >> >> GNU C                   5.4.0
>> >> >> GNU Make                4.1
>> >> >> Binutils                2.26.1
>> >> >> Util-linux              2.27.1
>> >> >> Mount                   2.27.1
>> >> >> Module-init-tools       22
>> >> >> E2fsprogs               1.42.13
>> >> >> Xfsprogs                4.3.0
>> >> >> Linux C Library         2.23
>> >> >> Dynamic linker (ldd)    2.23
>> >> >> Linux C++ Library       6.0.21
>> >> >> Procps                  3.3.10
>> >> >> Net-tools               1.60
>> >> >> Kbd                     1.15.5
>> >> >> Console-tools           1.15.5
>> >> >> Sh-utils                8.25
>> >> >> Udev                    229
>> >> >> Modules Loaded          ablk_helper aes_x86_64 aesni_intel
>> >> >> async_memcpy async_pq async_raid6_recov async_tx async_xor autofs4
>> >> >> binfmt_misc btrfs  crc32_pclmul crct10dif_pclmul cryptd floppy
>> >> >> gf128mul ghash_clmul ni_intel glue_helper hid hid_generic ib_addr
>> >> >> ib_cm ib_core ib_iser ib_mad ib_sa input_leds irqbypass iscsi_tcp
>> >> >> iw_cm joydev kvm kvm_intel libcrc32c libiscsi libiscsi_tcp linear lrw
>> >> >> multipath parport parport_pc ppdev psmouse raid0 raid1 raid10 raid456
>> >> >> raid6_pq rdma_cm scsi_transport_iscsi sctp serio_raw usbhid xor
>> >> >
>> >> >>
>> >> >> #include <cstring>
>> >> >> #include <ctime>
>> >> >> #include <iomanip>
>> >> >> #include <iostream>
>> >> >>
>> >> >> #include <errno.h>
>> >> >> #include <unistd.h>
>> >> >> #include <arpa/inet.h>
>> >> >> #include <net/if.h>
>> >> >> #include <netinet/in.h>
>> >> >> #include <netinet/sctp.h>
>> >> >> #include <sys/ioctl.h>
>> >> >> #include <sys/socket.h>
>> >> >>
>> >> >> using namespace std;
>> >> >>
>> >> >> static const int ERROR_BUFLEN = 64;
>> >> >> static const char* SCTP_INTERFACE_NAME = "ens4";
>> >> >>
>> >> >> static string data100 = "01234567890123456789012345678901234567890123456789"
>> >> >>   "01234567890123456789012345678901234567890123456789";
>> >> >> static string data1000 = "01234567890123456789012345678901234567890123456789"
>> >> >>   "01234567890123456789012345678901234567890123456789"
>> >> >>   "01234567890123456789012345678901234567890123456789"
>> >> >>   "01234567890123456789012345678901234567890123456789"
>> >> >>   "01234567890123456789012345678901234567890123456789"
>> >> >>   "01234567890123456789012345678901234567890123456789"
>> >> >>   "01234567890123456789012345678901234567890123456789"
>> >> >>   "01234567890123456789012345678901234567890123456789"
>> >> >>   "01234567890123456789012345678901234567890123456789"
>> >> >>   "01234567890123456789012345678901234567890123456789"
>> >> >>   "01234567890123456789012345678901234567890123456789"
>> >> >>   "01234567890123456789012345678901234567890123456789"
>> >> >>   "01234567890123456789012345678901234567890123456789"
>> >> >>   "01234567890123456789012345678901234567890123456789"
>> >> >>   "01234567890123456789012345678901234567890123456789"
>> >> >>   "01234567890123456789012345678901234567890123456789"
>> >> >>   "01234567890123456789012345678901234567890123456789"
>> >> >>   "01234567890123456789012345678901234567890123456789"
>> >> >>   "01234567890123456789012345678901234567890123456789"
>> >> >>   "01234567890123456789012345678901234567890123456789";
>> >> >>
>> >> >> void printError(const string& msg, const string& funcName) {
>> >> >>   char errorMessage[ERROR_BUFLEN] {};
>> >> >>   char* errMsg = ::strerror_r(errno, errorMessage,
>> >> >>                             sizeof(errorMessage));
>> >> >>
>> >> >>   cerr << "::" << funcName << ": " << msg << ": " << errMsg << endl;
>> >> >> }
>> >> >>
>> >> >> int createSocket() {
>> >> >>   int sockFd = socket (AF_INET,
>> >> >>                      SOCK_SEQPACKET,
>> >> >>                      IPPROTO_SCTP);
>> >> >>   if (sockFd == -1) {
>> >> >>     printError("Creation of socket failed", __FUNCTION__);
>> >> >>     return -1;
>> >> >>   }
>> >> >>
>> >> >>   // Enable address reuse
>> >> >>   int enable = 1;
>> >> >>   int err = setsockopt(sockFd,
>> >> >>                      SOL_SOCKET,
>> >> >>                      SO_REUSEADDR,
>> >> >>                      &enable,
>> >> >>                      sizeof(enable));
>> >> >>
>> >> >>   if (err) {
>> >> >>     printError("Error setting socket option SO_REUSEADDR", __FUNCTION__);
>> >> >>     close(sockFd);
>> >> >>     return -1;
>> >> >>   }
>> >> >>
>> >> >>   // Configure SCTP
>> >> >>   sctp_initmsg initmsg{};
>> >> >>   initmsg.sinit_num_ostreams = 3;
>> >> >>   initmsg.sinit_max_instreams = 3;
>> >> >>   initmsg.sinit_max_attempts = 2;
>> >> >>   initmsg.sinit_max_init_timeo = 0;
>> >> >>
>> >> >>   err = setsockopt(sockFd,
>> >> >>                  IPPROTO_SCTP,
>> >> >>                  SCTP_INITMSG,
>> >> >>                  &initmsg,
>> >> >>                  sizeof(initmsg));
>> >> >>
>> >> >>   if (err) {
>> >> >>     printError("Configuring SCTP socket failed", __FUNCTION__);
>> >> >>     close(sockFd);
>> >> >>     return -1;
>> >> >>   }
>> >> >>
>> >> >>   struct sctp_paddrparams paddr_params{};
>> >> >>   memset(&paddr_params, 0, sizeof(paddr_params));
>> >> >>   socklen_t size_of_sctp_paddr_params = sizeof(paddr_params);
>> >> >>   paddr_params.spp_flags = SPP_HB_ENABLE | SPP_PMTUD_ENABLE | SPP_SACKDELAY_ENABLE;
>> >> >>
>> >> >>   err = setsockopt(sockFd,
>> >> >>                  IPPROTO_SCTP,
>> >> >>                  SCTP_PEER_ADDR_PARAMS,
>> >> >>                  &paddr_params,
>> >> >>                  size_of_sctp_paddr_params);
>> >> >>
>> >> >>   if (err) {
>> >> >>     printError("Configuring SCTP params failed", __FUNCTION__);
>> >> >>     close(sockFd);
>> >> >>     return -1;
>> >> >>   }
>> >> >>
>> >> >>   return sockFd;
>> >> >> }
>> >> >>
>> >> >> bool bindSocket(const int sockFd, const int localPort) {
>> >> >>   // Get IP of ethernet interface
>> >> >>   string localAddress = "";
>> >> >>   ifreq ifr{};
>> >> >>   ifr.ifr_addr.sa_family = AF_INET;
>> >> >>   strncpy(ifr.ifr_name, SCTP_INTERFACE_NAME, IFNAMSIZ - 1);
>> >> >>   const int ioctlStatus = ioctl(sockFd,
>> >> >>                               SIOCGIFADDR,
>> >> >>                               &ifr);
>> >> >>
>> >> >>   if (ioctlStatus == -1) {
>> >> >>     printError("Failed to get local address", __FUNCTION__);
>> >> >>     return false;
>> >> >>   }
>> >> >>
>> >> >>   char ipAddrBuffer[INET_ADDRSTRLEN] {};
>> >> >>   inet_ntop(AF_INET,
>> >> >>           &reinterpret_cast<sockaddr_in*>(&(ifr.ifr_addr))->sin_addr,
>> >> >>           ipAddrBuffer,
>> >> >>           sizeof(ipAddrBuffer));
>> >> >>
>> >> >>   localAddress.assign(ipAddrBuffer);
>> >> >>
>> >> >>   // Bind to found ip address
>> >> >>   sockaddr_in serv_addr{};
>> >> >>   serv_addr.sin_family = AF_INET;
>> >> >>   inet_pton(AF_INET,
>> >> >>           localAddress.c_str(),
>> >> >>           &serv_addr.sin_addr);
>> >> >>   serv_addr.sin_port = htons(localPort);
>> >> >>
>> >> >>   if (bind(sockFd,
>> >> >>          reinterpret_cast<sockaddr*>(&serv_addr),
>> >> >>          sizeof(serv_addr))) {
>> >> >>     printError("Failed to bind socket to local address", __FUNCTION__);
>> >> >>     localAddress.clear();
>> >> >>     close(sockFd);
>> >> >>     return false;
>> >> >>   }
>> >> >>
>> >> >>   cout << "Local endpoint succussfully bound to local address: " << localAddress << endl;
>> >> >>
>> >> >>   return true;
>> >> >> }
>> >> >>
>> >> >> bool openAssociation(const int sockFd,
>> >> >>                    const string &remoteAddress,
>> >> >>                    std::uint16_t remotePort) {
>> >> >>
>> >> >>   sockaddr_in address{};
>> >> >>   address.sin_family = AF_INET;
>> >> >>   inet_pton(AF_INET, remoteAddress.c_str(), &address.sin_addr);
>> >> >>   address.sin_port = htons(remotePort);
>> >> >>
>> >> >>   int connectError = connect(sockFd,
>> >> >>                            reinterpret_cast<sockaddr *>(&address),
>> >> >>                            sizeof(address));
>> >> >>   if (connectError) {
>> >> >>     printError("Error connecting association", __FUNCTION__);
>> >> >>     return false;
>> >> >>   }
>> >> >>
>> >> >>   cout << "Association connected to address: " << remoteAddress << ":" << remotePort << endl;
>> >> >>   return true;
>> >> >> }
>> >> >>
>> >> >> void sendReq(const int sockFd,
>> >> >>            const string& remoteAddress,
>> >> >>            const uint16_t remotePort,
>> >> >>            const std::string& data)
>> >> >> {
>> >> >>
>> >> >>   struct sockaddr_in remoteAddr {};
>> >> >>   remoteAddr.sin_family = AF_INET;
>> >> >>   remoteAddr.sin_port = htons(remotePort);
>> >> >>
>> >> >>   uint32_t payloadProtId = 7;
>> >> >>   uint16_t streamId = 0;
>> >> >>   uint32_t dataLength = data.size();
>> >> >>   sockaddr* servaddr = reinterpret_cast<sockaddr*>(&remoteAddr);
>> >> >>   inet_pton(AF_INET, remoteAddress.c_str(), &remoteAddr.sin_addr);
>> >> >>
>> >> >>   const std::string ipaddr =
>> >> >>     inet_ntoa(reinterpret_cast<sockaddr_in*>(servaddr)->sin_addr);
>> >> >>
>> >> >>   cout << "Sending SCTP req to " << remoteAddress << ":" << remotePort;
>> >> >>   cout << ", len=" << dataLength << endl;
>> >> >>
>> >> >>   const int bytesSent = sctp_sendmsg(sockFd,
>> >> >>                                    data.c_str(),
>> >> >>                                    (size_t)dataLength,
>> >> >>                                    servaddr,
>> >> >>                                    sizeof(sockaddr_in),
>> >> >>                                    htonl(payloadProtId),
>> >> >>                                    SCTP_ADDR_OVER,
>> >> >>                                    streamId,
>> >> >>                                    200,
>> >> >>                                    0);
>> >> >>
>> >> >>   if (bytesSent == -1) {
>> >> >>     printError("SCTP send failed", __FUNCTION__);
>> >> >>   }
>> >> >>
>> >> >>   return;
>> >> >> }
>> >> >>
>> >> >> sctp_assoc_t getSocketAssociationId(const int sockFd,
>> >> >>                                   const string &remoteIpAddress,
>> >> >>                                   std::uint16_t remotePort)
>> >> >>
>> >> >> {
>> >> >>   sockaddr_in socket_address_in{};
>> >> >>
>> >> >>   socket_address_in.sin_family = AF_INET;
>> >> >>   socket_address_in.sin_port = htons(remotePort);
>> >> >>   inet_pton(AF_INET, remoteIpAddress.c_str(), &socket_address_in.sin_addr);
>> >> >>
>> >> >>   struct sockaddr *socket_address = reinterpret_cast<sockaddr*>(&socket_address_in);
>> >> >>   socklen_t salen = sizeof(&socket_address);
>> >> >>
>> >> >>   struct sctp_paddrinfo peer_address_info{};
>> >> >>   socklen_t size_of_sctp_paddrinfo = sizeof peer_address_info;
>> >> >>   std::memcpy(&peer_address_info.spinfo_address, socket_address, salen);
>> >> >>
>> >> >>   const int sctpOptInfoError = sctp_opt_info(sockFd,
>> >> >>                                            0,
>> >> >>                                            SCTP_GET_PEER_ADDR_INFO,
>> >> >>                                            &peer_address_info,
>> >> >>                                            &size_of_sctp_paddrinfo);
>> >> >>
>> >> >>   if (sctpOptInfoError) {
>> >> >>     printError("Failed to get association id", __FUNCTION__);
>> >> >>   }
>> >> >>
>> >> >>   return peer_address_info.spinfo_assoc_id;
>> >> >> }
>> >> >>
>> >> >> std::uint32_t getAssociationPathMtu(const int sockFd,
>> >> >>                                   const string &remoteIpAddress,
>> >> >>                                   const std::uint16_t remotePort) {
>> >> >>   sockaddr_in socket_address_in{};
>> >> >>
>> >> >>   socket_address_in.sin_family = AF_INET;
>> >> >>   socket_address_in.sin_port = htons(remotePort);
>> >> >>   inet_pton(AF_INET, remoteIpAddress.c_str(), &socket_address_in.sin_addr);
>> >> >>
>> >> >>   struct sockaddr *socket_address = reinterpret_cast<sockaddr*>(&socket_address_in);
>> >> >>   socklen_t salen = sizeof(&socket_address);
>> >> >>
>> >> >>   struct sctp_paddrinfo peer_address_info{};
>> >> >>   socklen_t size_of_sctp_paddrinfo = sizeof(peer_address_info);
>> >> >>   std::memcpy(&peer_address_info.spinfo_address, socket_address, salen);
>> >> >>
>> >> >>   sctp_assoc_t sctpAssociationId = getSocketAssociationId(sockFd, remoteIpAddress, remotePort);
>> >> >>
>> >> >>   const int sctpOptInfoError = sctp_opt_info(sockFd, sctpAssociationId,
>> >> >>                                            SCTP_GET_PEER_ADDR_INFO,
>> >> >>                                            &peer_address_info, &size_of_sctp_paddrinfo);
>> >> >>   if (sctpOptInfoError) {
>> >> >>     printError("Failed to get pmtu", __FUNCTION__);
>> >> >>   }
>> >> >>
>> >> >>   auto t = std::time(nullptr);
>> >> >>   auto tm = *std::localtime(&t);
>> >> >>   std::cout << std::put_time(&tm, "%H:%M:%S ") << remoteIpAddress << ":" << remotePort;
>> >> >>   cout << " currently has a PMTU of " << peer_address_info.spinfo_mtu << endl;
>> >> >>
>> >> >>   return peer_address_info.spinfo_mtu;
>> >> >> }
>> >> >>
>> >> >> void test1(const string& data) {
>> >> >>   int localPort = 2944;
>> >> >>   string remoteIp1 = "10.0.0.3";
>> >> >>   uint16_t remotePort1 = 8001;
>> >> >>   uint16_t remotePort2 = 8002;
>> >> >>
>> >> >>   int sockFd = createSocket();
>> >> >>   bindSocket(sockFd, localPort);
>> >> >>
>> >> >>   cout << "### Test 1: 2 assocs" << endl;
>> >> >>
>> >> >>   openAssociation(sockFd, remoteIp1, remotePort1);
>> >> >>   openAssociation(sockFd, remoteIp1, remotePort2);
>> >> >>
>> >> >>   getAssociationPathMtu(sockFd, remoteIp1, remotePort1);
>> >> >>   getAssociationPathMtu(sockFd, remoteIp1, remotePort2);
>> >> >>
>> >> >>   sendReq(sockFd, remoteIp1, remotePort1, data);
>> >> >>   for (int i = 0; i < 10; i++) {
>> >> >>     sleep(10);
>> >> >>     getAssociationPathMtu(sockFd, remoteIp1, remotePort1);
>> >> >>     getAssociationPathMtu(sockFd, remoteIp1, remotePort2);
>> >> >>   }
>> >> >> }
>> >> >>
>> >> >> void test2(const string& data) {
>> >> >>   int localPort = 2944;
>> >> >>   string remoteIp1 = "10.0.0.3";
>> >> >>   uint16_t remotePort1 = 8001;
>> >> >>   uint16_t remotePort2 = 8002;
>> >> >>   string remoteIpFake = "10.52.96.204";
>> >> >>   uint16_t remotePortFake = 3239;
>> >> >>
>> >> >>   int sockFd = createSocket();
>> >> >>   bindSocket(sockFd, localPort);
>> >> >>
>> >> >>   cout << "### Test 2: 2 assocs + 1 unreachable assoc" << endl;
>> >> >>
>> >> >>   openAssociation(sockFd, remoteIp1, remotePort1);
>> >> >>   openAssociation(sockFd, remoteIp1, remotePort2);
>> >> >>   openAssociation(sockFd, remoteIpFake, remotePortFake);
>> >> >>
>> >> >>   getAssociationPathMtu(sockFd, remoteIp1, remotePort1);
>> >> >>   getAssociationPathMtu(sockFd, remoteIp1, remotePort2);
>> >> >>
>> >> >>   sendReq(sockFd, remoteIp1, remotePort1, data);
>> >> >>   for (int i = 0; i < 10; i++) {
>> >> >>     sleep(10);
>> >> >>     getAssociationPathMtu(sockFd, remoteIp1, remotePort1);
>> >> >>     getAssociationPathMtu(sockFd, remoteIp1, remotePort2);
>> >> >>   }
>> >> >> }
>> >> >>
>> >> >>
>> >> >> int main(int argc, char** argv) {
>> >> >>   string testNr = "1";
>> >> >>   string& testData = data1000;
>> >> >>   if (argc >= 2) {
>> >> >>     testNr = argv[1];
>> >> >>   }
>> >> >>   if (argc >= 3) {
>> >> >>     testData = data100;
>> >> >>   }
>> >> >>
>> >> >>   if (testNr == "1") {
>> >> >>     test1(testData);
>> >> >>   } else {
>> >> >>     test2(testData);
>> >> >>   }
>> >> >>
>> >> >>   return 0;
>> >> >> }
>> >> >
>> >
>> >
>
>

[-- Attachment #2: test_traces.tar.gz --]
[-- Type: application/x-gzip, Size: 28616 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: PMTU discovery behaviour
  2017-09-11 12:44 PMTU discovery behaviour Peter Salin
                   ` (6 preceding siblings ...)
  2017-09-22  9:05 ` Peter Salin
@ 2017-09-22 11:33 ` Neil Horman
  2017-09-22 20:06 ` Neil Horman
  8 siblings, 0 replies; 10+ messages in thread
From: Neil Horman @ 2017-09-22 11:33 UTC (permalink / raw)
  To: linux-sctp

On Fri, Sep 22, 2017 at 12:05:30PM +0300, Peter Salin wrote:
> 2017-09-21 18:24 GMT+03:00 Neil Horman <nhorman@tuxdriver.com>:
> > On Thu, Sep 21, 2017 at 03:41:51PM +0300, Peter Salin wrote:
> >> 2017-09-21 14:01 GMT+03:00 Neil Horman <nhorman@tuxdriver.com>:
> >> > On Wed, Sep 20, 2017 at 02:02:45PM +0300, Peter Salin wrote:
> >> >> 2017-09-19 20:09 GMT+03:00 Neil Horman <nhorman@tuxdriver.com>:
> >> >> > On Mon, Sep 11, 2017 at 03:44:57PM +0300, Peter Salin wrote:
> >> >> >> Hi,
> >> >> >>
> >> >> >> I encountered some strange PMTUD related behaviour that I need help in
> >> >> >> understanding.
> >> >> >>
> >> >> >> Setup:
> >> >> >>
> >> >> >> +-----------+        +---+        +--------+
> >> >> >> | 10.0.0.10 |--------| X |--------|10.0.0.3|
> >> >> >> +-----------+        +---+        +--------+
> >> >> >>
> >> >> >> A one to many socket is setup at 10.0.0.10. Two instances of the
> >> >> >> lksctp sctp_darn applications are ran at 10.0.0.3 listening to ports
> >> >> >> 8001 and 8002. 10.0.0.3 was also setup to generate ICMP frag needed
> >> >> >> messages for incoming messages over 600 bytes. This same issue also
> >> >> >> occurs also when a router on the path was setup to generate the ICMP
> >> >> >> message instead.
> >> >> >>
> >> >> >> Test 1:
> >> >> >> Two associations were connected from 10.0.0.10 to 10.0.0.3, one to
> >> >> >> port 8001 and another one to 8002. Then a too large message was sent
> >> >> >> on the association to 8001, triggering ICMP generation. When checking
> >> >> >> the MTU reported in spinfo_mtu field of SCTP_GET_PEER_ADDR_INFO, the
> >> >> >> association now reports 600. The association to 8002 reports 1500
> >> >> >> until traffic is sent on it, at which point it also adjusts to 600
> >> >> >> which I think makes sense since the destination IP is the same. When
> >> >> >> reopening the associations, the value of 600 would be remembered for
> >> >> >> about 10 min, which I also think makes sense since
> >> >> >> net.ipv4.route.mtu_expires is 600.
> >> >> >>
> >> >> >> Test 2:
> >> >> >> Again the same two associations were connected to 10.0.0.3, but in
> >> >> >> addition an attempt to connect a third association to a non-existing
> >> >> >> IP was done, this attempt fails with timeout after a while. After
> >> >> >> that, again an ICMP triggering large message was sent to 8001. Now the
> >> >> >> behaviour is different from before. The association to 8001 reports a
> >> >> >> spinfo_mtu of 600, but only for a brief moment, it does not stay at
> >> >> >> 600 for 10 minutes. In addition the spinfo_mtu of the association to
> >> >> >> 8002 never changes, it stays at the original 1500.
> >> >> >>
> >> >> >> The only difference between the two tests is the attempt to connect to
> >> >> >> a non-responding IP at the beginning of test 2. Any ideas why the
> >> >> >> behaviour changes, is this a bug or is there some other reason for
> >> >> >> this?
> >> >> >>
> >> >> >> I have attached the sample application used for reproducing this.
> >> >> >>
> >> >> >> BR,
> >> >> >> -Peter
> >> >> >>
> >> >> > Hey, apologies for the delay on this, I've had it in my reader for days and kept
> >> >> > meaning to respond, but kept getting sidetracked.
> >> >> >
> >> >> > First glance, this sounds incorrect.  Each association (or rather each
> >> >> > transport) maintains its own mtu, and the association reflects the mtu of the
> >> >> > active transport. Given that each transport holds its own dst cache entry, I
> >> >> > have a hard time seeing how one transports mtu changes might leak to another
> >> >> >
> >> >> > But thats not really whats happening here.  By your description, the active
> >> >> > transport on the established association isn't updating its pathmtu, which
> >> >> > should happen in response to receiving the ICMP_FRAG_NEEDED message.
> >> >> >
> >> >> > I know you've provided the reproducer bellow, and I appreciate that, but I don't
> >> >> > have the cycles to set this up at the moment.  Could you tell me if, during the
> >> >> > second test, after you attempt to connect to the fake ip address and then send
> >> >> > the large message that should trigger the frag needed message, does said large
> >> >> > message get retransmitted and eventually arrive at the peer host?  If so, that
> >> >> > suggests that the sctp stack:
> >> >> >
> >> >> > a) receives the frag needed message
> >> >> > and
> >> >> > b) resends the packet at the lower frag point
> >> >> >
> >> >> > That in turn suggests we just have some internal reporting error in which we
> >> >> > don't update the associations pmtu with the active transports
> >> >> >
> >> >> > Let me know the answer to that question and it will give me some places to start
> >> >> > looking
> >> >> > Neil
> >> >> >
> >> >> Thanks for responding. In response to your question, the first large
> >> >> message does get retransmitted without the Don't Fragment bit set. I
> >> >> modified the test a bit to also send further messages after the first
> >> >> one. Those messages are indeed fragmented according to the limit of
> >> >> the ICMP message. I have attached a PCAP trace and SCTP debug logs in
> >> >> case that helps here.
> >> >>
> >> >> I also tried sending a large message on the other association after
> >> >> the large message on the first association had been sent. For test 2
> >> >> that message was not fragmented even though the ICMP was already
> >> >> received for the first assoc. After the second assoc also received an
> >> >> ICMP it adjusted to use the lower MTU for subsequent messages. In the
> >> >> case of test 1, sending a large message on the second assoc would auto
> >> >> fragment already on the first message.
> >> >>
> >> >> Also, after stopping and rerunning test 2 the MTU would always be
> >> >> reset at 1500, whereas in test 1 the lower limit would still be in
> >> >> effect for a new run. So it seems like in test 2 the lower MTU is only
> >> >> known within each association, where as in test 1 the lower MTU also
> >> >> gets stored deeper down?
> >> >>
> >> >> BR,
> >> >> -Peter
> >> >>
> >> > So, from what I can see, your included tcpdump only shows the first part of what
> >> > you are describing.  That is to say that it sends a large data chunk on an
> >> > association that gets an ICMP frag needed response, after which the pmtu is
> >> > lowered and smaller message fragments are sent, which is good (i.e. working as
> >> > designed).
> >> >
> >> > I don't see anything in the tcpdump relating to the remainder of your test,
> >> > showing failed fragmentation.  Can you include that please?
> >> >
> >> > Neil
> >> Yes, please find attached traces that include sending on the other
> >> association after receiving the first ICMP.
> >>
> >
> >
> > Thank you.  So tell me if I'm missing something here, but I think this trace
> > contradicts what you describe above.  Some specifics:
> >
> > 1) I observe two assocations in this trace:
> >         a) An association with index 0, who's init chunk is in frame 1
> >         b) An association with index 1, whos init chunk is in frame 5
> >         Note that I can toggle between these association flows with the display
> > filter of:
> >         sctp.assoc_index = 1
> >         or
> >         sctp.assoc_index = 0
> >         in wireshark
> >
> >
> > 2) In both flows, I can observe that a large chunk is sent:
> >         a) in assoc index 0, the over-mtu chunk is in frame 9
> >         b) in assoc index 1, the over-mtu chunk is in frame 16
> >
> > 3) Subsequent to each data chunk in (2), we get an icmp unreach (frag needed
> > message)
> >         a) in assoc index 0, the icmp is in frame 10
> >         b) in assoc index 1, the icmp is in frame 17
> >
> > 4) Subsequent to (3), all DATA chunks appear to get limited to an appropriate
> > size for the path mtu as specified in the respective icmp from (3), and
> > oversized datagrams are appropriately fragmented.
> >
> >
> > Please let me know if I'm missing something, but this trace shows everything to
> > be working as normal.
> >
> > Neil
> 
> I would have expected the second ICMP to not be needed as I thought
> both assocs are on the same transport.
Oh, I'm sorry, that may be where the conflict is here.  Each association creates
its own transport objects, even if they share endpoint addresses.  Given that
the transport object holds its own unique dst entry, which is where the pmtu
value is derived from, each association needs to go through the pmtu scaling
process.  Perhaps this is where the confusion lies?

> 
> I have now attached traces for both tests so that you can compare them
> side-by-side and see what I am after here. I have run each test twice
> in the traces to be able to show the two key differences here:
> 
> 1) In test 1, the first large message sent on the second assoc (frame
> 22) is already limited correctly in size and no second ICMP is needed
> like in test 2.
> 
> (Here test 1 behaviour looked ok to me since I thought both assocs are
> on the same transport and therefore the MTU would be synced to both
> assocs. There seems to be some locally sent ICMP message in frame 16,
> perhaps this has to do with the syncing?)
> 
> 2) When rerunning test 1, the previously found out MTU value is
> remembered and no new ICMPs are needed (frame 52). This is not the
> case for test 2.
> 
> (Again here test 1 made sense to me since I thougth the MTU would be
> cached and only forgotten after 10 minutes
> (net.ipv4.route.mtu_expires).
> 
> Please note that in the test application the only difference between
> test 1 and test 2 is the attempt to connect a third assoc to a
> non-responding IP in test 2. Yet the behaviour of the stack is very
> different between the two tests.
> 
> I am new to SCTP, but to me the behaviour shown in test 1 looked more
> like what I would have expected. In any case I don't understand why
> the behaviour is so different between these two cases, so I hope we
> can find some explanation for that.
> 
Ok, I'll take a look at these later today and compare.

Neil



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: PMTU discovery behaviour
  2017-09-11 12:44 PMTU discovery behaviour Peter Salin
                   ` (7 preceding siblings ...)
  2017-09-22 11:33 ` Neil Horman
@ 2017-09-22 20:06 ` Neil Horman
  8 siblings, 0 replies; 10+ messages in thread
From: Neil Horman @ 2017-09-22 20:06 UTC (permalink / raw)
  To: linux-sctp

On Fri, Sep 22, 2017 at 12:05:30PM +0300, Peter Salin wrote:
> 2017-09-21 18:24 GMT+03:00 Neil Horman <nhorman@tuxdriver.com>:
> > On Thu, Sep 21, 2017 at 03:41:51PM +0300, Peter Salin wrote:
> >> 2017-09-21 14:01 GMT+03:00 Neil Horman <nhorman@tuxdriver.com>:
> >> > On Wed, Sep 20, 2017 at 02:02:45PM +0300, Peter Salin wrote:
> >> >> 2017-09-19 20:09 GMT+03:00 Neil Horman <nhorman@tuxdriver.com>:
> >> >> > On Mon, Sep 11, 2017 at 03:44:57PM +0300, Peter Salin wrote:
> >> >> >> Hi,
> >> >> >>
> >> >> >> I encountered some strange PMTUD related behaviour that I need help in
> >> >> >> understanding.
> >> >> >>
> >> >> >> Setup:
> >> >> >>
> >> >> >> +-----------+        +---+        +--------+
> >> >> >> | 10.0.0.10 |--------| X |--------|10.0.0.3|
> >> >> >> +-----------+        +---+        +--------+
> >> >> >>
> >> >> >> A one to many socket is setup at 10.0.0.10. Two instances of the
> >> >> >> lksctp sctp_darn applications are ran at 10.0.0.3 listening to ports
> >> >> >> 8001 and 8002. 10.0.0.3 was also setup to generate ICMP frag needed
> >> >> >> messages for incoming messages over 600 bytes. This same issue also
> >> >> >> occurs also when a router on the path was setup to generate the ICMP
> >> >> >> message instead.
> >> >> >>
> >> >> >> Test 1:
> >> >> >> Two associations were connected from 10.0.0.10 to 10.0.0.3, one to
> >> >> >> port 8001 and another one to 8002. Then a too large message was sent
> >> >> >> on the association to 8001, triggering ICMP generation. When checking
> >> >> >> the MTU reported in spinfo_mtu field of SCTP_GET_PEER_ADDR_INFO, the
> >> >> >> association now reports 600. The association to 8002 reports 1500
> >> >> >> until traffic is sent on it, at which point it also adjusts to 600
> >> >> >> which I think makes sense since the destination IP is the same. When
> >> >> >> reopening the associations, the value of 600 would be remembered for
> >> >> >> about 10 min, which I also think makes sense since
> >> >> >> net.ipv4.route.mtu_expires is 600.
> >> >> >>
> >> >> >> Test 2:
> >> >> >> Again the same two associations were connected to 10.0.0.3, but in
> >> >> >> addition an attempt to connect a third association to a non-existing
> >> >> >> IP was done, this attempt fails with timeout after a while. After
> >> >> >> that, again an ICMP triggering large message was sent to 8001. Now the
> >> >> >> behaviour is different from before. The association to 8001 reports a
> >> >> >> spinfo_mtu of 600, but only for a brief moment, it does not stay at
> >> >> >> 600 for 10 minutes. In addition the spinfo_mtu of the association to
> >> >> >> 8002 never changes, it stays at the original 1500.
> >> >> >>
> >> >> >> The only difference between the two tests is the attempt to connect to
> >> >> >> a non-responding IP at the beginning of test 2. Any ideas why the
> >> >> >> behaviour changes, is this a bug or is there some other reason for
> >> >> >> this?
> >> >> >>
> >> >> >> I have attached the sample application used for reproducing this.
> >> >> >>
> >> >> >> BR,
> >> >> >> -Peter
> >> >> >>
> >> >> > Hey, apologies for the delay on this, I've had it in my reader for days and kept
> >> >> > meaning to respond, but kept getting sidetracked.
> >> >> >
> >> >> > First glance, this sounds incorrect.  Each association (or rather each
> >> >> > transport) maintains its own mtu, and the association reflects the mtu of the
> >> >> > active transport. Given that each transport holds its own dst cache entry, I
> >> >> > have a hard time seeing how one transports mtu changes might leak to another
> >> >> >
> >> >> > But thats not really whats happening here.  By your description, the active
> >> >> > transport on the established association isn't updating its pathmtu, which
> >> >> > should happen in response to receiving the ICMP_FRAG_NEEDED message.
> >> >> >
> >> >> > I know you've provided the reproducer bellow, and I appreciate that, but I don't
> >> >> > have the cycles to set this up at the moment.  Could you tell me if, during the
> >> >> > second test, after you attempt to connect to the fake ip address and then send
> >> >> > the large message that should trigger the frag needed message, does said large
> >> >> > message get retransmitted and eventually arrive at the peer host?  If so, that
> >> >> > suggests that the sctp stack:
> >> >> >
> >> >> > a) receives the frag needed message
> >> >> > and
> >> >> > b) resends the packet at the lower frag point
> >> >> >
> >> >> > That in turn suggests we just have some internal reporting error in which we
> >> >> > don't update the associations pmtu with the active transports
> >> >> >
> >> >> > Let me know the answer to that question and it will give me some places to start
> >> >> > looking
> >> >> > Neil
> >> >> >
> >> >> Thanks for responding. In response to your question, the first large
> >> >> message does get retransmitted without the Don't Fragment bit set. I
> >> >> modified the test a bit to also send further messages after the first
> >> >> one. Those messages are indeed fragmented according to the limit of
> >> >> the ICMP message. I have attached a PCAP trace and SCTP debug logs in
> >> >> case that helps here.
> >> >>
> >> >> I also tried sending a large message on the other association after
> >> >> the large message on the first association had been sent. For test 2
> >> >> that message was not fragmented even though the ICMP was already
> >> >> received for the first assoc. After the second assoc also received an
> >> >> ICMP it adjusted to use the lower MTU for subsequent messages. In the
> >> >> case of test 1, sending a large message on the second assoc would auto
> >> >> fragment already on the first message.
> >> >>
> >> >> Also, after stopping and rerunning test 2 the MTU would always be
> >> >> reset at 1500, whereas in test 1 the lower limit would still be in
> >> >> effect for a new run. So it seems like in test 2 the lower MTU is only
> >> >> known within each association, where as in test 1 the lower MTU also
> >> >> gets stored deeper down?
> >> >>
> >> >> BR,
> >> >> -Peter
> >> >>
> >> > So, from what I can see, your included tcpdump only shows the first part of what
> >> > you are describing.  That is to say that it sends a large data chunk on an
> >> > association that gets an ICMP frag needed response, after which the pmtu is
> >> > lowered and smaller message fragments are sent, which is good (i.e. working as
> >> > designed).
> >> >
> >> > I don't see anything in the tcpdump relating to the remainder of your test,
> >> > showing failed fragmentation.  Can you include that please?
> >> >
> >> > Neil
> >> Yes, please find attached traces that include sending on the other
> >> association after receiving the first ICMP.
> >>
> >
> >
> > Thank you.  So tell me if I'm missing something here, but I think this trace
> > contradicts what you describe above.  Some specifics:
> >
> > 1) I observe two assocations in this trace:
> >         a) An association with index 0, who's init chunk is in frame 1
> >         b) An association with index 1, whos init chunk is in frame 5
> >         Note that I can toggle between these association flows with the display
> > filter of:
> >         sctp.assoc_index = 1
> >         or
> >         sctp.assoc_index = 0
> >         in wireshark
> >
> >
> > 2) In both flows, I can observe that a large chunk is sent:
> >         a) in assoc index 0, the over-mtu chunk is in frame 9
> >         b) in assoc index 1, the over-mtu chunk is in frame 16
> >
> > 3) Subsequent to each data chunk in (2), we get an icmp unreach (frag needed
> > message)
> >         a) in assoc index 0, the icmp is in frame 10
> >         b) in assoc index 1, the icmp is in frame 17
> >
> > 4) Subsequent to (3), all DATA chunks appear to get limited to an appropriate
> > size for the path mtu as specified in the respective icmp from (3), and
> > oversized datagrams are appropriately fragmented.
> >
> >
> > Please let me know if I'm missing something, but this trace shows everything to
> > be working as normal.
> >
> > Neil
> 
> I would have expected the second ICMP to not be needed as I thought
> both assocs are on the same transport.
> 
> I have now attached traces for both tests so that you can compare them
> side-by-side and see what I am after here. I have run each test twice
> in the traces to be able to show the two key differences here:
> 
> 1) In test 1, the first large message sent on the second assoc (frame
> 22) is already limited correctly in size and no second ICMP is needed
> like in test 2.
> 
> (Here test 1 behaviour looked ok to me since I thought both assocs are
> on the same transport and therefore the MTU would be synced to both
> assocs. There seems to be some locally sent ICMP message in frame 16,
> perhaps this has to do with the syncing?)
> 
> 2) When rerunning test 1, the previously found out MTU value is
> remembered and no new ICMPs are needed (frame 52). This is not the
> case for test 2.
> 
> (Again here test 1 made sense to me since I thougth the MTU would be
> cached and only forgotten after 10 minutes
> (net.ipv4.route.mtu_expires).
> 
> Please note that in the test application the only difference between
> test 1 and test 2 is the attempt to connect a third assoc to a
> non-responding IP in test 2. Yet the behaviour of the stack is very
> different between the two tests.
> 
> I am new to SCTP, but to me the behaviour shown in test 1 looked more
> like what I would have expected. In any case I don't understand why
> the behaviour is so different between these two cases, so I hope we
> can find some explanation for that.
> 

So, theres two things I think we need to explore:

1) The possibillity that both behaviors are correct.  Every time a packet is
generated on the outq (which is asychronous from the sending of the packet), we
update the pmtu from the route table.  Its possible that, depending on the exact
timing that a packet might be allocated to bundle chunks into based on the route
prior to the mtu route metric getting updated, leading to two icmp packets
getting generated in response.  Put another way, the output of an oversized
frame may be dependent on the exact timing of the arrival of the first icmp
message and the decision to transmit the large data chunk on an association.  I
need to look more to confirm this though.  The establishment of a new
association to an unknown address may be affecting such timing.

2) In frame 18 of test_1.pcap, there is a FORWARD_TSN chunk, which is very odd,
because it is on association 1.  Thats odd because as far as I know, FORWARD_TSN
chunks are only generated after data has been sent (which is expected, since its
meant to help pr-sctp skip tsns that are considered lost).  The 1 exception to
that, is that a forward_tsn may be generated in response to a SACK chunk (which
still assumes data has been sent).  This is all interesting because frame 17 in
that trace is in fact a SACK chunk, but it was sent on association 0, which
seems very wrong.  Need to look into this more too.
Neil

> BR,
> -Peter
> >
> >> BR,
> >> -Peter
> >> >
> >> >> >> ------ ver_linux output ------
> >> >> >> Linux esalipe-test 4.4.0-93-generic #116-Ubuntu SMP Fri Aug 11
> >> >> >> 21:17:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
> >> >> >>
> >> >> >> GNU C                   5.4.0
> >> >> >> GNU Make                4.1
> >> >> >> Binutils                2.26.1
> >> >> >> Util-linux              2.27.1
> >> >> >> Mount                   2.27.1
> >> >> >> Module-init-tools       22
> >> >> >> E2fsprogs               1.42.13
> >> >> >> Xfsprogs                4.3.0
> >> >> >> Linux C Library         2.23
> >> >> >> Dynamic linker (ldd)    2.23
> >> >> >> Linux C++ Library       6.0.21
> >> >> >> Procps                  3.3.10
> >> >> >> Net-tools               1.60
> >> >> >> Kbd                     1.15.5
> >> >> >> Console-tools           1.15.5
> >> >> >> Sh-utils                8.25
> >> >> >> Udev                    229
> >> >> >> Modules Loaded          ablk_helper aes_x86_64 aesni_intel
> >> >> >> async_memcpy async_pq async_raid6_recov async_tx async_xor autofs4
> >> >> >> binfmt_misc btrfs  crc32_pclmul crct10dif_pclmul cryptd floppy
> >> >> >> gf128mul ghash_clmul ni_intel glue_helper hid hid_generic ib_addr
> >> >> >> ib_cm ib_core ib_iser ib_mad ib_sa input_leds irqbypass iscsi_tcp
> >> >> >> iw_cm joydev kvm kvm_intel libcrc32c libiscsi libiscsi_tcp linear lrw
> >> >> >> multipath parport parport_pc ppdev psmouse raid0 raid1 raid10 raid456
> >> >> >> raid6_pq rdma_cm scsi_transport_iscsi sctp serio_raw usbhid xor
> >> >> >
> >> >> >>
> >> >> >> #include <cstring>
> >> >> >> #include <ctime>
> >> >> >> #include <iomanip>
> >> >> >> #include <iostream>
> >> >> >>
> >> >> >> #include <errno.h>
> >> >> >> #include <unistd.h>
> >> >> >> #include <arpa/inet.h>
> >> >> >> #include <net/if.h>
> >> >> >> #include <netinet/in.h>
> >> >> >> #include <netinet/sctp.h>
> >> >> >> #include <sys/ioctl.h>
> >> >> >> #include <sys/socket.h>
> >> >> >>
> >> >> >> using namespace std;
> >> >> >>
> >> >> >> static const int ERROR_BUFLEN = 64;
> >> >> >> static const char* SCTP_INTERFACE_NAME = "ens4";
> >> >> >>
> >> >> >> static string data100 = "01234567890123456789012345678901234567890123456789"
> >> >> >>   "01234567890123456789012345678901234567890123456789";
> >> >> >> static string data1000 = "01234567890123456789012345678901234567890123456789"
> >> >> >>   "01234567890123456789012345678901234567890123456789"
> >> >> >>   "01234567890123456789012345678901234567890123456789"
> >> >> >>   "01234567890123456789012345678901234567890123456789"
> >> >> >>   "01234567890123456789012345678901234567890123456789"
> >> >> >>   "01234567890123456789012345678901234567890123456789"
> >> >> >>   "01234567890123456789012345678901234567890123456789"
> >> >> >>   "01234567890123456789012345678901234567890123456789"
> >> >> >>   "01234567890123456789012345678901234567890123456789"
> >> >> >>   "01234567890123456789012345678901234567890123456789"
> >> >> >>   "01234567890123456789012345678901234567890123456789"
> >> >> >>   "01234567890123456789012345678901234567890123456789"
> >> >> >>   "01234567890123456789012345678901234567890123456789"
> >> >> >>   "01234567890123456789012345678901234567890123456789"
> >> >> >>   "01234567890123456789012345678901234567890123456789"
> >> >> >>   "01234567890123456789012345678901234567890123456789"
> >> >> >>   "01234567890123456789012345678901234567890123456789"
> >> >> >>   "01234567890123456789012345678901234567890123456789"
> >> >> >>   "01234567890123456789012345678901234567890123456789"
> >> >> >>   "01234567890123456789012345678901234567890123456789";
> >> >> >>
> >> >> >> void printError(const string& msg, const string& funcName) {
> >> >> >>   char errorMessage[ERROR_BUFLEN] {};
> >> >> >>   char* errMsg = ::strerror_r(errno, errorMessage,
> >> >> >>                             sizeof(errorMessage));
> >> >> >>
> >> >> >>   cerr << "::" << funcName << ": " << msg << ": " << errMsg << endl;
> >> >> >> }
> >> >> >>
> >> >> >> int createSocket() {
> >> >> >>   int sockFd = socket (AF_INET,
> >> >> >>                      SOCK_SEQPACKET,
> >> >> >>                      IPPROTO_SCTP);
> >> >> >>   if (sockFd = -1) {
> >> >> >>     printError("Creation of socket failed", __FUNCTION__);
> >> >> >>     return -1;
> >> >> >>   }
> >> >> >>
> >> >> >>   // Enable address reuse
> >> >> >>   int enable = 1;
> >> >> >>   int err = setsockopt(sockFd,
> >> >> >>                      SOL_SOCKET,
> >> >> >>                      SO_REUSEADDR,
> >> >> >>                      &enable,
> >> >> >>                      sizeof(enable));
> >> >> >>
> >> >> >>   if (err) {
> >> >> >>     printError("Error setting socket option SO_REUSEADDR", __FUNCTION__);
> >> >> >>     close(sockFd);
> >> >> >>     return -1;
> >> >> >>   }
> >> >> >>
> >> >> >>   // Configure SCTP
> >> >> >>   sctp_initmsg initmsg{};
> >> >> >>   initmsg.sinit_num_ostreams = 3;
> >> >> >>   initmsg.sinit_max_instreams = 3;
> >> >> >>   initmsg.sinit_max_attempts = 2;
> >> >> >>   initmsg.sinit_max_init_timeo = 0;
> >> >> >>
> >> >> >>   err = setsockopt(sockFd,
> >> >> >>                  IPPROTO_SCTP,
> >> >> >>                  SCTP_INITMSG,
> >> >> >>                  &initmsg,
> >> >> >>                  sizeof(initmsg));
> >> >> >>
> >> >> >>   if (err) {
> >> >> >>     printError("Configuring SCTP socket failed", __FUNCTION__);
> >> >> >>     close(sockFd);
> >> >> >>     return -1;
> >> >> >>   }
> >> >> >>
> >> >> >>   struct sctp_paddrparams paddr_params{};
> >> >> >>   memset(&paddr_params, 0, sizeof(paddr_params));
> >> >> >>   socklen_t size_of_sctp_paddr_params = sizeof(paddr_params);
> >> >> >>   paddr_params.spp_flags = SPP_HB_ENABLE | SPP_PMTUD_ENABLE | SPP_SACKDELAY_ENABLE;
> >> >> >>
> >> >> >>   err = setsockopt(sockFd,
> >> >> >>                  IPPROTO_SCTP,
> >> >> >>                  SCTP_PEER_ADDR_PARAMS,
> >> >> >>                  &paddr_params,
> >> >> >>                  size_of_sctp_paddr_params);
> >> >> >>
> >> >> >>   if (err) {
> >> >> >>     printError("Configuring SCTP params failed", __FUNCTION__);
> >> >> >>     close(sockFd);
> >> >> >>     return -1;
> >> >> >>   }
> >> >> >>
> >> >> >>   return sockFd;
> >> >> >> }
> >> >> >>
> >> >> >> bool bindSocket(const int sockFd, const int localPort) {
> >> >> >>   // Get IP of ethernet interface
> >> >> >>   string localAddress = "";
> >> >> >>   ifreq ifr{};
> >> >> >>   ifr.ifr_addr.sa_family = AF_INET;
> >> >> >>   strncpy(ifr.ifr_name, SCTP_INTERFACE_NAME, IFNAMSIZ - 1);
> >> >> >>   const int ioctlStatus = ioctl(sockFd,
> >> >> >>                               SIOCGIFADDR,
> >> >> >>                               &ifr);
> >> >> >>
> >> >> >>   if (ioctlStatus = -1) {
> >> >> >>     printError("Failed to get local address", __FUNCTION__);
> >> >> >>     return false;
> >> >> >>   }
> >> >> >>
> >> >> >>   char ipAddrBuffer[INET_ADDRSTRLEN] {};
> >> >> >>   inet_ntop(AF_INET,
> >> >> >>           &reinterpret_cast<sockaddr_in*>(&(ifr.ifr_addr))->sin_addr,
> >> >> >>           ipAddrBuffer,
> >> >> >>           sizeof(ipAddrBuffer));
> >> >> >>
> >> >> >>   localAddress.assign(ipAddrBuffer);
> >> >> >>
> >> >> >>   // Bind to found ip address
> >> >> >>   sockaddr_in serv_addr{};
> >> >> >>   serv_addr.sin_family = AF_INET;
> >> >> >>   inet_pton(AF_INET,
> >> >> >>           localAddress.c_str(),
> >> >> >>           &serv_addr.sin_addr);
> >> >> >>   serv_addr.sin_port = htons(localPort);
> >> >> >>
> >> >> >>   if (bind(sockFd,
> >> >> >>          reinterpret_cast<sockaddr*>(&serv_addr),
> >> >> >>          sizeof(serv_addr))) {
> >> >> >>     printError("Failed to bind socket to local address", __FUNCTION__);
> >> >> >>     localAddress.clear();
> >> >> >>     close(sockFd);
> >> >> >>     return false;
> >> >> >>   }
> >> >> >>
> >> >> >>   cout << "Local endpoint succussfully bound to local address: " << localAddress << endl;
> >> >> >>
> >> >> >>   return true;
> >> >> >> }
> >> >> >>
> >> >> >> bool openAssociation(const int sockFd,
> >> >> >>                    const string &remoteAddress,
> >> >> >>                    std::uint16_t remotePort) {
> >> >> >>
> >> >> >>   sockaddr_in address{};
> >> >> >>   address.sin_family = AF_INET;
> >> >> >>   inet_pton(AF_INET, remoteAddress.c_str(), &address.sin_addr);
> >> >> >>   address.sin_port = htons(remotePort);
> >> >> >>
> >> >> >>   int connectError = connect(sockFd,
> >> >> >>                            reinterpret_cast<sockaddr *>(&address),
> >> >> >>                            sizeof(address));
> >> >> >>   if (connectError) {
> >> >> >>     printError("Error connecting association", __FUNCTION__);
> >> >> >>     return false;
> >> >> >>   }
> >> >> >>
> >> >> >>   cout << "Association connected to address: " << remoteAddress << ":" << remotePort << endl;
> >> >> >>   return true;
> >> >> >> }
> >> >> >>
> >> >> >> void sendReq(const int sockFd,
> >> >> >>            const string& remoteAddress,
> >> >> >>            const uint16_t remotePort,
> >> >> >>            const std::string& data)
> >> >> >> {
> >> >> >>
> >> >> >>   struct sockaddr_in remoteAddr {};
> >> >> >>   remoteAddr.sin_family = AF_INET;
> >> >> >>   remoteAddr.sin_port = htons(remotePort);
> >> >> >>
> >> >> >>   uint32_t payloadProtId = 7;
> >> >> >>   uint16_t streamId = 0;
> >> >> >>   uint32_t dataLength = data.size();
> >> >> >>   sockaddr* servaddr = reinterpret_cast<sockaddr*>(&remoteAddr);
> >> >> >>   inet_pton(AF_INET, remoteAddress.c_str(), &remoteAddr.sin_addr);
> >> >> >>
> >> >> >>   const std::string ipaddr > >> >> >>     inet_ntoa(reinterpret_cast<sockaddr_in*>(servaddr)->sin_addr);
> >> >> >>
> >> >> >>   cout << "Sending SCTP req to " << remoteAddress << ":" << remotePort;
> >> >> >>   cout << ", len=" << dataLength << endl;
> >> >> >>
> >> >> >>   const int bytesSent = sctp_sendmsg(sockFd,
> >> >> >>                                    data.c_str(),
> >> >> >>                                    (size_t)dataLength,
> >> >> >>                                    servaddr,
> >> >> >>                                    sizeof(sockaddr_in),
> >> >> >>                                    htonl(payloadProtId),
> >> >> >>                                    SCTP_ADDR_OVER,
> >> >> >>                                    streamId,
> >> >> >>                                    200,
> >> >> >>                                    0);
> >> >> >>
> >> >> >>   if (bytesSent = -1) {
> >> >> >>     printError("SCTP send failed", __FUNCTION__);
> >> >> >>   }
> >> >> >>
> >> >> >>   return;
> >> >> >> }
> >> >> >>
> >> >> >> sctp_assoc_t getSocketAssociationId(const int sockFd,
> >> >> >>                                   const string &remoteIpAddress,
> >> >> >>                                   std::uint16_t remotePort)
> >> >> >>
> >> >> >> {
> >> >> >>   sockaddr_in socket_address_in{};
> >> >> >>
> >> >> >>   socket_address_in.sin_family = AF_INET;
> >> >> >>   socket_address_in.sin_port = htons(remotePort);
> >> >> >>   inet_pton(AF_INET, remoteIpAddress.c_str(), &socket_address_in.sin_addr);
> >> >> >>
> >> >> >>   struct sockaddr *socket_address = reinterpret_cast<sockaddr*>(&socket_address_in);
> >> >> >>   socklen_t salen = sizeof(&socket_address);
> >> >> >>
> >> >> >>   struct sctp_paddrinfo peer_address_info{};
> >> >> >>   socklen_t size_of_sctp_paddrinfo = sizeof peer_address_info;
> >> >> >>   std::memcpy(&peer_address_info.spinfo_address, socket_address, salen);
> >> >> >>
> >> >> >>   const int sctpOptInfoError = sctp_opt_info(sockFd,
> >> >> >>                                            0,
> >> >> >>                                            SCTP_GET_PEER_ADDR_INFO,
> >> >> >>                                            &peer_address_info,
> >> >> >>                                            &size_of_sctp_paddrinfo);
> >> >> >>
> >> >> >>   if (sctpOptInfoError) {
> >> >> >>     printError("Failed to get association id", __FUNCTION__);
> >> >> >>   }
> >> >> >>
> >> >> >>   return peer_address_info.spinfo_assoc_id;
> >> >> >> }
> >> >> >>
> >> >> >> std::uint32_t getAssociationPathMtu(const int sockFd,
> >> >> >>                                   const string &remoteIpAddress,
> >> >> >>                                   const std::uint16_t remotePort) {
> >> >> >>   sockaddr_in socket_address_in{};
> >> >> >>
> >> >> >>   socket_address_in.sin_family = AF_INET;
> >> >> >>   socket_address_in.sin_port = htons(remotePort);
> >> >> >>   inet_pton(AF_INET, remoteIpAddress.c_str(), &socket_address_in.sin_addr);
> >> >> >>
> >> >> >>   struct sockaddr *socket_address = reinterpret_cast<sockaddr*>(&socket_address_in);
> >> >> >>   socklen_t salen = sizeof(&socket_address);
> >> >> >>
> >> >> >>   struct sctp_paddrinfo peer_address_info{};
> >> >> >>   socklen_t size_of_sctp_paddrinfo = sizeof(peer_address_info);
> >> >> >>   std::memcpy(&peer_address_info.spinfo_address, socket_address, salen);
> >> >> >>
> >> >> >>   sctp_assoc_t sctpAssociationId = getSocketAssociationId(sockFd, remoteIpAddress, remotePort);
> >> >> >>
> >> >> >>   const int sctpOptInfoError = sctp_opt_info(sockFd, sctpAssociationId,
> >> >> >>                                            SCTP_GET_PEER_ADDR_INFO,
> >> >> >>                                            &peer_address_info, &size_of_sctp_paddrinfo);
> >> >> >>   if (sctpOptInfoError) {
> >> >> >>     printError("Failed to get pmtu", __FUNCTION__);
> >> >> >>   }
> >> >> >>
> >> >> >>   auto t = std::time(nullptr);
> >> >> >>   auto tm = *std::localtime(&t);
> >> >> >>   std::cout << std::put_time(&tm, "%H:%M:%S ") << remoteIpAddress << ":" << remotePort;
> >> >> >>   cout << " currently has a PMTU of " << peer_address_info.spinfo_mtu << endl;
> >> >> >>
> >> >> >>   return peer_address_info.spinfo_mtu;
> >> >> >> }
> >> >> >>
> >> >> >> void test1(const string& data) {
> >> >> >>   int localPort = 2944;
> >> >> >>   string remoteIp1 = "10.0.0.3";
> >> >> >>   uint16_t remotePort1 = 8001;
> >> >> >>   uint16_t remotePort2 = 8002;
> >> >> >>
> >> >> >>   int sockFd = createSocket();
> >> >> >>   bindSocket(sockFd, localPort);
> >> >> >>
> >> >> >>   cout << "### Test 1: 2 assocs" << endl;
> >> >> >>
> >> >> >>   openAssociation(sockFd, remoteIp1, remotePort1);
> >> >> >>   openAssociation(sockFd, remoteIp1, remotePort2);
> >> >> >>
> >> >> >>   getAssociationPathMtu(sockFd, remoteIp1, remotePort1);
> >> >> >>   getAssociationPathMtu(sockFd, remoteIp1, remotePort2);
> >> >> >>
> >> >> >>   sendReq(sockFd, remoteIp1, remotePort1, data);
> >> >> >>   for (int i = 0; i < 10; i++) {
> >> >> >>     sleep(10);
> >> >> >>     getAssociationPathMtu(sockFd, remoteIp1, remotePort1);
> >> >> >>     getAssociationPathMtu(sockFd, remoteIp1, remotePort2);
> >> >> >>   }
> >> >> >> }
> >> >> >>
> >> >> >> void test2(const string& data) {
> >> >> >>   int localPort = 2944;
> >> >> >>   string remoteIp1 = "10.0.0.3";
> >> >> >>   uint16_t remotePort1 = 8001;
> >> >> >>   uint16_t remotePort2 = 8002;
> >> >> >>   string remoteIpFake = "10.52.96.204";
> >> >> >>   uint16_t remotePortFake = 3239;
> >> >> >>
> >> >> >>   int sockFd = createSocket();
> >> >> >>   bindSocket(sockFd, localPort);
> >> >> >>
> >> >> >>   cout << "### Test 2: 2 assocs + 1 unreachable assoc" << endl;
> >> >> >>
> >> >> >>   openAssociation(sockFd, remoteIp1, remotePort1);
> >> >> >>   openAssociation(sockFd, remoteIp1, remotePort2);
> >> >> >>   openAssociation(sockFd, remoteIpFake, remotePortFake);
> >> >> >>
> >> >> >>   getAssociationPathMtu(sockFd, remoteIp1, remotePort1);
> >> >> >>   getAssociationPathMtu(sockFd, remoteIp1, remotePort2);
> >> >> >>
> >> >> >>   sendReq(sockFd, remoteIp1, remotePort1, data);
> >> >> >>   for (int i = 0; i < 10; i++) {
> >> >> >>     sleep(10);
> >> >> >>     getAssociationPathMtu(sockFd, remoteIp1, remotePort1);
> >> >> >>     getAssociationPathMtu(sockFd, remoteIp1, remotePort2);
> >> >> >>   }
> >> >> >> }
> >> >> >>
> >> >> >>
> >> >> >> int main(int argc, char** argv) {
> >> >> >>   string testNr = "1";
> >> >> >>   string& testData = data1000;
> >> >> >>   if (argc >= 2) {
> >> >> >>     testNr = argv[1];
> >> >> >>   }
> >> >> >>   if (argc >= 3) {
> >> >> >>     testData = data100;
> >> >> >>   }
> >> >> >>
> >> >> >>   if (testNr = "1") {
> >> >> >>     test1(testData);
> >> >> >>   } else {
> >> >> >>     test2(testData);
> >> >> >>   }
> >> >> >>
> >> >> >>   return 0;
> >> >> >> }
> >> >> >
> >> >
> >> >
> >
> >



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-09-22 20:06 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-11 12:44 PMTU discovery behaviour Peter Salin
2017-09-19 12:14 ` Peter Salin
2017-09-19 17:09 ` Neil Horman
2017-09-20 11:02 ` Peter Salin
2017-09-21 11:01 ` Neil Horman
2017-09-21 12:41 ` Peter Salin
2017-09-21 15:24 ` Neil Horman
2017-09-22  9:05 ` Peter Salin
2017-09-22 11:33 ` Neil Horman
2017-09-22 20:06 ` Neil Horman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.