All of lore.kernel.org
 help / color / mirror / Atom feed
* The new sysctl and socket option added for PLPMTUD (RFC8899)
@ 2021-06-11 20:20 Xin Long
  2021-06-11 20:42 ` tuexen
  0 siblings, 1 reply; 12+ messages in thread
From: Xin Long @ 2021-06-11 20:20 UTC (permalink / raw)
  To: linux-sctp @ vger . kernel . org, Michael Tuexen,
	Marcelo Ricardo Leitner

Hi, Michael,

In the linux implementation of RFC8899, we decided to introduce one
sysctl and one socket option for users to set up the PLPMUTD probe:

1. sysctl -w net.sctp.plpmtud_probe_interval=1

plpmtud_probe_interval - INTEGER
        The interval (in milliseconds) between PLPMTUD probe chunks. These
        chunks are sent at the specified interval with a variable size to
        probe the mtu of a given path between 2 associations. PLPMTUD will
        be disabled when 0 is set.

        Default: 0

2. a socket option that can be used per socket, assoc or transport

/* PLPMTUD Probe Interval socket option */
struct sctp_probeinterval {
        sctp_assoc_t spi_assoc_id;
        struct sockaddr_storage spi_address;
        __u32 spi_interval;
};

#define SCTP_PLPMTUD_PROBE_INTERVAL    133


The value above will enable/disable the PLPMUTD probe by setting up the probe
interval for the timer. When it's 0, the timer will also stop and
PLPMUTD is disabled.
By this way, we don't need to introduce more options.

We're expecting to keep consistent with BSD on this, pls check and
share your thoughts.

Thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: The new sysctl and socket option added for PLPMTUD (RFC8899)
  2021-06-11 20:20 The new sysctl and socket option added for PLPMTUD (RFC8899) Xin Long
@ 2021-06-11 20:42 ` tuexen
  2021-06-12 17:32   ` Xin Long
  0 siblings, 1 reply; 12+ messages in thread
From: tuexen @ 2021-06-11 20:42 UTC (permalink / raw)
  To: Xin Long; +Cc: linux-sctp @ vger . kernel . org, Marcelo Ricardo Leitner

> On 11. Jun 2021, at 22:20, Xin Long <lucien.xin@gmail.com> wrote:
> 
> Hi, Michael,
> 
> In the linux implementation of RFC8899, we decided to introduce one
> sysctl and one socket option for users to set up the PLPMUTD probe:
> 
> 1. sysctl -w net.sctp.plpmtud_probe_interval=1
> 
> plpmtud_probe_interval - INTEGER
>        The interval (in milliseconds) between PLPMTUD probe chunks. These
>        chunks are sent at the specified interval with a variable size to
>        probe the mtu of a given path between 2 associations. PLPMTUD will
I guess you mean "between 2 end points" instead of "between 2 associations".

I'm not sure what it means:

I assume, you have candidate 1400, 1420, 1460, 1480, and 1500.

Assume you sent a probe packet for 1400. Aren't you sending the
probe packet for 1420 as soon as you get an ACK for the probe packet
of size 1400? Or are you waiting for plpmtud_probe_interval ms?
>        be disabled when 0 is set.
> 
>        Default: 0
> 
> 2. a socket option that can be used per socket, assoc or transport
> 
> /* PLPMTUD Probe Interval socket option */
> struct sctp_probeinterval {
>        sctp_assoc_t spi_assoc_id;
>        struct sockaddr_storage spi_address;
>        __u32 spi_interval;
> };
> 
> #define SCTP_PLPMTUD_PROBE_INTERVAL    133
> 
> 
> The value above will enable/disable the PLPMUTD probe by setting up the probe
> interval for the timer. When it's 0, the timer will also stop and
> PLPMUTD is disabled.
> By this way, we don't need to introduce more options.
OK.
> 
> We're expecting to keep consistent with BSD on this, pls check and
> share your thoughts.
Looks good to me.

Best regards
Michael
> 
> Thanks.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: The new sysctl and socket option added for PLPMTUD (RFC8899)
  2021-06-11 20:42 ` tuexen
@ 2021-06-12 17:32   ` Xin Long
  2021-06-12 21:28     ` tuexen
       [not found]     ` <FEF068AA-C660-4A25-ABFE-D559B1136B58@fh-muenster.de>
  0 siblings, 2 replies; 12+ messages in thread
From: Xin Long @ 2021-06-12 17:32 UTC (permalink / raw)
  To: Michael Tuexen; +Cc: linux-sctp @ vger . kernel . org, Marcelo Ricardo Leitner

On Fri, Jun 11, 2021 at 4:42 PM <tuexen@freebsd.org> wrote:
>
> > On 11. Jun 2021, at 22:20, Xin Long <lucien.xin@gmail.com> wrote:
> >
> > Hi, Michael,
> >
> > In the linux implementation of RFC8899, we decided to introduce one
> > sysctl and one socket option for users to set up the PLPMUTD probe:
> >
> > 1. sysctl -w net.sctp.plpmtud_probe_interval=1
> >
> > plpmtud_probe_interval - INTEGER
> >        The interval (in milliseconds) between PLPMTUD probe chunks. These
> >        chunks are sent at the specified interval with a variable size to
> >        probe the mtu of a given path between 2 associations. PLPMTUD will
> I guess you mean "between 2 end points" instead of "between 2 associations".
>
> I'm not sure what it means:
>
> I assume, you have candidate 1400, 1420, 1460, 1480, and 1500.
>
> Assume you sent a probe packet for 1400. Aren't you sending the
> probe packet for 1420 as soon as you get an ACK for the probe packet
> of size 1400? Or are you waiting for plpmtud_probe_interval ms?
It will wait for "plpmtud_probe_interval" ms in searching state, but in
searching complete it will be "plpmtud_probe_interval * 30" ms.

The step we are using is 32, when it fails, we turn the step to 4. For example:
1400, 1432, 1464, 1496, 1528 (failed), 1500(1496 + 4), 1504(failed,
1500 is the PMTU).

Sorry, "sysctl -w net.sctp.plpmtud_probe_interval=1" won't work.
As plpmtud_probe_interval is the probe interval TIME for the timer.
Apart from 0, the minimal value is 5000ms.

So it should be:

plpmtud_probe_interval - INTEGER
        The time interval (in milliseconds) for sending PLPMTUD probe chunks.
        These chunks are sent at the specified interval with a variable size
        to probe the mtu of a given path between 2 endpoints. PLPMTUD will
        be disabled when 0 is set.

        Default: 0

Thanks.
> >        be disabled when 0 is set.
> >
> >        Default: 0
> >
> > 2. a socket option that can be used per socket, assoc or transport
> >
> > /* PLPMTUD Probe Interval socket option */
> > struct sctp_probeinterval {
> >        sctp_assoc_t spi_assoc_id;
> >        struct sockaddr_storage spi_address;
> >        __u32 spi_interval;
> > };
> >
> > #define SCTP_PLPMTUD_PROBE_INTERVAL    133
> >
> >
> > The value above will enable/disable the PLPMUTD probe by setting up the probe
> > interval for the timer. When it's 0, the timer will also stop and
> > PLPMUTD is disabled.
> > By this way, we don't need to introduce more options.
> OK.
> >
> > We're expecting to keep consistent with BSD on this, pls check and
> > share your thoughts.
> Looks good to me.
>
> Best regards
> Michael
> >
> > Thanks.
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: The new sysctl and socket option added for PLPMTUD (RFC8899)
  2021-06-12 17:32   ` Xin Long
@ 2021-06-12 21:28     ` tuexen
       [not found]     ` <FEF068AA-C660-4A25-ABFE-D559B1136B58@fh-muenster.de>
  1 sibling, 0 replies; 12+ messages in thread
From: tuexen @ 2021-06-12 21:28 UTC (permalink / raw)
  To: Xin Long; +Cc: linux-sctp @ vger . kernel . org, Marcelo Ricardo Leitner



> On 12. Jun 2021, at 19:32, Xin Long <lucien.xin@gmail.com> wrote:
> 
> On Fri, Jun 11, 2021 at 4:42 PM <tuexen@freebsd.org> wrote:
>> 
>>> On 11. Jun 2021, at 22:20, Xin Long <lucien.xin@gmail.com> wrote:
>>> 
>>> Hi, Michael,
>>> 
>>> In the linux implementation of RFC8899, we decided to introduce one
>>> sysctl and one socket option for users to set up the PLPMUTD probe:
>>> 
>>> 1. sysctl -w net.sctp.plpmtud_probe_interval=1
>>> 
>>> plpmtud_probe_interval - INTEGER
>>>       The interval (in milliseconds) between PLPMTUD probe chunks. These
>>>       chunks are sent at the specified interval with a variable size to
>>>       probe the mtu of a given path between 2 associations. PLPMTUD will
>> I guess you mean "between 2 end points" instead of "between 2 associations".
>> 
>> I'm not sure what it means:
>> 
>> I assume, you have candidate 1400, 1420, 1460, 1480, and 1500.
>> 
>> Assume you sent a probe packet for 1400. Aren't you sending the
>> probe packet for 1420 as soon as you get an ACK for the probe packet
>> of size 1400? Or are you waiting for plpmtud_probe_interval ms?
> It will wait for "plpmtud_probe_interval" ms in searching state, but in
> searching complete it will be "plpmtud_probe_interval * 30" ms.
> 
> The step we are using is 32, when it fails, we turn the step to 4. For example:
> 1400, 1432, 1464, 1496, 1528 (failed), 1500(1496 + 4), 1504(failed,
> 1500 is the PMTU).
> 
> Sorry, "sysctl -w net.sctp.plpmtud_probe_interval=1" won't work.
> As plpmtud_probe_interval is the probe interval TIME for the timer.
> Apart from 0, the minimal value is 5000ms.
> 
> So it should be:
> 
> plpmtud_probe_interval - INTEGER
>        The time interval (in milliseconds) for sending PLPMTUD probe chunks.
>        These chunks are sent at the specified interval with a variable size
>        to probe the mtu of a given path between 2 endpoints. PLPMTUD will
>        be disabled when 0 is set.
> 
>        Default: 0
> 
> Thanks.
OK. Thanks for the clarification.

Best regards
Michael
>>>       be disabled when 0 is set.
>>> 
>>>       Default: 0
>>> 
>>> 2. a socket option that can be used per socket, assoc or transport
>>> 
>>> /* PLPMTUD Probe Interval socket option */
>>> struct sctp_probeinterval {
>>>       sctp_assoc_t spi_assoc_id;
>>>       struct sockaddr_storage spi_address;
>>>       __u32 spi_interval;
>>> };
>>> 
>>> #define SCTP_PLPMTUD_PROBE_INTERVAL    133
>>> 
>>> 
>>> The value above will enable/disable the PLPMUTD probe by setting up the probe
>>> interval for the timer. When it's 0, the timer will also stop and
>>> PLPMUTD is disabled.
>>> By this way, we don't need to introduce more options.
>> OK.
>>> 
>>> We're expecting to keep consistent with BSD on this, pls check and
>>> share your thoughts.
>> Looks good to me.
>> 
>> Best regards
>> Michael
>>> 
>>> Thanks.
>> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: The new sysctl and socket option added for PLPMTUD (RFC8899)
       [not found]     ` <FEF068AA-C660-4A25-ABFE-D559B1136B58@fh-muenster.de>
@ 2021-07-06  9:12       ` Timo Völker
  2021-07-06 16:01         ` Xin Long
  0 siblings, 1 reply; 12+ messages in thread
From: Timo Völker @ 2021-07-06  9:12 UTC (permalink / raw)
  To: Xin Long
  Cc: Marcelo Ricardo Leitner, linux-sctp @ vger . kernel . org, tuexen

[-- Attachment #1: Type: text/plain, Size: 4559 bytes --]


Hi Xin,

I implemented RFC8899 for an SCTP simulation model.

Comments follow inline.

> Begin forwarded message:
> 
> From: Xin Long <lucien.xin@gmail.com>
> Subject: Re: The new sysctl and socket option added for PLPMTUD (RFC8899)
> Date: 12. June 2021 at 19:32:02 CEST
> To: Michael Tuexen <tuexen@freebsd.org>
> Cc: "linux-sctp @ vger . kernel . org" <linux-sctp@vger.kernel.org>, Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
> 
> On Fri, Jun 11, 2021 at 4:42 PM <tuexen@freebsd.org> wrote:
>> 
>>> On 11. Jun 2021, at 22:20, Xin Long <lucien.xin@gmail.com> wrote:
>>> 
>>> Hi, Michael,
>>> 
>>> In the linux implementation of RFC8899, we decided to introduce one
>>> sysctl and one socket option for users to set up the PLPMUTD probe:
>>> 
>>> 1. sysctl -w net.sctp.plpmtud_probe_interval=1
>>> 
>>> plpmtud_probe_interval - INTEGER
>>>       The interval (in milliseconds) between PLPMTUD probe chunks. These
>>>       chunks are sent at the specified interval with a variable size to
>>>       probe the mtu of a given path between 2 associations. PLPMTUD will
>> I guess you mean "between 2 end points" instead of "between 2 associations".
>> 
>> I'm not sure what it means:
>> 
>> I assume, you have candidate 1400, 1420, 1460, 1480, and 1500.
>> 
>> Assume you sent a probe packet for 1400. Aren't you sending the
>> probe packet for 1420 as soon as you get an ACK for the probe packet
>> of size 1400? Or are you waiting for plpmtud_probe_interval ms?
> It will wait for "plpmtud_probe_interval" ms in searching state, but in
> searching complete it will be "plpmtud_probe_interval * 30" ms.

Does this mean you always wait for plpmtud_probe_interval ms? Even if you receive an ack for a probe packet or a PTB?

In my implementation, I start with the next probe immediately when receiving an ack or PTB.

> 
> The step we are using is 32, when it fails, we turn the step to 4. For example:
> 1400, 1432, 1464, 1496, 1528 (failed), 1500(1496 + 4), 1504(failed,
> 1500 is the PMTU).

What does failed mean? Does it mean that you have sent MAX_PROBES (=3?) probe packets and waited for each plpmtud_probe_interval ms without receiving a response?

If so, it might make sense to continue with smaller candidates earlier. For example, after one unanswered probe packet.

> 
> Sorry, "sysctl -w net.sctp.plpmtud_probe_interval=1" won't work.
> As plpmtud_probe_interval is the probe interval TIME for the timer.
> Apart from 0, the minimal value is 5000ms.
> 
> So it should be:
> 
> plpmtud_probe_interval - INTEGER
>        The time interval (in milliseconds) for sending PLPMTUD probe chunks.
>        These chunks are sent at the specified interval with a variable size
>        to probe the mtu of a given path between 2 endpoints. PLPMTUD will
>        be disabled when 0 is set.
> 
>        Default: 0

What do you mean with probe chunks? You are sending probe *packets* containing a HEARTBEAT and a PAD chunk, right?

RFC8899 contains:
The PROBE_TIMER is configured to expire after a period longer than the maximum time to receive an acknowledgment to a probe packet.

So, how about plpmtud_probe_max_ack_time?

Also, I think more parameters would be helpful. For example,

plpmtud_enable - boolean to control whether to use PLPMTUD (it is more explicit than plpmtud_probe_interval=0 or plpmtud_probe_max_ack_time=0)
plpmtud_max_probes - controls the number of probe packets sent for one candidate.
plpmtud_raise_time - time to wait before probing for a larger PMTU in search complete (0 to disable it).
plpmtud_use_ptb - boolean to control whether to process an ICMP PTB.

Timo

> 
> Thanks.
>>>       be disabled when 0 is set.
>>> 
>>>       Default: 0
>>> 
>>> 2. a socket option that can be used per socket, assoc or transport
>>> 
>>> /* PLPMTUD Probe Interval socket option */
>>> struct sctp_probeinterval {
>>>       sctp_assoc_t spi_assoc_id;
>>>       struct sockaddr_storage spi_address;
>>>       __u32 spi_interval;
>>> };
>>> 
>>> #define SCTP_PLPMTUD_PROBE_INTERVAL    133
>>> 
>>> 
>>> The value above will enable/disable the PLPMUTD probe by setting up the probe
>>> interval for the timer. When it's 0, the timer will also stop and
>>> PLPMUTD is disabled.
>>> By this way, we don't need to introduce more options.
>> OK.
>>> 
>>> We're expecting to keep consistent with BSD on this, pls check and
>>> share your thoughts.
>> Looks good to me.
>> 
>> Best regards
>> Michael
>>> 
>>> Thanks.
>> 



[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5261 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: The new sysctl and socket option added for PLPMTUD (RFC8899)
  2021-07-06  9:12       ` Timo Völker
@ 2021-07-06 16:01         ` Xin Long
  2021-07-07 12:36           ` Timo Völker
  0 siblings, 1 reply; 12+ messages in thread
From: Xin Long @ 2021-07-06 16:01 UTC (permalink / raw)
  To: Timo Völker
  Cc: Marcelo Ricardo Leitner, linux-sctp @ vger . kernel . org, tuexen

On Tue, Jul 6, 2021 at 5:13 AM Timo Völker <timo.voelker@fh-muenster.de> wrote:
>
>
> Hi Xin,
>
> I implemented RFC8899 for an SCTP simulation model.
great, can I know what that one is?

>
> Comments follow inline.
>
> > Begin forwarded message:
> >
> > From: Xin Long <lucien.xin@gmail.com>
> > Subject: Re: The new sysctl and socket option added for PLPMTUD (RFC8899)
> > Date: 12. June 2021 at 19:32:02 CEST
> > To: Michael Tuexen <tuexen@freebsd.org>
> > Cc: "linux-sctp @ vger . kernel . org" <linux-sctp@vger.kernel.org>, Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
> >
> > On Fri, Jun 11, 2021 at 4:42 PM <tuexen@freebsd.org> wrote:
> >>
> >>> On 11. Jun 2021, at 22:20, Xin Long <lucien.xin@gmail.com> wrote:
> >>>
> >>> Hi, Michael,
> >>>
> >>> In the linux implementation of RFC8899, we decided to introduce one
> >>> sysctl and one socket option for users to set up the PLPMUTD probe:
> >>>
> >>> 1. sysctl -w net.sctp.plpmtud_probe_interval=1
> >>>
> >>> plpmtud_probe_interval - INTEGER
> >>>       The interval (in milliseconds) between PLPMTUD probe chunks. These
> >>>       chunks are sent at the specified interval with a variable size to
> >>>       probe the mtu of a given path between 2 associations. PLPMTUD will
> >> I guess you mean "between 2 end points" instead of "between 2 associations".
> >>
> >> I'm not sure what it means:
> >>
> >> I assume, you have candidate 1400, 1420, 1460, 1480, and 1500.
> >>
> >> Assume you sent a probe packet for 1400. Aren't you sending the
> >> probe packet for 1420 as soon as you get an ACK for the probe packet
> >> of size 1400? Or are you waiting for plpmtud_probe_interval ms?
> > It will wait for "plpmtud_probe_interval" ms in searching state, but in
> > searching complete it will be "plpmtud_probe_interval * 30" ms.
>
> Does this mean you always wait for plpmtud_probe_interval ms? Even if you receive an ack for a probe packet or a PTB?
>
> In my implementation, I start with the next probe immediately when receiving an ack or PTB.
yeah, we should do it immediately to make this more efficient, and I
already fixed it in linux for ACK.

For PTB, I currently only set probe_size as the pmtu from ICMP packet
when pmtu > 'current pmtu' && pmtu < probe_size, and wait until next
probe_timer. But probably better to send it immediately too, I need to
confirm.

>
> >
> > The step we are using is 32, when it fails, we turn the step to 4. For example:
> > 1400, 1432, 1464, 1496, 1528 (failed), 1500(1496 + 4), 1504(failed,
> > 1500 is the PMTU).
>
> What does failed mean? Does it mean that you have sent MAX_PROBES (=3?) probe packets and waited for each plpmtud_probe_interval ms without receiving a response?
yes

>
> If so, it might make sense to continue with smaller candidates earlier. For example, after one unanswered probe packet.
Sounds a good way to go, and it would save 2 intervals to get the
optimal value in the normal case.
But if the failure is false (like the link is unstable), it may also
take some time to catch up to the bigger candidate.

>
> >
> > Sorry, "sysctl -w net.sctp.plpmtud_probe_interval=1" won't work.
> > As plpmtud_probe_interval is the probe interval TIME for the timer.
> > Apart from 0, the minimal value is 5000ms.
> >
> > So it should be:
> >
> > plpmtud_probe_interval - INTEGER
> >        The time interval (in milliseconds) for sending PLPMTUD probe chunks.
> >        These chunks are sent at the specified interval with a variable size
> >        to probe the mtu of a given path between 2 endpoints. PLPMTUD will
> >        be disabled when 0 is set.
> >
> >        Default: 0
>
> What do you mean with probe chunks? You are sending probe *packets* containing a HEARTBEAT and a PAD chunk, right?
yes.

>
> RFC8899 contains:
> The PROBE_TIMER is configured to expire after a period longer than the maximum time to receive an acknowledgment to a probe packet.
>
> So, how about plpmtud_probe_max_ack_time?
"plpmtud_probe_interval" I got the name from tcp's sysctl plpmtud in
linux. I was hoping to keep this consistent in sysctl and sockopt
between Linux and BSD.  Note this parameter is also the interval to
send a probe for the current pmtu in Search Complete status.

>
> Also, I think more parameters would be helpful. For example,
>
> plpmtud_enable - boolean to control whether to use PLPMTUD (it is more explicit than plpmtud_probe_interval=0 or plpmtud_probe_max_ack_time=0)
> plpmtud_max_probes - controls the number of probe packets sent for one candidate.
> plpmtud_raise_time - time to wait before probing for a larger PMTU in search complete (0 to disable it).
> plpmtud_use_ptb - boolean to control whether to process an ICMP PTB.
With these, the control will be more detailed for sure.
But I didn't want to introduce too many parameters for this feature,
as you know, these parameters could also be per socket/asoc/transport,
and doing set/get with sockopt.

instead, we keep most fixed:

plpmtud_use_ptb = 1
plpmtud_raise_time = 30 * plpmtud_probe_max_ack_time(plpmtud_probe_interval)
plpmtud_max_probes = 3
plpmtud_enable = !! plpmtud_probe_interval

Only one variable:
plpmtud_probe_interval >= 5000ms

So I think this is up to the implementation, if you want more things
to tune, you can go ahead with these all parameters exposed to users.

>
> Timo
>
> >
> > Thanks.
> >>>       be disabled when 0 is set.
> >>>
> >>>       Default: 0
> >>>
> >>> 2. a socket option that can be used per socket, assoc or transport
> >>>
> >>> /* PLPMTUD Probe Interval socket option */
> >>> struct sctp_probeinterval {
> >>>       sctp_assoc_t spi_assoc_id;
> >>>       struct sockaddr_storage spi_address;
> >>>       __u32 spi_interval;
> >>> };
> >>>
> >>> #define SCTP_PLPMTUD_PROBE_INTERVAL    133
> >>>
> >>>
> >>> The value above will enable/disable the PLPMUTD probe by setting up the probe
> >>> interval for the timer. When it's 0, the timer will also stop and
> >>> PLPMUTD is disabled.
> >>> By this way, we don't need to introduce more options.
> >> OK.
> >>>
> >>> We're expecting to keep consistent with BSD on this, pls check and
> >>> share your thoughts.
> >> Looks good to me.
> >>
> >> Best regards
> >> Michael
> >>>
> >>> Thanks.
> >>
>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: The new sysctl and socket option added for PLPMTUD (RFC8899)
  2021-07-06 16:01         ` Xin Long
@ 2021-07-07 12:36           ` Timo Völker
  2021-07-07 16:30             ` Xin Long
  0 siblings, 1 reply; 12+ messages in thread
From: Timo Völker @ 2021-07-07 12:36 UTC (permalink / raw)
  To: Xin Long
  Cc: Marcelo Ricardo Leitner, linux-sctp @ vger . kernel . org, tuexen

[-- Attachment #1: Type: text/plain, Size: 7497 bytes --]

> On 6. Jul 2021, at 18:01, Xin Long <lucien.xin@gmail.com> wrote:
> 
> On Tue, Jul 6, 2021 at 5:13 AM Timo Völker <timo.voelker@fh-muenster.de> wrote:
>> 
>> 
>> Hi Xin,
>> 
>> I implemented RFC8899 for an SCTP simulation model.
> great, can I know what that one is?

I used the SCTP implementation in INET. INET is a simulation model suite for OMNeT++.

> 
>> 
>> Comments follow inline.
>> 
>>> Begin forwarded message:
>>> 
>>> From: Xin Long <lucien.xin@gmail.com>
>>> Subject: Re: The new sysctl and socket option added for PLPMTUD (RFC8899)
>>> Date: 12. June 2021 at 19:32:02 CEST
>>> To: Michael Tuexen <tuexen@freebsd.org>
>>> Cc: "linux-sctp @ vger . kernel . org" <linux-sctp@vger.kernel.org>, Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
>>> 
>>> On Fri, Jun 11, 2021 at 4:42 PM <tuexen@freebsd.org> wrote:
>>>> 
>>>>> On 11. Jun 2021, at 22:20, Xin Long <lucien.xin@gmail.com> wrote:
>>>>> 
>>>>> Hi, Michael,
>>>>> 
>>>>> In the linux implementation of RFC8899, we decided to introduce one
>>>>> sysctl and one socket option for users to set up the PLPMUTD probe:
>>>>> 
>>>>> 1. sysctl -w net.sctp.plpmtud_probe_interval=1
>>>>> 
>>>>> plpmtud_probe_interval - INTEGER
>>>>>      The interval (in milliseconds) between PLPMTUD probe chunks. These
>>>>>      chunks are sent at the specified interval with a variable size to
>>>>>      probe the mtu of a given path between 2 associations. PLPMTUD will
>>>> I guess you mean "between 2 end points" instead of "between 2 associations".
>>>> 
>>>> I'm not sure what it means:
>>>> 
>>>> I assume, you have candidate 1400, 1420, 1460, 1480, and 1500.
>>>> 
>>>> Assume you sent a probe packet for 1400. Aren't you sending the
>>>> probe packet for 1420 as soon as you get an ACK for the probe packet
>>>> of size 1400? Or are you waiting for plpmtud_probe_interval ms?
>>> It will wait for "plpmtud_probe_interval" ms in searching state, but in
>>> searching complete it will be "plpmtud_probe_interval * 30" ms.
>> 
>> Does this mean you always wait for plpmtud_probe_interval ms? Even if you receive an ack for a probe packet or a PTB?
>> 
>> In my implementation, I start with the next probe immediately when receiving an ack or PTB.
> yeah, we should do it immediately to make this more efficient, and I
> already fixed it in linux for ACK.
> 
> For PTB, I currently only set probe_size as the pmtu from ICMP packet
> when pmtu > 'current pmtu' && pmtu < probe_size, and wait until next
> probe_timer. But probably better to send it immediately too, I need to
> confirm.

I think so. At least I don't know what to wait for.

> 
>> 
>>> 
>>> The step we are using is 32, when it fails, we turn the step to 4. For example:
>>> 1400, 1432, 1464, 1496, 1528 (failed), 1500(1496 + 4), 1504(failed,
>>> 1500 is the PMTU).
>> 
>> What does failed mean? Does it mean that you have sent MAX_PROBES (=3?) probe packets and waited for each plpmtud_probe_interval ms without receiving a response?
> yes
> 
>> 
>> If so, it might make sense to continue with smaller candidates earlier. For example, after one unanswered probe packet.
> Sounds a good way to go, and it would save 2 intervals to get the
> optimal value in the normal case.
> But if the failure is false (like the link is unstable), it may also
> take some time to catch up to the bigger candidate.

Right, it's a trade off. What is better depends on the probability of a probe packet loss due to another reason than its size.

I chose to do something like this, when searching for a PMTU of 1472:

1400 ack
1432 ack
1464 timeout (false negative)
1436 ack
1440 ack
1444 ack
1448 ack
1452 ack
1456 ack
1460 ack
1464 ack
1496 timeout
1468 ack
1472 ack
1476 timeout
1476 timeout
1476 timeout
done with PMTU=1472

> 
>> 
>>> 
>>> Sorry, "sysctl -w net.sctp.plpmtud_probe_interval=1" won't work.
>>> As plpmtud_probe_interval is the probe interval TIME for the timer.
>>> Apart from 0, the minimal value is 5000ms.
>>> 
>>> So it should be:
>>> 
>>> plpmtud_probe_interval - INTEGER
>>>       The time interval (in milliseconds) for sending PLPMTUD probe chunks.
>>>       These chunks are sent at the specified interval with a variable size
>>>       to probe the mtu of a given path between 2 endpoints. PLPMTUD will
>>>       be disabled when 0 is set.
>>> 
>>>       Default: 0
>> 
>> What do you mean with probe chunks? You are sending probe *packets* containing a HEARTBEAT and a PAD chunk, right?
> yes.
> 
>> 
>> RFC8899 contains:
>> The PROBE_TIMER is configured to expire after a period longer than the maximum time to receive an acknowledgment to a probe packet.
>> 
>> So, how about plpmtud_probe_max_ack_time?
> "plpmtud_probe_interval" I got the name from tcp's sysctl plpmtud in
> linux. I was hoping to keep this consistent in sysctl and sockopt
> between Linux and BSD.  Note this parameter is also the interval to
> send a probe for the current pmtu in Search Complete status.

Do you send probe packets in Search Complete to confirm the current PMTU estimation?

RFC8899 suggests to do this only for non-reliable PLs. For a reliable PL like SCTP, it suggests to use the loss of (data) packets as indication instead.

> 
>> 
>> Also, I think more parameters would be helpful. For example,
>> 
>> plpmtud_enable - boolean to control whether to use PLPMTUD (it is more explicit than plpmtud_probe_interval=0 or plpmtud_probe_max_ack_time=0)
>> plpmtud_max_probes - controls the number of probe packets sent for one candidate.
>> plpmtud_raise_time - time to wait before probing for a larger PMTU in search complete (0 to disable it).
>> plpmtud_use_ptb - boolean to control whether to process an ICMP PTB.
> With these, the control will be more detailed for sure.
> But I didn't want to introduce too many parameters for this feature,
> as you know, these parameters could also be per socket/asoc/transport,
> and doing set/get with sockopt.
> 
> instead, we keep most fixed:
> 
> plpmtud_use_ptb = 1
> plpmtud_raise_time = 30 * plpmtud_probe_max_ack_time(plpmtud_probe_interval)
> plpmtud_max_probes = 3
> plpmtud_enable = !! plpmtud_probe_interval
> 
> Only one variable:
> plpmtud_probe_interval >= 5000ms

OK

> 
> So I think this is up to the implementation, if you want more things
> to tune, you can go ahead with these all parameters exposed to users.

Agree. It is probably a good idea to add not too much parameters.

> 
>> 
>> Timo
>> 
>>> 
>>> Thanks.
>>>>>      be disabled when 0 is set.
>>>>> 
>>>>>      Default: 0
>>>>> 
>>>>> 2. a socket option that can be used per socket, assoc or transport
>>>>> 
>>>>> /* PLPMTUD Probe Interval socket option */
>>>>> struct sctp_probeinterval {
>>>>>      sctp_assoc_t spi_assoc_id;
>>>>>      struct sockaddr_storage spi_address;
>>>>>      __u32 spi_interval;
>>>>> };
>>>>> 
>>>>> #define SCTP_PLPMTUD_PROBE_INTERVAL    133
>>>>> 
>>>>> 
>>>>> The value above will enable/disable the PLPMUTD probe by setting up the probe
>>>>> interval for the timer. When it's 0, the timer will also stop and
>>>>> PLPMUTD is disabled.
>>>>> By this way, we don't need to introduce more options.
>>>> OK.
>>>>> 
>>>>> We're expecting to keep consistent with BSD on this, pls check and
>>>>> share your thoughts.
>>>> Looks good to me.
>>>> 
>>>> Best regards
>>>> Michael
>>>>> 
>>>>> Thanks.
>>>> 
>> 
>> 


[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5261 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: The new sysctl and socket option added for PLPMTUD (RFC8899)
  2021-07-07 12:36           ` Timo Völker
@ 2021-07-07 16:30             ` Xin Long
  2021-07-08 14:18               ` Timo Völker
  0 siblings, 1 reply; 12+ messages in thread
From: Xin Long @ 2021-07-07 16:30 UTC (permalink / raw)
  To: Timo Völker
  Cc: Marcelo Ricardo Leitner, linux-sctp @ vger . kernel . org, tuexen

On Wed, Jul 7, 2021 at 8:36 AM Timo Völker <timo.voelker@fh-muenster.de> wrote:
>
> > On 6. Jul 2021, at 18:01, Xin Long <lucien.xin@gmail.com> wrote:
> >
> > On Tue, Jul 6, 2021 at 5:13 AM Timo Völker <timo.voelker@fh-muenster.de> wrote:
> >>
> >>
> >> Hi Xin,
> >>
> >> I implemented RFC8899 for an SCTP simulation model.
> > great, can I know what that one is?
>
> I used the SCTP implementation in INET. INET is a simulation model suite for OMNeT++.
Thanks.

>
> >
> >>
> >> Comments follow inline.
> >>
> >>> Begin forwarded message:
> >>>
> >>> From: Xin Long <lucien.xin@gmail.com>
> >>> Subject: Re: The new sysctl and socket option added for PLPMTUD (RFC8899)
> >>> Date: 12. June 2021 at 19:32:02 CEST
> >>> To: Michael Tuexen <tuexen@freebsd.org>
> >>> Cc: "linux-sctp @ vger . kernel . org" <linux-sctp@vger.kernel.org>, Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
> >>>
> >>> On Fri, Jun 11, 2021 at 4:42 PM <tuexen@freebsd.org> wrote:
> >>>>
> >>>>> On 11. Jun 2021, at 22:20, Xin Long <lucien.xin@gmail.com> wrote:
> >>>>>
> >>>>> Hi, Michael,
> >>>>>
> >>>>> In the linux implementation of RFC8899, we decided to introduce one
> >>>>> sysctl and one socket option for users to set up the PLPMUTD probe:
> >>>>>
> >>>>> 1. sysctl -w net.sctp.plpmtud_probe_interval=1
> >>>>>
> >>>>> plpmtud_probe_interval - INTEGER
> >>>>>      The interval (in milliseconds) between PLPMTUD probe chunks. These
> >>>>>      chunks are sent at the specified interval with a variable size to
> >>>>>      probe the mtu of a given path between 2 associations. PLPMTUD will
> >>>> I guess you mean "between 2 end points" instead of "between 2 associations".
> >>>>
> >>>> I'm not sure what it means:
> >>>>
> >>>> I assume, you have candidate 1400, 1420, 1460, 1480, and 1500.
> >>>>
> >>>> Assume you sent a probe packet for 1400. Aren't you sending the
> >>>> probe packet for 1420 as soon as you get an ACK for the probe packet
> >>>> of size 1400? Or are you waiting for plpmtud_probe_interval ms?
> >>> It will wait for "plpmtud_probe_interval" ms in searching state, but in
> >>> searching complete it will be "plpmtud_probe_interval * 30" ms.
> >>
> >> Does this mean you always wait for plpmtud_probe_interval ms? Even if you receive an ack for a probe packet or a PTB?
> >>
> >> In my implementation, I start with the next probe immediately when receiving an ack or PTB.
> > yeah, we should do it immediately to make this more efficient, and I
> > already fixed it in linux for ACK.
> >
> > For PTB, I currently only set probe_size as the pmtu from ICMP packet
> > when pmtu > 'current pmtu' && pmtu < probe_size, and wait until next
> > probe_timer. But probably better to send it immediately too, I need to
> > confirm.
>
> I think so. At least I don't know what to wait for.
I'm not sure about this, as it says:

   PLPMTU < PL_PTB_SIZE < PROBED_SIZE
   ...
      *  The PL can use the reported PL_PTB_SIZE from the PTB message as
         the next search point when it resumes the search algorithm.

it doesn't seem to mean that.


>
> >
> >>
> >>>
> >>> The step we are using is 32, when it fails, we turn the step to 4. For example:
> >>> 1400, 1432, 1464, 1496, 1528 (failed), 1500(1496 + 4), 1504(failed,
> >>> 1500 is the PMTU).
> >>
> >> What does failed mean? Does it mean that you have sent MAX_PROBES (=3?) probe packets and waited for each plpmtud_probe_interval ms without receiving a response?
> > yes
> >
> >>
> >> If so, it might make sense to continue with smaller candidates earlier. For example, after one unanswered probe packet.
> > Sounds a good way to go, and it would save 2 intervals to get the
> > optimal value in the normal case.
> > But if the failure is false (like the link is unstable), it may also
> > take some time to catch up to the bigger candidate.
>
> Right, it's a trade off. What is better depends on the probability of a probe packet loss due to another reason than its size.
>
> I chose to do something like this, when searching for a PMTU of 1472:
>
> 1400 ack
> 1432 ack
> 1464 timeout (false negative)
> 1436 ack
> 1440 ack
> 1444 ack
> 1448 ack
> 1452 ack
> 1456 ack
> 1460 ack
> 1464 ack
> 1496 timeout
> 1468 ack
> 1472 ack
> 1476 timeout
> 1476 timeout
> 1476 timeout
> done with PMTU=1472
Looks good to me. :-)

>
> >
> >>
> >>>
> >>> Sorry, "sysctl -w net.sctp.plpmtud_probe_interval=1" won't work.
> >>> As plpmtud_probe_interval is the probe interval TIME for the timer.
> >>> Apart from 0, the minimal value is 5000ms.
> >>>
> >>> So it should be:
> >>>
> >>> plpmtud_probe_interval - INTEGER
> >>>       The time interval (in milliseconds) for sending PLPMTUD probe chunks.
> >>>       These chunks are sent at the specified interval with a variable size
> >>>       to probe the mtu of a given path between 2 endpoints. PLPMTUD will
> >>>       be disabled when 0 is set.
> >>>
> >>>       Default: 0
> >>
> >> What do you mean with probe chunks? You are sending probe *packets* containing a HEARTBEAT and a PAD chunk, right?
> > yes.
> >
> >>
> >> RFC8899 contains:
> >> The PROBE_TIMER is configured to expire after a period longer than the maximum time to receive an acknowledgment to a probe packet.
> >>
> >> So, how about plpmtud_probe_max_ack_time?
> > "plpmtud_probe_interval" I got the name from tcp's sysctl plpmtud in
> > linux. I was hoping to keep this consistent in sysctl and sockopt
> > between Linux and BSD.  Note this parameter is also the interval to
> > send a probe for the current pmtu in Search Complete status.
>
> Do you send probe packets in Search Complete to confirm the current PMTU estimation?
>
> RFC8899 suggests to do this only for non-reliable PLs. For a reliable PL like SCTP, it suggests to use the loss of (data) packets as indication instead.
Can you point out the place in RFC8899 saying so?

What I saw is:

   Search Complete:  The Search Complete Phase is entered when the
      PLPMTU is supported across the network path.  A PL can use a
      CONFIRMATION_TIMER to periodically repeat a probe packet for the
      current PLPMTU size.  If the sender is unable to confirm
      reachability (e.g., if the CONFIRMATION_TIMER expires) or the PL
      signals a lack of reachability, a black hole has been detected and
      DPLPMTUD enters the Base Phase.

it desn't matter if it's a reliable or non-reliable PL, no?

>
> >
> >>
> >> Also, I think more parameters would be helpful. For example,
> >>
> >> plpmtud_enable - boolean to control whether to use PLPMTUD (it is more explicit than plpmtud_probe_interval=0 or plpmtud_probe_max_ack_time=0)
> >> plpmtud_max_probes - controls the number of probe packets sent for one candidate.
> >> plpmtud_raise_time - time to wait before probing for a larger PMTU in search complete (0 to disable it).
> >> plpmtud_use_ptb - boolean to control whether to process an ICMP PTB.
> > With these, the control will be more detailed for sure.
> > But I didn't want to introduce too many parameters for this feature,
> > as you know, these parameters could also be per socket/asoc/transport,
> > and doing set/get with sockopt.
> >
> > instead, we keep most fixed:
> >
> > plpmtud_use_ptb = 1
> > plpmtud_raise_time = 30 * plpmtud_probe_max_ack_time(plpmtud_probe_interval)
> > plpmtud_max_probes = 3
> > plpmtud_enable = !! plpmtud_probe_interval
> >
> > Only one variable:
> > plpmtud_probe_interval >= 5000ms
>
> OK
>
> >
> > So I think this is up to the implementation, if you want more things
> > to tune, you can go ahead with these all parameters exposed to users.
>
> Agree. It is probably a good idea to add not too much parameters.
>
> >
> >>
> >> Timo
> >>
> >>>
> >>> Thanks.
> >>>>>      be disabled when 0 is set.
> >>>>>
> >>>>>      Default: 0
> >>>>>
> >>>>> 2. a socket option that can be used per socket, assoc or transport
> >>>>>
> >>>>> /* PLPMTUD Probe Interval socket option */
> >>>>> struct sctp_probeinterval {
> >>>>>      sctp_assoc_t spi_assoc_id;
> >>>>>      struct sockaddr_storage spi_address;
> >>>>>      __u32 spi_interval;
> >>>>> };
> >>>>>
> >>>>> #define SCTP_PLPMTUD_PROBE_INTERVAL    133
> >>>>>
> >>>>>
> >>>>> The value above will enable/disable the PLPMUTD probe by setting up the probe
> >>>>> interval for the timer. When it's 0, the timer will also stop and
> >>>>> PLPMUTD is disabled.
> >>>>> By this way, we don't need to introduce more options.
> >>>> OK.
> >>>>>
> >>>>> We're expecting to keep consistent with BSD on this, pls check and
> >>>>> share your thoughts.
> >>>> Looks good to me.
> >>>>
> >>>> Best regards
> >>>> Michael
> >>>>>
> >>>>> Thanks.
> >>>>
> >>
> >>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: The new sysctl and socket option added for PLPMTUD (RFC8899)
  2021-07-07 16:30             ` Xin Long
@ 2021-07-08 14:18               ` Timo Völker
  2021-07-08 15:54                 ` Xin Long
  0 siblings, 1 reply; 12+ messages in thread
From: Timo Völker @ 2021-07-08 14:18 UTC (permalink / raw)
  To: Xin Long
  Cc: Marcelo Ricardo Leitner, linux-sctp @ vger . kernel . org, tuexen

[-- Attachment #1: Type: text/plain, Size: 9660 bytes --]

> On 7. Jul 2021, at 18:30, Xin Long <lucien.xin@gmail.com> wrote:
> 
> On Wed, Jul 7, 2021 at 8:36 AM Timo Völker <timo.voelker@fh-muenster.de> wrote:
>> 
>>> On 6. Jul 2021, at 18:01, Xin Long <lucien.xin@gmail.com> wrote:
>>> 
>>> On Tue, Jul 6, 2021 at 5:13 AM Timo Völker <timo.voelker@fh-muenster.de> wrote:
>>>> 
>>>> 
>>>> Hi Xin,
>>>> 
>>>> I implemented RFC8899 for an SCTP simulation model.
>>> great, can I know what that one is?
>> 
>> I used the SCTP implementation in INET. INET is a simulation model suite for OMNeT++.
> Thanks.
> 
>> 
>>> 
>>>> 
>>>> Comments follow inline.
>>>> 
>>>>> Begin forwarded message:
>>>>> 
>>>>> From: Xin Long <lucien.xin@gmail.com>
>>>>> Subject: Re: The new sysctl and socket option added for PLPMTUD (RFC8899)
>>>>> Date: 12. June 2021 at 19:32:02 CEST
>>>>> To: Michael Tuexen <tuexen@freebsd.org>
>>>>> Cc: "linux-sctp @ vger . kernel . org" <linux-sctp@vger.kernel.org>, Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
>>>>> 
>>>>> On Fri, Jun 11, 2021 at 4:42 PM <tuexen@freebsd.org> wrote:
>>>>>> 
>>>>>>> On 11. Jun 2021, at 22:20, Xin Long <lucien.xin@gmail.com> wrote:
>>>>>>> 
>>>>>>> Hi, Michael,
>>>>>>> 
>>>>>>> In the linux implementation of RFC8899, we decided to introduce one
>>>>>>> sysctl and one socket option for users to set up the PLPMUTD probe:
>>>>>>> 
>>>>>>> 1. sysctl -w net.sctp.plpmtud_probe_interval=1
>>>>>>> 
>>>>>>> plpmtud_probe_interval - INTEGER
>>>>>>>     The interval (in milliseconds) between PLPMTUD probe chunks. These
>>>>>>>     chunks are sent at the specified interval with a variable size to
>>>>>>>     probe the mtu of a given path between 2 associations. PLPMTUD will
>>>>>> I guess you mean "between 2 end points" instead of "between 2 associations".
>>>>>> 
>>>>>> I'm not sure what it means:
>>>>>> 
>>>>>> I assume, you have candidate 1400, 1420, 1460, 1480, and 1500.
>>>>>> 
>>>>>> Assume you sent a probe packet for 1400. Aren't you sending the
>>>>>> probe packet for 1420 as soon as you get an ACK for the probe packet
>>>>>> of size 1400? Or are you waiting for plpmtud_probe_interval ms?
>>>>> It will wait for "plpmtud_probe_interval" ms in searching state, but in
>>>>> searching complete it will be "plpmtud_probe_interval * 30" ms.
>>>> 
>>>> Does this mean you always wait for plpmtud_probe_interval ms? Even if you receive an ack for a probe packet or a PTB?
>>>> 
>>>> In my implementation, I start with the next probe immediately when receiving an ack or PTB.
>>> yeah, we should do it immediately to make this more efficient, and I
>>> already fixed it in linux for ACK.
>>> 
>>> For PTB, I currently only set probe_size as the pmtu from ICMP packet
>>> when pmtu > 'current pmtu' && pmtu < probe_size, and wait until next
>>> probe_timer. But probably better to send it immediately too, I need to
>>> confirm.
>> 
>> I think so. At least I don't know what to wait for.
> I'm not sure about this, as it says:
> 
>   PLPMTU < PL_PTB_SIZE < PROBED_SIZE
>   ...
>      *  The PL can use the reported PL_PTB_SIZE from the PTB message as
>         the next search point when it resumes the search algorithm.
> 
> it doesn't seem to mean that.

The "when it resumes the search algorithm" is a litte abstract, but I don't understand it as the PL has to wait for a timeout before resuming the search algorithm.

> 
> 
>> 
>>> 
>>>> 
>>>>> 
>>>>> The step we are using is 32, when it fails, we turn the step to 4. For example:
>>>>> 1400, 1432, 1464, 1496, 1528 (failed), 1500(1496 + 4), 1504(failed,
>>>>> 1500 is the PMTU).
>>>> 
>>>> What does failed mean? Does it mean that you have sent MAX_PROBES (=3?) probe packets and waited for each plpmtud_probe_interval ms without receiving a response?
>>> yes
>>> 
>>>> 
>>>> If so, it might make sense to continue with smaller candidates earlier. For example, after one unanswered probe packet.
>>> Sounds a good way to go, and it would save 2 intervals to get the
>>> optimal value in the normal case.
>>> But if the failure is false (like the link is unstable), it may also
>>> take some time to catch up to the bigger candidate.
>> 
>> Right, it's a trade off. What is better depends on the probability of a probe packet loss due to another reason than its size.
>> 
>> I chose to do something like this, when searching for a PMTU of 1472:
>> 
>> 1400 ack
>> 1432 ack
>> 1464 timeout (false negative)
>> 1436 ack
>> 1440 ack
>> 1444 ack
>> 1448 ack
>> 1452 ack
>> 1456 ack
>> 1460 ack
>> 1464 ack
>> 1496 timeout
>> 1468 ack
>> 1472 ack
>> 1476 timeout
>> 1476 timeout
>> 1476 timeout
>> done with PMTU=1472
> Looks good to me. :-)
> 
>> 
>>> 
>>>> 
>>>>> 
>>>>> Sorry, "sysctl -w net.sctp.plpmtud_probe_interval=1" won't work.
>>>>> As plpmtud_probe_interval is the probe interval TIME for the timer.
>>>>> Apart from 0, the minimal value is 5000ms.
>>>>> 
>>>>> So it should be:
>>>>> 
>>>>> plpmtud_probe_interval - INTEGER
>>>>>      The time interval (in milliseconds) for sending PLPMTUD probe chunks.
>>>>>      These chunks are sent at the specified interval with a variable size
>>>>>      to probe the mtu of a given path between 2 endpoints. PLPMTUD will
>>>>>      be disabled when 0 is set.
>>>>> 
>>>>>      Default: 0
>>>> 
>>>> What do you mean with probe chunks? You are sending probe *packets* containing a HEARTBEAT and a PAD chunk, right?
>>> yes.
>>> 
>>>> 
>>>> RFC8899 contains:
>>>> The PROBE_TIMER is configured to expire after a period longer than the maximum time to receive an acknowledgment to a probe packet.
>>>> 
>>>> So, how about plpmtud_probe_max_ack_time?
>>> "plpmtud_probe_interval" I got the name from tcp's sysctl plpmtud in
>>> linux. I was hoping to keep this consistent in sysctl and sockopt
>>> between Linux and BSD.  Note this parameter is also the interval to
>>> send a probe for the current pmtu in Search Complete status.
>> 
>> Do you send probe packets in Search Complete to confirm the current PMTU estimation?
>> 
>> RFC8899 suggests to do this only for non-reliable PLs. For a reliable PL like SCTP, it suggests to use the loss of (data) packets as indication instead.
> Can you point out the place in RFC8899 saying so?
> 
> What I saw is:
> 
>   Search Complete:  The Search Complete Phase is entered when the
>      PLPMTU is supported across the network path.  A PL can use a
>      CONFIRMATION_TIMER to periodically repeat a probe packet for the
>      current PLPMTU size.  If the sender is unable to confirm
>      reachability (e.g., if the CONFIRMATION_TIMER expires) or the PL
>      signals a lack of reachability, a black hole has been detected and
>      DPLPMTUD enters the Base Phase.
> 
> it desn't matter if it's a reliable or non-reliable PL, no?

The description of the phases are used to give a high level overview about the mechanism. The state diagram is more detailed. There you find this sentence: "When used with an acknowledged PL (e.g., SCTP), DPLPMTUD SHOULD NOT continue to generate PLPMTU probes in this state". However, it refers only to probes for confirmation of the current PMTU estimation. SCTP should send probe packets to probe for a larger PMTU in Search Complete.

> 
>> 
>>> 
>>>> 
>>>> Also, I think more parameters would be helpful. For example,
>>>> 
>>>> plpmtud_enable - boolean to control whether to use PLPMTUD (it is more explicit than plpmtud_probe_interval=0 or plpmtud_probe_max_ack_time=0)
>>>> plpmtud_max_probes - controls the number of probe packets sent for one candidate.
>>>> plpmtud_raise_time - time to wait before probing for a larger PMTU in search complete (0 to disable it).
>>>> plpmtud_use_ptb - boolean to control whether to process an ICMP PTB.
>>> With these, the control will be more detailed for sure.
>>> But I didn't want to introduce too many parameters for this feature,
>>> as you know, these parameters could also be per socket/asoc/transport,
>>> and doing set/get with sockopt.
>>> 
>>> instead, we keep most fixed:
>>> 
>>> plpmtud_use_ptb = 1
>>> plpmtud_raise_time = 30 * plpmtud_probe_max_ack_time(plpmtud_probe_interval)
>>> plpmtud_max_probes = 3
>>> plpmtud_enable = !! plpmtud_probe_interval
>>> 
>>> Only one variable:
>>> plpmtud_probe_interval >= 5000ms
>> 
>> OK
>> 
>>> 
>>> So I think this is up to the implementation, if you want more things
>>> to tune, you can go ahead with these all parameters exposed to users.
>> 
>> Agree. It is probably a good idea to add not too much parameters.
>> 
>>> 
>>>> 
>>>> Timo
>>>> 
>>>>> 
>>>>> Thanks.
>>>>>>>     be disabled when 0 is set.
>>>>>>> 
>>>>>>>     Default: 0
>>>>>>> 
>>>>>>> 2. a socket option that can be used per socket, assoc or transport
>>>>>>> 
>>>>>>> /* PLPMTUD Probe Interval socket option */
>>>>>>> struct sctp_probeinterval {
>>>>>>>     sctp_assoc_t spi_assoc_id;
>>>>>>>     struct sockaddr_storage spi_address;
>>>>>>>     __u32 spi_interval;
>>>>>>> };
>>>>>>> 
>>>>>>> #define SCTP_PLPMTUD_PROBE_INTERVAL    133
>>>>>>> 
>>>>>>> 
>>>>>>> The value above will enable/disable the PLPMUTD probe by setting up the probe
>>>>>>> interval for the timer. When it's 0, the timer will also stop and
>>>>>>> PLPMUTD is disabled.
>>>>>>> By this way, we don't need to introduce more options.
>>>>>> OK.
>>>>>>> 
>>>>>>> We're expecting to keep consistent with BSD on this, pls check and
>>>>>>> share your thoughts.
>>>>>> Looks good to me.
>>>>>> 
>>>>>> Best regards
>>>>>> Michael
>>>>>>> 
>>>>>>> Thanks.
>>>>>> 
>>>> 
>>>> 
>> 


[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5261 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: The new sysctl and socket option added for PLPMTUD (RFC8899)
  2021-07-08 14:18               ` Timo Völker
@ 2021-07-08 15:54                 ` Xin Long
  2021-07-12  8:09                   ` Timo Völker
  0 siblings, 1 reply; 12+ messages in thread
From: Xin Long @ 2021-07-08 15:54 UTC (permalink / raw)
  To: Timo Völker
  Cc: Marcelo Ricardo Leitner, linux-sctp @ vger . kernel . org, tuexen

On Thu, Jul 8, 2021 at 10:18 AM Timo Völker <timo.voelker@fh-muenster.de> wrote:
>
> > On 7. Jul 2021, at 18:30, Xin Long <lucien.xin@gmail.com> wrote:
> >
> > On Wed, Jul 7, 2021 at 8:36 AM Timo Völker <timo.voelker@fh-muenster.de> wrote:
> >>
> >>> On 6. Jul 2021, at 18:01, Xin Long <lucien.xin@gmail.com> wrote:
> >>>
> >>> On Tue, Jul 6, 2021 at 5:13 AM Timo Völker <timo.voelker@fh-muenster.de> wrote:
> >>>>
> >>>>
> >>>> Hi Xin,
> >>>>
> >>>> I implemented RFC8899 for an SCTP simulation model.
> >>> great, can I know what that one is?
> >>
> >> I used the SCTP implementation in INET. INET is a simulation model suite for OMNeT++.
> > Thanks.
> >
> >>
> >>>
> >>>>
> >>>> Comments follow inline.
> >>>>
> >>>>> Begin forwarded message:
> >>>>>
> >>>>> From: Xin Long <lucien.xin@gmail.com>
> >>>>> Subject: Re: The new sysctl and socket option added for PLPMTUD (RFC8899)
> >>>>> Date: 12. June 2021 at 19:32:02 CEST
> >>>>> To: Michael Tuexen <tuexen@freebsd.org>
> >>>>> Cc: "linux-sctp @ vger . kernel . org" <linux-sctp@vger.kernel.org>, Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
> >>>>>
> >>>>> On Fri, Jun 11, 2021 at 4:42 PM <tuexen@freebsd.org> wrote:
> >>>>>>
> >>>>>>> On 11. Jun 2021, at 22:20, Xin Long <lucien.xin@gmail.com> wrote:
> >>>>>>>
> >>>>>>> Hi, Michael,
> >>>>>>>
> >>>>>>> In the linux implementation of RFC8899, we decided to introduce one
> >>>>>>> sysctl and one socket option for users to set up the PLPMUTD probe:
> >>>>>>>
> >>>>>>> 1. sysctl -w net.sctp.plpmtud_probe_interval=1
> >>>>>>>
> >>>>>>> plpmtud_probe_interval - INTEGER
> >>>>>>>     The interval (in milliseconds) between PLPMTUD probe chunks. These
> >>>>>>>     chunks are sent at the specified interval with a variable size to
> >>>>>>>     probe the mtu of a given path between 2 associations. PLPMTUD will
> >>>>>> I guess you mean "between 2 end points" instead of "between 2 associations".
> >>>>>>
> >>>>>> I'm not sure what it means:
> >>>>>>
> >>>>>> I assume, you have candidate 1400, 1420, 1460, 1480, and 1500.
> >>>>>>
> >>>>>> Assume you sent a probe packet for 1400. Aren't you sending the
> >>>>>> probe packet for 1420 as soon as you get an ACK for the probe packet
> >>>>>> of size 1400? Or are you waiting for plpmtud_probe_interval ms?
> >>>>> It will wait for "plpmtud_probe_interval" ms in searching state, but in
> >>>>> searching complete it will be "plpmtud_probe_interval * 30" ms.
> >>>>
> >>>> Does this mean you always wait for plpmtud_probe_interval ms? Even if you receive an ack for a probe packet or a PTB?
> >>>>
> >>>> In my implementation, I start with the next probe immediately when receiving an ack or PTB.
> >>> yeah, we should do it immediately to make this more efficient, and I
> >>> already fixed it in linux for ACK.
> >>>
> >>> For PTB, I currently only set probe_size as the pmtu from ICMP packet
> >>> when pmtu > 'current pmtu' && pmtu < probe_size, and wait until next
> >>> probe_timer. But probably better to send it immediately too, I need to
> >>> confirm.
> >>
> >> I think so. At least I don't know what to wait for.
> > I'm not sure about this, as it says:
> >
> >   PLPMTU < PL_PTB_SIZE < PROBED_SIZE
> >   ...
> >      *  The PL can use the reported PL_PTB_SIZE from the PTB message as
> >         the next search point when it resumes the search algorithm.
> >
> > it doesn't seem to mean that.
>
> The "when it resumes the search algorithm" is a litte abstract, but I don't understand it as the PL has to wait for a timeout before resuming the search algorithm.
>
> >
> >
> >>
> >>>
> >>>>
> >>>>>
> >>>>> The step we are using is 32, when it fails, we turn the step to 4. For example:
> >>>>> 1400, 1432, 1464, 1496, 1528 (failed), 1500(1496 + 4), 1504(failed,
> >>>>> 1500 is the PMTU).
> >>>>
> >>>> What does failed mean? Does it mean that you have sent MAX_PROBES (=3?) probe packets and waited for each plpmtud_probe_interval ms without receiving a response?
> >>> yes
> >>>
> >>>>
> >>>> If so, it might make sense to continue with smaller candidates earlier. For example, after one unanswered probe packet.
> >>> Sounds a good way to go, and it would save 2 intervals to get the
> >>> optimal value in the normal case.
> >>> But if the failure is false (like the link is unstable), it may also
> >>> take some time to catch up to the bigger candidate.
> >>
> >> Right, it's a trade off. What is better depends on the probability of a probe packet loss due to another reason than its size.
> >>
> >> I chose to do something like this, when searching for a PMTU of 1472:
> >>
> >> 1400 ack
> >> 1432 ack
> >> 1464 timeout (false negative)
> >> 1436 ack
> >> 1440 ack
> >> 1444 ack
> >> 1448 ack
> >> 1452 ack
> >> 1456 ack
> >> 1460 ack
> >> 1464 ack
> >> 1496 timeout
> >> 1468 ack
> >> 1472 ack
> >> 1476 timeout
> >> 1476 timeout
> >> 1476 timeout
> >> done with PMTU=1472
> > Looks good to me. :-)
> >
> >>
> >>>
> >>>>
> >>>>>
> >>>>> Sorry, "sysctl -w net.sctp.plpmtud_probe_interval=1" won't work.
> >>>>> As plpmtud_probe_interval is the probe interval TIME for the timer.
> >>>>> Apart from 0, the minimal value is 5000ms.
> >>>>>
> >>>>> So it should be:
> >>>>>
> >>>>> plpmtud_probe_interval - INTEGER
> >>>>>      The time interval (in milliseconds) for sending PLPMTUD probe chunks.
> >>>>>      These chunks are sent at the specified interval with a variable size
> >>>>>      to probe the mtu of a given path between 2 endpoints. PLPMTUD will
> >>>>>      be disabled when 0 is set.
> >>>>>
> >>>>>      Default: 0
> >>>>
> >>>> What do you mean with probe chunks? You are sending probe *packets* containing a HEARTBEAT and a PAD chunk, right?
> >>> yes.
> >>>
> >>>>
> >>>> RFC8899 contains:
> >>>> The PROBE_TIMER is configured to expire after a period longer than the maximum time to receive an acknowledgment to a probe packet.
> >>>>
> >>>> So, how about plpmtud_probe_max_ack_time?
> >>> "plpmtud_probe_interval" I got the name from tcp's sysctl plpmtud in
> >>> linux. I was hoping to keep this consistent in sysctl and sockopt
> >>> between Linux and BSD.  Note this parameter is also the interval to
> >>> send a probe for the current pmtu in Search Complete status.
> >>
> >> Do you send probe packets in Search Complete to confirm the current PMTU estimation?
> >>
> >> RFC8899 suggests to do this only for non-reliable PLs. For a reliable PL like SCTP, it suggests to use the loss of (data) packets as indication instead.
> > Can you point out the place in RFC8899 saying so?
> >
> > What I saw is:
> >
> >   Search Complete:  The Search Complete Phase is entered when the
> >      PLPMTU is supported across the network path.  A PL can use a
> >      CONFIRMATION_TIMER to periodically repeat a probe packet for the
> >      current PLPMTU size.  If the sender is unable to confirm
> >      reachability (e.g., if the CONFIRMATION_TIMER expires) or the PL
> >      signals a lack of reachability, a black hole has been detected and
> >      DPLPMTUD enters the Base Phase.
> >
> > it desn't matter if it's a reliable or non-reliable PL, no?
>
> The description of the phases are used to give a high level overview about the mechanism. The state diagram is more detailed. There you find this sentence: "When used with an acknowledged PL (e.g., SCTP), DPLPMTUD SHOULD NOT continue to generate PLPMTU probes in this state". However, it refers only to probes for confirmation of the current PMTU estimation. SCTP should send probe packets to probe for a larger PMTU in Search Complete.
If so, how to make sure the current pmtu is working during the Search Complete?
Where did you get "it suggests to use the loss of (data) packets as
indication instead"?

Thanks.

>
> >
> >>
> >>>
> >>>>
> >>>> Also, I think more parameters would be helpful. For example,
> >>>>
> >>>> plpmtud_enable - boolean to control whether to use PLPMTUD (it is more explicit than plpmtud_probe_interval=0 or plpmtud_probe_max_ack_time=0)
> >>>> plpmtud_max_probes - controls the number of probe packets sent for one candidate.
> >>>> plpmtud_raise_time - time to wait before probing for a larger PMTU in search complete (0 to disable it).
> >>>> plpmtud_use_ptb - boolean to control whether to process an ICMP PTB.
> >>> With these, the control will be more detailed for sure.
> >>> But I didn't want to introduce too many parameters for this feature,
> >>> as you know, these parameters could also be per socket/asoc/transport,
> >>> and doing set/get with sockopt.
> >>>
> >>> instead, we keep most fixed:
> >>>
> >>> plpmtud_use_ptb = 1
> >>> plpmtud_raise_time = 30 * plpmtud_probe_max_ack_time(plpmtud_probe_interval)
> >>> plpmtud_max_probes = 3
> >>> plpmtud_enable = !! plpmtud_probe_interval
> >>>
> >>> Only one variable:
> >>> plpmtud_probe_interval >= 5000ms
> >>
> >> OK
> >>
> >>>
> >>> So I think this is up to the implementation, if you want more things
> >>> to tune, you can go ahead with these all parameters exposed to users.
> >>
> >> Agree. It is probably a good idea to add not too much parameters.
> >>
> >>>
> >>>>
> >>>> Timo
> >>>>
> >>>>>
> >>>>> Thanks.
> >>>>>>>     be disabled when 0 is set.
> >>>>>>>
> >>>>>>>     Default: 0
> >>>>>>>
> >>>>>>> 2. a socket option that can be used per socket, assoc or transport
> >>>>>>>
> >>>>>>> /* PLPMTUD Probe Interval socket option */
> >>>>>>> struct sctp_probeinterval {
> >>>>>>>     sctp_assoc_t spi_assoc_id;
> >>>>>>>     struct sockaddr_storage spi_address;
> >>>>>>>     __u32 spi_interval;
> >>>>>>> };
> >>>>>>>
> >>>>>>> #define SCTP_PLPMTUD_PROBE_INTERVAL    133
> >>>>>>>
> >>>>>>>
> >>>>>>> The value above will enable/disable the PLPMUTD probe by setting up the probe
> >>>>>>> interval for the timer. When it's 0, the timer will also stop and
> >>>>>>> PLPMUTD is disabled.
> >>>>>>> By this way, we don't need to introduce more options.
> >>>>>> OK.
> >>>>>>>
> >>>>>>> We're expecting to keep consistent with BSD on this, pls check and
> >>>>>>> share your thoughts.
> >>>>>> Looks good to me.
> >>>>>>
> >>>>>> Best regards
> >>>>>> Michael
> >>>>>>>
> >>>>>>> Thanks.
> >>>>>>
> >>>>
> >>>>
> >>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: The new sysctl and socket option added for PLPMTUD (RFC8899)
  2021-07-08 15:54                 ` Xin Long
@ 2021-07-12  8:09                   ` Timo Völker
  2021-07-19 16:55                     ` Xin Long
  0 siblings, 1 reply; 12+ messages in thread
From: Timo Völker @ 2021-07-12  8:09 UTC (permalink / raw)
  To: Xin Long
  Cc: Marcelo Ricardo Leitner, linux-sctp @ vger . kernel . org, tuexen

[-- Attachment #1: Type: text/plain, Size: 10812 bytes --]

> On 8. Jul 2021, at 17:54, Xin Long <lucien.xin@gmail.com> wrote:
> 
> On Thu, Jul 8, 2021 at 10:18 AM Timo Völker <timo.voelker@fh-muenster.de> wrote:
>> 
>>> On 7. Jul 2021, at 18:30, Xin Long <lucien.xin@gmail.com> wrote:
>>> 
>>> On Wed, Jul 7, 2021 at 8:36 AM Timo Völker <timo.voelker@fh-muenster.de> wrote:
>>>> 
>>>>> On 6. Jul 2021, at 18:01, Xin Long <lucien.xin@gmail.com> wrote:
>>>>> 
>>>>> On Tue, Jul 6, 2021 at 5:13 AM Timo Völker <timo.voelker@fh-muenster.de> wrote:
>>>>>> 
>>>>>> 
>>>>>> Hi Xin,
>>>>>> 
>>>>>> I implemented RFC8899 for an SCTP simulation model.
>>>>> great, can I know what that one is?
>>>> 
>>>> I used the SCTP implementation in INET. INET is a simulation model suite for OMNeT++.
>>> Thanks.
>>> 
>>>> 
>>>>> 
>>>>>> 
>>>>>> Comments follow inline.
>>>>>> 
>>>>>>> Begin forwarded message:
>>>>>>> 
>>>>>>> From: Xin Long <lucien.xin@gmail.com>
>>>>>>> Subject: Re: The new sysctl and socket option added for PLPMTUD (RFC8899)
>>>>>>> Date: 12. June 2021 at 19:32:02 CEST
>>>>>>> To: Michael Tuexen <tuexen@freebsd.org>
>>>>>>> Cc: "linux-sctp @ vger . kernel . org" <linux-sctp@vger.kernel.org>, Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
>>>>>>> 
>>>>>>> On Fri, Jun 11, 2021 at 4:42 PM <tuexen@freebsd.org> wrote:
>>>>>>>> 
>>>>>>>>> On 11. Jun 2021, at 22:20, Xin Long <lucien.xin@gmail.com> wrote:
>>>>>>>>> 
>>>>>>>>> Hi, Michael,
>>>>>>>>> 
>>>>>>>>> In the linux implementation of RFC8899, we decided to introduce one
>>>>>>>>> sysctl and one socket option for users to set up the PLPMUTD probe:
>>>>>>>>> 
>>>>>>>>> 1. sysctl -w net.sctp.plpmtud_probe_interval=1
>>>>>>>>> 
>>>>>>>>> plpmtud_probe_interval - INTEGER
>>>>>>>>>    The interval (in milliseconds) between PLPMTUD probe chunks. These
>>>>>>>>>    chunks are sent at the specified interval with a variable size to
>>>>>>>>>    probe the mtu of a given path between 2 associations. PLPMTUD will
>>>>>>>> I guess you mean "between 2 end points" instead of "between 2 associations".
>>>>>>>> 
>>>>>>>> I'm not sure what it means:
>>>>>>>> 
>>>>>>>> I assume, you have candidate 1400, 1420, 1460, 1480, and 1500.
>>>>>>>> 
>>>>>>>> Assume you sent a probe packet for 1400. Aren't you sending the
>>>>>>>> probe packet for 1420 as soon as you get an ACK for the probe packet
>>>>>>>> of size 1400? Or are you waiting for plpmtud_probe_interval ms?
>>>>>>> It will wait for "plpmtud_probe_interval" ms in searching state, but in
>>>>>>> searching complete it will be "plpmtud_probe_interval * 30" ms.
>>>>>> 
>>>>>> Does this mean you always wait for plpmtud_probe_interval ms? Even if you receive an ack for a probe packet or a PTB?
>>>>>> 
>>>>>> In my implementation, I start with the next probe immediately when receiving an ack or PTB.
>>>>> yeah, we should do it immediately to make this more efficient, and I
>>>>> already fixed it in linux for ACK.
>>>>> 
>>>>> For PTB, I currently only set probe_size as the pmtu from ICMP packet
>>>>> when pmtu > 'current pmtu' && pmtu < probe_size, and wait until next
>>>>> probe_timer. But probably better to send it immediately too, I need to
>>>>> confirm.
>>>> 
>>>> I think so. At least I don't know what to wait for.
>>> I'm not sure about this, as it says:
>>> 
>>>  PLPMTU < PL_PTB_SIZE < PROBED_SIZE
>>>  ...
>>>     *  The PL can use the reported PL_PTB_SIZE from the PTB message as
>>>        the next search point when it resumes the search algorithm.
>>> 
>>> it doesn't seem to mean that.
>> 
>> The "when it resumes the search algorithm" is a litte abstract, but I don't understand it as the PL has to wait for a timeout before resuming the search algorithm.
>> 
>>> 
>>> 
>>>> 
>>>>> 
>>>>>> 
>>>>>>> 
>>>>>>> The step we are using is 32, when it fails, we turn the step to 4. For example:
>>>>>>> 1400, 1432, 1464, 1496, 1528 (failed), 1500(1496 + 4), 1504(failed,
>>>>>>> 1500 is the PMTU).
>>>>>> 
>>>>>> What does failed mean? Does it mean that you have sent MAX_PROBES (=3?) probe packets and waited for each plpmtud_probe_interval ms without receiving a response?
>>>>> yes
>>>>> 
>>>>>> 
>>>>>> If so, it might make sense to continue with smaller candidates earlier. For example, after one unanswered probe packet.
>>>>> Sounds a good way to go, and it would save 2 intervals to get the
>>>>> optimal value in the normal case.
>>>>> But if the failure is false (like the link is unstable), it may also
>>>>> take some time to catch up to the bigger candidate.
>>>> 
>>>> Right, it's a trade off. What is better depends on the probability of a probe packet loss due to another reason than its size.
>>>> 
>>>> I chose to do something like this, when searching for a PMTU of 1472:
>>>> 
>>>> 1400 ack
>>>> 1432 ack
>>>> 1464 timeout (false negative)
>>>> 1436 ack
>>>> 1440 ack
>>>> 1444 ack
>>>> 1448 ack
>>>> 1452 ack
>>>> 1456 ack
>>>> 1460 ack
>>>> 1464 ack
>>>> 1496 timeout
>>>> 1468 ack
>>>> 1472 ack
>>>> 1476 timeout
>>>> 1476 timeout
>>>> 1476 timeout
>>>> done with PMTU=1472
>>> Looks good to me. :-)
>>> 
>>>> 
>>>>> 
>>>>>> 
>>>>>>> 
>>>>>>> Sorry, "sysctl -w net.sctp.plpmtud_probe_interval=1" won't work.
>>>>>>> As plpmtud_probe_interval is the probe interval TIME for the timer.
>>>>>>> Apart from 0, the minimal value is 5000ms.
>>>>>>> 
>>>>>>> So it should be:
>>>>>>> 
>>>>>>> plpmtud_probe_interval - INTEGER
>>>>>>>     The time interval (in milliseconds) for sending PLPMTUD probe chunks.
>>>>>>>     These chunks are sent at the specified interval with a variable size
>>>>>>>     to probe the mtu of a given path between 2 endpoints. PLPMTUD will
>>>>>>>     be disabled when 0 is set.
>>>>>>> 
>>>>>>>     Default: 0
>>>>>> 
>>>>>> What do you mean with probe chunks? You are sending probe *packets* containing a HEARTBEAT and a PAD chunk, right?
>>>>> yes.
>>>>> 
>>>>>> 
>>>>>> RFC8899 contains:
>>>>>> The PROBE_TIMER is configured to expire after a period longer than the maximum time to receive an acknowledgment to a probe packet.
>>>>>> 
>>>>>> So, how about plpmtud_probe_max_ack_time?
>>>>> "plpmtud_probe_interval" I got the name from tcp's sysctl plpmtud in
>>>>> linux. I was hoping to keep this consistent in sysctl and sockopt
>>>>> between Linux and BSD.  Note this parameter is also the interval to
>>>>> send a probe for the current pmtu in Search Complete status.
>>>> 
>>>> Do you send probe packets in Search Complete to confirm the current PMTU estimation?
>>>> 
>>>> RFC8899 suggests to do this only for non-reliable PLs. For a reliable PL like SCTP, it suggests to use the loss of (data) packets as indication instead.
>>> Can you point out the place in RFC8899 saying so?
>>> 
>>> What I saw is:
>>> 
>>>  Search Complete:  The Search Complete Phase is entered when the
>>>     PLPMTU is supported across the network path.  A PL can use a
>>>     CONFIRMATION_TIMER to periodically repeat a probe packet for the
>>>     current PLPMTU size.  If the sender is unable to confirm
>>>     reachability (e.g., if the CONFIRMATION_TIMER expires) or the PL
>>>     signals a lack of reachability, a black hole has been detected and
>>>     DPLPMTUD enters the Base Phase.
>>> 
>>> it desn't matter if it's a reliable or non-reliable PL, no?
>> 
>> The description of the phases are used to give a high level overview about the mechanism. The state diagram is more detailed. There you find this sentence: "When used with an acknowledged PL (e.g., SCTP), DPLPMTUD SHOULD NOT continue to generate PLPMTU probes in this state". However, it refers only to probes for confirmation of the current PMTU estimation. SCTP should send probe packets to probe for a larger PMTU in Search Complete.
> If so, how to make sure the current pmtu is working during the Search Complete?
> Where did you get "it suggests to use the loss of (data) packets as
> indication instead"?

Sorry, RFC8899 only suggests to not send probe packets to confirm the current PMTU estimation in Search Complete (when used within an acknowledged PL, like SCTP).

Since I don't see another way how to detect a decreased PMTU, I interpreted it as a suggestion to use packet loss for the detection.

Timo

> 
> Thanks.
> 
>> 
>>> 
>>>> 
>>>>> 
>>>>>> 
>>>>>> Also, I think more parameters would be helpful. For example,
>>>>>> 
>>>>>> plpmtud_enable - boolean to control whether to use PLPMTUD (it is more explicit than plpmtud_probe_interval=0 or plpmtud_probe_max_ack_time=0)
>>>>>> plpmtud_max_probes - controls the number of probe packets sent for one candidate.
>>>>>> plpmtud_raise_time - time to wait before probing for a larger PMTU in search complete (0 to disable it).
>>>>>> plpmtud_use_ptb - boolean to control whether to process an ICMP PTB.
>>>>> With these, the control will be more detailed for sure.
>>>>> But I didn't want to introduce too many parameters for this feature,
>>>>> as you know, these parameters could also be per socket/asoc/transport,
>>>>> and doing set/get with sockopt.
>>>>> 
>>>>> instead, we keep most fixed:
>>>>> 
>>>>> plpmtud_use_ptb = 1
>>>>> plpmtud_raise_time = 30 * plpmtud_probe_max_ack_time(plpmtud_probe_interval)
>>>>> plpmtud_max_probes = 3
>>>>> plpmtud_enable = !! plpmtud_probe_interval
>>>>> 
>>>>> Only one variable:
>>>>> plpmtud_probe_interval >= 5000ms
>>>> 
>>>> OK
>>>> 
>>>>> 
>>>>> So I think this is up to the implementation, if you want more things
>>>>> to tune, you can go ahead with these all parameters exposed to users.
>>>> 
>>>> Agree. It is probably a good idea to add not too much parameters.
>>>> 
>>>>> 
>>>>>> 
>>>>>> Timo
>>>>>> 
>>>>>>> 
>>>>>>> Thanks.
>>>>>>>>>    be disabled when 0 is set.
>>>>>>>>> 
>>>>>>>>>    Default: 0
>>>>>>>>> 
>>>>>>>>> 2. a socket option that can be used per socket, assoc or transport
>>>>>>>>> 
>>>>>>>>> /* PLPMTUD Probe Interval socket option */
>>>>>>>>> struct sctp_probeinterval {
>>>>>>>>>    sctp_assoc_t spi_assoc_id;
>>>>>>>>>    struct sockaddr_storage spi_address;
>>>>>>>>>    __u32 spi_interval;
>>>>>>>>> };
>>>>>>>>> 
>>>>>>>>> #define SCTP_PLPMTUD_PROBE_INTERVAL    133
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> The value above will enable/disable the PLPMUTD probe by setting up the probe
>>>>>>>>> interval for the timer. When it's 0, the timer will also stop and
>>>>>>>>> PLPMUTD is disabled.
>>>>>>>>> By this way, we don't need to introduce more options.
>>>>>>>> OK.
>>>>>>>>> 
>>>>>>>>> We're expecting to keep consistent with BSD on this, pls check and
>>>>>>>>> share your thoughts.
>>>>>>>> Looks good to me.
>>>>>>>> 
>>>>>>>> Best regards
>>>>>>>> Michael
>>>>>>>>> 
>>>>>>>>> Thanks.
>>>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>> 


[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5261 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: The new sysctl and socket option added for PLPMTUD (RFC8899)
  2021-07-12  8:09                   ` Timo Völker
@ 2021-07-19 16:55                     ` Xin Long
  0 siblings, 0 replies; 12+ messages in thread
From: Xin Long @ 2021-07-19 16:55 UTC (permalink / raw)
  To: Timo Völker
  Cc: Marcelo Ricardo Leitner, linux-sctp @ vger . kernel . org, tuexen

On Mon, Jul 12, 2021 at 4:09 AM Timo Völker <timo.voelker@fh-muenster.de> wrote:
>
> > On 8. Jul 2021, at 17:54, Xin Long <lucien.xin@gmail.com> wrote:
> >
> > On Thu, Jul 8, 2021 at 10:18 AM Timo Völker <timo.voelker@fh-muenster.de> wrote:
> >>
> >>> On 7. Jul 2021, at 18:30, Xin Long <lucien.xin@gmail.com> wrote:
> >>>
> >>> On Wed, Jul 7, 2021 at 8:36 AM Timo Völker <timo.voelker@fh-muenster.de> wrote:
> >>>>
> >>>>> On 6. Jul 2021, at 18:01, Xin Long <lucien.xin@gmail.com> wrote:
> >>>>>
> >>>>> On Tue, Jul 6, 2021 at 5:13 AM Timo Völker <timo.voelker@fh-muenster.de> wrote:
> >>>>>>
> >>>>>>
> >>>>>> Hi Xin,
> >>>>>>
> >>>>>> I implemented RFC8899 for an SCTP simulation model.
> >>>>> great, can I know what that one is?
> >>>>
> >>>> I used the SCTP implementation in INET. INET is a simulation model suite for OMNeT++.
> >>> Thanks.
> >>>
> >>>>
> >>>>>
> >>>>>>
> >>>>>> Comments follow inline.
> >>>>>>
> >>>>>>> Begin forwarded message:
> >>>>>>>
> >>>>>>> From: Xin Long <lucien.xin@gmail.com>
> >>>>>>> Subject: Re: The new sysctl and socket option added for PLPMTUD (RFC8899)
> >>>>>>> Date: 12. June 2021 at 19:32:02 CEST
> >>>>>>> To: Michael Tuexen <tuexen@freebsd.org>
> >>>>>>> Cc: "linux-sctp @ vger . kernel . org" <linux-sctp@vger.kernel.org>, Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
> >>>>>>>
> >>>>>>> On Fri, Jun 11, 2021 at 4:42 PM <tuexen@freebsd.org> wrote:
> >>>>>>>>
> >>>>>>>>> On 11. Jun 2021, at 22:20, Xin Long <lucien.xin@gmail.com> wrote:
> >>>>>>>>>
> >>>>>>>>> Hi, Michael,
> >>>>>>>>>
> >>>>>>>>> In the linux implementation of RFC8899, we decided to introduce one
> >>>>>>>>> sysctl and one socket option for users to set up the PLPMUTD probe:
> >>>>>>>>>
> >>>>>>>>> 1. sysctl -w net.sctp.plpmtud_probe_interval=1
> >>>>>>>>>
> >>>>>>>>> plpmtud_probe_interval - INTEGER
> >>>>>>>>>    The interval (in milliseconds) between PLPMTUD probe chunks. These
> >>>>>>>>>    chunks are sent at the specified interval with a variable size to
> >>>>>>>>>    probe the mtu of a given path between 2 associations. PLPMTUD will
> >>>>>>>> I guess you mean "between 2 end points" instead of "between 2 associations".
> >>>>>>>>
> >>>>>>>> I'm not sure what it means:
> >>>>>>>>
> >>>>>>>> I assume, you have candidate 1400, 1420, 1460, 1480, and 1500.
> >>>>>>>>
> >>>>>>>> Assume you sent a probe packet for 1400. Aren't you sending the
> >>>>>>>> probe packet for 1420 as soon as you get an ACK for the probe packet
> >>>>>>>> of size 1400? Or are you waiting for plpmtud_probe_interval ms?
> >>>>>>> It will wait for "plpmtud_probe_interval" ms in searching state, but in
> >>>>>>> searching complete it will be "plpmtud_probe_interval * 30" ms.
> >>>>>>
> >>>>>> Does this mean you always wait for plpmtud_probe_interval ms? Even if you receive an ack for a probe packet or a PTB?
> >>>>>>
> >>>>>> In my implementation, I start with the next probe immediately when receiving an ack or PTB.
> >>>>> yeah, we should do it immediately to make this more efficient, and I
> >>>>> already fixed it in linux for ACK.
> >>>>>
> >>>>> For PTB, I currently only set probe_size as the pmtu from ICMP packet
> >>>>> when pmtu > 'current pmtu' && pmtu < probe_size, and wait until next
> >>>>> probe_timer. But probably better to send it immediately too, I need to
> >>>>> confirm.
> >>>>
> >>>> I think so. At least I don't know what to wait for.
> >>> I'm not sure about this, as it says:
> >>>
> >>>  PLPMTU < PL_PTB_SIZE < PROBED_SIZE
> >>>  ...
> >>>     *  The PL can use the reported PL_PTB_SIZE from the PTB message as
> >>>        the next search point when it resumes the search algorithm.
> >>>
> >>> it doesn't seem to mean that.
> >>
> >> The "when it resumes the search algorithm" is a litte abstract, but I don't understand it as the PL has to wait for a timeout before resuming the search algorithm.
> >>
> >>>
> >>>
> >>>>
> >>>>>
> >>>>>>
> >>>>>>>
> >>>>>>> The step we are using is 32, when it fails, we turn the step to 4. For example:
> >>>>>>> 1400, 1432, 1464, 1496, 1528 (failed), 1500(1496 + 4), 1504(failed,
> >>>>>>> 1500 is the PMTU).
> >>>>>>
> >>>>>> What does failed mean? Does it mean that you have sent MAX_PROBES (=3?) probe packets and waited for each plpmtud_probe_interval ms without receiving a response?
> >>>>> yes
> >>>>>
> >>>>>>
> >>>>>> If so, it might make sense to continue with smaller candidates earlier. For example, after one unanswered probe packet.
> >>>>> Sounds a good way to go, and it would save 2 intervals to get the
> >>>>> optimal value in the normal case.
> >>>>> But if the failure is false (like the link is unstable), it may also
> >>>>> take some time to catch up to the bigger candidate.
> >>>>
> >>>> Right, it's a trade off. What is better depends on the probability of a probe packet loss due to another reason than its size.
> >>>>
> >>>> I chose to do something like this, when searching for a PMTU of 1472:
> >>>>
> >>>> 1400 ack
> >>>> 1432 ack
> >>>> 1464 timeout (false negative)
> >>>> 1436 ack
> >>>> 1440 ack
> >>>> 1444 ack
> >>>> 1448 ack
> >>>> 1452 ack
> >>>> 1456 ack
> >>>> 1460 ack
> >>>> 1464 ack
> >>>> 1496 timeout
> >>>> 1468 ack
> >>>> 1472 ack
> >>>> 1476 timeout
> >>>> 1476 timeout
> >>>> 1476 timeout
> >>>> done with PMTU=1472
> >>> Looks good to me. :-)
> >>>
> >>>>
> >>>>>
> >>>>>>
> >>>>>>>
> >>>>>>> Sorry, "sysctl -w net.sctp.plpmtud_probe_interval=1" won't work.
> >>>>>>> As plpmtud_probe_interval is the probe interval TIME for the timer.
> >>>>>>> Apart from 0, the minimal value is 5000ms.
> >>>>>>>
> >>>>>>> So it should be:
> >>>>>>>
> >>>>>>> plpmtud_probe_interval - INTEGER
> >>>>>>>     The time interval (in milliseconds) for sending PLPMTUD probe chunks.
> >>>>>>>     These chunks are sent at the specified interval with a variable size
> >>>>>>>     to probe the mtu of a given path between 2 endpoints. PLPMTUD will
> >>>>>>>     be disabled when 0 is set.
> >>>>>>>
> >>>>>>>     Default: 0
> >>>>>>
> >>>>>> What do you mean with probe chunks? You are sending probe *packets* containing a HEARTBEAT and a PAD chunk, right?
> >>>>> yes.
> >>>>>
> >>>>>>
> >>>>>> RFC8899 contains:
> >>>>>> The PROBE_TIMER is configured to expire after a period longer than the maximum time to receive an acknowledgment to a probe packet.
> >>>>>>
> >>>>>> So, how about plpmtud_probe_max_ack_time?
> >>>>> "plpmtud_probe_interval" I got the name from tcp's sysctl plpmtud in
> >>>>> linux. I was hoping to keep this consistent in sysctl and sockopt
> >>>>> between Linux and BSD.  Note this parameter is also the interval to
> >>>>> send a probe for the current pmtu in Search Complete status.
> >>>>
> >>>> Do you send probe packets in Search Complete to confirm the current PMTU estimation?
> >>>>
> >>>> RFC8899 suggests to do this only for non-reliable PLs. For a reliable PL like SCTP, it suggests to use the loss of (data) packets as indication instead.
> >>> Can you point out the place in RFC8899 saying so?
> >>>
> >>> What I saw is:
> >>>
> >>>  Search Complete:  The Search Complete Phase is entered when the
> >>>     PLPMTU is supported across the network path.  A PL can use a
> >>>     CONFIRMATION_TIMER to periodically repeat a probe packet for the
> >>>     current PLPMTU size.  If the sender is unable to confirm
> >>>     reachability (e.g., if the CONFIRMATION_TIMER expires) or the PL
> >>>     signals a lack of reachability, a black hole has been detected and
> >>>     DPLPMTUD enters the Base Phase.
> >>>
> >>> it desn't matter if it's a reliable or non-reliable PL, no?
> >>
> >> The description of the phases are used to give a high level overview about the mechanism. The state diagram is more detailed. There you find this sentence: "When used with an acknowledged PL (e.g., SCTP), DPLPMTUD SHOULD NOT continue to generate PLPMTU probes in this state". However, it refers only to probes for confirmation of the current PMTU estimation. SCTP should send probe packets to probe for a larger PMTU in Search Complete.
> > If so, how to make sure the current pmtu is working during the Search Complete?
> > Where did you get "it suggests to use the loss of (data) packets as
> > indication instead"?
>
> Sorry, RFC8899 only suggests to not send probe packets to confirm the current PMTU estimation in Search Complete (when used within an acknowledged PL, like SCTP).
>
> Since I don't see another way how to detect a decreased PMTU, I interpreted it as a suggestion to use packet loss for the detection.
This makes sense, I've posted a patchset to improve it.

Thanks.

>
> Timo
>
> >
> > Thanks.
> >
> >>
> >>>
> >>>>
> >>>>>
> >>>>>>
> >>>>>> Also, I think more parameters would be helpful. For example,
> >>>>>>
> >>>>>> plpmtud_enable - boolean to control whether to use PLPMTUD (it is more explicit than plpmtud_probe_interval=0 or plpmtud_probe_max_ack_time=0)
> >>>>>> plpmtud_max_probes - controls the number of probe packets sent for one candidate.
> >>>>>> plpmtud_raise_time - time to wait before probing for a larger PMTU in search complete (0 to disable it).
> >>>>>> plpmtud_use_ptb - boolean to control whether to process an ICMP PTB.
> >>>>> With these, the control will be more detailed for sure.
> >>>>> But I didn't want to introduce too many parameters for this feature,
> >>>>> as you know, these parameters could also be per socket/asoc/transport,
> >>>>> and doing set/get with sockopt.
> >>>>>
> >>>>> instead, we keep most fixed:
> >>>>>
> >>>>> plpmtud_use_ptb = 1
> >>>>> plpmtud_raise_time = 30 * plpmtud_probe_max_ack_time(plpmtud_probe_interval)
> >>>>> plpmtud_max_probes = 3
> >>>>> plpmtud_enable = !! plpmtud_probe_interval
> >>>>>
> >>>>> Only one variable:
> >>>>> plpmtud_probe_interval >= 5000ms
> >>>>
> >>>> OK
> >>>>
> >>>>>
> >>>>> So I think this is up to the implementation, if you want more things
> >>>>> to tune, you can go ahead with these all parameters exposed to users.
> >>>>
> >>>> Agree. It is probably a good idea to add not too much parameters.
> >>>>
> >>>>>
> >>>>>>
> >>>>>> Timo
> >>>>>>
> >>>>>>>
> >>>>>>> Thanks.
> >>>>>>>>>    be disabled when 0 is set.
> >>>>>>>>>
> >>>>>>>>>    Default: 0
> >>>>>>>>>
> >>>>>>>>> 2. a socket option that can be used per socket, assoc or transport
> >>>>>>>>>
> >>>>>>>>> /* PLPMTUD Probe Interval socket option */
> >>>>>>>>> struct sctp_probeinterval {
> >>>>>>>>>    sctp_assoc_t spi_assoc_id;
> >>>>>>>>>    struct sockaddr_storage spi_address;
> >>>>>>>>>    __u32 spi_interval;
> >>>>>>>>> };
> >>>>>>>>>
> >>>>>>>>> #define SCTP_PLPMTUD_PROBE_INTERVAL    133
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> The value above will enable/disable the PLPMUTD probe by setting up the probe
> >>>>>>>>> interval for the timer. When it's 0, the timer will also stop and
> >>>>>>>>> PLPMUTD is disabled.
> >>>>>>>>> By this way, we don't need to introduce more options.
> >>>>>>>> OK.
> >>>>>>>>>
> >>>>>>>>> We're expecting to keep consistent with BSD on this, pls check and
> >>>>>>>>> share your thoughts.
> >>>>>>>> Looks good to me.
> >>>>>>>>
> >>>>>>>> Best regards
> >>>>>>>> Michael
> >>>>>>>>>
> >>>>>>>>> Thanks.
> >>>>>>>>
> >>>>>>
> >>>>>>
> >>>>
> >>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-07-19 17:09 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-11 20:20 The new sysctl and socket option added for PLPMTUD (RFC8899) Xin Long
2021-06-11 20:42 ` tuexen
2021-06-12 17:32   ` Xin Long
2021-06-12 21:28     ` tuexen
     [not found]     ` <FEF068AA-C660-4A25-ABFE-D559B1136B58@fh-muenster.de>
2021-07-06  9:12       ` Timo Völker
2021-07-06 16:01         ` Xin Long
2021-07-07 12:36           ` Timo Völker
2021-07-07 16:30             ` Xin Long
2021-07-08 14:18               ` Timo Völker
2021-07-08 15:54                 ` Xin Long
2021-07-12  8:09                   ` Timo Völker
2021-07-19 16:55                     ` Xin Long

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.