All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] dhcp: Use bound_time for retransmission timers
@ 2022-05-13 13:53 Michael Johnson
  0 siblings, 0 replies; 8+ messages in thread
From: Michael Johnson @ 2022-05-13 13:53 UTC (permalink / raw)
  To: ell

[-- Attachment #1: Type: text/plain, Size: 3183 bytes --]

Hi Denis,

I've been running this patch for the day and I don't think it's
actually fixed the retry. I have set a low lease time and then block
the DHCP server for the ACK and I can see that the retry still doesn't
happen.

https://pastebin.com/Gy3Gcs4W

Regards,
Michael

On Wed, 11 May 2022 at 18:29, Denis Kenzior <denkenz(a)gmail.com> wrote:
>
> start_t is used to try and calculate the retransmission timeout value
> when the client enters RENEWING or REBINDING state.  This works fine on
> the first renewal since the client start timestamp and the lease bound
> timestamp are very close.  Also, the RENEW request is sent immediately
> whenever the T1 timer expires and most of the time it succeeds.
>
> However, if this isn't a first renewal attempt and the RENEW request
> sent when the T1 timer expires is not successful, then the renewal
> timeout value could become too large.  Fix that by using the lease
> bound_time for the retransmission timer calculation (as intended) intead
> of the client start time (start_t).
> ---
>  ell/dhcp.c | 11 ++++++-----
>  1 file changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/ell/dhcp.c b/ell/dhcp.c
> index 2d049005a135..01900f3b1d38 100644
> --- a/ell/dhcp.c
> +++ b/ell/dhcp.c
> @@ -520,6 +520,7 @@ static void dhcp_client_timeout_resend(struct l_timeout *timeout,
>                                                                 void *user_data)
>  {
>         struct l_dhcp_client *client = user_data;
> +       struct l_dhcp_lease *lease = client->lease;
>         unsigned int next_timeout = 0;
>         int r;
>
> @@ -555,12 +556,12 @@ static void dhcp_client_timeout_resend(struct l_timeout *timeout,
>
>         switch (client->state) {
>         case DHCP_STATE_RENEWING:
> -               next_timeout = dhcp_rebind_renew_retry_time(client->start_t,
> -                                                       client->lease->t2);
> +               next_timeout = dhcp_rebind_renew_retry_time(lease->bound_time,
> +                                                               lease->t2);
>                 break;
>         case DHCP_STATE_REBINDING:
> -               next_timeout = dhcp_rebind_renew_retry_time(client->start_t,
> -                                               client->lease->lifetime);
> +               next_timeout = dhcp_rebind_renew_retry_time(lease->bound_time,
> +                                                       lease->lifetime);
>                 break;
>         case DHCP_STATE_REQUESTING:
>         case DHCP_STATE_SELECTING:
> @@ -642,7 +643,7 @@ static void dhcp_client_t1_expired(struct l_timeout *timeout, void *user_data)
>         l_timeout_set_callback(client->timeout_lease, dhcp_client_t2_expired,
>                                 client, NULL);
>
> -       next_timeout = dhcp_rebind_renew_retry_time(client->start_t,
> +       next_timeout = dhcp_rebind_renew_retry_time(client->lease->bound_time,
>                                                         client->lease->t2);
>         client->timeout_resend =
>                 l_timeout_create_ms(dhcp_fuzz_secs(next_timeout),
> --
> 2.32.0
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] dhcp: Use bound_time for retransmission timers
@ 2022-05-18 16:21 Denis Kenzior
  0 siblings, 0 replies; 8+ messages in thread
From: Denis Kenzior @ 2022-05-18 16:21 UTC (permalink / raw)
  To: ell

[-- Attachment #1: Type: text/plain, Size: 843 bytes --]

On 5/11/22 12:28, Denis Kenzior wrote:
> start_t is used to try and calculate the retransmission timeout value
> when the client enters RENEWING or REBINDING state.  This works fine on
> the first renewal since the client start timestamp and the lease bound
> timestamp are very close.  Also, the RENEW request is sent immediately
> whenever the T1 timer expires and most of the time it succeeds.
> 
> However, if this isn't a first renewal attempt and the RENEW request
> sent when the T1 timer expires is not successful, then the renewal
> timeout value could become too large.  Fix that by using the lease
> bound_time for the retransmission timer calculation (as intended) intead
> of the client start time (start_t).
> ---
>   ell/dhcp.c | 11 ++++++-----
>   1 file changed, 6 insertions(+), 5 deletions(-)
> 

Applied.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] dhcp: Use bound_time for retransmission timers
@ 2022-05-13 17:10 Michael Johnson
  0 siblings, 0 replies; 8+ messages in thread
From: Michael Johnson @ 2022-05-13 17:10 UTC (permalink / raw)
  To: ell

[-- Attachment #1: Type: text/plain, Size: 1254 bytes --]

Great, thank you!

I'll run this patch on master over the weekend and see if I can get
more data for the other issue.

Regards,
Michael

On Fri, 13 May 2022 at 17:23, Denis Kenzior <denkenz(a)gmail.com> wrote:
>
> Hi Michael,
>
> On 5/13/22 10:40, Michael Johnson wrote:
> > Sorry, I put the log for the t2 retry. Putting it in the right place
> > gives us our problem though
> >
> > ```
> > May 13 15:38:34 r2-19006 iwd[39610]: [DHCPv4] dhcp_client_t1_expired:630
> > May 13 15:38:34 r2-19006 iwd[39610]: [DHCPv4]
> > dhcp_client_t1_expired:632 Entering state: DHCP_STATE_RENEWING
> > May 13 15:38:34 r2-19006 iwd[39610]: [DHCPv4] dhcp_client_send_request:397
> > May 13 15:38:34 r2-19006 iwd[39610]: [DHCPv4]
> > dhcp_client_t1_expired:642 next_timeout t2: 225
> > May 13 15:38:34 r2-19006 iwd[39610]: [DHCPv4]
> > dhcp_client_t1_expired:651 next_timeout t1 retry: 896458022
>
> Ugh, looks like commit:
> c78ad1bb6d7e ("dhcp: Set lease expiry based on frame reception times")
>
> broke this.  It is mixing up SO_TIMESTAMP which is from CLOCK_REALTIME and our
> bound_time which is supposed to be in CLOCK_BOOTTIME.
>
> No wonder the timeout way too long :)
>
> Let me send another fixup patch.
>
> Regards,
> -Denis

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] dhcp: Use bound_time for retransmission timers
@ 2022-05-13 16:23 Denis Kenzior
  0 siblings, 0 replies; 8+ messages in thread
From: Denis Kenzior @ 2022-05-13 16:23 UTC (permalink / raw)
  To: ell

[-- Attachment #1: Type: text/plain, Size: 984 bytes --]

Hi Michael,

On 5/13/22 10:40, Michael Johnson wrote:
> Sorry, I put the log for the t2 retry. Putting it in the right place
> gives us our problem though
> 
> ```
> May 13 15:38:34 r2-19006 iwd[39610]: [DHCPv4] dhcp_client_t1_expired:630
> May 13 15:38:34 r2-19006 iwd[39610]: [DHCPv4]
> dhcp_client_t1_expired:632 Entering state: DHCP_STATE_RENEWING
> May 13 15:38:34 r2-19006 iwd[39610]: [DHCPv4] dhcp_client_send_request:397
> May 13 15:38:34 r2-19006 iwd[39610]: [DHCPv4]
> dhcp_client_t1_expired:642 next_timeout t2: 225
> May 13 15:38:34 r2-19006 iwd[39610]: [DHCPv4]
> dhcp_client_t1_expired:651 next_timeout t1 retry: 896458022

Ugh, looks like commit:
c78ad1bb6d7e ("dhcp: Set lease expiry based on frame reception times")

broke this.  It is mixing up SO_TIMESTAMP which is from CLOCK_REALTIME and our 
bound_time which is supposed to be in CLOCK_BOOTTIME.

No wonder the timeout way too long :)

Let me send another fixup patch.

Regards,
-Denis

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] dhcp: Use bound_time for retransmission timers
@ 2022-05-13 15:40 Michael Johnson
  0 siblings, 0 replies; 8+ messages in thread
From: Michael Johnson @ 2022-05-13 15:40 UTC (permalink / raw)
  To: ell

[-- Attachment #1: Type: text/plain, Size: 3107 bytes --]

Sorry, I put the log for the t2 retry. Putting it in the right place
gives us our problem though

```
May 13 15:38:34 r2-19006 iwd[39610]: [DHCPv4] dhcp_client_t1_expired:630
May 13 15:38:34 r2-19006 iwd[39610]: [DHCPv4]
dhcp_client_t1_expired:632 Entering state: DHCP_STATE_RENEWING
May 13 15:38:34 r2-19006 iwd[39610]: [DHCPv4] dhcp_client_send_request:397
May 13 15:38:34 r2-19006 iwd[39610]: [DHCPv4]
dhcp_client_t1_expired:642 next_timeout t2: 225
May 13 15:38:34 r2-19006 iwd[39610]: [DHCPv4]
dhcp_client_t1_expired:651 next_timeout t1 retry: 896458022
```

Regards,
Michael

On Fri, 13 May 2022 at 16:31, Michael Johnson <mjohnson459(a)gmail.com> wrote:
>
> Hi Denis,
>
> I added those log lines and got this.
>
> ```
> May 13 15:20:14 r2-19006 iwd[28716]: [DHCPv4]
> dhcp_client_rx_message:949 T1 expiring in 299004 ms
> May 13 15:20:14 r2-19006 iwd[28716]: [DHCPv4] l_acd_start:455 Skipping
> probes and sending announcements
> May 13 15:20:14 r2-19006 iwd[28716]: [DHCPv4]
> announce_wait_timeout:166 No conflicts found for 10.101.8.24,
> announcing address
> May 13 15:20:14 r2-19006 iwd[28716]: [DHCPv4] acd_send_packet:146
> sending packet with target IP 10.101.8.24
> May 13 15:25:13 r2-19006 iwd[28716]: [DHCPv4] dhcp_client_t1_expired:630
> May 13 15:25:13 r2-19006 iwd[28716]: [DHCPv4]
> dhcp_client_t1_expired:632 Entering state: DHCP_STATE_RENEWING
> May 13 15:25:13 r2-19006 iwd[28716]: [DHCPv4] dhcp_client_send_request:397
> May 13 15:25:13 r2-19006 iwd[28716]: [DHCPv4]
> dhcp_client_t1_expired:643 next_timeout: 225
> May 13 15:28:58 r2-19006 iwd[28716]: [DHCPv4] dhcp_client_t2_expired:609
> May 13 15:28:58 r2-19006 iwd[28716]: [DHCPv4]
> dhcp_client_t2_expired:616 Entering state: DHCP_STATE_REBINDING
> May 13 15:30:12 r2-19006 iwd[28716]: [DHCPv4] dhcp_client_lease_expired:598
> ```
>
> Regards,
> Michael
>
> On Fri, 13 May 2022 at 15:30, Denis Kenzior <denkenz(a)gmail.com> wrote:
> >
> > Hi Michael,
> >
> > On 5/13/22 08:53, Michael Johnson wrote:
> > > Hi Denis,
> > >
> > > I've been running this patch for the day and I don't think it's
> > > actually fixed the retry. I have set a low lease time and then block
> > > the DHCP server for the ACK and I can see that the retry still doesn't
> > > happen.
> >
> > I suspect it does fix one bug, but not another...  I'm not near my test box
> > right now, so debugging this is a bit painful.
> >
> > Could you debug-print the 'next_timeout' we calculate in
> > dhcp_client_t1_expired() here:
> >
> >          next_timeout = dhcp_rebind_renew_retry_time(client->start_t,
> >                                                          client->lease->t2);
> >
> > And also put in
> >          L_WARN_ON(!client->timeout_resend);
> >
> > after the timeout_resend timer is created?
> >
> >  From your 'other' bug report yesterday, we calculate the t1 timer correctly,
> > but the timer still doesn't fire.
> > It almost feels like the timers are just not being created.  Is there anything
> > suspicious in the kernel logs?
> >
> > Regards,
> > -Denis

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] dhcp: Use bound_time for retransmission timers
@ 2022-05-13 15:31 Michael Johnson
  0 siblings, 0 replies; 8+ messages in thread
From: Michael Johnson @ 2022-05-13 15:31 UTC (permalink / raw)
  To: ell

[-- Attachment #1: Type: text/plain, Size: 2305 bytes --]

Hi Denis,

I added those log lines and got this.

```
May 13 15:20:14 r2-19006 iwd[28716]: [DHCPv4]
dhcp_client_rx_message:949 T1 expiring in 299004 ms
May 13 15:20:14 r2-19006 iwd[28716]: [DHCPv4] l_acd_start:455 Skipping
probes and sending announcements
May 13 15:20:14 r2-19006 iwd[28716]: [DHCPv4]
announce_wait_timeout:166 No conflicts found for 10.101.8.24,
announcing address
May 13 15:20:14 r2-19006 iwd[28716]: [DHCPv4] acd_send_packet:146
sending packet with target IP 10.101.8.24
May 13 15:25:13 r2-19006 iwd[28716]: [DHCPv4] dhcp_client_t1_expired:630
May 13 15:25:13 r2-19006 iwd[28716]: [DHCPv4]
dhcp_client_t1_expired:632 Entering state: DHCP_STATE_RENEWING
May 13 15:25:13 r2-19006 iwd[28716]: [DHCPv4] dhcp_client_send_request:397
May 13 15:25:13 r2-19006 iwd[28716]: [DHCPv4]
dhcp_client_t1_expired:643 next_timeout: 225
May 13 15:28:58 r2-19006 iwd[28716]: [DHCPv4] dhcp_client_t2_expired:609
May 13 15:28:58 r2-19006 iwd[28716]: [DHCPv4]
dhcp_client_t2_expired:616 Entering state: DHCP_STATE_REBINDING
May 13 15:30:12 r2-19006 iwd[28716]: [DHCPv4] dhcp_client_lease_expired:598
```

Regards,
Michael

On Fri, 13 May 2022 at 15:30, Denis Kenzior <denkenz(a)gmail.com> wrote:
>
> Hi Michael,
>
> On 5/13/22 08:53, Michael Johnson wrote:
> > Hi Denis,
> >
> > I've been running this patch for the day and I don't think it's
> > actually fixed the retry. I have set a low lease time and then block
> > the DHCP server for the ACK and I can see that the retry still doesn't
> > happen.
>
> I suspect it does fix one bug, but not another...  I'm not near my test box
> right now, so debugging this is a bit painful.
>
> Could you debug-print the 'next_timeout' we calculate in
> dhcp_client_t1_expired() here:
>
>          next_timeout = dhcp_rebind_renew_retry_time(client->start_t,
>                                                          client->lease->t2);
>
> And also put in
>          L_WARN_ON(!client->timeout_resend);
>
> after the timeout_resend timer is created?
>
>  From your 'other' bug report yesterday, we calculate the t1 timer correctly,
> but the timer still doesn't fire.
> It almost feels like the timers are just not being created.  Is there anything
> suspicious in the kernel logs?
>
> Regards,
> -Denis

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] dhcp: Use bound_time for retransmission timers
@ 2022-05-13 14:30 Denis Kenzior
  0 siblings, 0 replies; 8+ messages in thread
From: Denis Kenzior @ 2022-05-13 14:30 UTC (permalink / raw)
  To: ell

[-- Attachment #1: Type: text/plain, Size: 1029 bytes --]

Hi Michael,

On 5/13/22 08:53, Michael Johnson wrote:
> Hi Denis,
> 
> I've been running this patch for the day and I don't think it's
> actually fixed the retry. I have set a low lease time and then block
> the DHCP server for the ACK and I can see that the retry still doesn't
> happen.

I suspect it does fix one bug, but not another...  I'm not near my test box 
right now, so debugging this is a bit painful.

Could you debug-print the 'next_timeout' we calculate in 
dhcp_client_t1_expired() here:

         next_timeout = dhcp_rebind_renew_retry_time(client->start_t,
                                                         client->lease->t2);

And also put in
         L_WARN_ON(!client->timeout_resend);

after the timeout_resend timer is created?

 From your 'other' bug report yesterday, we calculate the t1 timer correctly, 
but the timer still doesn't fire.
It almost feels like the timers are just not being created.  Is there anything 
suspicious in the kernel logs?

Regards,
-Denis

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH] dhcp: Use bound_time for retransmission timers
@ 2022-05-11 17:28 Denis Kenzior
  0 siblings, 0 replies; 8+ messages in thread
From: Denis Kenzior @ 2022-05-11 17:28 UTC (permalink / raw)
  To: ell

[-- Attachment #1: Type: text/plain, Size: 2228 bytes --]

start_t is used to try and calculate the retransmission timeout value
when the client enters RENEWING or REBINDING state.  This works fine on
the first renewal since the client start timestamp and the lease bound
timestamp are very close.  Also, the RENEW request is sent immediately
whenever the T1 timer expires and most of the time it succeeds.

However, if this isn't a first renewal attempt and the RENEW request
sent when the T1 timer expires is not successful, then the renewal
timeout value could become too large.  Fix that by using the lease
bound_time for the retransmission timer calculation (as intended) intead
of the client start time (start_t).
---
 ell/dhcp.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/ell/dhcp.c b/ell/dhcp.c
index 2d049005a135..01900f3b1d38 100644
--- a/ell/dhcp.c
+++ b/ell/dhcp.c
@@ -520,6 +520,7 @@ static void dhcp_client_timeout_resend(struct l_timeout *timeout,
 								void *user_data)
 {
 	struct l_dhcp_client *client = user_data;
+	struct l_dhcp_lease *lease = client->lease;
 	unsigned int next_timeout = 0;
 	int r;
 
@@ -555,12 +556,12 @@ static void dhcp_client_timeout_resend(struct l_timeout *timeout,
 
 	switch (client->state) {
 	case DHCP_STATE_RENEWING:
-		next_timeout = dhcp_rebind_renew_retry_time(client->start_t,
-							client->lease->t2);
+		next_timeout = dhcp_rebind_renew_retry_time(lease->bound_time,
+								lease->t2);
 		break;
 	case DHCP_STATE_REBINDING:
-		next_timeout = dhcp_rebind_renew_retry_time(client->start_t,
-						client->lease->lifetime);
+		next_timeout = dhcp_rebind_renew_retry_time(lease->bound_time,
+							lease->lifetime);
 		break;
 	case DHCP_STATE_REQUESTING:
 	case DHCP_STATE_SELECTING:
@@ -642,7 +643,7 @@ static void dhcp_client_t1_expired(struct l_timeout *timeout, void *user_data)
 	l_timeout_set_callback(client->timeout_lease, dhcp_client_t2_expired,
 				client, NULL);
 
-	next_timeout = dhcp_rebind_renew_retry_time(client->start_t,
+	next_timeout = dhcp_rebind_renew_retry_time(client->lease->bound_time,
 							client->lease->t2);
 	client->timeout_resend =
 		l_timeout_create_ms(dhcp_fuzz_secs(next_timeout),
-- 
2.32.0

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-05-18 16:21 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-13 13:53 [PATCH] dhcp: Use bound_time for retransmission timers Michael Johnson
  -- strict thread matches above, loose matches on Subject: below --
2022-05-18 16:21 Denis Kenzior
2022-05-13 17:10 Michael Johnson
2022-05-13 16:23 Denis Kenzior
2022-05-13 15:40 Michael Johnson
2022-05-13 15:31 Michael Johnson
2022-05-13 14:30 Denis Kenzior
2022-05-11 17:28 Denis Kenzior

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.