netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [-next] Regression: ssh log in slowdown
       [not found] <CAMuHMdU6rrpx6xFw71HKexDJUZtfFtL+um6dxZ=EaycgVO312A@mail.gmail.com>
@ 2014-06-12 11:28 ` Geert Uytterhoeven
  0 siblings, 0 replies; 8+ messages in thread
From: Geert Uytterhoeven @ 2014-06-12 11:28 UTC (permalink / raw)
  To: Tom Herbert, David S. Miller; +Cc: Linux-Next, Linux-sh list, netdev

cc netdev@vger.kernel.org

On Thu, Jun 12, 2014 at 1:26 PM, Geert Uytterhoeven
<geert@linux-m68k.org> wrote:
> When logging in remotely using ssh, there's now a 10 second delay,
> cfr. the "ssh -v" output below:
>
> debug1: ssh_ecdsa_verify: signature correct
> debug1: SSH2_MSG_NEWKEYS sent
> debug1: expecting SSH2_MSG_NEWKEYS
> debug1: SSH2_MSG_NEWKEYS received
> debug1: Roaming not allowed by server
> debug1: SSH2_MSG_SERVICE_REQUEST sent
> debug1: SSH2_MSG_SERVICE_ACCEPT received
>
> ----- 10 s delay ---
>
> debug1: Authentications that can continue: publickey,password
> debug1: Next authentication method: publickey
>
> Wireshark doesn't show anything suspicious, just that there's a 10 s gap
> before I receive an "encrypted response packet" on the client.
>
> Hardware is r8a7791/koelsch, using sh_eth.
>
> I bisected this to
> commit 7e3cead5172927732f51fde77fef6f521e22f209
> Author: Tom Herbert <therbert@google.com>
> Date:   Tue Jun 10 18:54:19 2014 -0700
>
>     net: Save software checksum complete
>
>     In skb_checksum complete, if we need to compute the checksum for the
>     packet (via skb_checksum) save the result as CHECKSUM_COMPLETE.
>     Subsequent checksum verification can use this.
>
>     Also, added csum_complete_sw flag to distinguish between software and
>     hardware generated checksum complete, we should always be able to trust
>     the software computation.
>
>     Signed-off-by: Tom Herbert <therbert@google.com>
>     Signed-off-by: David S. Miller <davem@davemloft.net>
>
> Reverting this commit fixes the issue.
>
> Anyone with a clue?
>
> Thanks!

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvald

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [-next] Regression: ssh log in slowdown
  2014-06-13 15:37 Tom Herbert
@ 2014-06-15  8:42 ` Geert Uytterhoeven
  0 siblings, 0 replies; 8+ messages in thread
From: Geert Uytterhoeven @ 2014-06-15  8:42 UTC (permalink / raw)
  To: Tom Herbert
  Cc: David S. Miller, Linux-Next, Linux-sh list, linux-kernel, netdev

Hi Tom,

On Fri, Jun 13, 2014 at 5:37 PM, Tom Herbert <therbert@google.com> wrote:
>> Thanks, I applied the series "[PATCH 0/4] Checksum fixes", and the fix
>> above, but it doesn't help.
>>
>> Note that I'm also using NFS root, which doesn't seem to be affected.
>> I can happily run "ls -R /" on the serial console during the 10 s delay
> in ssh.
>
> Can you try one more patch below with the series applied? Also, can you

This patch fixes the issue. Thanks!

> look at 'netstat -s' to see if UDP checksum errors are being reported.

Indeed, errors are reported under Udp/InCsumErrors.
With this new patch, no errors are reported.

Thanks!

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [-next] Regression: ssh log in slowdown
@ 2014-06-13 15:37 Tom Herbert
  2014-06-15  8:42 ` Geert Uytterhoeven
  0 siblings, 1 reply; 8+ messages in thread
From: Tom Herbert @ 2014-06-13 15:37 UTC (permalink / raw)
  To: geert; +Cc: davem, linux-next, linux-sh, linux-kernel, netdev

> Thanks, I applied the series "[PATCH 0/4] Checksum fixes", and the fix
> above, but it doesn't help.
>
> Note that I'm also using NFS root, which doesn't seem to be affected.
> I can happily run "ls -R /" on the serial console during the 10 s delay 
in ssh.
>
Geert,

Thanks for your patience!

Can you try one more patch below with the series applied? Also, can you 
look at 'netstat -s' to see if UDP checksum errors are being reported.

Thanks,
Tom 

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index fdb510c..4b722bc 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2820,8 +2820,8 @@ static inline __sum16 __skb_checksum_validate_complete(struct sk_buff *skb,
 	if (complete || skb->len <= CHECKSUM_BREAK) {
 		__sum16 csum;
 
+		/* skb->csum valid set in __skb_checksum_complete */
 		csum = __skb_checksum_complete(skb);
-		skb->csum_valid = !csum;
 		return csum;
 	}
 
diff --git a/net/core/datagram.c b/net/core/datagram.c
index cf6cc4e..488dd1a 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -744,6 +744,7 @@ __sum16 __skb_checksum_complete_head(struct sk_buff *skb, int len)
 		    !skb->csum_complete_sw)
 			netdev_rx_csum_fault(skb->dev);
 	}
+	skb->csum_valid = !sum;
 	return sum;
 }
 EXPORT_SYMBOL(__skb_checksum_complete_head);
@@ -767,6 +768,7 @@ __sum16 __skb_checksum_complete(struct sk_buff *skb)
 	skb->csum = csum;
 	skb->ip_summed = CHECKSUM_COMPLETE;
 	skb->csum_complete_sw = 1;
+	skb->csum_valid = !sum;
 
 	return sum;
 }
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index bf92824..9cd5344 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -689,6 +689,9 @@ static void __copy_skb_header(struct sk_buff *new, const struct sk_buff *old)
 	new->ooo_okay		= old->ooo_okay;
 	new->no_fcs		= old->no_fcs;
 	new->encapsulation	= old->encapsulation;
+	new->encap_hdr_csum	= old->encap_hdr_csum;
+	new->csum_valid		= old->csum_valid;
+	new->csum_complete_sw	= old->csum_complete_sw;
 #ifdef CONFIG_XFRM
 	new->sp			= secpath_get(old->sp);
 #endif

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* RE: [-next] Regression: ssh log in slowdown
  2014-06-13  9:54     ` Geert Uytterhoeven
@ 2014-06-13 10:05       ` David Laight
  0 siblings, 0 replies; 8+ messages in thread
From: David Laight @ 2014-06-13 10:05 UTC (permalink / raw)
  To: 'Geert Uytterhoeven'
  Cc: Tom Herbert, David S. Miller, Linux-Next, Linux-sh list,
	linux-kernel, netdev

From: Geert Uytterhoeven
> Hi David,
> 
> On Fri, Jun 13, 2014 at 10:49 AM, David Laight <David.Laight@aculab.com> wrote:
> > From: Of Geert Uytterhoeven
> > ...
> >> Note that I'm also using NFS root, which doesn't seem to be affected.
> >> I can happily run "ls -R /" on the serial console during the 10 s delay in ssh.
> >
> > Are you sure that the delay during ssh login isn't just
> > a reverse DNS timeout?
> 
> Indeed, the ssh server sends a reverse DNS request twice, with 5s in between:
... 
> Interestingly, I don't see the forward DNS request after that, which
> does happen in the good case.

The forwards request is probably just copying some strange code from 'rshd'
that tried to verify the reverse lookup by doing a forwards lookup on the
result.
That in itself used to cause us grief.
The RDNS would (correctly) generate host.bar.baz.co.uk, since the 'domain'
in etc/resolv.conf was bar.baz.co.uk the forwards lookup first tried
host.bar.baz.co.uk.bar.baz.co.uk then host.bar.baz.co.uk.baz.co.uk
one of which always timed out :-(
(When the 'search' command was added we could avoid the request that
timed out.)

	David


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [-next] Regression: ssh log in slowdown
  2014-06-13  8:49   ` David Laight
@ 2014-06-13  9:54     ` Geert Uytterhoeven
  2014-06-13 10:05       ` David Laight
  0 siblings, 1 reply; 8+ messages in thread
From: Geert Uytterhoeven @ 2014-06-13  9:54 UTC (permalink / raw)
  To: David Laight
  Cc: Tom Herbert, David S. Miller, Linux-Next, Linux-sh list,
	linux-kernel, netdev

Hi David,

On Fri, Jun 13, 2014 at 10:49 AM, David Laight <David.Laight@aculab.com> wrote:
> From: Of Geert Uytterhoeven
> ...
>> Note that I'm also using NFS root, which doesn't seem to be affected.
>> I can happily run "ls -R /" on the serial console during the 10 s delay in ssh.
>
> Are you sure that the delay during ssh login isn't just
> a reverse DNS timeout?

Indeed, the ssh server sends a reverse DNS request twice, with 5s in between:

01:01:16.334257 2e:09:0a:00:6d:85 > 00:0c:42:12:75:1d, ethertype IPv4
(0x0800), length 86: <ssh-server-ip>.55318 > <dns-server-ip>.53: 2907+
PTR? <ssh-client-rev-ip>.in-addr.arpa. (44)
01:01:16.337282 00:0c:42:12:75:1d > 2e:09:0a:00:6d:85, ethertype IPv4
(0x0800), length 114: <dns-server-ip>.53 > <ssh-server-ip>.55318:
2907* 1/0/0 PTR <ssh-client-name>. (72)
01:01:16.356440 2e:09:0a:00:6d:85 > 74:d0:2b:c8:05:49, ethertype IPv4
(0x0800), length 66: <ssh-server-ip>.22 > <ssh-client-ip>.53536: Flags
[.], ack 2194, win 272, options [nop,nop,TS val 4294938665 ecr
480138541], length 0

01:01:21.342072 2e:09:0a:00:6d:85 > 00:0c:42:12:75:1d, ethertype IPv4
(0x0800), length 86: <ssh-server-ip>.55318 > <dns-server-ip>.53: 2907+
PTR? <ssh-client-rev-ip>.in-addr.arpa. (44)
01:01:21.344844 00:0c:42:12:75:1d > 2e:09:0a:00:6d:85, ethertype IPv4
(0x0800), length 114: <dns-server-ip>.53 > <ssh-server-ip>.55318:
2907* 1/0/0 PTR <ssh-client-name>. (72)

01:01:26.353667 2e:09:0a:00:6d:85 > 74:d0:2b:c8:05:49, ethertype IPv4
(0x0800), length 118: <ssh-server-ip>.22 > <ssh-client-ip>.53536:
Flags [P.], seq 1999:2051, ack 2194, win 272, options [nop,nop,TS val
4294939944 ecr 480138541], length 52

Interestingly, I don't see the forward DNS request after that, which
does happen in the
good case.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [-next] Regression: ssh log in slowdown
  2014-06-13  8:43 ` Geert Uytterhoeven
@ 2014-06-13  8:49   ` David Laight
  2014-06-13  9:54     ` Geert Uytterhoeven
  0 siblings, 1 reply; 8+ messages in thread
From: David Laight @ 2014-06-13  8:49 UTC (permalink / raw)
  To: 'Geert Uytterhoeven', Tom Herbert
  Cc: David S. Miller, Linux-Next, Linux-sh list, linux-kernel, netdev

From: Of Geert Uytterhoeven
...
> Note that I'm also using NFS root, which doesn't seem to be affected.
> I can happily run "ls -R /" on the serial console during the 10 s delay in ssh.

Are you sure that the delay during ssh login isn't just
a reverse DNS timeout?

	David


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [-next] Regression: ssh log in slowdown
  2014-06-13  8:21 Tom Herbert
@ 2014-06-13  8:43 ` Geert Uytterhoeven
  2014-06-13  8:49   ` David Laight
  0 siblings, 1 reply; 8+ messages in thread
From: Geert Uytterhoeven @ 2014-06-13  8:43 UTC (permalink / raw)
  To: Tom Herbert
  Cc: David S. Miller, Linux-Next, Linux-sh list, linux-kernel, netdev

Hi Tom,

On Fri, Jun 13, 2014 at 10:21 AM, Tom Herbert <therbert@google.com> wrote:
>> I assume this is the series "[PATCH 0/4] Checksum fixes"
>> (marc.info/?l=linux-netdev&m=140261417832399&w=2)?
>>
> Yes.
>
>> As I'm not subscribed to netdev, I cannot reply to that thread.
>>
>> "[PATCH 1/4] net: Fix save software checksum complete" fixes the issue
>> for me.
>> However, "[PATCH 2/4] udp: call __skb_checksum_complete when doing full
>> checksum" reintroduces the exact same bad behavior :-(
>>
> This implies the problem is happening in UDP path. AFAICT skb->csum is
> correct, and I don't seem to be able to reproduce the issue on my setup.
> It is possible that an skb copy is happening in which case we don't copy
> csum_valid so that checksum_unnecessary would fail in this case.
>
> Can you try with the patch below. Thanks!
>
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index bf92824..9cd5344 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -689,6 +689,9 @@ static void __copy_skb_header(struct sk_buff *new, const struct sk_buff *old)
>         new->ooo_okay           = old->ooo_okay;
>         new->no_fcs             = old->no_fcs;
>         new->encapsulation      = old->encapsulation;
> +       new->encap_hdr_csum     = old->encap_hdr_csum;
> +       new->csum_valid         = old->csum_valid;
> +       new->csum_complete_sw   = old->csum_complete_sw;
>  #ifdef CONFIG_XFRM
>         new->sp                 = secpath_get(old->sp);
>  #endif

Thanks, I applied the series "[PATCH 0/4] Checksum fixes", and the fix
above, but it doesn't help.

Note that I'm also using NFS root, which doesn't seem to be affected.
I can happily run "ls -R /" on the serial console during the 10 s delay in ssh.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [-next] Regression: ssh log in slowdown
@ 2014-06-13  8:21 Tom Herbert
  2014-06-13  8:43 ` Geert Uytterhoeven
  0 siblings, 1 reply; 8+ messages in thread
From: Tom Herbert @ 2014-06-13  8:21 UTC (permalink / raw)
  To: geert; +Cc: davem, linux-next, linux-sh, linux-kernel, netdev

>
> I assume this is the series "[PATCH 0/4] Checksum fixes"
> (marc.info/?l=linux-netdev&m=140261417832399&w=2)?
>
Yes.

> As I'm not subscribed to netdev, I cannot reply to that thread.
>
> "[PATCH 1/4] net: Fix save software checksum complete" fixes the issue
> for me.
> However, "[PATCH 2/4] udp: call __skb_checksum_complete when doing full
> checksum" reintroduces the exact same bad behavior :-(
>
This implies the problem is happening in UDP path. AFAICT skb->csum is 
correct, and I don't seem to be able to reproduce the issue on my setup. 
It is possible that an skb copy is happening in which case we don't copy
csum_valid so that checksum_unnecessary would fail in this case.

Can you try with the patch below. Thanks!

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index bf92824..9cd5344 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -689,6 +689,9 @@ static void __copy_skb_header(struct sk_buff *new, const struct sk_buff *old)
 	new->ooo_okay		= old->ooo_okay;
 	new->no_fcs		= old->no_fcs;
 	new->encapsulation	= old->encapsulation;
+	new->encap_hdr_csum	= old->encap_hdr_csum;
+	new->csum_valid		= old->csum_valid;
+	new->csum_complete_sw	= old->csum_complete_sw;
 #ifdef CONFIG_XFRM
 	new->sp			= secpath_get(old->sp);
 #endif

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-06-15  8:42 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CAMuHMdU6rrpx6xFw71HKexDJUZtfFtL+um6dxZ=EaycgVO312A@mail.gmail.com>
2014-06-12 11:28 ` [-next] Regression: ssh log in slowdown Geert Uytterhoeven
2014-06-13  8:21 Tom Herbert
2014-06-13  8:43 ` Geert Uytterhoeven
2014-06-13  8:49   ` David Laight
2014-06-13  9:54     ` Geert Uytterhoeven
2014-06-13 10:05       ` David Laight
2014-06-13 15:37 Tom Herbert
2014-06-15  8:42 ` Geert Uytterhoeven

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).