linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [-next] Regression: ssh log in slowdown
@ 2014-06-13 15:37 Tom Herbert
  2014-06-15  8:42 ` Geert Uytterhoeven
  0 siblings, 1 reply; 8+ messages in thread
From: Tom Herbert @ 2014-06-13 15:37 UTC (permalink / raw)
  To: geert; +Cc: davem, linux-next, linux-sh, linux-kernel, netdev

> Thanks, I applied the series "[PATCH 0/4] Checksum fixes", and the fix
> above, but it doesn't help.
>
> Note that I'm also using NFS root, which doesn't seem to be affected.
> I can happily run "ls -R /" on the serial console during the 10 s delay 
in ssh.
>
Geert,

Thanks for your patience!

Can you try one more patch below with the series applied? Also, can you 
look at 'netstat -s' to see if UDP checksum errors are being reported.

Thanks,
Tom 

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index fdb510c..4b722bc 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2820,8 +2820,8 @@ static inline __sum16 __skb_checksum_validate_complete(struct sk_buff *skb,
 	if (complete || skb->len <= CHECKSUM_BREAK) {
 		__sum16 csum;
 
+		/* skb->csum valid set in __skb_checksum_complete */
 		csum = __skb_checksum_complete(skb);
-		skb->csum_valid = !csum;
 		return csum;
 	}
 
diff --git a/net/core/datagram.c b/net/core/datagram.c
index cf6cc4e..488dd1a 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -744,6 +744,7 @@ __sum16 __skb_checksum_complete_head(struct sk_buff *skb, int len)
 		    !skb->csum_complete_sw)
 			netdev_rx_csum_fault(skb->dev);
 	}
+	skb->csum_valid = !sum;
 	return sum;
 }
 EXPORT_SYMBOL(__skb_checksum_complete_head);
@@ -767,6 +768,7 @@ __sum16 __skb_checksum_complete(struct sk_buff *skb)
 	skb->csum = csum;
 	skb->ip_summed = CHECKSUM_COMPLETE;
 	skb->csum_complete_sw = 1;
+	skb->csum_valid = !sum;
 
 	return sum;
 }
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index bf92824..9cd5344 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -689,6 +689,9 @@ static void __copy_skb_header(struct sk_buff *new, const struct sk_buff *old)
 	new->ooo_okay		= old->ooo_okay;
 	new->no_fcs		= old->no_fcs;
 	new->encapsulation	= old->encapsulation;
+	new->encap_hdr_csum	= old->encap_hdr_csum;
+	new->csum_valid		= old->csum_valid;
+	new->csum_complete_sw	= old->csum_complete_sw;
 #ifdef CONFIG_XFRM
 	new->sp			= secpath_get(old->sp);
 #endif

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [-next] Regression: ssh log in slowdown
  2014-06-13 15:37 [-next] Regression: ssh log in slowdown Tom Herbert
@ 2014-06-15  8:42 ` Geert Uytterhoeven
  0 siblings, 0 replies; 8+ messages in thread
From: Geert Uytterhoeven @ 2014-06-15  8:42 UTC (permalink / raw)
  To: Tom Herbert
  Cc: David S. Miller, Linux-Next, Linux-sh list, linux-kernel, netdev

Hi Tom,

On Fri, Jun 13, 2014 at 5:37 PM, Tom Herbert <therbert@google.com> wrote:
>> Thanks, I applied the series "[PATCH 0/4] Checksum fixes", and the fix
>> above, but it doesn't help.
>>
>> Note that I'm also using NFS root, which doesn't seem to be affected.
>> I can happily run "ls -R /" on the serial console during the 10 s delay
> in ssh.
>
> Can you try one more patch below with the series applied? Also, can you

This patch fixes the issue. Thanks!

> look at 'netstat -s' to see if UDP checksum errors are being reported.

Indeed, errors are reported under Udp/InCsumErrors.
With this new patch, no errors are reported.

Thanks!

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [-next] Regression: ssh log in slowdown
  2014-06-13  9:54     ` Geert Uytterhoeven
@ 2014-06-13 10:05       ` David Laight
  0 siblings, 0 replies; 8+ messages in thread
From: David Laight @ 2014-06-13 10:05 UTC (permalink / raw)
  To: 'Geert Uytterhoeven'
  Cc: Tom Herbert, David S. Miller, Linux-Next, Linux-sh list,
	linux-kernel, netdev

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1297 bytes --]

From: Geert Uytterhoeven
> Hi David,
> 
> On Fri, Jun 13, 2014 at 10:49 AM, David Laight <David.Laight@aculab.com> wrote:
> > From: Of Geert Uytterhoeven
> > ...
> >> Note that I'm also using NFS root, which doesn't seem to be affected.
> >> I can happily run "ls -R /" on the serial console during the 10 s delay in ssh.
> >
> > Are you sure that the delay during ssh login isn't just
> > a reverse DNS timeout?
> 
> Indeed, the ssh server sends a reverse DNS request twice, with 5s in between:
... 
> Interestingly, I don't see the forward DNS request after that, which
> does happen in the good case.

The forwards request is probably just copying some strange code from 'rshd'
that tried to verify the reverse lookup by doing a forwards lookup on the
result.
That in itself used to cause us grief.
The RDNS would (correctly) generate host.bar.baz.co.uk, since the 'domain'
in etc/resolv.conf was bar.baz.co.uk the forwards lookup first tried
host.bar.baz.co.uk.bar.baz.co.uk then host.bar.baz.co.uk.baz.co.uk
one of which always timed out :-(
(When the 'search' command was added we could avoid the request that
timed out.)

	David

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [-next] Regression: ssh log in slowdown
  2014-06-13  8:49   ` David Laight
@ 2014-06-13  9:54     ` Geert Uytterhoeven
  2014-06-13 10:05       ` David Laight
  0 siblings, 1 reply; 8+ messages in thread
From: Geert Uytterhoeven @ 2014-06-13  9:54 UTC (permalink / raw)
  To: David Laight
  Cc: Tom Herbert, David S. Miller, Linux-Next, Linux-sh list,
	linux-kernel, netdev

Hi David,

On Fri, Jun 13, 2014 at 10:49 AM, David Laight <David.Laight@aculab.com> wrote:
> From: Of Geert Uytterhoeven
> ...
>> Note that I'm also using NFS root, which doesn't seem to be affected.
>> I can happily run "ls -R /" on the serial console during the 10 s delay in ssh.
>
> Are you sure that the delay during ssh login isn't just
> a reverse DNS timeout?

Indeed, the ssh server sends a reverse DNS request twice, with 5s in between:

01:01:16.334257 2e:09:0a:00:6d:85 > 00:0c:42:12:75:1d, ethertype IPv4
(0x0800), length 86: <ssh-server-ip>.55318 > <dns-server-ip>.53: 2907+
PTR? <ssh-client-rev-ip>.in-addr.arpa. (44)
01:01:16.337282 00:0c:42:12:75:1d > 2e:09:0a:00:6d:85, ethertype IPv4
(0x0800), length 114: <dns-server-ip>.53 > <ssh-server-ip>.55318:
2907* 1/0/0 PTR <ssh-client-name>. (72)
01:01:16.356440 2e:09:0a:00:6d:85 > 74:d0:2b:c8:05:49, ethertype IPv4
(0x0800), length 66: <ssh-server-ip>.22 > <ssh-client-ip>.53536: Flags
[.], ack 2194, win 272, options [nop,nop,TS val 4294938665 ecr
480138541], length 0

01:01:21.342072 2e:09:0a:00:6d:85 > 00:0c:42:12:75:1d, ethertype IPv4
(0x0800), length 86: <ssh-server-ip>.55318 > <dns-server-ip>.53: 2907+
PTR? <ssh-client-rev-ip>.in-addr.arpa. (44)
01:01:21.344844 00:0c:42:12:75:1d > 2e:09:0a:00:6d:85, ethertype IPv4
(0x0800), length 114: <dns-server-ip>.53 > <ssh-server-ip>.55318:
2907* 1/0/0 PTR <ssh-client-name>. (72)

01:01:26.353667 2e:09:0a:00:6d:85 > 74:d0:2b:c8:05:49, ethertype IPv4
(0x0800), length 118: <ssh-server-ip>.22 > <ssh-client-ip>.53536:
Flags [P.], seq 1999:2051, ack 2194, win 272, options [nop,nop,TS val
4294939944 ecr 480138541], length 52

Interestingly, I don't see the forward DNS request after that, which
does happen in the
good case.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [-next] Regression: ssh log in slowdown
  2014-06-13  8:43 ` Geert Uytterhoeven
@ 2014-06-13  8:49   ` David Laight
  2014-06-13  9:54     ` Geert Uytterhoeven
  0 siblings, 1 reply; 8+ messages in thread
From: David Laight @ 2014-06-13  8:49 UTC (permalink / raw)
  To: 'Geert Uytterhoeven', Tom Herbert
  Cc: David S. Miller, Linux-Next, Linux-sh list, linux-kernel, netdev

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 435 bytes --]

From: Of Geert Uytterhoeven
...
> Note that I'm also using NFS root, which doesn't seem to be affected.
> I can happily run "ls -R /" on the serial console during the 10 s delay in ssh.

Are you sure that the delay during ssh login isn't just
a reverse DNS timeout?

	David

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [-next] Regression: ssh log in slowdown
  2014-06-13  8:21 Tom Herbert
@ 2014-06-13  8:43 ` Geert Uytterhoeven
  2014-06-13  8:49   ` David Laight
  0 siblings, 1 reply; 8+ messages in thread
From: Geert Uytterhoeven @ 2014-06-13  8:43 UTC (permalink / raw)
  To: Tom Herbert
  Cc: David S. Miller, Linux-Next, Linux-sh list, linux-kernel, netdev

Hi Tom,

On Fri, Jun 13, 2014 at 10:21 AM, Tom Herbert <therbert@google.com> wrote:
>> I assume this is the series "[PATCH 0/4] Checksum fixes"
>> (marc.info/?l=linux-netdev&m=140261417832399&w=2)?
>>
> Yes.
>
>> As I'm not subscribed to netdev, I cannot reply to that thread.
>>
>> "[PATCH 1/4] net: Fix save software checksum complete" fixes the issue
>> for me.
>> However, "[PATCH 2/4] udp: call __skb_checksum_complete when doing full
>> checksum" reintroduces the exact same bad behavior :-(
>>
> This implies the problem is happening in UDP path. AFAICT skb->csum is
> correct, and I don't seem to be able to reproduce the issue on my setup.
> It is possible that an skb copy is happening in which case we don't copy
> csum_valid so that checksum_unnecessary would fail in this case.
>
> Can you try with the patch below. Thanks!
>
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index bf92824..9cd5344 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -689,6 +689,9 @@ static void __copy_skb_header(struct sk_buff *new, const struct sk_buff *old)
>         new->ooo_okay           = old->ooo_okay;
>         new->no_fcs             = old->no_fcs;
>         new->encapsulation      = old->encapsulation;
> +       new->encap_hdr_csum     = old->encap_hdr_csum;
> +       new->csum_valid         = old->csum_valid;
> +       new->csum_complete_sw   = old->csum_complete_sw;
>  #ifdef CONFIG_XFRM
>         new->sp                 = secpath_get(old->sp);
>  #endif

Thanks, I applied the series "[PATCH 0/4] Checksum fixes", and the fix
above, but it doesn't help.

Note that I'm also using NFS root, which doesn't seem to be affected.
I can happily run "ls -R /" on the serial console during the 10 s delay in ssh.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [-next] Regression: ssh log in slowdown
@ 2014-06-13  8:21 Tom Herbert
  2014-06-13  8:43 ` Geert Uytterhoeven
  0 siblings, 1 reply; 8+ messages in thread
From: Tom Herbert @ 2014-06-13  8:21 UTC (permalink / raw)
  To: geert; +Cc: davem, linux-next, linux-sh, linux-kernel, netdev

>
> I assume this is the series "[PATCH 0/4] Checksum fixes"
> (marc.info/?l=linux-netdev&m=140261417832399&w=2)?
>
Yes.

> As I'm not subscribed to netdev, I cannot reply to that thread.
>
> "[PATCH 1/4] net: Fix save software checksum complete" fixes the issue
> for me.
> However, "[PATCH 2/4] udp: call __skb_checksum_complete when doing full
> checksum" reintroduces the exact same bad behavior :-(
>
This implies the problem is happening in UDP path. AFAICT skb->csum is 
correct, and I don't seem to be able to reproduce the issue on my setup. 
It is possible that an skb copy is happening in which case we don't copy
csum_valid so that checksum_unnecessary would fail in this case.

Can you try with the patch below. Thanks!

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index bf92824..9cd5344 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -689,6 +689,9 @@ static void __copy_skb_header(struct sk_buff *new, const struct sk_buff *old)
 	new->ooo_okay		= old->ooo_okay;
 	new->no_fcs		= old->no_fcs;
 	new->encapsulation	= old->encapsulation;
+	new->encap_hdr_csum	= old->encap_hdr_csum;
+	new->csum_valid		= old->csum_valid;
+	new->csum_complete_sw	= old->csum_complete_sw;
 #ifdef CONFIG_XFRM
 	new->sp			= secpath_get(old->sp);
 #endif

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [-next] Regression: ssh log in slowdown
       [not found] ` <CA+mtBx9BSe2M7-ALT9AL1J0qPXzJP2LcFG5dCJ3DASnh6-p9gA@mail.gmail.com>
@ 2014-06-13  7:27   ` Geert Uytterhoeven
  0 siblings, 0 replies; 8+ messages in thread
From: Geert Uytterhoeven @ 2014-06-13  7:27 UTC (permalink / raw)
  To: Tom Herbert; +Cc: David S. Miller, Linux-Next, Linux-sh list, linux-kernel

Hi Tom,

cc lkml, as this is now in mainline

On Fri, Jun 13, 2014 at 2:59 AM, Tom Herbert <therbert@google.com> wrote:
>>     net: Save software checksum complete
>>
>>     In skb_checksum complete, if we need to compute the checksum for the
>>     packet (via skb_checksum) save the result as CHECKSUM_COMPLETE.
>>     Subsequent checksum verification can use this.
>>
>>     Also, added csum_complete_sw flag to distinguish between software and
>>     hardware generated checksum complete, we should always be able to trust
>>     the software computation.
>>
>>     Signed-off-by: Tom Herbert <therbert@google.com>
>>     Signed-off-by: David S. Miller <davem@davemloft.net>
>>
>> Reverting this commit fixes the issue.
>>
>> Anyone with a clue?
>>
> Hi Geert,
>
> I'm very sorry that I seemed to have missed your initial bug report,
> thanks for bisecting the problem. I have posted a fix for this, please
> verify it if you can

Thanks for your patches!

I assume this is the series "[PATCH 0/4] Checksum fixes"
(marc.info/?l=linux-netdev&m=140261417832399&w=2)?

As I'm not subscribed to netdev, I cannot reply to that thread.

"[PATCH 1/4] net: Fix save software checksum complete" fixes the issue
for me.
However, "[PATCH 2/4] udp: call __skb_checksum_complete when doing full
checksum" reintroduces the exact same bad behavior :-(

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-06-15  8:43 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-13 15:37 [-next] Regression: ssh log in slowdown Tom Herbert
2014-06-15  8:42 ` Geert Uytterhoeven
  -- strict thread matches above, loose matches on Subject: below --
2014-06-13  8:21 Tom Herbert
2014-06-13  8:43 ` Geert Uytterhoeven
2014-06-13  8:49   ` David Laight
2014-06-13  9:54     ` Geert Uytterhoeven
2014-06-13 10:05       ` David Laight
     [not found] <CAMuHMdU6rrpx6xFw71HKexDJUZtfFtL+um6dxZ=EaycgVO312A@mail.gmail.com>
     [not found] ` <CA+mtBx9BSe2M7-ALT9AL1J0qPXzJP2LcFG5dCJ3DASnh6-p9gA@mail.gmail.com>
2014-06-13  7:27   ` Geert Uytterhoeven

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).