netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* kernel warning in tcp_fragment
@ 2015-07-22 18:55 Jovi Zhangwei
  2015-07-27 19:08 ` Jovi Zhangwei
  2015-07-27 23:19 ` Martin KaFai Lau
  0 siblings, 2 replies; 13+ messages in thread
From: Jovi Zhangwei @ 2015-07-22 18:55 UTC (permalink / raw)
  To: ncardwell, kafai, netdev, davem, kuznet, jmorris, yoshfuji, kaber

Hi Neal and Martin,

Sorry for disturbing, our production system(3.14 and 3.18 stable
kernel) have many tcp_fragment warnings,
the trace is same as below one which you discussed before.

http://comments.gmane.org/gmane.linux.network/365658

But I didn't found the final solution in that mail thread, do you have
any new ideas or patches on this warning?

Great thanks.


[5184217.672290] WARNING: CPU: 9 PID: 2801 at
net/ipv4/tcp_output.c:1081 tcp_fragment+0x34/0x230()
[5184217.680995] Modules linked in: sfc_char(O) sfc_resource(O)
sfc_affinity(O) nf_conntrack_netlink xt_connlimit xt_length xt_bpf
xt_hashlimit iptable_nat nf_nat_ipv4 nf_nat iptable_mangle xt_comment
ip6table_security ip6table_mangle ip_set_hash_netport 8021q garp bridg
e stp llc ipmi_devintf nf_conntrack_ipv6 nf_defrag_ipv6
ip6table_filter ip6table_raw ip6_tables nf_conntrack_ipv4
nf_defrag_ipv4 xt_NFLOG nfnetlink_log xt_conntrack iptable_filter
xt_tcpudp xt_multiport xt_CT nf_conntrack xt_set iptable_raw ip_tables
x_tables ip_set_hash
_net ip_set_hash_ip ip_set nfnetlink rpcsec_gss_krb5 auth_rpcgss
oid_registry nfsv4 fuse nfsv3 nfs_acl nfs fscache lockd sunrpc
tcp_cubic sg sfc(O) mtd mdio igb dca i2c_algo_bit ptp pps_core sd_mod
crct10dif_generic crc_t10dif crct10dif_common x86_pkg_temp_thermal
acpi_c
pufreq coretemp kvm_intel kvm crc32c_intel aesni_intel ablk_helper
cryptd lrw gf128mul glue_helper aes_x86_64 ahci libahci ehci_pci
libata ehci_hcd i2c_i801 i2c_core lpc_ich mfd_core usbcore scsi_mod
usb_common wmi evdev ipmi_si ipmi_msghandler tpm_tis tpm acpi_pad
proce
ssor thermal_sys button
[5184217.684098] CPU: 9 PID: 2801 Comm: rrdns Tainted: G        W  O
3.14.28-cloudflare #1
[5184217.684099] Hardware name: Quanta Computer Inc QuantaPlex
T41S-2U/S2S-MB, BIOS S2S_3A14          09/18/2014
[5184217.684100]  0000000000000000 ffffffff81466263 0000000000000000
ffffffff8103bb34
[5184217.684101]  ffffffff813e07f2 ffff8818abebcc00 000000000000004a
0000000000000002
[5184217.684102]  0000000000000060 ffffffff813e07f2 0000003000004120
ffff8818abebcc00
[5184217.684104] Call Trace:
[5184217.684105]  <IRQ>  [<ffffffff81466263>] ? dump_stack+0x41/0x51
[5184217.684111]  [<ffffffff8103bb34>] ? warn_slowpath_common+0x74/0x89
[5184217.684115]  [<ffffffff813e07f2>] ? tcp_fragment+0x34/0x230
[5184217.684118]  [<ffffffff813e07f2>] ? tcp_fragment+0x34/0x230
[5184217.684119]  [<ffffffff813d98b7>] ? tcp_mark_head_lost+0x1bd/0x1d5
[5184217.684123]  [<ffffffff813ddb71>] ? tcp_fastretrans_alert+0x69f/0x71d
[5184217.684125]  [<ffffffff813de567>] ? tcp_ack+0x90f/0xb16
[5184217.684126]  [<ffffffff813df618>] ? tcp_rcv_state_process+0x5bd/0x9b8
[5184217.684128]  [<ffffffff8106d9c0>] ? __wake_up_sync_key+0x3a/0x4d
[5184217.684130]  [<ffffffff813920ed>] ? sk_wake_async+0x17/0x34
[5184217.684133]  [<ffffffff81440d13>] ? ipv6_skip_exthdr+0x28/0xc7
[5184217.684139]  [<ffffffff81418db6>] ? NF_HOOK_THRESH.constprop.11+0x4a/0x4a
[5184217.684143]  [<ffffffff81435abe>] ? tcp_v6_do_rcv+0x3ac/0x4f1
[5184217.684146]  [<ffffffff81435eec>] ? tcp_v6_rcv+0x2e9/0x554
[5184217.684148]  [<ffffffff813c70d3>] ? nf_hook_slow+0x66/0xf1
[5184217.684150]  [<ffffffff81418db6>] ? NF_HOOK_THRESH.constprop.11+0x4a/0x4a
[5184217.684167]  [<ffffffff81418f70>] ? ip6_input_finish+0x1ba/0x2a7
[5184217.684169]  [<ffffffff813a1c12>] ? __netif_receive_skb_core+0x422/0x494
[5184217.684172]  [<ffffffff813a283a>] ? netif_receive_skb_internal+0x37/0x6d
[5184217.684188]  [<ffffffffa09a2e40>] ? efx_ssr_try_merge+0x336/0x34e [sfc]
[5184217.684215]  [<ffffffffa09a4075>] ? __efx_ssr_end_of_burst+0x3e/0xd2 [sfc]
[5184217.684225]  [<ffffffffa098e3bd>] ? efx_process_channel+0x5d/0x71 [sfc]
[5184217.684243]  [<ffffffffa098f557>] ? efx_poll+0x6d/0x16b [sfc]
[5184217.684248]  [<ffffffff813a2e27>] ? net_rx_action+0xc6/0x191
[5184217.684250]  [<ffffffff8103f7ee>] ? __do_softirq+0x100/0x27c
[5184217.684254]  [<ffffffff8103fae6>] ? irq_exit+0x51/0xbc
[5184217.684255]  [<ffffffff81003e35>] ? do_IRQ+0x9d/0xb4
[5184217.684258]  [<ffffffff8146992a>] ? common_interrupt+0x6a/0x6a
[5184217.684261]  <EOI> <4>[5184217.684263] ---[ end trace 4f42d23abf1c890e ]---
[5184217.684460] ------------[ cut here ]------------

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kernel warning in tcp_fragment
  2015-07-22 18:55 kernel warning in tcp_fragment Jovi Zhangwei
@ 2015-07-27 19:08 ` Jovi Zhangwei
  2015-07-27 23:19 ` Martin KaFai Lau
  1 sibling, 0 replies; 13+ messages in thread
From: Jovi Zhangwei @ 2015-07-27 19:08 UTC (permalink / raw)
  To: Neal Cardwell, kafai, netdev, davem, Alexey Kuznetsov, jmorris,
	yoshfuji, Patrick McHardy

ping...

On Wed, Jul 22, 2015 at 11:55 AM, Jovi Zhangwei <jovi@cloudflare.com> wrote:
> Hi Neal and Martin,
>
> Sorry for disturbing, our production system(3.14 and 3.18 stable
> kernel) have many tcp_fragment warnings,
> the trace is same as below one which you discussed before.
>
> http://comments.gmane.org/gmane.linux.network/365658
>
> But I didn't found the final solution in that mail thread, do you have
> any new ideas or patches on this warning?
>
> Great thanks.
>
>
> [5184217.672290] WARNING: CPU: 9 PID: 2801 at
> net/ipv4/tcp_output.c:1081 tcp_fragment+0x34/0x230()
> [5184217.680995] Modules linked in: sfc_char(O) sfc_resource(O)
> sfc_affinity(O) nf_conntrack_netlink xt_connlimit xt_length xt_bpf
> xt_hashlimit iptable_nat nf_nat_ipv4 nf_nat iptable_mangle xt_comment
> ip6table_security ip6table_mangle ip_set_hash_netport 8021q garp bridg
> e stp llc ipmi_devintf nf_conntrack_ipv6 nf_defrag_ipv6
> ip6table_filter ip6table_raw ip6_tables nf_conntrack_ipv4
> nf_defrag_ipv4 xt_NFLOG nfnetlink_log xt_conntrack iptable_filter
> xt_tcpudp xt_multiport xt_CT nf_conntrack xt_set iptable_raw ip_tables
> x_tables ip_set_hash
> _net ip_set_hash_ip ip_set nfnetlink rpcsec_gss_krb5 auth_rpcgss
> oid_registry nfsv4 fuse nfsv3 nfs_acl nfs fscache lockd sunrpc
> tcp_cubic sg sfc(O) mtd mdio igb dca i2c_algo_bit ptp pps_core sd_mod
> crct10dif_generic crc_t10dif crct10dif_common x86_pkg_temp_thermal
> acpi_c
> pufreq coretemp kvm_intel kvm crc32c_intel aesni_intel ablk_helper
> cryptd lrw gf128mul glue_helper aes_x86_64 ahci libahci ehci_pci
> libata ehci_hcd i2c_i801 i2c_core lpc_ich mfd_core usbcore scsi_mod
> usb_common wmi evdev ipmi_si ipmi_msghandler tpm_tis tpm acpi_pad
> proce
> ssor thermal_sys button
> [5184217.684098] CPU: 9 PID: 2801 Comm: rrdns Tainted: G        W  O
> 3.14.28-cloudflare #1
> [5184217.684099] Hardware name: Quanta Computer Inc QuantaPlex
> T41S-2U/S2S-MB, BIOS S2S_3A14          09/18/2014
> [5184217.684100]  0000000000000000 ffffffff81466263 0000000000000000
> ffffffff8103bb34
> [5184217.684101]  ffffffff813e07f2 ffff8818abebcc00 000000000000004a
> 0000000000000002
> [5184217.684102]  0000000000000060 ffffffff813e07f2 0000003000004120
> ffff8818abebcc00
> [5184217.684104] Call Trace:
> [5184217.684105]  <IRQ>  [<ffffffff81466263>] ? dump_stack+0x41/0x51
> [5184217.684111]  [<ffffffff8103bb34>] ? warn_slowpath_common+0x74/0x89
> [5184217.684115]  [<ffffffff813e07f2>] ? tcp_fragment+0x34/0x230
> [5184217.684118]  [<ffffffff813e07f2>] ? tcp_fragment+0x34/0x230
> [5184217.684119]  [<ffffffff813d98b7>] ? tcp_mark_head_lost+0x1bd/0x1d5
> [5184217.684123]  [<ffffffff813ddb71>] ? tcp_fastretrans_alert+0x69f/0x71d
> [5184217.684125]  [<ffffffff813de567>] ? tcp_ack+0x90f/0xb16
> [5184217.684126]  [<ffffffff813df618>] ? tcp_rcv_state_process+0x5bd/0x9b8
> [5184217.684128]  [<ffffffff8106d9c0>] ? __wake_up_sync_key+0x3a/0x4d
> [5184217.684130]  [<ffffffff813920ed>] ? sk_wake_async+0x17/0x34
> [5184217.684133]  [<ffffffff81440d13>] ? ipv6_skip_exthdr+0x28/0xc7
> [5184217.684139]  [<ffffffff81418db6>] ? NF_HOOK_THRESH.constprop.11+0x4a/0x4a
> [5184217.684143]  [<ffffffff81435abe>] ? tcp_v6_do_rcv+0x3ac/0x4f1
> [5184217.684146]  [<ffffffff81435eec>] ? tcp_v6_rcv+0x2e9/0x554
> [5184217.684148]  [<ffffffff813c70d3>] ? nf_hook_slow+0x66/0xf1
> [5184217.684150]  [<ffffffff81418db6>] ? NF_HOOK_THRESH.constprop.11+0x4a/0x4a
> [5184217.684167]  [<ffffffff81418f70>] ? ip6_input_finish+0x1ba/0x2a7
> [5184217.684169]  [<ffffffff813a1c12>] ? __netif_receive_skb_core+0x422/0x494
> [5184217.684172]  [<ffffffff813a283a>] ? netif_receive_skb_internal+0x37/0x6d
> [5184217.684188]  [<ffffffffa09a2e40>] ? efx_ssr_try_merge+0x336/0x34e [sfc]
> [5184217.684215]  [<ffffffffa09a4075>] ? __efx_ssr_end_of_burst+0x3e/0xd2 [sfc]
> [5184217.684225]  [<ffffffffa098e3bd>] ? efx_process_channel+0x5d/0x71 [sfc]
> [5184217.684243]  [<ffffffffa098f557>] ? efx_poll+0x6d/0x16b [sfc]
> [5184217.684248]  [<ffffffff813a2e27>] ? net_rx_action+0xc6/0x191
> [5184217.684250]  [<ffffffff8103f7ee>] ? __do_softirq+0x100/0x27c
> [5184217.684254]  [<ffffffff8103fae6>] ? irq_exit+0x51/0xbc
> [5184217.684255]  [<ffffffff81003e35>] ? do_IRQ+0x9d/0xb4
> [5184217.684258]  [<ffffffff8146992a>] ? common_interrupt+0x6a/0x6a
> [5184217.684261]  <EOI> <4>[5184217.684263] ---[ end trace 4f42d23abf1c890e ]---
> [5184217.684460] ------------[ cut here ]------------

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kernel warning in tcp_fragment
  2015-07-22 18:55 kernel warning in tcp_fragment Jovi Zhangwei
  2015-07-27 19:08 ` Jovi Zhangwei
@ 2015-07-27 23:19 ` Martin KaFai Lau
  2015-07-31 18:04   ` Jovi Zhangwei
  1 sibling, 1 reply; 13+ messages in thread
From: Martin KaFai Lau @ 2015-07-27 23:19 UTC (permalink / raw)
  To: Jovi Zhangwei, Eric Dumazet
  Cc: ncardwell, netdev, davem, kuznet, jmorris, yoshfuji, kaber,
	FB Kernel Team

On Wed, Jul 22, 2015 at 11:55:35AM -0700, Jovi Zhangwei wrote:
> Sorry for disturbing, our production system(3.14 and 3.18 stable
> kernel) have many tcp_fragment warnings,
> the trace is same as below one which you discussed before.
> 
> https://urldefense.proofpoint.com/v1/url?u=http://comments.gmane.org/gmane.linux.network/365658&k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A&r=%2Faj1ZOQObwbmtLwlDw3XzQ%3D%3D%0A&m=fQUME5h%2FYY3oZjXbnLC3z6TaEEcTBSCAji4PkNqFjq8%3D%0A&s=1527f3221a6f31cba9544e5ddaa20986aafe8be8c898b42c7e9ce5e68d3803d8
> 
> But I didn't found the final solution in that mail thread, do you have
> any new ideas or patches on this warning?

I think the following points to the last discussion.  We are currently using a
similar patch:
http://comments.gmane.org/gmane.linux.network/366549

Eric, any update on your findings? or you have already pushed a fix?

Thanks,
--Martin

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kernel warning in tcp_fragment
  2015-07-27 23:19 ` Martin KaFai Lau
@ 2015-07-31 18:04   ` Jovi Zhangwei
  2015-08-10 18:10     ` Jovi Zhangwei
  0 siblings, 1 reply; 13+ messages in thread
From: Jovi Zhangwei @ 2015-07-31 18:04 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: Eric Dumazet, Neal Cardwell, netdev, davem, Alexey Kuznetsov,
	jmorris, yoshfuji, Patrick McHardy, FB Kernel Team

Hi Eric,

Would you like share your thought on this bug? great thanks.


On Mon, Jul 27, 2015 at 4:19 PM, Martin KaFai Lau <kafai@fb.com> wrote:
> On Wed, Jul 22, 2015 at 11:55:35AM -0700, Jovi Zhangwei wrote:
>> Sorry for disturbing, our production system(3.14 and 3.18 stable
>> kernel) have many tcp_fragment warnings,
>> the trace is same as below one which you discussed before.
>>
>> https://urldefense.proofpoint.com/v1/url?u=http://comments.gmane.org/gmane.linux.network/365658&k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A&r=%2Faj1ZOQObwbmtLwlDw3XzQ%3D%3D%0A&m=fQUME5h%2FYY3oZjXbnLC3z6TaEEcTBSCAji4PkNqFjq8%3D%0A&s=1527f3221a6f31cba9544e5ddaa20986aafe8be8c898b42c7e9ce5e68d3803d8
>>
>> But I didn't found the final solution in that mail thread, do you have
>> any new ideas or patches on this warning?
>
> I think the following points to the last discussion.  We are currently using a
> similar patch:
> http://comments.gmane.org/gmane.linux.network/366549
>
> Eric, any update on your findings? or you have already pushed a fix?
>
> Thanks,
> --Martin

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kernel warning in tcp_fragment
  2015-07-31 18:04   ` Jovi Zhangwei
@ 2015-08-10 18:10     ` Jovi Zhangwei
  2015-08-10 18:35       ` Neal Cardwell
  0 siblings, 1 reply; 13+ messages in thread
From: Jovi Zhangwei @ 2015-08-10 18:10 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: Eric Dumazet, Neal Cardwell, netdev, davem, Alexey Kuznetsov,
	jmorris, yoshfuji, Patrick McHardy, FB Kernel Team

Ping?

We saw a lot of this warnings in our production system. It would be
great appreciate if someone can give us the fix on this warnings. :)

On Fri, Jul 31, 2015 at 11:04 AM, Jovi Zhangwei <jovi@cloudflare.com> wrote:
> Hi Eric,
>
> Would you like share your thought on this bug? great thanks.
>
>
> On Mon, Jul 27, 2015 at 4:19 PM, Martin KaFai Lau <kafai@fb.com> wrote:
>> On Wed, Jul 22, 2015 at 11:55:35AM -0700, Jovi Zhangwei wrote:
>>> Sorry for disturbing, our production system(3.14 and 3.18 stable
>>> kernel) have many tcp_fragment warnings,
>>> the trace is same as below one which you discussed before.
>>>
>>> https://urldefense.proofpoint.com/v1/url?u=http://comments.gmane.org/gmane.linux.network/365658&k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A&r=%2Faj1ZOQObwbmtLwlDw3XzQ%3D%3D%0A&m=fQUME5h%2FYY3oZjXbnLC3z6TaEEcTBSCAji4PkNqFjq8%3D%0A&s=1527f3221a6f31cba9544e5ddaa20986aafe8be8c898b42c7e9ce5e68d3803d8
>>>
>>> But I didn't found the final solution in that mail thread, do you have
>>> any new ideas or patches on this warning?
>>
>> I think the following points to the last discussion.  We are currently using a
>> similar patch:
>> http://comments.gmane.org/gmane.linux.network/366549
>>
>> Eric, any update on your findings? or you have already pushed a fix?
>>
>> Thanks,
>> --Martin

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kernel warning in tcp_fragment
  2015-08-10 18:10     ` Jovi Zhangwei
@ 2015-08-10 18:35       ` Neal Cardwell
  2015-08-10 21:53         ` Jovi Zhangwei
  2015-08-13  3:45         ` Martin KaFai Lau
  0 siblings, 2 replies; 13+ messages in thread
From: Neal Cardwell @ 2015-08-10 18:35 UTC (permalink / raw)
  To: Jovi Zhangwei
  Cc: Martin KaFai Lau, Eric Dumazet, Netdev, David Miller,
	Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI,
	Patrick McHardy, FB Kernel Team

[-- Attachment #1: Type: text/plain, Size: 687 bytes --]

On Mon, Aug 10, 2015 at 2:10 PM, Jovi Zhangwei <jovi@cloudflare.com> wrote:
>
> Ping?
>
> We saw a lot of this warnings in our production system. It would be
> great appreciate if someone can give us the fix on this warnings. :)

What is your net.ipv4.tcp_mtu_probing setting? If 1, have you tried
setting it to 0? Previous reports (
https://patchwork.ozlabs.org/patch/480882/ ) have shown that this gets
rid of at least one source of the warning. So that would provide a
useful data point.

Separately, you could also try the attached patch. This is against
3.14.39. It tries to attack a different possible source of this
warning. Please let us know if that patch helps.

Thanks!

neal

[-- Attachment #2: 0001-RFC-for-tests-on-v3.14.39-tcp-resegment-skbs-that-we.patch --]
[-- Type: application/octet-stream, Size: 4569 bytes --]

From 28f179e004adcaa2397c77882836fd7111ef61aa Mon Sep 17 00:00:00 2001
From: Neal Cardwell <ncardwell@google.com>
Date: Fri, 29 May 2015 20:05:23 -0400
Subject: [PATCH] [RFC for tests on v3.14.39] tcp: resegment skbs that we mark
 un-SACKed due to reneging

[This patch is for Linux v3.14.39 and is for testing a proposed fix
for the issue reported in the netdev thread "Recurring trace from
tcp_fragment()" from May 29, 2015. A slightly different patch would be
needed for more recent kernels.]

If we are removing a SACK mark due to reneging then we should check to
see if the pcount needs to be sanitized, since tcp_shifted_skb()
can join together SACKed skbs in a way that makes their pcount
unrepresentative of the length of the packet.

This is aimed at fixing scenarios like the one where
tcp_mark_head_lost() calls tcp_fragment() and we fire the following
warning:

         if (WARN_ON(len > skb->len))
                return -EINVAL;

Here is a theory as to how this could happen...

Suppose the MSS=1000, for simplicity.

(1) send packet A, 1001 bytes, pcount 2
(2) send packet B, 1001 bytes, pcount 2
(3) receive SACK for A
(4) receive SACK for A and B, shift B onto A.

When we shift B onto A, tcp_shifted_skb() just adds the pcounts of A
and B, so now A's pcount is 2+2=4. But its skb->len is 1001+1001 =
2002 bytes.  Now normally we would expect an skb with a pcount of 4 to
have somewhere between 3*MSS+1byte and 4*MSS (between 3001 and 4000
bytes). And tcp_mark_head_lost() and tcp_match_skb_to_sack()
implicitly assume this.

Suppose there is then SACK reneging, and we remove the SACKed bit from
this weird skb A with pcount 4 and skb->len 2002.  Then we get more
SACKs for packets beyond A, and the loss-marking rules say we should
be able to mark 3 packets starting at A as lost.  Then we try to chop
3MSS worth of bytes off of packet A, which only has 2.002MSS of data.
And the warning fires.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
---
 include/net/tcp.h     |  2 ++
 net/ipv4/tcp_input.c  |  9 ++++++++-
 net/ipv4/tcp_output.c | 14 ++++++++++++++
 3 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 1f0d847..4464312 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -543,6 +543,8 @@ void tcp_xmit_retransmit_queue(struct sock *);
 void tcp_simple_retransmit(struct sock *);
 int tcp_trim_head(struct sock *, struct sk_buff *, u32);
 int tcp_fragment(struct sock *, struct sk_buff *, u32, unsigned int);
+int tcp_reset_skb_tso_segs(struct sock *sk, struct sk_buff *skb,
+			   unsigned int mss_now);
 
 void tcp_send_probe0(struct sock *);
 void tcp_send_partial(struct sock *);
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 2291791..804713b 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -1915,6 +1915,7 @@ void tcp_enter_loss(struct sock *sk, int how)
 	struct tcp_sock *tp = tcp_sk(sk);
 	struct sk_buff *skb;
 	bool new_recovery = false;
+	bool was_sacked;
 
 	/* Reduce ssthresh if it has not yet been made inside this window. */
 	if (icsk->icsk_ca_state <= TCP_CA_Disorder ||
@@ -1949,11 +1950,17 @@ void tcp_enter_loss(struct sock *sk, int how)
 			tp->undo_marker = 0;
 
 		TCP_SKB_CB(skb)->sacked &= (~TCPCB_TAGBITS)|TCPCB_SACKED_ACKED;
-		if (!(TCP_SKB_CB(skb)->sacked&TCPCB_SACKED_ACKED) || how) {
+		was_sacked = TCP_SKB_CB(skb)->sacked & TCPCB_SACKED_ACKED;
+		if (!was_sacked || how) {
 			TCP_SKB_CB(skb)->sacked &= ~TCPCB_SACKED_ACKED;
 			TCP_SKB_CB(skb)->sacked |= TCPCB_LOST;
 			tp->lost_out += tcp_skb_pcount(skb);
 			tp->retransmit_high = TCP_SKB_CB(skb)->end_seq;
+
+			/* Clean up weird pcounts from tcp_shifted_skb(). */
+			if (was_sacked)
+				tcp_reset_skb_tso_segs(sk, skb,
+						       tcp_current_mss(sk));
 		}
 	}
 	tcp_verify_left_out(tp);
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 96f64e5..74c8757 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1501,6 +1501,20 @@ static int tcp_init_tso_segs(const struct sock *sk, struct sk_buff *skb,
 	return tso_segs;
 }
 
+/* Recompute the TSO segmentation for an skb that was already sent. */
+int tcp_reset_skb_tso_segs(struct sock *sk, struct sk_buff *skb,
+			   unsigned int mss_now)
+{
+	int oldpcount = tcp_skb_pcount(skb);
+
+	if (skb_unclone(skb, GFP_ATOMIC))
+		return -ENOMEM;
+
+	tcp_set_skb_tso_segs(sk, skb, mss_now);
+	tcp_adjust_pcount(sk, skb, oldpcount - tcp_skb_pcount(skb));
+
+	return 0;
+}
 
 /* Return true if the Nagle test allows this packet to be
  * sent now.
-- 
2.2.0.rc0.207.ga3a616c


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: kernel warning in tcp_fragment
  2015-08-10 18:35       ` Neal Cardwell
@ 2015-08-10 21:53         ` Jovi Zhangwei
  2015-08-13  3:45         ` Martin KaFai Lau
  1 sibling, 0 replies; 13+ messages in thread
From: Jovi Zhangwei @ 2015-08-10 21:53 UTC (permalink / raw)
  To: Neal Cardwell
  Cc: Martin KaFai Lau, Eric Dumazet, Netdev, David Miller,
	Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI,
	Patrick McHardy, FB Kernel Team

Hi Neal,

Great thanks for your reply, we will arrange testing against that patch.

On Mon, Aug 10, 2015 at 11:35 AM, Neal Cardwell <ncardwell@google.com> wrote:
> On Mon, Aug 10, 2015 at 2:10 PM, Jovi Zhangwei <jovi@cloudflare.com> wrote:
>>
>> Ping?
>>
>> We saw a lot of this warnings in our production system. It would be
>> great appreciate if someone can give us the fix on this warnings. :)
>
> What is your net.ipv4.tcp_mtu_probing setting? If 1, have you tried
> setting it to 0? Previous reports (
> https://patchwork.ozlabs.org/patch/480882/ ) have shown that this gets
> rid of at least one source of the warning. So that would provide a
> useful data point.
>
> Separately, you could also try the attached patch. This is against
> 3.14.39. It tries to attack a different possible source of this
> warning. Please let us know if that patch helps.
>
> Thanks!
>
> neal

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kernel warning in tcp_fragment
  2015-08-10 18:35       ` Neal Cardwell
  2015-08-10 21:53         ` Jovi Zhangwei
@ 2015-08-13  3:45         ` Martin KaFai Lau
  2015-08-13 23:05           ` Jovi Zhangwei
  2015-09-01 23:02           ` Grant Zhang
  1 sibling, 2 replies; 13+ messages in thread
From: Martin KaFai Lau @ 2015-08-13  3:45 UTC (permalink / raw)
  To: Jovi Zhangwei
  Cc: Neal Cardwell, Eric Dumazet, Netdev, David Miller,
	Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI,
	Patrick McHardy, FB Kernel Team

On Mon, Aug 10, 2015 at 02:35:37PM -0400, Neal Cardwell wrote:
> On Mon, Aug 10, 2015 at 2:10 PM, Jovi Zhangwei <jovi@cloudflare.com> wrote:
> >
> > Ping?
> >
> > We saw a lot of this warnings in our production system. It would be
> > great appreciate if someone can give us the fix on this warnings. :)
> 
> What is your net.ipv4.tcp_mtu_probing setting? If 1, have you tried
> setting it to 0? 

Hi Jovi, If setting net.ipv4.tcp_mtu_probing=0 helps, can you give the
patch we posted earlier a try: https://patchwork.ozlabs.org/patch/481609/
It is the same patch that I pointed out earlier. You can click
on the download link.

We are currently using a similar patch while keeping net.ipv4.tcp_mtu_probing=1.

Thanks,
--Martin

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kernel warning in tcp_fragment
  2015-08-13  3:45         ` Martin KaFai Lau
@ 2015-08-13 23:05           ` Jovi Zhangwei
  2015-09-01 23:02           ` Grant Zhang
  1 sibling, 0 replies; 13+ messages in thread
From: Jovi Zhangwei @ 2015-08-13 23:05 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: Neal Cardwell, Eric Dumazet, Netdev, David Miller,
	Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI,
	Patrick McHardy, FB Kernel Team

Hi,

On Wed, Aug 12, 2015 at 8:45 PM, Martin KaFai Lau <kafai@fb.com> wrote:
> On Mon, Aug 10, 2015 at 02:35:37PM -0400, Neal Cardwell wrote:
>> On Mon, Aug 10, 2015 at 2:10 PM, Jovi Zhangwei <jovi@cloudflare.com> wrote:
>> >
>> > Ping?
>> >
>> > We saw a lot of this warnings in our production system. It would be
>> > great appreciate if someone can give us the fix on this warnings. :)
>>
>> What is your net.ipv4.tcp_mtu_probing setting? If 1, have you tried
>> setting it to 0?
>
> Hi Jovi, If setting net.ipv4.tcp_mtu_probing=0 helps, can you give the
> patch we posted earlier a try: https://patchwork.ozlabs.org/patch/481609/
> It is the same patch that I pointed out earlier. You can click
> on the download link.
>
> We are currently using a similar patch while keeping net.ipv4.tcp_mtu_probing=1.
>
Our system need net.ipv4.tcp_mtu_probing, so we cannot set it to 0.
We are testing previous patch given by Neal, I will let you know the result.

Thanks.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kernel warning in tcp_fragment
  2015-08-13  3:45         ` Martin KaFai Lau
  2015-08-13 23:05           ` Jovi Zhangwei
@ 2015-09-01 23:02           ` Grant Zhang
  2015-09-14 18:12             ` Martin KaFai Lau
       [not found]             ` <CABPcSqJMS3oYLbL=Ns71ciF_9rcZNuZ0VceV=noaLJgV=LTAQQ@mail.gmail.com>
  1 sibling, 2 replies; 13+ messages in thread
From: Grant Zhang @ 2015-09-01 23:02 UTC (permalink / raw)
  To: Martin KaFai Lau, Jovi Zhangwei
  Cc: Neal Cardwell, Eric Dumazet, Netdev, David Miller,
	Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI,
	Patrick McHardy, FB Kernel Team

Hi Martin,

I did try out your v2 patch on our production server and can confirm 
that the patch gets rid of the WARN_ON trace.

I would really like to see the issue been fixed by upstream(and 
backported to kernel longterm tree 3.14)--either by this patch or 
something else. Is there a plan for this?

Thanks,

Grant

On 12/08/2015 20:45, Martin KaFai Lau wrote:
> On Mon, Aug 10, 2015 at 02:35:37PM -0400, Neal Cardwell wrote:
>> On Mon, Aug 10, 2015 at 2:10 PM, Jovi Zhangwei <jovi@cloudflare.com> wrote:
>>>
>>> Ping?
>>>
>>> We saw a lot of this warnings in our production system. It would be
>>> great appreciate if someone can give us the fix on this warnings. :)
>>
>> What is your net.ipv4.tcp_mtu_probing setting? If 1, have you tried
>> setting it to 0?
>
> Hi Jovi, If setting net.ipv4.tcp_mtu_probing=0 helps, can you give the
> patch we posted earlier a try: https://patchwork.ozlabs.org/patch/481609/
> It is the same patch that I pointed out earlier. You can click
> on the download link.
>
> We are currently using a similar patch while keeping net.ipv4.tcp_mtu_probing=1.
>
> Thanks,
> --Martin
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kernel warning in tcp_fragment
  2015-09-01 23:02           ` Grant Zhang
@ 2015-09-14 18:12             ` Martin KaFai Lau
       [not found]             ` <CABPcSqJMS3oYLbL=Ns71ciF_9rcZNuZ0VceV=noaLJgV=LTAQQ@mail.gmail.com>
  1 sibling, 0 replies; 13+ messages in thread
From: Martin KaFai Lau @ 2015-09-14 18:12 UTC (permalink / raw)
  To: Grant Zhang
  Cc: Jovi Zhangwei, Neal Cardwell, Eric Dumazet, Netdev, David Miller,
	Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI,
	Patrick McHardy, FB Kernel Team

Hi Grant,

Thanks for testing it.  I will try to repost the patch.

Thanks,
Martin

On Tue, Sep 01, 2015 at 04:02:33PM -0700, Grant Zhang wrote:
> Hi Martin,
>
> I did try out your v2 patch on our production server and can confirm that
> the patch gets rid of the WARN_ON trace.
>
> I would really like to see the issue been fixed by upstream(and backported
> to kernel longterm tree 3.14)--either by this patch or something else. Is
> there a plan for this?
>
> Thanks,
>
> Grant
>
> On 12/08/2015 20:45, Martin KaFai Lau wrote:
> >On Mon, Aug 10, 2015 at 02:35:37PM -0400, Neal Cardwell wrote:
> >>On Mon, Aug 10, 2015 at 2:10 PM, Jovi Zhangwei <jovi@cloudflare.com> wrote:
> >>>
> >>>Ping?
> >>>
> >>>We saw a lot of this warnings in our production system. It would be
> >>>great appreciate if someone can give us the fix on this warnings. :)
> >>
> >>What is your net.ipv4.tcp_mtu_probing setting? If 1, have you tried
> >>setting it to 0?
> >
> >Hi Jovi, If setting net.ipv4.tcp_mtu_probing=0 helps, can you give the
> >patch we posted earlier a try: https://urldefense.proofpoint.com/v1/url?u=https://patchwork.ozlabs.org/patch/481609/&k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A&r=%2Faj1ZOQObwbmtLwlDw3XzQ%3D%3D%0A&m=wYNHn6ACXUwYfYQpS2rAg%2BLrj8CrcyDTTr3Fx5SFoWg%3D%0A&s=51041d4fd18fa1568b4b46b683640d8239be657c50af324621ba9a4e8c9a96b6
> >It is the same patch that I pointed out earlier. You can click
> >on the download link.
> >
> >We are currently using a similar patch while keeping net.ipv4.tcp_mtu_probing=1.
> >
> >Thanks,
> >--Martin
> >--
> >To unsubscribe from this list: send the line "unsubscribe netdev" in
> >the body of a message to majordomo@vger.kernel.org
> >More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kernel warning in tcp_fragment
       [not found]             ` <CABPcSqJMS3oYLbL=Ns71ciF_9rcZNuZ0VceV=noaLJgV=LTAQQ@mail.gmail.com>
@ 2015-09-14 18:15               ` Neal Cardwell
       [not found]                 ` <CABPcSqJEoVAXk+PAZCWgD6LFV0Nxz7ON3CUuZZMgrcRQFLK44w@mail.gmail.com>
  0 siblings, 1 reply; 13+ messages in thread
From: Neal Cardwell @ 2015-09-14 18:15 UTC (permalink / raw)
  To: Jovi Zhangwei
  Cc: Grant Zhang, Martin KaFai Lau, Eric Dumazet, Netdev,
	David Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI,
	Patrick McHardy, FB Kernel Team

On Mon, Sep 14, 2015 at 6:27 AM, Jovi Zhangwei <jovi@cloudflare.com> wrote:
>
> Hi Near,
>
> After several days testing on your patch, our system crashed. Dmesg attached.

Jovi -- Sorry about that... thank you for the testing and the data point.

neal

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kernel warning in tcp_fragment
       [not found]                 ` <CABPcSqJEoVAXk+PAZCWgD6LFV0Nxz7ON3CUuZZMgrcRQFLK44w@mail.gmail.com>
@ 2015-10-19 10:57                   ` Jovi Zhangwei
  0 siblings, 0 replies; 13+ messages in thread
From: Jovi Zhangwei @ 2015-10-19 10:57 UTC (permalink / raw)
  To: Neal Cardwell
  Cc: Grant Zhang, Martin KaFai Lau, Eric Dumazet, Netdev,
	David Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI,
	Patrick McHardy, FB Kernel Team, Marek Majkowski

Hi Martin and Eric,

Do we have a final solution or patch for this issue? There have so
many this warnings in our production systems.

Thank you very much.

On Tue, Oct 13, 2015 at 4:55 PM, Jovi Zhangwei <jovi@cloudflare.com> wrote:
> Hi all,
>
> Is there have final patch to fix this issue? Thanks.
>
> On Mon, Sep 14, 2015 at 7:15 PM, Neal Cardwell <ncardwell@google.com> wrote:
>>
>> On Mon, Sep 14, 2015 at 6:27 AM, Jovi Zhangwei <jovi@cloudflare.com>
>> wrote:
>> >
>> > Hi Near,
>> >
>> > After several days testing on your patch, our system crashed. Dmesg
>> > attached.
>>
>> Jovi -- Sorry about that... thank you for the testing and the data point.
>>
>> neal
>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-10-19 10:57 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-22 18:55 kernel warning in tcp_fragment Jovi Zhangwei
2015-07-27 19:08 ` Jovi Zhangwei
2015-07-27 23:19 ` Martin KaFai Lau
2015-07-31 18:04   ` Jovi Zhangwei
2015-08-10 18:10     ` Jovi Zhangwei
2015-08-10 18:35       ` Neal Cardwell
2015-08-10 21:53         ` Jovi Zhangwei
2015-08-13  3:45         ` Martin KaFai Lau
2015-08-13 23:05           ` Jovi Zhangwei
2015-09-01 23:02           ` Grant Zhang
2015-09-14 18:12             ` Martin KaFai Lau
     [not found]             ` <CABPcSqJMS3oYLbL=Ns71ciF_9rcZNuZ0VceV=noaLJgV=LTAQQ@mail.gmail.com>
2015-09-14 18:15               ` Neal Cardwell
     [not found]                 ` <CABPcSqJEoVAXk+PAZCWgD6LFV0Nxz7ON3CUuZZMgrcRQFLK44w@mail.gmail.com>
2015-10-19 10:57                   ` Jovi Zhangwei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).