From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Dumazet Subject: Re: (2) (2) (2) [Kernel][NET] Bug report on packet defragmenting Date: Thu, 8 Nov 2018 07:12:29 -0800 Message-ID: References: <1771721f-40fd-0042-b603-5ed763c54378@gmail.com> <91b43bec-cb19-b94b-8ee3-26979e3a19d1@gmail.com> <20181108012927epcms1p47f719c1908da64a378690362901644ee@epcms1p4> <20181108020523epcms1p55a0c28d3e881a079231fe813258602f6@epcms1p5> <20181108041001epcms1p6c83831e3ef0d66b9591c2aca25d5841b@epcms1p6> <8b2209af-1221-f4f5-54e5-d9f5a503373e@gmail.com> <20181108075837epcms1p2747d212aee83ba0df60cc14ffac316aa@epcms1p2> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit To: soukjin.bae@samsung.com, Eric Dumazet , "netdev@vger.kernel.org" Return-path: Received: from mail-pf1-f196.google.com ([209.85.210.196]:34659 "EHLO mail-pf1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726421AbeKIAs2 (ORCPT ); Thu, 8 Nov 2018 19:48:28 -0500 Received: by mail-pf1-f196.google.com with SMTP id y18-v6so7035266pfn.1 for ; Thu, 08 Nov 2018 07:12:31 -0800 (PST) In-Reply-To: <20181108075837epcms1p2747d212aee83ba0df60cc14ffac316aa@epcms1p2> Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: On 11/07/2018 11:58 PM, 배석진 wrote: >> --------- Original Message --------- >> Sender : Eric Dumazet  >> Date : 2018-11-08 15:13 (GMT+9) >> Title : Re: (2) (2) [Kernel][NET] Bug report on packet defragmenting >>   >> On 11/07/2018 08:26 PM, Eric Dumazet wrote: >>>   >>>   >>>  On 11/07/2018 08:10 PM, 배석진 wrote: >>>>>  --------- Original Message --------- >>>>>  Sender : Eric Dumazet  >>>>>  Date   : 2018-11-08 12:57 (GMT+9) >>>>>  Title  : Re: (2) [Kernel][NET] Bug report on packet defragmenting >>>>>    >>>>>  On 11/07/2018 07:24 PM, Eric Dumazet wrote: >>>>> >>>>>>   Sure, it is better if RPS is smarter, but if there is a bug in IPv6 defrag unit >>>>>>   we must investigate and root-cause it. >>>>>    >>>>>  BTW, IPv4 defrag seems to have the same issue. >>>>    >>>> >>>>  yes, it could be. >>>>  key point isn't limitted to ipv6. >>>> >>>>  maybe because of faster air-network and modem, >>>>  it looks like occure more often and we got recognized that. >>>> >>>>  anyway, >>>>  we'll apply our patch to resolve this problem. >>>   >>>  Yeah, and I will fix the defrag units. >>> >>>  We can not rely on other layers doing proper no-reorder logic for us. >>>   >>>  Problem here is that multiple cpus attempt concurrent rhashtable_insert_fast() >>>  and do not properly recover in case -EEXIST is returned. >>>   >>>  This is silly, of course :/ >>   >> Patch would be https://patchwork.ozlabs.org/patch/994658/ >   > > Dear Dumazet, > > with your patch, kernel got the panic when packet recieved. > I double checked after disable your patch, then no problem. > > > <6>[ 119.702054] I[3: kworker/u18:1: 1705] LNK-RX(1464): 6b 80 00 00 05 90 2c 3e 20 01 44 30 00 05 04 01 ... > <6>[ 119.702120] I[3: kworker/u18:1: 1705] __skb_flow_dissect: ports: 77500000 > <6>[ 119.702153] I[3: kworker/u18:1: 1705] get_rps_cpu: cpu:2, hash:2055028308 > <6>[ 119.702203] I[3: kworker/u18:1: 1705] LNK-RX(1212): 6b 80 00 00 04 94 2c 3e 20 01 44 30 00 05 04 01 ... > <6>[ 119.702231] I[3: kworker/u18:1: 1705] __skb_flow_dissect: ports: 3c7e2c6b > <6>[ 119.702258] I[3: kworker/u18:1: 1705] get_rps_cpu: cpu:1, hash:671343869 > <6>[ 119.702365] I[1: Binder:11369_2:11382] ipv6_rcv +++ > <6>[ 119.702375] I[2: swapper/2: 0] ipv6_rcv +++ > <6>[ 119.702406] I[2: swapper/2: 0] ipv6_defrag +++ > <6>[ 119.702425] I[1: Binder:11369_2:11382] ipv6_defrag +++ > <6>[ 119.702494] I[2: swapper/2: 0] ipv6_defrag: EINPROGRESS > <6>[ 119.702522] I[2: swapper/2: 0] ipv6_rcv --- > <6>[ 119.702628] I[1: Binder:11369_2:11382] ipv6_defrag --- > <6>[ 119.702892] I[1: Binder:11369_2:11382] ipv6_defrag +++ > <6>[ 119.702922] I[1: Binder:11369_2:11382] ipv6_defrag --- > <6>[ 119.702966] I[1: Binder:11369_2:11382] ipv6_rcv --- > <0>[ 119.703792] [1: Binder:11369_2:11382] BUG: sleeping function called from invalid context at arch/arm64/mm/fault.c:518 > <3>[ 119.703826] [1: Binder:11369_2:11382] in_atomic(): 0, irqs_disabled(): 0, pid: 11382, name: Binder:11369_2 > <3>[ 119.703854] [1: Binder:11369_2:11382] Preemption disabled at: > <4>[ 119.703888] [1: Binder:11369_2:11382] [] __do_softirq+0x68/0x3c4 > <4>[ 119.703934] [1: Binder:11369_2:11382] CPU: 1 PID: 11382 Comm: Binder:11369_2 Tainted: G S W 4.14.75-20181108-163447-eng #0 > <4>[ 119.703960] [1: Binder:11369_2:11382] Hardware name: Samsung BEYOND2LTE KOR SINGLE 19 board based on EXYNOS9820 (DT) > <4>[ 119.703987] [1: Binder:11369_2:11382] Call trace: > <4>[ 119.704015] [1: Binder:11369_2:11382] [] dump_backtrace+0x0/0x280 > <4>[ 119.704045] [1: Binder:11369_2:11382] [] show_stack+0x18/0x24 > <4>[ 119.704074] [1: Binder:11369_2:11382] [] dump_stack+0xb8/0xf8 > <4>[ 119.704104] [1: Binder:11369_2:11382] [] ___might_sleep+0x16c/0x178 > <4>[ 119.704132] [1: Binder:11369_2:11382] [] __might_sleep+0x4c/0x84 > <4>[ 119.704164] [1: Binder:11369_2:11382] [] do_page_fault+0x2e8/0x4b8 > <4>[ 119.704193] [1: Binder:11369_2:11382] [] do_translation_fault+0x7c/0x100 > <4>[ 119.704219] [1: Binder:11369_2:11382] [] do_mem_abort+0x4c/0x12c > <4>[ 119.704243] [1: Binder:11369_2:11382] Exception stack(0xffffff8038bf3ec0 to 0xffffff8038bf4000) > <4>[ 119.704266] [1: Binder:11369_2:11382] 3ec0: 00000077b8262600 00000077b1bd0800 00000000708fcae0 0000000000000018 > ... > <4>[ 119.704459] [1: Binder:11369_2:11382] 3fe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > <4>[ 119.704480] [1: Binder:11369_2:11382] [] el0_da+0x20/0x24 > <4>[ 119.704509] [1: Binder:11369_2:11382] ------------[ cut here ]------------ > <0>[ 119.704541] [1: Binder:11369_2:11382] kernel BUG at kernel/sched/core.c:6152! > <2>[ 119.704563] [1: Binder:11369_2:11382] sec_debug_set_extra_info_fault = BUG / 0xffffff800811f180 > <0>[ 119.704603] [1: Binder:11369_2:11382] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP > > Thanks for testing. This is not a pristine net-next tree, this dump seems unrelated to the patch ?