linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ding Tianhong <dingtianhong@huawei.com>
To: <paulmck@linux.vnet.ibm.com>
Cc: <josh@joshtriplett.org>, <rostedt@goodmis.org>,
	<mathieu.desnoyers@efficios.com>, <jiangshanlai@gmail.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	<davem@davemloft.net>
Subject: Re: [PATCH] rcu: fix the OOM problem of huge IP abnormal packet traffic
Date: Sat, 19 Nov 2016 15:50:32 +0800	[thread overview]
Message-ID: <809d327e-d4e2-51a5-bbfd-9ff143ee55da@huawei.com> (raw)
In-Reply-To: <20161118130144.GO3612@linux.vnet.ibm.com>



On 2016/11/18 21:01, Paul E. McKenney wrote:
> On Fri, Nov 18, 2016 at 08:40:09PM +0800, Ding Tianhong wrote:
>> The commit bedc196915 ("rcu: Fix soft lockup for rcu_nocb_kthread")
>> will introduce a new problem that when huge IP abnormal packet arrived,
>> it may cause OOM and break the kernel, just like this:
>>
>> [   79.441538] mlx4_en: eth5: Leaving promiscuous mode steering mode:2
>> [  100.067032] ksoftirqd/0: page allocation failure: order:0, mode:0x120
>> [  100.067038] CPU: 0 PID: 3 Comm: ksoftirqd/0 Tainted: G           OE  ----V-------   3.10.0-327.28.3.28.x86_64 #1
>> [  100.067039] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.1-0-gb3ef39f-20161018_184732-HGH1000003483 04/01/2014
>> [  100.067041]  0000000000000120 00000000b080d798 ffff8802afd5b968 ffffffff81638cb9
>> [  100.067045]  ffff8802afd5b9f8 ffffffff81171380 0000000000000010 0000000000000000
>> [  100.067048]  ffff8802befd8000 00000000ffffffff 0000000000000001 00000000b080d798
>> [  100.067050] Call Trace:
>> [  100.067057]  [<ffffffff81638cb9>] dump_stack+0x19/0x1b
>> [  100.067062]  [<ffffffff81171380>] warn_alloc_failed+0x110/0x180
>> [  100.067066]  [<ffffffff81175b16>] __alloc_pages_nodemask+0x9b6/0xba0
>> [  100.067070]  [<ffffffff8151e400>] ? skb_add_rx_frag+0x90/0xb0
>> [  100.067075]  [<ffffffff811b6fba>] alloc_pages_current+0xaa/0x170
>> [  100.067080]  [<ffffffffa06b9be0>] mlx4_alloc_pages.isra.24+0x40/0x170 [mlx4_en]
>> [  100.067083]  [<ffffffffa06b9dec>] mlx4_en_alloc_frags+0xdc/0x220 [mlx4_en]
>> [  100.067086]  [<ffffffff8152eeb8>] ? __netif_receive_skb+0x18/0x60
>> [  100.067088]  [<ffffffff8152ef40>] ? netif_receive_skb+0x40/0xc0
>> [  100.067092]  [<ffffffffa06bb521>] mlx4_en_process_rx_cq+0x5f1/0xec0 [mlx4_en]
>> [  100.067095]  [<ffffffff8131027d>] ? list_del+0xd/0x30
>> [  100.067098]  [<ffffffff8152c90f>] ? __napi_complete+0x1f/0x30
>> [  100.067101]  [<ffffffffa06bbeef>] mlx4_en_poll_rx_cq+0x9f/0x170 [mlx4_en]
>> [  100.067103]  [<ffffffff8152f372>] net_rx_action+0x152/0x240
>> [  100.067107]  [<ffffffff81084d1f>] __do_softirq+0xef/0x280
>> [  100.067109]  [<ffffffff81084ee0>] run_ksoftirqd+0x30/0x50
>> [  100.067114]  [<ffffffff810ae93f>] smpboot_thread_fn+0xff/0x1a0
>> [  100.067117]  [<ffffffff8163e269>] ? schedule+0x29/0x70
>> [  100.067120]  [<ffffffff810ae840>] ? lg_double_unlock+0x90/0x90
>> [  100.067122]  [<ffffffff810a5d4f>] kthread+0xcf/0xe0
>> [  100.067124]  [<ffffffff810a5c80>] ? kthread_create_on_node+0x140/0x140
>> [  100.067127]  [<ffffffff81649198>] ret_from_fork+0x58/0x90
>> [  100.067129]  [<ffffffff810a5c80>] ? kthread_create_on_node+0x140/0x140
>>
>> ================================cut here=====================================
>>
>> The reason is that the huge abnormal IP packet will be received to net stack
>> and be dropped finally by dst_release, and the dst_release would use the rcuos
>> callback-offload kthread to free the packet, but the cond_resched_rcu_qs() will
>> calling do_softirq() to receive more and more IP abnormal packets which will be
>> throw into the RCU callbacks again later, the number of received packet is much
>> greater than the number of packets freed, it will exhaust the memory and then OOM,
>> so don't try to process any pending softirqs in the rcuos callback-offload kthread
>> is a more effective solution.
> 
> OK, but we could still have softirqs processed by the grace-period kthread
> as a result of any number of other events.  So this change might reduce
> the probability of this problem, but it doesn't eliminate it.
> 
> How huge are these huge IP packets?  Is the underlying problem that they
> are too large to use the memory-allocator fastpaths?
> 
> 							Thanx, Paul
> 

I use the 40G mellanox NiC to receive packet, and the testgine could send Mac abnormal packet and
IP abnormal packet to full speed.

The Mac abnormal packet would be dropped at low level and not be received to net stack,
but the IP abnormal packet will introduce this problem, every packet will looks as new dst first and
release later by dst_release because it is meaningless.

dst_release->call_rcu(&dst->rcu_head, dst_destroy_rcu);

so all packet will be freed until the rcuos callback-offload kthread processing, it will be a infinite loop
if huge packet is coming because the do_softirq will load more and more packet to the rcuos processing kthread,
so I still could not find a better way to fix this, btw, it is really hard to say the driver use too large memory-allocater
fastpaths, there is no memory leak and the Ixgbe may meet the same problem too.

Thanks.
Ding


>> Fix commit bedc196915 ("rcu: Fix soft lockup for rcu_nocb_kthread")
>> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
>>
>> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
>> ---
>>  kernel/rcu/tree_plugin.h | 3 +--
>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
>> index 85c5a88..760c3b5 100644
>> --- a/kernel/rcu/tree_plugin.h
>> +++ b/kernel/rcu/tree_plugin.h
>> @@ -2172,8 +2172,7 @@ static int rcu_nocb_kthread(void *arg)
>>  			if (__rcu_reclaim(rdp->rsp->name, list))
>>  				cl++;
>>  			c++;
>> -			local_bh_enable();
>> -			cond_resched_rcu_qs();
>> +			_local_bh_enable();
>>  			list = next;
>>  		}
>>  		trace_rcu_batch_end(rdp->rsp->name, c, !!list, 0, 0, 1);
>> -- 
>> 1.9.0
>>
>>
>>
> 
> 
> .
> 

  reply	other threads:[~2016-11-19  7:52 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-18 12:40 [PATCH] rcu: fix the OOM problem of huge IP abnormal packet traffic Ding Tianhong
2016-11-18 13:01 ` Paul E. McKenney
2016-11-19  7:50   ` Ding Tianhong [this message]
2016-11-19  8:22     ` Paul E. McKenney
2016-11-21  0:13       ` Paul E. McKenney
2016-11-21  1:28         ` Ding Tianhong
2016-12-28  5:58           ` Ding Tianhong
2016-12-29  0:13             ` Paul E. McKenney
2017-01-04  0:57               ` Paul E. McKenney
2017-01-04  7:02                 ` Ding Tianhong
2017-01-04 13:48                   ` Paul E. McKenney
2017-01-10  3:20                     ` Ding Tianhong
2017-01-10  5:51                       ` Paul E. McKenney
2017-01-10  7:28                         ` Ding Tianhong
2016-11-21  6:52 ` [lkp] [rcu] 83ee00c6cf: WARNING:at_kernel/softirq.c:#__local_bh_enable kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=809d327e-d4e2-51a5-bbfd-9ff143ee55da@huawei.com \
    --to=dingtianhong@huawei.com \
    --cc=davem@davemloft.net \
    --cc=jiangshanlai@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).