From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753050AbdKKKUG (ORCPT ); Sat, 11 Nov 2017 05:20:06 -0500 Received: from shards.monkeyblade.net ([184.105.139.130]:46282 "EHLO shards.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751349AbdKKKUE (ORCPT ); Sat, 11 Nov 2017 05:20:04 -0500 Date: Sat, 11 Nov 2017 19:20:00 +0900 (KST) Message-Id: <20171111.192000.1052079702439602290.davem@davemloft.net> To: herbert.tencent@gmail.com Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org Subject: Re: [PATCH] netdev: add netdev_pagefrag_enabled sysctl From: David Miller In-Reply-To: <05cb873f-3edf-f115-305c-81b5ace8d76e@gmail.com> References: <05cb873f-3edf-f115-305c-81b5ace8d76e@gmail.com> X-Mailer: Mew version 6.7 on Emacs 25.3 / Mule 6.0 (HANACHIRUSATO) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.5.12 (shards.monkeyblade.net [149.20.54.216]); Sat, 11 Nov 2017 02:20:04 -0800 (PST) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Hongbo Li Date: Thu, 9 Nov 2017 16:12:27 +0800 > From: Hongbo Li > > This patch solves a memory frag issue when allocating skb. > I found this issue in a udp scenario, here is my test model: > 1. About five hundreds udp threads listen on server, > and five hundreds client threads send udp pkts to them. > Some threads send pkts in a faster speed than others. > 2. The user processes on server don't have enough ability > to receive these pkts. > > Then I got following result: > 1. Some udp sockets' recv-q reach the queue's limit, others > not because of the global rmem limit. > 2. The "free" command shows "used" memory is more than 62GB. > But cat /proc/net/sockstat shows that udp uses only 12GB. > > This will confused the user that why the system consumes so > many memory.This is caused by the memory frags in netdev layer. > __netdev_alloc_frag() allocs a page block which has 8 pages. > > Then in this scenario, most skbs are freed when the recv-q > is full, but if any skb in the same page block be queued to > other recv-q which is not full, the whole page block can't > be freed. > > So from the view of kernel, these pages are used, but from > the view of tcp/udp, only the skbs in recv-q are used. > > To avoid exhausting memory in such scenario, I add a sysctl > to make user can disable allocating skbs in page frag. > > Signed-off-by: Hongbo Li When something like page fragments don't work properly, we fix them rather then providing a way to disable them. Thank you.