From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pl0-f66.google.com ([209.85.160.66]:32924 "EHLO mail-pl0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388332AbeGXQVC (ORCPT ); Tue, 24 Jul 2018 12:21:02 -0400 Subject: Re: [PATCH RFC/RFT net-next 00/17] net: Convert neighbor tables to per-namespace References: <1a3f59a9-0ba5-c83f-16a6-f9550a84f693@gmail.com> <1a27e301-3275-b349-a2f8-afdfdc02f04f@gmail.com> <20180718.125938.2271502580775162784.davem@davemloft.net> <28c30574-391c-b4bd-c337-51d3040d901a@gmail.com> From: David Ahern Message-ID: <5021d874-8e99-6eba-f24b-4257c62d4457@gmail.com> Date: Tue, 24 Jul 2018 09:14:01 -0600 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-wpan-owner@vger.kernel.org List-ID: To: Cong Wang Cc: David Miller , Linux Kernel Network Developers , nikita.leshchenko@oracle.com, Roopa Prabhu , Stephen Hemminger , Ido Schimmel , Jiri Pirko , Saeed Mahameed , Alexander Aring , linux-wpan@vger.kernel.org, NetFilter , LKML On 7/19/18 11:12 AM, Cong Wang wrote: > On Thu, Jul 19, 2018 at 9:16 AM David Ahern wrote: >> >> Chatting with Nikolay about this and he brought up a good corollary - ip >> fragmentation. It really is a similar problem in that memory is consumed >> as a result of packets received from an external entity. The ipfrag >> sysctls are per namespace with a limit that non-init_net namespaces can >> not set high_thresh > the current value of init_net. Potential memory >> consumed by fragments scales with the number of namespaces which is the >> primary concern with making neighbor tables per namespace. > > Nothing new, already discussed: > https://marc.info/?l=linux-netdev&m=140391416215988&w=2 > > :) > Neighbor tables, bridge fdbs, vxlan fdbs and ip fragments all consume local memory resources due to received packets. bridge and vxlan fdb's are fairly straightforward analogs to neighbor entries; they are per device with no limits on the number of entries. Fragments have memory limits per namespace. So neighbor tables are the only ones with this strict limitation and concern on memory consumption. I get the impression there is no longer a strong resistance against moving the tables to per namespace, but deciding what is the right approach to handle backwards compatibility. Correct? Changing the accounting is inevitably going to be noticeable to some use case(s), but with sysctl settings it is a simple runtime update once the user knows to make the change. neighbor entries round up to 512 byte allocations, so with the current gc_thresh defaults (128/512/1024) 512k can be consumed. Using those limits per namespace seems high which is why I suggested a per-namespace default of (16/32/64) which amounts to 32k per namespace limit by default. Open to other suggestions as well.