From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-f67.google.com ([209.85.208.67]:44759 "EHLO mail-ed1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729714AbeGQVLz (ORCPT ); Tue, 17 Jul 2018 17:11:55 -0400 MIME-Version: 1.0 References: <20180717120651.15748-1-dsahern@kernel.org> <1a3f59a9-0ba5-c83f-16a6-f9550a84f693@gmail.com> <1a27e301-3275-b349-a2f8-afdfdc02f04f@gmail.com> In-Reply-To: <1a27e301-3275-b349-a2f8-afdfdc02f04f@gmail.com> From: Cong Wang Date: Tue, 17 Jul 2018 13:37:26 -0700 Message-ID: Subject: Re: [PATCH RFC/RFT net-next 00/17] net: Convert neighbor tables to per-namespace Content-Type: text/plain; charset="UTF-8" Sender: linux-wpan-owner@vger.kernel.org List-ID: To: David Ahern Cc: Linux Kernel Network Developers , nikita.leshchenko@oracle.com, Roopa Prabhu , Stephen Hemminger , Ido Schimmel , Jiri Pirko , Saeed Mahameed , Alexander Aring , linux-wpan@vger.kernel.org, NetFilter , LKML On Tue, Jul 17, 2018 at 12:02 PM David Ahern wrote: > As for the per-namespace tables, it is 4 years later and over that time > Linux supports a number of features: EVPN which is very mac heavy, VRR > which doubles mac entries (one against the VRR device and one against > the lower device) and NOS level features such as mlxsw which has to > ensure mac entries for nexthop gateaways stay active. In addition there > are other features on the horizon - like the ability to use namespaces > to create virtual switches (what Cisco calls a VDC) where you absolutely > want isolation and not allowing entries from virtual switch to evict > entries from another. And of course the continued proliferation of > containerized workloads where isolation is desired. As long as no change in neigh table code base itself, these can't address the concern people raised before. > > I understand the concern about global resource and limits: as it stands > you have to increase the limits in init_net to the max expected and hope > for the best. With per namespace limits you can lower the limits of each > namespace better control the total impact on the total memory used. The problem is that the number of containers in a host is usually not predictable. Of course, you can say containers limit kernel memory too, but memcg is not part of netns. I once told David Miller cpuset is the isolation for isolating per-CPU softnet_data, he didn't like it. Based on that I don't think you can convince him with memcg as a solution here.