From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.1 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 90B6CC43381 for ; Sat, 2 Mar 2019 00:30:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 5884B20840 for ; Sat, 2 Mar 2019 00:30:54 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="isYz+y8G" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726693AbfCBAax (ORCPT ); Fri, 1 Mar 2019 19:30:53 -0500 Received: from mail-lf1-f66.google.com ([209.85.167.66]:42432 "EHLO mail-lf1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725966AbfCBAax (ORCPT ); Fri, 1 Mar 2019 19:30:53 -0500 Received: by mail-lf1-f66.google.com with SMTP id p1so19264844lfk.9 for ; Fri, 01 Mar 2019 16:30:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=ndkx23fcyPLZ/qgfPnrVFGbF1/b6F1z+ozkxe7BVbpA=; b=isYz+y8GCPokUYSJhqtvHe1wjnx42URl49XmCiNf7fL1p7fg2dYO6SgvFltfpAsQba fMcTGS1IAhIP5IFQ+Mk2PV8QtpT1NALY+FFHPB8CTDxhFzC1ppOPiZfXmpW82M0GVuOK 7vgsGVuAzOb+snbGkqfwEG6OP/q6aEyswFFHJ+vbrkbrmZ6Us3/bS6j77bj7WrkkUPjs DRHcxA0z7+WUuzj080EtRCfuq0tT+pAZOJMKzu5gaxFG2nGyjXsNRw/fNZHMjyNTrz8I VR/AoX2e+HGFqk8vaLcrECK0+huitXzFKtNTbh+j1bu3v0P3DEa/E/DZk8xc+ZgIRB46 1tJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ndkx23fcyPLZ/qgfPnrVFGbF1/b6F1z+ozkxe7BVbpA=; b=q2oAWiDYH6KIE0v15rODQhK+L75BAqNdPm5Je+hstpnOeJ/A4QXgDQcTyHRSqi2hnd d2JZiWZdk3Pe0ckl3tHbC/yjHJP7viI3LKnfU+w4yqNHEKHCQimNK60yhz3Xa+N7H47/ UAkkQLKaorbQv5hHsnkABTRpIilMCzKPoOavZPj95k5zsqWuQySVhP9rbg1r2LtrFitY U22wdf/KhztXCzXzv2h7+utBmBfBDkslHMcUQyCADClRKrK8vl43amd6EW3Vg/2BLgZC ZlXh8nKMGWfhrry2rVX+XfetXlvoS15wjNMAgaXiWkJrcgaKILIBIfzt3Cw1ETCyFQpz iHMw== X-Gm-Message-State: APjAAAV8d+DlNrrDIEycLJ53vFNXVcPNc/cM2uTi1jAtUzeSssaRRze9 MiFvT+Xj74W0VDVJk13lk41mDZfo+0qPwI+TBgkbYQ== X-Google-Smtp-Source: APXvYqwXp0SSohUT8jfcj75qPlhRTD70Zn6BDa48F4Tb3KoXO95LES1TpL873nwJBLQwEDM/hYfel1jkcvwckPGZJSs= X-Received: by 2002:ac2:5143:: with SMTP id q3mr4417084lfd.53.1551486650329; Fri, 01 Mar 2019 16:30:50 -0800 (PST) MIME-Version: 1.0 References: <20190227173710-mutt-send-email-mst@kernel.org> <20190227184601-mutt-send-email-mst@kernel.org> <20190227193923-mutt-send-email-mst@kernel.org> <20190227165205.307ed83c@cakuba.netronome.com> <20190227201857-mutt-send-email-mst@kernel.org> <20190227175218.736e13b6@cakuba.netronome.com> <20190227233812-mutt-send-email-mst@kernel.org> <20190228101356.39ac70aa@cakuba.netronome.com> <20190228143511-mutt-send-email-mst@kernel.org> <20190228115641.7afe6f09@cakuba.netronome.com> <20190228170520.527ed6df@cakuba.netronome.com> In-Reply-To: <20190228170520.527ed6df@cakuba.netronome.com> From: Siwei Liu Date: Fri, 1 Mar 2019 16:30:37 -0800 Message-ID: Subject: Re: [virtio-dev] Re: net_failover slave udev renaming (was Re: [RFC PATCH net-next v6 4/4] netvsc: refactor notifier/event handling code to use the bypass framework) To: Jakub Kicinski Cc: "Michael S. Tsirkin" , si-wei liu , "Samudrala, Sridhar" , Jiri Pirko , Stephen Hemminger , David Miller , Netdev , virtualization@lists.linux-foundation.org, "Brandeburg, Jesse" , Alexander Duyck , Jason Wang , liran.alon@oracle.com Content-Type: text/plain; charset="UTF-8" Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Thu, Feb 28, 2019 at 5:05 PM Jakub Kicinski wrote: > > On Thu, 28 Feb 2019 16:20:28 -0800, Siwei Liu wrote: > > On Thu, Feb 28, 2019 at 11:56 AM Jakub Kicinski wrote: > > > On Thu, 28 Feb 2019 14:36:56 -0500, Michael S. Tsirkin wrote: > > > > > It is a bit of a the chicken or the egg situation ;) But users can > > > > > just blacklist, too. Anyway, I think this is far better than module > > > > > parameters > > > > > > > > Sorry I'm a bit confused. What is better than what? > > > > > > I mean that blacklist net_failover or module param to disable > > > net_failover and handle in user space are better than trying to solve > > > the renaming at kernel level (either by adding module params that make > > > the kernel rename devices or letting user space change names of running > > > devices if they are slaves). > > > > Before I was aksed to revive this old mail thread, I knew the > > discussion could end up with something like this. Yes, theoretically > > there's a point - basically you don't believe kernel should take risk > > in fixing the issue, so you push back the hope to something in > > hypothesis that actually wasn't done and hard to get done in reality. > > It's not too different than saying "hey, what you're asking for is > > simply wrong, don't do it! Go back to modify userspace to create a > > bond or team instead!" FWIW I want to emphasize that the debate for > > what should be the right place to implement this failover facility: > > userspace versus kernel, had been around for almost a decade, and no > > real work ever happened in userspace to "standardize" this in the > > Linux world. > > Let me offer you my very subjective opinion of why "no real work ever > happened in user space". The actors who have primary interest to get > the auto-bonding working are HW vendors trying to either convince > customers to use SR-IOV, or being pressured by customers to make SR-IOV > easier to consume. HW vendors hire driver developers, not user space > developers. So the solution we arrive at is in the kernel for a non > technical reason (Conway's law, sort of). > > $ cd NetworkManager/ > $ git log --pretty=format:"%ae" | \ > grep '\(mellanox\|intel\|broadcom\|netronome\)' | sort | uniq -c > 81 andrew.zaborowski@intel.com > 2 David.Woodhouse@intel.com > 2 ismo.puustinen@intel.com > 1 michael.i.doherty@intel.com > > Andrew works on WiFi. > I'm sorry, but we don't use NetworkManager in our cloud images at all. We sufferd from lots of problems when booting from remote iSCSI disk with NetworkManager enabled, and it looks like those issues are still there while that's not (my subjective impression) a network config tool mainly targeting desktop and WiFi users ever cares about. At least a sign of lack of sufficient testing was made there. >From cloud service provider perspective, we always prefer single central solution than speak to various distro vendors with their own network daemons/config tools thus different solutions. It's hard to coordicate all efforts in one place. From my personal perspetive, the in-kernel auto-slave solution is nothing technically inferior than any userspace implementation, and every major OS/cloud providers choose to implement this in-kernel model for the same reason. I don't want to argue more if there's value or not for net_failover to be in Linux kernel, given that it's already there I think it's better to move on. We have done extensive work in reporting (actually, fix them internally before posting) issues to the dracut, udev, initramfs-tools, and cloud-init community. Although as claimed the 3-netdev should be transparent to userspace in general, the reality is opposite: the effort is nothing differenet than bring up a new type of virutal bond than any existing userspace tool would otherwise expect for a regular physical netdev. If there's ever concern about breaking userspace, I bet no one ever tries to start using it. If they did they know what I am saying. The dup MAC address setting and plugging order are totally new to userspace that none of userspace tools fail to know how to plumb failover interface in a proper way, if without fixing them one or another. -Siwei > I have asked the NetworkManager folks to implement this feature last > year when net_failover got dangerously close to getting merged, and > they said they were never approached with this request before, much less > offered code that solve it. Unfortunately before they got around to it > net_failover was merged already, and they didn't proceed. > > So to my knowledge nobody ever tried to solve this in user space. > I don't think net_failover is particularly terrible, or that renaming > of primary in the kernel is the end of the world, but I'd appreciate if > you could point me to efforts to solve it upstream in user space > components, or acknowledge that nobody actually tried that.