From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.2 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_PASS,UNPARSEABLE_RELAY,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 973E5C43381 for ; Fri, 22 Feb 2019 12:58:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 5BF2620823 for ; Fri, 22 Feb 2019 12:58:20 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=broadcom.com header.i=@broadcom.com header.b="OA03qAgS" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726938AbfBVM6T (ORCPT ); Fri, 22 Feb 2019 07:58:19 -0500 Received: from mail-ot1-f65.google.com ([209.85.210.65]:33618 "EHLO mail-ot1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725926AbfBVM6S (ORCPT ); Fri, 22 Feb 2019 07:58:18 -0500 Received: by mail-ot1-f65.google.com with SMTP id q24so1795676otk.0 for ; Fri, 22 Feb 2019 04:58:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; h=from:mime-version:references:in-reply-to:date:message-id:subject:to :cc:content-transfer-encoding; bh=bXRuhCtOqhq7bFFBN5nWoBhbfwAInQxVgHyTAq7LGW4=; b=OA03qAgSpWtqTBPfomJZcOtkO1g8/BAbI7asePKff9o8wha1zKrCYUn3mgKQvhZN+t 1vFr7vm8ib3IpFIk0q3eUUIA67Rxov0acTc2U+comiI+lI2y7nN2TAzwv4hgO7WS7K8J OE3/wxjaADaH7CYoL72lfMGqq4lwhctPV+FtU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:mime-version:references:in-reply-to:date :message-id:subject:to:cc:content-transfer-encoding; bh=bXRuhCtOqhq7bFFBN5nWoBhbfwAInQxVgHyTAq7LGW4=; b=U7plv0pjJnUlrKDBcJUUqRwOluoWiBWQAYDf80bzxrJUvsQ5ZxV+8ysJXtWl3Nc39L BxmI8qnkHhq6bOYzRNKSphZ6WUuY/Nq5h4YeBEfhD9cDzHW7txkc7ZBftX8uNdF4ZXuy LmQdu5cXwMj3zjnllR+jTzT7m2vYGKbl0wcfUlW7WSMZrlkt4uhinAAVKEUu7wSzHALH ECwn5m8lY/MehhtD4sJw2AilLoVMZnLfY1oJnsyCbWwFqLZjp8/Jli7/OBYTjhGPJPaG yrvqm1q0fcA3tX1dV/TIRYWd+x3taEJ/azO6CjTDXu8ef3nxn+a/THHsEgGF2PBzenwa 0JJg== X-Gm-Message-State: AHQUAuYny7wS/Fzkb6POW/AzC0rhvNB+l0f1l+kFOyAWmUiPj3tFTZFO FJl8WbdPfuemtoHkPo876zHWZ8Q/Rbq7pCKaQsFkWg== X-Google-Smtp-Source: AHgI3IYQY7/L1YgcsWkh4oazpu+kA978Sav/M/ay719ym+YX6JRtMW/gnFOJMJjvZC235rS1uxlUDh9D21tHqU6wGfs= X-Received: by 2002:a9d:66d0:: with SMTP id t16mr2549200otm.35.1550840297081; Fri, 22 Feb 2019 04:58:17 -0800 (PST) Received: from unknown named unknown by gmailapi.google.com with HTTPREST; Fri, 22 Feb 2019 04:58:15 -0800 From: Rob Miller Mime-Version: 1.0 (1.0) References: <1523386790-12396-1-git-send-email-sridhar.samudrala@intel.com> <1523386790-12396-5-git-send-email-sridhar.samudrala@intel.com> <20180410142608.50f15b45@xeon-e3> <20180411075334.GK2028@nanopsycho> <20190221203808-mutt-send-email-mst@kernel.org> <581e4399-3969-aecd-e923-03bbc0880733@oracle.com> <91d4cbb1-be7a-b53c-6b2a-99bef07e7c53@intel.com> In-Reply-To: Date: Fri, 22 Feb 2019 04:58:15 -0800 Message-ID: Subject: Re: [virtio-dev] Re: net_failover slave udev renaming (was Re: [RFC PATCH net-next v6 4/4] netvsc: refactor notifier/event handling code to use the bypass framework) To: si-wei liu Cc: "Samudrala, Sridhar" , "Michael S. Tsirkin" , Siwei Liu , Jiri Pirko , Stephen Hemminger , David Miller , Netdev , virtualization@lists.linux-foundation.org, virtio-dev , "Brandeburg, Jesse" , Alexander Duyck , Jakub Kicinski , Jason Wang , liran.alon@oracle.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org I don=E2=80=99t know enough about how they get named, but is it possible fo= r user space to suggest its interface name, such that the interface name would we as unique as the VM name itself. and is limited to scope to be within the network boundry of an organization? In other words, as a company, i decided to name my VM co-vm-1 through co-vm-xxx, i would leave off location of vm b/c that will change. My interfaces then would be named, co-vm-1.0 through co-vm-1.x. Just thinking out loud. Sent from my iPhone > On Feb 22, 2019, at 2:55 AM, si-wei liu wrote: > > > >> On 2/21/2019 11:00 PM, Samudrala, Sridhar wrote: >> >> >>> On 2/21/2019 7:33 PM, si-wei liu wrote: >>> >>> >>>> On 2/21/2019 5:39 PM, Michael S. Tsirkin wrote: >>>>> On Thu, Feb 21, 2019 at 05:14:44PM -0800, Siwei Liu wrote: >>>>> Sorry for replying to this ancient thread. There was some remaining >>>>> issue that I don't think the initial net_failover patch got addressed >>>>> cleanly, see: >>>>> >>>>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1815268 >>>>> >>>>> The renaming of 'eth0' to 'ens4' fails because the udev userspace was >>>>> not specifically writtten for such kernel automatic enslavement. >>>>> Specifically, if it is a bond or team, the slave would typically get >>>>> renamed *before* virtual device gets created, that's what udev can >>>>> control (without getting netdev opened early by the other part of >>>>> kernel) and other userspace components for e.g. initramfs, >>>>> init-scripts can coordinate well in between. The in-kernel >>>>> auto-enslavement of net_failover breaks this userspace convention, >>>>> which don't provides a solution if user care about consistent naming >>>>> on the slave netdevs specifically. >>>>> >>>>> Previously this issue had been specifically called out when IFF_HIDDE= N >>>>> and the 1-netdev was proposed, but no one gives out a solution to thi= s >>>>> problem ever since. Please share your mind how to proceed and solve >>>>> this userspace issue if netdev does not welcome a 1-netdev model. >>>> Above says: >>>> >>>> there's no motivation in the systemd/udevd community at >>>> this point to refactor the rename logic and make it work well with >>>> 3-netdev. >>>> >>>> What would the fix be? Skip slave devices? >>>> >>> There's nothing user can get if just skipping slave devices - the name = is still unchanged and unpredictable e.g. eth0, or eth1 the next reboot, wh= ile the rest may conform to the naming scheme (ens3 and such). There's no w= ay one can fix this in userspace alone - when the failover is created the e= nslaved netdev was opened by the kernel earlier than the userspace is made = aware of, and there's no negotiation protocol for kernel to know when users= pace has done initial renaming of the interface. I would expect netdev list= should at least provide the direction in general for how this can be solve= d... >>> >> Is there an issue if slave device names are not predictable? The user/ad= min scripts are expected >> to only work with the master failover device. > Where does this expectation come from? > > Admin users may have ethtool or tc configurations that need to deal with = predictable interface name. Third-party app which was built upon specifying= certain interface name can't be modified to chase dynamic names. > > Specifically, we have pre-canned image that uses ethtool to fine tune VF = offload settings post boot for specific workload. Those images won't work w= ell if the name is constantly changing just after couple rounds of live mig= ration. > >> Moreover, you were suggesting hiding the lower slave devices anyway. The= re was some discussion >> about moving them to a hidden network namespace so that they are not vis= ible from the default namespace. >> I looked into this sometime back, but did not find the right kernel api = to create a network namespace within >> kernel. If so, we could use this mechanism to simulate a 1-netdev model. > Yes, that's one possible implementation (IMHO the key is to make 1-netdev= model as much transparent to a real NIC as possible, while a hidden netns = is just the vehicle). However, I recall there was resistance around this di= scussion that even the concept of hiding itself is a taboo for Linux netdev= . I would like to summon potential alternatives before concluding 1-netdev = is the only solution too soon. > > Thanks, > -Siwei > >> >>> -Siwei >>> >>> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org > From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: virtio-dev-return-5508-cohuck=redhat.com@lists.oasis-open.org Sender: List-Post: List-Help: List-Unsubscribe: List-Subscribe: Received: from lists.oasis-open.org (oasis.ws5.connectedcommunity.org [10.110.1.242]) by lists.oasis-open.org (Postfix) with ESMTP id DAAB79860E5 for ; Fri, 22 Feb 2019 12:58:18 +0000 (UTC) From: Rob Miller Mime-Version: 1.0 (1.0) References: <1523386790-12396-1-git-send-email-sridhar.samudrala@intel.com> <1523386790-12396-5-git-send-email-sridhar.samudrala@intel.com> <20180410142608.50f15b45@xeon-e3> <20180411075334.GK2028@nanopsycho> <20190221203808-mutt-send-email-mst@kernel.org> <581e4399-3969-aecd-e923-03bbc0880733@oracle.com> <91d4cbb1-be7a-b53c-6b2a-99bef07e7c53@intel.com> In-Reply-To: Date: Fri, 22 Feb 2019 04:58:15 -0800 Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Subject: Re: [virtio-dev] Re: net_failover slave udev renaming (was Re: [RFC PATCH net-next v6 4/4] netvsc: refactor notifier/event handling code to use the bypass framework) To: si-wei liu Cc: "Samudrala, Sridhar" , "Michael S. Tsirkin" , Siwei Liu , Jiri Pirko , Stephen Hemminger , David Miller , Netdev , virtualization@lists.linux-foundation.org, virtio-dev , "Brandeburg, Jesse" , Alexander Duyck , Jakub Kicinski , Jason Wang , liran.alon@oracle.com List-ID: I don=E2=80=99t know enough about how they get named, but is it possible fo= r user space to suggest its interface name, such that the interface name would we as unique as the VM name itself. and is limited to scope to be within the network boundry of an organization? In other words, as a company, i decided to name my VM co-vm-1 through co-vm-xxx, i would leave off location of vm b/c that will change. My interfaces then would be named, co-vm-1.0 through co-vm-1.x. Just thinking out loud. Sent from my iPhone > On Feb 22, 2019, at 2:55 AM, si-wei liu wrote: > > > >> On 2/21/2019 11:00 PM, Samudrala, Sridhar wrote: >> >> >>> On 2/21/2019 7:33 PM, si-wei liu wrote: >>> >>> >>>> On 2/21/2019 5:39 PM, Michael S. Tsirkin wrote: >>>>> On Thu, Feb 21, 2019 at 05:14:44PM -0800, Siwei Liu wrote: >>>>> Sorry for replying to this ancient thread. There was some remaining >>>>> issue that I don't think the initial net_failover patch got addressed >>>>> cleanly, see: >>>>> >>>>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1815268 >>>>> >>>>> The renaming of 'eth0' to 'ens4' fails because the udev userspace was >>>>> not specifically writtten for such kernel automatic enslavement. >>>>> Specifically, if it is a bond or team, the slave would typically get >>>>> renamed *before* virtual device gets created, that's what udev can >>>>> control (without getting netdev opened early by the other part of >>>>> kernel) and other userspace components for e.g. initramfs, >>>>> init-scripts can coordinate well in between. The in-kernel >>>>> auto-enslavement of net_failover breaks this userspace convention, >>>>> which don't provides a solution if user care about consistent naming >>>>> on the slave netdevs specifically. >>>>> >>>>> Previously this issue had been specifically called out when IFF_HIDDE= N >>>>> and the 1-netdev was proposed, but no one gives out a solution to thi= s >>>>> problem ever since. Please share your mind how to proceed and solve >>>>> this userspace issue if netdev does not welcome a 1-netdev model. >>>> Above says: >>>> >>>> there's no motivation in the systemd/udevd community at >>>> this point to refactor the rename logic and make it work well with >>>> 3-netdev. >>>> >>>> What would the fix be? Skip slave devices? >>>> >>> There's nothing user can get if just skipping slave devices - the name = is still unchanged and unpredictable e.g. eth0, or eth1 the next reboot, wh= ile the rest may conform to the naming scheme (ens3 and such). There's no w= ay one can fix this in userspace alone - when the failover is created the e= nslaved netdev was opened by the kernel earlier than the userspace is made = aware of, and there's no negotiation protocol for kernel to know when users= pace has done initial renaming of the interface. I would expect netdev list= should at least provide the direction in general for how this can be solve= d... >>> >> Is there an issue if slave device names are not predictable? The user/ad= min scripts are expected >> to only work with the master failover device. > Where does this expectation come from? > > Admin users may have ethtool or tc configurations that need to deal with = predictable interface name. Third-party app which was built upon specifying= certain interface name can't be modified to chase dynamic names. > > Specifically, we have pre-canned image that uses ethtool to fine tune VF = offload settings post boot for specific workload. Those images won't work w= ell if the name is constantly changing just after couple rounds of live mig= ration. > >> Moreover, you were suggesting hiding the lower slave devices anyway. The= re was some discussion >> about moving them to a hidden network namespace so that they are not vis= ible from the default namespace. >> I looked into this sometime back, but did not find the right kernel api = to create a network namespace within >> kernel. If so, we could use this mechanism to simulate a 1-netdev model. > Yes, that's one possible implementation (IMHO the key is to make 1-netdev= model as much transparent to a real NIC as possible, while a hidden netns = is just the vehicle). However, I recall there was resistance around this di= scussion that even the concept of hiding itself is a taboo for Linux netdev= . I would like to summon potential alternatives before concluding 1-netdev = is the only solution too soon. > > Thanks, > -Siwei > >> >>> -Siwei >>> >>> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org > --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org