From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 105F7C433E0 for ; Wed, 10 Feb 2021 22:54:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AC6B664ED0 for ; Wed, 10 Feb 2021 22:54:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232413AbhBJWyN (ORCPT ); Wed, 10 Feb 2021 17:54:13 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:29672 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232331AbhBJWyL (ORCPT ); Wed, 10 Feb 2021 17:54:11 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1612997563; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=MTOZimiXv3MyLyighqxuqoNvd3oB6X1OJT8JBba17+s=; b=f7q1sMwXvBqKp/rDEKHVVat/P/7oqh1tVcl9ttCa7fGtHMa4qFSNeoAwe2fv1aXImZAB1o o0UN85zEDy7PvTQfHYDPeijBl3OIan/G9F6gn3gxNaF9Z9QuYtm3wesGK/7e9blQC5Wcas TMxipYCCoqdVxXy0By8LmacJa9SiHxs= Received: from mail-ed1-f70.google.com (mail-ed1-f70.google.com [209.85.208.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-237-H5PaALn4Oqy2uphXZskcRg-1; Wed, 10 Feb 2021 17:52:41 -0500 X-MC-Unique: H5PaALn4Oqy2uphXZskcRg-1 Received: by mail-ed1-f70.google.com with SMTP id b1so3620434edt.22 for ; Wed, 10 Feb 2021 14:52:41 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:in-reply-to:references:date :message-id:mime-version:content-transfer-encoding; bh=MTOZimiXv3MyLyighqxuqoNvd3oB6X1OJT8JBba17+s=; b=YRV5URwt7qas+12NAZEOg3BI5VUy31zwid6XPpeEwieeEUvdOztvCFAlNX0fOK3ipo nVyu98dbFzEFBanjA/8Be12wTOs2V/8Ffu9xDtiPcCmTeLT8tRBPg9WaRjwxCSnwuL1p +Yk4G8FMBHGYgRwGldlv3NLXTRmdrAtccIlRm6pfl5R8FFn+AntbJdTh7thykOT/Mgnt q3XXQTmERy/BquZccg0aqug52xFvz0w99xUVrEmogL7+ScnvkWwl4oPuFmQd3n60Ctx6 8dcdMox7f31hqXXzVhmToEtKyAwWNWMxeKDsDLBqm1xuUXP23ucwF0HsSizlwvYSHHgm 9Uhw== X-Gm-Message-State: AOAM532elqSMB3kwZZD41ufMGf/to2gCpwKUnAhdRerKKQqCDXvw6bEj nRPCcS7g+m5uOSTB3y2nlvyoRjPJEaaRGiijSxJpbAw6Sy4rSz3ZqVXY/tmBhM4ywnrVk0z8e2t mEPFLrFMrxzYV X-Received: by 2002:a17:906:a0c:: with SMTP id w12mr5179877ejf.211.1612997560400; Wed, 10 Feb 2021 14:52:40 -0800 (PST) X-Google-Smtp-Source: ABdhPJy8nXTYHN0PCcWqPkgUGMEtcSutowI+G41weA+o2ITB6saJfOw+D+6Caievxf13dP+9zOcQRA== X-Received: by 2002:a17:906:a0c:: with SMTP id w12mr5179858ejf.211.1612997560224; Wed, 10 Feb 2021 14:52:40 -0800 (PST) Received: from alrua-x1.borgediget.toke.dk ([45.145.92.2]) by smtp.gmail.com with ESMTPSA id hy24sm2423526ejc.40.2021.02.10.14.52.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Feb 2021 14:52:39 -0800 (PST) Received: by alrua-x1.borgediget.toke.dk (Postfix, from userid 1000) id 563011804EE; Wed, 10 Feb 2021 23:52:39 +0100 (CET) From: Toke =?utf-8?Q?H=C3=B8iland-J=C3=B8rgensen?= To: Jakub Kicinski Cc: Marek Majtyka , Saeed Mahameed , David Ahern , Maciej Fijalkowski , John Fastabend , Jesper Dangaard Brouer , Daniel Borkmann , Maciej Fijalkowski , =?utf-8?B?QmrDtnJuIFTDtnBlbA==?= , Andrii Nakryiko , Jonathan Lemon , Alexei Starovoitov , Network Development , "David S. Miller" , hawk@kernel.org, bpf , intel-wired-lan , "Karlsson, Magnus" , jeffrey.t.kirsher@intel.com Subject: Re: [PATCH v2 bpf 1/5] net: ethtool: add xdp properties flag set In-Reply-To: <20210210103135.38921f85@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com> References: <20201204102901.109709-1-marekx.majtyka@intel.com> <5fd068c75b92d_50ce20814@john-XPS-13-9370.notmuch> <20201209095454.GA36812@ranger.igk.intel.com> <20201209125223.49096d50@carbon> <1e5e044c8382a68a8a547a1892b48fb21d53dbb9.camel@kernel.org> <6f8c23d4ac60525830399754b4891c12943b63ac.camel@kernel.org> <87h7mvsr0e.fsf@toke.dk> <87bld2smi9.fsf@toke.dk> <20210202113456.30cfe21e@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com> <20210203090232.4a259958@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com> <874kikry66.fsf@toke.dk> <20210210103135.38921f85@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com> X-Clacks-Overhead: GNU Terry Pratchett Date: Wed, 10 Feb 2021 23:52:39 +0100 Message-ID: <87czx7r0w8.fsf@toke.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org Jakub Kicinski writes: > On Wed, 10 Feb 2021 11:53:53 +0100 Toke H=C3=B8iland-J=C3=B8rgensen wrote: >> >> I am a bit confused now. Did you mean validation tests of those XDP >> >> flags, which I am working on or some other validation tests? >> >> What should these tests verify? Can you please elaborate more on the >> >> topic, please - just a few sentences how are you see it?=20=20 >> > >> > Conformance tests can be written for all features, whether they have=20 >> > an explicit capability in the uAPI or not. But for those that do IMO >> > the tests should be required. >> > >> > Let me give you an example. This set adds a bit that says Intel NICs=20 >> > can do XDP_TX and XDP_REDIRECT, yet we both know of the Tx queue >> > shenanigans. So can i40e do XDP_REDIRECT or can it not? >> > >> > If we have exhaustive conformance tests we can confidently answer that >> > question. And the answer may not be "yes" or "no", it may actually be >> > "we need more options because many implementations fall in between". >> > >> > I think readable (IOW not written in some insane DSL) tests can also=20 >> > be useful for users who want to check which features their program / >> > deployment will require.=20=20 >>=20 >> While I do agree that that kind of conformance test would be great, I >> don't think it has to hold up this series (the perfect being the enemy >> of the good, and all that). We have a real problem today that userspace >> can't tell if a given driver implements, say, XDP_REDIRECT, and so >> people try to use it and spend days wondering which black hole their >> packets disappear into. And for things like container migration we need >> to be able to predict whether a given host supports a feature *before* >> we start the migration and try to use it. > > Unless you have a strong definition of what XDP_REDIRECT means the flag > itself is not worth much. We're not talking about normal ethtool feature > flags which are primarily stack-driven, XDP is implemented mostly by > the driver, each vendor can do their own thing. Maybe I've seen one > vendor incompatibility too many at my day job to hope for the best... I'm totally on board with documenting what a feature means. E.g., for XDP_REDIRECT, whether it's acceptable to fail the redirect in some situations even when it's active, or if there should always be a slow-path fallback. But I disagree that the flag is worthless without it. People are running into real issues with trying to run XDP_REDIRECT programs on a driver that doesn't support it at all, and it's incredibly confusing. The latest example popped up literally yesterday: https://lore.kernel.org/xdp-newbies/CAM-scZPPeu44FeCPGO=3DQz=3D03CrhhfB1GdJ= 8FNEpPqP_G27c6mQ@mail.gmail.com/ >> I view the feature flags as a list of features *implemented* by the >> driver. Which should be pretty static in a given kernel, but may be >> different than the features currently *enabled* on a given system (due >> to, e.g., the TX queue stuff). > > Hm, maybe I'm not being clear enough. The way XDP_REDIRECT (your > example) is implemented across drivers differs in a meaningful ways.=20 > Hence the need for conformance testing. We don't have a golden SW > standard to fall back on, like we do with HW offloads. I'm not disagreeing that we need to harmonise what "implementing a feature" means. Maybe I'm just not sure what you mean by "conformance testing"? What would that look like, specifically? A script in selftest that sets up a redirect between two interfaces that we tell people to run? Or what? How would you catch, say, that issue where if a machine has more CPUs than the NIC has TXQs things start falling apart? > Also IDK why those tests are considered such a huge ask. As I said most > vendors probably already have them, and so I'd guess do good distros. > So let's work together. I guess what I'm afraid of is that this will end up delaying or stalling a fix for a long-standing issue (which is what I consider this series as shown by the example above). Maybe you can alleviate that by expanding a bit on what you mean? -Toke