From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.6 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6F9E3C433DF for ; Tue, 23 Jun 2020 02:58:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 48A512076A for ; Tue, 23 Jun 2020 02:58:48 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Y3HOJcUY" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731820AbgFWC6s (ORCPT ); Mon, 22 Jun 2020 22:58:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60644 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731690AbgFWC6r (ORCPT ); Mon, 22 Jun 2020 22:58:47 -0400 Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com [IPv6:2a00:1450:4864:20::42b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 75528C061573 for ; Mon, 22 Jun 2020 19:58:46 -0700 (PDT) Received: by mail-wr1-x42b.google.com with SMTP id r12so1565266wrj.13 for ; Mon, 22 Jun 2020 19:58:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=AXi+ZPhOfi04UEldTn/C6lZw7iAKx1vkRfbVboyu5Z0=; b=Y3HOJcUYWQT1B2yupxS5EPA1NVpsex3gMAWe6SIbFh7bAvgDF0uBzQN6I/bkTGE/OB Gu8DKIv7rOKRPkkKD9Iba6bbTMsFaGiB1ebC7uZfytnVLiGAj3AyixHelJfe4ChL/y3s jCy0cr3BlqjdTezrgeEzGm1E+bvPhB78v+J8fFfQ/Cc14IDVVjWo1e1xCP/UHI6BKORS duqXl3Nc/ydmQws6UK1NZ8spaSbr8jiVeqV+YxO8BzCY0ACDgX5q+T44qgxEocQCYbfk c6GfLkrtiEshevPCtEWNWQiDGpGOt1DpuHu7k2HAHwdpJ1/QwrJZfHLqAmpnLGw0ZG9s 7qwg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=AXi+ZPhOfi04UEldTn/C6lZw7iAKx1vkRfbVboyu5Z0=; b=KDfLyG9PXvtOHE2Ci7YYhd0oJQCMD0Ab2WJk/1ZrotUVulw+TlY9VBwsfb21HZJw7H xE5QkEkMO6IVPvG590+Zywm91Xienzoep/SIdiHn194N3AaWqUqbfXSKJtetaiDrZzKJ SuUBWx4V9u7iVO0zlgZRGTYP6WGo53RfHrPJrVvtrBuPuu9fhxewGGuDasL0RG/jqFHi hix7wDqalcjpoKH+EdIBz+r0VX4ATD8z6K810+QPeGtLFCqONLR5TD5pabZoCFE+m0qd y7KlzJ5935nqCj81FwuFhA9QTFmrJC20aLBCTjyvpXIl/bB8njJUtyZm7oatAC8+vvzc BJ0w== X-Gm-Message-State: AOAM531AYoYSObmWOlvN6XgyKSWTpQI08/tW4josG5hNL9q6wAdiPx3b g9sQUBJUNNsIhy1+WqFiCmAQUgvPNToQsS8Me7cfUw== X-Google-Smtp-Source: ABdhPJzzX/FACj8yCdz+hoPmTXjLPzc/aJfnbVwkEq7r5hSinTj1eUQC2ZstlXF36DK/MV9XiqivfcNA49+Ievk6AQ0= X-Received: by 2002:a5d:5642:: with SMTP id j2mr15456656wrw.19.1592881124921; Mon, 22 Jun 2020 19:58:44 -0700 (PDT) MIME-Version: 1.0 References: <202006141120.96FF8C5@keescook> <7161fadb-45ba-c4c0-8bbb-cb47d2dd0265@redhat.com> <5b4c248a-f8c9-0913-5280-8e436cdc5838@gmail.com> In-Reply-To: <5b4c248a-f8c9-0913-5280-8e436cdc5838@gmail.com> From: David Gow Date: Tue, 23 Jun 2020 10:58:33 +0800 Message-ID: Subject: Re: RFC - kernel selftest result documentation (KTAP) To: Frank Rowand Cc: Paolo Bonzini , "Bird, Tim" , Kees Cook , "shuah@kernel.org" , "linux-kselftest@vger.kernel.org" , Brendan Higgins , "linux-kernel@vger.kernel.org" Content-Type: text/plain; charset="UTF-8" Sender: linux-kselftest-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org On Sat, Jun 20, 2020 at 11:03 PM Frank Rowand wrote: > > On 2020-06-20 01:44, David Gow wrote: > > On Sat, Jun 20, 2020 at 1:58 AM Frank Rowand wrote: > >> > >> On 2020-06-16 07:08, Paolo Bonzini wrote: > >>> On 15/06/20 21:07, Bird, Tim wrote: > > > >>>>>> Finally, > >>>>>> - Should a SKIP result be 'ok' (TAP13 spec) or 'not ok' (current kselftest practice)? > >>>>>> See https://testanything.org/tap-version-13-specification.html > >>>>> > >>>>> Oh! I totally missed this. Uhm. I think "not ok" makes sense to me "it > >>>>> did not run successfully". ... but ... Uhhh ... how do XFAIL and SKIP > >>>>> relate? Neither SKIP nor XFAIL count toward failure, though, so both > >>>>> should be "ok"? I guess we should change it to "ok". > >>> > >>> See above for XFAIL. > >>> > >>> I initially raised the issue with "SKIP" because I have a lot of tests > >>> that depend on hardware availability---for example, a test that does not > >>> run on some processor kinds (e.g. on AMD, or old Intel)---and for those > >>> SKIP should be considered a success. > >> > >> No, SKIP should not be considered a success. It should also not be considered > >> a failure. Please do not blur the lines between success, failure, and > >> skipped. > > > > > > I agree that skipped tests should be their own thing, separate from > > success and failure, but the way they tend to behave tends to be > > closer to a success than a failure. > > > > I guess the important note here is that a suite of tests, some of > > which are SKIPped, can be listed as having passed, so long as none of > > them failed. So, the rule for "bubbling up" test results is that any > > failures cause the parent to fail, the parent is marked as skipped if > > _all_ subtests are skipped, and otherwise is marked as having > > succeeded. (Reversing the last part: having a suite be marked as > > skipped if _any_ of the subtests are skipped also makes sense, and has > > its advantages, but anecdotally seems less common in other systems.) > > That really caught my attention as something to be captured in the spec. > > My initial response was that bubbling up results is the domain of the > test analysis tools, not the test code. KUnit is actually sitting in the middle. Results are bubbled up from individual tests to the test suites in-kernel (by the common KUnit code), as the suites are TAP tests (individual test cases being subtests), and so need to provide results. The kunit.py script then bubbles those results up (using the same rules) to print a summary. > If I were writing a test analysis tool, I would want the user to have > the ability to configure the bubble up rules. Different use cases > would desire different rules. I tend to agree: it'd be nice if test analysis tools could implement different rules here. If we're using TAP subtests, though, the parent tests do need to return a result in the test code, so either that needs to be test-specific (if the parent test is not just a simple union of its subtests), or it could be ignored by an analysis tool which would follow its own rules. (In either case, it may make sense to be able to configure a test analysis tool to always fail or mark tests with failed or skipped subtests, even if its result is "ok", but not vice-versa -- a test which failed would stay failed, even if all its subtests passed.) > My second response was to start thinking about whether the tests > themselves should have any sort of bubble up implemented. I think > it is a very interesting question. My current mindset is that > each test is independent, and their is not a concept of an umbrella > test that is the union of a set of subtests. But maybe there is > value to umbrella tests. If there is a concept of umbrella tests > then I think the spec should define how skip bubbles up. > KUnit suites are definitely that kind of "umbrella test" at the moment. > > > > The other really brave thing one could do to break from the TAP > > specification would be to add a "skipped" value alongside "ok" and > > "not ok", and get rid of the whole "SKIP" directive/comment stuff. > > Possibly not worth the departure from the spec, but it would sidestep > > part of the problem. > > I like being brave in this case. Elevating SKIP to be a peer of > "ok" and "not ok" provides a more clear model that SKIP is a first > class citizen. It also removes the muddled thinking that the > current model promotes. > > > > > > > Cheers, > > -- David > > >