From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=BAYES_00,DKIM_ADSP_CUSTOM_MED, DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,INCLUDES_PATCH,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 058DAC433DF for ; Sun, 19 Jul 2020 06:28:15 +0000 (UTC) Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id AE3AB20734 for ; Sun, 19 Jul 2020 06:28:14 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="hpDD5uuA" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AE3AB20734 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=linux-kernel-mentees-bounces@lists.linuxfoundation.org Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id 8AE2E87629; Sun, 19 Jul 2020 06:28:14 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id LWMcyjjX9EU5; Sun, 19 Jul 2020 06:28:12 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by hemlock.osuosl.org (Postfix) with ESMTP id C4082875D8; Sun, 19 Jul 2020 06:28:12 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id A46D1C089F; Sun, 19 Jul 2020 06:28:12 +0000 (UTC) Received: from silver.osuosl.org (smtp3.osuosl.org [140.211.166.136]) by lists.linuxfoundation.org (Postfix) with ESMTP id 1818EC016F for ; Sun, 19 Jul 2020 06:28:12 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by silver.osuosl.org (Postfix) with ESMTP id F17BC2002C for ; Sun, 19 Jul 2020 06:28:11 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from silver.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id pTiblD6C++F2 for ; Sun, 19 Jul 2020 06:28:09 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mail-pf1-f195.google.com (mail-pf1-f195.google.com [209.85.210.195]) by silver.osuosl.org (Postfix) with ESMTPS id 4673420029 for ; Sun, 19 Jul 2020 06:28:09 +0000 (UTC) Received: by mail-pf1-f195.google.com with SMTP id j20so7446998pfe.5 for ; Sat, 18 Jul 2020 23:28:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=ggJZCPJJL1EfYJo3grSVYyDIA6PhWRwYCBu5DD+icwk=; b=hpDD5uuA3D2pbfHdN3OZeKAHCGp3shYUG00RA9zNFhHbRI0z4NYXKTjeuw6LDCLUJv 7JL/91CiAnbSoAQq1Y3RWIVDbqR6jDEtCmuPnheBNzmLqbN34Twq6W4DKigYof1zKQYF mIEBO7mLOJ8lrBRCsv9M5DvaXoBdYeTxQ6X4xrem2r/SjgSOYfL6Vg10U/2N0oyHlAWl uU1as6KdedxgHjbPAE+v5v+VcVkQssjF94sH+njPW3B7wEcgqfG3bR288cu/0Q02KMCg GWqVaMZ7dY0iZ8RG8sF4R2rrc5qhKI8lI0g4Ucyy/MHZFbh1OTLTfQRdmK/b80OOr7Is 3Z1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=ggJZCPJJL1EfYJo3grSVYyDIA6PhWRwYCBu5DD+icwk=; b=jeepXa0EJn4evbeckBER7cJiKkcEffDlH6ttDE9zrZaAcqFK63eYJGUWDXSZ/zA+XL QyfbCx7c9TIaULX7whSqGqDJrzwuHRfiTt/q4BFHFpy0EO02EnigPG9d2LrcfCCHvo8V f4YOh5hw68hfk2b6BN+4HqfF6Ik6VT/3WfZPgVBpDmLQ5GpVzMoKk8lPVF2nIclu3nsO 6zMTTe/lYX1dLwD5PCv9S0zaNacYvwFkWTXp0iixbCflhghRjlL4n6Nncp7+UUQZIbGD yUdPqHVJRyDBXGKeKYqD0Mo13js9Cv5H6epkoNOK7lTq5LgjeQzDNcGjJU1tmno4pIXO jkpQ== X-Gm-Message-State: AOAM533ADrKTiuJh5b8kDN/0FzBNV6V8HF6cefKdJAiQ2Ln4YqFYTMmM 5dvbRG3JHMYnyhw6SVqNiswroQ6R4Fpz/tFKNM4= X-Google-Smtp-Source: ABdhPJxIDMGSGHU36am6WSrI4pzXdiWeH3o9H+AXJorfpzpeyEsJ+rPL3UoSIvcuJ2Ts3nIIa3+Z6P4V98Z7TipK2KY= X-Received: by 2002:a62:2b0c:: with SMTP id r12mr14736846pfr.122.1595140088600; Sat, 18 Jul 2020 23:28:08 -0700 (PDT) MIME-Version: 1.0 References: <20200713095740.mi3cnx7tccoetxgc@mrinalpandey> In-Reply-To: From: Mrinal Pandey Date: Sun, 19 Jul 2020 11:57:56 +0530 Message-ID: To: Lukas Bulwahn , Shuah Khan , Linux-kernel-mentees@lists.linuxfoundation.org Subject: Re: [Linux-kernel-mentees] [PATCH] checkpatch: Fix SPDX license check for scripts X-BeenThere: linux-kernel-mentees@lists.linuxfoundation.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: multipart/mixed; boundary="===============4724218569413256286==" Errors-To: linux-kernel-mentees-bounces@lists.linuxfoundation.org Sender: "Linux-kernel-mentees" --===============4724218569413256286== Content-Type: multipart/alternative; boundary="000000000000591dcc05aac57e19" --000000000000591dcc05aac57e19 Content-Type: text/plain; charset="UTF-8" On Fri, Jul 17, 2020 at 5:18 PM Lukas Bulwahn wrote: > > > On Fri, Jul 17, 2020 at 11:54 AM Mrinal Pandey > wrote: > >> >> >> On Thu, Jul 16, 2020, 11:01 Lukas Bulwahn >> wrote: >> >>> >>> >>> On Thu, Jul 16, 2020 at 7:15 AM Mrinal Pandey >>> wrote: >>> >>>> >>>> >>>> On Tue, Jul 14, 2020 at 11:33 AM Lukas Bulwahn >>>> wrote: >>>> >>>>> >>>>> >>>>> On Tue, 14 Jul 2020, Mrinal Pandey wrote: >>>>> >>>>> > >>>>> > >>>>> > On Tue, Jul 14, 2020 at 1:16 AM Lukas Bulwahn < >>>>> lukas.bulwahn@gmail.com> wrote: >>>>> > >>>>> > >>>>> > On Mon, 13 Jul 2020, Mrinal Pandey wrote: >>>>> > >>>>> > > In all the scripts, the SPDX license should be on the second >>>>> line, >>>>> > > the first line being the "sh-bang", but checkpatch issues a >>>>> warning >>>>> > > "Misplaced SPDX-License-Identifier tag - use line 1 instead" >>>>> for the >>>>> > > scripts that have SPDX license in the second line. >>>>> > > >>>>> > > However, this warning is not issued when checkpatch is run >>>>> on a file using >>>>> > > `-f` option. The case for files has been handled gracefully >>>>> by changing >>>>> > > `$checklicenseline` to `2` but a corresponding check when >>>>> running checkpatch >>>>> > > on a commit hash is missing. >>>>> > > >>>>> > > I noticed this false positive while running checkpatch on >>>>> the set of >>>>> > > commits from v5.7 to v5.8-rc1 of the kernel on the commits >>>>> which modified >>>>> > > a script file. >>>>> > > >>>>> > > This check is missing in checkpatch since commit a8da38a9cf0e >>>>> > > ("checkpatch: add test for SPDX-License-Identifier on wrong >>>>> line #") >>>>> > > when the corresponding rule was first commited. >>>>> > > >>>>> > > Fix this by setting `$checklicenseline` to `2` when the diff >>>>> content that >>>>> > > is being checked originates from a script, thus, informing >>>>> checkpatch that >>>>> > > the SPDX license should be on the second line. >>>>> > > >>>>> > > Signed-off-by: Mrinal Pandey >>>>> > > --- >>>>> > > scripts/checkpatch.pl | 3 +++ >>>>> > > 1 file changed, 3 insertions(+) >>>>> > > >>>>> > > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl >>>>> > > index 4c820607540b..bbffd0c4449d 100755 >>>>> > > --- a/scripts/checkpatch.pl >>>>> > > +++ b/scripts/checkpatch.pl >>>>> > > @@ -3218,6 +3218,9 @@ sub process { >>>>> > > next if ($realfile !~ >>>>> /\.(h|c|s|S|sh|dtsi|dts)$/); >>>>> > > >>>>> > > # check for using SPDX-License-Identifier on the wrong line >>>>> number >>>>> > > + if ($realfile =~ /^scripts/) { >>>>> > > + $checklicenseline = 2; >>>>> > > + } >>>>> > >>>>> > I think this is somehow wrong here. The check for >>>>> checklicenseline = 2 >>>>> > looks very different above. >>>>> > >>>>> > Why does -f work and using a patch file not work? >>>>> > >>>>> > >>>>> > Sir, >>>>> > >>>>> > I am going to explain my observation based on file >>>>> `scripts/atomic/gen-atomic-fallback.sh` and >>>>> > commit hash `37f8173dd849`. >>>>> > >>>>> > If we are checking against the file, `checklicenseline` is set to 1 >>>>> and when `realline` is 1 the above >>>>> > `if` block is triggered, then we check if this line is of the form >>>>> `#!/` using the regular expression >>>>> > `^[ \+]\s*\#\!\s*\/`. If this is the case we set `checklicenseline` >>>>> to `2` informing checkpatch that it should >>>>> > expect license on the second line and this works all fine for a file. >>>>> > The `if` block below my proposed changes evaluates to false in this >>>>> case and thus it emits no false warning. >>>>> > >>>>> > However, If we are checking a diff content, the above `if` block is >>>>> not triggered at all. This is >>>>> > because `realline` stores the actual line number of the line we are >>>>> checking currently out of diff content. >>>>> > This value is 2 because SPDX identifier is indeed at the second line >>>>> in the file but `checklicenseline` is still >>>>> > `1`. >>>>> > `realline` will never become equal to 1 again and thus the above >>>>> `if` condition will never be true in this case. >>>>> > Even if the above `if` block is triggered it would not update >>>>> `checklicenseline` to 2 as the regular expression >>>>> > is not satisfied since we don't have sh-bang in diff content and >>>>> just the SPDX tag. >>>>> > If we don't do this, the `if` block below evaluates to true when >>>>> `realline` is 2 and `checklicensline` is `1` >>>>> > leading >>>>> > to the emission of a false warning. >>>>> > >>>>> >>>>> So, maybe this whole logic needs to be reworked. If you do not know >>>>> the >>>>> first line, you need to have a different criteria in the first place >>>>> to determine if you expect the license tag in the first or the second, >>>>> e.g., the file extension, and then checking line 1 for a shebang is >>>>> just >>>>> sanity checking. If it is of a specific file extension, you know line 1 >>>>> and it is not a shebang, that is probably worth noting as a different >>>>> recommendation in checkpatch.pl anyway. >>>>> >>>> >>>> Sir, >>>> >>>> When we know the first line, i.e. we are running checkpatch against a >>>> file, the existing logic >>>> works fine. We probably don't want to induce any changes there. >>>> >>>> >>> Why not? Do you think we would break things there? Then we should not >>> touch the code at all. >>> Do you think we cannot test it properly after the change? Then we should >>> think about how we make a proper regression test suite for that. >>> >> >> Sir, >> >> No, breaking code or not being able to test is not why I suggest this. I >> feel that the existing logic handles the case of >> "Improper or malformed SPDX tag" and "Misplaced SPDX tag" for files i.e. >> when the first line is known. Anyway, the logic >> for "Misplaced SPDX tag" is written as a different rule. We just need to >> add in the logic for patches there. >> I tried to do this by checking for the scripts directory which was wrong. >> If I check instead for the file being a script that would make much more >> sense. >> Please let me know if you suggest something else. >> > > Well, you are going to add a different way of checking as you suggested, > right? So are you suggesting to have two duplicated ways of checking for > the same thing? That seems strange to me. > > Go ahead, make a suitable first proposal, then we will see if and how to > refactor. > Sir, No, I would not want to duplicate things. Yes, let me first send a patch to you first, and then we can refactor it if needed. > > >> >>> But when we don't know the first line, if am not wrong, it would go >>>> somewhat like: >>>> if (the file is a script) { >>>> if (the first line is shebang) { >>>> if (the second line is SPDX) { >>>> All good >>>> } else { >>>> Issue a misplaced or missing SPDX tag warning >>>> } >>>> } else { >>>> Issue a missing shebang warning >>>> } >>>> } else { >>>> if (the first line is SPDX) { >>>> All good >>>> } else { >>>> Issue a misplaced or missing SPDX tag warning >>>> } >>>> } >>>> >>>> >>> Basically agree, but that logic applies when you know the first line as >>> well (and only, right?). What if you do not know the first line, how would >>> you check "the first line is shebang" if you do not know the first line? >>> >>> >>> The missing shebang warning probably needs to go elsewhere in the whole >>> script. >>> >> >> By not knowing the first line I mean to say that the first line doesn't >> show up in diff content of the patch but >> what if we open the file at that point in the commit history and check >> for the first line to be a shebang? >> Would it be okay to do that? Once we check the first line we can then >> continue as suggested. >> > > I think that is essentially against the general design decision of > checkpatch.pl; checkpatch.pl takes the patch but it does not check > anything in the current working tree, nor does it know what the "parent" of > that patch in the git history really is. > > Maybe Shuah can confirm? Otherwise, I suggest to look if you find checking > a file beyond the patch happening anywhere in the current code of > checkpatch.pl and documented in the sparse documentation on checkpatch.pl. > > So, I believe you need to make it work without checking the first line (if > that is not in the patch). > Yes. We can't open a file for checking, you are correct here but we need to check for shebang to be on the first line only if it appears in the diff content or when we check a complete file, otherwise, we should anyway not chek it since checkpatch only checks the patch or the complete file. Am I correct here? > > >> >>>>> > So, what I did was to check if the diff content we are checking >>>>> actually comes from a script, if yes, we can set >>>>> > `checklicenseline` to `2` to avoid this confusion. >>>>> > >>>>> >>>>> Why would you think that scripts are only in scripts? >>>>> >>>>> How about first listing all files where the SPDX tag is in line 2 in >>>>> the >>>>> current repository, e.g., v5.8-rc5? >>>>> >>>>> Then, we look at that list and determine a suitable criteria for >>>>> looking >>>>> in line 2 for the SPDX tag. >>>>> >>>> >>>> Yes, the scripts are not only in scripts. I have listed all the files >>>> where the SPDX tag should be >>>> on the second line. I've attached the list for reference. We should >>>> probably be checking the file >>>> extension to determine if the tag needs to be on the second line or not. >>>> The documentation says the SPDX tag should be present in all source >>>> files. Do these source files include >>>> Documentation files too? >>>> >>>> >>> How did you create that list? >>> Agree (if the way you created that list makes sense). File extension >>> seems to cover all cases, and checking for the directory 'scripts' does not. >>> >>> I issued the command `find . -regex ".*\.\(py\|sh\|pl\)"` to make this >> list. I should have included awk, YAML and tc files too since they are >> scripts too. >> >> > Why not look for all files that have a shebang in the first line? > We could be checking a file or a patch. I suggest we take the name of the file we are checking(or the name of the file from which the diff content comes from) and then run a regex, similar to above, on it to determine if it is a script. If we instead go for checking the first line to be shebang in the current file would it not require to cat or open the file? Thank you. > > Lukas > >> --000000000000591dcc05aac57e19 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
On Fri, Jul 17, 2020 at 5:18 PM Lukas= Bulwahn <lukas.bulwahn@gmail= .com> wrote:


On Fri, Jul 17, 2020 at 11:54 AM = Mrinal Pandey <= mrinalmni@gmail.com> wrote:


On Thu, Jul 16, 20= 20, 11:01 Lukas Bulwahn <lukas.bulwahn@gmail.com> wrote:
<= div dir=3D"ltr">

On Thu, Jul 16, 2020 at 7:15 AM Mrinal Pandey <mrinalmni@gmail.com> wrote:


<= div class=3D"gmail_quote">
On Tue, Jul= 14, 2020 at 11:33 AM Lukas Bulwahn <lukas.bulwahn@gmail= .com> wrote:


On Tue, 14 Jul 2020, Mrinal Pandey wrote:

>
>
> On Tue, Jul 14, 2020 at 1:16 AM Lukas Bulwahn <luka= s.bulwahn@gmail.com> wrote:
>
>
>=C2=A0 =C2=A0 =C2=A0 =C2=A0On Mon, 13 Jul 2020, Mrinal Pandey wrote: >
>=C2=A0 =C2=A0 =C2=A0 =C2=A0> In all the scripts, the SPDX license sh= ould be on the second line,
>=C2=A0 =C2=A0 =C2=A0 =C2=A0> the first line being the "sh-bang&= quot;, but checkpatch issues a warning
>=C2=A0 =C2=A0 =C2=A0 =C2=A0> "Misplaced SPDX-License-Identifier= tag - use line 1 instead" for the
>=C2=A0 =C2=A0 =C2=A0 =C2=A0> scripts that have SPDX license in the s= econd line.
>=C2=A0 =C2=A0 =C2=A0 =C2=A0>
>=C2=A0 =C2=A0 =C2=A0 =C2=A0> However, this warning is not issued whe= n checkpatch is run on a file using
>=C2=A0 =C2=A0 =C2=A0 =C2=A0> `-f` option. The case for files has bee= n handled gracefully by changing
>=C2=A0 =C2=A0 =C2=A0 =C2=A0> `$checklicenseline` to `2` but a corres= ponding check when running checkpatch
>=C2=A0 =C2=A0 =C2=A0 =C2=A0> on a commit hash is missing.
>=C2=A0 =C2=A0 =C2=A0 =C2=A0>
>=C2=A0 =C2=A0 =C2=A0 =C2=A0> I noticed this false positive while run= ning checkpatch on the set of
>=C2=A0 =C2=A0 =C2=A0 =C2=A0> commits from v5.7 to v5.8-rc1 of the ke= rnel on the commits which modified
>=C2=A0 =C2=A0 =C2=A0 =C2=A0> a script file.
>=C2=A0 =C2=A0 =C2=A0 =C2=A0>
>=C2=A0 =C2=A0 =C2=A0 =C2=A0> This check is missing in checkpatch sin= ce commit a8da38a9cf0e
>=C2=A0 =C2=A0 =C2=A0 =C2=A0> ("checkpatch: add test for SPDX-Li= cense-Identifier on wrong line #")
>=C2=A0 =C2=A0 =C2=A0 =C2=A0> when the corresponding rule was first c= ommited.
>=C2=A0 =C2=A0 =C2=A0 =C2=A0>
>=C2=A0 =C2=A0 =C2=A0 =C2=A0> Fix this by setting `$checklicenseline`= to `2` when the diff content that
>=C2=A0 =C2=A0 =C2=A0 =C2=A0> is being checked originates from a scri= pt, thus, informing checkpatch that
>=C2=A0 =C2=A0 =C2=A0 =C2=A0> the SPDX license should be on the secon= d line.
>=C2=A0 =C2=A0 =C2=A0 =C2=A0>
>=C2=A0 =C2=A0 =C2=A0 =C2=A0> Signed-off-by: Mrinal Pandey <mrinalmni@gmail.com>
>=C2=A0 =C2=A0 =C2=A0 =C2=A0> ---
>=C2=A0 =C2=A0 =C2=A0 =C2=A0>=C2=A0 scripts/checkpat= ch.pl | 3 +++
>=C2=A0 =C2=A0 =C2=A0 =C2=A0>=C2=A0 1 file changed, 3 insertions(+) >=C2=A0 =C2=A0 =C2=A0 =C2=A0>
>=C2=A0 =C2=A0 =C2=A0 =C2=A0> diff --git a/scripts/c= heckpatch.pl b/scripts/checkpatch.pl
>=C2=A0 =C2=A0 =C2=A0 =C2=A0> index 4c820607540b..bbffd0c4449d 100755=
>=C2=A0 =C2=A0 =C2=A0 =C2=A0> --- a/scripts/checkpat= ch.pl
>=C2=A0 =C2=A0 =C2=A0 =C2=A0> +++ b/scripts/checkpat= ch.pl
>=C2=A0 =C2=A0 =C2=A0 =C2=A0> @@ -3218,6 +3218,9 @@ sub process {
>=C2=A0 =C2=A0 =C2=A0 =C2=A0>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0next if ($realfile !~ /\.(h|c|s|S|sh|dtsi|dts)$/);
>=C2=A0 =C2=A0 =C2=A0 =C2=A0>=C2=A0
>=C2=A0 =C2=A0 =C2=A0 =C2=A0>=C2=A0 # check for using SPDX-License-Id= entifier on the wrong line number
>=C2=A0 =C2=A0 =C2=A0 =C2=A0> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0if ($realfile =3D~ /^scripts/) {
>=C2=A0 =C2=A0 =C2=A0 =C2=A0> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 $checklicenseline =3D 2;
>=C2=A0 =C2=A0 =C2=A0 =C2=A0> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0}
>
>=C2=A0 =C2=A0 =C2=A0 =C2=A0I think this is somehow wrong here. The chec= k for checklicenseline =3D 2
>=C2=A0 =C2=A0 =C2=A0 =C2=A0looks very different above.
>
>=C2=A0 =C2=A0 =C2=A0 =C2=A0Why does -f work and using a patch file not = work?
>
>
> Sir,
>
> I am going to explain my observation based on file `scripts/atomic/gen= -atomic-fallback.sh` and
> commit hash `37f8173dd849`.
>
> If we are checking against the file, `checklicenseline` is set to 1 an= d when `realline` is 1 the above
> `if` block is triggered, then we check if this line is of the form `#!= /` using the regular expression
> `^[ \+]\s*\#\!\s*\/`. If this is the case we set `checklicenseline` to= `2` informing checkpatch that it should
> expect license on the second line and this works all fine for a file.<= br> > The `if` block below my proposed changes evaluates to false in this ca= se and thus it emits no false warning.
>
> However, If we are checking a diff content, the above `if` block is no= t triggered at all. This is
> because `realline` stores the actual line number of the line we are ch= ecking currently out of diff content.
> This value is 2 because SPDX identifier is indeed at the second line i= n the file but `checklicenseline` is still
> `1`.
> `realline` will never become equal to 1 again and thus the above `if` = condition will never be true in this case.
> Even if the above `if` block is triggered it would not update `checkli= censeline` to 2 as the regular expression
> is not satisfied since we don't have sh-bang in diff content and j= ust the SPDX tag.
> If we don't do this, the `if` block below evaluates to true when `= realline` is 2 and `checklicensline` is `1`
> leading
> to the emission of a false warning.
>

So, maybe this whole logic needs to be reworked. If you do not know the first line, you need to have a different criteria in the first place
to determine if you expect the license tag in the first or the second,
e.g., the file extension, and then checking line 1 for a shebang is just sanity checking. If it is of a specific file extension, you know line 1
and it is not a shebang, that is probably worth noting as a different
recommendation in checkpatch.pl anyway.

Sir,

When we know the f= irst line, i.e. we are running checkpatch against a file, the existing logi= c
works fine. We probably don't want to induce any changes th= ere.


=
Why not? Do you think we would break things there? Then we should not = touch the code at all.
Do you think we cannot test it properly af= ter the change? Then we should think about how we make a proper regression = test suite for that.

Sir,

No, breaking code or not being able to test is not why I sug= gest this. I feel that the existing logic handles the case of
"Improper or malformed SPDX tag" and "Misplac= ed SPDX tag" for files i.e. when the first line is known. Anyway, the = logic
for "Misplaced SPDX tag" is written as a differen= t rule. We just need to add in the logic for patches there.
= I tried to do this by checking for the scripts directory which was wrong. I= f I check instead for the file being a script that would make much more sen= se.
Please let me know if you suggest something = else.

Well, you are going= to add a different way of checking as you suggested, right? So are you sug= gesting to have two duplicated ways of checking for the same thing? That se= ems strange to me.
Go ahead, make a suitable first proposal, then we will see if = and how to refactor.

Sir,=

No, I would not want to duplicate things. Yes, le= t me first send a patch to you first, and then we can refactor it if needed= .
=C2=A0

But when we don't know the first line, if am not wrong, i= t would go somewhat like:
if (the file is a script) {
=C2=A0=C2=A0=C2=A0 if (the first line is shebang) {
=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (the second line is SPDX) {
= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 All good=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 } else {
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Issue = a misplaced or missing SPDX tag warning
=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 }
=C2=A0=C2=A0=C2=A0 } else {
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Issue a= missing shebang warning
=C2=A0=C2=A0=C2=A0 }
}= else {
=C2=A0=C2=A0=C2=A0 if (the first line is SPDX) {
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 All good
=C2=A0= =C2=A0=C2=A0 } else {
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 = Issue a misplaced or missing SPDX tag warning
=C2=A0=C2=A0=C2=A0 = }
}


Basically agree, but that logic applies when you know the first li= ne as well (and only, right?). What if you do not know the first line, how = would you check "the first line is shebang" if you do not know th= e first line?


The missing shebang w= arning probably needs to go elsewhere in the whole script.

By not knowing the first line I mean to s= ay that the first line doesn't show up in diff content of the patch but=
what if we open the file at that point in the commit history= and check for the first line to be a shebang?
Would it be okay t= o do that? Once we check the first line we can then continue as suggested.<= br>

I think t= hat is essentially against the general design decision of checkpatch.pl; checkpatch.pl takes the patch but it does = not check anything in the current working tree, nor does it know what the &= quot;parent" of that patch in the git history really is.
Maybe Shuah can confirm? Otherwise, I suggest to look if you fi= nd checking a file beyond the patch happening anywhere in the current code = of checkpatch.pl and= documented in the sparse=C2=A0documentation on checkpatch.pl.

So, I b= elieve you need to make it work without checking the first line (if that is= not in the patch).

Yes. = We can't open a file for checking, you are correct here but we need to = check for shebang to be on the first line only if it appears
in = the diff content or when we check a complete file, otherwise, we should any= way not chek it since checkpatch only checks the patch or the complete file= .
Am I correct here?
=C2=A0

> So, what I did was to check if the diff content we are checking actual= ly comes from a script, if yes, we can set
> `checklicenseline` to `2` to avoid this confusion.
>

Why would you think that scripts are only in scripts?

How about first listing all files where the SPDX tag is in line 2 in the current repository, e.g., v5.8-rc5?

Then, we look at that list and determine a suitable criteria for looking in line 2 for the SPDX tag.

Yes, the sc= ripts are not only in scripts. I have listed all the files where the SPDX t= ag should be
on the second line. I've attached the list for r= eference. We should probably be checking the file
extension to de= termine if the tag needs to be on the second line or not.
The doc= umentation says the SPDX tag should be present in all source files. Do thes= e source files include
Documentation files too?


How did you create that lis= t?
Agree (if the way you created that list makes sense). File ext= ension seems to cover all cases, and checking for the directory 'script= s' does not.

I issued= the command `find . -regex ".*\.\(py\|sh\|pl\)"` to make this li= st. I should have included awk, YAML and tc files too since they are script= s too.


Why not look for all files that have a shebang in the first line?<= /div>

We could be checking a file or a patch. I suggest = we take the name of the file we are checking(or the name of the file from w= hich the diff content comes from)
and then = run a regex, similar to above, on it to determine if it is a script. If we = instead go for checking the first line to be shebang in the current file wo= uld it not require to cat or open
the file?=

Thank= you.

Lukas
--000000000000591dcc05aac57e19-- --===============4724218569413256286== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Linux-kernel-mentees mailing list Linux-kernel-mentees@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees --===============4724218569413256286==--