On So., 19. Juli 2020 at 08:28, Mrinal Pandey wrote: > > > On Fri, Jul 17, 2020 at 5:18 PM Lukas Bulwahn > wrote: > >> >> >> On Fri, Jul 17, 2020 at 11:54 AM Mrinal Pandey >> wrote: >> >>> >>> >>> On Thu, Jul 16, 2020, 11:01 Lukas Bulwahn >>> wrote: >>> >>>> >>>> >>>> On Thu, Jul 16, 2020 at 7:15 AM Mrinal Pandey >>>> wrote: >>>> >>>>> >>>>> >>>>> On Tue, Jul 14, 2020 at 11:33 AM Lukas Bulwahn < >>>>> lukas.bulwahn@gmail.com> wrote: >>>>> >>>>>> >>>>>> >>>>>> On Tue, 14 Jul 2020, Mrinal Pandey wrote: >>>>>> >>>>>> > >>>>>> > >>>>>> > On Tue, Jul 14, 2020 at 1:16 AM Lukas Bulwahn < >>>>>> lukas.bulwahn@gmail.com> wrote: >>>>>> > >>>>>> > >>>>>> > On Mon, 13 Jul 2020, Mrinal Pandey wrote: >>>>>> > >>>>>> > > In all the scripts, the SPDX license should be on the >>>>>> second line, >>>>>> > > the first line being the "sh-bang", but checkpatch issues a >>>>>> warning >>>>>> > > "Misplaced SPDX-License-Identifier tag - use line 1 >>>>>> instead" for the >>>>>> > > scripts that have SPDX license in the second line. >>>>>> > > >>>>>> > > However, this warning is not issued when checkpatch is run >>>>>> on a file using >>>>>> > > `-f` option. The case for files has been handled gracefully >>>>>> by changing >>>>>> > > `$checklicenseline` to `2` but a corresponding check when >>>>>> running checkpatch >>>>>> > > on a commit hash is missing. >>>>>> > > >>>>>> > > I noticed this false positive while running checkpatch on >>>>>> the set of >>>>>> > > commits from v5.7 to v5.8-rc1 of the kernel on the commits >>>>>> which modified >>>>>> > > a script file. >>>>>> > > >>>>>> > > This check is missing in checkpatch since commit >>>>>> a8da38a9cf0e >>>>>> > > ("checkpatch: add test for SPDX-License-Identifier on wrong >>>>>> line #") >>>>>> > > when the corresponding rule was first commited. >>>>>> > > >>>>>> > > Fix this by setting `$checklicenseline` to `2` when the >>>>>> diff content that >>>>>> > > is being checked originates from a script, thus, informing >>>>>> checkpatch that >>>>>> > > the SPDX license should be on the second line. >>>>>> > > >>>>>> > > Signed-off-by: Mrinal Pandey >>>>>> > > --- >>>>>> > > scripts/checkpatch.pl | 3 +++ >>>>>> > > 1 file changed, 3 insertions(+) >>>>>> > > >>>>>> > > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl >>>>>> > > index 4c820607540b..bbffd0c4449d 100755 >>>>>> > > --- a/scripts/checkpatch.pl >>>>>> > > +++ b/scripts/checkpatch.pl >>>>>> > > @@ -3218,6 +3218,9 @@ sub process { >>>>>> > > next if ($realfile !~ >>>>>> /\.(h|c|s|S|sh|dtsi|dts)$/); >>>>>> > > >>>>>> > > # check for using SPDX-License-Identifier on the wrong >>>>>> line number >>>>>> > > + if ($realfile =~ /^scripts/) { >>>>>> > > + $checklicenseline = 2; >>>>>> > > + } >>>>>> > >>>>>> > I think this is somehow wrong here. The check for >>>>>> checklicenseline = 2 >>>>>> > looks very different above. >>>>>> > >>>>>> > Why does -f work and using a patch file not work? >>>>>> > >>>>>> > >>>>>> > Sir, >>>>>> > >>>>>> > I am going to explain my observation based on file >>>>>> `scripts/atomic/gen-atomic-fallback.sh` and >>>>>> > commit hash `37f8173dd849`. >>>>>> > >>>>>> > If we are checking against the file, `checklicenseline` is set to 1 >>>>>> and when `realline` is 1 the above >>>>>> > `if` block is triggered, then we check if this line is of the form >>>>>> `#!/` using the regular expression >>>>>> > `^[ \+]\s*\#\!\s*\/`. If this is the case we set `checklicenseline` >>>>>> to `2` informing checkpatch that it should >>>>>> > expect license on the second line and this works all fine for a >>>>>> file. >>>>>> > The `if` block below my proposed changes evaluates to false in this >>>>>> case and thus it emits no false warning. >>>>>> > >>>>>> > However, If we are checking a diff content, the above `if` block is >>>>>> not triggered at all. This is >>>>>> > because `realline` stores the actual line number of the line we are >>>>>> checking currently out of diff content. >>>>>> > This value is 2 because SPDX identifier is indeed at the second >>>>>> line in the file but `checklicenseline` is still >>>>>> > `1`. >>>>>> > `realline` will never become equal to 1 again and thus the above >>>>>> `if` condition will never be true in this case. >>>>>> > Even if the above `if` block is triggered it would not update >>>>>> `checklicenseline` to 2 as the regular expression >>>>>> > is not satisfied since we don't have sh-bang in diff content and >>>>>> just the SPDX tag. >>>>>> > If we don't do this, the `if` block below evaluates to true when >>>>>> `realline` is 2 and `checklicensline` is `1` >>>>>> > leading >>>>>> > to the emission of a false warning. >>>>>> > >>>>>> >>>>>> So, maybe this whole logic needs to be reworked. If you do not know >>>>>> the >>>>>> first line, you need to have a different criteria in the first place >>>>>> to determine if you expect the license tag in the first or the >>>>>> second, >>>>>> e.g., the file extension, and then checking line 1 for a shebang is >>>>>> just >>>>>> sanity checking. If it is of a specific file extension, you know line >>>>>> 1 >>>>>> and it is not a shebang, that is probably worth noting as a different >>>>>> recommendation in checkpatch.pl anyway. >>>>>> >>>>> >>>>> Sir, >>>>> >>>>> When we know the first line, i.e. we are running checkpatch against a >>>>> file, the existing logic >>>>> works fine. We probably don't want to induce any changes there. >>>>> >>>>> >>>> Why not? Do you think we would break things there? Then we should not >>>> touch the code at all. >>>> Do you think we cannot test it properly after the change? Then we >>>> should think about how we make a proper regression test suite for that. >>>> >>> >>> Sir, >>> >>> No, breaking code or not being able to test is not why I suggest this. I >>> feel that the existing logic handles the case of >>> "Improper or malformed SPDX tag" and "Misplaced SPDX tag" for files i.e. >>> when the first line is known. Anyway, the logic >>> for "Misplaced SPDX tag" is written as a different rule. We just need to >>> add in the logic for patches there. >>> I tried to do this by checking for the scripts directory which was >>> wrong. If I check instead for the file being a script that would make much >>> more sense. >>> Please let me know if you suggest something else. >>> >> >> Well, you are going to add a different way of checking as you suggested, >> right? So are you suggesting to have two duplicated ways of checking for >> the same thing? That seems strange to me. >> > >> Go ahead, make a suitable first proposal, then we will see if and how to >> refactor. >> > > Sir, > > No, I would not want to duplicate things. Yes, let me first send a patch > to you first, and then we can refactor it if needed. > >> >> >>> >>>> But when we don't know the first line, if am not wrong, it would go >>>>> somewhat like: >>>>> if (the file is a script) { >>>>> if (the first line is shebang) { >>>>> if (the second line is SPDX) { >>>>> All good >>>>> } else { >>>>> Issue a misplaced or missing SPDX tag warning >>>>> } >>>>> } else { >>>>> Issue a missing shebang warning >>>>> } >>>>> } else { >>>>> if (the first line is SPDX) { >>>>> All good >>>>> } else { >>>>> Issue a misplaced or missing SPDX tag warning >>>>> } >>>>> } >>>>> >>>>> >>>> Basically agree, but that logic applies when you know the first line as >>>> well (and only, right?). What if you do not know the first line, how would >>>> you check "the first line is shebang" if you do not know the first line? >>>> >>>> >>>> The missing shebang warning probably needs to go elsewhere in the whole >>>> script. >>>> >>> >>> By not knowing the first line I mean to say that the first line doesn't >>> show up in diff content of the patch but >>> what if we open the file at that point in the commit history and check >>> for the first line to be a shebang? >>> Would it be okay to do that? Once we check the first line we can then >>> continue as suggested. >>> >> >> I think that is essentially against the general design decision of >> checkpatch.pl; checkpatch.pl takes the patch but it does not check >> anything in the current working tree, nor does it know what the "parent" of >> that patch in the git history really is. >> >> Maybe Shuah can confirm? Otherwise, I suggest to look if you find >> checking a file beyond the patch happening anywhere in the current code of >> checkpatch.pl and documented in the sparse documentation on checkpatch.pl >> . >> >> So, I believe you need to make it work without checking the first line >> (if that is not in the patch). >> > > Yes. We can't open a file for checking, you are correct here but we need > to check for shebang to be on the first line only if it appears > in the diff content or when we check a complete file, otherwise, we should > anyway not chek it since checkpatch only checks the patch or the complete > file. > Am I correct here? > >> >> >>> >>>>>> > So, what I did was to check if the diff content we are checking >>>>>> actually comes from a script, if yes, we can set >>>>>> > `checklicenseline` to `2` to avoid this confusion. >>>>>> > >>>>>> >>>>>> Why would you think that scripts are only in scripts? >>>>>> >>>>>> How about first listing all files where the SPDX tag is in line 2 in >>>>>> the >>>>>> current repository, e.g., v5.8-rc5? >>>>>> >>>>>> Then, we look at that list and determine a suitable criteria for >>>>>> looking >>>>>> in line 2 for the SPDX tag. >>>>>> >>>>> >>>>> Yes, the scripts are not only in scripts. I have listed all the files >>>>> where the SPDX tag should be >>>>> on the second line. I've attached the list for reference. We should >>>>> probably be checking the file >>>>> extension to determine if the tag needs to be on the second line or >>>>> not. >>>>> The documentation says the SPDX tag should be present in all source >>>>> files. Do these source files include >>>>> Documentation files too? >>>>> >>>>> >>>> How did you create that list? >>>> Agree (if the way you created that list makes sense). File extension >>>> seems to cover all cases, and checking for the directory 'scripts' does not. >>>> >>>> I issued the command `find . -regex ".*\.\(py\|sh\|pl\)"` to make this >>> list. I should have included awk, YAML and tc files too since they are >>> scripts too. >>> >>> >> Why not look for all files that have a shebang in the first line? >> > > We could be checking a file or a patch. I suggest we take the name of the > file we are checking(or the name of the file from which the diff content > comes from) > and then run a regex, similar to above, on it to determine if it is a > script. If we instead go for checking the first line to be shebang in the > current file would it not require to cat or open > the file? > All good. Create a patch and then we will see. Lukas >