On 20/08/24 10:56AM, Lukas Bulwahn wrote: > > > On Mon, 24 Aug 2020, Mrinal Pandey wrote: > > > On 20/08/22 09:25PM, Lukas Bulwahn wrote: > > > > > > > > > On Sat, 22 Aug 2020, Lukas Bulwahn wrote: > > > > > > > > > > > > > > > On Thu, 20 Aug 2020, Mrinal Pandey wrote: > > > > > > > > > On 20/08/09 12:52PM, Mrinal Pandey wrote: > > > > > > On 20/08/04 09:37PM, Lukas Bulwahn wrote: > > > > > > > > > > > > > > > > > > > > > On Tue, 4 Aug 2020, Mrinal Pandey wrote: > > > > > > > > > > > > > > > On 20/08/03 12:59PM, Lukas Bulwahn wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 3 Aug 2020, Mrinal Pandey wrote: > > > > > > > > > > > > > > > > > > > The diff content includes the SPDX licensing information but excludes the > > > > > > > > > > shebang when a change is made to a script file in commit 37f8173dd849 > > > > > > > > > > ("locking/atomics: Flip fallbacks and instrumentation") and commit > > > > > > > > > > 075c8aa79d54 ("selftests: forwarding: tc_actions.sh: add matchall mirror > > > > > > > > > > test"). In these cases checkpatch issues a false positive warning: > > > > > > > > > > "Misplaced SPDX-License-Identifier tag - use line 1 instead". > > > > > > > > > > > > > > > > > > > > Currently, if checkpatch finds a shebang in line 1, it expects the > > > > > > > > > > license identifier in line 2. However, this doesn't work when a shebang > > > > > > > > > > isn't found on the line 1. > > > > > > > > > > > > > > > > > > It does not work when the diff does not contain line 1, but only line 2, > > > > > > > > > because then the shebang check for line 1 cannot work. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I noticed this false positive, while running checkpatch on the set of > > > > > > > > > > commits from v5.7 to v5.8-rc1 of the kernel, on the said commits. > > > > > > > > > > This false positive exists in checkpatch since commit a8da38a9cf0e > > > > > > > > > > ("checkpatch: add test for SPDX-License-Identifier on wrong line #") > > > > > > > > > > when the corresponding rule was first added. > > > > > > > > > > > > > > > > > > > > The alternatives considered to improve this check were looking the file > > > > > > > > > > to be a script by either examining the file extension or file permissions. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Make this sentence shorter. Try. > > > > > > > > > > > > > > > > > > > The evaluation on former option resulted in 120 files which had a shebang > > > > > > > > > > in the first line but no file extension. This didn't look like a promising > > > > > > > > > > result and hence I dropped the idea of using this approach. > > > > > > > > > > > > > > > > > > > > The evaluation on the latter approach shows that there are 53 files in the > > > > > > > > > > kernel which have an executable bit set but don't have a shebang in the > > > > > > > > > > first line. > > > > > > > > > > > > > > > > > > > > At the first sight on these 53 files, it seems that they either have a > > > > > > > > > > wrong file permission set or could be reasonably extended with a shebang > > > > > > > > > > and SPDX license information. Thus, further cleanup in the repository > > > > > > > > > > would make the latter approach to work even more precisely. > > > > > > > > > > > > > > > > > > > > Hence, I chose to check the file permissions to determine if the file is a > > > > > > > > > > script and notify checkpatch to expect SPDX on second line for such files. > > > > > > > > > > > > > > > > > > > > > > > > > > > > There is no notification here. Think about better wording. > > > > > > > > > > > > > > > > > > > Signed-off-by: Mrinal Pandey > > > > > > > > > > --- > > > > > > > > > > scripts/checkpatch.pl | 3 +++ > > > > > > > > > > 1 file changed, 3 insertions(+) > > > > > > > > > > > > > > > > > > > > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl > > > > > > > > > > index 4c820607540b..bae1dd824518 100755 > > > > > > > > > > --- a/scripts/checkpatch.pl > > > > > > > > > > +++ b/scripts/checkpatch.pl > > > > > > > > > > @@ -3166,6 +3166,9 @@ sub process { > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > # check for using SPDX license tag at beginning of files > > > > > > > > > > + if ($line =~ /^index\ .*\.\..*\ .*[7531]\d{0,2}$/) { > > > > > > > > > > + $checklicenseline = 2; > > > > > > > > > > + } > > > > > > > > > > > > > > > > > > That check looks good now. > > > > > > > > > > > > > > > > > > > if ($realline == $checklicenseline) { > > > > > > > > > > if ($rawline =~ /^[ \+]\s*\#\!\s*\//) { > > > > > > > > > > $checklicenseline = 2; > > > > > > > > > > > > > > > > > > This is probably broken now. It should check for shebang in line 1 and > > > > > > > > > then set checklicenseline to line 2, right? > > > > > > > > > > > > > > > > Sir, > > > > > > > > > > > > > > > > Should we remove this check? Earlier when I checked for file extension > > > > > > > > we had 120 cases where this check was also needed but now we have a > > > > > > > > better heuristic which is going to work for all cases where license > > > > > > > > should be on line 2 irrespective of the fact that we know the first line > > > > > > > > or not. > > > > > > > > > > > > > > > > > > > > > > Are you sure about that? Where is the evaluation that proves your point? > > > > > > > > > > > > > > E.g., are all files that contain a shebang really with an executable flag? > > > > > > > > > > > > > > Which commands did you run to check this? > > > > > > > > > > > > > > > If I am missing out on something and we should not be removing this check, > > > > > > > > then I suggest placing the new heuristics below this block so that it doesn't > > > > > > > > interfere with the existing logic. > > > > > > > > > > > > > > > > Please let me know which path should I go about and then I shall resend > > > > > > > > the patch with the modified commit message. > > > > > > > > > > > > > > > > > > > > > > Think about the strengths and weaknesses of the potential solutions, then > > > > > > > show with some commands (as I did for example, for finding the first > > > > > > > lines previously) that you can show that it practically makes a > > > > > > > difference and you can numbers on those differences. > > > > > > > > > > > > > > When you did that, send a new patch. > > > > > > > > > > > > > > Lukas > > > > > > > > > > > > > Sir, > > > > > > > > > > > > I ran the evaluation as: > > > > > > > > > > > > mrinalpandey@mrinalpandey:~/linux/linux$ cat get_permissions.sh > > > > > > #!/bin/bash > > > > > > > > > > > > for file in $(git ls-files) > > > > > > do > > > > > > permissions="$(stat -c "%a %n" $file)" > > > > > > echo "$permissions" > > > > > > done > > > > > > > > > > > > mrinalpandey@mrinalpandey:~/linux/linux$ sh get_permissions.sh | grep ^[7531] > temp > > > > > > > > > > > > mrinalpandey@mrinalpandey:~/linux/linux$ cut -d ' ' -f 2 temp > executables > > > > > > > > > > > > mrinalpandey@mrinalpandey:~/linux/linux$ cat first_line.sh > > > > > > #!/bin/bash > > > > > > file="executables" > > > > > > while IFS= read -r line > > > > > > do > > > > > > firstline=`head -n 1 $line` > > > > > > printf '%s:%s\n' "$firstline" "$line" > > > > > > done <"$file" > > > > > > > > > > > > mrinalpandey@mrinalpandey:~/linux/linux$ cat executables | wc -l > > > > > > 611 > > > > > > > > > > > > mrinalpandey@mrinalpandey:~/linux/linux$ sh first_line.sh | grep ^#! | wc -l > > > > > > head: error reading 'scripts/dtc/include-prefixes/arc': Is a directory > > > > > > head: error reading 'scripts/dtc/include-prefixes/arm': Is a directory > > > > > > head: error reading 'scripts/dtc/include-prefixes/arm64': Is a directory > > > > > > head: error reading 'scripts/dtc/include-prefixes/c6x': Is a directory > > > > > > head: error reading 'scripts/dtc/include-prefixes/dt-bindings': Is a directory > > > > > > head: error reading 'scripts/dtc/include-prefixes/h8300': Is a directory > > > > > > head: error reading 'scripts/dtc/include-prefixes/microblaze': Is a directory > > > > > > head: error reading 'scripts/dtc/include-prefixes/mips': Is a directory > > > > > > head: error reading 'scripts/dtc/include-prefixes/nios2': Is a directory > > > > > > head: error reading 'scripts/dtc/include-prefixes/openrisc': Is a directory > > > > > > head: error reading 'scripts/dtc/include-prefixes/powerpc': Is a directory > > > > > > head: error reading 'scripts/dtc/include-prefixes/sh': Is a directory > > > > > > head: error reading 'scripts/dtc/include-prefixes/xtensa': Is a directory > > > > > > 540 > > > > > > > > > > > > We can see that there are 71 files where the executable bit is set but > > > > > > the first line is not a shebang. These include 13 directories which > > > > > > throw the error above. Remaining 58 files(earlier the number was 53) > > > > > > could be cleaned so that this heuristic works better as we saw. So, by > > > > > > checking only for the executable bit we can say that license should be > > > > > > on second line, we probably don't need to check for the shebang on line > > > > > > 1. > > > > > > Please let me know if the evaluation makes sense. > > > > > > > > > > > > > > This evaluation makes sense to find the cases that should be cleaned up. > > > > > > > > Either the executable flag is simply set wrongly and should be dropped or > > > > it is actually a script and should get a shebang in the beginning. > > > > > > > > I actually already started cleaning up. See: > > > > > > > > https://lore.kernel.org/lkml/20200819081808.26796-1-lukas.bulwahn@gmail.com/ > > > > > > > > We can discuss how to continue this cleanup. > > > > > > > > > > I had another look at the results of your script. > > > > > > Just a minor improvement to that resulting list: > > > > > > I think symbolic links in the repository are always of permission 777, and > > > I think that is reasonable. > > > > > > So maybe you can filter out the symbolic links in your get_permissions.sh? > > > > > > Then, the list is probably down to a few 20 to 30 cases that should > > > probably really be cleaned up. > > > > > > Can you share that script and the results? Then, let us start cleaning > > > up. > > > > Sir, > > > > Here is what I ran: > > > > mrinalpandey@mrinalpandey:~/linux/linux$ cat get_permissions.sh > > #!/bin/bash > > > > for file in $(git ls-files) > > do > > permissions="$(stat -c '%a %n' $file)" > > details="$(ls -l $file)" > > echo "$permissions $details" > > done > > mrinalpandey@mrinalpandey:~/linux/linux$ sh get_permissions.sh | grep ^[7531] | grep -v "\->" > temp > > mrinalpandey@mrinalpandey:~/linux/linux$ cut -d ' ' -f 2 temp > executables > > mrinalpandey@mrinalpandey:~/linux/linux$ cat executables | wc -l > > 574 > > mrinalpandey@mrinalpandey:~/linux/linux$ cat first_line.sh > > #!/bin/bash > > file="executables" > > while IFS= read -r line > > do > > firstline=`head -n 1 $line` > > printf '%s:%s\n' "$firstline" "$line" > > done <"$file" > > mrinalpandey@mrinalpandey:~/linux/linux$ sh first_line.sh | grep ^#! | wc -l > > 539 > > > > Hence, there are only 35 cases to be cleaned up. > > > > Can you share those 35 cases you identified? > > Then, we can discuss the individual changes for those 35 cases. Sir, The list is attached herewith and is in the format: : Thank you. > > Lukas