* git grep/sed to standardize "/* SPDX-License-Identifier: <license>"
@ 2020-10-06 23:13 Joe Perches
2020-10-11 18:42 ` Linus Torvalds
0 siblings, 1 reply; 4+ messages in thread
From: Joe Perches @ 2020-10-06 23:13 UTC (permalink / raw)
To: LKML; +Cc: Linus Torvalds, Jiri Kosina
Almost all source files in the kernel use a standardized SPDX header
at line 1 with a comment /* initiator and terminator */:
/* SPDX-License-Identifier: <license> */
$ git grep -PHn '^/\* SPDX-License-Identifier:.*\*/\s*$' | \
wc -l
17847
$ git grep -PHn '^/\* SPDX-License-Identifier:.*\*/\s*$' | \
grep ":1:" | cut -f1 -d":" | grep -oP '\.\w+$' | \
sort | uniq -c | sort -rn
16769 .h
972 .S
87 .c
6 .lds
3 .l
2 .y
2 .py
2 .dtsi
1 .sh
1 .dts
1 .cpp
1 .bc
But about 2% of the files do not use a use comment termination at
line 1 and use either:
/* SPDX-License-Identifier: <license>
* additional comment or blank
or
/* SPDX-License-Identifier: <license>
<blank line>
$ git grep -PHn '^/\* SPDX-License-Identifier:(?!.*\*/\s*$)' | \
wc -l
407
$ git grep -PHn '^/\* SPDX-License-Identifier:(?!.*\*/\s*$)' | \
grep '\:1:' | cut -f1 -d':' | grep -oP '\.\w+$' | \
sort | uniq -c | sort -rn
357 .h
34 .S
16 .c
Here's a trivial script to convert and standardize the
first and second lines of these 407 files to make it easier
to categorize and sort.
$ git grep -PHn '^/\* SPDX-License-Identifier:(?!.*\*/\s*$)' | \
grep ':1:' | cut -f1 -d":" | \
xargs sed -i -e '1s@[[:space:]]*$@ */@' -r -e '2s@^( \*|)@/*@'
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: git grep/sed to standardize "/* SPDX-License-Identifier: <license>"
2020-10-06 23:13 git grep/sed to standardize "/* SPDX-License-Identifier: <license>" Joe Perches
@ 2020-10-11 18:42 ` Linus Torvalds
2020-10-11 18:47 ` Joe Perches
0 siblings, 1 reply; 4+ messages in thread
From: Linus Torvalds @ 2020-10-11 18:42 UTC (permalink / raw)
To: Joe Perches; +Cc: LKML, Jiri Kosina
On Tue, Oct 6, 2020 at 4:13 PM Joe Perches <joe@perches.com> wrote:
>
> Almost all source files in the kernel use a standardized SPDX header
> at line 1 with a comment /* initiator and terminator */:
>
> /* SPDX-License-Identifier: <license> */
>
> $ git grep -PHn '^/\* SPDX-License-Identifier:.*\*/\s*$' | \
> wc -l
> 17847
That grep pattern makes zero sense.
Why would */ be special at all? It isn't.
$ git grep SPDX-License-Identifier: | wc -l
52418
and a *LOT* of those are shell scripts and use "#", or are C sources
and use "//" etc.
So your "standardization" is completely pointless. Anybody who expects
that pattern just doing something fundamentally wrong, because the
pattern you want to standardize around is simply not valid.
Linus
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: git grep/sed to standardize "/* SPDX-License-Identifier: <license>"
2020-10-11 18:42 ` Linus Torvalds
@ 2020-10-11 18:47 ` Joe Perches
2020-10-11 20:29 ` Joe Perches
0 siblings, 1 reply; 4+ messages in thread
From: Joe Perches @ 2020-10-11 18:47 UTC (permalink / raw)
To: Linus Torvalds; +Cc: LKML, Jiri Kosina
On Sun, 2020-10-11 at 11:42 -0700, Linus Torvalds wrote:
> On Tue, Oct 6, 2020 at 4:13 PM Joe Perches <joe@perches.com> wrote:
> > Almost all source files in the kernel use a standardized SPDX header
> > at line 1 with a comment /* initiator and terminator */:
> >
> > /* SPDX-License-Identifier: <license> */
> >
> > $ git grep -PHn '^/\* SPDX-License-Identifier:.*\*/\s*$' | \
> > wc -l
> > 17847
>
> That grep pattern makes zero sense.
>
> Why would */ be special at all? It isn't.
>
> $ git grep SPDX-License-Identifier: | wc -l
> 52418
>
> and a *LOT* of those are shell scripts and use "#", or are C sources
> and use "//" etc.
>
> So your "standardization" is completely pointless. Anybody who expects
> that pattern just doing something fundamentally wrong, because the
> pattern you want to standardize around is simply not valid.
It's just a trivial grep pattern to determine if the c90 style
SPDX-License-Identifier is in an individual single line comment.
Almost all are.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: git grep/sed to standardize "/* SPDX-License-Identifier: <license>"
2020-10-11 18:47 ` Joe Perches
@ 2020-10-11 20:29 ` Joe Perches
0 siblings, 0 replies; 4+ messages in thread
From: Joe Perches @ 2020-10-11 20:29 UTC (permalink / raw)
To: Linus Torvalds; +Cc: LKML, Jiri Kosina
On Sun, 2020-10-11 at 11:47 -0700, Joe Perches wrote:
> On Sun, 2020-10-11 at 11:42 -0700, Linus Torvalds wrote:
> > On Tue, Oct 6, 2020 at 4:13 PM Joe Perches <joe@perches.com> wrote:
> > > Almost all source files in the kernel use a standardized SPDX header
> > > at line 1 with a comment /* initiator and terminator */:
> > >
> > > /* SPDX-License-Identifier: <license> */
> > >
> > > $ git grep -PHn '^/\* SPDX-License-Identifier:.*\*/\s*$' | \
> > > wc -l
> > > 17847
> >
> > That grep pattern makes zero sense.
> >
> > Why would */ be special at all? It isn't.
> >
> > $ git grep SPDX-License-Identifier: | wc -l
> > 52418
> >
> > and a *LOT* of those are shell scripts and use "#", or are C sources
> > and use "//" etc.
> >
> > So your "standardization" is completely pointless. Anybody who expects
> > that pattern just doing something fundamentally wrong, because the
> > pattern you want to standardize around is simply not valid.
btw:
The script would merely change these c90 comments to use the style
mandated by/proposed in Documentation/process/license-rules.rst
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2020-10-11 20:30 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-06 23:13 git grep/sed to standardize "/* SPDX-License-Identifier: <license>" Joe Perches
2020-10-11 18:42 ` Linus Torvalds
2020-10-11 18:47 ` Joe Perches
2020-10-11 20:29 ` Joe Perches
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).