linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* git grep/sed to standardize "/* SPDX-License-Identifier: <license>"
@ 2020-10-06 23:13 Joe Perches
  2020-10-11 18:42 ` Linus Torvalds
  0 siblings, 1 reply; 4+ messages in thread
From: Joe Perches @ 2020-10-06 23:13 UTC (permalink / raw)
  To: LKML; +Cc: Linus Torvalds, Jiri Kosina

Almost all source files in the kernel use a standardized SPDX header
at line 1 with a comment /* initiator and terminator */:

/* SPDX-License-Identifier: <license> */

$ git grep -PHn '^/\* SPDX-License-Identifier:.*\*/\s*$' | \
  wc -l
17847

$ git grep -PHn '^/\* SPDX-License-Identifier:.*\*/\s*$' | \
  grep ":1:" | cut -f1 -d":" | grep -oP '\.\w+$' | \
  sort | uniq -c | sort -rn
  16769 .h
    972 .S
     87 .c
      6 .lds
      3 .l
      2 .y
      2 .py
      2 .dtsi
      1 .sh
      1 .dts
      1 .cpp
      1 .bc

But about 2% of the files do not use a use comment termination at
line 1 and use either:

/* SPDX-License-Identifier: <license>
 * additional comment or blank

or

/* SPDX-License-Identifier: <license>
<blank line>

$ git grep -PHn '^/\* SPDX-License-Identifier:(?!.*\*/\s*$)' | \
  wc -l
407

$ git grep -PHn '^/\* SPDX-License-Identifier:(?!.*\*/\s*$)' | \
  grep '\:1:' | cut -f1 -d':' | grep -oP '\.\w+$' | \
  sort | uniq -c | sort -rn
    357 .h
     34 .S
     16 .c

Here's a trivial script to convert and standardize the
first and second lines of these 407 files to make it easier
to categorize and sort.

$ git grep -PHn '^/\* SPDX-License-Identifier:(?!.*\*/\s*$)' | \
  grep ':1:' | cut -f1 -d":" | \
  xargs sed -i -e '1s@[[:space:]]*$@ */@' -r -e '2s@^( \*|)@/*@'


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: git grep/sed to standardize "/* SPDX-License-Identifier: <license>"
  2020-10-06 23:13 git grep/sed to standardize "/* SPDX-License-Identifier: <license>" Joe Perches
@ 2020-10-11 18:42 ` Linus Torvalds
  2020-10-11 18:47   ` Joe Perches
  0 siblings, 1 reply; 4+ messages in thread
From: Linus Torvalds @ 2020-10-11 18:42 UTC (permalink / raw)
  To: Joe Perches; +Cc: LKML, Jiri Kosina

On Tue, Oct 6, 2020 at 4:13 PM Joe Perches <joe@perches.com> wrote:
>
> Almost all source files in the kernel use a standardized SPDX header
> at line 1 with a comment /* initiator and terminator */:
>
> /* SPDX-License-Identifier: <license> */
>
> $ git grep -PHn '^/\* SPDX-License-Identifier:.*\*/\s*$' | \
>   wc -l
> 17847

That grep pattern makes zero sense.

Why would */ be special at all? It isn't.

  $ git grep SPDX-License-Identifier: | wc -l
  52418

and a *LOT* of those are shell scripts and use "#", or are C sources
and use "//" etc.

So your "standardization" is completely pointless. Anybody who expects
that pattern just doing something fundamentally wrong, because the
pattern you want to standardize around is simply not valid.

             Linus

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: git grep/sed to standardize "/* SPDX-License-Identifier: <license>"
  2020-10-11 18:42 ` Linus Torvalds
@ 2020-10-11 18:47   ` Joe Perches
  2020-10-11 20:29     ` Joe Perches
  0 siblings, 1 reply; 4+ messages in thread
From: Joe Perches @ 2020-10-11 18:47 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: LKML, Jiri Kosina

On Sun, 2020-10-11 at 11:42 -0700, Linus Torvalds wrote:
> On Tue, Oct 6, 2020 at 4:13 PM Joe Perches <joe@perches.com> wrote:
> > Almost all source files in the kernel use a standardized SPDX header
> > at line 1 with a comment /* initiator and terminator */:
> > 
> > /* SPDX-License-Identifier: <license> */
> > 
> > $ git grep -PHn '^/\* SPDX-License-Identifier:.*\*/\s*$' | \
> >   wc -l
> > 17847
> 
> That grep pattern makes zero sense.
> 
> Why would */ be special at all? It isn't.
> 
>   $ git grep SPDX-License-Identifier: | wc -l
>   52418
> 
> and a *LOT* of those are shell scripts and use "#", or are C sources
> and use "//" etc.
> 
> So your "standardization" is completely pointless. Anybody who expects
> that pattern just doing something fundamentally wrong, because the
> pattern you want to standardize around is simply not valid.

It's just a trivial grep pattern to determine if the c90 style
SPDX-License-Identifier is in an individual single line comment.

Almost all are.



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: git grep/sed to standardize "/* SPDX-License-Identifier: <license>"
  2020-10-11 18:47   ` Joe Perches
@ 2020-10-11 20:29     ` Joe Perches
  0 siblings, 0 replies; 4+ messages in thread
From: Joe Perches @ 2020-10-11 20:29 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: LKML, Jiri Kosina

On Sun, 2020-10-11 at 11:47 -0700, Joe Perches wrote:
> On Sun, 2020-10-11 at 11:42 -0700, Linus Torvalds wrote:
> > On Tue, Oct 6, 2020 at 4:13 PM Joe Perches <joe@perches.com> wrote:
> > > Almost all source files in the kernel use a standardized SPDX header
> > > at line 1 with a comment /* initiator and terminator */:
> > > 
> > > /* SPDX-License-Identifier: <license> */
> > > 
> > > $ git grep -PHn '^/\* SPDX-License-Identifier:.*\*/\s*$' | \
> > >   wc -l
> > > 17847
> > 
> > That grep pattern makes zero sense.
> > 
> > Why would */ be special at all? It isn't.
> > 
> >   $ git grep SPDX-License-Identifier: | wc -l
> >   52418
> > 
> > and a *LOT* of those are shell scripts and use "#", or are C sources
> > and use "//" etc.
> > 
> > So your "standardization" is completely pointless. Anybody who expects
> > that pattern just doing something fundamentally wrong, because the
> > pattern you want to standardize around is simply not valid.

btw:

The script would merely change these c90 comments to use the style
mandated by/proposed in Documentation/process/license-rules.rst




^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-10-11 20:30 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-06 23:13 git grep/sed to standardize "/* SPDX-License-Identifier: <license>" Joe Perches
2020-10-11 18:42 ` Linus Torvalds
2020-10-11 18:47   ` Joe Perches
2020-10-11 20:29     ` Joe Perches

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).