All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tom Rix <trix@redhat.com>
To: Nick Desaulniers <ndesaulniers@google.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
	clang-built-linux <clang-built-linux@googlegroups.com>,
	linux-toolchains@vger.kernel.org, Joe Perches <joe@perches.com>,
	Julia.Lawall@lip6.fr,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Masahiro Yamada <masahiroy@kernel.org>,
	Nathan Huckleberry <nhuck15@gmail.com>
Subject: Re: Subject: [RFC] clang tooling cleanups
Date: Tue, 27 Oct 2020 12:33:59 -0700	[thread overview]
Message-ID: <8abd1e5a-511a-e4f6-6f2c-a045d33fa2aa@redhat.com> (raw)
In-Reply-To: <CAKwvOd=83v0Sv-NhQ5xgqdNSRm2b=pOJDziX8axZ9t2YyYwz-A@mail.gmail.com>


On 10/27/20 11:42 AM, Nick Desaulniers wrote:
> (cutting down the CC list to something more intimate)
>
> Tom, I'm over the moon to see clang-tidy being used this way.  I
> totally forgot it could automatically apply fixits.  I'm glad Nathan
> and Masahiro were able to get the foundation laid for running
> clang-tidy on the kernel this summer.
>
> On Tue, Oct 27, 2020 at 9:43 AM <trix@redhat.com> wrote:
>> This rfc will describe
>> An upcoming treewide cleanup.
>> How clang tooling was used to programatically do the clean up.
>> Solicit opinions on how to generally use clang tooling.
>>
>> The clang warning -Wextra-semi-stmt produces about 10k warnings.
>> Reviewing these, a subset of semicolon after a switch looks safe to
>> fix all the time.  An example problem
>>
>> void foo(int a) {
>>      switch(a) {
>>                case 1:
>>                ...
>>      }; <--- extra semicolon
>> }
>>
>> Treewide, there are about 100 problems in 50 files for x86_64 allyesconfig.
>> These fixes will be the upcoming cleanup.
>>
>> clang already supports fixing this problem. Add to your command line
>>
>>   clang -c -Wextra-semi-stmt -Xclang -fixit foo.c
>>
>>   foo.c:8:3: warning: empty expression statement has no effect;
>>     remove unnecessary ';' to silence this warning [-Wextra-semi-stmt]
>>         };
>>          ^
>>   foo.c:8:3: note: FIX-IT applied suggested code changes
>>   1 warning generated.
> Ah, doesn't that rely on clang-tidy to apply the fixit?  (oh, what,
> maybe not: https://stackoverflow.com/a/49749277)
>
> And doesn't that require your patch to clang-tidy to land?
> https://reviews.llvm.org/D90180
>
> Now I'm confused; if clang can apply the fixit for warnings, why do we
> need another patch to clang-tidy?

No, this shows where the fixer is upstream.

I am in the process of pushing out the patches.

Long term the clang-tidy part of the build will change once it lands.

globbing the checker to -checker=-*,linuxkernel* would be easiest on the kernel

but that may not be where the checker lands.

>> The big problem is using this treewide is it will fix all 10k problems.
>> 10k changes to analyze and upstream is not practical.
>>
>> Another problem is the generic fixer only removes the semicolon.
>> So empty lines with some tabs need to be manually cleaned.
>>
>> What is needed is a more precise fixer.
>>
>> Enter clang-tidy.
>> https://clang.llvm.org/extra/clang-tidy/
>>
>> Already part of the static checker infrastructure, invoke on the clang
>> build with
>>   make clang-tidy
>>
>> It is only a matter of coding up a specific checker for the cleanup.
>> Upstream this is review is happening here
>> https://reviews.llvm.org/D90180
> Sorry, I still don't understand how the clang-tidy checker wont also
> produce 10k fixes?

I am interested in treewide fixes.

Cleaning them up (maybe me not doing all the patches) and keeping them clean.

The clang -Wextra-semi-stmt -fixit will fix all 10,000 problems

This clang tidy fixer will fix only the 100 problems that are 'switch() {};'

When doing a treewide cleanup, batching a bunch of fixes that are the same problem and fix

is much easier on everyone to review and more likely to be accepted.


Long term, a c/i system would keep the tree clean by running the switch-semi checker/fixer.

And we can all move onto the next problem.

>
>> The development of a checker/fixer is
>> Start with a reproducer
>>
>> void foo (int a) {
>>   switch (a) {};
>> }
>>
>> Generate the abstract syntax tree (AST)
>>
>>   clang -Xclang -ast-dump foo.c
>>
>> `-FunctionDecl
>>   |-ParmVarDecl
>>   `-CompoundStmt
>>     |-SwitchStmt
>>     | |-ImplicitCastExpr
>>     | | `-DeclRefExpr
>>     | `-CompoundStmt
>>     `-NullStmt
>>
>> Write a matcher to get you most of the way
>>
>> void SwitchSemiCheck::registerMatchers(MatchFinder *Finder) {
>>   Finder->addMatcher(
>>       compoundStmt(has(switchStmt().bind("switch"))).bind("comp"), this);
>> }
>>
>> The 'bind' method is important, it allows a string to be associated
>> with a node in the AST.  In this case these are
>>
>> `-FunctionDecl
>>   |-ParmVarDecl
>>   `-CompoundStmt <-------- comp
>>     |-SwitchStmt <-------- switch
>>     | |-ImplicitCastExpr
>>     | | `-DeclRefExpr
>>     | `-CompoundStmt
>>     `-NullStmt
>>
>> When a match is made the 'check' method will be called.
>>
>>   void SwitchSemiCheck::check(const MatchFinder::MatchResult &Result) {
>>     auto *C = Result.Nodes.getNodeAs<CompoundStmt>("comp");
>>     auto *S = Result.Nodes.getNodeAs<SwitchStmt>("switch");
>>
>> This is where the string in the bind calls are changed to nodes
>>
>> `-FunctionDecl
>>   |-ParmVarDecl
>>   `-CompoundStmt <-------- comp, C
>>     |-SwitchStmt <-------- switch, S
>>     | |-ImplicitCastExpr
>>     | | `-DeclRefExpr
>>     | `-CompoundStmt
>>     `-NullStmt <---------- looking for N
>>
>> And then more logic to find the NullStmt
>>
>>   auto Current = C->body_begin();
>>   auto Next = Current;
>>   Next++;
>>   while (Next != C->body_end()) {
>>     if (*Current == S) {
>>       if (const auto *N = dyn_cast<NullStmt>(*Next)) {
>>
>> When it is found, a warning is printed and a FixItHint is proposed.
>>
>>   auto H = FixItHint::CreateReplacement(
>>     SourceRange(S->getBody()->getEndLoc(), N->getSemiLoc()), "}");
>>   diag(N->getSemiLoc(), "unneeded semicolon") << H;
>>
>> This fixit replaces from the end of switch to the semicolon with a
>> '}'.  Because the end of the switch is '}' this has the effect of
>> removing all the whitespace as well as the semicolon.
>>
>> Because of the checker's placement in clang-tidy existing linuxkernel
>> checkers, all that was needed to fix the tree was to add a '-fix'to the
>> build's clang-tidy call.
> I wonder if there's a way to differentiate existing checks we'd prefer
> to run continuously vs newer noisier ones?  Drowning in a sea of 10k
> -Wextra-semi-stmt doesn't sound like fun.  Maybe a new target for make
> to differentiate reporting vs auto fixing?
>
>> I am looking for opinions on what we want to do specifically with
>> cleanups and generally about other source-to-source programmatic
>> changes to the code base.
>>
>> For cleanups, I think we need a new toplevel target
>>
>> clang-tidy-fix
> ah, yep, I agree.  Though I'm curious now that I know that clang can
> be used as the driver to apply fixits rather than clang-tidy, how else
> we can leverage clang over manually writing clang-tidy checks.  Unless
> I have something confused there?

Nope.

IMO clang fixits are too coarse and will never work treewide.

Comparing my recent treewide fixing of unneeded breaks and semi's, I would much rather write a tool

than manually look at or fix anything treewide.

Tom

>
>> And an explicit list of fixers that have a very high (100%?) fix rate.
>>
>> Ideally a bot should make the changes, but a bot could also nag folks.
>> Is there interest in a bot making the changes? Does one already exist?
> Most recently Joe sent a treewide fix for section attributes that
> Linux pulled just after the merge window closed, IIUC.  Maybe that
> would be the best time, since automation makes it trivial for anyone
> to run the treewide fixit whenever.
>
>> The general source-to-source is a bit blue sky.  Ex/ could automagicly
>> refactor api, outline similar cut-n-pasted functions etc. Anything on
>> someone's wishlist you want to try out ?
>>
>> Signed-off-by: Tom Rix <trix@redhat.com>
>>
>> --


  reply	other threads:[~2020-10-27 19:34 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-27 16:42 Subject: [RFC] clang tooling cleanups trix
2020-10-27 16:42 ` trix
2020-10-27 16:42 ` trix
2020-10-27 16:42 ` trix
2020-10-27 16:42 ` trix
2020-10-27 16:42 ` trix
2020-10-27 18:42 ` Nick Desaulniers
2020-10-27 19:33   ` Tom Rix [this message]
2020-10-27 19:51     ` Joe Perches
2020-10-27 20:50       ` Nick Desaulniers
2020-10-27 21:21         ` Tom Rix
2020-10-27 21:09       ` Tom Rix
2020-10-27 21:34         ` Joe Perches
2020-10-27 19:39   ` Linus Torvalds
2020-10-28  3:26 ` Finn Thain
2020-10-28  3:26   ` Finn Thain
2020-10-28  3:26   ` Finn Thain
2020-10-28  3:26   ` Finn Thain
2020-10-28  3:26   ` Finn Thain
2020-10-28  3:26   ` Finn Thain
2020-11-10  2:52 ` Subject: " Joe Perches
2020-11-10  2:52   ` Joe Perches
2020-11-10  2:52   ` Joe Perches
2020-11-10  2:52   ` Joe Perches
2020-11-10  2:52   ` Joe Perches
2020-11-10  2:52   ` [Cocci] " Joe Perches
2020-11-10  2:52   ` Joe Perches
2020-11-10 13:12   ` Tom Rix
2020-11-10 13:12     ` Tom Rix
2020-11-10 13:12     ` Tom Rix
2020-11-10 13:12     ` Tom Rix
2020-11-10 13:12     ` Tom Rix
2020-11-10 13:12     ` [Cocci] " Tom Rix
2020-11-10 13:12     ` Tom Rix

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8abd1e5a-511a-e4f6-6f2c-a045d33fa2aa@redhat.com \
    --to=trix@redhat.com \
    --cc=Julia.Lawall@lip6.fr \
    --cc=clang-built-linux@googlegroups.com \
    --cc=joe@perches.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-toolchains@vger.kernel.org \
    --cc=masahiroy@kernel.org \
    --cc=ndesaulniers@google.com \
    --cc=nhuck15@gmail.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.