All of lore.kernel.org
 help / color / mirror / Atom feed
* Unused #include statements
@ 2015-01-15  3:43 Zoltan Klinger
  2015-01-15  4:14 ` Robert Schiele
  0 siblings, 1 reply; 8+ messages in thread
From: Zoltan Klinger @ 2015-01-15  3:43 UTC (permalink / raw)
  To: GIT Mailing-list

Hello there,

Since reading a post [1] about removing some unnecessary #include statements
from a git C source file I've been intrigued to see how many more might be
lurking in the code base.

After a bit of digging around, my brute force approach of 'remove as many
#includes as possible while making sure the code still successfully compiles'
has returned the following results:


$ git diff --stat
 alloc.c                                                    |  2 --
 archive-zip.c                                              |  1 -
 archive.c                                                  |  1 -
 argv-array.c                                               |  1 -
 bisect.c                                                   |  9 ---------
 block-sha1/sha1.c                                          |  2 --
 branch.c                                                   |  1 -
 builtin/add.c                                              |  7 -------
 builtin/annotate.c                                         |  1 -
 builtin/apply.c                                            |  8 --------
 builtin/archive.c                                          |  3 ---
 builtin/bisect--helper.c                                   |  2 --
 builtin/blame.c                                            | 13 -------------
 builtin/branch.c                                           |  8 --------
 builtin/bundle.c                                           |  1 -
 builtin/cat-file.c                                         |  3 ---
 builtin/check-attr.c                                       |  2 --
 builtin/check-ignore.c                                     |  2 --
 builtin/check-mailmap.c                                    |  2 --
 builtin/check-ref-format.c                                 |  2 --
 builtin/checkout-index.c                                   |  2 --
 builtin/checkout.c                                         | 11 -----------
 builtin/clean.c                                            |  3 ---
 builtin/clone.c                                            | 10 ----------
 builtin/column.c                                           |  3 ---
 builtin/commit-tree.c                                      |  4 ----
 builtin/commit.c                                           | 13 -------------
 builtin/config.c                                           |  1 -
 builtin/count-objects.c                                    |  2 --
 builtin/credential.c                                       |  1 -
 builtin/describe.c                                         |  5 -----
 builtin/diff-files.c                                       |  4 ----
 builtin/diff-index.c                                       |  4 ----
 builtin/diff-tree.c                                        |  4 ----
 builtin/diff.c                                             |  6 ------
 builtin/fast-export.c                                      |  8 --------
 builtin/fetch.c                                            |  8 --------
 builtin/fmt-merge-msg.c                                    |  6 ------
 builtin/for-each-ref.c                                     |  6 ------
 builtin/fsck.c                                             |  7 -------
 builtin/gc.c                                               |  3 ---
 builtin/get-tar-commit-id.c                                |  3 ---
 builtin/grep.c                                             |  6 ------
 builtin/hash-object.c                                      |  2 --
 builtin/help.c                                             |  2 --
 builtin/index-pack.c                                       |  5 -----
 builtin/init-db.c                                          |  1 -
 builtin/interpret-trailers.c                               |  3 ---
 builtin/log.c                                              | 14 --------------
 builtin/ls-files.c                                         |  3 ---
 builtin/ls-remote.c                                        |  3 ---
 builtin/ls-tree.c                                          |  3 ---
 builtin/mailinfo.c                                         |  2 --
 builtin/mailsplit.c                                        |  3 ---
 builtin/merge-base.c                                       |  5 -----
 builtin/merge-file.c                                       |  2 --
 builtin/merge-index.c                                      |  1 -
 builtin/merge-ours.c                                       |  1 -
 builtin/merge-recursive.c                                  |  3 ---
 builtin/merge-tree.c                                       |  2 --
 builtin/merge.c                                            | 10 ----------
 builtin/mktree.c                                           |  2 --
 builtin/mv.c                                               |  4 ----
 builtin/name-rev.c                                         |  3 ---
 builtin/notes.c                                            |  4 ----
 builtin/pack-objects.c                                     | 14 --------------
 builtin/prune-packed.c                                     |  1 -
 builtin/prune.c                                            |  5 -----
 builtin/push.c                                             |  6 ------
 builtin/read-tree.c                                        |  4 ----
 builtin/receive-pack.c                                     | 12 ------------
 builtin/reflog.c                                           |  5 -----
 builtin/remote-ext.c                                       |  1 -
 builtin/remote-fd.c                                        |  1 -
 builtin/remote.c                                           |  6 ------
 builtin/repack.c                                           |  6 ------
 builtin/replace.c                                          |  1 -
 builtin/rerere.c                                           |  4 ----
 builtin/reset.c                                            |  7 -------
 builtin/rev-list.c                                         |  7 -------
 builtin/rev-parse.c                                        |  5 -----
 builtin/revert.c                                           |  4 ----
 builtin/rm.c                                               |  4 ----
 builtin/send-pack.c                                        |  7 -------
 builtin/shortlog.c                                         |  7 -------
 builtin/show-branch.c                                      |  3 ---
 builtin/show-ref.c                                         |  5 -----
 builtin/stripspace.c                                       |  1 -
 builtin/symbolic-ref.c                                     |  1 -
 builtin/tag.c                                              |  4 ----
 builtin/unpack-objects.c                                   |  7 -------
 builtin/update-index.c                                     |  7 -------
 builtin/update-ref.c                                       |  3 ---
 builtin/update-server-info.c                               |  1 -
 builtin/upload-archive.c                                   |  5 -----
 builtin/verify-commit.c                                    |  5 -----
 builtin/verify-pack.c                                      |  1 -
 builtin/verify-tag.c                                       |  5 -----
 builtin/write-tree.c                                       |  2 --
 bulk-checkin.c                                             |  3 ---
 bundle.c                                                   |  5 -----
 cache-tree.c                                               |  2 --
 check-racy.c                                               |  1 -
 column.c                                                   |  2 --
 combine-diff.c                                             |  6 ------
 commit.c                                                   |  7 -------
 compat/basename.c                                          |  1 -
 compat/fopen.c                                             |  1 -
 compat/gmtime.c                                            |  1 -
 compat/hstrerror.c                                         |  3 ---
 compat/inet_ntop.c                                         |  1 -
 compat/inet_pton.c                                         |  1 -
 compat/mingw.c                                             |  7 -------
 compat/mkdir.c                                             |  1 -
 compat/mkdtemp.c                                           |  1 -
 compat/mmap.c                                              |  1 -
 compat/msvc.c                                              |  5 -----
 compat/nedmalloc/nedmalloc.c                               |  2 --
 compat/obstack.c                                           |  1 -
 compat/poll/poll.c                                         |  6 ------
 compat/pread.c                                             |  1 -
 compat/precompose_utf8.c                                   |  1 -
 compat/qsort.c                                             |  1 -
 compat/regex/regex.c                                       |  2 --
 compat/setenv.c                                            |  1 -
 compat/snprintf.c                                          |  1 -
 compat/strcasestr.c                                        |  1 -
 compat/strlcpy.c                                           |  1 -
 compat/strtoimax.c                                         |  1 -
 compat/strtoumax.c                                         |  1 -
 compat/terminal.c                                          |  2 --
 compat/unsetenv.c                                          |  1 -
 compat/win32/dirent.c                                      |  1 -
 compat/win32/pthread.c                                     |  4 ----
 compat/win32/syslog.c                                      |  1 -
 compat/win32mmap.c                                         |  1 -
 compat/winansi.c                                           |  3 ---
 config.c                                                   |  4 ----
 connect.c                                                  |  4 ----
 connected.c                                                |  2 --
 contrib/convert-objects/convert-objects.c                  |  4 ----
 .../gnome-keyring/git-credential-gnome-keyring.c           |  6 ------
 .../credential/osxkeychain/git-credential-osxkeychain.c    |  4 ----
 contrib/credential/wincred/git-credential-wincred.c        |  4 ----
 contrib/examples/builtin-fetch--tool.c                     |  5 -----
 contrib/svn-fe/svn-fe.c                                    |  2 --
 convert.c                                                  |  2 --
 credential-cache--daemon.c                                 |  2 --
 credential-cache.c                                         |  3 ---
 credential-store.c                                         |  1 -
 credential.c                                               |  1 -
 csum-file.c                                                |  1 -
 daemon.c                                                   |  3 ---
 diff-delta.c                                               |  1 -
 diff-lib.c                                                 |  5 -----
 155 files changed, 562 deletions(-)


So my questions are as follows:

(1) Is it worth turning this into a proper patch?

(2) Given the large number of files (150+) what would be the best
    approach in preparing the patch?

        (a) One commit containing all the changes? Would it be a bit too much
            to digest in one go?

        (b) One commit per file changed? Feels a bit over the top also it
            would flood the mailing list.

        (c) One commit each for changes in the root directory, builtin,
            compat and contrib directories?

Thanks,
Zoltan

[1] http://article.gmane.org/gmane.comp.version-control.git/262402

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Unused #include statements
  2015-01-15  3:43 Unused #include statements Zoltan Klinger
@ 2015-01-15  4:14 ` Robert Schiele
  2015-01-15  6:33   ` Jeff King
  0 siblings, 1 reply; 8+ messages in thread
From: Robert Schiele @ 2015-01-15  4:14 UTC (permalink / raw)
  To: Zoltan Klinger; +Cc: GIT Mailing-list

Hi Zoltan,

I can't make a statement for the git project but I consider this kind
of brute-force removal a very problematic approach for languages like
C and C++. The reason for that is simple: Often header files include
other header files since their content depends on those other header
files. Let's assume a header file a.h includes b.h. Now consider a
file c.c includes both a.h and b.h since they are actually using stuff
from both of them. Your brute-force approach would now remove b.h
since it is indirectly pulled in through a.h. While the removal would
no lead to technially incorrect code it would no longer reflect the
semantical situation (since c.c still uses stuff from b.h). Even worse
is that if you later modify c.c to no longer use stuff from a.h you
could no longer remove a.h since that way you would break the chain to
b.h.

Thus doing those kind of brute-force removals generally makes the
include structure in a project very fragile. The analysis itself you
did is still useful to identify header files that can potentially be
removed but removing them without further analysis I would consider
problematic.

Robert

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Unused #include statements
  2015-01-15  4:14 ` Robert Schiele
@ 2015-01-15  6:33   ` Jeff King
  2015-01-15 18:50     ` Junio C Hamano
  0 siblings, 1 reply; 8+ messages in thread
From: Jeff King @ 2015-01-15  6:33 UTC (permalink / raw)
  To: Robert Schiele; +Cc: Zoltan Klinger, GIT Mailing-list

On Thu, Jan 15, 2015 at 05:14:39AM +0100, Robert Schiele wrote:

> Thus doing those kind of brute-force removals generally makes the
> include structure in a project very fragile. The analysis itself you
> did is still useful to identify header files that can potentially be
> removed but removing them without further analysis I would consider
> problematic.

I would second that. Besides leading to a potentially fragile result,
this analysis was done only for a particular platform with a particular
set of config knobs.

One of our rules is that git-compat-util.h (or one of the well-known
headers which includes, cache.h or builtin.h) is included first in any
translation unit. This gives git-compat-util the cleanest environment
possible for making decisions, and lets macros it defines effect the
rest of the code consistently. I suspect on modern platforms like
Linux/glibc that it is not a huge deal to include git-compat-util a
little late, simply because it does not have all that much to do. But
on Solaris 8? Who knows.

-Peff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Unused #include statements
  2015-01-15  6:33   ` Jeff King
@ 2015-01-15 18:50     ` Junio C Hamano
  2015-01-15 22:38       ` Jeff King
  2015-01-20  2:08       ` Zoltan Klinger
  0 siblings, 2 replies; 8+ messages in thread
From: Junio C Hamano @ 2015-01-15 18:50 UTC (permalink / raw)
  To: Jeff King; +Cc: Robert Schiele, Zoltan Klinger, GIT Mailing-list

Jeff King <peff@peff.net> writes:

> One of our rules is that git-compat-util.h (or one of the well-known
> headers which includes, cache.h or builtin.h) is included first in any
> translation unit.

Perhaps by now we can spell it out a bit more explicitly and update
CodingGuidelines.  We have:

 - The first #include in C files, except in platform specific
   compat/ implementations, should be git-compat-util.h or another
   header file that includes it, such as cache.h or builtin.h.

"such as" might be making things unnecessarily vague; I do not think
a valid reason why we should say a .c file that includes "advice.h"
as the first thing satisfies this requirement, for example.

Because:

 - A command that interacts with the object store, config subsystem,
   the index, or the working tree cannot do anything without using
   what is declared in "cache.h".

 - A built-in command must be declared in "builtin.h", so anything
   in builtin/*.c must include it.

it may be reasonable to say the first *.h file included must be one
of git-compat-util.h, cache.h or builtin.h (and then we make sure
that compat-util is the first thing included in either of the latter
two).

While I very much agree with the principle Robert alluded to [*1*],
we may want to loosen that for .c files that include "builtin.h" or
"cache.h" for the sake of brevity.  For example, if you are builtin
(hence you start by #include "builtin.h"), it may be reasonable to
allow you to take whatever is in "cache.h" for granted [*2*].

So the rule might be:

 - The first #include in C files, except in platform specific
   compat/ implementations, must be either git-compat-util.h,
   cache.h or builtin.h.

 - A C file must directly include the header files that declare the
   functions and the types it uses, except for the functions and
   types that are made available to it by including one of the
   header files it must include by the previous rule.

Optionally, 

 - A C file must include only one of "git-compat-util.h", "cache.h"
   or "builtin.h"; e.g. if you include "builtin.h", do not include
   the other two, but it can consider what is availble in "cache.h"
   available to it.

Thoughts?  I am not looking forward to a torrent of patches whose
sole purpose is to make the existing C files conform to any such
rule, though.  Clean-up patches that trickle in at a low rate is
tolerable, but a torrent is too distracting.


[Footnote]

*1* For example, even though "diff.h" may include "tree-walk.h" for
its own use, a .c file that includes "diff.h" without including
"tree-walk.h" that uses update_tree_entry() or anything that is
declared in the latter is very iffy from semantic point of view.

*2* Because that facility is so widely used inside the codebase,
"builtin.h" includes "strbuf.h", so in addition to what are in
"cache.h", we may want to allow builtin implementations to take
strbufs for granted as well.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Unused #include statements
  2015-01-15 18:50     ` Junio C Hamano
@ 2015-01-15 22:38       ` Jeff King
  2015-01-15 23:20         ` Junio C Hamano
  2015-01-20  2:08       ` Zoltan Klinger
  1 sibling, 1 reply; 8+ messages in thread
From: Jeff King @ 2015-01-15 22:38 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Robert Schiele, Zoltan Klinger, GIT Mailing-list

On Thu, Jan 15, 2015 at 10:50:45AM -0800, Junio C Hamano wrote:

> So the rule might be:
> 
>  - The first #include in C files, except in platform specific
>    compat/ implementations, must be either git-compat-util.h,
>    cache.h or builtin.h.
> 
>  - A C file must directly include the header files that declare the
>    functions and the types it uses, except for the functions and
>    types that are made available to it by including one of the
>    header files it must include by the previous rule.

Yeah, that makes sense (and is what I took away from the existing rule
in CodingGuidelines, but I agree what is there is not very rigorous).

> Optionally, 
> 
>  - A C file must include only one of "git-compat-util.h", "cache.h"
>    or "builtin.h"; e.g. if you include "builtin.h", do not include
>    the other two, but it can consider what is availble in "cache.h"
>    available to it.
> 
> Thoughts?  I am not looking forward to a torrent of patches whose
> sole purpose is to make the existing C files conform to any such
> rule, though.  Clean-up patches that trickle in at a low rate is
> tolerable, but a torrent is too distracting.

I don't think the "optionally" one above is that necessary. Not because
I don't agree with it, but because I do not know that we want to get
into the business of laying out every minute detail and implication.
The CodingGuidelines document is meant to be guidelines, and I do not
want to see arguments like "well, the guidelines do not explicitly
_disallow_ this, so you must accept it or add something to the
guideline". That is a waste of everybody's time.

A general philosophy + good taste (from the submitter and the
maintainer) should ideally be enough. And hopefully would stop a torrent
of "but this file doesn't conform to the letter of CodingGuidelines!".
Maybe it does not, but if there is no tangible benefit besides blindly
following some rules, it is not worth the precious time of developers.

Which isn't to say we shouldn't clarify the document when need be. But I
think what I quoted at the top already is probably a good improvement
over what is there.

-Peff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Unused #include statements
  2015-01-15 22:38       ` Jeff King
@ 2015-01-15 23:20         ` Junio C Hamano
  2015-01-16  0:00           ` Jeff King
  0 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2015-01-15 23:20 UTC (permalink / raw)
  To: Jeff King; +Cc: Robert Schiele, Zoltan Klinger, GIT Mailing-list

Jeff King <peff@peff.net> writes:

> On Thu, Jan 15, 2015 at 10:50:45AM -0800, Junio C Hamano wrote:
>>  ...
>> Thoughts?  I am not looking forward to a torrent of patches whose
>> sole purpose is to make the existing C files conform to any such
>> rule, though.  Clean-up patches that trickle in at a low rate is
>> tolerable, but a torrent is too distracting.
>
> I don't think the "optionally" one above is that necessary. Not because
> I don't agree with it, but because I do not know that we want to get
> into the business of laying out every minute detail and implication.
> The CodingGuidelines document is meant to be guidelines, and I do not
> want to see arguments like "well, the guidelines do not explicitly
> _disallow_ this, so you must accept it or add something to the
> guideline". That is a waste of everybody's time.

Totally.  I know we do not want to get into that business.

> A general philosophy + good taste (from the submitter and the
> maintainer) should ideally be enough.

Yes, "ideally" ;-)

> Which isn't to say we shouldn't clarify the document when need be. But I
> think what I quoted at the top already is probably a good improvement
> over what is there.

OK, thanks.  Let's queue something like this for post 2.3 cycle,
then.

-- >8 --
Subject: CodingGuidelines: clarify C #include rules

Even though "advice.h" includes "git-compat-util.h", it is not
sensible to have it as the first #include and indirectly satisify
the "You must give git-compat-util.h a clean environment to set up
feature test macros before including any of the system headers are
included", which is the real requirement.

Because:

 - A command that interacts with the object store, config subsystem,
   the index, or the working tree cannot do anything without using
   what is declared in "cache.h";

 - A built-in command must be declared in "builtin.h", so anything
   in builtin/*.c must include it;

 - These two headers both include "git-compat-util.h" as the first
   thing; and

 - Almost all our *.c files (outside compat/ and borrowed files in
   xdiff/) need some Git-ness from "cache.h" to do something
   Git-ish.

let's explicitly specify that one of these three header files must
be the first thing that is included.

Any of our *.c file should include the header file that directly
declares what it uses, instead of relying on the fact that some *.h
file it includes happens to include another *.h file that declares
the necessary function or type.  Spell it out as another guideline
item.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/CodingGuidelines | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
index 894546d..578d07c 100644
--- a/Documentation/CodingGuidelines
+++ b/Documentation/CodingGuidelines
@@ -328,9 +328,14 @@ For C programs:
 
  - When you come up with an API, document it.
 
- - The first #include in C files, except in platform specific
-   compat/ implementations, should be git-compat-util.h or another
-   header file that includes it, such as cache.h or builtin.h.
+ - The first #include in C files, except in platform specific compat/
+   implementations, must be either "git-compat-util.h", "cache.h" or
+   "builtin.h".  You do not have to include more than one of these.
+
+ - A C file must directly include the header files that declare the
+   functions and the types it uses, except for the functions and types
+   that are made available to it by including one of the header files
+   it must include by the previous rule.
 
  - If you are planning a new command, consider writing it in shell
    or perl first, so that changes in semantics can be easily

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: Unused #include statements
  2015-01-15 23:20         ` Junio C Hamano
@ 2015-01-16  0:00           ` Jeff King
  0 siblings, 0 replies; 8+ messages in thread
From: Jeff King @ 2015-01-16  0:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Robert Schiele, Zoltan Klinger, GIT Mailing-list

On Thu, Jan 15, 2015 at 03:20:09PM -0800, Junio C Hamano wrote:

> OK, thanks.  Let's queue something like this for post 2.3 cycle,
> then.
> 
> -- >8 --
> Subject: CodingGuidelines: clarify C #include rules
> [...]

Thanks, this looks good to me.

-Peff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Unused #include statements
  2015-01-15 18:50     ` Junio C Hamano
  2015-01-15 22:38       ` Jeff King
@ 2015-01-20  2:08       ` Zoltan Klinger
  1 sibling, 0 replies; 8+ messages in thread
From: Zoltan Klinger @ 2015-01-20  2:08 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Robert Schiele, GIT Mailing-list

Robert, Peff and Junio.

Thank you all for your feedback. It's clear now what sort of analysis I should
aim towards.

Thanks,
Zoltan

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-01-20  2:08 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-15  3:43 Unused #include statements Zoltan Klinger
2015-01-15  4:14 ` Robert Schiele
2015-01-15  6:33   ` Jeff King
2015-01-15 18:50     ` Junio C Hamano
2015-01-15 22:38       ` Jeff King
2015-01-15 23:20         ` Junio C Hamano
2015-01-16  0:00           ` Jeff King
2015-01-20  2:08       ` Zoltan Klinger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.