All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kyle Lippincott <spectral@google.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Calvin Wan <calvinwan@google.com>,
	git@vger.kernel.org,  Jonathan Tan <jonathantanmy@google.com>,
	phillip.wood123@gmail.com
Subject: Re: [PATCH v5 1/3] pager: include stdint.h because uintmax_t is used
Date: Mon, 26 Feb 2024 10:59:38 -0800	[thread overview]
Message-ID: <CAO_smVi76TbmHd5w2rpBEEYbaw46SNTrekFHE-ohDC6-=dk6DA@mail.gmail.com> (raw)
In-Reply-To: <xmqqh6i0cgyo.fsf@gitster.g>

On Thu, Feb 22, 2024 at 1:44 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Calvin Wan <calvinwan@google.com> writes:
>
> > From: Jonathan Tan <jonathantanmy@google.com>
> >
> > pager.h uses uintmax_t but does not include stdint.h. Therefore, add
> > this include statement.
> >
> > This was discovered when writing a stub pager.c file.
> >
> > Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> > Signed-off-by: Calvin Wan <calvinwan@google.com>
> > ---
> >  pager.h | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/pager.h b/pager.h
> > index b77433026d..015bca95e3 100644
> > --- a/pager.h
> > +++ b/pager.h
> > @@ -1,6 +1,8 @@
> >  #ifndef PAGER_H
> >  #define PAGER_H
> >
> > +#include <stdint.h>
> > +
> >  struct child_process;
> >
> >  const char *git_pager(int stdout_is_tty);
>
> This is not going in a sensible direction from our portability
> standard's point of view.
>
> The reason why we do not include these system headers directly to
> our source files, and instead make it a rule to include
> <git-compat-util.h> as the first header instead, is exactly because
> there are curiosities in various platforms that Git wants to run on
> which system include headers give us the declarations for types and
> functions we rely on, in what order they must be included, and after
> what feature macros (the ones that give adjustment to what the
> system headers do, like _POSIX_C_SOURCE) are defined, etc.
>
> Given that in <git-compat-util.h>, inclusion of <stdint.h> is
> conditional behind some #ifdef's, it does not look like a sensible
> change.  It is not very likely for <inttypes.h> and <stdint.h> to
> declare uintmax_t in an incompatible way, but on a platform where
> <git-compat-util.h> decides to include <inttypes.h> and use its
> definition of what uintmax_t is, we should follow the same choice
> and be consistent.

Speaking of this specific header file inclusion and the oddities that
have gotten us to where we are:
- Originally, it seems that we were including stdint.h
- 17 years ago, to work around Solaris not providing stdint.h, but
providing inttypes.h, it was switched to being just inttypes.h, with
the explanation being that inttypes is a superset of stdint.
https://github.com/git/git/commit/007e2ba65902b484fc65a313e54594a009841740
- 13 years ago, to work around some platforms not having inttypes.h,
it was made conditional.
(https://github.com/git/git/commit/2844923d62a4c408bd59ddb2caacca4aa7eb86bc)

The condition added 13 years ago was, IMHO, backwards from what it
should have been. The intent is to have stdint.h included. We should
include stdint.h. I suspect that 17 years is enough time for that
platform to start conforming to what is now a 25 year old standard,
and I don't know how we can verify that and have this stop being a
haunted graveyard without just trying it and seeing if the build bots
or maintainers identify it as a continuing issue. If it's still an
issue (and only if), we should reintroduce a conditional, but invert
it: if there's no stdint.h, THEN include inttypes.h.

Oh, no, that doesn't work. I tried that, and the build bots told me
that doesn't work, because we're using things from inttypes.h (PRIuMAX
showed up several times in the errors, there may be others). This
makes me wonder how the platforms with no inttypes.h work at all. I
still think we should do something here because it's a 13-year-old
compatibility fix that "shouldn't" be needed anymore, and causes
confusion/concerns like this thread. Maybe just see if we can get away
with always including inttypes.h in git-compat-util.h, or maybe _both_
inttypes.h and stdint.h (in either order), just to be really obvious
that it's acceptable to include stdint.h.

>
> If there is a feature macro that affects sizes of the integer on a
> platform, this patch will break it even more badly.  Perhaps there
> is a platform whose C-library header requires you to define a
> feature macro to use 64-bit, and we may define that feature macro
> in <git-compat-util.h> before including either <inttypes.h> or
> <stdint.h>, but by including <stdint.h> directly like the above
> patch does, only this file and the sources that include only this
> file, refusing to include <git-compat-util.h> as everybody in the
> Git source tree should, will end up using different notion of what
> the integral type with maximum width is from everybody else.

I agree that for pager.h, something like the patch in your next email
would resolve that particular problem. The stub library is of
basically the same stature as git-std-lib: it's code that is provided
by the Git project, compiled by Makefile rules owned and maintained by
the Git project, and should conform to the Git coding standards. The
.c files in the stubs library should include git-compat-util.h,
there's basically no reason not to.

However, I believe that we'll need a good policy for what to do with
libified headers + sources in general. I can see many potential
categorizations of source; there's no need to formally define all of
them and assign files to each category, but the main categories are
basically:
1. files that have code that is an internal part of Git, one of the
helper binaries, or one of its tests, whether it's a library or not.
These should include git-compat-util.h at the top of the .c files like
they do today. The .h files for these translation units are also
considered "internal". These header files should assume that
git-compat-util.h has been included properly, and don't need to be
self-contained, because they're _only_ included by things in this
category.
2. files that have code that define the "library interface", probably
only the ones defining the library interface _as used by external
projects_. I think that we'll likely need to be principled about
defining these, and having them be as minimal and compatible as
possible.
3. code in external projects that directly uses libraries from the Git
project, and thus includes Git headers from category 2
4. the rest of the code in external projects (the code that does not
directly use libraries from the Git project)

A hypothetical git-compat-core.h being included at the top of the .c
files in category 2 is feasible, but needs to be carefully handled due
to potential symbol collision (which we're discussing in another
thread and I may have a possible solution for, at least on some
platforms). On the other hand, a git-compat-core.h being included at
the top of the .h files belonging to category 2 doesn't work, because
when these .h files are included by code in category 3, it's too late.

In this example, gnu-source-header.h below is a system header that
changes behavior depending on _GNU_SOURCE (effectively the same
concern as you were raising in the quoted paragraph):

external-project-category3.c:
#include <stdint.h>
#include <gnu-source-header.h>
#include <git/some-lib-interface.h>

git/some-lib-interface.h:
#include <git/git-compat-core.h>

We can't do anything in git/git-compat-core.h that relies on careful
inclusion order, requiring that various things are #defined prior to
the first inclusion of certain headers, etc. stdint.h and
gnu-source-header.h are already included, and so it's too late for us
to #define things that change their behavior, because that won't have
any effect. I don't think it's reasonable to expect the external
project to #include <git/git-compat-core.h> at the top of their files
that are in category3. It's definitely not reasonable to require the
external project to do that for all their files (category 3 and
category 4). It's slightly more reasonable to have them do some set of
#defines for their binaries and libraries, but still quite awkward and
potentially not feasible (for things like _FILE_OFFSET_BITS). This is
why I split it into 4 categories.

I believe that category 2 files need to be maximally compatible, both
to platforms [where we provide support for libraries, and I think this
probably will end up being a subset of all the platforms, especially
at first] and to the environment they're being #included in and
interacting with. So they need to be self-contained: they can't rely
on stdint.h having been included, but they also can't rely on it _not_
having been included already. The category 2 .h files need to be
minimal: just what we want in this external library interface, and
ideally nothing else. The category 2 .h and .c files need to be
compatible with common #defines being set OR not set. The point of a
category 2 .c file is to bridge the gap between the category 1 and
category 3 environments. This likely means that we need to be careful
about certain structs and typedefs defined by the system (vs. structs
defined by the category 2 headers themselves) being passed between the
different environments. For example, if we were to have a library
interface that included a `struct stat`, and the category 3 files
didn't have _FILE_OFFSET_BITS 64, but the category 1 files do? Instant
breakage.

This aspect of this discussion probably should happen on the next
patch, or in a separate thread :) But since I'm here, I'll summarize
these thoughts: basically, the next patch, imho, doesn't go far
enough, but is a very good first step that we can build on. We need to
define what belongs to the "external interface" of the various
libraries (category 2 above) and what is considered category 1.
pager.h is pretty obviously category 1. strbuf.h, abspath.h, etc? I'm
not sure. git-std-lib is weird, because it's so low level and there's
not really much "internal" code to this library. So maybe we declare
those as category 2. But I don't know how that will actually work in
practice. I'll try to find time to write up these thoughts on that
patch.

>
> What this patch _wants_ to do is of course sympathizable, and we
> have "make hdr-check" rule to enforce "a header must include the
> headers that declare what it uses", except that it lets the header
> files being tested assume that the things made available by
> including <git-compat-util.h> are always available.
>
> I think a sensible direction to go for libification purposes is to
> also make sure that sources that are compiled into gitstdlib.a, and
> the headers that makes what is in gitstdlib.a available, include the
> <git-compat-util.h> header file.  There may be things declared in
> the <git-compat-util.h> header that are _too_ specific to what ought
> to be linked into the final "git" binary and unwanted by library
> clients that are not "git" binary, and the right way to deal with it
> is to split <git-compat-util.h> into two parts, i.e. what makes
> system services available like its conditional inclusion of
> <stdint.h> vs <inttypes.h>, definition of feature macros, order in
> which the current <git-compat-util.h> includes system headers, etc.,






> excluding those that made you write this patch to avoid assuming
> that the client code would have included <git-compat-util.h> before
> <pager.h>, would be the new <git-compat-core.h>.  And everything
> else will remain in <git-compat-util.h>, which will include the
> <git-compat-core.h>.  The <pager.h> header for library clients would
> include <git-compat-core.h> instead, to still allow them to use the
> same types as "git" binary itself that way.
>
>
>
>
>
>

  reply	other threads:[~2024-02-26 18:59 UTC|newest]

Thread overview: 111+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-27 19:52 [RFC PATCH 0/8] Introduce Git Standard Library Calvin Wan
2023-06-27 19:52 ` [RFC PATCH 1/8] trace2: log fsync stats in trace2 rather than wrapper Calvin Wan
2023-06-28  2:05   ` Victoria Dye
2023-07-05 17:57     ` Calvin Wan
2023-07-05 18:22       ` Victoria Dye
2023-07-11 20:07   ` Jeff Hostetler
2023-06-27 19:52 ` [RFC PATCH 2/8] hex-ll: split out functionality from hex Calvin Wan
2023-06-28 13:15   ` Phillip Wood
2023-06-28 16:55     ` Calvin Wan
2023-06-27 19:52 ` [RFC PATCH 3/8] object: move function to object.c Calvin Wan
2023-06-27 19:52 ` [RFC PATCH 4/8] config: correct bad boolean env value error message Calvin Wan
2023-06-27 19:52 ` [RFC PATCH 5/8] parse: create new library for parsing strings and env values Calvin Wan
2023-06-27 22:58   ` Junio C Hamano
2023-06-27 19:52 ` [RFC PATCH 6/8] pager: remove pager_in_use() Calvin Wan
2023-06-27 23:00   ` Junio C Hamano
2023-06-27 23:18     ` Calvin Wan
2023-06-28  0:30     ` Glen Choo
2023-06-28 16:37       ` Glen Choo
2023-06-28 16:44         ` Calvin Wan
2023-06-28 17:30           ` Junio C Hamano
2023-06-28 20:58       ` Junio C Hamano
2023-06-27 19:52 ` [RFC PATCH 7/8] git-std-lib: introduce git standard library Calvin Wan
2023-06-28 13:27   ` Phillip Wood
2023-06-28 21:15     ` Calvin Wan
2023-06-30 10:00       ` Phillip Wood
2023-06-27 19:52 ` [RFC PATCH 8/8] git-std-lib: add test file to call git-std-lib.a functions Calvin Wan
2023-06-28  0:14 ` [RFC PATCH 0/8] Introduce Git Standard Library Glen Choo
2023-06-28 16:30   ` Calvin Wan
2023-06-30  7:01 ` Linus Arver
2023-08-10 16:33 ` [RFC PATCH v2 0/7] " Calvin Wan
2023-08-10 16:36   ` [RFC PATCH v2 1/7] hex-ll: split out functionality from hex Calvin Wan
2023-08-10 16:36   ` [RFC PATCH v2 2/7] object: move function to object.c Calvin Wan
2023-08-10 20:32     ` Junio C Hamano
2023-08-10 22:36     ` Glen Choo
2023-08-10 22:43       ` Junio C Hamano
2023-08-10 16:36   ` [RFC PATCH v2 3/7] config: correct bad boolean env value error message Calvin Wan
2023-08-10 20:36     ` Junio C Hamano
2023-08-10 16:36   ` [RFC PATCH v2 4/7] parse: create new library for parsing strings and env values Calvin Wan
2023-08-10 23:21     ` Glen Choo
2023-08-10 23:43       ` Junio C Hamano
2023-08-14 22:15       ` Jonathan Tan
2023-08-14 22:09     ` Jonathan Tan
2023-08-14 22:19       ` Junio C Hamano
2023-08-10 16:36   ` [RFC PATCH v2 5/7] date: push pager.h dependency up Calvin Wan
2023-08-10 23:41     ` Glen Choo
2023-08-14 22:17     ` Jonathan Tan
2023-08-10 16:36   ` [RFC PATCH v2 6/7] git-std-lib: introduce git standard library Calvin Wan
2023-08-14 22:26     ` Jonathan Tan
2023-08-10 16:36   ` [RFC PATCH v2 7/7] git-std-lib: add test file to call git-std-lib.a functions Calvin Wan
2023-08-14 22:28     ` Jonathan Tan
2023-08-10 22:05   ` [RFC PATCH v2 0/7] Introduce Git Standard Library Glen Choo
2023-08-15  9:20     ` Phillip Wood
2023-08-16 17:17       ` Calvin Wan
2023-08-16 21:19         ` Junio C Hamano
2023-08-15  9:41   ` Phillip Wood
2023-09-08 17:41     ` [PATCH v3 0/6] " Calvin Wan
2023-09-08 17:44       ` [PATCH v3 1/6] hex-ll: split out functionality from hex Calvin Wan
2023-09-08 17:44       ` [PATCH v3 2/6] wrapper: remove dependency to Git-specific internal file Calvin Wan
2023-09-15 17:54         ` Jonathan Tan
2023-09-08 17:44       ` [PATCH v3 3/6] config: correct bad boolean env value error message Calvin Wan
2023-09-08 17:44       ` [PATCH v3 4/6] parse: create new library for parsing strings and env values Calvin Wan
2023-09-08 17:44       ` [PATCH v3 5/6] git-std-lib: introduce git standard library Calvin Wan
2023-09-11 13:22         ` Phillip Wood
2023-09-27 14:14           ` Phillip Wood
2023-09-15 18:39         ` Jonathan Tan
2023-09-26 14:23         ` phillip.wood123
2023-09-08 17:44       ` [PATCH v3 6/6] git-std-lib: add test file to call git-std-lib.a functions Calvin Wan
2023-09-09  5:26         ` Junio C Hamano
2023-09-15 18:43         ` Jonathan Tan
2023-09-15 20:22           ` Junio C Hamano
2023-09-08 20:36       ` [PATCH v3 0/6] Introduce Git Standard Library Junio C Hamano
2023-09-08 21:30         ` Junio C Hamano
2023-09-29 21:20 ` [PATCH v4 0/4] Preliminary patches before git-std-lib Jonathan Tan
2023-09-29 21:20   ` [PATCH v4 1/4] hex-ll: separate out non-hash-algo functions Jonathan Tan
2023-10-21  4:14     ` Linus Arver
2023-09-29 21:20   ` [PATCH v4 2/4] wrapper: reduce scope of remove_or_warn() Jonathan Tan
2023-10-10  9:59     ` phillip.wood123
2023-10-10 16:13       ` Junio C Hamano
2023-10-10 17:38         ` Jonathan Tan
2023-09-29 21:20   ` [PATCH v4 3/4] config: correct bad boolean env value error message Jonathan Tan
2023-09-29 23:03     ` Junio C Hamano
2023-09-29 21:20   ` [PATCH v4 4/4] parse: separate out parsing functions from config.h Jonathan Tan
2023-10-10 10:00     ` phillip.wood123
2023-10-10 17:43       ` Jonathan Tan
2023-10-10 17:58         ` Phillip Wood
2023-10-10 20:57           ` Junio C Hamano
2023-10-10 10:05   ` [PATCH v4 0/4] Preliminary patches before git-std-lib phillip.wood123
2023-10-10 16:21     ` Jonathan Tan
2024-02-22 17:50   ` [PATCH v5 0/3] Introduce Git Standard Library Calvin Wan
2024-02-22 17:50   ` [PATCH v5 1/3] pager: include stdint.h because uintmax_t is used Calvin Wan
2024-02-22 21:43     ` Junio C Hamano
2024-02-26 18:59       ` Kyle Lippincott [this message]
2024-02-27  0:20         ` Junio C Hamano
2024-02-27  0:56           ` Kyle Lippincott
2024-02-27  2:45             ` Junio C Hamano
2024-02-27 22:29               ` Kyle Lippincott
2024-02-27 23:25                 ` Junio C Hamano
2024-02-27  8:45             ` Jeff King
2024-02-27  9:05               ` Jeff King
2024-02-27 20:10               ` Kyle Lippincott
2024-02-24  1:33     ` Kyle Lippincott
2024-02-24  7:58       ` Junio C Hamano
2024-02-22 17:50   ` [PATCH v5 2/3] git-std-lib: introduce Git Standard Library Calvin Wan
2024-02-29 11:16     ` Phillip Wood
2024-02-29 17:23       ` Junio C Hamano
2024-02-29 18:27         ` Linus Arver
2024-02-29 18:54           ` Junio C Hamano
2024-02-29 20:03             ` Linus Arver
2024-02-22 17:50   ` [PATCH v5 3/3] test-stdlib: show that git-std-lib is independent Calvin Wan
2024-02-22 22:24     ` Junio C Hamano
2024-03-07 21:13     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAO_smVi76TbmHd5w2rpBEEYbaw46SNTrekFHE-ohDC6-=dk6DA@mail.gmail.com' \
    --to=spectral@google.com \
    --cc=calvinwan@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    --cc=phillip.wood123@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.