All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/8] Introduce Git Standard Library
@ 2023-06-27 19:52 Calvin Wan
  2023-06-27 19:52 ` [RFC PATCH 1/8] trace2: log fsync stats in trace2 rather than wrapper Calvin Wan
                   ` (11 more replies)
  0 siblings, 12 replies; 111+ messages in thread
From: Calvin Wan @ 2023-06-27 19:52 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, nasamuffin, chooglen, johnathantanmy

Introduction / Pre-reading
================

The Git Standard Library intends to serve as the foundational library
and root dependency that other libraries in Git will be built off of.
That is to say, suppose we have libraries X and Y; a user that wants to
use X and Y would need to include X, Y, and this Git Standard Library.
This cover letter will explain the rationale behind having a root
dependency that encompasses many files in the form of a standard library
rather than many root dependencies/libraries of those files. This does
not mean that the Git Standard Library will be the only possible root
dependency in the future, but rather the most significant and widely
used one. I will also explain why each file was chosen to be a part of
Git Standard Library v1. I will not explain entirely why we would like
to libify parts of Git -- see here[1] for that context.

Before looking at this series, it probably makes sense to look at the
other series that this is built on top of since that is the state I will
be referring to in this cover letter:

  - Elijah's final cache.h cleanup series[2]
  - my strbuf cleanup series[3]
  - my git-compat-util cleanup series[4]

Most importantly, in the git-compat-util series, the declarations for
functions implemented in wrapper.c and usage.c have been moved to their
respective header files, wrapper.h and usage.h, from git-compat-util.h.
Also config.[ch] had its general parsing code moved to parse.[ch].

Dependency graph in libified Git
================

If you look in the Git Makefile, all of the objects defined in the Git
library are compiled and archived into a singular file, libgit.a, which
is linked against by common-main.o with other external dependencies and
turned into the Git executable. In other words, the Git executable has
dependencies on libgit.a and a couple of external libraries. While our
efforts to libify Git will not affect this current build flow, it will
provide an alternate method for building Git.

With our current method of building Git, we can imagine the dependency
graph as such:

        Git
         /\
        /  \
       /    \
  libgit.a   ext deps

In libifying parts of Git, we want to shrink the dependency graph to
only the minimal set of dependencies, so libraries should not use
libgit.a. Instead, it would look like:

                Git
                /\
               /  \
              /    \
          libgit.a  ext deps
             /\
            /  \
           /    \
object-store.a  (other lib)
      |        /
      |       /
      |      /
 config.a   / 
      |    /
      |   /
      |  /
git-std-lib.a

Instead of containing all of the objects in Git, libgit.a would contain
objects that are not built by libraries it links against. Consequently,
if someone wanted their own custom build of Git with their own custom
implementation of the object store, they would only have to swap out
object-store.a rather than do a hard fork of Git.

Rationale behind Git Standard Library
================

The rationale behind Git Standard Library essentially is the result of
two observations within the Git codebase: every file includes
git-compat-util.h which defines functions in a couple of different
files, and wrapper.c + usage.c have difficult-to-separate circular
dependencies with each other and other files.

Ubiquity of git-compat-util.h and circular dependencies
========

Every file in the Git codebase includes git-compat-util.h. It serves as
"a compatibility aid that isolates the knowledge of platform specific
inclusion order and what feature macros to define before including which
system header" (Junio[5]). Since every file includes git-compat-util.h, and
git-compat-util.h includes wrapper.h and usage.h, it would make sense
for wrapper.c and usage.c to be a part of the root library. They have
difficult to separate circular dependencies with each other so they
can't be independent libraries. Wrapper.c has dependencies on parse.c,
abspath.c, strbuf.c, which in turn also have dependencies on usage.c and
wrapper.c -- more circular dependencies. 

Tradeoff between swappability and refactoring
========

From the above dependency graph, we can see that git-std-lib.a could be
many smaller libraries rather than a singular library. So why choose a
singular library when multiple libraries can be individually easier to
swap and are more modular? A singular library requires less work to
separate out circular dependencies within itself so it becomes a
tradeoff question between work and reward. While there may be a point in
the future where a file like usage.c would want its own library so that
someone can have custom die() or error(), the work required to refactor
out the circular dependencies in some files would be enormous due to
their ubiquity so therefore I believe it is not worth the tradeoff
currently. Additionally, we can in the future choose to do this refactor
and change the API for the library if there becomes enough of a reason
to do so (remember we are avoiding promising stability of the interfaces
of those libraries).

Reuse of compatibility functions in git-compat-util.h
========

Most functions defined in git-compat-util.h are implemented in compat/
and have dependencies limited to strbuf.h and wrapper.h so they can be
easily included in git-std-lib.a, which as a root dependency means that
higher level libraries do not have to worry about compatibility files in
compat/. The rest of the functions defined in git-compat-util.h are
implemented in top level files and, in this patch set, are hidden behind
an #ifdef if their implementation is not in git-std-lib.a.

Rationale summary
========

The Git Standard Library allows us to get the libification ball rolling
with other libraries in Git (such as Glen's removal of global state from
config iteration[6] prepares a config library). By not spending many
more months attempting to refactor difficult circular dependencies and
instead spending that time getting to a state where we can test out
swapping a library out such as config or object store, we can prove the
viability of Git libification on a much faster time scale. Additionally
the code cleanups that have happened so far have been minor and
beneficial for the codebase. It is probable that making large movements
would negatively affect code clarity.

Git Standard Library boundary
================

While I have described above some useful heuristics for identifying
potential candidates for git-std-lib.a, a standard library should not
have a shaky definition for what belongs in it.

 - Low-level files (aka operates only on other primitive types) that are
   used everywhere within the codebase (wrapper.c, usage.c, strbuf.c)
   - Dependencies that are low-level and widely used
     (abspath.c, date.c, hex-ll.c, parse.c, utf8.c)
 - low-level git/* files with functions defined in git-compat-util.h
   (ctype.c)
 - compat/*

There are other files that might fit this definition, but that does not
mean it should belong in git-std-lib.a. Those files should start as
their own separate library since any file added to git-std-lib.a loses
its flexibility of being easily swappable.

Files inside of Git Standard Library
================

The initial set of files in git-std-lib.a are:
abspath.c
ctype.c
date.c
hex-ll.c
parse.c
strbuf.c
usage.c
utf8.c
wrapper.c
relevant compat/ files

Pitfalls
================

In patch 7, I use #ifdef GIT_STD_LIB to both stub out code and hide
certain function headers. As other parts of Git are libified, if we
have to use more ifdefs for each different library, then the codebase
will become uglier and harder to understand. 

There are a small amount of files under compat/* that have dependencies
not inside of git-std-lib.a. While those functions are not called on
Linux, other OSes might call those problematic functions. I don't see
this as a major problem, just moreso an observation that libification in
general may also require some minor compatibility work in the future.

Testing
================

Patch 8 introduces a temporary test file which will be replaced with
unit tests once a unit testing framework is decided upon[7]. It simply
proves that all of the functions in git-std-lib.a do not have any
missing dependencies and can stand up by itself.

I have not yet tested building Git with git-std-lib.a yet (basically
removing the objects in git-std-lib.a from LIB_OBJS and linking against
git-std-lib.a instead), but I intend on testing this in a future version
of this patch. As an RFC, I want to showcase git-std-lib.a as an
experimental dependency that other executables can include in order to
use Git binaries. Internally we have tested building and calling
functions in git-std-lib.a from other programs.

Unit tests should catch any breakages caused by changes to files in
git-std-lib.a (i.e. introduction of a out of scope dependency) and new
functions introduced to git-std-lib.a will require unit tests written
for them.

Series structure
================

While my strbuf and git-compat-util series can stand alone, they also
function as preparatory patches for this series. There are more cleanup
patches in this series, but since most of them have marginal benefits
probably not worth the churn on its own, I decided not to split them
into a separate series like with strbuf and git-compat-util. As an RFC,
I am looking for comments on whether the rationale behind git-std-lib
makes sense as well as whether there are better ways to build and enable
git-std-lib in patch 7, specifically regarding Makefile rules and the
usage of ifdef's to stub out certain functions and headers. 

The patch series is structured as follows:

Patches 1-6 are cleanup patches to remove the last few extraneous
dependencies from git-std-lib.a. Here's a short summary of the
dependencies that are specifically removed from git-std-lib.a since some
of the commit messages and diffs showcase dependency cleanups for other
files not directly related to git-std-lib.a:
 - Patch 1 removes trace2.h and repository.h dependencies from wrapper.c
 - Patch 2 removes the repository.h dependency from strbuf.c inherited from
   hex.c by separating it into hex-ll.c and hex.c
 - Patch 3 removes the object.h dependency from wrapper.c
 - Patch 4 is a bug fix that sets up the next patch. This importantly
   removes the git_config_bool() call from git_env_bool() so that env
   parsing can go in a separate file
 - Patch 5 removes the config.h dependency from wrapper.c and swaps it
   with a dependency to parse.h, which doesn't have extraneous
   dependencies to files outside of git-std-lib.a
 - Patch 6 removes the pager.h dependency from date.c

Patch 7 introduces Git standard library.

Patch 8 introduces a temporary test file for Git standard library. The
test file directly or indirectly calls all functions in git-std-lib.a to
showcase that the functions don't reference missing objects and that
git-std-lib.a can stand on its own.

[1] https://lore.kernel.org/git/CAJoAoZ=Cig_kLocxKGax31sU7Xe4==BGzC__Bg2_pr7krNq6MA@mail.gmail.com/
[2] https://lore.kernel.org/git/pull.1525.v3.git.1684218848.gitgitgadget@gmail.com/
[3] https://lore.kernel.org/git/20230606194720.2053551-1-calvinwan@google.com/
[4] https://lore.kernel.org/git/20230606170711.912972-1-calvinwan@google.com/
[5] https://lore.kernel.org/git/xmqqwn17sydw.fsf@gitster.g/
[6] https://lore.kernel.org/git/pull.1497.v3.git.git.1687290231.gitgitgadget@gmail.com/
[7] https://lore.kernel.org/git/8afdb215d7e10ca16a2ce8226b4127b3d8a2d971.1686352386.git.steadmon@google.com/

Calvin Wan (8):
  trace2: log fsync stats in trace2 rather than wrapper
  hex-ll: split out functionality from hex
  object: move function to object.c
  config: correct bad boolean env value error message
  parse: create new library for parsing strings and env values
  pager: remove pager_in_use()
  git-std-lib: introduce git standard library
  git-std-lib: add test file to call git-std-lib.a functions

 Documentation/technical/git-std-lib.txt | 182 ++++++++++++++++++
 Makefile                                |  30 ++-
 attr.c                                  |   2 +-
 builtin/log.c                           |   2 +-
 color.c                                 |   4 +-
 column.c                                |   2 +-
 config.c                                | 173 +----------------
 config.h                                |  14 +-
 date.c                                  |   4 +-
 git-compat-util.h                       |   7 +-
 git.c                                   |   2 +-
 hex-ll.c                                |  49 +++++
 hex-ll.h                                |  27 +++
 hex.c                                   |  47 -----
 hex.h                                   |  24 +--
 mailinfo.c                              |   2 +-
 object.c                                |   5 +
 object.h                                |   6 +
 pack-objects.c                          |   2 +-
 pack-revindex.c                         |   2 +-
 pager.c                                 |   5 -
 pager.h                                 |   1 -
 parse-options.c                         |   3 +-
 parse.c                                 | 182 ++++++++++++++++++
 parse.h                                 |  20 ++
 pathspec.c                              |   2 +-
 preload-index.c                         |   2 +-
 progress.c                              |   2 +-
 prompt.c                                |   2 +-
 rebase.c                                |   2 +-
 strbuf.c                                |   2 +-
 symlinks.c                              |   2 +
 t/Makefile                              |   4 +
 t/helper/test-env-helper.c              |   2 +-
 t/stdlib-test.c                         | 239 ++++++++++++++++++++++++
 trace2.c                                |  13 ++
 trace2.h                                |   5 +
 unpack-trees.c                          |   2 +-
 url.c                                   |   2 +-
 urlmatch.c                              |   2 +-
 usage.c                                 |   8 +
 wrapper.c                               |  25 +--
 wrapper.h                               |   9 +-
 write-or-die.c                          |   2 +-
 44 files changed, 813 insertions(+), 311 deletions(-)
 create mode 100644 Documentation/technical/git-std-lib.txt
 create mode 100644 hex-ll.c
 create mode 100644 hex-ll.h
 create mode 100644 parse.c
 create mode 100644 parse.h
 create mode 100644 t/stdlib-test.c

-- 
2.41.0.162.gfafddb0af9-goog


^ permalink raw reply	[flat|nested] 111+ messages in thread

* [RFC PATCH 1/8] trace2: log fsync stats in trace2 rather than wrapper
  2023-06-27 19:52 [RFC PATCH 0/8] Introduce Git Standard Library Calvin Wan
@ 2023-06-27 19:52 ` Calvin Wan
  2023-06-28  2:05   ` Victoria Dye
  2023-07-11 20:07   ` Jeff Hostetler
  2023-06-27 19:52 ` [RFC PATCH 2/8] hex-ll: split out functionality from hex Calvin Wan
                   ` (10 subsequent siblings)
  11 siblings, 2 replies; 111+ messages in thread
From: Calvin Wan @ 2023-06-27 19:52 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, nasamuffin, chooglen, johnathantanmy

As a library boundary, wrapper.c should not directly log trace2
statistics, but instead provide those statistics upon
request. Therefore, move the trace2 logging code to trace2.[ch.]. This
also allows wrapper.c to not be dependent on trace2.h and repository.h.

Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 trace2.c  | 13 +++++++++++++
 trace2.h  |  5 +++++
 wrapper.c | 17 ++++++-----------
 wrapper.h |  4 ++--
 4 files changed, 26 insertions(+), 13 deletions(-)

diff --git a/trace2.c b/trace2.c
index 0efc4e7b95..f367a1ce31 100644
--- a/trace2.c
+++ b/trace2.c
@@ -915,3 +915,16 @@ const char *trace2_session_id(void)
 {
 	return tr2_sid_get();
 }
+
+static void log_trace_fsync_if(const char *key)
+{
+	intmax_t value = get_trace_git_fsync_stats(key);
+	if (value)
+		trace2_data_intmax("fsync", the_repository, key, value);
+}
+
+void trace_git_fsync_stats(void)
+{
+	log_trace_fsync_if("fsync/writeout-only");
+	log_trace_fsync_if("fsync/hardware-flush");
+}
diff --git a/trace2.h b/trace2.h
index 4ced30c0db..689e9a4027 100644
--- a/trace2.h
+++ b/trace2.h
@@ -581,4 +581,9 @@ void trace2_collect_process_info(enum trace2_process_info_reason reason);
 
 const char *trace2_session_id(void);
 
+/*
+ * Writes out trace statistics for fsync
+ */
+void trace_git_fsync_stats(void);
+
 #endif /* TRACE2_H */
diff --git a/wrapper.c b/wrapper.c
index 22be9812a7..bd7f0a9752 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -6,9 +6,7 @@
 #include "config.h"
 #include "gettext.h"
 #include "object.h"
-#include "repository.h"
 #include "strbuf.h"
-#include "trace2.h"
 
 static intmax_t count_fsync_writeout_only;
 static intmax_t count_fsync_hardware_flush;
@@ -600,16 +598,13 @@ int git_fsync(int fd, enum fsync_action action)
 	}
 }
 
-static void log_trace_fsync_if(const char *key, intmax_t value)
+intmax_t get_trace_git_fsync_stats(const char *key)
 {
-	if (value)
-		trace2_data_intmax("fsync", the_repository, key, value);
-}
-
-void trace_git_fsync_stats(void)
-{
-	log_trace_fsync_if("fsync/writeout-only", count_fsync_writeout_only);
-	log_trace_fsync_if("fsync/hardware-flush", count_fsync_hardware_flush);
+	if (!strcmp(key, "fsync/writeout-only"))
+		return count_fsync_writeout_only;
+	if (!strcmp(key, "fsync/hardware-flush"))
+		return count_fsync_hardware_flush;
+	return 0;
 }
 
 static int warn_if_unremovable(const char *op, const char *file, int rc)
diff --git a/wrapper.h b/wrapper.h
index c85b1328d1..db1bc109ed 100644
--- a/wrapper.h
+++ b/wrapper.h
@@ -88,9 +88,9 @@ enum fsync_action {
 int git_fsync(int fd, enum fsync_action action);
 
 /*
- * Writes out trace statistics for fsync using the trace2 API.
+ * Returns trace statistics for fsync using the trace2 API.
  */
-void trace_git_fsync_stats(void);
+intmax_t get_trace_git_fsync_stats(const char *key);
 
 /*
  * Preserves errno, prints a message, but gives no warning for ENOENT.
-- 
2.41.0.162.gfafddb0af9-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [RFC PATCH 2/8] hex-ll: split out functionality from hex
  2023-06-27 19:52 [RFC PATCH 0/8] Introduce Git Standard Library Calvin Wan
  2023-06-27 19:52 ` [RFC PATCH 1/8] trace2: log fsync stats in trace2 rather than wrapper Calvin Wan
@ 2023-06-27 19:52 ` Calvin Wan
  2023-06-28 13:15   ` Phillip Wood
  2023-06-27 19:52 ` [RFC PATCH 3/8] object: move function to object.c Calvin Wan
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 111+ messages in thread
From: Calvin Wan @ 2023-06-27 19:52 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, nasamuffin, chooglen, johnathantanmy

Separate out hex functionality that doesn't require a hash algo into
hex-ll.[ch]. Since the hash algo is currently a global that sits in
repository, this separation removes that dependency for files that only
need basic hex manipulation functions.

Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 Makefile   |  1 +
 color.c    |  2 +-
 hex-ll.c   | 49 +++++++++++++++++++++++++++++++++++++++++++++++++
 hex-ll.h   | 27 +++++++++++++++++++++++++++
 hex.c      | 47 -----------------------------------------------
 hex.h      | 24 +-----------------------
 mailinfo.c |  2 +-
 strbuf.c   |  2 +-
 url.c      |  2 +-
 urlmatch.c |  2 +-
 10 files changed, 83 insertions(+), 75 deletions(-)
 create mode 100644 hex-ll.c
 create mode 100644 hex-ll.h

diff --git a/Makefile b/Makefile
index 045e2187c4..83b385b0be 100644
--- a/Makefile
+++ b/Makefile
@@ -1040,6 +1040,7 @@ LIB_OBJS += hash-lookup.o
 LIB_OBJS += hashmap.o
 LIB_OBJS += help.o
 LIB_OBJS += hex.o
+LIB_OBJS += hex-ll.o
 LIB_OBJS += hook.o
 LIB_OBJS += ident.o
 LIB_OBJS += json-writer.o
diff --git a/color.c b/color.c
index 83abb11eda..f3c0a4659b 100644
--- a/color.c
+++ b/color.c
@@ -3,7 +3,7 @@
 #include "color.h"
 #include "editor.h"
 #include "gettext.h"
-#include "hex.h"
+#include "hex-ll.h"
 #include "pager.h"
 #include "strbuf.h"
 
diff --git a/hex-ll.c b/hex-ll.c
new file mode 100644
index 0000000000..4d7ece1de5
--- /dev/null
+++ b/hex-ll.c
@@ -0,0 +1,49 @@
+#include "git-compat-util.h"
+#include "hex-ll.h"
+
+const signed char hexval_table[256] = {
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 00-07 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 08-0f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 10-17 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 18-1f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 20-27 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 28-2f */
+	  0,  1,  2,  3,  4,  5,  6,  7,		/* 30-37 */
+	  8,  9, -1, -1, -1, -1, -1, -1,		/* 38-3f */
+	 -1, 10, 11, 12, 13, 14, 15, -1,		/* 40-47 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 48-4f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 50-57 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 58-5f */
+	 -1, 10, 11, 12, 13, 14, 15, -1,		/* 60-67 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 68-67 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 70-77 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 78-7f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 80-87 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 88-8f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 90-97 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 98-9f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* a0-a7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* a8-af */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* b0-b7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* b8-bf */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* c0-c7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* c8-cf */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* d0-d7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* d8-df */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* e0-e7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* e8-ef */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* f0-f7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* f8-ff */
+};
+
+int hex_to_bytes(unsigned char *binary, const char *hex, size_t len)
+{
+	for (; len; len--, hex += 2) {
+		unsigned int val = (hexval(hex[0]) << 4) | hexval(hex[1]);
+
+		if (val & ~0xff)
+			return -1;
+		*binary++ = val;
+	}
+	return 0;
+}
diff --git a/hex-ll.h b/hex-ll.h
new file mode 100644
index 0000000000..a381fa8556
--- /dev/null
+++ b/hex-ll.h
@@ -0,0 +1,27 @@
+#ifndef HEX_LL_H
+#define HEX_LL_H
+
+extern const signed char hexval_table[256];
+static inline unsigned int hexval(unsigned char c)
+{
+	return hexval_table[c];
+}
+
+/*
+ * Convert two consecutive hexadecimal digits into a char.  Return a
+ * negative value on error.  Don't run over the end of short strings.
+ */
+static inline int hex2chr(const char *s)
+{
+	unsigned int val = hexval(s[0]);
+	return (val & ~0xf) ? val : (val << 4) | hexval(s[1]);
+}
+
+/*
+ * Read `len` pairs of hexadecimal digits from `hex` and write the
+ * values to `binary` as `len` bytes. Return 0 on success, or -1 if
+ * the input does not consist of hex digits).
+ */
+int hex_to_bytes(unsigned char *binary, const char *hex, size_t len);
+
+#endif
diff --git a/hex.c b/hex.c
index 7bb440e794..03e55841ed 100644
--- a/hex.c
+++ b/hex.c
@@ -2,53 +2,6 @@
 #include "hash.h"
 #include "hex.h"
 
-const signed char hexval_table[256] = {
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 00-07 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 08-0f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 10-17 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 18-1f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 20-27 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 28-2f */
-	  0,  1,  2,  3,  4,  5,  6,  7,		/* 30-37 */
-	  8,  9, -1, -1, -1, -1, -1, -1,		/* 38-3f */
-	 -1, 10, 11, 12, 13, 14, 15, -1,		/* 40-47 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 48-4f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 50-57 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 58-5f */
-	 -1, 10, 11, 12, 13, 14, 15, -1,		/* 60-67 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 68-67 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 70-77 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 78-7f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 80-87 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 88-8f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 90-97 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 98-9f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* a0-a7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* a8-af */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* b0-b7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* b8-bf */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* c0-c7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* c8-cf */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* d0-d7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* d8-df */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* e0-e7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* e8-ef */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* f0-f7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* f8-ff */
-};
-
-int hex_to_bytes(unsigned char *binary, const char *hex, size_t len)
-{
-	for (; len; len--, hex += 2) {
-		unsigned int val = (hexval(hex[0]) << 4) | hexval(hex[1]);
-
-		if (val & ~0xff)
-			return -1;
-		*binary++ = val;
-	}
-	return 0;
-}
-
 static int get_hash_hex_algop(const char *hex, unsigned char *hash,
 			      const struct git_hash_algo *algop)
 {
diff --git a/hex.h b/hex.h
index 7df4b3c460..c07c8b34c2 100644
--- a/hex.h
+++ b/hex.h
@@ -2,22 +2,7 @@
 #define HEX_H
 
 #include "hash-ll.h"
-
-extern const signed char hexval_table[256];
-static inline unsigned int hexval(unsigned char c)
-{
-	return hexval_table[c];
-}
-
-/*
- * Convert two consecutive hexadecimal digits into a char.  Return a
- * negative value on error.  Don't run over the end of short strings.
- */
-static inline int hex2chr(const char *s)
-{
-	unsigned int val = hexval(s[0]);
-	return (val & ~0xf) ? val : (val << 4) | hexval(s[1]);
-}
+#include "hex-ll.h"
 
 /*
  * Try to read a SHA1 in hexadecimal format from the 40 characters
@@ -32,13 +17,6 @@ int get_oid_hex(const char *hex, struct object_id *sha1);
 /* Like get_oid_hex, but for an arbitrary hash algorithm. */
 int get_oid_hex_algop(const char *hex, struct object_id *oid, const struct git_hash_algo *algop);
 
-/*
- * Read `len` pairs of hexadecimal digits from `hex` and write the
- * values to `binary` as `len` bytes. Return 0 on success, or -1 if
- * the input does not consist of hex digits).
- */
-int hex_to_bytes(unsigned char *binary, const char *hex, size_t len);
-
 /*
  * Convert a binary hash in "unsigned char []" or an object name in
  * "struct object_id *" to its hex equivalent. The `_r` variant is reentrant,
diff --git a/mailinfo.c b/mailinfo.c
index 2aeb20e5e6..eb34c30be7 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -1,7 +1,7 @@
 #include "git-compat-util.h"
 #include "config.h"
 #include "gettext.h"
-#include "hex.h"
+#include "hex-ll.h"
 #include "utf8.h"
 #include "strbuf.h"
 #include "mailinfo.h"
diff --git a/strbuf.c b/strbuf.c
index 8dac52b919..a2a05fe168 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -1,6 +1,6 @@
 #include "git-compat-util.h"
 #include "gettext.h"
-#include "hex.h"
+#include "hex-ll.h"
 #include "strbuf.h"
 #include "string-list.h"
 #include "utf8.h"
diff --git a/url.c b/url.c
index 2e1a9f6fee..282b12495a 100644
--- a/url.c
+++ b/url.c
@@ -1,5 +1,5 @@
 #include "git-compat-util.h"
-#include "hex.h"
+#include "hex-ll.h"
 #include "strbuf.h"
 #include "url.h"
 
diff --git a/urlmatch.c b/urlmatch.c
index eba0bdd77f..f1aa87d1dd 100644
--- a/urlmatch.c
+++ b/urlmatch.c
@@ -1,6 +1,6 @@
 #include "git-compat-util.h"
 #include "gettext.h"
-#include "hex.h"
+#include "hex-ll.h"
 #include "strbuf.h"
 #include "urlmatch.h"
 
-- 
2.41.0.162.gfafddb0af9-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [RFC PATCH 3/8] object: move function to object.c
  2023-06-27 19:52 [RFC PATCH 0/8] Introduce Git Standard Library Calvin Wan
  2023-06-27 19:52 ` [RFC PATCH 1/8] trace2: log fsync stats in trace2 rather than wrapper Calvin Wan
  2023-06-27 19:52 ` [RFC PATCH 2/8] hex-ll: split out functionality from hex Calvin Wan
@ 2023-06-27 19:52 ` Calvin Wan
  2023-06-27 19:52 ` [RFC PATCH 4/8] config: correct bad boolean env value error message Calvin Wan
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 111+ messages in thread
From: Calvin Wan @ 2023-06-27 19:52 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, nasamuffin, chooglen, johnathantanmy

While remove_or_warn() is a simple ternary operator to call two other
wrapper functions, it creates an unnecessary dependency to object.h in
wrapper.c. Therefore move the function to object.[ch] where the concept
of GITLINKs is first defined.

Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 object.c  | 5 +++++
 object.h  | 6 ++++++
 wrapper.c | 6 ------
 wrapper.h | 5 -----
 4 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/object.c b/object.c
index 60f954194f..cb29fcc304 100644
--- a/object.c
+++ b/object.c
@@ -617,3 +617,8 @@ void parsed_object_pool_clear(struct parsed_object_pool *o)
 	FREE_AND_NULL(o->object_state);
 	FREE_AND_NULL(o->shallow_stat);
 }
+
+int remove_or_warn(unsigned int mode, const char *file)
+{
+	return S_ISGITLINK(mode) ? rmdir_or_warn(file) : unlink_or_warn(file);
+}
diff --git a/object.h b/object.h
index 5871615fee..e908ef6515 100644
--- a/object.h
+++ b/object.h
@@ -284,4 +284,10 @@ void clear_object_flags(unsigned flags);
  */
 void repo_clear_commit_marks(struct repository *r, unsigned int flags);
 
+/*
+ * Calls the correct function out of {unlink,rmdir}_or_warn based on
+ * the supplied file mode.
+ */
+int remove_or_warn(unsigned int mode, const char *path);
+
 #endif /* OBJECT_H */
diff --git a/wrapper.c b/wrapper.c
index bd7f0a9752..62c04aeb17 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -5,7 +5,6 @@
 #include "abspath.h"
 #include "config.h"
 #include "gettext.h"
-#include "object.h"
 #include "strbuf.h"
 
 static intmax_t count_fsync_writeout_only;
@@ -642,11 +641,6 @@ int rmdir_or_warn(const char *file)
 	return warn_if_unremovable("rmdir", file, rmdir(file));
 }
 
-int remove_or_warn(unsigned int mode, const char *file)
-{
-	return S_ISGITLINK(mode) ? rmdir_or_warn(file) : unlink_or_warn(file);
-}
-
 static int access_error_is_ok(int err, unsigned flag)
 {
 	return (is_missing_file_error(err) ||
diff --git a/wrapper.h b/wrapper.h
index db1bc109ed..166740ae60 100644
--- a/wrapper.h
+++ b/wrapper.h
@@ -111,11 +111,6 @@ int unlink_or_msg(const char *file, struct strbuf *err);
  * not exist.
  */
 int rmdir_or_warn(const char *path);
-/*
- * Calls the correct function out of {unlink,rmdir}_or_warn based on
- * the supplied file mode.
- */
-int remove_or_warn(unsigned int mode, const char *path);
 
 /*
  * Call access(2), but warn for any error except "missing file"
-- 
2.41.0.162.gfafddb0af9-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [RFC PATCH 4/8] config: correct bad boolean env value error message
  2023-06-27 19:52 [RFC PATCH 0/8] Introduce Git Standard Library Calvin Wan
                   ` (2 preceding siblings ...)
  2023-06-27 19:52 ` [RFC PATCH 3/8] object: move function to object.c Calvin Wan
@ 2023-06-27 19:52 ` Calvin Wan
  2023-06-27 19:52 ` [RFC PATCH 5/8] parse: create new library for parsing strings and env values Calvin Wan
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 111+ messages in thread
From: Calvin Wan @ 2023-06-27 19:52 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, nasamuffin, chooglen, johnathantanmy

An incorrectly defined boolean environment value would result in the
following error message:

bad boolean config value '%s' for '%s'

This is a misnomer since environment value != config value. Instead of
calling git_config_bool() to parse the environment value, mimic the
functionality inside of git_config_bool() but with the correct error
message.

Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 config.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/config.c b/config.c
index 09851a6909..5b71ef1624 100644
--- a/config.c
+++ b/config.c
@@ -2172,7 +2172,14 @@ void git_global_config(char **user_out, char **xdg_out)
 int git_env_bool(const char *k, int def)
 {
 	const char *v = getenv(k);
-	return v ? git_config_bool(k, v) : def;
+	int val;
+	if (!v)
+		return def;
+	val = git_parse_maybe_bool(v);
+	if (val < 0)
+		die(_("bad boolean environment value '%s' for '%s'"),
+		    v, k);
+	return val;
 }
 
 /*
-- 
2.41.0.162.gfafddb0af9-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [RFC PATCH 5/8] parse: create new library for parsing strings and env values
  2023-06-27 19:52 [RFC PATCH 0/8] Introduce Git Standard Library Calvin Wan
                   ` (3 preceding siblings ...)
  2023-06-27 19:52 ` [RFC PATCH 4/8] config: correct bad boolean env value error message Calvin Wan
@ 2023-06-27 19:52 ` Calvin Wan
  2023-06-27 22:58   ` Junio C Hamano
  2023-06-27 19:52 ` [RFC PATCH 6/8] pager: remove pager_in_use() Calvin Wan
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 111+ messages in thread
From: Calvin Wan @ 2023-06-27 19:52 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, nasamuffin, chooglen, johnathantanmy

While string and environment value parsing is mainly consumed by
config.c, there are other files that only need parsing functionality and
not config functionality. By separating out string and environment value
parsing from config, those files can instead be dependent on parse,
which has a much smaller dependency chain than config.

Move general string and env parsing functions from config.[ch] to
parse.[ch].

Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 Makefile                   |   1 +
 attr.c                     |   2 +-
 config.c                   | 180 +-----------------------------------
 config.h                   |  14 +--
 pack-objects.c             |   2 +-
 pack-revindex.c            |   2 +-
 parse-options.c            |   3 +-
 parse.c                    | 182 +++++++++++++++++++++++++++++++++++++
 parse.h                    |  20 ++++
 pathspec.c                 |   2 +-
 preload-index.c            |   2 +-
 progress.c                 |   2 +-
 prompt.c                   |   2 +-
 rebase.c                   |   2 +-
 t/helper/test-env-helper.c |   2 +-
 unpack-trees.c             |   2 +-
 wrapper.c                  |   2 +-
 write-or-die.c             |   2 +-
 18 files changed, 219 insertions(+), 205 deletions(-)
 create mode 100644 parse.c
 create mode 100644 parse.h

diff --git a/Makefile b/Makefile
index 83b385b0be..e9ad9f9ef1 100644
--- a/Makefile
+++ b/Makefile
@@ -1091,6 +1091,7 @@ LIB_OBJS += pack-write.o
 LIB_OBJS += packfile.o
 LIB_OBJS += pager.o
 LIB_OBJS += parallel-checkout.o
+LIB_OBJS += parse.o
 LIB_OBJS += parse-options-cb.o
 LIB_OBJS += parse-options.o
 LIB_OBJS += patch-delta.o
diff --git a/attr.c b/attr.c
index e9c81b6e07..cb047b4618 100644
--- a/attr.c
+++ b/attr.c
@@ -7,7 +7,7 @@
  */
 
 #include "git-compat-util.h"
-#include "config.h"
+#include "parse.h"
 #include "environment.h"
 #include "exec-cmd.h"
 #include "attr.h"
diff --git a/config.c b/config.c
index 5b71ef1624..cdd70999aa 100644
--- a/config.c
+++ b/config.c
@@ -11,6 +11,7 @@
 #include "date.h"
 #include "branch.h"
 #include "config.h"
+#include "parse.h"
 #include "convert.h"
 #include "environment.h"
 #include "gettext.h"
@@ -1204,129 +1205,6 @@ static int git_parse_source(struct config_source *cs, config_fn_t fn,
 	return error_return;
 }
 
-static uintmax_t get_unit_factor(const char *end)
-{
-	if (!*end)
-		return 1;
-	else if (!strcasecmp(end, "k"))
-		return 1024;
-	else if (!strcasecmp(end, "m"))
-		return 1024 * 1024;
-	else if (!strcasecmp(end, "g"))
-		return 1024 * 1024 * 1024;
-	return 0;
-}
-
-static int git_parse_signed(const char *value, intmax_t *ret, intmax_t max)
-{
-	if (value && *value) {
-		char *end;
-		intmax_t val;
-		intmax_t factor;
-
-		if (max < 0)
-			BUG("max must be a positive integer");
-
-		errno = 0;
-		val = strtoimax(value, &end, 0);
-		if (errno == ERANGE)
-			return 0;
-		if (end == value) {
-			errno = EINVAL;
-			return 0;
-		}
-		factor = get_unit_factor(end);
-		if (!factor) {
-			errno = EINVAL;
-			return 0;
-		}
-		if ((val < 0 && -max / factor > val) ||
-		    (val > 0 && max / factor < val)) {
-			errno = ERANGE;
-			return 0;
-		}
-		val *= factor;
-		*ret = val;
-		return 1;
-	}
-	errno = EINVAL;
-	return 0;
-}
-
-static int git_parse_unsigned(const char *value, uintmax_t *ret, uintmax_t max)
-{
-	if (value && *value) {
-		char *end;
-		uintmax_t val;
-		uintmax_t factor;
-
-		/* negative values would be accepted by strtoumax */
-		if (strchr(value, '-')) {
-			errno = EINVAL;
-			return 0;
-		}
-		errno = 0;
-		val = strtoumax(value, &end, 0);
-		if (errno == ERANGE)
-			return 0;
-		if (end == value) {
-			errno = EINVAL;
-			return 0;
-		}
-		factor = get_unit_factor(end);
-		if (!factor) {
-			errno = EINVAL;
-			return 0;
-		}
-		if (unsigned_mult_overflows(factor, val) ||
-		    factor * val > max) {
-			errno = ERANGE;
-			return 0;
-		}
-		val *= factor;
-		*ret = val;
-		return 1;
-	}
-	errno = EINVAL;
-	return 0;
-}
-
-int git_parse_int(const char *value, int *ret)
-{
-	intmax_t tmp;
-	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(int)))
-		return 0;
-	*ret = tmp;
-	return 1;
-}
-
-static int git_parse_int64(const char *value, int64_t *ret)
-{
-	intmax_t tmp;
-	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(int64_t)))
-		return 0;
-	*ret = tmp;
-	return 1;
-}
-
-int git_parse_ulong(const char *value, unsigned long *ret)
-{
-	uintmax_t tmp;
-	if (!git_parse_unsigned(value, &tmp, maximum_unsigned_value_of_type(long)))
-		return 0;
-	*ret = tmp;
-	return 1;
-}
-
-int git_parse_ssize_t(const char *value, ssize_t *ret)
-{
-	intmax_t tmp;
-	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(ssize_t)))
-		return 0;
-	*ret = tmp;
-	return 1;
-}
-
 static int reader_config_name(struct config_reader *reader, const char **out);
 static int reader_origin_type(struct config_reader *reader,
 			      enum config_origin_type *type);
@@ -1404,23 +1282,6 @@ ssize_t git_config_ssize_t(const char *name, const char *value)
 	return ret;
 }
 
-static int git_parse_maybe_bool_text(const char *value)
-{
-	if (!value)
-		return 1;
-	if (!*value)
-		return 0;
-	if (!strcasecmp(value, "true")
-	    || !strcasecmp(value, "yes")
-	    || !strcasecmp(value, "on"))
-		return 1;
-	if (!strcasecmp(value, "false")
-	    || !strcasecmp(value, "no")
-	    || !strcasecmp(value, "off"))
-		return 0;
-	return -1;
-}
-
 static const struct fsync_component_name {
 	const char *name;
 	enum fsync_component component_bits;
@@ -1495,16 +1356,6 @@ static enum fsync_component parse_fsync_components(const char *var, const char *
 	return (current & ~negative) | positive;
 }
 
-int git_parse_maybe_bool(const char *value)
-{
-	int v = git_parse_maybe_bool_text(value);
-	if (0 <= v)
-		return v;
-	if (git_parse_int(value, &v))
-		return !!v;
-	return -1;
-}
-
 int git_config_bool_or_int(const char *name, const char *value, int *is_bool)
 {
 	int v = git_parse_maybe_bool_text(value);
@@ -2165,35 +2016,6 @@ void git_global_config(char **user_out, char **xdg_out)
 	*xdg_out = xdg_config;
 }
 
-/*
- * Parse environment variable 'k' as a boolean (in various
- * possible spellings); if missing, use the default value 'def'.
- */
-int git_env_bool(const char *k, int def)
-{
-	const char *v = getenv(k);
-	int val;
-	if (!v)
-		return def;
-	val = git_parse_maybe_bool(v);
-	if (val < 0)
-		die(_("bad boolean environment value '%s' for '%s'"),
-		    v, k);
-	return val;
-}
-
-/*
- * Parse environment variable 'k' as ulong with possibly a unit
- * suffix; if missing, use the default value 'val'.
- */
-unsigned long git_env_ulong(const char *k, unsigned long val)
-{
-	const char *v = getenv(k);
-	if (v && !git_parse_ulong(v, &val))
-		die(_("failed to parse %s"), k);
-	return val;
-}
-
 int git_config_system(void)
 {
 	return !git_env_bool("GIT_CONFIG_NOSYSTEM", 0);
diff --git a/config.h b/config.h
index 247b572b37..7a7f53e503 100644
--- a/config.h
+++ b/config.h
@@ -3,7 +3,7 @@
 
 #include "hashmap.h"
 #include "string-list.h"
-
+#include "parse.h"
 
 /**
  * The config API gives callers a way to access Git configuration files
@@ -205,16 +205,6 @@ int config_with_options(config_fn_t fn, void *,
  * The following helper functions aid in parsing string values
  */
 
-int git_parse_ssize_t(const char *, ssize_t *);
-int git_parse_ulong(const char *, unsigned long *);
-int git_parse_int(const char *value, int *ret);
-
-/**
- * Same as `git_config_bool`, except that it returns -1 on error rather
- * than dying.
- */
-int git_parse_maybe_bool(const char *);
-
 /**
  * Parse the string to an integer, including unit factors. Dies on error;
  * otherwise, returns the parsed result.
@@ -343,8 +333,6 @@ int git_config_rename_section(const char *, const char *);
 int git_config_rename_section_in_file(const char *, const char *, const char *);
 int git_config_copy_section(const char *, const char *);
 int git_config_copy_section_in_file(const char *, const char *, const char *);
-int git_env_bool(const char *, int);
-unsigned long git_env_ulong(const char *, unsigned long);
 int git_config_system(void);
 int config_error_nonbool(const char *);
 #if defined(__GNUC__)
diff --git a/pack-objects.c b/pack-objects.c
index 1b8052bece..f403ca6986 100644
--- a/pack-objects.c
+++ b/pack-objects.c
@@ -3,7 +3,7 @@
 #include "pack.h"
 #include "pack-objects.h"
 #include "packfile.h"
-#include "config.h"
+#include "parse.h"
 
 static uint32_t locate_object_entry_hash(struct packing_data *pdata,
 					 const struct object_id *oid,
diff --git a/pack-revindex.c b/pack-revindex.c
index 7fffcad912..a01a2a4640 100644
--- a/pack-revindex.c
+++ b/pack-revindex.c
@@ -6,7 +6,7 @@
 #include "packfile.h"
 #include "strbuf.h"
 #include "trace2.h"
-#include "config.h"
+#include "parse.h"
 #include "midx.h"
 #include "csum-file.h"
 
diff --git a/parse-options.c b/parse-options.c
index f8a155ee13..9f542950a7 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -1,11 +1,12 @@
 #include "git-compat-util.h"
 #include "parse-options.h"
 #include "abspath.h"
-#include "config.h"
+#include "parse.h"
 #include "commit.h"
 #include "color.h"
 #include "gettext.h"
 #include "strbuf.h"
+#include "string-list.h"
 #include "utf8.h"
 
 static int disallow_abbreviated_options;
diff --git a/parse.c b/parse.c
new file mode 100644
index 0000000000..42d691a0fb
--- /dev/null
+++ b/parse.c
@@ -0,0 +1,182 @@
+#include "git-compat-util.h"
+#include "gettext.h"
+#include "parse.h"
+
+static uintmax_t get_unit_factor(const char *end)
+{
+	if (!*end)
+		return 1;
+	else if (!strcasecmp(end, "k"))
+		return 1024;
+	else if (!strcasecmp(end, "m"))
+		return 1024 * 1024;
+	else if (!strcasecmp(end, "g"))
+		return 1024 * 1024 * 1024;
+	return 0;
+}
+
+int git_parse_signed(const char *value, intmax_t *ret, intmax_t max)
+{
+	if (value && *value) {
+		char *end;
+		intmax_t val;
+		intmax_t factor;
+
+		if (max < 0)
+			BUG("max must be a positive integer");
+
+		errno = 0;
+		val = strtoimax(value, &end, 0);
+		if (errno == ERANGE)
+			return 0;
+		if (end == value) {
+			errno = EINVAL;
+			return 0;
+		}
+		factor = get_unit_factor(end);
+		if (!factor) {
+			errno = EINVAL;
+			return 0;
+		}
+		if ((val < 0 && -max / factor > val) ||
+		    (val > 0 && max / factor < val)) {
+			errno = ERANGE;
+			return 0;
+		}
+		val *= factor;
+		*ret = val;
+		return 1;
+	}
+	errno = EINVAL;
+	return 0;
+}
+
+static int git_parse_unsigned(const char *value, uintmax_t *ret, uintmax_t max)
+{
+	if (value && *value) {
+		char *end;
+		uintmax_t val;
+		uintmax_t factor;
+
+		/* negative values would be accepted by strtoumax */
+		if (strchr(value, '-')) {
+			errno = EINVAL;
+			return 0;
+		}
+		errno = 0;
+		val = strtoumax(value, &end, 0);
+		if (errno == ERANGE)
+			return 0;
+		if (end == value) {
+			errno = EINVAL;
+			return 0;
+		}
+		factor = get_unit_factor(end);
+		if (!factor) {
+			errno = EINVAL;
+			return 0;
+		}
+		if (unsigned_mult_overflows(factor, val) ||
+		    factor * val > max) {
+			errno = ERANGE;
+			return 0;
+		}
+		val *= factor;
+		*ret = val;
+		return 1;
+	}
+	errno = EINVAL;
+	return 0;
+}
+
+int git_parse_int(const char *value, int *ret)
+{
+	intmax_t tmp;
+	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(int)))
+		return 0;
+	*ret = tmp;
+	return 1;
+}
+
+int git_parse_int64(const char *value, int64_t *ret)
+{
+	intmax_t tmp;
+	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(int64_t)))
+		return 0;
+	*ret = tmp;
+	return 1;
+}
+
+int git_parse_ulong(const char *value, unsigned long *ret)
+{
+	uintmax_t tmp;
+	if (!git_parse_unsigned(value, &tmp, maximum_unsigned_value_of_type(long)))
+		return 0;
+	*ret = tmp;
+	return 1;
+}
+
+int git_parse_ssize_t(const char *value, ssize_t *ret)
+{
+	intmax_t tmp;
+	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(ssize_t)))
+		return 0;
+	*ret = tmp;
+	return 1;
+}
+
+int git_parse_maybe_bool_text(const char *value)
+{
+	if (!value)
+		return 1;
+	if (!*value)
+		return 0;
+	if (!strcasecmp(value, "true")
+	    || !strcasecmp(value, "yes")
+	    || !strcasecmp(value, "on"))
+		return 1;
+	if (!strcasecmp(value, "false")
+	    || !strcasecmp(value, "no")
+	    || !strcasecmp(value, "off"))
+		return 0;
+	return -1;
+}
+
+int git_parse_maybe_bool(const char *value)
+{
+	int v = git_parse_maybe_bool_text(value);
+	if (0 <= v)
+		return v;
+	if (git_parse_int(value, &v))
+		return !!v;
+	return -1;
+}
+
+/*
+ * Parse environment variable 'k' as a boolean (in various
+ * possible spellings); if missing, use the default value 'def'.
+ */
+int git_env_bool(const char *k, int def)
+{
+	const char *v = getenv(k);
+	int val;
+	if (!v)
+		return def;
+	val = git_parse_maybe_bool(v);
+	if (val < 0)
+		die(_("bad boolean environment value '%s' for '%s'"),
+		    v, k);
+	return val;
+}
+
+/*
+ * Parse environment variable 'k' as ulong with possibly a unit
+ * suffix; if missing, use the default value 'val'.
+ */
+unsigned long git_env_ulong(const char *k, unsigned long val)
+{
+	const char *v = getenv(k);
+	if (v && !git_parse_ulong(v, &val))
+		die(_("failed to parse %s"), k);
+	return val;
+}
diff --git a/parse.h b/parse.h
new file mode 100644
index 0000000000..07d2193d69
--- /dev/null
+++ b/parse.h
@@ -0,0 +1,20 @@
+#ifndef PARSE_H
+#define PARSE_H
+
+int git_parse_signed(const char *value, intmax_t *ret, intmax_t max);
+int git_parse_ssize_t(const char *, ssize_t *);
+int git_parse_ulong(const char *, unsigned long *);
+int git_parse_int(const char *value, int *ret);
+int git_parse_int64(const char *value, int64_t *ret);
+
+/**
+ * Same as `git_config_bool`, except that it returns -1 on error rather
+ * than dying.
+ */
+int git_parse_maybe_bool(const char *);
+int git_parse_maybe_bool_text(const char *value);
+
+int git_env_bool(const char *, int);
+unsigned long git_env_ulong(const char *, unsigned long);
+
+#endif /* PARSE_H */
diff --git a/pathspec.c b/pathspec.c
index 4991455281..39337999d4 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -1,6 +1,6 @@
 #include "git-compat-util.h"
 #include "abspath.h"
-#include "config.h"
+#include "parse.h"
 #include "dir.h"
 #include "environment.h"
 #include "gettext.h"
diff --git a/preload-index.c b/preload-index.c
index e44530c80c..63fd35d64b 100644
--- a/preload-index.c
+++ b/preload-index.c
@@ -7,7 +7,7 @@
 #include "environment.h"
 #include "fsmonitor.h"
 #include "gettext.h"
-#include "config.h"
+#include "parse.h"
 #include "preload-index.h"
 #include "progress.h"
 #include "read-cache.h"
diff --git a/progress.c b/progress.c
index f695798aca..c83cb60bf1 100644
--- a/progress.c
+++ b/progress.c
@@ -17,7 +17,7 @@
 #include "trace.h"
 #include "trace2.h"
 #include "utf8.h"
-#include "config.h"
+#include "parse.h"
 
 #define TP_IDX_MAX      8
 
diff --git a/prompt.c b/prompt.c
index 3baa33f63d..8935fe4dfb 100644
--- a/prompt.c
+++ b/prompt.c
@@ -1,5 +1,5 @@
 #include "git-compat-util.h"
-#include "config.h"
+#include "parse.h"
 #include "environment.h"
 #include "run-command.h"
 #include "strbuf.h"
diff --git a/rebase.c b/rebase.c
index 17a570f1ff..69a1822da3 100644
--- a/rebase.c
+++ b/rebase.c
@@ -1,6 +1,6 @@
 #include "git-compat-util.h"
 #include "rebase.h"
-#include "config.h"
+#include "parse.h"
 #include "gettext.h"
 
 /*
diff --git a/t/helper/test-env-helper.c b/t/helper/test-env-helper.c
index 66c88b8ff3..1c486888a4 100644
--- a/t/helper/test-env-helper.c
+++ b/t/helper/test-env-helper.c
@@ -1,5 +1,5 @@
 #include "test-tool.h"
-#include "config.h"
+#include "parse.h"
 #include "parse-options.h"
 
 static char const * const env__helper_usage[] = {
diff --git a/unpack-trees.c b/unpack-trees.c
index 87517364dc..761562a96e 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -2,7 +2,7 @@
 #include "advice.h"
 #include "strvec.h"
 #include "repository.h"
-#include "config.h"
+#include "parse.h"
 #include "dir.h"
 #include "environment.h"
 #include "gettext.h"
diff --git a/wrapper.c b/wrapper.c
index 62c04aeb17..3e554f50c6 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -3,7 +3,7 @@
  */
 #include "git-compat-util.h"
 #include "abspath.h"
-#include "config.h"
+#include "parse.h"
 #include "gettext.h"
 #include "strbuf.h"
 
diff --git a/write-or-die.c b/write-or-die.c
index d8355c0c3e..42a2dc73cd 100644
--- a/write-or-die.c
+++ b/write-or-die.c
@@ -1,5 +1,5 @@
 #include "git-compat-util.h"
-#include "config.h"
+#include "parse.h"
 #include "run-command.h"
 #include "write-or-die.h"
 
-- 
2.41.0.162.gfafddb0af9-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [RFC PATCH 6/8] pager: remove pager_in_use()
  2023-06-27 19:52 [RFC PATCH 0/8] Introduce Git Standard Library Calvin Wan
                   ` (4 preceding siblings ...)
  2023-06-27 19:52 ` [RFC PATCH 5/8] parse: create new library for parsing strings and env values Calvin Wan
@ 2023-06-27 19:52 ` Calvin Wan
  2023-06-27 23:00   ` Junio C Hamano
  2023-06-27 19:52 ` [RFC PATCH 7/8] git-std-lib: introduce git standard library Calvin Wan
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 111+ messages in thread
From: Calvin Wan @ 2023-06-27 19:52 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, nasamuffin, chooglen, johnathantanmy

pager_in_use() is simply a wrapper around
git_env_bool("GIT_PAGER_IN_USE", 0). Other places that call
git_env_bool() in this fashion also do not have a wrapper function
around it. By removing pager_in_use(), we can also get rid of the
pager.h dependency from a few files.

Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 builtin/log.c | 2 +-
 color.c       | 2 +-
 column.c      | 2 +-
 date.c        | 4 ++--
 git.c         | 2 +-
 pager.c       | 5 -----
 pager.h       | 1 -
 7 files changed, 6 insertions(+), 12 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index 03954fb749..d5e979932f 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -82,7 +82,7 @@ struct line_opt_callback_data {
 
 static int session_is_interactive(void)
 {
-	return isatty(1) || pager_in_use();
+	return isatty(1) || git_env_bool("GIT_PAGER_IN_USE", 0);
 }
 
 static int auto_decoration_style(void)
diff --git a/color.c b/color.c
index f3c0a4659b..dd6f26b8db 100644
--- a/color.c
+++ b/color.c
@@ -388,7 +388,7 @@ static int check_auto_color(int fd)
 	int *is_tty_p = fd == 1 ? &color_stdout_is_tty : &color_stderr_is_tty;
 	if (*is_tty_p < 0)
 		*is_tty_p = isatty(fd);
-	if (*is_tty_p || (fd == 1 && pager_in_use() && pager_use_color)) {
+	if (*is_tty_p || (fd == 1 && git_env_bool("GIT_PAGER_IN_USE", 0) && pager_use_color)) {
 		if (!is_terminal_dumb())
 			return 1;
 	}
diff --git a/column.c b/column.c
index ff2f0abf39..e15ca70f36 100644
--- a/column.c
+++ b/column.c
@@ -214,7 +214,7 @@ int finalize_colopts(unsigned int *colopts, int stdout_is_tty)
 		if (stdout_is_tty < 0)
 			stdout_is_tty = isatty(1);
 		*colopts &= ~COL_ENABLE_MASK;
-		if (stdout_is_tty || pager_in_use())
+		if (stdout_is_tty || git_env_bool("GIT_PAGER_IN_USE", 0))
 			*colopts |= COL_ENABLED;
 	}
 	return 0;
diff --git a/date.c b/date.c
index 619ada5b20..95c0f568ba 100644
--- a/date.c
+++ b/date.c
@@ -7,7 +7,7 @@
 #include "git-compat-util.h"
 #include "date.h"
 #include "gettext.h"
-#include "pager.h"
+#include "parse.h"
 #include "strbuf.h"
 
 /*
@@ -1009,7 +1009,7 @@ void parse_date_format(const char *format, struct date_mode *mode)
 
 	/* "auto:foo" is "if tty/pager, then foo, otherwise normal" */
 	if (skip_prefix(format, "auto:", &p)) {
-		if (isatty(1) || pager_in_use())
+		if (isatty(1) || git_env_bool("GIT_PAGER_IN_USE", 0))
 			format = p;
 		else
 			format = "default";
diff --git a/git.c b/git.c
index eb69f4f997..3bfb673a4c 100644
--- a/git.c
+++ b/git.c
@@ -131,7 +131,7 @@ static void commit_pager_choice(void)
 
 void setup_auto_pager(const char *cmd, int def)
 {
-	if (use_pager != -1 || pager_in_use())
+	if (use_pager != -1 || git_env_bool("GIT_PAGER_IN_USE", 0))
 		return;
 	use_pager = check_pager_config(cmd);
 	if (use_pager == -1)
diff --git a/pager.c b/pager.c
index 63055d0873..9b392622d2 100644
--- a/pager.c
+++ b/pager.c
@@ -149,11 +149,6 @@ void setup_pager(void)
 	atexit(wait_for_pager_atexit);
 }
 
-int pager_in_use(void)
-{
-	return git_env_bool("GIT_PAGER_IN_USE", 0);
-}
-
 /*
  * Return cached value (if set) or $COLUMNS environment variable (if
  * set and positive) or ioctl(1, TIOCGWINSZ).ws_col (if positive),
diff --git a/pager.h b/pager.h
index b77433026d..6832c6168d 100644
--- a/pager.h
+++ b/pager.h
@@ -5,7 +5,6 @@ struct child_process;
 
 const char *git_pager(int stdout_is_tty);
 void setup_pager(void);
-int pager_in_use(void);
 int term_columns(void);
 void term_clear_line(void);
 int decimal_width(uintmax_t);
-- 
2.41.0.162.gfafddb0af9-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [RFC PATCH 7/8] git-std-lib: introduce git standard library
  2023-06-27 19:52 [RFC PATCH 0/8] Introduce Git Standard Library Calvin Wan
                   ` (5 preceding siblings ...)
  2023-06-27 19:52 ` [RFC PATCH 6/8] pager: remove pager_in_use() Calvin Wan
@ 2023-06-27 19:52 ` Calvin Wan
  2023-06-28 13:27   ` Phillip Wood
  2023-06-27 19:52 ` [RFC PATCH 8/8] git-std-lib: add test file to call git-std-lib.a functions Calvin Wan
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 111+ messages in thread
From: Calvin Wan @ 2023-06-27 19:52 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, nasamuffin, chooglen, johnathantanmy

The Git Standard Library intends to serve as the foundational library
and root dependency that other libraries in Git will be built off of.
That is to say, suppose we have libraries X and Y; a user that wants to
use X and Y would need to include X, Y, and this Git Standard Library.

Add Documentation/technical/git-std-lib.txt to further explain the
design and rationale.

Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 Documentation/technical/git-std-lib.txt | 182 ++++++++++++++++++++++++
 Makefile                                |  28 +++-
 git-compat-util.h                       |   7 +-
 symlinks.c                              |   2 +
 usage.c                                 |   8 ++
 5 files changed, 225 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/technical/git-std-lib.txt

diff --git a/Documentation/technical/git-std-lib.txt b/Documentation/technical/git-std-lib.txt
new file mode 100644
index 0000000000..3dce36c9f9
--- /dev/null
+++ b/Documentation/technical/git-std-lib.txt
@@ -0,0 +1,182 @@
+Git Standard Library
+================
+
+The Git Standard Library intends to serve as the foundational library
+and root dependency that other libraries in Git will be built off of.
+That is to say, suppose we have libraries X and Y; a user that wants to
+use X and Y would need to include X, Y, and this Git Standard Library.
+This does not mean that the Git Standard Library will be the only
+possible root dependency in the future, but rather the most significant
+and widely used one.
+
+Dependency graph in libified Git
+================
+
+If you look in the Git Makefile, all of the objects defined in the Git
+library are compiled and archived into a singular file, libgit.a, which
+is linked against by common-main.o with other external dependencies and
+turned into the Git executable. In other words, the Git executable has
+dependencies on libgit.a and a couple of external libraries. The
+libfication of Git will not affect this current build flow, but instead
+will provide an alternate method for building Git.
+
+With our current method of building Git, we can imagine the dependency
+graph as such:
+
+        Git
+         /\
+        /  \
+       /    \
+  libgit.a   ext deps
+
+In libifying parts of Git, we want to shrink the dependency graph to
+only the minimal set of dependencies, so libraries should not use
+libgit.a. Instead, it would look like:
+
+                Git
+                /\
+               /  \
+              /    \
+          libgit.a  ext deps
+             /\
+            /  \
+           /    \
+object-store.a  (other lib)
+      |        /
+      |       /
+      |      /
+ config.a   / 
+      |    /
+      |   /
+      |  /
+git-std-lib.a
+
+Instead of containing all of the objects in Git, libgit.a would contain
+objects that are not built by libraries it links against. Consequently,
+if someone wanted their own custom build of Git with their own custom
+implementation of the object store, they would only have to swap out
+object-store.a rather than do a hard fork of Git.
+
+Rationale behind Git Standard Library
+================
+
+The rationale behind Git Standard Library essentially is the result of
+two observations within the Git codebase: every file includes
+git-compat-util.h which defines functions in a couple of different
+files, and wrapper.c + usage.c have difficult-to-separate circular
+dependencies with each other and other files.
+
+Ubiquity of git-compat-util.h and circular dependencies
+========
+
+Every file in the Git codebase includes git-compat-util.h. It serves as
+"a compatibility aid that isolates the knowledge of platform specific
+inclusion order and what feature macros to define before including which
+system header" (Junio[1]). Since every file includes git-compat-util.h, and
+git-compat-util.h includes wrapper.h and usage.h, it would make sense
+for wrapper.c and usage.c to be a part of the root library. They have
+difficult to separate circular dependencies with each other so they
+can't be independent libraries. Wrapper.c has dependencies on parse.c,
+abspath.c, strbuf.c, which in turn also have dependencies on usage.c and
+wrapper.c -- more circular dependencies. 
+
+Tradeoff between swappability and refactoring
+========
+
+From the above dependency graph, we can see that git-std-lib.a could be
+many smaller libraries rather than a singular library. So why choose a
+singular library when multiple libraries can be individually easier to
+swap and are more modular? A singular library requires less work to
+separate out circular dependencies within itself so it becomes a
+tradeoff question between work and reward. While there may be a point in
+the future where a file like usage.c would want its own library so that
+someone can have custom die() or error(), the work required to refactor
+out the circular dependencies in some files would be enormous due to
+their ubiquity so therefore I believe it is not worth the tradeoff
+currently. Additionally, we can in the future choose to do this refactor
+and change the API for the library if there becomes enough of a reason
+to do so (remember we are avoiding promising stability of the interfaces
+of those libraries).
+
+Reuse of compatibility functions in git-compat-util.h
+========
+
+Most functions defined in git-compat-util.h are implemented in compat/
+and have dependencies limited to strbuf.h and wrapper.h so they can be
+easily included in git-std-lib.a, which as a root dependency means that
+higher level libraries do not have to worry about compatibility files in
+compat/. The rest of the functions defined in git-compat-util.h are
+implemented in top level files and, in this patch set, are hidden behind
+an #ifdef if their implementation is not in git-std-lib.a.
+
+Rationale summary
+========
+
+The Git Standard Library allows us to get the libification ball rolling
+with other libraries in Git. By not spending many
+more months attempting to refactor difficult circular dependencies and
+instead spending that time getting to a state where we can test out
+swapping a library out such as config or object store, we can prove the
+viability of Git libification on a much faster time scale. Additionally
+the code cleanups that have happened so far have been minor and
+beneficial for the codebase. It is probable that making large movements
+would negatively affect code clarity.
+
+Git Standard Library boundary
+================
+
+While I have described above some useful heuristics for identifying
+potential candidates for git-std-lib.a, a standard library should not
+have a shaky definition for what belongs in it.
+
+ - Low-level files (aka operates only on other primitive types) that are
+   used everywhere within the codebase (wrapper.c, usage.c, strbuf.c)
+   - Dependencies that are low-level and widely used
+     (abspath.c, date.c, hex-ll.c, parse.c, utf8.c)
+ - low-level git/* files with functions defined in git-compat-util.h
+   (ctype.c)
+ - compat/*
+
+There are other files that might fit this definition, but that does not
+mean it should belong in git-std-lib.a. Those files should start as
+their own separate library since any file added to git-std-lib.a loses
+its flexibility of being easily swappable.
+
+Files inside of Git Standard Library
+================
+
+The initial set of files in git-std-lib.a are:
+abspath.c
+ctype.c
+date.c
+hex-ll.c
+parse.c
+strbuf.c
+usage.c
+utf8.c
+wrapper.c
+relevant compat/ files
+
+Pitfalls
+================
+
+In patch 7, I use #ifdef GIT_STD_LIB to both stub out code and hide
+certain function headers. As other parts of Git are libified, if we
+have to use more ifdefs for each different library, then the codebase
+will become uglier and harder to understand. 
+
+There are a small amount of files under compat/* that have dependencies
+not inside of git-std-lib.a. While those functions are not called on
+Linux, other OSes might call those problematic functions. I don't see
+this as a major problem, just moreso an observation that libification in
+general may also require some minor compatibility work in the future.
+
+Testing
+================
+
+Unit tests should catch any breakages caused by changes to files in
+git-std-lib.a (i.e. introduction of a out of scope dependency) and new
+functions introduced to git-std-lib.a will require unit tests written
+for them.
+
+[1] https://lore.kernel.org/git/xmqqwn17sydw.fsf@gitster.g/
\ No newline at end of file
diff --git a/Makefile b/Makefile
index e9ad9f9ef1..255bd10b82 100644
--- a/Makefile
+++ b/Makefile
@@ -2162,6 +2162,11 @@ ifdef FSMONITOR_OS_SETTINGS
 	COMPAT_OBJS += compat/fsmonitor/fsm-path-utils-$(FSMONITOR_OS_SETTINGS).o
 endif
 
+ifdef GIT_STD_LIB
+	BASIC_CFLAGS += -DGIT_STD_LIB
+	BASIC_CFLAGS += -DNO_GETTEXT
+endif
+
 ifeq ($(TCLTK_PATH),)
 NO_TCLTK = NoThanks
 endif
@@ -3654,7 +3659,7 @@ clean: profile-clean coverage-clean cocciclean
 	$(RM) po/git.pot po/git-core.pot
 	$(RM) git.res
 	$(RM) $(OBJECTS)
-	$(RM) $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(REFTABLE_TEST_LIB)
+	$(RM) $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(REFTABLE_TEST_LIB) $(STD_LIB_FILE)
 	$(RM) $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS) $(OTHER_PROGRAMS)
 	$(RM) $(TEST_PROGRAMS)
 	$(RM) $(FUZZ_PROGRAMS)
@@ -3834,3 +3839,24 @@ $(FUZZ_PROGRAMS): all
 		$(XDIFF_OBJS) $(EXTLIBS) git.o $@.o $(LIB_FUZZING_ENGINE) -o $@
 
 fuzz-all: $(FUZZ_PROGRAMS)
+
+### Libified Git rules
+
+# git-std-lib
+# `make git-std-lib GIT_STD_LIB=YesPlease`
+STD_LIB = git-std-lib.a
+
+GIT_STD_LIB_OBJS += abspath.o
+GIT_STD_LIB_OBJS += ctype.o
+GIT_STD_LIB_OBJS += date.o
+GIT_STD_LIB_OBJS += hex-ll.o
+GIT_STD_LIB_OBJS += parse.o
+GIT_STD_LIB_OBJS += strbuf.o
+GIT_STD_LIB_OBJS += usage.o
+GIT_STD_LIB_OBJS += utf8.o
+GIT_STD_LIB_OBJS += wrapper.o
+
+$(STD_LIB): $(GIT_STD_LIB_OBJS) $(COMPAT_OBJS)
+	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
+
+git-std-lib: $(STD_LIB)
diff --git a/git-compat-util.h b/git-compat-util.h
index 481dac22b0..75aa9b263e 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -396,8 +396,8 @@ static inline int noop_core_config(const char *var UNUSED,
 #define platform_core_config noop_core_config
 #endif
 
+#if !defined(__MINGW32__) && !defined(_MSC_VER) && !defined(GIT_STD_LIB)
 int lstat_cache_aware_rmdir(const char *path);
-#if !defined(__MINGW32__) && !defined(_MSC_VER)
 #define rmdir lstat_cache_aware_rmdir
 #endif
 
@@ -787,9 +787,11 @@ const char *inet_ntop(int af, const void *src, char *dst, size_t size);
 #endif
 
 #ifdef NO_PTHREADS
+#ifdef GIT_STD_LIB
 #define atexit git_atexit
 int git_atexit(void (*handler)(void));
 #endif
+#endif
 
 /*
  * Limit size of IO chunks, because huge chunks only cause pain.  OS X
@@ -951,14 +953,17 @@ int git_access(const char *path, int mode);
 # endif
 #endif
 
+#ifndef GIT_STD_LIB
 int cmd_main(int, const char **);
 
 /*
  * Intercept all calls to exit() and route them to trace2 to
  * optionally emit a message before calling the real exit().
  */
+
 int common_exit(const char *file, int line, int code);
 #define exit(code) exit(common_exit(__FILE__, __LINE__, (code)))
+#endif
 
 /*
  * You can mark a stack variable with UNLEAK(var) to avoid it being
diff --git a/symlinks.c b/symlinks.c
index b29e340c2d..bced721a0c 100644
--- a/symlinks.c
+++ b/symlinks.c
@@ -337,6 +337,7 @@ void invalidate_lstat_cache(void)
 	reset_lstat_cache(&default_cache);
 }
 
+#if !defined(__MINGW32__) && !defined(_MSC_VER) && !defined(GIT_STD_LIB)
 #undef rmdir
 int lstat_cache_aware_rmdir(const char *path)
 {
@@ -348,3 +349,4 @@ int lstat_cache_aware_rmdir(const char *path)
 
 	return ret;
 }
+#endif
diff --git a/usage.c b/usage.c
index 09f0ed509b..58994e0d5c 100644
--- a/usage.c
+++ b/usage.c
@@ -5,7 +5,15 @@
  */
 #include "git-compat-util.h"
 #include "gettext.h"
+
+#ifdef GIT_STD_LIB
+#undef trace2_cmd_name
+#undef trace2_cmd_error_va
+#define trace2_cmd_name(x) 
+#define trace2_cmd_error_va(x, y)
+#else
 #include "trace2.h"
+#endif
 
 static void vreportf(const char *prefix, const char *err, va_list params)
 {
-- 
2.41.0.162.gfafddb0af9-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [RFC PATCH 8/8] git-std-lib: add test file to call git-std-lib.a functions
  2023-06-27 19:52 [RFC PATCH 0/8] Introduce Git Standard Library Calvin Wan
                   ` (6 preceding siblings ...)
  2023-06-27 19:52 ` [RFC PATCH 7/8] git-std-lib: introduce git standard library Calvin Wan
@ 2023-06-27 19:52 ` Calvin Wan
  2023-06-28  0:14 ` [RFC PATCH 0/8] Introduce Git Standard Library Glen Choo
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 111+ messages in thread
From: Calvin Wan @ 2023-06-27 19:52 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, nasamuffin, chooglen, johnathantanmy

Add test file that directly or indirectly calls all functions defined in
git-std-lib.a object files to showcase that they do not reference
missing objects and that git-std-lib.a can stand on its own.

Certain functions that cause the program to exit or are already called
by other functions are commented out.

TODO: replace with unit tests
Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 t/Makefile      |   4 +
 t/stdlib-test.c | 239 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 243 insertions(+)
 create mode 100644 t/stdlib-test.c

diff --git a/t/Makefile b/t/Makefile
index 3e00cdd801..b6d0bc9daa 100644
--- a/t/Makefile
+++ b/t/Makefile
@@ -150,3 +150,7 @@ perf:
 
 .PHONY: pre-clean $(T) aggregate-results clean valgrind perf \
 	check-chainlint clean-chainlint test-chainlint
+
+test-git-std-lib:
+	cc -It -o stdlib-test stdlib-test.c -L. -l:../git-std-lib.a
+	./stdlib-test
diff --git a/t/stdlib-test.c b/t/stdlib-test.c
new file mode 100644
index 0000000000..0e4f6d5807
--- /dev/null
+++ b/t/stdlib-test.c
@@ -0,0 +1,239 @@
+#include "../git-compat-util.h"
+#include "../abspath.h"
+#include "../hex-ll.h"
+#include "../parse.h"
+#include "../strbuf.h"
+#include "../string-list.h"
+
+/*
+ * Calls all functions from git-std-lib
+ * Some inline/trivial functions are skipped
+ */
+
+void abspath_funcs(void) {
+	struct strbuf sb = STRBUF_INIT;
+
+	fprintf(stderr, "calling abspath functions\n");
+	is_directory("foo");
+	strbuf_realpath(&sb, "foo", 0);
+	strbuf_realpath_forgiving(&sb, "foo", 0);
+	real_pathdup("foo", 0);
+	absolute_path("foo");
+	absolute_pathdup("foo");
+	prefix_filename("foo/", "bar");
+	prefix_filename_except_for_dash("foo/", "bar");
+	is_absolute_path("foo");
+	strbuf_add_absolute_path(&sb, "foo");
+	strbuf_add_real_path(&sb, "foo");
+}
+
+void hex_ll_funcs(void) {
+	unsigned char c;
+
+	fprintf(stderr, "calling hex-ll functions\n");
+
+	hexval('c');
+	hex2chr("A1");
+	hex_to_bytes(&c, "A1", 2);
+}
+
+void parse_funcs(void) {
+	intmax_t foo;
+	ssize_t foo1 = -1;
+	unsigned long foo2;
+	int foo3;
+	int64_t foo4;
+
+	fprintf(stderr, "calling parse functions\n");
+
+	git_parse_signed("42", &foo, maximum_signed_value_of_type(int));
+	git_parse_ssize_t("42", &foo1);
+	git_parse_ulong("42", &foo2);
+	git_parse_int("42", &foo3);
+	git_parse_int64("42", &foo4);
+	git_parse_maybe_bool("foo");
+	git_parse_maybe_bool_text("foo");
+	git_env_bool("foo", 1);
+	git_env_ulong("foo", 1);
+}
+
+static int allow_unencoded_fn(char ch) {
+	return 0;
+}
+
+void strbuf_funcs(void) {
+	struct strbuf *sb = xmalloc(sizeof(void*));
+	struct strbuf *sb2 = xmalloc(sizeof(void*));
+	struct strbuf sb3 = STRBUF_INIT;
+	struct string_list list = STRING_LIST_INIT_NODUP;
+	char *buf = "foo";
+	struct strbuf_expand_dict_entry dict[] = {
+		{ "foo", NULL, },
+		{ "bar", NULL, },
+	};
+	int fd = open("/dev/null", O_RDONLY);
+
+	fprintf(stderr, "calling strbuf functions\n");
+
+	starts_with("foo", "bar");
+	istarts_with("foo", "bar");
+	// skip_to_optional_arg_default(const char *str, const char *prefix,
+	// 			 const char **arg, const char *def)
+	strbuf_init(sb, 0);
+	strbuf_init(sb2, 0);
+	strbuf_release(sb);
+	strbuf_attach(sb, strbuf_detach(sb, NULL), 0, 0); // calls strbuf_grow
+	strbuf_swap(sb, sb2);
+	strbuf_setlen(sb, 0);
+	strbuf_trim(sb); // calls strbuf_rtrim, strbuf_ltrim
+	// strbuf_rtrim() called by strbuf_trim()
+	// strbuf_ltrim() called by strbuf_trim()
+	strbuf_trim_trailing_dir_sep(sb);
+	strbuf_trim_trailing_newline(sb);
+	strbuf_reencode(sb, "foo", "bar");
+	strbuf_tolower(sb);
+	strbuf_add_separated_string_list(sb, " ", &list);
+	strbuf_list_free(strbuf_split_buf("foo bar", 8, ' ', -1));
+	strbuf_cmp(sb, sb2);
+	strbuf_addch(sb, 1);
+	strbuf_splice(sb, 0, 1, "foo", 3);
+	strbuf_insert(sb, 0, "foo", 3);
+	// strbuf_vinsertf() called by strbuf_insertf
+	strbuf_insertf(sb, 0, "%s", "foo"); 
+	strbuf_remove(sb, 0, 1);
+	strbuf_add(sb, "foo", 3);
+	strbuf_addbuf(sb, sb2);
+	strbuf_join_argv(sb, 0, NULL, ' ');
+	strbuf_addchars(sb, 1, 1);
+	strbuf_addf(sb, "%s", "foo");
+	strbuf_add_commented_lines(sb, "foo", 3, '#');
+	strbuf_commented_addf(sb, '#', "%s", "foo");
+	// strbuf_vaddf() called by strbuf_addf()
+	strbuf_expand(sb, "%s", strbuf_expand_literal_cb, NULL);
+	strbuf_expand(sb, "%s", strbuf_expand_dict_cb, &dict);
+	// strbuf_expand_literal_cb() called by strbuf_expand()
+	// strbuf_expand_dict_cb() called by strbuf_expand()
+	strbuf_addbuf_percentquote(sb, &sb3);
+	strbuf_add_percentencode(sb, "foo", STRBUF_ENCODE_SLASH);
+	strbuf_fread(sb, 0, stdin);
+	strbuf_read(sb, fd, 0);
+	strbuf_read_once(sb, fd, 0);
+	strbuf_write(sb, stderr);
+	strbuf_readlink(sb, "/dev/null", 0);
+	strbuf_getcwd(sb);
+	strbuf_getwholeline(sb, stderr, '\n');
+	strbuf_appendwholeline(sb, stderr, '\n');
+	strbuf_getline(sb, stderr);
+	strbuf_getline_lf(sb, stderr);
+	strbuf_getline_nul(sb, stderr);
+	strbuf_getwholeline_fd(sb, fd, '\n');
+	strbuf_read_file(sb, "/dev/null", 0);
+	strbuf_add_lines(sb, "foo", "bar", 0);
+	strbuf_addstr_xml_quoted(sb, "foo");
+	strbuf_addstr_urlencode(sb, "foo", allow_unencoded_fn);
+	strbuf_humanise_bytes(sb, 42);
+	strbuf_humanise_rate(sb, 42);
+	printf_ln("%s", sb);
+	fprintf_ln(stderr, "%s", sb);
+	xstrdup_tolower("foo");
+	xstrdup_toupper("foo");
+	// xstrvfmt() called by xstrfmt()
+	xstrfmt("%s", "foo");
+	// strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm,
+	// 	     int tz_offset, int suppress_tz_name)
+	// strbuf_stripspace(struct strbuf *sb, char comment_line_char)
+	// strbuf_strip_suffix(struct strbuf *sb, const char *suffix)
+	// strbuf_strip_file_from_path(struct strbuf *sb)
+}
+
+static void error_builtin(const char *err, va_list params) {}
+static void warn_builtin(const char *err, va_list params) {}
+
+static report_fn error_routine = error_builtin;
+static report_fn warn_routine = warn_builtin;
+
+void usage_funcs(void) {
+	fprintf(stderr, "calling usage functions\n");
+	// Functions that call exit() are commented out
+
+	// usage()
+	// usagef()
+	// die()
+	// die_errno();
+	error("foo");
+	error_errno("foo");
+	die_message("foo");
+	die_message_errno("foo");
+	warning("foo");
+	warning_errno("foo");
+
+	// set_die_routine();
+	get_die_message_routine();
+	set_error_routine(error_builtin);
+	get_error_routine();
+	set_warn_routine(warn_builtin);
+	get_warn_routine();
+	// set_die_is_recursing_routine();
+}
+
+void wrapper_funcs(void) {
+	void *ptr = xmalloc(1);
+	int fd = open("/dev/null", O_RDONLY);
+	struct strbuf sb = STRBUF_INIT;
+	int mode = 0444;
+	char host[PATH_MAX], path[PATH_MAX], path1[PATH_MAX];
+	xsnprintf(path, sizeof(path), "out-XXXXXX");
+	xsnprintf(path1, sizeof(path1), "out-XXXXXX");
+	int tmp;
+
+	fprintf(stderr, "calling wrapper functions\n");
+
+	xstrdup("foo");
+	xmalloc(1);
+	xmallocz(1);
+	xmallocz_gently(1);
+	xmemdupz("foo", 3);
+	xstrndup("foo", 3);
+	xrealloc(ptr, 2);
+	xcalloc(1, 1);
+	xsetenv("foo", "bar", 0);
+	xopen("/dev/null", O_RDONLY);
+	xread(fd, &sb, 1);
+	xwrite(fd, &sb, 1);
+	xpread(fd, &sb, 1, 0);
+	xdup(fd);
+	xfopen("/dev/null", "r");
+	xfdopen(fd, "r");
+	tmp = xmkstemp(path);
+	close(tmp);
+	unlink(path);
+	tmp = xmkstemp_mode(path1, mode);
+	close(tmp);
+	unlink(path1);
+	xgetcwd();
+	fopen_for_writing(path);
+	fopen_or_warn(path, "r");
+	xstrncmpz("foo", "bar", 3);
+	// xsnprintf() called above
+	xgethostname(host, 3);
+	tmp = git_mkstemps_mode(path, 1, mode);
+	close(tmp);
+	unlink(path);
+	tmp = git_mkstemp_mode(path, mode);
+	close(tmp);
+	unlink(path);
+	read_in_full(fd, &sb, 1);
+	write_in_full(fd, &sb, 1);
+	pread_in_full(fd, &sb, 1, 0);	
+}
+
+int main() {
+	abspath_funcs();
+	hex_ll_funcs();
+	parse_funcs();
+	strbuf_funcs();
+	usage_funcs();
+	wrapper_funcs();
+	fprintf(stderr, "all git-std-lib functions finished calling\n");
+	return 0;
+}
\ No newline at end of file
-- 
2.41.0.162.gfafddb0af9-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH 5/8] parse: create new library for parsing strings and env values
  2023-06-27 19:52 ` [RFC PATCH 5/8] parse: create new library for parsing strings and env values Calvin Wan
@ 2023-06-27 22:58   ` Junio C Hamano
  0 siblings, 0 replies; 111+ messages in thread
From: Junio C Hamano @ 2023-06-27 22:58 UTC (permalink / raw)
  To: Calvin Wan; +Cc: git, nasamuffin, chooglen, johnathantanmy

Calvin Wan <calvinwan@google.com> writes:

> While string and environment value parsing is mainly consumed by
> config.c, there are other files that only need parsing functionality and
> not config functionality. By separating out string and environment value
> parsing from config, those files can instead be dependent on parse,
> which has a much smaller dependency chain than config.
>
> Move general string and env parsing functions from config.[ch] to
> parse.[ch].

Quite sensible and ...

>
> Signed-off-by: Calvin Wan <calvinwan@google.com>
> ---
>  Makefile                   |   1 +
>  attr.c                     |   2 +-
>  config.c                   | 180 +-----------------------------------

... long overdue to have this.

>  config.h                   |  14 +--
>  pack-objects.c             |   2 +-
>  pack-revindex.c            |   2 +-
>  parse-options.c            |   3 +-
>  parse.c                    | 182 +++++++++++++++++++++++++++++++++++++
>  parse.h                    |  20 ++++
>  pathspec.c                 |   2 +-
>  preload-index.c            |   2 +-
>  progress.c                 |   2 +-
>  prompt.c                   |   2 +-
>  rebase.c                   |   2 +-
>  t/helper/test-env-helper.c |   2 +-
>  unpack-trees.c             |   2 +-
>  wrapper.c                  |   2 +-
>  write-or-die.c             |   2 +-
>  18 files changed, 219 insertions(+), 205 deletions(-)
>  create mode 100644 parse.c
>  create mode 100644 parse.h

It is somewhat surprising and very pleasing to see so many *.c files
had and now can lose dependency on <config.h>.  Very nice.


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH 6/8] pager: remove pager_in_use()
  2023-06-27 19:52 ` [RFC PATCH 6/8] pager: remove pager_in_use() Calvin Wan
@ 2023-06-27 23:00   ` Junio C Hamano
  2023-06-27 23:18     ` Calvin Wan
  2023-06-28  0:30     ` Glen Choo
  0 siblings, 2 replies; 111+ messages in thread
From: Junio C Hamano @ 2023-06-27 23:00 UTC (permalink / raw)
  To: Calvin Wan; +Cc: git, nasamuffin, chooglen, johnathantanmy

Calvin Wan <calvinwan@google.com> writes:

> pager_in_use() is simply a wrapper around
> git_env_bool("GIT_PAGER_IN_USE", 0). Other places that call
> git_env_bool() in this fashion also do not have a wrapper function
> around it. By removing pager_in_use(), we can also get rid of the
> pager.h dependency from a few files.
>
> Signed-off-by: Calvin Wan <calvinwan@google.com>
> ---
>  builtin/log.c | 2 +-
>  color.c       | 2 +-
>  column.c      | 2 +-
>  date.c        | 4 ++--
>  git.c         | 2 +-
>  pager.c       | 5 -----
>  pager.h       | 1 -
>  7 files changed, 6 insertions(+), 12 deletions(-)

With so many (read: more than 3) callsites, I am not sure if this is
an improvement.  pager_in_use() cannot be misspelt without getting
noticed by compilers, but git_env_bool("GIT_PAGOR_IN_USE", 0) will
go silently unnoticed.  Is there no other way to lose the dependency
you do not like?


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH 6/8] pager: remove pager_in_use()
  2023-06-27 23:00   ` Junio C Hamano
@ 2023-06-27 23:18     ` Calvin Wan
  2023-06-28  0:30     ` Glen Choo
  1 sibling, 0 replies; 111+ messages in thread
From: Calvin Wan @ 2023-06-27 23:18 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, nasamuffin, chooglen, Jonathan Tan

> With so many (read: more than 3) callsites, I am not sure if this is
> an improvement.  pager_in_use() cannot be misspelt without getting
> noticed by compilers, but git_env_bool("GIT_PAGOR_IN_USE", 0) will
> go silently unnoticed.  Is there no other way to lose the dependency
> you do not like?

I thought about only changing this call site, but that creates an
inconsistency that shouldn't exist. The other way is to move this
function into a different file, but it is also unclear to me which
file that would be. It would be awkward in parse.c and if it was in
environment.c then we would have many more inherited dependencies from
that. I agree that the value of this patch is dubious in and of
itself, which is why it's coupled together with this series rather
than in a separate standalone cleanup series.

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH 0/8] Introduce Git Standard Library
  2023-06-27 19:52 [RFC PATCH 0/8] Introduce Git Standard Library Calvin Wan
                   ` (7 preceding siblings ...)
  2023-06-27 19:52 ` [RFC PATCH 8/8] git-std-lib: add test file to call git-std-lib.a functions Calvin Wan
@ 2023-06-28  0:14 ` Glen Choo
  2023-06-28 16:30   ` Calvin Wan
  2023-06-30  7:01 ` Linus Arver
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 111+ messages in thread
From: Glen Choo @ 2023-06-28  0:14 UTC (permalink / raw)
  To: Calvin Wan, git; +Cc: Calvin Wan, nasamuffin, johnathantanmy

I see that this doesn't apply cleanly to 'master'. Do you have a base
commit that reviewers can easily apply this to?

Calvin Wan <calvinwan@google.com> writes:

> Before looking at this series, it probably makes sense to look at the
> other series that this is built on top of since that is the state I will
> be referring to in this cover letter:
>
>   - Elijah's final cache.h cleanup series[2]
>   - my strbuf cleanup series[3]
>   - my git-compat-util cleanup series[4]

Unfortunately, not all of these series apply cleanly to 'master' either,
so I went digging for the topic branches, which I think are:

- en/header-split-cache-h-part-3
- cw/header-compat-util-shuffle
- cw/strbuf-cleanup

And then I tried merging them, but it looks like they don't merge
cleanly either :/

(Btw Junio, I think cw/header-compat-util-shuffle didn't get called out
in What's Cooking.)

> [2] https://lore.kernel.org/git/pull.1525.v3.git.1684218848.gitgitgadget@gmail.com/
> [3] https://lore.kernel.org/git/20230606194720.2053551-1-calvinwan@google.com/
> [4] https://lore.kernel.org/git/20230606170711.912972-1-calvinwan@google.com/

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH 6/8] pager: remove pager_in_use()
  2023-06-27 23:00   ` Junio C Hamano
  2023-06-27 23:18     ` Calvin Wan
@ 2023-06-28  0:30     ` Glen Choo
  2023-06-28 16:37       ` Glen Choo
  2023-06-28 20:58       ` Junio C Hamano
  1 sibling, 2 replies; 111+ messages in thread
From: Glen Choo @ 2023-06-28  0:30 UTC (permalink / raw)
  To: Junio C Hamano, Calvin Wan; +Cc: git, nasamuffin, johnathantanmy

Junio C Hamano <gitster@pobox.com> writes:

>> pager_in_use() is simply a wrapper around
>> git_env_bool("GIT_PAGER_IN_USE", 0). Other places that call
>> git_env_bool() in this fashion also do not have a wrapper function
>> around it. By removing pager_in_use(), we can also get rid of the
>> pager.h dependency from a few files.
>
> With so many (read: more than 3) callsites, I am not sure if this is
> an improvement.  pager_in_use() cannot be misspelt without getting
> noticed by compilers, but git_env_bool("GIT_PAGOR_IN_USE", 0) will
> go silently unnoticed.  Is there no other way to lose the dependency
> you do not like?

Having the function isn't just nice for typo prevention - it's also a
reasonable boundary around the pager subsystem. We could imagine a
world where we wanted to track the pager status using a static
var instead of an env var (not that we'd even want that :P), and this
inlining makes that harder.

From the cover letter, it seems like we only need this to remove
"#include pager.h" from date.c, and that's only used in
parse_date_format(). Could we add a is_pager/pager_in_use to that
function and push the pager.h dependency upwards?

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH 1/8] trace2: log fsync stats in trace2 rather than wrapper
  2023-06-27 19:52 ` [RFC PATCH 1/8] trace2: log fsync stats in trace2 rather than wrapper Calvin Wan
@ 2023-06-28  2:05   ` Victoria Dye
  2023-07-05 17:57     ` Calvin Wan
  2023-07-11 20:07   ` Jeff Hostetler
  1 sibling, 1 reply; 111+ messages in thread
From: Victoria Dye @ 2023-06-28  2:05 UTC (permalink / raw)
  To: Calvin Wan, git; +Cc: nasamuffin, chooglen, johnathantanmy

Calvin Wan wrote:
> As a library boundary, wrapper.c should not directly log trace2
> statistics, but instead provide those statistics upon
> request. Therefore, move the trace2 logging code to trace2.[ch.]. This
> also allows wrapper.c to not be dependent on trace2.h and repository.h.
> 

...

> diff --git a/trace2.h b/trace2.h
> index 4ced30c0db..689e9a4027 100644
> --- a/trace2.h
> +++ b/trace2.h
> @@ -581,4 +581,9 @@ void trace2_collect_process_info(enum trace2_process_info_reason reason);
>  
>  const char *trace2_session_id(void);
>  
> +/*
> + * Writes out trace statistics for fsync
> + */
> +void trace_git_fsync_stats(void);
> +

This function does not belong in 'trace2.h', IMO. The purpose of that file
is to contain the generic API for Trace2 (e.g., 'trace2_printf()',
'trace2_region_(enter|exit)'), whereas this function is effectively a
wrapper around a specific invocation of that API. 

You note in the commit message that "wrapper.c should not directly log
trace2 statistics" with the reasoning of "[it's] a library boundary," but I
suspect the unstated underlying reason is "because it tracks 'count_fsync_*'
in static variables." This case would be better handled, then, by replacing
the usage in 'wrapper.c' with a new Trace2 counter (API introduced in [1]).
That keeps this usage consistent with the API already established for
Trace2, rather than starting an unsustainable trend of creating ad-hoc,
per-metric wrappers in 'trace2.[c|h]'.

An added note re: the commit message - it's extremely important that
functions _anywhere in Git_ are able to use the Trace2 API directly. A
developer could reasonably want to measure performance, keep track of an
interesting metric, log when a region is entered in the larger trace,
capture error information, etc. for any function, regardless of where in
falls in the internal library organization. To that end, I think either the
commit message should be rephrased to remove that statement (if the issue is
really "we're using a static variable and we want to avoid that"), or the
libification effort should be updated to accommodate use of Trace2 anywhere
in Git. 

[1] https://lore.kernel.org/git/pull.1373.v4.git.1666618868.gitgitgadget@gmail.com/


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH 2/8] hex-ll: split out functionality from hex
  2023-06-27 19:52 ` [RFC PATCH 2/8] hex-ll: split out functionality from hex Calvin Wan
@ 2023-06-28 13:15   ` Phillip Wood
  2023-06-28 16:55     ` Calvin Wan
  0 siblings, 1 reply; 111+ messages in thread
From: Phillip Wood @ 2023-06-28 13:15 UTC (permalink / raw)
  To: Calvin Wan, git; +Cc: nasamuffin, chooglen, johnathantanmy

Hi Calvin

On 27/06/2023 20:52, Calvin Wan wrote:
> Separate out hex functionality that doesn't require a hash algo into
> hex-ll.[ch]. Since the hash algo is currently a global that sits in
> repository, this separation removes that dependency for files that only
> need basic hex manipulation functions.
>
> diff --git a/hex.h b/hex.h
> index 7df4b3c460..c07c8b34c2 100644
> --- a/hex.h
> +++ b/hex.h
> @@ -2,22 +2,7 @@
>   #define HEX_H
>   
>   #include "hash-ll.h"
> -
> -extern const signed char hexval_table[256];
> -static inline unsigned int hexval(unsigned char c)
> -{
> -	return hexval_table[c];
> -}
> -
> -/*
> - * Convert two consecutive hexadecimal digits into a char.  Return a
> - * negative value on error.  Don't run over the end of short strings.
> - */
> -static inline int hex2chr(const char *s)
> -{
> -	unsigned int val = hexval(s[0]);
> -	return (val & ~0xf) ? val : (val << 4) | hexval(s[1]);
> -}
> +#include "hex-ll.h"

I don't think any of the remaining declarations in hex.h depend on the 
ones that are moved to "hex-ll.h" so this include should probably be in 
"hex.c" rather than "hex.h"

Best Wishes

Phillip


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH 7/8] git-std-lib: introduce git standard library
  2023-06-27 19:52 ` [RFC PATCH 7/8] git-std-lib: introduce git standard library Calvin Wan
@ 2023-06-28 13:27   ` Phillip Wood
  2023-06-28 21:15     ` Calvin Wan
  0 siblings, 1 reply; 111+ messages in thread
From: Phillip Wood @ 2023-06-28 13:27 UTC (permalink / raw)
  To: Calvin Wan, git; +Cc: nasamuffin, chooglen, johnathantanmy

Hi Calvin

On 27/06/2023 20:52, Calvin Wan wrote:
> The Git Standard Library intends to serve as the foundational library
> and root dependency that other libraries in Git will be built off of.
> That is to say, suppose we have libraries X and Y; a user that wants to
> use X and Y would need to include X, Y, and this Git Standard Library.

I think having a library of commonly used functions and structures is a 
good idea. While I appreciate that we don't want to include everything 
I'm surprised to see it does not include things like "hashmap.c" and 
"string-list.c" that will be required by the config library as well as 
other code in "libgit.a". I don't think we want "libgitconfig.a" and 
"libgit.a" to both contain a copy of "hashmap.o" and "string-list.o"

> diff --git a/Makefile b/Makefile
> index e9ad9f9ef1..255bd10b82 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -2162,6 +2162,11 @@ ifdef FSMONITOR_OS_SETTINGS
>   	COMPAT_OBJS += compat/fsmonitor/fsm-path-utils-$(FSMONITOR_OS_SETTINGS).o
>   endif
>   
> +ifdef GIT_STD_LIB
> +	BASIC_CFLAGS += -DGIT_STD_LIB
> +	BASIC_CFLAGS += -DNO_GETTEXT

I can see other projects may want to build git-std-lib without gettext 
support but if we're going to use git-std-lib within git it needs to be 
able to be built with that support. The same goes for the trace 
functions that you are redefining in usage.h

> diff --git a/git-compat-util.h b/git-compat-util.h
> index 481dac22b0..75aa9b263e 100644
> --- a/git-compat-util.h
> +++ b/git-compat-util.h
> @@ -396,8 +396,8 @@ static inline int noop_core_config(const char *var UNUSED,
>   #define platform_core_config noop_core_config
>   #endif
>   
> +#if !defined(__MINGW32__) && !defined(_MSC_VER) && !defined(GIT_STD_LIB)
>   int lstat_cache_aware_rmdir(const char *path);
> -#if !defined(__MINGW32__) && !defined(_MSC_VER)
>   #define rmdir lstat_cache_aware_rmdir
>   #endif

I'm not sure why the existing condition is being moved here

Thanks for posting this RFC. I've only really given it a quick glance 
but on the whole it seems to make sense.

Best Wishes

Phillip


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH 0/8] Introduce Git Standard Library
  2023-06-28  0:14 ` [RFC PATCH 0/8] Introduce Git Standard Library Glen Choo
@ 2023-06-28 16:30   ` Calvin Wan
  0 siblings, 0 replies; 111+ messages in thread
From: Calvin Wan @ 2023-06-28 16:30 UTC (permalink / raw)
  To: Glen Choo; +Cc: git, nasamuffin, Jonathan Tan

Ah I failed to mention that this is built on top of 2.41. You can also
get this series with the correctly applied patches from:
https://github.com/calvin-wan-google/git/tree/git-std-lib-rfc

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH 6/8] pager: remove pager_in_use()
  2023-06-28  0:30     ` Glen Choo
@ 2023-06-28 16:37       ` Glen Choo
  2023-06-28 16:44         ` Calvin Wan
  2023-06-28 20:58       ` Junio C Hamano
  1 sibling, 1 reply; 111+ messages in thread
From: Glen Choo @ 2023-06-28 16:37 UTC (permalink / raw)
  To: Junio C Hamano, Calvin Wan; +Cc: git, nasamuffin, johnathantanmy

Glen Choo <chooglen@google.com> writes:

>                      Could we add a is_pager/pager_in_use to that
> function and push the pager.h dependency upwards?

Bleh, I meant "Could we add a new is_pager/pager_in_use parameter to
that function?"

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH 6/8] pager: remove pager_in_use()
  2023-06-28 16:37       ` Glen Choo
@ 2023-06-28 16:44         ` Calvin Wan
  2023-06-28 17:30           ` Junio C Hamano
  0 siblings, 1 reply; 111+ messages in thread
From: Calvin Wan @ 2023-06-28 16:44 UTC (permalink / raw)
  To: Glen Choo; +Cc: Junio C Hamano, git, nasamuffin, johnathantanmy

> Glen Choo <chooglen@google.com> writes:
>
> >                      Could we add a is_pager/pager_in_use to that
> > function and push the pager.h dependency upwards?
>
> Bleh, I meant "Could we add a new is_pager/pager_in_use parameter to
> that function?"

Refactoring the function signature to:

parse_date_format(const char *format, struct date_mode *mode, int pager_in_use)

as you suggested is a much better solution, thanks! I'll make that
change in the next reroll.

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH 2/8] hex-ll: split out functionality from hex
  2023-06-28 13:15   ` Phillip Wood
@ 2023-06-28 16:55     ` Calvin Wan
  0 siblings, 0 replies; 111+ messages in thread
From: Calvin Wan @ 2023-06-28 16:55 UTC (permalink / raw)
  To: phillip.wood; +Cc: git, nasamuffin, chooglen, Jonathan Tan

> I don't think any of the remaining declarations in hex.h depend on the
> ones that are moved to "hex-ll.h" so this include should probably be in
> "hex.c" rather than "hex.h"

The reason why hex-ll.h is included in hex.h isn't because there might
be other declarations in hex.h that depend on it. It is for files that
include hex.h to also inherit the inclusion of hex-ll.h. If we moved
the inclusion of hex-ll.h to hex.c rather than hex.h, then those files
would have to include both hex.h and hex-ll.h. It clarifies whether a
file needs all of hex or just the low level functionality of hex.

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH 6/8] pager: remove pager_in_use()
  2023-06-28 16:44         ` Calvin Wan
@ 2023-06-28 17:30           ` Junio C Hamano
  0 siblings, 0 replies; 111+ messages in thread
From: Junio C Hamano @ 2023-06-28 17:30 UTC (permalink / raw)
  To: Calvin Wan; +Cc: Glen Choo, git, nasamuffin, johnathantanmy

Calvin Wan <calvinwan@google.com> writes:

>> Glen Choo <chooglen@google.com> writes:
>>
>> >                      Could we add a is_pager/pager_in_use to that
>> > function and push the pager.h dependency upwards?
>>
>> Bleh, I meant "Could we add a new is_pager/pager_in_use parameter to
>> that function?"
>
> Refactoring the function signature to:
>
> parse_date_format(const char *format, struct date_mode *mode, int pager_in_use)
>
> as you suggested is a much better solution, thanks! I'll make that
> change in the next reroll.

Yeah, the date format "auto:" that changes behaviour between the
output medium feels a serious layering violation, but given the
constraints, it looks like the best thing to do.

Thanks.

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH 6/8] pager: remove pager_in_use()
  2023-06-28  0:30     ` Glen Choo
  2023-06-28 16:37       ` Glen Choo
@ 2023-06-28 20:58       ` Junio C Hamano
  1 sibling, 0 replies; 111+ messages in thread
From: Junio C Hamano @ 2023-06-28 20:58 UTC (permalink / raw)
  To: Glen Choo; +Cc: Calvin Wan, git, nasamuffin, jonathantanmy

Glen Choo <chooglen@google.com> writes:

> Having the function isn't just nice for typo prevention - it's also a
> reasonable boundary around the pager subsystem. We could imagine a
> world where we wanted to track the pager status using a static
> var instead of an env var (not that we'd even want that :P), and this
> inlining makes that harder.
>
> From the cover letter, it seems like we only need this to remove
> "#include pager.h" from date.c, and that's only used in
> parse_date_format(). Could we add a is_pager/pager_in_use to that
> function and push the pager.h dependency upwards?

Thanks---I think that may show a good direction.  parse_date_format()
reacts to "auto:foo" and as long as that feature needs to be there,
pager_in_use() must be available to the function.

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH 7/8] git-std-lib: introduce git standard library
  2023-06-28 13:27   ` Phillip Wood
@ 2023-06-28 21:15     ` Calvin Wan
  2023-06-30 10:00       ` Phillip Wood
  0 siblings, 1 reply; 111+ messages in thread
From: Calvin Wan @ 2023-06-28 21:15 UTC (permalink / raw)
  To: phillip.wood; +Cc: git, nasamuffin, chooglen, Jonathan Tan

> On 27/06/2023 20:52, Calvin Wan wrote:
> > The Git Standard Library intends to serve as the foundational library
> > and root dependency that other libraries in Git will be built off of.
> > That is to say, suppose we have libraries X and Y; a user that wants to
> > use X and Y would need to include X, Y, and this Git Standard Library.
>
> I think having a library of commonly used functions and structures is a
> good idea. While I appreciate that we don't want to include everything
> I'm surprised to see it does not include things like "hashmap.c" and
> "string-list.c" that will be required by the config library as well as
> other code in "libgit.a". I don't think we want "libgitconfig.a" and
> "libgit.a" to both contain a copy of "hashmap.o" and "string-list.o"

I chose not to include hashmap and string-list in git-std-lib.a in the
first pass since they can exist as libraries built on top of
git-std-lib.a. There is no harm starting off with more libraries than
fewer besides having something like the config library be dependent on
lib-hashmap.a, lib-string-list.a, and git-std-lib.a rather than only
git-std-lib.a. They can always be added into git-std-lib.a in the
future. That being said, I do find it extremely unlikely that someone
would want to swap out the implementation for hashmap or string-list
so it is also very reasonable to include them into git-std-lib.a

>
> > diff --git a/Makefile b/Makefile
> > index e9ad9f9ef1..255bd10b82 100644
> > --- a/Makefile
> > +++ b/Makefile
> > @@ -2162,6 +2162,11 @@ ifdef FSMONITOR_OS_SETTINGS
> >       COMPAT_OBJS += compat/fsmonitor/fsm-path-utils-$(FSMONITOR_OS_SETTINGS).o
> >   endif
> >
> > +ifdef GIT_STD_LIB
> > +     BASIC_CFLAGS += -DGIT_STD_LIB
> > +     BASIC_CFLAGS += -DNO_GETTEXT
>
> I can see other projects may want to build git-std-lib without gettext
> support but if we're going to use git-std-lib within git it needs to be
> able to be built with that support. The same goes for the trace
> functions that you are redefining in usage.h

Taking a closer look at gettext.[ch], I believe I can also include it
into git-std-lib.a with a couple of minor changes. I'm currently
thinking about how the trace functions should interact with
git-std-lib.a since Victoria had similar comments on patch 1. I'll
reply to that thread when I come up with an answer.

>
> > diff --git a/git-compat-util.h b/git-compat-util.h
> > index 481dac22b0..75aa9b263e 100644
> > --- a/git-compat-util.h
> > +++ b/git-compat-util.h
> > @@ -396,8 +396,8 @@ static inline int noop_core_config(const char *var UNUSED,
> >   #define platform_core_config noop_core_config
> >   #endif
> >
> > +#if !defined(__MINGW32__) && !defined(_MSC_VER) && !defined(GIT_STD_LIB)
> >   int lstat_cache_aware_rmdir(const char *path);
> > -#if !defined(__MINGW32__) && !defined(_MSC_VER)
> >   #define rmdir lstat_cache_aware_rmdir
> >   #endif
>
> I'm not sure why the existing condition is being moved here

Ah I see that this changes behavior for callers of
lstat_cache_aware_rmdir if those conditions aren't satisfied. I
should've added an extra #if for GIT_STD_LIB instead of adding it to
the end of the current check and moving it. Thanks for spotting this.

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH 0/8] Introduce Git Standard Library
  2023-06-27 19:52 [RFC PATCH 0/8] Introduce Git Standard Library Calvin Wan
                   ` (8 preceding siblings ...)
  2023-06-28  0:14 ` [RFC PATCH 0/8] Introduce Git Standard Library Glen Choo
@ 2023-06-30  7:01 ` Linus Arver
  2023-08-10 16:33 ` [RFC PATCH v2 0/7] " Calvin Wan
  2023-09-29 21:20 ` [PATCH v4 0/4] Preliminary patches before git-std-lib Jonathan Tan
  11 siblings, 0 replies; 111+ messages in thread
From: Linus Arver @ 2023-06-30  7:01 UTC (permalink / raw)
  To: Calvin Wan, git; +Cc: Calvin Wan, nasamuffin, chooglen, johnathantanmy

Hello Calvin,

Calvin Wan <calvinwan@google.com> writes:
> With our current method of building Git, we can imagine the dependency
> graph as such:
>
>         Git
>          /\
>         /  \
>        /    \
>   libgit.a   ext deps
>
> In libifying parts of Git, we want to shrink the dependency graph to
> only the minimal set of dependencies, so libraries should not use
> libgit.a. Instead, it would look like:
>
>                 Git
>                 /\
>                /  \
>               /    \
>           libgit.a  ext deps
>              /\
>             /  \
>            /    \
> object-store.a  (other lib)
>       |        /
>       |       /
>       |      /
>  config.a   / 
>       |    /
>       |   /
>       |  /
> git-std-lib.a
>
> Instead of containing all of the objects in Git, libgit.a would contain
> objects that are not built by libraries it links against. Consequently,
> if someone wanted their own custom build of Git with their own custom
> implementation of the object store, they would only have to swap out
> object-store.a rather than do a hard fork of Git.

What about the case where someone wants to build program Foo which just
pulls in some bits of Git? For example, I am thinking of trailer.[ch]
which could be refactored to expose a public API. Then the Foo program
could pull this public trailer manipulation API in as a library
dependency (so that Foo can parse trailers in commit messages without
re-implementing that logic in Foo's own codebase). With the proposed Git
Standard Library (GSL) model above, would my Foo program also have to
pull in GSL? If so, isn't this onerous because of the additional bloat?
The Foo developers just want the banana, not the gorilla holding the
banana in the jungle, so to speak.

> Rationale behind Git Standard Library
> ================
>
> The rationale behind Git Standard Library essentially is the result of
> two observations within the Git codebase: every file includes
> git-compat-util.h which defines functions in a couple of different
> files, and wrapper.c + usage.c have difficult-to-separate circular
> dependencies with each other and other files.
>
> Ubiquity of git-compat-util.h and circular dependencies
> ========
>
> Every file in the Git codebase includes git-compat-util.h. It serves as
> "a compatibility aid that isolates the knowledge of platform specific
> inclusion order and what feature macros to define before including which
> system header" (Junio[5]). Since every file includes git-compat-util.h, and
> git-compat-util.h includes wrapper.h and usage.h, it would make sense
> for wrapper.c and usage.c to be a part of the root library. They have
> difficult to separate circular dependencies with each other so they

s/difficult to separate/difficult-to-separate

> can't be independent libraries. Wrapper.c has dependencies on parse.c,
> abspath.c, strbuf.c, which in turn also have dependencies on usage.c and
> wrapper.c -- more circular dependencies. 
>
> Tradeoff between swappability and refactoring
> ========
>
> From the above dependency graph, we can see that git-std-lib.a could be
> many smaller libraries rather than a singular library. So why choose a
> singular library when multiple libraries can be individually easier to
> swap and are more modular? A singular library requires less work to
> separate out circular dependencies within itself so it becomes a
> tradeoff question between work and reward. While there may be a point in
> the future where a file like usage.c would want its own library so that
> someone can have custom die() or error(), the work required to refactor
> out the circular dependencies in some files would be enormous due to
> their ubiquity so therefore I believe it is not worth the tradeoff
> currently. Additionally, we can in the future choose to do this refactor
> and change the API for the library if there becomes enough of a reason
> to do so (remember we are avoiding promising stability of the interfaces
> of those libraries).

Would getting us down the currently proposed path make it even more
difficult to do this refactor? If so, I think it's worth mentioning.

> Reuse of compatibility functions in git-compat-util.h
> ========
>
> Most functions defined in git-compat-util.h are implemented in compat/
> and have dependencies limited to strbuf.h and wrapper.h so they can be
> easily included in git-std-lib.a, which as a root dependency means that
> higher level libraries do not have to worry about compatibility files in
> compat/. The rest of the functions defined in git-compat-util.h are
> implemented in top level files and, in this patch set, are hidden behind
> an #ifdef if their implementation is not in git-std-lib.a.
>
> Rationale summary
> ========
>
> The Git Standard Library allows us to get the libification ball rolling
> with other libraries in Git (such as Glen's removal of global state from
> config iteration[6] prepares a config library). By not spending many
> more months attempting to refactor difficult circular dependencies and
> instead spending that time getting to a state where we can test out
> swapping a library out such as config or object store, we can prove the
> viability of Git libification on a much faster time scale. Additionally
> the code cleanups that have happened so far have been minor and
> beneficial for the codebase. It is probable that making large movements
> would negatively affect code clarity.

It sounds like the circular dependencies are so difficult to untangle that they
are the primary motivation behind grouping these tightly-coupled libraries
together into the Git Standard Library (GSL) banner. Still, I think it would
help reviewers if you explain what tradeoffs we are making by accepting the
circular dependencies as they are instead of untangling them. Conversely, if we
assume that there are no circular dependencies, what kind of benefits do we get
when designing the GSL from this (improved) position? Would there be little to
no additional benefits? If so, then I think it would be easier to support the
current approach (as removing the circularities would not give us significant
advantages for libification).

> Git Standard Library boundary
> ================
>
> While I have described above some useful heuristics for identifying
> potential candidates for git-std-lib.a, a standard library should not
> have a shaky definition for what belongs in it.
>
>  - Low-level files (aka operates only on other primitive types) that are
>    used everywhere within the codebase (wrapper.c, usage.c, strbuf.c)
>    - Dependencies that are low-level and widely used
>      (abspath.c, date.c, hex-ll.c, parse.c, utf8.c)
>  - low-level git/* files with functions defined in git-compat-util.h
>    (ctype.c)
>  - compat/*

I'm confused. Is the list above an example of a shaky definition, or the
opposite? IOW, do you mean that the list above should be the initial set
of content to include in the GSL? Or _not_ to include?

> Series structure
> ================
>
> While my strbuf and git-compat-util series can stand alone, they also
> function as preparatory patches for this series. There are more cleanup
> patches in this series, but since most of them have marginal benefits
> probably not worth the churn on its own, I decided not to split them
> into a separate series like with strbuf and git-compat-util. As an RFC,
> I am looking for comments on whether the rationale behind git-std-lib
> makes sense as well as whether there are better ways to build and enable
> git-std-lib in patch 7, specifically regarding Makefile rules and the
> usage of ifdef's to stub out certain functions and headers. 

If the cleanups are independent I think it would be simpler to put them
in a separate series.

In general, I think the doc would make a stronger case if it expanded
the discussions around alternative approaches to the one proposed, with
the reasons why they were rejected.

Minor nits:
- Documentation/technical/git-std-lib.txt: (style) prefer "we" over "I" ("we
  believe" instead of "I believe").
- There are some "\ No newline at end of file" warnings in this series.

Thanks,
Linus

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH 7/8] git-std-lib: introduce git standard library
  2023-06-28 21:15     ` Calvin Wan
@ 2023-06-30 10:00       ` Phillip Wood
  0 siblings, 0 replies; 111+ messages in thread
From: Phillip Wood @ 2023-06-30 10:00 UTC (permalink / raw)
  To: Calvin Wan, phillip.wood; +Cc: git, nasamuffin, chooglen, Jonathan Tan

Hi Calvin

On 28/06/2023 22:15, Calvin Wan wrote:
>> On 27/06/2023 20:52, Calvin Wan wrote:
>>> The Git Standard Library intends to serve as the foundational library
>>> and root dependency that other libraries in Git will be built off of.
>>> That is to say, suppose we have libraries X and Y; a user that wants to
>>> use X and Y would need to include X, Y, and this Git Standard Library.
>>
>> I think having a library of commonly used functions and structures is a
>> good idea. While I appreciate that we don't want to include everything
>> I'm surprised to see it does not include things like "hashmap.c" and
>> "string-list.c" that will be required by the config library as well as
>> other code in "libgit.a". I don't think we want "libgitconfig.a" and
>> "libgit.a" to both contain a copy of "hashmap.o" and "string-list.o"
> 
> I chose not to include hashmap and string-list in git-std-lib.a in the
> first pass since they can exist as libraries built on top of
> git-std-lib.a. There is no harm starting off with more libraries than
> fewer besides having something like the config library be dependent on
> lib-hashmap.a, lib-string-list.a, and git-std-lib.a rather than only
> git-std-lib.a. They can always be added into git-std-lib.a in the
> future. That being said, I do find it extremely unlikely that someone
> would want to swap out the implementation for hashmap or string-list
> so it is also very reasonable to include them into git-std-lib.a

Finding the right boundary for git-std-lib is a bit of a judgement call. 
We certainly could have separate libraries for things like hashmap, 
string-list, strvec, strmap and wildmatch but there is some overhead 
adding each one to the Makefile. I think their use is common enough that 
it would be continent to have them in git-std-lib but we can always add 
them later.

>>> diff --git a/Makefile b/Makefile
>>> index e9ad9f9ef1..255bd10b82 100644
>>> --- a/Makefile
>>> +++ b/Makefile
>>> @@ -2162,6 +2162,11 @@ ifdef FSMONITOR_OS_SETTINGS
>>>        COMPAT_OBJS += compat/fsmonitor/fsm-path-utils-$(FSMONITOR_OS_SETTINGS).o
>>>    endif
>>>
>>> +ifdef GIT_STD_LIB
>>> +     BASIC_CFLAGS += -DGIT_STD_LIB
>>> +     BASIC_CFLAGS += -DNO_GETTEXT
>>
>> I can see other projects may want to build git-std-lib without gettext
>> support but if we're going to use git-std-lib within git it needs to be
>> able to be built with that support. The same goes for the trace
>> functions that you are redefining in usage.h
> 
> Taking a closer look at gettext.[ch], I believe I can also include it
> into git-std-lib.a with a couple of minor changes.

That's great

> I'm currently
> thinking about how the trace functions should interact with
> git-std-lib.a since Victoria had similar comments on patch 1. I'll
> reply to that thread when I come up with an answer.

One thought I had was to have a compile time flag so someone building 
git-std-lib for an external project could build it with

	make git-std-lib NO_TRACE2=YesPlease

and then we'd either compile against a stub version of trace2 that does 
nothing or use some #define magic to get rid of the calls if that is not 
too invasive.

Best Wishes

Phillip


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH 1/8] trace2: log fsync stats in trace2 rather than wrapper
  2023-06-28  2:05   ` Victoria Dye
@ 2023-07-05 17:57     ` Calvin Wan
  2023-07-05 18:22       ` Victoria Dye
  0 siblings, 1 reply; 111+ messages in thread
From: Calvin Wan @ 2023-07-05 17:57 UTC (permalink / raw)
  To: Victoria Dye; +Cc: git, nasamuffin, chooglen, johnathantanmy

> This function does not belong in 'trace2.h', IMO. The purpose of that file
> is to contain the generic API for Trace2 (e.g., 'trace2_printf()',
> 'trace2_region_(enter|exit)'), whereas this function is effectively a
> wrapper around a specific invocation of that API.
>
> You note in the commit message that "wrapper.c should not directly log
> trace2 statistics" with the reasoning of "[it's] a library boundary," but I
> suspect the unstated underlying reason is "because it tracks 'count_fsync_*'
> in static variables." This case would be better handled, then, by replacing
> the usage in 'wrapper.c' with a new Trace2 counter (API introduced in [1]).
> That keeps this usage consistent with the API already established for
> Trace2, rather than starting an unsustainable trend of creating ad-hoc,
> per-metric wrappers in 'trace2.[c|h]'.

The underlying reason is for removing the trace2 dependency from
wrapper.c so that when git-std-lib is compiled, there isn't a missing
object for  trace_git_fsync_stats(), resulting in a compilation error.
However I do agree that the method I chose to do so by creating an
ad-hoc wrapper is unsustainable and I will come up with a better
method for doing so.

>
> An added note re: the commit message - it's extremely important that
> functions _anywhere in Git_ are able to use the Trace2 API directly. A
> developer could reasonably want to measure performance, keep track of an
> interesting metric, log when a region is entered in the larger trace,
> capture error information, etc. for any function, regardless of where in
> falls in the internal library organization.

I don't quite agree that functions _anywhere in Git_ are able to use
the Trace2 API directly for the same reason that we don't have the
ability to log functions in external libraries -- logging common,
low-level functionality creates an unnecessary amount of log churn and
those logs generally contain practically useless information. However,
that does not mean that all of the functions in git-std-lib fall into
that category (usage has certain functions definitely worth logging).
This means that files like usage.c could instead be separated into its
own library and git-std-lib would only contain files that we deem
"should never be logged".

> To that end, I think either the
> commit message should be rephrased to remove that statement (if the issue is
> really "we're using a static variable and we want to avoid that"), or the
> libification effort should be updated to accommodate use of Trace2 anywhere
> in Git.

Besides potentially redrawing the boundaries of git-std-lib to
accommodate Trace2, we're also looking into the possibility of
stubbing out tracing in git-std-lib so that it and other libraries can
be built and tested, and then when Trace2 is turned into a library,
it's full functionality can be linked to.

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH 1/8] trace2: log fsync stats in trace2 rather than wrapper
  2023-07-05 17:57     ` Calvin Wan
@ 2023-07-05 18:22       ` Victoria Dye
  0 siblings, 0 replies; 111+ messages in thread
From: Victoria Dye @ 2023-07-05 18:22 UTC (permalink / raw)
  To: Calvin Wan; +Cc: git, nasamuffin, chooglen, johnathantanmy

Calvin Wan wrote:
>> An added note re: the commit message - it's extremely important that
>> functions _anywhere in Git_ are able to use the Trace2 API directly. A
>> developer could reasonably want to measure performance, keep track of an
>> interesting metric, log when a region is entered in the larger trace,
>> capture error information, etc. for any function, regardless of where in
>> falls in the internal library organization.
> 
> I don't quite agree that functions _anywhere in Git_ are able to use
> the Trace2 API directly for the same reason that we don't have the
> ability to log functions in external libraries -- logging common,
> low-level functionality creates an unnecessary amount of log churn and
> those logs generally contain practically useless information. 

That may be true in your use cases, but it isn't in mine and may not be for
others'. In fact, I was just using these exact fsync metrics a couple weeks
ago to do some performance analysis; I could easily imagine doing something
similar for another "low level" function. It's unreasonable - and unfair to
future development - to make an absolute declaration about "what's useful
vs. useless" and use that decision to justify severely limiting our future
flexibility on the matter.

> However,
> that does not mean that all of the functions in git-std-lib fall into
> that category (usage has certain functions definitely worth logging).
> This means that files like usage.c could instead be separated into its
> own library and git-std-lib would only contain files that we deem
> "should never be logged".

How do you make that determination? What about if/when someone realizes,
somewhere down the line, that one of those "should never be logged" files
would actually benefit from some aggregate metric, e.g. a Trace2 timer? This
isn't a case of extracting an extraneous dependency (where a function really
doesn't _need_ something it has access to); tracing & logging is a core
functionality in Git, and should not be artificially constrained in the name
of organization. 

>> To that end, I think either the
>> commit message should be rephrased to remove that statement (if the issue is
>> really "we're using a static variable and we want to avoid that"), or the
>> libification effort should be updated to accommodate use of Trace2 anywhere
>> in Git.
> 
> Besides potentially redrawing the boundaries of git-std-lib to
> accommodate Trace2, we're also looking into the possibility of
> stubbing out tracing in git-std-lib so that it and other libraries can
> be built and tested, and then when Trace2 is turned into a library,
> it's full functionality can be linked to.

If that allows you to meet your libification goals without limiting Trace2's
accessibility throughout the codebase, that works for me.


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH 1/8] trace2: log fsync stats in trace2 rather than wrapper
  2023-06-27 19:52 ` [RFC PATCH 1/8] trace2: log fsync stats in trace2 rather than wrapper Calvin Wan
  2023-06-28  2:05   ` Victoria Dye
@ 2023-07-11 20:07   ` Jeff Hostetler
  1 sibling, 0 replies; 111+ messages in thread
From: Jeff Hostetler @ 2023-07-11 20:07 UTC (permalink / raw)
  To: Calvin Wan, git; +Cc: nasamuffin, chooglen, johnathantanmy



On 6/27/23 3:52 PM, Calvin Wan wrote:
> As a library boundary, wrapper.c should not directly log trace2
> statistics, but instead provide those statistics upon
> request. Therefore, move the trace2 logging code to trace2.[ch.]. This
> also allows wrapper.c to not be dependent on trace2.h and repository.h.
> 
> Signed-off-by: Calvin Wan <calvinwan@google.com>
> ---
>   trace2.c  | 13 +++++++++++++
>   trace2.h  |  5 +++++
>   wrapper.c | 17 ++++++-----------
>   wrapper.h |  4 ++--
>   4 files changed, 26 insertions(+), 13 deletions(-)
> 
> diff --git a/trace2.c b/trace2.c
> index 0efc4e7b95..f367a1ce31 100644
> --- a/trace2.c
> +++ b/trace2.c
> @@ -915,3 +915,16 @@ const char *trace2_session_id(void)
>   {
>   	return tr2_sid_get();
>   }
> +
> +static void log_trace_fsync_if(const char *key)
> +{
> +	intmax_t value = get_trace_git_fsync_stats(key);
> +	if (value)
> +		trace2_data_intmax("fsync", the_repository, key, value);
> +}
> +
> +void trace_git_fsync_stats(void)
> +{
> +	log_trace_fsync_if("fsync/writeout-only");
> +	log_trace_fsync_if("fsync/hardware-flush");
> +}
> diff --git a/trace2.h b/trace2.h
> index 4ced30c0db..689e9a4027 100644
> --- a/trace2.h
> +++ b/trace2.h
> @@ -581,4 +581,9 @@ void trace2_collect_process_info(enum trace2_process_info_reason reason);
>   
>   const char *trace2_session_id(void);
>   
> +/*
> + * Writes out trace statistics for fsync
> + */
> +void trace_git_fsync_stats(void);
> +
>   #endif /* TRACE2_H */

Sorry to be late to this party, but none of this belongs
in trace2.[ch].

As Victoria stated, you can/should use the new "timers and counters"
feature in Trace2 to collect and log these stats.

And then you don't need specific log_trace_* functions or wrappers
-- just use the trace2_timer_start()/stop() or trace2_counter_add()
functions as necessary around the various fsync operations.


I haven't really followed the lib-ification effort, so I'm just going
to GUESS that all of the Trace2_ and tr2_ prefixed functions and data
structures will need to be in the lowest-level .a so that it can be
called from the main .exe and any other .a's between them.

Jeff


^ permalink raw reply	[flat|nested] 111+ messages in thread

* [RFC PATCH v2 0/7] Introduce Git Standard Library
  2023-06-27 19:52 [RFC PATCH 0/8] Introduce Git Standard Library Calvin Wan
                   ` (9 preceding siblings ...)
  2023-06-30  7:01 ` Linus Arver
@ 2023-08-10 16:33 ` Calvin Wan
  2023-08-10 16:36   ` [RFC PATCH v2 1/7] hex-ll: split out functionality from hex Calvin Wan
                     ` (8 more replies)
  2023-09-29 21:20 ` [PATCH v4 0/4] Preliminary patches before git-std-lib Jonathan Tan
  11 siblings, 9 replies; 111+ messages in thread
From: Calvin Wan @ 2023-08-10 16:33 UTC (permalink / raw)
  To: git
  Cc: Calvin Wan, nasamuffin, chooglen, jonathantanmy, linusa,
	phillip.wood123, vdye

Original cover letter:
https://lore.kernel.org/git/20230627195251.1973421-1-calvinwan@google.com/

In the initial RFC, I had a patch that removed the trace2 dependency
from usage.c so that git-std-lib.a would not have dependencies outside
of git-std-lib.a files. Consequently this meant that tracing would not
be possible in git-std-lib.a files for other developers of Git, and it
is not a good idea for the libification effort to close the door on
tracing in certain files for future development (thanks Victoria for
pointing this out). That patch has been removed and instead I introduce
stubbed out versions of repository.[ch] and trace2.[ch] that are swapped
in during compilation time (I'm no Makefile expert so any advice on how
on I could do this better would be much appreciated). These stubbed out
files contain no implementations and therefore do not have any
additional dependencies, allowing git-std-lib.a to compile with only the
stubs as additional dependencies. This also has the added benefit of
removing `#ifdef GIT_STD_LIB` macros in C files for specific library
compilation rules. Libification shouldn't pollute C files with these
macros. The boundaries for git-std-lib.a have also been updated to
contain these stubbed out files.

I have also made some additional changes to the Makefile to piggy back
off of our existing build rules for .c/.o targets and their
dependencies. As I learn more about Makefiles, I am continuing to look
for ways to improve these rules. Eventually I would like to be able to
have a set of rules that future libraries can emulate and is scalable
in the sense of not creating additional toil for developers that are not
interested in libification.

Calvin Wan (7):
  hex-ll: split out functionality from hex
  object: move function to object.c
  config: correct bad boolean env value error message
  parse: create new library for parsing strings and env values
  date: push pager.h dependency up
  git-std-lib: introduce git standard library
  git-std-lib: add test file to call git-std-lib.a functions

 Documentation/technical/git-std-lib.txt | 186 ++++++++++++++++++
 Makefile                                |  64 ++++++-
 attr.c                                  |   2 +-
 builtin/blame.c                         |   2 +-
 builtin/log.c                           |   2 +-
 color.c                                 |   2 +-
 config.c                                | 173 +----------------
 config.h                                |  14 +-
 date.c                                  |   5 +-
 date.h                                  |   2 +-
 git-compat-util.h                       |   7 +-
 hex-ll.c                                |  49 +++++
 hex-ll.h                                |  27 +++
 hex.c                                   |  47 -----
 hex.h                                   |  24 +--
 mailinfo.c                              |   2 +-
 object.c                                |   5 +
 object.h                                |   6 +
 pack-objects.c                          |   2 +-
 pack-revindex.c                         |   2 +-
 parse-options.c                         |   3 +-
 parse.c                                 | 182 ++++++++++++++++++
 parse.h                                 |  20 ++
 pathspec.c                              |   2 +-
 preload-index.c                         |   2 +-
 progress.c                              |   2 +-
 prompt.c                                |   2 +-
 rebase.c                                |   2 +-
 ref-filter.c                            |   3 +-
 revision.c                              |   3 +-
 strbuf.c                                |   2 +-
 stubs/repository.c                      |   4 +
 stubs/repository.h                      |   8 +
 stubs/trace2.c                          |  22 +++
 stubs/trace2.h                          |  69 +++++++
 symlinks.c                              |   2 +
 t/Makefile                              |   4 +
 t/helper/test-date.c                    |   3 +-
 t/helper/test-env-helper.c              |   2 +-
 t/stdlib-test.c                         | 239 ++++++++++++++++++++++++
 unpack-trees.c                          |   2 +-
 url.c                                   |   2 +-
 urlmatch.c                              |   2 +-
 wrapper.c                               |   8 +-
 wrapper.h                               |   5 -
 write-or-die.c                          |   2 +-
 46 files changed, 925 insertions(+), 295 deletions(-)
 create mode 100644 Documentation/technical/git-std-lib.txt
 create mode 100644 hex-ll.c
 create mode 100644 hex-ll.h
 create mode 100644 parse.c
 create mode 100644 parse.h
 create mode 100644 stubs/repository.c
 create mode 100644 stubs/repository.h
 create mode 100644 stubs/trace2.c
 create mode 100644 stubs/trace2.h
 create mode 100644 t/stdlib-test.c

Range-diff against v1:
1:  f7abe7a239 < -:  ---------- trace2: log fsync stats in trace2 rather than wrapper
2:  c302ae0052 = 1:  78634bc406 hex-ll: split out functionality from hex
3:  74e8e35ae2 ! 2:  21ec1d276e object: move function to object.c
    @@ wrapper.c
      #include "config.h"
      #include "gettext.h"
     -#include "object.h"
    + #include "repository.h"
      #include "strbuf.h"
    - 
    - static intmax_t count_fsync_writeout_only;
    + #include "trace2.h"
     @@ wrapper.c: int rmdir_or_warn(const char *file)
      	return warn_if_unremovable("rmdir", file, rmdir(file));
      }
4:  419c702633 = 3:  41dcf8107c config: correct bad boolean env value error message
5:  a325002438 ! 4:  3e800a41c4 parse: create new library for parsing strings and env values
    @@ wrapper.c
     -#include "config.h"
     +#include "parse.h"
      #include "gettext.h"
    + #include "repository.h"
      #include "strbuf.h"
    - 
     
      ## write-or-die.c ##
     @@
6:  475190310a < -:  ---------- pager: remove pager_in_use()
-:  ---------- > 5:  7a4a088bc3 date: push pager.h dependency up
7:  d7f4d4a137 ! 6:  c9002734d0 git-std-lib: introduce git standard library
    @@ Documentation/technical/git-std-lib.txt (new)
     +easily included in git-std-lib.a, which as a root dependency means that
     +higher level libraries do not have to worry about compatibility files in
     +compat/. The rest of the functions defined in git-compat-util.h are
    -+implemented in top level files and, in this patch set, are hidden behind
    ++implemented in top level files and are hidden behind
     +an #ifdef if their implementation is not in git-std-lib.a.
     +
     +Rationale summary
    @@ Documentation/technical/git-std-lib.txt (new)
     + - low-level git/* files with functions defined in git-compat-util.h
     +   (ctype.c)
     + - compat/*
    ++ - stubbed out dependencies in stubs/ (stubs/repository.c, stubs/trace2.c)
     +
     +There are other files that might fit this definition, but that does not
     +mean it should belong in git-std-lib.a. Those files should start as
     +their own separate library since any file added to git-std-lib.a loses
     +its flexibility of being easily swappable.
     +
    ++Wrapper.c and usage.c have dependencies on repository and trace2 that are
    ++possible to remove at the cost of sacrificing the ability for standard Git
    ++to be able to trace functions in those files and other files in git-std-lib.a.
    ++In order for git-std-lib.a to compile with those dependencies, stubbed out
    ++versions of those files are implemented and swapped in during compilation time.
    ++
     +Files inside of Git Standard Library
     +================
     +
    @@ Documentation/technical/git-std-lib.txt (new)
     +usage.c
     +utf8.c
     +wrapper.c
    ++stubs/repository.c
    ++stubs/trace2.c
     +relevant compat/ files
     +
     +Pitfalls
     +================
     +
    -+In patch 7, I use #ifdef GIT_STD_LIB to both stub out code and hide
    -+certain function headers. As other parts of Git are libified, if we
    -+have to use more ifdefs for each different library, then the codebase
    -+will become uglier and harder to understand. 
    -+
     +There are a small amount of files under compat/* that have dependencies
     +not inside of git-std-lib.a. While those functions are not called on
     +Linux, other OSes might call those problematic functions. I don't see
    @@ Documentation/technical/git-std-lib.txt (new)
      \ No newline at end of file
     
      ## Makefile ##
    +@@ Makefile: FUZZ_PROGRAMS =
    + GIT_OBJS =
    + LIB_OBJS =
    + SCALAR_OBJS =
    ++STUB_OBJS =
    + OBJECTS =
    + OTHER_PROGRAMS =
    + PROGRAM_OBJS =
    +@@ Makefile: COCCI_SOURCES = $(filter-out $(THIRD_PARTY_SOURCES),$(FOUND_C_SOURCES))
    + 
    + LIB_H = $(FOUND_H_SOURCES)
    + 
    ++ifndef GIT_STD_LIB
    + LIB_OBJS += abspath.o
    + LIB_OBJS += add-interactive.o
    + LIB_OBJS += add-patch.o
    +@@ Makefile: LIB_OBJS += write-or-die.o
    + LIB_OBJS += ws.o
    + LIB_OBJS += wt-status.o
    + LIB_OBJS += xdiff-interface.o
    ++else ifdef GIT_STD_LIB
    ++LIB_OBJS += abspath.o
    ++LIB_OBJS += ctype.o
    ++LIB_OBJS += date.o
    ++LIB_OBJS += hex-ll.o
    ++LIB_OBJS += parse.o
    ++LIB_OBJS += strbuf.o
    ++LIB_OBJS += usage.o
    ++LIB_OBJS += utf8.o
    ++LIB_OBJS += wrapper.o
    ++
    ++ifdef STUB_REPOSITORY
    ++STUB_OBJS += stubs/repository.o
    ++endif
    ++
    ++ifdef STUB_TRACE2
    ++STUB_OBJS += stubs/trace2.o
    ++endif
    ++
    ++LIB_OBJS += $(STUB_OBJS)
    ++endif
    + 
    + BUILTIN_OBJS += builtin/add.o
    + BUILTIN_OBJS += builtin/am.o
     @@ Makefile: ifdef FSMONITOR_OS_SETTINGS
      	COMPAT_OBJS += compat/fsmonitor/fsm-path-utils-$(FSMONITOR_OS_SETTINGS).o
      endif
    @@ Makefile: $(FUZZ_PROGRAMS): all
     +### Libified Git rules
     +
     +# git-std-lib
    -+# `make git-std-lib GIT_STD_LIB=YesPlease`
    ++# `make git-std-lib GIT_STD_LIB=YesPlease STUB_REPOSITORY=YesPlease STUB_TRACE2=YesPlease`
     +STD_LIB = git-std-lib.a
     +
    -+GIT_STD_LIB_OBJS += abspath.o
    -+GIT_STD_LIB_OBJS += ctype.o
    -+GIT_STD_LIB_OBJS += date.o
    -+GIT_STD_LIB_OBJS += hex-ll.o
    -+GIT_STD_LIB_OBJS += parse.o
    -+GIT_STD_LIB_OBJS += strbuf.o
    -+GIT_STD_LIB_OBJS += usage.o
    -+GIT_STD_LIB_OBJS += utf8.o
    -+GIT_STD_LIB_OBJS += wrapper.o
    -+
    -+$(STD_LIB): $(GIT_STD_LIB_OBJS) $(COMPAT_OBJS)
    ++$(STD_LIB): $(LIB_OBJS) $(COMPAT_OBJS) $(STUB_OBJS)
     +	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
     +
    -+git-std-lib: $(STD_LIB)
    ++TEMP_HEADERS = temp_headers/
    ++
    ++git-std-lib:
    ++# Move headers to temporary folder and replace them with stubbed headers.
    ++# After building, move headers and stubbed headers back.
    ++ifneq ($(STUB_OBJS),)
    ++	mkdir -p $(TEMP_HEADERS); \
    ++	for d in $(STUB_OBJS); do \
    ++		BASE=$${d%.*}; \
    ++		mv $${BASE##*/}.h $(TEMP_HEADERS)$${BASE##*/}.h; \
    ++		mv $${BASE}.h $${BASE##*/}.h; \
    ++	done; \
    ++	$(MAKE) $(STD_LIB); \
    ++	for d in $(STUB_OBJS); do \
    ++		BASE=$${d%.*}; \
    ++		mv $${BASE##*/}.h $${BASE}.h; \
    ++		mv $(TEMP_HEADERS)$${BASE##*/}.h $${BASE##*/}.h; \
    ++	done; \
    ++	rm -rf temp_headers
    ++else
    ++	$(MAKE) $(STD_LIB)
    ++endif
     
      ## git-compat-util.h ##
     @@ git-compat-util.h: static inline int noop_core_config(const char *var UNUSED,
    @@ git-compat-util.h: int git_access(const char *path, int mode);
      /*
       * You can mark a stack variable with UNLEAK(var) to avoid it being
     
    + ## stubs/repository.c (new) ##
    +@@
    ++#include "git-compat-util.h"
    ++#include "repository.h"
    ++
    ++struct repository *the_repository;
    +
    + ## stubs/repository.h (new) ##
    +@@
    ++#ifndef REPOSITORY_H
    ++#define REPOSITORY_H
    ++
    ++struct repository { int stub; };
    ++
    ++extern struct repository *the_repository;
    ++
    ++#endif /* REPOSITORY_H */
    +
    + ## stubs/trace2.c (new) ##
    +@@
    ++#include "git-compat-util.h"
    ++#include "trace2.h"
    ++
    ++void trace2_region_enter_fl(const char *file, int line, const char *category,
    ++			    const char *label, const struct repository *repo, ...) { }
    ++void trace2_region_leave_fl(const char *file, int line, const char *category,
    ++			    const char *label, const struct repository *repo, ...) { }
    ++void trace2_data_string_fl(const char *file, int line, const char *category,
    ++			   const struct repository *repo, const char *key,
    ++			   const char *value) { }
    ++void trace2_cmd_ancestry_fl(const char *file, int line, const char **parent_names) { }
    ++void trace2_cmd_error_va_fl(const char *file, int line, const char *fmt,
    ++			    va_list ap) { }
    ++void trace2_cmd_name_fl(const char *file, int line, const char *name) { }
    ++void trace2_thread_start_fl(const char *file, int line,
    ++			    const char *thread_base_name) { }
    ++void trace2_thread_exit_fl(const char *file, int line) { }
    ++void trace2_data_intmax_fl(const char *file, int line, const char *category,
    ++			   const struct repository *repo, const char *key,
    ++			   intmax_t value) { }
    ++int trace2_is_enabled(void) { return 0; }
    ++void trace2_collect_process_info(enum trace2_process_info_reason reason) { }
    +
    + ## stubs/trace2.h (new) ##
    +@@
    ++#ifndef TRACE2_H
    ++#define TRACE2_H
    ++
    ++struct child_process { int stub; };
    ++struct repository;
    ++struct json_writer { int stub; };
    ++
    ++void trace2_region_enter_fl(const char *file, int line, const char *category,
    ++			    const char *label, const struct repository *repo, ...);
    ++
    ++#define trace2_region_enter(category, label, repo) \
    ++	trace2_region_enter_fl(__FILE__, __LINE__, (category), (label), (repo))
    ++
    ++void trace2_region_leave_fl(const char *file, int line, const char *category,
    ++			    const char *label, const struct repository *repo, ...);
    ++
    ++#define trace2_region_leave(category, label, repo) \
    ++	trace2_region_leave_fl(__FILE__, __LINE__, (category), (label), (repo))
    ++
    ++void trace2_data_string_fl(const char *file, int line, const char *category,
    ++			   const struct repository *repo, const char *key,
    ++			   const char *value);
    ++
    ++#define trace2_data_string(category, repo, key, value)                       \
    ++	trace2_data_string_fl(__FILE__, __LINE__, (category), (repo), (key), \
    ++			      (value))
    ++
    ++void trace2_cmd_ancestry_fl(const char *file, int line, const char **parent_names);
    ++
    ++#define trace2_cmd_ancestry(v) trace2_cmd_ancestry_fl(__FILE__, __LINE__, (v))
    ++
    ++void trace2_cmd_error_va_fl(const char *file, int line, const char *fmt,
    ++			    va_list ap);
    ++
    ++#define trace2_cmd_error_va(fmt, ap) \
    ++	trace2_cmd_error_va_fl(__FILE__, __LINE__, (fmt), (ap))
    ++
    ++
    ++void trace2_cmd_name_fl(const char *file, int line, const char *name);
    ++
    ++#define trace2_cmd_name(v) trace2_cmd_name_fl(__FILE__, __LINE__, (v))
    ++
    ++void trace2_thread_start_fl(const char *file, int line,
    ++			    const char *thread_base_name);
    ++
    ++#define trace2_thread_start(thread_base_name) \
    ++	trace2_thread_start_fl(__FILE__, __LINE__, (thread_base_name))
    ++
    ++void trace2_thread_exit_fl(const char *file, int line);
    ++
    ++#define trace2_thread_exit() trace2_thread_exit_fl(__FILE__, __LINE__)
    ++
    ++void trace2_data_intmax_fl(const char *file, int line, const char *category,
    ++			   const struct repository *repo, const char *key,
    ++			   intmax_t value);
    ++
    ++#define trace2_data_intmax(category, repo, key, value)                       \
    ++	trace2_data_intmax_fl(__FILE__, __LINE__, (category), (repo), (key), \
    ++			      (value))
    ++
    ++enum trace2_process_info_reason {
    ++	TRACE2_PROCESS_INFO_STARTUP,
    ++	TRACE2_PROCESS_INFO_EXIT,
    ++};
    ++int trace2_is_enabled(void);
    ++void trace2_collect_process_info(enum trace2_process_info_reason reason);
    ++
    ++#endif /* TRACE2_H */
    ++
    +
      ## symlinks.c ##
     @@ symlinks.c: void invalidate_lstat_cache(void)
      	reset_lstat_cache(&default_cache);
    @@ symlinks.c: int lstat_cache_aware_rmdir(const char *path)
      	return ret;
      }
     +#endif
    -
    - ## usage.c ##
    -@@
    -  */
    - #include "git-compat-util.h"
    - #include "gettext.h"
    -+
    -+#ifdef GIT_STD_LIB
    -+#undef trace2_cmd_name
    -+#undef trace2_cmd_error_va
    -+#define trace2_cmd_name(x) 
    -+#define trace2_cmd_error_va(x, y)
    -+#else
    - #include "trace2.h"
    -+#endif
    - 
    - static void vreportf(const char *prefix, const char *err, va_list params)
    - {
8:  cb96e67774 ! 7:  0bead8f980 git-std-lib: add test file to call git-std-lib.a functions
    @@ t/stdlib-test.c (new)
     +	strbuf_splice(sb, 0, 1, "foo", 3);
     +	strbuf_insert(sb, 0, "foo", 3);
     +	// strbuf_vinsertf() called by strbuf_insertf
    -+	strbuf_insertf(sb, 0, "%s", "foo"); 
    ++	strbuf_insertf(sb, 0, "%s", "foo");
     +	strbuf_remove(sb, 0, 1);
     +	strbuf_add(sb, "foo", 3);
     +	strbuf_addbuf(sb, sb2);
    @@ t/stdlib-test.c (new)
     +	unlink(path);
     +	read_in_full(fd, &sb, 1);
     +	write_in_full(fd, &sb, 1);
    -+	pread_in_full(fd, &sb, 1, 0);	
    ++	pread_in_full(fd, &sb, 1, 0);
     +}
     +
     +int main() {
    @@ t/stdlib-test.c (new)
     +	fprintf(stderr, "all git-std-lib functions finished calling\n");
     +	return 0;
     +}
    - \ No newline at end of file
-- 
2.41.0.640.ga95def55d0-goog


^ permalink raw reply	[flat|nested] 111+ messages in thread

* [RFC PATCH v2 1/7] hex-ll: split out functionality from hex
  2023-08-10 16:33 ` [RFC PATCH v2 0/7] " Calvin Wan
@ 2023-08-10 16:36   ` Calvin Wan
  2023-08-10 16:36   ` [RFC PATCH v2 2/7] object: move function to object.c Calvin Wan
                     ` (7 subsequent siblings)
  8 siblings, 0 replies; 111+ messages in thread
From: Calvin Wan @ 2023-08-10 16:36 UTC (permalink / raw)
  To: git
  Cc: Calvin Wan, nasamuffin, chooglen, jonathantanmy, linusa,
	phillip.wood123, vdye

Separate out hex functionality that doesn't require a hash algo into
hex-ll.[ch]. Since the hash algo is currently a global that sits in
repository, this separation removes that dependency for files that only
need basic hex manipulation functions.

Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 Makefile   |  1 +
 color.c    |  2 +-
 hex-ll.c   | 49 +++++++++++++++++++++++++++++++++++++++++++++++++
 hex-ll.h   | 27 +++++++++++++++++++++++++++
 hex.c      | 47 -----------------------------------------------
 hex.h      | 24 +-----------------------
 mailinfo.c |  2 +-
 strbuf.c   |  2 +-
 url.c      |  2 +-
 urlmatch.c |  2 +-
 10 files changed, 83 insertions(+), 75 deletions(-)
 create mode 100644 hex-ll.c
 create mode 100644 hex-ll.h

diff --git a/Makefile b/Makefile
index 045e2187c4..83b385b0be 100644
--- a/Makefile
+++ b/Makefile
@@ -1040,6 +1040,7 @@ LIB_OBJS += hash-lookup.o
 LIB_OBJS += hashmap.o
 LIB_OBJS += help.o
 LIB_OBJS += hex.o
+LIB_OBJS += hex-ll.o
 LIB_OBJS += hook.o
 LIB_OBJS += ident.o
 LIB_OBJS += json-writer.o
diff --git a/color.c b/color.c
index 83abb11eda..f3c0a4659b 100644
--- a/color.c
+++ b/color.c
@@ -3,7 +3,7 @@
 #include "color.h"
 #include "editor.h"
 #include "gettext.h"
-#include "hex.h"
+#include "hex-ll.h"
 #include "pager.h"
 #include "strbuf.h"
 
diff --git a/hex-ll.c b/hex-ll.c
new file mode 100644
index 0000000000..4d7ece1de5
--- /dev/null
+++ b/hex-ll.c
@@ -0,0 +1,49 @@
+#include "git-compat-util.h"
+#include "hex-ll.h"
+
+const signed char hexval_table[256] = {
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 00-07 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 08-0f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 10-17 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 18-1f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 20-27 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 28-2f */
+	  0,  1,  2,  3,  4,  5,  6,  7,		/* 30-37 */
+	  8,  9, -1, -1, -1, -1, -1, -1,		/* 38-3f */
+	 -1, 10, 11, 12, 13, 14, 15, -1,		/* 40-47 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 48-4f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 50-57 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 58-5f */
+	 -1, 10, 11, 12, 13, 14, 15, -1,		/* 60-67 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 68-67 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 70-77 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 78-7f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 80-87 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 88-8f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 90-97 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 98-9f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* a0-a7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* a8-af */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* b0-b7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* b8-bf */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* c0-c7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* c8-cf */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* d0-d7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* d8-df */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* e0-e7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* e8-ef */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* f0-f7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* f8-ff */
+};
+
+int hex_to_bytes(unsigned char *binary, const char *hex, size_t len)
+{
+	for (; len; len--, hex += 2) {
+		unsigned int val = (hexval(hex[0]) << 4) | hexval(hex[1]);
+
+		if (val & ~0xff)
+			return -1;
+		*binary++ = val;
+	}
+	return 0;
+}
diff --git a/hex-ll.h b/hex-ll.h
new file mode 100644
index 0000000000..a381fa8556
--- /dev/null
+++ b/hex-ll.h
@@ -0,0 +1,27 @@
+#ifndef HEX_LL_H
+#define HEX_LL_H
+
+extern const signed char hexval_table[256];
+static inline unsigned int hexval(unsigned char c)
+{
+	return hexval_table[c];
+}
+
+/*
+ * Convert two consecutive hexadecimal digits into a char.  Return a
+ * negative value on error.  Don't run over the end of short strings.
+ */
+static inline int hex2chr(const char *s)
+{
+	unsigned int val = hexval(s[0]);
+	return (val & ~0xf) ? val : (val << 4) | hexval(s[1]);
+}
+
+/*
+ * Read `len` pairs of hexadecimal digits from `hex` and write the
+ * values to `binary` as `len` bytes. Return 0 on success, or -1 if
+ * the input does not consist of hex digits).
+ */
+int hex_to_bytes(unsigned char *binary, const char *hex, size_t len);
+
+#endif
diff --git a/hex.c b/hex.c
index 7bb440e794..03e55841ed 100644
--- a/hex.c
+++ b/hex.c
@@ -2,53 +2,6 @@
 #include "hash.h"
 #include "hex.h"
 
-const signed char hexval_table[256] = {
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 00-07 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 08-0f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 10-17 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 18-1f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 20-27 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 28-2f */
-	  0,  1,  2,  3,  4,  5,  6,  7,		/* 30-37 */
-	  8,  9, -1, -1, -1, -1, -1, -1,		/* 38-3f */
-	 -1, 10, 11, 12, 13, 14, 15, -1,		/* 40-47 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 48-4f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 50-57 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 58-5f */
-	 -1, 10, 11, 12, 13, 14, 15, -1,		/* 60-67 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 68-67 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 70-77 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 78-7f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 80-87 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 88-8f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 90-97 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 98-9f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* a0-a7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* a8-af */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* b0-b7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* b8-bf */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* c0-c7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* c8-cf */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* d0-d7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* d8-df */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* e0-e7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* e8-ef */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* f0-f7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* f8-ff */
-};
-
-int hex_to_bytes(unsigned char *binary, const char *hex, size_t len)
-{
-	for (; len; len--, hex += 2) {
-		unsigned int val = (hexval(hex[0]) << 4) | hexval(hex[1]);
-
-		if (val & ~0xff)
-			return -1;
-		*binary++ = val;
-	}
-	return 0;
-}
-
 static int get_hash_hex_algop(const char *hex, unsigned char *hash,
 			      const struct git_hash_algo *algop)
 {
diff --git a/hex.h b/hex.h
index 7df4b3c460..c07c8b34c2 100644
--- a/hex.h
+++ b/hex.h
@@ -2,22 +2,7 @@
 #define HEX_H
 
 #include "hash-ll.h"
-
-extern const signed char hexval_table[256];
-static inline unsigned int hexval(unsigned char c)
-{
-	return hexval_table[c];
-}
-
-/*
- * Convert two consecutive hexadecimal digits into a char.  Return a
- * negative value on error.  Don't run over the end of short strings.
- */
-static inline int hex2chr(const char *s)
-{
-	unsigned int val = hexval(s[0]);
-	return (val & ~0xf) ? val : (val << 4) | hexval(s[1]);
-}
+#include "hex-ll.h"
 
 /*
  * Try to read a SHA1 in hexadecimal format from the 40 characters
@@ -32,13 +17,6 @@ int get_oid_hex(const char *hex, struct object_id *sha1);
 /* Like get_oid_hex, but for an arbitrary hash algorithm. */
 int get_oid_hex_algop(const char *hex, struct object_id *oid, const struct git_hash_algo *algop);
 
-/*
- * Read `len` pairs of hexadecimal digits from `hex` and write the
- * values to `binary` as `len` bytes. Return 0 on success, or -1 if
- * the input does not consist of hex digits).
- */
-int hex_to_bytes(unsigned char *binary, const char *hex, size_t len);
-
 /*
  * Convert a binary hash in "unsigned char []" or an object name in
  * "struct object_id *" to its hex equivalent. The `_r` variant is reentrant,
diff --git a/mailinfo.c b/mailinfo.c
index 2aeb20e5e6..eb34c30be7 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -1,7 +1,7 @@
 #include "git-compat-util.h"
 #include "config.h"
 #include "gettext.h"
-#include "hex.h"
+#include "hex-ll.h"
 #include "utf8.h"
 #include "strbuf.h"
 #include "mailinfo.h"
diff --git a/strbuf.c b/strbuf.c
index 8dac52b919..a2a05fe168 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -1,6 +1,6 @@
 #include "git-compat-util.h"
 #include "gettext.h"
-#include "hex.h"
+#include "hex-ll.h"
 #include "strbuf.h"
 #include "string-list.h"
 #include "utf8.h"
diff --git a/url.c b/url.c
index 2e1a9f6fee..282b12495a 100644
--- a/url.c
+++ b/url.c
@@ -1,5 +1,5 @@
 #include "git-compat-util.h"
-#include "hex.h"
+#include "hex-ll.h"
 #include "strbuf.h"
 #include "url.h"
 
diff --git a/urlmatch.c b/urlmatch.c
index eba0bdd77f..f1aa87d1dd 100644
--- a/urlmatch.c
+++ b/urlmatch.c
@@ -1,6 +1,6 @@
 #include "git-compat-util.h"
 #include "gettext.h"
-#include "hex.h"
+#include "hex-ll.h"
 #include "strbuf.h"
 #include "urlmatch.h"
 
-- 
2.41.0.640.ga95def55d0-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [RFC PATCH v2 2/7] object: move function to object.c
  2023-08-10 16:33 ` [RFC PATCH v2 0/7] " Calvin Wan
  2023-08-10 16:36   ` [RFC PATCH v2 1/7] hex-ll: split out functionality from hex Calvin Wan
@ 2023-08-10 16:36   ` Calvin Wan
  2023-08-10 20:32     ` Junio C Hamano
  2023-08-10 22:36     ` Glen Choo
  2023-08-10 16:36   ` [RFC PATCH v2 3/7] config: correct bad boolean env value error message Calvin Wan
                     ` (6 subsequent siblings)
  8 siblings, 2 replies; 111+ messages in thread
From: Calvin Wan @ 2023-08-10 16:36 UTC (permalink / raw)
  To: git
  Cc: Calvin Wan, nasamuffin, chooglen, jonathantanmy, linusa,
	phillip.wood123, vdye

While remove_or_warn() is a simple ternary operator to call two other
wrapper functions, it creates an unnecessary dependency to object.h in
wrapper.c. Therefore move the function to object.[ch] where the concept
of GITLINKs is first defined.

Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 object.c  | 5 +++++
 object.h  | 6 ++++++
 wrapper.c | 6 ------
 wrapper.h | 5 -----
 4 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/object.c b/object.c
index 60f954194f..cb29fcc304 100644
--- a/object.c
+++ b/object.c
@@ -617,3 +617,8 @@ void parsed_object_pool_clear(struct parsed_object_pool *o)
 	FREE_AND_NULL(o->object_state);
 	FREE_AND_NULL(o->shallow_stat);
 }
+
+int remove_or_warn(unsigned int mode, const char *file)
+{
+	return S_ISGITLINK(mode) ? rmdir_or_warn(file) : unlink_or_warn(file);
+}
diff --git a/object.h b/object.h
index 5871615fee..e908ef6515 100644
--- a/object.h
+++ b/object.h
@@ -284,4 +284,10 @@ void clear_object_flags(unsigned flags);
  */
 void repo_clear_commit_marks(struct repository *r, unsigned int flags);
 
+/*
+ * Calls the correct function out of {unlink,rmdir}_or_warn based on
+ * the supplied file mode.
+ */
+int remove_or_warn(unsigned int mode, const char *path);
+
 #endif /* OBJECT_H */
diff --git a/wrapper.c b/wrapper.c
index 22be9812a7..118d3033de 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -5,7 +5,6 @@
 #include "abspath.h"
 #include "config.h"
 #include "gettext.h"
-#include "object.h"
 #include "repository.h"
 #include "strbuf.h"
 #include "trace2.h"
@@ -647,11 +646,6 @@ int rmdir_or_warn(const char *file)
 	return warn_if_unremovable("rmdir", file, rmdir(file));
 }
 
-int remove_or_warn(unsigned int mode, const char *file)
-{
-	return S_ISGITLINK(mode) ? rmdir_or_warn(file) : unlink_or_warn(file);
-}
-
 static int access_error_is_ok(int err, unsigned flag)
 {
 	return (is_missing_file_error(err) ||
diff --git a/wrapper.h b/wrapper.h
index c85b1328d1..272795f863 100644
--- a/wrapper.h
+++ b/wrapper.h
@@ -111,11 +111,6 @@ int unlink_or_msg(const char *file, struct strbuf *err);
  * not exist.
  */
 int rmdir_or_warn(const char *path);
-/*
- * Calls the correct function out of {unlink,rmdir}_or_warn based on
- * the supplied file mode.
- */
-int remove_or_warn(unsigned int mode, const char *path);
 
 /*
  * Call access(2), but warn for any error except "missing file"
-- 
2.41.0.640.ga95def55d0-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [RFC PATCH v2 3/7] config: correct bad boolean env value error message
  2023-08-10 16:33 ` [RFC PATCH v2 0/7] " Calvin Wan
  2023-08-10 16:36   ` [RFC PATCH v2 1/7] hex-ll: split out functionality from hex Calvin Wan
  2023-08-10 16:36   ` [RFC PATCH v2 2/7] object: move function to object.c Calvin Wan
@ 2023-08-10 16:36   ` Calvin Wan
  2023-08-10 20:36     ` Junio C Hamano
  2023-08-10 16:36   ` [RFC PATCH v2 4/7] parse: create new library for parsing strings and env values Calvin Wan
                     ` (5 subsequent siblings)
  8 siblings, 1 reply; 111+ messages in thread
From: Calvin Wan @ 2023-08-10 16:36 UTC (permalink / raw)
  To: git
  Cc: Calvin Wan, nasamuffin, chooglen, jonathantanmy, linusa,
	phillip.wood123, vdye

An incorrectly defined boolean environment value would result in the
following error message:

bad boolean config value '%s' for '%s'

This is a misnomer since environment value != config value. Instead of
calling git_config_bool() to parse the environment value, mimic the
functionality inside of git_config_bool() but with the correct error
message.

Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 config.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/config.c b/config.c
index 09851a6909..5b71ef1624 100644
--- a/config.c
+++ b/config.c
@@ -2172,7 +2172,14 @@ void git_global_config(char **user_out, char **xdg_out)
 int git_env_bool(const char *k, int def)
 {
 	const char *v = getenv(k);
-	return v ? git_config_bool(k, v) : def;
+	int val;
+	if (!v)
+		return def;
+	val = git_parse_maybe_bool(v);
+	if (val < 0)
+		die(_("bad boolean environment value '%s' for '%s'"),
+		    v, k);
+	return val;
 }
 
 /*
-- 
2.41.0.640.ga95def55d0-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [RFC PATCH v2 4/7] parse: create new library for parsing strings and env values
  2023-08-10 16:33 ` [RFC PATCH v2 0/7] " Calvin Wan
                     ` (2 preceding siblings ...)
  2023-08-10 16:36   ` [RFC PATCH v2 3/7] config: correct bad boolean env value error message Calvin Wan
@ 2023-08-10 16:36   ` Calvin Wan
  2023-08-10 23:21     ` Glen Choo
  2023-08-14 22:09     ` Jonathan Tan
  2023-08-10 16:36   ` [RFC PATCH v2 5/7] date: push pager.h dependency up Calvin Wan
                     ` (4 subsequent siblings)
  8 siblings, 2 replies; 111+ messages in thread
From: Calvin Wan @ 2023-08-10 16:36 UTC (permalink / raw)
  To: git
  Cc: Calvin Wan, nasamuffin, chooglen, jonathantanmy, linusa,
	phillip.wood123, vdye

While string and environment value parsing is mainly consumed by
config.c, there are other files that only need parsing functionality and
not config functionality. By separating out string and environment value
parsing from config, those files can instead be dependent on parse,
which has a much smaller dependency chain than config.

Move general string and env parsing functions from config.[ch] to
parse.[ch].

Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 Makefile                   |   1 +
 attr.c                     |   2 +-
 config.c                   | 180 +-----------------------------------
 config.h                   |  14 +--
 pack-objects.c             |   2 +-
 pack-revindex.c            |   2 +-
 parse-options.c            |   3 +-
 parse.c                    | 182 +++++++++++++++++++++++++++++++++++++
 parse.h                    |  20 ++++
 pathspec.c                 |   2 +-
 preload-index.c            |   2 +-
 progress.c                 |   2 +-
 prompt.c                   |   2 +-
 rebase.c                   |   2 +-
 t/helper/test-env-helper.c |   2 +-
 unpack-trees.c             |   2 +-
 wrapper.c                  |   2 +-
 write-or-die.c             |   2 +-
 18 files changed, 219 insertions(+), 205 deletions(-)
 create mode 100644 parse.c
 create mode 100644 parse.h

diff --git a/Makefile b/Makefile
index 83b385b0be..e9ad9f9ef1 100644
--- a/Makefile
+++ b/Makefile
@@ -1091,6 +1091,7 @@ LIB_OBJS += pack-write.o
 LIB_OBJS += packfile.o
 LIB_OBJS += pager.o
 LIB_OBJS += parallel-checkout.o
+LIB_OBJS += parse.o
 LIB_OBJS += parse-options-cb.o
 LIB_OBJS += parse-options.o
 LIB_OBJS += patch-delta.o
diff --git a/attr.c b/attr.c
index e9c81b6e07..cb047b4618 100644
--- a/attr.c
+++ b/attr.c
@@ -7,7 +7,7 @@
  */
 
 #include "git-compat-util.h"
-#include "config.h"
+#include "parse.h"
 #include "environment.h"
 #include "exec-cmd.h"
 #include "attr.h"
diff --git a/config.c b/config.c
index 5b71ef1624..cdd70999aa 100644
--- a/config.c
+++ b/config.c
@@ -11,6 +11,7 @@
 #include "date.h"
 #include "branch.h"
 #include "config.h"
+#include "parse.h"
 #include "convert.h"
 #include "environment.h"
 #include "gettext.h"
@@ -1204,129 +1205,6 @@ static int git_parse_source(struct config_source *cs, config_fn_t fn,
 	return error_return;
 }
 
-static uintmax_t get_unit_factor(const char *end)
-{
-	if (!*end)
-		return 1;
-	else if (!strcasecmp(end, "k"))
-		return 1024;
-	else if (!strcasecmp(end, "m"))
-		return 1024 * 1024;
-	else if (!strcasecmp(end, "g"))
-		return 1024 * 1024 * 1024;
-	return 0;
-}
-
-static int git_parse_signed(const char *value, intmax_t *ret, intmax_t max)
-{
-	if (value && *value) {
-		char *end;
-		intmax_t val;
-		intmax_t factor;
-
-		if (max < 0)
-			BUG("max must be a positive integer");
-
-		errno = 0;
-		val = strtoimax(value, &end, 0);
-		if (errno == ERANGE)
-			return 0;
-		if (end == value) {
-			errno = EINVAL;
-			return 0;
-		}
-		factor = get_unit_factor(end);
-		if (!factor) {
-			errno = EINVAL;
-			return 0;
-		}
-		if ((val < 0 && -max / factor > val) ||
-		    (val > 0 && max / factor < val)) {
-			errno = ERANGE;
-			return 0;
-		}
-		val *= factor;
-		*ret = val;
-		return 1;
-	}
-	errno = EINVAL;
-	return 0;
-}
-
-static int git_parse_unsigned(const char *value, uintmax_t *ret, uintmax_t max)
-{
-	if (value && *value) {
-		char *end;
-		uintmax_t val;
-		uintmax_t factor;
-
-		/* negative values would be accepted by strtoumax */
-		if (strchr(value, '-')) {
-			errno = EINVAL;
-			return 0;
-		}
-		errno = 0;
-		val = strtoumax(value, &end, 0);
-		if (errno == ERANGE)
-			return 0;
-		if (end == value) {
-			errno = EINVAL;
-			return 0;
-		}
-		factor = get_unit_factor(end);
-		if (!factor) {
-			errno = EINVAL;
-			return 0;
-		}
-		if (unsigned_mult_overflows(factor, val) ||
-		    factor * val > max) {
-			errno = ERANGE;
-			return 0;
-		}
-		val *= factor;
-		*ret = val;
-		return 1;
-	}
-	errno = EINVAL;
-	return 0;
-}
-
-int git_parse_int(const char *value, int *ret)
-{
-	intmax_t tmp;
-	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(int)))
-		return 0;
-	*ret = tmp;
-	return 1;
-}
-
-static int git_parse_int64(const char *value, int64_t *ret)
-{
-	intmax_t tmp;
-	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(int64_t)))
-		return 0;
-	*ret = tmp;
-	return 1;
-}
-
-int git_parse_ulong(const char *value, unsigned long *ret)
-{
-	uintmax_t tmp;
-	if (!git_parse_unsigned(value, &tmp, maximum_unsigned_value_of_type(long)))
-		return 0;
-	*ret = tmp;
-	return 1;
-}
-
-int git_parse_ssize_t(const char *value, ssize_t *ret)
-{
-	intmax_t tmp;
-	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(ssize_t)))
-		return 0;
-	*ret = tmp;
-	return 1;
-}
-
 static int reader_config_name(struct config_reader *reader, const char **out);
 static int reader_origin_type(struct config_reader *reader,
 			      enum config_origin_type *type);
@@ -1404,23 +1282,6 @@ ssize_t git_config_ssize_t(const char *name, const char *value)
 	return ret;
 }
 
-static int git_parse_maybe_bool_text(const char *value)
-{
-	if (!value)
-		return 1;
-	if (!*value)
-		return 0;
-	if (!strcasecmp(value, "true")
-	    || !strcasecmp(value, "yes")
-	    || !strcasecmp(value, "on"))
-		return 1;
-	if (!strcasecmp(value, "false")
-	    || !strcasecmp(value, "no")
-	    || !strcasecmp(value, "off"))
-		return 0;
-	return -1;
-}
-
 static const struct fsync_component_name {
 	const char *name;
 	enum fsync_component component_bits;
@@ -1495,16 +1356,6 @@ static enum fsync_component parse_fsync_components(const char *var, const char *
 	return (current & ~negative) | positive;
 }
 
-int git_parse_maybe_bool(const char *value)
-{
-	int v = git_parse_maybe_bool_text(value);
-	if (0 <= v)
-		return v;
-	if (git_parse_int(value, &v))
-		return !!v;
-	return -1;
-}
-
 int git_config_bool_or_int(const char *name, const char *value, int *is_bool)
 {
 	int v = git_parse_maybe_bool_text(value);
@@ -2165,35 +2016,6 @@ void git_global_config(char **user_out, char **xdg_out)
 	*xdg_out = xdg_config;
 }
 
-/*
- * Parse environment variable 'k' as a boolean (in various
- * possible spellings); if missing, use the default value 'def'.
- */
-int git_env_bool(const char *k, int def)
-{
-	const char *v = getenv(k);
-	int val;
-	if (!v)
-		return def;
-	val = git_parse_maybe_bool(v);
-	if (val < 0)
-		die(_("bad boolean environment value '%s' for '%s'"),
-		    v, k);
-	return val;
-}
-
-/*
- * Parse environment variable 'k' as ulong with possibly a unit
- * suffix; if missing, use the default value 'val'.
- */
-unsigned long git_env_ulong(const char *k, unsigned long val)
-{
-	const char *v = getenv(k);
-	if (v && !git_parse_ulong(v, &val))
-		die(_("failed to parse %s"), k);
-	return val;
-}
-
 int git_config_system(void)
 {
 	return !git_env_bool("GIT_CONFIG_NOSYSTEM", 0);
diff --git a/config.h b/config.h
index 247b572b37..7a7f53e503 100644
--- a/config.h
+++ b/config.h
@@ -3,7 +3,7 @@
 
 #include "hashmap.h"
 #include "string-list.h"
-
+#include "parse.h"
 
 /**
  * The config API gives callers a way to access Git configuration files
@@ -205,16 +205,6 @@ int config_with_options(config_fn_t fn, void *,
  * The following helper functions aid in parsing string values
  */
 
-int git_parse_ssize_t(const char *, ssize_t *);
-int git_parse_ulong(const char *, unsigned long *);
-int git_parse_int(const char *value, int *ret);
-
-/**
- * Same as `git_config_bool`, except that it returns -1 on error rather
- * than dying.
- */
-int git_parse_maybe_bool(const char *);
-
 /**
  * Parse the string to an integer, including unit factors. Dies on error;
  * otherwise, returns the parsed result.
@@ -343,8 +333,6 @@ int git_config_rename_section(const char *, const char *);
 int git_config_rename_section_in_file(const char *, const char *, const char *);
 int git_config_copy_section(const char *, const char *);
 int git_config_copy_section_in_file(const char *, const char *, const char *);
-int git_env_bool(const char *, int);
-unsigned long git_env_ulong(const char *, unsigned long);
 int git_config_system(void);
 int config_error_nonbool(const char *);
 #if defined(__GNUC__)
diff --git a/pack-objects.c b/pack-objects.c
index 1b8052bece..f403ca6986 100644
--- a/pack-objects.c
+++ b/pack-objects.c
@@ -3,7 +3,7 @@
 #include "pack.h"
 #include "pack-objects.h"
 #include "packfile.h"
-#include "config.h"
+#include "parse.h"
 
 static uint32_t locate_object_entry_hash(struct packing_data *pdata,
 					 const struct object_id *oid,
diff --git a/pack-revindex.c b/pack-revindex.c
index 7fffcad912..a01a2a4640 100644
--- a/pack-revindex.c
+++ b/pack-revindex.c
@@ -6,7 +6,7 @@
 #include "packfile.h"
 #include "strbuf.h"
 #include "trace2.h"
-#include "config.h"
+#include "parse.h"
 #include "midx.h"
 #include "csum-file.h"
 
diff --git a/parse-options.c b/parse-options.c
index f8a155ee13..9f542950a7 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -1,11 +1,12 @@
 #include "git-compat-util.h"
 #include "parse-options.h"
 #include "abspath.h"
-#include "config.h"
+#include "parse.h"
 #include "commit.h"
 #include "color.h"
 #include "gettext.h"
 #include "strbuf.h"
+#include "string-list.h"
 #include "utf8.h"
 
 static int disallow_abbreviated_options;
diff --git a/parse.c b/parse.c
new file mode 100644
index 0000000000..42d691a0fb
--- /dev/null
+++ b/parse.c
@@ -0,0 +1,182 @@
+#include "git-compat-util.h"
+#include "gettext.h"
+#include "parse.h"
+
+static uintmax_t get_unit_factor(const char *end)
+{
+	if (!*end)
+		return 1;
+	else if (!strcasecmp(end, "k"))
+		return 1024;
+	else if (!strcasecmp(end, "m"))
+		return 1024 * 1024;
+	else if (!strcasecmp(end, "g"))
+		return 1024 * 1024 * 1024;
+	return 0;
+}
+
+int git_parse_signed(const char *value, intmax_t *ret, intmax_t max)
+{
+	if (value && *value) {
+		char *end;
+		intmax_t val;
+		intmax_t factor;
+
+		if (max < 0)
+			BUG("max must be a positive integer");
+
+		errno = 0;
+		val = strtoimax(value, &end, 0);
+		if (errno == ERANGE)
+			return 0;
+		if (end == value) {
+			errno = EINVAL;
+			return 0;
+		}
+		factor = get_unit_factor(end);
+		if (!factor) {
+			errno = EINVAL;
+			return 0;
+		}
+		if ((val < 0 && -max / factor > val) ||
+		    (val > 0 && max / factor < val)) {
+			errno = ERANGE;
+			return 0;
+		}
+		val *= factor;
+		*ret = val;
+		return 1;
+	}
+	errno = EINVAL;
+	return 0;
+}
+
+static int git_parse_unsigned(const char *value, uintmax_t *ret, uintmax_t max)
+{
+	if (value && *value) {
+		char *end;
+		uintmax_t val;
+		uintmax_t factor;
+
+		/* negative values would be accepted by strtoumax */
+		if (strchr(value, '-')) {
+			errno = EINVAL;
+			return 0;
+		}
+		errno = 0;
+		val = strtoumax(value, &end, 0);
+		if (errno == ERANGE)
+			return 0;
+		if (end == value) {
+			errno = EINVAL;
+			return 0;
+		}
+		factor = get_unit_factor(end);
+		if (!factor) {
+			errno = EINVAL;
+			return 0;
+		}
+		if (unsigned_mult_overflows(factor, val) ||
+		    factor * val > max) {
+			errno = ERANGE;
+			return 0;
+		}
+		val *= factor;
+		*ret = val;
+		return 1;
+	}
+	errno = EINVAL;
+	return 0;
+}
+
+int git_parse_int(const char *value, int *ret)
+{
+	intmax_t tmp;
+	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(int)))
+		return 0;
+	*ret = tmp;
+	return 1;
+}
+
+int git_parse_int64(const char *value, int64_t *ret)
+{
+	intmax_t tmp;
+	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(int64_t)))
+		return 0;
+	*ret = tmp;
+	return 1;
+}
+
+int git_parse_ulong(const char *value, unsigned long *ret)
+{
+	uintmax_t tmp;
+	if (!git_parse_unsigned(value, &tmp, maximum_unsigned_value_of_type(long)))
+		return 0;
+	*ret = tmp;
+	return 1;
+}
+
+int git_parse_ssize_t(const char *value, ssize_t *ret)
+{
+	intmax_t tmp;
+	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(ssize_t)))
+		return 0;
+	*ret = tmp;
+	return 1;
+}
+
+int git_parse_maybe_bool_text(const char *value)
+{
+	if (!value)
+		return 1;
+	if (!*value)
+		return 0;
+	if (!strcasecmp(value, "true")
+	    || !strcasecmp(value, "yes")
+	    || !strcasecmp(value, "on"))
+		return 1;
+	if (!strcasecmp(value, "false")
+	    || !strcasecmp(value, "no")
+	    || !strcasecmp(value, "off"))
+		return 0;
+	return -1;
+}
+
+int git_parse_maybe_bool(const char *value)
+{
+	int v = git_parse_maybe_bool_text(value);
+	if (0 <= v)
+		return v;
+	if (git_parse_int(value, &v))
+		return !!v;
+	return -1;
+}
+
+/*
+ * Parse environment variable 'k' as a boolean (in various
+ * possible spellings); if missing, use the default value 'def'.
+ */
+int git_env_bool(const char *k, int def)
+{
+	const char *v = getenv(k);
+	int val;
+	if (!v)
+		return def;
+	val = git_parse_maybe_bool(v);
+	if (val < 0)
+		die(_("bad boolean environment value '%s' for '%s'"),
+		    v, k);
+	return val;
+}
+
+/*
+ * Parse environment variable 'k' as ulong with possibly a unit
+ * suffix; if missing, use the default value 'val'.
+ */
+unsigned long git_env_ulong(const char *k, unsigned long val)
+{
+	const char *v = getenv(k);
+	if (v && !git_parse_ulong(v, &val))
+		die(_("failed to parse %s"), k);
+	return val;
+}
diff --git a/parse.h b/parse.h
new file mode 100644
index 0000000000..07d2193d69
--- /dev/null
+++ b/parse.h
@@ -0,0 +1,20 @@
+#ifndef PARSE_H
+#define PARSE_H
+
+int git_parse_signed(const char *value, intmax_t *ret, intmax_t max);
+int git_parse_ssize_t(const char *, ssize_t *);
+int git_parse_ulong(const char *, unsigned long *);
+int git_parse_int(const char *value, int *ret);
+int git_parse_int64(const char *value, int64_t *ret);
+
+/**
+ * Same as `git_config_bool`, except that it returns -1 on error rather
+ * than dying.
+ */
+int git_parse_maybe_bool(const char *);
+int git_parse_maybe_bool_text(const char *value);
+
+int git_env_bool(const char *, int);
+unsigned long git_env_ulong(const char *, unsigned long);
+
+#endif /* PARSE_H */
diff --git a/pathspec.c b/pathspec.c
index 4991455281..39337999d4 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -1,6 +1,6 @@
 #include "git-compat-util.h"
 #include "abspath.h"
-#include "config.h"
+#include "parse.h"
 #include "dir.h"
 #include "environment.h"
 #include "gettext.h"
diff --git a/preload-index.c b/preload-index.c
index e44530c80c..63fd35d64b 100644
--- a/preload-index.c
+++ b/preload-index.c
@@ -7,7 +7,7 @@
 #include "environment.h"
 #include "fsmonitor.h"
 #include "gettext.h"
-#include "config.h"
+#include "parse.h"
 #include "preload-index.h"
 #include "progress.h"
 #include "read-cache.h"
diff --git a/progress.c b/progress.c
index f695798aca..c83cb60bf1 100644
--- a/progress.c
+++ b/progress.c
@@ -17,7 +17,7 @@
 #include "trace.h"
 #include "trace2.h"
 #include "utf8.h"
-#include "config.h"
+#include "parse.h"
 
 #define TP_IDX_MAX      8
 
diff --git a/prompt.c b/prompt.c
index 3baa33f63d..8935fe4dfb 100644
--- a/prompt.c
+++ b/prompt.c
@@ -1,5 +1,5 @@
 #include "git-compat-util.h"
-#include "config.h"
+#include "parse.h"
 #include "environment.h"
 #include "run-command.h"
 #include "strbuf.h"
diff --git a/rebase.c b/rebase.c
index 17a570f1ff..69a1822da3 100644
--- a/rebase.c
+++ b/rebase.c
@@ -1,6 +1,6 @@
 #include "git-compat-util.h"
 #include "rebase.h"
-#include "config.h"
+#include "parse.h"
 #include "gettext.h"
 
 /*
diff --git a/t/helper/test-env-helper.c b/t/helper/test-env-helper.c
index 66c88b8ff3..1c486888a4 100644
--- a/t/helper/test-env-helper.c
+++ b/t/helper/test-env-helper.c
@@ -1,5 +1,5 @@
 #include "test-tool.h"
-#include "config.h"
+#include "parse.h"
 #include "parse-options.h"
 
 static char const * const env__helper_usage[] = {
diff --git a/unpack-trees.c b/unpack-trees.c
index 87517364dc..761562a96e 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -2,7 +2,7 @@
 #include "advice.h"
 #include "strvec.h"
 #include "repository.h"
-#include "config.h"
+#include "parse.h"
 #include "dir.h"
 #include "environment.h"
 #include "gettext.h"
diff --git a/wrapper.c b/wrapper.c
index 118d3033de..a6249cc30e 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -3,7 +3,7 @@
  */
 #include "git-compat-util.h"
 #include "abspath.h"
-#include "config.h"
+#include "parse.h"
 #include "gettext.h"
 #include "repository.h"
 #include "strbuf.h"
diff --git a/write-or-die.c b/write-or-die.c
index d8355c0c3e..42a2dc73cd 100644
--- a/write-or-die.c
+++ b/write-or-die.c
@@ -1,5 +1,5 @@
 #include "git-compat-util.h"
-#include "config.h"
+#include "parse.h"
 #include "run-command.h"
 #include "write-or-die.h"
 
-- 
2.41.0.640.ga95def55d0-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [RFC PATCH v2 5/7] date: push pager.h dependency up
  2023-08-10 16:33 ` [RFC PATCH v2 0/7] " Calvin Wan
                     ` (3 preceding siblings ...)
  2023-08-10 16:36   ` [RFC PATCH v2 4/7] parse: create new library for parsing strings and env values Calvin Wan
@ 2023-08-10 16:36   ` Calvin Wan
  2023-08-10 23:41     ` Glen Choo
  2023-08-14 22:17     ` Jonathan Tan
  2023-08-10 16:36   ` [RFC PATCH v2 6/7] git-std-lib: introduce git standard library Calvin Wan
                     ` (3 subsequent siblings)
  8 siblings, 2 replies; 111+ messages in thread
From: Calvin Wan @ 2023-08-10 16:36 UTC (permalink / raw)
  To: git
  Cc: Calvin Wan, nasamuffin, chooglen, jonathantanmy, linusa,
	phillip.wood123, vdye

In order for date.c to be included in git-std-lib, the dependency to
pager.h must be removed since it has dependencies on many other files
not in git-std-lib. We achieve this by passing a boolean for
"pager_in_use", rather than checking for it in parse_date_format() so
callers of the function will have that dependency.

Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 builtin/blame.c      | 2 +-
 builtin/log.c        | 2 +-
 date.c               | 5 ++---
 date.h               | 2 +-
 ref-filter.c         | 3 ++-
 revision.c           | 3 ++-
 t/helper/test-date.c | 3 ++-
 7 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index 9a3f9facea..665511570d 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -714,7 +714,7 @@ static int git_blame_config(const char *var, const char *value, void *cb)
 	if (!strcmp(var, "blame.date")) {
 		if (!value)
 			return config_error_nonbool(var);
-		parse_date_format(value, &blame_date_mode);
+		parse_date_format(value, &blame_date_mode, pager_in_use());
 		return 0;
 	}
 	if (!strcmp(var, "blame.ignorerevsfile")) {
diff --git a/builtin/log.c b/builtin/log.c
index 03954fb749..a72ce30c2e 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -185,7 +185,7 @@ static void cmd_log_init_defaults(struct rev_info *rev)
 	rev->diffopt.flags.allow_textconv = 1;
 
 	if (default_date_mode)
-		parse_date_format(default_date_mode, &rev->date_mode);
+		parse_date_format(default_date_mode, &rev->date_mode, pager_in_use());
 }
 
 static void set_default_decoration_filter(struct decoration_filter *decoration_filter)
diff --git a/date.c b/date.c
index 619ada5b20..55f73ce2e0 100644
--- a/date.c
+++ b/date.c
@@ -7,7 +7,6 @@
 #include "git-compat-util.h"
 #include "date.h"
 #include "gettext.h"
-#include "pager.h"
 #include "strbuf.h"
 
 /*
@@ -1003,13 +1002,13 @@ static enum date_mode_type parse_date_type(const char *format, const char **end)
 	die("unknown date format %s", format);
 }
 
-void parse_date_format(const char *format, struct date_mode *mode)
+void parse_date_format(const char *format, struct date_mode *mode, int pager_in_use)
 {
 	const char *p;
 
 	/* "auto:foo" is "if tty/pager, then foo, otherwise normal" */
 	if (skip_prefix(format, "auto:", &p)) {
-		if (isatty(1) || pager_in_use())
+		if (isatty(1) || pager_in_use)
 			format = p;
 		else
 			format = "default";
diff --git a/date.h b/date.h
index 6136212a19..d9bd6dc09f 100644
--- a/date.h
+++ b/date.h
@@ -53,7 +53,7 @@ const char *show_date(timestamp_t time, int timezone, const struct date_mode *mo
  * be used with strbuf_addftime(), in which case you'll need to call
  * date_mode_release() later.
  */
-void parse_date_format(const char *format, struct date_mode *mode);
+void parse_date_format(const char *format, struct date_mode *mode, int pager_in_use);
 
 /**
  * Release a "struct date_mode", currently only required if
diff --git a/ref-filter.c b/ref-filter.c
index 2ed0ecf260..1b96bb7822 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -28,6 +28,7 @@
 #include "worktree.h"
 #include "hashmap.h"
 #include "strvec.h"
+#include "pager.h"
 
 static struct ref_msg {
 	const char *gone;
@@ -1323,7 +1324,7 @@ static void grab_date(const char *buf, struct atom_value *v, const char *atomnam
 	formatp = strchr(atomname, ':');
 	if (formatp) {
 		formatp++;
-		parse_date_format(formatp, &date_mode);
+		parse_date_format(formatp, &date_mode, pager_in_use());
 	}
 
 	if (!eoemail)
diff --git a/revision.c b/revision.c
index 985b8b2f51..c7efd11914 100644
--- a/revision.c
+++ b/revision.c
@@ -46,6 +46,7 @@
 #include "resolve-undo.h"
 #include "parse-options.h"
 #include "wildmatch.h"
+#include "pager.h"
 
 volatile show_early_output_fn_t show_early_output;
 
@@ -2577,7 +2578,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 		revs->date_mode.type = DATE_RELATIVE;
 		revs->date_mode_explicit = 1;
 	} else if ((argcount = parse_long_opt("date", argv, &optarg))) {
-		parse_date_format(optarg, &revs->date_mode);
+		parse_date_format(optarg, &revs->date_mode, pager_in_use());
 		revs->date_mode_explicit = 1;
 		return argcount;
 	} else if (!strcmp(arg, "--log-size")) {
diff --git a/t/helper/test-date.c b/t/helper/test-date.c
index 0683d46574..b3927a95b3 100644
--- a/t/helper/test-date.c
+++ b/t/helper/test-date.c
@@ -1,5 +1,6 @@
 #include "test-tool.h"
 #include "date.h"
+#include "pager.h"
 #include "trace.h"
 
 static const char *usage_msg = "\n"
@@ -37,7 +38,7 @@ static void show_dates(const char **argv, const char *format)
 {
 	struct date_mode mode = DATE_MODE_INIT;
 
-	parse_date_format(format, &mode);
+	parse_date_format(format, &mode, pager_in_use());
 	for (; *argv; argv++) {
 		char *arg;
 		timestamp_t t;
-- 
2.41.0.640.ga95def55d0-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [RFC PATCH v2 6/7] git-std-lib: introduce git standard library
  2023-08-10 16:33 ` [RFC PATCH v2 0/7] " Calvin Wan
                     ` (4 preceding siblings ...)
  2023-08-10 16:36   ` [RFC PATCH v2 5/7] date: push pager.h dependency up Calvin Wan
@ 2023-08-10 16:36   ` Calvin Wan
  2023-08-14 22:26     ` Jonathan Tan
  2023-08-10 16:36   ` [RFC PATCH v2 7/7] git-std-lib: add test file to call git-std-lib.a functions Calvin Wan
                     ` (2 subsequent siblings)
  8 siblings, 1 reply; 111+ messages in thread
From: Calvin Wan @ 2023-08-10 16:36 UTC (permalink / raw)
  To: git
  Cc: Calvin Wan, nasamuffin, chooglen, jonathantanmy, linusa,
	phillip.wood123, vdye

The Git Standard Library intends to serve as the foundational library
and root dependency that other libraries in Git will be built off of.
That is to say, suppose we have libraries X and Y; a user that wants to
use X and Y would need to include X, Y, and this Git Standard Library.

Add Documentation/technical/git-std-lib.txt to further explain the
design and rationale.

Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 Documentation/technical/git-std-lib.txt | 186 ++++++++++++++++++++++++
 Makefile                                |  62 +++++++-
 git-compat-util.h                       |   7 +-
 stubs/repository.c                      |   4 +
 stubs/repository.h                      |   8 +
 stubs/trace2.c                          |  22 +++
 stubs/trace2.h                          |  69 +++++++++
 symlinks.c                              |   2 +
 8 files changed, 358 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/technical/git-std-lib.txt
 create mode 100644 stubs/repository.c
 create mode 100644 stubs/repository.h
 create mode 100644 stubs/trace2.c
 create mode 100644 stubs/trace2.h

diff --git a/Documentation/technical/git-std-lib.txt b/Documentation/technical/git-std-lib.txt
new file mode 100644
index 0000000000..3d901a89b0
--- /dev/null
+++ b/Documentation/technical/git-std-lib.txt
@@ -0,0 +1,186 @@
+Git Standard Library
+================
+
+The Git Standard Library intends to serve as the foundational library
+and root dependency that other libraries in Git will be built off of.
+That is to say, suppose we have libraries X and Y; a user that wants to
+use X and Y would need to include X, Y, and this Git Standard Library.
+This does not mean that the Git Standard Library will be the only
+possible root dependency in the future, but rather the most significant
+and widely used one.
+
+Dependency graph in libified Git
+================
+
+If you look in the Git Makefile, all of the objects defined in the Git
+library are compiled and archived into a singular file, libgit.a, which
+is linked against by common-main.o with other external dependencies and
+turned into the Git executable. In other words, the Git executable has
+dependencies on libgit.a and a couple of external libraries. The
+libfication of Git will not affect this current build flow, but instead
+will provide an alternate method for building Git.
+
+With our current method of building Git, we can imagine the dependency
+graph as such:
+
+        Git
+         /\
+        /  \
+       /    \
+  libgit.a   ext deps
+
+In libifying parts of Git, we want to shrink the dependency graph to
+only the minimal set of dependencies, so libraries should not use
+libgit.a. Instead, it would look like:
+
+                Git
+                /\
+               /  \
+              /    \
+          libgit.a  ext deps
+             /\
+            /  \
+           /    \
+object-store.a  (other lib)
+      |        /
+      |       /
+      |      /
+ config.a   / 
+      |    /
+      |   /
+      |  /
+git-std-lib.a
+
+Instead of containing all of the objects in Git, libgit.a would contain
+objects that are not built by libraries it links against. Consequently,
+if someone wanted their own custom build of Git with their own custom
+implementation of the object store, they would only have to swap out
+object-store.a rather than do a hard fork of Git.
+
+Rationale behind Git Standard Library
+================
+
+The rationale behind Git Standard Library essentially is the result of
+two observations within the Git codebase: every file includes
+git-compat-util.h which defines functions in a couple of different
+files, and wrapper.c + usage.c have difficult-to-separate circular
+dependencies with each other and other files.
+
+Ubiquity of git-compat-util.h and circular dependencies
+========
+
+Every file in the Git codebase includes git-compat-util.h. It serves as
+"a compatibility aid that isolates the knowledge of platform specific
+inclusion order and what feature macros to define before including which
+system header" (Junio[1]). Since every file includes git-compat-util.h, and
+git-compat-util.h includes wrapper.h and usage.h, it would make sense
+for wrapper.c and usage.c to be a part of the root library. They have
+difficult to separate circular dependencies with each other so they
+can't be independent libraries. Wrapper.c has dependencies on parse.c,
+abspath.c, strbuf.c, which in turn also have dependencies on usage.c and
+wrapper.c -- more circular dependencies. 
+
+Tradeoff between swappability and refactoring
+========
+
+From the above dependency graph, we can see that git-std-lib.a could be
+many smaller libraries rather than a singular library. So why choose a
+singular library when multiple libraries can be individually easier to
+swap and are more modular? A singular library requires less work to
+separate out circular dependencies within itself so it becomes a
+tradeoff question between work and reward. While there may be a point in
+the future where a file like usage.c would want its own library so that
+someone can have custom die() or error(), the work required to refactor
+out the circular dependencies in some files would be enormous due to
+their ubiquity so therefore I believe it is not worth the tradeoff
+currently. Additionally, we can in the future choose to do this refactor
+and change the API for the library if there becomes enough of a reason
+to do so (remember we are avoiding promising stability of the interfaces
+of those libraries).
+
+Reuse of compatibility functions in git-compat-util.h
+========
+
+Most functions defined in git-compat-util.h are implemented in compat/
+and have dependencies limited to strbuf.h and wrapper.h so they can be
+easily included in git-std-lib.a, which as a root dependency means that
+higher level libraries do not have to worry about compatibility files in
+compat/. The rest of the functions defined in git-compat-util.h are
+implemented in top level files and are hidden behind
+an #ifdef if their implementation is not in git-std-lib.a.
+
+Rationale summary
+========
+
+The Git Standard Library allows us to get the libification ball rolling
+with other libraries in Git. By not spending many
+more months attempting to refactor difficult circular dependencies and
+instead spending that time getting to a state where we can test out
+swapping a library out such as config or object store, we can prove the
+viability of Git libification on a much faster time scale. Additionally
+the code cleanups that have happened so far have been minor and
+beneficial for the codebase. It is probable that making large movements
+would negatively affect code clarity.
+
+Git Standard Library boundary
+================
+
+While I have described above some useful heuristics for identifying
+potential candidates for git-std-lib.a, a standard library should not
+have a shaky definition for what belongs in it.
+
+ - Low-level files (aka operates only on other primitive types) that are
+   used everywhere within the codebase (wrapper.c, usage.c, strbuf.c)
+   - Dependencies that are low-level and widely used
+     (abspath.c, date.c, hex-ll.c, parse.c, utf8.c)
+ - low-level git/* files with functions defined in git-compat-util.h
+   (ctype.c)
+ - compat/*
+ - stubbed out dependencies in stubs/ (stubs/repository.c, stubs/trace2.c)
+
+There are other files that might fit this definition, but that does not
+mean it should belong in git-std-lib.a. Those files should start as
+their own separate library since any file added to git-std-lib.a loses
+its flexibility of being easily swappable.
+
+Wrapper.c and usage.c have dependencies on repository and trace2 that are
+possible to remove at the cost of sacrificing the ability for standard Git
+to be able to trace functions in those files and other files in git-std-lib.a.
+In order for git-std-lib.a to compile with those dependencies, stubbed out
+versions of those files are implemented and swapped in during compilation time.
+
+Files inside of Git Standard Library
+================
+
+The initial set of files in git-std-lib.a are:
+abspath.c
+ctype.c
+date.c
+hex-ll.c
+parse.c
+strbuf.c
+usage.c
+utf8.c
+wrapper.c
+stubs/repository.c
+stubs/trace2.c
+relevant compat/ files
+
+Pitfalls
+================
+
+There are a small amount of files under compat/* that have dependencies
+not inside of git-std-lib.a. While those functions are not called on
+Linux, other OSes might call those problematic functions. I don't see
+this as a major problem, just moreso an observation that libification in
+general may also require some minor compatibility work in the future.
+
+Testing
+================
+
+Unit tests should catch any breakages caused by changes to files in
+git-std-lib.a (i.e. introduction of a out of scope dependency) and new
+functions introduced to git-std-lib.a will require unit tests written
+for them.
+
+[1] https://lore.kernel.org/git/xmqqwn17sydw.fsf@gitster.g/
\ No newline at end of file
diff --git a/Makefile b/Makefile
index e9ad9f9ef1..82510cf50e 100644
--- a/Makefile
+++ b/Makefile
@@ -669,6 +669,7 @@ FUZZ_PROGRAMS =
 GIT_OBJS =
 LIB_OBJS =
 SCALAR_OBJS =
+STUB_OBJS =
 OBJECTS =
 OTHER_PROGRAMS =
 PROGRAM_OBJS =
@@ -956,6 +957,7 @@ COCCI_SOURCES = $(filter-out $(THIRD_PARTY_SOURCES),$(FOUND_C_SOURCES))
 
 LIB_H = $(FOUND_H_SOURCES)
 
+ifndef GIT_STD_LIB
 LIB_OBJS += abspath.o
 LIB_OBJS += add-interactive.o
 LIB_OBJS += add-patch.o
@@ -1196,6 +1198,27 @@ LIB_OBJS += write-or-die.o
 LIB_OBJS += ws.o
 LIB_OBJS += wt-status.o
 LIB_OBJS += xdiff-interface.o
+else ifdef GIT_STD_LIB
+LIB_OBJS += abspath.o
+LIB_OBJS += ctype.o
+LIB_OBJS += date.o
+LIB_OBJS += hex-ll.o
+LIB_OBJS += parse.o
+LIB_OBJS += strbuf.o
+LIB_OBJS += usage.o
+LIB_OBJS += utf8.o
+LIB_OBJS += wrapper.o
+
+ifdef STUB_REPOSITORY
+STUB_OBJS += stubs/repository.o
+endif
+
+ifdef STUB_TRACE2
+STUB_OBJS += stubs/trace2.o
+endif
+
+LIB_OBJS += $(STUB_OBJS)
+endif
 
 BUILTIN_OBJS += builtin/add.o
 BUILTIN_OBJS += builtin/am.o
@@ -2162,6 +2185,11 @@ ifdef FSMONITOR_OS_SETTINGS
 	COMPAT_OBJS += compat/fsmonitor/fsm-path-utils-$(FSMONITOR_OS_SETTINGS).o
 endif
 
+ifdef GIT_STD_LIB
+	BASIC_CFLAGS += -DGIT_STD_LIB
+	BASIC_CFLAGS += -DNO_GETTEXT
+endif
+
 ifeq ($(TCLTK_PATH),)
 NO_TCLTK = NoThanks
 endif
@@ -3654,7 +3682,7 @@ clean: profile-clean coverage-clean cocciclean
 	$(RM) po/git.pot po/git-core.pot
 	$(RM) git.res
 	$(RM) $(OBJECTS)
-	$(RM) $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(REFTABLE_TEST_LIB)
+	$(RM) $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(REFTABLE_TEST_LIB) $(STD_LIB_FILE)
 	$(RM) $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS) $(OTHER_PROGRAMS)
 	$(RM) $(TEST_PROGRAMS)
 	$(RM) $(FUZZ_PROGRAMS)
@@ -3834,3 +3862,35 @@ $(FUZZ_PROGRAMS): all
 		$(XDIFF_OBJS) $(EXTLIBS) git.o $@.o $(LIB_FUZZING_ENGINE) -o $@
 
 fuzz-all: $(FUZZ_PROGRAMS)
+
+### Libified Git rules
+
+# git-std-lib
+# `make git-std-lib GIT_STD_LIB=YesPlease STUB_REPOSITORY=YesPlease STUB_TRACE2=YesPlease`
+STD_LIB = git-std-lib.a
+
+$(STD_LIB): $(LIB_OBJS) $(COMPAT_OBJS) $(STUB_OBJS)
+	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
+
+TEMP_HEADERS = temp_headers/
+
+git-std-lib:
+# Move headers to temporary folder and replace them with stubbed headers.
+# After building, move headers and stubbed headers back.
+ifneq ($(STUB_OBJS),)
+	mkdir -p $(TEMP_HEADERS); \
+	for d in $(STUB_OBJS); do \
+		BASE=$${d%.*}; \
+		mv $${BASE##*/}.h $(TEMP_HEADERS)$${BASE##*/}.h; \
+		mv $${BASE}.h $${BASE##*/}.h; \
+	done; \
+	$(MAKE) $(STD_LIB); \
+	for d in $(STUB_OBJS); do \
+		BASE=$${d%.*}; \
+		mv $${BASE##*/}.h $${BASE}.h; \
+		mv $(TEMP_HEADERS)$${BASE##*/}.h $${BASE##*/}.h; \
+	done; \
+	rm -rf temp_headers
+else
+	$(MAKE) $(STD_LIB)
+endif
diff --git a/git-compat-util.h b/git-compat-util.h
index 481dac22b0..75aa9b263e 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -396,8 +396,8 @@ static inline int noop_core_config(const char *var UNUSED,
 #define platform_core_config noop_core_config
 #endif
 
+#if !defined(__MINGW32__) && !defined(_MSC_VER) && !defined(GIT_STD_LIB)
 int lstat_cache_aware_rmdir(const char *path);
-#if !defined(__MINGW32__) && !defined(_MSC_VER)
 #define rmdir lstat_cache_aware_rmdir
 #endif
 
@@ -787,9 +787,11 @@ const char *inet_ntop(int af, const void *src, char *dst, size_t size);
 #endif
 
 #ifdef NO_PTHREADS
+#ifdef GIT_STD_LIB
 #define atexit git_atexit
 int git_atexit(void (*handler)(void));
 #endif
+#endif
 
 /*
  * Limit size of IO chunks, because huge chunks only cause pain.  OS X
@@ -951,14 +953,17 @@ int git_access(const char *path, int mode);
 # endif
 #endif
 
+#ifndef GIT_STD_LIB
 int cmd_main(int, const char **);
 
 /*
  * Intercept all calls to exit() and route them to trace2 to
  * optionally emit a message before calling the real exit().
  */
+
 int common_exit(const char *file, int line, int code);
 #define exit(code) exit(common_exit(__FILE__, __LINE__, (code)))
+#endif
 
 /*
  * You can mark a stack variable with UNLEAK(var) to avoid it being
diff --git a/stubs/repository.c b/stubs/repository.c
new file mode 100644
index 0000000000..f81520d083
--- /dev/null
+++ b/stubs/repository.c
@@ -0,0 +1,4 @@
+#include "git-compat-util.h"
+#include "repository.h"
+
+struct repository *the_repository;
diff --git a/stubs/repository.h b/stubs/repository.h
new file mode 100644
index 0000000000..18262d748e
--- /dev/null
+++ b/stubs/repository.h
@@ -0,0 +1,8 @@
+#ifndef REPOSITORY_H
+#define REPOSITORY_H
+
+struct repository { int stub; };
+
+extern struct repository *the_repository;
+
+#endif /* REPOSITORY_H */
diff --git a/stubs/trace2.c b/stubs/trace2.c
new file mode 100644
index 0000000000..efc3f9c1f3
--- /dev/null
+++ b/stubs/trace2.c
@@ -0,0 +1,22 @@
+#include "git-compat-util.h"
+#include "trace2.h"
+
+void trace2_region_enter_fl(const char *file, int line, const char *category,
+			    const char *label, const struct repository *repo, ...) { }
+void trace2_region_leave_fl(const char *file, int line, const char *category,
+			    const char *label, const struct repository *repo, ...) { }
+void trace2_data_string_fl(const char *file, int line, const char *category,
+			   const struct repository *repo, const char *key,
+			   const char *value) { }
+void trace2_cmd_ancestry_fl(const char *file, int line, const char **parent_names) { }
+void trace2_cmd_error_va_fl(const char *file, int line, const char *fmt,
+			    va_list ap) { }
+void trace2_cmd_name_fl(const char *file, int line, const char *name) { }
+void trace2_thread_start_fl(const char *file, int line,
+			    const char *thread_base_name) { }
+void trace2_thread_exit_fl(const char *file, int line) { }
+void trace2_data_intmax_fl(const char *file, int line, const char *category,
+			   const struct repository *repo, const char *key,
+			   intmax_t value) { }
+int trace2_is_enabled(void) { return 0; }
+void trace2_collect_process_info(enum trace2_process_info_reason reason) { }
diff --git a/stubs/trace2.h b/stubs/trace2.h
new file mode 100644
index 0000000000..88ad7387ff
--- /dev/null
+++ b/stubs/trace2.h
@@ -0,0 +1,69 @@
+#ifndef TRACE2_H
+#define TRACE2_H
+
+struct child_process { int stub; };
+struct repository;
+struct json_writer { int stub; };
+
+void trace2_region_enter_fl(const char *file, int line, const char *category,
+			    const char *label, const struct repository *repo, ...);
+
+#define trace2_region_enter(category, label, repo) \
+	trace2_region_enter_fl(__FILE__, __LINE__, (category), (label), (repo))
+
+void trace2_region_leave_fl(const char *file, int line, const char *category,
+			    const char *label, const struct repository *repo, ...);
+
+#define trace2_region_leave(category, label, repo) \
+	trace2_region_leave_fl(__FILE__, __LINE__, (category), (label), (repo))
+
+void trace2_data_string_fl(const char *file, int line, const char *category,
+			   const struct repository *repo, const char *key,
+			   const char *value);
+
+#define trace2_data_string(category, repo, key, value)                       \
+	trace2_data_string_fl(__FILE__, __LINE__, (category), (repo), (key), \
+			      (value))
+
+void trace2_cmd_ancestry_fl(const char *file, int line, const char **parent_names);
+
+#define trace2_cmd_ancestry(v) trace2_cmd_ancestry_fl(__FILE__, __LINE__, (v))
+
+void trace2_cmd_error_va_fl(const char *file, int line, const char *fmt,
+			    va_list ap);
+
+#define trace2_cmd_error_va(fmt, ap) \
+	trace2_cmd_error_va_fl(__FILE__, __LINE__, (fmt), (ap))
+
+
+void trace2_cmd_name_fl(const char *file, int line, const char *name);
+
+#define trace2_cmd_name(v) trace2_cmd_name_fl(__FILE__, __LINE__, (v))
+
+void trace2_thread_start_fl(const char *file, int line,
+			    const char *thread_base_name);
+
+#define trace2_thread_start(thread_base_name) \
+	trace2_thread_start_fl(__FILE__, __LINE__, (thread_base_name))
+
+void trace2_thread_exit_fl(const char *file, int line);
+
+#define trace2_thread_exit() trace2_thread_exit_fl(__FILE__, __LINE__)
+
+void trace2_data_intmax_fl(const char *file, int line, const char *category,
+			   const struct repository *repo, const char *key,
+			   intmax_t value);
+
+#define trace2_data_intmax(category, repo, key, value)                       \
+	trace2_data_intmax_fl(__FILE__, __LINE__, (category), (repo), (key), \
+			      (value))
+
+enum trace2_process_info_reason {
+	TRACE2_PROCESS_INFO_STARTUP,
+	TRACE2_PROCESS_INFO_EXIT,
+};
+int trace2_is_enabled(void);
+void trace2_collect_process_info(enum trace2_process_info_reason reason);
+
+#endif /* TRACE2_H */
+
diff --git a/symlinks.c b/symlinks.c
index b29e340c2d..bced721a0c 100644
--- a/symlinks.c
+++ b/symlinks.c
@@ -337,6 +337,7 @@ void invalidate_lstat_cache(void)
 	reset_lstat_cache(&default_cache);
 }
 
+#if !defined(__MINGW32__) && !defined(_MSC_VER) && !defined(GIT_STD_LIB)
 #undef rmdir
 int lstat_cache_aware_rmdir(const char *path)
 {
@@ -348,3 +349,4 @@ int lstat_cache_aware_rmdir(const char *path)
 
 	return ret;
 }
+#endif
-- 
2.41.0.640.ga95def55d0-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [RFC PATCH v2 7/7] git-std-lib: add test file to call git-std-lib.a functions
  2023-08-10 16:33 ` [RFC PATCH v2 0/7] " Calvin Wan
                     ` (5 preceding siblings ...)
  2023-08-10 16:36   ` [RFC PATCH v2 6/7] git-std-lib: introduce git standard library Calvin Wan
@ 2023-08-10 16:36   ` Calvin Wan
  2023-08-14 22:28     ` Jonathan Tan
  2023-08-10 22:05   ` [RFC PATCH v2 0/7] Introduce Git Standard Library Glen Choo
  2023-08-15  9:41   ` Phillip Wood
  8 siblings, 1 reply; 111+ messages in thread
From: Calvin Wan @ 2023-08-10 16:36 UTC (permalink / raw)
  To: git
  Cc: Calvin Wan, nasamuffin, chooglen, jonathantanmy, linusa,
	phillip.wood123, vdye

Add test file that directly or indirectly calls all functions defined in
git-std-lib.a object files to showcase that they do not reference
missing objects and that git-std-lib.a can stand on its own.

Certain functions that cause the program to exit or are already called
by other functions are commented out.

TODO: replace with unit tests
Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 t/Makefile      |   4 +
 t/stdlib-test.c | 239 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 243 insertions(+)
 create mode 100644 t/stdlib-test.c

diff --git a/t/Makefile b/t/Makefile
index 3e00cdd801..b6d0bc9daa 100644
--- a/t/Makefile
+++ b/t/Makefile
@@ -150,3 +150,7 @@ perf:
 
 .PHONY: pre-clean $(T) aggregate-results clean valgrind perf \
 	check-chainlint clean-chainlint test-chainlint
+
+test-git-std-lib:
+	cc -It -o stdlib-test stdlib-test.c -L. -l:../git-std-lib.a
+	./stdlib-test
diff --git a/t/stdlib-test.c b/t/stdlib-test.c
new file mode 100644
index 0000000000..a5d7374e2f
--- /dev/null
+++ b/t/stdlib-test.c
@@ -0,0 +1,239 @@
+#include "../git-compat-util.h"
+#include "../abspath.h"
+#include "../hex-ll.h"
+#include "../parse.h"
+#include "../strbuf.h"
+#include "../string-list.h"
+
+/*
+ * Calls all functions from git-std-lib
+ * Some inline/trivial functions are skipped
+ */
+
+void abspath_funcs(void) {
+	struct strbuf sb = STRBUF_INIT;
+
+	fprintf(stderr, "calling abspath functions\n");
+	is_directory("foo");
+	strbuf_realpath(&sb, "foo", 0);
+	strbuf_realpath_forgiving(&sb, "foo", 0);
+	real_pathdup("foo", 0);
+	absolute_path("foo");
+	absolute_pathdup("foo");
+	prefix_filename("foo/", "bar");
+	prefix_filename_except_for_dash("foo/", "bar");
+	is_absolute_path("foo");
+	strbuf_add_absolute_path(&sb, "foo");
+	strbuf_add_real_path(&sb, "foo");
+}
+
+void hex_ll_funcs(void) {
+	unsigned char c;
+
+	fprintf(stderr, "calling hex-ll functions\n");
+
+	hexval('c');
+	hex2chr("A1");
+	hex_to_bytes(&c, "A1", 2);
+}
+
+void parse_funcs(void) {
+	intmax_t foo;
+	ssize_t foo1 = -1;
+	unsigned long foo2;
+	int foo3;
+	int64_t foo4;
+
+	fprintf(stderr, "calling parse functions\n");
+
+	git_parse_signed("42", &foo, maximum_signed_value_of_type(int));
+	git_parse_ssize_t("42", &foo1);
+	git_parse_ulong("42", &foo2);
+	git_parse_int("42", &foo3);
+	git_parse_int64("42", &foo4);
+	git_parse_maybe_bool("foo");
+	git_parse_maybe_bool_text("foo");
+	git_env_bool("foo", 1);
+	git_env_ulong("foo", 1);
+}
+
+static int allow_unencoded_fn(char ch) {
+	return 0;
+}
+
+void strbuf_funcs(void) {
+	struct strbuf *sb = xmalloc(sizeof(void*));
+	struct strbuf *sb2 = xmalloc(sizeof(void*));
+	struct strbuf sb3 = STRBUF_INIT;
+	struct string_list list = STRING_LIST_INIT_NODUP;
+	char *buf = "foo";
+	struct strbuf_expand_dict_entry dict[] = {
+		{ "foo", NULL, },
+		{ "bar", NULL, },
+	};
+	int fd = open("/dev/null", O_RDONLY);
+
+	fprintf(stderr, "calling strbuf functions\n");
+
+	starts_with("foo", "bar");
+	istarts_with("foo", "bar");
+	// skip_to_optional_arg_default(const char *str, const char *prefix,
+	// 			 const char **arg, const char *def)
+	strbuf_init(sb, 0);
+	strbuf_init(sb2, 0);
+	strbuf_release(sb);
+	strbuf_attach(sb, strbuf_detach(sb, NULL), 0, 0); // calls strbuf_grow
+	strbuf_swap(sb, sb2);
+	strbuf_setlen(sb, 0);
+	strbuf_trim(sb); // calls strbuf_rtrim, strbuf_ltrim
+	// strbuf_rtrim() called by strbuf_trim()
+	// strbuf_ltrim() called by strbuf_trim()
+	strbuf_trim_trailing_dir_sep(sb);
+	strbuf_trim_trailing_newline(sb);
+	strbuf_reencode(sb, "foo", "bar");
+	strbuf_tolower(sb);
+	strbuf_add_separated_string_list(sb, " ", &list);
+	strbuf_list_free(strbuf_split_buf("foo bar", 8, ' ', -1));
+	strbuf_cmp(sb, sb2);
+	strbuf_addch(sb, 1);
+	strbuf_splice(sb, 0, 1, "foo", 3);
+	strbuf_insert(sb, 0, "foo", 3);
+	// strbuf_vinsertf() called by strbuf_insertf
+	strbuf_insertf(sb, 0, "%s", "foo");
+	strbuf_remove(sb, 0, 1);
+	strbuf_add(sb, "foo", 3);
+	strbuf_addbuf(sb, sb2);
+	strbuf_join_argv(sb, 0, NULL, ' ');
+	strbuf_addchars(sb, 1, 1);
+	strbuf_addf(sb, "%s", "foo");
+	strbuf_add_commented_lines(sb, "foo", 3, '#');
+	strbuf_commented_addf(sb, '#', "%s", "foo");
+	// strbuf_vaddf() called by strbuf_addf()
+	strbuf_expand(sb, "%s", strbuf_expand_literal_cb, NULL);
+	strbuf_expand(sb, "%s", strbuf_expand_dict_cb, &dict);
+	// strbuf_expand_literal_cb() called by strbuf_expand()
+	// strbuf_expand_dict_cb() called by strbuf_expand()
+	strbuf_addbuf_percentquote(sb, &sb3);
+	strbuf_add_percentencode(sb, "foo", STRBUF_ENCODE_SLASH);
+	strbuf_fread(sb, 0, stdin);
+	strbuf_read(sb, fd, 0);
+	strbuf_read_once(sb, fd, 0);
+	strbuf_write(sb, stderr);
+	strbuf_readlink(sb, "/dev/null", 0);
+	strbuf_getcwd(sb);
+	strbuf_getwholeline(sb, stderr, '\n');
+	strbuf_appendwholeline(sb, stderr, '\n');
+	strbuf_getline(sb, stderr);
+	strbuf_getline_lf(sb, stderr);
+	strbuf_getline_nul(sb, stderr);
+	strbuf_getwholeline_fd(sb, fd, '\n');
+	strbuf_read_file(sb, "/dev/null", 0);
+	strbuf_add_lines(sb, "foo", "bar", 0);
+	strbuf_addstr_xml_quoted(sb, "foo");
+	strbuf_addstr_urlencode(sb, "foo", allow_unencoded_fn);
+	strbuf_humanise_bytes(sb, 42);
+	strbuf_humanise_rate(sb, 42);
+	printf_ln("%s", sb);
+	fprintf_ln(stderr, "%s", sb);
+	xstrdup_tolower("foo");
+	xstrdup_toupper("foo");
+	// xstrvfmt() called by xstrfmt()
+	xstrfmt("%s", "foo");
+	// strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm,
+	// 	     int tz_offset, int suppress_tz_name)
+	// strbuf_stripspace(struct strbuf *sb, char comment_line_char)
+	// strbuf_strip_suffix(struct strbuf *sb, const char *suffix)
+	// strbuf_strip_file_from_path(struct strbuf *sb)
+}
+
+static void error_builtin(const char *err, va_list params) {}
+static void warn_builtin(const char *err, va_list params) {}
+
+static report_fn error_routine = error_builtin;
+static report_fn warn_routine = warn_builtin;
+
+void usage_funcs(void) {
+	fprintf(stderr, "calling usage functions\n");
+	// Functions that call exit() are commented out
+
+	// usage()
+	// usagef()
+	// die()
+	// die_errno();
+	error("foo");
+	error_errno("foo");
+	die_message("foo");
+	die_message_errno("foo");
+	warning("foo");
+	warning_errno("foo");
+
+	// set_die_routine();
+	get_die_message_routine();
+	set_error_routine(error_builtin);
+	get_error_routine();
+	set_warn_routine(warn_builtin);
+	get_warn_routine();
+	// set_die_is_recursing_routine();
+}
+
+void wrapper_funcs(void) {
+	void *ptr = xmalloc(1);
+	int fd = open("/dev/null", O_RDONLY);
+	struct strbuf sb = STRBUF_INIT;
+	int mode = 0444;
+	char host[PATH_MAX], path[PATH_MAX], path1[PATH_MAX];
+	xsnprintf(path, sizeof(path), "out-XXXXXX");
+	xsnprintf(path1, sizeof(path1), "out-XXXXXX");
+	int tmp;
+
+	fprintf(stderr, "calling wrapper functions\n");
+
+	xstrdup("foo");
+	xmalloc(1);
+	xmallocz(1);
+	xmallocz_gently(1);
+	xmemdupz("foo", 3);
+	xstrndup("foo", 3);
+	xrealloc(ptr, 2);
+	xcalloc(1, 1);
+	xsetenv("foo", "bar", 0);
+	xopen("/dev/null", O_RDONLY);
+	xread(fd, &sb, 1);
+	xwrite(fd, &sb, 1);
+	xpread(fd, &sb, 1, 0);
+	xdup(fd);
+	xfopen("/dev/null", "r");
+	xfdopen(fd, "r");
+	tmp = xmkstemp(path);
+	close(tmp);
+	unlink(path);
+	tmp = xmkstemp_mode(path1, mode);
+	close(tmp);
+	unlink(path1);
+	xgetcwd();
+	fopen_for_writing(path);
+	fopen_or_warn(path, "r");
+	xstrncmpz("foo", "bar", 3);
+	// xsnprintf() called above
+	xgethostname(host, 3);
+	tmp = git_mkstemps_mode(path, 1, mode);
+	close(tmp);
+	unlink(path);
+	tmp = git_mkstemp_mode(path, mode);
+	close(tmp);
+	unlink(path);
+	read_in_full(fd, &sb, 1);
+	write_in_full(fd, &sb, 1);
+	pread_in_full(fd, &sb, 1, 0);
+}
+
+int main() {
+	abspath_funcs();
+	hex_ll_funcs();
+	parse_funcs();
+	strbuf_funcs();
+	usage_funcs();
+	wrapper_funcs();
+	fprintf(stderr, "all git-std-lib functions finished calling\n");
+	return 0;
+}
-- 
2.41.0.640.ga95def55d0-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH v2 2/7] object: move function to object.c
  2023-08-10 16:36   ` [RFC PATCH v2 2/7] object: move function to object.c Calvin Wan
@ 2023-08-10 20:32     ` Junio C Hamano
  2023-08-10 22:36     ` Glen Choo
  1 sibling, 0 replies; 111+ messages in thread
From: Junio C Hamano @ 2023-08-10 20:32 UTC (permalink / raw)
  To: Calvin Wan
  Cc: git, nasamuffin, chooglen, jonathantanmy, linusa, phillip.wood123, vdye

Calvin Wan <calvinwan@google.com> writes:

> While remove_or_warn() is a simple ternary operator to call two other
> wrapper functions, it creates an unnecessary dependency to object.h in
> wrapper.c. Therefore move the function to object.[ch] where the concept
> of GITLINKs is first defined.

An untold assumption here is that we would want to make wrapper.[ch]
independent of Git's internals?

If so, where the thing is moved to (i.e. object.c) is much less
interesting than the fact that the goal of this function is to make
wrapper.[ch] less dependent on Git, so the title should reflect
that, no?

> +/*
> + * Calls the correct function out of {unlink,rmdir}_or_warn based on
> + * the supplied file mode.
> + */
> +int remove_or_warn(unsigned int mode, const char *path);

OK.  That "file mode" thing is not a regular "struct stat .st_mode",
but knows Git's internals, hence it makes sense to have it on our
side, not on the wrapper.[ch] side.  That makes sense.

>  #endif /* OBJECT_H */
> diff --git a/wrapper.c b/wrapper.c
> index 22be9812a7..118d3033de 100644
> --- a/wrapper.c
> +++ b/wrapper.c
> @@ -5,7 +5,6 @@
>  #include "abspath.h"
>  #include "config.h"
>  #include "gettext.h"
> -#include "object.h"
>  #include "repository.h"
>  #include "strbuf.h"
>  #include "trace2.h"
> @@ -647,11 +646,6 @@ int rmdir_or_warn(const char *file)
>  	return warn_if_unremovable("rmdir", file, rmdir(file));
>  }
>  
> -int remove_or_warn(unsigned int mode, const char *file)
> -{
> -	return S_ISGITLINK(mode) ? rmdir_or_warn(file) : unlink_or_warn(file);
> -}
> -
>  static int access_error_is_ok(int err, unsigned flag)
>  {
>  	return (is_missing_file_error(err) ||
> diff --git a/wrapper.h b/wrapper.h
> index c85b1328d1..272795f863 100644
> --- a/wrapper.h
> +++ b/wrapper.h
> @@ -111,11 +111,6 @@ int unlink_or_msg(const char *file, struct strbuf *err);
>   * not exist.
>   */
>  int rmdir_or_warn(const char *path);
> -/*
> - * Calls the correct function out of {unlink,rmdir}_or_warn based on
> - * the supplied file mode.
> - */
> -int remove_or_warn(unsigned int mode, const char *path);
>  
>  /*
>   * Call access(2), but warn for any error except "missing file"

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH v2 3/7] config: correct bad boolean env value error message
  2023-08-10 16:36   ` [RFC PATCH v2 3/7] config: correct bad boolean env value error message Calvin Wan
@ 2023-08-10 20:36     ` Junio C Hamano
  0 siblings, 0 replies; 111+ messages in thread
From: Junio C Hamano @ 2023-08-10 20:36 UTC (permalink / raw)
  To: Calvin Wan
  Cc: git, nasamuffin, chooglen, jonathantanmy, linusa, phillip.wood123, vdye

Calvin Wan <calvinwan@google.com> writes:

> An incorrectly defined boolean environment value would result in the
> following error message:
>
> bad boolean config value '%s' for '%s'
>
> This is a misnomer since environment value != config value. Instead of
> calling git_config_bool() to parse the environment value, mimic the
> functionality inside of git_config_bool() but with the correct error
> message.

Makes sense.

>
> Signed-off-by: Calvin Wan <calvinwan@google.com>
> ---
>  config.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/config.c b/config.c
> index 09851a6909..5b71ef1624 100644
> --- a/config.c
> +++ b/config.c
> @@ -2172,7 +2172,14 @@ void git_global_config(char **user_out, char **xdg_out)
>  int git_env_bool(const char *k, int def)
>  {
>  	const char *v = getenv(k);
> -	return v ? git_config_bool(k, v) : def;
> +	int val;
> +	if (!v)
> +		return def;
> +	val = git_parse_maybe_bool(v);
> +	if (val < 0)
> +		die(_("bad boolean environment value '%s' for '%s'"),
> +		    v, k);
> +	return val;
>  }
>  
>  /*

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH v2 0/7] Introduce Git Standard Library
  2023-08-10 16:33 ` [RFC PATCH v2 0/7] " Calvin Wan
                     ` (6 preceding siblings ...)
  2023-08-10 16:36   ` [RFC PATCH v2 7/7] git-std-lib: add test file to call git-std-lib.a functions Calvin Wan
@ 2023-08-10 22:05   ` Glen Choo
  2023-08-15  9:20     ` Phillip Wood
  2023-08-15  9:41   ` Phillip Wood
  8 siblings, 1 reply; 111+ messages in thread
From: Glen Choo @ 2023-08-10 22:05 UTC (permalink / raw)
  To: Calvin Wan, git
  Cc: Calvin Wan, nasamuffin, jonathantanmy, linusa, phillip.wood123, vdye

Calvin Wan <calvinwan@google.com> writes:

> Calvin Wan (7):
>   hex-ll: split out functionality from hex
>   object: move function to object.c
>   config: correct bad boolean env value error message
>   parse: create new library for parsing strings and env values
>   date: push pager.h dependency up
>   git-std-lib: introduce git standard library
>   git-std-lib: add test file to call git-std-lib.a functions

This doesn't seem to apply to 'master'. Do you have a base commit that
reviewers could apply the patches to?

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH v2 2/7] object: move function to object.c
  2023-08-10 16:36   ` [RFC PATCH v2 2/7] object: move function to object.c Calvin Wan
  2023-08-10 20:32     ` Junio C Hamano
@ 2023-08-10 22:36     ` Glen Choo
  2023-08-10 22:43       ` Junio C Hamano
  1 sibling, 1 reply; 111+ messages in thread
From: Glen Choo @ 2023-08-10 22:36 UTC (permalink / raw)
  To: Calvin Wan, git
  Cc: Calvin Wan, nasamuffin, jonathantanmy, linusa, phillip.wood123, vdye

Calvin Wan <calvinwan@google.com> writes:

> While remove_or_warn() is a simple ternary operator to call two other
> wrapper functions, it creates an unnecessary dependency to object.h in
> wrapper.c. Therefore move the function to object.[ch] where the concept
> of GITLINKs is first defined.

As Junio mentioned elsewhere, I think we need to establish that
wrapper.c should be free of Git-specific internals.

> diff --git a/object.c b/object.c
> index 60f954194f..cb29fcc304 100644
> --- a/object.c
> +++ b/object.c
> @@ -617,3 +617,8 @@ void parsed_object_pool_clear(struct parsed_object_pool *o)
>  	FREE_AND_NULL(o->object_state);
>  	FREE_AND_NULL(o->shallow_stat);
>  }
> +
> +int remove_or_warn(unsigned int mode, const char *file)
> +{
> +	return S_ISGITLINK(mode) ? rmdir_or_warn(file) : unlink_or_warn(file);
> +}

Since this function really needs S_ISGITLINK (I tried to see if we could
just replace it with S_ISDIR and get the same behavior, but we can't),
this really is a Git-specific thing, so yes, this should be moved out of
wrapper.c.

Minor point: I think a better home might be entry.[ch], because those
files care about performing changes on the worktree based on the
Git-specific file modes in the index, whereas object.[ch] seems more
concerned about the format of objects.

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH v2 2/7] object: move function to object.c
  2023-08-10 22:36     ` Glen Choo
@ 2023-08-10 22:43       ` Junio C Hamano
  0 siblings, 0 replies; 111+ messages in thread
From: Junio C Hamano @ 2023-08-10 22:43 UTC (permalink / raw)
  To: Glen Choo
  Cc: Calvin Wan, git, nasamuffin, jonathantanmy, linusa,
	phillip.wood123, vdye

Glen Choo <chooglen@google.com> writes:

> Minor point: I think a better home might be entry.[ch], because those
> files care about performing changes on the worktree based on the
> Git-specific file modes in the index, whereas object.[ch] seems more
> concerned about the format of objects.

Yeah, I wasn't paying much attention on that point while reading the
patch, and I do agree with you that entry.[ch] may be a better fit.

Thanks.


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH v2 4/7] parse: create new library for parsing strings and env values
  2023-08-10 16:36   ` [RFC PATCH v2 4/7] parse: create new library for parsing strings and env values Calvin Wan
@ 2023-08-10 23:21     ` Glen Choo
  2023-08-10 23:43       ` Junio C Hamano
  2023-08-14 22:15       ` Jonathan Tan
  2023-08-14 22:09     ` Jonathan Tan
  1 sibling, 2 replies; 111+ messages in thread
From: Glen Choo @ 2023-08-10 23:21 UTC (permalink / raw)
  To: Calvin Wan, git
  Cc: Calvin Wan, nasamuffin, jonathantanmy, linusa, phillip.wood123, vdye

Calvin Wan <calvinwan@google.com> writes:

> While string and environment value parsing is mainly consumed by
> config.c, there are other files that only need parsing functionality and
> not config functionality. By separating out string and environment value
> parsing from config, those files can instead be dependent on parse,
> which has a much smaller dependency chain than config.
>
> Move general string and env parsing functions from config.[ch] to
> parse.[ch].

An unstated purpose of this patch is that parse.[ch] becomes part of
git-std-lib, but not config.[ch], right?

I think it's reasonable to have the string value parsing logic in
git-std-lib, e.g. this parsing snippet from diff.c seems like a good
thing to put into a library that wants to accept user input:

  static int parse_color_moved(const char *arg)
  {
    switch (git_parse_maybe_bool(arg)) {
    case 0:
      return COLOR_MOVED_NO;
    case 1:
      return COLOR_MOVED_DEFAULT;
    default:
      break;
    }

    if (!strcmp(arg, "no"))
      return COLOR_MOVED_NO;
    else if (!strcmp(arg, "plain"))
      return COLOR_MOVED_PLAIN;
    else if (!strcmp(arg, "blocks"))
      return COLOR_MOVED_BLOCKS;
    /* ... */
  }

But, I don't see a why a non-Git caller would want environment value
parsing in git-std-lib. I wouldn't think that libraries should be
reading Git-formatted environment variables. If I had to guess, you
arranged it this way because you want to keep xmalloc in git-std-lib,
which has a dependency on env var parsing here:

  static int memory_limit_check(size_t size, int gentle)
  {
    static size_t limit = 0;
    if (!limit) {
      limit = git_env_ulong("GIT_ALLOC_LIMIT", 0);
      if (!limit)
        limit = SIZE_MAX;
    }
    if (size > limit) {
      if (gentle) {
        error("attempting to allocate %"PRIuMAX" over limit %"PRIuMAX,
              (uintmax_t)size, (uintmax_t)limit);
        return -1;
      } else
        die("attempting to allocate %"PRIuMAX" over limit %"PRIuMAX,
            (uintmax_t)size, (uintmax_t)limit);
    }
    return 0;
  }

If we libified this as-is, wouldn't our caller start paying attention to
the GIT_ALLOC_LIMIT environment variable? That seems like an undesirable
side effect.

I see later in the series that you have "stubs", which are presumably
entrypoints for the caller to specify their own implementations of
Git-specific things. If so, then an alternative would be to provide a
"stub" to get the memory limit, something like:

  /* wrapper.h aka the things to stub */
  size_t git_get_memory_limit(void);

  /* stub-wrapper-or-something.c aka Git's implementation of the stub */

  #include "wrapper.h"
  size_t git_get_memory_limit(void)
  {
      return git_env_ulong("GIT_ALLOC_LIMIT", 0);
  }

  /* wrapper.c aka the thing in git-stb-lib */
  static int memory_limit_check(size_t size, int gentle)
  {
    static size_t limit = 0;
    if (!limit) {
      limit = git_get_memory_limit();
      if (!limit)
        limit = SIZE_MAX;
    }
    if (size > limit) {
      if (gentle) {
        error("attempting to allocate %"PRIuMAX" over limit %"PRIuMAX,
              (uintmax_t)size, (uintmax_t)limit);
        return -1;
      } else
        die("attempting to allocate %"PRIuMAX" over limit %"PRIuMAX,
            (uintmax_t)size, (uintmax_t)limit);
    }
    return 0;
  }

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH v2 5/7] date: push pager.h dependency up
  2023-08-10 16:36   ` [RFC PATCH v2 5/7] date: push pager.h dependency up Calvin Wan
@ 2023-08-10 23:41     ` Glen Choo
  2023-08-14 22:17     ` Jonathan Tan
  1 sibling, 0 replies; 111+ messages in thread
From: Glen Choo @ 2023-08-10 23:41 UTC (permalink / raw)
  To: Calvin Wan, git
  Cc: Calvin Wan, nasamuffin, jonathantanmy, linusa, phillip.wood123, vdye

Calvin Wan <calvinwan@google.com> writes:

> In order for date.c to be included in git-std-lib, the dependency to
> pager.h must be removed since it has dependencies on many other files
> not in git-std-lib.

Dependencies aside, I doubt callers of Git libraries want Git's
pager-handling logic bundled in git-std-lib ;)

> @@ -1003,13 +1002,13 @@ static enum date_mode_type parse_date_type(const char *format, const char **end)
>  	die("unknown date format %s", format);
>  }
>  
> -void parse_date_format(const char *format, struct date_mode *mode)
> +void parse_date_format(const char *format, struct date_mode *mode, int pager_in_use)
>  {
>  	const char *p;
>  
>  	/* "auto:foo" is "if tty/pager, then foo, otherwise normal" */
>  	if (skip_prefix(format, "auto:", &p)) {
> -		if (isatty(1) || pager_in_use())
> +		if (isatty(1) || pager_in_use)
>  			format = p;
>  		else
>  			format = "default";

Hm, it feels odd to ship a parsing option that changes based on whether
the caller isatty or not. Ideally we would stub this "switch the value
of auto" logic too.

Without reading ahead, I'm not sure if there are other sorts of "library
influencing process-wide" oddities like the one here and in the previous
patch. I think it would be okay for us to merge this series with these,
as long as we advertise to callers that the library boundary isn't very
clean yet, and we eventually clean it up.

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH v2 4/7] parse: create new library for parsing strings and env values
  2023-08-10 23:21     ` Glen Choo
@ 2023-08-10 23:43       ` Junio C Hamano
  2023-08-14 22:15       ` Jonathan Tan
  1 sibling, 0 replies; 111+ messages in thread
From: Junio C Hamano @ 2023-08-10 23:43 UTC (permalink / raw)
  To: Glen Choo
  Cc: Calvin Wan, git, nasamuffin, jonathantanmy, linusa,
	phillip.wood123, vdye

Glen Choo <chooglen@google.com> writes:

> I think it's reasonable to have the string value parsing logic in
> git-std-lib, e.g. this parsing snippet from diff.c seems like a good
> thing to put into a library that wants to accept user input:
>
>   static int parse_color_moved(const char *arg)
>   {
>     switch (git_parse_maybe_bool(arg)) {
>     case 0:
>       return COLOR_MOVED_NO;
>     case 1:
>       return COLOR_MOVED_DEFAULT;
>     default:
>       break;
>     }
>
>     if (!strcmp(arg, "no"))
>       return COLOR_MOVED_NO;
>     else if (!strcmp(arg, "plain"))
>       return COLOR_MOVED_PLAIN;
>     else if (!strcmp(arg, "blocks"))
>       return COLOR_MOVED_BLOCKS;
>     /* ... */
>   }
>
> But, I don't see a why a non-Git caller would want environment value
> parsing in git-std-lib.

It also is debatable why a non-Git caller wants to parse the value
to the "--color-moved" option (or a configuration variable) to begin
with.  Its vocabulary is closely tied to what the diff machinery in
Git can do, isn't it?

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH v2 4/7] parse: create new library for parsing strings and env values
  2023-08-10 16:36   ` [RFC PATCH v2 4/7] parse: create new library for parsing strings and env values Calvin Wan
  2023-08-10 23:21     ` Glen Choo
@ 2023-08-14 22:09     ` Jonathan Tan
  2023-08-14 22:19       ` Junio C Hamano
  1 sibling, 1 reply; 111+ messages in thread
From: Jonathan Tan @ 2023-08-14 22:09 UTC (permalink / raw)
  To: Calvin Wan
  Cc: Jonathan Tan, git, nasamuffin, chooglen, linusa, phillip.wood123, vdye

Calvin Wan <calvinwan@google.com> writes:
> While string and environment value parsing is mainly consumed by
> config.c, there are other files that only need parsing functionality and
> not config functionality. By separating out string and environment value
> parsing from config, those files can instead be dependent on parse,
> which has a much smaller dependency chain than config.
> 
> Move general string and env parsing functions from config.[ch] to
> parse.[ch].
> 
> Signed-off-by: Calvin Wan <calvinwan@google.com>

Thanks - I think that patches 1 through 4 are worth merging even now.
One thing we hoped to accomplish through the libification effort is to
make changes that are beneficial even outside the libification context,
and it seems that this is one of them. Previously, code needed to
include config.h even when it didn't use the main functionality that
config.h provides (config), but now it no longer needs to do so. (And
same argument for hex, although on a smaller scale.)

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH v2 4/7] parse: create new library for parsing strings and env values
  2023-08-10 23:21     ` Glen Choo
  2023-08-10 23:43       ` Junio C Hamano
@ 2023-08-14 22:15       ` Jonathan Tan
  1 sibling, 0 replies; 111+ messages in thread
From: Jonathan Tan @ 2023-08-14 22:15 UTC (permalink / raw)
  To: Glen Choo
  Cc: Jonathan Tan, Calvin Wan, git, nasamuffin, linusa, phillip.wood123, vdye

Glen Choo <chooglen@google.com> writes:
> But, I don't see a why a non-Git caller would want environment value
> parsing in git-std-lib. I wouldn't think that libraries should be
> reading Git-formatted environment variables.

I think environment parsing in git-std-lib is fine, at least for the
short term. First, currently we expect a lot from a user of our library
(including tolerating breaking changes in API), so I think it is
reasonable for such a user to be aware that some functionality can be
changed by an environment variable. Second, the purpose of the library
is to provide functionality that currently is only accessible through
CLI in library form, and if the CLI deems that some functionality
should be accessible through an environment variable instead of a config
variable or CLI parameter for whatever reason, we should reflect that in
the library as well.
 

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH v2 5/7] date: push pager.h dependency up
  2023-08-10 16:36   ` [RFC PATCH v2 5/7] date: push pager.h dependency up Calvin Wan
  2023-08-10 23:41     ` Glen Choo
@ 2023-08-14 22:17     ` Jonathan Tan
  1 sibling, 0 replies; 111+ messages in thread
From: Jonathan Tan @ 2023-08-14 22:17 UTC (permalink / raw)
  To: Calvin Wan
  Cc: Jonathan Tan, git, nasamuffin, chooglen, linusa, phillip.wood123, vdye

Calvin Wan <calvinwan@google.com> writes:
> In order for date.c to be included in git-std-lib, the dependency to
> pager.h must be removed since it has dependencies on many other files
> not in git-std-lib. We achieve this by passing a boolean for
> "pager_in_use", rather than checking for it in parse_date_format() so
> callers of the function will have that dependency.

Instead of doing it as you describe here, could this be another stub
instead? That way, we don't need to change the code here.

I don't feel strongly about this, though, so if other reviewers think
that the approach in this patch makes the code better, I'm OK with that.
 

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH v2 4/7] parse: create new library for parsing strings and env values
  2023-08-14 22:09     ` Jonathan Tan
@ 2023-08-14 22:19       ` Junio C Hamano
  0 siblings, 0 replies; 111+ messages in thread
From: Junio C Hamano @ 2023-08-14 22:19 UTC (permalink / raw)
  To: Jonathan Tan
  Cc: Calvin Wan, git, nasamuffin, chooglen, linusa, phillip.wood123, vdye

Jonathan Tan <jonathantanmy@google.com> writes:

> Thanks - I think that patches 1 through 4 are worth merging even now.
> One thing we hoped to accomplish through the libification effort is to
> make changes that are beneficial even outside the libification context,
> and it seems that this is one of them. Previously, code needed to
> include config.h even when it didn't use the main functionality that
> config.h provides (config), but now it no longer needs to do so. (And
> same argument for hex, although on a smaller scale.)

Thanks for writing this down.  The parser is shared across handling
data that come from config, environ, and options, and separating it
as a component different from the config does make sense.


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH v2 6/7] git-std-lib: introduce git standard library
  2023-08-10 16:36   ` [RFC PATCH v2 6/7] git-std-lib: introduce git standard library Calvin Wan
@ 2023-08-14 22:26     ` Jonathan Tan
  0 siblings, 0 replies; 111+ messages in thread
From: Jonathan Tan @ 2023-08-14 22:26 UTC (permalink / raw)
  To: Calvin Wan
  Cc: Jonathan Tan, git, nasamuffin, chooglen, linusa, phillip.wood123, vdye

Calvin Wan <calvinwan@google.com> writes:
> +Rationale behind Git Standard Library
> +================

Would it be clearer to write "Rationale behind what's in and what's not
in the Git Standard Library"? Or maybe that is too much of a mouthful.

> +Files inside of Git Standard Library
> +================
> +
> +The initial set of files in git-std-lib.a are:
> +abspath.c
> +ctype.c
> +date.c
> +hex-ll.c
> +parse.c
> +strbuf.c
> +usage.c
> +utf8.c
> +wrapper.c
> +stubs/repository.c
> +stubs/trace2.c
> +relevant compat/ files

I noticed that an earlier version did not have the "stubs" lines and
this version does, but could not find a comment about why these were
added. For me, what would make sense is to remove the "stubs" lines,
and then say "When these files are compiled together with the following
files (or user-provided files that provide the same functions), they
form a complete library", and then list the stubs after.

> diff --git a/git-compat-util.h b/git-compat-util.h
> index 481dac22b0..75aa9b263e 100644
> --- a/git-compat-util.h
> +++ b/git-compat-util.h
> @@ -396,8 +396,8 @@ static inline int noop_core_config(const char *var UNUSED,
>  #define platform_core_config noop_core_config
>  #endif
>  
> +#if !defined(__MINGW32__) && !defined(_MSC_VER) && !defined(GIT_STD_LIB)
>  int lstat_cache_aware_rmdir(const char *path);
> -#if !defined(__MINGW32__) && !defined(_MSC_VER)
>  #define rmdir lstat_cache_aware_rmdir
>  #endif

(and other changes that use defined(GIT_STD_LIB))

One alternative is to add stubs for lstat_cache_aware_rmdir that call
the "real" rmdir, but I guess that would be unnecessarily confusing.
Also, it would be strange if a user included a header file that
redefined a standard library function, so I guess we do need such a
"defined()" guard.

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH v2 7/7] git-std-lib: add test file to call git-std-lib.a functions
  2023-08-10 16:36   ` [RFC PATCH v2 7/7] git-std-lib: add test file to call git-std-lib.a functions Calvin Wan
@ 2023-08-14 22:28     ` Jonathan Tan
  0 siblings, 0 replies; 111+ messages in thread
From: Jonathan Tan @ 2023-08-14 22:28 UTC (permalink / raw)
  To: Calvin Wan
  Cc: Jonathan Tan, git, nasamuffin, chooglen, linusa, phillip.wood123, vdye

Calvin Wan <calvinwan@google.com> writes:
> Add test file that directly or indirectly calls all functions defined in
> git-std-lib.a object files to showcase that they do not reference
> missing objects and that git-std-lib.a can stand on its own.
> 
> Certain functions that cause the program to exit or are already called
> by other functions are commented out.
> 
> TODO: replace with unit tests
> Signed-off-by: Calvin Wan <calvinwan@google.com>

Thanks for this patch - it's useful for reviewers to see what this
patch set accomplishes (a way to compile a subset of files in Git that
can provide library functionality). I don't think we should merge it
as-is but should wait until we have a unit test that also exercises
functions, and then merge that instead (I think your TODO expresses the
same sentiment).
 

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH v2 0/7] Introduce Git Standard Library
  2023-08-10 22:05   ` [RFC PATCH v2 0/7] Introduce Git Standard Library Glen Choo
@ 2023-08-15  9:20     ` Phillip Wood
  2023-08-16 17:17       ` Calvin Wan
  0 siblings, 1 reply; 111+ messages in thread
From: Phillip Wood @ 2023-08-15  9:20 UTC (permalink / raw)
  To: Calvin Wan, git; +Cc: nasamuffin, jonathantanmy, linusa, vdye

On 10/08/2023 23:05, Glen Choo wrote:
> Calvin Wan <calvinwan@google.com> writes:
> 
>> Calvin Wan (7):
>>    hex-ll: split out functionality from hex
>>    object: move function to object.c
>>    config: correct bad boolean env value error message
>>    parse: create new library for parsing strings and env values
>>    date: push pager.h dependency up
>>    git-std-lib: introduce git standard library
>>    git-std-lib: add test file to call git-std-lib.a functions
> 
> This doesn't seem to apply to 'master'. Do you have a base commit that
> reviewers could apply the patches to?

I don't know what they are based on, but I did manage to apply them to 
master by using "am -3" and resolving the conflicts. The result is at 
https://github.com/phillipwood/git/tree/cw/git-std-lib/rfc-v2 if anyone 
is interested.

Best Wishes

Phillip


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH v2 0/7] Introduce Git Standard Library
  2023-08-10 16:33 ` [RFC PATCH v2 0/7] " Calvin Wan
                     ` (7 preceding siblings ...)
  2023-08-10 22:05   ` [RFC PATCH v2 0/7] Introduce Git Standard Library Glen Choo
@ 2023-08-15  9:41   ` Phillip Wood
  2023-09-08 17:41     ` [PATCH v3 0/6] " Calvin Wan
  8 siblings, 1 reply; 111+ messages in thread
From: Phillip Wood @ 2023-08-15  9:41 UTC (permalink / raw)
  To: Calvin Wan, git; +Cc: nasamuffin, chooglen, jonathantanmy, linusa, vdye

Hi Calvin

On 10/08/2023 17:33, Calvin Wan wrote:
> Original cover letter:
> https://lore.kernel.org/git/20230627195251.1973421-1-calvinwan@google.com/
> 
> In the initial RFC, I had a patch that removed the trace2 dependency
> from usage.c so that git-std-lib.a would not have dependencies outside
> of git-std-lib.a files. Consequently this meant that tracing would not
> be possible in git-std-lib.a files for other developers of Git, and it
> is not a good idea for the libification effort to close the door on
> tracing in certain files for future development (thanks Victoria for
> pointing this out). That patch has been removed and instead I introduce
> stubbed out versions of repository.[ch] and trace2.[ch] that are swapped
> in during compilation time (I'm no Makefile expert so any advice on how
> on I could do this better would be much appreciated). These stubbed out
> files contain no implementations and therefore do not have any
> additional dependencies, allowing git-std-lib.a to compile with only the
> stubs as additional dependencies.

I think stubbing out trace2 is a sensible approach. I don't think we
need separate headers when using the stub though, or a stub for
repository.c as we don't call any of the functions declared in that
header. I've appended a patch that shows a simplified stub. It also
removes the recursive make call as it no-longer needs to juggle the
header files.

> This also has the added benefit of
> removing `#ifdef GIT_STD_LIB` macros in C files for specific library
> compilation rules. Libification shouldn't pollute C files with these
> macros. The boundaries for git-std-lib.a have also been updated to
> contain these stubbed out files.

Do you have any plans to support building with gettext support so we
can use git-std-lib.a as a dependency of libgit.a?
  
> I have also made some additional changes to the Makefile to piggy back
> off of our existing build rules for .c/.o targets and their
> dependencies. As I learn more about Makefiles, I am continuing to look
> for ways to improve these rules. Eventually I would like to be able to
> have a set of rules that future libraries can emulate and is scalable
> in the sense of not creating additional toil for developers that are not
> interested in libification.

I'm not sure reusing LIB_OBJS for different targets is a good idea.
Once libgit.a starts to depend on git-std-lib.a we'll want to build them
both with a single make invocation without resorting to recursive make
calls. I think we could perhaps make a template function to create the
compilation rules for each library - see the end of
https://wingolog.org/archives/2023/08/08/a-negative-result

Best Wishes

Phillip

---- >8 -----
 From 194403e42f116cc3c6ed8eb8b03d6933b24067e4 Mon Sep 17 00:00:00 2001
From: Phillip Wood <phillip.wood@dunelm.org.uk>
Date: Sat, 12 Aug 2023 17:27:23 +0100
Subject: [PATCH] git-std-lib: simplify sub implementation

The code in std-lib does not depend directly on the functions declared
in repository.h and so it does not need to provide stub
implementations of the functions declared in repository.h. There is a
transitive dependency on `struct repository` from the functions
declared in trace2.h but the stub implementation of those functions
can simply define its own stub for struct repository. There is also no
need to use different headers when compiling against the stub
implementation of trace2.

This means we can simplify the stub implementation by removing
stubs/{repository.[ch],trace2.h} and simplify the Makefile by removing
the code that replaces header files when compiling against the trace2
stub. git-std-lib.a can now be built by running

   make git-std-lib.a GIT_STD_LIB=YesPlease STUB_TRACE2=YesPlease

There is one other small fixup in this commit:

  - `wrapper.c` includes `repository.h` but does not use any of the
    declarations.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
  Makefile           | 29 +-------------------
  stubs/repository.c |  4 ---
  stubs/repository.h |  8 ------
  stubs/trace2.c     |  5 ++++
  stubs/trace2.h     | 68 ----------------------------------------------
  wrapper.c          |  1 -
  6 files changed, 6 insertions(+), 109 deletions(-)
  delete mode 100644 stubs/repository.c
  delete mode 100644 stubs/repository.h
  delete mode 100644 stubs/trace2.h

diff --git a/Makefile b/Makefile
index a821d73c9d0..8eff4021025 100644
--- a/Makefile
+++ b/Makefile
@@ -1209,10 +1209,6 @@ LIB_OBJS += usage.o
  LIB_OBJS += utf8.o
  LIB_OBJS += wrapper.o
  
-ifdef STUB_REPOSITORY
-STUB_OBJS += stubs/repository.o
-endif
-
  ifdef STUB_TRACE2
  STUB_OBJS += stubs/trace2.o
  endif
@@ -3866,31 +3862,8 @@ fuzz-all: $(FUZZ_PROGRAMS)
  ### Libified Git rules
  
  # git-std-lib
-# `make git-std-lib GIT_STD_LIB=YesPlease STUB_REPOSITORY=YesPlease STUB_TRACE2=YesPlease`
+# `make git-std-lib.a GIT_STD_LIB=YesPlease STUB_TRACE2=YesPlease`
  STD_LIB = git-std-lib.a
  
  $(STD_LIB): $(LIB_OBJS) $(COMPAT_OBJS) $(STUB_OBJS)
  	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
-
-TEMP_HEADERS = temp_headers/
-
-git-std-lib:
-# Move headers to temporary folder and replace them with stubbed headers.
-# After building, move headers and stubbed headers back.
-ifneq ($(STUB_OBJS),)
-	mkdir -p $(TEMP_HEADERS); \
-	for d in $(STUB_OBJS); do \
-		BASE=$${d%.*}; \
-		mv $${BASE##*/}.h $(TEMP_HEADERS)$${BASE##*/}.h; \
-		mv $${BASE}.h $${BASE##*/}.h; \
-	done; \
-	$(MAKE) $(STD_LIB); \
-	for d in $(STUB_OBJS); do \
-		BASE=$${d%.*}; \
-		mv $${BASE##*/}.h $${BASE}.h; \
-		mv $(TEMP_HEADERS)$${BASE##*/}.h $${BASE##*/}.h; \
-	done; \
-	rm -rf temp_headers
-else
-	$(MAKE) $(STD_LIB)
-endif
diff --git a/stubs/repository.c b/stubs/repository.c
deleted file mode 100644
index f81520d083a..00000000000
--- a/stubs/repository.c
+++ /dev/null
@@ -1,4 +0,0 @@
-#include "git-compat-util.h"
-#include "repository.h"
-
-struct repository *the_repository;
diff --git a/stubs/repository.h b/stubs/repository.h
deleted file mode 100644
index 18262d748e5..00000000000
--- a/stubs/repository.h
+++ /dev/null
@@ -1,8 +0,0 @@
-#ifndef REPOSITORY_H
-#define REPOSITORY_H
-
-struct repository { int stub; };
-
-extern struct repository *the_repository;
-
-#endif /* REPOSITORY_H */
diff --git a/stubs/trace2.c b/stubs/trace2.c
index efc3f9c1f39..7d894822288 100644
--- a/stubs/trace2.c
+++ b/stubs/trace2.c
@@ -1,6 +1,10 @@
  #include "git-compat-util.h"
  #include "trace2.h"
  
+struct child_process { int stub; };
+struct repository { int stub; };
+struct json_writer { int stub; };
+
  void trace2_region_enter_fl(const char *file, int line, const char *category,
  			    const char *label, const struct repository *repo, ...) { }
  void trace2_region_leave_fl(const char *file, int line, const char *category,
@@ -19,4 +23,5 @@ void trace2_data_intmax_fl(const char *file, int line, const char *category,
  			   const struct repository *repo, const char *key,
  			   intmax_t value) { }
  int trace2_is_enabled(void) { return 0; }
+void trace2_counter_add(enum trace2_counter_id cid, uint64_t value) { }
  void trace2_collect_process_info(enum trace2_process_info_reason reason) { }
diff --git a/stubs/trace2.h b/stubs/trace2.h
deleted file mode 100644
index 836a14797cc..00000000000
--- a/stubs/trace2.h
+++ /dev/null
@@ -1,68 +0,0 @@
-#ifndef TRACE2_H
-#define TRACE2_H
-
-struct child_process { int stub; };
-struct repository;
-struct json_writer { int stub; };
-
-void trace2_region_enter_fl(const char *file, int line, const char *category,
-			    const char *label, const struct repository *repo, ...);
-
-#define trace2_region_enter(category, label, repo) \
-	trace2_region_enter_fl(__FILE__, __LINE__, (category), (label), (repo))
-
-void trace2_region_leave_fl(const char *file, int line, const char *category,
-			    const char *label, const struct repository *repo, ...);
-
-#define trace2_region_leave(category, label, repo) \
-	trace2_region_leave_fl(__FILE__, __LINE__, (category), (label), (repo))
-
-void trace2_data_string_fl(const char *file, int line, const char *category,
-			   const struct repository *repo, const char *key,
-			   const char *value);
-
-#define trace2_data_string(category, repo, key, value)                       \
-	trace2_data_string_fl(__FILE__, __LINE__, (category), (repo), (key), \
-			      (value))
-
-void trace2_cmd_ancestry_fl(const char *file, int line, const char **parent_names);
-
-#define trace2_cmd_ancestry(v) trace2_cmd_ancestry_fl(__FILE__, __LINE__, (v))
-
-void trace2_cmd_error_va_fl(const char *file, int line, const char *fmt,
-			    va_list ap);
-
-#define trace2_cmd_error_va(fmt, ap) \
-	trace2_cmd_error_va_fl(__FILE__, __LINE__, (fmt), (ap))
-
-
-void trace2_cmd_name_fl(const char *file, int line, const char *name);
-
-#define trace2_cmd_name(v) trace2_cmd_name_fl(__FILE__, __LINE__, (v))
-
-void trace2_thread_start_fl(const char *file, int line,
-			    const char *thread_base_name);
-
-#define trace2_thread_start(thread_base_name) \
-	trace2_thread_start_fl(__FILE__, __LINE__, (thread_base_name))
-
-void trace2_thread_exit_fl(const char *file, int line);
-
-#define trace2_thread_exit() trace2_thread_exit_fl(__FILE__, __LINE__)
-
-void trace2_data_intmax_fl(const char *file, int line, const char *category,
-			   const struct repository *repo, const char *key,
-			   intmax_t value);
-
-#define trace2_data_intmax(category, repo, key, value)                       \
-	trace2_data_intmax_fl(__FILE__, __LINE__, (category), (repo), (key), \
-			      (value))
-
-enum trace2_process_info_reason {
-	TRACE2_PROCESS_INFO_STARTUP,
-	TRACE2_PROCESS_INFO_EXIT,
-};
-int trace2_is_enabled(void);
-void trace2_collect_process_info(enum trace2_process_info_reason reason);
-
-#endif /* TRACE2_H */
diff --git a/wrapper.c b/wrapper.c
index 9eae4a8b3a0..e6facc5ff0c 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -5,7 +5,6 @@
  #include "abspath.h"
  #include "parse.h"
  #include "gettext.h"
-#include "repository.h"
  #include "strbuf.h"
  #include "trace2.h"
  
-- 
2.40.1.850.ge5e148ffb7d



^ permalink raw reply related	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH v2 0/7] Introduce Git Standard Library
  2023-08-15  9:20     ` Phillip Wood
@ 2023-08-16 17:17       ` Calvin Wan
  2023-08-16 21:19         ` Junio C Hamano
  0 siblings, 1 reply; 111+ messages in thread
From: Calvin Wan @ 2023-08-16 17:17 UTC (permalink / raw)
  To: phillip.wood; +Cc: git, nasamuffin, jonathantanmy, linusa, vdye

Thanks for resolving the conflicts on master. I should've rebased
before sending out this v2 since it's built off of 2.41 with some of
my other patch cleanup series.

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [RFC PATCH v2 0/7] Introduce Git Standard Library
  2023-08-16 17:17       ` Calvin Wan
@ 2023-08-16 21:19         ` Junio C Hamano
  0 siblings, 0 replies; 111+ messages in thread
From: Junio C Hamano @ 2023-08-16 21:19 UTC (permalink / raw)
  To: Calvin Wan; +Cc: phillip.wood, git, nasamuffin, jonathantanmy, linusa, vdye

Calvin Wan <calvinwan@google.com> writes:

> Thanks for resolving the conflicts on master. I should've rebased
> before sending out this v2 since it's built off of 2.41 with some of
> my other patch cleanup series.

I think the freeze period before the release would be a good time to
rebuild on an updated base to prepare v3 for posting.

Thanks.

^ permalink raw reply	[flat|nested] 111+ messages in thread

* [PATCH v3 0/6] Introduce Git Standard Library
  2023-08-15  9:41   ` Phillip Wood
@ 2023-09-08 17:41     ` Calvin Wan
  2023-09-08 17:44       ` [PATCH v3 1/6] hex-ll: split out functionality from hex Calvin Wan
                         ` (6 more replies)
  0 siblings, 7 replies; 111+ messages in thread
From: Calvin Wan @ 2023-09-08 17:41 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, nasamuffin, jonathantanmy, linusa, phillip.wood123, vdye

Original cover letter:
https://lore.kernel.org/git/20230627195251.1973421-1-calvinwan@google.com/

I have taken this series out of RFC since there weren't any significant
concerns with the overall concept and design of this series. This reroll
incorporates some smaller changes such as dropping the "push pager
dependency" patch in favor of stubbing it out. The main change this
reroll cleans up the Makefile rules and stubs, as suggested by
Phillip Wood (appreciate the help on this one)!

This series has been rebased onto 1fc548b2d6a: The sixth batch

Originally this series was built on other patches that have since been
merged, which is why the range-diff is shown removing many of them.

Calvin Wan (6):
  hex-ll: split out functionality from hex
  wrapper: remove dependency to Git-specific internal file
  config: correct bad boolean env value error message
  parse: create new library for parsing strings and env values
  git-std-lib: introduce git standard library
  git-std-lib: add test file to call git-std-lib.a functions

 Documentation/technical/git-std-lib.txt | 191 ++++++++++++++++++++
 Makefile                                |  41 ++++-
 attr.c                                  |   2 +-
 color.c                                 |   2 +-
 config.c                                | 173 +-----------------
 config.h                                |  14 +-
 entry.c                                 |   5 +
 entry.h                                 |   6 +
 git-compat-util.h                       |   7 +-
 hex-ll.c                                |  49 +++++
 hex-ll.h                                |  27 +++
 hex.c                                   |  47 -----
 hex.h                                   |  24 +--
 mailinfo.c                              |   2 +-
 pack-objects.c                          |   2 +-
 pack-revindex.c                         |   2 +-
 parse-options.c                         |   3 +-
 parse.c                                 | 182 +++++++++++++++++++
 parse.h                                 |  20 ++
 pathspec.c                              |   2 +-
 preload-index.c                         |   2 +-
 progress.c                              |   2 +-
 prompt.c                                |   2 +-
 rebase.c                                |   2 +-
 strbuf.c                                |   2 +-
 stubs/pager.c                           |   6 +
 stubs/pager.h                           |   6 +
 stubs/trace2.c                          |  27 +++
 symlinks.c                              |   2 +
 t/Makefile                              |   4 +
 t/helper/test-env-helper.c              |   2 +-
 t/stdlib-test.c                         | 231 ++++++++++++++++++++++++
 unpack-trees.c                          |   2 +-
 url.c                                   |   2 +-
 urlmatch.c                              |   2 +-
 wrapper.c                               |   9 +-
 wrapper.h                               |   5 -
 write-or-die.c                          |   2 +-
 38 files changed, 824 insertions(+), 287 deletions(-)
 create mode 100644 Documentation/technical/git-std-lib.txt
 create mode 100644 hex-ll.c
 create mode 100644 hex-ll.h
 create mode 100644 parse.c
 create mode 100644 parse.h
 create mode 100644 stubs/pager.c
 create mode 100644 stubs/pager.h
 create mode 100644 stubs/trace2.c
 create mode 100644 t/stdlib-test.c

Range-diff against v2:
 1:  121788f263 <  -:  ---------- strbuf: clarify API boundary
 2:  5e91404ecd <  -:  ---------- strbuf: clarify dependency
 3:  5c05f40181 <  -:  ---------- abspath: move related functions to abspath
 4:  e1addc77e5 <  -:  ---------- credential-store: move related functions to credential-store file
 5:  62e8c42f59 <  -:  ---------- object-name: move related functions to object-name
 6:  0abba57acb <  -:  ---------- path: move related function to path
 7:  d33267a390 <  -:  ---------- strbuf: remove global variable
 8:  665d2c2089 <  -:  ---------- init-db: document existing bug with core.bare in template config
 9:  68d0a8ff16 <  -:  ---------- init-db: remove unnecessary global variable
10:  8c8ec85507 <  -:  ---------- init-db, clone: change unnecessary global into passed parameter
11:  d555e2b365 <  -:  ---------- setup: adopt shared init-db & clone code
12:  689a7bc8aa <  -:  ---------- read-cache: move shared commit and ls-files code
13:  392f8e75b7 <  -:  ---------- add: modify add_files_to_cache() to avoid globals
14:  49ce237013 <  -:  ---------- read-cache: move shared add/checkout/commit code
15:  c5d8370d40 <  -:  ---------- statinfo: move stat_{data,validity} functions from cache/read-cache
16:  90a72b6f86 <  -:  ---------- run-command.h: move declarations for run-command.c from cache.h
17:  f27516c780 <  -:  ---------- name-hash.h: move declarations for name-hash.c from cache.h
18:  895c38a050 <  -:  ---------- sparse-index.h: move declarations for sparse-index.c from cache.h
19:  8678d4ad20 <  -:  ---------- preload-index.h: move declarations for preload-index.c from elsewhere
20:  4a463abaae <  -:  ---------- diff.h: move declaration for global in diff.c from cache.h
21:  3440e762c7 <  -:  ---------- merge.h: move declarations for merge.c from cache.h
22:  e70853e398 <  -:  ---------- repository.h: move declaration of the_index from cache.h
23:  ccd2014d73 <  -:  ---------- read-cache*.h: move declarations for read-cache.c functions from cache.h
24:  d3a482afa9 <  -:  ---------- cache.h: remove this no-longer-used header
25:  eaa087f446 <  -:  ---------- log-tree: replace include of revision.h with simple forward declaration
26:  5d2b0a9c75 <  -:  ---------- repository: remove unnecessary include of path.h
27:  250f83014e <  -:  ---------- diff.h: remove unnecessary include of oidset.h
28:  d0f9913958 <  -:  ---------- list-objects-filter-options.h: remove unneccessary include
29:  03a2b2a515 <  -:  ---------- builtin.h: remove unneccessary includes
30:  15edc22d00 <  -:  ---------- git-compat-util.h: remove unneccessary include of wildmatch.h
31:  e4e1bec8bd <  -:  ---------- merge-ll: rename from ll-merge
32:  9185495fd0 <  -:  ---------- khash: name the structs that khash declares
33:  15fb05e453 <  -:  ---------- object-store-ll.h: split this header out of object-store.h
34:  2608fe4b23 <  -:  ---------- hash-ll, hashmap: move oidhash() to hash-ll
35:  5e8dc5b574 <  -:  ---------- fsmonitor-ll.h: split this header out of fsmonitor.h
36:  37d32fc3fd <  -:  ---------- git-compat-util: move strbuf.c funcs to its header
37:  6ed19d5fe2 <  -:  ---------- git-compat-util: move wrapper.c funcs to its header
38:  555d1b8942 <  -:  ---------- sane-ctype.h: create header for sane-ctype macros
39:  72d591e282 <  -:  ---------- kwset: move translation table from ctype
40:  5d1dc2a118 <  -:  ---------- common.h: move non-compat specific macros and functions
41:  33e07e552e <  -:  ---------- git-compat-util: move usage.c funcs to its header
42:  417a8aa733 <  -:  ---------- treewide: remove unnecessary includes for wrapper.h
43:  65e35d00c1 <  -:  ---------- common: move alloc macros to common.h
44:  78634bc406 !  1:  2f99eb2ca4 hex-ll: split out functionality from hex
    @@ hex.h
     +#include "hex-ll.h"
      
      /*
    -  * Try to read a SHA1 in hexadecimal format from the 40 characters
    -@@ hex.h: int get_oid_hex(const char *hex, struct object_id *sha1);
    +  * Try to read a hash (specified by the_hash_algo) in hexadecimal
    +@@ hex.h: int get_oid_hex(const char *hex, struct object_id *oid);
      /* Like get_oid_hex, but for an arbitrary hash algorithm. */
      int get_oid_hex_algop(const char *hex, struct object_id *oid, const struct git_hash_algo *algop);
      
45:  21ec1d276e !  2:  7b2d123628 object: move function to object.c
    @@ Metadata
     Author: Calvin Wan <calvinwan@google.com>
     
      ## Commit message ##
    -    object: move function to object.c
    +    wrapper: remove dependency to Git-specific internal file
     
    -    While remove_or_warn() is a simple ternary operator to call two other
    -    wrapper functions, it creates an unnecessary dependency to object.h in
    -    wrapper.c. Therefore move the function to object.[ch] where the concept
    -    of GITLINKs is first defined.
    +    In order for wrapper.c to be built independently as part of a smaller
    +    library, it cannot have dependencies to other Git specific
    +    internals. remove_or_warn() creates an unnecessary dependency to
    +    object.h in wrapper.c. Therefore move the function to entry.[ch] which
    +    performs changes on the worktree based on the Git-specific file modes in
    +    the index.
     
    - ## object.c ##
    -@@ object.c: void parsed_object_pool_clear(struct parsed_object_pool *o)
    - 	FREE_AND_NULL(o->object_state);
    - 	FREE_AND_NULL(o->shallow_stat);
    + ## entry.c ##
    +@@ entry.c: void unlink_entry(const struct cache_entry *ce, const char *super_prefix)
    + 		return;
    + 	schedule_dir_for_removal(ce->name, ce_namelen(ce));
      }
     +
     +int remove_or_warn(unsigned int mode, const char *file)
    @@ object.c: void parsed_object_pool_clear(struct parsed_object_pool *o)
     +	return S_ISGITLINK(mode) ? rmdir_or_warn(file) : unlink_or_warn(file);
     +}
     
    - ## object.h ##
    -@@ object.h: void clear_object_flags(unsigned flags);
    -  */
    - void repo_clear_commit_marks(struct repository *r, unsigned int flags);
    + ## entry.h ##
    +@@ entry.h: int fstat_checkout_output(int fd, const struct checkout *state, struct stat *st)
    + void update_ce_after_write(const struct checkout *state, struct cache_entry *ce,
    + 			   struct stat *st);
      
     +/*
     + * Calls the correct function out of {unlink,rmdir}_or_warn based on
    @@ object.h: void clear_object_flags(unsigned flags);
     + */
     +int remove_or_warn(unsigned int mode, const char *path);
     +
    - #endif /* OBJECT_H */
    + #endif /* ENTRY_H */
     
      ## wrapper.c ##
     @@
46:  41dcf8107c =  3:  b37beb206a config: correct bad boolean env value error message
47:  3e800a41c4 !  4:  3a827cf45c parse: create new library for parsing strings and env values
    @@ Commit message
         config.c, there are other files that only need parsing functionality and
         not config functionality. By separating out string and environment value
         parsing from config, those files can instead be dependent on parse,
    -    which has a much smaller dependency chain than config.
    +    which has a much smaller dependency chain than config. This ultimately
    +    allows us to inclue parse.[ch] in an independent library since it
    +    doesn't have dependencies to Git-specific internals unlike in
    +    config.[ch].
     
         Move general string and env parsing functions from config.[ch] to
         parse.[ch].
    @@ config.c: static int git_parse_source(struct config_source *cs, config_fn_t fn,
     -	return 1;
     -}
     -
    - static int reader_config_name(struct config_reader *reader, const char **out);
    - static int reader_origin_type(struct config_reader *reader,
    - 			      enum config_origin_type *type);
    -@@ config.c: ssize_t git_config_ssize_t(const char *name, const char *value)
    + NORETURN
    + static void die_bad_number(const char *name, const char *value,
    + 			   const struct key_value_info *kvi)
    +@@ config.c: ssize_t git_config_ssize_t(const char *name, const char *value,
      	return ret;
      }
      
    @@ config.c: static enum fsync_component parse_fsync_components(const char *var, co
     -	return -1;
     -}
     -
    - int git_config_bool_or_int(const char *name, const char *value, int *is_bool)
    + int git_config_bool_or_int(const char *name, const char *value,
    + 			   const struct key_value_info *kvi, int *is_bool)
      {
    - 	int v = git_parse_maybe_bool_text(value);
     @@ config.c: void git_global_config(char **user_out, char **xdg_out)
      	*xdg_out = xdg_config;
      }
    @@ config.c: void git_global_config(char **user_out, char **xdg_out)
     
      ## config.h ##
     @@
    - 
      #include "hashmap.h"
      #include "string-list.h"
    + #include "repository.h"
     -
     +#include "parse.h"
      
48:  7a4a088bc3 <  -:  ---------- date: push pager.h dependency up
49:  c9002734d0 !  5:  f8e4ac50a0 git-std-lib: introduce git standard library
    @@ Documentation/technical/git-std-lib.txt (new)
     +Rationale behind Git Standard Library
     +================
     +
    -+The rationale behind Git Standard Library essentially is the result of
    -+two observations within the Git codebase: every file includes
    -+git-compat-util.h which defines functions in a couple of different
    -+files, and wrapper.c + usage.c have difficult-to-separate circular
    -+dependencies with each other and other files.
    ++The rationale behind what's in and what's not in the Git Standard
    ++Library essentially is the result of two observations within the Git
    ++codebase: every file includes git-compat-util.h which defines functions
    ++in a couple of different files, and wrapper.c + usage.c have
    ++difficult-to-separate circular dependencies with each other and other
    ++files.
     +
     +Ubiquity of git-compat-util.h and circular dependencies
     +========
    @@ Documentation/technical/git-std-lib.txt (new)
     + - low-level git/* files with functions defined in git-compat-util.h
     +   (ctype.c)
     + - compat/*
    -+ - stubbed out dependencies in stubs/ (stubs/repository.c, stubs/trace2.c)
    ++ - stubbed out dependencies in stubs/ (stubs/pager.c, stubs/trace2.c)
     +
     +There are other files that might fit this definition, but that does not
     +mean it should belong in git-std-lib.a. Those files should start as
     +their own separate library since any file added to git-std-lib.a loses
     +its flexibility of being easily swappable.
     +
    -+Wrapper.c and usage.c have dependencies on repository and trace2 that are
    ++Wrapper.c and usage.c have dependencies on pager and trace2 that are
     +possible to remove at the cost of sacrificing the ability for standard Git
     +to be able to trace functions in those files and other files in git-std-lib.a.
     +In order for git-std-lib.a to compile with those dependencies, stubbed out
    @@ Documentation/technical/git-std-lib.txt (new)
     +usage.c
     +utf8.c
     +wrapper.c
    -+stubs/repository.c
    -+stubs/trace2.c
     +relevant compat/ files
     +
    ++When these files are compiled together with the following files (or
    ++user-provided files that provide the same functions), they form a
    ++complete library:
    ++stubs/pager.c
    ++stubs/trace2.c
    ++
     +Pitfalls
     +================
     +
    @@ Makefile: LIB_OBJS += write-or-die.o
     +LIB_OBJS += utf8.o
     +LIB_OBJS += wrapper.o
     +
    -+ifdef STUB_REPOSITORY
    -+STUB_OBJS += stubs/repository.o
    -+endif
    -+
     +ifdef STUB_TRACE2
     +STUB_OBJS += stubs/trace2.o
     +endif
     +
    ++ifdef STUB_PAGER
    ++STUB_OBJS += stubs/pager.o
    ++endif
    ++
     +LIB_OBJS += $(STUB_OBJS)
     +endif
      
    @@ Makefile: ifdef FSMONITOR_OS_SETTINGS
      NO_TCLTK = NoThanks
      endif
     @@ Makefile: clean: profile-clean coverage-clean cocciclean
    - 	$(RM) po/git.pot po/git-core.pot
      	$(RM) git.res
      	$(RM) $(OBJECTS)
    + 	$(RM) headless-git.o
     -	$(RM) $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(REFTABLE_TEST_LIB)
     +	$(RM) $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(REFTABLE_TEST_LIB) $(STD_LIB_FILE)
      	$(RM) $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS) $(OTHER_PROGRAMS)
    @@ Makefile: $(FUZZ_PROGRAMS): all
     +### Libified Git rules
     +
     +# git-std-lib
    -+# `make git-std-lib GIT_STD_LIB=YesPlease STUB_REPOSITORY=YesPlease STUB_TRACE2=YesPlease`
    ++# `make git-std-lib.a GIT_STD_LIB=YesPlease STUB_TRACE2=YesPlease STUB_PAGER=YesPlease`
     +STD_LIB = git-std-lib.a
     +
     +$(STD_LIB): $(LIB_OBJS) $(COMPAT_OBJS) $(STUB_OBJS)
     +	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
    -+
    -+TEMP_HEADERS = temp_headers/
    -+
    -+git-std-lib:
    -+# Move headers to temporary folder and replace them with stubbed headers.
    -+# After building, move headers and stubbed headers back.
    -+ifneq ($(STUB_OBJS),)
    -+	mkdir -p $(TEMP_HEADERS); \
    -+	for d in $(STUB_OBJS); do \
    -+		BASE=$${d%.*}; \
    -+		mv $${BASE##*/}.h $(TEMP_HEADERS)$${BASE##*/}.h; \
    -+		mv $${BASE}.h $${BASE##*/}.h; \
    -+	done; \
    -+	$(MAKE) $(STD_LIB); \
    -+	for d in $(STUB_OBJS); do \
    -+		BASE=$${d%.*}; \
    -+		mv $${BASE##*/}.h $${BASE}.h; \
    -+		mv $(TEMP_HEADERS)$${BASE##*/}.h $${BASE##*/}.h; \
    -+	done; \
    -+	rm -rf temp_headers
    -+else
    -+	$(MAKE) $(STD_LIB)
    -+endif
     
      ## git-compat-util.h ##
     @@ git-compat-util.h: static inline int noop_core_config(const char *var UNUSED,
    @@ git-compat-util.h: const char *inet_ntop(int af, const void *src, char *dst, siz
      #endif
     +#endif
      
    - /*
    -  * Limit size of IO chunks, because huge chunks only cause pain.  OS X
    -@@ git-compat-util.h: int git_access(const char *path, int mode);
    - # endif
    - #endif
    + static inline size_t st_add(size_t a, size_t b)
    + {
    +@@ git-compat-util.h: static inline int is_missing_file_error(int errno_)
    + 	return (errno_ == ENOENT || errno_ == ENOTDIR);
    + }
      
     +#ifndef GIT_STD_LIB
      int cmd_main(int, const char **);
    @@ git-compat-util.h: int git_access(const char *path, int mode);
      /*
       * You can mark a stack variable with UNLEAK(var) to avoid it being
     
    - ## stubs/repository.c (new) ##
    + ## stubs/pager.c (new) ##
     @@
    -+#include "git-compat-util.h"
    -+#include "repository.h"
    ++#include "pager.h"
     +
    -+struct repository *the_repository;
    ++int pager_in_use(void)
    ++{
    ++	return 0;
    ++}
     
    - ## stubs/repository.h (new) ##
    + ## stubs/pager.h (new) ##
     @@
    -+#ifndef REPOSITORY_H
    -+#define REPOSITORY_H
    ++#ifndef PAGER_H
    ++#define PAGER_H
     +
    -+struct repository { int stub; };
    ++int pager_in_use(void);
     +
    -+extern struct repository *the_repository;
    -+
    -+#endif /* REPOSITORY_H */
    ++#endif /* PAGER_H */
     
      ## stubs/trace2.c (new) ##
     @@
     +#include "git-compat-util.h"
     +#include "trace2.h"
     +
    ++struct child_process { int stub; };
    ++struct repository { int stub; };
    ++struct json_writer { int stub; };
    ++
     +void trace2_region_enter_fl(const char *file, int line, const char *category,
     +			    const char *label, const struct repository *repo, ...) { }
     +void trace2_region_leave_fl(const char *file, int line, const char *category,
    @@ stubs/trace2.c (new)
     +			   const struct repository *repo, const char *key,
     +			   intmax_t value) { }
     +int trace2_is_enabled(void) { return 0; }
    ++void trace2_counter_add(enum trace2_counter_id cid, uint64_t value) { }
     +void trace2_collect_process_info(enum trace2_process_info_reason reason) { }
     
    - ## stubs/trace2.h (new) ##
    -@@
    -+#ifndef TRACE2_H
    -+#define TRACE2_H
    -+
    -+struct child_process { int stub; };
    -+struct repository;
    -+struct json_writer { int stub; };
    -+
    -+void trace2_region_enter_fl(const char *file, int line, const char *category,
    -+			    const char *label, const struct repository *repo, ...);
    -+
    -+#define trace2_region_enter(category, label, repo) \
    -+	trace2_region_enter_fl(__FILE__, __LINE__, (category), (label), (repo))
    -+
    -+void trace2_region_leave_fl(const char *file, int line, const char *category,
    -+			    const char *label, const struct repository *repo, ...);
    -+
    -+#define trace2_region_leave(category, label, repo) \
    -+	trace2_region_leave_fl(__FILE__, __LINE__, (category), (label), (repo))
    -+
    -+void trace2_data_string_fl(const char *file, int line, const char *category,
    -+			   const struct repository *repo, const char *key,
    -+			   const char *value);
    -+
    -+#define trace2_data_string(category, repo, key, value)                       \
    -+	trace2_data_string_fl(__FILE__, __LINE__, (category), (repo), (key), \
    -+			      (value))
    -+
    -+void trace2_cmd_ancestry_fl(const char *file, int line, const char **parent_names);
    -+
    -+#define trace2_cmd_ancestry(v) trace2_cmd_ancestry_fl(__FILE__, __LINE__, (v))
    -+
    -+void trace2_cmd_error_va_fl(const char *file, int line, const char *fmt,
    -+			    va_list ap);
    -+
    -+#define trace2_cmd_error_va(fmt, ap) \
    -+	trace2_cmd_error_va_fl(__FILE__, __LINE__, (fmt), (ap))
    -+
    -+
    -+void trace2_cmd_name_fl(const char *file, int line, const char *name);
    -+
    -+#define trace2_cmd_name(v) trace2_cmd_name_fl(__FILE__, __LINE__, (v))
    -+
    -+void trace2_thread_start_fl(const char *file, int line,
    -+			    const char *thread_base_name);
    -+
    -+#define trace2_thread_start(thread_base_name) \
    -+	trace2_thread_start_fl(__FILE__, __LINE__, (thread_base_name))
    -+
    -+void trace2_thread_exit_fl(const char *file, int line);
    -+
    -+#define trace2_thread_exit() trace2_thread_exit_fl(__FILE__, __LINE__)
    -+
    -+void trace2_data_intmax_fl(const char *file, int line, const char *category,
    -+			   const struct repository *repo, const char *key,
    -+			   intmax_t value);
    -+
    -+#define trace2_data_intmax(category, repo, key, value)                       \
    -+	trace2_data_intmax_fl(__FILE__, __LINE__, (category), (repo), (key), \
    -+			      (value))
    -+
    -+enum trace2_process_info_reason {
    -+	TRACE2_PROCESS_INFO_STARTUP,
    -+	TRACE2_PROCESS_INFO_EXIT,
    -+};
    -+int trace2_is_enabled(void);
    -+void trace2_collect_process_info(enum trace2_process_info_reason reason);
    -+
    -+#endif /* TRACE2_H */
    -+
    -
      ## symlinks.c ##
     @@ symlinks.c: void invalidate_lstat_cache(void)
      	reset_lstat_cache(&default_cache);
    @@ symlinks.c: int lstat_cache_aware_rmdir(const char *path)
      	return ret;
      }
     +#endif
    +
    + ## wrapper.c ##
    +@@
    + #include "abspath.h"
    + #include "parse.h"
    + #include "gettext.h"
    +-#include "repository.h"
    + #include "strbuf.h"
    + #include "trace2.h"
    + 
50:  0bead8f980 !  6:  7840e1830a git-std-lib: add test file to call git-std-lib.a functions
    @@ t/stdlib-test.c (new)
     +	struct strbuf sb3 = STRBUF_INIT;
     +	struct string_list list = STRING_LIST_INIT_NODUP;
     +	char *buf = "foo";
    -+	struct strbuf_expand_dict_entry dict[] = {
    -+		{ "foo", NULL, },
    -+		{ "bar", NULL, },
    -+	};
     +	int fd = open("/dev/null", O_RDONLY);
     +
     +	fprintf(stderr, "calling strbuf functions\n");
    @@ t/stdlib-test.c (new)
     +	strbuf_add_commented_lines(sb, "foo", 3, '#');
     +	strbuf_commented_addf(sb, '#', "%s", "foo");
     +	// strbuf_vaddf() called by strbuf_addf()
    -+	strbuf_expand(sb, "%s", strbuf_expand_literal_cb, NULL);
    -+	strbuf_expand(sb, "%s", strbuf_expand_dict_cb, &dict);
    -+	// strbuf_expand_literal_cb() called by strbuf_expand()
    -+	// strbuf_expand_dict_cb() called by strbuf_expand()
     +	strbuf_addbuf_percentquote(sb, &sb3);
     +	strbuf_add_percentencode(sb, "foo", STRBUF_ENCODE_SLASH);
     +	strbuf_fread(sb, 0, stdin);
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply	[flat|nested] 111+ messages in thread

* [PATCH v3 1/6] hex-ll: split out functionality from hex
  2023-09-08 17:41     ` [PATCH v3 0/6] " Calvin Wan
@ 2023-09-08 17:44       ` Calvin Wan
  2023-09-08 17:44       ` [PATCH v3 2/6] wrapper: remove dependency to Git-specific internal file Calvin Wan
                         ` (5 subsequent siblings)
  6 siblings, 0 replies; 111+ messages in thread
From: Calvin Wan @ 2023-09-08 17:44 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, nasamuffin, jonathantanmy, linusa, phillip.wood123, vdye

Separate out hex functionality that doesn't require a hash algo into
hex-ll.[ch]. Since the hash algo is currently a global that sits in
repository, this separation removes that dependency for files that only
need basic hex manipulation functions.

Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 Makefile   |  1 +
 color.c    |  2 +-
 hex-ll.c   | 49 +++++++++++++++++++++++++++++++++++++++++++++++++
 hex-ll.h   | 27 +++++++++++++++++++++++++++
 hex.c      | 47 -----------------------------------------------
 hex.h      | 24 +-----------------------
 mailinfo.c |  2 +-
 strbuf.c   |  2 +-
 url.c      |  2 +-
 urlmatch.c |  2 +-
 10 files changed, 83 insertions(+), 75 deletions(-)
 create mode 100644 hex-ll.c
 create mode 100644 hex-ll.h

diff --git a/Makefile b/Makefile
index 5776309365..861e643708 100644
--- a/Makefile
+++ b/Makefile
@@ -1040,6 +1040,7 @@ LIB_OBJS += hash-lookup.o
 LIB_OBJS += hashmap.o
 LIB_OBJS += help.o
 LIB_OBJS += hex.o
+LIB_OBJS += hex-ll.o
 LIB_OBJS += hook.o
 LIB_OBJS += ident.o
 LIB_OBJS += json-writer.o
diff --git a/color.c b/color.c
index b24b19566b..f663c06ac4 100644
--- a/color.c
+++ b/color.c
@@ -3,7 +3,7 @@
 #include "color.h"
 #include "editor.h"
 #include "gettext.h"
-#include "hex.h"
+#include "hex-ll.h"
 #include "pager.h"
 #include "strbuf.h"
 
diff --git a/hex-ll.c b/hex-ll.c
new file mode 100644
index 0000000000..4d7ece1de5
--- /dev/null
+++ b/hex-ll.c
@@ -0,0 +1,49 @@
+#include "git-compat-util.h"
+#include "hex-ll.h"
+
+const signed char hexval_table[256] = {
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 00-07 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 08-0f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 10-17 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 18-1f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 20-27 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 28-2f */
+	  0,  1,  2,  3,  4,  5,  6,  7,		/* 30-37 */
+	  8,  9, -1, -1, -1, -1, -1, -1,		/* 38-3f */
+	 -1, 10, 11, 12, 13, 14, 15, -1,		/* 40-47 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 48-4f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 50-57 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 58-5f */
+	 -1, 10, 11, 12, 13, 14, 15, -1,		/* 60-67 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 68-67 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 70-77 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 78-7f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 80-87 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 88-8f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 90-97 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 98-9f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* a0-a7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* a8-af */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* b0-b7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* b8-bf */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* c0-c7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* c8-cf */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* d0-d7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* d8-df */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* e0-e7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* e8-ef */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* f0-f7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* f8-ff */
+};
+
+int hex_to_bytes(unsigned char *binary, const char *hex, size_t len)
+{
+	for (; len; len--, hex += 2) {
+		unsigned int val = (hexval(hex[0]) << 4) | hexval(hex[1]);
+
+		if (val & ~0xff)
+			return -1;
+		*binary++ = val;
+	}
+	return 0;
+}
diff --git a/hex-ll.h b/hex-ll.h
new file mode 100644
index 0000000000..a381fa8556
--- /dev/null
+++ b/hex-ll.h
@@ -0,0 +1,27 @@
+#ifndef HEX_LL_H
+#define HEX_LL_H
+
+extern const signed char hexval_table[256];
+static inline unsigned int hexval(unsigned char c)
+{
+	return hexval_table[c];
+}
+
+/*
+ * Convert two consecutive hexadecimal digits into a char.  Return a
+ * negative value on error.  Don't run over the end of short strings.
+ */
+static inline int hex2chr(const char *s)
+{
+	unsigned int val = hexval(s[0]);
+	return (val & ~0xf) ? val : (val << 4) | hexval(s[1]);
+}
+
+/*
+ * Read `len` pairs of hexadecimal digits from `hex` and write the
+ * values to `binary` as `len` bytes. Return 0 on success, or -1 if
+ * the input does not consist of hex digits).
+ */
+int hex_to_bytes(unsigned char *binary, const char *hex, size_t len);
+
+#endif
diff --git a/hex.c b/hex.c
index 01f17fe5c9..d42262bdca 100644
--- a/hex.c
+++ b/hex.c
@@ -2,53 +2,6 @@
 #include "hash.h"
 #include "hex.h"
 
-const signed char hexval_table[256] = {
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 00-07 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 08-0f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 10-17 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 18-1f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 20-27 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 28-2f */
-	  0,  1,  2,  3,  4,  5,  6,  7,		/* 30-37 */
-	  8,  9, -1, -1, -1, -1, -1, -1,		/* 38-3f */
-	 -1, 10, 11, 12, 13, 14, 15, -1,		/* 40-47 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 48-4f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 50-57 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 58-5f */
-	 -1, 10, 11, 12, 13, 14, 15, -1,		/* 60-67 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 68-67 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 70-77 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 78-7f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 80-87 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 88-8f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 90-97 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 98-9f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* a0-a7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* a8-af */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* b0-b7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* b8-bf */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* c0-c7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* c8-cf */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* d0-d7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* d8-df */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* e0-e7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* e8-ef */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* f0-f7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* f8-ff */
-};
-
-int hex_to_bytes(unsigned char *binary, const char *hex, size_t len)
-{
-	for (; len; len--, hex += 2) {
-		unsigned int val = (hexval(hex[0]) << 4) | hexval(hex[1]);
-
-		if (val & ~0xff)
-			return -1;
-		*binary++ = val;
-	}
-	return 0;
-}
-
 static int get_hash_hex_algop(const char *hex, unsigned char *hash,
 			      const struct git_hash_algo *algop)
 {
diff --git a/hex.h b/hex.h
index 87abf66602..e0b83f776f 100644
--- a/hex.h
+++ b/hex.h
@@ -2,22 +2,7 @@
 #define HEX_H
 
 #include "hash-ll.h"
-
-extern const signed char hexval_table[256];
-static inline unsigned int hexval(unsigned char c)
-{
-	return hexval_table[c];
-}
-
-/*
- * Convert two consecutive hexadecimal digits into a char.  Return a
- * negative value on error.  Don't run over the end of short strings.
- */
-static inline int hex2chr(const char *s)
-{
-	unsigned int val = hexval(s[0]);
-	return (val & ~0xf) ? val : (val << 4) | hexval(s[1]);
-}
+#include "hex-ll.h"
 
 /*
  * Try to read a hash (specified by the_hash_algo) in hexadecimal
@@ -34,13 +19,6 @@ int get_oid_hex(const char *hex, struct object_id *oid);
 /* Like get_oid_hex, but for an arbitrary hash algorithm. */
 int get_oid_hex_algop(const char *hex, struct object_id *oid, const struct git_hash_algo *algop);
 
-/*
- * Read `len` pairs of hexadecimal digits from `hex` and write the
- * values to `binary` as `len` bytes. Return 0 on success, or -1 if
- * the input does not consist of hex digits).
- */
-int hex_to_bytes(unsigned char *binary, const char *hex, size_t len);
-
 /*
  * Convert a binary hash in "unsigned char []" or an object name in
  * "struct object_id *" to its hex equivalent. The `_r` variant is reentrant,
diff --git a/mailinfo.c b/mailinfo.c
index 931505363c..a07d2da16d 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -1,7 +1,7 @@
 #include "git-compat-util.h"
 #include "config.h"
 #include "gettext.h"
-#include "hex.h"
+#include "hex-ll.h"
 #include "utf8.h"
 #include "strbuf.h"
 #include "mailinfo.h"
diff --git a/strbuf.c b/strbuf.c
index 4c9ac6dc5e..7827178d8e 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -1,6 +1,6 @@
 #include "git-compat-util.h"
 #include "gettext.h"
-#include "hex.h"
+#include "hex-ll.h"
 #include "strbuf.h"
 #include "string-list.h"
 #include "utf8.h"
diff --git a/url.c b/url.c
index 2e1a9f6fee..282b12495a 100644
--- a/url.c
+++ b/url.c
@@ -1,5 +1,5 @@
 #include "git-compat-util.h"
-#include "hex.h"
+#include "hex-ll.h"
 #include "strbuf.h"
 #include "url.h"
 
diff --git a/urlmatch.c b/urlmatch.c
index 1c45f23adf..1d0254abac 100644
--- a/urlmatch.c
+++ b/urlmatch.c
@@ -1,6 +1,6 @@
 #include "git-compat-util.h"
 #include "gettext.h"
-#include "hex.h"
+#include "hex-ll.h"
 #include "strbuf.h"
 #include "urlmatch.h"
 
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v3 2/6] wrapper: remove dependency to Git-specific internal file
  2023-09-08 17:41     ` [PATCH v3 0/6] " Calvin Wan
  2023-09-08 17:44       ` [PATCH v3 1/6] hex-ll: split out functionality from hex Calvin Wan
@ 2023-09-08 17:44       ` Calvin Wan
  2023-09-15 17:54         ` Jonathan Tan
  2023-09-08 17:44       ` [PATCH v3 3/6] config: correct bad boolean env value error message Calvin Wan
                         ` (4 subsequent siblings)
  6 siblings, 1 reply; 111+ messages in thread
From: Calvin Wan @ 2023-09-08 17:44 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, nasamuffin, jonathantanmy, linusa, phillip.wood123, vdye

In order for wrapper.c to be built independently as part of a smaller
library, it cannot have dependencies to other Git specific
internals. remove_or_warn() creates an unnecessary dependency to
object.h in wrapper.c. Therefore move the function to entry.[ch] which
performs changes on the worktree based on the Git-specific file modes in
the index.

Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 entry.c   | 5 +++++
 entry.h   | 6 ++++++
 wrapper.c | 6 ------
 wrapper.h | 5 -----
 4 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/entry.c b/entry.c
index 43767f9043..076e97eb89 100644
--- a/entry.c
+++ b/entry.c
@@ -581,3 +581,8 @@ void unlink_entry(const struct cache_entry *ce, const char *super_prefix)
 		return;
 	schedule_dir_for_removal(ce->name, ce_namelen(ce));
 }
+
+int remove_or_warn(unsigned int mode, const char *file)
+{
+	return S_ISGITLINK(mode) ? rmdir_or_warn(file) : unlink_or_warn(file);
+}
diff --git a/entry.h b/entry.h
index 7329f918a9..ca3ed35bc0 100644
--- a/entry.h
+++ b/entry.h
@@ -62,4 +62,10 @@ int fstat_checkout_output(int fd, const struct checkout *state, struct stat *st)
 void update_ce_after_write(const struct checkout *state, struct cache_entry *ce,
 			   struct stat *st);
 
+/*
+ * Calls the correct function out of {unlink,rmdir}_or_warn based on
+ * the supplied file mode.
+ */
+int remove_or_warn(unsigned int mode, const char *path);
+
 #endif /* ENTRY_H */
diff --git a/wrapper.c b/wrapper.c
index 48065c4f53..453a20ed99 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -5,7 +5,6 @@
 #include "abspath.h"
 #include "config.h"
 #include "gettext.h"
-#include "object.h"
 #include "repository.h"
 #include "strbuf.h"
 #include "trace2.h"
@@ -632,11 +631,6 @@ int rmdir_or_warn(const char *file)
 	return warn_if_unremovable("rmdir", file, rmdir(file));
 }
 
-int remove_or_warn(unsigned int mode, const char *file)
-{
-	return S_ISGITLINK(mode) ? rmdir_or_warn(file) : unlink_or_warn(file);
-}
-
 static int access_error_is_ok(int err, unsigned flag)
 {
 	return (is_missing_file_error(err) ||
diff --git a/wrapper.h b/wrapper.h
index 79c7321bb3..1b2b047ea0 100644
--- a/wrapper.h
+++ b/wrapper.h
@@ -106,11 +106,6 @@ int unlink_or_msg(const char *file, struct strbuf *err);
  * not exist.
  */
 int rmdir_or_warn(const char *path);
-/*
- * Calls the correct function out of {unlink,rmdir}_or_warn based on
- * the supplied file mode.
- */
-int remove_or_warn(unsigned int mode, const char *path);
 
 /*
  * Call access(2), but warn for any error except "missing file"
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v3 3/6] config: correct bad boolean env value error message
  2023-09-08 17:41     ` [PATCH v3 0/6] " Calvin Wan
  2023-09-08 17:44       ` [PATCH v3 1/6] hex-ll: split out functionality from hex Calvin Wan
  2023-09-08 17:44       ` [PATCH v3 2/6] wrapper: remove dependency to Git-specific internal file Calvin Wan
@ 2023-09-08 17:44       ` Calvin Wan
  2023-09-08 17:44       ` [PATCH v3 4/6] parse: create new library for parsing strings and env values Calvin Wan
                         ` (3 subsequent siblings)
  6 siblings, 0 replies; 111+ messages in thread
From: Calvin Wan @ 2023-09-08 17:44 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, nasamuffin, jonathantanmy, linusa, phillip.wood123, vdye

An incorrectly defined boolean environment value would result in the
following error message:

bad boolean config value '%s' for '%s'

This is a misnomer since environment value != config value. Instead of
calling git_config_bool() to parse the environment value, mimic the
functionality inside of git_config_bool() but with the correct error
message.

Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 config.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/config.c b/config.c
index 3846a37be9..7dde0aaa02 100644
--- a/config.c
+++ b/config.c
@@ -2133,7 +2133,14 @@ void git_global_config(char **user_out, char **xdg_out)
 int git_env_bool(const char *k, int def)
 {
 	const char *v = getenv(k);
-	return v ? git_config_bool(k, v) : def;
+	int val;
+	if (!v)
+		return def;
+	val = git_parse_maybe_bool(v);
+	if (val < 0)
+		die(_("bad boolean environment value '%s' for '%s'"),
+		    v, k);
+	return val;
 }
 
 /*
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v3 4/6] parse: create new library for parsing strings and env values
  2023-09-08 17:41     ` [PATCH v3 0/6] " Calvin Wan
                         ` (2 preceding siblings ...)
  2023-09-08 17:44       ` [PATCH v3 3/6] config: correct bad boolean env value error message Calvin Wan
@ 2023-09-08 17:44       ` Calvin Wan
  2023-09-08 17:44       ` [PATCH v3 5/6] git-std-lib: introduce git standard library Calvin Wan
                         ` (2 subsequent siblings)
  6 siblings, 0 replies; 111+ messages in thread
From: Calvin Wan @ 2023-09-08 17:44 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, nasamuffin, jonathantanmy, linusa, phillip.wood123, vdye

While string and environment value parsing is mainly consumed by
config.c, there are other files that only need parsing functionality and
not config functionality. By separating out string and environment value
parsing from config, those files can instead be dependent on parse,
which has a much smaller dependency chain than config. This ultimately
allows us to inclue parse.[ch] in an independent library since it
doesn't have dependencies to Git-specific internals unlike in
config.[ch].

Move general string and env parsing functions from config.[ch] to
parse.[ch].

Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 Makefile                   |   1 +
 attr.c                     |   2 +-
 config.c                   | 180 +-----------------------------------
 config.h                   |  14 +--
 pack-objects.c             |   2 +-
 pack-revindex.c            |   2 +-
 parse-options.c            |   3 +-
 parse.c                    | 182 +++++++++++++++++++++++++++++++++++++
 parse.h                    |  20 ++++
 pathspec.c                 |   2 +-
 preload-index.c            |   2 +-
 progress.c                 |   2 +-
 prompt.c                   |   2 +-
 rebase.c                   |   2 +-
 t/helper/test-env-helper.c |   2 +-
 unpack-trees.c             |   2 +-
 wrapper.c                  |   2 +-
 write-or-die.c             |   2 +-
 18 files changed, 219 insertions(+), 205 deletions(-)
 create mode 100644 parse.c
 create mode 100644 parse.h

diff --git a/Makefile b/Makefile
index 861e643708..9226c719a0 100644
--- a/Makefile
+++ b/Makefile
@@ -1091,6 +1091,7 @@ LIB_OBJS += pack-write.o
 LIB_OBJS += packfile.o
 LIB_OBJS += pager.o
 LIB_OBJS += parallel-checkout.o
+LIB_OBJS += parse.o
 LIB_OBJS += parse-options-cb.o
 LIB_OBJS += parse-options.o
 LIB_OBJS += patch-delta.o
diff --git a/attr.c b/attr.c
index 71c84fbcf8..3c0b4fb3d9 100644
--- a/attr.c
+++ b/attr.c
@@ -7,7 +7,7 @@
  */
 
 #include "git-compat-util.h"
-#include "config.h"
+#include "parse.h"
 #include "environment.h"
 #include "exec-cmd.h"
 #include "attr.h"
diff --git a/config.c b/config.c
index 7dde0aaa02..c7bc21a25d 100644
--- a/config.c
+++ b/config.c
@@ -11,6 +11,7 @@
 #include "date.h"
 #include "branch.h"
 #include "config.h"
+#include "parse.h"
 #include "convert.h"
 #include "environment.h"
 #include "gettext.h"
@@ -1165,129 +1166,6 @@ static int git_parse_source(struct config_source *cs, config_fn_t fn,
 	return error_return;
 }
 
-static uintmax_t get_unit_factor(const char *end)
-{
-	if (!*end)
-		return 1;
-	else if (!strcasecmp(end, "k"))
-		return 1024;
-	else if (!strcasecmp(end, "m"))
-		return 1024 * 1024;
-	else if (!strcasecmp(end, "g"))
-		return 1024 * 1024 * 1024;
-	return 0;
-}
-
-static int git_parse_signed(const char *value, intmax_t *ret, intmax_t max)
-{
-	if (value && *value) {
-		char *end;
-		intmax_t val;
-		intmax_t factor;
-
-		if (max < 0)
-			BUG("max must be a positive integer");
-
-		errno = 0;
-		val = strtoimax(value, &end, 0);
-		if (errno == ERANGE)
-			return 0;
-		if (end == value) {
-			errno = EINVAL;
-			return 0;
-		}
-		factor = get_unit_factor(end);
-		if (!factor) {
-			errno = EINVAL;
-			return 0;
-		}
-		if ((val < 0 && -max / factor > val) ||
-		    (val > 0 && max / factor < val)) {
-			errno = ERANGE;
-			return 0;
-		}
-		val *= factor;
-		*ret = val;
-		return 1;
-	}
-	errno = EINVAL;
-	return 0;
-}
-
-static int git_parse_unsigned(const char *value, uintmax_t *ret, uintmax_t max)
-{
-	if (value && *value) {
-		char *end;
-		uintmax_t val;
-		uintmax_t factor;
-
-		/* negative values would be accepted by strtoumax */
-		if (strchr(value, '-')) {
-			errno = EINVAL;
-			return 0;
-		}
-		errno = 0;
-		val = strtoumax(value, &end, 0);
-		if (errno == ERANGE)
-			return 0;
-		if (end == value) {
-			errno = EINVAL;
-			return 0;
-		}
-		factor = get_unit_factor(end);
-		if (!factor) {
-			errno = EINVAL;
-			return 0;
-		}
-		if (unsigned_mult_overflows(factor, val) ||
-		    factor * val > max) {
-			errno = ERANGE;
-			return 0;
-		}
-		val *= factor;
-		*ret = val;
-		return 1;
-	}
-	errno = EINVAL;
-	return 0;
-}
-
-int git_parse_int(const char *value, int *ret)
-{
-	intmax_t tmp;
-	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(int)))
-		return 0;
-	*ret = tmp;
-	return 1;
-}
-
-static int git_parse_int64(const char *value, int64_t *ret)
-{
-	intmax_t tmp;
-	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(int64_t)))
-		return 0;
-	*ret = tmp;
-	return 1;
-}
-
-int git_parse_ulong(const char *value, unsigned long *ret)
-{
-	uintmax_t tmp;
-	if (!git_parse_unsigned(value, &tmp, maximum_unsigned_value_of_type(long)))
-		return 0;
-	*ret = tmp;
-	return 1;
-}
-
-int git_parse_ssize_t(const char *value, ssize_t *ret)
-{
-	intmax_t tmp;
-	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(ssize_t)))
-		return 0;
-	*ret = tmp;
-	return 1;
-}
-
 NORETURN
 static void die_bad_number(const char *name, const char *value,
 			   const struct key_value_info *kvi)
@@ -1363,23 +1241,6 @@ ssize_t git_config_ssize_t(const char *name, const char *value,
 	return ret;
 }
 
-static int git_parse_maybe_bool_text(const char *value)
-{
-	if (!value)
-		return 1;
-	if (!*value)
-		return 0;
-	if (!strcasecmp(value, "true")
-	    || !strcasecmp(value, "yes")
-	    || !strcasecmp(value, "on"))
-		return 1;
-	if (!strcasecmp(value, "false")
-	    || !strcasecmp(value, "no")
-	    || !strcasecmp(value, "off"))
-		return 0;
-	return -1;
-}
-
 static const struct fsync_component_name {
 	const char *name;
 	enum fsync_component component_bits;
@@ -1454,16 +1315,6 @@ static enum fsync_component parse_fsync_components(const char *var, const char *
 	return (current & ~negative) | positive;
 }
 
-int git_parse_maybe_bool(const char *value)
-{
-	int v = git_parse_maybe_bool_text(value);
-	if (0 <= v)
-		return v;
-	if (git_parse_int(value, &v))
-		return !!v;
-	return -1;
-}
-
 int git_config_bool_or_int(const char *name, const char *value,
 			   const struct key_value_info *kvi, int *is_bool)
 {
@@ -2126,35 +1977,6 @@ void git_global_config(char **user_out, char **xdg_out)
 	*xdg_out = xdg_config;
 }
 
-/*
- * Parse environment variable 'k' as a boolean (in various
- * possible spellings); if missing, use the default value 'def'.
- */
-int git_env_bool(const char *k, int def)
-{
-	const char *v = getenv(k);
-	int val;
-	if (!v)
-		return def;
-	val = git_parse_maybe_bool(v);
-	if (val < 0)
-		die(_("bad boolean environment value '%s' for '%s'"),
-		    v, k);
-	return val;
-}
-
-/*
- * Parse environment variable 'k' as ulong with possibly a unit
- * suffix; if missing, use the default value 'val'.
- */
-unsigned long git_env_ulong(const char *k, unsigned long val)
-{
-	const char *v = getenv(k);
-	if (v && !git_parse_ulong(v, &val))
-		die(_("failed to parse %s"), k);
-	return val;
-}
-
 int git_config_system(void)
 {
 	return !git_env_bool("GIT_CONFIG_NOSYSTEM", 0);
diff --git a/config.h b/config.h
index 6332d74904..14f881ecfa 100644
--- a/config.h
+++ b/config.h
@@ -4,7 +4,7 @@
 #include "hashmap.h"
 #include "string-list.h"
 #include "repository.h"
-
+#include "parse.h"
 
 /**
  * The config API gives callers a way to access Git configuration files
@@ -243,16 +243,6 @@ int config_with_options(config_fn_t fn, void *,
  * The following helper functions aid in parsing string values
  */
 
-int git_parse_ssize_t(const char *, ssize_t *);
-int git_parse_ulong(const char *, unsigned long *);
-int git_parse_int(const char *value, int *ret);
-
-/**
- * Same as `git_config_bool`, except that it returns -1 on error rather
- * than dying.
- */
-int git_parse_maybe_bool(const char *);
-
 /**
  * Parse the string to an integer, including unit factors. Dies on error;
  * otherwise, returns the parsed result.
@@ -385,8 +375,6 @@ int git_config_rename_section(const char *, const char *);
 int git_config_rename_section_in_file(const char *, const char *, const char *);
 int git_config_copy_section(const char *, const char *);
 int git_config_copy_section_in_file(const char *, const char *, const char *);
-int git_env_bool(const char *, int);
-unsigned long git_env_ulong(const char *, unsigned long);
 int git_config_system(void);
 int config_error_nonbool(const char *);
 #if defined(__GNUC__)
diff --git a/pack-objects.c b/pack-objects.c
index 1b8052bece..f403ca6986 100644
--- a/pack-objects.c
+++ b/pack-objects.c
@@ -3,7 +3,7 @@
 #include "pack.h"
 #include "pack-objects.h"
 #include "packfile.h"
-#include "config.h"
+#include "parse.h"
 
 static uint32_t locate_object_entry_hash(struct packing_data *pdata,
 					 const struct object_id *oid,
diff --git a/pack-revindex.c b/pack-revindex.c
index 7fffcad912..a01a2a4640 100644
--- a/pack-revindex.c
+++ b/pack-revindex.c
@@ -6,7 +6,7 @@
 #include "packfile.h"
 #include "strbuf.h"
 #include "trace2.h"
-#include "config.h"
+#include "parse.h"
 #include "midx.h"
 #include "csum-file.h"
 
diff --git a/parse-options.c b/parse-options.c
index e8e076c3a6..093eaf2db8 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -1,11 +1,12 @@
 #include "git-compat-util.h"
 #include "parse-options.h"
 #include "abspath.h"
-#include "config.h"
+#include "parse.h"
 #include "commit.h"
 #include "color.h"
 #include "gettext.h"
 #include "strbuf.h"
+#include "string-list.h"
 #include "utf8.h"
 
 static int disallow_abbreviated_options;
diff --git a/parse.c b/parse.c
new file mode 100644
index 0000000000..42d691a0fb
--- /dev/null
+++ b/parse.c
@@ -0,0 +1,182 @@
+#include "git-compat-util.h"
+#include "gettext.h"
+#include "parse.h"
+
+static uintmax_t get_unit_factor(const char *end)
+{
+	if (!*end)
+		return 1;
+	else if (!strcasecmp(end, "k"))
+		return 1024;
+	else if (!strcasecmp(end, "m"))
+		return 1024 * 1024;
+	else if (!strcasecmp(end, "g"))
+		return 1024 * 1024 * 1024;
+	return 0;
+}
+
+int git_parse_signed(const char *value, intmax_t *ret, intmax_t max)
+{
+	if (value && *value) {
+		char *end;
+		intmax_t val;
+		intmax_t factor;
+
+		if (max < 0)
+			BUG("max must be a positive integer");
+
+		errno = 0;
+		val = strtoimax(value, &end, 0);
+		if (errno == ERANGE)
+			return 0;
+		if (end == value) {
+			errno = EINVAL;
+			return 0;
+		}
+		factor = get_unit_factor(end);
+		if (!factor) {
+			errno = EINVAL;
+			return 0;
+		}
+		if ((val < 0 && -max / factor > val) ||
+		    (val > 0 && max / factor < val)) {
+			errno = ERANGE;
+			return 0;
+		}
+		val *= factor;
+		*ret = val;
+		return 1;
+	}
+	errno = EINVAL;
+	return 0;
+}
+
+static int git_parse_unsigned(const char *value, uintmax_t *ret, uintmax_t max)
+{
+	if (value && *value) {
+		char *end;
+		uintmax_t val;
+		uintmax_t factor;
+
+		/* negative values would be accepted by strtoumax */
+		if (strchr(value, '-')) {
+			errno = EINVAL;
+			return 0;
+		}
+		errno = 0;
+		val = strtoumax(value, &end, 0);
+		if (errno == ERANGE)
+			return 0;
+		if (end == value) {
+			errno = EINVAL;
+			return 0;
+		}
+		factor = get_unit_factor(end);
+		if (!factor) {
+			errno = EINVAL;
+			return 0;
+		}
+		if (unsigned_mult_overflows(factor, val) ||
+		    factor * val > max) {
+			errno = ERANGE;
+			return 0;
+		}
+		val *= factor;
+		*ret = val;
+		return 1;
+	}
+	errno = EINVAL;
+	return 0;
+}
+
+int git_parse_int(const char *value, int *ret)
+{
+	intmax_t tmp;
+	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(int)))
+		return 0;
+	*ret = tmp;
+	return 1;
+}
+
+int git_parse_int64(const char *value, int64_t *ret)
+{
+	intmax_t tmp;
+	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(int64_t)))
+		return 0;
+	*ret = tmp;
+	return 1;
+}
+
+int git_parse_ulong(const char *value, unsigned long *ret)
+{
+	uintmax_t tmp;
+	if (!git_parse_unsigned(value, &tmp, maximum_unsigned_value_of_type(long)))
+		return 0;
+	*ret = tmp;
+	return 1;
+}
+
+int git_parse_ssize_t(const char *value, ssize_t *ret)
+{
+	intmax_t tmp;
+	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(ssize_t)))
+		return 0;
+	*ret = tmp;
+	return 1;
+}
+
+int git_parse_maybe_bool_text(const char *value)
+{
+	if (!value)
+		return 1;
+	if (!*value)
+		return 0;
+	if (!strcasecmp(value, "true")
+	    || !strcasecmp(value, "yes")
+	    || !strcasecmp(value, "on"))
+		return 1;
+	if (!strcasecmp(value, "false")
+	    || !strcasecmp(value, "no")
+	    || !strcasecmp(value, "off"))
+		return 0;
+	return -1;
+}
+
+int git_parse_maybe_bool(const char *value)
+{
+	int v = git_parse_maybe_bool_text(value);
+	if (0 <= v)
+		return v;
+	if (git_parse_int(value, &v))
+		return !!v;
+	return -1;
+}
+
+/*
+ * Parse environment variable 'k' as a boolean (in various
+ * possible spellings); if missing, use the default value 'def'.
+ */
+int git_env_bool(const char *k, int def)
+{
+	const char *v = getenv(k);
+	int val;
+	if (!v)
+		return def;
+	val = git_parse_maybe_bool(v);
+	if (val < 0)
+		die(_("bad boolean environment value '%s' for '%s'"),
+		    v, k);
+	return val;
+}
+
+/*
+ * Parse environment variable 'k' as ulong with possibly a unit
+ * suffix; if missing, use the default value 'val'.
+ */
+unsigned long git_env_ulong(const char *k, unsigned long val)
+{
+	const char *v = getenv(k);
+	if (v && !git_parse_ulong(v, &val))
+		die(_("failed to parse %s"), k);
+	return val;
+}
diff --git a/parse.h b/parse.h
new file mode 100644
index 0000000000..07d2193d69
--- /dev/null
+++ b/parse.h
@@ -0,0 +1,20 @@
+#ifndef PARSE_H
+#define PARSE_H
+
+int git_parse_signed(const char *value, intmax_t *ret, intmax_t max);
+int git_parse_ssize_t(const char *, ssize_t *);
+int git_parse_ulong(const char *, unsigned long *);
+int git_parse_int(const char *value, int *ret);
+int git_parse_int64(const char *value, int64_t *ret);
+
+/**
+ * Same as `git_config_bool`, except that it returns -1 on error rather
+ * than dying.
+ */
+int git_parse_maybe_bool(const char *);
+int git_parse_maybe_bool_text(const char *value);
+
+int git_env_bool(const char *, int);
+unsigned long git_env_ulong(const char *, unsigned long);
+
+#endif /* PARSE_H */
diff --git a/pathspec.c b/pathspec.c
index 3a3a5724c4..7f88f1c02b 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -1,6 +1,6 @@
 #include "git-compat-util.h"
 #include "abspath.h"
-#include "config.h"
+#include "parse.h"
 #include "dir.h"
 #include "environment.h"
 #include "gettext.h"
diff --git a/preload-index.c b/preload-index.c
index e44530c80c..63fd35d64b 100644
--- a/preload-index.c
+++ b/preload-index.c
@@ -7,7 +7,7 @@
 #include "environment.h"
 #include "fsmonitor.h"
 #include "gettext.h"
-#include "config.h"
+#include "parse.h"
 #include "preload-index.h"
 #include "progress.h"
 #include "read-cache.h"
diff --git a/progress.c b/progress.c
index f695798aca..c83cb60bf1 100644
--- a/progress.c
+++ b/progress.c
@@ -17,7 +17,7 @@
 #include "trace.h"
 #include "trace2.h"
 #include "utf8.h"
-#include "config.h"
+#include "parse.h"
 
 #define TP_IDX_MAX      8
 
diff --git a/prompt.c b/prompt.c
index 3baa33f63d..8935fe4dfb 100644
--- a/prompt.c
+++ b/prompt.c
@@ -1,5 +1,5 @@
 #include "git-compat-util.h"
-#include "config.h"
+#include "parse.h"
 #include "environment.h"
 #include "run-command.h"
 #include "strbuf.h"
diff --git a/rebase.c b/rebase.c
index 17a570f1ff..69a1822da3 100644
--- a/rebase.c
+++ b/rebase.c
@@ -1,6 +1,6 @@
 #include "git-compat-util.h"
 #include "rebase.h"
-#include "config.h"
+#include "parse.h"
 #include "gettext.h"
 
 /*
diff --git a/t/helper/test-env-helper.c b/t/helper/test-env-helper.c
index 66c88b8ff3..1c486888a4 100644
--- a/t/helper/test-env-helper.c
+++ b/t/helper/test-env-helper.c
@@ -1,5 +1,5 @@
 #include "test-tool.h"
-#include "config.h"
+#include "parse.h"
 #include "parse-options.h"
 
 static char const * const env__helper_usage[] = {
diff --git a/unpack-trees.c b/unpack-trees.c
index 87517364dc..761562a96e 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -2,7 +2,7 @@
 #include "advice.h"
 #include "strvec.h"
 #include "repository.h"
-#include "config.h"
+#include "parse.h"
 #include "dir.h"
 #include "environment.h"
 #include "gettext.h"
diff --git a/wrapper.c b/wrapper.c
index 453a20ed99..7da15a56da 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -3,7 +3,7 @@
  */
 #include "git-compat-util.h"
 #include "abspath.h"
-#include "config.h"
+#include "parse.h"
 #include "gettext.h"
 #include "repository.h"
 #include "strbuf.h"
diff --git a/write-or-die.c b/write-or-die.c
index d8355c0c3e..42a2dc73cd 100644
--- a/write-or-die.c
+++ b/write-or-die.c
@@ -1,5 +1,5 @@
 #include "git-compat-util.h"
-#include "config.h"
+#include "parse.h"
 #include "run-command.h"
 #include "write-or-die.h"
 
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v3 5/6] git-std-lib: introduce git standard library
  2023-09-08 17:41     ` [PATCH v3 0/6] " Calvin Wan
                         ` (3 preceding siblings ...)
  2023-09-08 17:44       ` [PATCH v3 4/6] parse: create new library for parsing strings and env values Calvin Wan
@ 2023-09-08 17:44       ` Calvin Wan
  2023-09-11 13:22         ` Phillip Wood
                           ` (2 more replies)
  2023-09-08 17:44       ` [PATCH v3 6/6] git-std-lib: add test file to call git-std-lib.a functions Calvin Wan
  2023-09-08 20:36       ` [PATCH v3 0/6] Introduce Git Standard Library Junio C Hamano
  6 siblings, 3 replies; 111+ messages in thread
From: Calvin Wan @ 2023-09-08 17:44 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, nasamuffin, jonathantanmy, linusa, phillip.wood123, vdye

The Git Standard Library intends to serve as the foundational library
and root dependency that other libraries in Git will be built off of.
That is to say, suppose we have libraries X and Y; a user that wants to
use X and Y would need to include X, Y, and this Git Standard Library.

Add Documentation/technical/git-std-lib.txt to further explain the
design and rationale.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
---
 Documentation/technical/git-std-lib.txt | 191 ++++++++++++++++++++++++
 Makefile                                |  39 ++++-
 git-compat-util.h                       |   7 +-
 stubs/pager.c                           |   6 +
 stubs/pager.h                           |   6 +
 stubs/trace2.c                          |  27 ++++
 symlinks.c                              |   2 +
 wrapper.c                               |   1 -
 8 files changed, 276 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/technical/git-std-lib.txt
 create mode 100644 stubs/pager.c
 create mode 100644 stubs/pager.h
 create mode 100644 stubs/trace2.c

diff --git a/Documentation/technical/git-std-lib.txt b/Documentation/technical/git-std-lib.txt
new file mode 100644
index 0000000000..397c1da8c8
--- /dev/null
+++ b/Documentation/technical/git-std-lib.txt
@@ -0,0 +1,191 @@
+Git Standard Library
+================
+
+The Git Standard Library intends to serve as the foundational library
+and root dependency that other libraries in Git will be built off of.
+That is to say, suppose we have libraries X and Y; a user that wants to
+use X and Y would need to include X, Y, and this Git Standard Library.
+This does not mean that the Git Standard Library will be the only
+possible root dependency in the future, but rather the most significant
+and widely used one.
+
+Dependency graph in libified Git
+================
+
+If you look in the Git Makefile, all of the objects defined in the Git
+library are compiled and archived into a singular file, libgit.a, which
+is linked against by common-main.o with other external dependencies and
+turned into the Git executable. In other words, the Git executable has
+dependencies on libgit.a and a couple of external libraries. The
+libfication of Git will not affect this current build flow, but instead
+will provide an alternate method for building Git.
+
+With our current method of building Git, we can imagine the dependency
+graph as such:
+
+        Git
+         /\
+        /  \
+       /    \
+  libgit.a   ext deps
+
+In libifying parts of Git, we want to shrink the dependency graph to
+only the minimal set of dependencies, so libraries should not use
+libgit.a. Instead, it would look like:
+
+                Git
+                /\
+               /  \
+              /    \
+          libgit.a  ext deps
+             /\
+            /  \
+           /    \
+object-store.a  (other lib)
+      |        /
+      |       /
+      |      /
+ config.a   / 
+      |    /
+      |   /
+      |  /
+git-std-lib.a
+
+Instead of containing all of the objects in Git, libgit.a would contain
+objects that are not built by libraries it links against. Consequently,
+if someone wanted their own custom build of Git with their own custom
+implementation of the object store, they would only have to swap out
+object-store.a rather than do a hard fork of Git.
+
+Rationale behind Git Standard Library
+================
+
+The rationale behind what's in and what's not in the Git Standard
+Library essentially is the result of two observations within the Git
+codebase: every file includes git-compat-util.h which defines functions
+in a couple of different files, and wrapper.c + usage.c have
+difficult-to-separate circular dependencies with each other and other
+files.
+
+Ubiquity of git-compat-util.h and circular dependencies
+========
+
+Every file in the Git codebase includes git-compat-util.h. It serves as
+"a compatibility aid that isolates the knowledge of platform specific
+inclusion order and what feature macros to define before including which
+system header" (Junio[1]). Since every file includes git-compat-util.h, and
+git-compat-util.h includes wrapper.h and usage.h, it would make sense
+for wrapper.c and usage.c to be a part of the root library. They have
+difficult to separate circular dependencies with each other so they
+can't be independent libraries. Wrapper.c has dependencies on parse.c,
+abspath.c, strbuf.c, which in turn also have dependencies on usage.c and
+wrapper.c -- more circular dependencies. 
+
+Tradeoff between swappability and refactoring
+========
+
+From the above dependency graph, we can see that git-std-lib.a could be
+many smaller libraries rather than a singular library. So why choose a
+singular library when multiple libraries can be individually easier to
+swap and are more modular? A singular library requires less work to
+separate out circular dependencies within itself so it becomes a
+tradeoff question between work and reward. While there may be a point in
+the future where a file like usage.c would want its own library so that
+someone can have custom die() or error(), the work required to refactor
+out the circular dependencies in some files would be enormous due to
+their ubiquity so therefore I believe it is not worth the tradeoff
+currently. Additionally, we can in the future choose to do this refactor
+and change the API for the library if there becomes enough of a reason
+to do so (remember we are avoiding promising stability of the interfaces
+of those libraries).
+
+Reuse of compatibility functions in git-compat-util.h
+========
+
+Most functions defined in git-compat-util.h are implemented in compat/
+and have dependencies limited to strbuf.h and wrapper.h so they can be
+easily included in git-std-lib.a, which as a root dependency means that
+higher level libraries do not have to worry about compatibility files in
+compat/. The rest of the functions defined in git-compat-util.h are
+implemented in top level files and are hidden behind
+an #ifdef if their implementation is not in git-std-lib.a.
+
+Rationale summary
+========
+
+The Git Standard Library allows us to get the libification ball rolling
+with other libraries in Git. By not spending many
+more months attempting to refactor difficult circular dependencies and
+instead spending that time getting to a state where we can test out
+swapping a library out such as config or object store, we can prove the
+viability of Git libification on a much faster time scale. Additionally
+the code cleanups that have happened so far have been minor and
+beneficial for the codebase. It is probable that making large movements
+would negatively affect code clarity.
+
+Git Standard Library boundary
+================
+
+While I have described above some useful heuristics for identifying
+potential candidates for git-std-lib.a, a standard library should not
+have a shaky definition for what belongs in it.
+
+ - Low-level files (aka operates only on other primitive types) that are
+   used everywhere within the codebase (wrapper.c, usage.c, strbuf.c)
+   - Dependencies that are low-level and widely used
+     (abspath.c, date.c, hex-ll.c, parse.c, utf8.c)
+ - low-level git/* files with functions defined in git-compat-util.h
+   (ctype.c)
+ - compat/*
+ - stubbed out dependencies in stubs/ (stubs/pager.c, stubs/trace2.c)
+
+There are other files that might fit this definition, but that does not
+mean it should belong in git-std-lib.a. Those files should start as
+their own separate library since any file added to git-std-lib.a loses
+its flexibility of being easily swappable.
+
+Wrapper.c and usage.c have dependencies on pager and trace2 that are
+possible to remove at the cost of sacrificing the ability for standard Git
+to be able to trace functions in those files and other files in git-std-lib.a.
+In order for git-std-lib.a to compile with those dependencies, stubbed out
+versions of those files are implemented and swapped in during compilation time.
+
+Files inside of Git Standard Library
+================
+
+The initial set of files in git-std-lib.a are:
+abspath.c
+ctype.c
+date.c
+hex-ll.c
+parse.c
+strbuf.c
+usage.c
+utf8.c
+wrapper.c
+relevant compat/ files
+
+When these files are compiled together with the following files (or
+user-provided files that provide the same functions), they form a
+complete library:
+stubs/pager.c
+stubs/trace2.c
+
+Pitfalls
+================
+
+There are a small amount of files under compat/* that have dependencies
+not inside of git-std-lib.a. While those functions are not called on
+Linux, other OSes might call those problematic functions. I don't see
+this as a major problem, just moreso an observation that libification in
+general may also require some minor compatibility work in the future.
+
+Testing
+================
+
+Unit tests should catch any breakages caused by changes to files in
+git-std-lib.a (i.e. introduction of a out of scope dependency) and new
+functions introduced to git-std-lib.a will require unit tests written
+for them.
+
+[1] https://lore.kernel.org/git/xmqqwn17sydw.fsf@gitster.g/
\ No newline at end of file
diff --git a/Makefile b/Makefile
index 9226c719a0..0a2d1ae3cc 100644
--- a/Makefile
+++ b/Makefile
@@ -669,6 +669,7 @@ FUZZ_PROGRAMS =
 GIT_OBJS =
 LIB_OBJS =
 SCALAR_OBJS =
+STUB_OBJS =
 OBJECTS =
 OTHER_PROGRAMS =
 PROGRAM_OBJS =
@@ -956,6 +957,7 @@ COCCI_SOURCES = $(filter-out $(THIRD_PARTY_SOURCES),$(FOUND_C_SOURCES))
 
 LIB_H = $(FOUND_H_SOURCES)
 
+ifndef GIT_STD_LIB
 LIB_OBJS += abspath.o
 LIB_OBJS += add-interactive.o
 LIB_OBJS += add-patch.o
@@ -1196,6 +1198,27 @@ LIB_OBJS += write-or-die.o
 LIB_OBJS += ws.o
 LIB_OBJS += wt-status.o
 LIB_OBJS += xdiff-interface.o
+else ifdef GIT_STD_LIB
+LIB_OBJS += abspath.o
+LIB_OBJS += ctype.o
+LIB_OBJS += date.o
+LIB_OBJS += hex-ll.o
+LIB_OBJS += parse.o
+LIB_OBJS += strbuf.o
+LIB_OBJS += usage.o
+LIB_OBJS += utf8.o
+LIB_OBJS += wrapper.o
+
+ifdef STUB_TRACE2
+STUB_OBJS += stubs/trace2.o
+endif
+
+ifdef STUB_PAGER
+STUB_OBJS += stubs/pager.o
+endif
+
+LIB_OBJS += $(STUB_OBJS)
+endif
 
 BUILTIN_OBJS += builtin/add.o
 BUILTIN_OBJS += builtin/am.o
@@ -2162,6 +2185,11 @@ ifdef FSMONITOR_OS_SETTINGS
 	COMPAT_OBJS += compat/fsmonitor/fsm-path-utils-$(FSMONITOR_OS_SETTINGS).o
 endif
 
+ifdef GIT_STD_LIB
+	BASIC_CFLAGS += -DGIT_STD_LIB
+	BASIC_CFLAGS += -DNO_GETTEXT
+endif
+
 ifeq ($(TCLTK_PATH),)
 NO_TCLTK = NoThanks
 endif
@@ -3668,7 +3696,7 @@ clean: profile-clean coverage-clean cocciclean
 	$(RM) git.res
 	$(RM) $(OBJECTS)
 	$(RM) headless-git.o
-	$(RM) $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(REFTABLE_TEST_LIB)
+	$(RM) $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(REFTABLE_TEST_LIB) $(STD_LIB_FILE)
 	$(RM) $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS) $(OTHER_PROGRAMS)
 	$(RM) $(TEST_PROGRAMS)
 	$(RM) $(FUZZ_PROGRAMS)
@@ -3849,3 +3877,12 @@ $(FUZZ_PROGRAMS): all
 		$(XDIFF_OBJS) $(EXTLIBS) git.o $@.o $(LIB_FUZZING_ENGINE) -o $@
 
 fuzz-all: $(FUZZ_PROGRAMS)
+
+### Libified Git rules
+
+# git-std-lib
+# `make git-std-lib.a GIT_STD_LIB=YesPlease STUB_TRACE2=YesPlease STUB_PAGER=YesPlease`
+STD_LIB = git-std-lib.a
+
+$(STD_LIB): $(LIB_OBJS) $(COMPAT_OBJS) $(STUB_OBJS)
+	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
diff --git a/git-compat-util.h b/git-compat-util.h
index 3e7a59b5ff..14bf71c530 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -455,8 +455,8 @@ static inline int noop_core_config(const char *var UNUSED,
 #define platform_core_config noop_core_config
 #endif
 
+#if !defined(__MINGW32__) && !defined(_MSC_VER) && !defined(GIT_STD_LIB)
 int lstat_cache_aware_rmdir(const char *path);
-#if !defined(__MINGW32__) && !defined(_MSC_VER)
 #define rmdir lstat_cache_aware_rmdir
 #endif
 
@@ -966,9 +966,11 @@ const char *inet_ntop(int af, const void *src, char *dst, size_t size);
 #endif
 
 #ifdef NO_PTHREADS
+#ifdef GIT_STD_LIB
 #define atexit git_atexit
 int git_atexit(void (*handler)(void));
 #endif
+#endif
 
 static inline size_t st_add(size_t a, size_t b)
 {
@@ -1462,14 +1464,17 @@ static inline int is_missing_file_error(int errno_)
 	return (errno_ == ENOENT || errno_ == ENOTDIR);
 }
 
+#ifndef GIT_STD_LIB
 int cmd_main(int, const char **);
 
 /*
  * Intercept all calls to exit() and route them to trace2 to
  * optionally emit a message before calling the real exit().
  */
+
 int common_exit(const char *file, int line, int code);
 #define exit(code) exit(common_exit(__FILE__, __LINE__, (code)))
+#endif
 
 /*
  * You can mark a stack variable with UNLEAK(var) to avoid it being
diff --git a/stubs/pager.c b/stubs/pager.c
new file mode 100644
index 0000000000..4f575cada7
--- /dev/null
+++ b/stubs/pager.c
@@ -0,0 +1,6 @@
+#include "pager.h"
+
+int pager_in_use(void)
+{
+	return 0;
+}
diff --git a/stubs/pager.h b/stubs/pager.h
new file mode 100644
index 0000000000..b797910881
--- /dev/null
+++ b/stubs/pager.h
@@ -0,0 +1,6 @@
+#ifndef PAGER_H
+#define PAGER_H
+
+int pager_in_use(void);
+
+#endif /* PAGER_H */
diff --git a/stubs/trace2.c b/stubs/trace2.c
new file mode 100644
index 0000000000..7d89482228
--- /dev/null
+++ b/stubs/trace2.c
@@ -0,0 +1,27 @@
+#include "git-compat-util.h"
+#include "trace2.h"
+
+struct child_process { int stub; };
+struct repository { int stub; };
+struct json_writer { int stub; };
+
+void trace2_region_enter_fl(const char *file, int line, const char *category,
+			    const char *label, const struct repository *repo, ...) { }
+void trace2_region_leave_fl(const char *file, int line, const char *category,
+			    const char *label, const struct repository *repo, ...) { }
+void trace2_data_string_fl(const char *file, int line, const char *category,
+			   const struct repository *repo, const char *key,
+			   const char *value) { }
+void trace2_cmd_ancestry_fl(const char *file, int line, const char **parent_names) { }
+void trace2_cmd_error_va_fl(const char *file, int line, const char *fmt,
+			    va_list ap) { }
+void trace2_cmd_name_fl(const char *file, int line, const char *name) { }
+void trace2_thread_start_fl(const char *file, int line,
+			    const char *thread_base_name) { }
+void trace2_thread_exit_fl(const char *file, int line) { }
+void trace2_data_intmax_fl(const char *file, int line, const char *category,
+			   const struct repository *repo, const char *key,
+			   intmax_t value) { }
+int trace2_is_enabled(void) { return 0; }
+void trace2_counter_add(enum trace2_counter_id cid, uint64_t value) { }
+void trace2_collect_process_info(enum trace2_process_info_reason reason) { }
diff --git a/symlinks.c b/symlinks.c
index b29e340c2d..bced721a0c 100644
--- a/symlinks.c
+++ b/symlinks.c
@@ -337,6 +337,7 @@ void invalidate_lstat_cache(void)
 	reset_lstat_cache(&default_cache);
 }
 
+#if !defined(__MINGW32__) && !defined(_MSC_VER) && !defined(GIT_STD_LIB)
 #undef rmdir
 int lstat_cache_aware_rmdir(const char *path)
 {
@@ -348,3 +349,4 @@ int lstat_cache_aware_rmdir(const char *path)
 
 	return ret;
 }
+#endif
diff --git a/wrapper.c b/wrapper.c
index 7da15a56da..eeac3741cf 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -5,7 +5,6 @@
 #include "abspath.h"
 #include "parse.h"
 #include "gettext.h"
-#include "repository.h"
 #include "strbuf.h"
 #include "trace2.h"
 
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v3 6/6] git-std-lib: add test file to call git-std-lib.a functions
  2023-09-08 17:41     ` [PATCH v3 0/6] " Calvin Wan
                         ` (4 preceding siblings ...)
  2023-09-08 17:44       ` [PATCH v3 5/6] git-std-lib: introduce git standard library Calvin Wan
@ 2023-09-08 17:44       ` Calvin Wan
  2023-09-09  5:26         ` Junio C Hamano
  2023-09-15 18:43         ` Jonathan Tan
  2023-09-08 20:36       ` [PATCH v3 0/6] Introduce Git Standard Library Junio C Hamano
  6 siblings, 2 replies; 111+ messages in thread
From: Calvin Wan @ 2023-09-08 17:44 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, nasamuffin, jonathantanmy, linusa, phillip.wood123, vdye

Add test file that directly or indirectly calls all functions defined in
git-std-lib.a object files to showcase that they do not reference
missing objects and that git-std-lib.a can stand on its own.

Certain functions that cause the program to exit or are already called
by other functions are commented out.

TODO: replace with unit tests
Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 t/Makefile      |   4 +
 t/stdlib-test.c | 231 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 235 insertions(+)
 create mode 100644 t/stdlib-test.c

diff --git a/t/Makefile b/t/Makefile
index 3e00cdd801..b6d0bc9daa 100644
--- a/t/Makefile
+++ b/t/Makefile
@@ -150,3 +150,7 @@ perf:
 
 .PHONY: pre-clean $(T) aggregate-results clean valgrind perf \
 	check-chainlint clean-chainlint test-chainlint
+
+test-git-std-lib:
+	cc -It -o stdlib-test stdlib-test.c -L. -l:../git-std-lib.a
+	./stdlib-test
diff --git a/t/stdlib-test.c b/t/stdlib-test.c
new file mode 100644
index 0000000000..76fed9ecbf
--- /dev/null
+++ b/t/stdlib-test.c
@@ -0,0 +1,231 @@
+#include "../git-compat-util.h"
+#include "../abspath.h"
+#include "../hex-ll.h"
+#include "../parse.h"
+#include "../strbuf.h"
+#include "../string-list.h"
+
+/*
+ * Calls all functions from git-std-lib
+ * Some inline/trivial functions are skipped
+ */
+
+void abspath_funcs(void) {
+	struct strbuf sb = STRBUF_INIT;
+
+	fprintf(stderr, "calling abspath functions\n");
+	is_directory("foo");
+	strbuf_realpath(&sb, "foo", 0);
+	strbuf_realpath_forgiving(&sb, "foo", 0);
+	real_pathdup("foo", 0);
+	absolute_path("foo");
+	absolute_pathdup("foo");
+	prefix_filename("foo/", "bar");
+	prefix_filename_except_for_dash("foo/", "bar");
+	is_absolute_path("foo");
+	strbuf_add_absolute_path(&sb, "foo");
+	strbuf_add_real_path(&sb, "foo");
+}
+
+void hex_ll_funcs(void) {
+	unsigned char c;
+
+	fprintf(stderr, "calling hex-ll functions\n");
+
+	hexval('c');
+	hex2chr("A1");
+	hex_to_bytes(&c, "A1", 2);
+}
+
+void parse_funcs(void) {
+	intmax_t foo;
+	ssize_t foo1 = -1;
+	unsigned long foo2;
+	int foo3;
+	int64_t foo4;
+
+	fprintf(stderr, "calling parse functions\n");
+
+	git_parse_signed("42", &foo, maximum_signed_value_of_type(int));
+	git_parse_ssize_t("42", &foo1);
+	git_parse_ulong("42", &foo2);
+	git_parse_int("42", &foo3);
+	git_parse_int64("42", &foo4);
+	git_parse_maybe_bool("foo");
+	git_parse_maybe_bool_text("foo");
+	git_env_bool("foo", 1);
+	git_env_ulong("foo", 1);
+}
+
+static int allow_unencoded_fn(char ch) {
+	return 0;
+}
+
+void strbuf_funcs(void) {
+	struct strbuf *sb = xmalloc(sizeof(void*));
+	struct strbuf *sb2 = xmalloc(sizeof(void*));
+	struct strbuf sb3 = STRBUF_INIT;
+	struct string_list list = STRING_LIST_INIT_NODUP;
+	char *buf = "foo";
+	int fd = open("/dev/null", O_RDONLY);
+
+	fprintf(stderr, "calling strbuf functions\n");
+
+	starts_with("foo", "bar");
+	istarts_with("foo", "bar");
+	// skip_to_optional_arg_default(const char *str, const char *prefix,
+	// 			 const char **arg, const char *def)
+	strbuf_init(sb, 0);
+	strbuf_init(sb2, 0);
+	strbuf_release(sb);
+	strbuf_attach(sb, strbuf_detach(sb, NULL), 0, 0); // calls strbuf_grow
+	strbuf_swap(sb, sb2);
+	strbuf_setlen(sb, 0);
+	strbuf_trim(sb); // calls strbuf_rtrim, strbuf_ltrim
+	// strbuf_rtrim() called by strbuf_trim()
+	// strbuf_ltrim() called by strbuf_trim()
+	strbuf_trim_trailing_dir_sep(sb);
+	strbuf_trim_trailing_newline(sb);
+	strbuf_reencode(sb, "foo", "bar");
+	strbuf_tolower(sb);
+	strbuf_add_separated_string_list(sb, " ", &list);
+	strbuf_list_free(strbuf_split_buf("foo bar", 8, ' ', -1));
+	strbuf_cmp(sb, sb2);
+	strbuf_addch(sb, 1);
+	strbuf_splice(sb, 0, 1, "foo", 3);
+	strbuf_insert(sb, 0, "foo", 3);
+	// strbuf_vinsertf() called by strbuf_insertf
+	strbuf_insertf(sb, 0, "%s", "foo");
+	strbuf_remove(sb, 0, 1);
+	strbuf_add(sb, "foo", 3);
+	strbuf_addbuf(sb, sb2);
+	strbuf_join_argv(sb, 0, NULL, ' ');
+	strbuf_addchars(sb, 1, 1);
+	strbuf_addf(sb, "%s", "foo");
+	strbuf_add_commented_lines(sb, "foo", 3, '#');
+	strbuf_commented_addf(sb, '#', "%s", "foo");
+	// strbuf_vaddf() called by strbuf_addf()
+	strbuf_addbuf_percentquote(sb, &sb3);
+	strbuf_add_percentencode(sb, "foo", STRBUF_ENCODE_SLASH);
+	strbuf_fread(sb, 0, stdin);
+	strbuf_read(sb, fd, 0);
+	strbuf_read_once(sb, fd, 0);
+	strbuf_write(sb, stderr);
+	strbuf_readlink(sb, "/dev/null", 0);
+	strbuf_getcwd(sb);
+	strbuf_getwholeline(sb, stderr, '\n');
+	strbuf_appendwholeline(sb, stderr, '\n');
+	strbuf_getline(sb, stderr);
+	strbuf_getline_lf(sb, stderr);
+	strbuf_getline_nul(sb, stderr);
+	strbuf_getwholeline_fd(sb, fd, '\n');
+	strbuf_read_file(sb, "/dev/null", 0);
+	strbuf_add_lines(sb, "foo", "bar", 0);
+	strbuf_addstr_xml_quoted(sb, "foo");
+	strbuf_addstr_urlencode(sb, "foo", allow_unencoded_fn);
+	strbuf_humanise_bytes(sb, 42);
+	strbuf_humanise_rate(sb, 42);
+	printf_ln("%s", sb);
+	fprintf_ln(stderr, "%s", sb);
+	xstrdup_tolower("foo");
+	xstrdup_toupper("foo");
+	// xstrvfmt() called by xstrfmt()
+	xstrfmt("%s", "foo");
+	// strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm,
+	// 	     int tz_offset, int suppress_tz_name)
+	// strbuf_stripspace(struct strbuf *sb, char comment_line_char)
+	// strbuf_strip_suffix(struct strbuf *sb, const char *suffix)
+	// strbuf_strip_file_from_path(struct strbuf *sb)
+}
+
+static void error_builtin(const char *err, va_list params) {}
+static void warn_builtin(const char *err, va_list params) {}
+
+static report_fn error_routine = error_builtin;
+static report_fn warn_routine = warn_builtin;
+
+void usage_funcs(void) {
+	fprintf(stderr, "calling usage functions\n");
+	// Functions that call exit() are commented out
+
+	// usage()
+	// usagef()
+	// die()
+	// die_errno();
+	error("foo");
+	error_errno("foo");
+	die_message("foo");
+	die_message_errno("foo");
+	warning("foo");
+	warning_errno("foo");
+
+	// set_die_routine();
+	get_die_message_routine();
+	set_error_routine(error_builtin);
+	get_error_routine();
+	set_warn_routine(warn_builtin);
+	get_warn_routine();
+	// set_die_is_recursing_routine();
+}
+
+void wrapper_funcs(void) {
+	void *ptr = xmalloc(1);
+	int fd = open("/dev/null", O_RDONLY);
+	struct strbuf sb = STRBUF_INIT;
+	int mode = 0444;
+	char host[PATH_MAX], path[PATH_MAX], path1[PATH_MAX];
+	xsnprintf(path, sizeof(path), "out-XXXXXX");
+	xsnprintf(path1, sizeof(path1), "out-XXXXXX");
+	int tmp;
+
+	fprintf(stderr, "calling wrapper functions\n");
+
+	xstrdup("foo");
+	xmalloc(1);
+	xmallocz(1);
+	xmallocz_gently(1);
+	xmemdupz("foo", 3);
+	xstrndup("foo", 3);
+	xrealloc(ptr, 2);
+	xcalloc(1, 1);
+	xsetenv("foo", "bar", 0);
+	xopen("/dev/null", O_RDONLY);
+	xread(fd, &sb, 1);
+	xwrite(fd, &sb, 1);
+	xpread(fd, &sb, 1, 0);
+	xdup(fd);
+	xfopen("/dev/null", "r");
+	xfdopen(fd, "r");
+	tmp = xmkstemp(path);
+	close(tmp);
+	unlink(path);
+	tmp = xmkstemp_mode(path1, mode);
+	close(tmp);
+	unlink(path1);
+	xgetcwd();
+	fopen_for_writing(path);
+	fopen_or_warn(path, "r");
+	xstrncmpz("foo", "bar", 3);
+	// xsnprintf() called above
+	xgethostname(host, 3);
+	tmp = git_mkstemps_mode(path, 1, mode);
+	close(tmp);
+	unlink(path);
+	tmp = git_mkstemp_mode(path, mode);
+	close(tmp);
+	unlink(path);
+	read_in_full(fd, &sb, 1);
+	write_in_full(fd, &sb, 1);
+	pread_in_full(fd, &sb, 1, 0);
+}
+
+int main() {
+	abspath_funcs();
+	hex_ll_funcs();
+	parse_funcs();
+	strbuf_funcs();
+	usage_funcs();
+	wrapper_funcs();
+	fprintf(stderr, "all git-std-lib functions finished calling\n");
+	return 0;
+}
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* Re: [PATCH v3 0/6] Introduce Git Standard Library
  2023-09-08 17:41     ` [PATCH v3 0/6] " Calvin Wan
                         ` (5 preceding siblings ...)
  2023-09-08 17:44       ` [PATCH v3 6/6] git-std-lib: add test file to call git-std-lib.a functions Calvin Wan
@ 2023-09-08 20:36       ` Junio C Hamano
  2023-09-08 21:30         ` Junio C Hamano
  6 siblings, 1 reply; 111+ messages in thread
From: Junio C Hamano @ 2023-09-08 20:36 UTC (permalink / raw)
  To: Calvin Wan; +Cc: git, nasamuffin, jonathantanmy, linusa, phillip.wood123, vdye

Calvin Wan <calvinwan@google.com> writes:

> I have taken this series out of RFC since there weren't any significant
> concerns with the overall concept and design of this series. This reroll
> incorporates some smaller changes such as dropping the "push pager
> dependency" patch in favor of stubbing it out. The main change this
> reroll cleans up the Makefile rules and stubs, as suggested by
> Phillip Wood (appreciate the help on this one)!

What is your plan for the "config-parse" stuff?  The "create new library"
step in this series seem to aim for the same goal in a different ways.

> This series has been rebased onto 1fc548b2d6a: The sixth batch
>
> Originally this series was built on other patches that have since been
> merged, which is why the range-diff is shown removing many of them.

Good.  Previous rounds did not really attract much interest from the
public if I recall correctly.  Let's see how well this round fares.

>  Documentation/technical/git-std-lib.txt | 191 ++++++++++++++++++++

It is interesting to see that there is no "std.*lib\.c" in the set
of source files, or "std.*lib\.a" target in the Makefile.


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v3 0/6] Introduce Git Standard Library
  2023-09-08 20:36       ` [PATCH v3 0/6] Introduce Git Standard Library Junio C Hamano
@ 2023-09-08 21:30         ` Junio C Hamano
  0 siblings, 0 replies; 111+ messages in thread
From: Junio C Hamano @ 2023-09-08 21:30 UTC (permalink / raw)
  To: Calvin Wan; +Cc: git, nasamuffin, jonathantanmy, linusa, phillip.wood123, vdye

Junio C Hamano <gitster@pobox.com> writes:

> Calvin Wan <calvinwan@google.com> writes:
>
>> I have taken this series out of RFC since there weren't any significant
>> concerns with the overall concept and design of this series. This reroll
>> incorporates some smaller changes such as dropping the "push pager
>> dependency" patch in favor of stubbing it out. The main change this
>> reroll cleans up the Makefile rules and stubs, as suggested by
>> Phillip Wood (appreciate the help on this one)!
>
> What is your plan for the "config-parse" stuff?  The "create new library"
> step in this series seem to aim for the same goal in a different ways.

Actually, this one is far less ambitious in touching "config"
subsystem, in that it only deals with parsing strings as values.
The other one knows how a config file is laid out, what key the
value we are about to read is expected for, etc., and it will
benefit by having the "parse" code separated out by this series, but
they are more or less orthogonal.

Queued.  Thanks.


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v3 6/6] git-std-lib: add test file to call git-std-lib.a functions
  2023-09-08 17:44       ` [PATCH v3 6/6] git-std-lib: add test file to call git-std-lib.a functions Calvin Wan
@ 2023-09-09  5:26         ` Junio C Hamano
  2023-09-15 18:43         ` Jonathan Tan
  1 sibling, 0 replies; 111+ messages in thread
From: Junio C Hamano @ 2023-09-09  5:26 UTC (permalink / raw)
  To: Calvin Wan; +Cc: git, nasamuffin, jonathantanmy, linusa, phillip.wood123, vdye

Calvin Wan <calvinwan@google.com> writes:

> +
> +test-git-std-lib:
> +	cc -It -o stdlib-test stdlib-test.c -L. -l:../git-std-lib.a

Yuck, no.  Try to share as much with the main Makefile one level up.

> +	./stdlib-test
> diff --git a/t/stdlib-test.c b/t/stdlib-test.c
> new file mode 100644
> index 0000000000..76fed9ecbf
> --- /dev/null
> +++ b/t/stdlib-test.c
> @@ -0,0 +1,231 @@
> +#include "../git-compat-util.h"
> +#include "../abspath.h"
> +#include "../hex-ll.h"
> +#include "../parse.h"
> +#include "../strbuf.h"
> +#include "../string-list.h"

Use -I.. or something, to match what the main Makefile does, so that
you do not have to have these "../".  With -I.., you could even say

    #include <hex-ll.h>
    #include <parse.h>

etc.


> +	// skip_to_optional_arg_default(const char *str, const char *prefix,
> +	// 			 const char **arg, const char *def)

No // comments in this codebase, please.

> +	strbuf_addchars(sb, 1, 1);
> +	strbuf_addf(sb, "%s", "foo");

https://github.com/git/git/actions/runs/6126669144/job/16631124765#step:4:657


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v3 5/6] git-std-lib: introduce git standard library
  2023-09-08 17:44       ` [PATCH v3 5/6] git-std-lib: introduce git standard library Calvin Wan
@ 2023-09-11 13:22         ` Phillip Wood
  2023-09-27 14:14           ` Phillip Wood
  2023-09-15 18:39         ` Jonathan Tan
  2023-09-26 14:23         ` phillip.wood123
  2 siblings, 1 reply; 111+ messages in thread
From: Phillip Wood @ 2023-09-11 13:22 UTC (permalink / raw)
  To: Calvin Wan, git; +Cc: nasamuffin, jonathantanmy, linusa, vdye

Hi Calvin

On 08/09/2023 18:44, Calvin Wan wrote:
> +ifndef GIT_STD_LIB
>   LIB_OBJS += abspath.o
>   LIB_OBJS += add-interactive.o
>   LIB_OBJS += add-patch.o
> @@ -1196,6 +1198,27 @@ LIB_OBJS += write-or-die.o
>   LIB_OBJS += ws.o
>   LIB_OBJS += wt-status.o
>   LIB_OBJS += xdiff-interface.o
> +else ifdef GIT_STD_LIB
> +LIB_OBJS += abspath.o
> +LIB_OBJS += ctype.o
> +LIB_OBJS += date.o
> +LIB_OBJS += hex-ll.o
> +LIB_OBJS += parse.o
> +LIB_OBJS += strbuf.o
> +LIB_OBJS += usage.o
> +LIB_OBJS += utf8.o
> +LIB_OBJS += wrapper.o

It is still not clear to me how re-using LIB_OBJS like this is 
compatible with building libgit.a and git-stb-lib.a in a single make 
process c.f. [1].

> +ifdef GIT_STD_LIB
> +	BASIC_CFLAGS += -DGIT_STD_LIB
> +	BASIC_CFLAGS += -DNO_GETTEXT

As I've said before [2] I think that being able to built git-std-lib.a 
with gettext support is a prerequisite for using it to build git (just 
like trace2 support is). If we cannot build git using git-std-lib then 
the latter is likely to bit rot and so I don't think git-std-lib should 
be merged until there is a demonstration of building git using it.


> +### Libified Git rules
> +
> +# git-std-lib
> +# `make git-std-lib.a GIT_STD_LIB=YesPlease STUB_TRACE2=YesPlease STUB_PAGER=YesPlease`
> +STD_LIB = git-std-lib.a
> +
> +$(STD_LIB): $(LIB_OBJS) $(COMPAT_OBJS) $(STUB_OBJS)
> +	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^

This is much nicer that the previous version.

> diff --git a/git-compat-util.h b/git-compat-util.h
> index 3e7a59b5ff..14bf71c530 100644
> --- a/git-compat-util.h
> +++ b/git-compat-util.h
> @@ -455,8 +455,8 @@ static inline int noop_core_config(const char *var UNUSED,
>   #define platform_core_config noop_core_config
>   #endif
>   
> +#if !defined(__MINGW32__) && !defined(_MSC_VER) && !defined(GIT_STD_LIB)
>   int lstat_cache_aware_rmdir(const char *path);
> -#if !defined(__MINGW32__) && !defined(_MSC_VER)
>   #define rmdir lstat_cache_aware_rmdir
>   #endif

I thought we'd agreed that this represents a change in behavior that 
should be fixed c.f. [2]

> @@ -1462,14 +1464,17 @@ static inline int is_missing_file_error(int errno_)
>   	return (errno_ == ENOENT || errno_ == ENOTDIR);
>   }
>   
> +#ifndef GIT_STD_LIB
>   int cmd_main(int, const char **);
>   
>   /*
>    * Intercept all calls to exit() and route them to trace2 to
>    * optionally emit a message before calling the real exit().
>    */
> +

Nit: this blank line seems unnecessary

>   int common_exit(const char *file, int line, int code);
>   #define exit(code) exit(common_exit(__FILE__, __LINE__, (code)))
> +#endif
>   
>   /*
>    * You can mark a stack variable with UNLEAK(var) to avoid it being
> diff --git a/stubs/pager.c b/stubs/pager.c

> diff --git a/stubs/pager.h b/stubs/pager.h
> new file mode 100644
> index 0000000000..b797910881
> --- /dev/null
> +++ b/stubs/pager.h
> @@ -0,0 +1,6 @@
> +#ifndef PAGER_H
> +#define PAGER_H
> +
> +int pager_in_use(void);
> +
> +#endif /* PAGER_H */

Is this file actually used for anything? pager_in_use() is already 
declared in pager.h in the project root directory.

> diff --git a/wrapper.c b/wrapper.c
> index 7da15a56da..eeac3741cf 100644
> --- a/wrapper.c
> +++ b/wrapper.c
> @@ -5,7 +5,6 @@
>   #include "abspath.h"
>   #include "parse.h"
>   #include "gettext.h"
> -#include "repository.h"

It is probably worth splitting this change out with a commit message 
explaining why the include is unneeded.

This is looking good, it would be really nice to see a demonstration of 
building git using git-std-lib (with gettext support) in the next iteration.

Best Wishes

Phillip


[1] 
https://lore.kernel.org/git/a0f04bd7-3a1e-b303-fd52-eee2af4d38b3@gmail.com/
[2] 
https://lore.kernel.org/git/CAFySSZBMng9nEdCkuT5+fc6rfFgaFfU2E0NP3=jUQC1yRcUE6Q@mail.gmail.com/

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v3 2/6] wrapper: remove dependency to Git-specific internal file
  2023-09-08 17:44       ` [PATCH v3 2/6] wrapper: remove dependency to Git-specific internal file Calvin Wan
@ 2023-09-15 17:54         ` Jonathan Tan
  0 siblings, 0 replies; 111+ messages in thread
From: Jonathan Tan @ 2023-09-15 17:54 UTC (permalink / raw)
  To: Calvin Wan; +Cc: Jonathan Tan, git, nasamuffin, linusa, phillip.wood123, vdye

Calvin Wan <calvinwan@google.com> writes:
> In order for wrapper.c to be built independently as part of a smaller
> library, it cannot have dependencies to other Git specific
> internals. remove_or_warn() creates an unnecessary dependency to
> object.h in wrapper.c. Therefore move the function to entry.[ch] which
> performs changes on the worktree based on the Git-specific file modes in
> the index.

Looking at remove_or_warn(), it's only used from entry.c and apply.c
(which already includes entry.h for another reason) so moving it to
entry.c looks fine.

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v3 5/6] git-std-lib: introduce git standard library
  2023-09-08 17:44       ` [PATCH v3 5/6] git-std-lib: introduce git standard library Calvin Wan
  2023-09-11 13:22         ` Phillip Wood
@ 2023-09-15 18:39         ` Jonathan Tan
  2023-09-26 14:23         ` phillip.wood123
  2 siblings, 0 replies; 111+ messages in thread
From: Jonathan Tan @ 2023-09-15 18:39 UTC (permalink / raw)
  To: Calvin Wan; +Cc: Jonathan Tan, git, nasamuffin, linusa, phillip.wood123, vdye

Calvin Wan <calvinwan@google.com> writes:
> diff --git a/Makefile b/Makefile
> index 9226c719a0..0a2d1ae3cc 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -669,6 +669,7 @@ FUZZ_PROGRAMS =
>  GIT_OBJS =
>  LIB_OBJS =
>  SCALAR_OBJS =
> +STUB_OBJS =
>  OBJECTS =
>  OTHER_PROGRAMS =
>  PROGRAM_OBJS =

I don't think stubs should be compiled into git-std-lib.a - I would
expect a consumer of this library to be able to specify their own
implementations if needed (e.g. their own trace2).

> @@ -956,6 +957,7 @@ COCCI_SOURCES = $(filter-out $(THIRD_PARTY_SOURCES),$(FOUND_C_SOURCES))
>  
>  LIB_H = $(FOUND_H_SOURCES)
>  
> +ifndef GIT_STD_LIB
>  LIB_OBJS += abspath.o
>  LIB_OBJS += add-interactive.o
>  LIB_OBJS += add-patch.o
> @@ -1196,6 +1198,27 @@ LIB_OBJS += write-or-die.o
>  LIB_OBJS += ws.o
>  LIB_OBJS += wt-status.o
>  LIB_OBJS += xdiff-interface.o
> +else ifdef GIT_STD_LIB
> +LIB_OBJS += abspath.o
> +LIB_OBJS += ctype.o
> +LIB_OBJS += date.o
> +LIB_OBJS += hex-ll.o
> +LIB_OBJS += parse.o
> +LIB_OBJS += strbuf.o
> +LIB_OBJS += usage.o
> +LIB_OBJS += utf8.o
> +LIB_OBJS += wrapper.o

This means that LIB_OBJS (in this patch, used both by git-std-lib and
as part of compiling the regular Git binary) can differ based on the
GIT_STD_LIB variable. It does seem that we cannot avoid GIT_STD_LIB
for now, because the git-std-lib can only be compiled without GETTEXT
(so we need a variable to make sure that none of these .o files are
compiled with GETTEXT), but we should still minimize the changes between
compiling with GIT_STD_LIB and without it, at least to minimize future
work. Could we have two separate lists? So, leave LIB_OBJS alone and
make a new STD_LIB_OBJS.

> diff --git a/git-compat-util.h b/git-compat-util.h
> index 3e7a59b5ff..14bf71c530 100644
> --- a/git-compat-util.h
> +++ b/git-compat-util.h
> @@ -455,8 +455,8 @@ static inline int noop_core_config(const char *var UNUSED,
>  #define platform_core_config noop_core_config
>  #endif
>  
> +#if !defined(__MINGW32__) && !defined(_MSC_VER) && !defined(GIT_STD_LIB)
>  int lstat_cache_aware_rmdir(const char *path);
> -#if !defined(__MINGW32__) && !defined(_MSC_VER)
>  #define rmdir lstat_cache_aware_rmdir
>  #endif

I think we still want to keep the idea of "the code should still be good
even if we have no use for git-std-lib" as much as possible, so could we
stub lstat_cache_aware_rmdir() instead? We could have a new git-compat-
util-stub.c (or whatever we want to call it).

> @@ -966,9 +966,11 @@ const char *inet_ntop(int af, const void *src, char *dst, size_t size);
>  #endif
>  
>  #ifdef NO_PTHREADS
> +#ifdef GIT_STD_LIB
>  #define atexit git_atexit
>  int git_atexit(void (*handler)(void));
>  #endif
> +#endif
>  
>  static inline size_t st_add(size_t a, size_t b)
>  {

Same for git_atexit().

> @@ -1462,14 +1464,17 @@ static inline int is_missing_file_error(int errno_)
>  	return (errno_ == ENOENT || errno_ == ENOTDIR);
>  }
>  
> +#ifndef GIT_STD_LIB
>  int cmd_main(int, const char **);
>  
>  /*
>   * Intercept all calls to exit() and route them to trace2 to
>   * optionally emit a message before calling the real exit().
>   */
> +
>  int common_exit(const char *file, int line, int code);
>  #define exit(code) exit(common_exit(__FILE__, __LINE__, (code)))
> +#endif
>  
>  /*
>   * You can mark a stack variable with UNLEAK(var) to avoid it being

And for common_exit().

As for cmd_main(), that seems to be a convenience so that we can link
common_main.o with various other files (e.g. http-backend.c). I think
the right thing to do is to define a new cmd-main.h that declares only
cmd_main(), and then have only the files that need it (common_main.c and
all the files that define cmd_main()) include it. This cleanup patch can
be done before this patch. I think this is a good change that we would
want even without libification.
 

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v3 6/6] git-std-lib: add test file to call git-std-lib.a functions
  2023-09-08 17:44       ` [PATCH v3 6/6] git-std-lib: add test file to call git-std-lib.a functions Calvin Wan
  2023-09-09  5:26         ` Junio C Hamano
@ 2023-09-15 18:43         ` Jonathan Tan
  2023-09-15 20:22           ` Junio C Hamano
  1 sibling, 1 reply; 111+ messages in thread
From: Jonathan Tan @ 2023-09-15 18:43 UTC (permalink / raw)
  To: Calvin Wan; +Cc: Jonathan Tan, git, nasamuffin, linusa, phillip.wood123, vdye

Calvin Wan <calvinwan@google.com> writes:
> Add test file that directly or indirectly calls all functions defined in
> git-std-lib.a object files to showcase that they do not reference
> missing objects and that git-std-lib.a can stand on its own.
> 
> Certain functions that cause the program to exit or are already called
> by other functions are commented out.
> 
> TODO: replace with unit tests
> Signed-off-by: Calvin Wan <calvinwan@google.com>

I think the TODO should go into the code, so that when we add a unit
test that also deletes stdlib-test.c, we can see what's happening just
from the diff. The TODO should also explain what stdlib-test.c is hoping
to do, and why replacing it is OK. (Also, do we need to invoke all the
functions? I thought that missing functions are checked at link time, or
at the very latest, when the executable is run. No need to change this,
though - invoking all the functions we can is fine.)
 

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v3 6/6] git-std-lib: add test file to call git-std-lib.a functions
  2023-09-15 18:43         ` Jonathan Tan
@ 2023-09-15 20:22           ` Junio C Hamano
  0 siblings, 0 replies; 111+ messages in thread
From: Junio C Hamano @ 2023-09-15 20:22 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: Calvin Wan, git, nasamuffin, linusa, phillip.wood123, vdye

Jonathan Tan <jonathantanmy@google.com> writes:

> Calvin Wan <calvinwan@google.com> writes:
>> Add test file that directly or indirectly calls all functions defined in
>> git-std-lib.a object files to showcase that they do not reference
>> missing objects and that git-std-lib.a can stand on its own.
>> 
>> Certain functions that cause the program to exit or are already called
>> by other functions are commented out.
>> 
>> TODO: replace with unit tests
>> Signed-off-by: Calvin Wan <calvinwan@google.com>
>
> I think the TODO should go into the code, so that when we add a unit
> test that also deletes stdlib-test.c, we can see what's happening just
> from the diff. The TODO should also explain what stdlib-test.c is hoping
> to do, and why replacing it is OK. (Also, do we need to invoke all the
> functions? I thought that missing functions are checked at link time, or
> at the very latest, when the executable is run. No need to change this,
> though - invoking all the functions we can is fine.)
>  

Thanks for excellent reviews (not just against this 6/6 but others,
too).


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v3 5/6] git-std-lib: introduce git standard library
  2023-09-08 17:44       ` [PATCH v3 5/6] git-std-lib: introduce git standard library Calvin Wan
  2023-09-11 13:22         ` Phillip Wood
  2023-09-15 18:39         ` Jonathan Tan
@ 2023-09-26 14:23         ` phillip.wood123
  2 siblings, 0 replies; 111+ messages in thread
From: phillip.wood123 @ 2023-09-26 14:23 UTC (permalink / raw)
  To: Calvin Wan, git; +Cc: nasamuffin, jonathantanmy, linusa, vdye

Hi Calvin

On 08/09/2023 18:44, Calvin Wan wrote:
> The Git Standard Library intends to serve as the foundational library
> and root dependency that other libraries in Git will be built off of.
> That is to say, suppose we have libraries X and Y; a user that wants to
> use X and Y would need to include X, Y, and this Git Standard Library.
> 
> Add Documentation/technical/git-std-lib.txt to further explain the
> design and rationale.
> 
> Signed-off-by: Calvin Wan <calvinwan@google.com>
> Helped-by: Phillip Wood <phillip.wood123@gmail.com>
> ---
>   Documentation/technical/git-std-lib.txt | 191 ++++++++++++++++++++++++

I need the following diff to build the html documentation.

Best Wishes

Phillip

diff --git a/Documentation/Makefile b/Documentation/Makefile
index 3f2383a12c..f1dc673838 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -110,6 +110,7 @@ TECH_DOCS += SubmittingPatches
  TECH_DOCS += ToolsForGit
  TECH_DOCS += technical/bitmap-format
  TECH_DOCS += technical/bundle-uri
+TECH_DOCS += technical/git-std-lib
  TECH_DOCS += technical/hash-function-transition
  TECH_DOCS += technical/long-running-process-protocol
  TECH_DOCS += technical/multi-pack-index
diff --git a/Documentation/technical/git-std-lib.txt b/Documentation/technical/git-std-lib.txt
index d03b8565b4..28e6cdac2a 100644
--- a/Documentation/technical/git-std-lib.txt
+++ b/Documentation/technical/git-std-lib.txt
@@ -1,5 +1,4 @@
-Git Standard Library
-================
+= Git Standard Library
  
  The Git Standard Library intends to serve as the foundational library
  and root dependency that other libraries in Git will be built off of.
@@ -9,8 +8,7 @@ This does not mean that the Git Standard Library will be the only
  possible root dependency in the future, but rather the most significant
  and widely used one.
  
-Dependency graph in libified Git
-================
+== Dependency graph in libified Git
  
  If you look in the Git Makefile, all of the objects defined in the Git
  library are compiled and archived into a singular file, libgit.a, which
@@ -57,8 +55,7 @@ if someone wanted their own custom build of Git with their own custom
  implementation of the object store, they would only have to swap out
  object-store.a rather than do a hard fork of Git.
  
-Rationale behind Git Standard Library
-================
+== Rationale behind Git Standard Library
  
  The rationale behind what's in and what's not in the Git Standard
  Library essentially is the result of two observations within the Git
@@ -67,8 +64,7 @@ in a couple of different files, and wrapper.c + usage.c have
  difficult-to-separate circular dependencies with each other and other
  files.
  
-Ubiquity of git-compat-util.h and circular dependencies
-========
+=== Ubiquity of git-compat-util.h and circular dependencies
  
  Every file in the Git codebase includes git-compat-util.h. It serves as
  "a compatibility aid that isolates the knowledge of platform specific
@@ -79,10 +75,9 @@ for wrapper.c and usage.c to be a part of the root library. They have
  difficult to separate circular dependencies with each other so they
  can't be independent libraries. Wrapper.c has dependencies on parse.c,
  abspath.c, strbuf.c, which in turn also have dependencies on usage.c and
-wrapper.c -- more circular dependencies.
+wrapper.c - more circular dependencies.
  
-Tradeoff between swappability and refactoring
-========
+=== Tradeoff between swappability and refactoring
  
  From the above dependency graph, we can see that git-std-lib.a could be
  many smaller libraries rather than a singular library. So why choose a
@@ -99,8 +94,7 @@ and change the API for the library if there becomes enough of a reason
  to do so (remember we are avoiding promising stability of the interfaces
  of those libraries).
  
-Reuse of compatibility functions in git-compat-util.h
-========
+=== Reuse of compatibility functions in git-compat-util.h
  
  Most functions defined in git-compat-util.h are implemented in compat/
  and have dependencies limited to strbuf.h and wrapper.h so they can be
@@ -110,8 +104,7 @@ compat/. The rest of the functions defined in git-compat-util.h are
  implemented in top level files and are hidden behind
  an #ifdef if their implementation is not in git-std-lib.a.
  
-Rationale summary
-========
+=== Rationale summary
  
  The Git Standard Library allows us to get the libification ball rolling
  with other libraries in Git. By not spending many
@@ -123,8 +116,7 @@ the code cleanups that have happened so far have been minor and
  beneficial for the codebase. It is probable that making large movements
  would negatively affect code clarity.
  
-Git Standard Library boundary
-================
+== Git Standard Library boundary
  
  While I have described above some useful heuristics for identifying
  potential candidates for git-std-lib.a, a standard library should not
@@ -150,8 +142,7 @@ to be able to trace functions in those files and other files in git-std-lib.a.
  In order for git-std-lib.a to compile with those dependencies, stubbed out
  versions of those files are implemented and swapped in during compilation time.
  
-Files inside of Git Standard Library
-================
+== Files inside of Git Standard Library
  
  The initial set of files in git-std-lib.a are:
  abspath.c
@@ -171,21 +162,19 @@ complete library:
  stubs/pager.c
  stubs/trace2.c
  
-Pitfalls
-================
+== Pitfalls
  
  There are a small amount of files under compat/* that have dependencies
  not inside of git-std-lib.a. While those functions are not called on
  Linux, other OSes might call those problematic functions. I don't see
  this as a major problem, just moreso an observation that libification in
  general may also require some minor compatibility work in the future.
  
-Testing
-================
+== Testing
  
  Unit tests should catch any breakages caused by changes to files in
  git-std-lib.a (i.e. introduction of a out of scope dependency) and new
  functions introduced to git-std-lib.a will require unit tests written
  for them.
  
-[1] https://lore.kernel.org/git/xmqqwn17sydw.fsf@gitster.g/
\ No newline at end of file
+[1] https://lore.kernel.org/git/xmqqwn17sydw.fsf@gitster.g/

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* Re: [PATCH v3 5/6] git-std-lib: introduce git standard library
  2023-09-11 13:22         ` Phillip Wood
@ 2023-09-27 14:14           ` Phillip Wood
  0 siblings, 0 replies; 111+ messages in thread
From: Phillip Wood @ 2023-09-27 14:14 UTC (permalink / raw)
  To: Calvin Wan, git; +Cc: nasamuffin, jonathantanmy, linusa, vdye

Hi Calvin

On 11/09/2023 14:22, Phillip Wood wrote:
> This is looking good, it would be really nice to see a demonstration of 
> building git using git-std-lib (with gettext support) in the next 
> iteration.

I've just pushed a couple of patches on top of this series to 
https://github.com/phillipwood/git/tree/cw/git-std-lib/v3 that show a 
possible way to do that.

Rather than re-using LIB_OBJS when building git-std-lib.a it uses a 
separate variable to hold the list of sources. git is linked against 
git-std-lib.a so we don't have to worry about having the same source 
file in two different libraries. The stub objects are moved into their 
own library. This enables git to be built using git-std-lib.a with

     make git

which will link git against git-std-lib.a built with gettext and tracing 
support. To use git-std-lib.a in other programs compile it and 
git-stub-lib.a with

     make NO_GETTEXT=YesPlease git-std-lib.a git-stub-lib.a

and link your program against both libraries.

The GIT_STD_LIB define is also removed in favor of more stubbing to 
avoid the complications of conditional compilation.

Best Wishes

Phillip

^ permalink raw reply	[flat|nested] 111+ messages in thread

* [PATCH v4 0/4] Preliminary patches before git-std-lib
  2023-06-27 19:52 [RFC PATCH 0/8] Introduce Git Standard Library Calvin Wan
                   ` (10 preceding siblings ...)
  2023-08-10 16:33 ` [RFC PATCH v2 0/7] " Calvin Wan
@ 2023-09-29 21:20 ` Jonathan Tan
  2023-09-29 21:20   ` [PATCH v4 1/4] hex-ll: separate out non-hash-algo functions Jonathan Tan
                     ` (8 more replies)
  11 siblings, 9 replies; 111+ messages in thread
From: Jonathan Tan @ 2023-09-29 21:20 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, Calvin Wan, phillip.wood123, Junio C Hamano

Calvin will be away for a few weeks and I'll be handling the git-std-lib
effort in the meantime. My goals will be:

- Get the preliminary patches in Calvin's patch set (patches 1-4) merged
first.

- Updating patches 5-6 based on reviewer feedback (including my
feedback). I have several aims including reducing or eliminating the
need for the GIT_STD_LIB preprocessor variable, and making stubs a test-
only concern (I think Phillip has some similar ideas [1] but I haven't
looked at their repo on GitHub yet).

[1] https://lore.kernel.org/git/98f3edcf-7f37-45ff-abd2-c0038d4e0589@gmail.com/

This patch set is in service of the first goal. Because the libification
patches are no longer included in this patch set, I have rewritten the
commit messages to justify the patches in terms of code organization.
There are no changes in the code itself. Also, I have retained Calvin's
name as the author.

Putting on my reviewer hat, if I was reviewing hex.h and config.h from
scratch, I would have not thought twice about requesting the changes
in these patches. But since we are not creating them from scratch but
modifying existing files, a question does arise about whether it's
worth the additional noise that someone looking through history needs
to handle. In this case, I think it's worth it - I think the future code
delver would appreciate being able to see the evolution of hex, hash
algo, config, and parse functions as their own files rather than, when
looking at one of them, having to filter out unrelated changes.

Besides that, as Calvin has described in his other emails, these patches
are prerequisites to being able to independently compile and use a
certain subset of the .c files. With patches that solely refactor, there
is sometimes a worry that the benefits are nebulous and that we would
be moving code around for nothing, but I don't think that that applies
here: there is still more work to be done on patches 5 and 6, but what
we have in patches 5 and 6 now shows that the benefits are concrete and
within reach.

Calvin Wan (4):
  hex-ll: separate out non-hash-algo functions
  wrapper: reduce scope of remove_or_warn()
  config: correct bad boolean env value error message
  parse: separate out parsing functions from config.h

 Makefile                   |   2 +
 attr.c                     |   2 +-
 color.c                    |   2 +-
 config.c                   | 173 +----------------------------------
 config.h                   |  14 +--
 entry.c                    |   5 +
 entry.h                    |   6 ++
 hex-ll.c                   |  49 ++++++++++
 hex-ll.h                   |  27 ++++++
 hex.c                      |  47 ----------
 hex.h                      |  24 +----
 mailinfo.c                 |   2 +-
 pack-objects.c             |   2 +-
 pack-revindex.c            |   2 +-
 parse-options.c            |   3 +-
 parse.c                    | 182 +++++++++++++++++++++++++++++++++++++
 parse.h                    |  20 ++++
 pathspec.c                 |   2 +-
 preload-index.c            |   2 +-
 progress.c                 |   2 +-
 prompt.c                   |   2 +-
 rebase.c                   |   2 +-
 strbuf.c                   |   2 +-
 t/helper/test-env-helper.c |   2 +-
 unpack-trees.c             |   2 +-
 url.c                      |   2 +-
 urlmatch.c                 |   2 +-
 wrapper.c                  |   8 +-
 wrapper.h                  |   5 -
 write-or-die.c             |   2 +-
 30 files changed, 313 insertions(+), 284 deletions(-)
 create mode 100644 hex-ll.c
 create mode 100644 hex-ll.h
 create mode 100644 parse.c
 create mode 100644 parse.h

Range-diff against v3:
1:  fcce01bc19 ! 1:  02ecc00e9c hex-ll: split out functionality from hex
    @@ Metadata
     Author: Calvin Wan <calvinwan@google.com>
     
      ## Commit message ##
    -    hex-ll: split out functionality from hex
    +    hex-ll: separate out non-hash-algo functions
     
    -    Separate out hex functionality that doesn't require a hash algo into
    -    hex-ll.[ch]. Since the hash algo is currently a global that sits in
    -    repository, this separation removes that dependency for files that only
    -    need basic hex manipulation functions.
    +    In order to further reduce all-in-one headers, separate out functions in
    +    hex.h that do not operate on object hashes into its own file, hex-ll.h,
    +    and update the include directives in the .c files that need only such
    +    functions accordingly.
     
         Signed-off-by: Calvin Wan <calvinwan@google.com>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    +    Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
     
      ## Makefile ##
     @@ Makefile: LIB_OBJS += hash-lookup.o
2:  95a369d02b ! 2:  c9e7cd7857 wrapper: remove dependency to Git-specific internal file
    @@ Metadata
     Author: Calvin Wan <calvinwan@google.com>
     
      ## Commit message ##
    -    wrapper: remove dependency to Git-specific internal file
    +    wrapper: reduce scope of remove_or_warn()
     
    -    In order for wrapper.c to be built independently as part of a smaller
    -    library, it cannot have dependencies to other Git specific
    -    internals. remove_or_warn() creates an unnecessary dependency to
    -    object.h in wrapper.c. Therefore move the function to entry.[ch] which
    -    performs changes on the worktree based on the Git-specific file modes in
    -    the index.
    +    remove_or_warn() is only used by entry.c and apply.c, but it is
    +    currently declared and defined in wrapper.{h,c}, so it has a scope much
    +    greater than it needs. This needlessly large scope also causes wrapper.c
    +    to need to include object.h, when this file is largely unconcerned with
    +    Git objects.
    +
    +    Move remove_or_warn() to entry.{h,c}. The file apply.c still has access
    +    to it, since it already includes entry.h for another reason.
     
         Signed-off-by: Calvin Wan <calvinwan@google.com>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    +    Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
     
      ## entry.c ##
     @@ entry.c: void unlink_entry(const struct cache_entry *ce, const char *super_prefix)
3:  5348528865 = 3:  e4c20a81f9 config: correct bad boolean env value error message
4:  b5a8945c5c ! 4:  5d9f0b3de0 parse: create new library for parsing strings and env values
    @@ Metadata
     Author: Calvin Wan <calvinwan@google.com>
     
      ## Commit message ##
    -    parse: create new library for parsing strings and env values
    +    parse: separate out parsing functions from config.h
     
    -    While string and environment value parsing is mainly consumed by
    -    config.c, there are other files that only need parsing functionality and
    -    not config functionality. By separating out string and environment value
    -    parsing from config, those files can instead be dependent on parse,
    -    which has a much smaller dependency chain than config. This ultimately
    -    allows us to inclue parse.[ch] in an independent library since it
    -    doesn't have dependencies to Git-specific internals unlike in
    -    config.[ch].
    +    The files config.{h,c} contain functions that have to do with parsing,
    +    but not config.
     
    -    Move general string and env parsing functions from config.[ch] to
    -    parse.[ch].
    +    In order to further reduce all-in-one headers, separate out functions in
    +    config.c that do not operate on config into its own file, parse.h,
    +    and update the include directives in the .c files that need only such
    +    functions accordingly.
     
         Signed-off-by: Calvin Wan <calvinwan@google.com>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    +    Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
     
      ## Makefile ##
     @@ Makefile: LIB_OBJS += pack-write.o
-- 
2.42.0.582.g8ccd20d70d-goog


^ permalink raw reply	[flat|nested] 111+ messages in thread

* [PATCH v4 1/4] hex-ll: separate out non-hash-algo functions
  2023-09-29 21:20 ` [PATCH v4 0/4] Preliminary patches before git-std-lib Jonathan Tan
@ 2023-09-29 21:20   ` Jonathan Tan
  2023-10-21  4:14     ` Linus Arver
  2023-09-29 21:20   ` [PATCH v4 2/4] wrapper: reduce scope of remove_or_warn() Jonathan Tan
                     ` (7 subsequent siblings)
  8 siblings, 1 reply; 111+ messages in thread
From: Jonathan Tan @ 2023-09-29 21:20 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, phillip.wood123, Junio C Hamano, Jonathan Tan

From: Calvin Wan <calvinwan@google.com>

In order to further reduce all-in-one headers, separate out functions in
hex.h that do not operate on object hashes into its own file, hex-ll.h,
and update the include directives in the .c files that need only such
functions accordingly.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 Makefile   |  1 +
 color.c    |  2 +-
 hex-ll.c   | 49 +++++++++++++++++++++++++++++++++++++++++++++++++
 hex-ll.h   | 27 +++++++++++++++++++++++++++
 hex.c      | 47 -----------------------------------------------
 hex.h      | 24 +-----------------------
 mailinfo.c |  2 +-
 strbuf.c   |  2 +-
 url.c      |  2 +-
 urlmatch.c |  2 +-
 10 files changed, 83 insertions(+), 75 deletions(-)
 create mode 100644 hex-ll.c
 create mode 100644 hex-ll.h

diff --git a/Makefile b/Makefile
index 5776309365..861e643708 100644
--- a/Makefile
+++ b/Makefile
@@ -1040,6 +1040,7 @@ LIB_OBJS += hash-lookup.o
 LIB_OBJS += hashmap.o
 LIB_OBJS += help.o
 LIB_OBJS += hex.o
+LIB_OBJS += hex-ll.o
 LIB_OBJS += hook.o
 LIB_OBJS += ident.o
 LIB_OBJS += json-writer.o
diff --git a/color.c b/color.c
index b24b19566b..f663c06ac4 100644
--- a/color.c
+++ b/color.c
@@ -3,7 +3,7 @@
 #include "color.h"
 #include "editor.h"
 #include "gettext.h"
-#include "hex.h"
+#include "hex-ll.h"
 #include "pager.h"
 #include "strbuf.h"
 
diff --git a/hex-ll.c b/hex-ll.c
new file mode 100644
index 0000000000..4d7ece1de5
--- /dev/null
+++ b/hex-ll.c
@@ -0,0 +1,49 @@
+#include "git-compat-util.h"
+#include "hex-ll.h"
+
+const signed char hexval_table[256] = {
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 00-07 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 08-0f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 10-17 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 18-1f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 20-27 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 28-2f */
+	  0,  1,  2,  3,  4,  5,  6,  7,		/* 30-37 */
+	  8,  9, -1, -1, -1, -1, -1, -1,		/* 38-3f */
+	 -1, 10, 11, 12, 13, 14, 15, -1,		/* 40-47 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 48-4f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 50-57 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 58-5f */
+	 -1, 10, 11, 12, 13, 14, 15, -1,		/* 60-67 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 68-67 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 70-77 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 78-7f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 80-87 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 88-8f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 90-97 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 98-9f */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* a0-a7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* a8-af */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* b0-b7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* b8-bf */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* c0-c7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* c8-cf */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* d0-d7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* d8-df */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* e0-e7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* e8-ef */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* f0-f7 */
+	 -1, -1, -1, -1, -1, -1, -1, -1,		/* f8-ff */
+};
+
+int hex_to_bytes(unsigned char *binary, const char *hex, size_t len)
+{
+	for (; len; len--, hex += 2) {
+		unsigned int val = (hexval(hex[0]) << 4) | hexval(hex[1]);
+
+		if (val & ~0xff)
+			return -1;
+		*binary++ = val;
+	}
+	return 0;
+}
diff --git a/hex-ll.h b/hex-ll.h
new file mode 100644
index 0000000000..a381fa8556
--- /dev/null
+++ b/hex-ll.h
@@ -0,0 +1,27 @@
+#ifndef HEX_LL_H
+#define HEX_LL_H
+
+extern const signed char hexval_table[256];
+static inline unsigned int hexval(unsigned char c)
+{
+	return hexval_table[c];
+}
+
+/*
+ * Convert two consecutive hexadecimal digits into a char.  Return a
+ * negative value on error.  Don't run over the end of short strings.
+ */
+static inline int hex2chr(const char *s)
+{
+	unsigned int val = hexval(s[0]);
+	return (val & ~0xf) ? val : (val << 4) | hexval(s[1]);
+}
+
+/*
+ * Read `len` pairs of hexadecimal digits from `hex` and write the
+ * values to `binary` as `len` bytes. Return 0 on success, or -1 if
+ * the input does not consist of hex digits).
+ */
+int hex_to_bytes(unsigned char *binary, const char *hex, size_t len);
+
+#endif
diff --git a/hex.c b/hex.c
index 01f17fe5c9..d42262bdca 100644
--- a/hex.c
+++ b/hex.c
@@ -2,53 +2,6 @@
 #include "hash.h"
 #include "hex.h"
 
-const signed char hexval_table[256] = {
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 00-07 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 08-0f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 10-17 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 18-1f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 20-27 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 28-2f */
-	  0,  1,  2,  3,  4,  5,  6,  7,		/* 30-37 */
-	  8,  9, -1, -1, -1, -1, -1, -1,		/* 38-3f */
-	 -1, 10, 11, 12, 13, 14, 15, -1,		/* 40-47 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 48-4f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 50-57 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 58-5f */
-	 -1, 10, 11, 12, 13, 14, 15, -1,		/* 60-67 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 68-67 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 70-77 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 78-7f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 80-87 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 88-8f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 90-97 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* 98-9f */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* a0-a7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* a8-af */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* b0-b7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* b8-bf */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* c0-c7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* c8-cf */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* d0-d7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* d8-df */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* e0-e7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* e8-ef */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* f0-f7 */
-	 -1, -1, -1, -1, -1, -1, -1, -1,		/* f8-ff */
-};
-
-int hex_to_bytes(unsigned char *binary, const char *hex, size_t len)
-{
-	for (; len; len--, hex += 2) {
-		unsigned int val = (hexval(hex[0]) << 4) | hexval(hex[1]);
-
-		if (val & ~0xff)
-			return -1;
-		*binary++ = val;
-	}
-	return 0;
-}
-
 static int get_hash_hex_algop(const char *hex, unsigned char *hash,
 			      const struct git_hash_algo *algop)
 {
diff --git a/hex.h b/hex.h
index 87abf66602..e0b83f776f 100644
--- a/hex.h
+++ b/hex.h
@@ -2,22 +2,7 @@
 #define HEX_H
 
 #include "hash-ll.h"
-
-extern const signed char hexval_table[256];
-static inline unsigned int hexval(unsigned char c)
-{
-	return hexval_table[c];
-}
-
-/*
- * Convert two consecutive hexadecimal digits into a char.  Return a
- * negative value on error.  Don't run over the end of short strings.
- */
-static inline int hex2chr(const char *s)
-{
-	unsigned int val = hexval(s[0]);
-	return (val & ~0xf) ? val : (val << 4) | hexval(s[1]);
-}
+#include "hex-ll.h"
 
 /*
  * Try to read a hash (specified by the_hash_algo) in hexadecimal
@@ -34,13 +19,6 @@ int get_oid_hex(const char *hex, struct object_id *oid);
 /* Like get_oid_hex, but for an arbitrary hash algorithm. */
 int get_oid_hex_algop(const char *hex, struct object_id *oid, const struct git_hash_algo *algop);
 
-/*
- * Read `len` pairs of hexadecimal digits from `hex` and write the
- * values to `binary` as `len` bytes. Return 0 on success, or -1 if
- * the input does not consist of hex digits).
- */
-int hex_to_bytes(unsigned char *binary, const char *hex, size_t len);
-
 /*
  * Convert a binary hash in "unsigned char []" or an object name in
  * "struct object_id *" to its hex equivalent. The `_r` variant is reentrant,
diff --git a/mailinfo.c b/mailinfo.c
index 931505363c..a07d2da16d 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -1,7 +1,7 @@
 #include "git-compat-util.h"
 #include "config.h"
 #include "gettext.h"
-#include "hex.h"
+#include "hex-ll.h"
 #include "utf8.h"
 #include "strbuf.h"
 #include "mailinfo.h"
diff --git a/strbuf.c b/strbuf.c
index 4c9ac6dc5e..7827178d8e 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -1,6 +1,6 @@
 #include "git-compat-util.h"
 #include "gettext.h"
-#include "hex.h"
+#include "hex-ll.h"
 #include "strbuf.h"
 #include "string-list.h"
 #include "utf8.h"
diff --git a/url.c b/url.c
index 2e1a9f6fee..282b12495a 100644
--- a/url.c
+++ b/url.c
@@ -1,5 +1,5 @@
 #include "git-compat-util.h"
-#include "hex.h"
+#include "hex-ll.h"
 #include "strbuf.h"
 #include "url.h"
 
diff --git a/urlmatch.c b/urlmatch.c
index 1c45f23adf..1d0254abac 100644
--- a/urlmatch.c
+++ b/urlmatch.c
@@ -1,6 +1,6 @@
 #include "git-compat-util.h"
 #include "gettext.h"
-#include "hex.h"
+#include "hex-ll.h"
 #include "strbuf.h"
 #include "urlmatch.h"
 
-- 
2.42.0.582.g8ccd20d70d-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v4 2/4] wrapper: reduce scope of remove_or_warn()
  2023-09-29 21:20 ` [PATCH v4 0/4] Preliminary patches before git-std-lib Jonathan Tan
  2023-09-29 21:20   ` [PATCH v4 1/4] hex-ll: separate out non-hash-algo functions Jonathan Tan
@ 2023-09-29 21:20   ` Jonathan Tan
  2023-10-10  9:59     ` phillip.wood123
  2023-09-29 21:20   ` [PATCH v4 3/4] config: correct bad boolean env value error message Jonathan Tan
                     ` (6 subsequent siblings)
  8 siblings, 1 reply; 111+ messages in thread
From: Jonathan Tan @ 2023-09-29 21:20 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, phillip.wood123, Junio C Hamano, Jonathan Tan

From: Calvin Wan <calvinwan@google.com>

remove_or_warn() is only used by entry.c and apply.c, but it is
currently declared and defined in wrapper.{h,c}, so it has a scope much
greater than it needs. This needlessly large scope also causes wrapper.c
to need to include object.h, when this file is largely unconcerned with
Git objects.

Move remove_or_warn() to entry.{h,c}. The file apply.c still has access
to it, since it already includes entry.h for another reason.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 entry.c   | 5 +++++
 entry.h   | 6 ++++++
 wrapper.c | 6 ------
 wrapper.h | 5 -----
 4 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/entry.c b/entry.c
index 43767f9043..076e97eb89 100644
--- a/entry.c
+++ b/entry.c
@@ -581,3 +581,8 @@ void unlink_entry(const struct cache_entry *ce, const char *super_prefix)
 		return;
 	schedule_dir_for_removal(ce->name, ce_namelen(ce));
 }
+
+int remove_or_warn(unsigned int mode, const char *file)
+{
+	return S_ISGITLINK(mode) ? rmdir_or_warn(file) : unlink_or_warn(file);
+}
diff --git a/entry.h b/entry.h
index 7329f918a9..ca3ed35bc0 100644
--- a/entry.h
+++ b/entry.h
@@ -62,4 +62,10 @@ int fstat_checkout_output(int fd, const struct checkout *state, struct stat *st)
 void update_ce_after_write(const struct checkout *state, struct cache_entry *ce,
 			   struct stat *st);
 
+/*
+ * Calls the correct function out of {unlink,rmdir}_or_warn based on
+ * the supplied file mode.
+ */
+int remove_or_warn(unsigned int mode, const char *path);
+
 #endif /* ENTRY_H */
diff --git a/wrapper.c b/wrapper.c
index 48065c4f53..453a20ed99 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -5,7 +5,6 @@
 #include "abspath.h"
 #include "config.h"
 #include "gettext.h"
-#include "object.h"
 #include "repository.h"
 #include "strbuf.h"
 #include "trace2.h"
@@ -632,11 +631,6 @@ int rmdir_or_warn(const char *file)
 	return warn_if_unremovable("rmdir", file, rmdir(file));
 }
 
-int remove_or_warn(unsigned int mode, const char *file)
-{
-	return S_ISGITLINK(mode) ? rmdir_or_warn(file) : unlink_or_warn(file);
-}
-
 static int access_error_is_ok(int err, unsigned flag)
 {
 	return (is_missing_file_error(err) ||
diff --git a/wrapper.h b/wrapper.h
index 79c7321bb3..1b2b047ea0 100644
--- a/wrapper.h
+++ b/wrapper.h
@@ -106,11 +106,6 @@ int unlink_or_msg(const char *file, struct strbuf *err);
  * not exist.
  */
 int rmdir_or_warn(const char *path);
-/*
- * Calls the correct function out of {unlink,rmdir}_or_warn based on
- * the supplied file mode.
- */
-int remove_or_warn(unsigned int mode, const char *path);
 
 /*
  * Call access(2), but warn for any error except "missing file"
-- 
2.42.0.582.g8ccd20d70d-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v4 3/4] config: correct bad boolean env value error message
  2023-09-29 21:20 ` [PATCH v4 0/4] Preliminary patches before git-std-lib Jonathan Tan
  2023-09-29 21:20   ` [PATCH v4 1/4] hex-ll: separate out non-hash-algo functions Jonathan Tan
  2023-09-29 21:20   ` [PATCH v4 2/4] wrapper: reduce scope of remove_or_warn() Jonathan Tan
@ 2023-09-29 21:20   ` Jonathan Tan
  2023-09-29 23:03     ` Junio C Hamano
  2023-09-29 21:20   ` [PATCH v4 4/4] parse: separate out parsing functions from config.h Jonathan Tan
                     ` (5 subsequent siblings)
  8 siblings, 1 reply; 111+ messages in thread
From: Jonathan Tan @ 2023-09-29 21:20 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, phillip.wood123, Junio C Hamano, Jonathan Tan

From: Calvin Wan <calvinwan@google.com>

An incorrectly defined boolean environment value would result in the
following error message:

bad boolean config value '%s' for '%s'

This is a misnomer since environment value != config value. Instead of
calling git_config_bool() to parse the environment value, mimic the
functionality inside of git_config_bool() but with the correct error
message.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 config.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/config.c b/config.c
index 3846a37be9..7dde0aaa02 100644
--- a/config.c
+++ b/config.c
@@ -2133,7 +2133,14 @@ void git_global_config(char **user_out, char **xdg_out)
 int git_env_bool(const char *k, int def)
 {
 	const char *v = getenv(k);
-	return v ? git_config_bool(k, v) : def;
+	int val;
+	if (!v)
+		return def;
+	val = git_parse_maybe_bool(v);
+	if (val < 0)
+		die(_("bad boolean environment value '%s' for '%s'"),
+		    v, k);
+	return val;
 }
 
 /*
-- 
2.42.0.582.g8ccd20d70d-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v4 4/4] parse: separate out parsing functions from config.h
  2023-09-29 21:20 ` [PATCH v4 0/4] Preliminary patches before git-std-lib Jonathan Tan
                     ` (2 preceding siblings ...)
  2023-09-29 21:20   ` [PATCH v4 3/4] config: correct bad boolean env value error message Jonathan Tan
@ 2023-09-29 21:20   ` Jonathan Tan
  2023-10-10 10:00     ` phillip.wood123
  2023-10-10 10:05   ` [PATCH v4 0/4] Preliminary patches before git-std-lib phillip.wood123
                     ` (4 subsequent siblings)
  8 siblings, 1 reply; 111+ messages in thread
From: Jonathan Tan @ 2023-09-29 21:20 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, phillip.wood123, Junio C Hamano, Jonathan Tan

From: Calvin Wan <calvinwan@google.com>

The files config.{h,c} contain functions that have to do with parsing,
but not config.

In order to further reduce all-in-one headers, separate out functions in
config.c that do not operate on config into its own file, parse.h,
and update the include directives in the .c files that need only such
functions accordingly.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 Makefile                   |   1 +
 attr.c                     |   2 +-
 config.c                   | 180 +-----------------------------------
 config.h                   |  14 +--
 pack-objects.c             |   2 +-
 pack-revindex.c            |   2 +-
 parse-options.c            |   3 +-
 parse.c                    | 182 +++++++++++++++++++++++++++++++++++++
 parse.h                    |  20 ++++
 pathspec.c                 |   2 +-
 preload-index.c            |   2 +-
 progress.c                 |   2 +-
 prompt.c                   |   2 +-
 rebase.c                   |   2 +-
 t/helper/test-env-helper.c |   2 +-
 unpack-trees.c             |   2 +-
 wrapper.c                  |   2 +-
 write-or-die.c             |   2 +-
 18 files changed, 219 insertions(+), 205 deletions(-)
 create mode 100644 parse.c
 create mode 100644 parse.h

diff --git a/Makefile b/Makefile
index 861e643708..9226c719a0 100644
--- a/Makefile
+++ b/Makefile
@@ -1091,6 +1091,7 @@ LIB_OBJS += pack-write.o
 LIB_OBJS += packfile.o
 LIB_OBJS += pager.o
 LIB_OBJS += parallel-checkout.o
+LIB_OBJS += parse.o
 LIB_OBJS += parse-options-cb.o
 LIB_OBJS += parse-options.o
 LIB_OBJS += patch-delta.o
diff --git a/attr.c b/attr.c
index 71c84fbcf8..3c0b4fb3d9 100644
--- a/attr.c
+++ b/attr.c
@@ -7,7 +7,7 @@
  */
 
 #include "git-compat-util.h"
-#include "config.h"
+#include "parse.h"
 #include "environment.h"
 #include "exec-cmd.h"
 #include "attr.h"
diff --git a/config.c b/config.c
index 7dde0aaa02..c7bc21a25d 100644
--- a/config.c
+++ b/config.c
@@ -11,6 +11,7 @@
 #include "date.h"
 #include "branch.h"
 #include "config.h"
+#include "parse.h"
 #include "convert.h"
 #include "environment.h"
 #include "gettext.h"
@@ -1165,129 +1166,6 @@ static int git_parse_source(struct config_source *cs, config_fn_t fn,
 	return error_return;
 }
 
-static uintmax_t get_unit_factor(const char *end)
-{
-	if (!*end)
-		return 1;
-	else if (!strcasecmp(end, "k"))
-		return 1024;
-	else if (!strcasecmp(end, "m"))
-		return 1024 * 1024;
-	else if (!strcasecmp(end, "g"))
-		return 1024 * 1024 * 1024;
-	return 0;
-}
-
-static int git_parse_signed(const char *value, intmax_t *ret, intmax_t max)
-{
-	if (value && *value) {
-		char *end;
-		intmax_t val;
-		intmax_t factor;
-
-		if (max < 0)
-			BUG("max must be a positive integer");
-
-		errno = 0;
-		val = strtoimax(value, &end, 0);
-		if (errno == ERANGE)
-			return 0;
-		if (end == value) {
-			errno = EINVAL;
-			return 0;
-		}
-		factor = get_unit_factor(end);
-		if (!factor) {
-			errno = EINVAL;
-			return 0;
-		}
-		if ((val < 0 && -max / factor > val) ||
-		    (val > 0 && max / factor < val)) {
-			errno = ERANGE;
-			return 0;
-		}
-		val *= factor;
-		*ret = val;
-		return 1;
-	}
-	errno = EINVAL;
-	return 0;
-}
-
-static int git_parse_unsigned(const char *value, uintmax_t *ret, uintmax_t max)
-{
-	if (value && *value) {
-		char *end;
-		uintmax_t val;
-		uintmax_t factor;
-
-		/* negative values would be accepted by strtoumax */
-		if (strchr(value, '-')) {
-			errno = EINVAL;
-			return 0;
-		}
-		errno = 0;
-		val = strtoumax(value, &end, 0);
-		if (errno == ERANGE)
-			return 0;
-		if (end == value) {
-			errno = EINVAL;
-			return 0;
-		}
-		factor = get_unit_factor(end);
-		if (!factor) {
-			errno = EINVAL;
-			return 0;
-		}
-		if (unsigned_mult_overflows(factor, val) ||
-		    factor * val > max) {
-			errno = ERANGE;
-			return 0;
-		}
-		val *= factor;
-		*ret = val;
-		return 1;
-	}
-	errno = EINVAL;
-	return 0;
-}
-
-int git_parse_int(const char *value, int *ret)
-{
-	intmax_t tmp;
-	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(int)))
-		return 0;
-	*ret = tmp;
-	return 1;
-}
-
-static int git_parse_int64(const char *value, int64_t *ret)
-{
-	intmax_t tmp;
-	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(int64_t)))
-		return 0;
-	*ret = tmp;
-	return 1;
-}
-
-int git_parse_ulong(const char *value, unsigned long *ret)
-{
-	uintmax_t tmp;
-	if (!git_parse_unsigned(value, &tmp, maximum_unsigned_value_of_type(long)))
-		return 0;
-	*ret = tmp;
-	return 1;
-}
-
-int git_parse_ssize_t(const char *value, ssize_t *ret)
-{
-	intmax_t tmp;
-	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(ssize_t)))
-		return 0;
-	*ret = tmp;
-	return 1;
-}
-
 NORETURN
 static void die_bad_number(const char *name, const char *value,
 			   const struct key_value_info *kvi)
@@ -1363,23 +1241,6 @@ ssize_t git_config_ssize_t(const char *name, const char *value,
 	return ret;
 }
 
-static int git_parse_maybe_bool_text(const char *value)
-{
-	if (!value)
-		return 1;
-	if (!*value)
-		return 0;
-	if (!strcasecmp(value, "true")
-	    || !strcasecmp(value, "yes")
-	    || !strcasecmp(value, "on"))
-		return 1;
-	if (!strcasecmp(value, "false")
-	    || !strcasecmp(value, "no")
-	    || !strcasecmp(value, "off"))
-		return 0;
-	return -1;
-}
-
 static const struct fsync_component_name {
 	const char *name;
 	enum fsync_component component_bits;
@@ -1454,16 +1315,6 @@ static enum fsync_component parse_fsync_components(const char *var, const char *
 	return (current & ~negative) | positive;
 }
 
-int git_parse_maybe_bool(const char *value)
-{
-	int v = git_parse_maybe_bool_text(value);
-	if (0 <= v)
-		return v;
-	if (git_parse_int(value, &v))
-		return !!v;
-	return -1;
-}
-
 int git_config_bool_or_int(const char *name, const char *value,
 			   const struct key_value_info *kvi, int *is_bool)
 {
@@ -2126,35 +1977,6 @@ void git_global_config(char **user_out, char **xdg_out)
 	*xdg_out = xdg_config;
 }
 
-/*
- * Parse environment variable 'k' as a boolean (in various
- * possible spellings); if missing, use the default value 'def'.
- */
-int git_env_bool(const char *k, int def)
-{
-	const char *v = getenv(k);
-	int val;
-	if (!v)
-		return def;
-	val = git_parse_maybe_bool(v);
-	if (val < 0)
-		die(_("bad boolean environment value '%s' for '%s'"),
-		    v, k);
-	return val;
-}
-
-/*
- * Parse environment variable 'k' as ulong with possibly a unit
- * suffix; if missing, use the default value 'val'.
- */
-unsigned long git_env_ulong(const char *k, unsigned long val)
-{
-	const char *v = getenv(k);
-	if (v && !git_parse_ulong(v, &val))
-		die(_("failed to parse %s"), k);
-	return val;
-}
-
 int git_config_system(void)
 {
 	return !git_env_bool("GIT_CONFIG_NOSYSTEM", 0);
diff --git a/config.h b/config.h
index 6332d74904..14f881ecfa 100644
--- a/config.h
+++ b/config.h
@@ -4,7 +4,7 @@
 #include "hashmap.h"
 #include "string-list.h"
 #include "repository.h"
-
+#include "parse.h"
 
 /**
  * The config API gives callers a way to access Git configuration files
@@ -243,16 +243,6 @@ int config_with_options(config_fn_t fn, void *,
  * The following helper functions aid in parsing string values
  */
 
-int git_parse_ssize_t(const char *, ssize_t *);
-int git_parse_ulong(const char *, unsigned long *);
-int git_parse_int(const char *value, int *ret);
-
-/**
- * Same as `git_config_bool`, except that it returns -1 on error rather
- * than dying.
- */
-int git_parse_maybe_bool(const char *);
-
 /**
  * Parse the string to an integer, including unit factors. Dies on error;
  * otherwise, returns the parsed result.
@@ -385,8 +375,6 @@ int git_config_rename_section(const char *, const char *);
 int git_config_rename_section_in_file(const char *, const char *, const char *);
 int git_config_copy_section(const char *, const char *);
 int git_config_copy_section_in_file(const char *, const char *, const char *);
-int git_env_bool(const char *, int);
-unsigned long git_env_ulong(const char *, unsigned long);
 int git_config_system(void);
 int config_error_nonbool(const char *);
 #if defined(__GNUC__)
diff --git a/pack-objects.c b/pack-objects.c
index 1b8052bece..f403ca6986 100644
--- a/pack-objects.c
+++ b/pack-objects.c
@@ -3,7 +3,7 @@
 #include "pack.h"
 #include "pack-objects.h"
 #include "packfile.h"
-#include "config.h"
+#include "parse.h"
 
 static uint32_t locate_object_entry_hash(struct packing_data *pdata,
 					 const struct object_id *oid,
diff --git a/pack-revindex.c b/pack-revindex.c
index 7fffcad912..a01a2a4640 100644
--- a/pack-revindex.c
+++ b/pack-revindex.c
@@ -6,7 +6,7 @@
 #include "packfile.h"
 #include "strbuf.h"
 #include "trace2.h"
-#include "config.h"
+#include "parse.h"
 #include "midx.h"
 #include "csum-file.h"
 
diff --git a/parse-options.c b/parse-options.c
index e8e076c3a6..093eaf2db8 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -1,11 +1,12 @@
 #include "git-compat-util.h"
 #include "parse-options.h"
 #include "abspath.h"
-#include "config.h"
+#include "parse.h"
 #include "commit.h"
 #include "color.h"
 #include "gettext.h"
 #include "strbuf.h"
+#include "string-list.h"
 #include "utf8.h"
 
 static int disallow_abbreviated_options;
diff --git a/parse.c b/parse.c
new file mode 100644
index 0000000000..42d691a0fb
--- /dev/null
+++ b/parse.c
@@ -0,0 +1,182 @@
+#include "git-compat-util.h"
+#include "gettext.h"
+#include "parse.h"
+
+static uintmax_t get_unit_factor(const char *end)
+{
+	if (!*end)
+		return 1;
+	else if (!strcasecmp(end, "k"))
+		return 1024;
+	else if (!strcasecmp(end, "m"))
+		return 1024 * 1024;
+	else if (!strcasecmp(end, "g"))
+		return 1024 * 1024 * 1024;
+	return 0;
+}
+
+int git_parse_signed(const char *value, intmax_t *ret, intmax_t max)
+{
+	if (value && *value) {
+		char *end;
+		intmax_t val;
+		intmax_t factor;
+
+		if (max < 0)
+			BUG("max must be a positive integer");
+
+		errno = 0;
+		val = strtoimax(value, &end, 0);
+		if (errno == ERANGE)
+			return 0;
+		if (end == value) {
+			errno = EINVAL;
+			return 0;
+		}
+		factor = get_unit_factor(end);
+		if (!factor) {
+			errno = EINVAL;
+			return 0;
+		}
+		if ((val < 0 && -max / factor > val) ||
+		    (val > 0 && max / factor < val)) {
+			errno = ERANGE;
+			return 0;
+		}
+		val *= factor;
+		*ret = val;
+		return 1;
+	}
+	errno = EINVAL;
+	return 0;
+}
+
+static int git_parse_unsigned(const char *value, uintmax_t *ret, uintmax_t max)
+{
+	if (value && *value) {
+		char *end;
+		uintmax_t val;
+		uintmax_t factor;
+
+		/* negative values would be accepted by strtoumax */
+		if (strchr(value, '-')) {
+			errno = EINVAL;
+			return 0;
+		}
+		errno = 0;
+		val = strtoumax(value, &end, 0);
+		if (errno == ERANGE)
+			return 0;
+		if (end == value) {
+			errno = EINVAL;
+			return 0;
+		}
+		factor = get_unit_factor(end);
+		if (!factor) {
+			errno = EINVAL;
+			return 0;
+		}
+		if (unsigned_mult_overflows(factor, val) ||
+		    factor * val > max) {
+			errno = ERANGE;
+			return 0;
+		}
+		val *= factor;
+		*ret = val;
+		return 1;
+	}
+	errno = EINVAL;
+	return 0;
+}
+
+int git_parse_int(const char *value, int *ret)
+{
+	intmax_t tmp;
+	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(int)))
+		return 0;
+	*ret = tmp;
+	return 1;
+}
+
+int git_parse_int64(const char *value, int64_t *ret)
+{
+	intmax_t tmp;
+	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(int64_t)))
+		return 0;
+	*ret = tmp;
+	return 1;
+}
+
+int git_parse_ulong(const char *value, unsigned long *ret)
+{
+	uintmax_t tmp;
+	if (!git_parse_unsigned(value, &tmp, maximum_unsigned_value_of_type(long)))
+		return 0;
+	*ret = tmp;
+	return 1;
+}
+
+int git_parse_ssize_t(const char *value, ssize_t *ret)
+{
+	intmax_t tmp;
+	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(ssize_t)))
+		return 0;
+	*ret = tmp;
+	return 1;
+}
+
+int git_parse_maybe_bool_text(const char *value)
+{
+	if (!value)
+		return 1;
+	if (!*value)
+		return 0;
+	if (!strcasecmp(value, "true")
+	    || !strcasecmp(value, "yes")
+	    || !strcasecmp(value, "on"))
+		return 1;
+	if (!strcasecmp(value, "false")
+	    || !strcasecmp(value, "no")
+	    || !strcasecmp(value, "off"))
+		return 0;
+	return -1;
+}
+
+int git_parse_maybe_bool(const char *value)
+{
+	int v = git_parse_maybe_bool_text(value);
+	if (0 <= v)
+		return v;
+	if (git_parse_int(value, &v))
+		return !!v;
+	return -1;
+}
+
+/*
+ * Parse environment variable 'k' as a boolean (in various
+ * possible spellings); if missing, use the default value 'def'.
+ */
+int git_env_bool(const char *k, int def)
+{
+	const char *v = getenv(k);
+	int val;
+	if (!v)
+		return def;
+	val = git_parse_maybe_bool(v);
+	if (val < 0)
+		die(_("bad boolean environment value '%s' for '%s'"),
+		    v, k);
+	return val;
+}
+
+/*
+ * Parse environment variable 'k' as ulong with possibly a unit
+ * suffix; if missing, use the default value 'val'.
+ */
+unsigned long git_env_ulong(const char *k, unsigned long val)
+{
+	const char *v = getenv(k);
+	if (v && !git_parse_ulong(v, &val))
+		die(_("failed to parse %s"), k);
+	return val;
+}
diff --git a/parse.h b/parse.h
new file mode 100644
index 0000000000..07d2193d69
--- /dev/null
+++ b/parse.h
@@ -0,0 +1,20 @@
+#ifndef PARSE_H
+#define PARSE_H
+
+int git_parse_signed(const char *value, intmax_t *ret, intmax_t max);
+int git_parse_ssize_t(const char *, ssize_t *);
+int git_parse_ulong(const char *, unsigned long *);
+int git_parse_int(const char *value, int *ret);
+int git_parse_int64(const char *value, int64_t *ret);
+
+/**
+ * Same as `git_config_bool`, except that it returns -1 on error rather
+ * than dying.
+ */
+int git_parse_maybe_bool(const char *);
+int git_parse_maybe_bool_text(const char *value);
+
+int git_env_bool(const char *, int);
+unsigned long git_env_ulong(const char *, unsigned long);
+
+#endif /* PARSE_H */
diff --git a/pathspec.c b/pathspec.c
index 3a3a5724c4..7f88f1c02b 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -1,6 +1,6 @@
 #include "git-compat-util.h"
 #include "abspath.h"
-#include "config.h"
+#include "parse.h"
 #include "dir.h"
 #include "environment.h"
 #include "gettext.h"
diff --git a/preload-index.c b/preload-index.c
index e44530c80c..63fd35d64b 100644
--- a/preload-index.c
+++ b/preload-index.c
@@ -7,7 +7,7 @@
 #include "environment.h"
 #include "fsmonitor.h"
 #include "gettext.h"
-#include "config.h"
+#include "parse.h"
 #include "preload-index.h"
 #include "progress.h"
 #include "read-cache.h"
diff --git a/progress.c b/progress.c
index f695798aca..c83cb60bf1 100644
--- a/progress.c
+++ b/progress.c
@@ -17,7 +17,7 @@
 #include "trace.h"
 #include "trace2.h"
 #include "utf8.h"
-#include "config.h"
+#include "parse.h"
 
 #define TP_IDX_MAX      8
 
diff --git a/prompt.c b/prompt.c
index 3baa33f63d..8935fe4dfb 100644
--- a/prompt.c
+++ b/prompt.c
@@ -1,5 +1,5 @@
 #include "git-compat-util.h"
-#include "config.h"
+#include "parse.h"
 #include "environment.h"
 #include "run-command.h"
 #include "strbuf.h"
diff --git a/rebase.c b/rebase.c
index 17a570f1ff..69a1822da3 100644
--- a/rebase.c
+++ b/rebase.c
@@ -1,6 +1,6 @@
 #include "git-compat-util.h"
 #include "rebase.h"
-#include "config.h"
+#include "parse.h"
 #include "gettext.h"
 
 /*
diff --git a/t/helper/test-env-helper.c b/t/helper/test-env-helper.c
index 66c88b8ff3..1c486888a4 100644
--- a/t/helper/test-env-helper.c
+++ b/t/helper/test-env-helper.c
@@ -1,5 +1,5 @@
 #include "test-tool.h"
-#include "config.h"
+#include "parse.h"
 #include "parse-options.h"
 
 static char const * const env__helper_usage[] = {
diff --git a/unpack-trees.c b/unpack-trees.c
index 87517364dc..761562a96e 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -2,7 +2,7 @@
 #include "advice.h"
 #include "strvec.h"
 #include "repository.h"
-#include "config.h"
+#include "parse.h"
 #include "dir.h"
 #include "environment.h"
 #include "gettext.h"
diff --git a/wrapper.c b/wrapper.c
index 453a20ed99..7da15a56da 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -3,7 +3,7 @@
  */
 #include "git-compat-util.h"
 #include "abspath.h"
-#include "config.h"
+#include "parse.h"
 #include "gettext.h"
 #include "repository.h"
 #include "strbuf.h"
diff --git a/write-or-die.c b/write-or-die.c
index d8355c0c3e..42a2dc73cd 100644
--- a/write-or-die.c
+++ b/write-or-die.c
@@ -1,5 +1,5 @@
 #include "git-compat-util.h"
-#include "config.h"
+#include "parse.h"
 #include "run-command.h"
 #include "write-or-die.h"
 
-- 
2.42.0.582.g8ccd20d70d-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* Re: [PATCH v4 3/4] config: correct bad boolean env value error message
  2023-09-29 21:20   ` [PATCH v4 3/4] config: correct bad boolean env value error message Jonathan Tan
@ 2023-09-29 23:03     ` Junio C Hamano
  0 siblings, 0 replies; 111+ messages in thread
From: Junio C Hamano @ 2023-09-29 23:03 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, Calvin Wan, phillip.wood123

Jonathan Tan <jonathantanmy@google.com> writes:

> From: Calvin Wan <calvinwan@google.com>
>
> An incorrectly defined boolean environment value would result in the
> following error message:
>
> bad boolean config value '%s' for '%s'
>
> This is a misnomer since environment value != config value. Instead of
> calling git_config_bool() to parse the environment value, mimic the
> functionality inside of git_config_bool() but with the correct error
> message.
>
> Signed-off-by: Calvin Wan <calvinwan@google.com>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>  config.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)

Makes sense.

I briefly wondered if there are ways to share more code, but this
seems to be the best we can do.  The duplication is not too bad to
begin with anyway.

Looking good.  Will queue.


> diff --git a/config.c b/config.c
> index 3846a37be9..7dde0aaa02 100644
> --- a/config.c
> +++ b/config.c
> @@ -2133,7 +2133,14 @@ void git_global_config(char **user_out, char **xdg_out)
>  int git_env_bool(const char *k, int def)
>  {
>  	const char *v = getenv(k);
> -	return v ? git_config_bool(k, v) : def;
> +	int val;
> +	if (!v)
> +		return def;
> +	val = git_parse_maybe_bool(v);
> +	if (val < 0)
> +		die(_("bad boolean environment value '%s' for '%s'"),
> +		    v, k);
> +	return val;
>  }
>  
>  /*

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v4 2/4] wrapper: reduce scope of remove_or_warn()
  2023-09-29 21:20   ` [PATCH v4 2/4] wrapper: reduce scope of remove_or_warn() Jonathan Tan
@ 2023-10-10  9:59     ` phillip.wood123
  2023-10-10 16:13       ` Junio C Hamano
  0 siblings, 1 reply; 111+ messages in thread
From: phillip.wood123 @ 2023-10-10  9:59 UTC (permalink / raw)
  To: Jonathan Tan, git; +Cc: Calvin Wan, Junio C Hamano

Hi Jonathan

On 29/09/2023 22:20, Jonathan Tan wrote:
> From: Calvin Wan <calvinwan@google.com>
> 
> remove_or_warn() is only used by entry.c and apply.c, but it is
> currently declared and defined in wrapper.{h,c}, so it has a scope much
> greater than it needs. This needlessly large scope also causes wrapper.c
> to need to include object.h, when this file is largely unconcerned with
> Git objects.
> 
> Move remove_or_warn() to entry.{h,c}. The file apply.c still has access
> to it, since it already includes entry.h for another reason.

This looks good. On a related note wrapper.c includes repository.h but 
does use anything declared in that header.

Best Wishes

Phillip

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v4 4/4] parse: separate out parsing functions from config.h
  2023-09-29 21:20   ` [PATCH v4 4/4] parse: separate out parsing functions from config.h Jonathan Tan
@ 2023-10-10 10:00     ` phillip.wood123
  2023-10-10 17:43       ` Jonathan Tan
  0 siblings, 1 reply; 111+ messages in thread
From: phillip.wood123 @ 2023-10-10 10:00 UTC (permalink / raw)
  To: Jonathan Tan, git; +Cc: Calvin Wan, Junio C Hamano

Hi Jonathan

On 29/09/2023 22:20, Jonathan Tan wrote:
> diff --git a/parse.h b/parse.h
> new file mode 100644
> index 0000000000..07d2193d69
> --- /dev/null
> +++ b/parse.h
> @@ -0,0 +1,20 @@
> +#ifndef PARSE_H
> +#define PARSE_H
> +
> +int git_parse_signed(const char *value, intmax_t *ret, intmax_t max);

Previously this function was private to config.c, now it needs to be 
public because it is still called by 
git_config_get_expiry_date_in_days(). As this is essentially an internal 
helper for git_parse_int() and friends it is a bit unfortunate that it 
is now public. Perhaps we should change 
git_config_get_expiry_date_in_days() to call git_parse_int() instead.
Then we can keep git_parse_signed() and git_parse_unsigned() private to 
parse.c.

> +int git_parse_ssize_t(const char *, ssize_t *);
> +int git_parse_ulong(const char *, unsigned long *);
> +int git_parse_int(const char *value, int *ret);
> +int git_parse_int64(const char *value, int64_t *ret);

This was previously private but I think it makes sense for it to be 
publicly available.

> +/**
> + * Same as `git_config_bool`, except that it returns -1 on error rather
> + * than dying.
> + */
> +int git_parse_maybe_bool(const char *);
> +int git_parse_maybe_bool_text(const char *value);

This used to be private to config.c and now has callers in parse.c and 
config.c. We should make it clear that non-config code is likely to want 
git_parse_maybe_bool() rather than this function.

Best Wishes

Phillip

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v4 0/4] Preliminary patches before git-std-lib
  2023-09-29 21:20 ` [PATCH v4 0/4] Preliminary patches before git-std-lib Jonathan Tan
                     ` (3 preceding siblings ...)
  2023-09-29 21:20   ` [PATCH v4 4/4] parse: separate out parsing functions from config.h Jonathan Tan
@ 2023-10-10 10:05   ` phillip.wood123
  2023-10-10 16:21     ` Jonathan Tan
  2024-02-22 17:50   ` [PATCH v5 0/3] Introduce Git Standard Library Calvin Wan
                     ` (3 subsequent siblings)
  8 siblings, 1 reply; 111+ messages in thread
From: phillip.wood123 @ 2023-10-10 10:05 UTC (permalink / raw)
  To: Jonathan Tan, git; +Cc: Calvin Wan, Junio C Hamano

Hi Jonathan

On 29/09/2023 22:20, Jonathan Tan wrote:
> Calvin will be away for a few weeks and I'll be handling the git-std-lib
> effort in the meantime. My goals will be:
> 
> - Get the preliminary patches in Calvin's patch set (patches 1-4) merged
> first.
> 
> - Updating patches 5-6 based on reviewer feedback (including my
> feedback). I have several aims including reducing or eliminating the
> need for the GIT_STD_LIB preprocessor variable, and making stubs a test-
> only concern (I think Phillip has some similar ideas [1] but I haven't
> looked at their repo on GitHub yet).

It sounds like we're thinking along similar lines, do feel free get in 
touch on or off the list if you want to ask anything about those patches 
I pushed to github.

> [1] https://lore.kernel.org/git/98f3edcf-7f37-45ff-abd2-c0038d4e0589@gmail.com/
> 
> This patch set is in service of the first goal. Because the libification
> patches are no longer included in this patch set, I have rewritten the
> commit messages to justify the patches in terms of code organization.
> There are no changes in the code itself. Also, I have retained Calvin's
> name as the author.

I agree it makes sense to get the preliminary patches merged on their 
own. I think the argument that they reduce the scope of includes is a 
reasonable justification on its own. I've left a couple of comments but 
they're looking pretty good.

Best Wishes

Phillip

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v4 2/4] wrapper: reduce scope of remove_or_warn()
  2023-10-10  9:59     ` phillip.wood123
@ 2023-10-10 16:13       ` Junio C Hamano
  2023-10-10 17:38         ` Jonathan Tan
  0 siblings, 1 reply; 111+ messages in thread
From: Junio C Hamano @ 2023-10-10 16:13 UTC (permalink / raw)
  To: phillip.wood123; +Cc: Jonathan Tan, git, Calvin Wan

phillip.wood123@gmail.com writes:

> Hi Jonathan
>
> On 29/09/2023 22:20, Jonathan Tan wrote:
>> From: Calvin Wan <calvinwan@google.com>
>> remove_or_warn() is only used by entry.c and apply.c, but it is
>> currently declared and defined in wrapper.{h,c}, so it has a scope much
>> greater than it needs. This needlessly large scope also causes wrapper.c
>> to need to include object.h, when this file is largely unconcerned with
>> Git objects.
>> Move remove_or_warn() to entry.{h,c}. The file apply.c still has
>> access
>> to it, since it already includes entry.h for another reason.
>
> This looks good. On a related note wrapper.c includes repository.h but
> does use anything declared in that header.
>
> Best Wishes
>
> Phillip

Thanks for a review.  I just checked 'master', 'next', and 'seen'
and in all '#include <repository.h>' can safely be dropped from
there, it seems.  It may be too trivial even for a microproject,
but nevertheless a nice clean-up.


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v4 0/4] Preliminary patches before git-std-lib
  2023-10-10 10:05   ` [PATCH v4 0/4] Preliminary patches before git-std-lib phillip.wood123
@ 2023-10-10 16:21     ` Jonathan Tan
  0 siblings, 0 replies; 111+ messages in thread
From: Jonathan Tan @ 2023-10-10 16:21 UTC (permalink / raw)
  To: phillip.wood123; +Cc: Jonathan Tan, git, Calvin Wan, Junio C Hamano

phillip.wood123@gmail.com writes:
> Hi Jonathan
> 
> On 29/09/2023 22:20, Jonathan Tan wrote:
> > Calvin will be away for a few weeks and I'll be handling the git-std-lib
> > effort in the meantime. My goals will be:
> > 
> > - Get the preliminary patches in Calvin's patch set (patches 1-4) merged
> > first.
> > 
> > - Updating patches 5-6 based on reviewer feedback (including my
> > feedback). I have several aims including reducing or eliminating the
> > need for the GIT_STD_LIB preprocessor variable, and making stubs a test-
> > only concern (I think Phillip has some similar ideas [1] but I haven't
> > looked at their repo on GitHub yet).
> 
> It sounds like we're thinking along similar lines, do feel free get in 
> touch on or off the list if you want to ask anything about those patches 
> I pushed to github.

Thanks. I'm updating patches 5-6 now and basing on your work, in fact.

> > [1] https://lore.kernel.org/git/98f3edcf-7f37-45ff-abd2-c0038d4e0589@gmail.com/
> > 
> > This patch set is in service of the first goal. Because the libification
> > patches are no longer included in this patch set, I have rewritten the
> > commit messages to justify the patches in terms of code organization.
> > There are no changes in the code itself. Also, I have retained Calvin's
> > name as the author.
> 
> I agree it makes sense to get the preliminary patches merged on their 
> own. I think the argument that they reduce the scope of includes is a 
> reasonable justification on its own. I've left a couple of comments but 
> they're looking pretty good.
> 
> Best Wishes
> 
> Phillip

Thanks.

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v4 2/4] wrapper: reduce scope of remove_or_warn()
  2023-10-10 16:13       ` Junio C Hamano
@ 2023-10-10 17:38         ` Jonathan Tan
  0 siblings, 0 replies; 111+ messages in thread
From: Jonathan Tan @ 2023-10-10 17:38 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Tan, phillip.wood123, git, Calvin Wan

Junio C Hamano <gitster@pobox.com> writes:
> phillip.wood123@gmail.com writes:
> 
> > Hi Jonathan
> >
> > On 29/09/2023 22:20, Jonathan Tan wrote:
> >> From: Calvin Wan <calvinwan@google.com>
> >> remove_or_warn() is only used by entry.c and apply.c, but it is
> >> currently declared and defined in wrapper.{h,c}, so it has a scope much
> >> greater than it needs. This needlessly large scope also causes wrapper.c
> >> to need to include object.h, when this file is largely unconcerned with
> >> Git objects.
> >> Move remove_or_warn() to entry.{h,c}. The file apply.c still has
> >> access
> >> to it, since it already includes entry.h for another reason.
> >
> > This looks good. On a related note wrapper.c includes repository.h but
> > does use anything declared in that header.
> >
> > Best Wishes
> >
> > Phillip
> 
> Thanks for a review.  I just checked 'master', 'next', and 'seen'
> and in all '#include <repository.h>' can safely be dropped from
> there, it seems.  It may be too trivial even for a microproject,
> but nevertheless a nice clean-up.

Ah, Calvin fixed this in one of the subsequent patches, but I'll put it
into its own patch in my updated version.

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v4 4/4] parse: separate out parsing functions from config.h
  2023-10-10 10:00     ` phillip.wood123
@ 2023-10-10 17:43       ` Jonathan Tan
  2023-10-10 17:58         ` Phillip Wood
  0 siblings, 1 reply; 111+ messages in thread
From: Jonathan Tan @ 2023-10-10 17:43 UTC (permalink / raw)
  To: phillip.wood123; +Cc: Jonathan Tan, git, Calvin Wan, Junio C Hamano

phillip.wood123@gmail.com writes:
> Hi Jonathan
> 
> On 29/09/2023 22:20, Jonathan Tan wrote:
> > diff --git a/parse.h b/parse.h
> > new file mode 100644
> > index 0000000000..07d2193d69
> > --- /dev/null
> > +++ b/parse.h
> > @@ -0,0 +1,20 @@
> > +#ifndef PARSE_H
> > +#define PARSE_H
> > +
> > +int git_parse_signed(const char *value, intmax_t *ret, intmax_t max);
> 
> Previously this function was private to config.c, now it needs to be 
> public because it is still called by 
> git_config_get_expiry_date_in_days(). As this is essentially an internal 
> helper for git_parse_int() and friends it is a bit unfortunate that it 
> is now public. Perhaps we should change 
> git_config_get_expiry_date_in_days() to call git_parse_int() instead.
> Then we can keep git_parse_signed() and git_parse_unsigned() private to 
> parse.c.

It could be argued also that it fits in with the rest of
the parsing functions - this one parses intmax, and we have
others of various signedness and size. I'm open to changing
git_config_get_expiry_date_in_days() too, though...we probably don't
need so many days.

> > +/**
> > + * Same as `git_config_bool`, except that it returns -1 on error rather
> > + * than dying.
> > + */
> > +int git_parse_maybe_bool(const char *);
> > +int git_parse_maybe_bool_text(const char *value);
> 
> This used to be private to config.c and now has callers in parse.c and 
> config.c. We should make it clear that non-config code is likely to want 
> git_parse_maybe_bool() rather than this function.
> 
> Best Wishes
> 
> Phillip

The difference between these 2 functions here is that bool_text supports
only the textual forms (used, for example, in git_config_bool_or_int()
which accepts both boolean strings and integers), which might be useful
elsewhere too. But it could be better documented, yes.

Looking at "What's Cooking", this series is about to be merged to
master. We could hold off merging that, but I think we don't need to
- it could be argued that git_parse_maybe_bool_text() could be better
documented, but even if we wrote it from scratch, I would probably put
the extra documentation in its own patch anyway (so one patch for moving
the code, and another for adding documentation).

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v4 4/4] parse: separate out parsing functions from config.h
  2023-10-10 17:43       ` Jonathan Tan
@ 2023-10-10 17:58         ` Phillip Wood
  2023-10-10 20:57           ` Junio C Hamano
  0 siblings, 1 reply; 111+ messages in thread
From: Phillip Wood @ 2023-10-10 17:58 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, Calvin Wan, Junio C Hamano

On 10/10/2023 18:43, Jonathan Tan wrote:
> phillip.wood123@gmail.com writes:
>> Hi Jonathan
>>
>> On 29/09/2023 22:20, Jonathan Tan wrote:
>>> diff --git a/parse.h b/parse.h
>>> new file mode 100644
>>> index 0000000000..07d2193d69
>>> --- /dev/null
>>> +++ b/parse.h
>>> @@ -0,0 +1,20 @@
>>> +#ifndef PARSE_H
>>> +#define PARSE_H
>>> +
>>> +int git_parse_signed(const char *value, intmax_t *ret, intmax_t max);
>>
>> Previously this function was private to config.c, now it needs to be
>> public because it is still called by
>> git_config_get_expiry_date_in_days(). As this is essentially an internal
>> helper for git_parse_int() and friends it is a bit unfortunate that it
>> is now public. Perhaps we should change
>> git_config_get_expiry_date_in_days() to call git_parse_int() instead.
>> Then we can keep git_parse_signed() and git_parse_unsigned() private to
>> parse.c.
> 
> It could be argued also that it fits in with the rest of
> the parsing functions - this one parses intmax, and we have
> others of various signedness and size.

This one differs from the others because it expects the caller to pass a 
maximum value, the intmax_t equivalent to git_parse_int() would be

int git_parse_intmax(const char*, intmax_t*);

We now expose git_parse_int64() which covers a similar case.

> I'm open to changing
> git_config_get_expiry_date_in_days() too, though...we probably don't
> need so many days.

Indeed, the existing code passes maximum_signed_value_of_type(int) as 
the third argument to limit it to INT_MAX already.

>>> +/**
>>> + * Same as `git_config_bool`, except that it returns -1 on error rather
>>> + * than dying.
>>> + */
>>> +int git_parse_maybe_bool(const char *);
>>> +int git_parse_maybe_bool_text(const char *value);
>>
>> This used to be private to config.c and now has callers in parse.c and
>> config.c. We should make it clear that non-config code is likely to want
>> git_parse_maybe_bool() rather than this function.
>>
>> Best Wishes
>>
>> Phillip
> 
> The difference between these 2 functions here is that bool_text supports
> only the textual forms (used, for example, in git_config_bool_or_int()
> which accepts both boolean strings and integers), which might be useful
> elsewhere too. But it could be better documented, yes.
> 
> Looking at "What's Cooking", this series is about to be merged to
> master. We could hold off merging that, but I think we don't need to
> - it could be argued that git_parse_maybe_bool_text() could be better
> documented, but even if we wrote it from scratch, I would probably put
> the extra documentation in its own patch anyway (so one patch for moving
> the code, and another for adding documentation).

I agree it's not worth re-rolling just to add some documentation here.

Best Wishes

Phillip

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v4 4/4] parse: separate out parsing functions from config.h
  2023-10-10 17:58         ` Phillip Wood
@ 2023-10-10 20:57           ` Junio C Hamano
  0 siblings, 0 replies; 111+ messages in thread
From: Junio C Hamano @ 2023-10-10 20:57 UTC (permalink / raw)
  To: Phillip Wood; +Cc: Jonathan Tan, git, Calvin Wan

Phillip Wood <phillip.wood123@gmail.com> writes:

>> I'm open to changing
>> git_config_get_expiry_date_in_days() too, though...we probably don't
>> need so many days.
>
> Indeed, the existing code passes maximum_signed_value_of_type(int) as
> the third argument to limit it to INT_MAX already.

Yeah, in other words, the current implementation does not allow us
to express the days more than an int can, so it is no brainer to
switch to use git_parse_int().  Allowing longer expiration period
is obviously outside the scope of this series.

Thanks.

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v4 1/4] hex-ll: separate out non-hash-algo functions
  2023-09-29 21:20   ` [PATCH v4 1/4] hex-ll: separate out non-hash-algo functions Jonathan Tan
@ 2023-10-21  4:14     ` Linus Arver
  0 siblings, 0 replies; 111+ messages in thread
From: Linus Arver @ 2023-10-21  4:14 UTC (permalink / raw)
  To: Jonathan Tan, git
  Cc: Calvin Wan, phillip.wood123, Junio C Hamano, Jonathan Tan

Jonathan Tan <jonathantanmy@google.com> writes:

> From: Calvin Wan <calvinwan@google.com>
>
> In order to further reduce all-in-one headers, separate out functions in
> hex.h that do not operate on object hashes into its own file, hex-ll.h,

Nit: I was wondering what the "-ll" in "hex-ll.h" meant, then found
d1cbe1e6d8 (hash-ll.h: split out of hash.h to remove dependency on
repository.h, 2023-04-22) which seems to have set the precedent for this
naming style. Might be worth including here.

^ permalink raw reply	[flat|nested] 111+ messages in thread

* [PATCH v5 0/3] Introduce Git Standard Library
  2023-09-29 21:20 ` [PATCH v4 0/4] Preliminary patches before git-std-lib Jonathan Tan
                     ` (4 preceding siblings ...)
  2023-10-10 10:05   ` [PATCH v4 0/4] Preliminary patches before git-std-lib phillip.wood123
@ 2024-02-22 17:50   ` Calvin Wan
  2024-02-22 17:50   ` [PATCH v5 1/3] pager: include stdint.h because uintmax_t is used Calvin Wan
                     ` (2 subsequent siblings)
  8 siblings, 0 replies; 111+ messages in thread
From: Calvin Wan @ 2024-02-22 17:50 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, Jonathan Tan, phillip.wood123, Junio C Hamano

While it has been a while since the last reroll of this series[1], the
contents and boundaries of git-std-lib have not changed. The focus for
this reroll are improvements to the Makefile, test file, and
documentation. Patch 1 contains a small fix for a missing include
discovered by Jonathan Tan. Patch 2 introduces the Git Standard Library.
And patch 3 introduces preliminary testing and usage examples for the
Git Standard Library.

One important piece of feedback I received from the previous series is
that Git should be the first consumer of its own libraries. Objects in
git-std-lib.a are no longer contained in LIB_OBJS, but rather directly
built into git-std-lib.a and then linked into git.a. There is some
functionality that is used by git-std-lib.a that's not included in
git-std-lib.a, such as tracing support. These have been stubbed out into
git-stub-lib.a and can be built with git-std-lib.a to be used
externally. Thank you to Philip Wood for these suggestions[2]!

The test file and Makefile have been updated to include cleanups
suggested by Junio. Since git-std-lib.a is now a dependency of Git, the
test file has also been included as part of the test suite. The series
has been rebased onto a recent version of `next` and function calls that
have been added/changed since the last reroll have also been included
into the test file.

Finally, through our libification syncs, there have been various topics
and questions brought up that would be better clarified with additional
documentation in technical/git-std-lib.txt.

[1]
https://lore.kernel.org/git/20230908174134.1026823-1-calvinwan@google.com/
[2]
https://lore.kernel.org/git/98f3edcf-7f37-45ff-abd2-c0038d4e0589@gmail.com/


Calvin Wan (2):
  git-std-lib: introduce Git Standard Library
  test-stdlib: show that git-std-lib is independent

Jonathan Tan (1):
  pager: include stdint.h because uintmax_t is used

 Documentation/Makefile                  |   1 +
 Documentation/technical/git-std-lib.txt | 170 +++++++++++++++
 Makefile                                |  71 +++++--
 pager.h                                 |   2 +
 strbuf.h                                |   2 +
 stubs/misc.c                            |  34 +++
 stubs/pager.c                           |   6 +
 stubs/trace2.c                          |  27 +++
 t/helper/.gitignore                     |   1 +
 t/helper/test-stdlib.c                  | 266 ++++++++++++++++++++++++
 t/t0082-std-lib.sh                      |  11 +
 11 files changed, 575 insertions(+), 16 deletions(-)
 create mode 100644 Documentation/technical/git-std-lib.txt
 create mode 100644 stubs/misc.c
 create mode 100644 stubs/pager.c
 create mode 100644 stubs/trace2.c
 create mode 100644 t/helper/test-stdlib.c
 create mode 100755 t/t0082-std-lib.sh

Range-diff against v4:
1:  2f99eb2ca4 < -:  ---------- hex-ll: split out functionality from hex
2:  7b2d123628 < -:  ---------- wrapper: remove dependency to Git-specific internal file
3:  b37beb206a < -:  ---------- config: correct bad boolean env value error message
4:  3a827cf45c < -:  ---------- parse: create new library for parsing strings and env values
5:  f8e4ac50a0 < -:  ---------- git-std-lib: introduce git standard library
6:  7840e1830a < -:  ---------- git-std-lib: add test file to call git-std-lib.a functions
-:  ---------- > 1:  57b751a497 pager: include stdint.h because uintmax_t is used
-:  ---------- > 2:  e64f3c73c2 git-std-lib: introduce Git Standard Library
-:  ---------- > 3:  e2d930f729 test-stdlib: show that git-std-lib is independent
-- 
2.44.0.rc0.258.g7320e95886-goog


^ permalink raw reply	[flat|nested] 111+ messages in thread

* [PATCH v5 1/3] pager: include stdint.h because uintmax_t is used
  2023-09-29 21:20 ` [PATCH v4 0/4] Preliminary patches before git-std-lib Jonathan Tan
                     ` (5 preceding siblings ...)
  2024-02-22 17:50   ` [PATCH v5 0/3] Introduce Git Standard Library Calvin Wan
@ 2024-02-22 17:50   ` Calvin Wan
  2024-02-22 21:43     ` Junio C Hamano
  2024-02-24  1:33     ` Kyle Lippincott
  2024-02-22 17:50   ` [PATCH v5 2/3] git-std-lib: introduce Git Standard Library Calvin Wan
  2024-02-22 17:50   ` [PATCH v5 3/3] test-stdlib: show that git-std-lib is independent Calvin Wan
  8 siblings, 2 replies; 111+ messages in thread
From: Calvin Wan @ 2024-02-22 17:50 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, Calvin Wan, phillip.wood123, Junio C Hamano

From: Jonathan Tan <jonathantanmy@google.com>

pager.h uses uintmax_t but does not include stdint.h. Therefore, add
this include statement.

This was discovered when writing a stub pager.c file.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 pager.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/pager.h b/pager.h
index b77433026d..015bca95e3 100644
--- a/pager.h
+++ b/pager.h
@@ -1,6 +1,8 @@
 #ifndef PAGER_H
 #define PAGER_H
 
+#include <stdint.h>
+
 struct child_process;
 
 const char *git_pager(int stdout_is_tty);
-- 
2.44.0.rc0.258.g7320e95886-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v5 2/3] git-std-lib: introduce Git Standard Library
  2023-09-29 21:20 ` [PATCH v4 0/4] Preliminary patches before git-std-lib Jonathan Tan
                     ` (6 preceding siblings ...)
  2024-02-22 17:50   ` [PATCH v5 1/3] pager: include stdint.h because uintmax_t is used Calvin Wan
@ 2024-02-22 17:50   ` Calvin Wan
  2024-02-29 11:16     ` Phillip Wood
  2024-02-22 17:50   ` [PATCH v5 3/3] test-stdlib: show that git-std-lib is independent Calvin Wan
  8 siblings, 1 reply; 111+ messages in thread
From: Calvin Wan @ 2024-02-22 17:50 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, Jonathan Tan, phillip.wood123, Junio C Hamano

This commit contains:
- Makefile rules for git-std-lib.a
- code and Makefile rules for git-stub-lib.a
- description and rationale of the above in Documentation/

Quoting from documentation introduced in this commit:

  The Git Standard Library intends to serve as the foundational library
  and root dependency that other libraries in Git will be built off
  of. That is to say, suppose we have libraries X and Y; a user that
  wants to use X and Y would need to include X, Y, and this Git Standard
  Library.

Code demonstrating the use of git-std-lib.a and git-stub-lib.a will be
in a subsequent commit.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 Documentation/Makefile                  |   1 +
 Documentation/technical/git-std-lib.txt | 170 ++++++++++++++++++++++++
 Makefile                                |  48 +++++--
 stubs/misc.c                            |  33 +++++
 stubs/pager.c                           |   6 +
 stubs/trace2.c                          |  27 ++++
 6 files changed, 274 insertions(+), 11 deletions(-)
 create mode 100644 Documentation/technical/git-std-lib.txt
 create mode 100644 stubs/misc.c
 create mode 100644 stubs/pager.c
 create mode 100644 stubs/trace2.c

diff --git a/Documentation/Makefile b/Documentation/Makefile
index 3f2383a12c..f1dc673838 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -110,6 +110,7 @@ TECH_DOCS += SubmittingPatches
 TECH_DOCS += ToolsForGit
 TECH_DOCS += technical/bitmap-format
 TECH_DOCS += technical/bundle-uri
+TECH_DOCS += technical/git-std-lib
 TECH_DOCS += technical/hash-function-transition
 TECH_DOCS += technical/long-running-process-protocol
 TECH_DOCS += technical/multi-pack-index
diff --git a/Documentation/technical/git-std-lib.txt b/Documentation/technical/git-std-lib.txt
new file mode 100644
index 0000000000..3d9aa121ac
--- /dev/null
+++ b/Documentation/technical/git-std-lib.txt
@@ -0,0 +1,170 @@
+= Git Standard Library
+
+The Git Standard Library intends to serve as the foundational library
+and root dependency that other libraries in Git will be built off of.
+That is to say, suppose we have libraries X and Y; a user that wants to
+use X and Y would need to include X, Y, and this Git Standard Library.
+This does not mean that the Git Standard Library will be the only
+possible root dependency in the future, but rather the most significant
+and widely used one. Git itself is also built off of the Git Standard
+Library.
+
+== Dependency graph in libified Git
+
+Before the introduction of the Git Standard Library, all objects defined
+in the Git library are compiled and archived into a singular file,
+libgit.a, which is then linked against by common-main.o with other
+external dependencies and turned into the Git executable. In other
+words, the Git executable has dependencies on libgit.a and a couple of
+external libraries. The libfication of Git slightly alters this build
+flow by separating out libgit.a into libgit.a and git-std-lib.a. 
+
+With our current method of building Git, we can imagine the dependency
+graph as such:
+
+	Git
+	 /\
+	/  \
+       /    \
+  libgit.a   ext deps
+
+We want to separate out potential libraries from libgit.a and have
+libgit.a depend on them, which would possibly look like:
+
+		Git
+		/\
+	       /  \
+	      /    \
+	  libgit.a  ext deps
+	     /\
+	    /  \
+	   /    \
+object-store.a  (other lib)
+      |        /
+      |       /
+      |      /
+      |     /
+      |    /
+      |   /
+      |  /
+git-std-lib.a
+
+Instead of containing all objects in Git, libgit.a would contain objects
+that are not built by libraries it links against. Consequently, if
+someone wanted a custom build of Git with a custom implementation of the
+object store, they would only have to swap out object-store.a rather
+than do a hard fork of Git.
+
+== Rationale behind Git Standard Library
+
+The rationale behind the selected object files in the Git Standard
+Library is the result of two observations within the Git
+codebase:
+  1. every file includes git-compat-util.h which defines functions
+     in a couple of different files
+  2. wrapper.c + usage.c have difficult-to-separate circular
+     dependencies with each other and other files.
+
+=== Ubiquity of git-compat-util.h and circular dependencies
+
+Every file in the Git codebase includes git-compat-util.h. It serves as
+"a compatibility aid that isolates the knowledge of platform specific
+inclusion order and what feature macros to define before including which
+system header" (Junio[1]). Since every file includes git-compat-util.h,
+and git-compat-util.h includes wrapper.h and usage.h, it would make
+sense for wrapper.c and usage.c to be a part of the root library. They
+have difficult to separate circular dependencies with each other so it
+would impractical for them to be independent libraries. Wrapper.c has
+dependencies on parse.c, abspath.c, strbuf.c, which in turn also have
+dependencies on usage.c and wrapper.c - more circular dependencies.
+
+=== Tradeoff between swappability and refactoring
+
+From the above dependency graph, we can see that git-std-lib.a could be
+many smaller libraries rather than a singular library. So why choose a
+singular library when multiple libraries can be individually easier to
+swap and are more modular? A singular library requires less work to
+separate out circular dependencies within itself so it becomes a
+tradeoff question between work and reward. While there may be a point in
+the future where a file like usage.c would want its own library so that
+someone can have custom die() or error(), the work required to refactor
+out the circular dependencies in some files would be enormous due to
+their ubiquity so therefore I believe it is not worth the tradeoff
+currently. Additionally, we can in the future choose to do this refactor
+and change the API for the library if there becomes enough of a reason
+to do so (remember we are avoiding promising stability of the interfaces
+of those libraries).
+
+=== Reuse of compatibility functions in git-compat-util.h
+
+Most functions defined in git-compat-util.h are implemented in compat/
+and have dependencies limited to strbuf.h and wrapper.h so they can be
+easily included in git-std-lib.a, which as a root dependency means that
+higher level libraries do not have to worry about compatibility files in
+compat/. The rest of the functions defined in git-compat-util.h are
+implemented in top level files and are hidden behind
+an #ifdef if their implementation is not in git-std-lib.a.
+
+=== Rationale summary
+
+The Git Standard Library allows us to get the libification ball rolling
+with other libraries in Git. By not spending many more months attempting
+to refactor difficult circular dependencies and instead spending that
+time getting to a state where we can test out swapping a library out
+such as config or object store, we can prove the viability of Git
+libification on a much faster time scale. Additionally the code cleanups
+that have happened so far have been minor and beneficial for the
+codebase. It is probable that making large movements would negatively
+affect code clarity.
+
+== Git Standard Library boundary
+
+While I have described above some useful heuristics for identifying
+potential candidates for git-std-lib.a, a standard library should not
+have a shaky definition for what belongs in it.
+
+ - Low-level files (aka operates only on other primitive types) that are
+   used everywhere within the codebase (wrapper.c, usage.c, strbuf.c)
+   - Dependencies that are low-level and widely used
+     (abspath.c, date.c, hex-ll.c, parse.c, utf8.c)
+ - low-level git/* files with functions defined in git-compat-util.h
+   (ctype.c)
+ - compat/*
+
+There are other files that might fit this definition, but that does not
+mean it should belong in git-std-lib.a. Those files should start as
+their own separate library since any file added to git-std-lib.a loses
+its flexibility of being easily swappable.
+
+Wrapper.c and usage.c have dependencies on pager and trace2 that are
+possible to remove at the cost of sacrificing the ability for standard Git
+to be able to trace functions in those files and other files in git-std-lib.a.
+In order for git-std-lib.a to compile with those dependencies, stubbed out
+versions of those files are implemented and swapped in during compilation time
+(see STUB_LIB_OBJS in the Makefile).
+
+== Files inside of Git Standard Library
+
+The set of files in git-std-lib.a can be found in STD_LIB_OBJS and COMPAT_OBJS
+in the Makefile.
+
+When these files are compiled together with the files in STUB_LIB_OBJS (or
+user-provided files that provide the same functions), they form a complete
+library.
+
+== Pitfalls
+
+There are a small amount of files under compat/* that have dependencies
+not inside of git-std-lib.a. While those functions are not called on
+Linux, other OSes might call those problematic functions. I don't see
+this as a major problem, just moreso an observation that libification in
+general may also require some minor compatibility work in the future.
+
+== Testing
+
+Unit tests should catch any breakages caused by changes to files in
+git-std-lib.a (i.e. introduction of a out of scope dependency) and new
+functions introduced to git-std-lib.a will require unit tests written
+for them.
+
+[1] https://lore.kernel.org/git/xmqqwn17sydw.fsf@gitster.g/
diff --git a/Makefile b/Makefile
index 4e255c81f2..d37ea9d34b 100644
--- a/Makefile
+++ b/Makefile
@@ -669,6 +669,8 @@ FUZZ_PROGRAMS =
 GIT_OBJS =
 LIB_OBJS =
 SCALAR_OBJS =
+STD_LIB_OBJS =
+STUB_LIB_OBJS =
 OBJECTS =
 OTHER_PROGRAMS =
 PROGRAM_OBJS =
@@ -923,6 +925,8 @@ TEST_SHELL_PATH = $(SHELL_PATH)
 
 LIB_FILE = libgit.a
 XDIFF_LIB = xdiff/lib.a
+STD_LIB_FILE = git-std-lib.a
+STUB_LIB_FILE = git-stub-lib.a
 REFTABLE_LIB = reftable/libreftable.a
 REFTABLE_TEST_LIB = reftable/libreftable_test.a
 
@@ -962,7 +966,6 @@ COCCI_SOURCES = $(filter-out $(THIRD_PARTY_SOURCES),$(FOUND_C_SOURCES))
 
 LIB_H = $(FOUND_H_SOURCES)
 
-LIB_OBJS += abspath.o
 LIB_OBJS += add-interactive.o
 LIB_OBJS += add-patch.o
 LIB_OBJS += advice.o
@@ -1004,8 +1007,6 @@ LIB_OBJS += convert.o
 LIB_OBJS += copy.o
 LIB_OBJS += credential.o
 LIB_OBJS += csum-file.o
-LIB_OBJS += ctype.o
-LIB_OBJS += date.o
 LIB_OBJS += decorate.o
 LIB_OBJS += delta-islands.o
 LIB_OBJS += diagnose.o
@@ -1046,7 +1047,6 @@ LIB_OBJS += hash-lookup.o
 LIB_OBJS += hashmap.o
 LIB_OBJS += help.o
 LIB_OBJS += hex.o
-LIB_OBJS += hex-ll.o
 LIB_OBJS += hook.o
 LIB_OBJS += ident.o
 LIB_OBJS += json-writer.o
@@ -1097,7 +1097,6 @@ LIB_OBJS += pack-write.o
 LIB_OBJS += packfile.o
 LIB_OBJS += pager.o
 LIB_OBJS += parallel-checkout.o
-LIB_OBJS += parse.o
 LIB_OBJS += parse-options-cb.o
 LIB_OBJS += parse-options.o
 LIB_OBJS += patch-delta.o
@@ -1152,7 +1151,6 @@ LIB_OBJS += sparse-index.o
 LIB_OBJS += split-index.o
 LIB_OBJS += stable-qsort.o
 LIB_OBJS += statinfo.o
-LIB_OBJS += strbuf.o
 LIB_OBJS += streaming.o
 LIB_OBJS += string-list.o
 LIB_OBJS += strmap.o
@@ -1189,21 +1187,32 @@ LIB_OBJS += unpack-trees.o
 LIB_OBJS += upload-pack.o
 LIB_OBJS += url.o
 LIB_OBJS += urlmatch.o
-LIB_OBJS += usage.o
 LIB_OBJS += userdiff.o
-LIB_OBJS += utf8.o
 LIB_OBJS += varint.o
 LIB_OBJS += version.o
 LIB_OBJS += versioncmp.o
 LIB_OBJS += walker.o
 LIB_OBJS += wildmatch.o
 LIB_OBJS += worktree.o
-LIB_OBJS += wrapper.o
 LIB_OBJS += write-or-die.o
 LIB_OBJS += ws.o
 LIB_OBJS += wt-status.o
 LIB_OBJS += xdiff-interface.o
 
+STD_LIB_OBJS += abspath.o
+STD_LIB_OBJS += ctype.o
+STD_LIB_OBJS += date.o
+STD_LIB_OBJS += hex-ll.o
+STD_LIB_OBJS += parse.o
+STD_LIB_OBJS += strbuf.o
+STD_LIB_OBJS += usage.o
+STD_LIB_OBJS += utf8.o
+STD_LIB_OBJS += wrapper.o
+
+STUB_LIB_OBJS += stubs/trace2.o
+STUB_LIB_OBJS += stubs/pager.o
+STUB_LIB_OBJS += stubs/misc.o
+
 BUILTIN_OBJS += builtin/add.o
 BUILTIN_OBJS += builtin/am.o
 BUILTIN_OBJS += builtin/annotate.o
@@ -1352,7 +1361,7 @@ UNIT_TEST_OBJS = $(patsubst %,$(UNIT_TEST_DIR)/%.o,$(UNIT_TEST_PROGRAMS))
 UNIT_TEST_OBJS += $(UNIT_TEST_DIR)/test-lib.o
 
 # xdiff and reftable libs may in turn depend on what is in libgit.a
-GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(LIB_FILE)
+GITLIBS = common-main.o $(STD_LIB_FILE) $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(LIB_FILE)
 EXTLIBS =
 
 GIT_USER_AGENT = git/$(GIT_VERSION)
@@ -2693,6 +2702,8 @@ OBJECTS += $(XDIFF_OBJS)
 OBJECTS += $(FUZZ_OBJS)
 OBJECTS += $(REFTABLE_OBJS) $(REFTABLE_TEST_OBJS)
 OBJECTS += $(UNIT_TEST_OBJS)
+OBJECTS += $(STD_LIB_OBJS)
+OBJECTS += $(STUB_LIB_OBJS)
 
 ifndef NO_CURL
 	OBJECTS += http.o http-walker.o remote-curl.o
@@ -3686,7 +3697,7 @@ clean: profile-clean coverage-clean cocciclean
 	$(RM) git.res
 	$(RM) $(OBJECTS)
 	$(RM) headless-git.o
-	$(RM) $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(REFTABLE_TEST_LIB)
+	$(RM) $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(REFTABLE_TEST_LIB) $(STD_LIB_FILE) $(STUB_LIB_FILE)
 	$(RM) $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS) $(OTHER_PROGRAMS)
 	$(RM) $(TEST_PROGRAMS)
 	$(RM) $(FUZZ_PROGRAMS)
@@ -3878,3 +3889,18 @@ $(UNIT_TEST_PROGS): $(UNIT_TEST_BIN)/%$X: $(UNIT_TEST_DIR)/%.o $(UNIT_TEST_DIR)/
 build-unit-tests: $(UNIT_TEST_PROGS)
 unit-tests: $(UNIT_TEST_PROGS)
 	$(MAKE) -C t/ unit-tests
+
+### Libified Git rules
+
+# git-std-lib.a
+# Programs other than git should compile this with
+#     make NO_GETTEXT=YesPlease git-std-lib.a
+# and link against git-stub-lib.a (if the default no-op functionality is fine)
+# or a custom .a file with the same interface as git-stub-lib.a (if custom
+# functionality is needed) as well.
+$(STD_LIB_FILE): $(STD_LIB_OBJS) $(COMPAT_OBJS)
+	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
+
+# git-stub-lib.a
+$(STUB_LIB_FILE): $(STUB_LIB_OBJS)
+	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
diff --git a/stubs/misc.c b/stubs/misc.c
new file mode 100644
index 0000000000..92da76fd46
--- /dev/null
+++ b/stubs/misc.c
@@ -0,0 +1,33 @@
+#include <assert.h>
+#include <stdlib.h>
+
+#ifndef NO_GETTEXT
+/*
+ * NEEDSWORK: This is enough to link our unit tests against
+ * git-std-lib.a built with gettext support. We don't really support
+ * programs other than git using git-std-lib.a with gettext support
+ * yet. To do that we need to start using dgettext() rather than
+ * gettext() in our code.
+ */
+int git_gettext_enabled = 0;
+#endif
+
+int common_exit(const char *file, int line, int code);
+
+int common_exit(const char *file, int line, int code)
+{
+	exit(code);
+}
+
+#if !defined(__MINGW32__) && !defined(_MSC_VER)
+int lstat_cache_aware_rmdir(const char *path);
+
+int lstat_cache_aware_rmdir(const char *path)
+{
+	/*
+	 * This function should not be called by programs linked
+	 * against git-stub-lib.a
+	 */
+	assert(0);
+}
+#endif
diff --git a/stubs/pager.c b/stubs/pager.c
new file mode 100644
index 0000000000..4f575cada7
--- /dev/null
+++ b/stubs/pager.c
@@ -0,0 +1,6 @@
+#include "pager.h"
+
+int pager_in_use(void)
+{
+	return 0;
+}
diff --git a/stubs/trace2.c b/stubs/trace2.c
new file mode 100644
index 0000000000..7d89482228
--- /dev/null
+++ b/stubs/trace2.c
@@ -0,0 +1,27 @@
+#include "git-compat-util.h"
+#include "trace2.h"
+
+struct child_process { int stub; };
+struct repository { int stub; };
+struct json_writer { int stub; };
+
+void trace2_region_enter_fl(const char *file, int line, const char *category,
+			    const char *label, const struct repository *repo, ...) { }
+void trace2_region_leave_fl(const char *file, int line, const char *category,
+			    const char *label, const struct repository *repo, ...) { }
+void trace2_data_string_fl(const char *file, int line, const char *category,
+			   const struct repository *repo, const char *key,
+			   const char *value) { }
+void trace2_cmd_ancestry_fl(const char *file, int line, const char **parent_names) { }
+void trace2_cmd_error_va_fl(const char *file, int line, const char *fmt,
+			    va_list ap) { }
+void trace2_cmd_name_fl(const char *file, int line, const char *name) { }
+void trace2_thread_start_fl(const char *file, int line,
+			    const char *thread_base_name) { }
+void trace2_thread_exit_fl(const char *file, int line) { }
+void trace2_data_intmax_fl(const char *file, int line, const char *category,
+			   const struct repository *repo, const char *key,
+			   intmax_t value) { }
+int trace2_is_enabled(void) { return 0; }
+void trace2_counter_add(enum trace2_counter_id cid, uint64_t value) { }
+void trace2_collect_process_info(enum trace2_process_info_reason reason) { }
-- 
2.44.0.rc0.258.g7320e95886-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v5 3/3] test-stdlib: show that git-std-lib is independent
  2023-09-29 21:20 ` [PATCH v4 0/4] Preliminary patches before git-std-lib Jonathan Tan
                     ` (7 preceding siblings ...)
  2024-02-22 17:50   ` [PATCH v5 2/3] git-std-lib: introduce Git Standard Library Calvin Wan
@ 2024-02-22 17:50   ` Calvin Wan
  2024-02-22 22:24     ` Junio C Hamano
  2024-03-07 21:13     ` Junio C Hamano
  8 siblings, 2 replies; 111+ messages in thread
From: Calvin Wan @ 2024-02-22 17:50 UTC (permalink / raw)
  To: git; +Cc: Calvin Wan, Jonathan Tan, phillip.wood123, Junio C Hamano

Add a test file that calls some functions defined in git-std-lib.a
object files to showcase that they do not reference missing objects and
that, together with git-stub-lib.a, git-std-lib.a can stand on its own.

As described in test-stdlib.c, this can probably be removed once we have
unit tests.

The variable TEST_PROGRAMS is moved lower in the Makefile after
NO_POSIX_GOODIES is defined in config.make.uname. TEST_PROGRAMS isn't
used earlier than that so this change should be safe.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 Makefile               |  23 +++-
 strbuf.h               |   2 +
 stubs/misc.c           |   1 +
 t/helper/.gitignore    |   1 +
 t/helper/test-stdlib.c | 266 +++++++++++++++++++++++++++++++++++++++++
 t/t0082-std-lib.sh     |  11 ++
 6 files changed, 299 insertions(+), 5 deletions(-)
 create mode 100644 t/helper/test-stdlib.c
 create mode 100755 t/t0082-std-lib.sh

diff --git a/Makefile b/Makefile
index d37ea9d34b..1d762ce13a 100644
--- a/Makefile
+++ b/Makefile
@@ -870,9 +870,7 @@ TEST_BUILTINS_OBJS += test-xml-encode.o
 # Do not add more tests here unless they have extra dependencies. Add
 # them in TEST_BUILTINS_OBJS above.
 TEST_PROGRAMS_NEED_X += test-fake-ssh
-TEST_PROGRAMS_NEED_X += test-tool
-
-TEST_PROGRAMS = $(patsubst %,t/helper/%$X,$(TEST_PROGRAMS_NEED_X))
+TEST_PROGRAMS_NEED_X += $(info tpnxnpg=$(NO_POSIX_GOODIES))test-tool
 
 # List built-in command $C whose implementation cmd_$C() is not in
 # builtin/$C.o but is linked in as part of some other command.
@@ -2678,6 +2676,16 @@ REFTABLE_TEST_OBJS += reftable/stack_test.o
 REFTABLE_TEST_OBJS += reftable/test_framework.o
 REFTABLE_TEST_OBJS += reftable/tree_test.o
 
+ifndef NO_POSIX_GOODIES
+TEST_PROGRAMS_NEED_X += test-stdlib
+MY_VAR = not_else
+$(info insideifndefnpg=$(NO_POSIX_GOODIES))
+else
+MY_VAR = else
+endif
+
+TEST_PROGRAMS = $(info tptpnx=$(TEST_PROGRAMS_NEED_X) myvar=$(MY_VAR))$(patsubst %,t/helper/%$X,$(TEST_PROGRAMS_NEED_X))
+
 TEST_OBJS := $(patsubst %$X,%.o,$(TEST_PROGRAMS)) $(patsubst %,t/helper/%,$(TEST_BUILTINS_OBJS))
 
 .PHONY: test-objs
@@ -3204,7 +3212,11 @@ GIT-PYTHON-VARS: FORCE
             fi
 endif
 
-test_bindir_programs := $(patsubst %,bin-wrappers/%,$(BINDIR_PROGRAMS_NEED_X) $(BINDIR_PROGRAMS_NO_X) $(TEST_PROGRAMS_NEED_X))
+test_bindir_programs := $(info tbptpnx=$(TEST_PROGRAMS_NEED_X))$(patsubst %,bin-wrappers/%,$(BINDIR_PROGRAMS_NEED_X) $(BINDIR_PROGRAMS_NO_X) $(TEST_PROGRAMS_NEED_X))
+
+t/helper/test-stdlib$X: t/helper/test-stdlib.o GIT-LDFLAGS $(STD_LIB_FILE) $(STUB_LIB_FILE) $(GITLIBS)
+	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) \
+		$< $(STD_LIB_FILE) $(STUB_LIB_FILE) $(EXTLIBS)
 
 all:: $(TEST_PROGRAMS) $(test_bindir_programs) $(UNIT_TEST_PROGS)
 
@@ -3635,7 +3647,8 @@ ifneq ($(INCLUDE_DLLS_IN_ARTIFACTS),)
 OTHER_PROGRAMS += $(shell echo *.dll t/helper/*.dll t/unit-tests/bin/*.dll)
 endif
 
-artifacts-tar:: $(ALL_COMMANDS_TO_INSTALL) $(SCRIPT_LIB) $(OTHER_PROGRAMS) \
+# Added an info for debugging
+artifacts-tar:: $(info npg=$(NO_POSIX_GOODIES) cc=$(COMPAT_CFLAGS) tp=$(TEST_PROGRAMS))$(ALL_COMMANDS_TO_INSTALL) $(SCRIPT_LIB) $(OTHER_PROGRAMS) \
 		GIT-BUILD-OPTIONS $(TEST_PROGRAMS) $(test_bindir_programs) \
 		$(UNIT_TEST_PROGS) $(MOFILES)
 	$(QUIET_SUBDIR0)templates $(QUIET_SUBDIR1) \
diff --git a/strbuf.h b/strbuf.h
index e959caca87..f775416307 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -1,6 +1,8 @@
 #ifndef STRBUF_H
 #define STRBUF_H
 
+#include "git-compat-util.h"
+
 /*
  * NOTE FOR STRBUF DEVELOPERS
  *
diff --git a/stubs/misc.c b/stubs/misc.c
index 92da76fd46..8d80581e39 100644
--- a/stubs/misc.c
+++ b/stubs/misc.c
@@ -9,6 +9,7 @@
  * yet. To do that we need to start using dgettext() rather than
  * gettext() in our code.
  */
+#include "gettext.h"
 int git_gettext_enabled = 0;
 #endif
 
diff --git a/t/helper/.gitignore b/t/helper/.gitignore
index 8c2ddcce95..5cec3b357f 100644
--- a/t/helper/.gitignore
+++ b/t/helper/.gitignore
@@ -1,2 +1,3 @@
 /test-tool
 /test-fake-ssh
+/test-stdlib
diff --git a/t/helper/test-stdlib.c b/t/helper/test-stdlib.c
new file mode 100644
index 0000000000..460b472fb4
--- /dev/null
+++ b/t/helper/test-stdlib.c
@@ -0,0 +1,266 @@
+#include "git-compat-util.h"
+#include "abspath.h"
+#include "hex-ll.h"
+#include "parse.h"
+#include "strbuf.h"
+#include "string-list.h"
+
+/*
+ * Calls all functions from git-std-lib
+ * Some inline/trivial functions are skipped
+ *
+ * NEEDSWORK: The purpose of this file is to show that an executable can be
+ * built with git-std-lib.a and git-stub-lib.a, and then executed. If there
+ * is another executable that demonstrates this (for example, a unit test that
+ * takes the form of an executable compiled with git-std-lib.a and git-stub-
+ * lib.a), this file can be removed.
+ */
+
+static void abspath_funcs(void) {
+	struct strbuf sb = STRBUF_INIT;
+
+	fprintf(stderr, "calling abspath functions\n");
+	is_directory("foo");
+	strbuf_realpath(&sb, "foo", 0);
+	strbuf_realpath_forgiving(&sb, "foo", 0);
+	real_pathdup("foo", 0);
+	absolute_path("foo");
+	absolute_pathdup("foo");
+	prefix_filename("foo/", "bar");
+	prefix_filename_except_for_dash("foo/", "bar");
+	is_absolute_path("foo");
+	strbuf_add_absolute_path(&sb, "foo");
+	strbuf_add_real_path(&sb, "foo");
+}
+
+static void hex_ll_funcs(void) {
+	unsigned char c;
+
+	fprintf(stderr, "calling hex-ll functions\n");
+
+	hexval('c');
+	hex2chr("A1");
+	hex_to_bytes(&c, "A1", 1);
+}
+
+static void parse_funcs(void) {
+	intmax_t foo;
+	ssize_t foo1 = -1;
+	unsigned long foo2;
+	int foo3;
+	int64_t foo4;
+
+	fprintf(stderr, "calling parse functions\n");
+
+	git_parse_signed("42", &foo, maximum_signed_value_of_type(int));
+	git_parse_ssize_t("42", &foo1);
+	git_parse_ulong("42", &foo2);
+	git_parse_int("42", &foo3);
+	git_parse_int64("42", &foo4);
+	git_parse_maybe_bool("foo");
+	git_parse_maybe_bool_text("foo");
+	git_env_bool("foo", 1);
+	git_env_ulong("foo", 1);
+}
+
+static int allow_unencoded_fn(char ch) {
+	return 0;
+}
+
+static void strbuf_funcs(void) {
+	struct strbuf *sb = xmalloc(sizeof(*sb));
+	struct strbuf *sb2 = xmalloc(sizeof(*sb2));
+	struct strbuf sb3 = STRBUF_INIT;
+	struct string_list list = STRING_LIST_INIT_NODUP;
+	int fd = open("/dev/null", O_RDONLY);
+
+	fprintf(stderr, "calling strbuf functions\n");
+
+	fprintf(stderr, "at line %d\n", __LINE__);
+	starts_with("foo", "bar");
+	fprintf(stderr, "at line %d\n", __LINE__);
+	istarts_with("foo", "bar");
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_init(sb, 0);
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_init(sb2, 0);
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_release(sb);
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_attach(sb, strbuf_detach(sb, NULL), 0, 0);
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_swap(sb, sb2);
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_setlen(sb, 0);
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_trim(sb);
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_trim_trailing_dir_sep(sb);
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_trim_trailing_newline(sb);
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_reencode(sb, "foo", "bar");
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_tolower(sb);
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_add_separated_string_list(sb, " ", &list);
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_list_free(strbuf_split_buf("foo bar", 8, ' ', -1));
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_cmp(sb, sb2);
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_addch(sb, 1);
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_splice(sb, 0, 1, "foo", 3);
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_insert(sb, 0, "foo", 3);
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_insertf(sb, 0, "%s", "foo");
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_remove(sb, 0, 1);
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_add(sb, "foo", 3);
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_addbuf(sb, sb2);
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_join_argv(sb, 0, NULL, ' ');
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_addchars(sb, 1, 1);
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_addstr(sb, "foo");
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_add_commented_lines(sb, "foo", 3, '#');
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_commented_addf(sb, '#', "%s", "foo");
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_addbuf_percentquote(sb, &sb3);
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_add_percentencode(sb, "foo", STRBUF_ENCODE_SLASH);
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_fread(sb, 0, stdin);
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_read(sb, fd, 0);
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_read_once(sb, fd, 0);
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_write(sb, stderr);
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_readlink(sb, "/dev/null", 0);
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_getcwd(sb);
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_getwholeline(sb, stderr, '\n');
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_appendwholeline(sb, stderr, '\n');
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_getline(sb, stderr);
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_getline_lf(sb, stderr);
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_getline_nul(sb, stderr);
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_getwholeline_fd(sb, fd, '\n');
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_read_file(sb, "/dev/null", 0);
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_add_lines(sb, "foo", "bar", 0);
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_addstr_xml_quoted(sb, "foo");
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_addstr_urlencode(sb, "foo", allow_unencoded_fn);
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_humanise_bytes(sb, 42);
+	fprintf(stderr, "at line %d\n", __LINE__);
+	strbuf_humanise_rate(sb, 42);
+	fprintf(stderr, "at line %d\n", __LINE__);
+	printf_ln("%s", sb->buf);
+	fprintf(stderr, "at line %d\n", __LINE__);
+	fprintf_ln(stderr, "%s", sb->buf);
+	fprintf(stderr, "at line %d\n", __LINE__);
+	xstrdup_tolower("foo");
+	fprintf(stderr, "at line %d\n", __LINE__);
+	xstrdup_toupper("foo");
+	fprintf(stderr, "at line %d\n", __LINE__);
+	xstrfmt("%s", "foo");
+	fprintf(stderr, "at line %d\n", __LINE__);
+}
+
+static void error_builtin(const char *err, va_list params) {}
+static void warn_builtin(const char *err, va_list params) {}
+
+static void usage_funcs(void) {
+	fprintf(stderr, "calling usage functions\n");
+	error("foo");
+	error_errno("foo");
+	die_message("foo");
+	die_message_errno("foo");
+	warning("foo");
+	warning_errno("foo");
+
+	get_die_message_routine();
+	set_error_routine(error_builtin);
+	get_error_routine();
+	set_warn_routine(warn_builtin);
+	get_warn_routine();
+}
+
+static void wrapper_funcs(void) {
+	int tmp;
+	void *ptr = xmalloc(1);
+	int fd = open("/dev/null", O_RDONLY);
+	struct strbuf sb = STRBUF_INIT;
+	int mode = 0444;
+	char host[PATH_MAX], path[PATH_MAX], path1[PATH_MAX];
+	xsnprintf(path, sizeof(path), "out-XXXXXX");
+	xsnprintf(path1, sizeof(path1), "out-XXXXXX");
+
+	fprintf(stderr, "calling wrapper functions\n");
+
+	xstrdup("foo");
+	xmalloc(1);
+	xmallocz(1);
+	xmallocz_gently(1);
+	xmemdupz("foo", 3);
+	xstrndup("foo", 3);
+	xrealloc(ptr, 2);
+	xcalloc(1, 1);
+	xsetenv("foo", "bar", 0);
+	xopen("/dev/null", O_RDONLY);
+	xread(fd, &sb, 1);
+	xwrite(fd, &sb, 1);
+	xpread(fd, &sb, 1, 0);
+	xdup(fd);
+	xfopen("/dev/null", "r");
+	xfdopen(fd, "r");
+	tmp = xmkstemp(path);
+	close(tmp);
+	unlink(path);
+	tmp = xmkstemp_mode(path1, mode);
+	close(tmp);
+	unlink(path1);
+	xgetcwd();
+	fopen_for_writing(path);
+	fopen_or_warn(path, "r");
+	xstrncmpz("foo", "bar", 3);
+	xgethostname(host, 3);
+	tmp = git_mkstemps_mode(path, 1, mode);
+	close(tmp);
+	unlink(path);
+	tmp = git_mkstemp_mode(path, mode);
+	close(tmp);
+	unlink(path);
+	read_in_full(fd, &sb, 1);
+	write_in_full(fd, &sb, 1);
+	pread_in_full(fd, &sb, 1, 0);
+}
+
+int main(int argc, const char **argv) {
+	abspath_funcs();
+	hex_ll_funcs();
+	parse_funcs();
+	strbuf_funcs();
+	usage_funcs();
+	wrapper_funcs();
+	fprintf(stderr, "all git-std-lib functions finished calling\n");
+	return 0;
+}
diff --git a/t/t0082-std-lib.sh b/t/t0082-std-lib.sh
new file mode 100755
index 0000000000..0d5a024deb
--- /dev/null
+++ b/t/t0082-std-lib.sh
@@ -0,0 +1,11 @@
+#!/bin/sh
+
+test_description='Test git-std-lib compilation'
+
+. ./test-lib.sh
+
+test_expect_success !WINDOWS 'stdlib-test compiles and runs' '
+	test-stdlib
+'
+
+test_done
-- 
2.44.0.rc0.258.g7320e95886-goog


^ permalink raw reply related	[flat|nested] 111+ messages in thread

* Re: [PATCH v5 1/3] pager: include stdint.h because uintmax_t is used
  2024-02-22 17:50   ` [PATCH v5 1/3] pager: include stdint.h because uintmax_t is used Calvin Wan
@ 2024-02-22 21:43     ` Junio C Hamano
  2024-02-26 18:59       ` Kyle Lippincott
  2024-02-24  1:33     ` Kyle Lippincott
  1 sibling, 1 reply; 111+ messages in thread
From: Junio C Hamano @ 2024-02-22 21:43 UTC (permalink / raw)
  To: Calvin Wan; +Cc: git, Jonathan Tan, phillip.wood123

Calvin Wan <calvinwan@google.com> writes:

> From: Jonathan Tan <jonathantanmy@google.com>
>
> pager.h uses uintmax_t but does not include stdint.h. Therefore, add
> this include statement.
>
> This was discovered when writing a stub pager.c file.
>
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> Signed-off-by: Calvin Wan <calvinwan@google.com>
> ---
>  pager.h | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/pager.h b/pager.h
> index b77433026d..015bca95e3 100644
> --- a/pager.h
> +++ b/pager.h
> @@ -1,6 +1,8 @@
>  #ifndef PAGER_H
>  #define PAGER_H
>  
> +#include <stdint.h>
> +
>  struct child_process;
>  
>  const char *git_pager(int stdout_is_tty);

This is not going in a sensible direction from our portability
standard's point of view.

The reason why we do not include these system headers directly to
our source files, and instead make it a rule to include
<git-compat-util.h> as the first header instead, is exactly because
there are curiosities in various platforms that Git wants to run on
which system include headers give us the declarations for types and
functions we rely on, in what order they must be included, and after
what feature macros (the ones that give adjustment to what the
system headers do, like _POSIX_C_SOURCE) are defined, etc.

Given that in <git-compat-util.h>, inclusion of <stdint.h> is
conditional behind some #ifdef's, it does not look like a sensible
change.  It is not very likely for <inttypes.h> and <stdint.h> to
declare uintmax_t in an incompatible way, but on a platform where
<git-compat-util.h> decides to include <inttypes.h> and use its
definition of what uintmax_t is, we should follow the same choice
and be consistent.

If there is a feature macro that affects sizes of the integer on a
platform, this patch will break it even more badly.  Perhaps there
is a platform whose C-library header requires you to define a
feature macro to use 64-bit, and we may define that feature macro
in <git-compat-util.h> before including either <inttypes.h> or
<stdint.h>, but by including <stdint.h> directly like the above
patch does, only this file and the sources that include only this
file, refusing to include <git-compat-util.h> as everybody in the
Git source tree should, will end up using different notion of what
the integral type with maximum width is from everybody else.

What this patch _wants_ to do is of course sympathizable, and we
have "make hdr-check" rule to enforce "a header must include the
headers that declare what it uses", except that it lets the header
files being tested assume that the things made available by
including <git-compat-util.h> are always available.

I think a sensible direction to go for libification purposes is to
also make sure that sources that are compiled into gitstdlib.a, and
the headers that makes what is in gitstdlib.a available, include the
<git-compat-util.h> header file.  There may be things declared in
the <git-compat-util.h> header that are _too_ specific to what ought
to be linked into the final "git" binary and unwanted by library
clients that are not "git" binary, and the right way to deal with it
is to split <git-compat-util.h> into two parts, i.e. what makes
system services available like its conditional inclusion of
<stdint.h> vs <inttypes.h>, definition of feature macros, order in
which the current <git-compat-util.h> includes system headers, etc.,
excluding those that made you write this patch to avoid assuming
that the client code would have included <git-compat-util.h> before
<pager.h>, would be the new <git-compat-core.h>.  And everything
else will remain in <git-compat-util.h>, which will include the
<git-compat-core.h>.  The <pager.h> header for library clients would
include <git-compat-core.h> instead, to still allow them to use the
same types as "git" binary itself that way.






^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v5 3/3] test-stdlib: show that git-std-lib is independent
  2024-02-22 17:50   ` [PATCH v5 3/3] test-stdlib: show that git-std-lib is independent Calvin Wan
@ 2024-02-22 22:24     ` Junio C Hamano
  2024-03-07 21:13     ` Junio C Hamano
  1 sibling, 0 replies; 111+ messages in thread
From: Junio C Hamano @ 2024-02-22 22:24 UTC (permalink / raw)
  To: Calvin Wan; +Cc: git, Jonathan Tan, phillip.wood123

Calvin Wan <calvinwan@google.com> writes:

> diff --git a/Makefile b/Makefile
> index d37ea9d34b..1d762ce13a 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -870,9 +870,7 @@ TEST_BUILTINS_OBJS += test-xml-encode.o
>  # Do not add more tests here unless they have extra dependencies. Add
>  # them in TEST_BUILTINS_OBJS above.
>  TEST_PROGRAMS_NEED_X += test-fake-ssh
> -TEST_PROGRAMS_NEED_X += test-tool
> -
> -TEST_PROGRAMS = $(patsubst %,t/helper/%$X,$(TEST_PROGRAMS_NEED_X))
> +TEST_PROGRAMS_NEED_X += $(info tpnxnpg=$(NO_POSIX_GOODIES))test-tool

Is this version meant to be ready for reviewing?  $(info) used like
this does not look like a good fit for production code.

> diff --git a/strbuf.h b/strbuf.h
> index e959caca87..f775416307 100644
> --- a/strbuf.h
> +++ b/strbuf.h
> @@ -1,6 +1,8 @@
>  #ifndef STRBUF_H
>  #define STRBUF_H
>  
> +#include "git-compat-util.h"
> +
>  /*
>   * NOTE FOR STRBUF DEVELOPERS
>   *

The same comment about header inclusion I made on [1/3] applies
here, too.  I am open to hearing better ideas to handle system
headers, but my preference is to allow any and all headers assume
<git-compat-util.h> (or its moral equivalent that may be stripped
down by moving non-essential things out) is already included, which
in turn means those *.c files (like t/helper/test-stdlib.c we see
below) would include <git-compat-util.h> (or its trimmed down
version) as the first header, before including <strbuf.h>.

In any case, this change, if we were to make it (and I do not think
we should), should be treated like the change to pager.h in [1/3],
i.e. part of making the existing headers ready to be shared with the
"stdlib" effort.  It does not belong to this [3/3] step, where we
are supposed to be demonstrating the use of "stdlib", which has
become (minimally) usable with the steps before this one.

> diff --git a/stubs/misc.c b/stubs/misc.c
> index 92da76fd46..8d80581e39 100644
> --- a/stubs/misc.c
> +++ b/stubs/misc.c
> @@ -9,6 +9,7 @@
>   * yet. To do that we need to start using dgettext() rather than
>   * gettext() in our code.
>   */
> +#include "gettext.h"
>  int git_gettext_enabled = 0;
>  #endif

This change should have happened before this [3/3] step, whose point
is to demonstrate "stdlib" that has already been made (minimally)
usable with steps before this one.

> diff --git a/t/helper/.gitignore b/t/helper/.gitignore
> index 8c2ddcce95..5cec3b357f 100644
> --- a/t/helper/.gitignore
> +++ b/t/helper/.gitignore
> @@ -1,2 +1,3 @@
>  /test-tool
>  /test-fake-ssh
> +/test-stdlib
> diff --git a/t/helper/test-stdlib.c b/t/helper/test-stdlib.c
> new file mode 100644
> index 0000000000..460b472fb4
> --- /dev/null
> +++ b/t/helper/test-stdlib.c
> @@ -0,0 +1,266 @@
> +#include "git-compat-util.h"
> +#include "abspath.h"
> +#include "hex-ll.h"
> +#include "parse.h"
> +#include "strbuf.h"
> +#include "string-list.h"
> +
> +/*
> + * Calls all functions from git-std-lib
> + * Some inline/trivial functions are skipped
> + *
> + * NEEDSWORK: The purpose of this file is to show that an executable can be
> + * built with git-std-lib.a and git-stub-lib.a, and then executed. If there
> + * is another executable that demonstrates this (for example, a unit test that
> + * takes the form of an executable compiled with git-std-lib.a and git-stub-
> + * lib.a), this file can be removed.
> + */

Or alternatively, these "random list of function calls" can be
turned into a more realistic test helpers in place.  "stdlib"
will hopefully gain more coverage of the features of low level
helpers "git" binary proper uses, and I do not think it is
far-fetched to migrate the "test-tool date" subcommands all to not
link directly with "libgit.a" but with gitstdlib instead and the
things should work, right?  Right now, the "random list of function
calls" do not do anything useful, but that does not have to be the
case.  It should offer us more value to us than "It links!" ;-).

Having said that, the most valuable part in this [3/3] step is how
this t/helper/test-stdlib is linked, i.e. this part from the
Makefile:

> +t/helper/test-stdlib$X: t/helper/test-stdlib.o GIT-LDFLAGS $(STD_LIB_FILE) $(STUB_LIB_FILE) $(GITLIBS)
> +	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) \
> +		$< $(STD_LIB_FILE) $(STUB_LIB_FILE) $(EXTLIBS)

where we have no $(LIB_FILE) (aka libgit.a).  Especially if we can
grow the capability in $(STD_LIB_FILE) without adding too much stuff
to $(STUB_LIB_FILE), this is a major achievement.  Very nice.


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v5 1/3] pager: include stdint.h because uintmax_t is used
  2024-02-22 17:50   ` [PATCH v5 1/3] pager: include stdint.h because uintmax_t is used Calvin Wan
  2024-02-22 21:43     ` Junio C Hamano
@ 2024-02-24  1:33     ` Kyle Lippincott
  2024-02-24  7:58       ` Junio C Hamano
  1 sibling, 1 reply; 111+ messages in thread
From: Kyle Lippincott @ 2024-02-24  1:33 UTC (permalink / raw)
  To: Calvin Wan; +Cc: git, Jonathan Tan, phillip.wood123, Junio C Hamano

On Thu, Feb 22, 2024 at 9:51 AM Calvin Wan <calvinwan@google.com> wrote:
>
> From: Jonathan Tan <jonathantanmy@google.com>
>
> pager.h uses uintmax_t but does not include stdint.h. Therefore, add
> this include statement.
>
> This was discovered when writing a stub pager.c file.
>
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> Signed-off-by: Calvin Wan <calvinwan@google.com>
> ---
>  pager.h | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/pager.h b/pager.h
> index b77433026d..015bca95e3 100644
> --- a/pager.h
> +++ b/pager.h
> @@ -1,6 +1,8 @@
>  #ifndef PAGER_H
>  #define PAGER_H
>
> +#include <stdint.h>
> +
>  struct child_process;
>
>  const char *git_pager(int stdout_is_tty);
> --
> 2.44.0.rc0.258.g7320e95886-goog
>
>

As far as I can tell, we need pager.h because of the `pager_in_use`
symbol. We need that symbol because of its use in date.c's
`parse_date_format`. I wonder if we can side step the `#include
<stdint.h>` concerns by splitting pager.h into pager.h and
pager_in_use.h, and have pager.h include pager_in_use.h instead. This
way pager.h (and its [unused] forward declarations) aren't part of
git-std-lib at all. I believe this was done for things like hex-ll.h,
so maybe we call it pager-ll.h. The goal being to (a) not need the
`#include <stdint.h>` because that's currently contentious, but also
(b) to identify the minimum set of symbols needed for the stubs
library, and not declare things that we don't have any intention of
actually providing / stubbing out.

I have some more thoughts on this, but they're much more appropriate
for the next patch in the series, so I'll leave them there.

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v5 1/3] pager: include stdint.h because uintmax_t is used
  2024-02-24  1:33     ` Kyle Lippincott
@ 2024-02-24  7:58       ` Junio C Hamano
  0 siblings, 0 replies; 111+ messages in thread
From: Junio C Hamano @ 2024-02-24  7:58 UTC (permalink / raw)
  To: Kyle Lippincott; +Cc: Calvin Wan, git, Jonathan Tan, phillip.wood123

Kyle Lippincott <spectral@google.com> writes:

> As far as I can tell, we need pager.h because of the `pager_in_use`
> symbol. We need that symbol because of its use in date.c's
> `parse_date_format`. I wonder if we can side step the `#include
> <stdint.h>` concerns by splitting pager.h into pager.h and
> pager_in_use.h, and have pager.h include pager_in_use.h instead. This
> way pager.h (and its [unused] forward declarations) aren't part of
> git-std-lib at all.

Step back a bit.  Why do you even need to touch pager.h in the first
place?  Whatever thing that needs to define a mock version of
pager_in_use() would need to be able to find out that it is supposed
to take nothing as arguments and return an integer, and it can
include <pager.h> without modification.  Just like everybody else,
it has to include <git-compat-util.h> so that the system header that
gives us uintmax_t gets include appropriately in platform-dependent
way, no?  Why do we even need to butcher pager.h into two pieces in
the first place?

If you just include <git-compat-util.h> and then <pager.h> in
stubs/pager.c and you're OK, no?

If anything, as I already said, I think it is more reasonable to
tweak what <git-compat-util.h> does.  For example, it might be
unwieldy for gitstdlib's purpose that it unconditionally overrides
exit(), in which case it may be OK to introduce some conditional
compilation macros to omit that override when building stub code.
Or even split parts of the <git-compat-util.h> that both Git's use
and gitstdlib's purpose are OK with into a separate header file
<git-compat-core.h>, while leaving (hopefully a very minor) other
parts in <git-compat-util.h> *and* include <git-compat-core.h> in
<git-compat-util.h>.  That way, the sources of Git can continue
including <git-compat-util.h> while stub code can include
<git-compat-core.h>, and we will get system library symbols and
system defined types like uintmax_t in a consistent way, both in Git
itself and in gitstdlib.

But once such a sanitization is done on the compat-util header,
other "ordinary" header files that should not have to care about
portability (because they can assume that inclusion of
git-compat-util.h will give them access to system types and symbols
without having to worry about portability issues) and should not
have to include system header files themselves.

At least, that is the idea behind <git-compat-util.h> in the first
place.  Including any system headers directly in ordinary headers,
or splitting ordinary headers at an arbitrary and artificial
boundary, should not be necessary.  I'd have to say that such
changes are tail wagging the dog.

I do not have sufficient cycles to spend actually splitting
git-compat-util.h into two myself, but as an illustration, here is
how I would tweak cw/git-std-lib topic to make it build without
breaking our headers and including system header files directly.

 git-compat-util.h | 2 ++
 pager.h           | 2 --
 stubs/misc.c      | 4 ++--
 stubs/pager.c     | 1 +
 4 files changed, 5 insertions(+), 4 deletions(-)

diff --git c/git-compat-util.h w/git-compat-util.h
index 7c2a6538e5..981d526d18 100644
--- c/git-compat-util.h
+++ w/git-compat-util.h
@@ -1475,12 +1475,14 @@ static inline int is_missing_file_error(int errno_)
 
 int cmd_main(int, const char **);
 
+#ifndef _GIT_NO_OVERRIDE_EXIT
 /*
  * Intercept all calls to exit() and route them to trace2 to
  * optionally emit a message before calling the real exit().
  */
 int common_exit(const char *file, int line, int code);
 #define exit(code) exit(common_exit(__FILE__, __LINE__, (code)))
+#endif
 
 /*
  * You can mark a stack variable with UNLEAK(var) to avoid it being
diff --git c/pager.h w/pager.h
index 015bca95e3..b77433026d 100644
--- c/pager.h
+++ w/pager.h
@@ -1,8 +1,6 @@
 #ifndef PAGER_H
 #define PAGER_H
 
-#include <stdint.h>
-
 struct child_process;
 
 const char *git_pager(int stdout_is_tty);
diff --git c/stubs/misc.c w/stubs/misc.c
index 8d80581e39..d0379dcb69 100644
--- c/stubs/misc.c
+++ w/stubs/misc.c
@@ -1,5 +1,5 @@
-#include <assert.h>
-#include <stdlib.h>
+#define _GIT_NO_OVERRIDE_EXIT
+#include <git-compat-util.h>
 
 #ifndef NO_GETTEXT
 /*
diff --git c/stubs/pager.c w/stubs/pager.c
index 4f575cada7..04517aad4c 100644
--- c/stubs/pager.c
+++ w/stubs/pager.c
@@ -1,3 +1,4 @@
+#include <git-compat-util.h>
 #include "pager.h"
 
 int pager_in_use(void)



^ permalink raw reply related	[flat|nested] 111+ messages in thread

* Re: [PATCH v5 1/3] pager: include stdint.h because uintmax_t is used
  2024-02-22 21:43     ` Junio C Hamano
@ 2024-02-26 18:59       ` Kyle Lippincott
  2024-02-27  0:20         ` Junio C Hamano
  0 siblings, 1 reply; 111+ messages in thread
From: Kyle Lippincott @ 2024-02-26 18:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Calvin Wan, git, Jonathan Tan, phillip.wood123

On Thu, Feb 22, 2024 at 1:44 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Calvin Wan <calvinwan@google.com> writes:
>
> > From: Jonathan Tan <jonathantanmy@google.com>
> >
> > pager.h uses uintmax_t but does not include stdint.h. Therefore, add
> > this include statement.
> >
> > This was discovered when writing a stub pager.c file.
> >
> > Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> > Signed-off-by: Calvin Wan <calvinwan@google.com>
> > ---
> >  pager.h | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/pager.h b/pager.h
> > index b77433026d..015bca95e3 100644
> > --- a/pager.h
> > +++ b/pager.h
> > @@ -1,6 +1,8 @@
> >  #ifndef PAGER_H
> >  #define PAGER_H
> >
> > +#include <stdint.h>
> > +
> >  struct child_process;
> >
> >  const char *git_pager(int stdout_is_tty);
>
> This is not going in a sensible direction from our portability
> standard's point of view.
>
> The reason why we do not include these system headers directly to
> our source files, and instead make it a rule to include
> <git-compat-util.h> as the first header instead, is exactly because
> there are curiosities in various platforms that Git wants to run on
> which system include headers give us the declarations for types and
> functions we rely on, in what order they must be included, and after
> what feature macros (the ones that give adjustment to what the
> system headers do, like _POSIX_C_SOURCE) are defined, etc.
>
> Given that in <git-compat-util.h>, inclusion of <stdint.h> is
> conditional behind some #ifdef's, it does not look like a sensible
> change.  It is not very likely for <inttypes.h> and <stdint.h> to
> declare uintmax_t in an incompatible way, but on a platform where
> <git-compat-util.h> decides to include <inttypes.h> and use its
> definition of what uintmax_t is, we should follow the same choice
> and be consistent.

Speaking of this specific header file inclusion and the oddities that
have gotten us to where we are:
- Originally, it seems that we were including stdint.h
- 17 years ago, to work around Solaris not providing stdint.h, but
providing inttypes.h, it was switched to being just inttypes.h, with
the explanation being that inttypes is a superset of stdint.
https://github.com/git/git/commit/007e2ba65902b484fc65a313e54594a009841740
- 13 years ago, to work around some platforms not having inttypes.h,
it was made conditional.
(https://github.com/git/git/commit/2844923d62a4c408bd59ddb2caacca4aa7eb86bc)

The condition added 13 years ago was, IMHO, backwards from what it
should have been. The intent is to have stdint.h included. We should
include stdint.h. I suspect that 17 years is enough time for that
platform to start conforming to what is now a 25 year old standard,
and I don't know how we can verify that and have this stop being a
haunted graveyard without just trying it and seeing if the build bots
or maintainers identify it as a continuing issue. If it's still an
issue (and only if), we should reintroduce a conditional, but invert
it: if there's no stdint.h, THEN include inttypes.h.

Oh, no, that doesn't work. I tried that, and the build bots told me
that doesn't work, because we're using things from inttypes.h (PRIuMAX
showed up several times in the errors, there may be others). This
makes me wonder how the platforms with no inttypes.h work at all. I
still think we should do something here because it's a 13-year-old
compatibility fix that "shouldn't" be needed anymore, and causes
confusion/concerns like this thread. Maybe just see if we can get away
with always including inttypes.h in git-compat-util.h, or maybe _both_
inttypes.h and stdint.h (in either order), just to be really obvious
that it's acceptable to include stdint.h.

>
> If there is a feature macro that affects sizes of the integer on a
> platform, this patch will break it even more badly.  Perhaps there
> is a platform whose C-library header requires you to define a
> feature macro to use 64-bit, and we may define that feature macro
> in <git-compat-util.h> before including either <inttypes.h> or
> <stdint.h>, but by including <stdint.h> directly like the above
> patch does, only this file and the sources that include only this
> file, refusing to include <git-compat-util.h> as everybody in the
> Git source tree should, will end up using different notion of what
> the integral type with maximum width is from everybody else.

I agree that for pager.h, something like the patch in your next email
would resolve that particular problem. The stub library is of
basically the same stature as git-std-lib: it's code that is provided
by the Git project, compiled by Makefile rules owned and maintained by
the Git project, and should conform to the Git coding standards. The
.c files in the stubs library should include git-compat-util.h,
there's basically no reason not to.

However, I believe that we'll need a good policy for what to do with
libified headers + sources in general. I can see many potential
categorizations of source; there's no need to formally define all of
them and assign files to each category, but the main categories are
basically:
1. files that have code that is an internal part of Git, one of the
helper binaries, or one of its tests, whether it's a library or not.
These should include git-compat-util.h at the top of the .c files like
they do today. The .h files for these translation units are also
considered "internal". These header files should assume that
git-compat-util.h has been included properly, and don't need to be
self-contained, because they're _only_ included by things in this
category.
2. files that have code that define the "library interface", probably
only the ones defining the library interface _as used by external
projects_. I think that we'll likely need to be principled about
defining these, and having them be as minimal and compatible as
possible.
3. code in external projects that directly uses libraries from the Git
project, and thus includes Git headers from category 2
4. the rest of the code in external projects (the code that does not
directly use libraries from the Git project)

A hypothetical git-compat-core.h being included at the top of the .c
files in category 2 is feasible, but needs to be carefully handled due
to potential symbol collision (which we're discussing in another
thread and I may have a possible solution for, at least on some
platforms). On the other hand, a git-compat-core.h being included at
the top of the .h files belonging to category 2 doesn't work, because
when these .h files are included by code in category 3, it's too late.

In this example, gnu-source-header.h below is a system header that
changes behavior depending on _GNU_SOURCE (effectively the same
concern as you were raising in the quoted paragraph):

external-project-category3.c:
#include <stdint.h>
#include <gnu-source-header.h>
#include <git/some-lib-interface.h>

git/some-lib-interface.h:
#include <git/git-compat-core.h>

We can't do anything in git/git-compat-core.h that relies on careful
inclusion order, requiring that various things are #defined prior to
the first inclusion of certain headers, etc. stdint.h and
gnu-source-header.h are already included, and so it's too late for us
to #define things that change their behavior, because that won't have
any effect. I don't think it's reasonable to expect the external
project to #include <git/git-compat-core.h> at the top of their files
that are in category3. It's definitely not reasonable to require the
external project to do that for all their files (category 3 and
category 4). It's slightly more reasonable to have them do some set of
#defines for their binaries and libraries, but still quite awkward and
potentially not feasible (for things like _FILE_OFFSET_BITS). This is
why I split it into 4 categories.

I believe that category 2 files need to be maximally compatible, both
to platforms [where we provide support for libraries, and I think this
probably will end up being a subset of all the platforms, especially
at first] and to the environment they're being #included in and
interacting with. So they need to be self-contained: they can't rely
on stdint.h having been included, but they also can't rely on it _not_
having been included already. The category 2 .h files need to be
minimal: just what we want in this external library interface, and
ideally nothing else. The category 2 .h and .c files need to be
compatible with common #defines being set OR not set. The point of a
category 2 .c file is to bridge the gap between the category 1 and
category 3 environments. This likely means that we need to be careful
about certain structs and typedefs defined by the system (vs. structs
defined by the category 2 headers themselves) being passed between the
different environments. For example, if we were to have a library
interface that included a `struct stat`, and the category 3 files
didn't have _FILE_OFFSET_BITS 64, but the category 1 files do? Instant
breakage.

This aspect of this discussion probably should happen on the next
patch, or in a separate thread :) But since I'm here, I'll summarize
these thoughts: basically, the next patch, imho, doesn't go far
enough, but is a very good first step that we can build on. We need to
define what belongs to the "external interface" of the various
libraries (category 2 above) and what is considered category 1.
pager.h is pretty obviously category 1. strbuf.h, abspath.h, etc? I'm
not sure. git-std-lib is weird, because it's so low level and there's
not really much "internal" code to this library. So maybe we declare
those as category 2. But I don't know how that will actually work in
practice. I'll try to find time to write up these thoughts on that
patch.

>
> What this patch _wants_ to do is of course sympathizable, and we
> have "make hdr-check" rule to enforce "a header must include the
> headers that declare what it uses", except that it lets the header
> files being tested assume that the things made available by
> including <git-compat-util.h> are always available.
>
> I think a sensible direction to go for libification purposes is to
> also make sure that sources that are compiled into gitstdlib.a, and
> the headers that makes what is in gitstdlib.a available, include the
> <git-compat-util.h> header file.  There may be things declared in
> the <git-compat-util.h> header that are _too_ specific to what ought
> to be linked into the final "git" binary and unwanted by library
> clients that are not "git" binary, and the right way to deal with it
> is to split <git-compat-util.h> into two parts, i.e. what makes
> system services available like its conditional inclusion of
> <stdint.h> vs <inttypes.h>, definition of feature macros, order in
> which the current <git-compat-util.h> includes system headers, etc.,






> excluding those that made you write this patch to avoid assuming
> that the client code would have included <git-compat-util.h> before
> <pager.h>, would be the new <git-compat-core.h>.  And everything
> else will remain in <git-compat-util.h>, which will include the
> <git-compat-core.h>.  The <pager.h> header for library clients would
> include <git-compat-core.h> instead, to still allow them to use the
> same types as "git" binary itself that way.
>
>
>
>
>
>

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v5 1/3] pager: include stdint.h because uintmax_t is used
  2024-02-26 18:59       ` Kyle Lippincott
@ 2024-02-27  0:20         ` Junio C Hamano
  2024-02-27  0:56           ` Kyle Lippincott
  0 siblings, 1 reply; 111+ messages in thread
From: Junio C Hamano @ 2024-02-27  0:20 UTC (permalink / raw)
  To: Kyle Lippincott; +Cc: Calvin Wan, git, Jonathan Tan, phillip.wood123

Kyle Lippincott <spectral@google.com> writes:

> The condition added 13 years ago was, IMHO, backwards from what it
> should have been. The intent is to have stdint.h included. We should
> include stdint.h. I suspect that 17 years is enough time for that
> platform to start conforming to what is now a 25 year old standard,
> and I don't know how we can verify that and have this stop being a
> haunted graveyard without just trying it and seeing if the build bots
> or maintainers identify it as a continuing issue.

The nightmare of Solaris might be luckily behind us, but the world
does not only run Linux and GNU libc, and it is not just <stdint.h>
vs <inttypes.h>.  This is about general code hygiene.

> If it's still an
> issue (and only if), we should reintroduce a conditional, but invert
> it: if there's no stdint.h, THEN include inttypes.h.

But it would give the wrong order in general in the modern world,
where <inttypes.h> is supposed to include <stdint.h> and extends it.

We use inttypes.h by default because the C standard already talks
about it, and fall back to stdint.h when the platform lacks it.  But
what I suspect is that nobody compiles with NO_INTTYPES_H and we
would unknowingly (but not "unintentionally") start using the
extended types that are only available in <inttypes.h> but not in
<stdint.h> sometime in the future.  It might already have happened,
but I do not know.  I haven't compiled with NO_INTTYPES_H for some
time (to experiment), and I haven't met a platform that actually
requires NO_INTTYPES_H defined to build.  Once after such a change
is made without anybody being knowingly breaking some rare platform,
if nobody complains, we can just drop the support to allow us to
limit ourselves to <stdint.h>, but since we hear nobody complaining,
we should be OK with the current rule of including system header
files that is embodied in <git-compat-util.h> header file.

In any case, your sources should not include a standard library
header directly yourself, period.  Instead let <git-compat-util.h>
take care of the details of how we need to obtain what we need out
of the system on various platforms.

Thanks.

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v5 1/3] pager: include stdint.h because uintmax_t is used
  2024-02-27  0:20         ` Junio C Hamano
@ 2024-02-27  0:56           ` Kyle Lippincott
  2024-02-27  2:45             ` Junio C Hamano
  2024-02-27  8:45             ` Jeff King
  0 siblings, 2 replies; 111+ messages in thread
From: Kyle Lippincott @ 2024-02-27  0:56 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Calvin Wan, git, Jonathan Tan, phillip.wood123

On Mon, Feb 26, 2024 at 4:20 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Kyle Lippincott <spectral@google.com> writes:
>
> > The condition added 13 years ago was, IMHO, backwards from what it
> > should have been. The intent is to have stdint.h included. We should
> > include stdint.h. I suspect that 17 years is enough time for that
> > platform to start conforming to what is now a 25 year old standard,
> > and I don't know how we can verify that and have this stop being a
> > haunted graveyard without just trying it and seeing if the build bots
> > or maintainers identify it as a continuing issue.
>
> The nightmare of Solaris might be luckily behind us, but the world
> does not only run Linux and GNU libc, and it is not just <stdint.h>
> vs <inttypes.h>.  This is about general code hygiene.
>
> > If it's still an
> > issue (and only if), we should reintroduce a conditional, but invert
> > it: if there's no stdint.h, THEN include inttypes.h.
>
> But it would give the wrong order in general in the modern world,
> where <inttypes.h> is supposed to include <stdint.h> and extends it.
>
> We use inttypes.h by default because the C standard already talks
> about it, and fall back to stdint.h when the platform lacks it.  But
> what I suspect is that nobody compiles with NO_INTTYPES_H and we
> would unknowingly (but not "unintentionally") start using the
> extended types that are only available in <inttypes.h> but not in
> <stdint.h> sometime in the future.  It might already have happened,

It has. We use PRIuMAX, which is from inttypes.h. I think it's only
"accidentally" working if anyone uses NO_INTTYPES_H. I changed my
stance halfway through this investigation in my previous email, I
apologize for not going back and editing it to make it clear at the
beginning that I'd done so. My current stance is that
<git-compat-util.h> should be either (a) including only inttypes.h
(since it includes stdint.h), or (b) including both inttypes.h and
stdint.h (in either order), just to demonstrate that we can.

> but I do not know.  I haven't compiled with NO_INTTYPES_H for some
> time (to experiment), and I haven't met a platform that actually
> requires NO_INTTYPES_H defined to build.  Once after such a change
> is made without anybody being knowingly breaking some rare platform,
> if nobody complains, we can just drop the support to allow us to
> limit ourselves to <stdint.h>, but since we hear nobody complaining,
> we should be OK with the current rule of including system header
> files that is embodied in <git-compat-util.h> header file.
>
> In any case, your sources should not include a standard library
> header directly yourself, period.  Instead let <git-compat-util.h>
> take care of the details of how we need to obtain what we need out
> of the system on various platforms.

I disagree with this statement. We _can't_ use a magic compatibility
header file in the library interfaces, for the reasons I outlined
further below in my previous message. For those headers, the ones that
might be included by code that's not under the Git project's control,
they need to be self-contained, minimal, and maximally compatible.

>
> Thanks.

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v5 1/3] pager: include stdint.h because uintmax_t is used
  2024-02-27  0:56           ` Kyle Lippincott
@ 2024-02-27  2:45             ` Junio C Hamano
  2024-02-27 22:29               ` Kyle Lippincott
  2024-02-27  8:45             ` Jeff King
  1 sibling, 1 reply; 111+ messages in thread
From: Junio C Hamano @ 2024-02-27  2:45 UTC (permalink / raw)
  To: Kyle Lippincott; +Cc: Calvin Wan, git, Jonathan Tan, phillip.wood123

Kyle Lippincott <spectral@google.com> writes:

>> In any case, your sources should not include a standard library
>> header directly yourself, period.  Instead let <git-compat-util.h>
>> take care of the details of how we need to obtain what we need out
>> of the system on various platforms.
>
> I disagree with this statement. We _can't_ use a magic compatibility
> header file in the library interfaces, for the reasons I outlined
> further below in my previous message. For those headers, the ones that
> might be included by code that's not under the Git project's control,
> they need to be self-contained, minimal, and maximally compatible.

Note that I am not talking about your random outside program that
happens to link with gitstdlib.a; it would want to include a header
file <gitstdlib.h> that comes with the library.

Earlier I suggested that you may want to take a subset of
<git-compat-util.h>, because <git-compat-util.h> may have a lot more
than what is minimally necessary to allow our sources to be
insulated from details of platform dependence.  You can think of
that subset as a good starting point to build the <gitstdlib.h>
header file to be given to the library customers.

But the sources that go to the library, as gitstdlib.a is supposed
to serve as a subset of gitlib.a to our internal codebase when
building the git binary, should still follow our header inclusion
rules.

Because we would want to make sure that the sources that are made
into gitstdlib.a, the sources to the rest of libgit.a, and the
sources to the rest of git, all agree on what system features we ask
from the system, feature macros that must be defined to certain
values before we include system library files (like _XOPEN_SOURCE
and _FILE_OFFSET_BITS) must be defined consistently across all of
these three pieces.  One way to do so may be to ensure that the
definition of them would be migrated to <gitstdlib.h> when we
separate a subset out of <git-compat-util.h> to it (and of course,
we make <git-compat-util.h> to include <gitstdlib.h> so that it
would be still sufficient for our in-tree users to include the
<git-compat-util.h>)

<gitstdlib.h> may have to expose an API function that uses some
extended types only available by including system header files,
e.g. some function may return ssize_t as its value or take an off_t
value as its argument.

If our header should include system headers to make these types
available to our definitions is probably open to discussion.  It is
harder to do so portably, unless your world is limited to POSIX.1
and ISO C, than making it the responsibility of library users.  

But if the platform headers and libraries support feature macros
that allows you to tweak these sizes (e.g. the size of off_t may be
controlled by setting the _FILE_OFFSET_BITS to an appropriate
value), it may be irresponsible to leave that to the library users,
as they MUST make sure to define such feature macros exactly the
same way as we define for our code, which currently is done in
<git-compat-util.h>, before they include their system headers to
obtain off_t so that they can use <gitstdlib.h>.

So the rules for library clients (random outside programs that
happen to link with gitstdlib.a) may not be that they must include
<git-compat-util.h> as the first thing, but they probably still have
to include <gitstdlib.h> fairly early before including any of their
system headers, I would suspect, unless they are willing to accept
such responsibility fully to ensure they compile the same way as the
gitstdlib library, I would think.




^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v5 1/3] pager: include stdint.h because uintmax_t is used
  2024-02-27  0:56           ` Kyle Lippincott
  2024-02-27  2:45             ` Junio C Hamano
@ 2024-02-27  8:45             ` Jeff King
  2024-02-27  9:05               ` Jeff King
  2024-02-27 20:10               ` Kyle Lippincott
  1 sibling, 2 replies; 111+ messages in thread
From: Jeff King @ 2024-02-27  8:45 UTC (permalink / raw)
  To: Kyle Lippincott
  Cc: Junio C Hamano, Calvin Wan, git, Jonathan Tan, phillip.wood123

On Mon, Feb 26, 2024 at 04:56:28PM -0800, Kyle Lippincott wrote:

> > We use inttypes.h by default because the C standard already talks
> > about it, and fall back to stdint.h when the platform lacks it.  But
> > what I suspect is that nobody compiles with NO_INTTYPES_H and we
> > would unknowingly (but not "unintentionally") start using the
> > extended types that are only available in <inttypes.h> but not in
> > <stdint.h> sometime in the future.  It might already have happened,
> 
> It has. We use PRIuMAX, which is from inttypes.h.

Is it always, though? That's what C99 says, but if you have a system
that does not have inttypes.h in the first place, but does have
stdint.h, it seems possible that it provides conversion macros elsewhere
(either via stdint.h, or possibly just as part of stdio.h).

So it might be that things have been horribly broken on NO_INTTYPES_H
systems for a while, and nobody is checking. But I don't think you can
really say so without looking at such a system.

And looking at config.mak.uname, it looks like Windows is such a system.
Does it really have inttypes.h and it is getting included from somewhere
else, making format conversion macros work? Or does it provide those
macros elsewhere, and really needs stdint? It does look like
compat/mingw.h includes it, but I think we wouldn't use that for msvc
builds.

> I think it's only
> "accidentally" working if anyone uses NO_INTTYPES_H. I changed my
> stance halfway through this investigation in my previous email, I
> apologize for not going back and editing it to make it clear at the
> beginning that I'd done so. My current stance is that
> <git-compat-util.h> should be either (a) including only inttypes.h
> (since it includes stdint.h), or (b) including both inttypes.h and
> stdint.h (in either order), just to demonstrate that we can.

It is good to clean up old conditionals if they are no longer
applicable, as they are a burden to reason about later (as this
discussion shows). But I am not sure about your "just to demonstrate we
can". It is good to try it out, but it looks like there is a non-zero
chance that MSVC on Windows might break. It is probably better to try
building there or looping in folks who can, rather than just making a
change and seeing if anybody screams.

I think the "win+VS" test in the GitHub Actions CI job might cover this
case. It is not run by default (because it was considered be mostly
redundant with the mingw build), but it shouldn't be too hard to enable
it for a one-off test.

-Peff

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v5 1/3] pager: include stdint.h because uintmax_t is used
  2024-02-27  8:45             ` Jeff King
@ 2024-02-27  9:05               ` Jeff King
  2024-02-27 20:10               ` Kyle Lippincott
  1 sibling, 0 replies; 111+ messages in thread
From: Jeff King @ 2024-02-27  9:05 UTC (permalink / raw)
  To: Kyle Lippincott
  Cc: Sven Strickroth, Junio C Hamano, Calvin Wan, git, Jonathan Tan,
	phillip.wood123

On Tue, Feb 27, 2024 at 03:45:29AM -0500, Jeff King wrote:

> It is good to clean up old conditionals if they are no longer
> applicable, as they are a burden to reason about later (as this
> discussion shows). But I am not sure about your "just to demonstrate we
> can". It is good to try it out, but it looks like there is a non-zero
> chance that MSVC on Windows might break. It is probably better to try
> building there or looping in folks who can, rather than just making a
> change and seeing if anybody screams.
> 
> I think the "win+VS" test in the GitHub Actions CI job might cover this
> case. It is not run by default (because it was considered be mostly
> redundant with the mingw build), but it shouldn't be too hard to enable
> it for a one-off test.

Here's a successful run with the NO_INTTYPES_H line removed:

  https://github.com/peff/git/actions/runs/8062063219

Blaming the code around the inttypes.h reference in compat/mingw.h
turned up 0ef60afdd4 (MSVC: use shipped headers instead of fallback
definitions, 2016-03-30), which claims that inttypes.h was added to
VS2013.

That sounds old-ish in 2024, but I don't know how old is normal in the
Windows world.

All of which to me says that cleaning this up is something that should
involve Windows folks, who can make the judgement for their platform.

-Peff

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v5 1/3] pager: include stdint.h because uintmax_t is used
  2024-02-27  8:45             ` Jeff King
  2024-02-27  9:05               ` Jeff King
@ 2024-02-27 20:10               ` Kyle Lippincott
  1 sibling, 0 replies; 111+ messages in thread
From: Kyle Lippincott @ 2024-02-27 20:10 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Calvin Wan, git, Jonathan Tan, phillip.wood123

On Tue, Feb 27, 2024 at 12:45 AM Jeff King <peff@peff.net> wrote:
>
> On Mon, Feb 26, 2024 at 04:56:28PM -0800, Kyle Lippincott wrote:
>
> > > We use inttypes.h by default because the C standard already talks
> > > about it, and fall back to stdint.h when the platform lacks it.  But
> > > what I suspect is that nobody compiles with NO_INTTYPES_H and we
> > > would unknowingly (but not "unintentionally") start using the
> > > extended types that are only available in <inttypes.h> but not in
> > > <stdint.h> sometime in the future.  It might already have happened,
> >
> > It has. We use PRIuMAX, which is from inttypes.h.
>
> Is it always, though? That's what C99 says, but if you have a system
> that does not have inttypes.h in the first place, but does have
> stdint.h, it seems possible that it provides conversion macros elsewhere
> (either via stdint.h, or possibly just as part of stdio.h).

It's of course possible that on some platforms, stdio.h or stdint.h
defines these types (or includes inttypes.h internally, which defines
these types). However, I think that to be "correct" and for a compiler
to claim it supports C99 (and the compiler _must_ claim that because
of the guard in <git-compat-util.h>), inttypes.h must exist, and it
must cause these symbols to appear. If there are platforms that are
claiming to be C99 and inttypes.h doesn't exist or doesn't provide the
symbols it should, I don't think we should try to support them - they
can maintain platform-specific patches for whatever not-actually-C99
language the platform supports. Basically what git for windows is
already doing (presumably for other reasons), as far as I can tell.

>
> So it might be that things have been horribly broken on NO_INTTYPES_H
> systems for a while, and nobody is checking. But I don't think you can
> really say so without looking at such a system.
>
> And looking at config.mak.uname, it looks like Windows is such a system.
> Does it really have inttypes.h and it is getting included from somewhere
> else, making format conversion macros work? Or does it provide those
> macros elsewhere, and really needs stdint? It does look like
> compat/mingw.h includes it, but I think we wouldn't use that for msvc
> builds.
>
> > I think it's only
> > "accidentally" working if anyone uses NO_INTTYPES_H. I changed my
> > stance halfway through this investigation in my previous email, I
> > apologize for not going back and editing it to make it clear at the
> > beginning that I'd done so. My current stance is that
> > <git-compat-util.h> should be either (a) including only inttypes.h
> > (since it includes stdint.h), or (b) including both inttypes.h and
> > stdint.h (in either order), just to demonstrate that we can.
>
> It is good to clean up old conditionals if they are no longer
> applicable, as they are a burden to reason about later (as this
> discussion shows). But I am not sure about your "just to demonstrate we
> can".

Yeah, I'm also not convinced the "just to demonstrate we can" has much
value. I was trying to get ahead of future discussions where we claim
it's important to never include stdint.h (because people remember this
discussion and how contentious it was) and think it might misbehave,
and instead just include it in <git-compat-util.h> to prove it
_doesn't_ misbehave, and thus start to allow usage in self-contained
headers when necessary.

> It is good to try it out, but it looks like there is a non-zero
> chance that MSVC on Windows might break. It is probably better to try
> building there or looping in folks who can, rather than just making a
> change and seeing if anybody screams.

I think I miscommunicated here, or had too many assumptions about the
current state of things that I didn't actually verify. When I wrote
"and seeing if the build bots or maintainers identify it as a
continuing issue", I was assuming that we had build bots for all major
platforms (including windows, with however it gets built: mingw or VC
or whatever), and I included the "maintainers" part of it for the long
tail of esoteric platforms that we either don't know about, or can't
have automated builds on for whatever reason. I agree that making
changes that have a high likelihood of breaking supported platforms
(which gets back to that platform support thread that was started a
few weeks ago) should not be done lightly, and it's not reasonable to
make the change and wait for maintainers of these "supported
platforms" to complain. I was relying on the build bots covering the
"supported platforms" and stopping me from even sending such a patch
to the mailing list :)

>
> I think the "win+VS" test in the GitHub Actions CI job might cover this
> case. It is not run by default (because it was considered be mostly
> redundant with the mingw build), but it shouldn't be too hard to enable
> it for a one-off test.
>
> -Peff

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v5 1/3] pager: include stdint.h because uintmax_t is used
  2024-02-27  2:45             ` Junio C Hamano
@ 2024-02-27 22:29               ` Kyle Lippincott
  2024-02-27 23:25                 ` Junio C Hamano
  0 siblings, 1 reply; 111+ messages in thread
From: Kyle Lippincott @ 2024-02-27 22:29 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Calvin Wan, git, Jonathan Tan, phillip.wood123

On Mon, Feb 26, 2024 at 6:45 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Kyle Lippincott <spectral@google.com> writes:
>
> >> In any case, your sources should not include a standard library
> >> header directly yourself, period.  Instead let <git-compat-util.h>
> >> take care of the details of how we need to obtain what we need out
> >> of the system on various platforms.
> >
> > I disagree with this statement. We _can't_ use a magic compatibility
> > header file in the library interfaces, for the reasons I outlined
> > further below in my previous message. For those headers, the ones that
> > might be included by code that's not under the Git project's control,
> > they need to be self-contained, minimal, and maximally compatible.
>
> Note that I am not talking about your random outside program that
> happens to link with gitstdlib.a; it would want to include a header
> file <gitstdlib.h> that comes with the library.

I agree with this.

>
> Earlier I suggested that you may want to take a subset of
> <git-compat-util.h>, because <git-compat-util.h> may have a lot more
> than what is minimally necessary to allow our sources to be
> insulated from details of platform dependence.  You can think of
> that subset as a good starting point to build the <gitstdlib.h>
> header file to be given to the library customers.
>
> But the sources that go to the library, as gitstdlib.a is supposed
> to serve as a subset of gitlib.a to our internal codebase when
> building the git binary, should still follow our header inclusion
> rules.

If I'm understanding this correctly, I agree with it. The .c files
still include <git-compat-util.h>, and don't change. The internal-only
.h files (ones that a pre-built-library consumer doesn't need to even
have in the filesystem) still assume that <git-compat-util.h> was
included, and don't change. <pager.h> falls into this category.

>
> Because we would want to make sure that the sources that are made
> into gitstdlib.a, the sources to the rest of libgit.a, and the
> sources to the rest of git, all agree on what system features we ask
> from the system, feature macros that must be defined to certain
> values before we include system library files (like _XOPEN_SOURCE
> and _FILE_OFFSET_BITS) must be defined consistently across all of
> these three pieces.  One way to do so may be to ensure that the
> definition of them would be migrated to <gitstdlib.h> when we
> separate a subset out of <git-compat-util.h> to it (and of course,
> we make <git-compat-util.h> to include <gitstdlib.h> so that it
> would be still sufficient for our in-tree users to include the
> <git-compat-util.h>)
>
> <gitstdlib.h> may have to expose an API function that uses some
> extended types only available by including system header files,
> e.g. some function may return ssize_t as its value or take an off_t
> value as its argument.

I agree that these types will be necessary (specifically ssize_t and
int##_t, but less so off_t) in the "external" (used by projects other
than Git) library interfaces.

>
> If our header should include system headers to make these types
> available to our definitions is probably open to discussion.  It is
> harder to do so portably, unless your world is limited to POSIX.1
> and ISO C, than making it the responsibility of library users.

I think I'm probably missing the nuance here, and may be making this
discussion much harder because of it. My understanding is that Git is
using C99; is that different from ISO C? There's something at the top
of <git-compat-util.h> that enforces that we're using C99. Therefore,
I'm assuming that any compiler that claims to be C99 and passes that
check at the top of <git-compat-util.h> will support inttypes.h,
stdint.h, stdbool.h, and other files defined by the C99 standard to
include types that we need in our .h files are able to be included
without reservation. To flip it around: any compiler/platform that's
missing inttypes.h, or is missing stdint.h, or raises errors if both
are included, or requires other headers to be included before them
_isn't a C99 compiler_, and _isn't supported_. I'm picking on these
files because I think they will be necessary for the external library
interfaces. I'm intentionally ignoring any file not mentioned in the
C99 standard, because those are platform specific. I acknowledge that
there may be some functionality in these files that's only enabled if
certain #defines are set. Our external interfaces should strive to not
use that functionality, and only do so if we are able to test for this
functionality and refuse to compile if it's not available. I have an
example with uintmax_t below.

>
> But if the platform headers and libraries support feature macros
> that allows you to tweak these sizes (e.g. the size of off_t may be
> controlled by setting the _FILE_OFFSET_BITS to an appropriate
> value), it may be irresponsible to leave that to the library users,
> as they MUST make sure to define such feature macros exactly the
> same way as we define for our code, which currently is done in
> <git-compat-util.h>, before they include their system headers to
> obtain off_t so that they can use <gitstdlib.h>.

I think the only viable solution to this is to not use these types
that depend on #defines in the interface available to non-git
projects. We can't set _FILE_OFFSET_BITS in the library's external
(used by non-Git projects) interface header, as there's a high
likelihood that it's either too late (external project #included
something that relies on _FILE_OFFSET_BITS already), or, if not, we
create the "off_t is a different size" problem for their code.

This means that we can't use off_t in these external interface headers
(and in the .c files that support them, if any). We can't use `struct
stat`. We likely need to limit ourselves to just the typedefs from
stdint.h, and probably will need some additional checks that enforce
that we have the types and sizes we expect (ex: I could imagine that
some platforms define uintmax_t as 32-bit. or 128-bit. Either we can't
use it in these external interfaces, or we have to enforce somehow
that the simplest file we can imagine (#include <stdint.h>) gets a
definition of uintmax_t that is the exact same as the one we'd get if
we included <git-compat-util.h>). The external interface headers don't
need to be as platform-compatible as the rest of the git code base,
because not every platform is going to be a supported target for using
the library in non-git projects, especially at first. The external
interface headers _do_ need to be as tolerant and well behaved as
possible when being included by external projects, which I'm asserting
means they need to be self-contained and minimal. If that means these
external interfaces don't get to use off_t at all, so be it. If it
means they can only be included if sizeof(off_t) == 64, and we have a
way of enforcing that at compile time, that's fine with me too. But we
can't #define _FILE_OFFSET_BITS ourselves in this external interface
to get that behavior, because it just doesn't work.

I'm making some assumptions here. I'm assuming that the git binary
uses a different interface to a hypothetical libgitobjstore.a than an
external project would (i.e. that there'd be some
git-obj-store-interface.h that gets included by non-Git projects, but
not by git itself). Is git-std-lib an obvious counterexample to this
assumption? Yes and no. No one (besides Git itself) is going to
include libgitstdlib.a in their project any time soon, so there's no
real "external interface" to define right now. Eventually, having
git-std-lib types in the hypothetical git-obj-store-interface.h _may_
happen, or it may not. I don't know.

...

But I think we're in agreement that pager.h isn't part of
git-std-lib's (currently undefined/non-existent) external interface,
and so doesn't need to be self-contained, and this patch should
probably be dropped?
>
> So the rules for library clients (random outside programs that
> happen to link with gitstdlib.a) may not be that they must include
> <git-compat-util.h> as the first thing, but they probably still have
> to include <gitstdlib.h> fairly early before including any of their
> system headers, I would suspect, unless they are willing to accept
> such responsibility fully to ensure they compile the same way as the
> gitstdlib library, I would think.
>
>
>

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v5 1/3] pager: include stdint.h because uintmax_t is used
  2024-02-27 22:29               ` Kyle Lippincott
@ 2024-02-27 23:25                 ` Junio C Hamano
  0 siblings, 0 replies; 111+ messages in thread
From: Junio C Hamano @ 2024-02-27 23:25 UTC (permalink / raw)
  To: Kyle Lippincott; +Cc: Calvin Wan, git, Jonathan Tan, phillip.wood123

Kyle Lippincott <spectral@google.com> writes:

> of <git-compat-util.h> that enforces that we're using C99. Therefore,
> I'm assuming that any compiler that claims to be C99 and passes that
> check at the top of <git-compat-util.h> will support inttypes.h,
> stdint.h, stdbool.h, and other files defined by the C99 standard to
> include types that we need in our .h files are able to be included
> without reservation.

We at the git project is much more conservative than trusting the
compilers and take their "claim" to support a standard at the face
value, though ;-).  As our CodingGuidelines say, we honor real world
constraints more than what's written on paper as "standard", and
would ...

> To flip it around: any compiler/platform that's
> missing inttypes.h, or is missing stdint.h, or raises errors if both
> are included, or requires other headers to be included before them
> _isn't a C99 compiler_, and _isn't supported_.

... refrain from taking such a position as much as possible.

> I think the only viable solution to this is to not use these types
> that depend on #defines in the interface available to non-git
> projects.

OK.  Now we have that behind us, can we start outlining what kind of
things _are_ exported out in the library to the outside world?  The
only example of the C source file that is a client of the library we
have is t/helper/test-stdlib.c but it includes 

    <abspath.h>
    <hex-ll.h>
    <parse.h>
    <strbuf.h>
    <string-list.h>

after including <git-compat-util.h>.  Are the services offered by
these five header files the initial set of the library, minimally
viable demonstration?  Has somebody already analyzed and enumerated
what kind of system definitions we need to support the interface
these five header files offer?

Once we know that, perhaps the next task would be to create a
<git-absolute-minimum-portability-requirement.h> header by taking a
subset of <git-compat-util.h>, and have test-stdlib.c include that
instead of <git-compat-util.h>.  <git-compat-util.h> will of course
include that header to replace the parts it lost to the new header.

It does *not* make it a requirement for client programs to include
the <git-absolute-minimum-portability-requirement.h>, though.  They
can include the system headers in the way they like, as long as they
do not let them define symbols in ways contradicting to what our
headers expect.  <git-absolute-minimum-portability-requirement.h> is
merely a way for us to tell those who write these client programs
what the system symbols we rely on are.

So, that's one (or two? first to analyse and enumerate, second to
split a new header file out of compat-util) actionable task, I
guess.


^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v5 2/3] git-std-lib: introduce Git Standard Library
  2024-02-22 17:50   ` [PATCH v5 2/3] git-std-lib: introduce Git Standard Library Calvin Wan
@ 2024-02-29 11:16     ` Phillip Wood
  2024-02-29 17:23       ` Junio C Hamano
  0 siblings, 1 reply; 111+ messages in thread
From: Phillip Wood @ 2024-02-29 11:16 UTC (permalink / raw)
  To: Calvin Wan, git; +Cc: Jonathan Tan, Junio C Hamano

Hi Calvin

On 22/02/2024 17:50, Calvin Wan wrote:
> This commit contains:
> - Makefile rules for git-std-lib.a
> - code and Makefile rules for git-stub-lib.a
> - description and rationale of the above in Documentation/

We tend to avoid lists like this in our commit messages. Starting with 
the motivation for adding git-std-lib would be more helpful to the 
reader I think.

> Quoting from documentation introduced in this commit:
> 
>    The Git Standard Library intends to serve as the foundational library
>    and root dependency that other libraries in Git will be built off
>    of. That is to say, suppose we have libraries X and Y; a user that
>    wants to use X and Y would need to include X, Y, and this Git Standard
>    Library.
> 
> Code demonstrating the use of git-std-lib.a and git-stub-lib.a will be
> in a subsequent commit.
> 
> Signed-off-by: Calvin Wan <calvinwan@google.com>
> Helped-by: Phillip Wood <phillip.wood123@gmail.com>
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>

I think virtually all the changes since the last version apart from 
rewording the documentation come from my fixup at [1] I'm happy to offer 
my Signed-off-by: for those. I cannot offer a review of the changes from 
that fixup though I'm still happy with the approach. I do however think 
I should have included git-compat-util.h in the stub implementations and 
used BUG() instead of assert(). I've left some comments on the 
documentation below.

[1] 
https://github.com/phillipwood/git/commit/0f8393d2189a4c73d3f00f5ae74d3972677309d0

> diff --git a/Documentation/technical/git-std-lib.txt b/Documentation/technical/git-std-lib.txt
> new file mode 100644
> index 0000000000..3d9aa121ac
> --- /dev/null
> +++ b/Documentation/technical/git-std-lib.txt
> @@ -0,0 +1,170 @@
> += Git Standard Library
> +
> +The Git Standard Library intends to serve as the foundational library
> +and root dependency that other libraries in Git will be built off of.
> +That is to say, suppose we have libraries X and Y; a user that wants to
> +use X and Y would need to include X, Y, and this Git Standard Library.
> +This does not mean that the Git Standard Library will be the only
> +possible root dependency in the future, but rather the most significant
> +and widely used one. Git itself is also built off of the Git Standard
> +Library.
> +
> +== Dependency graph in libified Git
> +
> +Before the introduction of the Git Standard Library, all objects defined
> +in the Git library are compiled and archived into a singular file,

s/singular/single/ there are some instances of "singular" later on as 
well all of which would be better as "single" I think.

> +libgit.a, which is then linked against by common-main.o with other
> +external dependencies and turned into the Git executable.

I found this description a bit confusing. As I understand it to build 
git we link git.o against common-main.o, libgit.a, xdiff/lib.a, 
reftable/libreftable.a and libpcre etc.

> In other
> +words, the Git executable has dependencies on libgit.a and a couple of
> +external libraries. The libfication of Git slightly alters this build
> +flow by separating out libgit.a into libgit.a and git-std-lib.a.
> +
> +With our current method of building Git, we can imagine the dependency

s/imagine/visualize/

> +graph as such:
> +
> +	Git
> +	 /\
> +	/  \
> +       /    \
> +  libgit.a   ext deps
> +
> +We want to separate out potential libraries from libgit.a and have
> +libgit.a depend on them, which would possibly look like:
> +
> +		Git
> +		/\
> +	       /  \
> +	      /    \
> +	  libgit.a  ext deps
> +	     /\
> +	    /  \
> +	   /    \
> +object-store.a  (other lib)
> +      |        /
> +      |       /
> +      |      /
> +      |     /
> +      |    /
> +      |   /
> +      |  /
> +git-std-lib.a
> +
> +Instead of containing all objects in Git, libgit.a would contain objects
> +that are not built by libraries it links against. Consequently, if

s/by libraries/by the libraries/

> +someone wanted a custom build of Git with a custom implementation of the
> +object store, they would only have to swap out object-store.a rather
> +than do a hard fork of Git.
> +
> +== Rationale behind Git Standard Library
> +
> +The rationale behind the selected object files in the Git Standard
> +Library is the result of two observations within the Git
> +codebase:
> +  1. every file includes git-compat-util.h which defines functions
> +     in a couple of different files
> +  2. wrapper.c + usage.c have difficult-to-separate circular

s/+/and/

> +     dependencies with each other and other files.
> +
> +=== Ubiquity of git-compat-util.h and circular dependencies
> +
> +Every file in the Git codebase includes git-compat-util.h. It serves as
> +"a compatibility aid that isolates the knowledge of platform specific
> +inclusion order and what feature macros to define before including which
> +system header" (Junio[1]). Since every file includes git-compat-util.h,
> +and git-compat-util.h includes wrapper.h and usage.h, it would make
> +sense for wrapper.c and usage.c to be a part of the root library. They
> +have difficult to separate circular dependencies with each other so it
> +would impractical for them to be independent libraries. Wrapper.c has
> +dependencies on parse.c, abspath.c, strbuf.c, which in turn also have
> +dependencies on usage.c and wrapper.c - more circular dependencies.
> +
> +=== Tradeoff between swappability and refactoring
> +
> +From the above dependency graph, we can see that git-std-lib.a could be
> +many smaller libraries rather than a singular library. So why choose a
> +singular library when multiple libraries can be individually easier to
> +swap and are more modular? A singular library requires less work to
> +separate out circular dependencies within itself so it becomes a
> +tradeoff question between work and reward. While there may be a point in
> +the future where a file like usage.c would want its own library so that
> +someone can have custom die() or error(), the work required to refactor
> +out the circular dependencies in some files would be enormous due to
> +their ubiquity so therefore I believe it is not worth the tradeoff

I'm not sure if we want to use the first person in our technical 
documentation, unlike the cover letter to a patch series it is not 
immediately obvious to the reader who "I" is. This applies the passages 
in the first person below as well.

> +currently. Additionally, we can in the future choose to do this refactor
> +and change the API for the library if there becomes enough of a reason
> +to do so (remember we are avoiding promising stability of the interfaces
> +of those libraries).
> +
> +=== Reuse of compatibility functions in git-compat-util.h
> +
> +Most functions defined in git-compat-util.h are implemented in compat/
> +and have dependencies limited to strbuf.h and wrapper.h so they can be
> +easily included in git-std-lib.a, which as a root dependency means that
> +higher level libraries do not have to worry about compatibility files in
> +compat/. The rest of the functions defined in git-compat-util.h are
> +implemented in top level files and are hidden behind
> +an #ifdef if their implementation is not in git-std-lib.a.

I think the reference to #ifdef is out of date now we've moved to more stubs

> +=== Rationale summary
> +
> +The Git Standard Library allows us to get the libification ball rolling
> +with other libraries in Git. By not spending many more months attempting
> +to refactor difficult circular dependencies and instead spending that
> +time getting to a state where we can test out swapping a library out
> +such as config or object store, we can prove the viability of Git
> +libification on a much faster time scale. Additionally the code cleanups
> +that have happened so far have been minor and beneficial for the
> +codebase. It is probable that making large movements would negatively
> +affect code clarity.
> +
> +== Git Standard Library boundary
> +
> +While I have described above some useful heuristics for identifying
> +potential candidates for git-std-lib.a, a standard library should not
> +have a shaky definition for what belongs in it.

Maybe "we need a more precise definition" rather than the "shaky 
definition" bit

> + - Low-level files (aka operates only on other primitive types) that are
> +   used everywhere within the codebase (wrapper.c, usage.c, strbuf.c)
> +   - Dependencies that are low-level and widely used
> +     (abspath.c, date.c, hex-ll.c, parse.c, utf8.c)
> + - low-level git/* files with functions defined in git-compat-util.h
> +   (ctype.c)
> + - compat/*
> +
> +There are other files that might fit this definition, but that does not
> +mean it should belong in git-std-lib.a. Those files should start as
> +their own separate library since any file added to git-std-lib.a loses
> +its flexibility of being easily swappable.
> +
> +Wrapper.c and usage.c have dependencies on pager and trace2 that are
> +possible to remove at the cost of sacrificing the ability for standard Git
> +to be able to trace functions in those files and other files in git-std-lib.a.
> +In order for git-std-lib.a to compile with those dependencies, stubbed out
> +versions of those files are implemented and swapped in during compilation time
> +(see STUB_LIB_OBJS in the Makefile).
> +
> +== Files inside of Git Standard Library
> +
> +The set of files in git-std-lib.a can be found in STD_LIB_OBJS and COMPAT_OBJS
> +in the Makefile.
> +
> +When these files are compiled together with the files in STUB_LIB_OBJS (or
> +user-provided files that provide the same functions), they form a complete
> +library.
> +
> +== Pitfalls
> +
> +There are a small amount of files under compat/* that have dependencies

s/amount/number/ as files are countable

> +not inside of git-std-lib.a. While those functions are not called on
> +Linux, other OSes might call those problematic functions. I don't see
> +this as a major problem, just moreso an observation that libification in
> +general may also require some minor compatibility work in the future.
> +
> +== Testing
> +
> +Unit tests should catch any breakages caused by changes to files in
> +git-std-lib.a (i.e. introduction of a out of scope dependency) and new
> +functions introduced to git-std-lib.a will require unit tests written

s/test written/tests to be written/

> +for them.
> +
> +[1] https://lore.kernel.org/git/xmqqwn17sydw.fsf@gitster.g/

It is nice to see us compiling git using git-std-lib.

Best Wishes

Phillip

> diff --git a/Makefile b/Makefile
> index 4e255c81f2..d37ea9d34b 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -669,6 +669,8 @@ FUZZ_PROGRAMS =
>   GIT_OBJS =
>   LIB_OBJS =
>   SCALAR_OBJS =
> +STD_LIB_OBJS =
> +STUB_LIB_OBJS =
>   OBJECTS =
>   OTHER_PROGRAMS =
>   PROGRAM_OBJS =
> @@ -923,6 +925,8 @@ TEST_SHELL_PATH = $(SHELL_PATH)
>   
>   LIB_FILE = libgit.a
>   XDIFF_LIB = xdiff/lib.a
> +STD_LIB_FILE = git-std-lib.a
> +STUB_LIB_FILE = git-stub-lib.a
>   REFTABLE_LIB = reftable/libreftable.a
>   REFTABLE_TEST_LIB = reftable/libreftable_test.a
>   
> @@ -962,7 +966,6 @@ COCCI_SOURCES = $(filter-out $(THIRD_PARTY_SOURCES),$(FOUND_C_SOURCES))
>   
>   LIB_H = $(FOUND_H_SOURCES)
>   
> -LIB_OBJS += abspath.o
>   LIB_OBJS += add-interactive.o
>   LIB_OBJS += add-patch.o
>   LIB_OBJS += advice.o
> @@ -1004,8 +1007,6 @@ LIB_OBJS += convert.o
>   LIB_OBJS += copy.o
>   LIB_OBJS += credential.o
>   LIB_OBJS += csum-file.o
> -LIB_OBJS += ctype.o
> -LIB_OBJS += date.o
>   LIB_OBJS += decorate.o
>   LIB_OBJS += delta-islands.o
>   LIB_OBJS += diagnose.o
> @@ -1046,7 +1047,6 @@ LIB_OBJS += hash-lookup.o
>   LIB_OBJS += hashmap.o
>   LIB_OBJS += help.o
>   LIB_OBJS += hex.o
> -LIB_OBJS += hex-ll.o
>   LIB_OBJS += hook.o
>   LIB_OBJS += ident.o
>   LIB_OBJS += json-writer.o
> @@ -1097,7 +1097,6 @@ LIB_OBJS += pack-write.o
>   LIB_OBJS += packfile.o
>   LIB_OBJS += pager.o
>   LIB_OBJS += parallel-checkout.o
> -LIB_OBJS += parse.o
>   LIB_OBJS += parse-options-cb.o
>   LIB_OBJS += parse-options.o
>   LIB_OBJS += patch-delta.o
> @@ -1152,7 +1151,6 @@ LIB_OBJS += sparse-index.o
>   LIB_OBJS += split-index.o
>   LIB_OBJS += stable-qsort.o
>   LIB_OBJS += statinfo.o
> -LIB_OBJS += strbuf.o
>   LIB_OBJS += streaming.o
>   LIB_OBJS += string-list.o
>   LIB_OBJS += strmap.o
> @@ -1189,21 +1187,32 @@ LIB_OBJS += unpack-trees.o
>   LIB_OBJS += upload-pack.o
>   LIB_OBJS += url.o
>   LIB_OBJS += urlmatch.o
> -LIB_OBJS += usage.o
>   LIB_OBJS += userdiff.o
> -LIB_OBJS += utf8.o
>   LIB_OBJS += varint.o
>   LIB_OBJS += version.o
>   LIB_OBJS += versioncmp.o
>   LIB_OBJS += walker.o
>   LIB_OBJS += wildmatch.o
>   LIB_OBJS += worktree.o
> -LIB_OBJS += wrapper.o
>   LIB_OBJS += write-or-die.o
>   LIB_OBJS += ws.o
>   LIB_OBJS += wt-status.o
>   LIB_OBJS += xdiff-interface.o
>   
> +STD_LIB_OBJS += abspath.o
> +STD_LIB_OBJS += ctype.o
> +STD_LIB_OBJS += date.o
> +STD_LIB_OBJS += hex-ll.o
> +STD_LIB_OBJS += parse.o
> +STD_LIB_OBJS += strbuf.o
> +STD_LIB_OBJS += usage.o
> +STD_LIB_OBJS += utf8.o
> +STD_LIB_OBJS += wrapper.o
> +
> +STUB_LIB_OBJS += stubs/trace2.o
> +STUB_LIB_OBJS += stubs/pager.o
> +STUB_LIB_OBJS += stubs/misc.o
> +
>   BUILTIN_OBJS += builtin/add.o
>   BUILTIN_OBJS += builtin/am.o
>   BUILTIN_OBJS += builtin/annotate.o
> @@ -1352,7 +1361,7 @@ UNIT_TEST_OBJS = $(patsubst %,$(UNIT_TEST_DIR)/%.o,$(UNIT_TEST_PROGRAMS))
>   UNIT_TEST_OBJS += $(UNIT_TEST_DIR)/test-lib.o
>   
>   # xdiff and reftable libs may in turn depend on what is in libgit.a
> -GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(LIB_FILE)
> +GITLIBS = common-main.o $(STD_LIB_FILE) $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(LIB_FILE)
>   EXTLIBS =
>   
>   GIT_USER_AGENT = git/$(GIT_VERSION)
> @@ -2693,6 +2702,8 @@ OBJECTS += $(XDIFF_OBJS)
>   OBJECTS += $(FUZZ_OBJS)
>   OBJECTS += $(REFTABLE_OBJS) $(REFTABLE_TEST_OBJS)
>   OBJECTS += $(UNIT_TEST_OBJS)
> +OBJECTS += $(STD_LIB_OBJS)
> +OBJECTS += $(STUB_LIB_OBJS)
>   
>   ifndef NO_CURL
>   	OBJECTS += http.o http-walker.o remote-curl.o
> @@ -3686,7 +3697,7 @@ clean: profile-clean coverage-clean cocciclean
>   	$(RM) git.res
>   	$(RM) $(OBJECTS)
>   	$(RM) headless-git.o
> -	$(RM) $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(REFTABLE_TEST_LIB)
> +	$(RM) $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(REFTABLE_TEST_LIB) $(STD_LIB_FILE) $(STUB_LIB_FILE)
>   	$(RM) $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS) $(OTHER_PROGRAMS)
>   	$(RM) $(TEST_PROGRAMS)
>   	$(RM) $(FUZZ_PROGRAMS)
> @@ -3878,3 +3889,18 @@ $(UNIT_TEST_PROGS): $(UNIT_TEST_BIN)/%$X: $(UNIT_TEST_DIR)/%.o $(UNIT_TEST_DIR)/
>   build-unit-tests: $(UNIT_TEST_PROGS)
>   unit-tests: $(UNIT_TEST_PROGS)
>   	$(MAKE) -C t/ unit-tests
> +
> +### Libified Git rules
> +
> +# git-std-lib.a
> +# Programs other than git should compile this with
> +#     make NO_GETTEXT=YesPlease git-std-lib.a
> +# and link against git-stub-lib.a (if the default no-op functionality is fine)
> +# or a custom .a file with the same interface as git-stub-lib.a (if custom
> +# functionality is needed) as well.
> +$(STD_LIB_FILE): $(STD_LIB_OBJS) $(COMPAT_OBJS)
> +	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
> +
> +# git-stub-lib.a
> +$(STUB_LIB_FILE): $(STUB_LIB_OBJS)
> +	$(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
> diff --git a/stubs/misc.c b/stubs/misc.c
> new file mode 100644
> index 0000000000..92da76fd46
> --- /dev/null
> +++ b/stubs/misc.c
> @@ -0,0 +1,33 @@
> +#include <assert.h>
> +#include <stdlib.h>
> +
> +#ifndef NO_GETTEXT
> +/*
> + * NEEDSWORK: This is enough to link our unit tests against
> + * git-std-lib.a built with gettext support. We don't really support
> + * programs other than git using git-std-lib.a with gettext support
> + * yet. To do that we need to start using dgettext() rather than
> + * gettext() in our code.
> + */
> +int git_gettext_enabled = 0;
> +#endif
> +
> +int common_exit(const char *file, int line, int code);
> +
> +int common_exit(const char *file, int line, int code)
> +{
> +	exit(code);
> +}
> +
> +#if !defined(__MINGW32__) && !defined(_MSC_VER)
> +int lstat_cache_aware_rmdir(const char *path);
> +
> +int lstat_cache_aware_rmdir(const char *path)
> +{
> +	/*
> +	 * This function should not be called by programs linked
> +	 * against git-stub-lib.a
> +	 */
> +	assert(0);
> +}
> +#endif
> diff --git a/stubs/pager.c b/stubs/pager.c
> new file mode 100644
> index 0000000000..4f575cada7
> --- /dev/null
> +++ b/stubs/pager.c
> @@ -0,0 +1,6 @@
> +#include "pager.h"
> +
> +int pager_in_use(void)
> +{
> +	return 0;
> +}
> diff --git a/stubs/trace2.c b/stubs/trace2.c
> new file mode 100644
> index 0000000000..7d89482228
> --- /dev/null
> +++ b/stubs/trace2.c
> @@ -0,0 +1,27 @@
> +#include "git-compat-util.h"
> +#include "trace2.h"
> +
> +struct child_process { int stub; };
> +struct repository { int stub; };
> +struct json_writer { int stub; };
> +
> +void trace2_region_enter_fl(const char *file, int line, const char *category,
> +			    const char *label, const struct repository *repo, ...) { }
> +void trace2_region_leave_fl(const char *file, int line, const char *category,
> +			    const char *label, const struct repository *repo, ...) { }
> +void trace2_data_string_fl(const char *file, int line, const char *category,
> +			   const struct repository *repo, const char *key,
> +			   const char *value) { }
> +void trace2_cmd_ancestry_fl(const char *file, int line, const char **parent_names) { }
> +void trace2_cmd_error_va_fl(const char *file, int line, const char *fmt,
> +			    va_list ap) { }
> +void trace2_cmd_name_fl(const char *file, int line, const char *name) { }
> +void trace2_thread_start_fl(const char *file, int line,
> +			    const char *thread_base_name) { }
> +void trace2_thread_exit_fl(const char *file, int line) { }
> +void trace2_data_intmax_fl(const char *file, int line, const char *category,
> +			   const struct repository *repo, const char *key,
> +			   intmax_t value) { }
> +int trace2_is_enabled(void) { return 0; }
> +void trace2_counter_add(enum trace2_counter_id cid, uint64_t value) { }
> +void trace2_collect_process_info(enum trace2_process_info_reason reason) { }

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v5 2/3] git-std-lib: introduce Git Standard Library
  2024-02-29 11:16     ` Phillip Wood
@ 2024-02-29 17:23       ` Junio C Hamano
  2024-02-29 18:27         ` Linus Arver
  0 siblings, 1 reply; 111+ messages in thread
From: Junio C Hamano @ 2024-02-29 17:23 UTC (permalink / raw)
  To: Phillip Wood; +Cc: Calvin Wan, git, Jonathan Tan

Phillip Wood <phillip.wood123@gmail.com> writes:

> Hi Calvin
> On 22/02/2024 17:50, Calvin Wan wrote:

Thanks for reviewing -- I held mine back as I expected this will see
another reroll soonish, but you already have raised many points I
also had trouble with, so I do not have to ;-) 

Below, I'll liberally omit everything you wrote that I agree with.

>> +libgit.a, which is then linked against by common-main.o with other
>> +external dependencies and turned into the Git executable.
>
> I found this description a bit confusing. As I understand it to build
> git we link git.o against common-main.o, libgit.a, xdiff/lib.a,
> reftable/libreftable.a and libpcre etc.

In addition, there is no single "the Git executable", simply because
not everything is builtin command.  The purpose of using libgit.a is
because we are too lazy to list and maintain all the internal
dependencies to link final executables like 'git' (which has all the
built-in command implementations) and 'git-remote-curl' (which is a
standalone program).  Instead of feeding exact list of object files
to "$(CC) -o git" command line, we throw everything into libgit.a
and let the linker pick what is needed.  To link "git", we may
include all builtin/*.o, git.o, common-main.o, libgit.a and the
external [*] library dependencies they have.  To link "git-daemon",
we may not link builtin/*.o and git.o and link daemon.o instead.

	Side note: here I am counting xdiff/lib.a as an external
	library as it is mostly a borrowed code.

In other words, libgit.a is not a true library in the sense that it
was designed to be _used_ as a library.  It was merely a detail of
how we implemented lazy dependency management in our build process,
which happend with 0a02ce72 (Clean up the Makefile a bit.,
2005-04-18) whose commit log message uses air-quotes around the word
"library".

>> +From the above dependency graph, we can see that git-std-lib.a could be
>> +many smaller libraries rather than a singular library. So why choose a
>> +singular library when multiple libraries can be individually easier to
>> +swap and are more modular? A singular library requires less work to
>> +separate out circular dependencies within itself so it becomes a
>> +tradeoff question between work and reward. While there may be a point in
>> +the future where a file like usage.c would want its own library so that
>> +someone can have custom die() or error(), the work required to refactor
>> +out the circular dependencies in some files would be enormous due to
>> +their ubiquity so therefore I believe it is not worth the tradeoff
>
> I'm not sure if we want to use the first person in our technical
> documentation, unlike the cover letter to a patch series it is not
> immediately obvious to the reader who "I" is. This applies the
> passages in the first person below as well.

I found it highly annoying while reading it, too.  If the document
(not the commit that introduced the document) were signed and
written to state the position of one author, as opposed to spell out
the position the project will collectively take, it would have been
OK, but this document is meant to set a course of the project (and
discussion on it is the process to decide which course to take), the
first person singular "I" did not sit well for me.

Another thing to consider that I do not think you covered is the
name of the resulting .a archive.  By starting it with "lib", a
customer can find your libstdgit.a with -lstdgit on the command
line, once libstdgit.a is installed at an appropriate location (or
the build-time library path is configured to point the location you
have libstdgit.a at).  libgit.a that wasn't really designed to be
used as such a library did not have to follow the naming convention,
but if the thing being proposed is meant to be eventually used as a
library by external entities, git-std-lib.a is a rather poor name
for it.


PS.

I seem to have been hit by a power outage and am on UPS, so I'll
probably be offline until the power comes back.




^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v5 2/3] git-std-lib: introduce Git Standard Library
  2024-02-29 17:23       ` Junio C Hamano
@ 2024-02-29 18:27         ` Linus Arver
  2024-02-29 18:54           ` Junio C Hamano
  0 siblings, 1 reply; 111+ messages in thread
From: Linus Arver @ 2024-02-29 18:27 UTC (permalink / raw)
  To: Junio C Hamano, Phillip Wood; +Cc: Calvin Wan, git, Jonathan Tan

Junio C Hamano <gitster@pobox.com> writes:

> Phillip Wood <phillip.wood123@gmail.com> writes:
>
>> On 22/02/2024 17:50, Calvin Wan wrote:
>
>>> +libgit.a, which is then linked against by common-main.o with other
>>> +external dependencies and turned into the Git executable.
>>
>> I found this description a bit confusing. As I understand it to build
>> git we link git.o against common-main.o, libgit.a, xdiff/lib.a,
>> reftable/libreftable.a and libpcre etc.
>
> In addition, there is no single "the Git executable", simply because
> not everything is builtin command.  The purpose of using libgit.a is
> because we are too lazy to list and maintain all the internal
> dependencies to link final executables like 'git' (which has all the
> built-in command implementations) and 'git-remote-curl' (which is a
> standalone program).  Instead of feeding exact list of object files
> to "$(CC) -o git" command line, we throw everything into libgit.a
> and let the linker pick what is needed.  To link "git", we may
> include all builtin/*.o, git.o, common-main.o, libgit.a and the
> external [*] library dependencies they have.  To link "git-daemon",
> we may not link builtin/*.o and git.o and link daemon.o instead.
>
> 	Side note: here I am counting xdiff/lib.a as an external
> 	library as it is mostly a borrowed code.
>
> In other words, libgit.a is not a true library in the sense that it
> was designed to be _used_ as a library.  It was merely a detail of
> how we implemented lazy dependency management in our build process,
> which happend with 0a02ce72 (Clean up the Makefile a bit.,
> 2005-04-18) whose commit log message uses air-quotes around the word
> "library".

Somehow I did not realize that this was going on. Thank you for pointing
this out!

It does make me wonder if we should stop being lazy and do the work that
the linker has been doing for us "for free" ourselves. IOW, stop linking
against a monolithic libgit.a. That way we would replace implicit
dependencies with explicit ones, which might help us understand which
things need what.

But maybe doing that is super painful, so, maybe it's not worth it. IDK.

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v5 2/3] git-std-lib: introduce Git Standard Library
  2024-02-29 18:27         ` Linus Arver
@ 2024-02-29 18:54           ` Junio C Hamano
  2024-02-29 20:03             ` Linus Arver
  0 siblings, 1 reply; 111+ messages in thread
From: Junio C Hamano @ 2024-02-29 18:54 UTC (permalink / raw)
  To: Linus Arver; +Cc: Phillip Wood, Calvin Wan, git, Jonathan Tan

Linus Arver <linusa@google.com> writes:

> It does make me wonder if we should stop being lazy and do the
> work that the linker has been doing for us "for free"
> ourselves. IOW, stop linking against a monolithic libgit.a.
> ... which might help us understand which things need what.

Sorry, but I fail see a point in such an exercise.  If a tool is
available to help us and if there is no downside of using the tool,
we should keep using it.  If you are proposing to move away from the
current build practice because you have a concrete downside of the
approach and avoid that, then it might be a good proposal, though.

And "we do not learn otherwise" is not a downside of the approach;
"we do not learn" comes from your not learning, the tools do not
force you to be ignorant.  We do not propose to use more __asm__ in
our C sources only because compilers were doing that for us "for
free" and because the compilers were somehow robbing us the
opportunity to learn micro-optimization techniques, do we?

A small downside I can immediately think of is possible in a
situation where we have been throwing an object file into libgit.a
archive that is no longer used by any final executable.  In such a
scenario, if you change the source file that is compiled into such
an unused object file, your next "make" will update libgit.a to
replace the unused object file with its new version with your
updates, and that would cause the final build product to be linked
again with objects needed from libgit.a, but there shouldn't be any
change because we are talking about an object that is *not* used by
them but still is in libgit.a due to be listed on LIB_OBJS variable.

But that is a purely theoretical downside. It may be the case that
we haven't done our spring cleaning recently and we haven't noticed
that a source file or two are now unused but are still listed on
LIB_OBJS to be included in the libgit.a archive.  But even if that
were the case, it is implausible that you are touching such an
unused source file in the first place.

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v5 2/3] git-std-lib: introduce Git Standard Library
  2024-02-29 18:54           ` Junio C Hamano
@ 2024-02-29 20:03             ` Linus Arver
  0 siblings, 0 replies; 111+ messages in thread
From: Linus Arver @ 2024-02-29 20:03 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Phillip Wood, Calvin Wan, git, Jonathan Tan

Junio C Hamano <gitster@pobox.com> writes:

> Linus Arver <linusa@google.com> writes:
>
>> It does make me wonder if we should stop being lazy and do the
>> work that the linker has been doing for us "for free"
>> ourselves. IOW, stop linking against a monolithic libgit.a.
>> ... which might help us understand which things need what.
>
> [...] If a tool is
> available to help us and if there is no downside of using the tool,
> we should keep using it.

Of course, if there is no downside, we should use the tool as is.

> If you are proposing to move away from the
> current build practice because you have a concrete downside of the
> approach and avoid that, then it might be a good proposal, though.

Right. I was just wondering if the "explicit dependencies declared in
the Makefile" would provide some value WRT libification. Currently IDK
the answer to that.

> And "we do not learn otherwise" is not a downside of the approach;
> "we do not learn" comes from your not learning, the tools do not
> force you to be ignorant.  We do not propose to use more __asm__ in
> our C sources only because compilers were doing that for us "for
> free" and because the compilers were somehow robbing us the
> opportunity to learn micro-optimization techniques, do we?

True.

> A small downside I can immediately think of is possible in a
> situation where we have been throwing an object file into libgit.a
> archive that is no longer used by any final executable.  In such a
> scenario, if you change the source file that is compiled into such
> an unused object file, your next "make" will update libgit.a to
> replace the unused object file with its new version with your
> updates, and that would cause the final build product to be linked
> again with objects needed from libgit.a, but there shouldn't be any
> change because we are talking about an object that is *not* used by
> them but still is in libgit.a due to be listed on LIB_OBJS variable.

IIUC, this (theoretical) downside will result in Make thinking that it
needs to rebuild libgit.a when it actually doesn't need to (because the
updated change is for an unused object). So it could slow down the build
unnecessarily. Makes sense.

> But that is a purely theoretical downside. It may be the case that
> we haven't done our spring cleaning recently and we haven't noticed
> that a source file or two are now unused but are still listed on
> LIB_OBJS to be included in the libgit.a archive.  But even if that
> were the case, it is implausible that you are touching such an
> unused source file in the first place.

As you noted, libgit.a is not a true library; it's just a big archive of
everything and we let the linker figure out what the executables need
out of it. But I was under the impression that with Git libification, we
would want to create a real library in the fullest sense of the word ---
such that our executables also need this library ("-lgit") to be built
in the exact same way that external programs would need this same
(single) Git library. For example, I believe this is how curl is built
(it first builds libcurl, and then links against it as an internal user
for generating the "curl" executable).

Going back to libgit.a, I was just wondering if the exercise of breaking
it up into smaller pieces would have been helpful in figuring out this
"-lgit" library (or what a smaller version of it would look like).

I sense that I may be missing large pieces of context around the
git-std-lib discussions though, so I apologize if my points above are
not new and moot. Thanks.

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v5 3/3] test-stdlib: show that git-std-lib is independent
  2024-02-22 17:50   ` [PATCH v5 3/3] test-stdlib: show that git-std-lib is independent Calvin Wan
  2024-02-22 22:24     ` Junio C Hamano
@ 2024-03-07 21:13     ` Junio C Hamano
  1 sibling, 0 replies; 111+ messages in thread
From: Junio C Hamano @ 2024-03-07 21:13 UTC (permalink / raw)
  To: Calvin Wan; +Cc: git, Jonathan Tan, phillip.wood123, Jeff King

Calvin Wan <calvinwan@google.com> writes:

> +	strbuf_commented_addf(sb, '#', "%s", "foo");

Of course, this will need to be adjusted when it meets the "let's
allow more than one byte for a comment character" series by Peff.
It should now read

	strbuf_commented_addf(sb, "#", "%s", "foo");

of course.

This is a usual "a function changes in one topic, while the other
topic adds more callers to it" problem a maintainer is expected to
handle fine, so there is nothing special that needs to be done by
contributors, but just giving you a head's up when you yourself test
your updated version to ensure it works well with other topics in
flight.


^ permalink raw reply	[flat|nested] 111+ messages in thread

end of thread, other threads:[~2024-03-07 21:13 UTC | newest]

Thread overview: 111+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-27 19:52 [RFC PATCH 0/8] Introduce Git Standard Library Calvin Wan
2023-06-27 19:52 ` [RFC PATCH 1/8] trace2: log fsync stats in trace2 rather than wrapper Calvin Wan
2023-06-28  2:05   ` Victoria Dye
2023-07-05 17:57     ` Calvin Wan
2023-07-05 18:22       ` Victoria Dye
2023-07-11 20:07   ` Jeff Hostetler
2023-06-27 19:52 ` [RFC PATCH 2/8] hex-ll: split out functionality from hex Calvin Wan
2023-06-28 13:15   ` Phillip Wood
2023-06-28 16:55     ` Calvin Wan
2023-06-27 19:52 ` [RFC PATCH 3/8] object: move function to object.c Calvin Wan
2023-06-27 19:52 ` [RFC PATCH 4/8] config: correct bad boolean env value error message Calvin Wan
2023-06-27 19:52 ` [RFC PATCH 5/8] parse: create new library for parsing strings and env values Calvin Wan
2023-06-27 22:58   ` Junio C Hamano
2023-06-27 19:52 ` [RFC PATCH 6/8] pager: remove pager_in_use() Calvin Wan
2023-06-27 23:00   ` Junio C Hamano
2023-06-27 23:18     ` Calvin Wan
2023-06-28  0:30     ` Glen Choo
2023-06-28 16:37       ` Glen Choo
2023-06-28 16:44         ` Calvin Wan
2023-06-28 17:30           ` Junio C Hamano
2023-06-28 20:58       ` Junio C Hamano
2023-06-27 19:52 ` [RFC PATCH 7/8] git-std-lib: introduce git standard library Calvin Wan
2023-06-28 13:27   ` Phillip Wood
2023-06-28 21:15     ` Calvin Wan
2023-06-30 10:00       ` Phillip Wood
2023-06-27 19:52 ` [RFC PATCH 8/8] git-std-lib: add test file to call git-std-lib.a functions Calvin Wan
2023-06-28  0:14 ` [RFC PATCH 0/8] Introduce Git Standard Library Glen Choo
2023-06-28 16:30   ` Calvin Wan
2023-06-30  7:01 ` Linus Arver
2023-08-10 16:33 ` [RFC PATCH v2 0/7] " Calvin Wan
2023-08-10 16:36   ` [RFC PATCH v2 1/7] hex-ll: split out functionality from hex Calvin Wan
2023-08-10 16:36   ` [RFC PATCH v2 2/7] object: move function to object.c Calvin Wan
2023-08-10 20:32     ` Junio C Hamano
2023-08-10 22:36     ` Glen Choo
2023-08-10 22:43       ` Junio C Hamano
2023-08-10 16:36   ` [RFC PATCH v2 3/7] config: correct bad boolean env value error message Calvin Wan
2023-08-10 20:36     ` Junio C Hamano
2023-08-10 16:36   ` [RFC PATCH v2 4/7] parse: create new library for parsing strings and env values Calvin Wan
2023-08-10 23:21     ` Glen Choo
2023-08-10 23:43       ` Junio C Hamano
2023-08-14 22:15       ` Jonathan Tan
2023-08-14 22:09     ` Jonathan Tan
2023-08-14 22:19       ` Junio C Hamano
2023-08-10 16:36   ` [RFC PATCH v2 5/7] date: push pager.h dependency up Calvin Wan
2023-08-10 23:41     ` Glen Choo
2023-08-14 22:17     ` Jonathan Tan
2023-08-10 16:36   ` [RFC PATCH v2 6/7] git-std-lib: introduce git standard library Calvin Wan
2023-08-14 22:26     ` Jonathan Tan
2023-08-10 16:36   ` [RFC PATCH v2 7/7] git-std-lib: add test file to call git-std-lib.a functions Calvin Wan
2023-08-14 22:28     ` Jonathan Tan
2023-08-10 22:05   ` [RFC PATCH v2 0/7] Introduce Git Standard Library Glen Choo
2023-08-15  9:20     ` Phillip Wood
2023-08-16 17:17       ` Calvin Wan
2023-08-16 21:19         ` Junio C Hamano
2023-08-15  9:41   ` Phillip Wood
2023-09-08 17:41     ` [PATCH v3 0/6] " Calvin Wan
2023-09-08 17:44       ` [PATCH v3 1/6] hex-ll: split out functionality from hex Calvin Wan
2023-09-08 17:44       ` [PATCH v3 2/6] wrapper: remove dependency to Git-specific internal file Calvin Wan
2023-09-15 17:54         ` Jonathan Tan
2023-09-08 17:44       ` [PATCH v3 3/6] config: correct bad boolean env value error message Calvin Wan
2023-09-08 17:44       ` [PATCH v3 4/6] parse: create new library for parsing strings and env values Calvin Wan
2023-09-08 17:44       ` [PATCH v3 5/6] git-std-lib: introduce git standard library Calvin Wan
2023-09-11 13:22         ` Phillip Wood
2023-09-27 14:14           ` Phillip Wood
2023-09-15 18:39         ` Jonathan Tan
2023-09-26 14:23         ` phillip.wood123
2023-09-08 17:44       ` [PATCH v3 6/6] git-std-lib: add test file to call git-std-lib.a functions Calvin Wan
2023-09-09  5:26         ` Junio C Hamano
2023-09-15 18:43         ` Jonathan Tan
2023-09-15 20:22           ` Junio C Hamano
2023-09-08 20:36       ` [PATCH v3 0/6] Introduce Git Standard Library Junio C Hamano
2023-09-08 21:30         ` Junio C Hamano
2023-09-29 21:20 ` [PATCH v4 0/4] Preliminary patches before git-std-lib Jonathan Tan
2023-09-29 21:20   ` [PATCH v4 1/4] hex-ll: separate out non-hash-algo functions Jonathan Tan
2023-10-21  4:14     ` Linus Arver
2023-09-29 21:20   ` [PATCH v4 2/4] wrapper: reduce scope of remove_or_warn() Jonathan Tan
2023-10-10  9:59     ` phillip.wood123
2023-10-10 16:13       ` Junio C Hamano
2023-10-10 17:38         ` Jonathan Tan
2023-09-29 21:20   ` [PATCH v4 3/4] config: correct bad boolean env value error message Jonathan Tan
2023-09-29 23:03     ` Junio C Hamano
2023-09-29 21:20   ` [PATCH v4 4/4] parse: separate out parsing functions from config.h Jonathan Tan
2023-10-10 10:00     ` phillip.wood123
2023-10-10 17:43       ` Jonathan Tan
2023-10-10 17:58         ` Phillip Wood
2023-10-10 20:57           ` Junio C Hamano
2023-10-10 10:05   ` [PATCH v4 0/4] Preliminary patches before git-std-lib phillip.wood123
2023-10-10 16:21     ` Jonathan Tan
2024-02-22 17:50   ` [PATCH v5 0/3] Introduce Git Standard Library Calvin Wan
2024-02-22 17:50   ` [PATCH v5 1/3] pager: include stdint.h because uintmax_t is used Calvin Wan
2024-02-22 21:43     ` Junio C Hamano
2024-02-26 18:59       ` Kyle Lippincott
2024-02-27  0:20         ` Junio C Hamano
2024-02-27  0:56           ` Kyle Lippincott
2024-02-27  2:45             ` Junio C Hamano
2024-02-27 22:29               ` Kyle Lippincott
2024-02-27 23:25                 ` Junio C Hamano
2024-02-27  8:45             ` Jeff King
2024-02-27  9:05               ` Jeff King
2024-02-27 20:10               ` Kyle Lippincott
2024-02-24  1:33     ` Kyle Lippincott
2024-02-24  7:58       ` Junio C Hamano
2024-02-22 17:50   ` [PATCH v5 2/3] git-std-lib: introduce Git Standard Library Calvin Wan
2024-02-29 11:16     ` Phillip Wood
2024-02-29 17:23       ` Junio C Hamano
2024-02-29 18:27         ` Linus Arver
2024-02-29 18:54           ` Junio C Hamano
2024-02-29 20:03             ` Linus Arver
2024-02-22 17:50   ` [PATCH v5 3/3] test-stdlib: show that git-std-lib is independent Calvin Wan
2024-02-22 22:24     ` Junio C Hamano
2024-03-07 21:13     ` Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.