git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [DISCUSS] Introducing Rust into the Git project
@ 2024-01-10 20:16 Taylor Blau
  2024-01-10 21:57 ` Dragan Simic
                   ` (4 more replies)
  0 siblings, 5 replies; 38+ messages in thread
From: Taylor Blau @ 2024-01-10 20:16 UTC (permalink / raw)
  To: git

Over the holiday break at the end of last year I spent some time
thinking on what it would take to introduce Rust into the Git project.

There is significant work underway to introduce Rust into the Linux
kernel (see [1], [2]). Among their stated goals, I think there are a few
which could be potentially relevant to the Git project:

  - Lower risk of memory safety bugs, data races, memory leaks, etc.
    thanks to the language's safety guarantees.

  - Easier to gain confidence when refactoring or introducing new code
    in Rust (assuming little to no use of the language's `unsafe`
    feature).

  - Contributing to Git becomes easier and accessible to a broader group
    of programmers by relying on a more modern language.

Given the allure of these benefits, I think it's at least worth
considering and discussing how Rust might make its way into Junio's
tree.

I imagine that the transition state would involve some parts of the
project being built in C and calling into Rust code via FFI (and perhaps
vice-versa, with Rust code calling back into the existing C codebase).
Luckily for us, Rust's FFI provides a zero-cost abstraction [3], meaning
there is no performance impact when calling code from one language in
the other.

Some open questions from me, at least to get the discussion going are:

  1. Platform support. The Rust compiler (rustc) does not enjoy the same
     widespread availability that C compilers do. For instance, I
     suspect that NonStop, AIX, Solaris, among others may not be
     supported.

     One possible alternative is to have those platforms use a Rust
     front-end for a compiler that they do support. The gccrs [4]
     project would allow us to compile Rust anywhere where GCC is
     available. The rustc_codegen_gcc [5] project uses GCC's libgccjit
     API to target GCC from rustc itself.

  2. Migration. What parts of Git are easiest to convert to Rust? My
     hunch is that the answer is any stand-alone libraries, like
     strbuf.h. I'm not sure how we should identify these, though, and in
     what order we would want to move them over.

  3. Interaction with the lib-ification effort. There is lots of work
     going on in an effort to lib-ify much of the Git codebase done by
     Google. I'm not sure how this would interact with that effort, but
     we should make sure that one isn't a blocker for the other.

I'm curious to hear what others think about this. I think that this
would be an exciting and worthwhile direction for the project. Let's
see!

Thanks,
Taylor

[1]: https://rust-for-linux.com/
[2]: https://lore.kernel.org/rust-for-linux/20210414184604.23473-1-ojeda@kernel.org/
[3]: https://blog.rust-lang.org/2015/04/24/Rust-Once-Run-Everywhere.html#c-talking-to-rust
[4]: https://github.com/Rust-GCC/gccrs
[5]: https://github.com/rust-lang/rustc_codegen_gcc

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [DISCUSS] Introducing Rust into the Git project
  2024-01-10 20:16 [DISCUSS] Introducing Rust into the Git project Taylor Blau
@ 2024-01-10 21:57 ` Dragan Simic
  2024-01-10 22:11   ` Junio C Hamano
  2024-01-11  0:33   ` Elijah Newren
  2024-01-11  0:12 ` Elijah Newren
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 38+ messages in thread
From: Dragan Simic @ 2024-01-10 21:57 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git

On 2024-01-10 21:16, Taylor Blau wrote:
> Over the holiday break at the end of last year I spent some time
> thinking on what it would take to introduce Rust into the Git project.
> 
> There is significant work underway to introduce Rust into the Linux
> kernel (see [1], [2]). Among their stated goals, I think there are a 
> few
> which could be potentially relevant to the Git project:
> 
>   - Lower risk of memory safety bugs, data races, memory leaks, etc.
>     thanks to the language's safety guarantees.
> 
>   - Easier to gain confidence when refactoring or introducing new code
>     in Rust (assuming little to no use of the language's `unsafe`
>     feature).
> 
>   - Contributing to Git becomes easier and accessible to a broader 
> group
>     of programmers by relying on a more modern language.
> 
> Given the allure of these benefits, I think it's at least worth
> considering and discussing how Rust might make its way into Junio's
> tree.

Quite frankly, that would only complicate things and cause 
fragmentation.  The goal of introducing Rust into the Linux kernel is 
to, possibly, have some new "leafs" written in Rust, such as some new 
device drivers.  No existing kernel code, AFAIK, has been planned to be 
rewritten in Rust.

Thus, Git should probably follow the same approach of not converting the 
already existing code, but frankly, I don't see what would actually be 
the "new leafs" written in Rust.

> I imagine that the transition state would involve some parts of the
> project being built in C and calling into Rust code via FFI (and 
> perhaps
> vice-versa, with Rust code calling back into the existing C codebase).
> Luckily for us, Rust's FFI provides a zero-cost abstraction [3], 
> meaning
> there is no performance impact when calling code from one language in
> the other.
> 
> Some open questions from me, at least to get the discussion going are:
> 
>   1. Platform support. The Rust compiler (rustc) does not enjoy the 
> same
>      widespread availability that C compilers do. For instance, I
>      suspect that NonStop, AIX, Solaris, among others may not be
>      supported.
> 
>      One possible alternative is to have those platforms use a Rust
>      front-end for a compiler that they do support. The gccrs [4]
>      project would allow us to compile Rust anywhere where GCC is
>      available. The rustc_codegen_gcc [5] project uses GCC's libgccjit
>      API to target GCC from rustc itself.
> 
>   2. Migration. What parts of Git are easiest to convert to Rust? My
>      hunch is that the answer is any stand-alone libraries, like
>      strbuf.h. I'm not sure how we should identify these, though, and 
> in
>      what order we would want to move them over.
> 
>   3. Interaction with the lib-ification effort. There is lots of work
>      going on in an effort to lib-ify much of the Git codebase done by
>      Google. I'm not sure how this would interact with that effort, but
>      we should make sure that one isn't a blocker for the other.
> 
> I'm curious to hear what others think about this. I think that this
> would be an exciting and worthwhile direction for the project. Let's
> see!
> 
> Thanks,
> Taylor
> 
> [1]: https://rust-for-linux.com/
> [2]:
> https://lore.kernel.org/rust-for-linux/20210414184604.23473-1-ojeda@kernel.org/
> [3]:
> https://blog.rust-lang.org/2015/04/24/Rust-Once-Run-Everywhere.html#c-talking-to-rust
> [4]: https://github.com/Rust-GCC/gccrs
> [5]: https://github.com/rust-lang/rustc_codegen_gcc

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [DISCUSS] Introducing Rust into the Git project
  2024-01-10 21:57 ` Dragan Simic
@ 2024-01-10 22:11   ` Junio C Hamano
  2024-01-10 22:15     ` rsbecker
  2024-01-10 23:40     ` [DISCUSS] Introducing Rust into the Git project brian m. carlson
  2024-01-11  0:33   ` Elijah Newren
  1 sibling, 2 replies; 38+ messages in thread
From: Junio C Hamano @ 2024-01-10 22:11 UTC (permalink / raw)
  To: Dragan Simic; +Cc: Taylor Blau, git

Dragan Simic <dsimic@manjaro.org> writes:

> Thus, Git should probably follow the same approach of not converting
> the already existing code, but frankly, I don't see what would
> actually be the "new leafs" written in Rust.

A few obvious ones that come to my mind are that you should be able
to write a new merge strategy and link the resulting binary into Git
without much hassle.  You might even want to make that a dynamically
loaded object.  The interface into a merge strategy is fairly narrow
IIRC.  Or possibly a new remote helper.

Adding a new refs backend may need to wait for the work Patrick is
doing to add reftable support, but once the abstraction gets to the
point to sufficiently hide the differences between files and reftables
backends, I do not see a reason why you cannot add the third one.

And more into the future, we might want to have an object DB
abstraction, similar to how we abstracted refs API over time, at
which time you might be writing code that stores objects to and
retrieves objects from persistent redis and whatnot in your favorite
language.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [DISCUSS] Introducing Rust into the Git project
  2024-01-10 22:11   ` Junio C Hamano
@ 2024-01-10 22:15     ` rsbecker
  2024-01-10 22:26       ` Taylor Blau
  2024-01-10 23:40     ` [DISCUSS] Introducing Rust into the Git project brian m. carlson
  1 sibling, 1 reply; 38+ messages in thread
From: rsbecker @ 2024-01-10 22:15 UTC (permalink / raw)
  To: 'Junio C Hamano', 'Dragan Simic'
  Cc: 'Taylor Blau', git

On Wednesday, January 10, 2024 5:12 PM, Junio C Hamano wrote:
>Dragan Simic <dsimic@manjaro.org> writes:
>
>> Thus, Git should probably follow the same approach of not converting
>> the already existing code, but frankly, I don't see what would
>> actually be the "new leafs" written in Rust.
>
>A few obvious ones that come to my mind are that you should be able to
write a
>new merge strategy and link the resulting binary into Git without much
hassle.  You
>might even want to make that a dynamically loaded object.  The interface
into a
>merge strategy is fairly narrow IIRC.  Or possibly a new remote helper.
>
>Adding a new refs backend may need to wait for the work Patrick is doing to
add
>reftable support, but once the abstraction gets to the point to
sufficiently hide the
>differences between files and reftables backends, I do not see a reason why
you
>cannot add the third one.
>
>And more into the future, we might want to have an object DB abstraction,
similar
>to how we abstracted refs API over time, at which time you might be writing
code
>that stores objects to and retrieves objects from persistent redis and
whatnot in
>your favorite language.

Just a brief concern: Rust is not broadly portable. Adding another
dependency to git will remove many existing platforms from future releases.
Please consider this carefully before going down this path.
--Randall


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [DISCUSS] Introducing Rust into the Git project
  2024-01-10 22:15     ` rsbecker
@ 2024-01-10 22:26       ` Taylor Blau
  2024-01-10 23:52         ` rsbecker
  0 siblings, 1 reply; 38+ messages in thread
From: Taylor Blau @ 2024-01-10 22:26 UTC (permalink / raw)
  To: rsbecker; +Cc: 'Junio C Hamano', 'Dragan Simic', git

Hi Randall,

On Wed, Jan 10, 2024 at 05:15:53PM -0500, rsbecker@nexbridge.com wrote:
> Just a brief concern: Rust is not broadly portable. Adding another
> dependency to git will remove many existing platforms from future releases.
> Please consider this carefully before going down this path.

I was hoping to hear from you as one of the few (only?) folks who
participate on the list and represent HPE NonStop users.

I'm curious which if any of the compiler frontends that I listed in my
earlier email would work for you.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [DISCUSS] Introducing Rust into the Git project
  2024-01-10 22:11   ` Junio C Hamano
  2024-01-10 22:15     ` rsbecker
@ 2024-01-10 23:40     ` brian m. carlson
  1 sibling, 0 replies; 38+ messages in thread
From: brian m. carlson @ 2024-01-10 23:40 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Dragan Simic, Taylor Blau, git

[-- Attachment #1: Type: text/plain, Size: 1679 bytes --]

On 2024-01-10 at 22:11:34, Junio C Hamano wrote:
> A few obvious ones that come to my mind are that you should be able
> to write a new merge strategy and link the resulting binary into Git
> without much hassle.  You might even want to make that a dynamically
> loaded object.  The interface into a merge strategy is fairly narrow
> IIRC.  Or possibly a new remote helper.
> 
> Adding a new refs backend may need to wait for the work Patrick is
> doing to add reftable support, but once the abstraction gets to the
> point to sufficiently hide the differences between files and reftables
> backends, I do not see a reason why you cannot add the third one.
> 
> And more into the future, we might want to have an object DB
> abstraction, similar to how we abstracted refs API over time, at
> which time you might be writing code that stores objects to and
> retrieves objects from persistent redis and whatnot in your favorite
> language.

This is definitely a thing people will want to do.  I think Microsoft
had some code for Azure DevOps that stored their code in the cloud and
the refs database in a real database.  I can imagine that being a
valuable set of features people would want to implement in a variety of
environments, with all of the benefits of basing on upstream Git.

I also feel that I would absolutely not want to write those things in C.
Rust is much more ergonomic when writing these things because freeing
resources (freeing memory, rolling back transactions, closing files,
etc.) becomes as easy as implementing the Drop trait and you write less
boilerplate.
-- 
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [DISCUSS] Introducing Rust into the Git project
  2024-01-10 22:26       ` Taylor Blau
@ 2024-01-10 23:52         ` rsbecker
  2024-01-11  0:59           ` Elijah Newren
                             ` (2 more replies)
  0 siblings, 3 replies; 38+ messages in thread
From: rsbecker @ 2024-01-10 23:52 UTC (permalink / raw)
  To: 'Taylor Blau'
  Cc: 'Junio C Hamano', 'Dragan Simic', git

On Wednesday, January 10, 2024 5:26 PM, Taylor Blau wrote:
>On Wed, Jan 10, 2024 at 05:15:53PM -0500, rsbecker@nexbridge.com wrote:
>> Just a brief concern: Rust is not broadly portable. Adding another
>> dependency to git will remove many existing platforms from future releases.
>> Please consider this carefully before going down this path.
>
>I was hoping to hear from you as one of the few (only?) folks who participate on
>the list and represent HPE NonStop users.
>
>I'm curious which if any of the compiler frontends that I listed in my earlier email
>would work for you.

Unfortunately, none of the compiler frontends listed previously can be built for NonStop. These appear to all require gcc either directly or transitively, which cannot be ported to NonStop. I do not expect this to change any time soon - and is outside of my control anyway. An attempt was made to port Rust but it did not succeed primarily because of that dependency. Similarly, Golang is also not portable to NonStop because of architecture assumptions made by the Go team that cannot be satisfied on NonStop at this time. If some of the memory/pointer issues are the primary concern, c11 might be something acceptable with smart pointers. C17 will eventually be deployable, but is not available on most currently supported OS versions on the platform.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [DISCUSS] Introducing Rust into the Git project
  2024-01-10 20:16 [DISCUSS] Introducing Rust into the Git project Taylor Blau
  2024-01-10 21:57 ` Dragan Simic
@ 2024-01-11  0:12 ` Elijah Newren
  2024-01-11  5:33   ` Dragan Simic
  2024-01-11  1:56 ` brian m. carlson
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 38+ messages in thread
From: Elijah Newren @ 2024-01-11  0:12 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git

On Wed, Jan 10, 2024 at 12:18 PM Taylor Blau <me@ttaylorr.com> wrote:
>
> Over the holiday break at the end of last year I spent some time
> thinking on what it would take to introduce Rust into the Git project.

I'm very happy to see this email.

> There is significant work underway to introduce Rust into the Linux
> kernel (see [1], [2]). Among their stated goals, I think there are a few
> which could be potentially relevant to the Git project:
>
>   - Lower risk of memory safety bugs, data races, memory leaks, etc.
>     thanks to the language's safety guarantees.
>
>   - Easier to gain confidence when refactoring or introducing new code
>     in Rust (assuming little to no use of the language's `unsafe`
>     feature).
>
>   - Contributing to Git becomes easier and accessible to a broader group
>     of programmers by relying on a more modern language.
>
> Given the allure of these benefits, I think it's at least worth
> considering and discussing how Rust might make its way into Junio's
> tree.

I think there are other benefits as well; I'll list them at the end of
the email to avoid side-tracking too much[6].

> I imagine that the transition state would involve some parts of the
> project being built in C and calling into Rust code via FFI (and perhaps
> vice-versa, with Rust code calling back into the existing C codebase).
> Luckily for us, Rust's FFI provides a zero-cost abstraction [3], meaning
> there is no performance impact when calling code from one language in
> the other.

I agree with the zero-cost abstraction, but there is a funny caveat
with measuring it if anyone is curious[7].

> Some open questions from me, at least to get the discussion going are:
>
>   1. Platform support. The Rust compiler (rustc) does not enjoy the same
>      widespread availability that C compilers do. For instance, I
>      suspect that NonStop, AIX, Solaris, among others may not be
>      supported.
>
>      One possible alternative is to have those platforms use a Rust
>      front-end for a compiler that they do support. The gccrs [4]
>      project would allow us to compile Rust anywhere where GCC is
>      available. The rustc_codegen_gcc [5] project uses GCC's libgccjit
>      API to target GCC from rustc itself.

Another alternative (as discussed at Git Merge when we were last
talking about Rust[8]), is requiring all Rust code to be optional for
now.  If we choose to go that route, I think that means that (a) for
existing components, we have both a Rust and a C implementation
available, and (b) for new components (e.g. new top-level commands
like git-replay), they can be Rust-only and those compiling without
Rust just don't get them.

>   2. Migration. What parts of Git are easiest to convert to Rust? My
>      hunch is that the answer is any stand-alone libraries, like
>      strbuf.h. I'm not sure how we should identify these, though, and in
>      what order we would want to move them over.

If we're happy to allow Rust, I'd like to rewrite git-replay in Rust
as a testcase.  It's almost certainly not "easiest", but I think it's
an interesting testcase because it's a new top-level command that
hasn't appeared in any release yet.  Further, it is currently only
designed for server-side usecases, so would likely not be affected by
more limited platform support.  (I haven't started on this; my
previous experiments were with diffcore-delta.)

> I'm curious to hear what others think about this. I think that this
> would be an exciting and worthwhile direction for the project. Let's
> see!

:-)

>
> Thanks,
> Taylor
>
> [1]: https://rust-for-linux.com/
> [2]: https://lore.kernel.org/rust-for-linux/20210414184604.23473-1-ojeda@kernel.org/
> [3]: https://blog.rust-lang.org/2015/04/24/Rust-Once-Run-Everywhere.html#c-talking-to-rust
> [4]: https://github.com/Rust-GCC/gccrs
> [5]: https://github.com/rust-lang/rustc_codegen_gcc

[6] Here are some additional benefits I see:

 - Parallel performance.  We avoid making things parallel in Git because
   debugging/maintaining/reviewing parallel code in C often isn't worth
   the squeeze.  Rust was designed to greatly reduce this effort (the
   whole "fearless concurrency" thing).

 - Single-threaded Performance.  Multiple factors:

   - We had (and might still have) O(N^2) stuff in a lot of places in
     our codebase, because we tend to over-use arrays.  (e.g. with
     string_list, or with insertions and deletions into the index
     during a merge, etc.)

   - Relatedly, using hashes in C is quite onerous, to the point that
     we often simply avoid it.  I know I have, and I also know that
     even after I introduced strmap and tried to use it outside of
     merge-ort, that I got pushback because "string hash-maps are not
     really typical for a C program. I'm sure they are the best choice
     for an advanced merge algorithm but they are not really necessary
     [here; let's use sorted arrays instead]..."  I then had to go
     through multiple rounds of responses and ended up reimplementing
     everything as suggested (before finally convincing others to just
     use the strmap implementation after all).

   - We use QSORT() which basically calls libc's qsort().  Due to the
     design of this function (where the comparator is a separate
     function call), it is slow.  When languages avoid making the
     comparator a separate function call, they can speed sorts up by a
     factor of 2 (or even by 3 when an unstable sort is good enough
     and the platform's qsort() is stable).

   - Difficulty of incorporating other libraries.  For example, our
     hashmap.[ch] make use of FNV, but picking something else is a big
     amount of effort.  Now, while FNV is faster than Rust's default
     of SipHash, cargo makes it easy to pull in alternatives like
     FnvHashMap or FxHashMap, which we can then use where it matters.

I'm also tempted to include bullet points for having a unit testing
framework built in, and potentially fewer platform-dependent issues
(e.g. forgetting to use STABLE_QSORT when required since qsort is
stable in some libc implementations, since rust defines those more
carefully to be consistent across platforms), but I'm not sure these
additional advantages are big enough to merit a full bullet point.

[7] If you ignore Rust for a moment, and simply divide your files into
different libraries (e.g. introducing a new.c file, moving some
functions to it, and then compiling new.c into a new library,
libnew.a, and linking both libgit.a and libnew.a into git), you can
sometimes measure some small performance differences.  At least, I
did.  What this scenario has to do with Rust is that if we start
moving some code to Rust, that will naturally likely result in a
different division of files into libraries.  Thus, for me to verify
that Rust did provide zero-cost abstractions with my experiments, in
order to compare the performance of my Rust changes, I had to compare
to a version of git where I split some functions out into a separate
library.  When I did that, the performance overhead was actually 0.
Otherwise, there was a tiny performance degradation in the particular
splitting I employed.  However, while splitting did give me a small
performance drop, it was completely outweighed by the performance
advantages I got elsewhere in the things I converted to Rust.

[8] https://lore.kernel.org/git/ZRrfN2lbg14IOLiK@nand.local/

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [DISCUSS] Introducing Rust into the Git project
  2024-01-10 21:57 ` Dragan Simic
  2024-01-10 22:11   ` Junio C Hamano
@ 2024-01-11  0:33   ` Elijah Newren
  2024-01-11  5:39     ` Dragan Simic
  1 sibling, 1 reply; 38+ messages in thread
From: Elijah Newren @ 2024-01-11  0:33 UTC (permalink / raw)
  To: Dragan Simic; +Cc: Taylor Blau, git

On Wed, Jan 10, 2024 at 1:57 PM Dragan Simic <dsimic@manjaro.org> wrote:
>
> Thus, Git should probably follow the same approach of not converting the
> already existing code

I disagree with this.  I saw significant performance improvements
through converting some existing Git code to Rust.  Granted, it was
only a small amount of code, but the performance benefits I saw
suggested we'd see more by also doing similar conversions elsewhere.
(Note that I kept the old C code and then conditionally compiled
either Rust or C versions of what I was converting.)

Further, I found a really old bug from this effort as well[1], and I
find it extremely unlikely that I would have found that bug otherwise.
So, converting to Rust can even improve our existing C code.

>, but frankly, I don't see what would actually be
> the "new leafs" written in Rust.

In addition to some of the examples Junio mentioned elsewhere, I think
new toplevel commands, like git-replay, would qualify.


[1] Yeah, I really need to dig the patch out and send it in.  I'll do
so shortly.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [DISCUSS] Introducing Rust into the Git project
  2024-01-10 23:52         ` rsbecker
@ 2024-01-11  0:59           ` Elijah Newren
  2024-01-11  1:44             ` rsbecker
  2024-01-11  2:55           ` brian m. carlson
  2024-01-22 23:17           ` Defining a platform support policy (Was: [DISCUSS] Introducing Rust into the Git project) Emily Shaffer
  2 siblings, 1 reply; 38+ messages in thread
From: Elijah Newren @ 2024-01-11  0:59 UTC (permalink / raw)
  To: rsbecker; +Cc: Taylor Blau, Junio C Hamano, Dragan Simic, git

On Wed, Jan 10, 2024 at 3:52 PM <rsbecker@nexbridge.com> wrote:
>
> On Wednesday, January 10, 2024 5:26 PM, Taylor Blau wrote:
> >On Wed, Jan 10, 2024 at 05:15:53PM -0500, rsbecker@nexbridge.com wrote:
> >> Just a brief concern: Rust is not broadly portable. Adding another
> >> dependency to git will remove many existing platforms from future releases.
> >> Please consider this carefully before going down this path.
> >
> >I was hoping to hear from you as one of the few (only?) folks who participate on
> >the list and represent HPE NonStop users.
> >
> >I'm curious which if any of the compiler frontends that I listed in my earlier email
> >would work for you.
>
> Unfortunately, none of the compiler frontends listed previously can be built for NonStop. These appear to all require gcc either directly or transitively, which cannot be ported to NonStop. I do not expect this to change any time soon - and is outside of my control anyway. An attempt was made to port Rust but it did not succeed primarily because of that dependency. Similarly, Golang is also not portable to NonStop because of architecture assumptions made by the Go team that cannot be satisfied on NonStop at this time. If some of the memory/pointer issues are the primary concern, c11 might be something acceptable with smart pointers. C17 will eventually be deployable, but is not available on most currently supported OS versions on the platform.

Would you be okay with the following alternative: requiring that all
Rust code be optional for now?

(In other words, allow you to build with USE_RUST=0, or something like
that.  And then we have both a Rust and a C implementation of anything
that is required for backward compatibility, while any new Rust-only
stuff would not be included in your build.)

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [DISCUSS] Introducing Rust into the Git project
  2024-01-11  0:59           ` Elijah Newren
@ 2024-01-11  1:44             ` rsbecker
  2024-01-11  2:21               ` Elijah Newren
  0 siblings, 1 reply; 38+ messages in thread
From: rsbecker @ 2024-01-11  1:44 UTC (permalink / raw)
  To: 'Elijah Newren'
  Cc: 'Taylor Blau', 'Junio C Hamano',
	'Dragan Simic',
	git

On Wednesday, January 10, 2024 7:59 PM, Elijah Newren wrote:
>On Wed, Jan 10, 2024 at 3:52 PM <rsbecker@nexbridge.com> wrote:
>>
>> On Wednesday, January 10, 2024 5:26 PM, Taylor Blau wrote:
>> >On Wed, Jan 10, 2024 at 05:15:53PM -0500, rsbecker@nexbridge.com wrote:
>> >> Just a brief concern: Rust is not broadly portable. Adding another
>> >> dependency to git will remove many existing platforms from future releases.
>> >> Please consider this carefully before going down this path.
>> >
>> >I was hoping to hear from you as one of the few (only?) folks who
>> >participate on the list and represent HPE NonStop users.
>> >
>> >I'm curious which if any of the compiler frontends that I listed in
>> >my earlier email would work for you.
>>
>> Unfortunately, none of the compiler frontends listed previously can be built for
>NonStop. These appear to all require gcc either directly or transitively, which cannot
>be ported to NonStop. I do not expect this to change any time soon - and is outside
>of my control anyway. An attempt was made to port Rust but it did not succeed
>primarily because of that dependency. Similarly, Golang is also not portable to
>NonStop because of architecture assumptions made by the Go team that cannot be
>satisfied on NonStop at this time. If some of the memory/pointer issuese the
>primary concern, c11 might be something acceptable with smart pointers. C17 will
>eventually be deployable, but is not available on most currently supported OS
>versions on the platform.
>
>Would you be okay with the following alternative: requiring that all Rust code be
>optional for now?
>
>(In other words, allow you to build with USE_RUST=0, or something like that.  And
>then we have both a Rust and a C implementation of anything that is required for
>backward compatibility, while any new Rust-only stuff would not be included in
>your build.)

To address the immediate above, I assume this means that platform maintainers will be responsible for developing non-portable implementations that duplicate Rust functionality, which arguably may not be possible. We do have $DAYJOBS and the expectation that duplicate implementation are cost effective or even viable is a huge assumption that may not be attainable.

One of the key benefits of git is the ability to deploy it virtually anywhere on virtually any platform - and mirror repositories anywhere for resiliency purposes. It currently runs on (almost) every current platform because it does not have dependencies on Linux-only compilers and tools. Except for LFS, which is Golang, and I do not have access to that functionality, anyone with a C compiler can deploy git processes in their environment. By adding Rust (or any other gcc-only dependency), it eliminates the primary benefit of git. I am honestly very disappointed with this direction and think this detracts significantly from the primary value proposition that git offers: specifically, that we can take any developer from any platform and move them anywhere else without having to design new processes or teach them new processes for their workflows (this comes up at every major customer with whom I interact). I think this direction is a fundamental mistake and will rapidly limit (or eliminate) git's long-term viability. To be honest, if I saw this direction when deciding which VCS to deploy, I would reconsider git and start looking around for another more portable option. It hurts to even contemplate this direction. Please do not do this.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [DISCUSS] Introducing Rust into the Git project
  2024-01-10 20:16 [DISCUSS] Introducing Rust into the Git project Taylor Blau
  2024-01-10 21:57 ` Dragan Simic
  2024-01-11  0:12 ` Elijah Newren
@ 2024-01-11  1:56 ` brian m. carlson
  2024-01-11 11:45 ` Sam James
  2024-01-11 23:53 ` Trevor Gross
  4 siblings, 0 replies; 38+ messages in thread
From: brian m. carlson @ 2024-01-11  1:56 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 9771 bytes --]

On 2024-01-10 at 20:16:53, Taylor Blau wrote:
> Over the holiday break at the end of last year I spent some time
> thinking on what it would take to introduce Rust into the Git project.
> 
> There is significant work underway to introduce Rust into the Linux
> kernel (see [1], [2]). Among their stated goals, I think there are a few
> which could be potentially relevant to the Git project:
> 
>   - Lower risk of memory safety bugs, data races, memory leaks, etc.
>     thanks to the language's safety guarantees.
> 
>   - Easier to gain confidence when refactoring or introducing new code
>     in Rust (assuming little to no use of the language's `unsafe`
>     feature).

I agree with both of these points.  We've found that making our code
thread safe in Git is hard and it's much easier in Rust, because, for
the most part, the code doesn't compile if it would have a data race.
Unit tests are also easy and built-in, and I think that's a major
advantage.

We also get nice things for free, like sets, maps, lists, and a variety
of other collections that are all type-safe.  Error handling is also a
huge benefit: we'll get typed errors with the ability to pass data back.

>   - Contributing to Git becomes easier and accessible to a broader group
>     of programmers by relying on a more modern language.

I think this can't be understated.  One of the biggest hurdles for
people contributing is that our code requires expert knowledge of C.  We
do all sorts of weird things with pointer arithmetic that even I have
trouble understanding, and I'd really appreciate not having to worry
about memory leaks or freeing resources.[0]  Rust has nice things like the
Drop trait that make resource management easy.

Rust is also a language that people _want_ to use.  I really like it and
would probably contribute more if Git were in it.  I don't really want
to write more C, and outside of Git, won't use it on more than a de
minimis basis unless paid.

I can confirm that, having partially ported our service that serves Git
traffic to Rust from C (without the public having noticed), it's a much
nicer environment to work in.  I'm also much more efficient at making
changes as well.

> Given the allure of these benefits, I think it's at least worth
> considering and discussing how Rust might make its way into Junio's
> tree.

A couple of things which I think are worth discussing are as follows:

The Rust project emits a new release every six weeks and doesn't provide
LTS versions.  What versions of Rust are supported by crates vary
widely, and we'll absolutely need to choose our dependencies wisely.  We
may also want to ask crate authors if they'll be willing to commit to
our version policy before using them; oftentimes, that can work.

The approach that I aim for is supporting the version of Rust in the
latest Debian stable, plus the version in Debian's previous stable
release until the latest stable has been out for a year.  (Thus, if
Debian 12 was released on 2023-06-10, then I'd support Rust 1.48, Debian
11's version, until 2024-06-10, and then support would move to 1.63,
Debian 12's version.)  This provides about three years of support for a
compiler version, which I think is fair.

Note that none of this means that we're dropping support for older
systems; newer versions of Rust will be available for most targets,
even often after OSes go end of life.

We'll also probably need to continue to rely on some C libraries.  For
example, reqwest, the main Rust HTTP client, doesn't support any
authentication other than Basic, and I assure you from my experience as
the Git LFS maintainer, we don't want to implement things like NTLM and
Kerberos on our own.  libcurl is almost certainly going to continue to
be a dependency, as will PCRE.  The Rust regex crate doesn't support
backreferences, and we've basically tied lots of our regexes to POSIX,
so we'll need to either rely on PCRE or some call out to a
POSIX-compatible interface.  gettext is likely to be another issue,
although its thread-safety is potentially a problem; we could try using
the `tr` crate instead, which also provides a Rust-specific string
ripper.

> I imagine that the transition state would involve some parts of the
> project being built in C and calling into Rust code via FFI (and perhaps
> vice-versa, with Rust code calling back into the existing C codebase).
> Luckily for us, Rust's FFI provides a zero-cost abstraction [3], meaning
> there is no performance impact when calling code from one language in
> the other.

Moreover, there are even ways to generate Rust bindings for C code and C
headers for Rust code automatically.  (These are cbindgen and bindgen,
respectively.)  I've used both, and while it's clearly an FFI case, it's
still very ergonomic.

> Some open questions from me, at least to get the discussion going are:
> 
>   1. Platform support. The Rust compiler (rustc) does not enjoy the same
>      widespread availability that C compilers do. For instance, I
>      suspect that NonStop, AIX, Solaris, among others may not be
>      supported.
> 
>      One possible alternative is to have those platforms use a Rust
>      front-end for a compiler that they do support. The gccrs [4]
>      project would allow us to compile Rust anywhere where GCC is
>      available. The rustc_codegen_gcc [5] project uses GCC's libgccjit
>      API to target GCC from rustc itself.

I think this is probably the biggest stumbling point.  I know GCC is
highly portable and works on AIX, as well as virtually every
architecture.  gccrs is still incomplete, but I believe
rustc_codegen_gcc is mature, and should be a viable option for most
platforms.  (Solaris is already supported on Rust[1].)

My main concerns are with NonStop, since the Rust standard library
requires threading and a CSPRNG (although that can definitely be RDRAND,
and is for some targets).  I seem to recall that neither GCC nor LLVM
are present there, although I see no reason why GCC could not be ported
(LLVM lacks support for ia64, I believe, which would make it a bigger
lift)

I suspect that if we go forward, though, a lot of the work for
architecture support in Rust upstream will already have been done, since
I'm pretty sure the Debian porters for architectures like alpha, hppa,
and ia64 are going to want to continue to use Git.  NetBSD porters may
also have useful patches in pkgsrc.

I am also very sympathetic to the difficulties of running on less common
systems, having had a PowerPC Mac running Linux as my first laptop and
several UltraSPARC machines.  I have sent in numerous patches to a wide
variety of code so that it works gracefully on lots of architectures,
and I've also dealt with lots of broken software.  I do, however, think
it's up to the porters of an OS to keep it running and healthy, and that
means making sure it has suitable compiler toolchains for building,
including for modern, extremely popular languages like Rust and Go.  I'm
okay with dropping support for systems where nobody upstream wants to or
is capable of maintaining that tooling.

I actually feel that once Rust is running on a system, it's actually
easier to write portable code, since you don't have alignment issues and
endianness must be handled explicitly, and most safe Rust code just
works out of the box.

>   2. Migration. What parts of Git are easiest to convert to Rust? My
>      hunch is that the answer is any stand-alone libraries, like
>      strbuf.h. I'm not sure how we should identify these, though, and in
>      what order we would want to move them over.

strbuf.h is tricky because it uses variadic arguments, which are not
stable in Rust.  My approach would be to start by getting the main
function up and running, and then we can incrementally port things over.

We could, for example, use the `sha256` crate for our SHA-256 code
(which would also dynamically use accelerated hardware implementations
where available).  There are other things which are libraries which
could well work, though.  Porting over our hashmap implementation might
be a thing to do, for example.  The repository structure might also be
a good idea, since that will allow us to write safe wrappers for its
contents.

>   3. Interaction with the lib-ification effort. There is lots of work
>      going on in an effort to lib-ify much of the Git codebase done by
>      Google. I'm not sure how this would interact with that effort, but
>      we should make sure that one isn't a blocker for the other.

I think it's going to work together nicely.  We can and should consider
building a C library from Rust to expose a lot of what we write.

Also, in my view, the biggest enemy to libification in our codebase is
our copious and improvident use of globals.  Mutating static variables
in Rust is unsafe, so as part of the port, we'll need to get rid of
them, which seems like a nice common goal.

> I'm curious to hear what others think about this. I think that this
> would be an exciting and worthwhile direction for the project. Let's
> see!

I'm very much in favour of this.  I think I brought it up at the
contributor's summit and it caught some attention, but I don't think it
should be too controversial and it will offer us a lot of advantages.

[0] And before people say, "Well, you just need to spend more time with
C," I've been writing it since I was 10 and I think we can all agree
that with the SHA-256 work I've spent plenty of time with it.
[1] rustc --print target-list is a great way to see what's supported.
-- 
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [DISCUSS] Introducing Rust into the Git project
  2024-01-11  1:44             ` rsbecker
@ 2024-01-11  2:21               ` Elijah Newren
  2024-01-11  2:57                 ` rsbecker
  0 siblings, 1 reply; 38+ messages in thread
From: Elijah Newren @ 2024-01-11  2:21 UTC (permalink / raw)
  To: rsbecker; +Cc: Taylor Blau, Junio C Hamano, Dragan Simic, git

On Wed, Jan 10, 2024 at 5:44 PM <rsbecker@nexbridge.com> wrote:
>
> On Wednesday, January 10, 2024 7:59 PM, Elijah Newren wrote:
[...]
> >Would you be okay with the following alternative: requiring that all Rust code be
> >optional for now?
> >
> >(In other words, allow you to build with USE_RUST=0, or something like that.  And
> >then we have both a Rust and a C implementation of anything that is required for
> >backward compatibility, while any new Rust-only stuff would not be included in
> >your build.)
>
> To address the immediate above, I assume this means that platform maintainers will be responsible for developing non-portable implementations that duplicate Rust functionality

This doesn't at all sound like what I thought I said.  The whole
proposal was so that folks like NonStop could continue using Git with
no more work than setting USE_RUST=0 at build time.

Why do you feel you'd need to duplicate any functionality?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [DISCUSS] Introducing Rust into the Git project
  2024-01-10 23:52         ` rsbecker
  2024-01-11  0:59           ` Elijah Newren
@ 2024-01-11  2:55           ` brian m. carlson
  2024-01-11  3:24             ` rsbecker
  2024-01-22 23:17           ` Defining a platform support policy (Was: [DISCUSS] Introducing Rust into the Git project) Emily Shaffer
  2 siblings, 1 reply; 38+ messages in thread
From: brian m. carlson @ 2024-01-11  2:55 UTC (permalink / raw)
  To: rsbecker
  Cc: 'Taylor Blau', 'Junio C Hamano',
	'Dragan Simic',
	git

[-- Attachment #1: Type: text/plain, Size: 1365 bytes --]

On 2024-01-10 at 23:52:21, rsbecker@nexbridge.com wrote:
> Unfortunately, none of the compiler frontends listed previously can be
> built for NonStop. These appear to all require gcc either directly or
> transitively, which cannot be ported to NonStop. I do not expect this
> to change any time soon - and is outside of my control anyway. An
> attempt was made to port Rust but it did not succeed primarily because
> of that dependency.

Can you tell us what the technical limitations are that prevent GCC from
being ported so we can understand better?  I know LLVM doesn't support
ia64, which you do support, but GCC is very likely the most portable
compiler on the planet and supports architectures and OSes I've never
otherwise heard of.

I strongly suspect that if GCC did end up on NonStop, Rust would be able
to be ported, too, and you'd also get access to gccgo, which would make
Git LFS possible on NonStop as well[0].

I'm not capable of porting GCC, but I have done some portability work in
the Rust ecosystem, and I'd be willing to provide context and some
assistance (within my time and capabilities) to help get Rust working on
NonStop if you want.

[0] For the record, as a maintainer of Git LFS, I'm happy to accept
portability patches for virtually any OS.
-- 
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [DISCUSS] Introducing Rust into the Git project
  2024-01-11  2:21               ` Elijah Newren
@ 2024-01-11  2:57                 ` rsbecker
  2024-01-11  5:06                   ` Elijah Newren
  0 siblings, 1 reply; 38+ messages in thread
From: rsbecker @ 2024-01-11  2:57 UTC (permalink / raw)
  To: 'Elijah Newren'
  Cc: 'Taylor Blau', 'Junio C Hamano',
	'Dragan Simic',
	git

On Wednesday, January 10, 2024 9:21 PM, Elijah Newren wrote:
>On Wed, Jan 10, 2024 at 5:44 PM <rsbecker@nexbridge.com> wrote:
>>
>> On Wednesday, January 10, 2024 7:59 PM, Elijah Newren wrote:
>[...]
>> >Would you be okay with the following alternative: requiring that all
>> >Rust code be optional for now?
>> >
>> >(In other words, allow you to build with USE_RUST=0, or something
>> >like that.  And then we have both a Rust and a C implementation of
>> >anything that is required for backward compatibility, while any new
>> >Rust-only stuff would not be included in your build.)
>>
>> To address the immediate above, I assume this means that platform
>> maintainers will be responsible for developing non-portable
>> implementations that duplicate Rust functionality
>
>This doesn't at all sound like what I thought I said.  The whole proposal was so that
>folks like NonStop could continue using Git with no more work than setting
>USE_RUST=0 at build time.
>
>Why do you feel you'd need to duplicate any functionality?

I think I misunderstood. What I took from this is that all new functionality would be in Rust, which would require a custom implementation in C for platforms that did not have Rust available - if that is even practical. Did I get that wrong?


^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [DISCUSS] Introducing Rust into the Git project
  2024-01-11  2:55           ` brian m. carlson
@ 2024-01-11  3:24             ` rsbecker
  2024-01-11 20:07               ` Trevor Gross
  0 siblings, 1 reply; 38+ messages in thread
From: rsbecker @ 2024-01-11  3:24 UTC (permalink / raw)
  To: 'brian m. carlson'
  Cc: 'Taylor Blau', 'Junio C Hamano',
	'Dragan Simic',
	git

On Wednesday, January 10, 2024 9:56 PM, brian m. carlson wrote:
>On 2024-01-10 at 23:52:21, rsbecker@nexbridge.com wrote:
>> Unfortunately, none of the compiler frontends listed previously can be
>> built for NonStop. These appear to all require gcc either directly or
>> transitively, which cannot be ported to NonStop. I do not expect this
>> to change any time soon - and is outside of my control anyway. An
>> attempt was made to port Rust but it did not succeed primarily because
>> of that dependency.
>
>Can you tell us what the technical limitations are that prevent GCC from being
>ported so we can understand better?  I know LLVM doesn't support ia64, which you
>do support, but GCC is very likely the most portable compiler on the planet and
>supports architectures and OSes I've never otherwise heard of.
>
>I strongly suspect that if GCC did end up on NonStop, Rust would be able to be
>ported, too, and you'd also get access to gccgo, which would make Git LFS possible
>on NonStop as well[0].
>
>I'm not capable of porting GCC, but I have done some portability work in the Rust
>ecosystem, and I'd be willing to provide context and some assistance (within my
>time and capabilities) to help get Rust working on NonStop if you want.
>
>[0] For the record, as a maintainer of Git LFS, I'm happy to accept portability
>patches for virtually any OS.

There are a number of issues for porting gcc (and Go). The list is fairly long, but the summary of what I encountered directly (on the last funded effort of 3) is:
1. There are C syntax constructs required to do anything useful (required for access to the OS API) on NonStop that are not in gcc. I can hand code the parser for that, but it would take time.
2. The Big Endian x86 architecture is weird to gcc and making that work is not easy.
3. There is no assembler on NonStop.
4. The ELF header is very different from standard.
5. The symbol table structure is radically different, so debugging would be (nearly) impossible or impractical. gdb was ported to account for the platform differences.
6. The linkage structure is similar but different from standard.
7. The external fixup structure is radically different.
8. The loader does not work the same way, so there are required sections of the ELF files on NonStop that are not generated by gcc.

There are more, but I just did not get to the point if hitting them. Part of my own issue is that I have expertise in parsing and semantic passes of compilers, but my code generation skills are not where I want them to be for taking on this effort. Our last funded attempt had a code generation expert and he gave up in frustration.

If I was hired on to do this, it might have a chance, but at an estimate (not mine) of 4-5 person years for a gcc port, best case, my $DAYJOB will not permit it.

If gcc could be ported to NonStop, it would solve so many problems. I have heard of numerous failed efforts beyond what was officially funded by various companies, so this is considered a high-risk project.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [DISCUSS] Introducing Rust into the Git project
  2024-01-11  2:57                 ` rsbecker
@ 2024-01-11  5:06                   ` Elijah Newren
  2024-01-11  6:56                     ` Patrick Steinhardt
  2024-01-11 13:07                     ` rsbecker
  0 siblings, 2 replies; 38+ messages in thread
From: Elijah Newren @ 2024-01-11  5:06 UTC (permalink / raw)
  To: rsbecker; +Cc: Taylor Blau, Junio C Hamano, Dragan Simic, git

On Wed, Jan 10, 2024 at 6:57 PM <rsbecker@nexbridge.com> wrote:
>
> On Wednesday, January 10, 2024 9:21 PM, Elijah Newren wrote:
> >On Wed, Jan 10, 2024 at 5:44 PM <rsbecker@nexbridge.com> wrote:
> >>
> >> On Wednesday, January 10, 2024 7:59 PM, Elijah Newren wrote:
> >[...]
> >> >Would you be okay with the following alternative: requiring that all
> >> >Rust code be optional for now?
> >> >
> >> >(In other words, allow you to build with USE_RUST=0, or something
> >> >like that.  And then we have both a Rust and a C implementation of
> >> >anything that is required for backward compatibility, while any new
> >> >Rust-only stuff would not be included in your build.)
> >>
> >> To address the immediate above, I assume this means that platform
> >> maintainers will be responsible for developing non-portable
> >> implementations that duplicate Rust functionality
> >
> >This doesn't at all sound like what I thought I said.  The whole proposal was so that
> >folks like NonStop could continue using Git with no more work than setting
> >USE_RUST=0 at build time.
> >
> >Why do you feel you'd need to duplicate any functionality?
>
> I think I misunderstood. What I took from this is that all new functionality would be in Rust, which would require a custom implementation in C for platforms that did not have Rust available - if that is even practical. Did I get that wrong?

I think you somehow missed the word optional?

I did say that new functionality should be allowed to be Rust only
(unlike existing functionality), but I'm not sure how you leaped to
assuming that all new functionality would be in Rust.  Further, I also
don't understand why you jump to assuming that all new functionality
needs to be supported on all platforms.  The point of the word
"optional" in my proposal is that it is not required.  So, say, if
git-replay is in Rust, well you've never had git-replay before in any
release, so you haven't lost any functionality by it being implemented
in Rust.  And existing things (merge, cherry-pick, rebase, etc.)
continue working with C-only code.  But you may have one less optional
addition.

At least that was _my_ proposal -- that Rust be optional for now.  It
does differ from what I think Taylor was originally proposing, but
that's why I brought it up as an alternative proposal.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [DISCUSS] Introducing Rust into the Git project
  2024-01-11  0:12 ` Elijah Newren
@ 2024-01-11  5:33   ` Dragan Simic
  0 siblings, 0 replies; 38+ messages in thread
From: Dragan Simic @ 2024-01-11  5:33 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Taylor Blau, git

On 2024-01-11 01:12, Elijah Newren wrote:
> Another alternative (as discussed at Git Merge when we were last
> talking about Rust[8]), is requiring all Rust code to be optional for
> now.  If we choose to go that route, I think that means that (a) for
> existing components, we have both a Rust and a C implementation
> available, and (b) for new components (e.g. new top-level commands
> like git-replay), they can be Rust-only and those compiling without
> Rust just don't get them.

To me, this sounds like a horrible option, which is exactly what
I earlier referred to as introducing fragmentation.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [DISCUSS] Introducing Rust into the Git project
  2024-01-11  0:33   ` Elijah Newren
@ 2024-01-11  5:39     ` Dragan Simic
  2024-01-11 16:57       ` Elijah Newren
  0 siblings, 1 reply; 38+ messages in thread
From: Dragan Simic @ 2024-01-11  5:39 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Taylor Blau, git

On 2024-01-11 01:33, Elijah Newren wrote:
> On Wed, Jan 10, 2024 at 1:57 PM Dragan Simic <dsimic@manjaro.org> 
> wrote:
>> 
>> Thus, Git should probably follow the same approach of not converting 
>> the
>> already existing code
> 
> I disagree with this.  I saw significant performance improvements
> through converting some existing Git code to Rust.  Granted, it was
> only a small amount of code, but the performance benefits I saw
> suggested we'd see more by also doing similar conversions elsewhere.
> (Note that I kept the old C code and then conditionally compiled
> either Rust or C versions of what I was converting.)

Well, it's also possible that improving the old C code could also result 
in some performance improvements.  Thus, quite frankly, I don't see that 
as a valid argument to rewrite some existing C code in Rust.

> Further, I found a really old bug from this effort as well[1], and I
> find it extremely unlikely that I would have found that bug otherwise.
> So, converting to Rust can even improve our existing C code.
> 
>> , but frankly, I don't see what would actually be
>> the "new leafs" written in Rust.
> 
> In addition to some of the examples Junio mentioned elsewhere, I think
> new toplevel commands, like git-replay, would qualify.
> 
> 
> [1] Yeah, I really need to dig the patch out and send it in.  I'll do
> so shortly.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [DISCUSS] Introducing Rust into the Git project
  2024-01-11  5:06                   ` Elijah Newren
@ 2024-01-11  6:56                     ` Patrick Steinhardt
  2024-01-11 13:07                     ` rsbecker
  1 sibling, 0 replies; 38+ messages in thread
From: Patrick Steinhardt @ 2024-01-11  6:56 UTC (permalink / raw)
  To: Elijah Newren; +Cc: rsbecker, Taylor Blau, Junio C Hamano, Dragan Simic, git

[-- Attachment #1: Type: text/plain, Size: 4216 bytes --]

On Wed, Jan 10, 2024 at 09:06:23PM -0800, Elijah Newren wrote:
> On Wed, Jan 10, 2024 at 6:57 PM <rsbecker@nexbridge.com> wrote:
> >
> > On Wednesday, January 10, 2024 9:21 PM, Elijah Newren wrote:
> > >On Wed, Jan 10, 2024 at 5:44 PM <rsbecker@nexbridge.com> wrote:
> > >>
> > >> On Wednesday, January 10, 2024 7:59 PM, Elijah Newren wrote:
> > >[...]
> > >> >Would you be okay with the following alternative: requiring that all
> > >> >Rust code be optional for now?
> > >> >
> > >> >(In other words, allow you to build with USE_RUST=0, or something
> > >> >like that.  And then we have both a Rust and a C implementation of
> > >> >anything that is required for backward compatibility, while any new
> > >> >Rust-only stuff would not be included in your build.)
> > >>
> > >> To address the immediate above, I assume this means that platform
> > >> maintainers will be responsible for developing non-portable
> > >> implementations that duplicate Rust functionality
> > >
> > >This doesn't at all sound like what I thought I said.  The whole proposal was so that
> > >folks like NonStop could continue using Git with no more work than setting
> > >USE_RUST=0 at build time.
> > >
> > >Why do you feel you'd need to duplicate any functionality?
> >
> > I think I misunderstood. What I took from this is that all new functionality would be in Rust, which would require a custom implementation in C for platforms that did not have Rust available - if that is even practical. Did I get that wrong?
> 
> I think you somehow missed the word optional?
> 
> I did say that new functionality should be allowed to be Rust only
> (unlike existing functionality), but I'm not sure how you leaped to
> assuming that all new functionality would be in Rust.  Further, I also
> don't understand why you jump to assuming that all new functionality
> needs to be supported on all platforms.  The point of the word
> "optional" in my proposal is that it is not required.  So, say, if
> git-replay is in Rust, well you've never had git-replay before in any
> release, so you haven't lost any functionality by it being implemented
> in Rust.  And existing things (merge, cherry-pick, rebase, etc.)
> continue working with C-only code.  But you may have one less optional
> addition.
> 
> At least that was _my_ proposal -- that Rust be optional for now.  It
> does differ from what I think Taylor was originally proposing, but
> that's why I brought it up as an alternative proposal.

There are two ways to do this that I can see:

  - New features may not be available on some platforms. I think this is
    what Elijah had in mind.

  - New features may require two implementations, one in C and one in
    Rust. I think this is what Randall understood.

Ultimately, I think both alternatives would end up demoting platforms
that do not support Rust to become second-class citizens eventually.
This demotion is rather obvious in the case where new features may not
be available. But I also think that the second approach, where we
provide two implementations, would lead to a demotion of the Rust-less
platform because the alternate implementation in C would likely end up
receiving less attention than the Rust-based one. It's thus likely that
the implementation receiving less attention will deteriorate in code
quality.

I also think that once we start to accept Rust code, it will only be a
matter of time before we want to start using it in central code paths.
Rust does provide interfaces which are a lot nicer to use than the C
based ones, but it's hard to really reap the benefits unless we start to
embrace Rust fully. Also, the most complex interfaces tend to be those
which are deep inside our code base, like for example the object
database. They are thus also the most profitable targets for a Rust
conversion, even though likely also the hardest to realize.

To me this feels like a slippery slope, and the deeper we go the more
incentive we will have to drop platforms which do not support Rust
altogether. So I can certainly see where Randall is coming from and why
this proposal is not something that he is thrilled about.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [DISCUSS] Introducing Rust into the Git project
  2024-01-10 20:16 [DISCUSS] Introducing Rust into the Git project Taylor Blau
                   ` (2 preceding siblings ...)
  2024-01-11  1:56 ` brian m. carlson
@ 2024-01-11 11:45 ` Sam James
  2024-01-11 23:48   ` brian m. carlson
  2024-01-11 23:53 ` Trevor Gross
  4 siblings, 1 reply; 38+ messages in thread
From: Sam James @ 2024-01-11 11:45 UTC (permalink / raw)
  To: me; +Cc: git

Something I'm a bit concerned about is that right now, neither
rustc_codegen_gcc nor gccrs are ready for use here.

We've had trouble getting things wired up for rustc_codegen_gcc
- which is not to speak against their wonderful efforts - because
the Rust community hasn't yet figured out how to handle things which
pure rustc supports yet. See
e.g. https://github.com/rust-lang/libc/pull/3032.

I think care should be taken in citing rustc_codegen_gcc and gccrs
as options for alternative platforms for now. They will hopefully
be great options in the future, but they aren't today, and they probably
won't be in the next 6 months at the least.

We also do use git heavily on platforms which rustc isn't supported
yet.

thanks,
sam

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [DISCUSS] Introducing Rust into the Git project
  2024-01-11  5:06                   ` Elijah Newren
  2024-01-11  6:56                     ` Patrick Steinhardt
@ 2024-01-11 13:07                     ` rsbecker
  1 sibling, 0 replies; 38+ messages in thread
From: rsbecker @ 2024-01-11 13:07 UTC (permalink / raw)
  To: 'Elijah Newren'
  Cc: 'Taylor Blau', 'Junio C Hamano',
	'Dragan Simic',
	git

On Thursday, January 11, 2024 12:06 AM, Elijah Newren wrote:
>On Wed, Jan 10, 2024 at 6:57 PM <rsbecker@nexbridge.com> wrote:
>>
>> On Wednesday, January 10, 2024 9:21 PM, Elijah Newren wrote:
>> >On Wed, Jan 10, 2024 at 5:44 PM <rsbecker@nexbridge.com> wrote:
>> >>
>> >> On Wednesday, January 10, 2024 7:59 PM, Elijah Newren wrote:
>> >[...]
>> >> >Would you be okay with the following alternative: requiring that
>> >> >all Rust code be optional for now?
>> >> >
>> >> >(In other words, allow you to build with USE_RUST=0, or something
>> >> >like that.  And then we have both a Rust and a C implementation of
>> >> >anything that is required for backward compatibility, while any
>> >> >new Rust-only stuff would not be included in your build.)
>> >>
>> >> To address the immediate above, I assume this means that platform
>> >> maintainers will be responsible for developing non-portable
>> >> implementations that duplicate Rust functionality
>> >
>> >This doesn't at all sound like what I thought I said.  The whole
>> >proposal was so that folks like NonStop could continue using Git with
>> >no more work than setting
>> >USE_RUST=0 at build time.
>> >
>> >Why do you feel you'd need to duplicate any functionality?
>>
>> I think I misunderstood. What I took from this is that all new functionality would
>be in Rust, which would require a custom implementation in C for platforms that did
>not have Rust available - if that is even practical. Did I get that wrong?
>
>I think you somehow missed the word optional?
>
>I did say that new functionality should be allowed to be Rust only (unlike existing
>functionality), but I'm not sure how you leaped to assuming that all new
>functionality would be in Rust.  Further, I also don't understand why you jump to
>assuming that all new functionality needs to be supported on all platforms.  The
>point of the word "optional" in my proposal is that it is not required.  So, say, if git-
>replay is in Rust, well you've never had git-replay before in any release, so you
>haven't lost any functionality by it being implemented in Rust.  And existing things
>(merge, cherry-pick, rebase, etc.) continue working with C-only code.  But you may
>have one less optional addition.
>
>At least that was _my_ proposal -- that Rust be optional for now.  It does differ from
>what I think Taylor was originally proposing, but that's why I brought it up as an
>alternative proposal.

Thank you for the clarification.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [DISCUSS] Introducing Rust into the Git project
  2024-01-11  5:39     ` Dragan Simic
@ 2024-01-11 16:57       ` Elijah Newren
  2024-01-17 21:30         ` Dragan Simic
  0 siblings, 1 reply; 38+ messages in thread
From: Elijah Newren @ 2024-01-11 16:57 UTC (permalink / raw)
  To: Dragan Simic; +Cc: Taylor Blau, git

Hi Dragan,

On Wed, Jan 10, 2024 at 9:39 PM Dragan Simic <dsimic@manjaro.org> wrote:
>
> On 2024-01-11 01:33, Elijah Newren wrote:
> > On Wed, Jan 10, 2024 at 1:57 PM Dragan Simic <dsimic@manjaro.org>
> > wrote:
> >>
> >> Thus, Git should probably follow the same approach of not converting
> >> the
> >> already existing code
> >
> > I disagree with this.  I saw significant performance improvements
> > through converting some existing Git code to Rust.  Granted, it was
> > only a small amount of code, but the performance benefits I saw
> > suggested we'd see more by also doing similar conversions elsewhere.
> > (Note that I kept the old C code and then conditionally compiled
> > either Rust or C versions of what I was converting.)
>
> Well, it's also possible that improving the old C code could also result
> in some performance improvements.  Thus, quite frankly, I don't see that
> as a valid argument to rewrite some existing C code in Rust.

Yes, and I've made many performance improvements in the C code in git.
Sometimes I make some of the code 5% or 20% faster.  Sometimes 1-3
orders of magnitude faster.  Once over 60 orders of magnitude
faster.[1]  Look around in git's history; I've done a fair amount of
performance stuff.

And I'm specifically arguing that I feel limited in some of the
performance work that can be done by remaining in C.  Part of my
reason for interest in Rust is exactly because I think it can help us
improve performance in ways that are far more difficult to achieve in
C.  And this isn't just guesswork, I've done some trials with it.
Further, I even took the time to document some of these reasons
elsewhere in this thread[2].  Arguing that some performance
improvements can be done in C is thus entirely missing the point.

If you want to dismiss the performance angle of argument for Rust, you
should take the time to address the actual reasons raised for why it
could make it easier to improve performance relative to continuing in
C.

Also, as a heads up since you seem to be relatively new to the list:
your position will probably carry more weight with others if you take
the time to understand, acknowledge, and/or address counterpoints of
the other party.  It is certainly fine to simply express some concerns
without doing so (Randall and Patrick did a good job of this in this
thread), but when you simply assert that the benefits others point out
simply don't exist (e.g. your "Quite frankly, that would _only_
complicate things and cause fragmentation." (emphasis added) from your
first email in this thread[3], and which this latest email of yours
somewhat looks like as well), others may well start applying a
discount to any positions you state.  Granted, it's totally up to you,
but I'm just giving a hint about how I think you might be able to be
more persuasive.


Hope that helps,
Elijah

[1] A couple examples: 6a5fb966720 ("Change default merge backend from
recursive to ort", 2021-08-04) and 8d92fb29270 ("dir: replace
exponential algorithm with a linear one", 2020-04-01)
[2] Footnote 6 of
https://lore.kernel.org/git/CABPp-BFOmwV-xBtjvtenb6RFz9wx2VWVpTeho0k=D8wsCCVwqQ@mail.gmail.com/
[3] https://lore.kernel.org/git/b2651b38a4f7edaf1c5ffee72af00e46@manjaro.org/

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [DISCUSS] Introducing Rust into the Git project
  2024-01-11  3:24             ` rsbecker
@ 2024-01-11 20:07               ` Trevor Gross
  2024-01-11 21:28                 ` rsbecker
  0 siblings, 1 reply; 38+ messages in thread
From: Trevor Gross @ 2024-01-11 20:07 UTC (permalink / raw)
  To: rsbecker; +Cc: brian m. carlson, Taylor Blau, Junio C Hamano, Dragan Simic, git

On Wed, Jan 10, 2024 at 10:24 PM <rsbecker@nexbridge.com> wrote:
>
> There are a number of issues for porting gcc (and Go). The list is fairly long, but the summary of what I encountered directly (on the last funded effort of 3) is:
> 1. There are C syntax constructs required to do anything useful (required for access to the OS API) on NonStop that are not in gcc. I can hand code the parser for that, but it would take time.
> 2. The Big Endian x86 architecture is weird to gcc and making that work is not easy.
> 3. There is no assembler on NonStop.
> 4. The ELF header is very different from standard.
> 5. The symbol table structure is radically different, so debugging would be (nearly) impossible or impractical. gdb was ported to account for the platform differences.
> 6. The linkage structure is similar but different from standard.
> 7. The external fixup structure is radically different.
> 8. The loader does not work the same way, so there are required sections of the ELF files on NonStop that are not generated by gcc.
>
> There are more, but I just did not get to the point if hitting them. Part of my own issue is that I have expertise in parsing and semantic passes of compilers, but my code generation skills are not where I want them to be for taking on this effort. Our last funded attempt had a code generation expert and he gave up in frustration.
>
> If I was hired on to do this, it might have a chance, but at an estimate (not mine) of 4-5 person years for a gcc port, best case, my $DAYJOB will not permit it.
>
> If gcc could be ported to NonStop, it would solve so many problems. I have heard of numerous failed efforts beyond what was officially funded by various companies, so this is considered a high-risk project.

Out of curiosity - does the Tandem compiler (assuming that is the
correct name) have a backend that is usable as a library or via an IR?

If so, maybe it would be possible to write a rustc_codegen_tandem
backend like the three that exist (rustc_codegen_{llvm,gcc,cranelift}
at [1]. GCC and cranelift are still under development). This way you
sidestep a lot of the codegen-specific problems listed above.

I am, of course, not suggesting this as a solution for git and am sure
you would rather have GCC support. But I wonder how feasible this
would be if Rust on NonStop is desired at some point.

[1]: https://github.com/rust-lang/rust/tree/062e7c6a951c1e4f33c0a6f6761755949cde15ec/compiler

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [DISCUSS] Introducing Rust into the Git project
  2024-01-11 20:07               ` Trevor Gross
@ 2024-01-11 21:28                 ` rsbecker
  2024-01-11 23:23                   ` Trevor Gross
  0 siblings, 1 reply; 38+ messages in thread
From: rsbecker @ 2024-01-11 21:28 UTC (permalink / raw)
  To: 'Trevor Gross'
  Cc: 'brian m. carlson', 'Taylor Blau',
	'Junio C Hamano', 'Dragan Simic',
	git

On Thursday, January 11, 2024 3:08 PM, Trevor Gross wrote:
>On Wed, Jan 10, 2024 at 10:24 PM <rsbecker@nexbridge.com> wrote:
>>
>> There are a number of issues for porting gcc (and Go). The list is fairly long, but the
>summary of what I encountered directly (on the last funded effort of 3) is:
>> 1. There are C syntax constructs required to do anything useful (required for
>access to the OS API) on NonStop that are not in gcc. I can hand code the parser for
>that, but it would take time.
>> 2. The Big Endian x86 architecture is weird to gcc and making that work is not
>easy.
>> 3. There is no assembler on NonStop.
>> 4. The ELF header is very different from standard.
>> 5. The symbol table structure is radically different, so debugging would be (nearly)
>impossible or impractical. gdb was ported to account for the platform differences.
>> 6. The linkage structure is similar but different from standard.
>> 7. The external fixup structure is radically different.
>> 8. The loader does not work the same way, so there are required sections of the
>ELF files on NonStop that are not generated by gcc.
>>
>> There are more, but I just did not get to the point if hitting them. Part of my own
>issue is that I have expertise in parsing and semantic passes of compilers, but my
>code generation skills are not where I want them to be for taking on this effort. Our
>last funded attempt had a code generation expert and he gave up in frustration.
>>
>> If I was hired on to do this, it might have a chance, but at an estimate (not mine)
>of 4-5 person years for a gcc port, best case, my $DAYJOB will not permit it.
>>
>> If gcc could be ported to NonStop, it would solve so many problems. I have heard
>of numerous failed efforts beyond what was officially funded by various companies,
>so this is considered a high-risk project.
>
>Out of curiosity - does the Tandem compiler (assuming that is the correct name)
>have a backend that is usable as a library or via an IR?
>
>If so, maybe it would be possible to write a rustc_codegen_tandem backend like the
>three that exist (rustc_codegen_{llvm,gcc,cranelift}
>at [1]. GCC and cranelift are still under development). This way you sidestep a lot of
>the codegen-specific problems listed above.
>
>I am, of course, not suggesting this as a solution for git and am sure you would
>rather have GCC support. But I wonder how feasible this would be if Rust on
>NonStop is desired at some point.

The usable compilers and interpreters on NonStop are c89, c99 (what we use for git), c11, perl, and python3 (for the x86 only). The perl and python do not have sufficient modules to do what would be needed by git. The compilers are invoked using a CLI and are not callable using a library. gcc is, for all intents and purposes, not possible - so anything requiring gcc (for example, Rust), cannot be built.  There is no back-end pluggable component for any of the compilers.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [DISCUSS] Introducing Rust into the Git project
  2024-01-11 21:28                 ` rsbecker
@ 2024-01-11 23:23                   ` Trevor Gross
  0 siblings, 0 replies; 38+ messages in thread
From: Trevor Gross @ 2024-01-11 23:23 UTC (permalink / raw)
  To: rsbecker; +Cc: brian m. carlson, Taylor Blau, Junio C Hamano, Dragan Simic, git

On Thu, Jan 11, 2024 at 4:28 PM <rsbecker@nexbridge.com> wrote:
>
> The usable compilers and interpreters on NonStop are c89, c99 (what we use for git), c11, perl, and python3 (for the x86 only). The perl and python do not have sufficient modules to do what would be needed by git. The compilers are invoked using a CLI and are not callable using a library. gcc is, for all intents and purposes, not possible - so anything requiring gcc (for example, Rust), cannot be built.  There is no back-end pluggable component for any of the compilers.
>

Ah, no pluggable backend is unfortunate. Rust only uses GCC to build
the LLVM backend, it isn't actually needed for the language. It does
link libgcc_s for unwinding and I believe some math symbols, but
unwinding can be disabled and other symbols can come from anywhere.

If you can build mrustc (C++ program) [1] then you can use it to
transpile Rust to C. This is how rustc is bootstrapped, and would be
how you bring it up with a different backend on a new platform.

Still, this probably wouldn't be a solution for git.

[1]: https://github.com/thepowersgang/mrustc

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [DISCUSS] Introducing Rust into the Git project
  2024-01-11 11:45 ` Sam James
@ 2024-01-11 23:48   ` brian m. carlson
  2024-01-12  8:24     ` Sam James
  0 siblings, 1 reply; 38+ messages in thread
From: brian m. carlson @ 2024-01-11 23:48 UTC (permalink / raw)
  To: Sam James; +Cc: me, git

[-- Attachment #1: Type: text/plain, Size: 1251 bytes --]

On 2024-01-11 at 11:45:07, Sam James wrote:
> Something I'm a bit concerned about is that right now, neither
> rustc_codegen_gcc nor gccrs are ready for use here.
> 
> We've had trouble getting things wired up for rustc_codegen_gcc
> - which is not to speak against their wonderful efforts - because
> the Rust community hasn't yet figured out how to handle things which
> pure rustc supports yet. See
> e.g. https://github.com/rust-lang/libc/pull/3032.

Is this simply library support in the libc crate?  That's very easy to add.

> I think care should be taken in citing rustc_codegen_gcc and gccrs
> as options for alternative platforms for now. They will hopefully
> be great options in the future, but they aren't today, and they probably
> won't be in the next 6 months at the least.

What specifically is missing for rust_codegen_gcc?  I know gccrs is not
ready at the moment, but I was under the impression that
rust_codegen_gcc was at least usable.  I'm aware it requires some
patches to GCC, but distros should be able to carry those.

If rust_codegen_gcc isn't viable, then I agree we should avoid making
Rust mandatory, but I'd like to learn more.
-- 
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [DISCUSS] Introducing Rust into the Git project
  2024-01-10 20:16 [DISCUSS] Introducing Rust into the Git project Taylor Blau
                   ` (3 preceding siblings ...)
  2024-01-11 11:45 ` Sam James
@ 2024-01-11 23:53 ` Trevor Gross
  4 siblings, 0 replies; 38+ messages in thread
From: Trevor Gross @ 2024-01-11 23:53 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git

On Wed, Jan 10, 2024 at 3:19 PM Taylor Blau <me@ttaylorr.com> wrote:
>
> Over the holiday break at the end of last year I spent some time
> thinking on what it would take to introduce Rust into the Git project.
>
> There is significant work underway to introduce Rust into the Linux
> kernel (see [1], [2]). Among their stated goals, I think there are a few
> which could be potentially relevant to the Git project:
>
>   - Lower risk of memory safety bugs, data races, memory leaks, etc.
>     thanks to the language's safety guarantees.
>
>   - Easier to gain confidence when refactoring or introducing new code
>     in Rust (assuming little to no use of the language's `unsafe`
>     feature).
>
>   - Contributing to Git becomes easier and accessible to a broader group
>     of programmers by relying on a more modern language.
>
> Given the allure of these benefits, I think it's at least worth
> considering and discussing how Rust might make its way into Junio's
> tree.
>
> I imagine that the transition state would involve some parts of the
> project being built in C and calling into Rust code via FFI (and perhaps
> vice-versa, with Rust code calling back into the existing C codebase).
> Luckily for us, Rust's FFI provides a zero-cost abstraction [3], meaning
> there is no performance impact when calling code from one language in
> the other.
>
> Some open questions from me, at least to get the discussion going are:
>
>   1. Platform support. The Rust compiler (rustc) does not enjoy the same
>      widespread availability that C compilers do. For instance, I
>      suspect that NonStop, AIX, Solaris, among others may not be
>      supported.
>
>      One possible alternative is to have those platforms use a Rust
>      front-end for a compiler that they do support. The gccrs [4]
>      project would allow us to compile Rust anywhere where GCC is
>      available. The rustc_codegen_gcc [5] project uses GCC's libgccjit
>      API to target GCC from rustc itself.
>
>   2. Migration. What parts of Git are easiest to convert to Rust? My
>      hunch is that the answer is any stand-alone libraries, like
>      strbuf.h. I'm not sure how we should identify these, though, and in
>      what order we would want to move them over.
>
>   3. Interaction with the lib-ification effort. There is lots of work
>      going on in an effort to lib-ify much of the Git codebase done by
>      Google. I'm not sure how this would interact with that effort, but
>      we should make sure that one isn't a blocker for the other.
>
> I'm curious to hear what others think about this. I think that this
> would be an exciting and worthwhile direction for the project. Let's
> see!
>
> Thanks,
> Taylor
>
> [1]: https://rust-for-linux.com/
> [2]: https://lore.kernel.org/rust-for-linux/20210414184604.23473-1-ojeda@kernel.org/
> [3]: https://blog.rust-lang.org/2015/04/24/Rust-Once-Run-Everywhere.html#c-talking-to-rust
> [4]: https://github.com/Rust-GCC/gccrs
> [5]: https://github.com/rust-lang/rustc_codegen_gcc
>

Two good reference codebases out there:

Abstractions over libgit2
    Repo: https://github.com/rust-lang/git2-rs
    Docs: https://docs.rs/git2/latest/git2/

gix, a WIP reimplementation of git. This is far from complete but does
a lot of threading / async to apparently get quite fast.
    Repo: https://github.com/Byron/gitoxide
    Docs: https://docs.rs/gix/latest/gix/

If the git project does decide to go forward with this, there is
probably a lot of completed work that can be pulled from either of
those sources.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [DISCUSS] Introducing Rust into the Git project
  2024-01-11 23:48   ` brian m. carlson
@ 2024-01-12  8:24     ` Sam James
  2024-01-12 14:46       ` Antoni Boucher
  0 siblings, 1 reply; 38+ messages in thread
From: Sam James @ 2024-01-12  8:24 UTC (permalink / raw)
  To: brian m. carlson
  Cc: Sam James, me, git, John Paul Adrian Glaubitz, Helge Deller,
	John David Anglin, arsen, Antoni Boucher


"brian m. carlson" <sandals@crustytoothpaste.net> writes:

> [[PGP Signed Part:Undecided]]
> On 2024-01-11 at 11:45:07, Sam James wrote:
>> Something I'm a bit concerned about is that right now, neither
>> rustc_codegen_gcc nor gccrs are ready for use here.
>> 
>> We've had trouble getting things wired up for rustc_codegen_gcc
>> - which is not to speak against their wonderful efforts - because
>> the Rust community hasn't yet figured out how to handle things which
>> pure rustc supports yet. See
>> e.g. https://github.com/rust-lang/libc/pull/3032.
>
> Is this simply library support in the libc crate?  That's very easy to add.

[CC'd the rustc_codegen_gcc maintainer as well as some folks who have
tried using rustc_codegen_gcc for their distributions.]

Evidently not on the last point? ;)

Even just patching it in downstream isn't easy because you then have to
do it for many many packages. But after that PR stalling because of the
policy issue, there wasn't really anywhere to go, because of the
chicken-and-egg situation.

Let alone then, once the libc crate has it, going around and wiring up
in other crates.

The discussion on the PR seems clear that the intention is to not add
it until some policy is revised/formulated? I also don't want to have
to have that debate with every crate just because rustc doesn't support
it.

>
>> I think care should be taken in citing rustc_codegen_gcc and gccrs
>> as options for alternative platforms for now. They will hopefully
>> be great options in the future, but they aren't today, and they probably
>> won't be in the next 6 months at the least.
>
> What specifically is missing for rust_codegen_gcc?  I know gccrs is not
> ready at the moment, but I was under the impression that
> rust_codegen_gcc was at least usable.  I'm aware it requires some
> patches to GCC, but distros should be able to carry those.
>
> If rust_codegen_gcc isn't viable, then I agree we should avoid making
> Rust mandatory, but I'd like to learn more.

It's in a general state of instability. There's still *very* active work
ongoing in libgccjit (by the rust_codegen_gcc maintainer).

I'd say "you need to patch your GCC" is probably not a good state of
affairs for using something critical like git anyway, but even then,
I'm not aware of anyone having used it to build real-world common
applications using Rust for a non-rustc-supported platform, at least
not then using those builds day-to-day.

So, even if we were willing to chase the active flurry of libgccjit
patches (which is wonderful to see!), it's a significant moving
target. In Gentoo, we're probably better-placed than most people
to be able to do that, but it's still a lot of work and it doesn't
sound very robust for us to be doing for core infrastructure.

We have a lot of packages in Gentoo - partly actually stuff in the
Python ecosystem - where we're very excited to be able to use
rust_codegen_gcc (or gccrs, whichever comes first inreadiness, surely
rust_codegen_gcc) for alt platforms, but it's just not there yet.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [DISCUSS] Introducing Rust into the Git project
  2024-01-12  8:24     ` Sam James
@ 2024-01-12 14:46       ` Antoni Boucher
  0 siblings, 0 replies; 38+ messages in thread
From: Antoni Boucher @ 2024-01-12 14:46 UTC (permalink / raw)
  To: Sam James, brian m. carlson
  Cc: me, git, John Paul Adrian Glaubitz, Helge Deller,
	John David Anglin, arsen

While usable, there are a few things missing in rustc_codegen_gcc:

 * Unwinding doesn't work correctly when compiling Rust code in release
mode.
 * Rustup distribution: might not be mandatory, but I guess it would be
very helpful to have an easy way to install rustc_codegen_gcc and being
able to pin to a specific version.
 * Debug info: again might not be mandatory, but would be helpful.
 * Have not been tested on many platforms: these platforms had a few
tests, so while it's possible to use Rust on them, that doesn't mean
everything works (in particular, I know that changes will be needed to
both the Rust spec file and the standard library — or its tests — for
m68k): SuperH, ARC, m68k [1] and there's currently someone
experimenting on AVR. Related to the platform support, could you please
send me a list of platforms where git is officially supported?
 * Not sure if it would be needed, but the new inline asm syntax is not
supported on architectures not supported by rustc.
 * I also expect bad compilation in some cases.

> Is this simply library support in the libc crate?  That's very easy
to add.

We might also need to update the object crate.

As for the progress, we plan to have most of the patches merged for
libgccjit 14, but one important one will be missing because it's not
ready (the one for try/catch that is necessary to support Rust panics).
I expect there will be much less patches for libgccjit 15: probably
try/catch and bug fixing for the most part.
We also plan to have rustup distribution in the coming months, so
that's something that will help for adoption.
Along with rustup distribution, we plan on making architectures
currently not supported by rustc usable more easily in the coming
months.

Recently, I built and ran the tests of a dozen of the most popular
crates and all of their tests passed [2]. And rustc_codegen_gcc was
already able to build the Rust compiler in March 2022 and while not
completely working, the resulting compiler could compile a "Hello,
world!" [3].

[1] https://github.com/rust-lang/rustc_codegen_gcc/wiki
[2]
https://blog.antoyo.xyz/rustc_codegen_gcc-progress-report-26#state_of_compiling_popular_crates
[3] https://blog.antoyo.xyz/rustc_codegen_gcc-progress-report-10

On Fri, 2024-01-12 at 08:24 +0000, Sam James wrote:
> 
> "brian m. carlson" <sandals@crustytoothpaste.net> writes:
> 
> > [[PGP Signed Part:Undecided]]
> > On 2024-01-11 at 11:45:07, Sam James wrote:
> > > Something I'm a bit concerned about is that right now, neither
> > > rustc_codegen_gcc nor gccrs are ready for use here.
> > > 
> > > We've had trouble getting things wired up for rustc_codegen_gcc
> > > - which is not to speak against their wonderful efforts - because
> > > the Rust community hasn't yet figured out how to handle things
> > > which
> > > pure rustc supports yet. See
> > > e.g. https://github.com/rust-lang/libc/pull/3032.
> > 
> > Is this simply library support in the libc crate?  That's very easy
> > to add.
> 
> [CC'd the rustc_codegen_gcc maintainer as well as some folks who have
> tried using rustc_codegen_gcc for their distributions.]
> 
> Evidently not on the last point? ;)
> 
> Even just patching it in downstream isn't easy because you then have
> to
> do it for many many packages. But after that PR stalling because of
> the
> policy issue, there wasn't really anywhere to go, because of the
> chicken-and-egg situation.
> 
> Let alone then, once the libc crate has it, going around and wiring
> up
> in other crates.
> 
> The discussion on the PR seems clear that the intention is to not add
> it until some policy is revised/formulated? I also don't want to have
> to have that debate with every crate just because rustc doesn't
> support
> it.
> 
> > 
> > > I think care should be taken in citing rustc_codegen_gcc and
> > > gccrs
> > > as options for alternative platforms for now. They will hopefully
> > > be great options in the future, but they aren't today, and they
> > > probably
> > > won't be in the next 6 months at the least.
> > 
> > What specifically is missing for rust_codegen_gcc?  I know gccrs is
> > not
> > ready at the moment, but I was under the impression that
> > rust_codegen_gcc was at least usable.  I'm aware it requires some
> > patches to GCC, but distros should be able to carry those.
> > 
> > If rust_codegen_gcc isn't viable, then I agree we should avoid
> > making
> > Rust mandatory, but I'd like to learn more.
> 
> It's in a general state of instability. There's still *very* active
> work
> ongoing in libgccjit (by the rust_codegen_gcc maintainer).
> 
> I'd say "you need to patch your GCC" is probably not a good state of
> affairs for using something critical like git anyway, but even then,
> I'm not aware of anyone having used it to build real-world common
> applications using Rust for a non-rustc-supported platform, at least
> not then using those builds day-to-day.
> 
> So, even if we were willing to chase the active flurry of libgccjit
> patches (which is wonderful to see!), it's a significant moving
> target. In Gentoo, we're probably better-placed than most people
> to be able to do that, but it's still a lot of work and it doesn't
> sound very robust for us to be doing for core infrastructure.
> 
> We have a lot of packages in Gentoo - partly actually stuff in the
> Python ecosystem - where we're very excited to be able to use
> rust_codegen_gcc (or gccrs, whichever comes first inreadiness, surely
> rust_codegen_gcc) for alt platforms, but it's just not there yet.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [DISCUSS] Introducing Rust into the Git project
  2024-01-11 16:57       ` Elijah Newren
@ 2024-01-17 21:30         ` Dragan Simic
  2024-01-24  4:15           ` Elijah Newren
  0 siblings, 1 reply; 38+ messages in thread
From: Dragan Simic @ 2024-01-17 21:30 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Taylor Blau, git

On 2024-01-11 17:57, Elijah Newren wrote:
> Hi Dragan,

I apologize for my delayed response.

> On Wed, Jan 10, 2024 at 9:39 PM Dragan Simic <dsimic@manjaro.org> 
> wrote:
>> 
>> On 2024-01-11 01:33, Elijah Newren wrote:
>> > On Wed, Jan 10, 2024 at 1:57 PM Dragan Simic <dsimic@manjaro.org>
>> > wrote:
>> >>
>> >> Thus, Git should probably follow the same approach of not converting
>> >> the
>> >> already existing code
>> >
>> > I disagree with this.  I saw significant performance improvements
>> > through converting some existing Git code to Rust.  Granted, it was
>> > only a small amount of code, but the performance benefits I saw
>> > suggested we'd see more by also doing similar conversions elsewhere.
>> > (Note that I kept the old C code and then conditionally compiled
>> > either Rust or C versions of what I was converting.)
>> 
>> Well, it's also possible that improving the old C code could also 
>> result
>> in some performance improvements.  Thus, quite frankly, I don't see 
>> that
>> as a valid argument to rewrite some existing C code in Rust.
> 
> Yes, and I've made many performance improvements in the C code in git.
> Sometimes I make some of the code 5% or 20% faster.  Sometimes 1-3
> orders of magnitude faster.  Once over 60 orders of magnitude
> faster.[1]  Look around in git's history; I've done a fair amount of
> performance stuff.

Thank you very much for your work!

> And I'm specifically arguing that I feel limited in some of the
> performance work that can be done by remaining in C.  Part of my
> reason for interest in Rust is exactly because I think it can help us
> improve performance in ways that are far more difficult to achieve in
> C.  And this isn't just guesswork, I've done some trials with it.
> Further, I even took the time to document some of these reasons
> elsewhere in this thread[2].  Arguing that some performance
> improvements can be done in C is thus entirely missing the point.
> 
> If you want to dismiss the performance angle of argument for Rust, you
> should take the time to address the actual reasons raised for why it
> could make it easier to improve performance relative to continuing in
> C.
> 
> Also, as a heads up since you seem to be relatively new to the list:
> your position will probably carry more weight with others if you take
> the time to understand, acknowledge, and/or address counterpoints of
> the other party.  It is certainly fine to simply express some concerns
> without doing so (Randall and Patrick did a good job of this in this
> thread), but when you simply assert that the benefits others point out
> simply don't exist (e.g. your "Quite frankly, that would _only_
> complicate things and cause fragmentation." (emphasis added) from your
> first email in this thread[3], and which this latest email of yours
> somewhat looks like as well), others may well start applying a
> discount to any positions you state.  Granted, it's totally up to you,
> but I'm just giving a hint about how I think you might be able to be
> more persuasive.

I totally agree with your suggestions, and I'm thankful for the time it 
took you to write it all down.  I'll take your advice and refrain myself 
from expressing my opinions in this thread.

> [1] A couple examples: 6a5fb966720 ("Change default merge backend from
> recursive to ort", 2021-08-04) and 8d92fb29270 ("dir: replace
> exponential algorithm with a linear one", 2020-04-01)
> [2] Footnote 6 of
> https://lore.kernel.org/git/CABPp-BFOmwV-xBtjvtenb6RFz9wx2VWVpTeho0k=D8wsCCVwqQ@mail.gmail.com/
> [3] 
> https://lore.kernel.org/git/b2651b38a4f7edaf1c5ffee72af00e46@manjaro.org/

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Defining a platform support policy (Was: [DISCUSS] Introducing Rust into the Git project)
  2024-01-10 23:52         ` rsbecker
  2024-01-11  0:59           ` Elijah Newren
  2024-01-11  2:55           ` brian m. carlson
@ 2024-01-22 23:17           ` Emily Shaffer
  2024-01-23  0:11             ` rsbecker
                               ` (2 more replies)
  2 siblings, 3 replies; 38+ messages in thread
From: Emily Shaffer @ 2024-01-22 23:17 UTC (permalink / raw)
  To: Randall S. Becker
  Cc: Taylor Blau, Junio C Hamano, Dragan Simic, Git List, Johannes Schindelin

On Wed, Jan 10, 2024 at 3:52 PM <rsbecker@nexbridge.com> wrote:
>
> On Wednesday, January 10, 2024 5:26 PM, Taylor Blau wrote:
> >On Wed, Jan 10, 2024 at 05:15:53PM -0500, rsbecker@nexbridge.com wrote:
> >> Just a brief concern: Rust is not broadly portable. Adding another
> >> dependency to git will remove many existing platforms from future releases.
> >> Please consider this carefully before going down this path.
> >
> >I was hoping to hear from you as one of the few (only?) folks who participate on
> >the list and represent HPE NonStop users.
> >
> >I'm curious which if any of the compiler frontends that I listed in my earlier email
> >would work for you.
>
> Unfortunately, none of the compiler frontends listed previously can be built for NonStop. These appear to all require gcc either directly or transitively, which cannot be ported to NonStop. I do not expect this to change any time soon - and is outside of my control anyway. An attempt was made to port Rust but it did not succeed primarily because of that dependency. Similarly, Golang is also not portable to NonStop because of architecture assumptions made by the Go team that cannot be satisfied on NonStop at this time. If some of the memory/pointer issues are the primary concern, c11 might be something acceptable with smart pointers. C17 will eventually be deployable, but is not available on most currently supported OS versions on the platform.

I hope y'all don't mind me hijacking this part of the thread ;)

But, Randall's remarks bring up something pretty compelling: I don't
think Git has a clearly defined platform support policy. As far as I
can tell, the support policy now is "if you run `make test` on it and
breaks, and you let us know, we'll try to fix it" - without much in
the way of additional caveats. If I look in CodingGuidelines I see a
few "this doesn't work on platform X so don't do it" (like around %z
in printf), but nowhere do I see "how to know if your platform is
supported" or even "here are platforms we have heard Git works OK on".

That causes a lot of confusion for the project - threads like this one
(and presumably a similar one about C99 adoption) become a blend of
"is this change good for the project or not?" and "will this change
leave behind platform X?" that is difficult to pick apart.

Does it make sense for us to formalize a support policy? For example,
if we wanted to formalize the status quo, I could envision:

"""
Platform support: We make a best-effort attempt to solve any bugs
reported to the list, regardless of platform. To prevent breakages in
the first place, consider running Git's `make test` regularly on your
platform and reporting the results to git@vger.kernel.org; or, better
yet, consider adding your platform to the GitHub Actions CI
(configured in `.github/`).
"""

Or, if we wanted to be able to move very nimbly, we could imagine
something much more restrictive (note that I'm not endorsing it, just
illustrating):

"""
Platform support: Git is guaranteed to work well on Linux platforms
using a kernel version that is less than 1 year old. Support for all
other platforms is best-effort; when reporting a bug on another
platform, you may need to patch the issue and verify your fix
yourself.
"""

I suspect there's a happy medium in here somewhere - trying to fix (or
avoid) an issue on a platform which the average developer cannot run
tests on is not a recipe for a happy developer, and a general policy
of "patches welcome" for anything but latest Linux is not a recipe for
happy users.

I see a few axes we can play with:
 * which architectures/kernels/OS (do we care about more than the
usual suspects of Linux/Mac/Windows // x86/amd/arm //
POSIX-compliant?)
 * age of architectures/kernels (do we care to offer full support for
a 10 or 15 year old OS?)
 * new feature compatibility guarantees vs. core
functionality/security fix guarantees (which do we really define
"support" as?)
 * test provisioning (do we require a VM we can run CI on, or is a
report generated from a nightly build and mailed to the list OK?)
 * test/breakage timing (should the above tests run on every commit to
'next'? every merge to 'master'? every RC?)
 * who provides the support (is it the patch author's responsibility
to fix late-breaking platform support bugs? is it the reporter's
responsibility? and especially, how does this interplay with test
provisioning and frequency above?)

If we had clearer answers to these questions, it'd be much simpler to
determine whether experimentation with Rust is possible or useful.
Plus it would make developer lives easier, in general, to understand
how much compatibility support work they're potentially signing up for
when sending a change of any size.

 - Emily

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: Defining a platform support policy (Was: [DISCUSS] Introducing Rust into the Git project)
  2024-01-22 23:17           ` Defining a platform support policy (Was: [DISCUSS] Introducing Rust into the Git project) Emily Shaffer
@ 2024-01-23  0:11             ` rsbecker
  2024-01-23  0:57               ` Defining a platform support policy Junio C Hamano
  2024-01-23  0:31             ` Junio C Hamano
  2024-01-24  7:54             ` Defining a platform support policy (Was: [DISCUSS] Introducing Rust into the Git project) Elijah Newren
  2 siblings, 1 reply; 38+ messages in thread
From: rsbecker @ 2024-01-23  0:11 UTC (permalink / raw)
  To: 'Emily Shaffer'
  Cc: 'Taylor Blau', 'Junio C Hamano',
	'Dragan Simic', 'Git List',
	'Johannes Schindelin'

On Monday, January 22, 2024 6:18 PM, Emily Shaffer wrote:
>To: Randall S. Becker <rsbecker@nexbridge.com>
>Cc: Taylor Blau <me@ttaylorr.com>; Junio C Hamano <gitster@pobox.com>; Dragan
>Simic <dsimic@manjaro.org>; Git List <git@vger.kernel.org>; Johannes Schindelin
><Johannes.Schindelin@gmx.de>
>Subject: Defining a platform support policy (Was: [DISCUSS] Introducing Rust into
>the Git project)
>
>On Wed, Jan 10, 2024 at 3:52 PM <rsbecker@nexbridge.com> wrote:
>>
>> On Wednesday, January 10, 2024 5:26 PM, Taylor Blau wrote:
>> >On Wed, Jan 10, 2024 at 05:15:53PM -0500, rsbecker@nexbridge.com wrote:
>> >> Just a brief concern: Rust is not broadly portable. Adding another
>> >> dependency to git will remove many existing platforms from future releases.
>> >> Please consider this carefully before going down this path.
>> >
>> >I was hoping to hear from you as one of the few (only?) folks who
>> >participate on the list and represent HPE NonStop users.
>> >
>> >I'm curious which if any of the compiler frontends that I listed in
>> >my earlier email would work for you.
>>
>> Unfortunately, none of the compiler frontends listed previously can be built for
>NonStop. These appear to all require gcc either directly or transitively, which cannot
>be ported to NonStop. I do not expect this to change any time soon - and is outside
>of my control anyway. An attempt was made to port Rust but it did not succeed
>primarily because of that dependency. Similarly, Golang is also not portable to
>NonStop because of architecture assumptions made by the Go team that cannot be
>satisfied on NonStop at this time. If some of the memory/pointer issues are the
>primary concern, c11 might be something acceptable with smart pointers. C17 will
>eventually be deployable, but is not available on most currently supported OS
>versions on the platform.
>
>I hope y'all don't mind me hijacking this part of the thread ;)

I'm happy you did this. The topic is crucial - if nowhere else but to my ability to sleep at night. Preserving Emily's comments without snipping as these are important questions and comments.

>But, Randall's remarks bring up something pretty compelling: I don't think Git has a
>clearly defined platform support policy. As far as I can tell, the support policy now is
>"if you run `make test` on it and breaks, and you let us know, we'll try to fix it" -
>without much in the way of additional caveats. If I look in CodingGuidelines I see a
>few "this doesn't work on platform X so don't do it" (like around %z in printf), but
>nowhere do I see "how to know if your platform is supported" or even "here are
>platforms we have heard Git works OK on".
>
>That causes a lot of confusion for the project - threads like this one (and presumably
>a similar one about C99 adoption) become a blend of "is this change good for the
>project or not?" and "will this change leave behind platform X?" that is difficult to
>pick apart.
>
>Does it make sense for us to formalize a support policy? For example, if we wanted
>to formalize the status quo, I could envision:
>
>"""
>Platform support: We make a best-effort attempt to solve any bugs reported to the
>list, regardless of platform. To prevent breakages in the first place, consider running
>Git's `make test` regularly on your platform and reporting the results to
>git@vger.kernel.org; or, better yet, consider adding your platform to the GitHub
>Actions CI (configured in `.github/`).
>"""
>
>Or, if we wanted to be able to move very nimbly, we could imagine something much
>more restrictive (note that I'm not endorsing it, just
>illustrating):
>
>"""
>Platform support: Git is guaranteed to work well on Linux platforms using a kernel
>version that is less than 1 year old. Support for all other platforms is best-effort;
>when reporting a bug on another platform, you may need to patch the issue and
>verify your fix yourself.
>"""
>
>I suspect there's a happy medium in here somewhere - trying to fix (or
>avoid) an issue on a platform which the average developer cannot run tests on is
>not a recipe for a happy developer, and a general policy of "patches welcome" for
>anything but latest Linux is not a recipe for happy users.
>
>I see a few axes we can play with:
> * which architectures/kernels/OS (do we care about more than the usual suspects
>of Linux/Mac/Windows // x86/amd/arm //
>POSIX-compliant?)
> * age of architectures/kernels (do we care to offer full support for a 10 or 15 year
>old OS?)
> * new feature compatibility guarantees vs. core functionality/security fix
>guarantees (which do we really define "support" as?)
> * test provisioning (do we require a VM we can run CI on, or is a report generated
>from a nightly build and mailed to the list OK?)
> * test/breakage timing (should the above tests run on every commit to 'next'?
>every merge to 'master'? every RC?)
> * who provides the support (is it the patch author's responsibility to fix late-
>breaking platform support bugs? is it the reporter's responsibility? and especially,
>how does this interplay with test provisioning and frequency above?)
>
>If we had clearer answers to these questions, it'd be much simpler to determine
>whether experimentation with Rust is possible or useful.
>Plus it would make developer lives easier, in general, to understand how much
>compatibility support work they're potentially signing up for when sending a change
>of any size.

I think we might want to add some considerations to the above list that go beyond what other projects use, OpenSSL as an example:

* Can support for exotic platforms be delegated to some "community" support concept. In NonStop's case, I currently do 99% of the verification that each release runs properly. If I am able to provide a fix, I will. We have been fortunate that most problems/solutions have been of general interest and impact, with my platforms being more of a "Canary in the Coalmine" situation where we just encounter it first because of edge conditions, but other platforms may be impacted. The problem here is time of how long a designated community support person(s) can keep supporting git and what happens when they (me) retire or get hit by a bus. Like all good NonStop people, I have a backup, so git does not need to worry about me specifically.

* What is the broad impact of dropping support for a platform that has a significant investment in git, where loss of support could have societal impact. There are platforms with 100 million to 1000 million lines of code managed by git today. This type of investment-related impact is specific to git compared to other Open-Source products. Leaving debit or credit card authorizers without a supported git would be, let's say, "bad".

* Could stakeholders be consulted before changing support levels? Yes, I get that commercial fee-based products hit this more than Open-Source. Looking at other products in the Open-Source space, there are fee-based support models that could be developed for long-term support (beyond the obvious LTS-type considerations - see OpenSSL's model for reference). A related question is: "If there is a bug detected in git, what version is the oldest supported git version to which a fix can be made?" 2.0.0? 1.8.0 (looking at some Linux distro's RPM or APT repositories as being seriously guilty here). My MacOS server (just a couple of years ago) came with 1.8.0. I can't answer that question if asked by a customer. An alternative model, which seems to be informally embraced by git is "please upgrade to the latest - or a fix on the latest few releases". But this position puts pressure on the team to maintain platform compatibility for indefinite periods.

* What level of compatibility will be more appropriate to ensure git's reputation as the gold-standard version control platform? Without a board of directors, or at least an advisory board, this might not be answerable (or even decidable). That role has been taken up, by intent and/or because it has to be done, by our quite awesome committers.

This last two point put a serious amount of pressure for compatibility on both the customer and the git dev team to keep compatibility in the latest release with all platforms, especially the exotic platforms - mine included - although dropping those is not a good approach either. As an example (not intended as guidance) If we said something like git 2.40.0 is LTS until September 2026, and security fixes (and critical functional ones) will be done going back to that version, and anything older requires an extended fee-based support contract, I think it would make some organizations more comfortable with the support model. This is not a panacea and there are some obviously difficult concerns here - causing git's support model to vary from some Linux distro models from the earliest inception of both products.

I don't have a good answer to any of this that would satisfy everyone. I'm not sure there is one.
--Randall


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Defining a platform support policy
  2024-01-22 23:17           ` Defining a platform support policy (Was: [DISCUSS] Introducing Rust into the Git project) Emily Shaffer
  2024-01-23  0:11             ` rsbecker
@ 2024-01-23  0:31             ` Junio C Hamano
  2024-01-24  7:54             ` Defining a platform support policy (Was: [DISCUSS] Introducing Rust into the Git project) Elijah Newren
  2 siblings, 0 replies; 38+ messages in thread
From: Junio C Hamano @ 2024-01-23  0:31 UTC (permalink / raw)
  To: Emily Shaffer
  Cc: Randall S. Becker, Taylor Blau, Dragan Simic, Git List,
	Johannes Schindelin

Emily Shaffer <nasamuffin@google.com> writes:

> But, Randall's remarks bring up something pretty compelling: I don't
> think Git has a clearly defined platform support policy. As far as I
> can tell, the support policy now is "if you run `make test` on it and
> breaks, and you let us know, we'll try to fix it"

I doubt this part.  If there is somebody motivated enough among us
who has access to such a platform, then that person may try to fix
it and if the fix is not too ugly, I may accept such a patch as the
upstream maintainer.  So your "you let us know we'll try" does not
reflect reality at all.  The major platforms luckily have such
motivated somebody almost always available for them.  Niche ones,
perhaps not.

> ..., but nowhere do I see "how to know if your platform is
> supported" or even "here are platforms we have heard Git works OK on".

Yup.  Patches, with commitments to keep such lists up-to-date, are
very much welcome.

Thanks.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Defining a platform support policy
  2024-01-23  0:11             ` rsbecker
@ 2024-01-23  0:57               ` Junio C Hamano
  0 siblings, 0 replies; 38+ messages in thread
From: Junio C Hamano @ 2024-01-23  0:57 UTC (permalink / raw)
  To: rsbecker
  Cc: 'Emily Shaffer', 'Taylor Blau',
	'Dragan Simic', 'Git List',
	'Johannes Schindelin'

<rsbecker@nexbridge.com> writes:

> I think we might want to add some considerations to the above list
> that go beyond what other projects use, OpenSSL as an example:

[jc: if you want to have a meaningful discussion on this list,
please stick to a reasonable line width.  I'll rewrap your lines
below].

> * Can support for exotic platforms be delegated to some
> "community" support concept. In NonStop's case, I currently do 99%
> of the verification that each release runs properly. If I am able
> to provide a fix, I will. We have been fortunate that most
> problems/solutions have been of general interest and impact, with
> my platforms being more of a "Canary in the Coalmine" situation
> where we just encounter it first because of edge conditions, but
> other platforms may be impacted. The problem here is time of how
> long a designated community support person(s) can keep supporting
> git and what happens when they (me) retire or get hit by a
> bus. Like all good NonStop people, I have a backup, so git does
> not need to worry about me specifically.

There are platform packagers that deliver binary releases, and we do
not have to worry about them.  We _could_ have a tier of minority
platform that we can treat pretty much the same as these packagers,
i.e. the "community supported version of Git for platform X" might
consist of many patches on top of what I release, and some patches
that are acceptable quality may be given upstream, but there may
need hacks that are too ugly to live in my tree, which the
"community edition" may have to keep outside the upstream.  Even in
such a case, if they try to engage with this list, they will often
find somebody willing to help them polish such "ugly hacks" into
acceptable patches.

> * What is the broad impact of dropping support for a platform that
> has a significant investment in git, where loss of support could
> have societal impact. There are platforms with 100 million to 1000
> million lines of code managed by git today. This type of
> investment-related impact is specific to git compared to other
> Open-Source products. Leaving debit or credit card authorizers
> without a supported git would be, let's say, "bad".

Let's say we may want to start requiring new enough version of
library that is not yet ported to a minority platform.  Do we deeply
care?  It depends but "investment-related impact" is unlikely cause
for us to personally care.  But those $CORPS who will feel the
"investment-related impact" are welcome to hire quality developers
and to these employed developers, the "impact" might become an issue
they care more deeply about.

> * Could stakeholders be consulted before changing support levels?
> Yes, I get that commercial fee-based products hit this more than
> Open-Source. Looking at other products in the Open-Source space,
> there are fee-based support models that could be developed for
> long-term support (beyond the obvious LTS-type considerations -
> see OpenSSL's model for reference).

The stakeholders are already consulted, aren't they?  Every time we
make noises like "let's raise the minimum version of Perl we
require", we discuss it here.  They have to monitor this list, of
course, and if they lack people to do so, then they may have to
invest in it.

> A related question is: "If there is a bug detected in git, what
> version is the oldest supported git version to which a fix can be
> made?"

This is a good question.  The latest security-induced maintenance
release was Git 2.40.1 done in March 2023 and the fixes go back to
the v2.30 track, and Git 2.30.0 was done at the end of 2021.  This
window was unusually generous from our usual standard, IIRC, so I
would say roughly speaking 2 years is the maximum.

> ... But this position puts pressure on the team to maintain
> platform compatibility for indefinite periods.

Sure, but I think we should just say something like "18 months to 24
months", if you want backport of a fix to older track, you can do so
yourself.

The story is probably the same if a minority platform that lacks
recent enough dependencies (e.g. libraries) and stop linking
correctly.  If you care deeply enough, you should be ready to invest
yourself in porting such dependencies.  We can help, but the primary
driving force for porting issues ought to be folks with stake in the
platform.  We as the project won't bend over backwards and keep
everybody else to an ancient version of the dependency if some
platforms cannot catch up with the time.



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [DISCUSS] Introducing Rust into the Git project
  2024-01-17 21:30         ` Dragan Simic
@ 2024-01-24  4:15           ` Elijah Newren
  2024-01-24  5:14             ` Dragan Simic
  0 siblings, 1 reply; 38+ messages in thread
From: Elijah Newren @ 2024-01-24  4:15 UTC (permalink / raw)
  To: Dragan Simic; +Cc: Taylor Blau, git

Hi Dragan,

On Wed, Jan 17, 2024 at 1:30 PM Dragan Simic <dsimic@manjaro.org> wrote:
>
> On 2024-01-11 17:57, Elijah Newren wrote:
> > Hi Dragan,
>
> I apologize for my delayed response.

No worries; I'm often hit or miss on my responses these days as well.

> > On Wed, Jan 10, 2024 at 9:39 PM Dragan Simic <dsimic@manjaro.org>
> > wrote:
> >>
> >> On 2024-01-11 01:33, Elijah Newren wrote:
> >> > On Wed, Jan 10, 2024 at 1:57 PM Dragan Simic <dsimic@manjaro.org>
> >> > wrote:
> >> >>
> >> >> Thus, Git should probably follow the same approach of not converting
> >> >> the
> >> >> already existing code
> >> >
> >> > I disagree with this.  I saw significant performance improvements
> >> > through converting some existing Git code to Rust.  Granted, it was
> >> > only a small amount of code, but the performance benefits I saw
> >> > suggested we'd see more by also doing similar conversions elsewhere.
> >> > (Note that I kept the old C code and then conditionally compiled
> >> > either Rust or C versions of what I was converting.)
> >>
> >> Well, it's also possible that improving the old C code could also
> >> result
> >> in some performance improvements.  Thus, quite frankly, I don't see
> >> that
> >> as a valid argument to rewrite some existing C code in Rust.
> >
> > Yes, and I've made many performance improvements in the C code in git.
> > Sometimes I make some of the code 5% or 20% faster.  Sometimes 1-3
> > orders of magnitude faster.  Once over 60 orders of magnitude
> > faster.[1]  Look around in git's history; I've done a fair amount of
> > performance stuff.
>
> Thank you very much for your work!
>
> > And I'm specifically arguing that I feel limited in some of the
> > performance work that can be done by remaining in C.  Part of my
> > reason for interest in Rust is exactly because I think it can help us
> > improve performance in ways that are far more difficult to achieve in
> > C.  And this isn't just guesswork, I've done some trials with it.
> > Further, I even took the time to document some of these reasons
> > elsewhere in this thread[2].  Arguing that some performance
> > improvements can be done in C is thus entirely missing the point.
> >
> > If you want to dismiss the performance angle of argument for Rust, you
> > should take the time to address the actual reasons raised for why it
> > could make it easier to improve performance relative to continuing in
> > C.
> >
> > Also, as a heads up since you seem to be relatively new to the list:
> > your position will probably carry more weight with others if you take
> > the time to understand, acknowledge, and/or address counterpoints of
> > the other party.  It is certainly fine to simply express some concerns
> > without doing so (Randall and Patrick did a good job of this in this
> > thread), but when you simply assert that the benefits others point out
> > simply don't exist (e.g. your "Quite frankly, that would _only_
> > complicate things and cause fragmentation." (emphasis added) from your
> > first email in this thread[3], and which this latest email of yours
> > somewhat looks like as well), others may well start applying a
> > discount to any positions you state.  Granted, it's totally up to you,
> > but I'm just giving a hint about how I think you might be able to be
> > more persuasive.
>
> I totally agree with your suggestions, and I'm thankful for the time it
> took you to write it all down.  I'll take your advice

Great!

> and refrain myself
> from expressing my opinions in this thread.

...but that's not what my advice was.  My advice was that you'd be
more persuasive if you expressed your opinions differently.  Some
possible examples:

  * Stating that you are worried about the codebase becoming more
complicated or more fragmented (without dismissing the points Taylor
raised)
  * Arguing that you believe various points others raised aren't as
much of an advantage as they perceive, or even potentially aren't even
an advantage at all, not by mere assertion but by providing additional
details on the topic (statistics, anecdotes, war stories,
counter-examples, old commit messages, etc.) that back up your point
  * Stating that you don't understand why others think that advantages
they state are as significant as they pose and ask for clarification.

I think there's potentially some good points behind your positions,
and I don't want to discourage them.  I want to encourage lively,
friendly debate so that we can have the best information possible when
decisions are made.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [DISCUSS] Introducing Rust into the Git project
  2024-01-24  4:15           ` Elijah Newren
@ 2024-01-24  5:14             ` Dragan Simic
  0 siblings, 0 replies; 38+ messages in thread
From: Dragan Simic @ 2024-01-24  5:14 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Taylor Blau, git

Hello Elijah,

On 2024-01-24 05:15, Elijah Newren wrote:
> On Wed, Jan 17, 2024 at 1:30 PM Dragan Simic <dsimic@manjaro.org> 
> wrote:
>> On 2024-01-11 17:57, Elijah Newren wrote:
>>> On Wed, Jan 10, 2024 at 9:39 PM Dragan Simic <dsimic@manjaro.org> 
>>> wrote:
>> and refrain myself
>> from expressing my opinions in this thread.
> 
> ...but that's not what my advice was.  My advice was that you'd be
> more persuasive if you expressed your opinions differently.  Some
> possible examples:
> 
>   * Stating that you are worried about the codebase becoming more
> complicated or more fragmented (without dismissing the points Taylor
> raised)
>   * Arguing that you believe various points others raised aren't as
> much of an advantage as they perceive, or even potentially aren't even
> an advantage at all, not by mere assertion but by providing additional
> details on the topic (statistics, anecdotes, war stories,
> counter-examples, old commit messages, etc.) that back up your point
>   * Stating that you don't understand why others think that advantages
> they state are as significant as they pose and ask for clarification.
> 
> I think there's potentially some good points behind your positions,
> and I don't want to discourage them.  I want to encourage lively,
> friendly debate so that we can have the best information possible when
> decisions are made.

Oh, I once again totally agree!  I really love the way you expressed it,
which I'm once again thankful for.

I always support making improvements and major changes, introducing
new technologies, augmenting or even replacing old technologies, etc.
In the end, that's how progress is made, but such major changes also
need to be performed very carefully, in a controlled way that provides
a backup plan, and based on solid facts and past experiences.

In this specific case, please be aware that my health wasn't in great
shape, because I was (and still am) recovering from some nasty flu,
which has also effectively diminished my mental capacities.  That's
the primary reason why I provided only terse comments, without backing
them up with more specific details.  That's not the way I usually
operate, and I apologize for that.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Defining a platform support policy (Was: [DISCUSS] Introducing Rust into the Git project)
  2024-01-22 23:17           ` Defining a platform support policy (Was: [DISCUSS] Introducing Rust into the Git project) Emily Shaffer
  2024-01-23  0:11             ` rsbecker
  2024-01-23  0:31             ` Junio C Hamano
@ 2024-01-24  7:54             ` Elijah Newren
  2 siblings, 0 replies; 38+ messages in thread
From: Elijah Newren @ 2024-01-24  7:54 UTC (permalink / raw)
  To: Emily Shaffer
  Cc: Randall S. Becker, Taylor Blau, Junio C Hamano, Dragan Simic,
	Git List, Johannes Schindelin

On Mon, Jan 22, 2024 at 3:18 PM Emily Shaffer <nasamuffin@google.com> wrote:
>
> On Wed, Jan 10, 2024 at 3:52 PM <rsbecker@nexbridge.com> wrote:
> >
[...]
> > Unfortunately, none of the compiler frontends listed previously can be built for NonStop. These appear to all require gcc either directly or transitively, which cannot be ported to NonStop. I do not expect this to change any time soon - and is outside of my control anyway. An attempt was made to port Rust but it did not succeed primarily because of that dependency. Similarly, Golang is also not portable to NonStop because of architecture assumptions made by the Go team that cannot be satisfied on NonStop at this time. If some of the memory/pointer issues are the primary concern, c11 might be something acceptable with smart pointers. C17 will eventually be deployable, but is not available on most currently supported OS versions on the platform.
>
> I hope y'all don't mind me hijacking this part of the thread ;)

Of course not.  :-)

[...]
> Does it make sense for us to formalize a support policy?

Some hurdles that may need to be overcome if we want to do so:

* For a significant number of the discussions I remember, a
significant challenge was that we don't even know which platforms Git
is used on.  That's why we sometimes agree to weather balloon patches
that attempt to use some new option, in a way that is really easy to
remove...and if no one complains for a long enough time, then we
presume all platforms support it and start adding hard dependencies on
it.
* We are often happy to try to fix issues on even obscure platforms if
we get a detailed enough description showing exactly what the problem
is
* However, when reports don't come with a complete diagnosis, we often
will tell people who are reporting issues that we don't have access to
such a platform and someone else will have to dig further.  This
happens more often for exotic platforms (AIX, NonStop, etc.) but also
happens with mainstream platforms (Mac, Windows, and I think I've even
seen it happen with Linux).
* Even when folks report that they can't help the reporter, the work
doesn't always go back to the reporter, because someone else on the
list may respond and dig in; that happens more for mainstream
platforms but can happen with the exotic platforms as well.
* How exactly can we even enforce continued platform support?  What's
the actual mechanism?  I think the only route available to us is
people who care and try to provide reports, testing, patches, new
tools (e.g. our CI runs and gitgitgadget providing reports across
several of the more common platforms, with lots of work to investigate
the occasional weird build issues and flakes so it continues to be
fairly reliable), but what happens if some of those developers start
caring less...and yet we still have an encoded policy that their
platforms are supported?

I generally think we value portability fairly highly, but it clearly
has bounds...fuzzy and even unknown-by-us bounds.  I don't know how to
translate that into a policy, and I'm curious if trying to apply nice
sharp boundaries risks unreasonable expectations on either or both
sides.

Also...

[...]

> I see a few axes we can play with:
>  * which architectures/kernels/OS (do we care about more than the
> usual suspects of Linux/Mac/Windows // x86/amd/arm //
> POSIX-compliant?)
>  * age of architectures/kernels (do we care to offer full support for
> a 10 or 15 year old OS?)
>  * new feature compatibility guarantees vs. core
> functionality/security fix guarantees (which do we really define
> "support" as?)
>  * test provisioning (do we require a VM we can run CI on, or is a
> report generated from a nightly build and mailed to the list OK?)
>  * test/breakage timing (should the above tests run on every commit to
> 'next'? every merge to 'master'? every RC?)
>  * who provides the support (is it the patch author's responsibility
> to fix late-breaking platform support bugs? is it the reporter's
> responsibility? and especially, how does this interplay with test
> provisioning and frequency above?)

That's a great list of questions, but to me it does seem to lean
towards "whatever is supported is supported equally".  I don't know if
that was intended, or just the way I read it.  But if it was intended,
I'd say that while equal support may be an ideal, I suspect it is
pragmatically just too expensive as evidenced by the many optional
features we already have, many (all?) of which have roots in platform
support or the lack thereof:

  * gitk (NO_TCLTK)
  * dumb http(s) transport (NO_EXPAT)
  * smart http(s) transport (NO_CURL)
  * perl regexes (USE_LIBPRCRE)
  * translations (NO_GETTEXT)
  * charset conversions (NO_ICONV)
  * p4 support (NO_PYTHON, affected other scripts in the past too)
  * svn, send-email, gitweb support (NO_PERL, affected other stuff in
the past too)
  * fsmonitor (FSMONITOR_DAEMON_BACKEND)

Also, this list isn't just an "exotic" vs. "mainstream" platform
thing, since even Linux is "second class" in the final category[1].

So, I think if we create a "supported platforms" policy, it should
address optional features as well (though perhaps as simply as "the
support policy only applies to non-optional parts of Git").

[1] https://lore.kernel.org/git/pull.1352.v5.git.git.1670882286.gitgitgadget@gmail.com/

> If we had clearer answers to these questions, it'd be much simpler to
> determine whether experimentation with Rust is possible or useful.

How so?

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2024-01-24  7:55 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-10 20:16 [DISCUSS] Introducing Rust into the Git project Taylor Blau
2024-01-10 21:57 ` Dragan Simic
2024-01-10 22:11   ` Junio C Hamano
2024-01-10 22:15     ` rsbecker
2024-01-10 22:26       ` Taylor Blau
2024-01-10 23:52         ` rsbecker
2024-01-11  0:59           ` Elijah Newren
2024-01-11  1:44             ` rsbecker
2024-01-11  2:21               ` Elijah Newren
2024-01-11  2:57                 ` rsbecker
2024-01-11  5:06                   ` Elijah Newren
2024-01-11  6:56                     ` Patrick Steinhardt
2024-01-11 13:07                     ` rsbecker
2024-01-11  2:55           ` brian m. carlson
2024-01-11  3:24             ` rsbecker
2024-01-11 20:07               ` Trevor Gross
2024-01-11 21:28                 ` rsbecker
2024-01-11 23:23                   ` Trevor Gross
2024-01-22 23:17           ` Defining a platform support policy (Was: [DISCUSS] Introducing Rust into the Git project) Emily Shaffer
2024-01-23  0:11             ` rsbecker
2024-01-23  0:57               ` Defining a platform support policy Junio C Hamano
2024-01-23  0:31             ` Junio C Hamano
2024-01-24  7:54             ` Defining a platform support policy (Was: [DISCUSS] Introducing Rust into the Git project) Elijah Newren
2024-01-10 23:40     ` [DISCUSS] Introducing Rust into the Git project brian m. carlson
2024-01-11  0:33   ` Elijah Newren
2024-01-11  5:39     ` Dragan Simic
2024-01-11 16:57       ` Elijah Newren
2024-01-17 21:30         ` Dragan Simic
2024-01-24  4:15           ` Elijah Newren
2024-01-24  5:14             ` Dragan Simic
2024-01-11  0:12 ` Elijah Newren
2024-01-11  5:33   ` Dragan Simic
2024-01-11  1:56 ` brian m. carlson
2024-01-11 11:45 ` Sam James
2024-01-11 23:48   ` brian m. carlson
2024-01-12  8:24     ` Sam James
2024-01-12 14:46       ` Antoni Boucher
2024-01-11 23:53 ` Trevor Gross

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).