All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christian Couder <christian.couder@gmail.com>
To: Atharva Raykar <raykar.ath@gmail.com>
Cc: git <git@vger.kernel.org>,
	Shourya Shukla <shouryashukla.oo@gmail.com>,
	Shourya Shukla <periperidip@gmail.com>
Subject: Re: [GSoC][Draft Proposal] Finish converting git submodule to builtin
Date: Mon, 5 Apr 2021 18:02:48 +0200	[thread overview]
Message-ID: <CAP8UFD2eXtW4e-Pm5N2GyXZXPpYaZBci7bs=yHGTaTaD=ZaKag@mail.gmail.com> (raw)
In-Reply-To: <E6E88000-9C18-4035-9A14-8B406617351A@gmail.com>

Hi,

On Sat, Apr 3, 2021 at 4:08 PM Atharva Raykar <raykar.ath@gmail.com> wrote:
>
> Hi all,
>
> Below is my draft of my GSoC proposal. I have noticed that Chinmoy has already
> submitted a proposal for the same idea before me, so would that be considered
> "taken"? (I don't think I can submit another proposal for the other idea either,
> because someone has already sent one for that as well)

Unfortunately, it looks like we will mentor only 2 students on the 2
projects listed on https://git.github.io/SoC-2021-Ideas/, so we might
have to make tough choices.

> Since I have already put my effort into this for a while, I thought I might as
> well send it, but I'll accept whatever the mentors say about the eligibility of
> this proposal.

Thanks for sending it anyway!

> Here is a prettier markdown version:
> https://gist.github.com/tfidfwastaken/0c6ca9ef2a452f110a416351541e0f19
>
>
> --8<-----8<-----8<-----8<-----8<-----8<-----8<-----8<-----8<-----8<-----8<--
>
>                           ___________________
>
>                            GSOC GIT PROPOSAL
>
>                              Atharva Raykar
>                           ___________________
>
>
> Table of Contents
> _________________
>
> 1. Personal Details
> 2. Background
> 3. Me and Git
> .. 1. Current knowledge of git
> 4. The Project: Finish converting `git submodule' to builtin
> 5. Prior work
> 6. General implementation strategy
> 7. Timeline (using the format dd/mm)
> 8. Beyond GSoC
> 9. Blogging
> 10. Final Remarks: A little more about me
>
>
> 1 Personal Details
> ==================
>
>   Name        : Atharva Raykar
>   Major       : Computer Science and Engineering
>   Email       : raykar.ath@gmail.com
>   IRC nick    : atharvaraykar on #git and #git-devel
>   Address     : RB 103, Purva Riviera, Marathahalli, Bangalore
>   Postal Code : 560037
>   Time Zone   : IST (UTC+5:30)
>   GitHub      : http://github.com/tfidfwastaken
>
>
> 2 Background
> ============
>
>   I am Atharva Raykar, currently in my third year of studying Computer
>   Science and Engineering at PES University, Bangalore. I have always
>   enjoyed programming since a young age, but my deep appreciation for
>   good program design and creating the right abstractions came during my
>   exploration of the various rabbitholes of knowledge originating from
>   communities around the internet. I have personally enjoyed learning
>   about Functional Programming, Database Architecture and Operating
>   Systems, and my interests keep expanding as I explore more in this
>   field.
>
>   I owe my appreciation of this rich field to these communities, and I
>   always wanted to give back. With that goal, I restarted the [PES Open
>   Source] community in our campus, with the goal of creating spaces
>   where members could share knowledge, much in the same spirit as the
>   communities that kickstarted my journey in Computer Science. I learnt
>   a lot about collaborating in the open, maintainership, and reviewing
>   code. While I have made many small contributions to projects in the
>   past, I am hoping GSoC will help me make the leap to a larger and more
>   substantial contribution to one of my favourite projects that made it
>   all possible in my journey with Open Source.
>
>
> [PES Open Source] <https://pesos.github.io>
>
>
> 3 Me and Git
> ============
>
>   Here are the various forms of contributions that I have made to Git:
>
>   - [Microproject] userdiff: userdiff: add support for Scheme Status: In
>     progress, patch v2 pending List:
>     <https://public-inbox.org/git/20210327173938.59391-1-raykar.ath@gmail.com/>
>
>   - [Git Education] Conducted a workshop with attendance of hundreds of
>     students new to git, and increased the prevalence of of git's usage
>     in my campus.
>     Photos: <https://photos.app.goo.gl/T7CPk1zkHdK7mx6v7> and
>     <https://photos.app.goo.gl/bzTgdHMttxDen6z9A>
>
>   I intend to continue helping people out on the mailing list and IRC
>   and tending to patches wherever possible in the meantime.

Nice!

> 3.1 Current knowledge of git

s/git/Git/

> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
>   I use git almost daily in some form, and I am fairly comfortable with
>   it. I have already read and understood the chapters from the Git
>   Book about submodules along with the one on objects, references,
>   packfiles and the refspec.
>
>
> 4 The Project: Finish converting `git submodule' to builtin
> ===========================================================
>
>   Git has historically had many components implemented in the form of
>   shell scripts. This was less than ideal for several reasons:
>   - Portability: Non-POSIX systems like Windows don't play nice with
>     shell script commands like grep, cd and printf, to name a few, and
>     these commands have to be reimplemented for the system. There are
>     also POSIX to Windows path conversion issues.
>   - No direct access to plumbing: Shell commands do not have direct
>     access to the low level git API, and a separate shell is spawned to
>     just to carry out their operations.
>   - Performance: Shell scripts tend to create a lot of child processes
>     which slows down the functioning of these commands, especially with
>     large repositories.
>   Over the years, many GSoC students have converted the shell versions
>   of these commands to C. Git `submodule' is the last of these to be
>   converted.
>
>
> 5 Prior work
> ============
>
>   I will be taking advantage of the knowledge that was gained in the
>   process of the converting the previous scripts and avoiding all the
>   gotchas that may be present in the process. There may be a bunch of
>   useful helper functions in the previous patches that can be reused as
>   well (more investigation needed to determine what exactly is
>   reusable).
>
>   Currently the only other commands left to be completed for `submodule'
>   are `add' and `update'. Work for `add' has already been started by a
>   previous GSoCer, Shourya Shukla, and needs to picked up from there.

Yeah, 'update' uses  ̀git submodule--helper update-clone`, `git
submodule--helper update-module-mode` and other `git
submodule--helper` sub-commands, but is not fully ported.

>   Reference:
>   <https://github.com/gitgitgadget/git/issues/541#issuecomment-769245064>
>
>   I'll have these as my references when I am working on the project:
>   His blog about his progress:
>   <https://shouryashukla.blogspot.com/2020/08/the-final-report.html>
>   (more has been implemented since)
>   Shourya's latest patch for `submodule add':
>   <https://lore.kernel.org/git/20201007074538.25891-1-shouryashukla.oo@gmail.com/>
>
>   For the most part, the implementation looks fairly complete, but there
>   seems to be a segfault occurring, along with a few changes suggested
>   by the reviewers. It will be helpful to contact Shourya to fully
>   understand what needs to be done.
>
>   Prathamesh's previous conversion work:
>   <https://lore.kernel.org/git/20170724203454.13947-1-pc44800@gmail.com/#t>

It would be nice if, after finishing 'add' and 'update', you could
also completely get rid of git-submodule.sh and instead use `git
submodule-helper` as `git submodule`.

> 6 General implementation strategy
> =================================
>
>   The way to port the shell to C code for `submodule' will largely
>   remain the same. There already exists the builtin
>   `submodule--helper.c' which contains most of the previous commands'
>   ports. All that the shell script for `git-submodule.sh' is doing for
>   the previously completed ports is parsing the flags and then calling
>   the helper, which does all the business logic.
>
>   So I will be moving out all the business logic that the shell script
>   is performing to `submodule--helper.c'. Any reusable functionality
>   that is introduced during the port will be added to `submodule.c' in
>   the top level.

Ok.

>   For example: The general strategy for converting `cmd_update()' would
>   be to have a call to `submodule--helper' in the shell script to a
>   function which would resemble something like `module_update()' which
>   would perform the work being done by the shell script past the flags
>   being parsed and make the necessary calls to `update_clone()', and the
>   git interface in C for performing the merging, checkout and rebase
>   where necessary.

It would be nice if you could go into more details about what
`module_update()' would look like. Do you see steps that you could
take to not have to do everything related to `module_update()' in only
one patch?

>   After this process, the builtin is added to the commands array in
>   `submodule--helper.c'. And since these two functions are the last bit

It's not very clear here that by "these two functions" you reference
the 'add' and 'update' sub-commands.

>   of functionality left to convert in submodules, an extended goal can
>   be to get rid of the shell script altogether, and make the helper into
>   the actual builtin [1].

Nice that you are talking about this!

>   [1]
>   <https://lore.kernel.org/git/nycvar.QRO.7.76.6.2011191327320.56@tvgsbejvaqbjf.bet/>
>
>
> 7 Timeline (using the format dd/mm)
> ===================================
>
>   Periods of limited availability (read: hectic chaos):
>   - From 13/04 to 20/04 I will be having project evaluations and lab
>     assessments for five of my courses.
>   - From 20/04 to 01/05 I have my in-semester exams.
>   - For a period of two weeks in the range of 08/05 to 29/05 I will be
>     having my end-semester exams.
>   My commitment: I will still have time during my finals to help people
>   out on the mailing list, get acquainted with the community and its
>   processes, and even review patches if I can. This is because we get
>   holidays between each exam, and my grades are good enough to that I
>   can prioritise git over my studies ;-)

s/git/Git/

>   And on the safe side, I will still engage with the community from now
>   till 07/06 so that the community bonding period is not compromised in
>   any way.
>
>   Periods of abundant availability: After 29/05 all the way to the first
>   week of August, I will be having my summer break, so I can dedicate
>   myself to git full-time :-)
>
>   I would have also finished all my core courses, so even after that, I
>   will have enough of time to give back to git past my GSoC period.

Ok.

Also: s/git/Git/

>   Phase 1: 07/06 to 14/06 -- Investigate and devise a strategy to port
>   the submodule functions
>   - This phase will be more diagrams in my notebook than code in my
>     editor -- I will go through all the methods used to port the other
>     submodule functions and see how to do the same for what is left.
>   - I will find the C equivalents of all the shell invocations in
>     `git-submodule.sh', and see what invocations have /no/ equivalent
>     and need to be created as helpers in C (Eg: What is the equivalent
>     to the `ensure-core-worktree' invocation in C?). For all the helpers
>     and new functionality that I do introduce, I will need to create the
>     testing strategy for the same.
>   - I will go through all the work done by Shourya in his patch, and try
>     to understand it properly. I will also see the mistakes that were
>     caught in all the reviews for previous submodule conversion patches
>     and try to learn from them before I jump to the code.
>   - Deliverable: I will create a checklist for all the work that needs
>     to be done with as much detail as I can with the help of inputs from
>     my mentor and all the knowledge I have gained in the process.
>
>   Phase 2: 14/06 to 28/06 -- Convert `add' to builtin in C
>   - I will work on completing `git submodule add'. One strategy would be
>     to either reimplement the whole thing using what was learnt in
>     Shourya's attempt, but it is probably wiser to just take his patch
>     and modify it. I would know what to do by the time I reach this
>     phase.
>   - I will also add tests for this functionality. I will also document
>     my changes when required. These would be unit tests for the helpers
>     introduced, and integration of `add' with the other commands.
>   - Deliverable: Completely port `add' to C!
>
>   Bonus Phase: If I am ahead of time -- Remove the need for a
>   `submodule--helper', and make it a proper C builtin.
>   - Once all the submodule functionality is ported, the shell script is
>     not really doing much more than parsing the arguments and passing it
>     to the helper. We won't need this anymore if it is implemented.

Ok, great!

  reply	other threads:[~2021-04-05 16:03 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-03 14:08 [GSoC][Draft Proposal] Finish converting git submodule to builtin Atharva Raykar
2021-04-05 16:02 ` Christian Couder [this message]
2021-04-08 10:19 ` [GSoC][Draft Proposal v2] " Atharva Raykar
2021-04-10 12:59   ` Christian Couder
2021-04-11  9:40     ` Atharva Raykar
2021-04-11 19:32       ` Kaartic Sivaraam
2021-04-12  5:56         ` Atharva Raykar
2021-04-12 13:29           ` Christian Couder
2021-04-11 10:17   ` [GSoC][Draft Proposal v3] " Atharva Raykar
2021-05-14 16:00   ` [GSoC][Draft Proposal v2] " Atharva Raykar
2021-05-16 18:40     ` Kaartic Sivaraam

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAP8UFD2eXtW4e-Pm5N2GyXZXPpYaZBci7bs=yHGTaTaD=ZaKag@mail.gmail.com' \
    --to=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=periperidip@gmail.com \
    --cc=raykar.ath@gmail.com \
    --cc=shouryashukla.oo@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.