All of lore.kernel.org
 help / color / mirror / Atom feed
From: ZheNing Hu <adlternative@gmail.com>
To: Jeff King <peff@peff.net>
Cc: "Git List" <git@vger.kernel.org>,
	"Junio C Hamano" <gitster@pobox.com>,
	"Eric Sunshine" <sunshine@sunshineco.com>,
	"Christian Couder" <christian.couder@gmail.com>,
	"Hariom verma" <hariom18599@gmail.com>,
	"Shourya Shukla" <periperidip@gmail.com>,
	"Оля Тележная" <olyatelezhnaya@gmail.com>
Subject: Re: GSoC Git Proposal Draft - ZheNing Hu
Date: Thu, 8 Apr 2021 21:29:38 +0800	[thread overview]
Message-ID: <CAOLTT8QSxgMBLVk2dqt2b863z0oxYTcczF5FcPFtiKQh4p_j9w@mail.gmail.com> (raw)
In-Reply-To: <YG4H3wXI8pZT+zDI@coredump.intra.peff.net>

Jeff King <peff@peff.net> 于2021年4月8日周四 上午3:28写道:
>
> On Sat, Apr 03, 2021 at 10:27:39PM +0800, ZheNing Hu wrote:
>
> > >   - figure out which data will be needed for each item based on the
> > >     parsed format, and then do the minimum amount of work to get that
> > >     data (using "oid_object_info_extended()" helps here, because it
> > >     likewise tries to do as little work as possible to satisfy the
> > >     request, but there are many elements that it doesn't know about)
> > >
> >
> > I have indeed noticed that `oid_object_info_extended()`
> > can get information about the object which we actually want.
> > In `cat-file.c`, It has been used in `batch_object_write()`, and
> > `expanding_atom()` specify what data we need.
> > In `ref-filter.c`, It has been used in `get_object()`.
> > I am not sure what you mean about "many elements that it
> > doesn't know about", For the time being, `cat-file` can get 5
> > kind of objects info it need.
>
> I think there are things one might want to format that
> oid_object_info_extended() does not know about. For example, if you are
> asking about %(authorname), it can't provide that. But we want to do as
> little work as possible to satisfy the request. So for example, with the
> format "%(objectsize)", we'd prefer _not_ to load the contents of each
> object, and just ask oid_object_info_extended() for the size. But if we
> are asked for "%(authorname)", we know we'll have to read and parse the
> object contents.
>

OK, I understand it now, `%(authorname)` needs to grub info in object content
so that content must be parsed, If we need to let cat-file learn
`%(authorname)`,
It takes extra work to extract from the object.

> So this notion of "figure out the least amount of work" will have to be
> part of the format code (and ref-filter and the pretty.c formatters do
> make an attempt at this; I'm saying that a universal formatter will want
> to keep this behavior).
>

You're right. %(tree) %(parent) ... reliant on commit object info,
%(tagger) %(taggername) ... reliant on tag object info.But If it is
some %(objectsize) or %(objectname) content, we do not need
to parse the content of the objects. Future work we should also
keep avoid parsing of non-dependent info.

> > Maybe you think that `cat-file` can learn some features in
> > `ref-filter` to extend the function of `cat-file --batch`?
> > E.g. %(objectname:short)? I think I may have a better
> > understanding of the topic of this mini-project now.
> > We may not want to port the logic of cat-file,but to learn some
> > design in `ref-filter`, right?
>
> Yes, I think the goal is for all of the commands that allow format
> specifiers to support the same set (at least where it makes sense;
> obviously you cannot ask for %(refname) in cat-file).
>

The future new API may need to deny such access.

> And IMHO the best way to do that is to write a new universal formatting
> API that takes the best parts from all of the existing ones. It _could_
> also be done by choosing ref-filter as the best implementation, slowly
> teaching it formats the other commands know (which is what Olga had
> started with), and then cleaning up any performance deficiencies. But I
> think that last part would actually be easier when starting from scratch
> (e.g., I think it would help to actually produce an abstract syntax tree
> of the parsed format, and then walk that tree to fill in the values).
>
> -Peff

It is the unified "%an" and "%author" you said last time.
I think maybe Olga and Hariom might have done similar things:
Calling `ref-filter` results in slower speed.

And you said we may can refactor to abstract syntax tree, this is
a good idea, and this may be a big project, In particular, pre-knowledge
of compilation principles is required, and we may also need to deal with
each different atom carefully.

Thanks.
--
ZheNing Hu

  reply	other threads:[~2021-04-08 13:29 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-02  9:03 GSoC Git Proposal Draft - ZheNing Hu ZheNing Hu
2021-04-02 14:57 ` Christian Couder
2021-04-03 13:23   ` ZheNing Hu
2021-04-02 15:39 ` Jeff King
2021-04-03 14:27   ` ZheNing Hu
2021-04-07 19:28     ` Jeff King
2021-04-08 13:29       ` ZheNing Hu [this message]
2021-04-11  6:11 ` ZheNing Hu
2021-04-11 15:34   ` ZheNing Hu
2021-04-13  6:40     ` Jeff King
2021-04-13 14:51       ` ZheNing Hu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOLTT8QSxgMBLVk2dqt2b863z0oxYTcczF5FcPFtiKQh4p_j9w@mail.gmail.com \
    --to=adlternative@gmail.com \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=hariom18599@gmail.com \
    --cc=olyatelezhnaya@gmail.com \
    --cc=peff@peff.net \
    --cc=periperidip@gmail.com \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.