All of lore.kernel.org
 help / color / mirror / Atom feed
* Feature request: --format=json
@ 2017-04-08 16:07 Fred .Flintstone
  2017-04-08 16:26 ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 14+ messages in thread
From: Fred .Flintstone @ 2017-04-08 16:07 UTC (permalink / raw)
  To: git

$ git log --format=json
[{
    "commit": "64eabf050e315a4c7a11e0c05ca163be7cf9075e",
    "tree": "b1e977800f40bbf6de906b1fe4f2de4b4b14f0fd",
    "author": "Tux <tux@example.com> 1490981516 +0200",
    "committer": "Tux <tux@example.com> 1490981516 +0200",
    "message": "This is a test commit",
    "long_message": "This explains in more details the commit"
}]

This would make it easy to parse the output.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Feature request: --format=json
  2017-04-08 16:07 Feature request: --format=json Fred .Flintstone
@ 2017-04-08 16:26 ` Ævar Arnfjörð Bjarmason
  2017-04-17  0:38   ` Junio C Hamano
  0 siblings, 1 reply; 14+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-04-08 16:26 UTC (permalink / raw)
  To: Fred .Flintstone; +Cc: Git Mailing List

On Sat, Apr 8, 2017 at 6:07 PM, Fred .Flintstone <eldmannen@gmail.com> wrote:
> $ git log --format=json
> [{
>     "commit": "64eabf050e315a4c7a11e0c05ca163be7cf9075e",
>     "tree": "b1e977800f40bbf6de906b1fe4f2de4b4b14f0fd",
>     "author": "Tux <tux@example.com> 1490981516 +0200",
>     "committer": "Tux <tux@example.com> 1490981516 +0200",
>     "message": "This is a test commit",
>     "long_message": "This explains in more details the commit"
> }]
>
> This would make it easy to parse the output.

The git-log command isn't plumbing that's meant for machines, but the
git-for-each-ref command is what you're most likely looking for.

It doesn't have JSON output, but you can make e.g. --format emit
something even more easily parsable, e.g. a version of what you have
with each field delimited by a custom delimiter, and then split on
that.

It does have --perl, --tcl etc. options to make it easy to quote the
fields, however there's no logic to manage the state machine JSON
would need to omit trailing commas, whereas emitting output for
languages like Perl where trailing commas don't matter is much easier.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Feature request: --format=json
  2017-04-08 16:26 ` Ævar Arnfjörð Bjarmason
@ 2017-04-17  0:38   ` Junio C Hamano
  2017-04-17 12:44     ` Fred .Flintstone
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2017-04-17  0:38 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Fred .Flintstone, Git Mailing List

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> On Sat, Apr 8, 2017 at 6:07 PM, Fred .Flintstone <eldmannen@gmail.com> wrote:
>> $ git log --format=json
>> [{
>>     "commit": "64eabf050e315a4c7a11e0c05ca163be7cf9075e",
>>     "tree": "b1e977800f40bbf6de906b1fe4f2de4b4b14f0fd",
>>     "author": "Tux <tux@example.com> 1490981516 +0200",
>>     "committer": "Tux <tux@example.com> 1490981516 +0200",
>>     "message": "This is a test commit",
>>     "long_message": "This explains in more details the commit"
>> }]
>>
>> This would make it easy to parse the output.
>
> The git-log command isn't plumbing that's meant for machines, but the
> git-for-each-ref command is what you're most likely looking for.

They are apples and oranges.  log is about traversing the history.
"for-each-ref" is about listing the tips of refs.  It doesn't and it
shouldn't traverse the history.

The plumbing to use when you want to reimplement "git log" lookalike
is "rev-list".

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Feature request: --format=json
  2017-04-17  0:38   ` Junio C Hamano
@ 2017-04-17 12:44     ` Fred .Flintstone
  2017-04-17 12:59       ` Duy Nguyen
  2017-04-17 17:35       ` Sebastian Schuberth
  0 siblings, 2 replies; 14+ messages in thread
From: Fred .Flintstone @ 2017-04-17 12:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Ævar Arnfjörð Bjarmason, Git Mailing List

So I did "git rev-list --all --pretty" and it looks like "git log".
Which outputs a human-readable format.

However, if I want something more suitable for machine parsing, is
there any way to get that output?

Example maybe I want another date format like ISO dates, or maybe a
serializable format like JSON or CSV or something. Maybe I want more
data than commit, auhor, date, subject and body?

On Mon, Apr 17, 2017 at 2:38 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> On Sat, Apr 8, 2017 at 6:07 PM, Fred .Flintstone <eldmannen@gmail.com> wrote:
>>> $ git log --format=json
>>> [{
>>>     "commit": "64eabf050e315a4c7a11e0c05ca163be7cf9075e",
>>>     "tree": "b1e977800f40bbf6de906b1fe4f2de4b4b14f0fd",
>>>     "author": "Tux <tux@example.com> 1490981516 +0200",
>>>     "committer": "Tux <tux@example.com> 1490981516 +0200",
>>>     "message": "This is a test commit",
>>>     "long_message": "This explains in more details the commit"
>>> }]
>>>
>>> This would make it easy to parse the output.
>>
>> The git-log command isn't plumbing that's meant for machines, but the
>> git-for-each-ref command is what you're most likely looking for.
>
> They are apples and oranges.  log is about traversing the history.
> "for-each-ref" is about listing the tips of refs.  It doesn't and it
> shouldn't traverse the history.
>
> The plumbing to use when you want to reimplement "git log" lookalike
> is "rev-list".

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Feature request: --format=json
  2017-04-17 12:44     ` Fred .Flintstone
@ 2017-04-17 12:59       ` Duy Nguyen
  2017-04-17 13:09         ` Fred .Flintstone
  2017-04-17 17:35       ` Sebastian Schuberth
  1 sibling, 1 reply; 14+ messages in thread
From: Duy Nguyen @ 2017-04-17 12:59 UTC (permalink / raw)
  To: Fred .Flintstone
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason, Git Mailing List

On Mon, Apr 17, 2017 at 7:44 PM, Fred .Flintstone <eldmannen@gmail.com> wrote:
> So I did "git rev-list --all --pretty" and it looks like "git log".
> Which outputs a human-readable format.
>
> However, if I want something more suitable for machine parsing, is
> there any way to get that output?
>
> Example maybe I want another date format like ISO dates, or maybe a
> serializable format like JSON or CSV or something. Maybe I want more
> data than commit, auhor, date, subject and body?

"git cat-file commit <commit-id>" should give you a machine-readable
format of everything (it's the same thing that git-log parses and
shows you; not counting options like --diff, --stat...). <commit-id>
is from rev-list output (without --pretty, that's not for machine
processing). You probably can use "git cat-file --batch" too, just
pipe the whole rev-list output through it. You don't get to choose a
convenient format this way though.
-- 
Duy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Feature request: --format=json
  2017-04-17 12:59       ` Duy Nguyen
@ 2017-04-17 13:09         ` Fred .Flintstone
  2017-04-18  0:38           ` Junio C Hamano
  0 siblings, 1 reply; 14+ messages in thread
From: Fred .Flintstone @ 2017-04-17 13:09 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason, Git Mailing List

So I would either have to do:
git rev-list --all
Then iterate over each line and do git-cat-file commit <commit-id>.

Or do:
git rev-list --all | git cat-file --batch

If I do it in a batch, then it will be tricky to reliably parse since
I don't know when the message body ends and when the next commit
starts.

JSON output would have been very handy.

On Mon, Apr 17, 2017 at 2:59 PM, Duy Nguyen <pclouds@gmail.com> wrote:
> On Mon, Apr 17, 2017 at 7:44 PM, Fred .Flintstone <eldmannen@gmail.com> wrote:
>> So I did "git rev-list --all --pretty" and it looks like "git log".
>> Which outputs a human-readable format.
>>
>> However, if I want something more suitable for machine parsing, is
>> there any way to get that output?
>>
>> Example maybe I want another date format like ISO dates, or maybe a
>> serializable format like JSON or CSV or something. Maybe I want more
>> data than commit, auhor, date, subject and body?
>
> "git cat-file commit <commit-id>" should give you a machine-readable
> format of everything (it's the same thing that git-log parses and
> shows you; not counting options like --diff, --stat...). <commit-id>
> is from rev-list output (without --pretty, that's not for machine
> processing). You probably can use "git cat-file --batch" too, just
> pipe the whole rev-list output through it. You don't get to choose a
> convenient format this way though.
> --
> Duy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Feature request: --format=json
  2017-04-17 12:44     ` Fred .Flintstone
  2017-04-17 12:59       ` Duy Nguyen
@ 2017-04-17 17:35       ` Sebastian Schuberth
  1 sibling, 0 replies; 14+ messages in thread
From: Sebastian Schuberth @ 2017-04-17 17:35 UTC (permalink / raw)
  To: git; +Cc: Ævar Arnfjörð Bjarmason, Git Mailing List

On 2017-04-17 14:44, Fred .Flintstone wrote:

> However, if I want something more suitable for machine parsing, is
> there any way to get that output?

Instead of machine parsing, why not directly get what you want via 
libgit2 (or one of its language bindings), or jgit?

[1] https://github.com/libgit2/libgit2
[2] https://github.com/eclipse/jgit

-- 
Sebastian Schuberth


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Feature request: --format=json
  2017-04-17 13:09         ` Fred .Flintstone
@ 2017-04-18  0:38           ` Junio C Hamano
  2017-04-18  8:44             ` Fred .Flintstone
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2017-04-18  0:38 UTC (permalink / raw)
  To: Fred .Flintstone
  Cc: Duy Nguyen, Ævar Arnfjörð Bjarmason, Git Mailing List

"Fred .Flintstone" <eldmannen@gmail.com> writes:

> So I would either have to do:
> git rev-list --all
> Then iterate over each line and do git-cat-file commit <commit-id>.
>
> Or do:
> git rev-list --all | git cat-file --batch
>
> If I do it in a batch, then it will be tricky to reliably parse since
> I don't know when the message body ends and when the next commit
> starts.
>
> JSON output would have been very handy.

I am somewhat puzzled.  I thought that you were trying to come up
with a way to produce JSON output and people are trying to help you
by pointing out tools that you can use for that.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Feature request: --format=json
  2017-04-18  0:38           ` Junio C Hamano
@ 2017-04-18  8:44             ` Fred .Flintstone
  2017-04-18  9:39               ` Samuel Lijin
  2017-04-18 11:16               ` demerphq
  0 siblings, 2 replies; 14+ messages in thread
From: Fred .Flintstone @ 2017-04-18  8:44 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Duy Nguyen, Ævar Arnfjörð Bjarmason, Git Mailing List

Well the easiest way to work with that would be JSON.
So the best would be if Git could output the data I want in JSON format.
Then it would be easy for me to work with data.

With git rev-list and git-cat file, its not so easy to reliably parse
that output.

On Tue, Apr 18, 2017 at 2:38 AM, Junio C Hamano <gitster@pobox.com> wrote:
> "Fred .Flintstone" <eldmannen@gmail.com> writes:
>
>> So I would either have to do:
>> git rev-list --all
>> Then iterate over each line and do git-cat-file commit <commit-id>.
>>
>> Or do:
>> git rev-list --all | git cat-file --batch
>>
>> If I do it in a batch, then it will be tricky to reliably parse since
>> I don't know when the message body ends and when the next commit
>> starts.
>>
>> JSON output would have been very handy.
>
> I am somewhat puzzled.  I thought that you were trying to come up
> with a way to produce JSON output and people are trying to help you
> by pointing out tools that you can use for that.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Feature request: --format=json
  2017-04-18  8:44             ` Fred .Flintstone
@ 2017-04-18  9:39               ` Samuel Lijin
  2017-04-18 10:08                 ` Fred .Flintstone
  2017-04-18 11:16               ` demerphq
  1 sibling, 1 reply; 14+ messages in thread
From: Samuel Lijin @ 2017-04-18  9:39 UTC (permalink / raw)
  To: Fred .Flintstone
  Cc: Junio C Hamano, Duy Nguyen,
	Ævar Arnfjörð Bjarmason, Git Mailing List

If for some reason your use case is so performance intensive that you
can't just `git cat-file commit` every entry in `git rev-list --all`
individually, then you can also pipe input into `git cat-file --batch`
and read output as you pipe input in, which will give you a very
simple mechanism for delimiting the cat-file output.

In any case, as developers, it's rare to have our job done for us.
That's why we write code.

I'm sure people would be happy to help if you submitted patches to
support --format=json.

On Tue, Apr 18, 2017 at 3:44 AM, Fred .Flintstone <eldmannen@gmail.com> wrote:
> Well the easiest way to work with that would be JSON.
> So the best would be if Git could output the data I want in JSON format.
> Then it would be easy for me to work with data.
>
> With git rev-list and git-cat file, its not so easy to reliably parse
> that output.
>
> On Tue, Apr 18, 2017 at 2:38 AM, Junio C Hamano <gitster@pobox.com> wrote:
>> "Fred .Flintstone" <eldmannen@gmail.com> writes:
>>
>>> So I would either have to do:
>>> git rev-list --all
>>> Then iterate over each line and do git-cat-file commit <commit-id>.
>>>
>>> Or do:
>>> git rev-list --all | git cat-file --batch
>>>
>>> If I do it in a batch, then it will be tricky to reliably parse since
>>> I don't know when the message body ends and when the next commit
>>> starts.
>>>
>>> JSON output would have been very handy.
>>
>> I am somewhat puzzled.  I thought that you were trying to come up
>> with a way to produce JSON output and people are trying to help you
>> by pointing out tools that you can use for that.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Feature request: --format=json
  2017-04-18  9:39               ` Samuel Lijin
@ 2017-04-18 10:08                 ` Fred .Flintstone
  0 siblings, 0 replies; 14+ messages in thread
From: Fred .Flintstone @ 2017-04-18 10:08 UTC (permalink / raw)
  To: Samuel Lijin
  Cc: Junio C Hamano, Duy Nguyen,
	Ævar Arnfjörð Bjarmason, Git Mailing List

But a repository or branch can have thousands of commits, so running
`git commit-file <commit>` seems maybe not be a wide idea.
But parsing `git cat-file --batch` is difficult, because there seems
to be no reliable way to discern when a commit starts and ends.

I don't code in C though. A JSON formatter option would need a JSON library.
But maybe there should be raised a discussion about JSON in Git if
there are other people interested in this?

On Tue, Apr 18, 2017 at 11:39 AM, Samuel Lijin <sxlijin@gmail.com> wrote:
> If for some reason your use case is so performance intensive that you
> can't just `git cat-file commit` every entry in `git rev-list --all`
> individually, then you can also pipe input into `git cat-file --batch`
> and read output as you pipe input in, which will give you a very
> simple mechanism for delimiting the cat-file output.
>
> In any case, as developers, it's rare to have our job done for us.
> That's why we write code.
>
> I'm sure people would be happy to help if you submitted patches to
> support --format=json.
>
> On Tue, Apr 18, 2017 at 3:44 AM, Fred .Flintstone <eldmannen@gmail.com> wrote:
>> Well the easiest way to work with that would be JSON.
>> So the best would be if Git could output the data I want in JSON format.
>> Then it would be easy for me to work with data.
>>
>> With git rev-list and git-cat file, its not so easy to reliably parse
>> that output.
>>
>> On Tue, Apr 18, 2017 at 2:38 AM, Junio C Hamano <gitster@pobox.com> wrote:
>>> "Fred .Flintstone" <eldmannen@gmail.com> writes:
>>>
>>>> So I would either have to do:
>>>> git rev-list --all
>>>> Then iterate over each line and do git-cat-file commit <commit-id>.
>>>>
>>>> Or do:
>>>> git rev-list --all | git cat-file --batch
>>>>
>>>> If I do it in a batch, then it will be tricky to reliably parse since
>>>> I don't know when the message body ends and when the next commit
>>>> starts.
>>>>
>>>> JSON output would have been very handy.
>>>
>>> I am somewhat puzzled.  I thought that you were trying to come up
>>> with a way to produce JSON output and people are trying to help you
>>> by pointing out tools that you can use for that.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Feature request: --format=json
  2017-04-18  8:44             ` Fred .Flintstone
  2017-04-18  9:39               ` Samuel Lijin
@ 2017-04-18 11:16               ` demerphq
       [not found]                 ` <CAH_OBidR8ewMO_B0HM2SU=B+uV=kRjpOKVMcvohEkZ1PSgT92w@mail.gmail.com>
  1 sibling, 1 reply; 14+ messages in thread
From: demerphq @ 2017-04-18 11:16 UTC (permalink / raw)
  To: Fred .Flintstone
  Cc: Junio C Hamano, Duy Nguyen,
	Ævar Arnfjörð Bjarmason, Git Mailing List

On 18 April 2017 at 10:44, Fred .Flintstone <eldmannen@gmail.com> wrote:
> Well the easiest way to work with that would be JSON.
> So the best would be if Git could output the data I want in JSON format.
> Then it would be easy for me to work with data.
>
> With git rev-list and git-cat file, its not so easy to reliably parse
> that output.

Doesn't seem too hard to work with rev-list to me. As far as I can
tell the following produces what  you want. You need perl installed
obviously, and the JSON::PP module is required, but should come
bundled with recent perls.

git rev-list master --pretty=raw | perl -MJSON::PP=encode_json
-ane'if(/^(\w+) (.*)/) { if ($1 eq "commit") { push @objs, $o if $o;
$o={}; } $o->{$1} = $2; } else { $o->{text} .= $_; } END{ push @objs,
$o if $o; for $o (@objs) { s/^    //mg, s/^\n// for $o->{text};
($o->{message},$o->{long_message})= split /\n\n/, delete $o->{text}; }
print JSON::PP->new->pretty->encode(\@objs);}'

You might consider an alternative approach than stating that working
with JSON is "the easiest", especially to people who clearly are
making do without it. :-)

A better argument might be that exposing data through a well defined
and widely used and simple data format would trivially expand the
range of projects that might interoperate with git or enhance the git
ecosystem. For instance you could argue that having clean JSON output
would make it easier to integrate into search engines and other
indexing tools that already know how to speak JSON. Maybe a regular
contributor on this list might agree with your arguments and make it
happen.

Until then you can parse rev-list like the rest of us. :-)

cheers,
Yves

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Feature request: --format=json
       [not found]                 ` <CAH_OBidR8ewMO_B0HM2SU=B+uV=kRjpOKVMcvohEkZ1PSgT92w@mail.gmail.com>
@ 2017-04-24  9:28                   ` Duy Nguyen
  2017-04-24 11:04                     ` shawn wilson
  0 siblings, 1 reply; 14+ messages in thread
From: Duy Nguyen @ 2017-04-24  9:28 UTC (permalink / raw)
  To: shawn wilson
  Cc: demerphq, Fred .Flintstone, Junio C Hamano, Git Mailing List,
	Ævar Arnfjörð Bjarmason

On Mon, Apr 24, 2017 at 3:33 PM, shawn wilson <ag4ve.us@gmail.com> wrote:
> Late to the party, but I too would also like json format output (mainly so I
> could pipe stuff to jq instead of looking at the man page for which %thing
> I'm looking for). That said, it's not at the PR level of want for me.
>
> OTOH, format=xml would be even more handy IMHO... Which I see has hit both
> SO and this ml in the past. Either way /some/ machine output would be a good
> thing :)

Personally I'd rather avoid linking to another library just for
json/xml formatting. libgit2 would be a great place to have
functionality like this and it looks like you don't even have to touch
C [1] to do that.

[1] https://gist.github.com/m1el/42472327b4be382b02eb
-- 
Duy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Feature request: --format=json
  2017-04-24  9:28                   ` Duy Nguyen
@ 2017-04-24 11:04                     ` shawn wilson
  0 siblings, 0 replies; 14+ messages in thread
From: shawn wilson @ 2017-04-24 11:04 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: demerphq, Fred .Flintstone, Junio C Hamano, Git Mailing List,
	Ævar Arnfjörð Bjarmason

On Mon, Apr 24, 2017 at 5:28 AM, Duy Nguyen <pclouds@gmail.com> wrote:
> On Mon, Apr 24, 2017 at 3:33 PM, shawn wilson <ag4ve.us@gmail.com> wrote:
>> Late to the party, but I too would also like json format output (mainly so I
>> could pipe stuff to jq instead of looking at the man page for which %thing
>> I'm looking for). That said, it's not at the PR level of want for me.
>>
>> OTOH, format=xml would be even more handy IMHO... Which I see has hit both
>> SO and this ml in the past. Either way /some/ machine output would be a good
>> thing :)
>
> Personally I'd rather avoid linking to another library just for
> json/xml formatting. libgit2 would be a great place to have
> functionality like this and it looks like you don't even have to touch
> C [1] to do that.
>
> [1] https://gist.github.com/m1el/42472327b4be382b02eb


Heh, well, I guess if it's like that (simple), I don't really care :)
I was under the impression (no idea why) that it was limited to C or
java (and friends). Given this means I don't have to figure out
someone's json structure or refer to an xsd, I'll live w/ this.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-04-24 11:06 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-08 16:07 Feature request: --format=json Fred .Flintstone
2017-04-08 16:26 ` Ævar Arnfjörð Bjarmason
2017-04-17  0:38   ` Junio C Hamano
2017-04-17 12:44     ` Fred .Flintstone
2017-04-17 12:59       ` Duy Nguyen
2017-04-17 13:09         ` Fred .Flintstone
2017-04-18  0:38           ` Junio C Hamano
2017-04-18  8:44             ` Fred .Flintstone
2017-04-18  9:39               ` Samuel Lijin
2017-04-18 10:08                 ` Fred .Flintstone
2017-04-18 11:16               ` demerphq
     [not found]                 ` <CAH_OBidR8ewMO_B0HM2SU=B+uV=kRjpOKVMcvohEkZ1PSgT92w@mail.gmail.com>
2017-04-24  9:28                   ` Duy Nguyen
2017-04-24 11:04                     ` shawn wilson
2017-04-17 17:35       ` Sebastian Schuberth

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.