All of lore.kernel.org
 help / color / mirror / Atom feed
From: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
To: Jakub Narebski <jnareb@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCHv5 12/12] gitweb: gather more remote data
Date: Sat, 23 Oct 2010 18:17:51 +0200	[thread overview]
Message-ID: <AANLkTi=7gUAbdRKLCifPidZXdLa8fCENT0=iy5bKAeiA@mail.gmail.com> (raw)
In-Reply-To: <201009271747.39201.jnareb@gmail.com>

I'm awfully sorry for the delay in the reply to this email. I got
gobbled up by some real-world stuff just while finishing the review of
the last patch.

2010/9/27 Jakub Narebski <jnareb@gmail.com>:
> On Fri, 24 Sep 2010, Giuseppe Bilotta wrote:
>
> Isn't the most important and visible information the fact that both
> standalone 'remotes' view and 'remotes' section in the 'summary' page
> is now grouped by remotes?  This should be explicitely mentioned in the
> commit message.  It is not very visible IMVHO in what is written below.

You're right. I've rewritten the message by putting the grouping
layout in the subject and rephrasing the body of the commit to better
describe what happens both in remotes view and in limited summary
view.

>>
>> +# Return an array with: a hash of remote names mapping to the corresponding
>> +# remote heads, the value of the $limit parameter, and a boolean indicating
>> +# whether we managed to get all the remotes or not.
>
> Perhaps it would be better to provide an example, rather than trying to
> describe the output.

I'm really not sure about how it would help, but I'll try.

>> +# If $limit is specified and the number of heads found is higher, the head
>> +# list is empy. Additional filtering on the number of heads can be done when
>> +# displaying the remotes.
>> +sub git_get_remotes_data {
>> +     my ($limit, $wanted) = @_;
>
> I'm not sure if it wouldn't be better to not worry git_get_remotes_data
> about $limit, but do limiting on output.  The amount of work gitweb
> needs to do in both situation is, I guess, nearly the same.

I think that the real issue is that situations in which gathering all
data and then discarding much of it would be very rare, being that it
would require a very large number of remotes (not remote heads, just
remotes). But considering that even I, a rather casual user, have at
least one project where the number of remotes is in the 20s, I
wouldn't be surprised if there were cases of humongous remote lists.

>> +     my %remotes;
>> +     open my $fd, '-|' , git_cmd(), 'remote', '-v';
>> +     return unless $fd;
>> +     my $more = 1;
>> +     while (my $remote = <$fd> and $more) {
>
> Why not use 'last' instead of using $more variable (though I have not
> checked if it would really make code simpler and more readable).

Because we need to know whether we bail out early or not from the
loop. It might be appropriate to document this in the code.

>> +             chomp $remote;
>> +             $remote =~ s!\t(.*?)\s+\((\w+)\)$!!;
>> +             next if $wanted and not $remote eq $wanted;
>> +             my $url = $1;
>> +             my $key = $2;
>
> Minor nitpick: why not
>
>  +             my ($url, $key) = ($1, $2);

No reason. Can do.

>> +     if ($more) {
>> +             my @heads = map { "remotes/$_" } keys %remotes;
>> +             my @remoteheads = git_get_heads_list(undef, @heads);
>> +             foreach (keys %remotes) {
>> +                     my $remote = $_;
>> +                     $remotes{$remote}{'heads'} = [ grep {
>> +                             $_->{'name'} =~ s!^$remote/!!
>> +                     } @remoteheads ];
>> +             }
>> +     }
>> +     return (\%remotes, $limit, $more);
>
> Hmmmm... as it can be seen making this function do more work results
> in this ugly API.  If git_get_remotes_data didn't worry about limiting,
> we could return simply %remotes hash (or \%remotes hash reference).

If I read your comments correctly, your idea would be to have two
separate functions, one that gets the list of remotes (with their
respective URLs), and a separate one (called when intended) to also
gather the list of heads. The only case in which the latter would not
be called would be when in summary view more than the given number of
maximum remotes were returned from the previous function. Correct?

>> +# Display a single remote block
>> +sub git_remote_body {
>> +     my ($remote, $rdata, $limit, $head, $single) = @_;
>> +     my %rdata = %{$rdata};
>
> Why do you need this, instead of simply using $rdata->{'heads'} etc.?

Because I keep getting messed up with the refs syntax in Perl. Thanks
for the hint.

>> +     my $heads = $rdata{'heads'};
>
> Why not
>
>  +     my @heads = @{$rdata{'heads'}};
>
> Or
>
>  +     my @heads = @{$rdata->{'heads'}};
>
> without this strange '%rdata = %{$rdata};'

Well, the main use of heads was to be passed to git_heads_body so I
thought I would just leave it as a ref.

>> +     if (defined $fetch) {
>> +             if ($fetch eq $push) {
>> +                     $urls .= git_repo_url("URL", $fetch);
>> +             } else {
>> +                     $urls .= git_repo_url("Fetch URL", $fetch);
>> +                     $urls .= git_repo_url("Push URL", $push) if defined $push;
>> +             }
>> +     } elsif (defined $push) {
>> +             $urls .= git_repo_url("Push URL", $push);
>> +     } else {
>> +             $urls .= git_repo_url("", "No remote URL");
>> +     }
>
> Perhaps reverse order of conditions would result in less nested
> conditional... but perhaps not.

I tried. The problem is that we just have that many cases (fetch and
no push, push and no fetch, equal fetch and push, different fetch and
push). I could not think of a way to handle them all in a more
streamlined way.

>> +
>> +     $urls .= "</table>\n";
>> +
>> +     my ($maxheads, $dots);
>> +     if (defined $limit) {
>> +             $maxheads = $limit - 1;
>
> If git_get_remotes_data didn't do limiting, you could use
>
>  +             $maxheads = scalar @heads;
>
>> +             if ($#{$heads} > $maxheads) {
>> +                     $dots = $cgi->a({-href => href(action=>"remotes", hash=>$remote)}, "...");
>> +             }
>> +     }
>
> We can always check if we got more remotes than limit.
>

Actually, this is in reverse. get_remotes_data doesn't do limiting on
the number of heads, which is why $maxheads is $limit - 1 and the
number of heads is compared against it. git_remotes_data does limiting
on the number of _remotes_, and if the limit is hit, git_remote_body
would just not be called.

>> +
>> +     my $content = sub {
>> +             print $urls;
>> +             git_heads_body($heads, $head, 0, $maxheads, $dots);
>> +     };
>> +
>> +     if (defined $single and $single) {
>
> 'defined $single and $single' is the same as '$single', because
> undefined value is false-ish.

I was over-protecting against warnings about the use of undefined
symbols. Cleaned up.

>> +             $content->();
>> +     } else {
>> +             git_group("remotes", $remote, "remotes", $remote, $remote, $content);
>> +     }
>
> Hmmm... should git_remote_body be responsible for wrapping in subsection?
> Wouldn't it be better to have caller use
>
>  git_group("remotes", $remote, "remotes", $remote, $remote,
>            sub { git_remote_body($remote, $rdata, $limit, $head) });

Right.

> Sidenote: strange (but perhaps unavoidable) repetition we have here...

Even with the new syntax discussed for the previous patches this is
unavoidable, it seems.

>> +}
>> +
>> +# Display remote heads grouped by remote, unless there are too many
>> +# remotes ($have_all is false), in which case we only display the remote
>> +# names
>
> Wouldn't it be better to put displaying only remote names in a separate
> subroutine, which would make git_remotes_body extremely simple?

Sure.

>> +                     print "<td>" .
>> +                           $cgi->a({-href=> href(action=>'remotes', hash=>$remote),
>> +                                    -class=> "list name"},esc_html($remote)) . "</td>";
>> +                     print "<td class=\"link\">" .
>> +                           (defined $fetch ? $cgi->a({-href=> $fetch}, "fetch") : "fetch") .
>> +                           " | " .
>> +                           (defined $push ? $cgi->a({-href=> $push}, "push") : "push") .
>> +                           "</td>";
>
> I see that you don't worry here if $fetch == $push, only if they
> are defined.  But I think it might be all right... though the exact
> output might need some discussion.

I'm open to suggestions.

>>  sub git_search_grep_body {
>>       my ($commitlist, $from, $to, $extra) = @_;
>>       $from = 0 unless defined $from;
>> @@ -5164,7 +5302,7 @@ sub git_summary {
>>       # there are more ...
>>       my @taglist  = git_get_tags_list(16);
>>       my @headlist = git_get_heads_list(16, 'heads');
>> -     my @remotelist = $remote_heads ? git_get_heads_list(16, 'remotes') : ();
>> +     my @remotelist = $remote_heads ? git_get_remotes_data(16) : ();
>
> That's not @remotelist any longer, isn't it?

Right. I renamed it to @remotedata across the whole file.

>>       my @forklist;
>>       my $check_forks = gitweb_check_feature('forks');
>>
>> @@ -5244,9 +5382,7 @@ sub git_summary {
>>
>>       if (@remotelist) {
>>               git_print_header_div('remotes');
>> -             git_heads_body(\@remotelist, $head, 0, 15,
>> -                            $#remotelist <= 15 ? undef :
>> -                            $cgi->a({-href => href(action=>"remotes")}, "..."));
>> +             git_remotes_body(@remotelist, $head);
>
> Hmmm... (current API, with @remotelist (!) containing list of arguments
> to a subroutine).

It does make the call much cleaner though (think remotedata: it has
the data, and the information on whether the data is complete or not.
doesn't it make sense?)

>>       }
>>
>>       if (@forklist) {
>> @@ -5570,26 +5706,25 @@ sub git_remotes {
>>       my $head = git_get_head_hash($project);
>>       my $remote = $input_params{'hash'};
>>
>> +     my @remotelist = git_get_remotes_data(undef, $remote);
>> +     die_error(500, "Unable to get remote information") unless @remotelist;
>
> What if there are no remotes, and no remote-tracking branches?

Presently, we get 404 - No remotes found if there are no remotes. If
there are remotes without heads, the section will only contain the
URL.

The funny thing is that if there are no remotes there will be an empty
remotes section in summary view, so we probably to check for that too.

>> +
>> +     if (keys(%{$remotelist[0]}) == 0) {
>> +             die_error(404, defined $remote ?
>> +                     "Remote $remote not found" :
>> +                     "No remotes found");
>> +     }
>
> Ah, I see that it is handled here.  Sidenote: with proposed changed
> API we can distinguish error case from no-remotes case by returning
> undefined value versus returning empty hash / empty hash reference.

The question is: do we really want the different API? It does make a
few things cleaner, but it makes other things messier (I'm thinking
here about the calls to git_*_body; compare what we have for remotes
with what we have for tags or heads list in summary view ...)

-- 
Giuseppe "Oblomov" Bilotta

  reply	other threads:[~2010-10-23 16:18 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-24 16:02 [PATCHv5 00/12] gitweb: remote_heads feature Giuseppe Bilotta
2010-09-24 16:02 ` [PATCHv5 01/12] gitweb: introduce " Giuseppe Bilotta
2010-09-26 17:24   ` Jakub Narebski
2010-09-26 19:19     ` Ævar Arnfjörð Bjarmason
2010-09-26 19:47       ` David Ripton
2010-09-27  6:42         ` Giuseppe Bilotta
2010-09-24 16:02 ` [PATCHv5 02/12] gitweb: git_get_heads_list accepts an optional list of refs Giuseppe Bilotta
2010-09-26 17:27   ` Jakub Narebski
2010-09-24 16:02 ` [PATCHv5 03/12] gitweb: separate heads and remotes lists Giuseppe Bilotta
2010-09-26 17:39   ` Jakub Narebski
2010-09-24 16:02 ` [PATCHv5 04/12] gitweb: nagivation menu for tags, heads and remotes Giuseppe Bilotta
2010-09-26 17:52   ` Jakub Narebski
2010-09-27  6:48     ` Giuseppe Bilotta
2010-09-27  8:30       ` Jakub Narebski
2010-09-24 16:02 ` [PATCHv5 05/12] gitweb: use fullname as hash_base in heads link Giuseppe Bilotta
2010-09-26 17:57   ` Jakub Narebski
2010-09-24 16:02 ` [PATCHv5 06/12] gitweb: allow extra text after action in page header Giuseppe Bilotta
2010-09-26 18:11   ` Jakub Narebski
2010-09-27  6:56     ` Giuseppe Bilotta
2010-09-27  7:42       ` Jakub Narebski
2010-09-24 16:02 ` [PATCHv5 07/12] gitweb: remotes view for a single remote Giuseppe Bilotta
2010-09-26 20:55   ` Jakub Narebski
2010-09-27  7:11     ` Giuseppe Bilotta
2010-09-27  7:53       ` Jakub Narebski
2010-09-24 16:02 ` [PATCHv5 08/12] gitweb: auxiliary function to group data Giuseppe Bilotta
2010-09-26 21:47   ` Jakub Narebski
2010-09-27  7:26     ` Giuseppe Bilotta
2010-09-27  8:12       ` Jakub Narebski
2010-09-27 19:17         ` Giuseppe Bilotta
2010-09-24 16:02 ` [PATCHv5 09/12] gitweb: group styling Giuseppe Bilotta
2010-09-26 22:10   ` Jakub Narebski
2010-09-27  7:27     ` Giuseppe Bilotta
2010-09-24 16:02 ` [PATCHv5 10/12] gitweb: git_repo_url() routine Giuseppe Bilotta
2010-09-26 22:34   ` Jakub Narebski
2010-09-27  7:29     ` Giuseppe Bilotta
2010-09-24 16:02 ` [PATCHv5 11/12] gitweb: use git_repo_url() in summary Giuseppe Bilotta
2010-09-26 22:36   ` Jakub Narebski
2010-09-24 16:02 ` [PATCHv5 12/12] gitweb: gather more remote data Giuseppe Bilotta
2010-09-27 15:47   ` Jakub Narebski
2010-10-23 16:17     ` Giuseppe Bilotta [this message]
2010-09-26 18:18 ` [PATCHv5 00/12] gitweb: remote_heads feature Jakub Narebski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='AANLkTi=7gUAbdRKLCifPidZXdLa8fCENT0=iy5bKAeiA@mail.gmail.com' \
    --to=giuseppe.bilotta@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=jnareb@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.