* [PATCH 0/3] Gitweb caching v7 @ 2010-10-28 0:42 John 'Warthog9' Hawley 2010-10-28 0:42 ` [PATCH 1/3] gitweb: Add option to force version match John 'Warthog9' Hawley ` (4 more replies) 0 siblings, 5 replies; 43+ messages in thread From: John 'Warthog9' Hawley @ 2010-10-28 0:42 UTC (permalink / raw) To: git [-- Attachment #1: Type: text/plain, Size: 2493 bytes --] Evening everyone, This is the latest incarnation of gitweb w/ caching. Per the general consensus and requests from the recent Gittogether I'm re-submitting my patches. Biggest changes are bringing it back in line with Junio's tree, couple of locking fixes (specifically targeted at binary files), and at the nagging behest of Sverre I've cut the old patch down to a much more manageable (and in his words "reviewable") series. This still differs, slightly, from whats in production on kernel.org. It's missing the index page git:// link, and kernel.org is still running the broken out output buffering but I'm likely to flip over to this just so I'm not maintaining quite so much (and the changes are easier to track with upstream) v7: - Rework output system, now central STDOUT redirect - Various fixes to caching brought in from existing running system v6: - Never saw the light of day - Various testing, and reworks. v5: - Missed a couple of things that were in my local tree, and added them back in. - Split up the die_error and the version matching patch - Set version matching to be on by default - otherwise this really is code that will never get checked, or at best enabled by default by distributions - Added a minor code cleanup with respect to $site_header that was already in my tree - Applied against a more recent git tree vs. 1.6.6-rc2 - Removed breakout patch for now (did that in v4 actually) and will deal with that separately http://git.kernel.org/?p=git/warthog9/gitweb.git;a=shortlog;h=refs/heads/gitweb-ml-v5 v4: - major re-working of the caching layer to use file handle redirection instead of buffering output - other minor improvements http://git.kernel.org/?p=git/warthog9/gitweb.git;a=shortlog;h=refs/heads/gitweb-ml-v4 v3: - various minor re-works based on mailing list feedback, this series was not sent to the mailing list. v2: - Better breakout - You can actually disable the cache now - John 'Warthog9' Hawley John 'Warthog9' Hawley (3): gitweb: Add option to force version match gitweb: add output buffering and associated functions gitweb: File based caching layer (from git.kernel.org) gitweb/README | 4 + gitweb/cache.pm | 365 ++++++++++++++++++++++++++++++++++++++++++++++ gitweb/gitweb.perl | 147 ++++++++++++++++++- gitweb/static/gitweb.css | 6 + 4 files changed, 514 insertions(+), 8 deletions(-) create mode 100644 gitweb/cache.pm -- 1.7.2.3 ^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH 1/3] gitweb: Add option to force version match 2010-10-28 0:42 [PATCH 0/3] Gitweb caching v7 John 'Warthog9' Hawley @ 2010-10-28 0:42 ` John 'Warthog9' Hawley 2010-10-28 9:52 ` Jakub Narebski 2010-10-28 22:08 ` Junio C Hamano 2010-10-28 0:42 ` [PATCH 2/3] gitweb: add output buffering and associated functions John 'Warthog9' Hawley ` (3 subsequent siblings) 4 siblings, 2 replies; 43+ messages in thread From: John 'Warthog9' Hawley @ 2010-10-28 0:42 UTC (permalink / raw) To: git [-- Attachment #1: Type: text/plain, Size: 752 bytes --] This adds $git_versions_must_match variable, which is set to true, checks that we are running on the same version of git that we shipped with, and if not throw '500 Internal Server Error' error. What is checked is the version of gitweb (embedded in building gitweb.cgi), against version of runtime git binary used. Gitweb can usually run with a mismatched git install. This is more here to give an obvious warning as to whats going on vs. silently failing. By default this feature is turned on. Signed-off-by: John 'Warthog9' Hawley <warthog9@kernel.org> Signed-off-by: Jakub Narebski <jnareb@gmail.com> --- gitweb/README | 4 ++++ gitweb/gitweb.perl | 27 +++++++++++++++++++++++++++ 2 files changed, 31 insertions(+), 0 deletions(-) [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0001-gitweb-Add-option-to-force-version-match.patch --] [-- Type: text/x-patch; name="0001-gitweb-Add-option-to-force-version-match.patch", Size: 2303 bytes --] diff --git a/gitweb/README b/gitweb/README index bf3664f..7ee8450 100644 --- a/gitweb/README +++ b/gitweb/README @@ -246,6 +246,10 @@ not include variables usually directly set during build): http://www.andre-simon.de due to assumptions about parameters and output). Useful if highlight is not installed on your webserver's PATH. [Default: highlight] + * $git_versions_must_match + If set, gitweb fails with 500 Internal Server Error if the version of gitweb + doesn't match version of git binary. The default is true. + Projects list file format ~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl index 8d7e4c5..215a4e9 100755 --- a/gitweb/gitweb.perl +++ b/gitweb/gitweb.perl @@ -232,6 +232,9 @@ our %avatar_size = ( 'double' => 32 ); +# If it is true, exit if gitweb version and git binary version don't match +our $git_versions_must_match = 1; + # Used to set the maximum load that we will still respond to gitweb queries. # If server load exceed this value then return "503 server busy" error. # If gitweb cannot determined server load, it is taken to be 0. @@ -649,6 +652,29 @@ sub check_loadavg { } } +sub check_versionmatch { + # Throw an error if git versions does not match, if $git_versions_must_match is true. + if ($git_versions_must_match && + $git_version ne $version) { + my $admin_contact = + defined $ENV{'SERVER_ADMIN'} ? ", $ENV{'SERVER_ADMIN'}," : ''; + my $err_msg = <<EOT; +<h1 align="center">*** Warning ***</h1> +<p> +This version of gitweb was compiled for <b>@{[esc_html($version)]}</b>, +however git version <b>@{[esc_html($git_version)]}</b> was found on server, +and administrator requested strict version checking. +</p> +<p> +Please contact the server administrator${admin_contact} to either configure +gitweb to allow mismatched versions, or update git or gitweb installation. +</p> +EOT + die_error(500, 'Internal server error', $err_msg); + } + +} + # ====================================================================== # input validation and dispatch @@ -1075,6 +1101,7 @@ sub run_request { evaluate_uri(); evaluate_gitweb_config(); check_loadavg(); + check_versionmatch(); # $projectroot and $projects_list might be set in gitweb config file $projects_list ||= $projectroot; ^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [PATCH 1/3] gitweb: Add option to force version match 2010-10-28 0:42 ` [PATCH 1/3] gitweb: Add option to force version match John 'Warthog9' Hawley @ 2010-10-28 9:52 ` Jakub Narebski 2010-10-28 22:08 ` Junio C Hamano 1 sibling, 0 replies; 43+ messages in thread From: Jakub Narebski @ 2010-10-28 9:52 UTC (permalink / raw) To: John 'Warthog9' Hawley; +Cc: git "John 'Warthog9' Hawley" <warthog9@eaglescrag.net> writes: > This adds $git_versions_must_match variable, which is set to true, > checks that we are running on the same version of git that we > shipped with, and if not throw '500 Internal Server Error' error. > What is checked is the version of gitweb (embedded in building > gitweb.cgi), against version of runtime git binary used. > > Gitweb can usually run with a mismatched git install. This is more > here to give an obvious warning as to whats going on vs. silently > failing. Were you bitten by mismatch between git version and gitweb version? Just how serious is that (hypothetical?) situation? Should it be warning, or erroring out? > > By default this feature is turned on. I think last time this patch stalled on discussion whether this feature should be turned on by default, where it can break some setups; or be turned off by default, where it is much less useful protecting against errors. > > Signed-off-by: John 'Warthog9' Hawley <warthog9@kernel.org> > Signed-off-by: Jakub Narebski <jnareb@gmail.com> > --- > gitweb/README | 4 ++++ > gitweb/gitweb.perl | 27 +++++++++++++++++++++++++++ > 2 files changed, 31 insertions(+), 0 deletions(-) For me, if $git_versions_must_match is to be on by default, I would prefer also update to t/gitweb-lib.sh, so hat I don't have to recompile git to test new version of gitweb... > diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl > index 8d7e4c5..215a4e9 100755 > --- a/gitweb/gitweb.perl > +++ b/gitweb/gitweb.perl > @@ -232,6 +232,9 @@ our %avatar_size = ( > 'double' => 32 > ); > > +# If it is true, exit if gitweb version and git binary version don't match > +our $git_versions_must_match = 1; > + > # Used to set the maximum load that we will still respond to gitweb queries. > # If server load exceed this value then return "503 server busy" error. > # If gitweb cannot determined server load, it is taken to be 0. > @@ -649,6 +652,29 @@ sub check_loadavg { > } > } > > +sub check_versionmatch { > + # Throw an error if git versions does not match, if $git_versions_must_match is true. > + if ($git_versions_must_match && > + $git_version ne $version) { > + my $admin_contact = > + defined $ENV{'SERVER_ADMIN'} ? ", $ENV{'SERVER_ADMIN'}," : ''; > + my $err_msg = <<EOT; > +<h1 align="center">*** Warning ***</h1> > +<p> > +This version of gitweb was compiled for <b>@{[esc_html($version)]}</b>, > +however git version <b>@{[esc_html($git_version)]}</b> was found on server, > +and administrator requested strict version checking. > +</p> > +<p> > +Please contact the server administrator${admin_contact} to either configure > +gitweb to allow mismatched versions, or update git or gitweb installation. > +</p> > +EOT > + die_error(500, 'Internal server error', $err_msg); > + } > + > +} Nice. > + > # ====================================================================== > # input validation and dispatch > > @@ -1075,6 +1101,7 @@ sub run_request { > evaluate_uri(); > evaluate_gitweb_config(); > check_loadavg(); > + check_versionmatch(); Shouldn't it be before check_loadavg()? -- Jakub Narebski Poland ShadeHawk on #git ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 1/3] gitweb: Add option to force version match 2010-10-28 0:42 ` [PATCH 1/3] gitweb: Add option to force version match John 'Warthog9' Hawley 2010-10-28 9:52 ` Jakub Narebski @ 2010-10-28 22:08 ` Junio C Hamano 1 sibling, 0 replies; 43+ messages in thread From: Junio C Hamano @ 2010-10-28 22:08 UTC (permalink / raw) To: John 'Warthog9' Hawley; +Cc: git I do not understand why this variable needs to be turned _on_ by default, but more importantly, isn't this more or less independent from what we discussed at GitTogether, which is to get your battle-tested caching layer that will tremendously help any nontrivial site merged earlier rather than later by reducing load average from 1000 to 3-or-4? ^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH 2/3] gitweb: add output buffering and associated functions 2010-10-28 0:42 [PATCH 0/3] Gitweb caching v7 John 'Warthog9' Hawley 2010-10-28 0:42 ` [PATCH 1/3] gitweb: Add option to force version match John 'Warthog9' Hawley @ 2010-10-28 0:42 ` John 'Warthog9' Hawley 2010-10-28 9:56 ` Jakub Narebski 2010-10-28 0:42 ` [PATCH 3/3] gitweb: File based caching layer (from git.kernel.org) John 'Warthog9' Hawley ` (2 subsequent siblings) 4 siblings, 1 reply; 43+ messages in thread From: John 'Warthog9' Hawley @ 2010-10-28 0:42 UTC (permalink / raw) To: git [-- Attachment #1: Type: text/plain, Size: 556 bytes --] This adds output buffering for gitweb, mainly in preparation for caching support. This is a dramatic change to how caching was being done, mainly in passing around the variable manually and such. This centrally flips the entire STDOUT to a variable, which after the completion of the run, flips it back and does a print on the resulting data. This should save on the previous 10K line patch (or so) that adds more explicit output passing. --- gitweb/gitweb.perl | 29 +++++++++++++++++++++++++++++ 1 files changed, 29 insertions(+), 0 deletions(-) [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0002-gitweb-add-output-buffering-and-associated-functions.patch --] [-- Type: text/x-patch; name="0002-gitweb-add-output-buffering-and-associated-functions.patch", Size: 1595 bytes --] diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl index 215a4e9..757ef46 100755 --- a/gitweb/gitweb.perl +++ b/gitweb/gitweb.perl @@ -30,6 +30,9 @@ BEGIN { our $version = "++GIT_VERSION++"; +# Output buffer variable +$output = ""; + our ($my_url, $my_uri, $base_url, $path_info, $home_link); sub evaluate_uri { our $cgi; @@ -1151,6 +1154,25 @@ sub evaluate_argv { ); } +sub change_output { + our $output; + + # Trap the 'proper' STDOUT to STDOUT_REAL for things like error messages and such + open(STDOUT_REAL,">&STDOUT") or die "Unable to capture STDOUT $!\n"; + + # Close STDOUT, so that it isn't being used anymore. + close STDOUT; + + # Trap STDOUT to the $output variable, which is what I was using in the original + # patch anyway. + open(STDOUT,">", \$output) || die "Unable to open STDOUT: $!"; #open STDOUT handle to use $var +} + +sub reset_output { + # This basically takes STDOUT_REAL and puts it back as STDOUTl + open(STDOUT,">&STDOUT_REAL"); +} + sub run { evaluate_argv(); evaluate_git_version(); @@ -1163,7 +1185,10 @@ sub run { $pre_dispatch_hook->() if $pre_dispatch_hook; + change_output(); run_request(); + reset_output(); + print $output; $post_dispatch_hook->() if $post_dispatch_hook; @@ -3673,6 +3698,10 @@ sub die_error { 500 => '500 Internal Server Error', 503 => '503 Service Unavailable', ); + # Reset the output so that we are actually going to STDOUT as opposed + # to buffering the output. + reset_output(); + git_header_html($http_responses{$status}, undef, %opts); print <<EOF; <div class="page_body"> ^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [PATCH 2/3] gitweb: add output buffering and associated functions 2010-10-28 0:42 ` [PATCH 2/3] gitweb: add output buffering and associated functions John 'Warthog9' Hawley @ 2010-10-28 9:56 ` Jakub Narebski 0 siblings, 0 replies; 43+ messages in thread From: Jakub Narebski @ 2010-10-28 9:56 UTC (permalink / raw) To: John 'Warthog9' Hawley; +Cc: git "John 'Warthog9' Hawley" <warthog9@eaglescrag.net> writes: > This adds output buffering for gitweb, mainly in preparation for > caching support. This is a dramatic change to how caching was being > done, mainly in passing around the variable manually and such. > > This centrally flips the entire STDOUT to a variable, which after the > completion of the run, flips it back and does a print on the resulting > data. > > This should save on the previous 10K line patch (or so) that adds more > explicit output passing. Signoff? > --- > gitweb/gitweb.perl | 29 +++++++++++++++++++++++++++++ > 1 files changed, 29 insertions(+), 0 deletions(-) > > diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl > index 215a4e9..757ef46 100755 > --- a/gitweb/gitweb.perl > +++ b/gitweb/gitweb.perl > @@ -30,6 +30,9 @@ BEGIN { > > our $version = "++GIT_VERSION++"; > > +# Output buffer variable > +$output = ""; > + > our ($my_url, $my_uri, $base_url, $path_info, $home_link); > sub evaluate_uri { > our $cgi; > @@ -1151,6 +1154,25 @@ sub evaluate_argv { > ); > } > > +sub change_output { > + our $output; > + > + # Trap the 'proper' STDOUT to STDOUT_REAL for things like error messages and such > + open(STDOUT_REAL,">&STDOUT") or die "Unable to capture STDOUT $!\n"; > + > + # Close STDOUT, so that it isn't being used anymore. > + close STDOUT; > + > + # Trap STDOUT to the $output variable, which is what I was using in the original > + # patch anyway. > + open(STDOUT,">", \$output) || die "Unable to open STDOUT: $!"; #open STDOUT handle to use $var > +} > + > +sub reset_output { > + # This basically takes STDOUT_REAL and puts it back as STDOUTl > + open(STDOUT,">&STDOUT_REAL"); > +} Nice solution. I'll steal it for GitwebCache::Capture::Redirect (or something like that, instead of relying on a bit cryptic select($fh)). > @@ -1163,7 +1185,10 @@ sub run { > $pre_dispatch_hook->() > if $pre_dispatch_hook; > > + change_output(); > run_request(); > + reset_output(); > + print $output; Doesn't this enable capture unconditionally? Wouldn't that screw-up the blame_data part of blame_interactive Ajax-y view? Wouldn't that decrease perceived responsiveness of gitweb? > > $post_dispatch_hook->() > if $post_dispatch_hook; > @@ -3673,6 +3698,10 @@ sub die_error { > 500 => '500 Internal Server Error', > 503 => '503 Service Unavailable', > ); > + # Reset the output so that we are actually going to STDOUT as opposed > + # to buffering the output. > + reset_output(); > + Good. > git_header_html($http_responses{$status}, undef, %opts); > print <<EOF; > <div class="page_body"> -- Jakub Narebski Poland ShadeHawk on #git ^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH 3/3] gitweb: File based caching layer (from git.kernel.org) 2010-10-28 0:42 [PATCH 0/3] Gitweb caching v7 John 'Warthog9' Hawley 2010-10-28 0:42 ` [PATCH 1/3] gitweb: Add option to force version match John 'Warthog9' Hawley 2010-10-28 0:42 ` [PATCH 2/3] gitweb: add output buffering and associated functions John 'Warthog9' Hawley @ 2010-10-28 0:42 ` John 'Warthog9' Hawley 2010-10-28 16:10 ` Jakub Narebski 2010-10-28 22:29 ` [PATCH 0/3] Gitweb caching v7 Junio C Hamano 2010-10-29 22:25 ` Junio C Hamano 4 siblings, 1 reply; 43+ messages in thread From: John 'Warthog9' Hawley @ 2010-10-28 0:42 UTC (permalink / raw) To: git [-- Attachment #1: Type: text/plain, Size: 1695 bytes --] This is a relatively large patch that implements the file based caching layer that is quite similar to the one used on such large sites as kernel.org and soon git.fedoraproject.org. This provides a simple, and straight forward caching mechanism that scales dramatically better than Gitweb by itself. The caching layer basically buffers the output that Gitweb would normally return, and saves that output to a cache file on the local disk. When the file is requested it attempts to gain a shared lock on the cache file and cat it out to the client. Should an exclusive lock be on a file (it's being updated) the code has a choice to either update in the background and go ahead and show the stale page while update is being performed, or stall the client(s) until the page is generated. There are two forms of stalling involved here, background building and non-background building, both of which are discussed in the configuration page. There are still a few known "issues" with respect to this: - Code needs to be added to be "browser" aware so that clients like wget that are trying to get a binary blob don't obtain a "Generating..." page Caching is disabled by default with the $cache_enable variable, setting this to 1 will enable file based caching. It is expected that this will be extended to include additional types of caching (like memcached) in the future and should not be exclusively considered a binary value. --- gitweb/cache.pm | 365 ++++++++++++++++++++++++++++++++++++++++++++++ gitweb/gitweb.perl | 99 +++++++++++-- gitweb/static/gitweb.css | 6 + 3 files changed, 458 insertions(+), 12 deletions(-) create mode 100644 gitweb/cache.pm [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0003-gitweb-File-based-caching-layer-from-git.kernel.org.patch --] [-- Type: text/x-patch; name="0003-gitweb-File-based-caching-layer-from-git.kernel.org.patch", Size: 15811 bytes --] diff --git a/gitweb/cache.pm b/gitweb/cache.pm new file mode 100644 index 0000000..c86265c --- /dev/null +++ b/gitweb/cache.pm @@ -0,0 +1,365 @@ +# gitweb - simple web interface to track changes in git repositories +# +# (C) 2006, John 'Warthog9' Hawley <warthog19@eaglescrag.net> +# +# This program is licensed under the GPLv2 + +# +# Gitweb caching engine +# + +#use File::Path qw(make_path remove_tree); +use File::Path qw(mkpath rmtree); # Used for compatability reasons +use Digest::MD5 qw(md5 md5_hex md5_base64); +use Fcntl ':flock'; +use File::Copy; + +sub cache_fetch { + my ($action) = @_; + my $cacheTime = 0; + + # Deal with cache being disabled + if( $cache_enable == 0 ){ + undef $backgroundCache; + #$fullhashpath = \$nocachedata; + #$fullhashbinpath = \$nocachedatabin; + $fullhashpath = *STDOUT; + $fullhashbinpath = *STDOUT; + $actions{$action}->(); + return; + }elsif( $cache_enable == 1 ){ + #obviously we are using file based caching + + if(! -d $cachedir){ + print "*** Warning ***: Caching enabled but cache directory does not exsist. ($cachedir)\n"; + mkdir ("cache", 0665) || die "Cannot create cache dir - you will need to manually create"; + print "Cache directory created successfully\n"; + } + + our $full_url = "$my_url?". $ENV{'QUERY_STRING'}; + our $urlhash = md5_hex($full_url); + our $fullhashdir = "$cachedir/". substr( $urlhash, 0, 2) ."/"; + + eval { mkpath( $fullhashdir, 0, 0777 ) }; + if ($@) { + die_error(500, "Internal Server Error", "Could not create cache directory: $@"); + } + $fullhashpath = "$fullhashdir/". substr( $urlhash, 2 ); + $fullhashbinpath = "$fullhashpath.bin.wt"; + $fullhashbinpathfinal = "$fullhashpath.bin"; + } # done dealing with cache enabled / disabled + + if(! -e "$fullhashpath" ){ + if( ! $cache_enable || ! $cacheDoFork || ! defined(my $childPid = fork()) ){ + cacheUpdate($action,0); + cacheDisplay($action); + } elsif ( $childPid == 0 ){ + #run the updater + cacheUpdate($action,1); + }else{ + cacheWaitForUpdate($action); + } + }else{ + #if cache is out dated, update + #else displayCache(); + open(cacheFile, '<', "$fullhashpath"); + stat(cacheFile); + close(cacheFile); + my $stat_time = (stat(_))[9]; + my $stat_size = (stat(_))[7]; + + $cacheTime = get_loadavg() * 60; + if( $cacheTime > $maxCacheTime ){ + $cacheTime = $maxCacheTime; + } + if( $cacheTime < $minCacheTime ){ + $cacheTime = $minCacheTime; + } + if( $stat_time < (time - $cacheTime) || $stat_size == 0 ){ + if( ! $cacheDoFork || ! defined(my $childPid = fork()) ){ + cacheUpdate($action,0); + cacheDisplay($action); + } elsif ( $childPid == 0 ){ + #run the updater + #print "Running updater\n"; + cacheUpdate($action,1); + }else{ + #print "Waiting for update\n"; + cacheWaitForUpdate($action); + } + } else { + cacheDisplay($action); + } + + + } + + # + # If all of the caching failes - lets go ahead and press on without it and fall back to 'default' + # non-caching behavior. This is the softest of the failure conditions. + # + #$actions{$action}->(); +} + +sub cacheUpdate { + my ($action,$areForked) = @_; + my $lockingStatus; + my $fileData = ""; + + if($backgroundCache){ + open(cacheFileBG, '>:utf8', "$fullhashpath.bg"); + my $lockStatBG = flock(cacheFileBG,LOCK_EX|LOCK_NB); + + $lockStatus = $lockStatBG; + }else{ + open(cacheFile, '>:utf8', "$fullhashpath") if ($cache_enable > 0); + open(cacheFile, '>:utf8', \$fullhashpath) if ($cache_enable == 0); + my $lockStat = flock(cacheFile,LOCK_EX|LOCK_NB); + + $lockStatus = $lockStat; + } + #print "lock status: $lockStat\n"; + + + if ($cache_enable != 0 && ! $lockStatus ){ + if ( $areForked ){ + exit(0); + }else{ + return; + } + } + + if( + $action eq "snapshot" + || + $action eq "blob_plain" + ){ + my $openstat = open(cacheFileBinWT, '>>:utf8', "$fullhashbinpath"); + my $lockStatBin = flock(cacheFileBinWT,LOCK_EX|LOCK_NB); + } + + # Trap all output from the action + change_output(); + + $actions{$action}->(); + + # Reset the outputs as we should be fine now + reset_output(); + + + if($backgroundCache){ + open(cacheFile, '>:utf8', "$fullhashpath"); + $lockStat = flock(cacheFile,LOCK_EX); + + if (! $lockStat ){ + if ( $areForked ){ + exit(0); + }else{ + return; + } + } + } + + if( + $action eq "snapshot" + || + $action eq "blob_plain" + ){ + my $openstat = open(cacheFileBinFINAL, '>:utf8', "$fullhashbinpathfinal"); + $lockStatBIN = flock(cacheFileBinFINAL,LOCK_EX); + + if (! $lockStatBIN ){ + if ( $areForked ){ + exit(0); + }else{ + return; + } + } + } + + # Actually dump the output to the proper file handler + local $/ = undef; + $|++; + print cacheFile "$output"; + $|--; + if( + $action eq "snapshot" + || + $action eq "blob_plain" + ){ + move("$fullhashbinpath", "$fullhashbinpathfinal") or die "Binary Cache file could not be updated: $!"; + + flock(cacheFileBinFINAL,LOCK_UN); + close(cacheFileBinFINAL); + + flock(cacheFileBinWT,LOCK_UN); + close(cacheFileBinWT); + } + + flock(cacheFile,LOCK_UN); + close(cacheFile); + + if($backgroundCache){ + flock(cacheFileBG,LOCK_UN); + close(cacheFileBG); + } + + if ( $areForked ){ + exit(0); + } else { + return; + } +} + + +sub cacheWaitForUpdate { + my ($action) = @_; + my $x = 0; + my $max = 10; + my $lockStat = 0; + + if( $backgroundCache ){ + if( -e "$fullhashpath" ){ + open(cacheFile, '<:utf8', "$fullhashpath") if ($cache_enable > 0); + open(cacheFile, '<:utf8', \$fullhashpath) if ($cache_enable == 0); + $lockStat = flock(cacheFile,LOCK_SH|LOCK_NB); + stat(cacheFile); + close(cacheFile); + + if( $lockStat && ( (stat(_))[9] > (time - $maxCacheLife) ) ){ + cacheDisplay($action); + return; + } + } + } + + if( + $action eq "atom" + || + $action eq "rss" + || + $action eq "opml" + ){ + do { + sleep 2 if $x > 0; + open(cacheFile, '<:utf8', "$fullhashpath") if ($cache_enable > 0); + open(cacheFile, '<:utf8', \$fullhashpath) if ($cache_enable == 0); + $lockStat = flock(cacheFile,LOCK_SH|LOCK_NB); + close(cacheFile); + $x++; + $combinedLockStat = $lockStat; + } while ((! $combinedLockStat) && ($x < $max)); + + if( $x != $max ){ + cacheDisplay($action); + } + return; + } + + $| = 1; + + print $::cgi->header( + -type=>'text/html', + -charset => 'utf-8', + -status=> 200, + -expires => 'now', + # HTTP/1.0 + -Pragma => 'no-cache', + # HTTP/1.1 + -Cache_Control => join( + ', ', + qw( + private + no-cache + no-store + must-revalidate + max-age=0 + pre-check=0 + post-check=0 + ) + ) + ); + + print <<EOF; +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www/w3.porg/TR/html4/strict.dtd"> +<!-- git web w/caching interface version $version, (C) 2006-2010, John 'Warthog9' Hawley <warthog9\@kernel.org> --> +<!-- git core binaries version $git_version --> +<head> +<meta http-equiv="content-type" content="$content_type; charset=utf-8"/> +<meta name="generator" content="gitweb/$version git/$git_version"/> +<meta name="robots" content="index, nofollow"/> +<meta http-equiv="refresh" content="0"/> +<title>$title</title> +</head> +<body> +EOF + + print "Generating.."; + do { + print "."; + sleep 2 if $x > 0; + open(cacheFile, '<:utf8', "$fullhashpath") if ($cache_enable > 0); + open(cacheFile, '<:utf8', \$fullhashpath) if ($cache_enable == 0); + $lockStat = flock(cacheFile,LOCK_SH|LOCK_NB); + close(cacheFile); + $x++; + $combinedLockStat = $lockStat; + } while ((! $combinedLockStat) && ($x < $max)); + print <<EOF; +</body> +</html> +EOF + return; +} + +sub cacheDisplay { + local $/ = undef; + $|++; + + my ($action) = @_; + open(cacheFile, '<:utf8', "$fullhashpath") if ($cache_enable > 0); + open(cacheFile, '<:utf8', \$fullhashpath) if ($cache_enable == 0); + $lockStat = flock(cacheFile,LOCK_SH|LOCK_NB); + + if ($cache_enable != 0 && ! $lockStat ){ + close(cacheFile); + cacheWaitForUpdate($action); + } + + if( + ( + $action eq "snapshot" + || + $action eq "blob_plain" + ) + && + $cache_enable > 0 + ){ + my $openstat = open(cacheFileBin, '<', "$fullhashbinpathfinal"); + $lockStatBIN = flock(cacheFileBin,LOCK_SH|LOCK_NB); + if ($cache_enable != 0 && ! $lockStatBIN ){ + system ("echo 'cacheDisplay - bailing due to binary lock failure' >> /tmp/gitweb.log"); + close(cacheFile); + close(cacheFileBin); + cacheWaitForUpdate($action); + } + + my $binfilesize = -s "$fullhashbinpathfinal"; + print "Content-Length: $binfilesize"; + } + while( <cacheFile> ){ + print $_; + } + if( + $action eq "snapshot" + || + $action eq "blob_plain" + ){ + binmode STDOUT, ':raw'; + print <cacheFileBin>; + binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi + close(cacheFileBin); + } + close(cacheFile); + $|--; +} diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl index 757ef46..eb53075 100755 --- a/gitweb/gitweb.perl +++ b/gitweb/gitweb.perl @@ -31,7 +31,7 @@ BEGIN { our $version = "++GIT_VERSION++"; # Output buffer variable -$output = ""; +our $output = ""; our ($my_url, $my_uri, $base_url, $path_info, $home_link); sub evaluate_uri { @@ -244,6 +244,57 @@ our $git_versions_must_match = 1; # Leave it undefined (or set to 'undef') to turn off load checking. our $maxload = 300; +# This enables/disables the caching layer in gitweb. This currently only supports the +# 'dumb' file based caching layer, primarily used on git.kernel.org. this is reasonably +# effective but it has the downside of requiring a huge amount of disk space if there +# are a number of repositories involved. It is not uncommon for git.kernel.org to have +# on the order of 80G - 120G accumulate over the course of a few months. It is recommended +# that the cache directory be periodically completely deleted, and this is safe to perform. +# Suggested mechanism +# mv $cacheidr $cachedir.flush;mkdir $cachedir;rm -rf $cachedir.flush +# Value is binary. 0 = disabled (default), 1 = enabled. +# +# Values of caching: +# 1 = 'dumb' file based caching used on git.kernel.org +our $cache_enable = 0; + +# Used to set the minimum cache timeout for the dynamic caching algorithm. Basically +# if we calculate the cache to be under this number of seconds we set the cache timeout +# to this minimum. +# Value is in seconds. 1 = 1 seconds, 60 = 1 minute, 600 = 10 minutes, 3600 = 1 hour +our $minCacheTime = 20; + +# Used to set the maximum cache timeout for the dynamic caching algorithm. Basically +# if we calculate the cache to exceed this number of seconds we set the cache timeout +# to this maximum. +# Value is in seconds. 1 = 1 seconds, 60 = 1 minute, 600 = 10 minutes, 3600 = 1 hour +our $maxCacheTime = 1200; + +# If you need to change the location of the caching directory, override this +# otherwise this will probably do fine for you +our $cachedir = 'cache'; + +# If this is set (to 1) cache will do it's best to always display something instead +# of making someone wait for the cache to update. This will launch the cacheUpdate +# into the background and it will lock a <file>.bg file and will only lock the +# actual cache file when it needs to write into it. In theory this will make +# gitweb seem more responsive at the price of possibly stale data. +our $backgroundCache = 1; + +# Used to set the maximum cache file life. If a cache files last modify time exceeds +# this value, it will assume that the data is just too old, and HAS to be regenerated +# instead of trying to display the existing cache data. +# Value is in seconds. 1 = 1 seconds, 60 = 1 minute, 600 = 10 minutes, 3600 = 1 hour +# 18000 = 5 hours +our $maxCacheLife = 18000; + +# Used to enable or disable background forking of the gitweb caching. Mainly here for debugging purposes +our $cacheDoFork = 1; + +our $fullhashpath = *STDOUT; +our $fullhashbinpath = *STDOUT; +our $fullhashbinpathfinal = *STDOUT; + # configuration for 'highlight' (http://www.andre-simon.de/) # match by basename our %highlight_basename = ( @@ -500,6 +551,11 @@ our %feature = ( 'default' => [0]}, ); +# +# Includes +# +do 'cache.pm'; + sub gitweb_get_feature { my ($name) = @_; return unless exists $feature{$name}; @@ -1089,7 +1145,8 @@ sub dispatch { !$project) { die_error(400, "Project needed"); } - $actions{$action}->(); + #$actions{$action}->(); + cache_fetch($action); } sub reset_timer { @@ -1159,6 +1216,7 @@ sub change_output { # Trap the 'proper' STDOUT to STDOUT_REAL for things like error messages and such open(STDOUT_REAL,">&STDOUT") or die "Unable to capture STDOUT $!\n"; + print STDOUT_REAL ""; # Close STDOUT, so that it isn't being used anymore. close STDOUT; @@ -1185,10 +1243,7 @@ sub run { $pre_dispatch_hook->() if $pre_dispatch_hook; - change_output(); run_request(); - reset_output(); - print $output; $post_dispatch_hook->() if $post_dispatch_hook; @@ -3465,7 +3520,9 @@ sub git_header_html { # support xhtml+xml but choking when it gets what it asked for. if (defined $cgi->http('HTTP_ACCEPT') && $cgi->http('HTTP_ACCEPT') =~ m/(,|;|\s|^)application\/xhtml\+xml(,|;|\s|$)/ && - $cgi->Accept('application/xhtml+xml') != 0) { + $cgi->Accept('application/xhtml+xml') != 0 + && + $cache_enable == 0) { $content_type = 'application/xhtml+xml'; } else { $content_type = 'text/html'; @@ -3610,6 +3667,7 @@ sub git_footer_html { my $feed_class = 'rss_logo'; print "<div class=\"page_footer\">\n"; + print "<div class=\"cachetime\">Cache Last Updated: ". gmtime( time ) ." GMT</div>\n"; if (defined $project) { my $descr = git_get_project_description($project); if (defined $descr) { @@ -3698,6 +3756,11 @@ sub die_error { 500 => '500 Internal Server Error', 503 => '503 Service Unavailable', ); + # The output handlers for die_error need to be reset to STDOUT + # so that half the message isn't being output to random and + # half to STDOUT as expected. This is mainly for the benefit + # of using git_header_html() and git_footer_html() since + # # Reset the output so that we are actually going to STDOUT as opposed # to buffering the output. reset_output(); @@ -5610,9 +5673,15 @@ sub git_blob_plain { ($sandbox ? 'attachment' : 'inline') . '; filename="' . $save_as . '"'); local $/ = undef; - binmode STDOUT, ':raw'; - print <$fd>; - binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi + if( $cache_enable != 0){ + open BINOUT, '>', $fullhashbinpath or die_error(500, "Could not open bin dump file"); + }else{ + open BINOUT, '>', \$fullhashbinpath or die_error(500, "Could not open bin dump file"); + } + binmode BINOUT, ':raw'; + print BINOUT <$fd>; + binmode BINOUT, ':utf8'; # as set at the beginning of gitweb.cgi + close BINOUT; close $fd; } @@ -5897,9 +5966,15 @@ sub git_snapshot { open my $fd, "-|", $cmd or die_error(500, "Execute git-archive failed"); - binmode STDOUT, ':raw'; - print <$fd>; - binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi + if( $cache_enable != 0){ + open BINOUT, '>', $fullhashbinpath or die_error(500, "Could not open bin dump file"); + }else{ + open BINOUT, '>', \$fullhashbinpath or die_error(500, "Could not open bin dump file"); + } + binmode BINOUT, ':raw'; + print BINOUT <$fd>; + binmode BINOUT, ':utf8'; # as set at the beginning of gitweb.cgi + close BINOUT; close $fd; } diff --git a/gitweb/static/gitweb.css b/gitweb/static/gitweb.css index 4132aab..972d32e 100644 --- a/gitweb/static/gitweb.css +++ b/gitweb/static/gitweb.css @@ -67,6 +67,12 @@ div.page_path { border-width: 0px 0px 1px; } +div.cachetime { + float: left; + margin-right: 10px; + color: #555555; +} + div.page_footer { height: 17px; padding: 4px 8px; ^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [PATCH 3/3] gitweb: File based caching layer (from git.kernel.org) 2010-10-28 0:42 ` [PATCH 3/3] gitweb: File based caching layer (from git.kernel.org) John 'Warthog9' Hawley @ 2010-10-28 16:10 ` Jakub Narebski 0 siblings, 0 replies; 43+ messages in thread From: Jakub Narebski @ 2010-10-28 16:10 UTC (permalink / raw) To: John 'Warthog9' Hawley; +Cc: git, Jakub Narebski I am not commenting on minor style issues, like using camelCase names of variables and subroutines, of 'if( ... )' instead of 'if (...)' etc. minor issues. "John 'Warthog9' Hawley" <warthog9@eaglescrag.net> writes: > This is a relatively large patch that implements the file based > caching layer that is quite similar to the one used on such large > sites as kernel.org and soon git.fedoraproject.org. This provides > a simple, and straight forward caching mechanism that scales > dramatically better than Gitweb by itself. "Quite similar" - what does it mean? In what way it is different from the one used on git.kernel.org? > > The caching layer basically buffers the output that Gitweb would > normally return, and saves that output to a cache file on the local > disk. O.K. This means caching HTTP response (caching output of CGI script), using "simple" file based caching engine, without any serialization. But this doesn't tell us how the cache expiration is handled, i.e. how do you know that cache entry is expired, and should be regenerated. > When the file is requested it attempts to gain a shared lock > on the cache file and cat it out to the client. Should an exclusive > lock be on a file (it's being updated) the code has a choice to either > update in the background and go ahead and show the stale page while > update is being performed, or stall the client(s) until the page > is generated. In my rewrite of earlier version of your gitweb caching patch it is responsibility of caching engine (GitwebCache::FileCacheWithLocking) to serialize (via locking) (re)generating cache entry. Gitweb just passes page key, function that returns captured output to cache instance; it passes callback for "staling" the client to cache constructor. It is not the "file" that is being requested: on the gitweb level it is a page / action / view that is being requested, on the level of caching negine it is cache entry that is requested... only at the low level of cache driver that a file holding a cached entry is requested. > > There are two forms of stalling involved here, background building > and non-background building, both of which are discussed in the > configuration page. > > There are still a few known "issues" with respect to this: > - Code needs to be added to be "browser" aware so > that clients like wget that are trying to get a > binary blob don't obtain a "Generating..." page Right. This is also missing from my rewrite. Sidenote: why "binary blob"? Most downloading utilities do not follow http-meta in-HTML redirects, do they? > > Caching is disabled by default with the $cache_enable variable, > setting this to 1 will enable file based caching. It is expected > that this will be extended to include additional types of caching > (like memcached) in the future and should not be exclusively > considered a binary value. Why not do the same as in my rewrite, namely $cache_enabled or $caching_enabled *boolean* value (not "binary"), and $cache variable to select caching engine (undef to use default)? > --- > gitweb/cache.pm | 365 ++++++++++++++++++++++++++++++++++++++++++++++ > gitweb/gitweb.perl | 99 +++++++++++-- > gitweb/static/gitweb.css | 6 + > 3 files changed, 458 insertions(+), 12 deletions(-) > create mode 100644 gitweb/cache.pm It would be nice to have at least some minimal testing for caching enabled gitweb. Hmmm... I should have split update to t9500-* test so you would be able to cherry-pick it. Also, it would be good idea to have 'make install-gitweb' to install 'cache.pm' alongside gitweb, especially that you do not do any error checking for "do 'cache.pm';" (see below). > > diff --git a/gitweb/cache.pm b/gitweb/cache.pm > new file mode 100644 > index 0000000..c86265c > --- /dev/null > +++ b/gitweb/cache.pm > @@ -0,0 +1,365 @@ > +# gitweb - simple web interface to track changes in git repositories > +# > +# (C) 2006, John 'Warthog9' Hawley <warthog19@eaglescrag.net> > +# > +# This program is licensed under the GPLv2 > + > +# > +# Gitweb caching engine > +# Wouldn't it be better to use real Perl module, i.e. start with 'package' line, like in my rewrite? > + > +#use File::Path qw(make_path remove_tree); Is above line really needed? > +use File::Path qw(mkpath rmtree); # Used for compatability reasons Because [stable] Perl earlier than 5.10.0 has pre-2.00 version of File::Path in core, which didn't have make_path and remove_tree. At least that is compatibility with pre-2.00 File::Path should be mentioned in this comment, if we are to mention it at all. BTW. do you need rmtree? > +use Digest::MD5 qw(md5 md5_hex md5_base64); Do you use *all* those subroutines? > +use Fcntl ':flock'; All right. > +use File::Copy; Why do you need this for? > + > +sub cache_fetch { > + my ($action) = @_; > + my $cacheTime = 0; > + > + # Deal with cache being disabled > + if( $cache_enable == 0 ){ Wouldn't it be better to use boolean 'if (!$caching_enabled) {' rather than explicitely test for numeric 0? Wouldn't it be better to have this conditional _outside_ cache_fetch? > + undef $backgroundCache; > + #$fullhashpath = \$nocachedata; > + #$fullhashbinpath = \$nocachedatabin; Leftover debugging? > + $fullhashpath = *STDOUT; > + $fullhashbinpath = *STDOUT; WTF is this for? Defensive programming? You shouldn't protect yourself against bugs in your program... > + $actions{$action}->(); > + return; > + }elsif( $cache_enable == 1 ){ > + #obviously we are using file based caching > + > + if(! -d $cachedir){ > + print "*** Warning ***: Caching enabled but cache directory does not exsist. ($cachedir)\n"; Why print this warning? Why not use 'warn' instead, so it would get in web server logs? Wouldn't printing this here screw up HTTP headers something fierce? I think it is a bug; rarely encountered but still a bug. > + mkdir ("cache", 0665) || die "Cannot create cache dir - you will need to manually create"; > + print "Cache directory created successfully\n"; > + } > + > + our $full_url = "$my_url?". $ENV{'QUERY_STRING'}; Why not use 'href(-replay => 1, -full => 1, -path_info => 0)'? > + our $urlhash = md5_hex($full_url); > + our $fullhashdir = "$cachedir/". substr( $urlhash, 0, 2) ."/"; Why 'our' and not 'my', i.e. why use global variables, and not local to function (lexical scoped to be more exact)? We use 'our' in gitweb *only* because otherwise mod_perl with ModPerl::Registry handler wouldn't work, see e.g. dde80d9 (gitweb: Fix mod_perl support., 2008-11-06) which fixes one issue of 'my' instead of 'our'. But global variables are bad idea... > + > + eval { mkpath( $fullhashdir, 0, 0777 ) }; > + if ($@) { > + die_error(500, "Internal Server Error", "Could not create cache directory: $@"); > + } Good... but wouldn't it leak sensitive information? > + $fullhashpath = "$fullhashdir/". substr( $urlhash, 2 ); > + $fullhashbinpath = "$fullhashpath.bin.wt"; > + $fullhashbinpathfinal = "$fullhashpath.bin"; First, why do you treat binary and non-binary case differently, i.e. you have $fullhashbinpathfinal (egh, thats a mouthfull ;-), but not $fullhashpathfinal? Second, do you really need to have distinct code for binary and non-binary case besides what gitweb already has with "binmode STDOUT, ':raw';" in those places few that needs it? If you are capturing to in-memory file (to scalar), and binmode acts on said in-memory file, then what gets saved is raw (bytes) data. Then you can print it raw to web browser, save it raw to cache file, and read it raw from cache file. > + } # done dealing with cache enabled / disabled > + > + if(! -e "$fullhashpath" ){ Why it is not within 'if ($caching_enabled)' branch of conditional? > + if( ! $cache_enable || ! $cacheDoFork || ! defined(my $childPid = fork()) ){ > + cacheUpdate($action,0); > + cacheDisplay($action); > + } elsif ( $childPid == 0 ){ > + #run the updater > + cacheUpdate($action,1); > + }else{ > + cacheWaitForUpdate($action); > + } WTF do you have forking *here*? If cache entry is not expired, you can just read it, without need for forking. It is only when you need to (re)generate cache that forking is needed. > + }else{ > + #if cache is out dated, update > + #else displayCache(); Lefotover statement. > + open(cacheFile, '<', "$fullhashpath"); > + stat(cacheFile); > + close(cacheFile); You don't need to open file to stat it, you can do + stat($fullhashpath); BTW. what happens if output is binary? > + my $stat_time = (stat(_))[9]; > + my $stat_size = (stat(_))[7]; > + > + $cacheTime = get_loadavg() * 60; > + if( $cacheTime > $maxCacheTime ){ > + $cacheTime = $maxCacheTime; > + } > + if( $cacheTime < $minCacheTime ){ > + $cacheTime = $minCacheTime; > + } All right. > + if( $stat_time < (time - $cacheTime) || $stat_size == 0 ){ Some comment about '$stat_size == 0' test would be good idea, I think. > + if( ! $cacheDoFork || ! defined(my $childPid = fork()) ){ > + cacheUpdate($action,0); > + cacheDisplay($action); > + } elsif ( $childPid == 0 ){ > + #run the updater > + #print "Running updater\n"; > + cacheUpdate($action,1); > + }else{ > + #print "Waiting for update\n"; > + cacheWaitForUpdate($action); > + } The above block of code is almost verbatim copy of code higher up. Not good. > + } else { > + cacheDisplay($action); > + } Wouldn't it be better to have those two branches of conditional reversed, so that else part is larger? Just a thought... > + > + > + } > + > + # > + # If all of the caching failes - lets go ahead and press on without it and fall back to 'default' > + # non-caching behavior. This is the softest of the failure conditions. This comment refers to... > + # > + #$actions{$action}->(); ... commented out code? Eh? > +} You don't use persistent web environment[1] for git.kernel.org, do you? With persistent web environments, where gitweb script runs for a long time, and doesn't exit after end of request, you really should "daemonize" (detach) background process which regenerates cache, at least when said process doesn't display web page to server (serve stale data + regenerate cache in background). Otherwise you would get lots of zombie processes... [1] Gitweb currently supports FastCGI (just rename it to gitweb.fcgi, but you would need FCGI perl module installed together with CGI::Fast), mod_perl via ModPerl::Registry handler, and you can use it with PSGI via gitweb.psgi wrapper like the one generated by 'git instaweb --httpd=plackup'. > + > +sub cacheUpdate { > + my ($action,$areForked) = @_; Strange API. > + my $lockingStatus; > + my $fileData = ""; > + > + if($backgroundCache){ > + open(cacheFileBG, '>:utf8', "$fullhashpath.bg"); > + my $lockStatBG = flock(cacheFileBG,LOCK_EX|LOCK_NB); > + > + $lockStatus = $lockStatBG; > + }else{ > + open(cacheFile, '>:utf8', "$fullhashpath") if ($cache_enable > 0); > + open(cacheFile, '>:utf8', \$fullhashpath) if ($cache_enable == 0); Why do you save to in-memory file if caching is disabled, instead of just printing to STDOUT? > + my $lockStat = flock(cacheFile,LOCK_EX|LOCK_NB); > + > + $lockStatus = $lockStat; > + } Why do you need different names for lockfiles for background cache entry enabled and for it disabled? Why do you need $lockStat and $lockStatBG if what you do is just assign it to $lockStatus? Why do you use fileglobs (which are *global* variables!) and not lexical filehandles? > + #print "lock status: $lockStat\n"; Leftover debugging? BTW. didn't you have problems debugging it from withing gitweb code, and not as a separate caching engine which can be tested individually? > + > + > + if ($cache_enable != 0 && ! $lockStatus ){ > + if ( $areForked ){ > + exit(0); > + }else{ > + return; > + } > + } Strange code flow... > + > + if( > + $action eq "snapshot" > + || > + $action eq "blob_plain" > + ){ > + my $openstat = open(cacheFileBinWT, '>>:utf8', "$fullhashbinpath"); > + my $lockStatBin = flock(cacheFileBinWT,LOCK_EX|LOCK_NB); > + } Why '>>', and not '>'. Why use ':utf8' on binary file? Shouldn't you use different file for locking than for saving cache output in? > + > + # Trap all output from the action > + change_output(); > + > + $actions{$action}->(); > + > + # Reset the outputs as we should be fine now > + reset_output(); All right. > + > + > + if($backgroundCache){ > + open(cacheFile, '>:utf8', "$fullhashpath"); > + $lockStat = flock(cacheFile,LOCK_EX); > + > + if (! $lockStat ){ > + if ( $areForked ){ > + exit(0); > + }else{ > + return; > + } > + } > + } > + > + if( > + $action eq "snapshot" > + || > + $action eq "blob_plain" > + ){ > + my $openstat = open(cacheFileBinFINAL, '>:utf8', "$fullhashbinpathfinal"); > + $lockStatBIN = flock(cacheFileBinFINAL,LOCK_EX); > + > + if (! $lockStatBIN ){ > + if ( $areForked ){ > + exit(0); > + }else{ > + return; > + } > + } > + } Didn't I see nearly exactly the same code? Copy'n'paste programming isn't good idea... > + > + # Actually dump the output to the proper file handler > + local $/ = undef; > + $|++; > + print cacheFile "$output"; > + $|--; Why stringify $output? Why not 'local $| = 1' too? > + if( > + $action eq "snapshot" > + || > + $action eq "blob_plain" > + ){ > + move("$fullhashbinpath", "$fullhashbinpathfinal") or die "Binary Cache file could not be updated: $!"; Do you really need 'File::Copy::move' instead of simple 'rename' here? Both files are in the same directory, so there should be no problems with it. > + > + flock(cacheFileBinFINAL,LOCK_UN); > + close(cacheFileBinFINAL); > + > + flock(cacheFileBinWT,LOCK_UN); > + close(cacheFileBinWT); Closing lockfile unlocks it, there is no need for LOCK_UN; moreover, LOCK_Un is unreliable, as I have checked myself, and as I have read on http://perl.plover.com/yak/flock/ and also on StackOverflow http://stackoverflow.com/questions/34920/how-do-i-lock-a-file-in-perl Why do binary files needs _two_ locks for? > + } > + > + flock(cacheFile,LOCK_UN); > + close(cacheFile); > + > + if($backgroundCache){ > + flock(cacheFileBG,LOCK_UN); > + close(cacheFileBG); > + } Why do background cache generation needs _two_ (or _four_ in case of binary files!!!) locks? > + > + if ( $areForked ){ > + exit(0); > + } else { > + return; > + } Why can't a caller do that? > +} > + > + I think it would be better to describe how cacheWaitForUpdate() works. > +sub cacheWaitForUpdate { > + my ($action) = @_; > + my $x = 0; > + my $max = 10; > + my $lockStat = 0; > + > + if( $backgroundCache ){ > + if( -e "$fullhashpath" ){ > + open(cacheFile, '<:utf8', "$fullhashpath") if ($cache_enable > 0); > + open(cacheFile, '<:utf8', \$fullhashpath) if ($cache_enable == 0); Same af above: when caching is disabled, why do you capture gitweb output for? > + $lockStat = flock(cacheFile,LOCK_SH|LOCK_NB); > + stat(cacheFile); > + close(cacheFile); Here you unlocked just acquired lock! Sidenote: using real locsk with flock() has the advantage over trick of creating *.lock files with O_CREAT in that if process dies (and all files it opened are closed) the lock is automatically unlocked. Therefore you don't have problems of dead lockfiles. > + > + if( $lockStat && ( (stat(_))[9] > (time - $maxCacheLife) ) ){ > + cacheDisplay($action); > + return; > + } Is running time() a second time intended, or just a side effect of arrangement of code? > + } > + } > + > + if( > + $action eq "atom" > + || > + $action eq "rss" > + || > + $action eq "opml" > + ){ Sidenote: this is a bit ugly, and that is why in my rewrite I have commit d20e6a4 (gitweb: Introduce %actions_info, gathering information about actions, 2010-10-05). Also, in my rewrite I can simply 'return' instead of 'exit()' here, and you would get data from caching engine: no need for below code block (which is repeated in very similar way a bit later in same subroutine). > + do { > + sleep 2 if $x > 0; > + open(cacheFile, '<:utf8', "$fullhashpath") if ($cache_enable > 0); > + open(cacheFile, '<:utf8', \$fullhashpath) if ($cache_enable == 0); > + $lockStat = flock(cacheFile,LOCK_SH|LOCK_NB); > + close(cacheFile); Here you unlock just acquired lock. Is it intended? Is it safe against other process acquiting writer's lock (LOCK_EX)? If it is, I probably should unlock it earlier and not have exit() do it... > + $x++; > + $combinedLockStat = $lockStat; > + } while ((! $combinedLockStat) && ($x < $max)); My rewrite instead tries to acquire LOCK_SH *without* LOCK_NB, i.e. in a blocking way... but with a timeout via $SIG{ALRM} and eval { ... }. > + > + if( $x != $max ){ > + cacheDisplay($action); > + } > + return; > + } > + > + $| = 1; I have autoflush turned on later. But by all thats's koly, please localize $| before changing its value, i.e. + local $| = 1; > + > + print $::cgi->header( Why $::cgi->header()? In my rewrite git_generating_data_html() subroutine is defined in gitweb.perl, and passed as callback to cache constructor instead. > + -type=>'text/html', > + -charset => 'utf-8', > + -status=> 200, > + -expires => 'now', That I have. > + # HTTP/1.0 > + -Pragma => 'no-cache', > + # HTTP/1.1 > + -Cache_Control => join( > + ', ', > + qw( > + private > + no-cache > + no-store > + must-revalidate > + max-age=0 > + pre-check=0 > + post-check=0 > + ) > + ) This I don't, and I probably should have. Good idea. > + ); > + > + print <<EOF; > +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www/w3.porg/TR/html4/strict.dtd"> > +<!-- git web w/caching interface version $version, (C) 2006-2010, John 'Warthog9' Hawley <warthog9\@kernel.org> --> > +<!-- git core binaries version $git_version --> > +<head> > +<meta http-equiv="content-type" content="$content_type; charset=utf-8"/> > +<meta name="generator" content="gitweb/$version git/$git_version"/> > +<meta name="robots" content="index, nofollow"/> You meant here "noindex, nofollow", isn't it? > +<meta http-equiv="refresh" content="0"/> > +<title>$title</title> > +</head> > +<body> > +EOF > + > + print "Generating.."; > + do { > + print "."; > + sleep 2 if $x > 0; > + open(cacheFile, '<:utf8', "$fullhashpath") if ($cache_enable > 0); > + open(cacheFile, '<:utf8', \$fullhashpath) if ($cache_enable == 0); > + $lockStat = flock(cacheFile,LOCK_SH|LOCK_NB); > + close(cacheFile); > + $x++; > + $combinedLockStat = $lockStat; > + } while ((! $combinedLockStat) && ($x < $max)); A bit of code repetition here. > + print <<EOF; > +</body> > +</html> > +EOF > + return; > +} > + > +sub cacheDisplay { > + local $/ = undef; > + $|++; > + > + my ($action) = @_; > + open(cacheFile, '<:utf8', "$fullhashpath") if ($cache_enable > 0); > + open(cacheFile, '<:utf8', \$fullhashpath) if ($cache_enable == 0); > + $lockStat = flock(cacheFile,LOCK_SH|LOCK_NB); > + > + if ($cache_enable != 0 && ! $lockStat ){ > + close(cacheFile); > + cacheWaitForUpdate($action); > + } > + > + if( > + ( > + $action eq "snapshot" > + || > + $action eq "blob_plain" > + ) > + && > + $cache_enable > 0 > + ){ > + my $openstat = open(cacheFileBin, '<', "$fullhashbinpathfinal"); > + $lockStatBIN = flock(cacheFileBin,LOCK_SH|LOCK_NB); > + if ($cache_enable != 0 && ! $lockStatBIN ){ > + system ("echo 'cacheDisplay - bailing due to binary lock failure' >> /tmp/gitweb.log"); Errrrr... what? Using system("echo ...") instead of just writing to a file? Or better, using 'warn' and relying on CGI::Carp module to deliver warning to web server logs? > + close(cacheFile); Why not close 'cacheFile' earlier, or say, not opening it in first place if it is not necessary... oh, sorry, I see that you are using the same file for cache file for normal case and for lockfile. > + close(cacheFileBin); > + cacheWaitForUpdate($action); > + } > + > + my $binfilesize = -s "$fullhashbinpathfinal"; > + print "Content-Length: $binfilesize"; Good idea, bad solution. "Content-Length: " HTTP header can be only _after_ 'HTTP/1.1 200 OK' header, isn't it? > + } > + while( <cacheFile> ){ > + print $_; > + } Why not 'print <cacheFile>;'? besides with $/ undefined you operate in slurp / spew mode, and Perl now considers all file to be a single line. > + if( > + $action eq "snapshot" > + || > + $action eq "blob_plain" > + ){ > + binmode STDOUT, ':raw'; > + print <cacheFileBin>; > + binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi > + close(cacheFileBin); > + } > + close(cacheFile); > + $|--; Wouldn't it be better to use 'local $| = 1' instead of this '$|++ / $|--'? > +} > diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl > index 757ef46..eb53075 100755 > --- a/gitweb/gitweb.perl > +++ b/gitweb/gitweb.perl > @@ -31,7 +31,7 @@ BEGIN { > our $version = "++GIT_VERSION++"; > > # Output buffer variable > -$output = ""; > +our $output = ""; This belongs to and should be squashed with earlier patch > > our ($my_url, $my_uri, $base_url, $path_info, $home_link); > sub evaluate_uri { > @@ -244,6 +244,57 @@ our $git_versions_must_match = 1; > # Leave it undefined (or set to 'undef') to turn off load checking. > our $maxload = 300; > > +# This enables/disables the caching layer in gitweb. This currently only supports the > +# 'dumb' file based caching layer, primarily used on git.kernel.org. this is reasonably > +# effective but it has the downside of requiring a huge amount of disk space if there > +# are a number of repositories involved. It is not uncommon for git.kernel.org to have > +# on the order of 80G - 120G accumulate over the course of a few months. It is recommended > +# that the cache directory be periodically completely deleted, and this is safe to perform. > +# Suggested mechanism > +# mv $cacheidr $cachedir.flush;mkdir $cachedir;rm -rf $cachedir.flush > +# Value is binary. 0 = disabled (default), 1 = enabled. > +# > +# Values of caching: > +# 1 = 'dumb' file based caching used on git.kernel.org > +our $cache_enable = 0; As I wrote earlier, having boolean $caching_enabled and $cache specifying caching engine to use (undef for default) allows to separate act of enabling cache from the cact of selecting (and configuring) cache. Isn't it? > + > +# Used to set the minimum cache timeout for the dynamic caching algorithm. Basically > +# if we calculate the cache to be under this number of seconds we set the cache timeout > +# to this minimum. > +# Value is in seconds. 1 = 1 seconds, 60 = 1 minute, 600 = 10 minutes, 3600 = 1 hour > +our $minCacheTime = 20; > + > +# Used to set the maximum cache timeout for the dynamic caching algorithm. Basically > +# if we calculate the cache to exceed this number of seconds we set the cache timeout > +# to this maximum. > +# Value is in seconds. 1 = 1 seconds, 60 = 1 minute, 600 = 10 minutes, 3600 = 1 hour > +our $maxCacheTime = 1200; > + > +# If you need to change the location of the caching directory, override this > +# otherwise this will probably do fine for you > +our $cachedir = 'cache'; If it is relative path, it is relative to location of gitweb.cgi script, isn't it? > + > +# If this is set (to 1) cache will do it's best to always display something instead > +# of making someone wait for the cache to update. This will launch the cacheUpdate > +# into the background and it will lock a <file>.bg file and will only lock the > +# actual cache file when it needs to write into it. In theory this will make > +# gitweb seem more responsive at the price of possibly stale data. > +our $backgroundCache = 1; Boolean. > + > +# Used to set the maximum cache file life. If a cache files last modify time exceeds > +# this value, it will assume that the data is just too old, and HAS to be regenerated > +# instead of trying to display the existing cache data. > +# Value is in seconds. 1 = 1 seconds, 60 = 1 minute, 600 = 10 minutes, 3600 = 1 hour > +# 18000 = 5 hours > +our $maxCacheLife = 18000; > + > +# Used to enable or disable background forking of the gitweb caching. Mainly here for debugging purposes > +our $cacheDoFork = 1; > + > +our $fullhashpath = *STDOUT; > +our $fullhashbinpath = *STDOUT; > +our $fullhashbinpathfinal = *STDOUT; > + In my rewrite all cache parameters are in %cache_options hash, and passed as parameters to cache constructor, instead of littering global variables namespace... > # configuration for 'highlight' (http://www.andre-simon.de/) > # match by basename > our %highlight_basename = ( > @@ -500,6 +551,11 @@ our %feature = ( > 'default' => [0]}, > ); > > +# > +# Includes > +# > +do 'cache.pm'; Unsafe! If you have any error in 'cache.pm', you could get a very strange error reports. Also what if 'cache.pm' cannot be found because somebody forgot to copy it along with gitweb.cgi? See how we do it for $GITWEB_CONFIG: if (-e $GITWEB_CONFIG) { do $GITWEB_CONFIG; die $@ if $@; } See also documentation for 'do' (perldoc -f do): If "do" cannot read the file, it returns undef and sets $! to the error. If "do" can read the file but cannot compile it, it returns undef and sets an error message in $@. If the file is successfully compiled, "do" returns the value of the last expression evaluated. Note that inclusion of library modules is better done with the "use" and "require" operators, which also do automatic error checking and raise an exception if there's a problem. You might like to use "do" to read in a program configuration file. Manual error checking can be done this way: # read in config files: system first, then user for $file ("/share/prog/defaults.rc", "$ENV{HOME}/.someprogrc") { unless ($return = do $file) { warn "couldn't parse $file: $@" if $@; warn "couldn't do $file: $!" unless defined $return; warn "couldn't run $file" unless $return; } } In my rewrite I use simply "require", and I check it's output. > + > sub gitweb_get_feature { > my ($name) = @_; > return unless exists $feature{$name}; > @@ -1089,7 +1145,8 @@ sub dispatch { > !$project) { > die_error(400, "Project needed"); > } > - $actions{$action}->(); > + #$actions{$action}->(); > + cache_fetch($action); Wouldn't a better solution be - $actions{$action}->(); + if ($caching_enabled) { + cache_fetch($action); + } else { + $actions{$action}->(); + } > } > > sub reset_timer { > @@ -1159,6 +1216,7 @@ sub change_output { > > # Trap the 'proper' STDOUT to STDOUT_REAL for things like error messages and such > open(STDOUT_REAL,">&STDOUT") or die "Unable to capture STDOUT $!\n"; > + print STDOUT_REAL ""; Why it is for? > > # Close STDOUT, so that it isn't being used anymore. > close STDOUT; > @@ -1185,10 +1243,7 @@ sub run { > $pre_dispatch_hook->() > if $pre_dispatch_hook; > > - change_output(); > run_request(); > - reset_output(); > - print $output; > > $post_dispatch_hook->() > if $post_dispatch_hook; Hmmm... > @@ -3465,7 +3520,9 @@ sub git_header_html { > # support xhtml+xml but choking when it gets what it asked for. > if (defined $cgi->http('HTTP_ACCEPT') && > $cgi->http('HTTP_ACCEPT') =~ m/(,|;|\s|^)application\/xhtml\+xml(,|;|\s|$)/ && > - $cgi->Accept('application/xhtml+xml') != 0) { > + $cgi->Accept('application/xhtml+xml') != 0 > + && > + $cache_enable == 0) { > $content_type = 'application/xhtml+xml'; > } else { > $content_type = 'text/html'; I also have something similar in my rewrite. > @@ -3610,6 +3667,7 @@ sub git_footer_html { > my $feed_class = 'rss_logo'; > > print "<div class=\"page_footer\">\n"; > + print "<div class=\"cachetime\">Cache Last Updated: ". gmtime( time ) ." GMT</div>\n"; > if (defined $project) { > my $descr = git_get_project_description($project); > if (defined $descr) { In my rewrite it replaces "time to generate page" info, though probably should be always there. > @@ -3698,6 +3756,11 @@ sub die_error { > 500 => '500 Internal Server Error', > 503 => '503 Service Unavailable', > ); > + # The output handlers for die_error need to be reset to STDOUT > + # so that half the message isn't being output to random and > + # half to STDOUT as expected. This is mainly for the benefit > + # of using git_header_html() and git_footer_html() since > + # > # Reset the output so that we are actually going to STDOUT as opposed > # to buffering the output. > reset_output(); > @@ -5610,9 +5673,15 @@ sub git_blob_plain { > ($sandbox ? 'attachment' : 'inline') > . '; filename="' . $save_as . '"'); > local $/ = undef; > - binmode STDOUT, ':raw'; > - print <$fd>; > - binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi > + if( $cache_enable != 0){ > + open BINOUT, '>', $fullhashbinpath or die_error(500, "Could not open bin dump file"); > + }else{ > + open BINOUT, '>', \$fullhashbinpath or die_error(500, "Could not open bin dump file"); > + } > + binmode BINOUT, ':raw'; > + print BINOUT <$fd>; > + binmode BINOUT, ':utf8'; # as set at the beginning of gitweb.cgi > + close BINOUT; Why it is here? Why caching engine connot take care of that? > close $fd; > } > > @@ -5897,9 +5966,15 @@ sub git_snapshot { > > open my $fd, "-|", $cmd > or die_error(500, "Execute git-archive failed"); > - binmode STDOUT, ':raw'; > - print <$fd>; > - binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi > + if( $cache_enable != 0){ > + open BINOUT, '>', $fullhashbinpath or die_error(500, "Could not open bin dump file"); > + }else{ > + open BINOUT, '>', \$fullhashbinpath or die_error(500, "Could not open bin dump file"); > + } > + binmode BINOUT, ':raw'; > + print BINOUT <$fd>; > + binmode BINOUT, ':utf8'; # as set at the beginning of gitweb.cgi > + close BINOUT; > close $fd; > } Same as above. I don't think that having to modify gitweb code beside enabling caching somewhere at the top shows a good design... > > diff --git a/gitweb/static/gitweb.css b/gitweb/static/gitweb.css > index 4132aab..972d32e 100644 > --- a/gitweb/static/gitweb.css > +++ b/gitweb/static/gitweb.css > @@ -67,6 +67,12 @@ div.page_path { > border-width: 0px 0px 1px; > } > > +div.cachetime { > + float: left; > + margin-right: 10px; > + color: #555555; > +} O.K. > + > div.page_footer { > height: 17px; > padding: 4px 8px; P.S. Where is my review ;-)??? -- Jakub Narebski Poland ShadeHawk on #git ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 0/3] Gitweb caching v7 2010-10-28 0:42 [PATCH 0/3] Gitweb caching v7 John 'Warthog9' Hawley ` (2 preceding siblings ...) 2010-10-28 0:42 ` [PATCH 3/3] gitweb: File based caching layer (from git.kernel.org) John 'Warthog9' Hawley @ 2010-10-28 22:29 ` Junio C Hamano 2010-10-29 22:25 ` Junio C Hamano 4 siblings, 0 replies; 43+ messages in thread From: Junio C Hamano @ 2010-10-28 22:29 UTC (permalink / raw) To: John 'Warthog9' Hawley; +Cc: git "John 'Warthog9' Hawley" <warthog9@eaglescrag.net> writes: > v7: > - Rework output system, now central STDOUT redirect Yeah, when I saw your print-to-variable-assignment that was one of the things that came to my mind. Sounds like a reasonable thing to do. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 0/3] Gitweb caching v7 2010-10-28 0:42 [PATCH 0/3] Gitweb caching v7 John 'Warthog9' Hawley ` (3 preceding siblings ...) 2010-10-28 22:29 ` [PATCH 0/3] Gitweb caching v7 Junio C Hamano @ 2010-10-29 22:25 ` Junio C Hamano 2010-10-30 8:58 ` Jakub Narebski 4 siblings, 1 reply; 43+ messages in thread From: Junio C Hamano @ 2010-10-29 22:25 UTC (permalink / raw) To: John 'Warthog9' Hawley; +Cc: git I am getting this in the gitweb.log: [Fri Oct 29 22:21:12 2010] gitweb.perl: Undefined subroutine &main::cache_fetch called at .../t/../gitweb/gitweb.perl line 1124. which seems to cause t9500 to fail. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 0/3] Gitweb caching v7 2010-10-29 22:25 ` Junio C Hamano @ 2010-10-30 8:58 ` Jakub Narebski 2010-10-31 4:24 ` Junio C Hamano 0 siblings, 1 reply; 43+ messages in thread From: Jakub Narebski @ 2010-10-30 8:58 UTC (permalink / raw) To: Junio C Hamano; +Cc: John 'Warthog9' Hawley, git Junio C Hamano <gitster@pobox.com> writes: > I am getting this in the gitweb.log: > > [Fri Oct 29 22:21:12 2010] gitweb.perl: Undefined subroutine > &main::cache_fetch called at .../t/../gitweb/gitweb.perl line 1124. > > which seems to cause t9500 to fail. This is caused by three issues (bugs) in v7 caching code. First is the reason why t9500 exhibits this bug. The gitweb caching v7 includes file with subroutines related to caching in the following way: do 'cache.pm'; Note that the relative path is used; for t9500 it is relative from somewhere witjin 't/', and not from 'gitweb/', so "do 'cache.pm';" doesn't find it. In my earlier rewrites I used do $cache_pm; and 't/gitweb-lib.sh' set $cache_pm appriopriately. Second is why this bug is bad. There is no error checking that "do 'cache.pm';" succeeded. It should be do $cache_pm; die $@ if $@; at least. Perhaps even better would be to simply turn off caching support if there is an error in 'cache.pm' (which probably should be called 'cache.pl', as it is not proper Perl module)... though on the other hand side it would could hide the fact that caching is not working. Third is why this matters. In v7 the cache_fetch() subroutine, defined in 'cache.pm', is run *unconditionally*, and has test if the caching is enabled *inside it*, instead of having gitweb (caller) use it only if caching is disabled. This coupled with the fact that 'make install-gitweb' would *not* install 'cache.pm' alongside gitweb.cgi means that anybody upgrading gitweb, whether he/she wants caching or not, would have gitweb stop working after upgrade... unless he/she knows that he/she has to copy 'cache.pm' file manually. On the other hand having test first if caching is enabled would make t9500 tests pass (because they do not even test minimally cache enabled / cache disabled cases, like in my rewrite), but would hide problem when caching is enabled and 'cache.pm' is not installed... (perhaps also in persistent environments; I don't know where pwd is then). -- Jakub Narebski Poland ShadeHawk on #git ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 0/3] Gitweb caching v7 2010-10-30 8:58 ` Jakub Narebski @ 2010-10-31 4:24 ` Junio C Hamano 2010-10-31 9:21 ` Jakub Narebski 0 siblings, 1 reply; 43+ messages in thread From: Junio C Hamano @ 2010-10-31 4:24 UTC (permalink / raw) To: Jakub Narebski; +Cc: John 'Warthog9' Hawley, git Jakub Narebski <jnareb@gmail.com> writes: >> I am getting this in the gitweb.log: >> >> [Fri Oct 29 22:21:12 2010] gitweb.perl: Undefined subroutine >> &main::cache_fetch called at .../t/../gitweb/gitweb.perl line 1124. >> >> which seems to cause t9500 to fail. > > This is caused by three issues (bugs) in v7 caching code. > > First is the reason why t9500 exhibits this bug. The gitweb caching > v7 includes file with subroutines related to caching in the following > way: > > do 'cache.pm'; > > Note that the relative path is used; for t9500 it is relative from > somewhere witjin 't/', and not from 'gitweb/', so "do 'cache.pm';" > doesn't find it. John, where should cache.pm go in the installed system? Does it typically go next to gitweb.perl? I think "do 'that-file'" honors path specified by the -I option, so I do not think "do $cache_pm" is necessary. My preference is to run gitweb tests with appropriate -I pointing at the cache.pm in the directory. > ... It should be > > do $cache_pm; > die $@ if $@; > > at least. Catching failure is a good thing to do. > Perhaps even better would be to simply turn off caching > support if there is an error in 'cache.pm' That can come later. Jakub, can we have an absolute minimum fix-up, so that we can give this wider exposure? I think there are only four issues: (1) exclude Ajax-y stuff from caching; (2) install cache.pm the same way gitweb.perl is installed via the Makefile; (3) running tests with appropriate -I so that cache.pm is found; and (4) die if 'cache.pm' cannot be "done"). I think the change in gitweb-cache v7 is small and safe enough that we should fast-track it to give usability to the real world sites. It may be a low-risk "obviously correct" approach that is quick-and-dirty, but that is exactly why this should be fast-tracked. It does not touch the logic or formatting in any way, just bypasses the page generation altogether when it can clearly do so when it can tell the output cannot possibly be incorrect (albeit sometimes it might be stale if in certain cases, e.g. it is relative to HEAD). I know you and others were aiming to split things up, but I think the amount of the effort that is needed for that line of work on top of the current codebase is not much different from what is needed on top of gitweb-cache v7. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 0/3] Gitweb caching v7 2010-10-31 4:24 ` Junio C Hamano @ 2010-10-31 9:21 ` Jakub Narebski 2010-11-01 10:24 ` [PATCHv7.1 0/4] Gitweb caching v7.1 Jakub Narebski ` (4 more replies) 0 siblings, 5 replies; 43+ messages in thread From: Jakub Narebski @ 2010-10-31 9:21 UTC (permalink / raw) To: Junio C Hamano; +Cc: John 'Warthog9' Hawley, git On Sun, 31 Oct 2010, Junio C Hamano wrote: > Jakub Narebski <jnareb@gmail.com> writes: > >>> I am getting this in the gitweb.log: >>> >>> [Fri Oct 29 22:21:12 2010] gitweb.perl: Undefined subroutine >>> &main::cache_fetch called at .../t/../gitweb/gitweb.perl line 1124. >>> >>> which seems to cause t9500 to fail. >> >> This is caused by three issues (bugs) in v7 caching code. >> >> First is the reason why t9500 exhibits this bug. The gitweb caching >> v7 includes file with subroutines related to caching in the following >> way: >> >> do 'cache.pm'; >> >> Note that the relative path is used; for t9500 it is relative from >> somewhere witjin 't/', and not from 'gitweb/', so "do 'cache.pm';" >> doesn't find it. > > John, where should cache.pm go in the installed system? Does it typically > go next to gitweb.perl? > > I think "do 'that-file'" honors path specified by the -I option, so I do > not think "do $cache_pm" is necessary. My preference is to run gitweb > tests with appropriate -I pointing at the cache.pm in the directory. >From `perldoc -f do` do 'stat.pl'; is just like eval 'cat stat.pl'; except that it's more efficient and concise, keeps track of the current filename for error messages, searches the @INC libraries, and updates %INC if the file is found. ^^^^^^^^^^^^^^^^^^^^^^^^^^^ So I think it respects '-I<directory>' given to perl interpreter. On the other hand I don't know how it would work with mod_perl (uwing ModPerl::Registry handler), whether it wouldn't too require extra configuration. So I think a better solution would be to base 'Gitweb caching v7' plus necessary fixes and improvements on top of gitweb: Prepare for splitting gitweb This means that the directory that gitweb is in would be added to @INC via use lib __DIR__.'/lib'; The 'cache.pm' or 'cache.pl' file would be moved to 'gitweb/lib', but that is cosmetic change. > >> ... It should be >> >> do $cache_pm; >> die $@ if $@; >> >> at least. > > Catching failure is a good thing to do. Right. And this is only one-line addition. >> Perhaps even better would be to simply turn off caching >> support if there is an error in 'cache.pm' > > That can come later. > > Jakub, can we have an absolute minimum fix-up, so that we can give > this wider exposure? I think there are only > four issues: > > (1) exclude Ajax-y stuff from caching; Easy, but only if check whether to do capturing and caching is moved out of cache_fetch to caller, i.e. to gitweb script. See also comments below about what need and should be done. Another solution is to turn off Ajax-y stuff when caching is enabled. It can be done quite easily, just sprinkle some !$caching_enabled (see comment below about naming and semantic of this config variable) in appropriate place. > (2) install cache.pm the same way gitweb.perl is installed via > the Makefile; With "gitweb: Prepare for splitting gitweb" as introductory patch this would be just adding # gitweb output caching GITWEB_MODULES += cache.pm (or cache.pl - it is not a proper Perl package!) e.g. above GITWEB_REPLACE sed script in gitweb/Makefile. > (3) running tests with appropriate -I so that cache.pm is found; and No change to test necessary if we use "use lib __DIR__.'/lib';" in gitweb.perl I'd like to port from my rewrite change to t/t9500-gitweb-standalone-no-errors.sh adding minimal test to check if running gitweb with caching enabled doesn't generate any warnings, and (modified) change to the t/t9502-gitweb-standalone-parse-output.sh, adding minimal test that gitweb produces the same output with and without caching. > (4) die if 'cache.pm' cannot be "done". O.K. (4) is one liner. There is are also other issues (5) naming and semantic of gitweb config variables configuring caching; at least change $cache_enabled enum to $caching_enabled boolean (6) do not change anything in gitweb behavior if caching is disabled; move 'if ($caching_enabled)' test to gitweb.perl, and remove code from cache_fetch About (5): names and semantics of configuration variables are gitweb API, so we should at least try to keep backward compatibility with old names (see for example 'backward compatibility: legacy gitweb config support' section in %known_snapshot_format_aliases). There is no much problem with $minCacheTime, $maxCacheTime, $backgroundCache etc. (besides bleh, camelCase ;-) but I would prefer to have *boolean* $caching_enabled (true-ish value means cache is enabled, false-y value including undef means caching is disabled) to currently two-value *"enum"* $cache_enable called "binary" (sic!) in docs and comments. See also issue (6). About (6): I would prefer to move check if caching should be enabled higher, to gitweb.perl (to its own configure_caching() subroutine, rather than sprinkling it in top scope), and not even include 'cache.pm' if caching is disabled (default). But because we use 'do' and not 'require', we would have to check %INC to not include it twice... Anyway, whether we include 'cache.pm' conditionally or not, I'd prefer to use if ($caching_enabled) { cache_fetch($action); } else { $actions{$action}->(); } in dispatch() subroutine, and simply remove all code dealing with '$cache_enable == 0' case from 'cache.pm'. Remember, config variables are forever :-) > > I think the change in gitweb-cache v7 is small and safe enough that we > should fast-track it to give usability to the real world sites. It may be > a low-risk "obviously correct" approach that is quick-and-dirty, but that > is exactly why this should be fast-tracked. It does not touch the logic > or formatting in any way, just bypasses the page generation altogether > when it can clearly do so when it can tell the output cannot possibly be > incorrect (albeit sometimes it might be stale if in certain cases, e.g. it > is relative to HEAD). My rewrite also does the same. I have changed it to do the same STDOUT redirection that J.H. gitweb caching v7 does (I used select($fh) based capture) - work in progress, not yet pushed to my repository: git://repo.or.cz/git/jnareb-git.git http://repo.or.cz/w/git/jnareb-git.git (gitweb) > > I know you and others were aiming to split things up, but I think the > amount of the effort that is needed for that line of work on top of the > current codebase is not much different from what is needed on top of > gitweb-cache v7. Well, I'd have to add support for configuration variables used in this patch series, but with exception of $cache_enable -> $caching_enabled it wouldn't be much extra work. -- Jakub Narebski Poland ^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCHv7.1 0/4] Gitweb caching v7.1 2010-10-31 9:21 ` Jakub Narebski @ 2010-11-01 10:24 ` Jakub Narebski 2010-11-12 23:35 ` [RFC/PATCHv7.2 0/4] Gitweb caching v7.2 Jakub Narebski 2010-11-01 10:24 ` [PATCHv7.1 1/4] gitweb: Prepare for splitting gitweb Jakub Narebski ` (3 subsequent siblings) 4 siblings, 1 reply; 43+ messages in thread From: Jakub Narebski @ 2010-11-01 10:24 UTC (permalink / raw) To: git Cc: John 'Warthog9' Hawley, John 'Warthog9' Hawley, Junio C Hamano, Petr Baudis, admin, Jakub Narebski This series is a bit fixed up and a tiny bit cleaned up version of "Gitweb caching v7" series from John 'Warthog9' Hawley. This series is based on top of 8ff76f4 (gitweb: Move call to evaluate_git_version after evaluate_gitweb_config, 2010-09-26) currently in 'pu'. This commit fixes a bug in gitweb which shows in running testsuite. The patch "gitweb: Add option to force version match" was removed from this series as it doesn't belong to it, c.f. http://article.gmane.org/gmane.comp.version-control.git/160236 [PATCHv7.1 1/4] gitweb: Prepare for splitting gitweb [PATCHv7.1 2/4] gitweb: add output buffering and associated functions [PATCHv7.1 3/4] gitweb: File based caching layer (from git.kernel.org) [PATCHv7.1 4/4] gitweb: Minimal testing of gitweb caching Jakub Narebski (2): gitweb: Prepare for splitting gitweb gitweb: Minimal testing of gitweb caching John 'Warthog9' Hawley (2): gitweb: add output buffering and associated functions gitweb: File based caching layer (from git.kernel.org) gitweb/Makefile | 20 ++- gitweb/gitweb.perl | 134 +++++++++++- gitweb/lib/cache.pl | 348 +++++++++++++++++++++++++++++ gitweb/static/gitweb.css | 6 + t/t9500-gitweb-standalone-no-errors.sh | 19 ++ t/t9502-gitweb-standalone-parse-output.sh | 39 ++++ 6 files changed, 555 insertions(+), 11 deletions(-) create mode 100644 gitweb/lib/cache.pl -- 1.7.3 ^ permalink raw reply [flat|nested] 43+ messages in thread
* [RFC/PATCHv7.2 0/4] Gitweb caching v7.2 2010-11-01 10:24 ` [PATCHv7.1 0/4] Gitweb caching v7.1 Jakub Narebski @ 2010-11-12 23:35 ` Jakub Narebski 2010-11-12 23:41 ` [PATCHv7.2 1/4] gitweb: Prepare for splitting gitweb Jakub Narebski ` (3 more replies) 0 siblings, 4 replies; 43+ messages in thread From: Jakub Narebski @ 2010-11-12 23:35 UTC (permalink / raw) To: git Cc: John 'Warthog9' Hawley, John 'Warthog9' Hawley, Junio C Hamano, Petr Baudis, admin This series is a bit fixed up and a tiny bit cleaned up version of "Gitweb caching v7" series from John 'Warthog9' Hawley: http://thread.gmane.org/gmane.comp.version-control.git/160147 This series is based on top of 'next', because it contains 'jn/gitweb-test' branch. The difference from v7.2 is that it takes into account 'test-installed' target in gitweb/Makefile in first patch of its series, and that testing of caching support is slightly extended. Note that some of those tests fail currently, not because of error in gitweb caching code, but because I was not able to disable calling cacheWaitForUpdate(), which hinders testing. Those differences were also described in "Re: What's cooking in git.git (Nov 2010, #01; Tue, 9)": http://article.gmane.org/gmane.comp.version-control.git/161309 Table of contents: ================== [PATCHv7.2 1/4] gitweb: Prepare for splitting gitweb [PATCHv7.1 2/4] gitweb: add output buffering and associated functions [PATCHv7.1 3/4] gitweb: File based caching layer (from git.kernel.org) [PATCHv7.2 4/4] gitweb: Minimal testing of gitweb caching Shortlog: ========= Jakub Narebski (2): gitweb: Prepare for splitting gitweb gitweb: Minimal testing of gitweb caching John 'Warthog9' Hawley (2): gitweb: add output buffering and associated functions gitweb: File based caching layer (from git.kernel.org) Diffstat: ========= gitweb/Makefile | 20 ++- gitweb/gitweb.perl | 134 +++++++++++- gitweb/lib/cache.pl | 348 +++++++++++++++++++++++++++++ gitweb/static/gitweb.css | 6 + t/gitweb-lib.sh | 15 ++ t/t9500-gitweb-standalone-no-errors.sh | 20 ++ t/t9501-gitweb-standalone-http-status.sh | 13 + t/t9502-gitweb-standalone-parse-output.sh | 33 +++ 8 files changed, 579 insertions(+), 10 deletions(-) create mode 100644 gitweb/lib/cache.pl -- 1.7.3 ^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCHv7.2 1/4] gitweb: Prepare for splitting gitweb 2010-11-12 23:35 ` [RFC/PATCHv7.2 0/4] Gitweb caching v7.2 Jakub Narebski @ 2010-11-12 23:41 ` Jakub Narebski 2010-11-17 23:10 ` Junio C Hamano 2010-12-02 10:17 ` [PATCHv7.3 1/4 (bugfix)] " Jakub Narebski 2010-11-12 23:44 ` [PATCHv7.1 2/4] gitweb: add output buffering and associated functions Jakub Narebski ` (2 subsequent siblings) 3 siblings, 2 replies; 43+ messages in thread From: Jakub Narebski @ 2010-11-12 23:41 UTC (permalink / raw) To: git Cc: John 'Warthog9' Hawley, John 'Warthog9' Hawley, Junio C Hamano, Petr Baudis, admin Prepare gitweb for having been split into modules that are to be installed alongside gitweb in 'lib/' subdirectory, by adding use lib __DIR__.'/lib'; to gitweb.perl (to main gitweb script), and preparing for putting modules (relative path) in $(GITWEB_MODULES) in gitweb/Makefile. This preparatory work allows to add new module to gitweb by simply adding GITWEB_MODULES += <module> to gitweb/Makefile (assuming that the module is in 'gitweb/lib/' directory). While at it pass GITWEBLIBDIR in addition to GITWEB_TEST_INSTALLED to test instaleed version of gitweb and installed version of modules (for tests which check individual (sub)modules). Using __DIR__ from Dir::Self module (not in core, that's why currently gitweb includes excerpt of code from Dir::Self defining __DIR__) was chosen over using FindBin-based solution (in core since perl 5.00307, while gitweb itself requires at least perl 5.8.0) because FindBin uses BEGIN block, which is a problem under mod_perl and other persistent Perl environments (thought there are workarounds). At Pavan Kumar Sankara suggestion gitweb/Makefile uses install [OPTION]... SOURCE... DIRECTORY format (2nd format) with single SOURCE rather than install [OPTION]... SOURCE DEST format (1st format) because of security reasons (race conditions). Modern GNU install has `-T' / `--no-target-directory' option, but we cannot rely that the $(INSTALL) we are using supports this option. The install-modules target in gitweb/Makefile uses shell 'for' loop, instead of make's $(foreach) function, to avoid possible problem with generating a command line that exceeded the maximum argument list length. Signed-off-by: Jakub Narebski <jnareb@gmail.com> --- This version is rebased on top of 'next', which means that it takes 'jn/gitweb-test' into account, in particular 958a846 (gitweb/Makefile: Add 'test' and 'test-installed' targets, 2010-09-26). That means adding GITWEBLIBDIR for future testing of individual modules (it was not present in v7.1). gitweb/Makefile | 17 +++++++++++++++-- gitweb/gitweb.perl | 8 ++++++++ 2 files changed, 23 insertions(+), 2 deletions(-) diff --git a/gitweb/Makefile b/gitweb/Makefile index 0a6ac00..f9e32eb 100644 --- a/gitweb/Makefile +++ b/gitweb/Makefile @@ -57,6 +57,7 @@ PERL_PATH ?= /usr/bin/perl bindir_SQ = $(subst ','\'',$(bindir))#' gitwebdir_SQ = $(subst ','\'',$(gitwebdir))#' gitwebstaticdir_SQ = $(subst ','\'',$(gitwebdir)/static)#' +gitweblibdir_SQ = $(subst ','\'',$(gitwebdir)/lib)#' SHELL_PATH_SQ = $(subst ','\'',$(SHELL_PATH))#' PERL_PATH_SQ = $(subst ','\'',$(PERL_PATH))#' DESTDIR_SQ = $(subst ','\'',$(DESTDIR))#' @@ -153,20 +154,32 @@ test: test-installed: GITWEB_TEST_INSTALLED='$(DESTDIR_SQ)$(gitwebdir_SQ)' \ + GITWEBLIBDIR='$(DESTDIR_SQ)$(gitweblibdir_SQ)' \ $(MAKE) -C ../t gitweb-test ### Installation rules -install: all +install: all install-modules $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(gitwebdir_SQ)' $(INSTALL) -m 755 $(GITWEB_PROGRAMS) '$(DESTDIR_SQ)$(gitwebdir_SQ)' $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(gitwebstaticdir_SQ)' $(INSTALL) -m 644 $(GITWEB_FILES) '$(DESTDIR_SQ)$(gitwebstaticdir_SQ)' +install-modules: + install_dirs="$(sort $(dir $(GITWEB_MODULES)))" && \ + for dir in $$install_dirs; do \ + test -d '$(DESTDIR_SQ)$(gitweblibdir_SQ)/$$dir' || \ + $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(gitweblibdir_SQ)/$$dir'; \ + done + gitweb_modules="$(GITWEB_MODULES)" && \ + for mod in $$gitweb_modules; do \ + $(INSTALL) -m 644 lib/$$mod '$(DESTDIR_SQ)$(gitweblibdir_SQ)/$$(dirname $$mod)'; \ + done + ### Cleaning rules clean: $(RM) gitweb.cgi static/gitweb.min.js static/gitweb.min.css GITWEB-BUILD-OPTIONS -.PHONY: all clean install test test-installed .FORCE-GIT-VERSION-FILE FORCE +.PHONY: all clean install install-modules test test-installed .FORCE-GIT-VERSION-FILE FORCE diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl index 679f2da..cfa511c 100755 --- a/gitweb/gitweb.perl +++ b/gitweb/gitweb.perl @@ -10,6 +10,14 @@ use 5.008; use strict; use warnings; + +use File::Spec; +# __DIR__ is taken from Dir::Self __DIR__ fragment +sub __DIR__ () { + File::Spec->rel2abs(join '', (File::Spec->splitpath(__FILE__))[0, 1]); +} +use lib __DIR__ . '/lib'; + use CGI qw(:standard :escapeHTML -nosticky); use CGI::Util qw(unescape); use CGI::Carp qw(fatalsToBrowser set_message); -- 1.7.3 ^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [PATCHv7.2 1/4] gitweb: Prepare for splitting gitweb 2010-11-12 23:41 ` [PATCHv7.2 1/4] gitweb: Prepare for splitting gitweb Jakub Narebski @ 2010-11-17 23:10 ` Junio C Hamano 2010-11-18 13:37 ` Jakub Narebski 2010-12-02 10:17 ` [PATCHv7.3 1/4 (bugfix)] " Jakub Narebski 1 sibling, 1 reply; 43+ messages in thread From: Junio C Hamano @ 2010-11-17 23:10 UTC (permalink / raw) To: Jakub Narebski Cc: git, John 'Warthog9' Hawley, John 'Warthog9' Hawley, Petr Baudis, admin Jakub Narebski <jnareb@gmail.com> writes: > That means adding GITWEBLIBDIR for future testing of individual > modules (it was not present in v7.1). You confused me. "git grep" after test-applying this 4-patch series on top of next shows only one GITWEBLIBDIR, which means that the symbol is defined but nobody pays attention to it. > diff --git a/gitweb/Makefile b/gitweb/Makefile > index 0a6ac00..f9e32eb 100644 > --- a/gitweb/Makefile > +++ b/gitweb/Makefile > @@ -57,6 +57,7 @@ PERL_PATH ?= /usr/bin/perl > ... > -.PHONY: all clean install test test-installed .FORCE-GIT-VERSION-FILE FORCE > +.PHONY: all clean install install-modules test test-installed .FORCE-GIT-VERSION-FILE FORCE > In the earlier series, 478da52 (gitweb: Prepare for splitting gitweb, 2010-10-30) removed the trailing blank line from here but this does not do so anymore, which is a micronit regression. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCHv7.2 1/4] gitweb: Prepare for splitting gitweb 2010-11-17 23:10 ` Junio C Hamano @ 2010-11-18 13:37 ` Jakub Narebski 0 siblings, 0 replies; 43+ messages in thread From: Jakub Narebski @ 2010-11-18 13:37 UTC (permalink / raw) To: Junio C Hamano Cc: git, John 'Warthog9' Hawley, John 'Warthog9' Hawley, Petr Baudis, admin On Thu, 18 Nov 2010, Junio C Hamano wrote: > Jakub Narebski <jnareb@gmail.com> writes: > > > That means adding GITWEBLIBDIR for future testing of individual > > modules (it was not present in v7.1). > > You confused me. "git grep" after test-applying this 4-patch series on > top of next shows only one GITWEBLIBDIR, which means that the symbol is > defined but nobody pays attention to it. That's why I wrote "for _future_ testing"... but perhaps it is not worth introducing it now that it is not yet used by any test. > > diff --git a/gitweb/Makefile b/gitweb/Makefile > > index 0a6ac00..f9e32eb 100644 > > --- a/gitweb/Makefile > > +++ b/gitweb/Makefile > > @@ -57,6 +57,7 @@ PERL_PATH ?= /usr/bin/perl > > ... > > -.PHONY: all clean install test test-installed .FORCE-GIT-VERSION-FILE FORCE > > +.PHONY: all clean install install-modules test test-installed .FORCE-GIT-VERSION-FILE FORCE > > > > In the earlier series, 478da52 (gitweb: Prepare for splitting gitweb, > 2010-10-30) removed the trailing blank line from here but this does not do > so anymore, which is a micronit regression. I was not sure if such "while at it" style improvements are encouraged or not. Will remember it for v7.3. -- Jakub Narebski Poland ^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCHv7.3 1/4 (bugfix)] gitweb: Prepare for splitting gitweb 2010-11-12 23:41 ` [PATCHv7.2 1/4] gitweb: Prepare for splitting gitweb Jakub Narebski 2010-11-17 23:10 ` Junio C Hamano @ 2010-12-02 10:17 ` Jakub Narebski 2010-12-02 17:37 ` Junio C Hamano 1 sibling, 1 reply; 43+ messages in thread From: Jakub Narebski @ 2010-12-02 10:17 UTC (permalink / raw) To: git Cc: John 'Warthog9' Hawley, John 'Warthog9' Hawley, Junio C Hamano, Petr Baudis, admin Prepare gitweb for having been split into modules that are to be installed alongside gitweb in 'lib/' subdirectory, by adding use lib __DIR__.'/lib'; to gitweb.perl (to main gitweb script), and preparing for putting modules (relative path) in $(GITWEB_MODULES) in gitweb/Makefile. This preparatory work allows to add new module to gitweb by simply adding GITWEB_MODULES += <module> to gitweb/Makefile (assuming that the module is in 'gitweb/lib/' directory). While at it pass GITWEBLIBDIR in addition to GITWEB_TEST_INSTALLED to allow testing installed version of gitweb and installed version of modules (for future tests which would check individual (sub)modules). Using __DIR__ from Dir::Self module (not in core, that's why currently gitweb includes excerpt of code from Dir::Self defining __DIR__) was chosen over using FindBin-based solution (in core since perl 5.00307, while gitweb itself requires at least perl 5.8.0) because FindBin uses BEGIN block, which is a problem under mod_perl and other persistent Perl environments (thought there are workarounds). At Pavan Kumar Sankara suggestion gitweb/Makefile uses install [OPTION]... SOURCE... DIRECTORY format (2nd format) with single SOURCE rather than install [OPTION]... SOURCE DEST format (1st format) because of security reasons (race conditions). Modern GNU install has `-T' / `--no-target-directory' option, but we cannot rely that the $(INSTALL) we are using supports this option. The install-modules target in gitweb/Makefile uses shell 'for' loop, instead of make's $(foreach) function, to avoid possible problem with generating a command line that exceeded the maximum argument list length. Signed-off-by: Jakub Narebski <jnareb@gmail.com> --- BUGFIX: shell variables should not be in single quotes I am very sorry about this bug... gitweb/Makefile | 18 +++++++++++++++--- gitweb/gitweb.perl | 8 ++++++++ 2 files changed, 23 insertions(+), 3 deletions(-) diff --git a/gitweb/Makefile b/gitweb/Makefile index 0a6ac00..ad5a282 100644 --- a/gitweb/Makefile +++ b/gitweb/Makefile @@ -57,6 +57,7 @@ PERL_PATH ?= /usr/bin/perl bindir_SQ = $(subst ','\'',$(bindir))#' gitwebdir_SQ = $(subst ','\'',$(gitwebdir))#' gitwebstaticdir_SQ = $(subst ','\'',$(gitwebdir)/static)#' +gitweblibdir_SQ = $(subst ','\'',$(gitwebdir)/lib)#' SHELL_PATH_SQ = $(subst ','\'',$(SHELL_PATH))#' PERL_PATH_SQ = $(subst ','\'',$(PERL_PATH))#' DESTDIR_SQ = $(subst ','\'',$(DESTDIR))#' @@ -153,20 +154,31 @@ test: test-installed: GITWEB_TEST_INSTALLED='$(DESTDIR_SQ)$(gitwebdir_SQ)' \ + GITWEBLIBDIR='$(DESTDIR_SQ)$(gitweblibdir_SQ)' \ $(MAKE) -C ../t gitweb-test ### Installation rules -install: all +install: all install-modules $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(gitwebdir_SQ)' $(INSTALL) -m 755 $(GITWEB_PROGRAMS) '$(DESTDIR_SQ)$(gitwebdir_SQ)' $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(gitwebstaticdir_SQ)' $(INSTALL) -m 644 $(GITWEB_FILES) '$(DESTDIR_SQ)$(gitwebstaticdir_SQ)' +install-modules: + install_dirs="$(sort $(dir $(GITWEB_MODULES)))" && \ + for dir in $$install_dirs; do \ + test -d '$(DESTDIR_SQ)$(gitweblibdir_SQ)'/"$$dir" || \ + $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(gitweblibdir_SQ)'/"$$dir"; \ + done + gitweb_modules="$(GITWEB_MODULES)" && \ + for mod in $$gitweb_modules; do \ + $(INSTALL) -m 644 "lib/$$mod" '$(DESTDIR_SQ)$(gitweblibdir_SQ)'/"$$(dirname $$mod)"; \ + done + ### Cleaning rules clean: $(RM) gitweb.cgi static/gitweb.min.js static/gitweb.min.css GITWEB-BUILD-OPTIONS -.PHONY: all clean install test test-installed .FORCE-GIT-VERSION-FILE FORCE - +.PHONY: all clean install install-modules test test-installed .FORCE-GIT-VERSION-FILE FORCE diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl index 679f2da..cfa511c 100755 --- a/gitweb/gitweb.perl +++ b/gitweb/gitweb.perl @@ -10,6 +10,14 @@ use 5.008; use strict; use warnings; + +use File::Spec; +# __DIR__ is taken from Dir::Self __DIR__ fragment +sub __DIR__ () { + File::Spec->rel2abs(join '', (File::Spec->splitpath(__FILE__))[0, 1]); +} +use lib __DIR__ . '/lib'; + use CGI qw(:standard :escapeHTML -nosticky); use CGI::Util qw(unescape); use CGI::Carp qw(fatalsToBrowser set_message); -- 1.7.3 ^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [PATCHv7.3 1/4 (bugfix)] gitweb: Prepare for splitting gitweb 2010-12-02 10:17 ` [PATCHv7.3 1/4 (bugfix)] " Jakub Narebski @ 2010-12-02 17:37 ` Junio C Hamano 2010-12-02 19:01 ` Jakub Narebski 0 siblings, 1 reply; 43+ messages in thread From: Junio C Hamano @ 2010-12-02 17:37 UTC (permalink / raw) To: Jakub Narebski Cc: git, John 'Warthog9' Hawley, John 'Warthog9' Hawley, Petr Baudis, admin Hmm, how did you find the issue, and more importantly, how did I or other people who saw this patch so easily fail to notice it? FWIW, I do run "make install" from the toplevel as part of my pre-push test, so I _should_ have noticed it. Ah, I don't run the install step for a revision that does not pass its selftest, so I haven't run "make install" on 'pu' for some time. That may explain it. Anyway thanks for a fixup. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCHv7.3 1/4 (bugfix)] gitweb: Prepare for splitting gitweb 2010-12-02 17:37 ` Junio C Hamano @ 2010-12-02 19:01 ` Jakub Narebski 2010-12-02 19:21 ` Junio C Hamano 0 siblings, 1 reply; 43+ messages in thread From: Jakub Narebski @ 2010-12-02 19:01 UTC (permalink / raw) To: Junio C Hamano Cc: git, John 'Warthog9' Hawley, John 'Warthog9' Hawley, Petr Baudis, admin On Thu, 2 Dec 2010, Junio C Hamano wrote: > Hmm, how did you find the issue, and more importantly, how did I or other > people who saw this patch so easily fail to notice it? When working on my total rewrite of J.H. gitweb caching series, available as 'gitweb/cache-kernel-pu' branch in git://repo.or.cz/git/jnareb-git.git and git://github.com/jnareb/git.git repositories I finally did a *clean* reinstal (i.e. remove whole directory, then run "make install-gitweb"). This branch uses the same "gitweb: Prepare for splitting gitweb" patch (cherry-picked back and forth). > FWIW, I do run "make install" from the toplevel as part of my pre-push > test, so I _should_ have noticed it. > > Ah, I don't run the install step for a revision that does not pass its > selftest, so I haven't run "make install" on 'pu' for some time. That may > explain it. Hmmm... I thought that "make install" doesn't install gitweb, but it does with "$(MAKE) -C gitweb install"... though I am not sure if "make all" builds gitweb (runs "make gitweb"). > Anyway thanks for a fixup. > Sorry for the bug. -- Jakub Narebski Poland ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCHv7.3 1/4 (bugfix)] gitweb: Prepare for splitting gitweb 2010-12-02 19:01 ` Jakub Narebski @ 2010-12-02 19:21 ` Junio C Hamano 2010-12-02 19:36 ` Jakub Narebski 0 siblings, 1 reply; 43+ messages in thread From: Junio C Hamano @ 2010-12-02 19:21 UTC (permalink / raw) To: Jakub Narebski Cc: git, John 'Warthog9' Hawley, John 'Warthog9' Hawley, Petr Baudis, admin Jakub Narebski <jnareb@gmail.com> writes: >> Ah, I don't run the install step for a revision that does not pass its >> selftest, so I haven't run "make install" on 'pu' for some time. That may >> explain it. > > Hmmm... I thought that "make install" doesn't install gitweb, but it does > with "$(MAKE) -C gitweb install"... though I am not sure if "make all" > builds gitweb (runs "make gitweb"). I think it does, and it should if it doesn't. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCHv7.3 1/4 (bugfix)] gitweb: Prepare for splitting gitweb 2010-12-02 19:21 ` Junio C Hamano @ 2010-12-02 19:36 ` Jakub Narebski 0 siblings, 0 replies; 43+ messages in thread From: Jakub Narebski @ 2010-12-02 19:36 UTC (permalink / raw) To: Junio C Hamano Cc: git, John 'Warthog9' Hawley, John 'Warthog9' Hawley, Petr Baudis, admin Junio C Hamano wrote: > Jakub Narebski <jnareb@gmail.com> writes: > > > > Ah, I don't run the install step for a revision that does not pass its > > > selftest, so I haven't run "make install" on 'pu' for some time. That may > > > explain it. > > > > Hmmm... I thought that "make install" doesn't install gitweb, but it does > > with "$(MAKE) -C gitweb install"... though I am not sure if "make all" > > builds gitweb (runs "make gitweb"). > > I think it does, and it should if it doesn't. Anyway "install" target in gitweb/Makefile runs "all" target in gitweb/Makefile, so "make install" -> "make -C gitweb install" -> "make -C gitweb all" -> gitweb/gitweb.cgi is generated gitweb/gitweb.cgi is in $(OTHER_PROGRAMS), and we have all:: [...] $(OTHER_PROGRAMS) GIT-BUILD-OPTIONS Hmmm... shouldn't it be 'gitweb', not 'gitweb/gitweb.cgi'? Just wondering (we have gitweb/gitweb.cgi target in main Makefile, which just proxies to gitweb/Makefile). -- Jakub Narebski Poland ^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCHv7.1 2/4] gitweb: add output buffering and associated functions 2010-11-12 23:35 ` [RFC/PATCHv7.2 0/4] Gitweb caching v7.2 Jakub Narebski 2010-11-12 23:41 ` [PATCHv7.2 1/4] gitweb: Prepare for splitting gitweb Jakub Narebski @ 2010-11-12 23:44 ` Jakub Narebski 2010-11-12 23:56 ` [PATCHv7.1 3/4] gitweb: File based caching layer (from git.kernel.org) Jakub Narebski 2010-11-13 0:01 ` [PATCHv7.2 4/4] gitweb: Minimal testing of gitweb caching Jakub Narebski 3 siblings, 0 replies; 43+ messages in thread From: Jakub Narebski @ 2010-11-12 23:44 UTC (permalink / raw) To: git Cc: John 'Warthog9' Hawley, John 'Warthog9' Hawley, Junio C Hamano, Petr Baudis, admin From: John 'Warthog9' Hawley <warthog9@eaglescrag.net> This adds output buffering for gitweb, mainly in preparation for caching support. This is a dramatic change to how caching was being done, mainly in passing around the variable manually and such. This centrally flips the entire STDOUT to a variable, which after the completion of the run, flips it back and does a print on the resulting data. This should save on the previous 10K line patch (or so) that adds more explicit output passing. [jn: modified reset_output to silence 'gitweb.perl: Name "main::STDOUT_REAL" used only once: possible typo at ../gitweb/gitweb.perl line 1130.' warning] Signed-off-by: Junio C Hamano <gitster@pobox.com> --- This is unchanged from previous version. Reminder: it uses open STDOUT, ">&", \*STDOUT_REAL; rather than open(STDOUT,">&STDOUT_REAL"); to silence spurious (invalid) warning gitweb.perl: Name "main::STDOUT_REAL" used only once: possible typo at ../gitweb/gitweb.perl line 1130. gitweb/gitweb.perl | 29 +++++++++++++++++++++++++++++ 1 files changed, 29 insertions(+), 0 deletions(-) diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl index cfa511c..cae0e34 100755 --- a/gitweb/gitweb.perl +++ b/gitweb/gitweb.perl @@ -39,6 +39,9 @@ BEGIN { our $version = "++GIT_VERSION++"; +# Output buffer variable +our $output = ""; + our ($my_url, $my_uri, $base_url, $path_info, $home_link); sub evaluate_uri { our $cgi; @@ -1134,6 +1137,25 @@ sub evaluate_argv { ); } +sub change_output { + our $output; + + # Trap the 'proper' STDOUT to STDOUT_REAL for things like error messages and such + open(STDOUT_REAL,">&STDOUT") or die "Unable to capture STDOUT $!\n"; + + # Close STDOUT, so that it isn't being used anymore. + close STDOUT; + + # Trap STDOUT to the $output variable, which is what I was using in the original + # patch anyway. + open(STDOUT,">", \$output) || die "Unable to open STDOUT: $!"; #open STDOUT handle to use $var +} + +sub reset_output { + # This basically takes STDOUT_REAL and puts it back as STDOUT + open STDOUT, ">&", \*STDOUT_REAL; +} + sub run { evaluate_argv(); @@ -1145,7 +1167,10 @@ sub run { $pre_dispatch_hook->() if $pre_dispatch_hook; + change_output(); run_request(); + reset_output(); + print $output; $post_dispatch_hook->() if $post_dispatch_hook; @@ -3655,6 +3680,10 @@ sub die_error { 500 => '500 Internal Server Error', 503 => '503 Service Unavailable', ); + # Reset the output so that we are actually going to STDOUT as opposed + # to buffering the output. + reset_output(); + git_header_html($http_responses{$status}, undef, %opts); print <<EOF; <div class="page_body"> -- 1.7.3 ^ permalink raw reply related [flat|nested] 43+ messages in thread
* [PATCHv7.1 3/4] gitweb: File based caching layer (from git.kernel.org) 2010-11-12 23:35 ` [RFC/PATCHv7.2 0/4] Gitweb caching v7.2 Jakub Narebski 2010-11-12 23:41 ` [PATCHv7.2 1/4] gitweb: Prepare for splitting gitweb Jakub Narebski 2010-11-12 23:44 ` [PATCHv7.1 2/4] gitweb: add output buffering and associated functions Jakub Narebski @ 2010-11-12 23:56 ` Jakub Narebski 2010-11-28 11:22 ` [PATCHv7.1 3/4 (amend)] " Jakub Narebski 2010-11-29 23:07 ` [PATCHv7.1 3/4] " demerphq 2010-11-13 0:01 ` [PATCHv7.2 4/4] gitweb: Minimal testing of gitweb caching Jakub Narebski 3 siblings, 2 replies; 43+ messages in thread From: Jakub Narebski @ 2010-11-12 23:56 UTC (permalink / raw) To: git Cc: John 'Warthog9' Hawley, John 'Warthog9' Hawley, Junio C Hamano, Petr Baudis, admin From: John 'Warthog9' Hawley <warthog9@eaglescrag.net> This is a relatively large patch that implements the file based caching layer that is quite similar to the one used on such large sites as kernel.org and soon git.fedoraproject.org. This provides a simple, and straight forward caching mechanism that scales dramatically better than Gitweb by itself. The caching layer basically buffers the output that Gitweb would normally return, and saves that output to a cache file on the local disk. When the file is requested it attempts to gain a shared lock on the cache file and cat it out to the client. Should an exclusive lock be on a file (it's being updated) the code has a choice to either update in the background and go ahead and show the stale page while update is being performed, or stall the client(s) until the page is generated. There are two forms of stalling involved here, background building and non-background building, both of which are discussed in the configuration page. There are still a few known "issues" with respect to this: - Code needs to be added to be "browser" aware so that clients like wget that are trying to get a binary blob don't obtain a "Generating..." page Caching is disabled by default. You can turn it on by setting $caching_enabled variable to true to enable file based caching. [jn: added error checking to loading 'cache.pl'; moved check for $caching_enabled outside out of cache_fetch, which required update to die_error()] Signed-off-by: Junio C Hamano <gitster@pobox.com> --- This patch is unchanged from previous version. Reminder: the main difference from J.H. patch was moving check for $caching_enabled out of cache_fetch, so that when caching is disabled then gitweb works (almost) the same as before this patch. Also, error checking for loading 'cache.pl' file. Interdiff in previous series. gitweb/Makefile | 3 + gitweb/gitweb.perl | 105 ++++++++++++-- gitweb/lib/cache.pl | 348 ++++++++++++++++++++++++++++++++++++++++++++++ gitweb/static/gitweb.css | 6 + 4 files changed, 450 insertions(+), 12 deletions(-) create mode 100644 gitweb/lib/cache.pl diff --git a/gitweb/Makefile b/gitweb/Makefile index f9e32eb..6ddd4f1 100644 --- a/gitweb/Makefile +++ b/gitweb/Makefile @@ -113,6 +113,9 @@ endif GITWEB_FILES += static/git-logo.png static/git-favicon.png +# Gitweb caching +GITWEB_MODULES += cache.pl + GITWEB_REPLACE = \ -e 's|++GIT_VERSION++|$(GIT_VERSION)|g' \ -e 's|++GIT_BINDIR++|$(bindir)|g' \ diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl index cae0e34..3c3ff08 100755 --- a/gitweb/gitweb.perl +++ b/gitweb/gitweb.perl @@ -250,6 +250,53 @@ our %avatar_size = ( # Leave it undefined (or set to 'undef') to turn off load checking. our $maxload = 300; +# This enables/disables the caching layer in gitweb. This currently only supports the +# 'dumb' file based caching layer, primarily used on git.kernel.org. this is reasonably +# effective but it has the downside of requiring a huge amount of disk space if there +# are a number of repositories involved. It is not uncommon for git.kernel.org to have +# on the order of 80G - 120G accumulate over the course of a few months. It is recommended +# that the cache directory be periodically completely deleted, and this is safe to perform. +# Suggested mechanism +# mv $cacheidr $cachedir.flush;mkdir $cachedir;rm -rf $cachedir.flush +our $caching_enabled = 0; + +# Used to set the minimum cache timeout for the dynamic caching algorithm. Basically +# if we calculate the cache to be under this number of seconds we set the cache timeout +# to this minimum. +# Value is in seconds. 1 = 1 seconds, 60 = 1 minute, 600 = 10 minutes, 3600 = 1 hour +our $minCacheTime = 20; + +# Used to set the maximum cache timeout for the dynamic caching algorithm. Basically +# if we calculate the cache to exceed this number of seconds we set the cache timeout +# to this maximum. +# Value is in seconds. 1 = 1 seconds, 60 = 1 minute, 600 = 10 minutes, 3600 = 1 hour +our $maxCacheTime = 1200; + +# If you need to change the location of the caching directory, override this +# otherwise this will probably do fine for you +our $cachedir = 'cache'; + +# If this is set (to 1) cache will do it's best to always display something instead +# of making someone wait for the cache to update. This will launch the cacheUpdate +# into the background and it will lock a <file>.bg file and will only lock the +# actual cache file when it needs to write into it. In theory this will make +# gitweb seem more responsive at the price of possibly stale data. +our $backgroundCache = 1; + +# Used to set the maximum cache file life. If a cache files last modify time exceeds +# this value, it will assume that the data is just too old, and HAS to be regenerated +# instead of trying to display the existing cache data. +# Value is in seconds. 1 = 1 seconds, 60 = 1 minute, 600 = 10 minutes, 3600 = 1 hour +# 18000 = 5 hours +our $maxCacheLife = 18000; + +# Used to enable or disable background forking of the gitweb caching. Mainly here for debugging purposes +our $cacheDoFork = 1; + +our $fullhashpath = *STDOUT; +our $fullhashbinpath = *STDOUT; +our $fullhashbinpathfinal = *STDOUT; + # configuration for 'highlight' (http://www.andre-simon.de/) # match by basename our %highlight_basename = ( @@ -506,6 +553,15 @@ our %feature = ( 'default' => [0]}, ); +# +# Includes +# +if (!exists $INC{'cache.pl'}) { + my $return = do 'cache.pl'; + die $@ if $@; + die "Couldn't read 'cache.pl': $!" if (!defined $return); +} + sub gitweb_get_feature { my ($name) = @_; return unless exists $feature{$name}; @@ -734,6 +790,10 @@ our %actions = ( "project_list" => \&git_project_list, "project_index" => \&git_project_index, ); +sub is_cacheable { + my $action = shift; + return !($action eq 'blame_data' || $action eq 'blame_incremental'); +} # finally, we have the hash of allowed extra_options for the commands that # allow them @@ -1072,7 +1132,11 @@ sub dispatch { !$project) { die_error(400, "Project needed"); } - $actions{$action}->(); + if ($caching_enabled && is_cacheable($action)) { + cache_fetch($action); + } else { + $actions{$action}->(); + } } sub reset_timer { @@ -1142,6 +1206,7 @@ sub change_output { # Trap the 'proper' STDOUT to STDOUT_REAL for things like error messages and such open(STDOUT_REAL,">&STDOUT") or die "Unable to capture STDOUT $!\n"; + print STDOUT_REAL ""; # Close STDOUT, so that it isn't being used anymore. close STDOUT; @@ -1167,10 +1232,7 @@ sub run { $pre_dispatch_hook->() if $pre_dispatch_hook; - change_output(); run_request(); - reset_output(); - print $output; $post_dispatch_hook->() if $post_dispatch_hook; @@ -3447,7 +3509,8 @@ sub git_header_html { # support xhtml+xml but choking when it gets what it asked for. if (defined $cgi->http('HTTP_ACCEPT') && $cgi->http('HTTP_ACCEPT') =~ m/(,|;|\s|^)application\/xhtml\+xml(,|;|\s|$)/ && - $cgi->Accept('application/xhtml+xml') != 0) { + $cgi->Accept('application/xhtml+xml') != 0 && + !$caching_enabled) { $content_type = 'application/xhtml+xml'; } else { $content_type = 'text/html'; @@ -3592,6 +3655,7 @@ sub git_footer_html { my $feed_class = 'rss_logo'; print "<div class=\"page_footer\">\n"; + print "<div class=\"cachetime\">Cache Last Updated: ". gmtime( time ) ." GMT</div>\n"; if (defined $project) { my $descr = git_get_project_description($project); if (defined $descr) { @@ -3680,9 +3744,14 @@ sub die_error { 500 => '500 Internal Server Error', 503 => '503 Service Unavailable', ); + # The output handlers for die_error need to be reset to STDOUT + # so that half the message isn't being output to random and + # half to STDOUT as expected. This is mainly for the benefit + # of using git_header_html() and git_footer_html() since + # # Reset the output so that we are actually going to STDOUT as opposed # to buffering the output. - reset_output(); + reset_output() if ($caching_enabled); git_header_html($http_responses{$status}, undef, %opts); print <<EOF; @@ -5592,9 +5661,15 @@ sub git_blob_plain { ($sandbox ? 'attachment' : 'inline') . '; filename="' . $save_as . '"'); local $/ = undef; - binmode STDOUT, ':raw'; - print <$fd>; - binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi + if ($caching_enabled) { + open BINOUT, '>', $fullhashbinpath or die_error(500, "Could not open bin dump file"); + }else{ + open BINOUT, '>', \$fullhashbinpath or die_error(500, "Could not open bin dump file"); + } + binmode BINOUT, ':raw'; + print BINOUT <$fd>; + binmode BINOUT, ':utf8'; # as set at the beginning of gitweb.cgi + close BINOUT; close $fd; } @@ -5879,9 +5954,15 @@ sub git_snapshot { open my $fd, "-|", $cmd or die_error(500, "Execute git-archive failed"); - binmode STDOUT, ':raw'; - print <$fd>; - binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi + if ($caching_enabled) { + open BINOUT, '>', $fullhashbinpath or die_error(500, "Could not open bin dump file"); + }else{ + open BINOUT, '>', \$fullhashbinpath or die_error(500, "Could not open bin dump file"); + } + binmode BINOUT, ':raw'; + print BINOUT <$fd>; + binmode BINOUT, ':utf8'; # as set at the beginning of gitweb.cgi + close BINOUT; close $fd; } diff --git a/gitweb/lib/cache.pl b/gitweb/lib/cache.pl new file mode 100644 index 0000000..dd14bfb --- /dev/null +++ b/gitweb/lib/cache.pl @@ -0,0 +1,348 @@ +# gitweb - simple web interface to track changes in git repositories +# +# (C) 2006, John 'Warthog9' Hawley <warthog19@eaglescrag.net> +# +# This program is licensed under the GPLv2 + +# +# Gitweb caching engine +# + +#use File::Path qw(make_path remove_tree); +use File::Path qw(mkpath rmtree); # Used for compatability reasons +use Digest::MD5 qw(md5 md5_hex md5_base64); +use Fcntl ':flock'; +use File::Copy; + +sub cache_fetch { + my ($action) = @_; + my $cacheTime = 0; + + if(! -d $cachedir){ + print "*** Warning ***: Caching enabled but cache directory does not exsist. ($cachedir)\n"; + mkdir ("cache", 0755) || die "Cannot create cache dir - you will need to manually create"; + print "Cache directory created successfully\n"; + } + + our $full_url = "$my_url?". $ENV{'QUERY_STRING'}; + our $urlhash = md5_hex($full_url); + our $fullhashdir = "$cachedir/". substr( $urlhash, 0, 2) ."/"; + + eval { mkpath( $fullhashdir, 0, 0777 ) }; + if ($@) { + die_error(500, "Internal Server Error", "Could not create cache directory: $@"); + } + $fullhashpath = "$fullhashdir/". substr( $urlhash, 2 ); + $fullhashbinpath = "$fullhashpath.bin.wt"; + $fullhashbinpathfinal = "$fullhashpath.bin"; + + if(! -e "$fullhashpath" ){ + if(! $cacheDoFork || ! defined(my $childPid = fork()) ){ + cacheUpdate($action,0); + cacheDisplay($action); + } elsif ( $childPid == 0 ){ + #run the updater + cacheUpdate($action,1); + }else{ + cacheWaitForUpdate($action); + } + }else{ + #if cache is out dated, update + #else displayCache(); + open(cacheFile, '<', "$fullhashpath"); + stat(cacheFile); + close(cacheFile); + my $stat_time = (stat(_))[9]; + my $stat_size = (stat(_))[7]; + + $cacheTime = get_loadavg() * 60; + if( $cacheTime > $maxCacheTime ){ + $cacheTime = $maxCacheTime; + } + if( $cacheTime < $minCacheTime ){ + $cacheTime = $minCacheTime; + } + if( $stat_time < (time - $cacheTime) || $stat_size == 0 ){ + if( ! $cacheDoFork || ! defined(my $childPid = fork()) ){ + cacheUpdate($action,0); + cacheDisplay($action); + } elsif ( $childPid == 0 ){ + #run the updater + #print "Running updater\n"; + cacheUpdate($action,1); + }else{ + #print "Waiting for update\n"; + cacheWaitForUpdate($action); + } + } else { + cacheDisplay($action); + } + + + } + + # + # If all of the caching failes - lets go ahead and press on without it and fall back to 'default' + # non-caching behavior. This is the softest of the failure conditions. + # + #$actions{$action}->(); +} + +sub cacheUpdate { + my ($action,$areForked) = @_; + my $lockingStatus; + my $fileData = ""; + + if($backgroundCache){ + open(cacheFileBG, '>:utf8', "$fullhashpath.bg"); + my $lockStatBG = flock(cacheFileBG,LOCK_EX|LOCK_NB); + + $lockStatus = $lockStatBG; + }else{ + open(cacheFile, '>:utf8', \$fullhashpath); + my $lockStat = flock(cacheFile,LOCK_EX|LOCK_NB); + + $lockStatus = $lockStat; + } + #print "lock status: $lockStat\n"; + + + if (! $lockStatus ){ + if ( $areForked ){ + exit(0); + }else{ + return; + } + } + + if( + $action eq "snapshot" + || + $action eq "blob_plain" + ){ + my $openstat = open(cacheFileBinWT, '>>:utf8', "$fullhashbinpath"); + my $lockStatBin = flock(cacheFileBinWT,LOCK_EX|LOCK_NB); + } + + # Trap all output from the action + change_output(); + + $actions{$action}->(); + + # Reset the outputs as we should be fine now + reset_output(); + + + if($backgroundCache){ + open(cacheFile, '>:utf8', "$fullhashpath"); + $lockStat = flock(cacheFile,LOCK_EX); + + if (! $lockStat ){ + if ( $areForked ){ + exit(0); + }else{ + return; + } + } + } + + if( + $action eq "snapshot" + || + $action eq "blob_plain" + ){ + my $openstat = open(cacheFileBinFINAL, '>:utf8', "$fullhashbinpathfinal"); + $lockStatBIN = flock(cacheFileBinFINAL,LOCK_EX); + + if (! $lockStatBIN ){ + if ( $areForked ){ + exit(0); + }else{ + return; + } + } + } + + # Actually dump the output to the proper file handler + local $/ = undef; + $|++; + print cacheFile "$output"; + $|--; + if( + $action eq "snapshot" + || + $action eq "blob_plain" + ){ + move("$fullhashbinpath", "$fullhashbinpathfinal") or die "Binary Cache file could not be updated: $!"; + + flock(cacheFileBinFINAL,LOCK_UN); + close(cacheFileBinFINAL); + + flock(cacheFileBinWT,LOCK_UN); + close(cacheFileBinWT); + } + + flock(cacheFile,LOCK_UN); + close(cacheFile); + + if($backgroundCache){ + flock(cacheFileBG,LOCK_UN); + close(cacheFileBG); + } + + if ( $areForked ){ + exit(0); + } else { + return; + } +} + + +sub cacheWaitForUpdate { + my ($action) = @_; + my $x = 0; + my $max = 10; + my $lockStat = 0; + + if( $backgroundCache ){ + if( -e "$fullhashpath" ){ + open(cacheFile, '<:utf8', "$fullhashpath"); + $lockStat = flock(cacheFile,LOCK_SH|LOCK_NB); + stat(cacheFile); + close(cacheFile); + + if( $lockStat && ( (stat(_))[9] > (time - $maxCacheLife) ) ){ + cacheDisplay($action); + return; + } + } + } + + if( + $action eq "atom" + || + $action eq "rss" + || + $action eq "opml" + ){ + do { + sleep 2 if $x > 0; + open(cacheFile, '<:utf8', "$fullhashpath"); + $lockStat = flock(cacheFile,LOCK_SH|LOCK_NB); + close(cacheFile); + $x++; + $combinedLockStat = $lockStat; + } while ((! $combinedLockStat) && ($x < $max)); + + if( $x != $max ){ + cacheDisplay($action); + } + return; + } + + $| = 1; + + print $::cgi->header( + -type=>'text/html', + -charset => 'utf-8', + -status=> 200, + -expires => 'now', + # HTTP/1.0 + -Pragma => 'no-cache', + # HTTP/1.1 + -Cache_Control => join( + ', ', + qw( + private + no-cache + no-store + must-revalidate + max-age=0 + pre-check=0 + post-check=0 + ) + ) + ); + + print <<EOF; +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www/w3.porg/TR/html4/strict.dtd"> +<!-- git web w/caching interface version $version, (C) 2006-2010, John 'Warthog9' Hawley <warthog9\@kernel.org> --> +<!-- git core binaries version $git_version --> +<head> +<meta http-equiv="content-type" content="$content_type; charset=utf-8"/> +<meta name="generator" content="gitweb/$version git/$git_version"/> +<meta name="robots" content="index, nofollow"/> +<meta http-equiv="refresh" content="0"/> +<title>$title</title> +</head> +<body> +EOF + + print "Generating.."; + do { + print "."; + sleep 2 if $x > 0; + open(cacheFile, '<:utf8', "$fullhashpath"); + $lockStat = flock(cacheFile,LOCK_SH|LOCK_NB); + close(cacheFile); + $x++; + $combinedLockStat = $lockStat; + } while ((! $combinedLockStat) && ($x < $max)); + print <<EOF; +</body> +</html> +EOF + return; +} + +sub cacheDisplay { + local $/ = undef; + $|++; + + my ($action) = @_; + open(cacheFile, '<:utf8', "$fullhashpath"); + $lockStat = flock(cacheFile,LOCK_SH|LOCK_NB); + + if (! $lockStat ){ + close(cacheFile); + cacheWaitForUpdate($action); + } + + if( + ( + $action eq "snapshot" + || + $action eq "blob_plain" + ) + ){ + my $openstat = open(cacheFileBin, '<', "$fullhashbinpathfinal"); + $lockStatBIN = flock(cacheFileBin,LOCK_SH|LOCK_NB); + if (! $lockStatBIN ){ + system ("echo 'cacheDisplay - bailing due to binary lock failure' >> /tmp/gitweb.log"); + close(cacheFile); + close(cacheFileBin); + cacheWaitForUpdate($action); + } + + my $binfilesize = -s "$fullhashbinpathfinal"; + print "Content-Length: $binfilesize"; + } + while( <cacheFile> ){ + print $_; + } + if( + $action eq "snapshot" + || + $action eq "blob_plain" + ){ + binmode STDOUT, ':raw'; + print <cacheFileBin>; + binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi + close(cacheFileBin); + } + close(cacheFile); + $|--; +} + +1; +__END__ diff --git a/gitweb/static/gitweb.css b/gitweb/static/gitweb.css index 4132aab..972d32e 100644 --- a/gitweb/static/gitweb.css +++ b/gitweb/static/gitweb.css @@ -67,6 +67,12 @@ div.page_path { border-width: 0px 0px 1px; } +div.cachetime { + float: left; + margin-right: 10px; + color: #555555; +} + div.page_footer { height: 17px; padding: 4px 8px; -- 1.7.3 ^ permalink raw reply related [flat|nested] 43+ messages in thread
* [PATCHv7.1 3/4 (amend)] gitweb: File based caching layer (from git.kernel.org) 2010-11-12 23:56 ` [PATCHv7.1 3/4] gitweb: File based caching layer (from git.kernel.org) Jakub Narebski @ 2010-11-28 11:22 ` Jakub Narebski 2010-11-28 11:31 ` Jakub Narebski 2010-11-29 23:07 ` [PATCHv7.1 3/4] " demerphq 1 sibling, 1 reply; 43+ messages in thread From: Jakub Narebski @ 2010-11-28 11:22 UTC (permalink / raw) To: git Cc: John 'Warthog9' Hawley, John 'Warthog9' Hawley, Junio C Hamano, Petr Baudis, admin From: John 'Warthog9' Hawley <warthog9@eaglescrag.net> This is a relatively large patch that implements the file based caching layer that is quite similar to the one used on such large sites as kernel.org and soon git.fedoraproject.org. This provides a simple, and straight forward caching mechanism that scales dramatically better than Gitweb by itself. The caching layer basically buffers the output that Gitweb would normally return, and saves that output to a cache file on the local disk. When the file is requested it attempts to gain a shared lock on the cache file and cat it out to the client. Should an exclusive lock be on a file (it's being updated) the code has a choice to either update in the background and go ahead and show the stale page while update is being performed, or stall the client(s) until the page is generated. There are two forms of stalling involved here, background building and non-background building, both of which are discussed in the configuration page. There are still a few known "issues" with respect to this: - Code needs to be added to be "browser" aware so that clients like wget that are trying to get a binary blob don't obtain a "Generating..." page Caching is disabled by default. You can turn it on by setting $caching_enabled variable to true to enable file based caching. [jn: added error checking to loading 'cache.pl'; moved check for $caching_enabled outside out of cache_fetch, which required update to die_error()] Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jakub Narebski <jnareb@gmail.com> --- On Sat, 13 Nov 2010, Jakub Narebski wrote: > +sub cacheUpdate { > + my ($action,$areForked) = @_; > + my $lockingStatus; > + my $fileData = ""; > + > + if($backgroundCache){ > + open(cacheFileBG, '>:utf8', "$fullhashpath.bg"); > + my $lockStatBG = flock(cacheFileBG,LOCK_EX|LOCK_NB); > + > + $lockStatus = $lockStatBG; > + }else{ > + open(cacheFile, '>:utf8', \$fullhashpath); ^^^^^^^^^^^^^^ I have made mistake with this line when moving $caching_enabled check out of cache_fetch to its caller. Reusing $fullhashpath variable as a *capture buffer* (it has nothing to do with path; it is not a filename no longer) wouldn't help there. > + my $lockStat = flock(cacheFile,LOCK_EX|LOCK_NB); A note about original code: you can't flock in-memory files. > + > + $lockStatus = $lockStat; > + } > + #print "lock status: $lockStat\n"; [...] gitweb/Makefile | 3 + gitweb/gitweb.perl | 105 ++++++++++++-- gitweb/lib/cache.pl | 348 ++++++++++++++++++++++++++++++++++++++++++++++ gitweb/static/gitweb.css | 6 + 4 files changed, 450 insertions(+), 12 deletions(-) create mode 100644 gitweb/lib/cache.pl diff --git a/gitweb/Makefile b/gitweb/Makefile index f9e32eb..6ddd4f1 100644 --- a/gitweb/Makefile +++ b/gitweb/Makefile @@ -113,6 +113,9 @@ endif GITWEB_FILES += static/git-logo.png static/git-favicon.png +# Gitweb caching +GITWEB_MODULES += cache.pl + GITWEB_REPLACE = \ -e 's|++GIT_VERSION++|$(GIT_VERSION)|g' \ -e 's|++GIT_BINDIR++|$(bindir)|g' \ diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl index cae0e34..3c3ff08 100755 --- a/gitweb/gitweb.perl +++ b/gitweb/gitweb.perl @@ -250,6 +250,53 @@ our %avatar_size = ( # Leave it undefined (or set to 'undef') to turn off load checking. our $maxload = 300; +# This enables/disables the caching layer in gitweb. This currently only supports the +# 'dumb' file based caching layer, primarily used on git.kernel.org. this is reasonably +# effective but it has the downside of requiring a huge amount of disk space if there +# are a number of repositories involved. It is not uncommon for git.kernel.org to have +# on the order of 80G - 120G accumulate over the course of a few months. It is recommended +# that the cache directory be periodically completely deleted, and this is safe to perform. +# Suggested mechanism +# mv $cacheidr $cachedir.flush;mkdir $cachedir;rm -rf $cachedir.flush +our $caching_enabled = 0; + +# Used to set the minimum cache timeout for the dynamic caching algorithm. Basically +# if we calculate the cache to be under this number of seconds we set the cache timeout +# to this minimum. +# Value is in seconds. 1 = 1 seconds, 60 = 1 minute, 600 = 10 minutes, 3600 = 1 hour +our $minCacheTime = 20; + +# Used to set the maximum cache timeout for the dynamic caching algorithm. Basically +# if we calculate the cache to exceed this number of seconds we set the cache timeout +# to this maximum. +# Value is in seconds. 1 = 1 seconds, 60 = 1 minute, 600 = 10 minutes, 3600 = 1 hour +our $maxCacheTime = 1200; + +# If you need to change the location of the caching directory, override this +# otherwise this will probably do fine for you +our $cachedir = 'cache'; + +# If this is set (to 1) cache will do it's best to always display something instead +# of making someone wait for the cache to update. This will launch the cacheUpdate +# into the background and it will lock a <file>.bg file and will only lock the +# actual cache file when it needs to write into it. In theory this will make +# gitweb seem more responsive at the price of possibly stale data. +our $backgroundCache = 1; + +# Used to set the maximum cache file life. If a cache files last modify time exceeds +# this value, it will assume that the data is just too old, and HAS to be regenerated +# instead of trying to display the existing cache data. +# Value is in seconds. 1 = 1 seconds, 60 = 1 minute, 600 = 10 minutes, 3600 = 1 hour +# 18000 = 5 hours +our $maxCacheLife = 18000; + +# Used to enable or disable background forking of the gitweb caching. Mainly here for debugging purposes +our $cacheDoFork = 1; + +our $fullhashpath = *STDOUT; +our $fullhashbinpath = *STDOUT; +our $fullhashbinpathfinal = *STDOUT; + # configuration for 'highlight' (http://www.andre-simon.de/) # match by basename our %highlight_basename = ( @@ -506,6 +553,15 @@ our %feature = ( 'default' => [0]}, ); +# +# Includes +# +if (!exists $INC{'cache.pl'}) { + my $return = do 'cache.pl'; + die $@ if $@; + die "Couldn't read 'cache.pl': $!" if (!defined $return); +} + sub gitweb_get_feature { my ($name) = @_; return unless exists $feature{$name}; @@ -734,6 +790,10 @@ our %actions = ( "project_list" => \&git_project_list, "project_index" => \&git_project_index, ); +sub is_cacheable { + my $action = shift; + return !($action eq 'blame_data' || $action eq 'blame_incremental'); +} # finally, we have the hash of allowed extra_options for the commands that # allow them @@ -1072,7 +1132,11 @@ sub dispatch { !$project) { die_error(400, "Project needed"); } - $actions{$action}->(); + if ($caching_enabled && is_cacheable($action)) { + cache_fetch($action); + } else { + $actions{$action}->(); + } } sub reset_timer { @@ -1142,6 +1206,7 @@ sub change_output { # Trap the 'proper' STDOUT to STDOUT_REAL for things like error messages and such open(STDOUT_REAL,">&STDOUT") or die "Unable to capture STDOUT $!\n"; + print STDOUT_REAL ""; # Close STDOUT, so that it isn't being used anymore. close STDOUT; @@ -1167,10 +1232,7 @@ sub run { $pre_dispatch_hook->() if $pre_dispatch_hook; - change_output(); run_request(); - reset_output(); - print $output; $post_dispatch_hook->() if $post_dispatch_hook; @@ -3447,7 +3509,8 @@ sub git_header_html { # support xhtml+xml but choking when it gets what it asked for. if (defined $cgi->http('HTTP_ACCEPT') && $cgi->http('HTTP_ACCEPT') =~ m/(,|;|\s|^)application\/xhtml\+xml(,|;|\s|$)/ && - $cgi->Accept('application/xhtml+xml') != 0) { + $cgi->Accept('application/xhtml+xml') != 0 && + !$caching_enabled) { $content_type = 'application/xhtml+xml'; } else { $content_type = 'text/html'; @@ -3592,6 +3655,7 @@ sub git_footer_html { my $feed_class = 'rss_logo'; print "<div class=\"page_footer\">\n"; + print "<div class=\"cachetime\">Cache Last Updated: ". gmtime( time ) ." GMT</div>\n"; if (defined $project) { my $descr = git_get_project_description($project); if (defined $descr) { @@ -3680,9 +3744,14 @@ sub die_error { 500 => '500 Internal Server Error', 503 => '503 Service Unavailable', ); + # The output handlers for die_error need to be reset to STDOUT + # so that half the message isn't being output to random and + # half to STDOUT as expected. This is mainly for the benefit + # of using git_header_html() and git_footer_html() since + # # Reset the output so that we are actually going to STDOUT as opposed # to buffering the output. - reset_output(); + reset_output() if ($caching_enabled); git_header_html($http_responses{$status}, undef, %opts); print <<EOF; @@ -5592,9 +5661,15 @@ sub git_blob_plain { ($sandbox ? 'attachment' : 'inline') . '; filename="' . $save_as . '"'); local $/ = undef; - binmode STDOUT, ':raw'; - print <$fd>; - binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi + if ($caching_enabled) { + open BINOUT, '>', $fullhashbinpath or die_error(500, "Could not open bin dump file"); + }else{ + open BINOUT, '>', \$fullhashbinpath or die_error(500, "Could not open bin dump file"); + } + binmode BINOUT, ':raw'; + print BINOUT <$fd>; + binmode BINOUT, ':utf8'; # as set at the beginning of gitweb.cgi + close BINOUT; close $fd; } @@ -5879,9 +5954,15 @@ sub git_snapshot { open my $fd, "-|", $cmd or die_error(500, "Execute git-archive failed"); - binmode STDOUT, ':raw'; - print <$fd>; - binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi + if ($caching_enabled) { + open BINOUT, '>', $fullhashbinpath or die_error(500, "Could not open bin dump file"); + }else{ + open BINOUT, '>', \$fullhashbinpath or die_error(500, "Could not open bin dump file"); + } + binmode BINOUT, ':raw'; + print BINOUT <$fd>; + binmode BINOUT, ':utf8'; # as set at the beginning of gitweb.cgi + close BINOUT; close $fd; } diff --git a/gitweb/lib/cache.pl b/gitweb/lib/cache.pl new file mode 100644 index 0000000..dd14bfb --- /dev/null +++ b/gitweb/lib/cache.pl @@ -0,0 +1,348 @@ +# gitweb - simple web interface to track changes in git repositories +# +# (C) 2006, John 'Warthog9' Hawley <warthog19@eaglescrag.net> +# +# This program is licensed under the GPLv2 + +# +# Gitweb caching engine +# + +#use File::Path qw(make_path remove_tree); +use File::Path qw(mkpath rmtree); # Used for compatability reasons +use Digest::MD5 qw(md5 md5_hex md5_base64); +use Fcntl ':flock'; +use File::Copy; + +sub cache_fetch { + my ($action) = @_; + my $cacheTime = 0; + + if(! -d $cachedir){ + print "*** Warning ***: Caching enabled but cache directory does not exsist. ($cachedir)\n"; + mkdir ("cache", 0755) || die "Cannot create cache dir - you will need to manually create"; + print "Cache directory created successfully\n"; + } + + our $full_url = "$my_url?". $ENV{'QUERY_STRING'}; + our $urlhash = md5_hex($full_url); + our $fullhashdir = "$cachedir/". substr( $urlhash, 0, 2) ."/"; + + eval { mkpath( $fullhashdir, 0, 0777 ) }; + if ($@) { + die_error(500, "Internal Server Error", "Could not create cache directory: $@"); + } + $fullhashpath = "$fullhashdir/". substr( $urlhash, 2 ); + $fullhashbinpath = "$fullhashpath.bin.wt"; + $fullhashbinpathfinal = "$fullhashpath.bin"; + + if(! -e "$fullhashpath" ){ + if(! $cacheDoFork || ! defined(my $childPid = fork()) ){ + cacheUpdate($action,0); + cacheDisplay($action); + } elsif ( $childPid == 0 ){ + #run the updater + cacheUpdate($action,1); + }else{ + cacheWaitForUpdate($action); + } + }else{ + #if cache is out dated, update + #else displayCache(); + open(cacheFile, '<', "$fullhashpath"); + stat(cacheFile); + close(cacheFile); + my $stat_time = (stat(_))[9]; + my $stat_size = (stat(_))[7]; + + $cacheTime = get_loadavg() * 60; + if( $cacheTime > $maxCacheTime ){ + $cacheTime = $maxCacheTime; + } + if( $cacheTime < $minCacheTime ){ + $cacheTime = $minCacheTime; + } + if( $stat_time < (time - $cacheTime) || $stat_size == 0 ){ + if( ! $cacheDoFork || ! defined(my $childPid = fork()) ){ + cacheUpdate($action,0); + cacheDisplay($action); + } elsif ( $childPid == 0 ){ + #run the updater + #print "Running updater\n"; + cacheUpdate($action,1); + }else{ + #print "Waiting for update\n"; + cacheWaitForUpdate($action); + } + } else { + cacheDisplay($action); + } + + + } + + # + # If all of the caching failes - lets go ahead and press on without it and fall back to 'default' + # non-caching behavior. This is the softest of the failure conditions. + # + #$actions{$action}->(); +} + +sub cacheUpdate { + my ($action,$areForked) = @_; + my $lockingStatus; + my $fileData = ""; + + if($backgroundCache){ + open(cacheFileBG, '>:utf8', "$fullhashpath.bg"); + my $lockStatBG = flock(cacheFileBG,LOCK_EX|LOCK_NB); + + $lockStatus = $lockStatBG; + }else{ + open(cacheFile, '>:utf8', \$fullhashpath); + my $lockStat = flock(cacheFile,LOCK_EX|LOCK_NB); + + $lockStatus = $lockStat; + } + #print "lock status: $lockStat\n"; + + + if (! $lockStatus ){ + if ( $areForked ){ + exit(0); + }else{ + return; + } + } + + if( + $action eq "snapshot" + || + $action eq "blob_plain" + ){ + my $openstat = open(cacheFileBinWT, '>>:utf8', "$fullhashbinpath"); + my $lockStatBin = flock(cacheFileBinWT,LOCK_EX|LOCK_NB); + } + + # Trap all output from the action + change_output(); + + $actions{$action}->(); + + # Reset the outputs as we should be fine now + reset_output(); + + + if($backgroundCache){ + open(cacheFile, '>:utf8', "$fullhashpath"); + $lockStat = flock(cacheFile,LOCK_EX); + + if (! $lockStat ){ + if ( $areForked ){ + exit(0); + }else{ + return; + } + } + } + + if( + $action eq "snapshot" + || + $action eq "blob_plain" + ){ + my $openstat = open(cacheFileBinFINAL, '>:utf8', "$fullhashbinpathfinal"); + $lockStatBIN = flock(cacheFileBinFINAL,LOCK_EX); + + if (! $lockStatBIN ){ + if ( $areForked ){ + exit(0); + }else{ + return; + } + } + } + + # Actually dump the output to the proper file handler + local $/ = undef; + $|++; + print cacheFile "$output"; + $|--; + if( + $action eq "snapshot" + || + $action eq "blob_plain" + ){ + move("$fullhashbinpath", "$fullhashbinpathfinal") or die "Binary Cache file could not be updated: $!"; + + flock(cacheFileBinFINAL,LOCK_UN); + close(cacheFileBinFINAL); + + flock(cacheFileBinWT,LOCK_UN); + close(cacheFileBinWT); + } + + flock(cacheFile,LOCK_UN); + close(cacheFile); + + if($backgroundCache){ + flock(cacheFileBG,LOCK_UN); + close(cacheFileBG); + } + + if ( $areForked ){ + exit(0); + } else { + return; + } +} + + +sub cacheWaitForUpdate { + my ($action) = @_; + my $x = 0; + my $max = 10; + my $lockStat = 0; + + if( $backgroundCache ){ + if( -e "$fullhashpath" ){ + open(cacheFile, '<:utf8', "$fullhashpath"); + $lockStat = flock(cacheFile,LOCK_SH|LOCK_NB); + stat(cacheFile); + close(cacheFile); + + if( $lockStat && ( (stat(_))[9] > (time - $maxCacheLife) ) ){ + cacheDisplay($action); + return; + } + } + } + + if( + $action eq "atom" + || + $action eq "rss" + || + $action eq "opml" + ){ + do { + sleep 2 if $x > 0; + open(cacheFile, '<:utf8', "$fullhashpath"); + $lockStat = flock(cacheFile,LOCK_SH|LOCK_NB); + close(cacheFile); + $x++; + $combinedLockStat = $lockStat; + } while ((! $combinedLockStat) && ($x < $max)); + + if( $x != $max ){ + cacheDisplay($action); + } + return; + } + + $| = 1; + + print $::cgi->header( + -type=>'text/html', + -charset => 'utf-8', + -status=> 200, + -expires => 'now', + # HTTP/1.0 + -Pragma => 'no-cache', + # HTTP/1.1 + -Cache_Control => join( + ', ', + qw( + private + no-cache + no-store + must-revalidate + max-age=0 + pre-check=0 + post-check=0 + ) + ) + ); + + print <<EOF; +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www/w3.porg/TR/html4/strict.dtd"> +<!-- git web w/caching interface version $version, (C) 2006-2010, John 'Warthog9' Hawley <warthog9\@kernel.org> --> +<!-- git core binaries version $git_version --> +<head> +<meta http-equiv="content-type" content="$content_type; charset=utf-8"/> +<meta name="generator" content="gitweb/$version git/$git_version"/> +<meta name="robots" content="index, nofollow"/> +<meta http-equiv="refresh" content="0"/> +<title>$title</title> +</head> +<body> +EOF + + print "Generating.."; + do { + print "."; + sleep 2 if $x > 0; + open(cacheFile, '<:utf8', "$fullhashpath"); + $lockStat = flock(cacheFile,LOCK_SH|LOCK_NB); + close(cacheFile); + $x++; + $combinedLockStat = $lockStat; + } while ((! $combinedLockStat) && ($x < $max)); + print <<EOF; +</body> +</html> +EOF + return; +} + +sub cacheDisplay { + local $/ = undef; + $|++; + + my ($action) = @_; + open(cacheFile, '<:utf8', "$fullhashpath"); + $lockStat = flock(cacheFile,LOCK_SH|LOCK_NB); + + if (! $lockStat ){ + close(cacheFile); + cacheWaitForUpdate($action); + } + + if( + ( + $action eq "snapshot" + || + $action eq "blob_plain" + ) + ){ + my $openstat = open(cacheFileBin, '<', "$fullhashbinpathfinal"); + $lockStatBIN = flock(cacheFileBin,LOCK_SH|LOCK_NB); + if (! $lockStatBIN ){ + system ("echo 'cacheDisplay - bailing due to binary lock failure' >> /tmp/gitweb.log"); + close(cacheFile); + close(cacheFileBin); + cacheWaitForUpdate($action); + } + + my $binfilesize = -s "$fullhashbinpathfinal"; + print "Content-Length: $binfilesize"; + } + while( <cacheFile> ){ + print $_; + } + if( + $action eq "snapshot" + || + $action eq "blob_plain" + ){ + binmode STDOUT, ':raw'; + print <cacheFileBin>; + binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi + close(cacheFileBin); + } + close(cacheFile); + $|--; +} + +1; +__END__ diff --git a/gitweb/static/gitweb.css b/gitweb/static/gitweb.css index 4132aab..972d32e 100644 --- a/gitweb/static/gitweb.css +++ b/gitweb/static/gitweb.css @@ -67,6 +67,12 @@ div.page_path { border-width: 0px 0px 1px; } +div.cachetime { + float: left; + margin-right: 10px; + color: #555555; +} + div.page_footer { height: 17px; padding: 4px 8px; -- 1.7.3 ^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [PATCHv7.1 3/4 (amend)] gitweb: File based caching layer (from git.kernel.org) 2010-11-28 11:22 ` [PATCHv7.1 3/4 (amend)] " Jakub Narebski @ 2010-11-28 11:31 ` Jakub Narebski 2010-11-29 22:13 ` Junio C Hamano 0 siblings, 1 reply; 43+ messages in thread From: Jakub Narebski @ 2010-11-28 11:31 UTC (permalink / raw) To: git Cc: John 'Warthog9' Hawley, John 'Warthog9' Hawley, Junio C Hamano, Petr Baudis, admin On Sun, 28 Nov 2010, Jakub Narebski wrote: > On Sat, 13 Nov 2010, Jakub Narebski wrote: > > > +sub cacheUpdate { > > + my ($action,$areForked) = @_; > > + my $lockingStatus; > > + my $fileData = ""; > > + > > + if($backgroundCache){ > > + open(cacheFileBG, '>:utf8', "$fullhashpath.bg"); > > + my $lockStatBG = flock(cacheFileBG,LOCK_EX|LOCK_NB); > > + > > + $lockStatus = $lockStatBG; > > + }else{ > > + open(cacheFile, '>:utf8', \$fullhashpath); > ^^^^^^^^^^^^^^ > > I have made mistake with this line when moving $caching_enabled check > out of cache_fetch to its caller. > > Reusing $fullhashpath variable as a *capture buffer* (it has nothing > to do with path; it is not a filename no longer) wouldn't help there. Errr... I meant that this abuse didn't help avoiding my mistake. -- Jakub Narebski Poland ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCHv7.1 3/4 (amend)] gitweb: File based caching layer (from git.kernel.org) 2010-11-28 11:31 ` Jakub Narebski @ 2010-11-29 22:13 ` Junio C Hamano 2010-11-29 22:20 ` Junio C Hamano 0 siblings, 1 reply; 43+ messages in thread From: Junio C Hamano @ 2010-11-29 22:13 UTC (permalink / raw) To: Jakub Narebski Cc: git, John 'Warthog9' Hawley, John 'Warthog9' Hawley, Petr Baudis, admin Jakub Narebski <jnareb@gmail.com> writes: >> I have made mistake with this line when moving $caching_enabled check >> out of cache_fetch to its caller. >> >> Reusing $fullhashpath variable as a *capture buffer* (it has nothing >> to do with path; it is not a filename no longer) wouldn't help there. > > Errr... I meant that this abuse didn't help avoiding my mistake. Thanks. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCHv7.1 3/4 (amend)] gitweb: File based caching layer (from git.kernel.org) 2010-11-29 22:13 ` Junio C Hamano @ 2010-11-29 22:20 ` Junio C Hamano 2010-11-29 23:09 ` [PATCHv7.1 3/4 (amend v2)] " Jakub Narebski 0 siblings, 1 reply; 43+ messages in thread From: Junio C Hamano @ 2010-11-29 22:20 UTC (permalink / raw) To: Junio C Hamano Cc: Jakub Narebski, git, John 'Warthog9' Hawley, John 'Warthog9' Hawley, Petr Baudis, admin Junio C Hamano <gitster@pobox.com> writes: > Jakub Narebski <jnareb@gmail.com> writes: > >>> I have made mistake with this line when moving $caching_enabled check >>> out of cache_fetch to its caller. >>> >>> Reusing $fullhashpath variable as a *capture buffer* (it has nothing >>> to do with path; it is not a filename no longer) wouldn't help there. >> >> Errr... I meant that this abuse didn't help avoiding my mistake. > > Thanks. Wait a bit. This seems to match what I have already queued on 'pu', no? Am I hallucinating? ^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCHv7.1 3/4 (amend v2)] gitweb: File based caching layer (from git.kernel.org) 2010-11-29 22:20 ` Junio C Hamano @ 2010-11-29 23:09 ` Jakub Narebski 2010-11-30 0:51 ` Junio C Hamano 0 siblings, 1 reply; 43+ messages in thread From: Jakub Narebski @ 2010-11-29 23:09 UTC (permalink / raw) To: Junio C Hamano Cc: git, John 'Warthog9' Hawley, John 'Warthog9' Hawley, Petr Baudis, admin On Mon, 29 Nov 2010, Junio C Hamano wrote: > Junio C Hamano <gitster@pobox.com> writes: >> Jakub Narebski <jnareb@gmail.com> writes: >> >>>> I have made mistake with this line when moving $caching_enabled check >>>> out of cache_fetch to its caller. >>>> >>>> Reusing $fullhashpath variable as a *capture buffer* (it has nothing >>>> to do with path; it is not a filename no longer) wouldn't help there. >>> >>> Errr... I meant that this abuse didn't help avoiding my mistake. >> >> Thanks. > > Wait a bit. > > This seems to match what I have already queued on 'pu', no? Am I > hallucinating? Damn, it looks like my mailer included wrong (older) version of a file. I'm sorry. Below there is interdiff: diff --git a/gitweb/lib/cache.pl b/gitweb/lib/cache.pl index dd14bfb..4360b3d 100644 --- a/gitweb/lib/cache.pl +++ b/gitweb/lib/cache.pl @@ -99,7 +99,7 @@ sub cacheUpdate { $lockStatus = $lockStatBG; }else{ - open(cacheFile, '>:utf8', \$fullhashpath); + open(cacheFile, '>:utf8', $fullhashpath); my $lockStat = flock(cacheFile,LOCK_EX|LOCK_NB); $lockStatus = $lockStat; -- >8 -- From: John 'Warthog9' Hawley <warthog9@eaglescrag.net> This is a relatively large patch that implements the file based caching layer that is quite similar to the one used on such large sites as kernel.org and soon git.fedoraproject.org. This provides a simple, and straight forward caching mechanism that scales dramatically better than Gitweb by itself. The caching layer basically buffers the output that Gitweb would normally return, and saves that output to a cache file on the local disk. When the file is requested it attempts to gain a shared lock on the cache file and cat it out to the client. Should an exclusive lock be on a file (it's being updated) the code has a choice to either update in the background and go ahead and show the stale page while update is being performed, or stall the client(s) until the page is generated. There are two forms of stalling involved here, background building and non-background building, both of which are discussed in the configuration page. There are still a few known "issues" with respect to this: - Code needs to be added to be "browser" aware so that clients like wget that are trying to get a binary blob don't obtain a "Generating..." page Caching is disabled by default. You can turn it on by setting $caching_enabled variable to true to enable file based caching. [jn: added error checking to loading 'cache.pl'; moved check for $caching_enabled outside out of cache_fetch, which required update to die_error()] Signed-off-by: Junio C Hamano <gitster@pobox.com> --- gitweb/Makefile | 3 + gitweb/gitweb.perl | 105 ++++++++++++-- gitweb/lib/cache.pl | 348 ++++++++++++++++++++++++++++++++++++++++++++++ gitweb/static/gitweb.css | 6 + 4 files changed, 450 insertions(+), 12 deletions(-) create mode 100644 gitweb/lib/cache.pl diff --git a/gitweb/Makefile b/gitweb/Makefile index f9e32eb..6ddd4f1 100644 --- a/gitweb/Makefile +++ b/gitweb/Makefile @@ -113,6 +113,9 @@ endif GITWEB_FILES += static/git-logo.png static/git-favicon.png +# Gitweb caching +GITWEB_MODULES += cache.pl + GITWEB_REPLACE = \ -e 's|++GIT_VERSION++|$(GIT_VERSION)|g' \ -e 's|++GIT_BINDIR++|$(bindir)|g' \ diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl index cae0e34..3c3ff08 100755 --- a/gitweb/gitweb.perl +++ b/gitweb/gitweb.perl @@ -250,6 +250,53 @@ our %avatar_size = ( # Leave it undefined (or set to 'undef') to turn off load checking. our $maxload = 300; +# This enables/disables the caching layer in gitweb. This currently only supports the +# 'dumb' file based caching layer, primarily used on git.kernel.org. this is reasonably +# effective but it has the downside of requiring a huge amount of disk space if there +# are a number of repositories involved. It is not uncommon for git.kernel.org to have +# on the order of 80G - 120G accumulate over the course of a few months. It is recommended +# that the cache directory be periodically completely deleted, and this is safe to perform. +# Suggested mechanism +# mv $cacheidr $cachedir.flush;mkdir $cachedir;rm -rf $cachedir.flush +our $caching_enabled = 0; + +# Used to set the minimum cache timeout for the dynamic caching algorithm. Basically +# if we calculate the cache to be under this number of seconds we set the cache timeout +# to this minimum. +# Value is in seconds. 1 = 1 seconds, 60 = 1 minute, 600 = 10 minutes, 3600 = 1 hour +our $minCacheTime = 20; + +# Used to set the maximum cache timeout for the dynamic caching algorithm. Basically +# if we calculate the cache to exceed this number of seconds we set the cache timeout +# to this maximum. +# Value is in seconds. 1 = 1 seconds, 60 = 1 minute, 600 = 10 minutes, 3600 = 1 hour +our $maxCacheTime = 1200; + +# If you need to change the location of the caching directory, override this +# otherwise this will probably do fine for you +our $cachedir = 'cache'; + +# If this is set (to 1) cache will do it's best to always display something instead +# of making someone wait for the cache to update. This will launch the cacheUpdate +# into the background and it will lock a <file>.bg file and will only lock the +# actual cache file when it needs to write into it. In theory this will make +# gitweb seem more responsive at the price of possibly stale data. +our $backgroundCache = 1; + +# Used to set the maximum cache file life. If a cache files last modify time exceeds +# this value, it will assume that the data is just too old, and HAS to be regenerated +# instead of trying to display the existing cache data. +# Value is in seconds. 1 = 1 seconds, 60 = 1 minute, 600 = 10 minutes, 3600 = 1 hour +# 18000 = 5 hours +our $maxCacheLife = 18000; + +# Used to enable or disable background forking of the gitweb caching. Mainly here for debugging purposes +our $cacheDoFork = 1; + +our $fullhashpath = *STDOUT; +our $fullhashbinpath = *STDOUT; +our $fullhashbinpathfinal = *STDOUT; + # configuration for 'highlight' (http://www.andre-simon.de/) # match by basename our %highlight_basename = ( @@ -506,6 +553,15 @@ our %feature = ( 'default' => [0]}, ); +# +# Includes +# +if (!exists $INC{'cache.pl'}) { + my $return = do 'cache.pl'; + die $@ if $@; + die "Couldn't read 'cache.pl': $!" if (!defined $return); +} + sub gitweb_get_feature { my ($name) = @_; return unless exists $feature{$name}; @@ -734,6 +790,10 @@ our %actions = ( "project_list" => \&git_project_list, "project_index" => \&git_project_index, ); +sub is_cacheable { + my $action = shift; + return !($action eq 'blame_data' || $action eq 'blame_incremental'); +} # finally, we have the hash of allowed extra_options for the commands that # allow them @@ -1072,7 +1132,11 @@ sub dispatch { !$project) { die_error(400, "Project needed"); } - $actions{$action}->(); + if ($caching_enabled && is_cacheable($action)) { + cache_fetch($action); + } else { + $actions{$action}->(); + } } sub reset_timer { @@ -1142,6 +1206,7 @@ sub change_output { # Trap the 'proper' STDOUT to STDOUT_REAL for things like error messages and such open(STDOUT_REAL,">&STDOUT") or die "Unable to capture STDOUT $!\n"; + print STDOUT_REAL ""; # Close STDOUT, so that it isn't being used anymore. close STDOUT; @@ -1167,10 +1232,7 @@ sub run { $pre_dispatch_hook->() if $pre_dispatch_hook; - change_output(); run_request(); - reset_output(); - print $output; $post_dispatch_hook->() if $post_dispatch_hook; @@ -3447,7 +3509,8 @@ sub git_header_html { # support xhtml+xml but choking when it gets what it asked for. if (defined $cgi->http('HTTP_ACCEPT') && $cgi->http('HTTP_ACCEPT') =~ m/(,|;|\s|^)application\/xhtml\+xml(,|;|\s|$)/ && - $cgi->Accept('application/xhtml+xml') != 0) { + $cgi->Accept('application/xhtml+xml') != 0 && + !$caching_enabled) { $content_type = 'application/xhtml+xml'; } else { $content_type = 'text/html'; @@ -3592,6 +3655,7 @@ sub git_footer_html { my $feed_class = 'rss_logo'; print "<div class=\"page_footer\">\n"; + print "<div class=\"cachetime\">Cache Last Updated: ". gmtime( time ) ." GMT</div>\n"; if (defined $project) { my $descr = git_get_project_description($project); if (defined $descr) { @@ -3680,9 +3744,14 @@ sub die_error { 500 => '500 Internal Server Error', 503 => '503 Service Unavailable', ); + # The output handlers for die_error need to be reset to STDOUT + # so that half the message isn't being output to random and + # half to STDOUT as expected. This is mainly for the benefit + # of using git_header_html() and git_footer_html() since + # # Reset the output so that we are actually going to STDOUT as opposed # to buffering the output. - reset_output(); + reset_output() if ($caching_enabled); git_header_html($http_responses{$status}, undef, %opts); print <<EOF; @@ -5592,9 +5661,15 @@ sub git_blob_plain { ($sandbox ? 'attachment' : 'inline') . '; filename="' . $save_as . '"'); local $/ = undef; - binmode STDOUT, ':raw'; - print <$fd>; - binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi + if ($caching_enabled) { + open BINOUT, '>', $fullhashbinpath or die_error(500, "Could not open bin dump file"); + }else{ + open BINOUT, '>', \$fullhashbinpath or die_error(500, "Could not open bin dump file"); + } + binmode BINOUT, ':raw'; + print BINOUT <$fd>; + binmode BINOUT, ':utf8'; # as set at the beginning of gitweb.cgi + close BINOUT; close $fd; } @@ -5879,9 +5954,15 @@ sub git_snapshot { open my $fd, "-|", $cmd or die_error(500, "Execute git-archive failed"); - binmode STDOUT, ':raw'; - print <$fd>; - binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi + if ($caching_enabled) { + open BINOUT, '>', $fullhashbinpath or die_error(500, "Could not open bin dump file"); + }else{ + open BINOUT, '>', \$fullhashbinpath or die_error(500, "Could not open bin dump file"); + } + binmode BINOUT, ':raw'; + print BINOUT <$fd>; + binmode BINOUT, ':utf8'; # as set at the beginning of gitweb.cgi + close BINOUT; close $fd; } diff --git a/gitweb/lib/cache.pl b/gitweb/lib/cache.pl new file mode 100644 index 0000000..4360b3d --- /dev/null +++ b/gitweb/lib/cache.pl @@ -0,0 +1,348 @@ +# gitweb - simple web interface to track changes in git repositories +# +# (C) 2006, John 'Warthog9' Hawley <warthog19@eaglescrag.net> +# +# This program is licensed under the GPLv2 + +# +# Gitweb caching engine +# + +#use File::Path qw(make_path remove_tree); +use File::Path qw(mkpath rmtree); # Used for compatability reasons +use Digest::MD5 qw(md5 md5_hex md5_base64); +use Fcntl ':flock'; +use File::Copy; + +sub cache_fetch { + my ($action) = @_; + my $cacheTime = 0; + + if(! -d $cachedir){ + print "*** Warning ***: Caching enabled but cache directory does not exsist. ($cachedir)\n"; + mkdir ("cache", 0755) || die "Cannot create cache dir - you will need to manually create"; + print "Cache directory created successfully\n"; + } + + our $full_url = "$my_url?". $ENV{'QUERY_STRING'}; + our $urlhash = md5_hex($full_url); + our $fullhashdir = "$cachedir/". substr( $urlhash, 0, 2) ."/"; + + eval { mkpath( $fullhashdir, 0, 0777 ) }; + if ($@) { + die_error(500, "Internal Server Error", "Could not create cache directory: $@"); + } + $fullhashpath = "$fullhashdir/". substr( $urlhash, 2 ); + $fullhashbinpath = "$fullhashpath.bin.wt"; + $fullhashbinpathfinal = "$fullhashpath.bin"; + + if(! -e "$fullhashpath" ){ + if(! $cacheDoFork || ! defined(my $childPid = fork()) ){ + cacheUpdate($action,0); + cacheDisplay($action); + } elsif ( $childPid == 0 ){ + #run the updater + cacheUpdate($action,1); + }else{ + cacheWaitForUpdate($action); + } + }else{ + #if cache is out dated, update + #else displayCache(); + open(cacheFile, '<', "$fullhashpath"); + stat(cacheFile); + close(cacheFile); + my $stat_time = (stat(_))[9]; + my $stat_size = (stat(_))[7]; + + $cacheTime = get_loadavg() * 60; + if( $cacheTime > $maxCacheTime ){ + $cacheTime = $maxCacheTime; + } + if( $cacheTime < $minCacheTime ){ + $cacheTime = $minCacheTime; + } + if( $stat_time < (time - $cacheTime) || $stat_size == 0 ){ + if( ! $cacheDoFork || ! defined(my $childPid = fork()) ){ + cacheUpdate($action,0); + cacheDisplay($action); + } elsif ( $childPid == 0 ){ + #run the updater + #print "Running updater\n"; + cacheUpdate($action,1); + }else{ + #print "Waiting for update\n"; + cacheWaitForUpdate($action); + } + } else { + cacheDisplay($action); + } + + + } + + # + # If all of the caching failes - lets go ahead and press on without it and fall back to 'default' + # non-caching behavior. This is the softest of the failure conditions. + # + #$actions{$action}->(); +} + +sub cacheUpdate { + my ($action,$areForked) = @_; + my $lockingStatus; + my $fileData = ""; + + if($backgroundCache){ + open(cacheFileBG, '>:utf8', "$fullhashpath.bg"); + my $lockStatBG = flock(cacheFileBG,LOCK_EX|LOCK_NB); + + $lockStatus = $lockStatBG; + }else{ + open(cacheFile, '>:utf8', $fullhashpath); + my $lockStat = flock(cacheFile,LOCK_EX|LOCK_NB); + + $lockStatus = $lockStat; + } + #print "lock status: $lockStat\n"; + + + if (! $lockStatus ){ + if ( $areForked ){ + exit(0); + }else{ + return; + } + } + + if( + $action eq "snapshot" + || + $action eq "blob_plain" + ){ + my $openstat = open(cacheFileBinWT, '>>:utf8', "$fullhashbinpath"); + my $lockStatBin = flock(cacheFileBinWT,LOCK_EX|LOCK_NB); + } + + # Trap all output from the action + change_output(); + + $actions{$action}->(); + + # Reset the outputs as we should be fine now + reset_output(); + + + if($backgroundCache){ + open(cacheFile, '>:utf8', "$fullhashpath"); + $lockStat = flock(cacheFile,LOCK_EX); + + if (! $lockStat ){ + if ( $areForked ){ + exit(0); + }else{ + return; + } + } + } + + if( + $action eq "snapshot" + || + $action eq "blob_plain" + ){ + my $openstat = open(cacheFileBinFINAL, '>:utf8', "$fullhashbinpathfinal"); + $lockStatBIN = flock(cacheFileBinFINAL,LOCK_EX); + + if (! $lockStatBIN ){ + if ( $areForked ){ + exit(0); + }else{ + return; + } + } + } + + # Actually dump the output to the proper file handler + local $/ = undef; + $|++; + print cacheFile "$output"; + $|--; + if( + $action eq "snapshot" + || + $action eq "blob_plain" + ){ + move("$fullhashbinpath", "$fullhashbinpathfinal") or die "Binary Cache file could not be updated: $!"; + + flock(cacheFileBinFINAL,LOCK_UN); + close(cacheFileBinFINAL); + + flock(cacheFileBinWT,LOCK_UN); + close(cacheFileBinWT); + } + + flock(cacheFile,LOCK_UN); + close(cacheFile); + + if($backgroundCache){ + flock(cacheFileBG,LOCK_UN); + close(cacheFileBG); + } + + if ( $areForked ){ + exit(0); + } else { + return; + } +} + + +sub cacheWaitForUpdate { + my ($action) = @_; + my $x = 0; + my $max = 10; + my $lockStat = 0; + + if( $backgroundCache ){ + if( -e "$fullhashpath" ){ + open(cacheFile, '<:utf8', "$fullhashpath"); + $lockStat = flock(cacheFile,LOCK_SH|LOCK_NB); + stat(cacheFile); + close(cacheFile); + + if( $lockStat && ( (stat(_))[9] > (time - $maxCacheLife) ) ){ + cacheDisplay($action); + return; + } + } + } + + if( + $action eq "atom" + || + $action eq "rss" + || + $action eq "opml" + ){ + do { + sleep 2 if $x > 0; + open(cacheFile, '<:utf8', "$fullhashpath"); + $lockStat = flock(cacheFile,LOCK_SH|LOCK_NB); + close(cacheFile); + $x++; + $combinedLockStat = $lockStat; + } while ((! $combinedLockStat) && ($x < $max)); + + if( $x != $max ){ + cacheDisplay($action); + } + return; + } + + $| = 1; + + print $::cgi->header( + -type=>'text/html', + -charset => 'utf-8', + -status=> 200, + -expires => 'now', + # HTTP/1.0 + -Pragma => 'no-cache', + # HTTP/1.1 + -Cache_Control => join( + ', ', + qw( + private + no-cache + no-store + must-revalidate + max-age=0 + pre-check=0 + post-check=0 + ) + ) + ); + + print <<EOF; +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www/w3.porg/TR/html4/strict.dtd"> +<!-- git web w/caching interface version $version, (C) 2006-2010, John 'Warthog9' Hawley <warthog9\@kernel.org> --> +<!-- git core binaries version $git_version --> +<head> +<meta http-equiv="content-type" content="$content_type; charset=utf-8"/> +<meta name="generator" content="gitweb/$version git/$git_version"/> +<meta name="robots" content="index, nofollow"/> +<meta http-equiv="refresh" content="0"/> +<title>$title</title> +</head> +<body> +EOF + + print "Generating.."; + do { + print "."; + sleep 2 if $x > 0; + open(cacheFile, '<:utf8', "$fullhashpath"); + $lockStat = flock(cacheFile,LOCK_SH|LOCK_NB); + close(cacheFile); + $x++; + $combinedLockStat = $lockStat; + } while ((! $combinedLockStat) && ($x < $max)); + print <<EOF; +</body> +</html> +EOF + return; +} + +sub cacheDisplay { + local $/ = undef; + $|++; + + my ($action) = @_; + open(cacheFile, '<:utf8', "$fullhashpath"); + $lockStat = flock(cacheFile,LOCK_SH|LOCK_NB); + + if (! $lockStat ){ + close(cacheFile); + cacheWaitForUpdate($action); + } + + if( + ( + $action eq "snapshot" + || + $action eq "blob_plain" + ) + ){ + my $openstat = open(cacheFileBin, '<', "$fullhashbinpathfinal"); + $lockStatBIN = flock(cacheFileBin,LOCK_SH|LOCK_NB); + if (! $lockStatBIN ){ + system ("echo 'cacheDisplay - bailing due to binary lock failure' >> /tmp/gitweb.log"); + close(cacheFile); + close(cacheFileBin); + cacheWaitForUpdate($action); + } + + my $binfilesize = -s "$fullhashbinpathfinal"; + print "Content-Length: $binfilesize"; + } + while( <cacheFile> ){ + print $_; + } + if( + $action eq "snapshot" + || + $action eq "blob_plain" + ){ + binmode STDOUT, ':raw'; + print <cacheFileBin>; + binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi + close(cacheFileBin); + } + close(cacheFile); + $|--; +} + +1; +__END__ diff --git a/gitweb/static/gitweb.css b/gitweb/static/gitweb.css index 4132aab..972d32e 100644 --- a/gitweb/static/gitweb.css +++ b/gitweb/static/gitweb.css @@ -67,6 +67,12 @@ div.page_path { border-width: 0px 0px 1px; } +div.cachetime { + float: left; + margin-right: 10px; + color: #555555; +} + div.page_footer { height: 17px; padding: 4px 8px; -- 1.7.3 ^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [PATCHv7.1 3/4 (amend v2)] gitweb: File based caching layer (from git.kernel.org) 2010-11-29 23:09 ` [PATCHv7.1 3/4 (amend v2)] " Jakub Narebski @ 2010-11-30 0:51 ` Junio C Hamano 2010-11-30 10:21 ` Jakub Narebski 0 siblings, 1 reply; 43+ messages in thread From: Junio C Hamano @ 2010-11-30 0:51 UTC (permalink / raw) To: Jakub Narebski Cc: Junio C Hamano, git, John 'Warthog9' Hawley, John 'Warthog9' Hawley, Petr Baudis, admin Jakub Narebski <jnareb@gmail.com> writes: > On Mon, 29 Nov 2010, Junio C Hamano wrote: > >> Wait a bit. >> >> This seems to match what I have already queued on 'pu', no? Am I >> hallucinating? > > Damn, it looks like my mailer included wrong (older) version of a file. > I'm sorry. Below there is interdiff: Thanks. What I had was what I pulled from you. I take it that you want me to rebase the two tip commits on the branch? ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCHv7.1 3/4 (amend v2)] gitweb: File based caching layer (from git.kernel.org) 2010-11-30 0:51 ` Junio C Hamano @ 2010-11-30 10:21 ` Jakub Narebski 0 siblings, 0 replies; 43+ messages in thread From: Jakub Narebski @ 2010-11-30 10:21 UTC (permalink / raw) To: Junio C Hamano Cc: git, John 'Warthog9' Hawley, John 'Warthog9' Hawley, Petr Baudis, admin On Tue, 30 Nov 2010, Junio C Hamano wrote: > Jakub Narebski <jnareb@gmail.com> writes: >> On Mon, 29 Nov 2010, Junio C Hamano wrote: >> >>> Wait a bit. >>> >>> This seems to match what I have already queued on 'pu', no? Am I >>> hallucinating? >> >> Damn, it looks like my mailer included wrong (older) version of a file. >> I'm sorry. Below there is interdiff: > > Thanks. What I had was what I pulled from you. I take it that you want > me to rebase the two tip commits on the branch? Yes, please. Though it still doesn't pass new tests in t9501 and t9502; the problem is more with setting up tests rather than with caching code, though. -- Jakub Narebski Poland ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCHv7.1 3/4] gitweb: File based caching layer (from git.kernel.org) 2010-11-12 23:56 ` [PATCHv7.1 3/4] gitweb: File based caching layer (from git.kernel.org) Jakub Narebski 2010-11-28 11:22 ` [PATCHv7.1 3/4 (amend)] " Jakub Narebski @ 2010-11-29 23:07 ` demerphq 2010-11-29 23:26 ` demerphq 1 sibling, 1 reply; 43+ messages in thread From: demerphq @ 2010-11-29 23:07 UTC (permalink / raw) To: Jakub Narebski Cc: git, John 'Warthog9' Hawley, John 'Warthog9' Hawley, Junio C Hamano, Petr Baudis, admin 2010/11/13 Jakub Narebski <jnareb@gmail.com>: > - binmode STDOUT, ':raw'; > - print <$fd>; > - binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi > + if ($caching_enabled) { > + open BINOUT, '>', $fullhashbinpath or die_error(500, "Could not open bin dump file"); > + }else{ > + open BINOUT, '>', \$fullhashbinpath or die_error(500, "Could not open bin dump file"); > + } > + binmode BINOUT, ':raw'; > + print BINOUT <$fd>; > + binmode BINOUT, ':utf8'; # as set at the beginning of gitweb.cgi > + close BINOUT; Why do you use dynamically scoped file handles here as opposed to lexically scoped ones? And why do you change the output discipline on BINOUT immediately before closing the file, and after you print data to it? Doing so sortof makes sense when the filedhandle is STDOUT, but not when it is BINOUT. cheers, Yves -- perl -Mre=debug -e "/just|another|perl|hacker/" ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCHv7.1 3/4] gitweb: File based caching layer (from git.kernel.org) 2010-11-29 23:07 ` [PATCHv7.1 3/4] " demerphq @ 2010-11-29 23:26 ` demerphq 2010-11-29 23:54 ` Jakub Narebski 0 siblings, 1 reply; 43+ messages in thread From: demerphq @ 2010-11-29 23:26 UTC (permalink / raw) To: Jakub Narebski Cc: git, John 'Warthog9' Hawley, John 'Warthog9' Hawley, Junio C Hamano, Petr Baudis, admin On 30 November 2010 00:07, demerphq <demerphq@gmail.com> wrote: > 2010/11/13 Jakub Narebski <jnareb@gmail.com>: >> - binmode STDOUT, ':raw'; >> - print <$fd>; >> - binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi >> + if ($caching_enabled) { >> + open BINOUT, '>', $fullhashbinpath or die_error(500, "Could not open bin dump file"); >> + }else{ >> + open BINOUT, '>', \$fullhashbinpath or die_error(500, "Could not open bin dump file"); >> + } >> + binmode BINOUT, ':raw'; >> + print BINOUT <$fd>; >> + binmode BINOUT, ':utf8'; # as set at the beginning of gitweb.cgi >> + close BINOUT; > > Why do you use dynamically scoped file handles here as opposed to > lexically scoped ones? > > And why do you change the output discipline on BINOUT immediately > before closing the file, and after you print data to it? > > Doing so sortof makes sense when the filedhandle is STDOUT, but not > when it is BINOUT. Also in this code: 2010/11/28 Jakub Narebski <jnareb@gmail.com>: > +# > +# Includes > +# > +if (!exists $INC{'cache.pl'}) { > + my $return = do 'cache.pl'; > + die $@ if $@; > + die "Couldn't read 'cache.pl': $!" if (!defined $return); > +} Why is that preferred to: require 'cache.pl'; And why is this thing even a .pl file? Why isnt it called lib/GitWeb/Cache.pm or something like that? -- perl -Mre=debug -e "/just|another|perl|hacker/" ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCHv7.1 3/4] gitweb: File based caching layer (from git.kernel.org) 2010-11-29 23:26 ` demerphq @ 2010-11-29 23:54 ` Jakub Narebski 0 siblings, 0 replies; 43+ messages in thread From: Jakub Narebski @ 2010-11-29 23:54 UTC (permalink / raw) To: demerphq Cc: git, John 'Warthog9' Hawley, John 'Warthog9' Hawley, Junio C Hamano, Petr Baudis, admin On Tue, 30 Nov 2010, demerphq wrote: > On 30 November 2010 00:07, demerphq <demerphq@gmail.com> wrote: > > 2010/11/13 Jakub Narebski <jnareb@gmail.com>: > > Also in this code: > > 2010/11/28 Jakub Narebski <jnareb@gmail.com>: > > +# > > +# Includes > > +# > > +if (!exists $INC{'cache.pl'}) { > > + my $return = do 'cache.pl'; > > + die $@ if $@; > > + die "Couldn't read 'cache.pl': $!" if (!defined $return); > > +} > > Why is that preferred to: > > require 'cache.pl'; > > And why is this thing even a .pl file? Why isnt it called > lib/GitWeb/Cache.pm or something like that? Because it is not a Perl module; in particular 'cache.pl' uses global variables from gitweb.perl (like $my_url, or $cachedir, or %action) and subroutines from gitweb.perl (like change_output() and reset_output()). That is why it needs to be injected via do, rather than included in its owne namespace with package/require. P.S. This is not my code, this is patch by J.H. (John Hawley); I did only *minimal* fixups. P.P.S. My rewrite can be found in 'gitweb/cache-kernel-pu' branch in my repository (links are to web interface) http://repo.or.cz/w/git/jnareb-git.git https://github.com/jnareb/git Sent in http://thread.gmane.org/gmane.comp.version-control.git/158313 -- Jakub Narebski Poland ^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCHv7.2 4/4] gitweb: Minimal testing of gitweb caching 2010-11-12 23:35 ` [RFC/PATCHv7.2 0/4] Gitweb caching v7.2 Jakub Narebski ` (2 preceding siblings ...) 2010-11-12 23:56 ` [PATCHv7.1 3/4] gitweb: File based caching layer (from git.kernel.org) Jakub Narebski @ 2010-11-13 0:01 ` Jakub Narebski 2010-11-17 22:37 ` Junio C Hamano 3 siblings, 1 reply; 43+ messages in thread From: Jakub Narebski @ 2010-11-13 0:01 UTC (permalink / raw) To: git Cc: John 'Warthog9' Hawley, John 'Warthog9' Hawley, Junio C Hamano, Petr Baudis, admin Add basic tests of caching support to t9500-gitweb-standalone-no-errors test: set $caching_enabled to true and check for errors for first time run (generating cache) and second time run (retrieving from cache) for a single view - summary view for a project. Check also that request for non-existent object (which results in die_error() codepath to be called) doesn't produce errors. Check in t9501-gitweb-standalone-http-status that request for non-existent object produces correct output (HTTP headers and HTML output) also when caching is enabled. Check in the t9502-gitweb-standalone-parse-output test that gitweb produces the same output with and without caching, for first and second run, with binary (raw) or plain text (utf8) output. The common routine that enables cache, gitweb_enable_caching, is defined in t/gitweb-lib.sh Signed-off-by: Jakub Narebski <jnareb@gmail.com> --- As I wrote in the cover letter, because of cacheWaitForUpdate() generating extra output some of those newly introduced tests fail, some intermittently. John, could you take a look at it and check if the problem is: in tests, in my simplification, or in caching code... t/gitweb-lib.sh | 15 +++++++++++++ t/t9500-gitweb-standalone-no-errors.sh | 20 +++++++++++++++++ t/t9501-gitweb-standalone-http-status.sh | 13 +++++++++++ t/t9502-gitweb-standalone-parse-output.sh | 33 +++++++++++++++++++++++++++++ 4 files changed, 81 insertions(+), 0 deletions(-) diff --git a/t/gitweb-lib.sh b/t/gitweb-lib.sh index b9bb95f..16ce811 100644 --- a/t/gitweb-lib.sh +++ b/t/gitweb-lib.sh @@ -52,6 +52,21 @@ EOF export SCRIPT_NAME } +gitweb_enable_caching () { + test_expect_success 'enable caching' ' + cat >>gitweb_config.perl <<-\EOF && + our $caching_enabled = 1; + our $minCacheTime = 60*60*24*7*30; # very long expiration time for tests (a month) + our $maxCacheTime = 60*60*24*7*30*365; # upper bound for dynamic (adaptive) caching + our $cachedir = "cache"; # for testsuite to clear the right thing + # required, because otherwise some tests might intermittently not pass + our $backgroundCache = 0; # should turn off cacheWaitForUpdate() / "Generating..." + #our $cacheDoFork = 0; + EOF + rm -rf cache/ + ' +} + gitweb_run () { GATEWAY_INTERFACE='CGI/1.1' HTTP_ACCEPT='*/*' diff --git a/t/t9500-gitweb-standalone-no-errors.sh b/t/t9500-gitweb-standalone-no-errors.sh index 21cd286..86c1b50 100755 --- a/t/t9500-gitweb-standalone-no-errors.sh +++ b/t/t9500-gitweb-standalone-no-errors.sh @@ -677,4 +677,24 @@ test_expect_success HIGHLIGHT \ gitweb_run "p=.git;a=blob;f=test.sh"' test_debug 'cat gitweb.log' +# ---------------------------------------------------------------------- +# caching + +gitweb_enable_caching + +test_expect_success \ + 'caching enabled (project summary, first run, generating cache)' \ + 'gitweb_run "p=.git;a=summary"' +test_debug 'cat gitweb.log' + +test_expect_success \ + 'caching enabled (project summary, second run, cached version)' \ + 'gitweb_run "p=.git;a=summary"' +test_debug 'cat gitweb.log' + +test_expect_success \ + 'caching enabled (non-existent commit, non-cache error page)' \ + 'gitweb_run "p=.git;a=commit;h=non-existent"' +test_debug 'cat gitweb.log' + test_done diff --git a/t/t9501-gitweb-standalone-http-status.sh b/t/t9501-gitweb-standalone-http-status.sh index 2487da1..5b1df3f 100755 --- a/t/t9501-gitweb-standalone-http-status.sh +++ b/t/t9501-gitweb-standalone-http-status.sh @@ -134,5 +134,18 @@ cat >>gitweb_config.perl <<\EOF our $maxload = undef; EOF +# ---------------------------------------------------------------------- +# output caching + +gitweb_enable_caching + +test_expect_success 'caching enabled (non-existent commit, 404 error)' ' + gitweb_run "p=.git;a=commit;h=non-existent" && + grep "Status: 404 Not Found" gitweb.headers && + grep "404 - Unknown commit object" gitweb.body +' +test_debug 'echo "log" && cat gitweb.log' +test_debug 'echo "headers" && cat gitweb.headers' +test_debug 'echo "body" && cat gitweb.body' test_done diff --git a/t/t9502-gitweb-standalone-parse-output.sh b/t/t9502-gitweb-standalone-parse-output.sh index dd83890..bc8eb01 100755 --- a/t/t9502-gitweb-standalone-parse-output.sh +++ b/t/t9502-gitweb-standalone-parse-output.sh @@ -112,4 +112,37 @@ test_expect_success 'snapshot: hierarchical branch name (xx/test)' ' ' test_debug 'cat gitweb.headers' + +# ---------------------------------------------------------------------- +# whether gitweb with caching enabled produces the same output + +test_expect_success 'setup for caching tests (utf8 patch, binary file)' ' + . "$TEST_DIRECTORY"/t3901-utf8.txt && + cp "$TEST_DIRECTORY"/test9200a.png image.png && + git add image.png && + git commit -F "$TEST_DIRECTORY"/t3900/1-UTF-8.txt && + gitweb_run "p=.git;a=patch" && + mv gitweb.body no_cache.txt && + gitweb_run "p=.git;a=blob_plain;f=image.png" && + mv gitweb.body no_cache.png +' + +gitweb_enable_caching + +for desc in 'generating cache' 'cached version'; do + test_expect_success "caching enabled, plain text (utf8) output, $desc" ' + gitweb_run "p=.git;a=patch" && + mv gitweb.body cache.txt && + test_cmp no_cache.txt cache.txt + ' +done + +for desc in 'generating cache' 'cached version'; do + test_expect_success "caching enabled, binary output (raw), $desc" ' + gitweb_run "p=.git;a=blob_plain;f=image.png" && + mv gitweb.body cache.png && + cmp no_cache.png cache.png + ' +done + test_done -- 1.7.3 ^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [PATCHv7.2 4/4] gitweb: Minimal testing of gitweb caching 2010-11-13 0:01 ` [PATCHv7.2 4/4] gitweb: Minimal testing of gitweb caching Jakub Narebski @ 2010-11-17 22:37 ` Junio C Hamano 2010-11-17 23:12 ` Jakub Narebski 0 siblings, 1 reply; 43+ messages in thread From: Junio C Hamano @ 2010-11-17 22:37 UTC (permalink / raw) To: Jakub Narebski Cc: git, John 'Warthog9' Hawley, John 'Warthog9' Hawley, Petr Baudis, admin Jakub Narebski <jnareb@gmail.com> writes: > +gitweb_enable_caching () { > + test_expect_success 'enable caching' ' > + cat >>gitweb_config.perl <<-\EOF && > + our $caching_enabled = 1; > + our $minCacheTime = 60*60*24*7*30; # very long expiration time for tests (a month) Does your month have 210 days in it? > + our $maxCacheTime = 60*60*24*7*30*365; # upper bound for dynamic (adaptive) caching Likewise, is this 210 years? Does this bust 32-bit time_t somewhere? > + our $cachedir = "cache"; # for testsuite to clear the right thing > + # required, because otherwise some tests might intermittently not pass > + our $backgroundCache = 0; # should turn off cacheWaitForUpdate() / "Generating..." > + #our $cacheDoFork = 0; leftover.... ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCHv7.2 4/4] gitweb: Minimal testing of gitweb caching 2010-11-17 22:37 ` Junio C Hamano @ 2010-11-17 23:12 ` Jakub Narebski 0 siblings, 0 replies; 43+ messages in thread From: Jakub Narebski @ 2010-11-17 23:12 UTC (permalink / raw) To: Junio C Hamano Cc: git, John 'Warthog9' Hawley, John 'Warthog9' Hawley, Petr Baudis, admin Dnia środa 17. listopada 2010 23:37, Junio C Hamano napisał: > Jakub Narebski <jnareb@gmail.com> writes: > > > +gitweb_enable_caching () { > > + test_expect_success 'enable caching' ' > > + cat >>gitweb_config.perl <<-\EOF && > > + our $caching_enabled = 1; > > + our $minCacheTime = 60*60*24*7*30; # very long expiration time for tests (a month) > > Does your month have 210 days in it? > > > + our $maxCacheTime = 60*60*24*7*30*365; # upper bound for dynamic (adaptive) caching > > Likewise, is this 210 years? Does this bust 32-bit time_t somewhere? Ooops. Still, fixing this doesn't fix tests. I think I'd have to make another extra commit, adding configure knob to turn off cacheWaitForUpdate()... Or perhaps I did something wrong when simplifying, but I have checked diff between full and simplified version (in comments in v7.1 series), and it doesn't look like it... -- Jakub Narebski Poland ^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCHv7.1 1/4] gitweb: Prepare for splitting gitweb 2010-10-31 9:21 ` Jakub Narebski 2010-11-01 10:24 ` [PATCHv7.1 0/4] Gitweb caching v7.1 Jakub Narebski @ 2010-11-01 10:24 ` Jakub Narebski 2010-11-01 18:50 ` [PATCHv7.1b " Jakub Narebski 2010-11-01 10:24 ` [PATCHv7.1 2/4] gitweb: add output buffering and associated functions Jakub Narebski ` (2 subsequent siblings) 4 siblings, 1 reply; 43+ messages in thread From: Jakub Narebski @ 2010-11-01 10:24 UTC (permalink / raw) To: git Cc: John 'Warthog9' Hawley, John 'Warthog9' Hawley, Junio C Hamano, Petr Baudis, admin, Jakub Narebski Prepare gitweb for having been split into modules that are to be installed alongside gitweb in 'lib/' subdirectory, by adding use lib __DIR__.'/lib'; to gitweb.perl (to main gitweb script), and preparing for putting modules (relative path) in $(GITWEB_MODULES) in gitweb/Makefile. This preparatory work allows to add new module to gitweb by simply adding GITWEB_MODULES += <module> to gitweb/Makefile (assuming that the module is in 'gitweb/lib/' directory). Using __DIR__ from Dir::Self module (not in core, that's why currently gitweb includes excerpt of code from Dir::Self defining __DIR__) was chosen over using FindBin-based solution (in core since perl 5.00307, while gitweb itself requires at least perl 5.8.0) because FindBin uses BEGIN block, which is a problem under mod_perl and other persistent Perl environments (thought there are workarounds). At Pavan Kumar Sankara suggestion gitweb/Makefile uses install [OPTION]... SOURCE... DIRECTORY format (2nd format) with single SOURCE rather than install [OPTION]... SOURCE DEST format (1st format) because of security reasons (race conditions). Modern GNU install has `-T' / `--no-target-directory' option, but we cannot rely that the $(INSTALL) we are using supports this option. The install-modules target in gitweb/Makefile uses shell 'for' loop, instead of make's $(foreach) function, to avoid possible problem with generating a command line that exceeded the maximum argument list length. Signed-off-by: Jakub Narebski <jnareb@gmail.com> --- This patch would conflict with the patch 958a846 (gitweb/Makefile: Add 'test' and 'test-installed' targets, 2010-09-26), which is in 'pu'. Resolving those conflicts is easy, though non-trivial. Please tell me if I should send this series rebased on top of 'jn/gitweb-test' (which includes 958a846) instead. This patch introduces infrastructure which would be required later for splitting gitweb. gitweb/Makefile | 17 ++++++++++++++--- gitweb/gitweb.perl | 8 ++++++++ 2 files changed, 22 insertions(+), 3 deletions(-) diff --git a/gitweb/Makefile b/gitweb/Makefile index 2fb7c2d..6fa7625 100644 --- a/gitweb/Makefile +++ b/gitweb/Makefile @@ -55,6 +55,7 @@ PERL_PATH ?= /usr/bin/perl bindir_SQ = $(subst ','\'',$(bindir))#' gitwebdir_SQ = $(subst ','\'',$(gitwebdir))#' gitwebstaticdir_SQ = $(subst ','\'',$(gitwebdir)/static)#' +gitweblibdir_SQ = $(subst ','\'',$(gitwebdir)/lib)#' SHELL_PATH_SQ = $(subst ','\'',$(SHELL_PATH))#' PERL_PATH_SQ = $(subst ','\'',$(PERL_PATH))#' DESTDIR_SQ = $(subst ','\'',$(DESTDIR))#' @@ -145,16 +146,26 @@ gitweb.cgi: gitweb.perl GITWEB-BUILD-OPTIONS ### Installation rules -install: all +install: all install-modules $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(gitwebdir_SQ)' $(INSTALL) -m 755 $(GITWEB_PROGRAMS) '$(DESTDIR_SQ)$(gitwebdir_SQ)' $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(gitwebstaticdir_SQ)' $(INSTALL) -m 644 $(GITWEB_FILES) '$(DESTDIR_SQ)$(gitwebstaticdir_SQ)' +install-modules: + install_dirs="$(sort $(dir $(GITWEB_MODULES)))" && \ + for dir in $$install_dirs; do \ + test -d '$(DESTDIR_SQ)$(gitweblibdir_SQ)/$$dir' || \ + $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(gitweblibdir_SQ)/$$dir'; \ + done + gitweb_modules="$(GITWEB_MODULES)" && \ + for mod in $$gitweb_modules; do \ + $(INSTALL) -m 644 lib/$$mod '$(DESTDIR_SQ)$(gitweblibdir_SQ)/$$(dirname $$mod)'; \ + done + ### Cleaning rules clean: $(RM) gitweb.cgi static/gitweb.min.js static/gitweb.min.css GITWEB-BUILD-OPTIONS -.PHONY: all clean install .FORCE-GIT-VERSION-FILE FORCE - +.PHONY: all clean install install-modules .FORCE-GIT-VERSION-FILE FORCE diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl index d521b4c..e4c08ba 100755 --- a/gitweb/gitweb.perl +++ b/gitweb/gitweb.perl @@ -9,6 +9,14 @@ use strict; use warnings; + +use File::Spec; +# __DIR__ is taken from Dir::Self __DIR__ fragment +sub __DIR__ () { + File::Spec->rel2abs(join '', (File::Spec->splitpath(__FILE__))[0, 1]); +} +use lib __DIR__ . '/lib'; + use CGI qw(:standard :escapeHTML -nosticky); use CGI::Util qw(unescape); use CGI::Carp qw(fatalsToBrowser set_message); -- 1.7.3 ^ permalink raw reply related [flat|nested] 43+ messages in thread
* [PATCHv7.1b 1/4] gitweb: Prepare for splitting gitweb 2010-11-01 10:24 ` [PATCHv7.1 1/4] gitweb: Prepare for splitting gitweb Jakub Narebski @ 2010-11-01 18:50 ` Jakub Narebski 0 siblings, 0 replies; 43+ messages in thread From: Jakub Narebski @ 2010-11-01 18:50 UTC (permalink / raw) To: git Cc: John 'Warthog9' Hawley, John 'Warthog9' Hawley, Junio C Hamano, Petr Baudis, admin Prepare gitweb for having been split into modules that are to be installed alongside gitweb in 'lib/' subdirectory, by adding use lib __DIR__.'/lib'; to gitweb.perl (to main gitweb script), and preparing for putting modules (relative path) in $(GITWEB_MODULES) in gitweb/Makefile. This preparatory work allows to add new module to gitweb by simply adding GITWEB_MODULES += <module> to gitweb/Makefile (assuming that the module is in 'gitweb/lib/' directory). While at it pass GITWEBLIBDIR in addition to GITWEB_TEST_INSTALLED to test instaleed version of gitweb and installed version of modules (for tests which check individual (sub)modules). Using __DIR__ from Dir::Self module (not in core, that's why currently gitweb includes excerpt of code from Dir::Self defining __DIR__) was chosen over using FindBin-based solution (in core since perl 5.00307, while gitweb itself requires at least perl 5.8.0) because FindBin uses BEGIN block, which is a problem under mod_perl and other persistent Perl environments (thought there are workarounds). At Pavan Kumar Sankara suggestion gitweb/Makefile uses install [OPTION]... SOURCE... DIRECTORY format (2nd format) with single SOURCE rather than install [OPTION]... SOURCE DEST format (1st format) because of security reasons (race conditions). Modern GNU install has `-T' / `--no-target-directory' option, but we cannot rely that the $(INSTALL) we are using supports this option. The install-modules target in gitweb/Makefile uses shell 'for' loop, instead of make's $(foreach) function, to avoid possible problem with generating a command line that exceeded the maximum argument list length. Signed-off-by: Jakub Narebski <jnareb@gmail.com> --- This version requires and applies cleanly on top of patch 958a846 (gitweb/Makefile: Add 'test' and 'test-installed' targets, 2010-09-26) in 'pu'. gitweb/Makefile | 17 +++++++++++++++-- gitweb/gitweb.perl | 8 ++++++++ 2 files changed, 23 insertions(+), 2 deletions(-) diff --git a/gitweb/Makefile b/gitweb/Makefile index df908a1..c2d72e4 100644 --- a/gitweb/Makefile +++ b/gitweb/Makefile @@ -56,6 +56,7 @@ PERL_PATH ?= /usr/bin/perl bindir_SQ = $(subst ','\'',$(bindir))#' gitwebdir_SQ = $(subst ','\'',$(gitwebdir))#' gitwebstaticdir_SQ = $(subst ','\'',$(gitwebdir)/static)#' +gitweblibdir_SQ = $(subst ','\'',$(gitwebdir)/lib)#' SHELL_PATH_SQ = $(subst ','\'',$(SHELL_PATH))#' PERL_PATH_SQ = $(subst ','\'',$(PERL_PATH))#' DESTDIR_SQ = $(subst ','\'',$(DESTDIR))#' @@ -151,20 +152,32 @@ test: test-installed: GITWEB_TEST_INSTALLED='$(DESTDIR_SQ)$(gitwebdir_SQ)' \ + GITWEBLIBDIR='$(DESTDIR_SQ)$(gitweblibdir_SQ)' \ $(MAKE) -C ../t gitweb-test ### Installation rules -install: all +install: all install-modules $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(gitwebdir_SQ)' $(INSTALL) -m 755 $(GITWEB_PROGRAMS) '$(DESTDIR_SQ)$(gitwebdir_SQ)' $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(gitwebstaticdir_SQ)' $(INSTALL) -m 644 $(GITWEB_FILES) '$(DESTDIR_SQ)$(gitwebstaticdir_SQ)' +install-modules: + install_dirs="$(sort $(dir $(GITWEB_MODULES)))" && \ + for dir in $$install_dirs; do \ + test -d '$(DESTDIR_SQ)$(gitweblibdir_SQ)/$$dir' || \ + $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(gitweblibdir_SQ)/$$dir'; \ + done + gitweb_modules="$(GITWEB_MODULES)" && \ + for mod in $$gitweb_modules; do \ + $(INSTALL) -m 644 lib/$$mod '$(DESTDIR_SQ)$(gitweblibdir_SQ)/$$(dirname $$mod)'; \ + done + ### Cleaning rules clean: $(RM) gitweb.cgi static/gitweb.min.js static/gitweb.min.css GITWEB-BUILD-OPTIONS -.PHONY: all clean install test test-installed .FORCE-GIT-VERSION-FILE FORCE +.PHONY: all clean install install-modules test test-installed .FORCE-GIT-VERSION-FILE FORCE diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl index d521b4c..e4c08ba 100755 --- a/gitweb/gitweb.perl +++ b/gitweb/gitweb.perl @@ -9,6 +9,14 @@ use strict; use warnings; + +use File::Spec; +# __DIR__ is taken from Dir::Self __DIR__ fragment +sub __DIR__ () { + File::Spec->rel2abs(join '', (File::Spec->splitpath(__FILE__))[0, 1]); +} +use lib __DIR__ . '/lib'; + use CGI qw(:standard :escapeHTML -nosticky); use CGI::Util qw(unescape); use CGI::Carp qw(fatalsToBrowser set_message); -- 1.7.3 ^ permalink raw reply related [flat|nested] 43+ messages in thread
* [PATCHv7.1 2/4] gitweb: add output buffering and associated functions 2010-10-31 9:21 ` Jakub Narebski 2010-11-01 10:24 ` [PATCHv7.1 0/4] Gitweb caching v7.1 Jakub Narebski 2010-11-01 10:24 ` [PATCHv7.1 1/4] gitweb: Prepare for splitting gitweb Jakub Narebski @ 2010-11-01 10:24 ` Jakub Narebski 2010-11-01 10:24 ` [PATCHv7.1 3/4] gitweb: File based caching layer (from git.kernel.org) Jakub Narebski 2010-11-01 10:24 ` [PATCHv7.1 4/4] gitweb: Minimal testing of gitweb caching Jakub Narebski 4 siblings, 0 replies; 43+ messages in thread From: Jakub Narebski @ 2010-11-01 10:24 UTC (permalink / raw) To: git Cc: John 'Warthog9' Hawley, John 'Warthog9' Hawley, Junio C Hamano, Petr Baudis, admin, Jakub Narebski From: John 'Warthog9' Hawley <warthog9@eaglescrag.net> This adds output buffering for gitweb, mainly in preparation for caching support. This is a dramatic change to how caching was being done, mainly in passing around the variable manually and such. This centrally flips the entire STDOUT to a variable, which after the completion of the run, flips it back and does a print on the resulting data. This should save on the previous 10K line patch (or so) that adds more explicit output passing. Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jakub Narebski <jnareb@gmail.com> --- Commit message unchanged. I have modified reset_output to silence the following warning from t9500: gitweb.perl: Name "main::STDOUT_REAL" used only once: possible typo at ../gitweb/gitweb.perl line 1130. Here is the interdiff for gitweb.perl from 2e5c355 (J.H. patch in 'pu'): diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl index b2b0a23..91e274f 100755 --- a/gitweb/gitweb.perl +++ b/gitweb/gitweb.perl @@ -32,7 +39,7 @@ BEGIN { our $version = "++GIT_VERSION++"; # Output buffer variable -$output = ""; +our $output = ""; our ($my_url, $my_uri, $base_url, $path_info, $home_link); sub evaluate_uri { @@ -1143,13 +1138,12 @@ sub change_output { } sub reset_output { - # This basically takes STDOUT_REAL and puts it back as STDOUTl - open(STDOUT,">&STDOUT_REAL"); + # This basically takes STDOUT_REAL and puts it back as STDOUT + open STDOUT, ">&", \*STDOUT_REAL; } gitweb/gitweb.perl | 29 +++++++++++++++++++++++++++++ 1 files changed, 29 insertions(+), 0 deletions(-) diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl index e4c08ba..91e274f 100755 --- a/gitweb/gitweb.perl +++ b/gitweb/gitweb.perl @@ -38,6 +38,9 @@ BEGIN { our $version = "++GIT_VERSION++"; +# Output buffer variable +our $output = ""; + our ($my_url, $my_uri, $base_url, $path_info, $home_link); sub evaluate_uri { our $cgi; @@ -1120,6 +1123,25 @@ sub evaluate_argv { ); } +sub change_output { + our $output; + + # Trap the 'proper' STDOUT to STDOUT_REAL for things like error messages and such + open(STDOUT_REAL,">&STDOUT") or die "Unable to capture STDOUT $!\n"; + + # Close STDOUT, so that it isn't being used anymore. + close STDOUT; + + # Trap STDOUT to the $output variable, which is what I was using in the original + # patch anyway. + open(STDOUT,">", \$output) || die "Unable to open STDOUT: $!"; #open STDOUT handle to use $var +} + +sub reset_output { + # This basically takes STDOUT_REAL and puts it back as STDOUT + open STDOUT, ">&", \*STDOUT_REAL; +} + sub run { evaluate_argv(); @@ -1131,7 +1153,10 @@ sub run { $pre_dispatch_hook->() if $pre_dispatch_hook; + change_output(); run_request(); + reset_output(); + print $output; $post_dispatch_hook->() if $post_dispatch_hook; @@ -3640,6 +3665,10 @@ sub die_error { 500 => '500 Internal Server Error', 503 => '503 Service Unavailable', ); + # Reset the output so that we are actually going to STDOUT as opposed + # to buffering the output. + reset_output(); + git_header_html($http_responses{$status}, undef, %opts); print <<EOF; <div class="page_body"> -- 1.7.3 ^ permalink raw reply related [flat|nested] 43+ messages in thread
* [PATCHv7.1 3/4] gitweb: File based caching layer (from git.kernel.org) 2010-10-31 9:21 ` Jakub Narebski ` (2 preceding siblings ...) 2010-11-01 10:24 ` [PATCHv7.1 2/4] gitweb: add output buffering and associated functions Jakub Narebski @ 2010-11-01 10:24 ` Jakub Narebski 2010-11-01 10:24 ` [PATCHv7.1 4/4] gitweb: Minimal testing of gitweb caching Jakub Narebski 4 siblings, 0 replies; 43+ messages in thread From: Jakub Narebski @ 2010-11-01 10:24 UTC (permalink / raw) To: git Cc: John 'Warthog9' Hawley, John 'Warthog9' Hawley, Junio C Hamano, Petr Baudis, admin, Jakub Narebski From: John 'Warthog9' Hawley <warthog9@eaglescrag.net> This is a relatively large patch that implements the file based caching layer that is quite similar to the one used on such large sites as kernel.org and soon git.fedoraproject.org. This provides a simple, and straight forward caching mechanism that scales dramatically better than Gitweb by itself. The caching layer basically buffers the output that Gitweb would normally return, and saves that output to a cache file on the local disk. When the file is requested it attempts to gain a shared lock on the cache file and cat it out to the client. Should an exclusive lock be on a file (it's being updated) the code has a choice to either update in the background and go ahead and show the stale page while update is being performed, or stall the client(s) until the page is generated. There are two forms of stalling involved here, background building and non-background building, both of which are discussed in the configuration page. There are still a few known "issues" with respect to this: - Code needs to be added to be "browser" aware so that clients like wget that are trying to get a binary blob don't obtain a "Generating..." page Caching is disabled by default. You can turn it on by setting $caching_enabled variable to true to enable file based caching. Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jakub Narebski <jnareb@gmail.com> --- This is *minimal* fixup of J.H. patch, addressing only major concerns of Junio C Hamano (JH) and me (JN). JH> Jakub, can we have an absolute minimum fix-up, so that we can give JH> this wider exposure? I think there are only JH> four issues: JH> JH> (1) exclude Ajax-y stuff from caching Done, via "if ($caching_enabled && is_cacheable($action) { ... }". Alternate solution would be to degrade Ajax-y actions to their non-Ajax-y versions (i.e. 'blame_incremental' to just 'blame') when caching is enabled. JH> (2) install cache.pm the same way gitweb.perl is installed via JH> the Makefile It is now 'gitweb/lib/cache.pl' (*.pl and not *.pm because it is not a proper Perl module: it lacks 'package Foo' for one), and is installed via 'install-modules' target in gitweb/Makefile, which is prereq to 'install' therein. So 'make install-gitweb' would install cache.pl alongside gitweb. JH> (3) running tests with appropriate -I so that cache.pm is found; Thanks to "use lib __DIR__.'/lib';" it is not necessary. The 'use lib' solution has the advantage that it would work for mod_perl too. JH> (4) die if 'cache.pm' cannot be "done". # # Includes # if (!exists $INC{'cache.pl'}) { my $return = do 'cache.pl'; die $@ if $@; die "Couldn't read 'cache.pl': $!" if (!defined $return); } Gitweb now includes 'cache.pl' only once (which is important in persistent environments that wrap gitweb, like mod_perl's ModPerl::Registry, or Plack's Plack::App::WrapCGI which is used for --httpd=plackup in git-instaweb), checks if there were any errors parsing it, and checks if there were any errors finding it (e.g. locating it). JN> (5) naming and semantic of gitweb config variables configuring caching; JN> at least change $cache_enabled enum to $caching_enabled boolean Done. $cache_enable is not more. JN> (6) do not change anything in gitweb behavior if caching is disabled; JN> move 'if ($caching_enabled)' test to gitweb.perl, and remove code JN> from cache_fetch This simplified 'cache.pl' by removing all conditionals depending on state of $cache_enable from it (see interdiff below). This required adding check if caching is enabled before running reset_output() in die_error() subroutine. For tests to not hang I had to change permissions of cache directory from 0665 ('rw-rw-r-x') to 0755 ('rwxr-xr-x'). I think that was a bug. Interdiff (with '--ignore-all-space'): diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl index 8f748a3..abaeec6 100755 --- a/gitweb/gitweb.perl +++ b/gitweb/gitweb.perl @@ -250,11 +251,7 @@ our $maxload = 300; # that the cache directory be periodically completely deleted, and this is safe to perform. # Suggested mechanism # mv $cacheidr $cachedir.flush;mkdir $cachedir;rm -rf $cachedir.flush -# Value is binary. 0 = disabled (default), 1 = enabled. -# -# Values of caching: -# 1 = 'dumb' file based caching used on git.kernel.org -our $cache_enable = 0; +our $caching_enabled = 0; # Used to set the minimum cache timeout for the dynamic caching algorithm. Basically # if we calculate the cache to be under this number of seconds we set the cache timeout @@ -552,7 +549,11 @@ our %feature = ( # # Includes # -do 'cache.pm'; +if (!exists $INC{'cache.pl'}) { + my $return = do 'cache.pl'; + die $@ if $@; + die "Couldn't read 'cache.pl': $!" if (!defined $return); +} sub gitweb_get_feature { my ($name) = @_; @@ -782,6 +783,10 @@ our %actions = ( "project_list" => \&git_project_list, "project_index" => \&git_project_index, ); +sub is_cacheable { + my $action = shift; + return !($action eq 'blame_data' || $action eq 'blame_incremental'); +} # finally, we have the hash of allowed extra_options for the commands that # allow them @@ -1120,8 +1118,11 @@ sub dispatch { !$project) { die_error(400, "Project needed"); } - #$actions{$action}->(); + if ($caching_enabled && is_cacheable($action)) { cache_fetch($action); + } else { + $actions{$action}->(); + } } sub reset_timer { @@ -3494,9 +3494,8 @@ sub git_header_html { # support xhtml+xml but choking when it gets what it asked for. if (defined $cgi->http('HTTP_ACCEPT') && $cgi->http('HTTP_ACCEPT') =~ m/(,|;|\s|^)application\/xhtml\+xml(,|;|\s|$)/ && - $cgi->Accept('application/xhtml+xml') != 0 - && - $cache_enable == 0) { + $cgi->Accept('application/xhtml+xml') != 0 && + !$caching_enabled) { $content_type = 'application/xhtml+xml'; } else { $content_type = 'text/html'; @@ -3737,7 +3736,7 @@ sub die_error { # # Reset the output so that we are actually going to STDOUT as opposed # to buffering the output. - reset_output(); + reset_output() if ($caching_enabled); git_header_html($http_responses{$status}, undef, %opts); print <<EOF; @@ -5647,7 +5646,7 @@ sub git_blob_plain { ($sandbox ? 'attachment' : 'inline') . '; filename="' . $save_as . '"'); local $/ = undef; - if( $cache_enable != 0){ + if ($caching_enabled) { open BINOUT, '>', $fullhashbinpath or die_error(500, "Could not open bin dump file"); }else{ open BINOUT, '>', \$fullhashbinpath or die_error(500, "Could not open bin dump file"); @@ -5940,7 +5939,7 @@ sub git_snapshot { open my $fd, "-|", $cmd or die_error(500, "Execute git-archive failed"); - if( $cache_enable != 0){ + if ($caching_enabled) { open BINOUT, '>', $fullhashbinpath or die_error(500, "Could not open bin dump file"); }else{ open BINOUT, '>', \$fullhashbinpath or die_error(500, "Could not open bin dump file"); diff --git a/gitweb/cache.pm b/gitweb/lib/cache.pl similarity index 75% rename from gitweb/cache.pm rename to gitweb/lib/cache.pl index c86265c..dd14bfb 100644 --- a/gitweb/cache.pm +++ b/gitweb/lib/cache.pl @@ -18,21 +18,9 @@ sub cache_fetch { my ($action) = @_; my $cacheTime = 0; - # Deal with cache being disabled - if( $cache_enable == 0 ){ - undef $backgroundCache; - #$fullhashpath = \$nocachedata; - #$fullhashbinpath = \$nocachedatabin; - $fullhashpath = *STDOUT; - $fullhashbinpath = *STDOUT; - $actions{$action}->(); - return; - }elsif( $cache_enable == 1 ){ - #obviously we are using file based caching - if(! -d $cachedir){ print "*** Warning ***: Caching enabled but cache directory does not exsist. ($cachedir)\n"; - mkdir ("cache", 0665) || die "Cannot create cache dir - you will need to manually create"; + mkdir ("cache", 0755) || die "Cannot create cache dir - you will need to manually create"; print "Cache directory created successfully\n"; } @@ -47,10 +35,9 @@ sub cache_fetch { $fullhashpath = "$fullhashdir/". substr( $urlhash, 2 ); $fullhashbinpath = "$fullhashpath.bin.wt"; $fullhashbinpathfinal = "$fullhashpath.bin"; - } # done dealing with cache enabled / disabled if(! -e "$fullhashpath" ){ - if( ! $cache_enable || ! $cacheDoFork || ! defined(my $childPid = fork()) ){ + if(! $cacheDoFork || ! defined(my $childPid = fork()) ){ cacheUpdate($action,0); cacheDisplay($action); } elsif ( $childPid == 0 ){ @@ -112,8 +99,7 @@ sub cacheUpdate { $lockStatus = $lockStatBG; }else{ - open(cacheFile, '>:utf8', "$fullhashpath") if ($cache_enable > 0); - open(cacheFile, '>:utf8', \$fullhashpath) if ($cache_enable == 0); + open(cacheFile, '>:utf8', \$fullhashpath); my $lockStat = flock(cacheFile,LOCK_EX|LOCK_NB); $lockStatus = $lockStat; @@ -121,7 +107,7 @@ sub cacheUpdate { #print "lock status: $lockStat\n"; - if ($cache_enable != 0 && ! $lockStatus ){ + if (! $lockStatus ){ if ( $areForked ){ exit(0); }else{ @@ -220,8 +206,7 @@ sub cacheWaitForUpdate { if( $backgroundCache ){ if( -e "$fullhashpath" ){ - open(cacheFile, '<:utf8', "$fullhashpath") if ($cache_enable > 0); - open(cacheFile, '<:utf8', \$fullhashpath) if ($cache_enable == 0); + open(cacheFile, '<:utf8', "$fullhashpath"); $lockStat = flock(cacheFile,LOCK_SH|LOCK_NB); stat(cacheFile); close(cacheFile); @@ -242,8 +227,7 @@ sub cacheWaitForUpdate { ){ do { sleep 2 if $x > 0; - open(cacheFile, '<:utf8', "$fullhashpath") if ($cache_enable > 0); - open(cacheFile, '<:utf8', \$fullhashpath) if ($cache_enable == 0); + open(cacheFile, '<:utf8', "$fullhashpath"); $lockStat = flock(cacheFile,LOCK_SH|LOCK_NB); close(cacheFile); $x++; @@ -298,8 +282,7 @@ EOF do { print "."; sleep 2 if $x > 0; - open(cacheFile, '<:utf8', "$fullhashpath") if ($cache_enable > 0); - open(cacheFile, '<:utf8', \$fullhashpath) if ($cache_enable == 0); + open(cacheFile, '<:utf8', "$fullhashpath"); $lockStat = flock(cacheFile,LOCK_SH|LOCK_NB); close(cacheFile); $x++; @@ -317,11 +300,10 @@ sub cacheDisplay { $|++; my ($action) = @_; - open(cacheFile, '<:utf8', "$fullhashpath") if ($cache_enable > 0); - open(cacheFile, '<:utf8', \$fullhashpath) if ($cache_enable == 0); + open(cacheFile, '<:utf8', "$fullhashpath"); $lockStat = flock(cacheFile,LOCK_SH|LOCK_NB); - if ($cache_enable != 0 && ! $lockStat ){ + if (! $lockStat ){ close(cacheFile); cacheWaitForUpdate($action); } @@ -332,12 +314,10 @@ sub cacheDisplay { || $action eq "blob_plain" ) - && - $cache_enable > 0 ){ my $openstat = open(cacheFileBin, '<', "$fullhashbinpathfinal"); $lockStatBIN = flock(cacheFileBin,LOCK_SH|LOCK_NB); - if ($cache_enable != 0 && ! $lockStatBIN ){ + if (! $lockStatBIN ){ system ("echo 'cacheDisplay - bailing due to binary lock failure' >> /tmp/gitweb.log"); close(cacheFile); close(cacheFileBin); @@ -363,3 +343,6 @@ sub cacheDisplay { close(cacheFile); $|--; } + +1; gitweb/Makefile | 3 + gitweb/gitweb.perl | 105 ++++++++++++-- gitweb/lib/cache.pl | 348 ++++++++++++++++++++++++++++++++++++++++++++++ gitweb/static/gitweb.css | 6 + 4 files changed, 450 insertions(+), 12 deletions(-) create mode 100644 gitweb/lib/cache.pl diff --git a/gitweb/Makefile b/gitweb/Makefile index 6fa7625..315753e 100644 --- a/gitweb/Makefile +++ b/gitweb/Makefile @@ -111,6 +111,9 @@ endif GITWEB_FILES += static/git-logo.png static/git-favicon.png +# Gitweb caching +GITWEB_MODULES += cache.pl + GITWEB_REPLACE = \ -e 's|++GIT_VERSION++|$(GIT_VERSION)|g' \ -e 's|++GIT_BINDIR++|$(bindir)|g' \ diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl index 91e274f..abaeec6 100755 --- a/gitweb/gitweb.perl +++ b/gitweb/gitweb.perl @@ -243,6 +243,53 @@ our %avatar_size = ( # Leave it undefined (or set to 'undef') to turn off load checking. our $maxload = 300; +# This enables/disables the caching layer in gitweb. This currently only supports the +# 'dumb' file based caching layer, primarily used on git.kernel.org. this is reasonably +# effective but it has the downside of requiring a huge amount of disk space if there +# are a number of repositories involved. It is not uncommon for git.kernel.org to have +# on the order of 80G - 120G accumulate over the course of a few months. It is recommended +# that the cache directory be periodically completely deleted, and this is safe to perform. +# Suggested mechanism +# mv $cacheidr $cachedir.flush;mkdir $cachedir;rm -rf $cachedir.flush +our $caching_enabled = 0; + +# Used to set the minimum cache timeout for the dynamic caching algorithm. Basically +# if we calculate the cache to be under this number of seconds we set the cache timeout +# to this minimum. +# Value is in seconds. 1 = 1 seconds, 60 = 1 minute, 600 = 10 minutes, 3600 = 1 hour +our $minCacheTime = 20; + +# Used to set the maximum cache timeout for the dynamic caching algorithm. Basically +# if we calculate the cache to exceed this number of seconds we set the cache timeout +# to this maximum. +# Value is in seconds. 1 = 1 seconds, 60 = 1 minute, 600 = 10 minutes, 3600 = 1 hour +our $maxCacheTime = 1200; + +# If you need to change the location of the caching directory, override this +# otherwise this will probably do fine for you +our $cachedir = 'cache'; + +# If this is set (to 1) cache will do it's best to always display something instead +# of making someone wait for the cache to update. This will launch the cacheUpdate +# into the background and it will lock a <file>.bg file and will only lock the +# actual cache file when it needs to write into it. In theory this will make +# gitweb seem more responsive at the price of possibly stale data. +our $backgroundCache = 1; + +# Used to set the maximum cache file life. If a cache files last modify time exceeds +# this value, it will assume that the data is just too old, and HAS to be regenerated +# instead of trying to display the existing cache data. +# Value is in seconds. 1 = 1 seconds, 60 = 1 minute, 600 = 10 minutes, 3600 = 1 hour +# 18000 = 5 hours +our $maxCacheLife = 18000; + +# Used to enable or disable background forking of the gitweb caching. Mainly here for debugging purposes +our $cacheDoFork = 1; + +our $fullhashpath = *STDOUT; +our $fullhashbinpath = *STDOUT; +our $fullhashbinpathfinal = *STDOUT; + # configuration for 'highlight' (http://www.andre-simon.de/) # match by basename our %highlight_basename = ( @@ -499,6 +546,15 @@ our %feature = ( 'default' => [0]}, ); +# +# Includes +# +if (!exists $INC{'cache.pl'}) { + my $return = do 'cache.pl'; + die $@ if $@; + die "Couldn't read 'cache.pl': $!" if (!defined $return); +} + sub gitweb_get_feature { my ($name) = @_; return unless exists $feature{$name}; @@ -727,6 +783,10 @@ our %actions = ( "project_list" => \&git_project_list, "project_index" => \&git_project_index, ); +sub is_cacheable { + my $action = shift; + return !($action eq 'blame_data' || $action eq 'blame_incremental'); +} # finally, we have the hash of allowed extra_options for the commands that # allow them @@ -1058,7 +1118,11 @@ sub dispatch { !$project) { die_error(400, "Project needed"); } - $actions{$action}->(); + if ($caching_enabled && is_cacheable($action)) { + cache_fetch($action); + } else { + $actions{$action}->(); + } } sub reset_timer { @@ -1128,6 +1192,7 @@ sub change_output { # Trap the 'proper' STDOUT to STDOUT_REAL for things like error messages and such open(STDOUT_REAL,">&STDOUT") or die "Unable to capture STDOUT $!\n"; + print STDOUT_REAL ""; # Close STDOUT, so that it isn't being used anymore. close STDOUT; @@ -1153,10 +1218,7 @@ sub run { $pre_dispatch_hook->() if $pre_dispatch_hook; - change_output(); run_request(); - reset_output(); - print $output; $post_dispatch_hook->() if $post_dispatch_hook; @@ -3432,7 +3494,8 @@ sub git_header_html { # support xhtml+xml but choking when it gets what it asked for. if (defined $cgi->http('HTTP_ACCEPT') && $cgi->http('HTTP_ACCEPT') =~ m/(,|;|\s|^)application\/xhtml\+xml(,|;|\s|$)/ && - $cgi->Accept('application/xhtml+xml') != 0) { + $cgi->Accept('application/xhtml+xml') != 0 && + !$caching_enabled) { $content_type = 'application/xhtml+xml'; } else { $content_type = 'text/html'; @@ -3577,6 +3640,7 @@ sub git_footer_html { my $feed_class = 'rss_logo'; print "<div class=\"page_footer\">\n"; + print "<div class=\"cachetime\">Cache Last Updated: ". gmtime( time ) ." GMT</div>\n"; if (defined $project) { my $descr = git_get_project_description($project); if (defined $descr) { @@ -3665,9 +3729,14 @@ sub die_error { 500 => '500 Internal Server Error', 503 => '503 Service Unavailable', ); + # The output handlers for die_error need to be reset to STDOUT + # so that half the message isn't being output to random and + # half to STDOUT as expected. This is mainly for the benefit + # of using git_header_html() and git_footer_html() since + # # Reset the output so that we are actually going to STDOUT as opposed # to buffering the output. - reset_output(); + reset_output() if ($caching_enabled); git_header_html($http_responses{$status}, undef, %opts); print <<EOF; @@ -5577,9 +5646,15 @@ sub git_blob_plain { ($sandbox ? 'attachment' : 'inline') . '; filename="' . $save_as . '"'); local $/ = undef; - binmode STDOUT, ':raw'; - print <$fd>; - binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi + if ($caching_enabled) { + open BINOUT, '>', $fullhashbinpath or die_error(500, "Could not open bin dump file"); + }else{ + open BINOUT, '>', \$fullhashbinpath or die_error(500, "Could not open bin dump file"); + } + binmode BINOUT, ':raw'; + print BINOUT <$fd>; + binmode BINOUT, ':utf8'; # as set at the beginning of gitweb.cgi + close BINOUT; close $fd; } @@ -5864,9 +5939,15 @@ sub git_snapshot { open my $fd, "-|", $cmd or die_error(500, "Execute git-archive failed"); - binmode STDOUT, ':raw'; - print <$fd>; - binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi + if ($caching_enabled) { + open BINOUT, '>', $fullhashbinpath or die_error(500, "Could not open bin dump file"); + }else{ + open BINOUT, '>', \$fullhashbinpath or die_error(500, "Could not open bin dump file"); + } + binmode BINOUT, ':raw'; + print BINOUT <$fd>; + binmode BINOUT, ':utf8'; # as set at the beginning of gitweb.cgi + close BINOUT; close $fd; } diff --git a/gitweb/lib/cache.pl b/gitweb/lib/cache.pl new file mode 100644 index 0000000..dd14bfb --- /dev/null +++ b/gitweb/lib/cache.pl @@ -0,0 +1,347 @@ +# gitweb - simple web interface to track changes in git repositories +# +# (C) 2006, John 'Warthog9' Hawley <warthog19@eaglescrag.net> +# +# This program is licensed under the GPLv2 + +# +# Gitweb caching engine +# + +#use File::Path qw(make_path remove_tree); +use File::Path qw(mkpath rmtree); # Used for compatability reasons +use Digest::MD5 qw(md5 md5_hex md5_base64); +use Fcntl ':flock'; +use File::Copy; + +sub cache_fetch { + my ($action) = @_; + my $cacheTime = 0; + + if(! -d $cachedir){ + print "*** Warning ***: Caching enabled but cache directory does not exsist. ($cachedir)\n"; + mkdir ("cache", 0755) || die "Cannot create cache dir - you will need to manually create"; + print "Cache directory created successfully\n"; + } + + our $full_url = "$my_url?". $ENV{'QUERY_STRING'}; + our $urlhash = md5_hex($full_url); + our $fullhashdir = "$cachedir/". substr( $urlhash, 0, 2) ."/"; + + eval { mkpath( $fullhashdir, 0, 0777 ) }; + if ($@) { + die_error(500, "Internal Server Error", "Could not create cache directory: $@"); + } + $fullhashpath = "$fullhashdir/". substr( $urlhash, 2 ); + $fullhashbinpath = "$fullhashpath.bin.wt"; + $fullhashbinpathfinal = "$fullhashpath.bin"; + + if(! -e "$fullhashpath" ){ + if(! $cacheDoFork || ! defined(my $childPid = fork()) ){ + cacheUpdate($action,0); + cacheDisplay($action); + } elsif ( $childPid == 0 ){ + #run the updater + cacheUpdate($action,1); + }else{ + cacheWaitForUpdate($action); + } + }else{ + #if cache is out dated, update + #else displayCache(); + open(cacheFile, '<', "$fullhashpath"); + stat(cacheFile); + close(cacheFile); + my $stat_time = (stat(_))[9]; + my $stat_size = (stat(_))[7]; + + $cacheTime = get_loadavg() * 60; + if( $cacheTime > $maxCacheTime ){ + $cacheTime = $maxCacheTime; + } + if( $cacheTime < $minCacheTime ){ + $cacheTime = $minCacheTime; + } + if( $stat_time < (time - $cacheTime) || $stat_size == 0 ){ + if( ! $cacheDoFork || ! defined(my $childPid = fork()) ){ + cacheUpdate($action,0); + cacheDisplay($action); + } elsif ( $childPid == 0 ){ + #run the updater + #print "Running updater\n"; + cacheUpdate($action,1); + }else{ + #print "Waiting for update\n"; + cacheWaitForUpdate($action); + } + } else { + cacheDisplay($action); + } + + + } + + # + # If all of the caching failes - lets go ahead and press on without it and fall back to 'default' + # non-caching behavior. This is the softest of the failure conditions. + # + #$actions{$action}->(); +} + +sub cacheUpdate { + my ($action,$areForked) = @_; + my $lockingStatus; + my $fileData = ""; + + if($backgroundCache){ + open(cacheFileBG, '>:utf8', "$fullhashpath.bg"); + my $lockStatBG = flock(cacheFileBG,LOCK_EX|LOCK_NB); + + $lockStatus = $lockStatBG; + }else{ + open(cacheFile, '>:utf8', \$fullhashpath); + my $lockStat = flock(cacheFile,LOCK_EX|LOCK_NB); + + $lockStatus = $lockStat; + } + #print "lock status: $lockStat\n"; + + + if (! $lockStatus ){ + if ( $areForked ){ + exit(0); + }else{ + return; + } + } + + if( + $action eq "snapshot" + || + $action eq "blob_plain" + ){ + my $openstat = open(cacheFileBinWT, '>>:utf8', "$fullhashbinpath"); + my $lockStatBin = flock(cacheFileBinWT,LOCK_EX|LOCK_NB); + } + + # Trap all output from the action + change_output(); + + $actions{$action}->(); + + # Reset the outputs as we should be fine now + reset_output(); + + + if($backgroundCache){ + open(cacheFile, '>:utf8', "$fullhashpath"); + $lockStat = flock(cacheFile,LOCK_EX); + + if (! $lockStat ){ + if ( $areForked ){ + exit(0); + }else{ + return; + } + } + } + + if( + $action eq "snapshot" + || + $action eq "blob_plain" + ){ + my $openstat = open(cacheFileBinFINAL, '>:utf8', "$fullhashbinpathfinal"); + $lockStatBIN = flock(cacheFileBinFINAL,LOCK_EX); + + if (! $lockStatBIN ){ + if ( $areForked ){ + exit(0); + }else{ + return; + } + } + } + + # Actually dump the output to the proper file handler + local $/ = undef; + $|++; + print cacheFile "$output"; + $|--; + if( + $action eq "snapshot" + || + $action eq "blob_plain" + ){ + move("$fullhashbinpath", "$fullhashbinpathfinal") or die "Binary Cache file could not be updated: $!"; + + flock(cacheFileBinFINAL,LOCK_UN); + close(cacheFileBinFINAL); + + flock(cacheFileBinWT,LOCK_UN); + close(cacheFileBinWT); + } + + flock(cacheFile,LOCK_UN); + close(cacheFile); + + if($backgroundCache){ + flock(cacheFileBG,LOCK_UN); + close(cacheFileBG); + } + + if ( $areForked ){ + exit(0); + } else { + return; + } +} + + +sub cacheWaitForUpdate { + my ($action) = @_; + my $x = 0; + my $max = 10; + my $lockStat = 0; + + if( $backgroundCache ){ + if( -e "$fullhashpath" ){ + open(cacheFile, '<:utf8', "$fullhashpath"); + $lockStat = flock(cacheFile,LOCK_SH|LOCK_NB); + stat(cacheFile); + close(cacheFile); + + if( $lockStat && ( (stat(_))[9] > (time - $maxCacheLife) ) ){ + cacheDisplay($action); + return; + } + } + } + + if( + $action eq "atom" + || + $action eq "rss" + || + $action eq "opml" + ){ + do { + sleep 2 if $x > 0; + open(cacheFile, '<:utf8', "$fullhashpath"); + $lockStat = flock(cacheFile,LOCK_SH|LOCK_NB); + close(cacheFile); + $x++; + $combinedLockStat = $lockStat; + } while ((! $combinedLockStat) && ($x < $max)); + + if( $x != $max ){ + cacheDisplay($action); + } + return; + } + + $| = 1; + + print $::cgi->header( + -type=>'text/html', + -charset => 'utf-8', + -status=> 200, + -expires => 'now', + # HTTP/1.0 + -Pragma => 'no-cache', + # HTTP/1.1 + -Cache_Control => join( + ', ', + qw( + private + no-cache + no-store + must-revalidate + max-age=0 + pre-check=0 + post-check=0 + ) + ) + ); + + print <<EOF; +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www/w3.porg/TR/html4/strict.dtd"> +<!-- git web w/caching interface version $version, (C) 2006-2010, John 'Warthog9' Hawley <warthog9\@kernel.org> --> +<!-- git core binaries version $git_version --> +<head> +<meta http-equiv="content-type" content="$content_type; charset=utf-8"/> +<meta name="generator" content="gitweb/$version git/$git_version"/> +<meta name="robots" content="index, nofollow"/> +<meta http-equiv="refresh" content="0"/> +<title>$title</title> +</head> +<body> +EOF + + print "Generating.."; + do { + print "."; + sleep 2 if $x > 0; + open(cacheFile, '<:utf8', "$fullhashpath"); + $lockStat = flock(cacheFile,LOCK_SH|LOCK_NB); + close(cacheFile); + $x++; + $combinedLockStat = $lockStat; + } while ((! $combinedLockStat) && ($x < $max)); + print <<EOF; +</body> +</html> +EOF + return; +} + +sub cacheDisplay { + local $/ = undef; + $|++; + + my ($action) = @_; + open(cacheFile, '<:utf8', "$fullhashpath"); + $lockStat = flock(cacheFile,LOCK_SH|LOCK_NB); + + if (! $lockStat ){ + close(cacheFile); + cacheWaitForUpdate($action); + } + + if( + ( + $action eq "snapshot" + || + $action eq "blob_plain" + ) + ){ + my $openstat = open(cacheFileBin, '<', "$fullhashbinpathfinal"); + $lockStatBIN = flock(cacheFileBin,LOCK_SH|LOCK_NB); + if (! $lockStatBIN ){ + system ("echo 'cacheDisplay - bailing due to binary lock failure' >> /tmp/gitweb.log"); + close(cacheFile); + close(cacheFileBin); + cacheWaitForUpdate($action); + } + + my $binfilesize = -s "$fullhashbinpathfinal"; + print "Content-Length: $binfilesize"; + } + while( <cacheFile> ){ + print $_; + } + if( + $action eq "snapshot" + || + $action eq "blob_plain" + ){ + binmode STDOUT, ':raw'; + print <cacheFileBin>; + binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi + close(cacheFileBin); + } + close(cacheFile); + $|--; +} + +1; diff --git a/gitweb/static/gitweb.css b/gitweb/static/gitweb.css index 4132aab..972d32e 100644 --- a/gitweb/static/gitweb.css +++ b/gitweb/static/gitweb.css @@ -67,6 +67,12 @@ div.page_path { border-width: 0px 0px 1px; } +div.cachetime { + float: left; + margin-right: 10px; + color: #555555; +} + div.page_footer { height: 17px; padding: 4px 8px; -- 1.7.3 ^ permalink raw reply related [flat|nested] 43+ messages in thread
* [PATCHv7.1 4/4] gitweb: Minimal testing of gitweb caching 2010-10-31 9:21 ` Jakub Narebski ` (3 preceding siblings ...) 2010-11-01 10:24 ` [PATCHv7.1 3/4] gitweb: File based caching layer (from git.kernel.org) Jakub Narebski @ 2010-11-01 10:24 ` Jakub Narebski 4 siblings, 0 replies; 43+ messages in thread From: Jakub Narebski @ 2010-11-01 10:24 UTC (permalink / raw) To: git Cc: John 'Warthog9' Hawley, John 'Warthog9' Hawley, Junio C Hamano, Petr Baudis, admin, Jakub Narebski Add basic tests of caching support to t9500-gitweb-standalone-no-errors test: set $caching_enabled to true and check for errors for first time run (generating cache) and second time run (retrieving from cache) for a single view - summary view for a project. Check in the t9502-gitweb-standalone-parse-output test that gitweb produces the same output with and without caching, for first and second run, with binary or text output. Signed-off-by: Jakub Narebski <jnareb@gmail.com> --- All tests, including those added here, passes. t/t9500-gitweb-standalone-no-errors.sh | 19 ++++++++++++++ t/t9502-gitweb-standalone-parse-output.sh | 39 +++++++++++++++++++++++++++++ 2 files changed, 58 insertions(+), 0 deletions(-) diff --git a/t/t9500-gitweb-standalone-no-errors.sh b/t/t9500-gitweb-standalone-no-errors.sh index 4f2b9b0..0ad5fc8 100755 --- a/t/t9500-gitweb-standalone-no-errors.sh +++ b/t/t9500-gitweb-standalone-no-errors.sh @@ -676,4 +676,23 @@ test_expect_success HIGHLIGHT \ gitweb_run "p=.git;a=blob;f=test.sh"' test_debug 'cat gitweb.log' +# ---------------------------------------------------------------------- +# caching + +cat >>gitweb_config.perl <<\EOF +$caching_enabled = 1; +$cachedir = 'cache'; # to clear right thing +EOF +rm -rf cache + +test_expect_success \ + 'caching enabled (project summary, first run, generating cache)' \ + 'gitweb_run "p=.git;a=summary"' +test_debug 'cat gitweb.log' + +test_expect_success \ + 'caching enabled (project summary, second run, cached version)' \ + 'gitweb_run "p=.git;a=summary"' +test_debug 'cat gitweb.log' + test_done diff --git a/t/t9502-gitweb-standalone-parse-output.sh b/t/t9502-gitweb-standalone-parse-output.sh index dd83890..f3d0cf0 100755 --- a/t/t9502-gitweb-standalone-parse-output.sh +++ b/t/t9502-gitweb-standalone-parse-output.sh @@ -112,4 +112,43 @@ test_expect_success 'snapshot: hierarchical branch name (xx/test)' ' ' test_debug 'cat gitweb.headers' + +# ---------------------------------------------------------------------- +# whether gitweb with caching enabled produces the same output + +test_expect_success 'setup for caching tests (utf8 commit, binary file)' ' + . "$TEST_DIRECTORY"/t3901-utf8.txt && + cp "$TEST_DIRECTORY"/test9200a.png image.png && + git add image.png && + git commit -F "$TEST_DIRECTORY"/t3900/1-UTF-8.txt && + gitweb_run "p=.git;a=patch" && + mv gitweb.body no_cache.html && + gitweb_run "p=.git;a=blob_plain;f=image.png" && + mv gitweb.body no_cache.png +' + +test_expect_success 'enable caching' ' + cat >>gitweb_config.perl <<-\EOF && + $caching_enabled = 1; + $cachedir = "cache"; # to clear right thing + EOF + rm -rf cache +' + +for desc in 'generating cache' 'cached version'; do + test_expect_success "caching enabled, HTML output, $desc" ' + gitweb_run "p=.git;a=patch" && + mv gitweb.body cache.html && + test_cmp no_cache.html cache.html + ' +done + +for desc in 'generating cache' 'cached version'; do + test_expect_success "caching enabled, binary output, $desc" ' + gitweb_run "p=.git;a=blob_plain;f=image.png" && + mv gitweb.body cache.png && + cmp no_cache.png cache.png + ' +done + test_done -- 1.7.3 ^ permalink raw reply related [flat|nested] 43+ messages in thread
end of thread, other threads:[~2010-12-02 19:37 UTC | newest] Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2010-10-28 0:42 [PATCH 0/3] Gitweb caching v7 John 'Warthog9' Hawley 2010-10-28 0:42 ` [PATCH 1/3] gitweb: Add option to force version match John 'Warthog9' Hawley 2010-10-28 9:52 ` Jakub Narebski 2010-10-28 22:08 ` Junio C Hamano 2010-10-28 0:42 ` [PATCH 2/3] gitweb: add output buffering and associated functions John 'Warthog9' Hawley 2010-10-28 9:56 ` Jakub Narebski 2010-10-28 0:42 ` [PATCH 3/3] gitweb: File based caching layer (from git.kernel.org) John 'Warthog9' Hawley 2010-10-28 16:10 ` Jakub Narebski 2010-10-28 22:29 ` [PATCH 0/3] Gitweb caching v7 Junio C Hamano 2010-10-29 22:25 ` Junio C Hamano 2010-10-30 8:58 ` Jakub Narebski 2010-10-31 4:24 ` Junio C Hamano 2010-10-31 9:21 ` Jakub Narebski 2010-11-01 10:24 ` [PATCHv7.1 0/4] Gitweb caching v7.1 Jakub Narebski 2010-11-12 23:35 ` [RFC/PATCHv7.2 0/4] Gitweb caching v7.2 Jakub Narebski 2010-11-12 23:41 ` [PATCHv7.2 1/4] gitweb: Prepare for splitting gitweb Jakub Narebski 2010-11-17 23:10 ` Junio C Hamano 2010-11-18 13:37 ` Jakub Narebski 2010-12-02 10:17 ` [PATCHv7.3 1/4 (bugfix)] " Jakub Narebski 2010-12-02 17:37 ` Junio C Hamano 2010-12-02 19:01 ` Jakub Narebski 2010-12-02 19:21 ` Junio C Hamano 2010-12-02 19:36 ` Jakub Narebski 2010-11-12 23:44 ` [PATCHv7.1 2/4] gitweb: add output buffering and associated functions Jakub Narebski 2010-11-12 23:56 ` [PATCHv7.1 3/4] gitweb: File based caching layer (from git.kernel.org) Jakub Narebski 2010-11-28 11:22 ` [PATCHv7.1 3/4 (amend)] " Jakub Narebski 2010-11-28 11:31 ` Jakub Narebski 2010-11-29 22:13 ` Junio C Hamano 2010-11-29 22:20 ` Junio C Hamano 2010-11-29 23:09 ` [PATCHv7.1 3/4 (amend v2)] " Jakub Narebski 2010-11-30 0:51 ` Junio C Hamano 2010-11-30 10:21 ` Jakub Narebski 2010-11-29 23:07 ` [PATCHv7.1 3/4] " demerphq 2010-11-29 23:26 ` demerphq 2010-11-29 23:54 ` Jakub Narebski 2010-11-13 0:01 ` [PATCHv7.2 4/4] gitweb: Minimal testing of gitweb caching Jakub Narebski 2010-11-17 22:37 ` Junio C Hamano 2010-11-17 23:12 ` Jakub Narebski 2010-11-01 10:24 ` [PATCHv7.1 1/4] gitweb: Prepare for splitting gitweb Jakub Narebski 2010-11-01 18:50 ` [PATCHv7.1b " Jakub Narebski 2010-11-01 10:24 ` [PATCHv7.1 2/4] gitweb: add output buffering and associated functions Jakub Narebski 2010-11-01 10:24 ` [PATCHv7.1 3/4] gitweb: File based caching layer (from git.kernel.org) Jakub Narebski 2010-11-01 10:24 ` [PATCHv7.1 4/4] gitweb: Minimal testing of gitweb caching Jakub Narebski
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.