All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] Gitweb caching v7
@ 2010-10-28  0:42 John 'Warthog9' Hawley
  2010-10-28  0:42 ` [PATCH 1/3] gitweb: Add option to force version match John 'Warthog9' Hawley
                   ` (4 more replies)
  0 siblings, 5 replies; 43+ messages in thread
From: John 'Warthog9' Hawley @ 2010-10-28  0:42 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 2493 bytes --]


Evening everyone,
 
This is the latest incarnation of gitweb w/ caching.  Per the general
consensus and requests from the recent Gittogether I'm re-submitting 
my patches.

Biggest changes are bringing it back in line with Junio's tree, couple
of locking fixes (specifically targeted at binary files), and at the 
nagging behest of Sverre I've cut the old patch down to a much more
manageable (and in his words "reviewable") series.

This still differs, slightly, from whats in production on kernel.org.
It's missing the index page git:// link, and kernel.org is still running
the broken out output buffering but I'm likely to flip over to this just
so I'm not maintaining quite so much (and the changes are easier to track
with upstream)

v7:
	- Rework output system, now central STDOUT redirect
	- Various fixes to caching brought in from existing
	  running system

v6:
	- Never saw the light of day
	- Various testing, and reworks.

v5:
	- Missed a couple of things that were in my local tree, and
	  added them back in.
	- Split up the die_error and the version matching patch
	- Set version matching to be on by default - otherwise this
	  really is code that will never get checked, or at best
	  enabled by default by distributions
	- Added a minor code cleanup with respect to $site_header
	  that was already in my tree
	- Applied against a more recent git tree vs. 1.6.6-rc2
	- Removed breakout patch for now (did that in v4 actually)
	  and will deal with that separately 

	http://git.kernel.org/?p=git/warthog9/gitweb.git;a=shortlog;h=refs/heads/gitweb-ml-v5

v4:
	- major re-working of the caching layer to use file handle
	  redirection instead of buffering output
	- other minor improvements

	http://git.kernel.org/?p=git/warthog9/gitweb.git;a=shortlog;h=refs/heads/gitweb-ml-v4
v3:
	- various minor re-works based on mailing list feedback,
	  this series was not sent to the mailing list.
v2:
	- Better breakout
	- You can actually disable the cache now

- John 'Warthog9' Hawley 


John 'Warthog9' Hawley (3):
  gitweb: Add option to force version match
  gitweb: add output buffering and associated functions
  gitweb: File based caching layer (from git.kernel.org)

 gitweb/README            |    4 +
 gitweb/cache.pm          |  365 ++++++++++++++++++++++++++++++++++++++++++++++
 gitweb/gitweb.perl       |  147 ++++++++++++++++++-
 gitweb/static/gitweb.css |    6 +
 4 files changed, 514 insertions(+), 8 deletions(-)
 create mode 100644 gitweb/cache.pm

-- 
1.7.2.3

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 1/3] gitweb: Add option to force version match
  2010-10-28  0:42 [PATCH 0/3] Gitweb caching v7 John 'Warthog9' Hawley
@ 2010-10-28  0:42 ` John 'Warthog9' Hawley
  2010-10-28  9:52   ` Jakub Narebski
  2010-10-28 22:08   ` Junio C Hamano
  2010-10-28  0:42 ` [PATCH 2/3] gitweb: add output buffering and associated functions John 'Warthog9' Hawley
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 43+ messages in thread
From: John 'Warthog9' Hawley @ 2010-10-28  0:42 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 752 bytes --]


This adds $git_versions_must_match variable, which is set to true,
checks that we are running on the same version of git that we
shipped with, and if not throw '500 Internal Server Error' error.
What is checked is the version of gitweb (embedded in building
gitweb.cgi), against version of runtime git binary used.

Gitweb can usually run with a mismatched git install.  This is more
here to give an obvious warning as to whats going on vs. silently
failing.

By default this feature is turned on.

Signed-off-by: John 'Warthog9' Hawley <warthog9@kernel.org>
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
 gitweb/README      |    4 ++++
 gitweb/gitweb.perl |   27 +++++++++++++++++++++++++++
 2 files changed, 31 insertions(+), 0 deletions(-)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-gitweb-Add-option-to-force-version-match.patch --]
[-- Type: text/x-patch; name="0001-gitweb-Add-option-to-force-version-match.patch", Size: 2303 bytes --]

diff --git a/gitweb/README b/gitweb/README
index bf3664f..7ee8450 100644
--- a/gitweb/README
+++ b/gitweb/README
@@ -246,6 +246,10 @@ not include variables usually directly set during build):
    http://www.andre-simon.de due to assumptions about parameters and output).
    Useful if highlight is not installed on your webserver's PATH.
    [Default: highlight]
+ * $git_versions_must_match
+   If set, gitweb fails with 500 Internal Server Error if the version of gitweb
+   doesn't match version of git binary.  The default is true.
+
 
 Projects list file format
 ~~~~~~~~~~~~~~~~~~~~~~~~~
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 8d7e4c5..215a4e9 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -232,6 +232,9 @@ our %avatar_size = (
 	'double'  => 32
 );
 
+# If it is true, exit if gitweb version and git binary version don't match
+our $git_versions_must_match = 1;
+
 # Used to set the maximum load that we will still respond to gitweb queries.
 # If server load exceed this value then return "503 server busy" error.
 # If gitweb cannot determined server load, it is taken to be 0.
@@ -649,6 +652,29 @@ sub check_loadavg {
 	}
 }
 
+sub check_versionmatch {
+	# Throw an error if git versions does not match, if $git_versions_must_match is true.
+	if ($git_versions_must_match &&
+	    $git_version ne $version) {
+		my $admin_contact =
+			defined $ENV{'SERVER_ADMIN'} ? ", $ENV{'SERVER_ADMIN'}," : '';
+		my $err_msg = <<EOT;
+<h1 align="center">*** Warning ***</h1>
+<p>
+This version of gitweb was compiled for <b>@{[esc_html($version)]}</b>,
+however git version <b>@{[esc_html($git_version)]}</b> was found on server,
+and administrator requested strict version checking.
+</p>
+<p>
+Please contact the server administrator${admin_contact} to either configure
+gitweb to allow mismatched versions, or update git or gitweb installation.
+</p>
+EOT
+		die_error(500, 'Internal server error', $err_msg);
+	}
+
+}
+
 # ======================================================================
 # input validation and dispatch
 
@@ -1075,6 +1101,7 @@ sub run_request {
 	evaluate_uri();
 	evaluate_gitweb_config();
 	check_loadavg();
+	check_versionmatch();
 
 	# $projectroot and $projects_list might be set in gitweb config file
 	$projects_list ||= $projectroot;

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 2/3] gitweb: add output buffering and associated functions
  2010-10-28  0:42 [PATCH 0/3] Gitweb caching v7 John 'Warthog9' Hawley
  2010-10-28  0:42 ` [PATCH 1/3] gitweb: Add option to force version match John 'Warthog9' Hawley
@ 2010-10-28  0:42 ` John 'Warthog9' Hawley
  2010-10-28  9:56   ` Jakub Narebski
  2010-10-28  0:42 ` [PATCH 3/3] gitweb: File based caching layer (from git.kernel.org) John 'Warthog9' Hawley
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 43+ messages in thread
From: John 'Warthog9' Hawley @ 2010-10-28  0:42 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 556 bytes --]


This adds output buffering for gitweb, mainly in preparation for
caching support.  This is a dramatic change to how caching was being
done, mainly in passing around the variable manually and such.

This centrally flips the entire STDOUT to a variable, which after the
completion of the run, flips it back and does a print on the resulting
data.

This should save on the previous 10K line patch (or so) that adds more
explicit output passing.
---
 gitweb/gitweb.perl |   29 +++++++++++++++++++++++++++++
 1 files changed, 29 insertions(+), 0 deletions(-)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0002-gitweb-add-output-buffering-and-associated-functions.patch --]
[-- Type: text/x-patch; name="0002-gitweb-add-output-buffering-and-associated-functions.patch", Size: 1595 bytes --]

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 215a4e9..757ef46 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -30,6 +30,9 @@ BEGIN {
 
 our $version = "++GIT_VERSION++";
 
+# Output buffer variable
+$output = "";
+
 our ($my_url, $my_uri, $base_url, $path_info, $home_link);
 sub evaluate_uri {
 	our $cgi;
@@ -1151,6 +1154,25 @@ sub evaluate_argv {
 	);
 }
 
+sub change_output {
+	our $output;
+
+	# Trap the 'proper' STDOUT to STDOUT_REAL for things like error messages and such
+	open(STDOUT_REAL,">&STDOUT") or die "Unable to capture STDOUT $!\n";
+
+	# Close STDOUT, so that it isn't being used anymore.
+	close STDOUT;
+
+	# Trap STDOUT to the $output variable, which is what I was using in the original
+	# patch anyway.
+	open(STDOUT,">", \$output) || die "Unable to open STDOUT: $!"; #open STDOUT handle to use $var
+}
+
+sub reset_output {
+	# This basically takes STDOUT_REAL and puts it back as STDOUTl
+	open(STDOUT,">&STDOUT_REAL");
+}
+
 sub run {
 	evaluate_argv();
 	evaluate_git_version();
@@ -1163,7 +1185,10 @@ sub run {
 		$pre_dispatch_hook->()
 			if $pre_dispatch_hook;
 
+		change_output();
 		run_request();
+		reset_output();
+		print $output;
 
 		$post_dispatch_hook->()
 			if $post_dispatch_hook;
@@ -3673,6 +3698,10 @@ sub die_error {
 		500 => '500 Internal Server Error',
 		503 => '503 Service Unavailable',
 	);
+	# Reset the output so that we are actually going to STDOUT as opposed
+	# to buffering the output.
+	reset_output();
+
 	git_header_html($http_responses{$status}, undef, %opts);
 	print <<EOF;
 <div class="page_body">

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 3/3] gitweb: File based caching layer (from git.kernel.org)
  2010-10-28  0:42 [PATCH 0/3] Gitweb caching v7 John 'Warthog9' Hawley
  2010-10-28  0:42 ` [PATCH 1/3] gitweb: Add option to force version match John 'Warthog9' Hawley
  2010-10-28  0:42 ` [PATCH 2/3] gitweb: add output buffering and associated functions John 'Warthog9' Hawley
@ 2010-10-28  0:42 ` John 'Warthog9' Hawley
  2010-10-28 16:10   ` Jakub Narebski
  2010-10-28 22:29 ` [PATCH 0/3] Gitweb caching v7 Junio C Hamano
  2010-10-29 22:25 ` Junio C Hamano
  4 siblings, 1 reply; 43+ messages in thread
From: John 'Warthog9' Hawley @ 2010-10-28  0:42 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 1695 bytes --]


This is a relatively large patch that implements the file based
caching layer that is quite similar to the one used  on such large
sites as kernel.org and soon git.fedoraproject.org.  This provides
a simple, and straight forward caching mechanism that scales
dramatically better than Gitweb by itself.

The caching layer basically buffers the output that Gitweb would
normally return, and saves that output to a cache file on the local
disk.  When the file is requested it attempts to gain a shared lock
on the cache file and cat it out to the client.  Should an exclusive
lock be on a file (it's being updated) the code has a choice to either
update in the background and go ahead and show the stale page while
update is being performed, or stall the client(s) until the page
is generated.

There are two forms of stalling involved here, background building
and non-background building, both of which are discussed in the
configuration page.

There are still a few known "issues" with respect to this:
- Code needs to be added to be "browser" aware so
  that clients like wget that are trying to get a
  binary blob don't obtain a "Generating..." page

Caching is disabled by default with the $cache_enable variable,
setting this to 1 will enable file based caching.  It is expected
that this will be extended to include additional types of caching
(like memcached) in the future and should not be exclusively
considered a binary value.
---
 gitweb/cache.pm          |  365 ++++++++++++++++++++++++++++++++++++++++++++++
 gitweb/gitweb.perl       |   99 +++++++++++--
 gitweb/static/gitweb.css |    6 +
 3 files changed, 458 insertions(+), 12 deletions(-)
 create mode 100644 gitweb/cache.pm


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0003-gitweb-File-based-caching-layer-from-git.kernel.org.patch --]
[-- Type: text/x-patch; name="0003-gitweb-File-based-caching-layer-from-git.kernel.org.patch", Size: 15811 bytes --]

diff --git a/gitweb/cache.pm b/gitweb/cache.pm
new file mode 100644
index 0000000..c86265c
--- /dev/null
+++ b/gitweb/cache.pm
@@ -0,0 +1,365 @@
+# gitweb - simple web interface to track changes in git repositories
+#
+# (C) 2006, John 'Warthog9' Hawley <warthog19@eaglescrag.net>
+#
+# This program is licensed under the GPLv2
+
+#
+# Gitweb caching engine
+#
+
+#use File::Path qw(make_path remove_tree);
+use File::Path qw(mkpath rmtree); # Used for compatability reasons
+use Digest::MD5 qw(md5 md5_hex md5_base64);
+use Fcntl ':flock';
+use File::Copy;
+
+sub cache_fetch {
+	my ($action) = @_;
+	my $cacheTime = 0;
+
+	# Deal with cache being disabled
+	if( $cache_enable == 0 ){
+		undef $backgroundCache;
+		#$fullhashpath = \$nocachedata;
+		#$fullhashbinpath = \$nocachedatabin;
+		$fullhashpath = *STDOUT;
+		$fullhashbinpath = *STDOUT;
+		$actions{$action}->();
+		return;
+	}elsif( $cache_enable == 1 ){
+		#obviously we are using file based caching
+
+		if(! -d $cachedir){
+			print "*** Warning ***: Caching enabled but cache directory does not exsist.  ($cachedir)\n";
+			mkdir ("cache", 0665) || die "Cannot create cache dir - you will need to manually create";
+			print "Cache directory created successfully\n";
+		}
+
+		our $full_url = "$my_url?". $ENV{'QUERY_STRING'};
+		our $urlhash = md5_hex($full_url);
+		our $fullhashdir = "$cachedir/". substr( $urlhash, 0, 2) ."/";
+
+		eval { mkpath( $fullhashdir, 0, 0777 ) };
+		if ($@) {
+			die_error(500, "Internal Server Error", "Could not create cache directory: $@");
+		}
+		$fullhashpath = "$fullhashdir/". substr( $urlhash, 2 );
+		$fullhashbinpath = "$fullhashpath.bin.wt";
+		$fullhashbinpathfinal = "$fullhashpath.bin";
+	} # done dealing with cache enabled / disabled
+
+	if(! -e "$fullhashpath" ){
+		if( ! $cache_enable || ! $cacheDoFork || ! defined(my $childPid = fork()) ){
+			cacheUpdate($action,0);
+			cacheDisplay($action);
+		} elsif ( $childPid == 0 ){
+			#run the updater
+			cacheUpdate($action,1);
+		}else{
+			cacheWaitForUpdate($action);
+		}
+	}else{
+		#if cache is out dated, update
+		#else displayCache();
+		open(cacheFile, '<', "$fullhashpath");
+		stat(cacheFile);
+		close(cacheFile);
+		my $stat_time = (stat(_))[9];
+		my $stat_size = (stat(_))[7];
+
+		$cacheTime = get_loadavg() * 60;
+		if( $cacheTime > $maxCacheTime ){
+			$cacheTime = $maxCacheTime;
+		}
+		if( $cacheTime < $minCacheTime ){
+			$cacheTime = $minCacheTime;
+		}
+		if( $stat_time < (time - $cacheTime) || $stat_size == 0 ){
+			if( ! $cacheDoFork || ! defined(my $childPid = fork()) ){
+				cacheUpdate($action,0);
+				cacheDisplay($action);
+			} elsif ( $childPid == 0 ){
+				#run the updater
+				#print "Running updater\n";
+				cacheUpdate($action,1);
+			}else{
+				#print "Waiting for update\n";
+				cacheWaitForUpdate($action);
+			}
+		} else {
+			cacheDisplay($action);
+		}
+
+
+	}
+
+	#
+	# If all of the caching failes - lets go ahead and press on without it and fall back to 'default'
+	# non-caching behavior.  This is the softest of the failure conditions.
+	#
+	#$actions{$action}->();
+}
+
+sub cacheUpdate {
+	my ($action,$areForked) = @_;
+	my $lockingStatus;
+	my $fileData = "";
+
+	if($backgroundCache){
+		open(cacheFileBG, '>:utf8', "$fullhashpath.bg");
+		my $lockStatBG = flock(cacheFileBG,LOCK_EX|LOCK_NB);
+
+		$lockStatus = $lockStatBG;
+	}else{
+		open(cacheFile, '>:utf8', "$fullhashpath") if ($cache_enable > 0);
+		open(cacheFile, '>:utf8', \$fullhashpath) if ($cache_enable == 0);
+		my $lockStat = flock(cacheFile,LOCK_EX|LOCK_NB);
+
+		$lockStatus = $lockStat;
+	}
+	#print "lock status: $lockStat\n";
+
+
+	if ($cache_enable != 0 && ! $lockStatus ){
+		if ( $areForked ){
+			exit(0);
+		}else{
+			return;
+		}
+	}
+
+	if(
+		$action eq "snapshot"
+		||
+		$action eq "blob_plain"
+	){
+		my $openstat = open(cacheFileBinWT, '>>:utf8', "$fullhashbinpath");
+		my $lockStatBin = flock(cacheFileBinWT,LOCK_EX|LOCK_NB);
+	}
+
+	# Trap all output from the action
+	change_output();
+
+	$actions{$action}->();
+
+	# Reset the outputs as we should be fine now
+	reset_output();
+
+
+	if($backgroundCache){
+		open(cacheFile, '>:utf8', "$fullhashpath");
+		$lockStat = flock(cacheFile,LOCK_EX);
+
+		if (! $lockStat ){
+			if ( $areForked ){
+				exit(0);
+			}else{
+				return;
+			}
+		}
+	}
+
+	if(
+		$action eq "snapshot"
+		||
+		$action eq "blob_plain"
+	){
+		my $openstat = open(cacheFileBinFINAL, '>:utf8', "$fullhashbinpathfinal");
+		$lockStatBIN = flock(cacheFileBinFINAL,LOCK_EX);
+
+		if (! $lockStatBIN ){
+			if ( $areForked ){
+				exit(0);
+			}else{
+				return;
+			}
+		}
+	}
+
+	# Actually dump the output to the proper file handler
+	local $/ = undef;
+	$|++;
+	print cacheFile "$output";
+	$|--;
+	if(
+		$action eq "snapshot"
+		||
+		$action eq "blob_plain"
+	){
+		move("$fullhashbinpath", "$fullhashbinpathfinal") or die "Binary Cache file could not be updated: $!";
+
+		flock(cacheFileBinFINAL,LOCK_UN);
+		close(cacheFileBinFINAL);
+
+		flock(cacheFileBinWT,LOCK_UN);
+		close(cacheFileBinWT);
+	}
+
+	flock(cacheFile,LOCK_UN);
+	close(cacheFile);
+
+	if($backgroundCache){
+		flock(cacheFileBG,LOCK_UN);
+		close(cacheFileBG);
+	}
+
+	if ( $areForked ){
+		exit(0);
+	} else {
+		return;
+	}
+}
+
+
+sub cacheWaitForUpdate {
+	my ($action) = @_;
+	my $x = 0;
+	my $max = 10;
+	my $lockStat = 0;
+
+	if( $backgroundCache ){
+		if( -e "$fullhashpath" ){
+			open(cacheFile, '<:utf8', "$fullhashpath") if ($cache_enable > 0);
+			open(cacheFile, '<:utf8', \$fullhashpath) if ($cache_enable == 0);
+			$lockStat = flock(cacheFile,LOCK_SH|LOCK_NB);
+			stat(cacheFile);
+			close(cacheFile);
+
+			if( $lockStat && ( (stat(_))[9] > (time - $maxCacheLife) ) ){
+				cacheDisplay($action);
+				return;
+			}
+		}
+	}
+
+	if(
+		$action eq "atom"
+		||
+		$action eq "rss"
+		||
+		$action eq "opml"
+	){
+		do {
+			sleep 2 if $x > 0;
+			open(cacheFile, '<:utf8', "$fullhashpath") if ($cache_enable > 0);
+			open(cacheFile, '<:utf8', \$fullhashpath) if ($cache_enable == 0);
+			$lockStat = flock(cacheFile,LOCK_SH|LOCK_NB);
+			close(cacheFile);
+			$x++;
+			$combinedLockStat = $lockStat;
+		} while ((! $combinedLockStat) && ($x < $max));
+
+		if( $x != $max ){
+			cacheDisplay($action);
+		}
+		return;
+	}
+
+	$| = 1;
+
+	print $::cgi->header(
+				-type=>'text/html',
+				-charset => 'utf-8',
+				-status=> 200,
+				-expires => 'now',
+				# HTTP/1.0
+				-Pragma => 'no-cache',
+				# HTTP/1.1
+				-Cache_Control => join(
+							', ',
+							qw(
+								private
+								no-cache
+								no-store
+								must-revalidate
+								max-age=0
+								pre-check=0
+								post-check=0
+							)
+						)
+				);
+
+	print <<EOF;
+<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www/w3.porg/TR/html4/strict.dtd">
+<!-- git web w/caching interface version $version, (C) 2006-2010, John 'Warthog9' Hawley <warthog9\@kernel.org> -->
+<!-- git core binaries version $git_version -->
+<head>
+<meta http-equiv="content-type" content="$content_type; charset=utf-8"/>
+<meta name="generator" content="gitweb/$version git/$git_version"/>
+<meta name="robots" content="index, nofollow"/>
+<meta http-equiv="refresh" content="0"/>
+<title>$title</title>
+</head>
+<body>
+EOF
+
+	print "Generating..";
+	do {
+		print ".";
+		sleep 2 if $x > 0;
+		open(cacheFile, '<:utf8', "$fullhashpath") if ($cache_enable > 0);
+		open(cacheFile, '<:utf8', \$fullhashpath) if ($cache_enable == 0);
+		$lockStat = flock(cacheFile,LOCK_SH|LOCK_NB);
+		close(cacheFile);
+		$x++;
+		$combinedLockStat = $lockStat;
+	} while ((! $combinedLockStat) && ($x < $max));
+	print <<EOF;
+</body>
+</html>
+EOF
+	return;
+}
+
+sub cacheDisplay {
+	local $/ = undef;
+	$|++;
+
+	my ($action) = @_;
+	open(cacheFile, '<:utf8', "$fullhashpath") if ($cache_enable > 0);
+	open(cacheFile, '<:utf8', \$fullhashpath) if ($cache_enable == 0);
+	$lockStat = flock(cacheFile,LOCK_SH|LOCK_NB);
+
+	if ($cache_enable != 0 && ! $lockStat ){
+		close(cacheFile);
+		cacheWaitForUpdate($action);
+	}
+
+	if(
+		(
+			$action eq "snapshot"
+			||
+			$action eq "blob_plain"
+		)
+		&&
+		$cache_enable > 0
+	){
+		my $openstat = open(cacheFileBin, '<', "$fullhashbinpathfinal");
+		$lockStatBIN = flock(cacheFileBin,LOCK_SH|LOCK_NB);
+		if ($cache_enable != 0 && ! $lockStatBIN ){
+			system ("echo 'cacheDisplay - bailing due to binary lock failure' >> /tmp/gitweb.log");
+			close(cacheFile);
+			close(cacheFileBin);
+			cacheWaitForUpdate($action);
+		}
+
+		my $binfilesize = -s "$fullhashbinpathfinal";
+		print "Content-Length: $binfilesize";
+	}
+	while( <cacheFile> ){
+		print $_;
+	}
+	if(
+		$action eq "snapshot"
+		||
+		$action eq "blob_plain"
+	){
+		binmode STDOUT, ':raw';
+		print <cacheFileBin>;
+		binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi
+		close(cacheFileBin);
+	}
+	close(cacheFile);
+	$|--;
+}
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 757ef46..eb53075 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -31,7 +31,7 @@ BEGIN {
 our $version = "++GIT_VERSION++";
 
 # Output buffer variable
-$output = "";
+our $output = "";
 
 our ($my_url, $my_uri, $base_url, $path_info, $home_link);
 sub evaluate_uri {
@@ -244,6 +244,57 @@ our $git_versions_must_match = 1;
 # Leave it undefined (or set to 'undef') to turn off load checking.
 our $maxload = 300;
 
+# This enables/disables the caching layer in gitweb.  This currently only supports the
+# 'dumb' file based caching layer, primarily used on git.kernel.org.  this is reasonably
+# effective but it has the downside of requiring a huge amount of disk space if there
+# are a number of repositories involved.  It is not uncommon for git.kernel.org to have
+# on the order of 80G - 120G accumulate over the course of a few months.  It is recommended
+# that the cache directory be periodically completely deleted, and this is safe to perform.
+# Suggested mechanism
+# mv $cacheidr $cachedir.flush;mkdir $cachedir;rm -rf $cachedir.flush
+# Value is binary. 0 = disabled (default), 1 = enabled.
+#
+# Values of caching:
+# 	1 = 'dumb' file based caching used on git.kernel.org
+our $cache_enable = 0;
+
+# Used to set the minimum cache timeout for the dynamic caching algorithm.  Basically
+# if we calculate the cache to be under this number of seconds we set the cache timeout
+# to this minimum.
+# Value is in seconds.  1 = 1 seconds, 60 = 1 minute, 600 = 10 minutes, 3600 = 1 hour
+our $minCacheTime = 20;
+
+# Used to set the maximum cache timeout for the dynamic caching algorithm.  Basically
+# if we calculate the cache to exceed this number of seconds we set the cache timeout
+# to this maximum.
+# Value is in seconds.  1 = 1 seconds, 60 = 1 minute, 600 = 10 minutes, 3600 = 1 hour
+our $maxCacheTime = 1200;
+
+# If you need to change the location of the caching directory, override this
+# otherwise this will probably do fine for you
+our $cachedir = 'cache';
+
+# If this is set (to 1) cache will do it's best to always display something instead
+# of making someone wait for the cache to update.  This will launch the cacheUpdate
+# into the background and it will lock a <file>.bg file and will only lock the
+# actual cache file when it needs to write into it.  In theory this will make
+# gitweb seem more responsive at the price of possibly stale data.
+our $backgroundCache = 1;
+
+# Used to set the maximum cache file life.  If a cache files last modify time exceeds
+# this value, it will assume that the data is just too old, and HAS to be regenerated
+# instead of trying to display the existing cache data.
+# Value is in seconds.  1 = 1 seconds, 60 = 1 minute, 600 = 10 minutes, 3600 = 1 hour
+# 18000 = 5 hours
+our $maxCacheLife = 18000;
+
+# Used to enable or disable background forking of the gitweb caching.  Mainly here for debugging purposes
+our $cacheDoFork = 1;
+
+our $fullhashpath = *STDOUT;
+our $fullhashbinpath = *STDOUT;
+our $fullhashbinpathfinal = *STDOUT;
+
 # configuration for 'highlight' (http://www.andre-simon.de/)
 # match by basename
 our %highlight_basename = (
@@ -500,6 +551,11 @@ our %feature = (
 		'default' => [0]},
 );
 
+#
+# Includes
+#
+do 'cache.pm';
+
 sub gitweb_get_feature {
 	my ($name) = @_;
 	return unless exists $feature{$name};
@@ -1089,7 +1145,8 @@ sub dispatch {
 	    !$project) {
 		die_error(400, "Project needed");
 	}
-	$actions{$action}->();
+	#$actions{$action}->();
+	cache_fetch($action);
 }
 
 sub reset_timer {
@@ -1159,6 +1216,7 @@ sub change_output {
 
 	# Trap the 'proper' STDOUT to STDOUT_REAL for things like error messages and such
 	open(STDOUT_REAL,">&STDOUT") or die "Unable to capture STDOUT $!\n";
+	print STDOUT_REAL "";
 
 	# Close STDOUT, so that it isn't being used anymore.
 	close STDOUT;
@@ -1185,10 +1243,7 @@ sub run {
 		$pre_dispatch_hook->()
 			if $pre_dispatch_hook;
 
-		change_output();
 		run_request();
-		reset_output();
-		print $output;
 
 		$post_dispatch_hook->()
 			if $post_dispatch_hook;
@@ -3465,7 +3520,9 @@ sub git_header_html {
 	# support xhtml+xml but choking when it gets what it asked for.
 	if (defined $cgi->http('HTTP_ACCEPT') &&
 	    $cgi->http('HTTP_ACCEPT') =~ m/(,|;|\s|^)application\/xhtml\+xml(,|;|\s|$)/ &&
-	    $cgi->Accept('application/xhtml+xml') != 0) {
+	    $cgi->Accept('application/xhtml+xml') != 0
+	    &&
+	    $cache_enable == 0) {
 		$content_type = 'application/xhtml+xml';
 	} else {
 		$content_type = 'text/html';
@@ -3610,6 +3667,7 @@ sub git_footer_html {
 	my $feed_class = 'rss_logo';
 
 	print "<div class=\"page_footer\">\n";
+	print "<div class=\"cachetime\">Cache Last Updated: ". gmtime( time ) ." GMT</div>\n";
 	if (defined $project) {
 		my $descr = git_get_project_description($project);
 		if (defined $descr) {
@@ -3698,6 +3756,11 @@ sub die_error {
 		500 => '500 Internal Server Error',
 		503 => '503 Service Unavailable',
 	);
+	# The output handlers for die_error need to be reset to STDOUT
+	# so that half the message isn't being output to random and
+	# half to STDOUT as expected.  This is mainly for the benefit
+	# of using git_header_html() and git_footer_html() since
+	#
 	# Reset the output so that we are actually going to STDOUT as opposed
 	# to buffering the output.
 	reset_output();
@@ -5610,9 +5673,15 @@ sub git_blob_plain {
 			($sandbox ? 'attachment' : 'inline')
 			. '; filename="' . $save_as . '"');
 	local $/ = undef;
-	binmode STDOUT, ':raw';
-	print <$fd>;
-	binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi
+	if( $cache_enable != 0){
+		open BINOUT, '>', $fullhashbinpath or die_error(500, "Could not open bin dump file");
+	}else{
+		open BINOUT, '>', \$fullhashbinpath or die_error(500, "Could not open bin dump file");
+	}
+	binmode BINOUT, ':raw';
+	print BINOUT <$fd>;
+	binmode BINOUT, ':utf8'; # as set at the beginning of gitweb.cgi
+	close BINOUT;
 	close $fd;
 }
 
@@ -5897,9 +5966,15 @@ sub git_snapshot {
 
 	open my $fd, "-|", $cmd
 		or die_error(500, "Execute git-archive failed");
-	binmode STDOUT, ':raw';
-	print <$fd>;
-	binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi
+	if( $cache_enable != 0){
+		open BINOUT, '>', $fullhashbinpath or die_error(500, "Could not open bin dump file");
+	}else{
+		open BINOUT, '>', \$fullhashbinpath or die_error(500, "Could not open bin dump file");
+	}
+	binmode BINOUT, ':raw';
+	print BINOUT <$fd>;
+	binmode BINOUT, ':utf8'; # as set at the beginning of gitweb.cgi
+	close BINOUT;
 	close $fd;
 }
 
diff --git a/gitweb/static/gitweb.css b/gitweb/static/gitweb.css
index 4132aab..972d32e 100644
--- a/gitweb/static/gitweb.css
+++ b/gitweb/static/gitweb.css
@@ -67,6 +67,12 @@ div.page_path {
 	border-width: 0px 0px 1px;
 }
 
+div.cachetime {
+	float: left;
+	margin-right: 10px;
+	color: #555555;
+}
+
 div.page_footer {
 	height: 17px;
 	padding: 4px 8px;

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH 1/3] gitweb: Add option to force version match
  2010-10-28  0:42 ` [PATCH 1/3] gitweb: Add option to force version match John 'Warthog9' Hawley
@ 2010-10-28  9:52   ` Jakub Narebski
  2010-10-28 22:08   ` Junio C Hamano
  1 sibling, 0 replies; 43+ messages in thread
From: Jakub Narebski @ 2010-10-28  9:52 UTC (permalink / raw)
  To: John 'Warthog9' Hawley; +Cc: git

"John 'Warthog9' Hawley" <warthog9@eaglescrag.net> writes:

> This adds $git_versions_must_match variable, which is set to true,
> checks that we are running on the same version of git that we
> shipped with, and if not throw '500 Internal Server Error' error.
> What is checked is the version of gitweb (embedded in building
> gitweb.cgi), against version of runtime git binary used.
> 
> Gitweb can usually run with a mismatched git install.  This is more
> here to give an obvious warning as to whats going on vs. silently
> failing.

Were you bitten by mismatch between git version and gitweb version?
Just how serious is that (hypothetical?) situation?

Should it be warning, or erroring out?

> 
> By default this feature is turned on.

I think last time this patch stalled on discussion whether this
feature should be turned on by default, where it can break some
setups; or be turned off by default, where it is much less useful
protecting against errors.

> 
> Signed-off-by: John 'Warthog9' Hawley <warthog9@kernel.org>
> Signed-off-by: Jakub Narebski <jnareb@gmail.com>
> ---
>  gitweb/README      |    4 ++++
>  gitweb/gitweb.perl |   27 +++++++++++++++++++++++++++
>  2 files changed, 31 insertions(+), 0 deletions(-)

For me, if $git_versions_must_match is to be on by default, I would
prefer also update to t/gitweb-lib.sh, so hat I don't have to
recompile git to test new version of gitweb...  

> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> index 8d7e4c5..215a4e9 100755
> --- a/gitweb/gitweb.perl
> +++ b/gitweb/gitweb.perl
> @@ -232,6 +232,9 @@ our %avatar_size = (
>  	'double'  => 32
>  );
>  
> +# If it is true, exit if gitweb version and git binary version don't match
> +our $git_versions_must_match = 1;
> +
>  # Used to set the maximum load that we will still respond to gitweb queries.
>  # If server load exceed this value then return "503 server busy" error.
>  # If gitweb cannot determined server load, it is taken to be 0.
> @@ -649,6 +652,29 @@ sub check_loadavg {
>  	}
>  }
>  
> +sub check_versionmatch {
> +	# Throw an error if git versions does not match, if $git_versions_must_match is true.
> +	if ($git_versions_must_match &&
> +	    $git_version ne $version) {
> +		my $admin_contact =
> +			defined $ENV{'SERVER_ADMIN'} ? ", $ENV{'SERVER_ADMIN'}," : '';
> +		my $err_msg = <<EOT;
> +<h1 align="center">*** Warning ***</h1>
> +<p>
> +This version of gitweb was compiled for <b>@{[esc_html($version)]}</b>,
> +however git version <b>@{[esc_html($git_version)]}</b> was found on server,
> +and administrator requested strict version checking.
> +</p>
> +<p>
> +Please contact the server administrator${admin_contact} to either configure
> +gitweb to allow mismatched versions, or update git or gitweb installation.
> +</p>
> +EOT
> +		die_error(500, 'Internal server error', $err_msg);
> +	}
> +
> +}

Nice.

> +
>  # ======================================================================
>  # input validation and dispatch
>  
> @@ -1075,6 +1101,7 @@ sub run_request {
>  	evaluate_uri();
>  	evaluate_gitweb_config();
>  	check_loadavg();
> +	check_versionmatch();

Shouldn't it be before check_loadavg()?

-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 2/3] gitweb: add output buffering and associated functions
  2010-10-28  0:42 ` [PATCH 2/3] gitweb: add output buffering and associated functions John 'Warthog9' Hawley
@ 2010-10-28  9:56   ` Jakub Narebski
  0 siblings, 0 replies; 43+ messages in thread
From: Jakub Narebski @ 2010-10-28  9:56 UTC (permalink / raw)
  To: John 'Warthog9' Hawley; +Cc: git

"John 'Warthog9' Hawley" <warthog9@eaglescrag.net> writes:

> This adds output buffering for gitweb, mainly in preparation for
> caching support.  This is a dramatic change to how caching was being
> done, mainly in passing around the variable manually and such.
> 
> This centrally flips the entire STDOUT to a variable, which after the
> completion of the run, flips it back and does a print on the resulting
> data.
> 
> This should save on the previous 10K line patch (or so) that adds more
> explicit output passing.

Signoff?

> ---
>  gitweb/gitweb.perl |   29 +++++++++++++++++++++++++++++
>  1 files changed, 29 insertions(+), 0 deletions(-)
> 
> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> index 215a4e9..757ef46 100755
> --- a/gitweb/gitweb.perl
> +++ b/gitweb/gitweb.perl
> @@ -30,6 +30,9 @@ BEGIN {
>  
>  our $version = "++GIT_VERSION++";
>  
> +# Output buffer variable
> +$output = "";
> +
>  our ($my_url, $my_uri, $base_url, $path_info, $home_link);
>  sub evaluate_uri {
>  	our $cgi;
> @@ -1151,6 +1154,25 @@ sub evaluate_argv {
>  	);
>  }
>  
> +sub change_output {
> +	our $output;
> +
> +	# Trap the 'proper' STDOUT to STDOUT_REAL for things like error messages and such
> +	open(STDOUT_REAL,">&STDOUT") or die "Unable to capture STDOUT $!\n";
> +
> +	# Close STDOUT, so that it isn't being used anymore.
> +	close STDOUT;
> +
> +	# Trap STDOUT to the $output variable, which is what I was using in the original
> +	# patch anyway.
> +	open(STDOUT,">", \$output) || die "Unable to open STDOUT: $!"; #open STDOUT handle to use $var
> +}
> +
> +sub reset_output {
> +	# This basically takes STDOUT_REAL and puts it back as STDOUTl
> +	open(STDOUT,">&STDOUT_REAL");
> +}

Nice solution.  I'll steal it for GitwebCache::Capture::Redirect (or
something like that, instead of relying on a bit cryptic select($fh)).

> @@ -1163,7 +1185,10 @@ sub run {
>  		$pre_dispatch_hook->()
>  			if $pre_dispatch_hook;
>  
> +		change_output();
>  		run_request();
> +		reset_output();
> +		print $output;

Doesn't this enable capture unconditionally?  

Wouldn't that screw-up the blame_data part of blame_interactive Ajax-y
view?  Wouldn't that decrease perceived responsiveness of gitweb?

>  
>  		$post_dispatch_hook->()
>  			if $post_dispatch_hook;
> @@ -3673,6 +3698,10 @@ sub die_error {
>  		500 => '500 Internal Server Error',
>  		503 => '503 Service Unavailable',
>  	);
> +	# Reset the output so that we are actually going to STDOUT as opposed
> +	# to buffering the output.
> +	reset_output();
> +

Good.

>  	git_header_html($http_responses{$status}, undef, %opts);
>  	print <<EOF;
>  <div class="page_body">

-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 3/3] gitweb: File based caching layer (from git.kernel.org)
  2010-10-28  0:42 ` [PATCH 3/3] gitweb: File based caching layer (from git.kernel.org) John 'Warthog9' Hawley
@ 2010-10-28 16:10   ` Jakub Narebski
  0 siblings, 0 replies; 43+ messages in thread
From: Jakub Narebski @ 2010-10-28 16:10 UTC (permalink / raw)
  To: John 'Warthog9' Hawley; +Cc: git, Jakub Narebski

I am not commenting on minor style issues, like using camelCase names
of variables and subroutines, of 'if( ... )' instead of 'if (...)'
etc. minor issues.


"John 'Warthog9' Hawley" <warthog9@eaglescrag.net> writes:

> This is a relatively large patch that implements the file based
> caching layer that is quite similar to the one used  on such large
> sites as kernel.org and soon git.fedoraproject.org.  This provides
> a simple, and straight forward caching mechanism that scales
> dramatically better than Gitweb by itself.

"Quite similar" - what does it mean?  In what way it is different from
the one used on git.kernel.org?

> 
> The caching layer basically buffers the output that Gitweb would
> normally return, and saves that output to a cache file on the local
> disk.

O.K.  This means caching HTTP response (caching output of CGI script),
using "simple" file based caching engine, without any serialization.

But this doesn't tell us how the cache expiration is handled, i.e. how
do you know that cache entry is expired, and should be regenerated.


>  When the file is requested it attempts to gain a shared lock
> on the cache file and cat it out to the client.  Should an exclusive
> lock be on a file (it's being updated) the code has a choice to either
> update in the background and go ahead and show the stale page while
> update is being performed, or stall the client(s) until the page
> is generated.

In my rewrite of earlier version of your gitweb caching patch it is
responsibility of caching engine (GitwebCache::FileCacheWithLocking)
to serialize (via locking) (re)generating cache entry.  Gitweb just
passes page key, function that returns captured output to cache
instance; it passes callback for "staling" the client to cache
constructor.

It is not the "file" that is being requested: on the gitweb level it
is a page / action / view that is being requested, on the level of
caching negine it is cache entry that is requested... only at the low
level of cache driver that a file holding a cached entry is requested.

> 
> There are two forms of stalling involved here, background building
> and non-background building, both of which are discussed in the
> configuration page.
> 
> There are still a few known "issues" with respect to this:
> - Code needs to be added to be "browser" aware so
>   that clients like wget that are trying to get a
>   binary blob don't obtain a "Generating..." page

Right.  This is also missing from my rewrite.

Sidenote: why "binary blob"?  Most downloading utilities do not follow
http-meta in-HTML redirects, do they?

> 
> Caching is disabled by default with the $cache_enable variable,
> setting this to 1 will enable file based caching.  It is expected
> that this will be extended to include additional types of caching
> (like memcached) in the future and should not be exclusively
> considered a binary value.

Why not do the same as in my rewrite, namely $cache_enabled or
$caching_enabled *boolean* value (not "binary"), and $cache variable
to select caching engine (undef to use default)?

> ---
>  gitweb/cache.pm          |  365 ++++++++++++++++++++++++++++++++++++++++++++++
>  gitweb/gitweb.perl       |   99 +++++++++++--
>  gitweb/static/gitweb.css |    6 +
>  3 files changed, 458 insertions(+), 12 deletions(-)
>  create mode 100644 gitweb/cache.pm

It would be nice to have at least some minimal testing for caching
enabled gitweb.  Hmmm... I should have split update to t9500-* test
so you would be able to cherry-pick it.

Also, it would be good idea to have 'make install-gitweb' to install
'cache.pm' alongside gitweb, especially that you do not do any error
checking for "do 'cache.pm';" (see below).

> 
> diff --git a/gitweb/cache.pm b/gitweb/cache.pm
> new file mode 100644
> index 0000000..c86265c
> --- /dev/null
> +++ b/gitweb/cache.pm
> @@ -0,0 +1,365 @@
> +# gitweb - simple web interface to track changes in git repositories
> +#
> +# (C) 2006, John 'Warthog9' Hawley <warthog19@eaglescrag.net>
> +#
> +# This program is licensed under the GPLv2
> +
> +#
> +# Gitweb caching engine
> +#

Wouldn't it be better to use real Perl module, i.e. start with
'package' line, like in my rewrite?

> +
> +#use File::Path qw(make_path remove_tree);

Is above line really needed?

> +use File::Path qw(mkpath rmtree); # Used for compatability reasons

Because [stable] Perl earlier than 5.10.0 has pre-2.00 version of
File::Path in core, which didn't have make_path and remove_tree.

At least that is compatibility with pre-2.00 File::Path should be
mentioned in this comment, if we are to mention it at all.

BTW. do you need rmtree?

> +use Digest::MD5 qw(md5 md5_hex md5_base64);

Do you use *all* those subroutines?

> +use Fcntl ':flock';

All right.

> +use File::Copy;

Why do you need this for?

> +
> +sub cache_fetch {
> +	my ($action) = @_;
> +	my $cacheTime = 0;
> +
> +	# Deal with cache being disabled
> +	if( $cache_enable == 0 ){

Wouldn't it be better to use boolean 'if (!$caching_enabled) {' rather
than explicitely test for numeric 0?

Wouldn't it be better to have this conditional _outside_ cache_fetch?

> +		undef $backgroundCache;
> +		#$fullhashpath = \$nocachedata;
> +		#$fullhashbinpath = \$nocachedatabin;

Leftover debugging?

> +		$fullhashpath = *STDOUT;
> +		$fullhashbinpath = *STDOUT;

WTF is this for?  Defensive programming?  You shouldn't protect
yourself against bugs in your program...

> +		$actions{$action}->();
> +		return;
> +	}elsif( $cache_enable == 1 ){
> +		#obviously we are using file based caching
> +
> +		if(! -d $cachedir){
> +			print "*** Warning ***: Caching enabled but cache directory does not exsist.  ($cachedir)\n";

Why print this warning?  Why not use 'warn' instead, so it would get
in web server logs?

Wouldn't printing this here screw up HTTP headers something fierce?
I think it is a bug; rarely encountered but still a bug.

> +			mkdir ("cache", 0665) || die "Cannot create cache dir - you will need to manually create";
> +			print "Cache directory created successfully\n";
> +		}
> +
> +		our $full_url = "$my_url?". $ENV{'QUERY_STRING'};

Why not use 'href(-replay => 1, -full => 1, -path_info => 0)'?

> +		our $urlhash = md5_hex($full_url);
> +		our $fullhashdir = "$cachedir/". substr( $urlhash, 0, 2) ."/";

Why 'our' and not 'my', i.e. why use global variables, and not local
to function (lexical scoped to be more exact)?

We use 'our' in gitweb *only* because otherwise mod_perl with
ModPerl::Registry handler wouldn't work, see e.g. dde80d9 (gitweb: Fix
mod_perl support., 2008-11-06) which fixes one issue of 'my' instead
of 'our'.  But global variables are bad idea...

> +
> +		eval { mkpath( $fullhashdir, 0, 0777 ) };
> +		if ($@) {
> +			die_error(500, "Internal Server Error", "Could not create cache directory: $@");
> +		}

Good... but wouldn't it leak sensitive information?

> +		$fullhashpath = "$fullhashdir/". substr( $urlhash, 2 );
> +		$fullhashbinpath = "$fullhashpath.bin.wt";
> +		$fullhashbinpathfinal = "$fullhashpath.bin";

First, why do you treat binary and non-binary case differently,
i.e. you have $fullhashbinpathfinal (egh, thats a mouthfull ;-), but
not $fullhashpathfinal?

Second, do you really need to have distinct code for binary and
non-binary case besides what gitweb already has with 
"binmode STDOUT, ':raw';" in those places few that needs it?

If you are capturing to in-memory file (to scalar), and binmode acts
on said in-memory file, then what gets saved is raw (bytes) data.
Then you can print it raw to web browser, save it raw to cache file,
and read it raw from cache file.

> +	} # done dealing with cache enabled / disabled
> +
> +	if(! -e "$fullhashpath" ){

Why it is not within 'if ($caching_enabled)' branch of conditional?

> +		if( ! $cache_enable || ! $cacheDoFork || ! defined(my $childPid = fork()) ){
> +			cacheUpdate($action,0);
> +			cacheDisplay($action);
> +		} elsif ( $childPid == 0 ){
> +			#run the updater
> +			cacheUpdate($action,1);
> +		}else{
> +			cacheWaitForUpdate($action);
> +		}

WTF do you have forking *here*?  If cache entry is not expired, you
can just read it, without need for forking.  It is only when you need
to (re)generate cache that forking is needed.

> +	}else{
> +		#if cache is out dated, update
> +		#else displayCache();

Lefotover statement.

> +		open(cacheFile, '<', "$fullhashpath");
> +		stat(cacheFile);
> +		close(cacheFile);

You don't need to open file to stat it, you can do

  +		stat($fullhashpath);

BTW. what happens if output is binary?

> +		my $stat_time = (stat(_))[9];
> +		my $stat_size = (stat(_))[7];
> +


> +		$cacheTime = get_loadavg() * 60;
> +		if( $cacheTime > $maxCacheTime ){
> +			$cacheTime = $maxCacheTime;
> +		}
> +		if( $cacheTime < $minCacheTime ){
> +			$cacheTime = $minCacheTime;
> +		}

All right.

> +		if( $stat_time < (time - $cacheTime) || $stat_size == 0 ){

Some comment about '$stat_size == 0' test would be good idea, I think.

> +			if( ! $cacheDoFork || ! defined(my $childPid = fork()) ){
> +				cacheUpdate($action,0);
> +				cacheDisplay($action);
> +			} elsif ( $childPid == 0 ){
> +				#run the updater
> +				#print "Running updater\n";
> +				cacheUpdate($action,1);
> +			}else{
> +				#print "Waiting for update\n";
> +				cacheWaitForUpdate($action);
> +			}

The above block of code is almost verbatim copy of code higher up.
Not good.

> +		} else {
> +			cacheDisplay($action);
> +		}

Wouldn't it be better to have those two branches of conditional
reversed, so that else part is larger?  Just a thought...

> +
> +
> +	}
> +
> +	#
> +	# If all of the caching failes - lets go ahead and press on without it and fall back to 'default'
> +	# non-caching behavior.  This is the softest of the failure conditions.

This comment refers to...

> +	#
> +	#$actions{$action}->();

... commented out code?  Eh?

> +}

You don't use persistent web environment[1] for git.kernel.org, do
you?

With persistent web environments, where gitweb script runs for a long
time, and doesn't exit after end of request, you really should
"daemonize" (detach) background process which regenerates cache, at
least when said process doesn't display web page to server (serve
stale data + regenerate cache in background).  Otherwise you would get
lots of zombie processes...

[1] Gitweb currently supports FastCGI (just rename it to gitweb.fcgi,
but you would need FCGI perl module installed together with
CGI::Fast), mod_perl via ModPerl::Registry handler, and you can use
it with PSGI via gitweb.psgi wrapper like the one generated by 
'git instaweb --httpd=plackup'.

> +
> +sub cacheUpdate {
> +	my ($action,$areForked) = @_;

Strange API.

> +	my $lockingStatus;
> +	my $fileData = "";
> +
> +	if($backgroundCache){
> +		open(cacheFileBG, '>:utf8', "$fullhashpath.bg");
> +		my $lockStatBG = flock(cacheFileBG,LOCK_EX|LOCK_NB);
> +
> +		$lockStatus = $lockStatBG;
> +	}else{
> +		open(cacheFile, '>:utf8', "$fullhashpath") if ($cache_enable > 0);
> +		open(cacheFile, '>:utf8', \$fullhashpath) if ($cache_enable == 0);

Why do you save to in-memory file if caching is disabled, instead of
just printing to STDOUT?

> +		my $lockStat = flock(cacheFile,LOCK_EX|LOCK_NB);
> +
> +		$lockStatus = $lockStat;
> +	}

Why do you need different names for lockfiles for background cache
entry enabled and for it disabled?  Why do you need $lockStat and
$lockStatBG if what you do is just assign it to $lockStatus?

Why do you use fileglobs (which are *global* variables!) and not
lexical filehandles?

> +	#print "lock status: $lockStat\n";

Leftover debugging?  

BTW. didn't you have problems debugging it from withing gitweb code,
and not as a separate caching engine which can be tested individually?

> +
> +
> +	if ($cache_enable != 0 && ! $lockStatus ){
> +		if ( $areForked ){
> +			exit(0);
> +		}else{
> +			return;
> +		}
> +	}

Strange code flow...

> +
> +	if(
> +		$action eq "snapshot"
> +		||
> +		$action eq "blob_plain"
> +	){
> +		my $openstat = open(cacheFileBinWT, '>>:utf8', "$fullhashbinpath");
> +		my $lockStatBin = flock(cacheFileBinWT,LOCK_EX|LOCK_NB);
> +	}

Why '>>', and not '>'.  Why use ':utf8' on binary file?  Shouldn't you
use different file for locking than for saving cache output in?

> +
> +	# Trap all output from the action
> +	change_output();
> +
> +	$actions{$action}->();
> +
> +	# Reset the outputs as we should be fine now
> +	reset_output();

All right.

> +
> +
> +	if($backgroundCache){
> +		open(cacheFile, '>:utf8', "$fullhashpath");
> +		$lockStat = flock(cacheFile,LOCK_EX);
> +
> +		if (! $lockStat ){
> +			if ( $areForked ){
> +				exit(0);
> +			}else{
> +				return;
> +			}
> +		}
> +	}
> +
> +	if(
> +		$action eq "snapshot"
> +		||
> +		$action eq "blob_plain"
> +	){
> +		my $openstat = open(cacheFileBinFINAL, '>:utf8', "$fullhashbinpathfinal");
> +		$lockStatBIN = flock(cacheFileBinFINAL,LOCK_EX);
> +
> +		if (! $lockStatBIN ){
> +			if ( $areForked ){
> +				exit(0);
> +			}else{
> +				return;
> +			}
> +		}
> +	}

Didn't I see nearly exactly the same code?  Copy'n'paste programming
isn't good idea...

> +
> +	# Actually dump the output to the proper file handler
> +	local $/ = undef;
> +	$|++;
> +	print cacheFile "$output";
> +	$|--;

Why stringify $output?  Why not 'local $| = 1' too?

> +	if(
> +		$action eq "snapshot"
> +		||
> +		$action eq "blob_plain"
> +	){
> +		move("$fullhashbinpath", "$fullhashbinpathfinal") or die "Binary Cache file could not be updated: $!";

Do you really need 'File::Copy::move' instead of simple 'rename' here?
Both files are in the same directory, so there should be no problems
with it.

> +
> +		flock(cacheFileBinFINAL,LOCK_UN);
> +		close(cacheFileBinFINAL);
> +
> +		flock(cacheFileBinWT,LOCK_UN);
> +		close(cacheFileBinWT);

Closing lockfile unlocks it, there is no need for LOCK_UN; moreover,
LOCK_Un is unreliable, as I have checked myself, and as I have read
on http://perl.plover.com/yak/flock/ and also on StackOverflow
http://stackoverflow.com/questions/34920/how-do-i-lock-a-file-in-perl
  

Why do binary files needs _two_ locks for?

> +	}
> +
> +	flock(cacheFile,LOCK_UN);
> +	close(cacheFile);
> +
> +	if($backgroundCache){
> +		flock(cacheFileBG,LOCK_UN);
> +		close(cacheFileBG);
> +	}

Why do background cache generation needs _two_ (or _four_ in case of
binary files!!!) locks?

> +
> +	if ( $areForked ){
> +		exit(0);
> +	} else {
> +		return;
> +	}

Why can't a caller do that?

> +}
> +
> +

I think it would be better to describe how cacheWaitForUpdate() works.

> +sub cacheWaitForUpdate {
> +	my ($action) = @_;
> +	my $x = 0;
> +	my $max = 10;
> +	my $lockStat = 0;
> +
> +	if( $backgroundCache ){
> +		if( -e "$fullhashpath" ){
> +			open(cacheFile, '<:utf8', "$fullhashpath") if ($cache_enable > 0);
> +			open(cacheFile, '<:utf8', \$fullhashpath) if ($cache_enable == 0);

Same af above: when caching is disabled, why do you capture gitweb
output for?

> +			$lockStat = flock(cacheFile,LOCK_SH|LOCK_NB);
> +			stat(cacheFile);
> +			close(cacheFile);

Here you unlocked just acquired lock!

Sidenote: using real locsk with flock() has the advantage over trick
of creating *.lock files with O_CREAT in that if process dies (and all
files it opened are closed) the lock is automatically unlocked.
Therefore you don't have problems of dead lockfiles.

> +
> +			if( $lockStat && ( (stat(_))[9] > (time - $maxCacheLife) ) ){
> +				cacheDisplay($action);
> +				return;
> +			}

Is running time() a second time intended, or just a side effect of
arrangement of code?

> +		}
> +	}
> +
> +	if(
> +		$action eq "atom"
> +		||
> +		$action eq "rss"
> +		||
> +		$action eq "opml"
> +	){

Sidenote: this is a bit ugly, and that is why in my rewrite I have
commit d20e6a4 (gitweb: Introduce %actions_info, gathering information
about actions, 2010-10-05).

Also, in my rewrite I can simply 'return' instead of 'exit()' here,
and you would get data from caching engine: no need for below code
block (which is repeated in very similar way a bit later in same
subroutine).

> +		do {
> +			sleep 2 if $x > 0;
> +			open(cacheFile, '<:utf8', "$fullhashpath") if ($cache_enable > 0);
> +			open(cacheFile, '<:utf8', \$fullhashpath) if ($cache_enable == 0);
> +			$lockStat = flock(cacheFile,LOCK_SH|LOCK_NB);
> +			close(cacheFile);

Here you unlock just acquired lock.  Is it intended?  

Is it safe against other process acquiting writer's lock (LOCK_EX)?
If it is, I probably should unlock it earlier and not have exit() do
it...

> +			$x++;
> +			$combinedLockStat = $lockStat;
> +		} while ((! $combinedLockStat) && ($x < $max));

My rewrite instead tries to acquire LOCK_SH *without* LOCK_NB, i.e. in
a blocking way... but with a timeout via $SIG{ALRM} and eval { ... }.

> +
> +		if( $x != $max ){
> +			cacheDisplay($action);
> +		}
> +		return;
> +	}
> +
> +	$| = 1;

I have autoflush turned on later.  But by all thats's koly, please
localize $| before changing its value, i.e.

  +	local $| = 1;


> +
> +	print $::cgi->header(

Why $::cgi->header()?

In my rewrite git_generating_data_html() subroutine is defined in
gitweb.perl, and passed as callback to cache constructor instead.

> +				-type=>'text/html',
> +				-charset => 'utf-8',
> +				-status=> 200,
> +				-expires => 'now',

That I have.

> +				# HTTP/1.0
> +				-Pragma => 'no-cache',
> +				# HTTP/1.1
> +				-Cache_Control => join(
> +							', ',
> +							qw(
> +								private
> +								no-cache
> +								no-store
> +								must-revalidate
> +								max-age=0
> +								pre-check=0
> +								post-check=0
> +							)
> +						)

This I don't, and I probably should have.  Good idea.

> +				);
> +
> +	print <<EOF;
> +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www/w3.porg/TR/html4/strict.dtd">
> +<!-- git web w/caching interface version $version, (C) 2006-2010, John 'Warthog9' Hawley <warthog9\@kernel.org> -->
> +<!-- git core binaries version $git_version -->
> +<head>
> +<meta http-equiv="content-type" content="$content_type; charset=utf-8"/>
> +<meta name="generator" content="gitweb/$version git/$git_version"/>
> +<meta name="robots" content="index, nofollow"/>

You meant here "noindex, nofollow", isn't it?

> +<meta http-equiv="refresh" content="0"/>
> +<title>$title</title>
> +</head>
> +<body>
> +EOF
> +
> +	print "Generating..";
> +	do {
> +		print ".";
> +		sleep 2 if $x > 0;
> +		open(cacheFile, '<:utf8', "$fullhashpath") if ($cache_enable > 0);
> +		open(cacheFile, '<:utf8', \$fullhashpath) if ($cache_enable == 0);
> +		$lockStat = flock(cacheFile,LOCK_SH|LOCK_NB);
> +		close(cacheFile);
> +		$x++;
> +		$combinedLockStat = $lockStat;
> +	} while ((! $combinedLockStat) && ($x < $max));

A bit of code repetition here.

> +	print <<EOF;
> +</body>
> +</html>
> +EOF
> +	return;
> +}
> +
> +sub cacheDisplay {
> +	local $/ = undef;
> +	$|++;
> +
> +	my ($action) = @_;
> +	open(cacheFile, '<:utf8', "$fullhashpath") if ($cache_enable > 0);
> +	open(cacheFile, '<:utf8', \$fullhashpath) if ($cache_enable == 0);
> +	$lockStat = flock(cacheFile,LOCK_SH|LOCK_NB);
> +
> +	if ($cache_enable != 0 && ! $lockStat ){
> +		close(cacheFile);
> +		cacheWaitForUpdate($action);
> +	}
> +
> +	if(
> +		(
> +			$action eq "snapshot"
> +			||
> +			$action eq "blob_plain"
> +		)
> +		&&
> +		$cache_enable > 0
> +	){
> +		my $openstat = open(cacheFileBin, '<', "$fullhashbinpathfinal");
> +		$lockStatBIN = flock(cacheFileBin,LOCK_SH|LOCK_NB);
> +		if ($cache_enable != 0 && ! $lockStatBIN ){
> +			system ("echo 'cacheDisplay - bailing due to binary lock failure' >> /tmp/gitweb.log");

Errrrr... what? Using system("echo ...") instead of just writing to a
file?  Or better, using 'warn' and relying on CGI::Carp module to
deliver warning to web server logs?

> +			close(cacheFile);

Why not close 'cacheFile' earlier, or say, not opening it in first
place if it is not necessary... oh, sorry, I see that you are using
the same file for cache file for normal case and for lockfile.

> +			close(cacheFileBin);
> +			cacheWaitForUpdate($action);
> +		}
> +

> +		my $binfilesize = -s "$fullhashbinpathfinal";
> +		print "Content-Length: $binfilesize";

Good idea, bad solution.  "Content-Length: " HTTP header can be only _after_ 
'HTTP/1.1 200 OK' header, isn't it?

> +	}
> +	while( <cacheFile> ){
> +		print $_;
> +	}

Why not 'print <cacheFile>;'?  besides with $/ undefined you operate
in slurp / spew mode, and Perl now considers all file to be a single
line.

> +	if(
> +		$action eq "snapshot"
> +		||
> +		$action eq "blob_plain"
> +	){
> +		binmode STDOUT, ':raw';
> +		print <cacheFileBin>;
> +		binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi
> +		close(cacheFileBin);
> +	}
> +	close(cacheFile);
> +	$|--;

Wouldn't it be better to use 'local $| = 1' instead of this 
'$|++ / $|--'?

> +}
> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> index 757ef46..eb53075 100755
> --- a/gitweb/gitweb.perl
> +++ b/gitweb/gitweb.perl
> @@ -31,7 +31,7 @@ BEGIN {
>  our $version = "++GIT_VERSION++";
>  
>  # Output buffer variable
> -$output = "";
> +our $output = "";

This belongs to and should be squashed with earlier patch

>  
>  our ($my_url, $my_uri, $base_url, $path_info, $home_link);
>  sub evaluate_uri {
> @@ -244,6 +244,57 @@ our $git_versions_must_match = 1;
>  # Leave it undefined (or set to 'undef') to turn off load checking.
>  our $maxload = 300;
>  
> +# This enables/disables the caching layer in gitweb.  This currently only supports the
> +# 'dumb' file based caching layer, primarily used on git.kernel.org.  this is reasonably
> +# effective but it has the downside of requiring a huge amount of disk space if there
> +# are a number of repositories involved.  It is not uncommon for git.kernel.org to have
> +# on the order of 80G - 120G accumulate over the course of a few months.  It is recommended
> +# that the cache directory be periodically completely deleted, and this is safe to perform.
> +# Suggested mechanism
> +# mv $cacheidr $cachedir.flush;mkdir $cachedir;rm -rf $cachedir.flush
> +# Value is binary. 0 = disabled (default), 1 = enabled.
> +#
> +# Values of caching:
> +# 	1 = 'dumb' file based caching used on git.kernel.org
> +our $cache_enable = 0;

As I wrote earlier, having boolean $caching_enabled and $cache
specifying caching engine to use (undef for default) allows to
separate act of enabling cache from the cact of selecting (and
configuring) cache.  Isn't it?

> +
> +# Used to set the minimum cache timeout for the dynamic caching algorithm.  Basically
> +# if we calculate the cache to be under this number of seconds we set the cache timeout
> +# to this minimum.
> +# Value is in seconds.  1 = 1 seconds, 60 = 1 minute, 600 = 10 minutes, 3600 = 1 hour
> +our $minCacheTime = 20;
> +
> +# Used to set the maximum cache timeout for the dynamic caching algorithm.  Basically
> +# if we calculate the cache to exceed this number of seconds we set the cache timeout
> +# to this maximum.
> +# Value is in seconds.  1 = 1 seconds, 60 = 1 minute, 600 = 10 minutes, 3600 = 1 hour
> +our $maxCacheTime = 1200;
> +
> +# If you need to change the location of the caching directory, override this
> +# otherwise this will probably do fine for you
> +our $cachedir = 'cache';

If it is relative path, it is relative to location of gitweb.cgi
script, isn't it?

> +
> +# If this is set (to 1) cache will do it's best to always display something instead
> +# of making someone wait for the cache to update.  This will launch the cacheUpdate
> +# into the background and it will lock a <file>.bg file and will only lock the
> +# actual cache file when it needs to write into it.  In theory this will make
> +# gitweb seem more responsive at the price of possibly stale data.
> +our $backgroundCache = 1;

Boolean.

> +
> +# Used to set the maximum cache file life.  If a cache files last modify time exceeds
> +# this value, it will assume that the data is just too old, and HAS to be regenerated
> +# instead of trying to display the existing cache data.
> +# Value is in seconds.  1 = 1 seconds, 60 = 1 minute, 600 = 10 minutes, 3600 = 1 hour
> +# 18000 = 5 hours
> +our $maxCacheLife = 18000;
> +
> +# Used to enable or disable background forking of the gitweb caching.  Mainly here for debugging purposes
> +our $cacheDoFork = 1;
> +
> +our $fullhashpath = *STDOUT;
> +our $fullhashbinpath = *STDOUT;
> +our $fullhashbinpathfinal = *STDOUT;
> +

In my rewrite all cache parameters are in %cache_options hash, and
passed as parameters to cache constructor, instead of littering global
variables namespace...

>  # configuration for 'highlight' (http://www.andre-simon.de/)
>  # match by basename
>  our %highlight_basename = (
> @@ -500,6 +551,11 @@ our %feature = (
>  		'default' => [0]},
>  );
>  
> +#
> +# Includes
> +#
> +do 'cache.pm';

Unsafe!  If you have any error in 'cache.pm', you could get a very
strange error reports.

Also what if 'cache.pm' cannot be found because somebody forgot to
copy it along with gitweb.cgi?


See how we do it for $GITWEB_CONFIG:

        if (-e $GITWEB_CONFIG) {
                do $GITWEB_CONFIG;
                die $@ if $@;
        }

See also documentation for 'do' (perldoc -f do):

       If "do" cannot read the file, it returns undef and sets $! to the error.
       If "do" can read the file but cannot compile it, it returns undef and sets
       an error message in $@.   If the file is successfully compiled, "do"
       returns the value of the last expression evaluated.

       Note that inclusion of library modules is better done with the "use" and
       "require" operators, which also do automatic error checking and raise an
       exception if there's a problem.

       You might like to use "do" to read in a program configuration file.  
       Manual error checking can be done this way:

       # read in config files: system first, then user
       for $file ("/share/prog/defaults.rc",
                  "$ENV{HOME}/.someprogrc")
       {
             unless ($return = do $file) {
                 warn "couldn't parse $file: $@" if $@;
                 warn "couldn't do $file: $!"    unless defined $return;
                 warn "couldn't run $file"       unless $return;
             }
       }

In my rewrite I use simply "require", and I check it's output.

> +
>  sub gitweb_get_feature {
>  	my ($name) = @_;
>  	return unless exists $feature{$name};
> @@ -1089,7 +1145,8 @@ sub dispatch {
>  	    !$project) {
>  		die_error(400, "Project needed");
>  	}
> -	$actions{$action}->();
> +	#$actions{$action}->();
> +	cache_fetch($action);

Wouldn't a better solution be

  -	$actions{$action}->();
  +	if ($caching_enabled) {
  +		cache_fetch($action);
  +	} else {
  +		$actions{$action}->();
  +	}

>  }
>  
>  sub reset_timer {
> @@ -1159,6 +1216,7 @@ sub change_output {
>  
>  	# Trap the 'proper' STDOUT to STDOUT_REAL for things like error messages and such
>  	open(STDOUT_REAL,">&STDOUT") or die "Unable to capture STDOUT $!\n";
> +	print STDOUT_REAL "";

Why it is for?

>  
>  	# Close STDOUT, so that it isn't being used anymore.
>  	close STDOUT;
> @@ -1185,10 +1243,7 @@ sub run {
>  		$pre_dispatch_hook->()
>  			if $pre_dispatch_hook;
>  
> -		change_output();
>  		run_request();
> -		reset_output();
> -		print $output;
>  
>  		$post_dispatch_hook->()
>  			if $post_dispatch_hook;

Hmmm...

> @@ -3465,7 +3520,9 @@ sub git_header_html {
>  	# support xhtml+xml but choking when it gets what it asked for.
>  	if (defined $cgi->http('HTTP_ACCEPT') &&
>  	    $cgi->http('HTTP_ACCEPT') =~ m/(,|;|\s|^)application\/xhtml\+xml(,|;|\s|$)/ &&
> -	    $cgi->Accept('application/xhtml+xml') != 0) {
> +	    $cgi->Accept('application/xhtml+xml') != 0
> +	    &&
> +	    $cache_enable == 0) {
>  		$content_type = 'application/xhtml+xml';
>  	} else {
>  		$content_type = 'text/html';

I also have something similar in my rewrite.

> @@ -3610,6 +3667,7 @@ sub git_footer_html {
>  	my $feed_class = 'rss_logo';
>  
>  	print "<div class=\"page_footer\">\n";
> +	print "<div class=\"cachetime\">Cache Last Updated: ". gmtime( time ) ." GMT</div>\n";
>  	if (defined $project) {
>  		my $descr = git_get_project_description($project);
>  		if (defined $descr) {

In my rewrite it replaces "time to generate page" info, though
probably should be always there.

> @@ -3698,6 +3756,11 @@ sub die_error {
>  		500 => '500 Internal Server Error',
>  		503 => '503 Service Unavailable',
>  	);
> +	# The output handlers for die_error need to be reset to STDOUT
> +	# so that half the message isn't being output to random and
> +	# half to STDOUT as expected.  This is mainly for the benefit
> +	# of using git_header_html() and git_footer_html() since
> +	#
>  	# Reset the output so that we are actually going to STDOUT as opposed
>  	# to buffering the output.
>  	reset_output();
> @@ -5610,9 +5673,15 @@ sub git_blob_plain {
>  			($sandbox ? 'attachment' : 'inline')
>  			. '; filename="' . $save_as . '"');
>  	local $/ = undef;
> -	binmode STDOUT, ':raw';
> -	print <$fd>;
> -	binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi
> +	if( $cache_enable != 0){
> +		open BINOUT, '>', $fullhashbinpath or die_error(500, "Could not open bin dump file");
> +	}else{
> +		open BINOUT, '>', \$fullhashbinpath or die_error(500, "Could not open bin dump file");
> +	}
> +	binmode BINOUT, ':raw';
> +	print BINOUT <$fd>;
> +	binmode BINOUT, ':utf8'; # as set at the beginning of gitweb.cgi
> +	close BINOUT;

Why it is here?  Why caching engine connot take care of that?

>  	close $fd;
>  }
>  
> @@ -5897,9 +5966,15 @@ sub git_snapshot {
>  
>  	open my $fd, "-|", $cmd
>  		or die_error(500, "Execute git-archive failed");
> -	binmode STDOUT, ':raw';
> -	print <$fd>;
> -	binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi
> +	if( $cache_enable != 0){
> +		open BINOUT, '>', $fullhashbinpath or die_error(500, "Could not open bin dump file");
> +	}else{
> +		open BINOUT, '>', \$fullhashbinpath or die_error(500, "Could not open bin dump file");
> +	}
> +	binmode BINOUT, ':raw';
> +	print BINOUT <$fd>;
> +	binmode BINOUT, ':utf8'; # as set at the beginning of gitweb.cgi
> +	close BINOUT;
>  	close $fd;
>  }

Same as above.  I don't think that having to modify gitweb code beside
enabling caching somewhere at the top shows a good design...

>  
> diff --git a/gitweb/static/gitweb.css b/gitweb/static/gitweb.css
> index 4132aab..972d32e 100644
> --- a/gitweb/static/gitweb.css
> +++ b/gitweb/static/gitweb.css
> @@ -67,6 +67,12 @@ div.page_path {
>  	border-width: 0px 0px 1px;
>  }
>  
> +div.cachetime {
> +	float: left;
> +	margin-right: 10px;
> +	color: #555555;
> +}

O.K.

> +
>  div.page_footer {
>  	height: 17px;
>  	padding: 4px 8px;

P.S. Where is my review ;-)???

-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 1/3] gitweb: Add option to force version match
  2010-10-28  0:42 ` [PATCH 1/3] gitweb: Add option to force version match John 'Warthog9' Hawley
  2010-10-28  9:52   ` Jakub Narebski
@ 2010-10-28 22:08   ` Junio C Hamano
  1 sibling, 0 replies; 43+ messages in thread
From: Junio C Hamano @ 2010-10-28 22:08 UTC (permalink / raw)
  To: John 'Warthog9' Hawley; +Cc: git

I do not understand why this variable needs to be turned _on_ by default,
but more importantly, isn't this more or less independent from what we
discussed at GitTogether, which is to get your battle-tested caching layer
that will tremendously help any nontrivial site merged earlier rather than
later by reducing load average from 1000 to 3-or-4?

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 0/3] Gitweb caching v7
  2010-10-28  0:42 [PATCH 0/3] Gitweb caching v7 John 'Warthog9' Hawley
                   ` (2 preceding siblings ...)
  2010-10-28  0:42 ` [PATCH 3/3] gitweb: File based caching layer (from git.kernel.org) John 'Warthog9' Hawley
@ 2010-10-28 22:29 ` Junio C Hamano
  2010-10-29 22:25 ` Junio C Hamano
  4 siblings, 0 replies; 43+ messages in thread
From: Junio C Hamano @ 2010-10-28 22:29 UTC (permalink / raw)
  To: John 'Warthog9' Hawley; +Cc: git

"John 'Warthog9' Hawley" <warthog9@eaglescrag.net> writes:

> v7:
> 	- Rework output system, now central STDOUT redirect

Yeah, when I saw your print-to-variable-assignment that was one of the
things that came to my mind.  Sounds like a reasonable thing to do.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 0/3] Gitweb caching v7
  2010-10-28  0:42 [PATCH 0/3] Gitweb caching v7 John 'Warthog9' Hawley
                   ` (3 preceding siblings ...)
  2010-10-28 22:29 ` [PATCH 0/3] Gitweb caching v7 Junio C Hamano
@ 2010-10-29 22:25 ` Junio C Hamano
  2010-10-30  8:58   ` Jakub Narebski
  4 siblings, 1 reply; 43+ messages in thread
From: Junio C Hamano @ 2010-10-29 22:25 UTC (permalink / raw)
  To: John 'Warthog9' Hawley; +Cc: git

I am getting this in the gitweb.log:

    [Fri Oct 29 22:21:12 2010] gitweb.perl: Undefined subroutine &main::cache_fetch called at .../t/../gitweb/gitweb.perl line 1124.

which seems to cause t9500 to fail.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 0/3] Gitweb caching v7
  2010-10-29 22:25 ` Junio C Hamano
@ 2010-10-30  8:58   ` Jakub Narebski
  2010-10-31  4:24     ` Junio C Hamano
  0 siblings, 1 reply; 43+ messages in thread
From: Jakub Narebski @ 2010-10-30  8:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: John 'Warthog9' Hawley, git

Junio C Hamano <gitster@pobox.com> writes:

> I am getting this in the gitweb.log:
> 
>     [Fri Oct 29 22:21:12 2010] gitweb.perl: Undefined subroutine 
>     &main::cache_fetch called at .../t/../gitweb/gitweb.perl line 1124.
> 
> which seems to cause t9500 to fail.

This is caused by three issues (bugs) in v7 caching code.

First is the reason why t9500 exhibits this bug.  The gitweb caching
v7 includes file with subroutines related to caching in the following
way:

  do 'cache.pm';

Note that the relative path is used; for t9500 it is relative from
somewhere witjin 't/', and not from 'gitweb/', so "do 'cache.pm';"
doesn't find it.

In my earlier rewrites I used

  do $cache_pm;

and 't/gitweb-lib.sh' set $cache_pm appriopriately.


Second is why this bug is bad.  There is no error checking that 
"do 'cache.pm';" succeeded.  It should be

  do $cache_pm;
  die $@ if $@;

at least.  Perhaps even better would be to simply turn off caching
support if there is an error in 'cache.pm' (which probably should be
called 'cache.pl', as it is not proper Perl module)... though on the
other hand side it would could hide the fact that caching is not
working.


Third is why this matters.  In v7 the cache_fetch() subroutine,
defined in 'cache.pm', is run *unconditionally*, and has test if the
caching is enabled *inside it*, instead of having gitweb (caller) use
it only if caching is disabled.

This coupled with the fact that 'make install-gitweb' would *not*
install 'cache.pm' alongside gitweb.cgi means that anybody upgrading
gitweb, whether he/she wants caching or not, would have gitweb stop
working after upgrade... unless he/she knows that he/she has to copy
'cache.pm' file manually.

On the other hand having test first if caching is enabled would make
t9500 tests pass (because they do not even test minimally cache
enabled / cache disabled cases, like in my rewrite), but would hide
problem when caching is enabled and 'cache.pm' is not installed...
(perhaps also in persistent environments; I don't know where pwd is
then).

-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 0/3] Gitweb caching v7
  2010-10-30  8:58   ` Jakub Narebski
@ 2010-10-31  4:24     ` Junio C Hamano
  2010-10-31  9:21       ` Jakub Narebski
  0 siblings, 1 reply; 43+ messages in thread
From: Junio C Hamano @ 2010-10-31  4:24 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: John 'Warthog9' Hawley, git

Jakub Narebski <jnareb@gmail.com> writes:

>> I am getting this in the gitweb.log:
>> 
>>     [Fri Oct 29 22:21:12 2010] gitweb.perl: Undefined subroutine 
>>     &main::cache_fetch called at .../t/../gitweb/gitweb.perl line 1124.
>> 
>> which seems to cause t9500 to fail.
>
> This is caused by three issues (bugs) in v7 caching code.
>
> First is the reason why t9500 exhibits this bug.  The gitweb caching
> v7 includes file with subroutines related to caching in the following
> way:
>
>   do 'cache.pm';
>
> Note that the relative path is used; for t9500 it is relative from
> somewhere witjin 't/', and not from 'gitweb/', so "do 'cache.pm';"
> doesn't find it.

John, where should cache.pm go in the installed system?  Does it typically
go next to gitweb.perl?

I think "do 'that-file'" honors path specified by the -I option, so I do
not think "do $cache_pm" is necessary.  My preference is to run gitweb
tests with appropriate -I pointing at the cache.pm in the directory.

> ...  It should be
>
>   do $cache_pm;
>   die $@ if $@;
>
> at least.

Catching failure is a good thing to do.

> Perhaps even better would be to simply turn off caching
> support if there is an error in 'cache.pm'

That can come later.

Jakub, can we have an absolute minimum fix-up, so that we can give
this wider exposure?  I think there are only
four issues:

 (1) exclude Ajax-y stuff from caching;
 (2) install cache.pm the same way gitweb.perl is installed via
     the Makefile;
 (3) running tests with appropriate -I so that cache.pm is found; and
 (4) die if 'cache.pm' cannot be "done").

I think the change in gitweb-cache v7 is small and safe enough that we
should fast-track it to give usability to the real world sites.  It may be
a low-risk "obviously correct" approach that is quick-and-dirty, but that
is exactly why this should be fast-tracked.  It does not touch the logic
or formatting in any way, just bypasses the page generation altogether
when it can clearly do so when it can tell the output cannot possibly be
incorrect (albeit sometimes it might be stale if in certain cases, e.g. it
is relative to HEAD).

I know you and others were aiming to split things up, but I think the
amount of the effort that is needed for that line of work on top of the
current codebase is not much different from what is needed on top of
gitweb-cache v7.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 0/3] Gitweb caching v7
  2010-10-31  4:24     ` Junio C Hamano
@ 2010-10-31  9:21       ` Jakub Narebski
  2010-11-01 10:24         ` [PATCHv7.1 0/4] Gitweb caching v7.1 Jakub Narebski
                           ` (4 more replies)
  0 siblings, 5 replies; 43+ messages in thread
From: Jakub Narebski @ 2010-10-31  9:21 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: John 'Warthog9' Hawley, git

On Sun, 31 Oct 2010, Junio C Hamano wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
> 
>>> I am getting this in the gitweb.log:
>>> 
>>>     [Fri Oct 29 22:21:12 2010] gitweb.perl: Undefined subroutine 
>>>     &main::cache_fetch called at .../t/../gitweb/gitweb.perl line 1124.
>>> 
>>> which seems to cause t9500 to fail.
>>
>> This is caused by three issues (bugs) in v7 caching code.
>>
>> First is the reason why t9500 exhibits this bug.  The gitweb caching
>> v7 includes file with subroutines related to caching in the following
>> way:
>>
>>   do 'cache.pm';
>>
>> Note that the relative path is used; for t9500 it is relative from
>> somewhere witjin 't/', and not from 'gitweb/', so "do 'cache.pm';"
>> doesn't find it.
> 
> John, where should cache.pm go in the installed system?  Does it typically
> go next to gitweb.perl?
> 
> I think "do 'that-file'" honors path specified by the -I option, so I do
> not think "do $cache_pm" is necessary.  My preference is to run gitweb
> tests with appropriate -I pointing at the cache.pm in the directory.

>From `perldoc -f do`

      do 'stat.pl';

   is just like

      eval 'cat stat.pl';

   except that it's more efficient and concise, keeps track of the current
   filename for error messages, searches the @INC libraries, and updates %INC
   if the file is found.        ^^^^^^^^^^^^^^^^^^^^^^^^^^^

So I think it respects '-I<directory>' given to perl interpreter.

On the other hand I don't know how it would work with mod_perl (uwing
ModPerl::Registry handler), whether it wouldn't too require extra
configuration.

So I think a better solution would be to base 'Gitweb caching v7' plus
necessary fixes and improvements on top of 

  gitweb: Prepare for splitting gitweb

This means that the directory that gitweb is in would be added to @INC
via

  use lib __DIR__.'/lib';

The 'cache.pm' or 'cache.pl' file would be moved to 'gitweb/lib', but
that is cosmetic change.

> 
>> ...  It should be
>>
>>   do $cache_pm;
>>   die $@ if $@;
>>
>> at least.
> 
> Catching failure is a good thing to do.

Right.  And this is only one-line addition.
 
>> Perhaps even better would be to simply turn off caching
>> support if there is an error in 'cache.pm'
> 
> That can come later.
> 
> Jakub, can we have an absolute minimum fix-up, so that we can give
> this wider exposure?  I think there are only
> four issues:
> 
>  (1) exclude Ajax-y stuff from caching;

Easy, but only if check whether to do capturing and caching is moved
out of cache_fetch to caller, i.e. to gitweb script.  See also comments
below about what need and should be done.

Another solution is to turn off Ajax-y stuff when caching is enabled.
It can be done quite easily, just sprinkle some !$caching_enabled
(see comment below about naming and semantic of this config variable)
in appropriate place.

>  (2) install cache.pm the same way gitweb.perl is installed via
>      the Makefile;

With "gitweb: Prepare for splitting gitweb" as introductory patch this
would be just adding

  # gitweb output caching
  GITWEB_MODULES += cache.pm

(or cache.pl - it is not a proper Perl package!) e.g. above GITWEB_REPLACE
sed script in gitweb/Makefile.

>  (3) running tests with appropriate -I so that cache.pm is found; and

No change to test necessary if we use "use lib __DIR__.'/lib';" in
gitweb.perl

I'd like to port from my rewrite change to t/t9500-gitweb-standalone-no-errors.sh
adding minimal test to check if running gitweb with caching enabled
doesn't generate any warnings, and (modified) change to the
t/t9502-gitweb-standalone-parse-output.sh, adding minimal test that
gitweb produces the same output with and without caching.

>  (4) die if 'cache.pm' cannot be "done".

O.K. (4) is one liner.

There is are also other issues

   (5) naming and semantic of gitweb config variables configuring caching;
       at least change $cache_enabled enum to $caching_enabled boolean
   (6) do not change anything in gitweb behavior if caching is disabled;
       move 'if ($caching_enabled)' test to gitweb.perl, and remove code
       from cache_fetch

About (5): names and semantics of configuration variables are gitweb API,
so we should at least try to keep backward compatibility with old names
(see for example 'backward compatibility: legacy gitweb config support'
section in %known_snapshot_format_aliases).  

There is no much problem with $minCacheTime, $maxCacheTime, $backgroundCache
etc. (besides bleh, camelCase ;-) but I would prefer to have *boolean*
$caching_enabled (true-ish value means cache is enabled, false-y value
including undef means caching is disabled) to currently two-value *"enum"*
$cache_enable called "binary" (sic!) in docs and comments.  See also
issue (6).


About (6): I would prefer to move check if caching should be enabled
higher, to gitweb.perl (to its own configure_caching() subroutine, rather
than sprinkling it in top scope), and not even include 'cache.pm'
if caching is disabled (default).  But because we use 'do' and not
'require', we would have to check %INC to not include it twice...

Anyway, whether we include 'cache.pm' conditionally or not, I'd prefer
to use

  if ($caching_enabled) {
  	cache_fetch($action);
  } else {
  	$actions{$action}->();
  }

in dispatch() subroutine, and simply remove all code dealing with
'$cache_enable == 0' case from 'cache.pm'.

Remember, config variables are forever :-)

> 
> I think the change in gitweb-cache v7 is small and safe enough that we
> should fast-track it to give usability to the real world sites.  It may be
> a low-risk "obviously correct" approach that is quick-and-dirty, but that
> is exactly why this should be fast-tracked.  It does not touch the logic
> or formatting in any way, just bypasses the page generation altogether
> when it can clearly do so when it can tell the output cannot possibly be
> incorrect (albeit sometimes it might be stale if in certain cases, e.g. it
> is relative to HEAD).

My rewrite also does the same.  I have changed it to do the same STDOUT
redirection that J.H. gitweb caching v7 does (I used select($fh) based
capture) - work in progress, not yet pushed to my repository:

  git://repo.or.cz/git/jnareb-git.git
  http://repo.or.cz/w/git/jnareb-git.git (gitweb)

> 
> I know you and others were aiming to split things up, but I think the
> amount of the effort that is needed for that line of work on top of the
> current codebase is not much different from what is needed on top of
> gitweb-cache v7.

Well, I'd have to add support for configuration variables used in this
patch series, but with exception of $cache_enable -> $caching_enabled
it wouldn't be much extra work.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCHv7.1 0/4] Gitweb caching v7.1
  2010-10-31  9:21       ` Jakub Narebski
@ 2010-11-01 10:24         ` Jakub Narebski
  2010-11-12 23:35           ` [RFC/PATCHv7.2 0/4] Gitweb caching v7.2 Jakub Narebski
  2010-11-01 10:24         ` [PATCHv7.1 1/4] gitweb: Prepare for splitting gitweb Jakub Narebski
                           ` (3 subsequent siblings)
  4 siblings, 1 reply; 43+ messages in thread
From: Jakub Narebski @ 2010-11-01 10:24 UTC (permalink / raw)
  To: git
  Cc: John 'Warthog9' Hawley, John 'Warthog9' Hawley,
	Junio C Hamano, Petr Baudis, admin, Jakub Narebski

This series is a bit fixed up and a tiny bit cleaned up version of
"Gitweb caching v7" series from John 'Warthog9' Hawley.

This series is based on top of 8ff76f4 (gitweb: Move call to
evaluate_git_version after evaluate_gitweb_config, 2010-09-26)
currently in 'pu'.  This commit fixes a bug in gitweb which shows in
running testsuite.

The patch "gitweb: Add option to force version match" was removed from
this series as it doesn't belong to it, c.f. 
  http://article.gmane.org/gmane.comp.version-control.git/160236

 [PATCHv7.1 1/4] gitweb: Prepare for splitting gitweb
 [PATCHv7.1 2/4] gitweb: add output buffering and associated functions
 [PATCHv7.1 3/4] gitweb: File based caching layer (from git.kernel.org)
 [PATCHv7.1 4/4] gitweb: Minimal testing of gitweb caching

Jakub Narebski (2):
  gitweb: Prepare for splitting gitweb
  gitweb: Minimal testing of gitweb caching

John 'Warthog9' Hawley (2):
  gitweb: add output buffering and associated functions
  gitweb: File based caching layer (from git.kernel.org)

 gitweb/Makefile                           |   20 ++-
 gitweb/gitweb.perl                        |  134 +++++++++++-
 gitweb/lib/cache.pl                       |  348 +++++++++++++++++++++++++++++
 gitweb/static/gitweb.css                  |    6 +
 t/t9500-gitweb-standalone-no-errors.sh    |   19 ++
 t/t9502-gitweb-standalone-parse-output.sh |   39 ++++
 6 files changed, 555 insertions(+), 11 deletions(-)
 create mode 100644 gitweb/lib/cache.pl

-- 
1.7.3

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCHv7.1 1/4] gitweb: Prepare for splitting gitweb
  2010-10-31  9:21       ` Jakub Narebski
  2010-11-01 10:24         ` [PATCHv7.1 0/4] Gitweb caching v7.1 Jakub Narebski
@ 2010-11-01 10:24         ` Jakub Narebski
  2010-11-01 18:50           ` [PATCHv7.1b " Jakub Narebski
  2010-11-01 10:24         ` [PATCHv7.1 2/4] gitweb: add output buffering and associated functions Jakub Narebski
                           ` (2 subsequent siblings)
  4 siblings, 1 reply; 43+ messages in thread
From: Jakub Narebski @ 2010-11-01 10:24 UTC (permalink / raw)
  To: git
  Cc: John 'Warthog9' Hawley, John 'Warthog9' Hawley,
	Junio C Hamano, Petr Baudis, admin, Jakub Narebski

Prepare gitweb for having been split into modules that are to be
installed alongside gitweb in 'lib/' subdirectory, by adding

  use lib __DIR__.'/lib';

to gitweb.perl (to main gitweb script), and preparing for putting
modules (relative path) in $(GITWEB_MODULES) in gitweb/Makefile.

This preparatory work allows to add new module to gitweb by simply
adding

  GITWEB_MODULES += <module>

to gitweb/Makefile (assuming that the module is in 'gitweb/lib/'
directory).

Using __DIR__ from Dir::Self module (not in core, that's why currently
gitweb includes excerpt of code from Dir::Self defining __DIR__) was
chosen over using FindBin-based solution (in core since perl 5.00307,
while gitweb itself requires at least perl 5.8.0) because FindBin uses
BEGIN block, which is a problem under mod_perl and other persistent
Perl environments (thought there are workarounds).

At Pavan Kumar Sankara suggestion gitweb/Makefile uses

  install [OPTION]... SOURCE... DIRECTORY

format (2nd format) with single SOURCE rather than

  install [OPTION]... SOURCE DEST

format (1st format) because of security reasons (race conditions).
Modern GNU install has `-T' / `--no-target-directory' option, but we
cannot rely that the $(INSTALL) we are using supports this option.

The install-modules target in gitweb/Makefile uses shell 'for' loop,
instead of make's $(foreach) function, to avoid possible problem with
generating a command line that exceeded the maximum argument list
length.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
This patch would conflict with the patch 958a846 (gitweb/Makefile: Add
'test' and 'test-installed' targets, 2010-09-26), which is in 'pu'.
Resolving those conflicts is easy, though non-trivial.

Please tell me if I should send this series rebased on top of 
'jn/gitweb-test' (which includes 958a846) instead.

This patch introduces infrastructure which would be required later for
splitting gitweb.

 gitweb/Makefile    |   17 ++++++++++++++---
 gitweb/gitweb.perl |    8 ++++++++
 2 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/gitweb/Makefile b/gitweb/Makefile
index 2fb7c2d..6fa7625 100644
--- a/gitweb/Makefile
+++ b/gitweb/Makefile
@@ -55,6 +55,7 @@ PERL_PATH  ?= /usr/bin/perl
 bindir_SQ = $(subst ','\'',$(bindir))#'
 gitwebdir_SQ = $(subst ','\'',$(gitwebdir))#'
 gitwebstaticdir_SQ = $(subst ','\'',$(gitwebdir)/static)#'
+gitweblibdir_SQ = $(subst ','\'',$(gitwebdir)/lib)#'
 SHELL_PATH_SQ = $(subst ','\'',$(SHELL_PATH))#'
 PERL_PATH_SQ  = $(subst ','\'',$(PERL_PATH))#'
 DESTDIR_SQ    = $(subst ','\'',$(DESTDIR))#'
@@ -145,16 +146,26 @@ gitweb.cgi: gitweb.perl GITWEB-BUILD-OPTIONS
 
 ### Installation rules
 
-install: all
+install: all install-modules
 	$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(gitwebdir_SQ)'
 	$(INSTALL) -m 755 $(GITWEB_PROGRAMS) '$(DESTDIR_SQ)$(gitwebdir_SQ)'
 	$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(gitwebstaticdir_SQ)'
 	$(INSTALL) -m 644 $(GITWEB_FILES) '$(DESTDIR_SQ)$(gitwebstaticdir_SQ)'
 
+install-modules:
+	install_dirs="$(sort $(dir $(GITWEB_MODULES)))" && \
+	for dir in $$install_dirs; do \
+		test -d '$(DESTDIR_SQ)$(gitweblibdir_SQ)/$$dir' || \
+		$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(gitweblibdir_SQ)/$$dir'; \
+	done
+	gitweb_modules="$(GITWEB_MODULES)" && \
+	for mod in $$gitweb_modules; do \
+		$(INSTALL) -m 644 lib/$$mod '$(DESTDIR_SQ)$(gitweblibdir_SQ)/$$(dirname $$mod)'; \
+	done
+
 ### Cleaning rules
 
 clean:
 	$(RM) gitweb.cgi static/gitweb.min.js static/gitweb.min.css GITWEB-BUILD-OPTIONS
 
-.PHONY: all clean install .FORCE-GIT-VERSION-FILE FORCE
-
+.PHONY: all clean install install-modules .FORCE-GIT-VERSION-FILE FORCE
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index d521b4c..e4c08ba 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -9,6 +9,14 @@
 
 use strict;
 use warnings;
+
+use File::Spec;
+# __DIR__ is taken from Dir::Self __DIR__ fragment
+sub __DIR__ () {
+	File::Spec->rel2abs(join '', (File::Spec->splitpath(__FILE__))[0, 1]);
+}
+use lib __DIR__ . '/lib';
+
 use CGI qw(:standard :escapeHTML -nosticky);
 use CGI::Util qw(unescape);
 use CGI::Carp qw(fatalsToBrowser set_message);
-- 
1.7.3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCHv7.1 2/4] gitweb: add output buffering and associated functions
  2010-10-31  9:21       ` Jakub Narebski
  2010-11-01 10:24         ` [PATCHv7.1 0/4] Gitweb caching v7.1 Jakub Narebski
  2010-11-01 10:24         ` [PATCHv7.1 1/4] gitweb: Prepare for splitting gitweb Jakub Narebski
@ 2010-11-01 10:24         ` Jakub Narebski
  2010-11-01 10:24         ` [PATCHv7.1 3/4] gitweb: File based caching layer (from git.kernel.org) Jakub Narebski
  2010-11-01 10:24         ` [PATCHv7.1 4/4] gitweb: Minimal testing of gitweb caching Jakub Narebski
  4 siblings, 0 replies; 43+ messages in thread
From: Jakub Narebski @ 2010-11-01 10:24 UTC (permalink / raw)
  To: git
  Cc: John 'Warthog9' Hawley, John 'Warthog9' Hawley,
	Junio C Hamano, Petr Baudis, admin, Jakub Narebski

From: John 'Warthog9' Hawley <warthog9@eaglescrag.net>

This adds output buffering for gitweb, mainly in preparation for
caching support.  This is a dramatic change to how caching was being
done, mainly in passing around the variable manually and such.

This centrally flips the entire STDOUT to a variable, which after the
completion of the run, flips it back and does a print on the resulting
data.

This should save on the previous 10K line patch (or so) that adds more
explicit output passing.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
Commit message unchanged.

I have modified reset_output to silence the following warning from t9500:

  gitweb.perl: Name "main::STDOUT_REAL" used only once: possible typo
  at ../gitweb/gitweb.perl line 1130.

Here is the interdiff for gitweb.perl from 2e5c355 (J.H. patch in 'pu'):

  diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
  index b2b0a23..91e274f 100755
  --- a/gitweb/gitweb.perl
  +++ b/gitweb/gitweb.perl
  @@ -32,7 +39,7 @@ BEGIN {
   our $version = "++GIT_VERSION++";
   
   # Output buffer variable
  -$output = "";
  +our $output = "";
   
   our ($my_url, $my_uri, $base_url, $path_info, $home_link);
   sub evaluate_uri {
  @@ -1143,13 +1138,12 @@ sub change_output {
   }
   
   sub reset_output {
  -	# This basically takes STDOUT_REAL and puts it back as STDOUTl
  -	open(STDOUT,">&STDOUT_REAL");
  +	# This basically takes STDOUT_REAL and puts it back as STDOUT
  +	open STDOUT, ">&", \*STDOUT_REAL;
   }
   

 gitweb/gitweb.perl |   29 +++++++++++++++++++++++++++++
 1 files changed, 29 insertions(+), 0 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index e4c08ba..91e274f 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -38,6 +38,9 @@ BEGIN {
 
 our $version = "++GIT_VERSION++";
 
+# Output buffer variable
+our $output = "";
+
 our ($my_url, $my_uri, $base_url, $path_info, $home_link);
 sub evaluate_uri {
 	our $cgi;
@@ -1120,6 +1123,25 @@ sub evaluate_argv {
 	);
 }
 
+sub change_output {
+	our $output;
+
+	# Trap the 'proper' STDOUT to STDOUT_REAL for things like error messages and such
+	open(STDOUT_REAL,">&STDOUT") or die "Unable to capture STDOUT $!\n";
+
+	# Close STDOUT, so that it isn't being used anymore.
+	close STDOUT;
+
+	# Trap STDOUT to the $output variable, which is what I was using in the original
+	# patch anyway.
+	open(STDOUT,">", \$output) || die "Unable to open STDOUT: $!"; #open STDOUT handle to use $var
+}
+
+sub reset_output {
+	# This basically takes STDOUT_REAL and puts it back as STDOUT
+	open STDOUT, ">&", \*STDOUT_REAL;
+}
+
 sub run {
 	evaluate_argv();
 
@@ -1131,7 +1153,10 @@ sub run {
 		$pre_dispatch_hook->()
 			if $pre_dispatch_hook;
 
+		change_output();
 		run_request();
+		reset_output();
+		print $output;
 
 		$post_dispatch_hook->()
 			if $post_dispatch_hook;
@@ -3640,6 +3665,10 @@ sub die_error {
 		500 => '500 Internal Server Error',
 		503 => '503 Service Unavailable',
 	);
+	# Reset the output so that we are actually going to STDOUT as opposed
+	# to buffering the output.
+	reset_output();
+
 	git_header_html($http_responses{$status}, undef, %opts);
 	print <<EOF;
 <div class="page_body">
-- 
1.7.3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCHv7.1 3/4] gitweb: File based caching layer (from git.kernel.org)
  2010-10-31  9:21       ` Jakub Narebski
                           ` (2 preceding siblings ...)
  2010-11-01 10:24         ` [PATCHv7.1 2/4] gitweb: add output buffering and associated functions Jakub Narebski
@ 2010-11-01 10:24         ` Jakub Narebski
  2010-11-01 10:24         ` [PATCHv7.1 4/4] gitweb: Minimal testing of gitweb caching Jakub Narebski
  4 siblings, 0 replies; 43+ messages in thread
From: Jakub Narebski @ 2010-11-01 10:24 UTC (permalink / raw)
  To: git
  Cc: John 'Warthog9' Hawley, John 'Warthog9' Hawley,
	Junio C Hamano, Petr Baudis, admin, Jakub Narebski

From: John 'Warthog9' Hawley <warthog9@eaglescrag.net>

This is a relatively large patch that implements the file based
caching layer that is quite similar to the one used  on such large
sites as kernel.org and soon git.fedoraproject.org.  This provides
a simple, and straight forward caching mechanism that scales
dramatically better than Gitweb by itself.

The caching layer basically buffers the output that Gitweb would
normally return, and saves that output to a cache file on the local
disk.  When the file is requested it attempts to gain a shared lock
on the cache file and cat it out to the client.  Should an exclusive
lock be on a file (it's being updated) the code has a choice to either
update in the background and go ahead and show the stale page while
update is being performed, or stall the client(s) until the page
is generated.

There are two forms of stalling involved here, background building
and non-background building, both of which are discussed in the
configuration page.

There are still a few known "issues" with respect to this:
- Code needs to be added to be "browser" aware so
  that clients like wget that are trying to get a
  binary blob don't obtain a "Generating..." page

Caching is disabled by default.  You can turn it on by setting
$caching_enabled variable to true to enable file based caching.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
This is *minimal* fixup of J.H. patch, addressing only major concerns
of Junio C Hamano (JH) and me (JN).

JH> Jakub, can we have an absolute minimum fix-up, so that we can give
JH> this wider exposure?  I think there are only
JH> four issues:
JH>
JH>  (1) exclude Ajax-y stuff from caching

Done, via "if ($caching_enabled && is_cacheable($action) { ... }".

Alternate solution would be to degrade Ajax-y actions to their
non-Ajax-y versions (i.e. 'blame_incremental' to just 'blame') when
caching is enabled.

JH>  (2) install cache.pm the same way gitweb.perl is installed via
JH>      the Makefile

It is now 'gitweb/lib/cache.pl' (*.pl and not *.pm because it is not a
proper Perl module: it lacks 'package Foo' for one), and is installed
via 'install-modules' target in gitweb/Makefile, which is prereq to
'install' therein.  So 'make install-gitweb' would install cache.pl
alongside gitweb.

JH>  (3) running tests with appropriate -I so that cache.pm is found;

Thanks to "use lib __DIR__.'/lib';" it is not necessary.  The 
'use lib' solution has the advantage that it would work for mod_perl
too.

JH>  (4) die if 'cache.pm' cannot be "done".

  #
  # Includes
  #
  if (!exists $INC{'cache.pl'}) {
  	my $return = do 'cache.pl';
  	die $@ if $@;
  	die "Couldn't read 'cache.pl': $!" if (!defined $return);
  }

Gitweb now includes 'cache.pl' only once (which is important in
persistent environments that wrap gitweb, like mod_perl's
ModPerl::Registry, or Plack's Plack::App::WrapCGI which is used for
--httpd=plackup in git-instaweb), checks if there were any errors
parsing it, and checks if there were any errors finding it
(e.g. locating it).

JN>  (5) naming and semantic of gitweb config variables configuring caching;
JN>      at least change $cache_enabled enum to $caching_enabled boolean

Done.  $cache_enable is not more.

JN>  (6) do not change anything in gitweb behavior if caching is disabled;
JN>      move 'if ($caching_enabled)' test to gitweb.perl, and remove code
JN>      from cache_fetch

This simplified 'cache.pl' by removing all conditionals depending on
state of $cache_enable from it (see interdiff below).  This required
adding check if caching is enabled before running reset_output() in
die_error() subroutine.


For tests to not hang I had to change permissions of cache directory
from 0665 ('rw-rw-r-x') to 0755 ('rwxr-xr-x').  I think that was a
bug.

Interdiff (with '--ignore-all-space'):

  diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
  index 8f748a3..abaeec6 100755
  --- a/gitweb/gitweb.perl
  +++ b/gitweb/gitweb.perl
  @@ -250,11 +251,7 @@ our $maxload = 300;
   # that the cache directory be periodically completely deleted, and this is safe to perform.
   # Suggested mechanism
   # mv $cacheidr $cachedir.flush;mkdir $cachedir;rm -rf $cachedir.flush
  -# Value is binary. 0 = disabled (default), 1 = enabled.
  -#
  -# Values of caching:
  -# 	1 = 'dumb' file based caching used on git.kernel.org
  -our $cache_enable = 0;
  +our $caching_enabled = 0;
   
   # Used to set the minimum cache timeout for the dynamic caching algorithm.  Basically
   # if we calculate the cache to be under this number of seconds we set the cache timeout
  @@ -552,7 +549,11 @@ our %feature = (
   #
   # Includes
   #
  -do 'cache.pm';
  +if (!exists $INC{'cache.pl'}) {
  +	my $return = do 'cache.pl';
  +	die $@ if $@;
  +	die "Couldn't read 'cache.pl': $!" if (!defined $return);
  +}
   
   sub gitweb_get_feature {
   	my ($name) = @_;
  @@ -782,6 +783,10 @@ our %actions = (
   	"project_list" => \&git_project_list,
   	"project_index" => \&git_project_index,
   );
  +sub is_cacheable {
  +	my $action = shift;
  +	return !($action eq 'blame_data' || $action eq 'blame_incremental');
  +}
   
   # finally, we have the hash of allowed extra_options for the commands that
   # allow them
  @@ -1120,8 +1118,11 @@ sub dispatch {
   	    !$project) {
   		die_error(400, "Project needed");
   	}
  -	#$actions{$action}->();
  +	if ($caching_enabled && is_cacheable($action)) {
   	cache_fetch($action);
  +	} else {
  +		$actions{$action}->();
  +	}
   }
   
   sub reset_timer {
  @@ -3494,9 +3494,8 @@ sub git_header_html {
   	# support xhtml+xml but choking when it gets what it asked for.
   	if (defined $cgi->http('HTTP_ACCEPT') &&
   	    $cgi->http('HTTP_ACCEPT') =~ m/(,|;|\s|^)application\/xhtml\+xml(,|;|\s|$)/ &&
  -	    $cgi->Accept('application/xhtml+xml') != 0
  -	    &&
  -	    $cache_enable == 0) {
  +	    $cgi->Accept('application/xhtml+xml') != 0 &&
  +	    !$caching_enabled) {
   		$content_type = 'application/xhtml+xml';
   	} else {
   		$content_type = 'text/html';
  @@ -3737,7 +3736,7 @@ sub die_error {
   	#
   	# Reset the output so that we are actually going to STDOUT as opposed
   	# to buffering the output.
  -	reset_output();
  +	reset_output() if ($caching_enabled);
   
   	git_header_html($http_responses{$status}, undef, %opts);
   	print <<EOF;
  @@ -5647,7 +5646,7 @@ sub git_blob_plain {
   			($sandbox ? 'attachment' : 'inline')
   			. '; filename="' . $save_as . '"');
   	local $/ = undef;
  -	if( $cache_enable != 0){
  +	if ($caching_enabled) {
   		open BINOUT, '>', $fullhashbinpath or die_error(500, "Could not open bin dump file");
   	}else{
   		open BINOUT, '>', \$fullhashbinpath or die_error(500, "Could not open bin dump file");
  @@ -5940,7 +5939,7 @@ sub git_snapshot {
   
   	open my $fd, "-|", $cmd
   		or die_error(500, "Execute git-archive failed");
  -	if( $cache_enable != 0){
  +	if ($caching_enabled) {
   		open BINOUT, '>', $fullhashbinpath or die_error(500, "Could not open bin dump file");
   	}else{
   		open BINOUT, '>', \$fullhashbinpath or die_error(500, "Could not open bin dump file");
  diff --git a/gitweb/cache.pm b/gitweb/lib/cache.pl
  similarity index 75%
  rename from gitweb/cache.pm
  rename to gitweb/lib/cache.pl
  index c86265c..dd14bfb 100644
  --- a/gitweb/cache.pm
  +++ b/gitweb/lib/cache.pl
  @@ -18,21 +18,9 @@ sub cache_fetch {
   	my ($action) = @_;
   	my $cacheTime = 0;
   
  -	# Deal with cache being disabled
  -	if( $cache_enable == 0 ){
  -		undef $backgroundCache;
  -		#$fullhashpath = \$nocachedata;
  -		#$fullhashbinpath = \$nocachedatabin;
  -		$fullhashpath = *STDOUT;
  -		$fullhashbinpath = *STDOUT;
  -		$actions{$action}->();
  -		return;
  -	}elsif( $cache_enable == 1 ){
  -		#obviously we are using file based caching
  -
   		if(! -d $cachedir){
   			print "*** Warning ***: Caching enabled but cache directory does not exsist.  ($cachedir)\n";
  -			mkdir ("cache", 0665) || die "Cannot create cache dir - you will need to manually create";
  +		mkdir ("cache", 0755) || die "Cannot create cache dir - you will need to manually create";
   			print "Cache directory created successfully\n";
   		}
   
  @@ -47,10 +35,9 @@ sub cache_fetch {
   		$fullhashpath = "$fullhashdir/". substr( $urlhash, 2 );
   		$fullhashbinpath = "$fullhashpath.bin.wt";
   		$fullhashbinpathfinal = "$fullhashpath.bin";
  -	} # done dealing with cache enabled / disabled
   
   	if(! -e "$fullhashpath" ){
  -		if( ! $cache_enable || ! $cacheDoFork || ! defined(my $childPid = fork()) ){
  +		if(! $cacheDoFork || ! defined(my $childPid = fork()) ){
   			cacheUpdate($action,0);
   			cacheDisplay($action);
   		} elsif ( $childPid == 0 ){
  @@ -112,8 +99,7 @@ sub cacheUpdate {
   
   		$lockStatus = $lockStatBG;
   	}else{
  -		open(cacheFile, '>:utf8', "$fullhashpath") if ($cache_enable > 0);
  -		open(cacheFile, '>:utf8', \$fullhashpath) if ($cache_enable == 0);
  +		open(cacheFile, '>:utf8', \$fullhashpath);
   		my $lockStat = flock(cacheFile,LOCK_EX|LOCK_NB);
   
   		$lockStatus = $lockStat;
  @@ -121,7 +107,7 @@ sub cacheUpdate {
   	#print "lock status: $lockStat\n";
   
   
  -	if ($cache_enable != 0 && ! $lockStatus ){
  +	if (! $lockStatus ){
   		if ( $areForked ){
   			exit(0);
   		}else{
  @@ -220,8 +206,7 @@ sub cacheWaitForUpdate {
   
   	if( $backgroundCache ){
   		if( -e "$fullhashpath" ){
  -			open(cacheFile, '<:utf8', "$fullhashpath") if ($cache_enable > 0);
  -			open(cacheFile, '<:utf8', \$fullhashpath) if ($cache_enable == 0);
  +			open(cacheFile, '<:utf8', "$fullhashpath");
   			$lockStat = flock(cacheFile,LOCK_SH|LOCK_NB);
   			stat(cacheFile);
   			close(cacheFile);
  @@ -242,8 +227,7 @@ sub cacheWaitForUpdate {
   	){
   		do {
   			sleep 2 if $x > 0;
  -			open(cacheFile, '<:utf8', "$fullhashpath") if ($cache_enable > 0);
  -			open(cacheFile, '<:utf8', \$fullhashpath) if ($cache_enable == 0);
  +			open(cacheFile, '<:utf8', "$fullhashpath");
   			$lockStat = flock(cacheFile,LOCK_SH|LOCK_NB);
   			close(cacheFile);
   			$x++;
  @@ -298,8 +282,7 @@ EOF
   	do {
   		print ".";
   		sleep 2 if $x > 0;
  -		open(cacheFile, '<:utf8', "$fullhashpath") if ($cache_enable > 0);
  -		open(cacheFile, '<:utf8', \$fullhashpath) if ($cache_enable == 0);
  +		open(cacheFile, '<:utf8', "$fullhashpath");
   		$lockStat = flock(cacheFile,LOCK_SH|LOCK_NB);
   		close(cacheFile);
   		$x++;
  @@ -317,11 +300,10 @@ sub cacheDisplay {
   	$|++;
   
   	my ($action) = @_;
  -	open(cacheFile, '<:utf8', "$fullhashpath") if ($cache_enable > 0);
  -	open(cacheFile, '<:utf8', \$fullhashpath) if ($cache_enable == 0);
  +	open(cacheFile, '<:utf8', "$fullhashpath");
   	$lockStat = flock(cacheFile,LOCK_SH|LOCK_NB);
   
  -	if ($cache_enable != 0 && ! $lockStat ){
  +	if (! $lockStat ){
   		close(cacheFile);
   		cacheWaitForUpdate($action);
   	}
  @@ -332,12 +314,10 @@ sub cacheDisplay {
   			||
   			$action eq "blob_plain"
   		)
  -		&&
  -		$cache_enable > 0
   	){
   		my $openstat = open(cacheFileBin, '<', "$fullhashbinpathfinal");
   		$lockStatBIN = flock(cacheFileBin,LOCK_SH|LOCK_NB);
  -		if ($cache_enable != 0 && ! $lockStatBIN ){
  +		if (! $lockStatBIN ){
   			system ("echo 'cacheDisplay - bailing due to binary lock failure' >> /tmp/gitweb.log");
   			close(cacheFile);
   			close(cacheFileBin);
  @@ -363,3 +343,6 @@ sub cacheDisplay {
   	close(cacheFile);
   	$|--;
   }
  +
  +1;

 gitweb/Makefile          |    3 +
 gitweb/gitweb.perl       |  105 ++++++++++++--
 gitweb/lib/cache.pl      |  348 ++++++++++++++++++++++++++++++++++++++++++++++
 gitweb/static/gitweb.css |    6 +
 4 files changed, 450 insertions(+), 12 deletions(-)
 create mode 100644 gitweb/lib/cache.pl

diff --git a/gitweb/Makefile b/gitweb/Makefile
index 6fa7625..315753e 100644
--- a/gitweb/Makefile
+++ b/gitweb/Makefile
@@ -111,6 +111,9 @@ endif
 
 GITWEB_FILES += static/git-logo.png static/git-favicon.png
 
+# Gitweb caching
+GITWEB_MODULES += cache.pl
+
 GITWEB_REPLACE = \
 	-e 's|++GIT_VERSION++|$(GIT_VERSION)|g' \
 	-e 's|++GIT_BINDIR++|$(bindir)|g' \
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 91e274f..abaeec6 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -243,6 +243,53 @@ our %avatar_size = (
 # Leave it undefined (or set to 'undef') to turn off load checking.
 our $maxload = 300;
 
+# This enables/disables the caching layer in gitweb.  This currently only supports the
+# 'dumb' file based caching layer, primarily used on git.kernel.org.  this is reasonably
+# effective but it has the downside of requiring a huge amount of disk space if there
+# are a number of repositories involved.  It is not uncommon for git.kernel.org to have
+# on the order of 80G - 120G accumulate over the course of a few months.  It is recommended
+# that the cache directory be periodically completely deleted, and this is safe to perform.
+# Suggested mechanism
+# mv $cacheidr $cachedir.flush;mkdir $cachedir;rm -rf $cachedir.flush
+our $caching_enabled = 0;
+
+# Used to set the minimum cache timeout for the dynamic caching algorithm.  Basically
+# if we calculate the cache to be under this number of seconds we set the cache timeout
+# to this minimum.
+# Value is in seconds.  1 = 1 seconds, 60 = 1 minute, 600 = 10 minutes, 3600 = 1 hour
+our $minCacheTime = 20;
+
+# Used to set the maximum cache timeout for the dynamic caching algorithm.  Basically
+# if we calculate the cache to exceed this number of seconds we set the cache timeout
+# to this maximum.
+# Value is in seconds.  1 = 1 seconds, 60 = 1 minute, 600 = 10 minutes, 3600 = 1 hour
+our $maxCacheTime = 1200;
+
+# If you need to change the location of the caching directory, override this
+# otherwise this will probably do fine for you
+our $cachedir = 'cache';
+
+# If this is set (to 1) cache will do it's best to always display something instead
+# of making someone wait for the cache to update.  This will launch the cacheUpdate
+# into the background and it will lock a <file>.bg file and will only lock the
+# actual cache file when it needs to write into it.  In theory this will make
+# gitweb seem more responsive at the price of possibly stale data.
+our $backgroundCache = 1;
+
+# Used to set the maximum cache file life.  If a cache files last modify time exceeds
+# this value, it will assume that the data is just too old, and HAS to be regenerated
+# instead of trying to display the existing cache data.
+# Value is in seconds.  1 = 1 seconds, 60 = 1 minute, 600 = 10 minutes, 3600 = 1 hour
+# 18000 = 5 hours
+our $maxCacheLife = 18000;
+
+# Used to enable or disable background forking of the gitweb caching.  Mainly here for debugging purposes
+our $cacheDoFork = 1;
+
+our $fullhashpath = *STDOUT;
+our $fullhashbinpath = *STDOUT;
+our $fullhashbinpathfinal = *STDOUT;
+
 # configuration for 'highlight' (http://www.andre-simon.de/)
 # match by basename
 our %highlight_basename = (
@@ -499,6 +546,15 @@ our %feature = (
 		'default' => [0]},
 );
 
+#
+# Includes
+#
+if (!exists $INC{'cache.pl'}) {
+	my $return = do 'cache.pl';
+	die $@ if $@;
+	die "Couldn't read 'cache.pl': $!" if (!defined $return);
+}
+
 sub gitweb_get_feature {
 	my ($name) = @_;
 	return unless exists $feature{$name};
@@ -727,6 +783,10 @@ our %actions = (
 	"project_list" => \&git_project_list,
 	"project_index" => \&git_project_index,
 );
+sub is_cacheable {
+	my $action = shift;
+	return !($action eq 'blame_data' || $action eq 'blame_incremental');
+}
 
 # finally, we have the hash of allowed extra_options for the commands that
 # allow them
@@ -1058,7 +1118,11 @@ sub dispatch {
 	    !$project) {
 		die_error(400, "Project needed");
 	}
-	$actions{$action}->();
+	if ($caching_enabled && is_cacheable($action)) {
+		cache_fetch($action);
+	} else {
+		$actions{$action}->();
+	}
 }
 
 sub reset_timer {
@@ -1128,6 +1192,7 @@ sub change_output {
 
 	# Trap the 'proper' STDOUT to STDOUT_REAL for things like error messages and such
 	open(STDOUT_REAL,">&STDOUT") or die "Unable to capture STDOUT $!\n";
+	print STDOUT_REAL "";
 
 	# Close STDOUT, so that it isn't being used anymore.
 	close STDOUT;
@@ -1153,10 +1218,7 @@ sub run {
 		$pre_dispatch_hook->()
 			if $pre_dispatch_hook;
 
-		change_output();
 		run_request();
-		reset_output();
-		print $output;
 
 		$post_dispatch_hook->()
 			if $post_dispatch_hook;
@@ -3432,7 +3494,8 @@ sub git_header_html {
 	# support xhtml+xml but choking when it gets what it asked for.
 	if (defined $cgi->http('HTTP_ACCEPT') &&
 	    $cgi->http('HTTP_ACCEPT') =~ m/(,|;|\s|^)application\/xhtml\+xml(,|;|\s|$)/ &&
-	    $cgi->Accept('application/xhtml+xml') != 0) {
+	    $cgi->Accept('application/xhtml+xml') != 0 &&
+	    !$caching_enabled) {
 		$content_type = 'application/xhtml+xml';
 	} else {
 		$content_type = 'text/html';
@@ -3577,6 +3640,7 @@ sub git_footer_html {
 	my $feed_class = 'rss_logo';
 
 	print "<div class=\"page_footer\">\n";
+	print "<div class=\"cachetime\">Cache Last Updated: ". gmtime( time ) ." GMT</div>\n";
 	if (defined $project) {
 		my $descr = git_get_project_description($project);
 		if (defined $descr) {
@@ -3665,9 +3729,14 @@ sub die_error {
 		500 => '500 Internal Server Error',
 		503 => '503 Service Unavailable',
 	);
+	# The output handlers for die_error need to be reset to STDOUT
+	# so that half the message isn't being output to random and
+	# half to STDOUT as expected.  This is mainly for the benefit
+	# of using git_header_html() and git_footer_html() since
+	#
 	# Reset the output so that we are actually going to STDOUT as opposed
 	# to buffering the output.
-	reset_output();
+	reset_output() if ($caching_enabled);
 
 	git_header_html($http_responses{$status}, undef, %opts);
 	print <<EOF;
@@ -5577,9 +5646,15 @@ sub git_blob_plain {
 			($sandbox ? 'attachment' : 'inline')
 			. '; filename="' . $save_as . '"');
 	local $/ = undef;
-	binmode STDOUT, ':raw';
-	print <$fd>;
-	binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi
+	if ($caching_enabled) {
+		open BINOUT, '>', $fullhashbinpath or die_error(500, "Could not open bin dump file");
+	}else{
+		open BINOUT, '>', \$fullhashbinpath or die_error(500, "Could not open bin dump file");
+	}
+	binmode BINOUT, ':raw';
+	print BINOUT <$fd>;
+	binmode BINOUT, ':utf8'; # as set at the beginning of gitweb.cgi
+	close BINOUT;
 	close $fd;
 }
 
@@ -5864,9 +5939,15 @@ sub git_snapshot {
 
 	open my $fd, "-|", $cmd
 		or die_error(500, "Execute git-archive failed");
-	binmode STDOUT, ':raw';
-	print <$fd>;
-	binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi
+	if ($caching_enabled) {
+		open BINOUT, '>', $fullhashbinpath or die_error(500, "Could not open bin dump file");
+	}else{
+		open BINOUT, '>', \$fullhashbinpath or die_error(500, "Could not open bin dump file");
+	}
+	binmode BINOUT, ':raw';
+	print BINOUT <$fd>;
+	binmode BINOUT, ':utf8'; # as set at the beginning of gitweb.cgi
+	close BINOUT;
 	close $fd;
 }
 
diff --git a/gitweb/lib/cache.pl b/gitweb/lib/cache.pl
new file mode 100644
index 0000000..dd14bfb
--- /dev/null
+++ b/gitweb/lib/cache.pl
@@ -0,0 +1,347 @@
+# gitweb - simple web interface to track changes in git repositories
+#
+# (C) 2006, John 'Warthog9' Hawley <warthog19@eaglescrag.net>
+#
+# This program is licensed under the GPLv2
+
+#
+# Gitweb caching engine
+#
+
+#use File::Path qw(make_path remove_tree);
+use File::Path qw(mkpath rmtree); # Used for compatability reasons
+use Digest::MD5 qw(md5 md5_hex md5_base64);
+use Fcntl ':flock';
+use File::Copy;
+
+sub cache_fetch {
+	my ($action) = @_;
+	my $cacheTime = 0;
+
+	if(! -d $cachedir){
+		print "*** Warning ***: Caching enabled but cache directory does not exsist.  ($cachedir)\n";
+		mkdir ("cache", 0755) || die "Cannot create cache dir - you will need to manually create";
+		print "Cache directory created successfully\n";
+	}
+
+	our $full_url = "$my_url?". $ENV{'QUERY_STRING'};
+	our $urlhash = md5_hex($full_url);
+	our $fullhashdir = "$cachedir/". substr( $urlhash, 0, 2) ."/";
+
+	eval { mkpath( $fullhashdir, 0, 0777 ) };
+	if ($@) {
+		die_error(500, "Internal Server Error", "Could not create cache directory: $@");
+	}
+	$fullhashpath = "$fullhashdir/". substr( $urlhash, 2 );
+	$fullhashbinpath = "$fullhashpath.bin.wt";
+	$fullhashbinpathfinal = "$fullhashpath.bin";
+
+	if(! -e "$fullhashpath" ){
+		if(! $cacheDoFork || ! defined(my $childPid = fork()) ){
+			cacheUpdate($action,0);
+			cacheDisplay($action);
+		} elsif ( $childPid == 0 ){
+			#run the updater
+			cacheUpdate($action,1);
+		}else{
+			cacheWaitForUpdate($action);
+		}
+	}else{
+		#if cache is out dated, update
+		#else displayCache();
+		open(cacheFile, '<', "$fullhashpath");
+		stat(cacheFile);
+		close(cacheFile);
+		my $stat_time = (stat(_))[9];
+		my $stat_size = (stat(_))[7];
+
+		$cacheTime = get_loadavg() * 60;
+		if( $cacheTime > $maxCacheTime ){
+			$cacheTime = $maxCacheTime;
+		}
+		if( $cacheTime < $minCacheTime ){
+			$cacheTime = $minCacheTime;
+		}
+		if( $stat_time < (time - $cacheTime) || $stat_size == 0 ){
+			if( ! $cacheDoFork || ! defined(my $childPid = fork()) ){
+				cacheUpdate($action,0);
+				cacheDisplay($action);
+			} elsif ( $childPid == 0 ){
+				#run the updater
+				#print "Running updater\n";
+				cacheUpdate($action,1);
+			}else{
+				#print "Waiting for update\n";
+				cacheWaitForUpdate($action);
+			}
+		} else {
+			cacheDisplay($action);
+		}
+
+
+	}
+
+	#
+	# If all of the caching failes - lets go ahead and press on without it and fall back to 'default'
+	# non-caching behavior.  This is the softest of the failure conditions.
+	#
+	#$actions{$action}->();
+}
+
+sub cacheUpdate {
+	my ($action,$areForked) = @_;
+	my $lockingStatus;
+	my $fileData = "";
+
+	if($backgroundCache){
+		open(cacheFileBG, '>:utf8', "$fullhashpath.bg");
+		my $lockStatBG = flock(cacheFileBG,LOCK_EX|LOCK_NB);
+
+		$lockStatus = $lockStatBG;
+	}else{
+		open(cacheFile, '>:utf8', \$fullhashpath);
+		my $lockStat = flock(cacheFile,LOCK_EX|LOCK_NB);
+
+		$lockStatus = $lockStat;
+	}
+	#print "lock status: $lockStat\n";
+
+
+	if (! $lockStatus ){
+		if ( $areForked ){
+			exit(0);
+		}else{
+			return;
+		}
+	}
+
+	if(
+		$action eq "snapshot"
+		||
+		$action eq "blob_plain"
+	){
+		my $openstat = open(cacheFileBinWT, '>>:utf8', "$fullhashbinpath");
+		my $lockStatBin = flock(cacheFileBinWT,LOCK_EX|LOCK_NB);
+	}
+
+	# Trap all output from the action
+	change_output();
+
+	$actions{$action}->();
+
+	# Reset the outputs as we should be fine now
+	reset_output();
+
+
+	if($backgroundCache){
+		open(cacheFile, '>:utf8', "$fullhashpath");
+		$lockStat = flock(cacheFile,LOCK_EX);
+
+		if (! $lockStat ){
+			if ( $areForked ){
+				exit(0);
+			}else{
+				return;
+			}
+		}
+	}
+
+	if(
+		$action eq "snapshot"
+		||
+		$action eq "blob_plain"
+	){
+		my $openstat = open(cacheFileBinFINAL, '>:utf8', "$fullhashbinpathfinal");
+		$lockStatBIN = flock(cacheFileBinFINAL,LOCK_EX);
+
+		if (! $lockStatBIN ){
+			if ( $areForked ){
+				exit(0);
+			}else{
+				return;
+			}
+		}
+	}
+
+	# Actually dump the output to the proper file handler
+	local $/ = undef;
+	$|++;
+	print cacheFile "$output";
+	$|--;
+	if(
+		$action eq "snapshot"
+		||
+		$action eq "blob_plain"
+	){
+		move("$fullhashbinpath", "$fullhashbinpathfinal") or die "Binary Cache file could not be updated: $!";
+
+		flock(cacheFileBinFINAL,LOCK_UN);
+		close(cacheFileBinFINAL);
+
+		flock(cacheFileBinWT,LOCK_UN);
+		close(cacheFileBinWT);
+	}
+
+	flock(cacheFile,LOCK_UN);
+	close(cacheFile);
+
+	if($backgroundCache){
+		flock(cacheFileBG,LOCK_UN);
+		close(cacheFileBG);
+	}
+
+	if ( $areForked ){
+		exit(0);
+	} else {
+		return;
+	}
+}
+
+
+sub cacheWaitForUpdate {
+	my ($action) = @_;
+	my $x = 0;
+	my $max = 10;
+	my $lockStat = 0;
+
+	if( $backgroundCache ){
+		if( -e "$fullhashpath" ){
+			open(cacheFile, '<:utf8', "$fullhashpath");
+			$lockStat = flock(cacheFile,LOCK_SH|LOCK_NB);
+			stat(cacheFile);
+			close(cacheFile);
+
+			if( $lockStat && ( (stat(_))[9] > (time - $maxCacheLife) ) ){
+				cacheDisplay($action);
+				return;
+			}
+		}
+	}
+
+	if(
+		$action eq "atom"
+		||
+		$action eq "rss"
+		||
+		$action eq "opml"
+	){
+		do {
+			sleep 2 if $x > 0;
+			open(cacheFile, '<:utf8', "$fullhashpath");
+			$lockStat = flock(cacheFile,LOCK_SH|LOCK_NB);
+			close(cacheFile);
+			$x++;
+			$combinedLockStat = $lockStat;
+		} while ((! $combinedLockStat) && ($x < $max));
+
+		if( $x != $max ){
+			cacheDisplay($action);
+		}
+		return;
+	}
+
+	$| = 1;
+
+	print $::cgi->header(
+				-type=>'text/html',
+				-charset => 'utf-8',
+				-status=> 200,
+				-expires => 'now',
+				# HTTP/1.0
+				-Pragma => 'no-cache',
+				# HTTP/1.1
+				-Cache_Control => join(
+							', ',
+							qw(
+								private
+								no-cache
+								no-store
+								must-revalidate
+								max-age=0
+								pre-check=0
+								post-check=0
+							)
+						)
+				);
+
+	print <<EOF;
+<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www/w3.porg/TR/html4/strict.dtd">
+<!-- git web w/caching interface version $version, (C) 2006-2010, John 'Warthog9' Hawley <warthog9\@kernel.org> -->
+<!-- git core binaries version $git_version -->
+<head>
+<meta http-equiv="content-type" content="$content_type; charset=utf-8"/>
+<meta name="generator" content="gitweb/$version git/$git_version"/>
+<meta name="robots" content="index, nofollow"/>
+<meta http-equiv="refresh" content="0"/>
+<title>$title</title>
+</head>
+<body>
+EOF
+
+	print "Generating..";
+	do {
+		print ".";
+		sleep 2 if $x > 0;
+		open(cacheFile, '<:utf8', "$fullhashpath");
+		$lockStat = flock(cacheFile,LOCK_SH|LOCK_NB);
+		close(cacheFile);
+		$x++;
+		$combinedLockStat = $lockStat;
+	} while ((! $combinedLockStat) && ($x < $max));
+	print <<EOF;
+</body>
+</html>
+EOF
+	return;
+}
+
+sub cacheDisplay {
+	local $/ = undef;
+	$|++;
+
+	my ($action) = @_;
+	open(cacheFile, '<:utf8', "$fullhashpath");
+	$lockStat = flock(cacheFile,LOCK_SH|LOCK_NB);
+
+	if (! $lockStat ){
+		close(cacheFile);
+		cacheWaitForUpdate($action);
+	}
+
+	if(
+		(
+			$action eq "snapshot"
+			||
+			$action eq "blob_plain"
+		)
+	){
+		my $openstat = open(cacheFileBin, '<', "$fullhashbinpathfinal");
+		$lockStatBIN = flock(cacheFileBin,LOCK_SH|LOCK_NB);
+		if (! $lockStatBIN ){
+			system ("echo 'cacheDisplay - bailing due to binary lock failure' >> /tmp/gitweb.log");
+			close(cacheFile);
+			close(cacheFileBin);
+			cacheWaitForUpdate($action);
+		}
+
+		my $binfilesize = -s "$fullhashbinpathfinal";
+		print "Content-Length: $binfilesize";
+	}
+	while( <cacheFile> ){
+		print $_;
+	}
+	if(
+		$action eq "snapshot"
+		||
+		$action eq "blob_plain"
+	){
+		binmode STDOUT, ':raw';
+		print <cacheFileBin>;
+		binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi
+		close(cacheFileBin);
+	}
+	close(cacheFile);
+	$|--;
+}
+
+1;
diff --git a/gitweb/static/gitweb.css b/gitweb/static/gitweb.css
index 4132aab..972d32e 100644
--- a/gitweb/static/gitweb.css
+++ b/gitweb/static/gitweb.css
@@ -67,6 +67,12 @@ div.page_path {
 	border-width: 0px 0px 1px;
 }
 
+div.cachetime {
+	float: left;
+	margin-right: 10px;
+	color: #555555;
+}
+
 div.page_footer {
 	height: 17px;
 	padding: 4px 8px;
-- 
1.7.3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCHv7.1 4/4] gitweb: Minimal testing of gitweb caching
  2010-10-31  9:21       ` Jakub Narebski
                           ` (3 preceding siblings ...)
  2010-11-01 10:24         ` [PATCHv7.1 3/4] gitweb: File based caching layer (from git.kernel.org) Jakub Narebski
@ 2010-11-01 10:24         ` Jakub Narebski
  4 siblings, 0 replies; 43+ messages in thread
From: Jakub Narebski @ 2010-11-01 10:24 UTC (permalink / raw)
  To: git
  Cc: John 'Warthog9' Hawley, John 'Warthog9' Hawley,
	Junio C Hamano, Petr Baudis, admin, Jakub Narebski

Add basic tests of caching support to t9500-gitweb-standalone-no-errors
test: set $caching_enabled to true and check for errors for first time
run (generating cache) and second time run (retrieving from cache) for a
single view - summary view for a project.  Check in the
t9502-gitweb-standalone-parse-output test that gitweb produces the same
output with and without caching, for first and second run, with binary
or text output.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
All tests, including those added here, passes.

 t/t9500-gitweb-standalone-no-errors.sh    |   19 ++++++++++++++
 t/t9502-gitweb-standalone-parse-output.sh |   39 +++++++++++++++++++++++++++++
 2 files changed, 58 insertions(+), 0 deletions(-)

diff --git a/t/t9500-gitweb-standalone-no-errors.sh b/t/t9500-gitweb-standalone-no-errors.sh
index 4f2b9b0..0ad5fc8 100755
--- a/t/t9500-gitweb-standalone-no-errors.sh
+++ b/t/t9500-gitweb-standalone-no-errors.sh
@@ -676,4 +676,23 @@ test_expect_success HIGHLIGHT \
 	 gitweb_run "p=.git;a=blob;f=test.sh"'
 test_debug 'cat gitweb.log'
 
+# ----------------------------------------------------------------------
+# caching
+
+cat >>gitweb_config.perl <<\EOF
+$caching_enabled = 1;
+$cachedir = 'cache'; # to clear right thing
+EOF
+rm -rf cache
+
+test_expect_success \
+	'caching enabled (project summary, first run, generating cache)' \
+	'gitweb_run "p=.git;a=summary"'
+test_debug 'cat gitweb.log'
+
+test_expect_success \
+	'caching enabled (project summary, second run, cached version)' \
+	'gitweb_run "p=.git;a=summary"'
+test_debug 'cat gitweb.log'
+
 test_done
diff --git a/t/t9502-gitweb-standalone-parse-output.sh b/t/t9502-gitweb-standalone-parse-output.sh
index dd83890..f3d0cf0 100755
--- a/t/t9502-gitweb-standalone-parse-output.sh
+++ b/t/t9502-gitweb-standalone-parse-output.sh
@@ -112,4 +112,43 @@ test_expect_success 'snapshot: hierarchical branch name (xx/test)' '
 '
 test_debug 'cat gitweb.headers'
 
+
+# ----------------------------------------------------------------------
+# whether gitweb with caching enabled produces the same output
+
+test_expect_success 'setup for caching tests (utf8 commit, binary file)' '
+	. "$TEST_DIRECTORY"/t3901-utf8.txt &&
+	cp "$TEST_DIRECTORY"/test9200a.png image.png &&
+	git add image.png &&
+	git commit -F "$TEST_DIRECTORY"/t3900/1-UTF-8.txt &&
+	gitweb_run "p=.git;a=patch" &&
+	mv gitweb.body no_cache.html &&
+	gitweb_run "p=.git;a=blob_plain;f=image.png" &&
+	mv gitweb.body no_cache.png
+'
+
+test_expect_success 'enable caching' '
+	cat >>gitweb_config.perl <<-\EOF &&
+	$caching_enabled = 1;
+	$cachedir = "cache"; # to clear right thing
+	EOF
+	rm -rf cache
+'
+
+for desc in 'generating cache' 'cached version'; do
+	test_expect_success "caching enabled, HTML output, $desc" '
+		gitweb_run "p=.git;a=patch" &&
+		mv gitweb.body cache.html &&
+		test_cmp no_cache.html cache.html
+	'
+done
+
+for desc in 'generating cache' 'cached version'; do
+	test_expect_success "caching enabled, binary output, $desc" '
+		gitweb_run "p=.git;a=blob_plain;f=image.png" &&
+		mv gitweb.body cache.png &&
+		cmp no_cache.png cache.png
+	'
+done
+
 test_done
-- 
1.7.3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCHv7.1b 1/4] gitweb: Prepare for splitting gitweb
  2010-11-01 10:24         ` [PATCHv7.1 1/4] gitweb: Prepare for splitting gitweb Jakub Narebski
@ 2010-11-01 18:50           ` Jakub Narebski
  0 siblings, 0 replies; 43+ messages in thread
From: Jakub Narebski @ 2010-11-01 18:50 UTC (permalink / raw)
  To: git
  Cc: John 'Warthog9' Hawley, John 'Warthog9' Hawley,
	Junio C Hamano, Petr Baudis, admin

Prepare gitweb for having been split into modules that are to be
installed alongside gitweb in 'lib/' subdirectory, by adding

  use lib __DIR__.'/lib';

to gitweb.perl (to main gitweb script), and preparing for putting
modules (relative path) in $(GITWEB_MODULES) in gitweb/Makefile.

This preparatory work allows to add new module to gitweb by simply
adding

  GITWEB_MODULES += <module>

to gitweb/Makefile (assuming that the module is in 'gitweb/lib/'
directory).

While at it pass GITWEBLIBDIR in addition to GITWEB_TEST_INSTALLED
to test instaleed version of gitweb and installed version of modules
(for tests which check individual (sub)modules).


Using __DIR__ from Dir::Self module (not in core, that's why currently
gitweb includes excerpt of code from Dir::Self defining __DIR__) was
chosen over using FindBin-based solution (in core since perl 5.00307,
while gitweb itself requires at least perl 5.8.0) because FindBin uses
BEGIN block, which is a problem under mod_perl and other persistent
Perl environments (thought there are workarounds).

At Pavan Kumar Sankara suggestion gitweb/Makefile uses

  install [OPTION]... SOURCE... DIRECTORY

format (2nd format) with single SOURCE rather than

  install [OPTION]... SOURCE DEST

format (1st format) because of security reasons (race conditions).
Modern GNU install has `-T' / `--no-target-directory' option, but we
cannot rely that the $(INSTALL) we are using supports this option.

The install-modules target in gitweb/Makefile uses shell 'for' loop,
instead of make's $(foreach) function, to avoid possible problem with
generating a command line that exceeded the maximum argument list
length.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
This version requires and applies cleanly on top of patch 958a846 
(gitweb/Makefile: Add 'test' and 'test-installed' targets, 2010-09-26)
in 'pu'.

 gitweb/Makefile    |   17 +++++++++++++++--
 gitweb/gitweb.perl |    8 ++++++++
 2 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/gitweb/Makefile b/gitweb/Makefile
index df908a1..c2d72e4 100644
--- a/gitweb/Makefile
+++ b/gitweb/Makefile
@@ -56,6 +56,7 @@ PERL_PATH  ?= /usr/bin/perl
 bindir_SQ = $(subst ','\'',$(bindir))#'
 gitwebdir_SQ = $(subst ','\'',$(gitwebdir))#'
 gitwebstaticdir_SQ = $(subst ','\'',$(gitwebdir)/static)#'
+gitweblibdir_SQ = $(subst ','\'',$(gitwebdir)/lib)#'
 SHELL_PATH_SQ = $(subst ','\'',$(SHELL_PATH))#'
 PERL_PATH_SQ  = $(subst ','\'',$(PERL_PATH))#'
 DESTDIR_SQ    = $(subst ','\'',$(DESTDIR))#'
@@ -151,20 +152,32 @@ test:
 
 test-installed:
 	GITWEB_TEST_INSTALLED='$(DESTDIR_SQ)$(gitwebdir_SQ)' \
+	GITWEBLIBDIR='$(DESTDIR_SQ)$(gitweblibdir_SQ)' \
 		$(MAKE) -C ../t gitweb-test
 
 ### Installation rules
 
-install: all
+install: all install-modules
 	$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(gitwebdir_SQ)'
 	$(INSTALL) -m 755 $(GITWEB_PROGRAMS) '$(DESTDIR_SQ)$(gitwebdir_SQ)'
 	$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(gitwebstaticdir_SQ)'
 	$(INSTALL) -m 644 $(GITWEB_FILES) '$(DESTDIR_SQ)$(gitwebstaticdir_SQ)'
 
+install-modules:
+	install_dirs="$(sort $(dir $(GITWEB_MODULES)))" && \
+	for dir in $$install_dirs; do \
+		test -d '$(DESTDIR_SQ)$(gitweblibdir_SQ)/$$dir' || \
+		$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(gitweblibdir_SQ)/$$dir'; \
+	done
+	gitweb_modules="$(GITWEB_MODULES)" && \
+	for mod in $$gitweb_modules; do \
+		$(INSTALL) -m 644 lib/$$mod 
'$(DESTDIR_SQ)$(gitweblibdir_SQ)/$$(dirname $$mod)'; \
+	done
+
 ### Cleaning rules
 
 clean:
 	$(RM) gitweb.cgi static/gitweb.min.js static/gitweb.min.css 
GITWEB-BUILD-OPTIONS
 
-.PHONY: all clean install test test-installed .FORCE-GIT-VERSION-FILE 
FORCE
+.PHONY: all clean install install-modules test 
test-installed .FORCE-GIT-VERSION-FILE FORCE
 
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index d521b4c..e4c08ba 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -9,6 +9,14 @@
 
 use strict;
 use warnings;
+
+use File::Spec;
+# __DIR__ is taken from Dir::Self __DIR__ fragment
+sub __DIR__ () {
+	File::Spec->rel2abs(join '', (File::Spec->splitpath(__FILE__))[0, 1]);
+}
+use lib __DIR__ . '/lib';
+
 use CGI qw(:standard :escapeHTML -nosticky);
 use CGI::Util qw(unescape);
 use CGI::Carp qw(fatalsToBrowser set_message);
-- 
1.7.3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [RFC/PATCHv7.2 0/4] Gitweb caching v7.2
  2010-11-01 10:24         ` [PATCHv7.1 0/4] Gitweb caching v7.1 Jakub Narebski
@ 2010-11-12 23:35           ` Jakub Narebski
  2010-11-12 23:41             ` [PATCHv7.2 1/4] gitweb: Prepare for splitting gitweb Jakub Narebski
                               ` (3 more replies)
  0 siblings, 4 replies; 43+ messages in thread
From: Jakub Narebski @ 2010-11-12 23:35 UTC (permalink / raw)
  To: git
  Cc: John 'Warthog9' Hawley, John 'Warthog9' Hawley,
	Junio C Hamano, Petr Baudis, admin

This series is a bit fixed up and a tiny bit cleaned up version of
"Gitweb caching v7" series from John 'Warthog9' Hawley:
  http://thread.gmane.org/gmane.comp.version-control.git/160147

This series is based on top of 'next', because it contains 
'jn/gitweb-test' branch.

The difference from v7.2 is that it takes into account 'test-installed'
target in gitweb/Makefile in first patch of its series, and that testing
of caching support is slightly extended.  Note that some of those tests
fail currently, not because of error in gitweb caching code, but because
I was not able to disable calling cacheWaitForUpdate(), which hinders
testing.

Those differences were also described in "Re: What's cooking in git.git
(Nov 2010, #01; Tue, 9)":
  http://article.gmane.org/gmane.comp.version-control.git/161309


Table of contents:
==================
 [PATCHv7.2 1/4] gitweb: Prepare for splitting gitweb
 [PATCHv7.1 2/4] gitweb: add output buffering and associated functions
 [PATCHv7.1 3/4] gitweb: File based caching layer (from git.kernel.org)
 [PATCHv7.2 4/4] gitweb: Minimal testing of gitweb caching

Shortlog:
=========
Jakub Narebski (2):
  gitweb: Prepare for splitting gitweb
  gitweb: Minimal testing of gitweb caching

John 'Warthog9' Hawley (2):
  gitweb: add output buffering and associated functions
  gitweb: File based caching layer (from git.kernel.org)

Diffstat:
=========
 gitweb/Makefile                           |   20 ++-
 gitweb/gitweb.perl                        |  134 +++++++++++-
 gitweb/lib/cache.pl                       |  348 +++++++++++++++++++++++++++++
 gitweb/static/gitweb.css                  |    6 +
 t/gitweb-lib.sh                           |   15 ++
 t/t9500-gitweb-standalone-no-errors.sh    |   20 ++
 t/t9501-gitweb-standalone-http-status.sh  |   13 +
 t/t9502-gitweb-standalone-parse-output.sh |   33 +++
 8 files changed, 579 insertions(+), 10 deletions(-)
 create mode 100644 gitweb/lib/cache.pl

-- 
1.7.3

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCHv7.2 1/4] gitweb: Prepare for splitting gitweb
  2010-11-12 23:35           ` [RFC/PATCHv7.2 0/4] Gitweb caching v7.2 Jakub Narebski
@ 2010-11-12 23:41             ` Jakub Narebski
  2010-11-17 23:10               ` Junio C Hamano
  2010-12-02 10:17               ` [PATCHv7.3 1/4 (bugfix)] " Jakub Narebski
  2010-11-12 23:44             ` [PATCHv7.1 2/4] gitweb: add output buffering and associated functions Jakub Narebski
                               ` (2 subsequent siblings)
  3 siblings, 2 replies; 43+ messages in thread
From: Jakub Narebski @ 2010-11-12 23:41 UTC (permalink / raw)
  To: git
  Cc: John 'Warthog9' Hawley, John 'Warthog9' Hawley,
	Junio C Hamano, Petr Baudis, admin

Prepare gitweb for having been split into modules that are to be
installed alongside gitweb in 'lib/' subdirectory, by adding

  use lib __DIR__.'/lib';

to gitweb.perl (to main gitweb script), and preparing for putting
modules (relative path) in $(GITWEB_MODULES) in gitweb/Makefile.

This preparatory work allows to add new module to gitweb by simply
adding

  GITWEB_MODULES += <module>

to gitweb/Makefile (assuming that the module is in 'gitweb/lib/'
directory).

While at it pass GITWEBLIBDIR in addition to GITWEB_TEST_INSTALLED
to test instaleed version of gitweb and installed version of modules
(for tests which check individual (sub)modules).


Using __DIR__ from Dir::Self module (not in core, that's why currently
gitweb includes excerpt of code from Dir::Self defining __DIR__) was
chosen over using FindBin-based solution (in core since perl 5.00307,
while gitweb itself requires at least perl 5.8.0) because FindBin uses
BEGIN block, which is a problem under mod_perl and other persistent
Perl environments (thought there are workarounds).

At Pavan Kumar Sankara suggestion gitweb/Makefile uses

  install [OPTION]... SOURCE... DIRECTORY

format (2nd format) with single SOURCE rather than

  install [OPTION]... SOURCE DEST

format (1st format) because of security reasons (race conditions).
Modern GNU install has `-T' / `--no-target-directory' option, but we
cannot rely that the $(INSTALL) we are using supports this option.

The install-modules target in gitweb/Makefile uses shell 'for' loop,
instead of make's $(foreach) function, to avoid possible problem with
generating a command line that exceeded the maximum argument list
length.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
This version is rebased on top of 'next', which means that it takes
'jn/gitweb-test' into account, in particular 958a846 (gitweb/Makefile:
Add 'test' and 'test-installed' targets, 2010-09-26).

That means adding GITWEBLIBDIR for future testing of individual
modules (it was not present in v7.1).

 gitweb/Makefile    |   17 +++++++++++++++--
 gitweb/gitweb.perl |    8 ++++++++
 2 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/gitweb/Makefile b/gitweb/Makefile
index 0a6ac00..f9e32eb 100644
--- a/gitweb/Makefile
+++ b/gitweb/Makefile
@@ -57,6 +57,7 @@ PERL_PATH  ?= /usr/bin/perl
 bindir_SQ = $(subst ','\'',$(bindir))#'
 gitwebdir_SQ = $(subst ','\'',$(gitwebdir))#'
 gitwebstaticdir_SQ = $(subst ','\'',$(gitwebdir)/static)#'
+gitweblibdir_SQ = $(subst ','\'',$(gitwebdir)/lib)#'
 SHELL_PATH_SQ = $(subst ','\'',$(SHELL_PATH))#'
 PERL_PATH_SQ  = $(subst ','\'',$(PERL_PATH))#'
 DESTDIR_SQ    = $(subst ','\'',$(DESTDIR))#'
@@ -153,20 +154,32 @@ test:
 
 test-installed:
 	GITWEB_TEST_INSTALLED='$(DESTDIR_SQ)$(gitwebdir_SQ)' \
+	GITWEBLIBDIR='$(DESTDIR_SQ)$(gitweblibdir_SQ)' \
 		$(MAKE) -C ../t gitweb-test
 
 ### Installation rules
 
-install: all
+install: all install-modules
 	$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(gitwebdir_SQ)'
 	$(INSTALL) -m 755 $(GITWEB_PROGRAMS) '$(DESTDIR_SQ)$(gitwebdir_SQ)'
 	$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(gitwebstaticdir_SQ)'
 	$(INSTALL) -m 644 $(GITWEB_FILES) '$(DESTDIR_SQ)$(gitwebstaticdir_SQ)'
 
+install-modules:
+	install_dirs="$(sort $(dir $(GITWEB_MODULES)))" && \
+	for dir in $$install_dirs; do \
+		test -d '$(DESTDIR_SQ)$(gitweblibdir_SQ)/$$dir' || \
+		$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(gitweblibdir_SQ)/$$dir'; \
+	done
+	gitweb_modules="$(GITWEB_MODULES)" && \
+	for mod in $$gitweb_modules; do \
+		$(INSTALL) -m 644 lib/$$mod '$(DESTDIR_SQ)$(gitweblibdir_SQ)/$$(dirname $$mod)'; \
+	done
+
 ### Cleaning rules
 
 clean:
 	$(RM) gitweb.cgi static/gitweb.min.js static/gitweb.min.css GITWEB-BUILD-OPTIONS
 
-.PHONY: all clean install test test-installed .FORCE-GIT-VERSION-FILE FORCE
+.PHONY: all clean install install-modules test test-installed .FORCE-GIT-VERSION-FILE FORCE
 
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 679f2da..cfa511c 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -10,6 +10,14 @@
 use 5.008;
 use strict;
 use warnings;
+
+use File::Spec;
+# __DIR__ is taken from Dir::Self __DIR__ fragment
+sub __DIR__ () {
+	File::Spec->rel2abs(join '', (File::Spec->splitpath(__FILE__))[0, 1]);
+}
+use lib __DIR__ . '/lib';
+
 use CGI qw(:standard :escapeHTML -nosticky);
 use CGI::Util qw(unescape);
 use CGI::Carp qw(fatalsToBrowser set_message);
-- 
1.7.3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCHv7.1 2/4] gitweb: add output buffering and associated functions
  2010-11-12 23:35           ` [RFC/PATCHv7.2 0/4] Gitweb caching v7.2 Jakub Narebski
  2010-11-12 23:41             ` [PATCHv7.2 1/4] gitweb: Prepare for splitting gitweb Jakub Narebski
@ 2010-11-12 23:44             ` Jakub Narebski
  2010-11-12 23:56             ` [PATCHv7.1 3/4] gitweb: File based caching layer (from git.kernel.org) Jakub Narebski
  2010-11-13  0:01             ` [PATCHv7.2 4/4] gitweb: Minimal testing of gitweb caching Jakub Narebski
  3 siblings, 0 replies; 43+ messages in thread
From: Jakub Narebski @ 2010-11-12 23:44 UTC (permalink / raw)
  To: git
  Cc: John 'Warthog9' Hawley, John 'Warthog9' Hawley,
	Junio C Hamano, Petr Baudis, admin

From: John 'Warthog9' Hawley <warthog9@eaglescrag.net>

This adds output buffering for gitweb, mainly in preparation for
caching support.  This is a dramatic change to how caching was being
done, mainly in passing around the variable manually and such.

This centrally flips the entire STDOUT to a variable, which after the
completion of the run, flips it back and does a print on the resulting
data.

This should save on the previous 10K line patch (or so) that adds more
explicit output passing.

[jn: modified reset_output to silence 'gitweb.perl: Name "main::STDOUT_REAL"
 used only once: possible typo at ../gitweb/gitweb.perl line 1130.' warning]

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
This is unchanged from previous version.


Reminder: it uses

  open STDOUT, ">&", \*STDOUT_REAL;

rather than

  open(STDOUT,">&STDOUT_REAL");

to silence spurious (invalid) warning

  gitweb.perl: Name "main::STDOUT_REAL" used only once: possible typo
  at ../gitweb/gitweb.perl line 1130.


 gitweb/gitweb.perl |   29 +++++++++++++++++++++++++++++
 1 files changed, 29 insertions(+), 0 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index cfa511c..cae0e34 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -39,6 +39,9 @@ BEGIN {
 
 our $version = "++GIT_VERSION++";
 
+# Output buffer variable
+our $output = "";
+
 our ($my_url, $my_uri, $base_url, $path_info, $home_link);
 sub evaluate_uri {
 	our $cgi;
@@ -1134,6 +1137,25 @@ sub evaluate_argv {
 	);
 }
 
+sub change_output {
+	our $output;
+
+	# Trap the 'proper' STDOUT to STDOUT_REAL for things like error messages and such
+	open(STDOUT_REAL,">&STDOUT") or die "Unable to capture STDOUT $!\n";
+
+	# Close STDOUT, so that it isn't being used anymore.
+	close STDOUT;
+
+	# Trap STDOUT to the $output variable, which is what I was using in the original
+	# patch anyway.
+	open(STDOUT,">", \$output) || die "Unable to open STDOUT: $!"; #open STDOUT handle to use $var
+}
+
+sub reset_output {
+	# This basically takes STDOUT_REAL and puts it back as STDOUT
+	open STDOUT, ">&", \*STDOUT_REAL;
+}
+
 sub run {
 	evaluate_argv();
 
@@ -1145,7 +1167,10 @@ sub run {
 		$pre_dispatch_hook->()
 			if $pre_dispatch_hook;
 
+		change_output();
 		run_request();
+		reset_output();
+		print $output;
 
 		$post_dispatch_hook->()
 			if $post_dispatch_hook;
@@ -3655,6 +3680,10 @@ sub die_error {
 		500 => '500 Internal Server Error',
 		503 => '503 Service Unavailable',
 	);
+	# Reset the output so that we are actually going to STDOUT as opposed
+	# to buffering the output.
+	reset_output();
+
 	git_header_html($http_responses{$status}, undef, %opts);
 	print <<EOF;
 <div class="page_body">
-- 
1.7.3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCHv7.1 3/4] gitweb: File based caching layer (from git.kernel.org)
  2010-11-12 23:35           ` [RFC/PATCHv7.2 0/4] Gitweb caching v7.2 Jakub Narebski
  2010-11-12 23:41             ` [PATCHv7.2 1/4] gitweb: Prepare for splitting gitweb Jakub Narebski
  2010-11-12 23:44             ` [PATCHv7.1 2/4] gitweb: add output buffering and associated functions Jakub Narebski
@ 2010-11-12 23:56             ` Jakub Narebski
  2010-11-28 11:22               ` [PATCHv7.1 3/4 (amend)] " Jakub Narebski
  2010-11-29 23:07               ` [PATCHv7.1 3/4] " demerphq
  2010-11-13  0:01             ` [PATCHv7.2 4/4] gitweb: Minimal testing of gitweb caching Jakub Narebski
  3 siblings, 2 replies; 43+ messages in thread
From: Jakub Narebski @ 2010-11-12 23:56 UTC (permalink / raw)
  To: git
  Cc: John 'Warthog9' Hawley, John 'Warthog9' Hawley,
	Junio C Hamano, Petr Baudis, admin

From: John 'Warthog9' Hawley <warthog9@eaglescrag.net>

This is a relatively large patch that implements the file based
caching layer that is quite similar to the one used  on such large
sites as kernel.org and soon git.fedoraproject.org.  This provides
a simple, and straight forward caching mechanism that scales
dramatically better than Gitweb by itself.

The caching layer basically buffers the output that Gitweb would
normally return, and saves that output to a cache file on the local
disk.  When the file is requested it attempts to gain a shared lock
on the cache file and cat it out to the client.  Should an exclusive
lock be on a file (it's being updated) the code has a choice to either
update in the background and go ahead and show the stale page while
update is being performed, or stall the client(s) until the page
is generated.

There are two forms of stalling involved here, background building
and non-background building, both of which are discussed in the
configuration page.

There are still a few known "issues" with respect to this:
- Code needs to be added to be "browser" aware so
  that clients like wget that are trying to get a
  binary blob don't obtain a "Generating..." page

Caching is disabled by default.  You can turn it on by setting
$caching_enabled variable to true to enable file based caching.

[jn: added error checking to loading 'cache.pl'; moved check
 for $caching_enabled outside out of cache_fetch, which required
 update to die_error()]

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
This patch is unchanged from previous version.

Reminder: the main difference from J.H. patch was moving check for
$caching_enabled out of cache_fetch, so that when caching is disabled
then gitweb works (almost) the same as before this patch.

Also, error checking for loading 'cache.pl' file.

Interdiff in previous series.

 gitweb/Makefile          |    3 +
 gitweb/gitweb.perl       |  105 ++++++++++++--
 gitweb/lib/cache.pl      |  348 ++++++++++++++++++++++++++++++++++++++++++++++
 gitweb/static/gitweb.css |    6 +
 4 files changed, 450 insertions(+), 12 deletions(-)
 create mode 100644 gitweb/lib/cache.pl

diff --git a/gitweb/Makefile b/gitweb/Makefile
index f9e32eb..6ddd4f1 100644
--- a/gitweb/Makefile
+++ b/gitweb/Makefile
@@ -113,6 +113,9 @@ endif
 
 GITWEB_FILES += static/git-logo.png static/git-favicon.png
 
+# Gitweb caching
+GITWEB_MODULES += cache.pl
+
 GITWEB_REPLACE = \
 	-e 's|++GIT_VERSION++|$(GIT_VERSION)|g' \
 	-e 's|++GIT_BINDIR++|$(bindir)|g' \
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index cae0e34..3c3ff08 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -250,6 +250,53 @@ our %avatar_size = (
 # Leave it undefined (or set to 'undef') to turn off load checking.
 our $maxload = 300;
 
+# This enables/disables the caching layer in gitweb.  This currently only supports the
+# 'dumb' file based caching layer, primarily used on git.kernel.org.  this is reasonably
+# effective but it has the downside of requiring a huge amount of disk space if there
+# are a number of repositories involved.  It is not uncommon for git.kernel.org to have
+# on the order of 80G - 120G accumulate over the course of a few months.  It is recommended
+# that the cache directory be periodically completely deleted, and this is safe to perform.
+# Suggested mechanism
+# mv $cacheidr $cachedir.flush;mkdir $cachedir;rm -rf $cachedir.flush
+our $caching_enabled = 0;
+
+# Used to set the minimum cache timeout for the dynamic caching algorithm.  Basically
+# if we calculate the cache to be under this number of seconds we set the cache timeout
+# to this minimum.
+# Value is in seconds.  1 = 1 seconds, 60 = 1 minute, 600 = 10 minutes, 3600 = 1 hour
+our $minCacheTime = 20;
+
+# Used to set the maximum cache timeout for the dynamic caching algorithm.  Basically
+# if we calculate the cache to exceed this number of seconds we set the cache timeout
+# to this maximum.
+# Value is in seconds.  1 = 1 seconds, 60 = 1 minute, 600 = 10 minutes, 3600 = 1 hour
+our $maxCacheTime = 1200;
+
+# If you need to change the location of the caching directory, override this
+# otherwise this will probably do fine for you
+our $cachedir = 'cache';
+
+# If this is set (to 1) cache will do it's best to always display something instead
+# of making someone wait for the cache to update.  This will launch the cacheUpdate
+# into the background and it will lock a <file>.bg file and will only lock the
+# actual cache file when it needs to write into it.  In theory this will make
+# gitweb seem more responsive at the price of possibly stale data.
+our $backgroundCache = 1;
+
+# Used to set the maximum cache file life.  If a cache files last modify time exceeds
+# this value, it will assume that the data is just too old, and HAS to be regenerated
+# instead of trying to display the existing cache data.
+# Value is in seconds.  1 = 1 seconds, 60 = 1 minute, 600 = 10 minutes, 3600 = 1 hour
+# 18000 = 5 hours
+our $maxCacheLife = 18000;
+
+# Used to enable or disable background forking of the gitweb caching.  Mainly here for debugging purposes
+our $cacheDoFork = 1;
+
+our $fullhashpath = *STDOUT;
+our $fullhashbinpath = *STDOUT;
+our $fullhashbinpathfinal = *STDOUT;
+
 # configuration for 'highlight' (http://www.andre-simon.de/)
 # match by basename
 our %highlight_basename = (
@@ -506,6 +553,15 @@ our %feature = (
 		'default' => [0]},
 );
 
+#
+# Includes
+#
+if (!exists $INC{'cache.pl'}) {
+	my $return = do 'cache.pl';
+	die $@ if $@;
+	die "Couldn't read 'cache.pl': $!" if (!defined $return);
+}
+
 sub gitweb_get_feature {
 	my ($name) = @_;
 	return unless exists $feature{$name};
@@ -734,6 +790,10 @@ our %actions = (
 	"project_list" => \&git_project_list,
 	"project_index" => \&git_project_index,
 );
+sub is_cacheable {
+	my $action = shift;
+	return !($action eq 'blame_data' || $action eq 'blame_incremental');
+}
 
 # finally, we have the hash of allowed extra_options for the commands that
 # allow them
@@ -1072,7 +1132,11 @@ sub dispatch {
 	    !$project) {
 		die_error(400, "Project needed");
 	}
-	$actions{$action}->();
+	if ($caching_enabled && is_cacheable($action)) {
+		cache_fetch($action);
+	} else {
+		$actions{$action}->();
+	}
 }
 
 sub reset_timer {
@@ -1142,6 +1206,7 @@ sub change_output {
 
 	# Trap the 'proper' STDOUT to STDOUT_REAL for things like error messages and such
 	open(STDOUT_REAL,">&STDOUT") or die "Unable to capture STDOUT $!\n";
+	print STDOUT_REAL "";
 
 	# Close STDOUT, so that it isn't being used anymore.
 	close STDOUT;
@@ -1167,10 +1232,7 @@ sub run {
 		$pre_dispatch_hook->()
 			if $pre_dispatch_hook;
 
-		change_output();
 		run_request();
-		reset_output();
-		print $output;
 
 		$post_dispatch_hook->()
 			if $post_dispatch_hook;
@@ -3447,7 +3509,8 @@ sub git_header_html {
 	# support xhtml+xml but choking when it gets what it asked for.
 	if (defined $cgi->http('HTTP_ACCEPT') &&
 	    $cgi->http('HTTP_ACCEPT') =~ m/(,|;|\s|^)application\/xhtml\+xml(,|;|\s|$)/ &&
-	    $cgi->Accept('application/xhtml+xml') != 0) {
+	    $cgi->Accept('application/xhtml+xml') != 0 &&
+	    !$caching_enabled) {
 		$content_type = 'application/xhtml+xml';
 	} else {
 		$content_type = 'text/html';
@@ -3592,6 +3655,7 @@ sub git_footer_html {
 	my $feed_class = 'rss_logo';
 
 	print "<div class=\"page_footer\">\n";
+	print "<div class=\"cachetime\">Cache Last Updated: ". gmtime( time ) ." GMT</div>\n";
 	if (defined $project) {
 		my $descr = git_get_project_description($project);
 		if (defined $descr) {
@@ -3680,9 +3744,14 @@ sub die_error {
 		500 => '500 Internal Server Error',
 		503 => '503 Service Unavailable',
 	);
+	# The output handlers for die_error need to be reset to STDOUT
+	# so that half the message isn't being output to random and
+	# half to STDOUT as expected.  This is mainly for the benefit
+	# of using git_header_html() and git_footer_html() since
+	#
 	# Reset the output so that we are actually going to STDOUT as opposed
 	# to buffering the output.
-	reset_output();
+	reset_output() if ($caching_enabled);
 
 	git_header_html($http_responses{$status}, undef, %opts);
 	print <<EOF;
@@ -5592,9 +5661,15 @@ sub git_blob_plain {
 			($sandbox ? 'attachment' : 'inline')
 			. '; filename="' . $save_as . '"');
 	local $/ = undef;
-	binmode STDOUT, ':raw';
-	print <$fd>;
-	binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi
+	if ($caching_enabled) {
+		open BINOUT, '>', $fullhashbinpath or die_error(500, "Could not open bin dump file");
+	}else{
+		open BINOUT, '>', \$fullhashbinpath or die_error(500, "Could not open bin dump file");
+	}
+	binmode BINOUT, ':raw';
+	print BINOUT <$fd>;
+	binmode BINOUT, ':utf8'; # as set at the beginning of gitweb.cgi
+	close BINOUT;
 	close $fd;
 }
 
@@ -5879,9 +5954,15 @@ sub git_snapshot {
 
 	open my $fd, "-|", $cmd
 		or die_error(500, "Execute git-archive failed");
-	binmode STDOUT, ':raw';
-	print <$fd>;
-	binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi
+	if ($caching_enabled) {
+		open BINOUT, '>', $fullhashbinpath or die_error(500, "Could not open bin dump file");
+	}else{
+		open BINOUT, '>', \$fullhashbinpath or die_error(500, "Could not open bin dump file");
+	}
+	binmode BINOUT, ':raw';
+	print BINOUT <$fd>;
+	binmode BINOUT, ':utf8'; # as set at the beginning of gitweb.cgi
+	close BINOUT;
 	close $fd;
 }
 
diff --git a/gitweb/lib/cache.pl b/gitweb/lib/cache.pl
new file mode 100644
index 0000000..dd14bfb
--- /dev/null
+++ b/gitweb/lib/cache.pl
@@ -0,0 +1,348 @@
+# gitweb - simple web interface to track changes in git repositories
+#
+# (C) 2006, John 'Warthog9' Hawley <warthog19@eaglescrag.net>
+#
+# This program is licensed under the GPLv2
+
+#
+# Gitweb caching engine
+#
+
+#use File::Path qw(make_path remove_tree);
+use File::Path qw(mkpath rmtree); # Used for compatability reasons
+use Digest::MD5 qw(md5 md5_hex md5_base64);
+use Fcntl ':flock';
+use File::Copy;
+
+sub cache_fetch {
+	my ($action) = @_;
+	my $cacheTime = 0;
+
+	if(! -d $cachedir){
+		print "*** Warning ***: Caching enabled but cache directory does not exsist.  ($cachedir)\n";
+		mkdir ("cache", 0755) || die "Cannot create cache dir - you will need to manually create";
+		print "Cache directory created successfully\n";
+	}
+
+	our $full_url = "$my_url?". $ENV{'QUERY_STRING'};
+	our $urlhash = md5_hex($full_url);
+	our $fullhashdir = "$cachedir/". substr( $urlhash, 0, 2) ."/";
+
+	eval { mkpath( $fullhashdir, 0, 0777 ) };
+	if ($@) {
+		die_error(500, "Internal Server Error", "Could not create cache directory: $@");
+	}
+	$fullhashpath = "$fullhashdir/". substr( $urlhash, 2 );
+	$fullhashbinpath = "$fullhashpath.bin.wt";
+	$fullhashbinpathfinal = "$fullhashpath.bin";
+
+	if(! -e "$fullhashpath" ){
+		if(! $cacheDoFork || ! defined(my $childPid = fork()) ){
+			cacheUpdate($action,0);
+			cacheDisplay($action);
+		} elsif ( $childPid == 0 ){
+			#run the updater
+			cacheUpdate($action,1);
+		}else{
+			cacheWaitForUpdate($action);
+		}
+	}else{
+		#if cache is out dated, update
+		#else displayCache();
+		open(cacheFile, '<', "$fullhashpath");
+		stat(cacheFile);
+		close(cacheFile);
+		my $stat_time = (stat(_))[9];
+		my $stat_size = (stat(_))[7];
+
+		$cacheTime = get_loadavg() * 60;
+		if( $cacheTime > $maxCacheTime ){
+			$cacheTime = $maxCacheTime;
+		}
+		if( $cacheTime < $minCacheTime ){
+			$cacheTime = $minCacheTime;
+		}
+		if( $stat_time < (time - $cacheTime) || $stat_size == 0 ){
+			if( ! $cacheDoFork || ! defined(my $childPid = fork()) ){
+				cacheUpdate($action,0);
+				cacheDisplay($action);
+			} elsif ( $childPid == 0 ){
+				#run the updater
+				#print "Running updater\n";
+				cacheUpdate($action,1);
+			}else{
+				#print "Waiting for update\n";
+				cacheWaitForUpdate($action);
+			}
+		} else {
+			cacheDisplay($action);
+		}
+
+
+	}
+
+	#
+	# If all of the caching failes - lets go ahead and press on without it and fall back to 'default'
+	# non-caching behavior.  This is the softest of the failure conditions.
+	#
+	#$actions{$action}->();
+}
+
+sub cacheUpdate {
+	my ($action,$areForked) = @_;
+	my $lockingStatus;
+	my $fileData = "";
+
+	if($backgroundCache){
+		open(cacheFileBG, '>:utf8', "$fullhashpath.bg");
+		my $lockStatBG = flock(cacheFileBG,LOCK_EX|LOCK_NB);
+
+		$lockStatus = $lockStatBG;
+	}else{
+		open(cacheFile, '>:utf8', \$fullhashpath);
+		my $lockStat = flock(cacheFile,LOCK_EX|LOCK_NB);
+
+		$lockStatus = $lockStat;
+	}
+	#print "lock status: $lockStat\n";
+
+
+	if (! $lockStatus ){
+		if ( $areForked ){
+			exit(0);
+		}else{
+			return;
+		}
+	}
+
+	if(
+		$action eq "snapshot"
+		||
+		$action eq "blob_plain"
+	){
+		my $openstat = open(cacheFileBinWT, '>>:utf8', "$fullhashbinpath");
+		my $lockStatBin = flock(cacheFileBinWT,LOCK_EX|LOCK_NB);
+	}
+
+	# Trap all output from the action
+	change_output();
+
+	$actions{$action}->();
+
+	# Reset the outputs as we should be fine now
+	reset_output();
+
+
+	if($backgroundCache){
+		open(cacheFile, '>:utf8', "$fullhashpath");
+		$lockStat = flock(cacheFile,LOCK_EX);
+
+		if (! $lockStat ){
+			if ( $areForked ){
+				exit(0);
+			}else{
+				return;
+			}
+		}
+	}
+
+	if(
+		$action eq "snapshot"
+		||
+		$action eq "blob_plain"
+	){
+		my $openstat = open(cacheFileBinFINAL, '>:utf8', "$fullhashbinpathfinal");
+		$lockStatBIN = flock(cacheFileBinFINAL,LOCK_EX);
+
+		if (! $lockStatBIN ){
+			if ( $areForked ){
+				exit(0);
+			}else{
+				return;
+			}
+		}
+	}
+
+	# Actually dump the output to the proper file handler
+	local $/ = undef;
+	$|++;
+	print cacheFile "$output";
+	$|--;
+	if(
+		$action eq "snapshot"
+		||
+		$action eq "blob_plain"
+	){
+		move("$fullhashbinpath", "$fullhashbinpathfinal") or die "Binary Cache file could not be updated: $!";
+
+		flock(cacheFileBinFINAL,LOCK_UN);
+		close(cacheFileBinFINAL);
+
+		flock(cacheFileBinWT,LOCK_UN);
+		close(cacheFileBinWT);
+	}
+
+	flock(cacheFile,LOCK_UN);
+	close(cacheFile);
+
+	if($backgroundCache){
+		flock(cacheFileBG,LOCK_UN);
+		close(cacheFileBG);
+	}
+
+	if ( $areForked ){
+		exit(0);
+	} else {
+		return;
+	}
+}
+
+
+sub cacheWaitForUpdate {
+	my ($action) = @_;
+	my $x = 0;
+	my $max = 10;
+	my $lockStat = 0;
+
+	if( $backgroundCache ){
+		if( -e "$fullhashpath" ){
+			open(cacheFile, '<:utf8', "$fullhashpath");
+			$lockStat = flock(cacheFile,LOCK_SH|LOCK_NB);
+			stat(cacheFile);
+			close(cacheFile);
+
+			if( $lockStat && ( (stat(_))[9] > (time - $maxCacheLife) ) ){
+				cacheDisplay($action);
+				return;
+			}
+		}
+	}
+
+	if(
+		$action eq "atom"
+		||
+		$action eq "rss"
+		||
+		$action eq "opml"
+	){
+		do {
+			sleep 2 if $x > 0;
+			open(cacheFile, '<:utf8', "$fullhashpath");
+			$lockStat = flock(cacheFile,LOCK_SH|LOCK_NB);
+			close(cacheFile);
+			$x++;
+			$combinedLockStat = $lockStat;
+		} while ((! $combinedLockStat) && ($x < $max));
+
+		if( $x != $max ){
+			cacheDisplay($action);
+		}
+		return;
+	}
+
+	$| = 1;
+
+	print $::cgi->header(
+				-type=>'text/html',
+				-charset => 'utf-8',
+				-status=> 200,
+				-expires => 'now',
+				# HTTP/1.0
+				-Pragma => 'no-cache',
+				# HTTP/1.1
+				-Cache_Control => join(
+							', ',
+							qw(
+								private
+								no-cache
+								no-store
+								must-revalidate
+								max-age=0
+								pre-check=0
+								post-check=0
+							)
+						)
+				);
+
+	print <<EOF;
+<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www/w3.porg/TR/html4/strict.dtd">
+<!-- git web w/caching interface version $version, (C) 2006-2010, John 'Warthog9' Hawley <warthog9\@kernel.org> -->
+<!-- git core binaries version $git_version -->
+<head>
+<meta http-equiv="content-type" content="$content_type; charset=utf-8"/>
+<meta name="generator" content="gitweb/$version git/$git_version"/>
+<meta name="robots" content="index, nofollow"/>
+<meta http-equiv="refresh" content="0"/>
+<title>$title</title>
+</head>
+<body>
+EOF
+
+	print "Generating..";
+	do {
+		print ".";
+		sleep 2 if $x > 0;
+		open(cacheFile, '<:utf8', "$fullhashpath");
+		$lockStat = flock(cacheFile,LOCK_SH|LOCK_NB);
+		close(cacheFile);
+		$x++;
+		$combinedLockStat = $lockStat;
+	} while ((! $combinedLockStat) && ($x < $max));
+	print <<EOF;
+</body>
+</html>
+EOF
+	return;
+}
+
+sub cacheDisplay {
+	local $/ = undef;
+	$|++;
+
+	my ($action) = @_;
+	open(cacheFile, '<:utf8', "$fullhashpath");
+	$lockStat = flock(cacheFile,LOCK_SH|LOCK_NB);
+
+	if (! $lockStat ){
+		close(cacheFile);
+		cacheWaitForUpdate($action);
+	}
+
+	if(
+		(
+			$action eq "snapshot"
+			||
+			$action eq "blob_plain"
+		)
+	){
+		my $openstat = open(cacheFileBin, '<', "$fullhashbinpathfinal");
+		$lockStatBIN = flock(cacheFileBin,LOCK_SH|LOCK_NB);
+		if (! $lockStatBIN ){
+			system ("echo 'cacheDisplay - bailing due to binary lock failure' >> /tmp/gitweb.log");
+			close(cacheFile);
+			close(cacheFileBin);
+			cacheWaitForUpdate($action);
+		}
+
+		my $binfilesize = -s "$fullhashbinpathfinal";
+		print "Content-Length: $binfilesize";
+	}
+	while( <cacheFile> ){
+		print $_;
+	}
+	if(
+		$action eq "snapshot"
+		||
+		$action eq "blob_plain"
+	){
+		binmode STDOUT, ':raw';
+		print <cacheFileBin>;
+		binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi
+		close(cacheFileBin);
+	}
+	close(cacheFile);
+	$|--;
+}
+
+1;
+__END__
diff --git a/gitweb/static/gitweb.css b/gitweb/static/gitweb.css
index 4132aab..972d32e 100644
--- a/gitweb/static/gitweb.css
+++ b/gitweb/static/gitweb.css
@@ -67,6 +67,12 @@ div.page_path {
 	border-width: 0px 0px 1px;
 }
 
+div.cachetime {
+	float: left;
+	margin-right: 10px;
+	color: #555555;
+}
+
 div.page_footer {
 	height: 17px;
 	padding: 4px 8px;
-- 
1.7.3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCHv7.2 4/4] gitweb: Minimal testing of gitweb caching
  2010-11-12 23:35           ` [RFC/PATCHv7.2 0/4] Gitweb caching v7.2 Jakub Narebski
                               ` (2 preceding siblings ...)
  2010-11-12 23:56             ` [PATCHv7.1 3/4] gitweb: File based caching layer (from git.kernel.org) Jakub Narebski
@ 2010-11-13  0:01             ` Jakub Narebski
  2010-11-17 22:37               ` Junio C Hamano
  3 siblings, 1 reply; 43+ messages in thread
From: Jakub Narebski @ 2010-11-13  0:01 UTC (permalink / raw)
  To: git
  Cc: John 'Warthog9' Hawley, John 'Warthog9' Hawley,
	Junio C Hamano, Petr Baudis, admin

Add basic tests of caching support to t9500-gitweb-standalone-no-errors
test: set $caching_enabled to true and check for errors for first time
run (generating cache) and second time run (retrieving from cache) for a
single view - summary view for a project.  Check also that request for
non-existent object (which results in die_error() codepath to be called)
doesn't produce errors.

Check in t9501-gitweb-standalone-http-status that request for
non-existent object produces correct output (HTTP headers and HTML
output) also when caching is enabled.

Check in the t9502-gitweb-standalone-parse-output test that gitweb
produces the same output with and without caching, for first and
second run, with binary (raw) or plain text (utf8) output.

The common routine that enables cache, gitweb_enable_caching, is
defined in t/gitweb-lib.sh

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
As I wrote in the cover letter, because of cacheWaitForUpdate() generating
extra output some of those newly introduced tests fail, some
intermittently.

John, could you take a look at it and check if the problem is: in tests,
in my simplification, or in caching code...

 t/gitweb-lib.sh                           |   15 +++++++++++++
 t/t9500-gitweb-standalone-no-errors.sh    |   20 +++++++++++++++++
 t/t9501-gitweb-standalone-http-status.sh  |   13 +++++++++++
 t/t9502-gitweb-standalone-parse-output.sh |   33 +++++++++++++++++++++++++++++
 4 files changed, 81 insertions(+), 0 deletions(-)

diff --git a/t/gitweb-lib.sh b/t/gitweb-lib.sh
index b9bb95f..16ce811 100644
--- a/t/gitweb-lib.sh
+++ b/t/gitweb-lib.sh
@@ -52,6 +52,21 @@ EOF
 	export SCRIPT_NAME
 }
 
+gitweb_enable_caching () {
+	test_expect_success 'enable caching' '
+		cat >>gitweb_config.perl <<-\EOF &&
+		our $caching_enabled = 1;
+		our $minCacheTime = 60*60*24*7*30;     # very long expiration time for tests (a month)
+		our $maxCacheTime = 60*60*24*7*30*365; # upper bound for dynamic (adaptive) caching
+		our $cachedir = "cache";               # for testsuite to clear the right thing
+		# required, because otherwise some tests might intermittently not pass
+		our $backgroundCache = 0; # should turn off cacheWaitForUpdate() / "Generating..."
+		#our $cacheDoFork = 0;
+		EOF
+		rm -rf cache/
+	'
+}
+
 gitweb_run () {
 	GATEWAY_INTERFACE='CGI/1.1'
 	HTTP_ACCEPT='*/*'
diff --git a/t/t9500-gitweb-standalone-no-errors.sh b/t/t9500-gitweb-standalone-no-errors.sh
index 21cd286..86c1b50 100755
--- a/t/t9500-gitweb-standalone-no-errors.sh
+++ b/t/t9500-gitweb-standalone-no-errors.sh
@@ -677,4 +677,24 @@ test_expect_success HIGHLIGHT \
 	 gitweb_run "p=.git;a=blob;f=test.sh"'
 test_debug 'cat gitweb.log'
 
+# ----------------------------------------------------------------------
+# caching
+
+gitweb_enable_caching
+
+test_expect_success \
+	'caching enabled (project summary, first run, generating cache)' \
+	'gitweb_run "p=.git;a=summary"'
+test_debug 'cat gitweb.log'
+
+test_expect_success \
+	'caching enabled (project summary, second run, cached version)' \
+	'gitweb_run "p=.git;a=summary"'
+test_debug 'cat gitweb.log'
+
+test_expect_success \
+	'caching enabled (non-existent commit, non-cache error page)' \
+	'gitweb_run "p=.git;a=commit;h=non-existent"'
+test_debug 'cat gitweb.log'
+
 test_done
diff --git a/t/t9501-gitweb-standalone-http-status.sh b/t/t9501-gitweb-standalone-http-status.sh
index 2487da1..5b1df3f 100755
--- a/t/t9501-gitweb-standalone-http-status.sh
+++ b/t/t9501-gitweb-standalone-http-status.sh
@@ -134,5 +134,18 @@ cat >>gitweb_config.perl <<\EOF
 our $maxload = undef;
 EOF
 
+# ----------------------------------------------------------------------
+# output caching
+
+gitweb_enable_caching
+
+test_expect_success 'caching enabled (non-existent commit, 404 error)' '
+	gitweb_run "p=.git;a=commit;h=non-existent" &&
+	grep "Status: 404 Not Found" gitweb.headers &&
+	grep "404 - Unknown commit object" gitweb.body
+'
+test_debug 'echo "log"     && cat gitweb.log'
+test_debug 'echo "headers" && cat gitweb.headers'
+test_debug 'echo "body"    && cat gitweb.body'
 
 test_done
diff --git a/t/t9502-gitweb-standalone-parse-output.sh b/t/t9502-gitweb-standalone-parse-output.sh
index dd83890..bc8eb01 100755
--- a/t/t9502-gitweb-standalone-parse-output.sh
+++ b/t/t9502-gitweb-standalone-parse-output.sh
@@ -112,4 +112,37 @@ test_expect_success 'snapshot: hierarchical branch name (xx/test)' '
 '
 test_debug 'cat gitweb.headers'
 
+
+# ----------------------------------------------------------------------
+# whether gitweb with caching enabled produces the same output
+
+test_expect_success 'setup for caching tests (utf8 patch, binary file)' '
+	. "$TEST_DIRECTORY"/t3901-utf8.txt &&
+	cp "$TEST_DIRECTORY"/test9200a.png image.png &&
+	git add image.png &&
+	git commit -F "$TEST_DIRECTORY"/t3900/1-UTF-8.txt &&
+	gitweb_run "p=.git;a=patch" &&
+	mv gitweb.body no_cache.txt &&
+	gitweb_run "p=.git;a=blob_plain;f=image.png" &&
+	mv gitweb.body no_cache.png
+'
+
+gitweb_enable_caching
+
+for desc in 'generating cache' 'cached version'; do
+	test_expect_success "caching enabled, plain text (utf8) output, $desc" '
+		gitweb_run "p=.git;a=patch" &&
+		mv gitweb.body cache.txt &&
+		test_cmp no_cache.txt cache.txt
+	'
+done
+
+for desc in 'generating cache' 'cached version'; do
+	test_expect_success "caching enabled, binary output (raw), $desc" '
+		gitweb_run "p=.git;a=blob_plain;f=image.png" &&
+		mv gitweb.body cache.png &&
+		cmp no_cache.png cache.png
+	'
+done
+
 test_done
-- 
1.7.3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCHv7.2 4/4] gitweb: Minimal testing of gitweb caching
  2010-11-13  0:01             ` [PATCHv7.2 4/4] gitweb: Minimal testing of gitweb caching Jakub Narebski
@ 2010-11-17 22:37               ` Junio C Hamano
  2010-11-17 23:12                 ` Jakub Narebski
  0 siblings, 1 reply; 43+ messages in thread
From: Junio C Hamano @ 2010-11-17 22:37 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: git, John 'Warthog9' Hawley,
	John 'Warthog9' Hawley, Petr Baudis, admin

Jakub Narebski <jnareb@gmail.com> writes:

> +gitweb_enable_caching () {
> +	test_expect_success 'enable caching' '
> +		cat >>gitweb_config.perl <<-\EOF &&
> +		our $caching_enabled = 1;
> +		our $minCacheTime = 60*60*24*7*30;     # very long expiration time for tests (a month)

Does your month have 210 days in it?

> +		our $maxCacheTime = 60*60*24*7*30*365; # upper bound for dynamic (adaptive) caching

Likewise, is this 210 years?  Does this bust 32-bit time_t somewhere?

> +		our $cachedir = "cache";               # for testsuite to clear the right thing
> +		# required, because otherwise some tests might intermittently not pass
> +		our $backgroundCache = 0; # should turn off cacheWaitForUpdate() / "Generating..."
> +		#our $cacheDoFork = 0;

leftover....

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCHv7.2 1/4] gitweb: Prepare for splitting gitweb
  2010-11-12 23:41             ` [PATCHv7.2 1/4] gitweb: Prepare for splitting gitweb Jakub Narebski
@ 2010-11-17 23:10               ` Junio C Hamano
  2010-11-18 13:37                 ` Jakub Narebski
  2010-12-02 10:17               ` [PATCHv7.3 1/4 (bugfix)] " Jakub Narebski
  1 sibling, 1 reply; 43+ messages in thread
From: Junio C Hamano @ 2010-11-17 23:10 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: git, John 'Warthog9' Hawley,
	John 'Warthog9' Hawley, Petr Baudis, admin

Jakub Narebski <jnareb@gmail.com> writes:

> That means adding GITWEBLIBDIR for future testing of individual
> modules (it was not present in v7.1).

You confused me.  "git grep" after test-applying this 4-patch series on
top of next shows only one GITWEBLIBDIR, which means that the symbol is
defined but nobody pays attention to it.

> diff --git a/gitweb/Makefile b/gitweb/Makefile
> index 0a6ac00..f9e32eb 100644
> --- a/gitweb/Makefile
> +++ b/gitweb/Makefile
> @@ -57,6 +57,7 @@ PERL_PATH  ?= /usr/bin/perl
> ...
> -.PHONY: all clean install test test-installed .FORCE-GIT-VERSION-FILE FORCE
> +.PHONY: all clean install install-modules test test-installed .FORCE-GIT-VERSION-FILE FORCE
>  

In the earlier series, 478da52 (gitweb: Prepare for splitting gitweb,
2010-10-30) removed the trailing blank line from here but this does not do
so anymore, which is a micronit regression.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCHv7.2 4/4] gitweb: Minimal testing of gitweb caching
  2010-11-17 22:37               ` Junio C Hamano
@ 2010-11-17 23:12                 ` Jakub Narebski
  0 siblings, 0 replies; 43+ messages in thread
From: Jakub Narebski @ 2010-11-17 23:12 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, John 'Warthog9' Hawley,
	John 'Warthog9' Hawley, Petr Baudis, admin

Dnia środa 17. listopada 2010 23:37, Junio C Hamano napisał:
> Jakub Narebski <jnareb@gmail.com> writes:
> 
> > +gitweb_enable_caching () {
> > +	test_expect_success 'enable caching' '
> > +		cat >>gitweb_config.perl <<-\EOF &&
> > +		our $caching_enabled = 1;
> > +		our $minCacheTime = 60*60*24*7*30;     # very long expiration time for tests (a month)
> 
> Does your month have 210 days in it?
> 
> > +		our $maxCacheTime = 60*60*24*7*30*365; # upper bound for dynamic (adaptive) caching
> 
> Likewise, is this 210 years?  Does this bust 32-bit time_t somewhere?

Ooops.

Still, fixing this doesn't fix tests.  I think I'd have to make another
extra commit, adding configure knob to turn off cacheWaitForUpdate()...


Or perhaps I did something wrong when simplifying, but I have checked
diff between full and simplified version (in comments in v7.1 series),
and it doesn't look like it...

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCHv7.2 1/4] gitweb: Prepare for splitting gitweb
  2010-11-17 23:10               ` Junio C Hamano
@ 2010-11-18 13:37                 ` Jakub Narebski
  0 siblings, 0 replies; 43+ messages in thread
From: Jakub Narebski @ 2010-11-18 13:37 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, John 'Warthog9' Hawley,
	John 'Warthog9' Hawley, Petr Baudis, admin

On Thu, 18 Nov 2010, Junio C Hamano wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
> 
> > That means adding GITWEBLIBDIR for future testing of individual
> > modules (it was not present in v7.1).
> 
> You confused me.  "git grep" after test-applying this 4-patch series on
> top of next shows only one GITWEBLIBDIR, which means that the symbol is
> defined but nobody pays attention to it.

That's why I wrote "for _future_ testing"... but perhaps it is not worth
introducing it now that it is not yet used by any test.

> > diff --git a/gitweb/Makefile b/gitweb/Makefile
> > index 0a6ac00..f9e32eb 100644
> > --- a/gitweb/Makefile
> > +++ b/gitweb/Makefile
> > @@ -57,6 +57,7 @@ PERL_PATH  ?= /usr/bin/perl
> > ...
> > -.PHONY: all clean install test test-installed .FORCE-GIT-VERSION-FILE FORCE
> > +.PHONY: all clean install install-modules test test-installed .FORCE-GIT-VERSION-FILE FORCE
> >  
> 
> In the earlier series, 478da52 (gitweb: Prepare for splitting gitweb,
> 2010-10-30) removed the trailing blank line from here but this does not do
> so anymore, which is a micronit regression.

I was not sure if such "while at it" style improvements are encouraged
or not.  Will remember it for v7.3.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCHv7.1 3/4 (amend)] gitweb: File based caching layer (from git.kernel.org)
  2010-11-12 23:56             ` [PATCHv7.1 3/4] gitweb: File based caching layer (from git.kernel.org) Jakub Narebski
@ 2010-11-28 11:22               ` Jakub Narebski
  2010-11-28 11:31                 ` Jakub Narebski
  2010-11-29 23:07               ` [PATCHv7.1 3/4] " demerphq
  1 sibling, 1 reply; 43+ messages in thread
From: Jakub Narebski @ 2010-11-28 11:22 UTC (permalink / raw)
  To: git
  Cc: John 'Warthog9' Hawley, John 'Warthog9' Hawley,
	Junio C Hamano, Petr Baudis, admin

From: John 'Warthog9' Hawley <warthog9@eaglescrag.net>

This is a relatively large patch that implements the file based
caching layer that is quite similar to the one used  on such large
sites as kernel.org and soon git.fedoraproject.org.  This provides
a simple, and straight forward caching mechanism that scales
dramatically better than Gitweb by itself.

The caching layer basically buffers the output that Gitweb would
normally return, and saves that output to a cache file on the local
disk.  When the file is requested it attempts to gain a shared lock
on the cache file and cat it out to the client.  Should an exclusive
lock be on a file (it's being updated) the code has a choice to either
update in the background and go ahead and show the stale page while
update is being performed, or stall the client(s) until the page
is generated.

There are two forms of stalling involved here, background building
and non-background building, both of which are discussed in the
configuration page.

There are still a few known "issues" with respect to this:
- Code needs to be added to be "browser" aware so
  that clients like wget that are trying to get a
  binary blob don't obtain a "Generating..." page

Caching is disabled by default.  You can turn it on by setting
$caching_enabled variable to true to enable file based caching.

[jn: added error checking to loading 'cache.pl'; moved check
 for $caching_enabled outside out of cache_fetch, which required
 update to die_error()]

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---

On Sat, 13 Nov 2010, Jakub Narebski wrote:

> +sub cacheUpdate {
> +       my ($action,$areForked) = @_;
> +       my $lockingStatus;
> +       my $fileData = "";
> +
> +       if($backgroundCache){
> +               open(cacheFileBG, '>:utf8', "$fullhashpath.bg");
> +               my $lockStatBG = flock(cacheFileBG,LOCK_EX|LOCK_NB);
> +
> +               $lockStatus = $lockStatBG;
> +       }else{
> +               open(cacheFile, '>:utf8', \$fullhashpath);
                                            ^^^^^^^^^^^^^^

I have made mistake with this line when moving $caching_enabled check
out of cache_fetch to its caller.

Reusing $fullhashpath variable as a *capture buffer* (it has nothing
to do with path; it is not a filename no longer) wouldn't help there.

> +               my $lockStat = flock(cacheFile,LOCK_EX|LOCK_NB);

A note about original code: you can't flock in-memory files.

> +
> +               $lockStatus = $lockStat;
> +       }
> +       #print "lock status: $lockStat\n";
[...]

 gitweb/Makefile          |    3 +
 gitweb/gitweb.perl       |  105 ++++++++++++--
 gitweb/lib/cache.pl      |  348 ++++++++++++++++++++++++++++++++++++++++++++++
 gitweb/static/gitweb.css |    6 +
 4 files changed, 450 insertions(+), 12 deletions(-)
 create mode 100644 gitweb/lib/cache.pl

diff --git a/gitweb/Makefile b/gitweb/Makefile
index f9e32eb..6ddd4f1 100644
--- a/gitweb/Makefile
+++ b/gitweb/Makefile
@@ -113,6 +113,9 @@ endif
 
 GITWEB_FILES += static/git-logo.png static/git-favicon.png
 
+# Gitweb caching
+GITWEB_MODULES += cache.pl
+
 GITWEB_REPLACE = \
 	-e 's|++GIT_VERSION++|$(GIT_VERSION)|g' \
 	-e 's|++GIT_BINDIR++|$(bindir)|g' \
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index cae0e34..3c3ff08 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -250,6 +250,53 @@ our %avatar_size = (
 # Leave it undefined (or set to 'undef') to turn off load checking.
 our $maxload = 300;
 
+# This enables/disables the caching layer in gitweb.  This currently only supports the
+# 'dumb' file based caching layer, primarily used on git.kernel.org.  this is reasonably
+# effective but it has the downside of requiring a huge amount of disk space if there
+# are a number of repositories involved.  It is not uncommon for git.kernel.org to have
+# on the order of 80G - 120G accumulate over the course of a few months.  It is recommended
+# that the cache directory be periodically completely deleted, and this is safe to perform.
+# Suggested mechanism
+# mv $cacheidr $cachedir.flush;mkdir $cachedir;rm -rf $cachedir.flush
+our $caching_enabled = 0;
+
+# Used to set the minimum cache timeout for the dynamic caching algorithm.  Basically
+# if we calculate the cache to be under this number of seconds we set the cache timeout
+# to this minimum.
+# Value is in seconds.  1 = 1 seconds, 60 = 1 minute, 600 = 10 minutes, 3600 = 1 hour
+our $minCacheTime = 20;
+
+# Used to set the maximum cache timeout for the dynamic caching algorithm.  Basically
+# if we calculate the cache to exceed this number of seconds we set the cache timeout
+# to this maximum.
+# Value is in seconds.  1 = 1 seconds, 60 = 1 minute, 600 = 10 minutes, 3600 = 1 hour
+our $maxCacheTime = 1200;
+
+# If you need to change the location of the caching directory, override this
+# otherwise this will probably do fine for you
+our $cachedir = 'cache';
+
+# If this is set (to 1) cache will do it's best to always display something instead
+# of making someone wait for the cache to update.  This will launch the cacheUpdate
+# into the background and it will lock a <file>.bg file and will only lock the
+# actual cache file when it needs to write into it.  In theory this will make
+# gitweb seem more responsive at the price of possibly stale data.
+our $backgroundCache = 1;
+
+# Used to set the maximum cache file life.  If a cache files last modify time exceeds
+# this value, it will assume that the data is just too old, and HAS to be regenerated
+# instead of trying to display the existing cache data.
+# Value is in seconds.  1 = 1 seconds, 60 = 1 minute, 600 = 10 minutes, 3600 = 1 hour
+# 18000 = 5 hours
+our $maxCacheLife = 18000;
+
+# Used to enable or disable background forking of the gitweb caching.  Mainly here for debugging purposes
+our $cacheDoFork = 1;
+
+our $fullhashpath = *STDOUT;
+our $fullhashbinpath = *STDOUT;
+our $fullhashbinpathfinal = *STDOUT;
+
 # configuration for 'highlight' (http://www.andre-simon.de/)
 # match by basename
 our %highlight_basename = (
@@ -506,6 +553,15 @@ our %feature = (
 		'default' => [0]},
 );
 
+#
+# Includes
+#
+if (!exists $INC{'cache.pl'}) {
+	my $return = do 'cache.pl';
+	die $@ if $@;
+	die "Couldn't read 'cache.pl': $!" if (!defined $return);
+}
+
 sub gitweb_get_feature {
 	my ($name) = @_;
 	return unless exists $feature{$name};
@@ -734,6 +790,10 @@ our %actions = (
 	"project_list" => \&git_project_list,
 	"project_index" => \&git_project_index,
 );
+sub is_cacheable {
+	my $action = shift;
+	return !($action eq 'blame_data' || $action eq 'blame_incremental');
+}
 
 # finally, we have the hash of allowed extra_options for the commands that
 # allow them
@@ -1072,7 +1132,11 @@ sub dispatch {
 	    !$project) {
 		die_error(400, "Project needed");
 	}
-	$actions{$action}->();
+	if ($caching_enabled && is_cacheable($action)) {
+		cache_fetch($action);
+	} else {
+		$actions{$action}->();
+	}
 }
 
 sub reset_timer {
@@ -1142,6 +1206,7 @@ sub change_output {
 
 	# Trap the 'proper' STDOUT to STDOUT_REAL for things like error messages and such
 	open(STDOUT_REAL,">&STDOUT") or die "Unable to capture STDOUT $!\n";
+	print STDOUT_REAL "";
 
 	# Close STDOUT, so that it isn't being used anymore.
 	close STDOUT;
@@ -1167,10 +1232,7 @@ sub run {
 		$pre_dispatch_hook->()
 			if $pre_dispatch_hook;
 
-		change_output();
 		run_request();
-		reset_output();
-		print $output;
 
 		$post_dispatch_hook->()
 			if $post_dispatch_hook;
@@ -3447,7 +3509,8 @@ sub git_header_html {
 	# support xhtml+xml but choking when it gets what it asked for.
 	if (defined $cgi->http('HTTP_ACCEPT') &&
 	    $cgi->http('HTTP_ACCEPT') =~ m/(,|;|\s|^)application\/xhtml\+xml(,|;|\s|$)/ &&
-	    $cgi->Accept('application/xhtml+xml') != 0) {
+	    $cgi->Accept('application/xhtml+xml') != 0 &&
+	    !$caching_enabled) {
 		$content_type = 'application/xhtml+xml';
 	} else {
 		$content_type = 'text/html';
@@ -3592,6 +3655,7 @@ sub git_footer_html {
 	my $feed_class = 'rss_logo';
 
 	print "<div class=\"page_footer\">\n";
+	print "<div class=\"cachetime\">Cache Last Updated: ". gmtime( time ) ." GMT</div>\n";
 	if (defined $project) {
 		my $descr = git_get_project_description($project);
 		if (defined $descr) {
@@ -3680,9 +3744,14 @@ sub die_error {
 		500 => '500 Internal Server Error',
 		503 => '503 Service Unavailable',
 	);
+	# The output handlers for die_error need to be reset to STDOUT
+	# so that half the message isn't being output to random and
+	# half to STDOUT as expected.  This is mainly for the benefit
+	# of using git_header_html() and git_footer_html() since
+	#
 	# Reset the output so that we are actually going to STDOUT as opposed
 	# to buffering the output.
-	reset_output();
+	reset_output() if ($caching_enabled);
 
 	git_header_html($http_responses{$status}, undef, %opts);
 	print <<EOF;
@@ -5592,9 +5661,15 @@ sub git_blob_plain {
 			($sandbox ? 'attachment' : 'inline')
 			. '; filename="' . $save_as . '"');
 	local $/ = undef;
-	binmode STDOUT, ':raw';
-	print <$fd>;
-	binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi
+	if ($caching_enabled) {
+		open BINOUT, '>', $fullhashbinpath or die_error(500, "Could not open bin dump file");
+	}else{
+		open BINOUT, '>', \$fullhashbinpath or die_error(500, "Could not open bin dump file");
+	}
+	binmode BINOUT, ':raw';
+	print BINOUT <$fd>;
+	binmode BINOUT, ':utf8'; # as set at the beginning of gitweb.cgi
+	close BINOUT;
 	close $fd;
 }
 
@@ -5879,9 +5954,15 @@ sub git_snapshot {
 
 	open my $fd, "-|", $cmd
 		or die_error(500, "Execute git-archive failed");
-	binmode STDOUT, ':raw';
-	print <$fd>;
-	binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi
+	if ($caching_enabled) {
+		open BINOUT, '>', $fullhashbinpath or die_error(500, "Could not open bin dump file");
+	}else{
+		open BINOUT, '>', \$fullhashbinpath or die_error(500, "Could not open bin dump file");
+	}
+	binmode BINOUT, ':raw';
+	print BINOUT <$fd>;
+	binmode BINOUT, ':utf8'; # as set at the beginning of gitweb.cgi
+	close BINOUT;
 	close $fd;
 }
 
diff --git a/gitweb/lib/cache.pl b/gitweb/lib/cache.pl
new file mode 100644
index 0000000..dd14bfb
--- /dev/null
+++ b/gitweb/lib/cache.pl
@@ -0,0 +1,348 @@
+# gitweb - simple web interface to track changes in git repositories
+#
+# (C) 2006, John 'Warthog9' Hawley <warthog19@eaglescrag.net>
+#
+# This program is licensed under the GPLv2
+
+#
+# Gitweb caching engine
+#
+
+#use File::Path qw(make_path remove_tree);
+use File::Path qw(mkpath rmtree); # Used for compatability reasons
+use Digest::MD5 qw(md5 md5_hex md5_base64);
+use Fcntl ':flock';
+use File::Copy;
+
+sub cache_fetch {
+	my ($action) = @_;
+	my $cacheTime = 0;
+
+	if(! -d $cachedir){
+		print "*** Warning ***: Caching enabled but cache directory does not exsist.  ($cachedir)\n";
+		mkdir ("cache", 0755) || die "Cannot create cache dir - you will need to manually create";
+		print "Cache directory created successfully\n";
+	}
+
+	our $full_url = "$my_url?". $ENV{'QUERY_STRING'};
+	our $urlhash = md5_hex($full_url);
+	our $fullhashdir = "$cachedir/". substr( $urlhash, 0, 2) ."/";
+
+	eval { mkpath( $fullhashdir, 0, 0777 ) };
+	if ($@) {
+		die_error(500, "Internal Server Error", "Could not create cache directory: $@");
+	}
+	$fullhashpath = "$fullhashdir/". substr( $urlhash, 2 );
+	$fullhashbinpath = "$fullhashpath.bin.wt";
+	$fullhashbinpathfinal = "$fullhashpath.bin";
+
+	if(! -e "$fullhashpath" ){
+		if(! $cacheDoFork || ! defined(my $childPid = fork()) ){
+			cacheUpdate($action,0);
+			cacheDisplay($action);
+		} elsif ( $childPid == 0 ){
+			#run the updater
+			cacheUpdate($action,1);
+		}else{
+			cacheWaitForUpdate($action);
+		}
+	}else{
+		#if cache is out dated, update
+		#else displayCache();
+		open(cacheFile, '<', "$fullhashpath");
+		stat(cacheFile);
+		close(cacheFile);
+		my $stat_time = (stat(_))[9];
+		my $stat_size = (stat(_))[7];
+
+		$cacheTime = get_loadavg() * 60;
+		if( $cacheTime > $maxCacheTime ){
+			$cacheTime = $maxCacheTime;
+		}
+		if( $cacheTime < $minCacheTime ){
+			$cacheTime = $minCacheTime;
+		}
+		if( $stat_time < (time - $cacheTime) || $stat_size == 0 ){
+			if( ! $cacheDoFork || ! defined(my $childPid = fork()) ){
+				cacheUpdate($action,0);
+				cacheDisplay($action);
+			} elsif ( $childPid == 0 ){
+				#run the updater
+				#print "Running updater\n";
+				cacheUpdate($action,1);
+			}else{
+				#print "Waiting for update\n";
+				cacheWaitForUpdate($action);
+			}
+		} else {
+			cacheDisplay($action);
+		}
+
+
+	}
+
+	#
+	# If all of the caching failes - lets go ahead and press on without it and fall back to 'default'
+	# non-caching behavior.  This is the softest of the failure conditions.
+	#
+	#$actions{$action}->();
+}
+
+sub cacheUpdate {
+	my ($action,$areForked) = @_;
+	my $lockingStatus;
+	my $fileData = "";
+
+	if($backgroundCache){
+		open(cacheFileBG, '>:utf8', "$fullhashpath.bg");
+		my $lockStatBG = flock(cacheFileBG,LOCK_EX|LOCK_NB);
+
+		$lockStatus = $lockStatBG;
+	}else{
+		open(cacheFile, '>:utf8', \$fullhashpath);
+		my $lockStat = flock(cacheFile,LOCK_EX|LOCK_NB);
+
+		$lockStatus = $lockStat;
+	}
+	#print "lock status: $lockStat\n";
+
+
+	if (! $lockStatus ){
+		if ( $areForked ){
+			exit(0);
+		}else{
+			return;
+		}
+	}
+
+	if(
+		$action eq "snapshot"
+		||
+		$action eq "blob_plain"
+	){
+		my $openstat = open(cacheFileBinWT, '>>:utf8', "$fullhashbinpath");
+		my $lockStatBin = flock(cacheFileBinWT,LOCK_EX|LOCK_NB);
+	}
+
+	# Trap all output from the action
+	change_output();
+
+	$actions{$action}->();
+
+	# Reset the outputs as we should be fine now
+	reset_output();
+
+
+	if($backgroundCache){
+		open(cacheFile, '>:utf8', "$fullhashpath");
+		$lockStat = flock(cacheFile,LOCK_EX);
+
+		if (! $lockStat ){
+			if ( $areForked ){
+				exit(0);
+			}else{
+				return;
+			}
+		}
+	}
+
+	if(
+		$action eq "snapshot"
+		||
+		$action eq "blob_plain"
+	){
+		my $openstat = open(cacheFileBinFINAL, '>:utf8', "$fullhashbinpathfinal");
+		$lockStatBIN = flock(cacheFileBinFINAL,LOCK_EX);
+
+		if (! $lockStatBIN ){
+			if ( $areForked ){
+				exit(0);
+			}else{
+				return;
+			}
+		}
+	}
+
+	# Actually dump the output to the proper file handler
+	local $/ = undef;
+	$|++;
+	print cacheFile "$output";
+	$|--;
+	if(
+		$action eq "snapshot"
+		||
+		$action eq "blob_plain"
+	){
+		move("$fullhashbinpath", "$fullhashbinpathfinal") or die "Binary Cache file could not be updated: $!";
+
+		flock(cacheFileBinFINAL,LOCK_UN);
+		close(cacheFileBinFINAL);
+
+		flock(cacheFileBinWT,LOCK_UN);
+		close(cacheFileBinWT);
+	}
+
+	flock(cacheFile,LOCK_UN);
+	close(cacheFile);
+
+	if($backgroundCache){
+		flock(cacheFileBG,LOCK_UN);
+		close(cacheFileBG);
+	}
+
+	if ( $areForked ){
+		exit(0);
+	} else {
+		return;
+	}
+}
+
+
+sub cacheWaitForUpdate {
+	my ($action) = @_;
+	my $x = 0;
+	my $max = 10;
+	my $lockStat = 0;
+
+	if( $backgroundCache ){
+		if( -e "$fullhashpath" ){
+			open(cacheFile, '<:utf8', "$fullhashpath");
+			$lockStat = flock(cacheFile,LOCK_SH|LOCK_NB);
+			stat(cacheFile);
+			close(cacheFile);
+
+			if( $lockStat && ( (stat(_))[9] > (time - $maxCacheLife) ) ){
+				cacheDisplay($action);
+				return;
+			}
+		}
+	}
+
+	if(
+		$action eq "atom"
+		||
+		$action eq "rss"
+		||
+		$action eq "opml"
+	){
+		do {
+			sleep 2 if $x > 0;
+			open(cacheFile, '<:utf8', "$fullhashpath");
+			$lockStat = flock(cacheFile,LOCK_SH|LOCK_NB);
+			close(cacheFile);
+			$x++;
+			$combinedLockStat = $lockStat;
+		} while ((! $combinedLockStat) && ($x < $max));
+
+		if( $x != $max ){
+			cacheDisplay($action);
+		}
+		return;
+	}
+
+	$| = 1;
+
+	print $::cgi->header(
+				-type=>'text/html',
+				-charset => 'utf-8',
+				-status=> 200,
+				-expires => 'now',
+				# HTTP/1.0
+				-Pragma => 'no-cache',
+				# HTTP/1.1
+				-Cache_Control => join(
+							', ',
+							qw(
+								private
+								no-cache
+								no-store
+								must-revalidate
+								max-age=0
+								pre-check=0
+								post-check=0
+							)
+						)
+				);
+
+	print <<EOF;
+<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www/w3.porg/TR/html4/strict.dtd">
+<!-- git web w/caching interface version $version, (C) 2006-2010, John 'Warthog9' Hawley <warthog9\@kernel.org> -->
+<!-- git core binaries version $git_version -->
+<head>
+<meta http-equiv="content-type" content="$content_type; charset=utf-8"/>
+<meta name="generator" content="gitweb/$version git/$git_version"/>
+<meta name="robots" content="index, nofollow"/>
+<meta http-equiv="refresh" content="0"/>
+<title>$title</title>
+</head>
+<body>
+EOF
+
+	print "Generating..";
+	do {
+		print ".";
+		sleep 2 if $x > 0;
+		open(cacheFile, '<:utf8', "$fullhashpath");
+		$lockStat = flock(cacheFile,LOCK_SH|LOCK_NB);
+		close(cacheFile);
+		$x++;
+		$combinedLockStat = $lockStat;
+	} while ((! $combinedLockStat) && ($x < $max));
+	print <<EOF;
+</body>
+</html>
+EOF
+	return;
+}
+
+sub cacheDisplay {
+	local $/ = undef;
+	$|++;
+
+	my ($action) = @_;
+	open(cacheFile, '<:utf8', "$fullhashpath");
+	$lockStat = flock(cacheFile,LOCK_SH|LOCK_NB);
+
+	if (! $lockStat ){
+		close(cacheFile);
+		cacheWaitForUpdate($action);
+	}
+
+	if(
+		(
+			$action eq "snapshot"
+			||
+			$action eq "blob_plain"
+		)
+	){
+		my $openstat = open(cacheFileBin, '<', "$fullhashbinpathfinal");
+		$lockStatBIN = flock(cacheFileBin,LOCK_SH|LOCK_NB);
+		if (! $lockStatBIN ){
+			system ("echo 'cacheDisplay - bailing due to binary lock failure' >> /tmp/gitweb.log");
+			close(cacheFile);
+			close(cacheFileBin);
+			cacheWaitForUpdate($action);
+		}
+
+		my $binfilesize = -s "$fullhashbinpathfinal";
+		print "Content-Length: $binfilesize";
+	}
+	while( <cacheFile> ){
+		print $_;
+	}
+	if(
+		$action eq "snapshot"
+		||
+		$action eq "blob_plain"
+	){
+		binmode STDOUT, ':raw';
+		print <cacheFileBin>;
+		binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi
+		close(cacheFileBin);
+	}
+	close(cacheFile);
+	$|--;
+}
+
+1;
+__END__
diff --git a/gitweb/static/gitweb.css b/gitweb/static/gitweb.css
index 4132aab..972d32e 100644
--- a/gitweb/static/gitweb.css
+++ b/gitweb/static/gitweb.css
@@ -67,6 +67,12 @@ div.page_path {
 	border-width: 0px 0px 1px;
 }
 
+div.cachetime {
+	float: left;
+	margin-right: 10px;
+	color: #555555;
+}
+
 div.page_footer {
 	height: 17px;
 	padding: 4px 8px;
-- 
1.7.3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCHv7.1 3/4 (amend)] gitweb: File based caching layer (from git.kernel.org)
  2010-11-28 11:22               ` [PATCHv7.1 3/4 (amend)] " Jakub Narebski
@ 2010-11-28 11:31                 ` Jakub Narebski
  2010-11-29 22:13                   ` Junio C Hamano
  0 siblings, 1 reply; 43+ messages in thread
From: Jakub Narebski @ 2010-11-28 11:31 UTC (permalink / raw)
  To: git
  Cc: John 'Warthog9' Hawley, John 'Warthog9' Hawley,
	Junio C Hamano, Petr Baudis, admin

On Sun, 28 Nov 2010, Jakub Narebski wrote:
> On Sat, 13 Nov 2010, Jakub Narebski wrote:
> 
> > +sub cacheUpdate {
> > +       my ($action,$areForked) = @_;
> > +       my $lockingStatus;
> > +       my $fileData = "";
> > +
> > +       if($backgroundCache){
> > +               open(cacheFileBG, '>:utf8', "$fullhashpath.bg");
> > +               my $lockStatBG = flock(cacheFileBG,LOCK_EX|LOCK_NB);
> > +
> > +               $lockStatus = $lockStatBG;
> > +       }else{
> > +               open(cacheFile, '>:utf8', \$fullhashpath);
>                                             ^^^^^^^^^^^^^^
> 
> I have made mistake with this line when moving $caching_enabled check
> out of cache_fetch to its caller.
> 
> Reusing $fullhashpath variable as a *capture buffer* (it has nothing
> to do with path; it is not a filename no longer) wouldn't help there.

Errr... I meant that this abuse didn't help avoiding my mistake.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCHv7.1 3/4 (amend)] gitweb: File based caching layer (from git.kernel.org)
  2010-11-28 11:31                 ` Jakub Narebski
@ 2010-11-29 22:13                   ` Junio C Hamano
  2010-11-29 22:20                     ` Junio C Hamano
  0 siblings, 1 reply; 43+ messages in thread
From: Junio C Hamano @ 2010-11-29 22:13 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: git, John 'Warthog9' Hawley,
	John 'Warthog9' Hawley, Petr Baudis, admin

Jakub Narebski <jnareb@gmail.com> writes:

>> I have made mistake with this line when moving $caching_enabled check
>> out of cache_fetch to its caller.
>> 
>> Reusing $fullhashpath variable as a *capture buffer* (it has nothing
>> to do with path; it is not a filename no longer) wouldn't help there.
>
> Errr... I meant that this abuse didn't help avoiding my mistake.

Thanks.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCHv7.1 3/4 (amend)] gitweb: File based caching layer (from git.kernel.org)
  2010-11-29 22:13                   ` Junio C Hamano
@ 2010-11-29 22:20                     ` Junio C Hamano
  2010-11-29 23:09                       ` [PATCHv7.1 3/4 (amend v2)] " Jakub Narebski
  0 siblings, 1 reply; 43+ messages in thread
From: Junio C Hamano @ 2010-11-29 22:20 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jakub Narebski, git, John 'Warthog9' Hawley,
	John 'Warthog9' Hawley, Petr Baudis, admin

Junio C Hamano <gitster@pobox.com> writes:

> Jakub Narebski <jnareb@gmail.com> writes:
>
>>> I have made mistake with this line when moving $caching_enabled check
>>> out of cache_fetch to its caller.
>>> 
>>> Reusing $fullhashpath variable as a *capture buffer* (it has nothing
>>> to do with path; it is not a filename no longer) wouldn't help there.
>>
>> Errr... I meant that this abuse didn't help avoiding my mistake.
>
> Thanks.

Wait a bit.

This seems to match what I have already queued on 'pu', no?  Am I
hallucinating?

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCHv7.1 3/4] gitweb: File based caching layer (from git.kernel.org)
  2010-11-12 23:56             ` [PATCHv7.1 3/4] gitweb: File based caching layer (from git.kernel.org) Jakub Narebski
  2010-11-28 11:22               ` [PATCHv7.1 3/4 (amend)] " Jakub Narebski
@ 2010-11-29 23:07               ` demerphq
  2010-11-29 23:26                 ` demerphq
  1 sibling, 1 reply; 43+ messages in thread
From: demerphq @ 2010-11-29 23:07 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: git, John 'Warthog9' Hawley,
	John 'Warthog9' Hawley, Junio C Hamano, Petr Baudis,
	admin

2010/11/13 Jakub Narebski <jnareb@gmail.com>:
> -       binmode STDOUT, ':raw';
> -       print <$fd>;
> -       binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi
> +       if ($caching_enabled) {
> +               open BINOUT, '>', $fullhashbinpath or die_error(500, "Could not open bin dump file");
> +       }else{
> +               open BINOUT, '>', \$fullhashbinpath or die_error(500, "Could not open bin dump file");
> +       }
> +       binmode BINOUT, ':raw';
> +       print BINOUT <$fd>;
> +       binmode BINOUT, ':utf8'; # as set at the beginning of gitweb.cgi
> +       close BINOUT;

Why do you use dynamically scoped file handles here as opposed to
lexically scoped ones?

And why do you change the output discipline on BINOUT immediately
before closing the file, and after you print data to it?

Doing so sortof makes sense when the filedhandle is STDOUT, but not
when it is BINOUT.

cheers,
Yves



-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCHv7.1 3/4 (amend v2)] gitweb: File based caching layer (from git.kernel.org)
  2010-11-29 22:20                     ` Junio C Hamano
@ 2010-11-29 23:09                       ` Jakub Narebski
  2010-11-30  0:51                         ` Junio C Hamano
  0 siblings, 1 reply; 43+ messages in thread
From: Jakub Narebski @ 2010-11-29 23:09 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, John 'Warthog9' Hawley,
	John 'Warthog9' Hawley, Petr Baudis, admin

On Mon, 29 Nov 2010, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
>> Jakub Narebski <jnareb@gmail.com> writes:
>>
>>>> I have made mistake with this line when moving $caching_enabled check
>>>> out of cache_fetch to its caller.
>>>> 
>>>> Reusing $fullhashpath variable as a *capture buffer* (it has nothing
>>>> to do with path; it is not a filename no longer) wouldn't help there.
>>>
>>> Errr... I meant that this abuse didn't help avoiding my mistake.
>>
>> Thanks.
> 
> Wait a bit.
> 
> This seems to match what I have already queued on 'pu', no?  Am I
> hallucinating?
 
Damn, it looks like my mailer included wrong (older) version of a file.
I'm sorry.  Below there is interdiff:

  diff --git a/gitweb/lib/cache.pl b/gitweb/lib/cache.pl
  index dd14bfb..4360b3d 100644
  --- a/gitweb/lib/cache.pl
  +++ b/gitweb/lib/cache.pl
  @@ -99,7 +99,7 @@ sub cacheUpdate {
   
   		$lockStatus = $lockStatBG;
   	}else{
  -		open(cacheFile, '>:utf8', \$fullhashpath);
  +		open(cacheFile, '>:utf8', $fullhashpath);
   		my $lockStat = flock(cacheFile,LOCK_EX|LOCK_NB);
   
   		$lockStatus = $lockStat;

-- >8 --
From: John 'Warthog9' Hawley <warthog9@eaglescrag.net>

This is a relatively large patch that implements the file based
caching layer that is quite similar to the one used  on such large
sites as kernel.org and soon git.fedoraproject.org.  This provides
a simple, and straight forward caching mechanism that scales
dramatically better than Gitweb by itself.

The caching layer basically buffers the output that Gitweb would
normally return, and saves that output to a cache file on the local
disk.  When the file is requested it attempts to gain a shared lock
on the cache file and cat it out to the client.  Should an exclusive
lock be on a file (it's being updated) the code has a choice to either
update in the background and go ahead and show the stale page while
update is being performed, or stall the client(s) until the page
is generated.

There are two forms of stalling involved here, background building
and non-background building, both of which are discussed in the
configuration page.

There are still a few known "issues" with respect to this:
- Code needs to be added to be "browser" aware so
  that clients like wget that are trying to get a
  binary blob don't obtain a "Generating..." page

Caching is disabled by default.  You can turn it on by setting
$caching_enabled variable to true to enable file based caching.

[jn: added error checking to loading 'cache.pl'; moved check
 for $caching_enabled outside out of cache_fetch, which required
 update to die_error()]

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 gitweb/Makefile          |    3 +
 gitweb/gitweb.perl       |  105 ++++++++++++--
 gitweb/lib/cache.pl      |  348 ++++++++++++++++++++++++++++++++++++++++++++++
 gitweb/static/gitweb.css |    6 +
 4 files changed, 450 insertions(+), 12 deletions(-)
 create mode 100644 gitweb/lib/cache.pl

diff --git a/gitweb/Makefile b/gitweb/Makefile
index f9e32eb..6ddd4f1 100644
--- a/gitweb/Makefile
+++ b/gitweb/Makefile
@@ -113,6 +113,9 @@ endif
 
 GITWEB_FILES += static/git-logo.png static/git-favicon.png
 
+# Gitweb caching
+GITWEB_MODULES += cache.pl
+
 GITWEB_REPLACE = \
 	-e 's|++GIT_VERSION++|$(GIT_VERSION)|g' \
 	-e 's|++GIT_BINDIR++|$(bindir)|g' \
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index cae0e34..3c3ff08 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -250,6 +250,53 @@ our %avatar_size = (
 # Leave it undefined (or set to 'undef') to turn off load checking.
 our $maxload = 300;
 
+# This enables/disables the caching layer in gitweb.  This currently only supports the
+# 'dumb' file based caching layer, primarily used on git.kernel.org.  this is reasonably
+# effective but it has the downside of requiring a huge amount of disk space if there
+# are a number of repositories involved.  It is not uncommon for git.kernel.org to have
+# on the order of 80G - 120G accumulate over the course of a few months.  It is recommended
+# that the cache directory be periodically completely deleted, and this is safe to perform.
+# Suggested mechanism
+# mv $cacheidr $cachedir.flush;mkdir $cachedir;rm -rf $cachedir.flush
+our $caching_enabled = 0;
+
+# Used to set the minimum cache timeout for the dynamic caching algorithm.  Basically
+# if we calculate the cache to be under this number of seconds we set the cache timeout
+# to this minimum.
+# Value is in seconds.  1 = 1 seconds, 60 = 1 minute, 600 = 10 minutes, 3600 = 1 hour
+our $minCacheTime = 20;
+
+# Used to set the maximum cache timeout for the dynamic caching algorithm.  Basically
+# if we calculate the cache to exceed this number of seconds we set the cache timeout
+# to this maximum.
+# Value is in seconds.  1 = 1 seconds, 60 = 1 minute, 600 = 10 minutes, 3600 = 1 hour
+our $maxCacheTime = 1200;
+
+# If you need to change the location of the caching directory, override this
+# otherwise this will probably do fine for you
+our $cachedir = 'cache';
+
+# If this is set (to 1) cache will do it's best to always display something instead
+# of making someone wait for the cache to update.  This will launch the cacheUpdate
+# into the background and it will lock a <file>.bg file and will only lock the
+# actual cache file when it needs to write into it.  In theory this will make
+# gitweb seem more responsive at the price of possibly stale data.
+our $backgroundCache = 1;
+
+# Used to set the maximum cache file life.  If a cache files last modify time exceeds
+# this value, it will assume that the data is just too old, and HAS to be regenerated
+# instead of trying to display the existing cache data.
+# Value is in seconds.  1 = 1 seconds, 60 = 1 minute, 600 = 10 minutes, 3600 = 1 hour
+# 18000 = 5 hours
+our $maxCacheLife = 18000;
+
+# Used to enable or disable background forking of the gitweb caching.  Mainly here for debugging purposes
+our $cacheDoFork = 1;
+
+our $fullhashpath = *STDOUT;
+our $fullhashbinpath = *STDOUT;
+our $fullhashbinpathfinal = *STDOUT;
+
 # configuration for 'highlight' (http://www.andre-simon.de/)
 # match by basename
 our %highlight_basename = (
@@ -506,6 +553,15 @@ our %feature = (
 		'default' => [0]},
 );
 
+#
+# Includes
+#
+if (!exists $INC{'cache.pl'}) {
+	my $return = do 'cache.pl';
+	die $@ if $@;
+	die "Couldn't read 'cache.pl': $!" if (!defined $return);
+}
+
 sub gitweb_get_feature {
 	my ($name) = @_;
 	return unless exists $feature{$name};
@@ -734,6 +790,10 @@ our %actions = (
 	"project_list" => \&git_project_list,
 	"project_index" => \&git_project_index,
 );
+sub is_cacheable {
+	my $action = shift;
+	return !($action eq 'blame_data' || $action eq 'blame_incremental');
+}
 
 # finally, we have the hash of allowed extra_options for the commands that
 # allow them
@@ -1072,7 +1132,11 @@ sub dispatch {
 	    !$project) {
 		die_error(400, "Project needed");
 	}
-	$actions{$action}->();
+	if ($caching_enabled && is_cacheable($action)) {
+		cache_fetch($action);
+	} else {
+		$actions{$action}->();
+	}
 }
 
 sub reset_timer {
@@ -1142,6 +1206,7 @@ sub change_output {
 
 	# Trap the 'proper' STDOUT to STDOUT_REAL for things like error messages and such
 	open(STDOUT_REAL,">&STDOUT") or die "Unable to capture STDOUT $!\n";
+	print STDOUT_REAL "";
 
 	# Close STDOUT, so that it isn't being used anymore.
 	close STDOUT;
@@ -1167,10 +1232,7 @@ sub run {
 		$pre_dispatch_hook->()
 			if $pre_dispatch_hook;
 
-		change_output();
 		run_request();
-		reset_output();
-		print $output;
 
 		$post_dispatch_hook->()
 			if $post_dispatch_hook;
@@ -3447,7 +3509,8 @@ sub git_header_html {
 	# support xhtml+xml but choking when it gets what it asked for.
 	if (defined $cgi->http('HTTP_ACCEPT') &&
 	    $cgi->http('HTTP_ACCEPT') =~ m/(,|;|\s|^)application\/xhtml\+xml(,|;|\s|$)/ &&
-	    $cgi->Accept('application/xhtml+xml') != 0) {
+	    $cgi->Accept('application/xhtml+xml') != 0 &&
+	    !$caching_enabled) {
 		$content_type = 'application/xhtml+xml';
 	} else {
 		$content_type = 'text/html';
@@ -3592,6 +3655,7 @@ sub git_footer_html {
 	my $feed_class = 'rss_logo';
 
 	print "<div class=\"page_footer\">\n";
+	print "<div class=\"cachetime\">Cache Last Updated: ". gmtime( time ) ." GMT</div>\n";
 	if (defined $project) {
 		my $descr = git_get_project_description($project);
 		if (defined $descr) {
@@ -3680,9 +3744,14 @@ sub die_error {
 		500 => '500 Internal Server Error',
 		503 => '503 Service Unavailable',
 	);
+	# The output handlers for die_error need to be reset to STDOUT
+	# so that half the message isn't being output to random and
+	# half to STDOUT as expected.  This is mainly for the benefit
+	# of using git_header_html() and git_footer_html() since
+	#
 	# Reset the output so that we are actually going to STDOUT as opposed
 	# to buffering the output.
-	reset_output();
+	reset_output() if ($caching_enabled);
 
 	git_header_html($http_responses{$status}, undef, %opts);
 	print <<EOF;
@@ -5592,9 +5661,15 @@ sub git_blob_plain {
 			($sandbox ? 'attachment' : 'inline')
 			. '; filename="' . $save_as . '"');
 	local $/ = undef;
-	binmode STDOUT, ':raw';
-	print <$fd>;
-	binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi
+	if ($caching_enabled) {
+		open BINOUT, '>', $fullhashbinpath or die_error(500, "Could not open bin dump file");
+	}else{
+		open BINOUT, '>', \$fullhashbinpath or die_error(500, "Could not open bin dump file");
+	}
+	binmode BINOUT, ':raw';
+	print BINOUT <$fd>;
+	binmode BINOUT, ':utf8'; # as set at the beginning of gitweb.cgi
+	close BINOUT;
 	close $fd;
 }
 
@@ -5879,9 +5954,15 @@ sub git_snapshot {
 
 	open my $fd, "-|", $cmd
 		or die_error(500, "Execute git-archive failed");
-	binmode STDOUT, ':raw';
-	print <$fd>;
-	binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi
+	if ($caching_enabled) {
+		open BINOUT, '>', $fullhashbinpath or die_error(500, "Could not open bin dump file");
+	}else{
+		open BINOUT, '>', \$fullhashbinpath or die_error(500, "Could not open bin dump file");
+	}
+	binmode BINOUT, ':raw';
+	print BINOUT <$fd>;
+	binmode BINOUT, ':utf8'; # as set at the beginning of gitweb.cgi
+	close BINOUT;
 	close $fd;
 }
 
diff --git a/gitweb/lib/cache.pl b/gitweb/lib/cache.pl
new file mode 100644
index 0000000..4360b3d
--- /dev/null
+++ b/gitweb/lib/cache.pl
@@ -0,0 +1,348 @@
+# gitweb - simple web interface to track changes in git repositories
+#
+# (C) 2006, John 'Warthog9' Hawley <warthog19@eaglescrag.net>
+#
+# This program is licensed under the GPLv2
+
+#
+# Gitweb caching engine
+#
+
+#use File::Path qw(make_path remove_tree);
+use File::Path qw(mkpath rmtree); # Used for compatability reasons
+use Digest::MD5 qw(md5 md5_hex md5_base64);
+use Fcntl ':flock';
+use File::Copy;
+
+sub cache_fetch {
+	my ($action) = @_;
+	my $cacheTime = 0;
+
+	if(! -d $cachedir){
+		print "*** Warning ***: Caching enabled but cache directory does not exsist.  ($cachedir)\n";
+		mkdir ("cache", 0755) || die "Cannot create cache dir - you will need to manually create";
+		print "Cache directory created successfully\n";
+	}
+
+	our $full_url = "$my_url?". $ENV{'QUERY_STRING'};
+	our $urlhash = md5_hex($full_url);
+	our $fullhashdir = "$cachedir/". substr( $urlhash, 0, 2) ."/";
+
+	eval { mkpath( $fullhashdir, 0, 0777 ) };
+	if ($@) {
+		die_error(500, "Internal Server Error", "Could not create cache directory: $@");
+	}
+	$fullhashpath = "$fullhashdir/". substr( $urlhash, 2 );
+	$fullhashbinpath = "$fullhashpath.bin.wt";
+	$fullhashbinpathfinal = "$fullhashpath.bin";
+
+	if(! -e "$fullhashpath" ){
+		if(! $cacheDoFork || ! defined(my $childPid = fork()) ){
+			cacheUpdate($action,0);
+			cacheDisplay($action);
+		} elsif ( $childPid == 0 ){
+			#run the updater
+			cacheUpdate($action,1);
+		}else{
+			cacheWaitForUpdate($action);
+		}
+	}else{
+		#if cache is out dated, update
+		#else displayCache();
+		open(cacheFile, '<', "$fullhashpath");
+		stat(cacheFile);
+		close(cacheFile);
+		my $stat_time = (stat(_))[9];
+		my $stat_size = (stat(_))[7];
+
+		$cacheTime = get_loadavg() * 60;
+		if( $cacheTime > $maxCacheTime ){
+			$cacheTime = $maxCacheTime;
+		}
+		if( $cacheTime < $minCacheTime ){
+			$cacheTime = $minCacheTime;
+		}
+		if( $stat_time < (time - $cacheTime) || $stat_size == 0 ){
+			if( ! $cacheDoFork || ! defined(my $childPid = fork()) ){
+				cacheUpdate($action,0);
+				cacheDisplay($action);
+			} elsif ( $childPid == 0 ){
+				#run the updater
+				#print "Running updater\n";
+				cacheUpdate($action,1);
+			}else{
+				#print "Waiting for update\n";
+				cacheWaitForUpdate($action);
+			}
+		} else {
+			cacheDisplay($action);
+		}
+
+
+	}
+
+	#
+	# If all of the caching failes - lets go ahead and press on without it and fall back to 'default'
+	# non-caching behavior.  This is the softest of the failure conditions.
+	#
+	#$actions{$action}->();
+}
+
+sub cacheUpdate {
+	my ($action,$areForked) = @_;
+	my $lockingStatus;
+	my $fileData = "";
+
+	if($backgroundCache){
+		open(cacheFileBG, '>:utf8', "$fullhashpath.bg");
+		my $lockStatBG = flock(cacheFileBG,LOCK_EX|LOCK_NB);
+
+		$lockStatus = $lockStatBG;
+	}else{
+		open(cacheFile, '>:utf8', $fullhashpath);
+		my $lockStat = flock(cacheFile,LOCK_EX|LOCK_NB);
+
+		$lockStatus = $lockStat;
+	}
+	#print "lock status: $lockStat\n";
+
+
+	if (! $lockStatus ){
+		if ( $areForked ){
+			exit(0);
+		}else{
+			return;
+		}
+	}
+
+	if(
+		$action eq "snapshot"
+		||
+		$action eq "blob_plain"
+	){
+		my $openstat = open(cacheFileBinWT, '>>:utf8', "$fullhashbinpath");
+		my $lockStatBin = flock(cacheFileBinWT,LOCK_EX|LOCK_NB);
+	}
+
+	# Trap all output from the action
+	change_output();
+
+	$actions{$action}->();
+
+	# Reset the outputs as we should be fine now
+	reset_output();
+
+
+	if($backgroundCache){
+		open(cacheFile, '>:utf8', "$fullhashpath");
+		$lockStat = flock(cacheFile,LOCK_EX);
+
+		if (! $lockStat ){
+			if ( $areForked ){
+				exit(0);
+			}else{
+				return;
+			}
+		}
+	}
+
+	if(
+		$action eq "snapshot"
+		||
+		$action eq "blob_plain"
+	){
+		my $openstat = open(cacheFileBinFINAL, '>:utf8', "$fullhashbinpathfinal");
+		$lockStatBIN = flock(cacheFileBinFINAL,LOCK_EX);
+
+		if (! $lockStatBIN ){
+			if ( $areForked ){
+				exit(0);
+			}else{
+				return;
+			}
+		}
+	}
+
+	# Actually dump the output to the proper file handler
+	local $/ = undef;
+	$|++;
+	print cacheFile "$output";
+	$|--;
+	if(
+		$action eq "snapshot"
+		||
+		$action eq "blob_plain"
+	){
+		move("$fullhashbinpath", "$fullhashbinpathfinal") or die "Binary Cache file could not be updated: $!";
+
+		flock(cacheFileBinFINAL,LOCK_UN);
+		close(cacheFileBinFINAL);
+
+		flock(cacheFileBinWT,LOCK_UN);
+		close(cacheFileBinWT);
+	}
+
+	flock(cacheFile,LOCK_UN);
+	close(cacheFile);
+
+	if($backgroundCache){
+		flock(cacheFileBG,LOCK_UN);
+		close(cacheFileBG);
+	}
+
+	if ( $areForked ){
+		exit(0);
+	} else {
+		return;
+	}
+}
+
+
+sub cacheWaitForUpdate {
+	my ($action) = @_;
+	my $x = 0;
+	my $max = 10;
+	my $lockStat = 0;
+
+	if( $backgroundCache ){
+		if( -e "$fullhashpath" ){
+			open(cacheFile, '<:utf8', "$fullhashpath");
+			$lockStat = flock(cacheFile,LOCK_SH|LOCK_NB);
+			stat(cacheFile);
+			close(cacheFile);
+
+			if( $lockStat && ( (stat(_))[9] > (time - $maxCacheLife) ) ){
+				cacheDisplay($action);
+				return;
+			}
+		}
+	}
+
+	if(
+		$action eq "atom"
+		||
+		$action eq "rss"
+		||
+		$action eq "opml"
+	){
+		do {
+			sleep 2 if $x > 0;
+			open(cacheFile, '<:utf8', "$fullhashpath");
+			$lockStat = flock(cacheFile,LOCK_SH|LOCK_NB);
+			close(cacheFile);
+			$x++;
+			$combinedLockStat = $lockStat;
+		} while ((! $combinedLockStat) && ($x < $max));
+
+		if( $x != $max ){
+			cacheDisplay($action);
+		}
+		return;
+	}
+
+	$| = 1;
+
+	print $::cgi->header(
+				-type=>'text/html',
+				-charset => 'utf-8',
+				-status=> 200,
+				-expires => 'now',
+				# HTTP/1.0
+				-Pragma => 'no-cache',
+				# HTTP/1.1
+				-Cache_Control => join(
+							', ',
+							qw(
+								private
+								no-cache
+								no-store
+								must-revalidate
+								max-age=0
+								pre-check=0
+								post-check=0
+							)
+						)
+				);
+
+	print <<EOF;
+<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www/w3.porg/TR/html4/strict.dtd">
+<!-- git web w/caching interface version $version, (C) 2006-2010, John 'Warthog9' Hawley <warthog9\@kernel.org> -->
+<!-- git core binaries version $git_version -->
+<head>
+<meta http-equiv="content-type" content="$content_type; charset=utf-8"/>
+<meta name="generator" content="gitweb/$version git/$git_version"/>
+<meta name="robots" content="index, nofollow"/>
+<meta http-equiv="refresh" content="0"/>
+<title>$title</title>
+</head>
+<body>
+EOF
+
+	print "Generating..";
+	do {
+		print ".";
+		sleep 2 if $x > 0;
+		open(cacheFile, '<:utf8', "$fullhashpath");
+		$lockStat = flock(cacheFile,LOCK_SH|LOCK_NB);
+		close(cacheFile);
+		$x++;
+		$combinedLockStat = $lockStat;
+	} while ((! $combinedLockStat) && ($x < $max));
+	print <<EOF;
+</body>
+</html>
+EOF
+	return;
+}
+
+sub cacheDisplay {
+	local $/ = undef;
+	$|++;
+
+	my ($action) = @_;
+	open(cacheFile, '<:utf8', "$fullhashpath");
+	$lockStat = flock(cacheFile,LOCK_SH|LOCK_NB);
+
+	if (! $lockStat ){
+		close(cacheFile);
+		cacheWaitForUpdate($action);
+	}
+
+	if(
+		(
+			$action eq "snapshot"
+			||
+			$action eq "blob_plain"
+		)
+	){
+		my $openstat = open(cacheFileBin, '<', "$fullhashbinpathfinal");
+		$lockStatBIN = flock(cacheFileBin,LOCK_SH|LOCK_NB);
+		if (! $lockStatBIN ){
+			system ("echo 'cacheDisplay - bailing due to binary lock failure' >> /tmp/gitweb.log");
+			close(cacheFile);
+			close(cacheFileBin);
+			cacheWaitForUpdate($action);
+		}
+
+		my $binfilesize = -s "$fullhashbinpathfinal";
+		print "Content-Length: $binfilesize";
+	}
+	while( <cacheFile> ){
+		print $_;
+	}
+	if(
+		$action eq "snapshot"
+		||
+		$action eq "blob_plain"
+	){
+		binmode STDOUT, ':raw';
+		print <cacheFileBin>;
+		binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi
+		close(cacheFileBin);
+	}
+	close(cacheFile);
+	$|--;
+}
+
+1;
+__END__
diff --git a/gitweb/static/gitweb.css b/gitweb/static/gitweb.css
index 4132aab..972d32e 100644
--- a/gitweb/static/gitweb.css
+++ b/gitweb/static/gitweb.css
@@ -67,6 +67,12 @@ div.page_path {
 	border-width: 0px 0px 1px;
 }
 
+div.cachetime {
+	float: left;
+	margin-right: 10px;
+	color: #555555;
+}
+
 div.page_footer {
 	height: 17px;
 	padding: 4px 8px;
-- 
1.7.3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCHv7.1 3/4] gitweb: File based caching layer (from git.kernel.org)
  2010-11-29 23:07               ` [PATCHv7.1 3/4] " demerphq
@ 2010-11-29 23:26                 ` demerphq
  2010-11-29 23:54                   ` Jakub Narebski
  0 siblings, 1 reply; 43+ messages in thread
From: demerphq @ 2010-11-29 23:26 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: git, John 'Warthog9' Hawley,
	John 'Warthog9' Hawley, Junio C Hamano, Petr Baudis,
	admin

On 30 November 2010 00:07, demerphq <demerphq@gmail.com> wrote:
> 2010/11/13 Jakub Narebski <jnareb@gmail.com>:
>> -       binmode STDOUT, ':raw';
>> -       print <$fd>;
>> -       binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi
>> +       if ($caching_enabled) {
>> +               open BINOUT, '>', $fullhashbinpath or die_error(500, "Could not open bin dump file");
>> +       }else{
>> +               open BINOUT, '>', \$fullhashbinpath or die_error(500, "Could not open bin dump file");
>> +       }
>> +       binmode BINOUT, ':raw';
>> +       print BINOUT <$fd>;
>> +       binmode BINOUT, ':utf8'; # as set at the beginning of gitweb.cgi
>> +       close BINOUT;
>
> Why do you use dynamically scoped file handles here as opposed to
> lexically scoped ones?
>
> And why do you change the output discipline on BINOUT immediately
> before closing the file, and after you print data to it?
>
> Doing so sortof makes sense when the filedhandle is STDOUT, but not
> when it is BINOUT.

Also in this code:

2010/11/28 Jakub Narebski <jnareb@gmail.com>:
> +#
> +# Includes
> +#
> +if (!exists $INC{'cache.pl'}) {
> +       my $return = do 'cache.pl';
> +       die $@ if $@;
> +       die "Couldn't read 'cache.pl': $!" if (!defined $return);
> +}

Why is that preferred to:

   require 'cache.pl';

And why is this thing even a .pl file? Why isnt it called
lib/GitWeb/Cache.pm or something like that?


-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCHv7.1 3/4] gitweb: File based caching layer (from git.kernel.org)
  2010-11-29 23:26                 ` demerphq
@ 2010-11-29 23:54                   ` Jakub Narebski
  0 siblings, 0 replies; 43+ messages in thread
From: Jakub Narebski @ 2010-11-29 23:54 UTC (permalink / raw)
  To: demerphq
  Cc: git, John 'Warthog9' Hawley,
	John 'Warthog9' Hawley, Junio C Hamano, Petr Baudis,
	admin

On Tue, 30 Nov 2010, demerphq wrote:
> On 30 November 2010 00:07, demerphq <demerphq@gmail.com> wrote:
> > 2010/11/13 Jakub Narebski <jnareb@gmail.com>:

> 
> Also in this code:
> 
> 2010/11/28 Jakub Narebski <jnareb@gmail.com>:
> > +#
> > +# Includes
> > +#
> > +if (!exists $INC{'cache.pl'}) {
> > +       my $return = do 'cache.pl';
> > +       die $@ if $@;
> > +       die "Couldn't read 'cache.pl': $!" if (!defined $return);
> > +}
> 
> Why is that preferred to:
> 
>    require 'cache.pl';
> 
> And why is this thing even a .pl file? Why isnt it called
> lib/GitWeb/Cache.pm or something like that?

Because it is not a Perl module; in particular 'cache.pl' uses global 
variables from gitweb.perl (like $my_url, or $cachedir, or %action)
and subroutines from gitweb.perl (like change_output() and 
reset_output()).  That is why it needs to be injected via do, rather
than included in its owne namespace with package/require.


P.S. This is not my code, this is patch by J.H. (John Hawley); I did
only *minimal* fixups.

P.P.S. My rewrite can be found in 'gitweb/cache-kernel-pu' branch in
my repository (links are to web interface)
  http://repo.or.cz/w/git/jnareb-git.git
  https://github.com/jnareb/git
Sent in http://thread.gmane.org/gmane.comp.version-control.git/158313

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCHv7.1 3/4 (amend v2)] gitweb: File based caching layer (from git.kernel.org)
  2010-11-29 23:09                       ` [PATCHv7.1 3/4 (amend v2)] " Jakub Narebski
@ 2010-11-30  0:51                         ` Junio C Hamano
  2010-11-30 10:21                           ` Jakub Narebski
  0 siblings, 1 reply; 43+ messages in thread
From: Junio C Hamano @ 2010-11-30  0:51 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Junio C Hamano, git, John 'Warthog9' Hawley,
	John 'Warthog9' Hawley, Petr Baudis, admin

Jakub Narebski <jnareb@gmail.com> writes:

> On Mon, 29 Nov 2010, Junio C Hamano wrote:
>
>> Wait a bit.
>> 
>> This seems to match what I have already queued on 'pu', no?  Am I
>> hallucinating?
>  
> Damn, it looks like my mailer included wrong (older) version of a file.
> I'm sorry.  Below there is interdiff:

Thanks.  What I had was what I pulled from you.  I take it that you want
me to rebase the two tip commits on the branch?

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCHv7.1 3/4 (amend v2)] gitweb: File based caching layer (from git.kernel.org)
  2010-11-30  0:51                         ` Junio C Hamano
@ 2010-11-30 10:21                           ` Jakub Narebski
  0 siblings, 0 replies; 43+ messages in thread
From: Jakub Narebski @ 2010-11-30 10:21 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, John 'Warthog9' Hawley,
	John 'Warthog9' Hawley, Petr Baudis, admin

On Tue, 30 Nov 2010, Junio C Hamano wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
>> On Mon, 29 Nov 2010, Junio C Hamano wrote:
>>
>>> Wait a bit.
>>> 
>>> This seems to match what I have already queued on 'pu', no?  Am I
>>> hallucinating?
>>  
>> Damn, it looks like my mailer included wrong (older) version of a file.
>> I'm sorry.  Below there is interdiff:
> 
> Thanks.  What I had was what I pulled from you.  I take it that you want
> me to rebase the two tip commits on the branch?

Yes, please.

Though it still doesn't pass new tests in t9501 and t9502; the problem
is more with setting up tests rather than with caching code, though.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCHv7.3 1/4 (bugfix)] gitweb: Prepare for splitting gitweb
  2010-11-12 23:41             ` [PATCHv7.2 1/4] gitweb: Prepare for splitting gitweb Jakub Narebski
  2010-11-17 23:10               ` Junio C Hamano
@ 2010-12-02 10:17               ` Jakub Narebski
  2010-12-02 17:37                 ` Junio C Hamano
  1 sibling, 1 reply; 43+ messages in thread
From: Jakub Narebski @ 2010-12-02 10:17 UTC (permalink / raw)
  To: git
  Cc: John 'Warthog9' Hawley, John 'Warthog9' Hawley,
	Junio C Hamano, Petr Baudis, admin

Prepare gitweb for having been split into modules that are to be
installed alongside gitweb in 'lib/' subdirectory, by adding

  use lib __DIR__.'/lib';

to gitweb.perl (to main gitweb script), and preparing for putting
modules (relative path) in $(GITWEB_MODULES) in gitweb/Makefile.

This preparatory work allows to add new module to gitweb by simply
adding

  GITWEB_MODULES += <module>

to gitweb/Makefile (assuming that the module is in 'gitweb/lib/'
directory).

While at it pass GITWEBLIBDIR in addition to GITWEB_TEST_INSTALLED to
allow testing installed version of gitweb and installed version of
modules (for future tests which would check individual (sub)modules).


Using __DIR__ from Dir::Self module (not in core, that's why currently
gitweb includes excerpt of code from Dir::Self defining __DIR__) was
chosen over using FindBin-based solution (in core since perl 5.00307,
while gitweb itself requires at least perl 5.8.0) because FindBin uses
BEGIN block, which is a problem under mod_perl and other persistent
Perl environments (thought there are workarounds).

At Pavan Kumar Sankara suggestion gitweb/Makefile uses

  install [OPTION]... SOURCE... DIRECTORY

format (2nd format) with single SOURCE rather than

  install [OPTION]... SOURCE DEST

format (1st format) because of security reasons (race conditions).
Modern GNU install has `-T' / `--no-target-directory' option, but we
cannot rely that the $(INSTALL) we are using supports this option.

The install-modules target in gitweb/Makefile uses shell 'for' loop,
instead of make's $(foreach) function, to avoid possible problem with
generating a command line that exceeded the maximum argument list
length.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
BUGFIX: shell variables should not be in single quotes

I am very sorry about this bug...

 gitweb/Makefile    |   18 +++++++++++++++---
 gitweb/gitweb.perl |    8 ++++++++
 2 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/gitweb/Makefile b/gitweb/Makefile
index 0a6ac00..ad5a282 100644
--- a/gitweb/Makefile
+++ b/gitweb/Makefile
@@ -57,6 +57,7 @@ PERL_PATH  ?= /usr/bin/perl
 bindir_SQ = $(subst ','\'',$(bindir))#'
 gitwebdir_SQ = $(subst ','\'',$(gitwebdir))#'
 gitwebstaticdir_SQ = $(subst ','\'',$(gitwebdir)/static)#'
+gitweblibdir_SQ = $(subst ','\'',$(gitwebdir)/lib)#'
 SHELL_PATH_SQ = $(subst ','\'',$(SHELL_PATH))#'
 PERL_PATH_SQ  = $(subst ','\'',$(PERL_PATH))#'
 DESTDIR_SQ    = $(subst ','\'',$(DESTDIR))#'
@@ -153,20 +154,31 @@ test:
 
 test-installed:
 	GITWEB_TEST_INSTALLED='$(DESTDIR_SQ)$(gitwebdir_SQ)' \
+	GITWEBLIBDIR='$(DESTDIR_SQ)$(gitweblibdir_SQ)' \
 		$(MAKE) -C ../t gitweb-test
 
 ### Installation rules
 
-install: all
+install: all install-modules
 	$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(gitwebdir_SQ)'
 	$(INSTALL) -m 755 $(GITWEB_PROGRAMS) '$(DESTDIR_SQ)$(gitwebdir_SQ)'
 	$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(gitwebstaticdir_SQ)'
 	$(INSTALL) -m 644 $(GITWEB_FILES) '$(DESTDIR_SQ)$(gitwebstaticdir_SQ)'
 
+install-modules:
+	install_dirs="$(sort $(dir $(GITWEB_MODULES)))" && \
+	for dir in $$install_dirs; do \
+		test -d '$(DESTDIR_SQ)$(gitweblibdir_SQ)'/"$$dir" || \
+		$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(gitweblibdir_SQ)'/"$$dir"; \
+	done
+	gitweb_modules="$(GITWEB_MODULES)" && \
+	for mod in $$gitweb_modules; do \
+		$(INSTALL) -m 644 "lib/$$mod" '$(DESTDIR_SQ)$(gitweblibdir_SQ)'/"$$(dirname $$mod)"; \
+	done
+
 ### Cleaning rules
 
 clean:
 	$(RM) gitweb.cgi static/gitweb.min.js static/gitweb.min.css GITWEB-BUILD-OPTIONS
 
-.PHONY: all clean install test test-installed .FORCE-GIT-VERSION-FILE FORCE
-
+.PHONY: all clean install install-modules test test-installed .FORCE-GIT-VERSION-FILE FORCE
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 679f2da..cfa511c 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -10,6 +10,14 @@
 use 5.008;
 use strict;
 use warnings;
+
+use File::Spec;
+# __DIR__ is taken from Dir::Self __DIR__ fragment
+sub __DIR__ () {
+	File::Spec->rel2abs(join '', (File::Spec->splitpath(__FILE__))[0, 1]);
+}
+use lib __DIR__ . '/lib';
+
 use CGI qw(:standard :escapeHTML -nosticky);
 use CGI::Util qw(unescape);
 use CGI::Carp qw(fatalsToBrowser set_message);
-- 
1.7.3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCHv7.3 1/4 (bugfix)] gitweb: Prepare for splitting gitweb
  2010-12-02 10:17               ` [PATCHv7.3 1/4 (bugfix)] " Jakub Narebski
@ 2010-12-02 17:37                 ` Junio C Hamano
  2010-12-02 19:01                   ` Jakub Narebski
  0 siblings, 1 reply; 43+ messages in thread
From: Junio C Hamano @ 2010-12-02 17:37 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: git, John 'Warthog9' Hawley,
	John 'Warthog9' Hawley, Petr Baudis, admin

Hmm, how did you find the issue, and more importantly, how did I or other
people who saw this patch so easily fail to notice it?

FWIW, I do run "make install" from the toplevel as part of my pre-push
test, so I _should_ have noticed it.

Ah, I don't run the install step for a revision that does not pass its
selftest, so I haven't run "make install" on 'pu' for some time.  That may
explain it.

Anyway thanks for a fixup.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCHv7.3 1/4 (bugfix)] gitweb: Prepare for splitting gitweb
  2010-12-02 17:37                 ` Junio C Hamano
@ 2010-12-02 19:01                   ` Jakub Narebski
  2010-12-02 19:21                     ` Junio C Hamano
  0 siblings, 1 reply; 43+ messages in thread
From: Jakub Narebski @ 2010-12-02 19:01 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, John 'Warthog9' Hawley,
	John 'Warthog9' Hawley, Petr Baudis, admin

On Thu, 2 Dec 2010, Junio C Hamano wrote:

> Hmm, how did you find the issue, and more importantly, how did I or other
> people who saw this patch so easily fail to notice it?

When working on my total rewrite of J.H. gitweb caching series, available
as 'gitweb/cache-kernel-pu' branch in git://repo.or.cz/git/jnareb-git.git
and git://github.com/jnareb/git.git repositories I finally did a *clean*
reinstal (i.e. remove whole directory, then run "make install-gitweb").
This branch uses the same "gitweb: Prepare for splitting gitweb" patch
(cherry-picked back and forth).
 
> FWIW, I do run "make install" from the toplevel as part of my pre-push
> test, so I _should_ have noticed it.
> 
> Ah, I don't run the install step for a revision that does not pass its
> selftest, so I haven't run "make install" on 'pu' for some time.  That may
> explain it.

Hmmm... I thought that "make install" doesn't install gitweb, but it does
with "$(MAKE) -C gitweb install"... though I am not sure if "make all"
builds gitweb (runs "make gitweb").
 
> Anyway thanks for a fixup.
> 

Sorry for the bug.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCHv7.3 1/4 (bugfix)] gitweb: Prepare for splitting gitweb
  2010-12-02 19:01                   ` Jakub Narebski
@ 2010-12-02 19:21                     ` Junio C Hamano
  2010-12-02 19:36                       ` Jakub Narebski
  0 siblings, 1 reply; 43+ messages in thread
From: Junio C Hamano @ 2010-12-02 19:21 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: git, John 'Warthog9' Hawley,
	John 'Warthog9' Hawley, Petr Baudis, admin

Jakub Narebski <jnareb@gmail.com> writes:

>> Ah, I don't run the install step for a revision that does not pass its
>> selftest, so I haven't run "make install" on 'pu' for some time.  That may
>> explain it.
>
> Hmmm... I thought that "make install" doesn't install gitweb, but it does
> with "$(MAKE) -C gitweb install"... though I am not sure if "make all"
> builds gitweb (runs "make gitweb").

I think it does, and it should if it doesn't.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCHv7.3 1/4 (bugfix)] gitweb: Prepare for splitting gitweb
  2010-12-02 19:21                     ` Junio C Hamano
@ 2010-12-02 19:36                       ` Jakub Narebski
  0 siblings, 0 replies; 43+ messages in thread
From: Jakub Narebski @ 2010-12-02 19:36 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, John 'Warthog9' Hawley,
	John 'Warthog9' Hawley, Petr Baudis, admin

Junio C Hamano wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
> 
> > > Ah, I don't run the install step for a revision that does not pass its
> > > selftest, so I haven't run "make install" on 'pu' for some time.  That may
> > > explain it.
> >
> > Hmmm... I thought that "make install" doesn't install gitweb, but it does
> > with "$(MAKE) -C gitweb install"... though I am not sure if "make all"
> > builds gitweb (runs "make gitweb").
> 
> I think it does, and it should if it doesn't.

Anyway "install" target in gitweb/Makefile runs "all" target in
gitweb/Makefile, so 
   "make install" -> "make -C gitweb install" -> "make -C gitweb all"
   -> gitweb/gitweb.cgi is generated

gitweb/gitweb.cgi is in $(OTHER_PROGRAMS), and we have

  all:: [...] $(OTHER_PROGRAMS) GIT-BUILD-OPTIONS


Hmmm... shouldn't it be 'gitweb', not 'gitweb/gitweb.cgi'?  Just wondering
(we have gitweb/gitweb.cgi target in main Makefile, which just proxies
to gitweb/Makefile).

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2010-12-02 19:37 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-10-28  0:42 [PATCH 0/3] Gitweb caching v7 John 'Warthog9' Hawley
2010-10-28  0:42 ` [PATCH 1/3] gitweb: Add option to force version match John 'Warthog9' Hawley
2010-10-28  9:52   ` Jakub Narebski
2010-10-28 22:08   ` Junio C Hamano
2010-10-28  0:42 ` [PATCH 2/3] gitweb: add output buffering and associated functions John 'Warthog9' Hawley
2010-10-28  9:56   ` Jakub Narebski
2010-10-28  0:42 ` [PATCH 3/3] gitweb: File based caching layer (from git.kernel.org) John 'Warthog9' Hawley
2010-10-28 16:10   ` Jakub Narebski
2010-10-28 22:29 ` [PATCH 0/3] Gitweb caching v7 Junio C Hamano
2010-10-29 22:25 ` Junio C Hamano
2010-10-30  8:58   ` Jakub Narebski
2010-10-31  4:24     ` Junio C Hamano
2010-10-31  9:21       ` Jakub Narebski
2010-11-01 10:24         ` [PATCHv7.1 0/4] Gitweb caching v7.1 Jakub Narebski
2010-11-12 23:35           ` [RFC/PATCHv7.2 0/4] Gitweb caching v7.2 Jakub Narebski
2010-11-12 23:41             ` [PATCHv7.2 1/4] gitweb: Prepare for splitting gitweb Jakub Narebski
2010-11-17 23:10               ` Junio C Hamano
2010-11-18 13:37                 ` Jakub Narebski
2010-12-02 10:17               ` [PATCHv7.3 1/4 (bugfix)] " Jakub Narebski
2010-12-02 17:37                 ` Junio C Hamano
2010-12-02 19:01                   ` Jakub Narebski
2010-12-02 19:21                     ` Junio C Hamano
2010-12-02 19:36                       ` Jakub Narebski
2010-11-12 23:44             ` [PATCHv7.1 2/4] gitweb: add output buffering and associated functions Jakub Narebski
2010-11-12 23:56             ` [PATCHv7.1 3/4] gitweb: File based caching layer (from git.kernel.org) Jakub Narebski
2010-11-28 11:22               ` [PATCHv7.1 3/4 (amend)] " Jakub Narebski
2010-11-28 11:31                 ` Jakub Narebski
2010-11-29 22:13                   ` Junio C Hamano
2010-11-29 22:20                     ` Junio C Hamano
2010-11-29 23:09                       ` [PATCHv7.1 3/4 (amend v2)] " Jakub Narebski
2010-11-30  0:51                         ` Junio C Hamano
2010-11-30 10:21                           ` Jakub Narebski
2010-11-29 23:07               ` [PATCHv7.1 3/4] " demerphq
2010-11-29 23:26                 ` demerphq
2010-11-29 23:54                   ` Jakub Narebski
2010-11-13  0:01             ` [PATCHv7.2 4/4] gitweb: Minimal testing of gitweb caching Jakub Narebski
2010-11-17 22:37               ` Junio C Hamano
2010-11-17 23:12                 ` Jakub Narebski
2010-11-01 10:24         ` [PATCHv7.1 1/4] gitweb: Prepare for splitting gitweb Jakub Narebski
2010-11-01 18:50           ` [PATCHv7.1b " Jakub Narebski
2010-11-01 10:24         ` [PATCHv7.1 2/4] gitweb: add output buffering and associated functions Jakub Narebski
2010-11-01 10:24         ` [PATCHv7.1 3/4] gitweb: File based caching layer (from git.kernel.org) Jakub Narebski
2010-11-01 10:24         ` [PATCHv7.1 4/4] gitweb: Minimal testing of gitweb caching Jakub Narebski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.