All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] 2GB fixes for Windows
@ 2009-03-05 16:05 Johannes Schindelin
  2009-03-05 16:05 ` [PATCH 1/2] Add an (optional, since expensive) test for >2g clones Johannes Schindelin
  2009-03-05 16:05 ` [PATCH " Johannes Schindelin
  0 siblings, 2 replies; 10+ messages in thread
From: Johannes Schindelin @ 2009-03-05 16:05 UTC (permalink / raw)
  To: git, gitster; +Cc: Johannes Sixt

On Windows, we can actually have files larger than 2 gigabyte, just not 
using off_t, but off64_t instead.

This came up as issue 194 on the msysGit tracker, and as I first brushed 
the people off saying that it is not an msysGit issue, by way of an 
apology, I started working on this myself.

The first patch adds a test for cloning repositories larger than 2 
gigabyte, which is disabled by default, since it is quite expensive (both 
in terms of time and in terms of space), and since it must fail when the 
underlying filesystem does not allow files larger than 2 gigabyte.

The second patch convinces msysGit (AKA the MinGW port of Git) to make use 
of the 64-bit file offsets.

Johannes Schindelin (2):
  Add an (optional, since expensive) test for >2g clones
  MinGW: 64-bit file offsets

 compat/mingw.c       |    8 +++++---
 compat/mingw.h       |    5 ++++-
 t/t5705-clone-2gb.sh |   45 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 54 insertions(+), 4 deletions(-)
 create mode 100755 t/t5705-clone-2gb.sh

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/2] Add an (optional, since expensive) test for >2g clones
  2009-03-05 16:05 [PATCH 0/2] 2GB fixes for Windows Johannes Schindelin
@ 2009-03-05 16:05 ` Johannes Schindelin
  2009-03-05 16:41   ` Jeff King
  2009-03-05 23:08   ` Junio C Hamano
  2009-03-05 16:05 ` [PATCH " Johannes Schindelin
  1 sibling, 2 replies; 10+ messages in thread
From: Johannes Schindelin @ 2009-03-05 16:05 UTC (permalink / raw)
  To: git, gitster; +Cc: Johannes Sixt

Define GIT_TEST_CLONE_2GB=t if you want the test not to be skipped.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t5705-clone-2gb.sh |   45 +++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 45 insertions(+), 0 deletions(-)
 create mode 100755 t/t5705-clone-2gb.sh

diff --git a/t/t5705-clone-2gb.sh b/t/t5705-clone-2gb.sh
new file mode 100755
index 0000000..dd4a8ec
--- /dev/null
+++ b/t/t5705-clone-2gb.sh
@@ -0,0 +1,45 @@
+#!/bin/sh
+
+test_description='Test cloning a repository larger than 2 gigabyte'
+. ./test-lib.sh
+
+test -z "$GIT_TEST_CLONE_2GB" &&
+say "Skipping expensive 2GB clone test; enable it with GIT_TEST_CLONE_2GB=t" &&
+test_done &&
+exit
+
+test_expect_success 'setup' '
+
+	git config pack.compression 0 &&
+	git config pack.depth 0 &&
+	blobsize=$((20*1024*1024))
+	blobcount=$((2*1024*1024*1024/$blobsize+1))
+	i=1
+	(while test $i -le $blobcount
+	 do
+		printf "Generating blob $i/$blobcount\r" >&2 &&
+		printf "blob\nmark :$i\ndata $blobsize\n" &&
+		#test-genrandom $i $blobsize &&
+		printf "%-${blobsize}s" $i &&
+		echo "M 100644 :$i $i" >> commit
+		i=$(($i+1)) ||
+		echo $? > exit-status
+	 done &&
+	 echo "commit refs/heads/master" &&
+	 echo "author A U Thor <author@email.com> 123456789 +0000" &&
+	 echo "committer C O Mitter <committer@email.com> 123456789 +0000" &&
+	 echo "data 5" &&
+	 echo ">2gb" &&
+	 cat commit) |
+	git fast-import &&
+	test ! -f exit-status
+
+'
+
+test_expect_success 'clone' '
+
+	git clone --bare --no-hardlinks . clone
+
+'
+
+test_done
-- 
1.6.2.rc1.493.g27ad8

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/2] MinGW: 64-bit file offsets
  2009-03-05 16:05 [PATCH 0/2] 2GB fixes for Windows Johannes Schindelin
  2009-03-05 16:05 ` [PATCH 1/2] Add an (optional, since expensive) test for >2g clones Johannes Schindelin
@ 2009-03-05 16:05 ` Johannes Schindelin
  2009-03-05 20:53   ` Johannes Sixt
  1 sibling, 1 reply; 10+ messages in thread
From: Johannes Schindelin @ 2009-03-05 16:05 UTC (permalink / raw)
  To: git, gitster; +Cc: Johannes Sixt, Sickboy

The type 'off_t' should be used everywhere so that the bit-depth of that
type can be adjusted in the standard C library, and you just need to
recompile your program to benefit from the extended precision.

Only that it was not done that way in the MS runtime library.

This patch reroutes off_t to off64_t and provides the other necessary
changes so that finally, clones larger than 2 gigabyte work on Windows
(provided you are on a file system that allows files larger than 2gb).

Initial patch by Sickboy <sb@dev-heaven.net>.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/mingw.c |    8 +++++---
 compat/mingw.h |    5 ++++-
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index 3dbe6a7..27bcf3f 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -46,7 +46,8 @@ static int do_lstat(const char *file_name, struct stat *buf)
 		buf->st_uid = 0;
 		buf->st_nlink = 1;
 		buf->st_mode = file_attr_to_st_mode(fdata.dwFileAttributes);
-		buf->st_size = fdata.nFileSizeLow; /* Can't use nFileSizeHigh, since it's not a stat64 */
+		buf->st_size = fdata.nFileSizeLow |
+			(((off_t)fdata.nFileSizeHigh)<<32);
 		buf->st_dev = buf->st_rdev = 0; /* not used by Git */
 		buf->st_atime = filetime_to_time_t(&(fdata.ftLastAccessTime));
 		buf->st_mtime = filetime_to_time_t(&(fdata.ftLastWriteTime));
@@ -101,7 +102,7 @@ int mingw_fstat(int fd, struct stat *buf)
 	}
 	/* direct non-file handles to MS's fstat() */
 	if (GetFileType(fh) != FILE_TYPE_DISK)
-		return fstat(fd, buf);
+		return _fstati64(fd, buf);
 
 	if (GetFileInformationByHandle(fh, &fdata)) {
 		buf->st_ino = 0;
@@ -109,7 +110,8 @@ int mingw_fstat(int fd, struct stat *buf)
 		buf->st_uid = 0;
 		buf->st_nlink = 1;
 		buf->st_mode = file_attr_to_st_mode(fdata.dwFileAttributes);
-		buf->st_size = fdata.nFileSizeLow; /* Can't use nFileSizeHigh, since it's not a stat64 */
+		buf->st_size = fdata.nFileSizeLow |
+			(((off_t)fdata.nFileSizeHigh)<<32);
 		buf->st_dev = buf->st_rdev = 0; /* not used by Git */
 		buf->st_atime = filetime_to_time_t(&(fdata.ftLastAccessTime));
 		buf->st_mtime = filetime_to_time_t(&(fdata.ftLastWriteTime));
diff --git a/compat/mingw.h b/compat/mingw.h
index a255898..cb9c4d4 100644
--- a/compat/mingw.h
+++ b/compat/mingw.h
@@ -163,11 +163,14 @@ int mingw_rename(const char*, const char*);
 /* Use mingw_lstat() instead of lstat()/stat() and
  * mingw_fstat() instead of fstat() on Windows.
  */
+#define off_t off64_t
+#define stat _stati64
+#define lseek _lseeki64
 int mingw_lstat(const char *file_name, struct stat *buf);
 int mingw_fstat(int fd, struct stat *buf);
 #define fstat mingw_fstat
 #define lstat mingw_lstat
-#define stat(x,y) mingw_lstat(x,y)
+#define stat64(x,y) mingw_lstat(x,y)
 
 int mingw_utime(const char *file_name, const struct utimbuf *times);
 #define utime mingw_utime
-- 
1.6.2.rc1.493.g27ad8

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] Add an (optional, since expensive) test for >2g clones
  2009-03-05 16:05 ` [PATCH 1/2] Add an (optional, since expensive) test for >2g clones Johannes Schindelin
@ 2009-03-05 16:41   ` Jeff King
  2009-03-05 16:58     ` Johannes Schindelin
  2009-03-05 23:08   ` Junio C Hamano
  1 sibling, 1 reply; 10+ messages in thread
From: Jeff King @ 2009-03-05 16:41 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, gitster, Johannes Sixt

On Thu, Mar 05, 2009 at 05:05:09PM +0100, Johannes Schindelin wrote:

> +	(while test $i -le $blobcount
> +	 do
> +		printf "Generating blob $i/$blobcount\r" >&2 &&
> +		printf "blob\nmark :$i\ndata $blobsize\n" &&
> +		#test-genrandom $i $blobsize &&
> +		printf "%-${blobsize}s" $i &&

Leftover cruft using genrandom? (I'm guessing you tried random at first
to avoid compression, but I think your pack.compression=0 technique is
more sensible).

-Peff

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] Add an (optional, since expensive) test for >2g clones
  2009-03-05 16:41   ` Jeff King
@ 2009-03-05 16:58     ` Johannes Schindelin
  2009-03-05 18:50       ` Jeff King
  0 siblings, 1 reply; 10+ messages in thread
From: Johannes Schindelin @ 2009-03-05 16:58 UTC (permalink / raw)
  To: Jeff King; +Cc: git, gitster, Johannes Sixt

Hi,

On Thu, 5 Mar 2009, Jeff King wrote:

> On Thu, Mar 05, 2009 at 05:05:09PM +0100, Johannes Schindelin wrote:
> 
> > +	(while test $i -le $blobcount
> > +	 do
> > +		printf "Generating blob $i/$blobcount\r" >&2 &&
> > +		printf "blob\nmark :$i\ndata $blobsize\n" &&
> > +		#test-genrandom $i $blobsize &&
> > +		printf "%-${blobsize}s" $i &&
> 
> Leftover cruft using genrandom? (I'm guessing you tried random at first
> to avoid compression, but I think your pack.compression=0 technique is
> more sensible).

Actually, I left it in on purpose.  Yes, it happens to work right now, as 
the packs are built with 0 compression and with 0 deltification.

However, there might be a day when we cannot guarantee anymore that a 
single number padded with spaces to a certain width really makes the pack 
grow by that many bytes.  Then we would need something like test-genrandom 
(which is substantially slower, though).

Interesting random note: it took me quite a while to figure out that both 
pack.compression and pack.depth need to be set to 0.  In hindsight, it is 
obvious, but that does not really help the time I lost due to Windows' 
slowness.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] Add an (optional, since expensive) test for >2g clones
  2009-03-05 16:58     ` Johannes Schindelin
@ 2009-03-05 18:50       ` Jeff King
  0 siblings, 0 replies; 10+ messages in thread
From: Jeff King @ 2009-03-05 18:50 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, gitster, Johannes Sixt

On Thu, Mar 05, 2009 at 05:58:23PM +0100, Johannes Schindelin wrote:

> > Leftover cruft using genrandom? (I'm guessing you tried random at first
> > to avoid compression, but I think your pack.compression=0 technique is
> > more sensible).
> 
> Actually, I left it in on purpose.  Yes, it happens to work right now, as 
> the packs are built with 0 compression and with 0 deltification.

Fair enough. A comment in the commit message or in the code might have
made that more clear, though...

-Peff

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] MinGW: 64-bit file offsets
  2009-03-05 16:05 ` [PATCH " Johannes Schindelin
@ 2009-03-05 20:53   ` Johannes Sixt
  0 siblings, 0 replies; 10+ messages in thread
From: Johannes Sixt @ 2009-03-05 20:53 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, gitster, Sickboy

On Donnerstag, 5. März 2009, Johannes Schindelin wrote:
> The type 'off_t' should be used everywhere so that the bit-depth of that
> type can be adjusted in the standard C library, and you just need to
> recompile your program to benefit from the extended precision.
>
> Only that it was not done that way in the MS runtime library.
>
> This patch reroutes off_t to off64_t and provides the other necessary
> changes so that finally, clones larger than 2 gigabyte work on Windows
> (provided you are on a file system that allows files larger than 2gb).
>
> Initial patch by Sickboy <sb@dev-heaven.net>.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>

Acked-by: Johannes Sixt <j6t@kdbg.org>

-- Hannes

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] Add an (optional, since expensive) test for >2g clones
  2009-03-05 16:05 ` [PATCH 1/2] Add an (optional, since expensive) test for >2g clones Johannes Schindelin
  2009-03-05 16:41   ` Jeff King
@ 2009-03-05 23:08   ` Junio C Hamano
  2009-03-06  9:48     ` [PATCH v2 1/2] Add an (optional, since expensive) test for >2gb clones Johannes Schindelin
  1 sibling, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2009-03-05 23:08 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, gitster, Johannes Sixt

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> +test_expect_success 'setup' '
> +
> +	git config pack.compression 0 &&
> +	git config pack.depth 0 &&
> +	blobsize=$((20*1024*1024))
> +	blobcount=$((2*1024*1024*1024/$blobsize+1))
> +	i=1

What happened to the && chain?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2 1/2] Add an (optional, since expensive) test for >2gb clones
  2009-03-05 23:08   ` Junio C Hamano
@ 2009-03-06  9:48     ` Johannes Schindelin
  2009-03-06  9:49       ` [PATCH v2 2/2] MinGW: 64-bit file offsets Johannes Schindelin
  0 siblings, 1 reply; 10+ messages in thread
From: Johannes Schindelin @ 2009-03-06  9:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King, Johannes Sixt


Define GIT_TEST_CLONE_2GB=t if you want the test not to be skipped.

The test works by constructing a repository larger than 2gb, and then
cloning it.

The repository is forced larger than 2gb by setting compression and
delta depth to zero, and then adding just enough unique objects of
a given size.

The objects consist of a running decimal number in ASCII, padded by
spaces.  Should that break in the future, e.g. when pack v4 becomes
default, there is a commented-out call to test-genrandom which can be
substituted, but that uses more cycles than the current method.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---

	Changes since v1: fixed && chain, better commit message.

 t/t5705-clone-2gb.sh |   45 +++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 45 insertions(+), 0 deletions(-)
 create mode 100755 t/t5705-clone-2gb.sh

diff --git a/t/t5705-clone-2gb.sh b/t/t5705-clone-2gb.sh
new file mode 100755
index 0000000..9f52154
--- /dev/null
+++ b/t/t5705-clone-2gb.sh
@@ -0,0 +1,45 @@
+#!/bin/sh
+
+test_description='Test cloning a repository larger than 2 gigabyte'
+. ./test-lib.sh
+
+test -z "$GIT_TEST_CLONE_2GB" &&
+say "Skipping expensive 2GB clone test; enable it with GIT_TEST_CLONE_2GB=t" &&
+test_done &&
+exit
+
+test_expect_success 'setup' '
+
+	git config pack.compression 0 &&
+	git config pack.depth 0 &&
+	blobsize=$((20*1024*1024)) &&
+	blobcount=$((2*1024*1024*1024/$blobsize+1)) &&
+	i=1 &&
+	(while test $i -le $blobcount
+	 do
+		printf "Generating blob $i/$blobcount\r" >&2 &&
+		printf "blob\nmark :$i\ndata $blobsize\n" &&
+		#test-genrandom $i $blobsize &&
+		printf "%-${blobsize}s" $i &&
+		echo "M 100644 :$i $i" >> commit
+		i=$(($i+1)) ||
+		echo $? > exit-status
+	 done &&
+	 echo "commit refs/heads/master" &&
+	 echo "author A U Thor <author@email.com> 123456789 +0000" &&
+	 echo "committer C O Mitter <committer@email.com> 123456789 +0000" &&
+	 echo "data 5" &&
+	 echo ">2gb" &&
+	 cat commit) |
+	git fast-import &&
+	test ! -f exit-status
+
+'
+
+test_expect_success 'clone' '
+
+	git clone --bare --no-hardlinks . clone
+
+'
+
+test_done
-- 
1.6.2.240.g23c7

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 2/2] MinGW: 64-bit file offsets
  2009-03-06  9:48     ` [PATCH v2 1/2] Add an (optional, since expensive) test for >2gb clones Johannes Schindelin
@ 2009-03-06  9:49       ` Johannes Schindelin
  0 siblings, 0 replies; 10+ messages in thread
From: Johannes Schindelin @ 2009-03-06  9:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King, Johannes Sixt


The type 'off_t' should be used everywhere so that the bit-depth of that
type can be adjusted in the standard C library, and you just need to
recompile your program to benefit from the extended precision.

Only that it was not done that way in the MS runtime library.

This patch reroutes off_t to off64_t and provides the other necessary
changes so that finally, clones larger than 2 gigabyte work on Windows
(provided you are on a file system that allows files larger than 2gb).

Initial patch by Sickboy <sb@dev-heaven.net>.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Johannes Sixt <j6t@kdbg.org>
---

	Unchanged since v1 (therefore I dared adding Hannes' ACK).

 compat/mingw.c |    8 +++++---
 compat/mingw.h |    5 ++++-
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index 3dbe6a7..27bcf3f 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -46,7 +46,8 @@ static int do_lstat(const char *file_name, struct stat *buf)
 		buf->st_uid = 0;
 		buf->st_nlink = 1;
 		buf->st_mode = file_attr_to_st_mode(fdata.dwFileAttributes);
-		buf->st_size = fdata.nFileSizeLow; /* Can't use nFileSizeHigh, since it's not a stat64 */
+		buf->st_size = fdata.nFileSizeLow |
+			(((off_t)fdata.nFileSizeHigh)<<32);
 		buf->st_dev = buf->st_rdev = 0; /* not used by Git */
 		buf->st_atime = filetime_to_time_t(&(fdata.ftLastAccessTime));
 		buf->st_mtime = filetime_to_time_t(&(fdata.ftLastWriteTime));
@@ -101,7 +102,7 @@ int mingw_fstat(int fd, struct stat *buf)
 	}
 	/* direct non-file handles to MS's fstat() */
 	if (GetFileType(fh) != FILE_TYPE_DISK)
-		return fstat(fd, buf);
+		return _fstati64(fd, buf);
 
 	if (GetFileInformationByHandle(fh, &fdata)) {
 		buf->st_ino = 0;
@@ -109,7 +110,8 @@ int mingw_fstat(int fd, struct stat *buf)
 		buf->st_uid = 0;
 		buf->st_nlink = 1;
 		buf->st_mode = file_attr_to_st_mode(fdata.dwFileAttributes);
-		buf->st_size = fdata.nFileSizeLow; /* Can't use nFileSizeHigh, since it's not a stat64 */
+		buf->st_size = fdata.nFileSizeLow |
+			(((off_t)fdata.nFileSizeHigh)<<32);
 		buf->st_dev = buf->st_rdev = 0; /* not used by Git */
 		buf->st_atime = filetime_to_time_t(&(fdata.ftLastAccessTime));
 		buf->st_mtime = filetime_to_time_t(&(fdata.ftLastWriteTime));
diff --git a/compat/mingw.h b/compat/mingw.h
index a255898..cb9c4d4 100644
--- a/compat/mingw.h
+++ b/compat/mingw.h
@@ -163,11 +163,14 @@ int mingw_rename(const char*, const char*);
 /* Use mingw_lstat() instead of lstat()/stat() and
  * mingw_fstat() instead of fstat() on Windows.
  */
+#define off_t off64_t
+#define stat _stati64
+#define lseek _lseeki64
 int mingw_lstat(const char *file_name, struct stat *buf);
 int mingw_fstat(int fd, struct stat *buf);
 #define fstat mingw_fstat
 #define lstat mingw_lstat
-#define stat(x,y) mingw_lstat(x,y)
+#define stat64(x,y) mingw_lstat(x,y)
 
 int mingw_utime(const char *file_name, const struct utimbuf *times);
 #define utime mingw_utime
-- 
1.6.2.240.g23c7

^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2009-03-06  9:49 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-03-05 16:05 [PATCH 0/2] 2GB fixes for Windows Johannes Schindelin
2009-03-05 16:05 ` [PATCH 1/2] Add an (optional, since expensive) test for >2g clones Johannes Schindelin
2009-03-05 16:41   ` Jeff King
2009-03-05 16:58     ` Johannes Schindelin
2009-03-05 18:50       ` Jeff King
2009-03-05 23:08   ` Junio C Hamano
2009-03-06  9:48     ` [PATCH v2 1/2] Add an (optional, since expensive) test for >2gb clones Johannes Schindelin
2009-03-06  9:49       ` [PATCH v2 2/2] MinGW: 64-bit file offsets Johannes Schindelin
2009-03-05 16:05 ` [PATCH " Johannes Schindelin
2009-03-05 20:53   ` Johannes Sixt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.