* Git as data archive @ 2019-12-06 18:54 Andreas Kalz 2019-12-07 16:54 ` Philip Oakley 0 siblings, 1 reply; 6+ messages in thread From: Andreas Kalz @ 2019-12-06 18:54 UTC (permalink / raw) To: git Hello, I am using git as archive and versioning also for photos. Apart from performance issues, I wanted to ask if there are hard limits and configurable limits (how to configure?) for maximum single file size and maximum .git archive size (Windows 64 Bit system)? Thanks in advance for your answer. All the best, Andreas ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Git as data archive 2019-12-06 18:54 Git as data archive Andreas Kalz @ 2019-12-07 16:54 ` Philip Oakley 2019-12-07 18:04 ` Thomas Braun 0 siblings, 1 reply; 6+ messages in thread From: Philip Oakley @ 2019-12-07 16:54 UTC (permalink / raw) To: Andreas Kalz, git Hi Andreas, On 06/12/2019 18:54, Andreas Kalz wrote: > Hello, > I am using git as archive and versioning also for photos. Apart from > performance issues, I wanted to ask if there are hard limits and > configurable limits (how to configure?) for maximum single file size and > maximum .git archive size (Windows 64 Bit system)? > Thanks in advance for your answer. > All the best, > Andreas On Git the file size is currently limited to size of `long`, rather than `size_t`. Hence on Git-for Windows the size limit is 32bit ~4GiB Any change will be a big change as it ripples through many places in the code base and, for some, will feel 'wrong'. I did some work [1-4] on top of those of many others that was almost there, but... The alternative is git-lfs, which I don't personally use (see [4]). Philip [1] https://github.com/git-for-windows/git/pull/2179 [2] https://github.com/gitgitgadget/git/pull/115 [3] https://github.com/git-for-windows/git/issues/1063 [4] https://github.com/git-lfs/git-lfs/issues/2434 ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Git as data archive 2019-12-07 16:54 ` Philip Oakley @ 2019-12-07 18:04 ` Thomas Braun 2019-12-08 18:44 ` Andreas Kalz 0 siblings, 1 reply; 6+ messages in thread From: Thomas Braun @ 2019-12-07 18:04 UTC (permalink / raw) To: Philip Oakley, Andreas Kalz, git On 07.12.2019 17:54, Philip Oakley wrote: > Hi Andreas, > > On 06/12/2019 18:54, Andreas Kalz wrote: >> Hello, >> I am using git as archive and versioning also for photos. Apart from >> performance issues, I wanted to ask if there are hard limits and >> configurable limits (how to configure?) for maximum single file size and >> maximum .git archive size (Windows 64 Bit system)? >> Thanks in advance for your answer. >> All the best, >> Andreas > > On Git the file size is currently limited to size of `long`, rather than > `size_t`. Hence on Git-for Windows the size limit is 32bit ~4GiB > > Any change will be a big change as it ripples through many places in the > code base and, for some, will feel 'wrong'. I did some work [1-4] on top > of those of many others that was almost there, but... Adding to what Philip said. On Windows the size of exported archives (git archive) is currently also limited to 4GB. The reason being also the long vs size_t issue (which is not present on linux though). So if you can switch to Linux or even MacOSX these issues are gone. The number of files in .git, only the number packfiles would be of interest here I guess, do not have the long vs size_t issue. So packfiles can be larger than 4GB on 64bit Windows (with 64bit git of course). But depending on how large the biggest files are, it might be worth tweaking some of the settings, so that the created packfiles are readable on all platforms. I once created a repo on linux which could not be checked on windows, and that is a bit annoying. So the questions are how large is each file? And what repository size do you expect? Are we talking about 20MB files and 10GB repository? Or a factor 100 more? And are you just adding files or are you modifying the added files? Depending on the file sizes it might then also be beneficial to tweak the delta compression settings and/or the big file threshold limits. Thomas > The alternative is git-lfs, which I don't personally use (see [4]). > > Philip > > [1] https://github.com/git-for-windows/git/pull/2179 > [2] https://github.com/gitgitgadget/git/pull/115 > [3] https://github.com/git-for-windows/git/issues/1063 > [4] https://github.com/git-lfs/git-lfs/issues/2434 > > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Git as data archive 2019-12-07 18:04 ` Thomas Braun @ 2019-12-08 18:44 ` Andreas Kalz 2019-12-09 1:18 ` Thomas Braun 0 siblings, 1 reply; 6+ messages in thread From: Andreas Kalz @ 2019-12-08 18:44 UTC (permalink / raw) To: Thomas Braun; +Cc: Philip Oakley, git Hi, thanks to you both. @Thomas: are you Thomas Braun who studied at FH Regensburg? Well, currently the .git repository is 715GB and the maximum file size is 9.5GB, but I did not get error messages due to that even if the performance is quite low. The biggest pack* file is 24GB. There are some files which are modified, but most are not modified. My question came up as I did not find a documentation about limits of git, only a lot of entries about github and forum users who are discussing about old bugs of git. I read about git-lfs and also that it is not working very stable, due to that I did not use it yet. How can the delta compression settings and/or the big filethreshold limits be modified? Thanks in advance. All the best, Andreas Am 07.12.2019 um 19:04 schrieb Thomas Braun: > On 07.12.2019 17:54, Philip Oakley wrote: >> Hi Andreas, >> >> On 06/12/2019 18:54, Andreas Kalz wrote: >>> Hello, >>> I am using git as archive and versioning also for photos. Apart from >>> performance issues, I wanted to ask if there are hard limits and >>> configurable limits (how to configure?) for maximum single file size and >>> maximum .git archive size (Windows 64 Bit system)? >>> Thanks in advance for your answer. >>> All the best, >>> Andreas >> On Git the file size is currently limited to size of `long`, rather than >> `size_t`. Hence on Git-for Windows the size limit is 32bit ~4GiB >> >> Any change will be a big change as it ripples through many places in the >> code base and, for some, will feel 'wrong'. I did some work [1-4] on top >> of those of many others that was almost there, but... > Adding to what Philip said. On Windows the size of exported archives > (git archive) is currently also limited to 4GB. The reason being also > the long vs size_t issue (which is not present on linux though). > > So if you can switch to Linux or even MacOSX these issues are gone. > > The number of files in .git, only the number packfiles would be of > interest here I guess, do not have the long vs size_t issue. So > packfiles can be larger than 4GB on 64bit Windows (with 64bit git of > course). > > But depending on how large the biggest files are, it might be worth > tweaking some of the settings, so that the created packfiles are > readable on all platforms. I once created a repo on linux which could > not be checked on windows, and that is a bit annoying. > > So the questions are how large is each file? And what repository size do > you expect? Are we talking about 20MB files and 10GB repository? Or a > factor 100 more? And are you just adding files or are you modifying the > added files? Depending on the file sizes it might then also be > beneficial to tweak the delta compression settings and/or the big file > threshold limits. > > Thomas > >> The alternative is git-lfs, which I don't personally use (see [4]). >> >> Philip >> >> [1] https://github.com/git-for-windows/git/pull/2179 >> [2] https://github.com/gitgitgadget/git/pull/115 >> [3] https://github.com/git-for-windows/git/issues/1063 >> [4] https://github.com/git-lfs/git-lfs/issues/2434 >> >> ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Git as data archive 2019-12-08 18:44 ` Andreas Kalz @ 2019-12-09 1:18 ` Thomas Braun 2019-12-09 16:39 ` Andreas Kalz 0 siblings, 1 reply; 6+ messages in thread From: Thomas Braun @ 2019-12-09 1:18 UTC (permalink / raw) To: Andreas Kalz; +Cc: Philip Oakley, git On 08.12.2019 19:44, Andreas Kalz wrote: Hi Andreas, > @Thomas: are you Thomas Braun who studied at FH Regensburg? nope, sorry. > Well, currently the .git repository is 715GB and the maximum file size > is 9.5GB, but I did not get error messages due to that even if the > performance is quite low. The biggest pack* file is 24GB. There are some > files which are modified, but most are not modified. Okay that is kind-of-large. How did you add the 9.5GB file? AFAIK this could not have be done on windows. Do you push that to a remote repository as well? > My question came up as I did not find a documentation about limits of > git, only a lot of entries about github and forum users who are > discussing about old bugs of git. I read about git-lfs and also that it > is not working very stable, due to that I did not use it yet. Although I'm not using git-lfs myself, from what I know it works well. But it does have the same limitation as stock git for windows as Philip pointed out already. > How can the delta compression settings and/or the big filethreshold > limits be modified? These are plain git config settings. Have a look at [1]. The attributes are explained in [2-3]. Basically you can set in .gitattributes *.bin -delta, -diff which would tell git that files with suffix bin should not be delta compressed and are always binary. You could also play around with turning compression completely off via core.compression or pack.compression. Hope that helps, Thomas PS: If you have resources to help fixing that long-standing bug in git for windows, there is a PR open [4] which has a WIP version. But beware you need good C skills and better-than-average git skills, or a Santa-Claus-style bag with monetary resources. [1]: https://git-scm.com/docs/git-config#Documentation/git-config.txt-corebigFileThreshold [2]: https://git-scm.com/docs/gitattributes#_code_delta_code [3]: https://git-scm.com/docs/gitattributes#_marking_files_as_binary [4]: https://github.com/git-for-windows/git/pull/2179 > Am 07.12.2019 um 19:04 schrieb Thomas Braun: >> On 07.12.2019 17:54, Philip Oakley wrote: >>> Hi Andreas, >>> >>> On 06/12/2019 18:54, Andreas Kalz wrote: >>>> Hello, >>>> I am using git as archive and versioning also for photos. Apart from >>>> performance issues, I wanted to ask if there are hard limits and >>>> configurable limits (how to configure?) for maximum single file size >>>> and >>>> maximum .git archive size (Windows 64 Bit system)? >>>> Thanks in advance for your answer. >>>> All the best, >>>> Andreas >>> On Git the file size is currently limited to size of `long`, rather than >>> `size_t`. Hence on Git-for Windows the size limit is 32bit ~4GiB >>> >>> Any change will be a big change as it ripples through many places in the >>> code base and, for some, will feel 'wrong'. I did some work [1-4] on top >>> of those of many others that was almost there, but... >> Adding to what Philip said. On Windows the size of exported archives >> (git archive) is currently also limited to 4GB. The reason being also >> the long vs size_t issue (which is not present on linux though). >> >> So if you can switch to Linux or even MacOSX these issues are gone. >> >> The number of files in .git, only the number packfiles would be of >> interest here I guess, do not have the long vs size_t issue. So >> packfiles can be larger than 4GB on 64bit Windows (with 64bit git of >> course). >> >> But depending on how large the biggest files are, it might be worth >> tweaking some of the settings, so that the created packfiles are >> readable on all platforms. I once created a repo on linux which could >> not be checked on windows, and that is a bit annoying. >> >> So the questions are how large is each file? And what repository size do >> you expect? Are we talking about 20MB files and 10GB repository? Or a >> factor 100 more? And are you just adding files or are you modifying the >> added files? Depending on the file sizes it might then also be >> beneficial to tweak the delta compression settings and/or the big file >> threshold limits. >> >> Thomas >> >>> The alternative is git-lfs, which I don't personally use (see [4]). >>> >>> Philip >>> >>> [1] https://github.com/git-for-windows/git/pull/2179 >>> [2] https://github.com/gitgitgadget/git/pull/115 >>> [3] https://github.com/git-for-windows/git/issues/1063 >>> [4] https://github.com/git-lfs/git-lfs/issues/2434 >>> >>> > > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Git as data archive 2019-12-09 1:18 ` Thomas Braun @ 2019-12-09 16:39 ` Andreas Kalz 0 siblings, 0 replies; 6+ messages in thread From: Andreas Kalz @ 2019-12-09 16:39 UTC (permalink / raw) To: Thomas Braun; +Cc: Philip Oakley, git Hi Thomas, I committed it only on a local repository (git add . / git commit -m"..."). But I never tested to restore the big files from the archive :( and then I stepped over the bug description. Now I tried it out and something bad happened: E:\bilder_git>git checkout -- Hochzeitsmesse.mp4 error: bad object header fatal: packed object 5c1403a85829c1c9e03bf04ac814d65bb72b617f (stored in .git/objects/pack/pack-00246783dc8e6b7365220e75563b5cecfa358e11.pack) is corrupt During add / commit there was no problem, but now this is not a good thing... My C-Skills are not bad - I worked about 10 years in embedded SW development. But, currently my time is limited as I have a 3 week old baby child :) All the best, Andreas Am 09.12.2019 um 02:18 schrieb Thomas Braun: > On 08.12.2019 19:44, Andreas Kalz wrote: > > Hi Andreas, > >> @Thomas: are you Thomas Braun who studied at FH Regensburg? > nope, sorry. > >> Well, currently the .git repository is 715GB and the maximum file size >> is 9.5GB, but I did not get error messages due to that even if the >> performance is quite low. The biggest pack* file is 24GB. There are some >> files which are modified, but most are not modified. > Okay that is kind-of-large. How did you add the 9.5GB file? AFAIK this > could not have be done on windows. > > Do you push that to a remote repository as well? > >> My question came up as I did not find a documentation about limits of >> git, only a lot of entries about github and forum users who are >> discussing about old bugs of git. I read about git-lfs and also that it >> is not working very stable, due to that I did not use it yet. > Although I'm not using git-lfs myself, from what I know it works well. > But it does have the same limitation as stock git for windows as Philip > pointed out already. > >> How can the delta compression settings and/or the big filethreshold >> limits be modified? > These are plain git config settings. Have a look at [1]. The attributes > are explained in [2-3]. Basically you can set in .gitattributes > > *.bin -delta, -diff > > which would tell git that files with suffix bin should not be delta > compressed and are always binary. > > You could also play around with turning compression completely off via > core.compression or pack.compression. > > Hope that helps, > Thomas > > PS: If you have resources to help fixing that long-standing bug in git > for windows, there is a PR open [4] which has a WIP version. But beware > you need good C skills and better-than-average git skills, or a > Santa-Claus-style bag with monetary resources. > > [1]: > https://git-scm.com/docs/git-config#Documentation/git-config.txt-corebigFileThreshold > [2]: https://git-scm.com/docs/gitattributes#_code_delta_code > [3]: https://git-scm.com/docs/gitattributes#_marking_files_as_binary > [4]: https://github.com/git-for-windows/git/pull/2179 > >> Am 07.12.2019 um 19:04 schrieb Thomas Braun: >>> On 07.12.2019 17:54, Philip Oakley wrote: >>>> Hi Andreas, >>>> >>>> On 06/12/2019 18:54, Andreas Kalz wrote: >>>>> Hello, >>>>> I am using git as archive and versioning also for photos. Apart from >>>>> performance issues, I wanted to ask if there are hard limits and >>>>> configurable limits (how to configure?) for maximum single file size >>>>> and >>>>> maximum .git archive size (Windows 64 Bit system)? >>>>> Thanks in advance for your answer. >>>>> All the best, >>>>> Andreas >>>> On Git the file size is currently limited to size of `long`, rather than >>>> `size_t`. Hence on Git-for Windows the size limit is 32bit ~4GiB >>>> >>>> Any change will be a big change as it ripples through many places in the >>>> code base and, for some, will feel 'wrong'. I did some work [1-4] on top >>>> of those of many others that was almost there, but... >>> Adding to what Philip said. On Windows the size of exported archives >>> (git archive) is currently also limited to 4GB. The reason being also >>> the long vs size_t issue (which is not present on linux though). >>> >>> So if you can switch to Linux or even MacOSX these issues are gone. >>> >>> The number of files in .git, only the number packfiles would be of >>> interest here I guess, do not have the long vs size_t issue. So >>> packfiles can be larger than 4GB on 64bit Windows (with 64bit git of >>> course). >>> >>> But depending on how large the biggest files are, it might be worth >>> tweaking some of the settings, so that the created packfiles are >>> readable on all platforms. I once created a repo on linux which could >>> not be checked on windows, and that is a bit annoying. >>> >>> So the questions are how large is each file? And what repository size do >>> you expect? Are we talking about 20MB files and 10GB repository? Or a >>> factor 100 more? And are you just adding files or are you modifying the >>> added files? Depending on the file sizes it might then also be >>> beneficial to tweak the delta compression settings and/or the big file >>> threshold limits. >>> >>> Thomas >>> >>>> The alternative is git-lfs, which I don't personally use (see [4]). >>>> >>>> Philip >>>> >>>> [1] https://github.com/git-for-windows/git/pull/2179 >>>> [2] https://github.com/gitgitgadget/git/pull/115 >>>> [3] https://github.com/git-for-windows/git/issues/1063 >>>> [4] https://github.com/git-lfs/git-lfs/issues/2434 >>>> >>>> >> ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2019-12-09 16:39 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-12-06 18:54 Git as data archive Andreas Kalz 2019-12-07 16:54 ` Philip Oakley 2019-12-07 18:04 ` Thomas Braun 2019-12-08 18:44 ` Andreas Kalz 2019-12-09 1:18 ` Thomas Braun 2019-12-09 16:39 ` Andreas Kalz
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).