All of lore.kernel.org
 help / color / mirror / Atom feed
From: Neal Kreitzinger <nkreitzinger@gmail.com>
To: Jeff King <peff@peff.net>
Cc: Bo Chen <chen@chenirvine.org>,
	Sergio <sergio.callegari@gmail.com>,
	git@vger.kernel.org
Subject: Re: GSoC - Some questions on the idea of
Date: Sat, 31 Mar 2012 10:19:54 -0500	[thread overview]
Message-ID: <4F77209A.8050607@gmail.com> (raw)
In-Reply-To: <20120330203430.GB20376@sigill.intra.peff.net>

On 3/30/2012 3:34 PM, Jeff King wrote:
> On Fri, Mar 30, 2012 at 03:51:20PM -0400, Bo Chen wrote:
>
>> The sub-problems of "delta for large file" problem.
>>
>> 1 large file
>>
> Note that there are other problem areas with big files that can be
> worked on, too. For example, some people want to store 100 gigabytes
> in a repository.

I take it that you have in mind a 100G set of files comprised entirely
of big-files that cannot be logically separated into smaller submodules?

My understanding is that a main strategy for "big files" is to separate
your big-files logically into their own submodule(s) to keep them from
bogging down the not-big-file repo(s).

Is one of the goals of big-file-support to make submodule strategizing 
unconcerned about big-file groupings and only concerned about 
logical-file groupings?  Big-file groupings are not necessarily logical 
file groupings, but perhaps a technical file grouping subset of a 
logical file grouping that is necessitated by big-file performance 
considerations.  IOW, is the goal of big-file-support to make big-files 
"just work" so that users don't have to think about graphics files, 
binaries, etc, and just treat them like everything else?  Obviously, a 
100G database file will always be a 'big-file' for the foreseeable 
future, but a 0.5G graphics file is not a "big file" generally speaking 
(as opposed to git-speaking).

> Because git is distributed, that means 100G in the repo database,
> and 100G in the working directory, for a total of 200G.

I take it that you are implying that the 100G object-store size is due
to the notion that binary files cannot-be/are-not compressed well?

> People in this situation may want to be able to store part of the
> repository database in a network-accessible location, trading some
> of the convenience of being fully distributed for the space savings.
> So another project could be designing a network-based alternate
> object storage system.
>
I take it you are implying a local area network with users git repos on 
workstations?

In regards to "network-based alternate objects" that are in fact on the 
internet they would need to first be cloned onto the local area network. 
  Or are you imagining this would work for internet "network-based 
alternate objects"?

Some setups login to a linux server and have all their repos there.  The 
"alternate objects" does not need to network-based in that case.  It is 
"local", but local does not mean 20 people cloning the alternate objects 
to their workstations.  It means one copy of alternate objects, and 
twenty repos referencing that one copy.

v/r,
neal

  parent reply	other threads:[~2012-03-31 15:20 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-28  4:38 GSoC - Some questions on the idea of "Better big-file support" Bo Chen
2012-03-28  6:19 ` Nguyen Thai Ngoc Duy
2012-03-28 11:33   ` GSoC - Some questions on the idea of Sergio
2012-03-30 19:44     ` Bo Chen
2012-03-30 19:51     ` Bo Chen
2012-03-30 20:34       ` Jeff King
2012-03-30 23:08         ` Bo Chen
2012-03-31 11:02           ` Sergio Callegari
2012-03-31 16:18             ` Neal Kreitzinger
2012-04-02 21:07               ` Jeff King
2012-04-03  9:58                 ` Sergio Callegari
2012-04-11  1:24                 ` Neal Kreitzinger
2012-04-11  6:04                   ` Jonathan Nieder
2012-04-11 16:29                     ` Neal Kreitzinger
2012-04-11 22:09                       ` Jeff King
2012-04-11 16:35                     ` Neal Kreitzinger
2012-04-11 16:44                     ` Neal Kreitzinger
2012-04-11 17:20                       ` Jonathan Nieder
2012-04-11 18:51                         ` Junio C Hamano
2012-04-11 19:03                           ` Jonathan Nieder
2012-04-11 18:23                     ` Neal Kreitzinger
2012-04-11 21:35                   ` Jeff King
2012-04-12 19:29                     ` Neal Kreitzinger
2012-04-12 21:03                       ` Jeff King
     [not found]                         ` <4F8A2EBD.1070407@gmail.com>
2012-04-15  2:15                           ` Jeff King
2012-04-15  2:33                             ` Neal Kreitzinger
2012-04-16 14:54                               ` Jeff King
2012-05-10 21:43                             ` Neal Kreitzinger
2012-05-10 22:39                               ` Jeff King
2012-04-12 21:08                       ` Neal Kreitzinger
2012-04-13 21:36                       ` Bo Chen
2012-03-31 15:19         ` Neal Kreitzinger [this message]
2012-04-02 21:40           ` Jeff King
2012-04-02 22:19             ` Junio C Hamano
2012-04-03 10:07               ` Jeff King
2012-03-31 16:49         ` Neal Kreitzinger
2012-03-31 20:28         ` Neal Kreitzinger
2012-03-31 21:27           ` Bo Chen
2012-04-01  4:22             ` Nguyen Thai Ngoc Duy
2012-04-01 23:30               ` Bo Chen
2012-04-02  1:00                 ` Nguyen Thai Ngoc Duy
2012-03-30 19:11   ` GSoC - Some questions on the idea of "Better big-file support" Bo Chen
2012-03-30 19:54     ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F77209A.8050607@gmail.com \
    --to=nkreitzinger@gmail.com \
    --cc=chen@chenirvine.org \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    --cc=sergio.callegari@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.