* Using git for code deployment on webservers? @ 2009-06-15 23:11 Ingo Oeser 2009-06-16 7:13 ` Allan Wind ` (2 more replies) 0 siblings, 3 replies; 10+ messages in thread From: Ingo Oeser @ 2009-06-15 23:11 UTC (permalink / raw) To: git; +Cc: Ingo Oeser [please CC me, as I'm not subscribed] Hi there, I try to use git in a quite unusual way. I have a bunch of servers (hundreds), which get regular pulls of web developer code. The code consists of images, flash files, scripting language files, you name it. An exported repo (just the files, no SCM metadata) contains up to 4GB of files. No I want to distribute changes the developers made in a tree like structure: main server --> slave_1 --> webserver_0815 |-> slave_2 --> webserver_2342 |-> webserver_4711 But with the following contraints: - Store as little as possible on the webservers. One selected revision/tag is enough. - Transfer as little as possible data. Cancel out addition and deletion on the fly. - Nearly atomic update of file tree (easy to implement outside git) Nice to have: - Instead of copying the files to their proper names, hardlink them to their git objects. At the moment I always get more data than I need and have to store the repository AND the checked out data. I couldn't find a way so far to get around this. Is this possible? Any ideas are welcome. Many Thanks in Advance! Best Regards Ingo Oeser ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Using git for code deployment on webservers? 2009-06-15 23:11 Using git for code deployment on webservers? Ingo Oeser @ 2009-06-16 7:13 ` Allan Wind 2009-06-17 17:42 ` Ingo Oeser 2009-06-16 8:01 ` Thomas Koch 2009-06-16 17:49 ` Daniel Barkalow 2 siblings, 1 reply; 10+ messages in thread From: Allan Wind @ 2009-06-16 7:13 UTC (permalink / raw) To: git; +Cc: ioe-git On 2009-06-16T01:11:47, Ingo Oeser wrote: > - Transfer as little as possible data. > Cancel out addition and deletion on the fly. I use `git diff` with the post-receive hook to distribute changes to my web server. diff carries the previous content when you delete a file, and in my case this was large mpeg files defeating the purpose somewhat. If you do not mind having a full repository on the web servers, then pushing changes might work better. This appears to be what you are doing now though. If I had to scale this I would probably build a master image (either locally or remotely) and use rsync to distribute the content instead of git. > - Nearly atomic update of file tree (easy to implement outside git) stow can be handy for this. /Allan -- Allan Wind Life Integrity, LLC http://lifeintegrity.com ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Using git for code deployment on webservers? 2009-06-16 7:13 ` Allan Wind @ 2009-06-17 17:42 ` Ingo Oeser 0 siblings, 0 replies; 10+ messages in thread From: Ingo Oeser @ 2009-06-17 17:42 UTC (permalink / raw) To: git, ioe-git Hi Allan, On Tuesday 16 June 2009, Allan Wind wrote: > If you do not mind having a full repository on the web servers, > then pushing changes might work better. This appears to be what > you are doing now though. No, at the moment we have built our own version of a content addressable filesystem and are distributing changes to it. We have symlinks to real file names. I just thought, that git can do sth. similiar with its core, before trying to solve a solved problem :-) > If I had to scale this I would probably build a master image > (either locally or remotely) and use rsync to distribute the > content instead of git. We do sth. similiar at the moment. De-duplication is important, because web people copy lots of data for images and flash around when doing things. > > - Nearly atomic update of file tree (easy to implement outside git) > > stow can be handy for this. Ah! Will have a look. Many Thanks! Best Regards Ingo Oeser ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Using git for code deployment on webservers? 2009-06-15 23:11 Using git for code deployment on webservers? Ingo Oeser 2009-06-16 7:13 ` Allan Wind @ 2009-06-16 8:01 ` Thomas Koch 2009-06-17 17:27 ` Ingo Oeser 2009-06-16 17:49 ` Daniel Barkalow 2 siblings, 1 reply; 10+ messages in thread From: Thomas Koch @ 2009-06-16 8:01 UTC (permalink / raw) To: Ingo Oeser; +Cc: git Would it help, to share a read only GIT object store among all webservers via NFS? Best regards, Thomas Koch > [please CC me, as I'm not subscribed] > > Hi there, > > I try to use git in a quite unusual way. > > I have a bunch of servers (hundreds), which get regular pulls of web > developer code. The code consists of images, flash files, scripting > language files, you name it. An exported repo (just the files, no SCM > metadata) contains up to 4GB of files. > > No I want to distribute changes the developers made in a tree like > structure: > > main server --> slave_1 --> webserver_0815 > > |-> slave_2 --> webserver_2342 > | > |-> webserver_4711 > > But with the following contraints: > - Store as little as possible on the webservers. > One selected revision/tag is enough. > - Transfer as little as possible data. > Cancel out addition and deletion on the fly. > - Nearly atomic update of file tree (easy to implement outside git) > > Nice to have: > - Instead of copying the files to their proper names, > hardlink them to their git objects. > > At the moment I always get more data than I need and have to store > the repository AND the checked out data. > > I couldn't find a way so far to get around this. Is this possible? > Any ideas are welcome. > > Many Thanks in Advance! > > Best Regards > > Ingo Oeser > -- > To unsubscribe from this list: send the line "unsubscribe git" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html Thomas Koch, http://www.koch.ro ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Using git for code deployment on webservers? 2009-06-16 8:01 ` Thomas Koch @ 2009-06-17 17:27 ` Ingo Oeser 0 siblings, 0 replies; 10+ messages in thread From: Ingo Oeser @ 2009-06-17 17:27 UTC (permalink / raw) To: thomas; +Cc: git Hi Thomas, On Tuesday 16 June 2009, Thomas Koch wrote: > Would it help, to share a read only GIT object store among all webservers via > NFS? NFS on hundreds of web servers has severe scaling problems. That is by design and is solved by alternative file systems or soon pNFS. We tried such a setup already. Best Regards Ingo Oeser ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Using git for code deployment on webservers? 2009-06-15 23:11 Using git for code deployment on webservers? Ingo Oeser 2009-06-16 7:13 ` Allan Wind 2009-06-16 8:01 ` Thomas Koch @ 2009-06-16 17:49 ` Daniel Barkalow 2009-06-17 17:23 ` Ingo Oeser 2 siblings, 1 reply; 10+ messages in thread From: Daniel Barkalow @ 2009-06-16 17:49 UTC (permalink / raw) To: Ingo Oeser; +Cc: git On Tue, 16 Jun 2009, Ingo Oeser wrote: > [please CC me, as I'm not subscribed] > > Hi there, > > I try to use git in a quite unusual way. > > I have a bunch of servers (hundreds), which get regular pulls of web developer code. > The code consists of images, flash files, scripting language files, you name it. > An exported repo (just the files, no SCM metadata) contains up to 4GB of files. > > No I want to distribute changes the developers made in a tree like structure: > > main server --> slave_1 --> webserver_0815 > |-> slave_2 --> webserver_2342 > |-> webserver_4711 > > But with the following contraints: > - Store as little as possible on the webservers. > One selected revision/tag is enough. > - Transfer as little as possible data. > Cancel out addition and deletion on the fly. > - Nearly atomic update of file tree (easy to implement outside git) > > Nice to have: > - Instead of copying the files to their proper names, > hardlink them to their git objects. > > At the moment I always get more data than I need and have to store > the repository AND the checked out data. You should be able to have the slave repositories store tags for tree objects (instead of commit objects), and have the webservers fetch those. You'll still have the object database, but it will only contain stuff that's been deployed to that webserver, not intermediate versions or historical versions. You'll still have to store both the repo and the checked out data (but git stores the content delta-compressed against each other in one big file, normally, so there really aren't files to hard link to. Of course, the other possibility is to check out versions on the slaves, and rsync that to the webservers, which is probably the optimal method if you're not in a situation where you benefit from anything git does in transit. -Daniel *This .sig left intentionally blank* ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Using git for code deployment on webservers? 2009-06-16 17:49 ` Daniel Barkalow @ 2009-06-17 17:23 ` Ingo Oeser 2009-06-17 19:26 ` Daniel Barkalow 0 siblings, 1 reply; 10+ messages in thread From: Ingo Oeser @ 2009-06-17 17:23 UTC (permalink / raw) To: Daniel Barkalow; +Cc: Ingo Oeser, git Hi Daniel, On Tuesday 16 June 2009, Daniel Barkalow wrote: > You should be able to have the slave repositories store tags for tree > objects (instead of commit objects), and have the webservers fetch those. > You'll still have the object database, but it will only contain stuff > that's been deployed to that webserver, not intermediate versions or > historical versions. Ah, that sound like a great solution. I'll try that. > You'll still have to store both the repo and the checked out data > (but git stores the content delta-compressed against each > other in one big file, normally, so there really aren't files to hard link > to. Ok. That was under the assumption, that the core of git is basically a content addressable file system. But that seems to be history :-) > Of course, the other possibility is to check out versions on the slaves, > and rsync that to the webservers, which is probably the optimal method if > you're not in a situation where you benefit from anything git does in > transit. I would benefit from noticing local changes. But simple rsync is what is tried now. Problem is, we get no de-duplication from rsync, which git could do. Many thanks for your suggestions! Best Regards Ingo Oeser ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Using git for code deployment on webservers? 2009-06-17 17:23 ` Ingo Oeser @ 2009-06-17 19:26 ` Daniel Barkalow 2009-06-17 20:26 ` Alex Riesen 0 siblings, 1 reply; 10+ messages in thread From: Daniel Barkalow @ 2009-06-17 19:26 UTC (permalink / raw) To: Ingo Oeser; +Cc: git On Wed, 17 Jun 2009, Ingo Oeser wrote: > Hi Daniel, > > On Tuesday 16 June 2009, Daniel Barkalow wrote: > > You should be able to have the slave repositories store tags for tree > > objects (instead of commit objects), and have the webservers fetch those. > > You'll still have the object database, but it will only contain stuff > > that's been deployed to that webserver, not intermediate versions or > > historical versions. > > Ah, that sound like a great solution. I'll try that. > > > You'll still have to store both the repo and the checked out data > > (but git stores the content delta-compressed against each > > other in one big file, normally, so there really aren't files to hard link > > to. > > Ok. That was under the assumption, that the core of git is basically a > content addressable file system. But that seems to be history :-) It is (based on) a content-addressable file system, but it's not a host file system. It's a file system in the sense that you can put octet sequences into it and lookup them up by their names, but you can't mount it from the kernel and link to it. It's like a tar file, although it's more limited in that it doesn't provide a "list" operation. There's no fundamental reason there couldn't be a kernel driver (or, more likely, FUSE helper) which could mount it, but that's not the normal method. > > Of course, the other possibility is to check out versions on the slaves, > > and rsync that to the webservers, which is probably the optimal method if > > you're not in a situation where you benefit from anything git does in > > transit. > > I would benefit from noticing local changes. But simple rsync is what is tried now. > Problem is, we get no de-duplication from rsync, which git could do. In that case, fetching trees is probably the right thing; that should give you a point-to-point de-duplication without any history (although you may also turn up git bugs, since this isn't how git is normally used). -Daniel *This .sig left intentionally blank* ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Using git for code deployment on webservers? 2009-06-17 19:26 ` Daniel Barkalow @ 2009-06-17 20:26 ` Alex Riesen 2009-06-17 20:33 ` Alex Riesen 0 siblings, 1 reply; 10+ messages in thread From: Alex Riesen @ 2009-06-17 20:26 UTC (permalink / raw) To: Ingo Oeser; +Cc: Daniel Barkalow, git 2009/6/17 Daniel Barkalow <barkalow@iabervon.org>: > On Wed, 17 Jun 2009, Ingo Oeser wrote: >> > Of course, the other possibility is to check out versions on the slaves, >> > and rsync that to the webservers, which is probably the optimal method if >> > you're not in a situation where you benefit from anything git does in >> > transit. >> >> I would benefit from noticing local changes. But simple rsync is what is tried now. >> Problem is, we get no de-duplication from rsync, which git could do. > > In that case, fetching trees is probably the right thing; that should give > you a point-to-point de-duplication without any history (although you may > also turn up git bugs, since this isn't how git is normally used). Or, you can just keep a namespace for each server in the intermediate repositories, which records the version the server has and the version it should have. Then you can use git diff-tree to find you which files have to be transferred. You wont be able to record changes on the servers, though. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Using git for code deployment on webservers? 2009-06-17 20:26 ` Alex Riesen @ 2009-06-17 20:33 ` Alex Riesen 0 siblings, 0 replies; 10+ messages in thread From: Alex Riesen @ 2009-06-17 20:33 UTC (permalink / raw) To: Ingo Oeser; +Cc: Daniel Barkalow, git 2009/6/17 Alex Riesen <raa.lkml@gmail.com>: > Or, you can just keep a namespace for each server in the intermediate I mean namespace of branches: refs/heads/webserver_1/master (current) refs/heads/webserver_1/next (to be updated to) > repositories, which records the version the server has and the version > it should have. Then you can use git diff-tree to find you which files > have to be transferred. You wont be able to record changes on the servers, > though. Something like that: git diff-tree --diff-filter=AM webserver1_/master..webserver_1/next | while read f; do scp "$f" webserver_1:"$f" || break; done git diff-tree --diff-filter=D webserver1_/master..webserver_1/next | while read f; do ssh webserver_1 rm -f "$f" || break; done ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2009-06-17 20:33 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2009-06-15 23:11 Using git for code deployment on webservers? Ingo Oeser 2009-06-16 7:13 ` Allan Wind 2009-06-17 17:42 ` Ingo Oeser 2009-06-16 8:01 ` Thomas Koch 2009-06-17 17:27 ` Ingo Oeser 2009-06-16 17:49 ` Daniel Barkalow 2009-06-17 17:23 ` Ingo Oeser 2009-06-17 19:26 ` Daniel Barkalow 2009-06-17 20:26 ` Alex Riesen 2009-06-17 20:33 ` Alex Riesen
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.