* About GIT Internals @ 2022-05-25 16:10 Aman 2022-05-25 16:49 ` Emily Shaffer ` (3 more replies) 0 siblings, 4 replies; 17+ messages in thread From: Aman @ 2022-05-25 16:10 UTC (permalink / raw) To: git Hello there, I have recently been reading The Architecture for Open Source Applications book - and read the chapters dedicated to GIT internals. And if I am being completely honest, I didn't understand most of it. Could someone please assist - in sharing some resources - which I could go through, to better understand GIT software internals. (I am a high school student, and really want to learn more about how all the great software and hardware around us work - which so many of us take for granted) Regards, ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: About GIT Internals 2022-05-25 16:10 About GIT Internals Aman @ 2022-05-25 16:49 ` Emily Shaffer 2022-05-25 21:14 ` Erik Cervin Edin ` (2 subsequent siblings) 3 siblings, 0 replies; 17+ messages in thread From: Emily Shaffer @ 2022-05-25 16:49 UTC (permalink / raw) To: Aman; +Cc: Git List On Wed, May 25, 2022 at 9:11 AM Aman <amanmatreja@gmail.com> wrote: > > Hello there, > > I have recently been reading The Architecture for Open Source > Applications book - and read the chapters dedicated to GIT internals. > And if I am being completely honest, I didn't understand most of it. > > Could someone please assist - in sharing some resources - which I > could go through, to better understand GIT software internals. I am really excited you asked! This puts you firmly on the road to being the person who can help unstick all your friends when they get into Git messes later on. ;) https://docs.google.com/presentation/d/1IQCRPHEIX-qKo7QFxsD3V62yhyGA9_5YsYXFOiBpgkk/edit?usp=sharing <- This is a really great intro to the internals which I love. I pretty much always recommend it as the place to start for someone curious about learning how Git works. https://www.youtube.com/watch?v=5Gq3KVvcfDk <- This covers much of the same territory but has a nice video to go through it, in case it's easier for you to learn that way instead of reading slides. If you have additional questions about the technical design of Git following one or both of those presentations above, I think you could get far starting with Git's own design documentation: https://github.com/git/git/tree/master/Documentation/technical From there I think the list will be the best place for specific followup questions you might have. Happy learning! - Emily ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: About GIT Internals 2022-05-25 16:10 About GIT Internals Aman 2022-05-25 16:49 ` Emily Shaffer @ 2022-05-25 21:14 ` Erik Cervin Edin 2022-05-25 23:34 ` git-vger 2022-05-26 12:45 ` Konstantin Khomoutov 3 siblings, 0 replies; 17+ messages in thread From: Erik Cervin Edin @ 2022-05-25 21:14 UTC (permalink / raw) To: Aman; +Cc: git On Wed, May 25, 2022 at 10:14 PM Aman <amanmatreja@gmail.com> wrote: > > And if I am being completely honest, I didn't understand most of it. You are not alone, there are many that struggle with understanding how git works internally. > (I am a high school student, and really want to learn more about how > all the great software and hardware around us work - which so many of > us take for granted) Perhaps not a good resource, depending on your familiarity with computer science but https://eagain.net/articles/git-for-computer-scientists/ is an article that is often recommended. I think for me, the hardest part of understanding Git was the difficulty conceptualizing it. But at its core Git is very simple. You can think of it as a folder of files that you can "save" (commit) whenever you want. Each time you "save" (commit), all files and folders are "copied" to another folder (the local repository). That means that if you ever want to look at a previous version of a file, it's there. For simplicity's sake you can think of this as being unchangeable. Once a file is saved it's saved forever. Just having a messy pile of every single version of a file is not useful, so the rest of git consists of making this manageable. For example by remembering who saved it, when and why (by making them write a message when they save). The main thing however is that Git orders saves. This order is not necessarily one version after another, sorted by when they were saved. Instead, order is manually controlled by saving files in different places (branches). In its simplest form, a branch is several saves, one after another. Because of how Git orders saves, I can work on files, save them and give them to you. You can keep working on those files and make your own saves. But I don't have to wait for you to send your work back to me. I can keep working on the same files and making my own saves. When you're done you can put your saves in a "shared folder" (a remote repository). Later, when I'm done, I can get your saves and Git can help me figure out which parts of the files that you changed that I didn't and copy both of our work into new files (merging). This is a bit of an oversimplification and Git allows users to do more advanced things but the gist is basically this. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: About GIT Internals 2022-05-25 16:10 About GIT Internals Aman 2022-05-25 16:49 ` Emily Shaffer 2022-05-25 21:14 ` Erik Cervin Edin @ 2022-05-25 23:34 ` git-vger 2022-05-26 8:47 ` Philip Oakley 2022-05-26 12:45 ` Konstantin Khomoutov 3 siblings, 1 reply; 17+ messages in thread From: git-vger @ 2022-05-25 23:34 UTC (permalink / raw) To: Aman; +Cc: git Hi Aman, responses inline below. On Wed, May 25, 2022 at 09:40:42PM +0530, Aman wrote: > Could someone please assist - in sharing some resources - which I > could go through, to better understand GIT software internals. There is an excellent free book at https://git-scm.com/book/en/v2 . Chapter 10 is about git internals. It is important to realize that, unlike many other version control systems, git works effectively on files locally on your computer, without any server or other shared resources to manage. Also, one good way to learn may be to form a question that you want to answer first. "How do I ...." or "what happens when I ....". Since git works locally, it is possible to create a git repo, look at the files contained in the .git directory, take action with git, and then look at the files again. Many people use git from the command line. If you are not familiar with the command line, you may be interesting in learning more about it. Mozilla, the makers of the Firefox web browser, have a wiki page to familiarize yourself with the command line here: https://developer.mozilla.org/en-US/docs/Learn/Tools_and_testing/Understanding_client-side_tools/Command_line Happy Explorations! Eldon ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: About GIT Internals 2022-05-25 23:34 ` git-vger @ 2022-05-26 8:47 ` Philip Oakley [not found] ` <CACMKQb3exv13sYN5uEP_AG-JYu1rmVj4HDxjdw8_Y-+maJPwGg@mail.gmail.com> 0 siblings, 1 reply; 17+ messages in thread From: Philip Oakley @ 2022-05-26 8:47 UTC (permalink / raw) To: git-vger, Aman; +Cc: git On 26/05/2022 00:34, git-vger@eldondev.com wrote: > Hi Aman, responses inline below. > > On Wed, May 25, 2022 at 09:40:42PM +0530, Aman wrote: >> Could someone please assist - in sharing some resources - which I >> could go through, to better understand GIT software internals. > There is an excellent free book at https://git-scm.com/book/en/v2 . > > Chapter 10 is about git internals. It is important to realize that, > unlike many other version control systems, git works effectively on > files locally on your computer, without any server or other shared > resources to manage. Also, one good way to learn may be to form a > question that you want to answer first. "How do I ...." or "what happens > when I ....". Since git works locally, it is possible to create a git > repo, look at the files contained in the .git directory, take action > with git, and then look at the files again. > > Another Git feature, compared to older version control systems, is that it flips the 'control' aspect on its head. (who controls what you can store?) It does this by using the hash (sha1, or sha256) values as a way of users _checking_ that they have the right copy of a file or commit, rather than needing special permissions to access (write/read) some alleged 'master' copy (in the sense of a unique artefact) of the particular version. Maintainers now check and authorise particular versions much more easily. Hence Git _Distributes Control_ - you no longer need permission to keep versioned copies of your work. This was, in my mind, a core element of its success. There is other stuff about how Git splits the (file) content from it's meta-data, so if say 10 files contain the same licence text, then it only hold one copy of that text, with its own unique hash. Then has a hierarchy (pyramid) of hashes of the meta-data to build up a whole project's hash (the top level 'tree'), and the same hierarchy technique is repeated for the project's history of commits. If you have a copy of the repository with the latest (same) hash then you have a perfect copy, indistinguishable to the 'original'! Older versioning systems did not have those guarantees, many were derived from systems for versioning engineering and architectural drawings such as those that were used for the RMS Titanic or Empire State Building. Philip PS it's worth checking out the distinction between having hash (a magic id) of some text, and encrypting (a magic translation of) some text. ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <CACMKQb3exv13sYN5uEP_AG-JYu1rmVj4HDxjdw8_Y-+maJPwGg@mail.gmail.com>]
* Re: About GIT Internals [not found] ` <CACMKQb3exv13sYN5uEP_AG-JYu1rmVj4HDxjdw8_Y-+maJPwGg@mail.gmail.com> @ 2022-05-27 14:40 ` Philip Oakley [not found] ` <C4B1A93D-800F-4C49-93D5-86FE58B1DDCA@hxcore.ol> 2022-05-30 9:49 ` Kerry, Richard 0 siblings, 2 replies; 17+ messages in thread From: Philip Oakley @ 2022-05-27 14:40 UTC (permalink / raw) To: Aman; +Cc: Git List, git-vger Hi Aman, We try to keep all the cc's so every one can gain from the learning! comments in-line. On 26/05/2022 15:17, Aman wrote: > Hey Phillip. > > Thanks a lot for your email, and for sharing the book! This is great. That was Eldon, thank you.. (https://lore.kernel.org/git/Yo68+kjAeP6tnduW@invalid/) There is also the Git Magic 'book' from Stanford, with Ch8 covering the internals http://www-cs-students.stanford.edu/~blynn/gitmagic/ch08.html > > Just a follow up questions- if you don't mind: > > 1. I haven't had the experience of working with other (perhaps even > older) version control systems, like subversion. So when refering to > the "control" aspect, The "control" aspect was from whoever was the 'manager' that limited access to the version system (i.e. acting like a museum curator), and deciding if your masterpiece was worthy of inclusion as a significant example of your craft, whether that was an engineering drawing or some software code. > you mean because with hashes we can verify If you have a look at https://www.makeuk.org/insights/blogs/how-to-read-engineering-drawings-a-simple-guide and the part about the Title Block has a drawing (DWG) number (EEF-001-AM) that is used to reference it and, while it feels nice, the reference is rather arbitrary, (could someone else use that number? what's the next in the sequence? what happens when we reach EEF-999-AM? etc.). So the computer hash (40 digits of 0-9a-f !) solves all those problems, it is unique (>40 card shuffle level), depends only on the content, computers like it. Yay. (Computers are great at perfect replication, so cost of manufacture tends to zero! Cost of design wanders in the other direction;-) > the > integrity of the files (like code) in git - there is no need for > having a central authority to guarantee that's it's the right content > files (which is great)? And it means managers no longer worry about _your_ working copy - computers have digital storage space to spare. That wasn't the case when it was on paper, and we didn't have photocopies - have a look at 'blue prints' https://en.wikipedia.org/wiki/Blueprint (see the invention date!, I still remember the smell from the late 1970s) > > On Thu, May 26, 2022 at 2:17 PM Philip Oakley <philipoakley@iee.email> wrote: >> On 26/05/2022 00:34, git-vger@eldondev.com wrote: >>> Hi Aman, responses inline below. >>> >>> On Wed, May 25, 2022 at 09:40:42PM +0530, Aman wrote: >>>> Could someone please assist - in sharing some resources - which I >>>> could go through, to better understand GIT software internals. >>> There is an excellent free book at https://git-scm.com/book/en/v2 . >>> >>> Chapter 10 is about git internals. It is important to realize that, >>> unlike many other version control systems, git works effectively on >>> files locally on your computer, without any server or other shared >>> resources to manage. Also, one good way to learn may be to form a >>> question that you want to answer first. "How do I ...." or "what happens >>> when I ....". Since git works locally, it is possible to create a git >>> repo, look at the files contained in the .git directory, take action >>> with git, and then look at the files again. >>> >>> >> Another Git feature, compared to older version control systems, is that >> it flips the 'control' aspect on its head. (who controls what you can >> store?) >> >> It does this by using the hash (sha1, or sha256) values as a way of >> users _checking_ that they have the right copy of a file or commit, >> rather than needing special permissions to access (write/read) some >> alleged 'master' copy (in the sense of a unique artefact) of the >> particular version. Maintainers now check and authorise particular >> versions much more easily. >> >> Hence Git _Distributes Control_ - you no longer need permission to keep >> versioned copies of your work. This was, in my mind, a core element of >> its success. >> >> There is other stuff about how Git splits the (file) content from it's >> meta-data, so if say 10 files contain the same licence text, then it >> only hold one copy of that text, with its own unique hash. Then has a >> hierarchy (pyramid) of hashes of the meta-data to build up a whole >> project's hash (the top level 'tree'), and the same hierarchy technique >> is repeated for the project's history of commits. >> >> If you have a copy of the repository with the latest (same) hash then >> you have a perfect copy, indistinguishable to the 'original'! Older >> versioning systems did not have those guarantees, many were derived from >> systems for versioning engineering and architectural drawings such as >> those that were used for the RMS Titanic or Empire State Building. >> >> Philip >> >> PS it's worth checking out the distinction between having hash (a magic >> id) of some text, and encrypting (a magic translation of) some text. >> >> ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <C4B1A93D-800F-4C49-93D5-86FE58B1DDCA@hxcore.ol>]
* Re: About GIT Internals [not found] ` <C4B1A93D-800F-4C49-93D5-86FE58B1DDCA@hxcore.ol> @ 2022-05-27 15:14 ` Philip Oakley 0 siblings, 0 replies; 17+ messages in thread From: Philip Oakley @ 2022-05-27 15:14 UTC (permalink / raw) To: Aman; +Cc: Git List On 27/05/2022 16:01, Aman wrote: > > Hey, thank you again. > > I am finding this mailing list format of talking a bit confusing, sorry. > No problem, blame Microsoft for following the business ($$$) way of doing stuff. The 'plain text, in-line replies, with trimming of unrelated items' style helps the lurkers, and folks who come to the discussion later. Main point is that we convert each point of interest into its own discussion, rather than it being a big challenge-response style between legal negotiators - it's not a win/lose discussion ;-) > Would there be any way address everyone on the mailing list – like in > the future – to continue this conversation about git internals? > Key method is to locate the "reply All" option in your mail app. That makes sure every one in the discussion is copied, and the mailing lists as well for all the 'lurkers';-) The mailing list archive uses both the message titles and any in-reply-to hidden headers (see if your mailer has a 'show source' option to see those interesting bits) to organise the list archive. > > I found out the mailing list archive (so confusing)– and saw these > personal replies don’t get added in the thread. I would appreciate if > you give some advice, thank you > > Sent from Mail <https://go.microsoft.com/fwlink/?LinkId=550986> for > Windows > > From: Philip Oakley <mailto:philipoakley@iee.email> > Sent: 27 May 2022 08:10 PM > To: Aman <mailto:amanmatreja@gmail.com> > Cc: Git List <mailto:git@vger.kernel.org>; git-vger@eldondev.com > Subject: Re: About GIT Internals > > Hi Aman, > > We try to keep all the cc's so every one can gain from the learning! > > comments in-line. > > On 26/05/2022 15:17, Aman wrote: > > > Hey Phillip. > > > > > > Thanks a lot for your email, and for sharing the book! This is great. > > That was Eldon, thank you.. > > (https://lore.kernel.org/git/Yo68+kjAeP6tnduW@invalid/) > > There is also the Git Magic 'book' from Stanford, with Ch8 covering the > > internals http://www-cs-students.stanford.edu/~blynn/gitmagic/ch08.html > > > > > > Just a follow up questions- if you don't mind: > > > > > > 1. I haven't had the experience of working with other (perhaps even > > > older) version control systems, like subversion. So when refering to > > > the "control" aspect, > > The "control" aspect was from whoever was the 'manager' that limited > > access to the version system (i.e. acting like a museum curator), and > > deciding if your masterpiece was worthy of inclusion as a significant > > example of your craft, whether that was an engineering drawing or some > > software code. > > > you mean because with hashes we can verify > > If you have a look at > > https://www.makeuk.org/insights/blogs/how-to-read-engineering-drawings-a-simple-guide > > > and the part about the Title Block has a drawing (DWG) number > > (EEF-001-AM) that is used to reference it and, while it feels nice, the > > reference is rather arbitrary, (could someone else use that number? > > what's the next in the sequence? what happens when we reach EEF-999-AM? > > etc.). > > So the computer hash (40 digits of 0-9a-f !) solves all those problems, > > it is unique (>40 card shuffle level), depends only on the content, > > computers like it. Yay. > > (Computers are great at perfect replication, so cost of manufacture > > tends to zero! Cost of design wanders in the other direction;-) > > > the > > > integrity of the files (like code) in git - there is no need for > > > having a central authority to guarantee that's it's the right content > > > files (which is great)? > > And it means managers no longer worry about _your_ working copy - > > computers have digital storage space to spare. That wasn't the case when > > it was on paper, and we didn't have photocopies - have a look at 'blue > > prints' https://en.wikipedia.org/wiki/Blueprint (see the invention > > date!, I still remember the smell from the late 1970s) > > > > > > On Thu, May 26, 2022 at 2:17 PM Philip Oakley > <philipoakley@iee.email> wrote: > > >> On 26/05/2022 00:34, git-vger@eldondev.com wrote: > > >>> Hi Aman, responses inline below. > > >>> > > >>> On Wed, May 25, 2022 at 09:40:42PM +0530, Aman wrote: > > >>>> Could someone please assist - in sharing some resources - which I > > >>>> could go through, to better understand GIT software internals. > > >>> There is an excellent free book at https://git-scm.com/book/en/v2 . > > >>> > > >>> Chapter 10 is about git internals. It is important to realize that, > > >>> unlike many other version control systems, git works effectively on > > >>> files locally on your computer, without any server or other shared > > >>> resources to manage. Also, one good way to learn may be to form a > > >>> question that you want to answer first. "How do I ...." or "what > happens > > >>> when I ...". Since git works locally, it is possible to create a git > > >>> repo, look at the files contained in the .git directory, take action > > >>> with git, and then look at the files again. > > >>> > > >>> > > >> Another Git feature, compared to older version control systems, is that > > >> it flips the 'control' aspect on its head. (who controls what you can > > >> store?) > > >> > > >> It does this by using the hash (sha1, or sha256) values as a way of > > >> users _checking_ that they have the right copy of a file or commit, > > >> rather than needing special permissions to access (write/read) some > > >> alleged 'master' copy (in the sense of a unique artefact) of the > > >> particular version. Maintainers now check and authorise particular > > >> versions much more easily. > > >> > > >> Hence Git _Distributes Control_ - you no longer need permission to keep > > >> versioned copies of your work. This was, in my mind, a core element of > > >> its success. > > >> > > >> There is other stuff about how Git splits the (file) content from it's > > >> meta-data, so if say 10 files contain the same licence text, then it > > >> only hold one copy of that text, with its own unique hash. Then has a > > >> hierarchy (pyramid) of hashes of the meta-data to build up a whole > > >> project's hash (the top level 'tree'), and the same hierarchy technique > > >> is repeated for the project's history of commits. > > >> > > >> If you have a copy of the repository with the latest (same) hash then > > >> you have a perfect copy, indistinguishable to the 'original'! Older > > >> versioning systems did not have those guarantees, many were derived > from > > >> systems for versioning engineering and architectural drawings such as > > >> those that were used for the RMS Titanic or Empire State Building. > > >> > > >> Philip > > >> > > >> PS it's worth checking out the distinction between having hash (a magic > > >> id) of some text, and encrypting (a magic translation of) some text. > > >> > > >> > ^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: About GIT Internals 2022-05-27 14:40 ` Philip Oakley [not found] ` <C4B1A93D-800F-4C49-93D5-86FE58B1DDCA@hxcore.ol> @ 2022-05-30 9:49 ` Kerry, Richard 2022-05-30 11:53 ` Konstantin Khomoutov 1 sibling, 1 reply; 17+ messages in thread From: Kerry, Richard @ 2022-05-30 9:49 UTC (permalink / raw) To: Philip Oakley, Aman; +Cc: Git List, git-vger > -----Original Message----- > From: Philip Oakley <philipoakley@iee.email> > Sent: 27 May 2022 15:40 > To: Aman <amanmatreja@gmail.com> > Cc: Git List <git@vger.kernel.org>; git-vger@eldondev.com > Subject: Re: About GIT Internals > > > Just a follow up questions- if you don't mind: > > > > 1. I haven't had the experience of working with other (perhaps even > > older) version control systems, like subversion. So when refering to > > the "control" aspect, > > The "control" aspect was from whoever was the 'manager' that limited > access to the version system (i.e. acting like a museum curator), and deciding > if your masterpiece was worthy of inclusion as a significant example of your > craft, whether that was an engineering drawing or some software code. I'm not sure I get that idea. I worked using server-based Version Control systems from the mid 80s until about 5 years ago when the team moved from Subversion to Git. There was never a "curator" who controlled what went into VC. You did your work, developed files, and committed when you thought it necessary. When a build was to be done there would then be some consideration of what from VC would go into the build. That is all still there nowadays using a distributed system (ie Git). Those doing Open source work might operate a bit differently, as there is of necessity distribution of control of what gets into a release. But those of us who are developing proprietary software are still going through the same sort of release process. And that's even if there isn't actually a separate person actively manipulating the contents of a release, it's just up to you to do what's necessary (actually there are others involved in dividing what will be in, but in our case they don't actively manipulate a repository). > >>> Chapter 10 is about git internals. It is important to realize that, > >>> unlike many other version control systems, git works effectively on > >>> files locally on your computer, without any server or other shared > >>> resources to manage. Also, one good way to learn may be to form a > >>> question that you want to answer first. "How do I ...." or "what > >>> happens when I ....". Since git works locally, it is possible to > >>> create a git repo, look at the files contained in the .git > >>> directory, take action with git, and then look at the files again. > >>> > >>> > >> Another Git feature, compared to older version control systems, is > >> that it flips the 'control' aspect on its head. (who controls what > >> you can > >> store?) Again, I don't really recognize that. You store what you want, probably with some sort of arrangement with the others on the team. The important bit is determining what will go into the release. Ie in choosing what, from everything that is stored, will be released. > >> Hence Git _Distributes Control_ - you no longer need permission to > >> keep versioned copies of your work. This was, in my mind, a core > >> element of its success. Maybe you do. If you're working with others there will probably be "permission" in some sense involved. I can store what I like locally, but then I miss out on some protection of my work, against a technical fault locally that might cause a loss of the whole repository. If there is a remote server then I am probably only allowed to store company work to the company server. A lot of this discussion seems to be more about the differences between the nature of Git and its client-server rivals. I thought the original query was about how its internals worked, which would seem to be a slightly different question. Regards, Richard. (Not old enough to remember the smell of blue prints, but old enough to know of the term) ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: About GIT Internals 2022-05-30 9:49 ` Kerry, Richard @ 2022-05-30 11:53 ` Konstantin Khomoutov 2022-05-30 13:50 ` Ævar Arnfjörð Bjarmason 0 siblings, 1 reply; 17+ messages in thread From: Konstantin Khomoutov @ 2022-05-30 11:53 UTC (permalink / raw) To: Kerry, Richard; +Cc: Philip Oakley, Aman, Git List, git-vger On Mon, May 30, 2022 at 09:49:57AM +0000, Kerry, Richard wrote: [...] > > > 1. I haven't had the experience of working with other (perhaps even > > > older) version control systems, like subversion. So when refering to > > > the "control" aspect, > > > > The "control" aspect was from whoever was the 'manager' that limited > > access to the version system (i.e. acting like a museum curator), and deciding > > if your masterpiece was worthy of inclusion as a significant example of your > > craft, whether that was an engineering drawing or some software code. > > I'm not sure I get that idea. I worked using server-based Version Control > systems from the mid 80s until about 5 years ago when the team moved from > Subversion to Git. There was never a "curator" who controlled what went > into VC. You did your work, developed files, and committed when you thought > it necessary. When a build was to be done there would then be some > consideration of what from VC would go into the build. That is all still > there nowadays using a distributed system (ie Git). Those doing Open source > work might operate a bit differently, as there is of necessity distribution > of control of what gets into a release. But those of us who are developing > proprietary software are still going through the same sort of release > process. And that's even if there isn't actually a separate person actively > manipulating the contents of a release, it's just up to you to do what's > necessary (actually there are others involved in dividing what will be in, > but in our case they don't actively manipulate a repository). I think, the "inversion of control" brought in by DVCS-es about a bit differet set of things. I would say it is connected to F/OSS and the way most projects have been hosted before the DVCS-es over: usually each project had a single repository (say, on Sourceforge or elsewhere), and it was "truly central" in the sense that if anyone were to decide to work on that project, they would need to contact whoever were in charge of that project and ask them to set up permissions allowing commits - may be not to "the trunk", but anyway the commit access was required because in centralized VCS commits are made on the server side. (Of course, there were projects where you could mail your patchset to a maintainer, but maintaining such patchset was not convenient: you would either need to host your own fully private VCS or use a tool like Quilt [1]. Also note that certain high-profile projects such as Linux and Git use mailing lists for submission and review of patch series; this workflow coexists with the concept of DVCS just fine.) This approach has been effectively reversed by what was a killer-feature of Github (I honestly am not sure whether Github was the first to implement it but it was, and arguably is, the most popular): a network of "forks". If a project is hosted using a DVCS, anyone is free to clone it and push their work _elsewhere._ This point is crucial: you do not need to ask the project maintainers to publish your modifications. Github pushed this concept quite far: creating a fork and pushing your work there is actually a device to create a pull request - a request to incorporate your changes into the original project. While this approach has obvious upsides, it also has possible downsides; one of a more visible is that when an original project becomes dormant for some reason, its users might have hard time understanding which one of competing forks to switch to, and there are cases when multiple competing forks implement different features and bugfixes, in parallel. One of the guys behind Subversion expressed his concerns about this back then wgen Git was in its relative infancy [2]. 1. https://en.wikipedia.org/wiki/Quilt_(software) 2. http://blog.red-bean.com/sussman/?p=20 ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: About GIT Internals 2022-05-30 11:53 ` Konstantin Khomoutov @ 2022-05-30 13:50 ` Ævar Arnfjörð Bjarmason 2022-06-03 12:18 ` Aman 0 siblings, 1 reply; 17+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2022-05-30 13:50 UTC (permalink / raw) To: Konstantin Khomoutov Cc: Kerry, Richard, Philip Oakley, Aman, Git List, git-vger On Mon, May 30 2022, Konstantin Khomoutov wrote: > On Mon, May 30, 2022 at 09:49:57AM +0000, Kerry, Richard wrote: > > [...] >> > > 1. I haven't had the experience of working with other (perhaps even >> > > older) version control systems, like subversion. So when refering to >> > > the "control" aspect, >> > >> > The "control" aspect was from whoever was the 'manager' that limited >> > access to the version system (i.e. acting like a museum curator), and deciding >> > if your masterpiece was worthy of inclusion as a significant example of your >> > craft, whether that was an engineering drawing or some software code. >> >> I'm not sure I get that idea. I worked using server-based Version Control >> systems from the mid 80s until about 5 years ago when the team moved from >> Subversion to Git. There was never a "curator" who controlled what went >> into VC. You did your work, developed files, and committed when you thought >> it necessary. When a build was to be done there would then be some >> consideration of what from VC would go into the build. That is all still >> there nowadays using a distributed system (ie Git). Those doing Open source >> work might operate a bit differently, as there is of necessity distribution >> of control of what gets into a release. But those of us who are developing >> proprietary software are still going through the same sort of release >> process. And that's even if there isn't actually a separate person actively >> manipulating the contents of a release, it's just up to you to do what's >> necessary (actually there are others involved in dividing what will be in, >> but in our case they don't actively manipulate a repository). > > I think, the "inversion of control" brought in by DVCS-es about a bit > differet set of things. Re the "I'm not sure I get that idea" from Richard I think his point stands that some of the stories we carry around about the VCS v.s. DVCS in free/open source software was more particular to how things were done in those online communities, and not really about the implicit constraints of centralized VCS per-se. Partly those two mix: It was quite common for free software projects not to have any public VCS (usually CVS) access at all, some did, but it was quite a hassle to set up, and not part of your "normal" workflow (as opposed setting up a hoster git repository, which everyone uses) that many just didn't do it. > I would say it is connected to F/OSS and the way most projects have been > hosted before the DVCS-es over: usually each project had a single repository > (say, on Sourceforge or elsewhere), and it was "truly central" in the sense > that if anyone were to decide to work on that project, they would need to > contact whoever were in charge of that project and ask them to set up > permissions allowing commits - may be not to "the trunk", but anyway the > commit access was required because in centralized VCS commits are made on the > server side. We may have tried this in different eras, but from what I recall it was a crapshoot whether there was any public VCS access at all. Some projects were quite good about it, and sourceforge managed to push that to more of them early on by making anonymous CVS access something you could get by default. But a lot of projects simply didn't have it at all, you'll still find some of them today, i.e. various bits of "infrastructure" code that the maintainers are (presumably) still manually managing with zip snapshots and manually applied patches. > (Of course, there were projects where you could mail your patchset to a > maintainer, but maintaining such patchset was not convenient: you would either > need to host your own fully private VCS or use a tool like Quilt [1]. > Also note that certain high-profile projects such as Linux and Git use mailing > lists for submission and review of patch series; this workflow coexists with > the concept of DVCS just fine.) I'd add though that this isn't really "co-existing" with DVSC so much as using patches on a ML as an indirect transport protocol for "git push". I.e. if you contributed to some similar projects "back in the day" you could expect to effectively send your patche into a black-hole until the next release, the maintainer would apply them locally, you wouldn't be able to pull them back down via the DVCS. Perhaps there would be development releases, but those could be weeks or even months apart, and a "real" release might be once every 1-2 years. Whereas both Junio and Linus (and other linux maintainers) publish their version of the patches they do integrate fairly quickly. > [...] it also has possible > downsides; one of a more visible is that when an original project becomes > dormant for some reason, its users might have hard time understanding which > one of competing forks to switch to, and there are cases when multiple > competing forks implement different features and bugfixes, in parallel. > One of the guys behind Subversion expressed his concerns about this back then > wgen Git was in its relative infancy [2]. > > 1. https://en.wikipedia.org/wiki/Quilt_(software) > 2. http://blog.red-bean.com/sussman/?p=20 It's interesting that this aspect of what proponents of centralized VCS were fearful of when it came to DVCS turned out to be the exact opposite: Notice what this user is now able to do: he wants to to crawl off into a cave, work for weeks on a complex feature by himself, then present it as a polished result to the main codebase. And this is exactly the sort of behavior that I think is bad for open source communities. I.e. lowering the cost to publish early and often has had the effect that people are less likely to "crawl off into a cave" and work on something for a long time without syncing up with other parallel development. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: About GIT Internals 2022-05-30 13:50 ` Ævar Arnfjörð Bjarmason @ 2022-06-03 12:18 ` Aman 2022-06-03 15:23 ` Konstantin Khomoutov 2022-06-03 15:25 ` Emily Shaffer 0 siblings, 2 replies; 17+ messages in thread From: Aman @ 2022-06-03 12:18 UTC (permalink / raw) To: Git List Cc: Konstantin Khomoutov, Kerry, Richard, Philip Oakley, git-vger, Ævar Arnfjörð Bjarmason Hello everyone. I sent out an email here last week, asking for a list of resources, so I could better understand the workings and design of git. I really appreciate everyone, who gave the links and their advice. I have been reading about GIT for some time now, and have looked at almost all of the resources plus some others. I think I could say, I now have a decent conceptual understanding of how GIT works internally. (Also, I understood the chapter about git I read in the book I am reading, Architecture of Open Source Applications: Volume 2, which I didn't understand at all, the reason I started this thread). Although there must definitely be a lot of details and subtle things I may not understand yet (like branches are nothing but pointers to commits, wow! btw) Now, continuing this discussion, and talking about the implementation and engineering side of things, I wanted to ask another question and hence wanted some advice. Though I may understand the internal design and high-level implementation of GIT, I really want to know how it's implemented and was made, which means reading the SOURCE CODE. 1. I don't know how absurd of a quest this is, please enlighten me. 2. How do I do it? Where do I start? It's such a BIG repository - and I am not guessing it's going to be easy. 3. Would someone advise, perhaps, to have a look at an older version of the source code? rather than the latest one, for some reason. Again, I would really appreciate it if someone could give their thoughts on this. Thank you, Regards, Aman On Mon, May 30, 2022 at 7:40 PM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote: > > > On Mon, May 30 2022, Konstantin Khomoutov wrote: > > > On Mon, May 30, 2022 at 09:49:57AM +0000, Kerry, Richard wrote: > > > > [...] > >> > > 1. I haven't had the experience of working with other (perhaps even > >> > > older) version control systems, like subversion. So when refering to > >> > > the "control" aspect, > >> > > >> > The "control" aspect was from whoever was the 'manager' that limited > >> > access to the version system (i.e. acting like a museum curator), and deciding > >> > if your masterpiece was worthy of inclusion as a significant example of your > >> > craft, whether that was an engineering drawing or some software code. > >> > >> I'm not sure I get that idea. I worked using server-based Version Control > >> systems from the mid 80s until about 5 years ago when the team moved from > >> Subversion to Git. There was never a "curator" who controlled what went > >> into VC. You did your work, developed files, and committed when you thought > >> it necessary. When a build was to be done there would then be some > >> consideration of what from VC would go into the build. That is all still > >> there nowadays using a distributed system (ie Git). Those doing Open source > >> work might operate a bit differently, as there is of necessity distribution > >> of control of what gets into a release. But those of us who are developing > >> proprietary software are still going through the same sort of release > >> process. And that's even if there isn't actually a separate person actively > >> manipulating the contents of a release, it's just up to you to do what's > >> necessary (actually there are others involved in dividing what will be in, > >> but in our case they don't actively manipulate a repository). > > > > I think, the "inversion of control" brought in by DVCS-es about a bit > > differet set of things. > > Re the "I'm not sure I get that idea" from Richard I think his point > stands that some of the stories we carry around about the VCS v.s. DVCS > in free/open source software was more particular to how things were done > in those online communities, and not really about the implicit > constraints of centralized VCS per-se. > > Partly those two mix: It was quite common for free software projects not > to have any public VCS (usually CVS) access at all, some did, but it was > quite a hassle to set up, and not part of your "normal" workflow (as > opposed setting up a hoster git repository, which everyone uses) that > many just didn't do it. > > > I would say it is connected to F/OSS and the way most projects have been > > hosted before the DVCS-es over: usually each project had a single repository > > (say, on Sourceforge or elsewhere), and it was "truly central" in the sense > > that if anyone were to decide to work on that project, they would need to > > contact whoever were in charge of that project and ask them to set up > > permissions allowing commits - may be not to "the trunk", but anyway the > > commit access was required because in centralized VCS commits are made on the > > server side. > > We may have tried this in different eras, but from what I recall it was > a crapshoot whether there was any public VCS access at all. Some > projects were quite good about it, and sourceforge managed to push that > to more of them early on by making anonymous CVS access something you > could get by default. > > But a lot of projects simply didn't have it at all, you'll still find > some of them today, i.e. various bits of "infrastructure" code that the > maintainers are (presumably) still manually managing with zip snapshots > and manually applied patches. > > > (Of course, there were projects where you could mail your patchset to a > > maintainer, but maintaining such patchset was not convenient: you would either > > need to host your own fully private VCS or use a tool like Quilt [1]. > > Also note that certain high-profile projects such as Linux and Git use mailing > > lists for submission and review of patch series; this workflow coexists with > > the concept of DVCS just fine.) > > I'd add though that this isn't really "co-existing" with DVSC so much as > using patches on a ML as an indirect transport protocol for "git push". > > I.e. if you contributed to some similar projects "back in the day" you > could expect to effectively send your patche into a black-hole until the > next release, the maintainer would apply them locally, you wouldn't be > able to pull them back down via the DVCS. > > Perhaps there would be development releases, but those could be weeks or > even months apart, and a "real" release might be once every 1-2 years. > > Whereas both Junio and Linus (and other linux maintainers) publish their > version of the patches they do integrate fairly quickly. > > > [...] it also has possible > > downsides; one of a more visible is that when an original project becomes > > dormant for some reason, its users might have hard time understanding which > > one of competing forks to switch to, and there are cases when multiple > > competing forks implement different features and bugfixes, in parallel. > > One of the guys behind Subversion expressed his concerns about this back then > > wgen Git was in its relative infancy [2]. > > > > 1. https://en.wikipedia.org/wiki/Quilt_(software) > > 2. http://blog.red-bean.com/sussman/?p=20 > > It's interesting that this aspect of what proponents of centralized VCS > were fearful of when it came to DVCS turned out to be the exact > opposite: > > Notice what this user is now able to do: he wants to to crawl off > into a cave, work for weeks on a complex feature by himself, then > present it as a polished result to the main codebase. And this is > exactly the sort of behavior that I think is bad for open source > communities. > > I.e. lowering the cost to publish early and often has had the effect > that people are less likely to "crawl off into a cave" and work on > something for a long time without syncing up with other parallel > development. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: About GIT Internals 2022-06-03 12:18 ` Aman @ 2022-06-03 15:23 ` Konstantin Khomoutov 2022-06-04 15:24 ` Aman 2022-06-03 15:25 ` Emily Shaffer 1 sibling, 1 reply; 17+ messages in thread From: Konstantin Khomoutov @ 2022-06-03 15:23 UTC (permalink / raw) To: Aman Cc: Git List, Konstantin Khomoutov, Kerry, Richard, Philip Oakley, git-vger, Ævar Arnfjörð Bjarmason On Fri, Jun 03, 2022 at 05:48:14PM +0530, Aman wrote: [...] > Though I may understand the internal design and high-level > implementation of GIT, I really want to know how it's implemented and > was made, which means reading the SOURCE CODE. > > 1. I don't know how absurd of a quest this is, please enlighten me. > 2. How do I do it? Where do I start? It's such a BIG repository - and > I am not guessing it's going to be easy. > 3. Would someone advise, perhaps, to have a look at an older version > of the source code? rather than the latest one, for some reason. Well, depends on what you mean when talking about the two mentioned designs. I mean, there's the design of the approach to manage data and there's the design of the software package (which Git is). If you do also understand the latter - that is, understanding that Git is an assortment of CLI tools combined into two layers called "plumbing" and "porcelain", - then you should have no difficulty starting to read the code: basically locate the source code of the entry point Git binary (which is, well, "git", or "git.exe" on Windows) and start reading it. You'll find it parses its command-line arguments and calls out to other executable modules which are parts of the Git software package to do heavy lifting. You then read the source code of the packages of interest, and so on and so on. I'm not sure there could be any other "guide" to read the source code. If you're not familiar with the design of Git-as-a-software-package, it's probably time to clone the Git repository and explore the contents of the directory named "Documentation" there. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: About GIT Internals 2022-06-03 15:23 ` Konstantin Khomoutov @ 2022-06-04 15:24 ` Aman 2022-06-06 11:52 ` Konstantin Khomoutov 0 siblings, 1 reply; 17+ messages in thread From: Aman @ 2022-06-04 15:24 UTC (permalink / raw) To: Konstantin Khomoutov Cc: Git List, Kerry, Richard, Philip Oakley, git-vger, Ævar Arnfjörð Bjarmason On Fri, Jun 3, 2022, at 8:53 PM Konstantin Khomoutov <kostix@bswap.ru> wrote: > Well, depends on what you mean when talking about the two mentioned designs. > I mean, there's the design of the approach to manage data and there's the > design of the software package (which Git is). That's a good perspective on the distinction between the designs. I am not familiar yet, with the design of GIT as a software package, and I am guessing most people who'll be learning about GIT internals won't be. > If you do also understand the latter - that is, understanding that Git is an > assortment of CLI tools combined into two layers called "plumbing" and > "porcelain", - then you should have no difficulty starting to read the code: > basically locate the source code of the entry point Git binary (which is, > well, "git", or "git.exe" on Windows) and start reading it. How do I do that? What do you mean by the "entry point" of the git binary? ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: About GIT Internals 2022-06-04 15:24 ` Aman @ 2022-06-06 11:52 ` Konstantin Khomoutov 0 siblings, 0 replies; 17+ messages in thread From: Konstantin Khomoutov @ 2022-06-06 11:52 UTC (permalink / raw) To: Aman Cc: Konstantin Khomoutov, Git List, Kerry, Richard, Philip Oakley, git-vger, Ævar Arnfjörð Bjarmason On Sat, Jun 04, 2022 at 08:54:10PM +0530, Aman wrote: [...] > > If you do also understand the latter - that is, understanding that Git is an > > assortment of CLI tools combined into two layers called "plumbing" and > > "porcelain", - then you should have no difficulty starting to read the code: > > basically locate the source code of the entry point Git binary (which is, > > well, "git", or "git.exe" on Windows) and start reading it. (I have reversed the order of your questions below so that my comments follow logically one after another.) > What do you mean by the "entry point" of the git binary? Well, porcelain Git commands (those supposed to be used by users to carry out their day-to-day tasks) are all implemented as subcommands of a single executable image file called "git" on all supported platforms (except Windows, where it's called "git.exe"): for instance, you run "git init" to initialize a repository, and your OS looks up the executable image file named "git" somewhere in the list of directories containing such files (it's usually contained in the environment variable named "PATH"), executes it and passes it a single command-line argument - "init". The rest of the commands works the same way. Therefore, that binary named "git" is an entry point of the Git software package: the execution of most Git commands starts there (not *all* Git commands, but let's not touch this yet). > How do I do that? Well, basically that's out of the scope of this list, but let's try... Git is a complex software package mostly written in C (and POSIX shell). As many F/OSS projects written in C, it has a top-level Makefile which is a file supposed to be processed by GNU Make; this file contains a set of rules for generating files from other files (compiling C source code into object files and linking those into libraries and executable image files is exactly this - generating files from other files). So usually you start from reading the Makefile to find where the binary file of interest is generated, and from which source files. The problem is that Git's Makefile is *complex.* So let's save you some headache and cut straight to the point: of the top interest to you are the two files: git.c and common-main.c. The former is exactly what implements that top-level entry point program, "git", while the latter implements the function called "main" which is an entry point to any program written in C which is supposed to be runnable standalone (as opposed to becoming a library); the object file generated when compiling common-main.c is linked to every other compiled code implementing Git commands, its main() calls cmd_main() which is supposed to be implemented in the code of those commands. The rest is basically just usual C stuff - source files and header files. If you're not familiar with these basics, then, I'm afraid, Git may be not the best project to dive into. In any case, I find the idea proposed by Junio elsewhere in this thread to be very smart: it should be quite enlightening to read the "early" Git code to make yourself accustomed to its overal architecture before moving on to its present - much more complicated - implementation which nevertheless still maintains the same architecture. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: About GIT Internals 2022-06-03 12:18 ` Aman 2022-06-03 15:23 ` Konstantin Khomoutov @ 2022-06-03 15:25 ` Emily Shaffer 2022-06-03 17:15 ` Junio C Hamano 1 sibling, 1 reply; 17+ messages in thread From: Emily Shaffer @ 2022-06-03 15:25 UTC (permalink / raw) To: Aman Cc: Git List, Konstantin Khomoutov, Kerry, Richard, Philip Oakley, git-vger, Ævar Arnfjörð Bjarmason On Fri, Jun 3, 2022 at 5:21 AM Aman <amanmatreja@gmail.com> wrote: > > Hello everyone. I sent out an email here last week, asking for a list > of resources, so I could better understand the workings and design of > git. I really appreciate everyone, who gave the links and their > advice. > > I have been reading about GIT for some time now, and have looked at > almost all of the resources plus some others. I think I could say, I > now have a decent conceptual understanding of how GIT works > internally. > > (Also, I understood the chapter about git I read in the book I am > reading, Architecture of Open Source Applications: Volume 2, which I > didn't understand at all, the reason I started this thread). Although > there must definitely be a lot of details and subtle things I may not > understand yet (like branches are nothing but pointers to commits, > wow! btw) > > Now, continuing this discussion, and talking about the implementation > and engineering side of things, I wanted to ask another question and > hence wanted some advice. > > Though I may understand the internal design and high-level > implementation of GIT, I really want to know how it's implemented and > was made, which means reading the SOURCE CODE. > > 1. I don't know how absurd of a quest this is, please enlighten me. It's a lot :) But I don't think that should discourage you. > 2. How do I do it? Where do I start? It's such a BIG repository - and > I am not guessing it's going to be easy. I would start actually with "Documentation/MyFirstContribution.txt" and "Documentation/MyFirstRevisionWalk.txt" - but I am biased towards those documents. ;) The other subtle hint I would give is that the entry point for almost every command is at a function called "cmd_cmdname()", so for example "git status" is at "cmd_status()", usually somewhere in 'builtin/'. > 3. Would someone advise, perhaps, to have a look at an older version > of the source code? rather than the latest one, for some reason. Some other piece of the developer documentation (maybe "SubmittingPatches"?) suggests that you start from the initial commit and understand that part first. I personally don't find this exercise very useful anymore as Git has grown quite a lot since then (and is even primarily in a different language, although we still have some bash scripts here and there). > Again, I would really appreciate it if someone could give their > thoughts on this. In your journeys, also watch out for some libraries in common, like calls from "run-command.h" or "parse-opt.h", to help you understand how we make stuff work more or less consistently across the codebase, or libraries like "strbuf.h" and "string-list.h" to understand some of the things that we do to make working with C a little less fraught. > > Thank you, > > Regards, > Aman > > > On Mon, May 30, 2022 at 7:40 PM Ævar Arnfjörð Bjarmason > <avarab@gmail.com> wrote: > > > > > > On Mon, May 30 2022, Konstantin Khomoutov wrote: > > > > > On Mon, May 30, 2022 at 09:49:57AM +0000, Kerry, Richard wrote: > > > > > > [...] > > >> > > 1. I haven't had the experience of working with other (perhaps even > > >> > > older) version control systems, like subversion. So when refering to > > >> > > the "control" aspect, > > >> > > > >> > The "control" aspect was from whoever was the 'manager' that limited > > >> > access to the version system (i.e. acting like a museum curator), and deciding > > >> > if your masterpiece was worthy of inclusion as a significant example of your > > >> > craft, whether that was an engineering drawing or some software code. > > >> > > >> I'm not sure I get that idea. I worked using server-based Version Control > > >> systems from the mid 80s until about 5 years ago when the team moved from > > >> Subversion to Git. There was never a "curator" who controlled what went > > >> into VC. You did your work, developed files, and committed when you thought > > >> it necessary. When a build was to be done there would then be some > > >> consideration of what from VC would go into the build. That is all still > > >> there nowadays using a distributed system (ie Git). Those doing Open source > > >> work might operate a bit differently, as there is of necessity distribution > > >> of control of what gets into a release. But those of us who are developing > > >> proprietary software are still going through the same sort of release > > >> process. And that's even if there isn't actually a separate person actively > > >> manipulating the contents of a release, it's just up to you to do what's > > >> necessary (actually there are others involved in dividing what will be in, > > >> but in our case they don't actively manipulate a repository). > > > > > > I think, the "inversion of control" brought in by DVCS-es about a bit > > > differet set of things. > > > > Re the "I'm not sure I get that idea" from Richard I think his point > > stands that some of the stories we carry around about the VCS v.s. DVCS > > in free/open source software was more particular to how things were done > > in those online communities, and not really about the implicit > > constraints of centralized VCS per-se. > > > > Partly those two mix: It was quite common for free software projects not > > to have any public VCS (usually CVS) access at all, some did, but it was > > quite a hassle to set up, and not part of your "normal" workflow (as > > opposed setting up a hoster git repository, which everyone uses) that > > many just didn't do it. > > > > > I would say it is connected to F/OSS and the way most projects have been > > > hosted before the DVCS-es over: usually each project had a single repository > > > (say, on Sourceforge or elsewhere), and it was "truly central" in the sense > > > that if anyone were to decide to work on that project, they would need to > > > contact whoever were in charge of that project and ask them to set up > > > permissions allowing commits - may be not to "the trunk", but anyway the > > > commit access was required because in centralized VCS commits are made on the > > > server side. > > > > We may have tried this in different eras, but from what I recall it was > > a crapshoot whether there was any public VCS access at all. Some > > projects were quite good about it, and sourceforge managed to push that > > to more of them early on by making anonymous CVS access something you > > could get by default. > > > > But a lot of projects simply didn't have it at all, you'll still find > > some of them today, i.e. various bits of "infrastructure" code that the > > maintainers are (presumably) still manually managing with zip snapshots > > and manually applied patches. > > > > > (Of course, there were projects where you could mail your patchset to a > > > maintainer, but maintaining such patchset was not convenient: you would either > > > need to host your own fully private VCS or use a tool like Quilt [1]. > > > Also note that certain high-profile projects such as Linux and Git use mailing > > > lists for submission and review of patch series; this workflow coexists with > > > the concept of DVCS just fine.) > > > > I'd add though that this isn't really "co-existing" with DVSC so much as > > using patches on a ML as an indirect transport protocol for "git push". > > > > I.e. if you contributed to some similar projects "back in the day" you > > could expect to effectively send your patche into a black-hole until the > > next release, the maintainer would apply them locally, you wouldn't be > > able to pull them back down via the DVCS. > > > > Perhaps there would be development releases, but those could be weeks or > > even months apart, and a "real" release might be once every 1-2 years. > > > > Whereas both Junio and Linus (and other linux maintainers) publish their > > version of the patches they do integrate fairly quickly. > > > > > [...] it also has possible > > > downsides; one of a more visible is that when an original project becomes > > > dormant for some reason, its users might have hard time understanding which > > > one of competing forks to switch to, and there are cases when multiple > > > competing forks implement different features and bugfixes, in parallel. > > > One of the guys behind Subversion expressed his concerns about this back then > > > wgen Git was in its relative infancy [2]. > > > > > > 1. https://en.wikipedia.org/wiki/Quilt_(software) > > > 2. http://blog.red-bean.com/sussman/?p=20 > > > > It's interesting that this aspect of what proponents of centralized VCS > > were fearful of when it came to DVCS turned out to be the exact > > opposite: > > > > Notice what this user is now able to do: he wants to to crawl off > > into a cave, work for weeks on a complex feature by himself, then > > present it as a polished result to the main codebase. And this is > > exactly the sort of behavior that I think is bad for open source > > communities. > > > > I.e. lowering the cost to publish early and often has had the effect > > that people are less likely to "crawl off into a cave" and work on > > something for a long time without syncing up with other parallel > > development. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: About GIT Internals 2022-06-03 15:25 ` Emily Shaffer @ 2022-06-03 17:15 ` Junio C Hamano 0 siblings, 0 replies; 17+ messages in thread From: Junio C Hamano @ 2022-06-03 17:15 UTC (permalink / raw) To: Emily Shaffer Cc: Aman, Git List, Konstantin Khomoutov, Kerry, Richard, Philip Oakley, git-vger, Ævar Arnfjörð Bjarmason Emily Shaffer <emilyshaffer@google.com> writes: >> 3. Would someone advise, perhaps, to have a look at an older version >> of the source code? rather than the latest one, for some reason. For those who want to learn from source files, I would recommend reading all the files in the very initial commit, cover to cover. e83c5163 (Initial revision of "git", the information manager from hell, 2005-04-07) With only 1244 lines spread across 11 files, it is a short-read that is completable in a single sitting for those who are reasonably fluent in C. It does not have any frills, but the basic data structures to express the important concepts are already there. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: About GIT Internals 2022-05-25 16:10 About GIT Internals Aman ` (2 preceding siblings ...) 2022-05-25 23:34 ` git-vger @ 2022-05-26 12:45 ` Konstantin Khomoutov 3 siblings, 0 replies; 17+ messages in thread From: Konstantin Khomoutov @ 2022-05-26 12:45 UTC (permalink / raw) To: Aman; +Cc: git In addition to what others have said, I would recommend to start with "The Git Parable" [1] - which is an ideal gentle, non-technical introduction to the concept of distributed version control systems, - and then read "Git from the Bottom Up" [2] and "Git for Computer Scientists" which has already been mentioned. 1. https://tom.preston-werner.com/2009/05/19/the-git-parable.html 2. https://jwiegley.github.io/git-from-the-bottom-up/ ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2022-06-06 13:00 UTC | newest] Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-05-25 16:10 About GIT Internals Aman 2022-05-25 16:49 ` Emily Shaffer 2022-05-25 21:14 ` Erik Cervin Edin 2022-05-25 23:34 ` git-vger 2022-05-26 8:47 ` Philip Oakley [not found] ` <CACMKQb3exv13sYN5uEP_AG-JYu1rmVj4HDxjdw8_Y-+maJPwGg@mail.gmail.com> 2022-05-27 14:40 ` Philip Oakley [not found] ` <C4B1A93D-800F-4C49-93D5-86FE58B1DDCA@hxcore.ol> 2022-05-27 15:14 ` Philip Oakley 2022-05-30 9:49 ` Kerry, Richard 2022-05-30 11:53 ` Konstantin Khomoutov 2022-05-30 13:50 ` Ævar Arnfjörð Bjarmason 2022-06-03 12:18 ` Aman 2022-06-03 15:23 ` Konstantin Khomoutov 2022-06-04 15:24 ` Aman 2022-06-06 11:52 ` Konstantin Khomoutov 2022-06-03 15:25 ` Emily Shaffer 2022-06-03 17:15 ` Junio C Hamano 2022-05-26 12:45 ` Konstantin Khomoutov
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.