* [PATCH 0/9] Assorted fixes for `git config` (including the "empty sections" bug) @ 2018-03-29 15:18 Johannes Schindelin 2018-03-29 15:18 ` [PATCH 1/9] git_config_set: fix off-by-two Johannes Schindelin ` (12 more replies) 0 siblings, 13 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-03-29 15:18 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley This patch series started out as a single patch trying to figure out what it takes to fix that annoying bug that has been reported several times over the years, where `git config --unset` would leave empty sections behind, and `git config --add` would not reuse them. Little did I know that this would turn not only into a full patch to fix this issue, but into a full-blown series of nine patches. The first patch is somewhat of a "while at it" bug fix that I first thought would be a lot more critical than it actually is: It really only affects config files that start with a section followed immediately (i.e. without a newline) by a one-letter boolean setting (i.e. without a `= <value>` part). So while it is a real bug fix, I doubt anybody ever got bitten by it. Nonetheless, I would be confortable with this patch going into v2.17.0, even at this late stage. The final verdict is Junio's, of course. The next swath of patches add some tests, and adjust one test about which I already complained at length yesterday, so I will spare you the same ordeal today. These fixes are pretty straight-forward, and I always try to keep my added tests as concise as possible, so please tell me if you find a way to make them smaller (without giving up readability and debuggability). Finally, the interesting part, where I do two things, essentially (with preparatory steps for each thing): 1. I add the ability for `git config --unset/--unset-all` to detect that it can remove a section that has just become empty (see below for some more discussion of what I consider "become empty"), and 2. I add the ability for `git config [--add] key value` to re-use empty sections. Note that the --unset/--unset-all part is the hairy one, and I would love it if people could concentrate on wrapping their heads around that function, and obviously tell me how I can change it to make it more readable (or even point out incorrect behavior). Now, to the really important part: why does this patch series not conflict with my very early statements that we cannot simply remove empty sections because we may end up with stale comments? Well, the patch in question takes pains to determine *iff* there are any comments surrounding, or included in, the section. If any are found: previous behavior. Under the assumption that the user edited the file, we keep it as intact as possible (see below for some argument against this). If no comments are found, and let's face it, this is probably *the* common case, as few people edit their config files by hand these days (neither should they because it is too easy to end up with an unparseable one), the now-empty section *is* removed. So what is the argument against this extra care to detect comments? Well, if you have something like this: [section] ; Here we comment about the variable called snarf snarf = froop and we run `git config --unset section.snarf`, we end up with this config: [section] ; Here we comment about the variable called snarf which obviously does not make sense. However, that is already established behavior for quite a few years, and I do not even try to think of a way how this could be solved. Johannes Schindelin (9): git_config_set: fix off-by-two t1300: rename it to reflect that `repo-config` was deprecated t1300: avoid relying on a bug t1300: remove unreasonable expectation from TODO t1300: `--unset-all` can leave an empty section behind (bug) git_config_set: simplify the way the section name is remembered git config --unset: remove empty sections (in normal situations) git_config_set: use do_config_from_file() directly git_config_set: reuse empty sections config.c | 234 +++++++++++++++++++++++++--- t/{t1300-repo-config.sh => t1300-config.sh} | 36 ++++- 2 files changed, 250 insertions(+), 20 deletions(-) rename t/{t1300-repo-config.sh => t1300-config.sh} (98%) base-commit: 03df4959472e7d4b5117bb72ac86e1e2bcf21723 Published-As: https://github.com/dscho/git/releases/tag/empty-config-section-v1 Fetch-It-Via: git fetch https://github.com/dscho/git empty-config-section-v1 -- 2.16.2.windows.1.26.g2cc3565eb4b ^ permalink raw reply [flat|nested] 103+ messages in thread
* [PATCH 1/9] git_config_set: fix off-by-two 2018-03-29 15:18 [PATCH 0/9] Assorted fixes for `git config` (including the "empty sections" bug) Johannes Schindelin @ 2018-03-29 15:18 ` Johannes Schindelin 2018-03-29 18:15 ` Stefan Beller 2018-03-29 15:18 ` [PATCH 2/9] t1300: rename it to reflect that `repo-config` was deprecated Johannes Schindelin ` (11 subsequent siblings) 12 siblings, 1 reply; 103+ messages in thread From: Johannes Schindelin @ 2018-03-29 15:18 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley Currently, we are slightly overzealous When removing an entry from a config file of this form: [abc]a [xyz] key = value When calling `git config --unset abc.a` on this file, it leaves this (invalid) config behind: [ [xyz] key = value The reason is that we try to search for the beginning of the line (or for the end of the preceding section header on the same line) that defines abc.a, but as an optimization, we subtract 2 from the offset pointing just after the definition before we call find_beginning_of_line(). That function, however, *also* performs that optimization and promptly fails to find the section header correctly. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- config.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/config.c b/config.c index b0c20e6cb8a..5cc049aaef0 100644 --- a/config.c +++ b/config.c @@ -2632,7 +2632,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, } else copy_end = find_beginning_of_line( contents, contents_sz, - store.offset[i]-2, &new_line); + store.offset[i], &new_line); if (copy_end > 0 && contents[copy_end-1] != '\n') new_line = 1; -- 2.16.2.windows.1.26.g2cc3565eb4b ^ permalink raw reply related [flat|nested] 103+ messages in thread
* Re: [PATCH 1/9] git_config_set: fix off-by-two 2018-03-29 15:18 ` [PATCH 1/9] git_config_set: fix off-by-two Johannes Schindelin @ 2018-03-29 18:15 ` Stefan Beller 2018-03-29 19:41 ` Jeff King 0 siblings, 1 reply; 103+ messages in thread From: Stefan Beller @ 2018-03-29 18:15 UTC (permalink / raw) To: Johannes Schindelin Cc: git, Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Jason Frey, Philip Oakley On Thu, Mar 29, 2018 at 8:18 AM, Johannes Schindelin <johannes.schindelin@gmx.de> wrote: > Currently, we are slightly overzealous When removing an entry from a > config file of this form: > > [abc]a > [xyz] > key = value > > When calling `git config --unset abc.a` on this file, it leaves this > (invalid) config behind: > > [ > [xyz] > key = value > > The reason is that we try to search for the beginning of the line (or > for the end of the preceding section header on the same line) that > defines abc.a, but as an optimization, we subtract 2 from the offset > pointing just after the definition before we call > find_beginning_of_line(). That function, however, *also* performs that > optimization and promptly fails to find the section header correctly. This commit message would be more convincing if we had it in test form. [abc]a is not written by Git, but would be written from an outside tool or person and we barely cope with it? Thanks, Stefan > > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> > --- > config.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/config.c b/config.c > index b0c20e6cb8a..5cc049aaef0 100644 > --- a/config.c > +++ b/config.c > @@ -2632,7 +2632,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, > } else > copy_end = find_beginning_of_line( > contents, contents_sz, > - store.offset[i]-2, &new_line); > + store.offset[i], &new_line); > > if (copy_end > 0 && contents[copy_end-1] != '\n') > new_line = 1; > -- > 2.16.2.windows.1.26.g2cc3565eb4b > > ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 1/9] git_config_set: fix off-by-two 2018-03-29 18:15 ` Stefan Beller @ 2018-03-29 19:41 ` Jeff King 2018-03-30 12:32 ` Johannes Schindelin 0 siblings, 1 reply; 103+ messages in thread From: Jeff King @ 2018-03-29 19:41 UTC (permalink / raw) To: Stefan Beller Cc: Johannes Schindelin, git, Junio C Hamano, Thomas Rast, Phil Haack, Ævar Arnfjörð Bjarmason, Jason Frey, Philip Oakley On Thu, Mar 29, 2018 at 11:15:33AM -0700, Stefan Beller wrote: > > When calling `git config --unset abc.a` on this file, it leaves this > > (invalid) config behind: > > > > [ > > [xyz] > > key = value > > > > The reason is that we try to search for the beginning of the line (or > > for the end of the preceding section header on the same line) that > > defines abc.a, but as an optimization, we subtract 2 from the offset > > pointing just after the definition before we call > > find_beginning_of_line(). That function, however, *also* performs that > > optimization and promptly fails to find the section header correctly. > > This commit message would be more convincing if we had it in test form. I agree a test might be nice. But I don't find the commit message unconvincing at all. It explains pretty clearly why the bug occurs, and you can verify it by looking at find_beginning_of_line. > [abc]a > > is not written by Git, but would be written from an outside tool or person > and we barely cope with it? Yes, I don't think git would ever write onto the same line. But clearly we should handle anything that's syntactically valid. -Peff ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 1/9] git_config_set: fix off-by-two 2018-03-29 19:41 ` Jeff King @ 2018-03-30 12:32 ` Johannes Schindelin 2018-03-30 14:15 ` Ævar Arnfjörð Bjarmason ` (2 more replies) 0 siblings, 3 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-03-30 12:32 UTC (permalink / raw) To: Jeff King Cc: Stefan Beller, git, Junio C Hamano, Thomas Rast, Phil Haack, Ævar Arnfjörð Bjarmason, Jason Frey, Philip Oakley Hi, On Thu, 29 Mar 2018, Jeff King wrote: > On Thu, Mar 29, 2018 at 11:15:33AM -0700, Stefan Beller wrote: > > > > When calling `git config --unset abc.a` on this file, it leaves this > > > (invalid) config behind: > > > > > > [ > > > [xyz] > > > key = value > > > > > > The reason is that we try to search for the beginning of the line (or > > > for the end of the preceding section header on the same line) that > > > defines abc.a, but as an optimization, we subtract 2 from the offset > > > pointing just after the definition before we call > > > find_beginning_of_line(). That function, however, *also* performs that > > > optimization and promptly fails to find the section header correctly. > > > > This commit message would be more convincing if we had it in test form. > > I agree a test might be nice. But I don't find the commit message > unconvincing at all. It explains pretty clearly why the bug occurs, and > you can verify it by looking at find_beginning_of_line. > > > [abc]a > > > > is not written by Git, but would be written from an outside tool or person > > and we barely cope with it? > > Yes, I don't think git would ever write onto the same line. But clearly > we should handle anything that's syntactically valid. I was tempted to add the test case, because it is easy to test it. But I then decided *not* to add it. Why? Testing is a balance between "can do" and "need to do". Can you imagine that I did *not* run the entire test suite before submitting this patch series, because it takes an incredible *90 minutes* to run *on a fast Windows machine*? Seriously, this is hurting me. I do not complain about this due to some mental illness forcing me to do it. I complain about this so often *because it slows me down*, you gentle people. And you don't seem to care, at least the test suite gets noticably worse by the month. I frankly do not know what to do about this, as you keep adding and adding and it gets less and less feasible for me to run the full test suite. I seem to be totally unable to get through to you with the message that this is a real problem with a real need to get fixed. So with this in mind, I do not want to add a test case for a concocted example that won't affect anybody except users who *want* to trigger this bug. I hope you agree, Dscho P.S.: Of course I ran the entire test suite. Not on Windows, but in a Linux VM, because Linux is what Git is fine-tuned for, most obviously so. An alien digging up ancient Earth history in the far future might be tempted to assume that Git was developed to develop Linux which was developed to develop Git, and then ask herself why humans bothered at all. I actually ran the entire test suite on Linux on every single patch, via `git rebase -x "make -j15 DEVELOPER=1 test" @{u}`, as I usually do before submitting a patch series. And it *did* find an obscure bug in an earlier iteration, where t5512-ls-remote.sh demonstrated that looking at only one entry at a time is not enough: `git config --unset-all uploadpack.hiderefs` *also* needs to remove the now-empty section, because we might end up with the empty sections in the wrong order, and the order of [transfer] and [uploadpack] *matters* if the transfer.hiderefs setting is negative and the uploadpack.hiderefs setting is positive, as is the case in 'overrides work between mixed transfer/upload-pack hideRefs'. (Side-note: this looks like a pretty obvious design bug to me, as there is *no tooling* to switch around the order of these settings. Even worse: if somebody gets instructions to add those settings, and there is already a [transfer] section in the config: you're out of luck! You will have to *know* that the order matters, *and add a second [transfer] section manually*!) ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 1/9] git_config_set: fix off-by-two 2018-03-30 12:32 ` Johannes Schindelin @ 2018-03-30 14:15 ` Ævar Arnfjörð Bjarmason 2018-03-30 16:24 ` Junio C Hamano 2018-03-30 16:36 ` Duy Nguyen 2018-03-30 18:45 ` A potential approach to making tests faster on Windows Ævar Arnfjörð Bjarmason 2 siblings, 1 reply; 103+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2018-03-30 14:15 UTC (permalink / raw) To: Johannes Schindelin Cc: Jeff King, Stefan Beller, git, Junio C Hamano, Thomas Rast, Phil Haack, Jason Frey, Philip Oakley On Fri, Mar 30 2018, Johannes Schindelin wrote: > On Thu, 29 Mar 2018, Jeff King wrote: > >> On Thu, Mar 29, 2018 at 11:15:33AM -0700, Stefan Beller wrote: >> >> > > When calling `git config --unset abc.a` on this file, it leaves this >> > > (invalid) config behind: >> > > >> > > [ >> > > [xyz] >> > > key = value >> > > >> > > The reason is that we try to search for the beginning of the line (or >> > > for the end of the preceding section header on the same line) that >> > > defines abc.a, but as an optimization, we subtract 2 from the offset >> > > pointing just after the definition before we call >> > > find_beginning_of_line(). That function, however, *also* performs that >> > > optimization and promptly fails to find the section header correctly. >> > >> > This commit message would be more convincing if we had it in test form. >> >> I agree a test might be nice. But I don't find the commit message >> unconvincing at all. It explains pretty clearly why the bug occurs, and >> you can verify it by looking at find_beginning_of_line. >> >> > [abc]a >> > >> > is not written by Git, but would be written from an outside tool or person >> > and we barely cope with it? >> >> Yes, I don't think git would ever write onto the same line. But clearly >> we should handle anything that's syntactically valid. > > I was tempted to add the test case, because it is easy to test it. > > But I then decided *not* to add it. Why? Testing is a balance between "can > do" and "need to do". > > Can you imagine that I did *not* run the entire test suite before > submitting this patch series, because it takes an incredible *90 minutes* > to run *on a fast Windows machine*? I think if it's worth fixing it's worth testing for, a future change to the config code could easily introduce a regression for this, and particularly in this type of code obscure edge cases like this can point to bugs elsewhere. We have the EXPENSIVE_ON_WINDOWS prerequisite already in master from an earlier series of mine, maybe we could use that here, or add some other prereq like OVERLY_EXHAUSTIVE which by default could depend on EXPENSIVE_ON_WINDOWS, i.e. we'd have a set of overly pedantic tests that we skip on Windows by default, as there's no reason to suspect they're platform-dependent, but we'd like to know if they regress. ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 1/9] git_config_set: fix off-by-two 2018-03-30 14:15 ` Ævar Arnfjörð Bjarmason @ 2018-03-30 16:24 ` Junio C Hamano 2018-03-30 18:44 ` Johannes Schindelin 0 siblings, 1 reply; 103+ messages in thread From: Junio C Hamano @ 2018-03-30 16:24 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: Johannes Schindelin, Jeff King, Stefan Beller, git, Thomas Rast, Phil Haack, Jason Frey, Philip Oakley Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > I think if it's worth fixing it's worth testing for, a future change to > the config code could easily introduce a regression for this, and > particularly in this type of code obscure edge cases like this can point > to bugs elsewhere. Yup. "The port to my favourite platform is too slow, and everybody should learn to live with thin test coverage" would not be a good strategy in the longer run. ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 1/9] git_config_set: fix off-by-two 2018-03-30 16:24 ` Junio C Hamano @ 2018-03-30 18:44 ` Johannes Schindelin 2018-03-30 19:00 ` Junio C Hamano 0 siblings, 1 reply; 103+ messages in thread From: Johannes Schindelin @ 2018-03-30 18:44 UTC (permalink / raw) To: Junio C Hamano Cc: Ævar Arnfjörð Bjarmason, Jeff King, Stefan Beller, git, Thomas Rast, Phil Haack, Jason Frey, Philip Oakley [-- Attachment #1: Type: text/plain, Size: 859 bytes --] Hi Junio, On Fri, 30 Mar 2018, Junio C Hamano wrote: > Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > > > I think if it's worth fixing it's worth testing for, a future change to > > the config code could easily introduce a regression for this, and > > particularly in this type of code obscure edge cases like this can point > > to bugs elsewhere. > > Yup. "The port to my favourite platform is too slow, and everybody > should learn to live with thin test coverage" would not be a good > strategy in the longer run. What would be a *really* good strategy is: "Oh, there is a problem! Let's acknowledge it and try to come up with a solution rather than a work-around". EXPENSIVE_ON_WINDOWS is a symptom. Not a solution. And you are actively hurting my ability to contribute, I hope you are aware of that. Ciao, Dscho ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 1/9] git_config_set: fix off-by-two 2018-03-30 18:44 ` Johannes Schindelin @ 2018-03-30 19:00 ` Junio C Hamano 2018-04-03 9:31 ` Johannes Schindelin 0 siblings, 1 reply; 103+ messages in thread From: Junio C Hamano @ 2018-03-30 19:00 UTC (permalink / raw) To: Johannes Schindelin Cc: Ævar Arnfjörð Bjarmason, Jeff King, Stefan Beller, git, Thomas Rast, Phil Haack, Jason Frey, Philip Oakley Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > What would be a *really* good strategy is: "Oh, there is a problem! Let's > acknowledge it and try to come up with a solution rather than a > work-around". > > EXPENSIVE_ON_WINDOWS is a symptom. Not a solution. Yes, it is a workaround. Making shell faster on windows would of course be one possible solution to make t/t*.sh scripts go faster ;-) Or update parts of t/t*.sh so that the equivalent test coverage can be kept while running making them go faster on Windows. ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 1/9] git_config_set: fix off-by-two 2018-03-30 19:00 ` Junio C Hamano @ 2018-04-03 9:31 ` Johannes Schindelin 2018-04-03 15:29 ` Duy Nguyen 2018-04-08 23:12 ` Junio C Hamano 0 siblings, 2 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-04-03 9:31 UTC (permalink / raw) To: Junio C Hamano Cc: Ævar Arnfjörð Bjarmason, Jeff King, Stefan Beller, git, Thomas Rast, Phil Haack, Jason Frey, Philip Oakley Hi Junio, On Fri, 30 Mar 2018, Junio C Hamano wrote: > Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > > > What would be a *really* good strategy is: "Oh, there is a problem! Let's > > acknowledge it and try to come up with a solution rather than a > > work-around". > > > > EXPENSIVE_ON_WINDOWS is a symptom. Not a solution. > > Yes, it is a workaround. Making shell faster on windows would of > course be one possible solution to make t/t*.sh scripts go faster > ;-) Or update parts of t/t*.sh so that the equivalent test coverage > can be kept while running making them go faster on Windows. What makes you think that I did not try my hardest for around 812 hours in total so far to make the shell faster? Ciao, Dscho P.S.: I do not have the actual number of hours I spent on both MSYS2's runtime and BusyBox and Git to find *some* way to make it faster, as my time-keeping is organized in a different way that makes it hard to query the overall number. But I can state with confidence that it is easily in the 200-300 hour range, if not beyond that. It is very frustrating to spend that much time with only little gains here and there (and BusyBox-w32 is simply not robust enough yet, apart from also not showing a significant improvement in performance). Please do not make this experience even more frustrating. Thanks. ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 1/9] git_config_set: fix off-by-two 2018-04-03 9:31 ` Johannes Schindelin @ 2018-04-03 15:29 ` Duy Nguyen 2018-04-03 15:47 ` Johannes Schindelin 2018-04-08 23:12 ` Junio C Hamano 1 sibling, 1 reply; 103+ messages in thread From: Duy Nguyen @ 2018-04-03 15:29 UTC (permalink / raw) To: Johannes Schindelin Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason, Jeff King, Stefan Beller, git, Thomas Rast, Phil Haack, Jason Frey, Philip Oakley On Tue, Apr 3, 2018 at 11:31 AM, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > It is very frustrating to spend that much time with only little gains here > and there (and BusyBox-w32 is simply not robust enough yet, apart from > also not showing a significant improvement in performance). You still use busybox-w32? It's amazing that people still use it after the linux subsystem comes. busybox has a lot of commands built in (i.e. no new processes) and unless rmyorston did something more, the "fork" in ash shell should be as cheap as it could be: it simply serializes data and sends to the new process. If performance does not improve, I guess the process creation cost dominates. There's not much we could do except moving away from the zillion processes test framework: either something C-based or another scripting language (ok I don't want to bring this up again) -- Duy ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 1/9] git_config_set: fix off-by-two 2018-04-03 15:29 ` Duy Nguyen @ 2018-04-03 15:47 ` Johannes Schindelin 0 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-04-03 15:47 UTC (permalink / raw) To: Duy Nguyen Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason, Jeff King, Stefan Beller, git, Thomas Rast, Phil Haack, Jason Frey, Philip Oakley Hi Duy, On Tue, 3 Apr 2018, Duy Nguyen wrote: > On Tue, Apr 3, 2018 at 11:31 AM, Johannes Schindelin > <Johannes.Schindelin@gmx.de> wrote: > > It is very frustrating to spend that much time with only little gains > > here and there (and BusyBox-w32 is simply not robust enough yet, apart > > from also not showing a significant improvement in performance). > > You still use busybox-w32? Yes. > It's amazing that people still use it after the linux subsystem comes. I use WSL myself. But you need to realize that WSL is only available on Windows 10 (many users still use Windows 7), and it is a little tricky to get to work in Docker containers, I heard, so I did not even try. Also, many Windows users are unfamiliar with Linux, and forcing them to learn and install a Linux distribution on their machine when all they want is to use Git is a bit... much. > busybox has a lot of commands built in (i.e. no new processes) and > unless rmyorston did something more, the "fork" in ash shell should be > as cheap as it could be: it simply serializes data and sends to the new > process. Yes, I had the pleasure of reading that code. It might surprise you, but I had to come up with quite a bit of patches to make the test suite pass. And it does not really pass, as I randomly get hangs... > If performance does not improve, I guess the process creation cost > dominates. There's not much we could do except moving away from the > zillion processes test framework: either something C-based or another > scripting language (ok I don't want to bring this up again) There is no need to guess. I now have .pdb files, and once I have a good example of a shell script construct that is particularly slow, and once I find some time to work on it, I will dig into the bottlenecks. Ciao, Dscho ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 1/9] git_config_set: fix off-by-two 2018-04-03 9:31 ` Johannes Schindelin 2018-04-03 15:29 ` Duy Nguyen @ 2018-04-08 23:12 ` Junio C Hamano 1 sibling, 0 replies; 103+ messages in thread From: Junio C Hamano @ 2018-04-08 23:12 UTC (permalink / raw) To: Johannes Schindelin Cc: Ævar Arnfjörð Bjarmason, Jeff King, Stefan Beller, git, Thomas Rast, Phil Haack, Jason Frey, Philip Oakley Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: >> Yes, it is a workaround. Making shell faster on windows would of >> course be one possible solution to make t/t*.sh scripts go faster >> ;-) Or update parts of t/t*.sh so that the equivalent test coverage >> can be kept while running making them go faster on Windows. > > What makes you think that I did not try my hardest for around 812 hours in > total so far to make the shell faster? Nowhere in these four lines I ever said that I think you did not work hard to solve the performance issues you have. ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 1/9] git_config_set: fix off-by-two 2018-03-30 12:32 ` Johannes Schindelin 2018-03-30 14:15 ` Ævar Arnfjörð Bjarmason @ 2018-03-30 16:36 ` Duy Nguyen 2018-03-30 18:53 ` Johannes Schindelin 2018-03-30 18:45 ` A potential approach to making tests faster on Windows Ævar Arnfjörð Bjarmason 2 siblings, 1 reply; 103+ messages in thread From: Duy Nguyen @ 2018-03-30 16:36 UTC (permalink / raw) To: Johannes Schindelin Cc: Jeff King, Stefan Beller, git, Junio C Hamano, Thomas Rast, Phil Haack, Ævar Arnfjörð Bjarmason, Jason Frey, Philip Oakley On Fri, Mar 30, 2018 at 2:32 PM, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > Hi, > > On Thu, 29 Mar 2018, Jeff King wrote: > >> On Thu, Mar 29, 2018 at 11:15:33AM -0700, Stefan Beller wrote: >> >> > > When calling `git config --unset abc.a` on this file, it leaves this >> > > (invalid) config behind: >> > > >> > > [ >> > > [xyz] >> > > key = value >> > > >> > > The reason is that we try to search for the beginning of the line (or >> > > for the end of the preceding section header on the same line) that >> > > defines abc.a, but as an optimization, we subtract 2 from the offset >> > > pointing just after the definition before we call >> > > find_beginning_of_line(). That function, however, *also* performs that >> > > optimization and promptly fails to find the section header correctly. >> > >> > This commit message would be more convincing if we had it in test form. >> >> I agree a test might be nice. But I don't find the commit message >> unconvincing at all. It explains pretty clearly why the bug occurs, and >> you can verify it by looking at find_beginning_of_line. >> >> > [abc]a >> > >> > is not written by Git, but would be written from an outside tool or person >> > and we barely cope with it? >> >> Yes, I don't think git would ever write onto the same line. But clearly >> we should handle anything that's syntactically valid. > > I was tempted to add the test case, because it is easy to test it. > > But I then decided *not* to add it. Why? Testing is a balance between "can > do" and "need to do". > > Can you imagine that I did *not* run the entire test suite before > submitting this patch series, because it takes an incredible *90 minutes* > to run *on a fast Windows machine*? What's wrong with firing up a new worktree, run the test suite there and go back to do something else so you won't waste time just waiting for test results and submit? Sure there is a mental overhead for switching tasks, but at 90 minutes, I think it's worth doing. -- Duy ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 1/9] git_config_set: fix off-by-two 2018-03-30 16:36 ` Duy Nguyen @ 2018-03-30 18:53 ` Johannes Schindelin 2018-03-30 19:16 ` Duy Nguyen 0 siblings, 1 reply; 103+ messages in thread From: Johannes Schindelin @ 2018-03-30 18:53 UTC (permalink / raw) To: Duy Nguyen Cc: Jeff King, Stefan Beller, git, Junio C Hamano, Thomas Rast, Phil Haack, Ævar Arnfjörð Bjarmason, Jason Frey, Philip Oakley Hi Duy, On Fri, 30 Mar 2018, Duy Nguyen wrote: > On Fri, Mar 30, 2018 at 2:32 PM, Johannes Schindelin > <Johannes.Schindelin@gmx.de> wrote: > > > > On Thu, 29 Mar 2018, Jeff King wrote: > > > >> On Thu, Mar 29, 2018 at 11:15:33AM -0700, Stefan Beller wrote: > >> > >> > > When calling `git config --unset abc.a` on this file, it leaves this > >> > > (invalid) config behind: > >> > > > >> > > [ > >> > > [xyz] > >> > > key = value > >> > > > >> > > The reason is that we try to search for the beginning of the line (or > >> > > for the end of the preceding section header on the same line) that > >> > > defines abc.a, but as an optimization, we subtract 2 from the offset > >> > > pointing just after the definition before we call > >> > > find_beginning_of_line(). That function, however, *also* performs that > >> > > optimization and promptly fails to find the section header correctly. > >> > > >> > This commit message would be more convincing if we had it in test form. > >> > >> I agree a test might be nice. But I don't find the commit message > >> unconvincing at all. It explains pretty clearly why the bug occurs, and > >> you can verify it by looking at find_beginning_of_line. > >> > >> > [abc]a > >> > > >> > is not written by Git, but would be written from an outside tool or person > >> > and we barely cope with it? > >> > >> Yes, I don't think git would ever write onto the same line. But clearly > >> we should handle anything that's syntactically valid. > > > > I was tempted to add the test case, because it is easy to test it. > > > > But I then decided *not* to add it. Why? Testing is a balance between "can > > do" and "need to do". > > > > Can you imagine that I did *not* run the entire test suite before > > submitting this patch series, because it takes an incredible *90 minutes* > > to run *on a fast Windows machine*? > > What's wrong with firing up a new worktree, run the test suite there > and go back to do something else so you won't waste time just waiting > for test results and submit? Sure there is a mental overhead for > switching tasks, but at 90 minutes, I think it's worth doing. Of course it is worth doing. That's why I often test the end result on Windows (waiting those 90 minutes, but I do not fire up a new worktree, I use my cloud privilege and let Azure/Visual Studio Team Services do the work for me, without slowing down my laptop). What I would love to do, however, would be to test all intermediate patches, too, as that often shows a problem with my frequent reorderings via interactive rebases. And 90 minutes times 9 is... 13 hours and 30 minutes. That's a really long time. I think the best course of action would be to incrementally do away with the shell scripted test framework, in the way you outlined earlier this year. This would *also* buy us a wealth of other benefits, such as better control over the parallelization, resource usage, etc. It would also finally make it easier to introduce something like "smart testing" where code coverage could be computed (this works only for C code, of course, not for the many scripted parts of core Git), and a diff could be inspected to discover which tests *really* need to be run, skipping the tests that would only touch unchanged code. Ciao, Dscho ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 1/9] git_config_set: fix off-by-two 2018-03-30 18:53 ` Johannes Schindelin @ 2018-03-30 19:16 ` Duy Nguyen 0 siblings, 0 replies; 103+ messages in thread From: Duy Nguyen @ 2018-03-30 19:16 UTC (permalink / raw) To: Johannes Schindelin Cc: Jeff King, Stefan Beller, git, Junio C Hamano, Thomas Rast, Phil Haack, Ævar Arnfjörð Bjarmason, Jason Frey, Philip Oakley On Fri, Mar 30, 2018 at 8:53 PM, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > I think the best course of action would be to incrementally do away with > the shell scripted test framework, in the way you outlined earlier this > year. This would *also* buy us a wealth of other benefits, such as better > control over the parallelization, resource usage, etc. If you have not noticed, I'm a bit busy with all sorts of stuff and probably won't continue that work. And since it affects you the most, you probably have the best motive to tackle it ;-) I don't think complaining about slow test suite helps. And avoiding adding more tests because of that definitely does not help. > It would also finally make it easier to introduce something like "smart > testing" where code coverage could be computed (this works only for C > code, of course, not for the many scripted parts of core Git), and a diff > could be inspected to discover which tests *really* need to be run, > skipping the tests that would only touch unchanged code. -- Duy ^ permalink raw reply [flat|nested] 103+ messages in thread
* A potential approach to making tests faster on Windows 2018-03-30 12:32 ` Johannes Schindelin 2018-03-30 14:15 ` Ævar Arnfjörð Bjarmason 2018-03-30 16:36 ` Duy Nguyen @ 2018-03-30 18:45 ` Ævar Arnfjörð Bjarmason 2018-03-30 18:58 ` Junio C Hamano ` (2 more replies) 2 siblings, 3 replies; 103+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2018-03-30 18:45 UTC (permalink / raw) To: Johannes Schindelin Cc: Jeff King, Stefan Beller, git, Junio C Hamano, Thomas Rast, Phil Haack, Jason Frey, Philip Oakley, Duy Nguyen On Fri, Mar 30 2018, Johannes Schindelin wrote [expressing frustrations about Windows test suite slowness]: I've wondered for a while whether it wouldn't be a viable approach to make something like an interpreter for our test suite to get around this problem, i.e. much of it's very repetitive and just using a few shell functions we've defined, what if we had C equivalents of those? Duy had a WIP patch set a while ago to add C test suite support, but I thought what if we turn that inside-out, and instead have a shell interpreter that knows about the likes of test_cmp, and executes them directly? Here's proof of concept as a patch to the dash shell: u dash (debian/master=) $ git diff diff --git a/src/builtins.def.in b/src/builtins.def.in index 4441fe4..b214a17 100644 --- a/src/builtins.def.in +++ b/src/builtins.def.in @@ -92,3 +92,4 @@ ulimitcmd ulimit #endif testcmd test [ killcmd -u kill +testcmpcmd test_cmp diff --git a/src/jobs.c b/src/jobs.c index c2c2332..905563f 100644 --- a/src/jobs.c +++ b/src/jobs.c @@ -1502,3 +1502,12 @@ getstatus(struct job *job) { jobno(job), job->nprocs, status, retval)); return retval; } + +#include <stdio.h> +int +testcmpcmd(argc, argv) + int argc; + char **argv; +{ + fprintf(stderr, "Got %d arguments\n", argc); +} I just added that to jobs.c because it was easiest, then test_cmp becomes a builtin: u dash (debian/master=) $ src/dash -c 'type test_cmp' test_cmp is a shell builtin u dash (debian/master=) $ src/dash -c 'echo foo && test_cmp 1 2 3' foo Got 4 arguments I.e. it's really easy to add new built in commands to the dash shell (and probably other shells, but dash is really small & fast). We could carry some patch like that to dash, and also patch it so test-lib.sh could know that that was our own custom shell, and we'd then skip defining functions like test_cmp, and instead use that new builtin. Similarly, it could then be linked to our own binaries, and the test-tool would be a builtin that would appropriately dispatch, and we could even eventually make "git" a shell builtin. I don't have time or interest to work on this now, but thought it was interesting to share. This assumes that something in shellscript like: while echo foo; do echo bar; done Is no slower on Windows than *nix, since it's purely using built-ins, as opposed to something that would shell out. ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: A potential approach to making tests faster on Windows 2018-03-30 18:45 ` A potential approach to making tests faster on Windows Ævar Arnfjörð Bjarmason @ 2018-03-30 18:58 ` Junio C Hamano 2018-03-30 19:16 ` Jeff King 2018-04-03 11:43 ` Johannes Schindelin 2 siblings, 0 replies; 103+ messages in thread From: Junio C Hamano @ 2018-03-30 18:58 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: Johannes Schindelin, Jeff King, Stefan Beller, git, Thomas Rast, Phil Haack, Jason Frey, Philip Oakley, Duy Nguyen Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > On Fri, Mar 30 2018, Johannes Schindelin wrote [expressing frustrations > about Windows test suite slowness]: > > I've wondered for a while whether it wouldn't be a viable approach to > make something like an interpreter for our test suite to get around this > problem, i.e. much of it's very repetitive and just using a few shell > functions we've defined, what if we had C equivalents of those? > ... > > I don't have time or interest to work on this now, but thought it was > interesting to share. This assumes that something in shellscript like: > > while echo foo; do echo bar; done > > Is no slower on Windows than *nix, since it's purely using built-ins, as > opposed to something that would shell out. That's interesting; it certainly is appreciated to be constructive to find a usable solution. ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: A potential approach to making tests faster on Windows 2018-03-30 18:45 ` A potential approach to making tests faster on Windows Ævar Arnfjörð Bjarmason 2018-03-30 18:58 ` Junio C Hamano @ 2018-03-30 19:16 ` Jeff King 2018-04-03 9:49 ` Johannes Schindelin 2018-04-03 11:43 ` Johannes Schindelin 2 siblings, 1 reply; 103+ messages in thread From: Jeff King @ 2018-03-30 19:16 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: Johannes Schindelin, Stefan Beller, git, Junio C Hamano, Thomas Rast, Phil Haack, Jason Frey, Philip Oakley, Duy Nguyen On Fri, Mar 30, 2018 at 08:45:45PM +0200, Ævar Arnfjörð Bjarmason wrote: > I've wondered for a while whether it wouldn't be a viable approach to > make something like an interpreter for our test suite to get around this > problem, i.e. much of it's very repetitive and just using a few shell > functions we've defined, what if we had C equivalents of those? I've had a similar thought, though I wonder how far we could get with just shell. I even tried it out with test_cmp: https://public-inbox.org/git/20161020215647.5no7effvutwep2xt@sigill.intra.peff.net/ But Johannes Sixt pointed out that they already do this (see mingw_test_cmp in test-lib-functions). I also tried to explore a few numbers about process invocations to see if running shell commands is the problem: https://public-inbox.org/git/20161020123111.qnbsainul2g54z4z@sigill.intra.peff.net/ There was some discussion there about whether the problem is programs being exec'd, or if it's forks due to subshells. And if it is programs being exec'd, whether it's shell programs or if it is simply that we exec Git a huge number of times. -Peff ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: A potential approach to making tests faster on Windows 2018-03-30 19:16 ` Jeff King @ 2018-04-03 9:49 ` Johannes Schindelin 2018-04-03 11:28 ` Ævar Arnfjörð Bjarmason 2018-04-03 21:36 ` Eric Sunshine 0 siblings, 2 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-04-03 9:49 UTC (permalink / raw) To: Jeff King Cc: Ævar Arnfjörð Bjarmason, Stefan Beller, git, Junio C Hamano, Thomas Rast, Phil Haack, Jason Frey, Philip Oakley, Duy Nguyen [-- Attachment #1: Type: text/plain, Size: 3321 bytes --] Hi Peff, On Fri, 30 Mar 2018, Jeff King wrote: > On Fri, Mar 30, 2018 at 08:45:45PM +0200, Ævar Arnfjörð Bjarmason wrote: > > > I've wondered for a while whether it wouldn't be a viable approach to > > make something like an interpreter for our test suite to get around > > this problem, i.e. much of it's very repetitive and just using a few > > shell functions we've defined, what if we had C equivalents of those? > > I've had a similar thought, though I wonder how far we could get with > just shell. I even tried it out with test_cmp: > > https://public-inbox.org/git/20161020215647.5no7effvutwep2xt@sigill.intra.peff.net/ > > But Johannes Sixt pointed out that they already do this (see > mingw_test_cmp in test-lib-functions). Right. Additionally, I noticed that that simple loop in shell is *also* very slow on Windows (at least in the MSYS2 Bash we use in Git for Windows). Under the assumption that it is the Bash with the loop that uses too much POSIX emulation to make it fast, I re-implemented mingw_test_cmp in pure C: https://github.com/git-for-windows/git/commit/8a96ef63a0083ba02305dfeef6ff92c31b4fd7c3 Unfortunately, it did not produce any noticeable speed improvement, so I did not even finish the conversion (when the cmp fails, it does not show you any helpful diff yet). > I also tried to explore a few numbers about process invocations to see > if running shell commands is the problem: > > https://public-inbox.org/git/20161020123111.qnbsainul2g54z4z@sigill.intra.peff.net/ This mail was still in my inbox, in want of me saying something about this. My main evidence that shell scripts on macOS are slower than on Linux was the difference of the improvement incurred by moving more things from git-rebase--interactive.sh into sequencer.c: Linux saw an improvement only of about 3x, while macOS saw an improvement of 4x, IIRC. If I don't remember the absolute numbers correctly, at least I vividly remember the qualitative difference: It was noticeable. > There was some discussion there about whether the problem is programs > being exec'd, or if it's forks due to subshells. And if it is programs > being exec'd, whether it's shell programs or if it is simply that we > exec Git a huge number of times. One large problem there is that it is really hard to analyze performance over such a heterogenous code base: part C, part Perl, part Unix shell (and of course, when you say Unix shell, you imply dozens of separate tools that *also* need to be performance-profiled). I have very good profiling tools for C, I saw some built-in performance profiling for Perl, but there is no good performance profiling for Unix shell scripting: I doubt that the inventors of shell scripting had speed-critical production code in mind when they came up with the idea. I did invest dozens of hours earlier this year trying to obtain debug symbols in .pdb format (ready for Visual Studio's really envy-inducing performance profiler) also for the MSYS2 runtime and Bash, so that I could analyze what makes things so awfully slow in Git's test suite. The only problem is that I also have to do other things in my day-job, so that project waits patiently until I have some time to come back to that project. Ciao, Dscho ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: A potential approach to making tests faster on Windows 2018-04-03 9:49 ` Johannes Schindelin @ 2018-04-03 11:28 ` Ævar Arnfjörð Bjarmason 2018-04-03 15:55 ` Johannes Schindelin 2018-04-03 21:36 ` Eric Sunshine 1 sibling, 1 reply; 103+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2018-04-03 11:28 UTC (permalink / raw) To: Johannes Schindelin Cc: Jeff King, Stefan Beller, git, Junio C Hamano, Thomas Rast, Phil Haack, Jason Frey, Philip Oakley, Duy Nguyen On Tue, Apr 03 2018, Johannes Schindelin wrote: > Hi Peff, > > On Fri, 30 Mar 2018, Jeff King wrote: > >> On Fri, Mar 30, 2018 at 08:45:45PM +0200, Ævar Arnfjörð Bjarmason wrote: >> >> > I've wondered for a while whether it wouldn't be a viable approach to >> > make something like an interpreter for our test suite to get around >> > this problem, i.e. much of it's very repetitive and just using a few >> > shell functions we've defined, what if we had C equivalents of those? >> >> I've had a similar thought, though I wonder how far we could get with >> just shell. I even tried it out with test_cmp: >> >> https://public-inbox.org/git/20161020215647.5no7effvutwep2xt@sigill.intra.peff.net/ >> >> But Johannes Sixt pointed out that they already do this (see >> mingw_test_cmp in test-lib-functions). > > Right. > > Additionally, I noticed that that simple loop in shell is *also* very slow on > Windows (at least in the MSYS2 Bash we use in Git for Windows). > > Under the assumption that it is the Bash with the loop that uses too much > POSIX emulation to make it fast, I re-implemented mingw_test_cmp in pure > C: > https://github.com/git-for-windows/git/commit/8a96ef63a0083ba02305dfeef6ff92c31b4fd7c3 > > Unfortunately, it did not produce any noticeable speed improvement, so I > did not even finish the conversion (when the cmp fails, it does not show > you any helpful diff yet). I don't know the details of Windows, but it sounds like you're trying to performance test two things that are going to suck for different reasons. On one hand the pure-*.sh comparison would be slower than just diff on *nix, because it's not C, so you'll get that slowness, but gain in not having to fork another process. On the other hand the C implementation is going to be really fast, but it's going to take you a long time to get it started on Windows. Which is why I think it would be really interesting to see the third approach I suggested, i.e. hack the shell to make the test_cmp a builtin and test that. Then you won't fork, but will get the advantage of your fast C codepath. Also, even if test_cmp is much faster, Peff's results over at https://public-inbox.org/git/20161020123111.qnbsainul2g54z4z@sigill.intra.peff.net/ suggest that you may not notice anyway. Aside from the points raised there about the bin wrappers it seems the easiest wins are having a builtin version of "rm" and "cat". Are you able to compile dash on Windows with some modification of the patch I sent upthread? If not it doesn't seem too hard to do the same trick for bash, see: git grep '\balias\b' -- builtins Once you have bash.git checked out. I.e. you add a bit of Makefile boilerplate and you should be able to get a new builtin. >> I also tried to explore a few numbers about process invocations to see >> if running shell commands is the problem: >> >> https://public-inbox.org/git/20161020123111.qnbsainul2g54z4z@sigill.intra.peff.net/ > > This mail was still in my inbox, in want of me saying something about > this. > > My main evidence that shell scripts on macOS are slower than on Linux was > the difference of the improvement incurred by moving more things from > git-rebase--interactive.sh into sequencer.c: Linux saw an improvement only > of about 3x, while macOS saw an improvement of 4x, IIRC. If I don't > remember the absolute numbers correctly, at least I vividly remember the > qualitative difference: It was noticeable. > >> There was some discussion there about whether the problem is programs >> being exec'd, or if it's forks due to subshells. And if it is programs >> being exec'd, whether it's shell programs or if it is simply that we >> exec Git a huge number of times. > > One large problem there is that it is really hard to analyze performance > over such a heterogenous code base: part C, part Perl, part Unix shell > (and of course, when you say Unix shell, you imply dozens of separate > tools that *also* need to be performance-profiled). I have very good > profiling tools for C, I saw some built-in performance profiling for Perl, > but there is no good performance profiling for Unix shell scripting: I > doubt that the inventors of shell scripting had speed-critical production > code in mind when they came up with the idea. > > I did invest dozens of hours earlier this year trying to obtain debug > symbols in .pdb format (ready for Visual Studio's really envy-inducing > performance profiler) also for the MSYS2 runtime and Bash, so that I could > analyze what makes things so awfully slow in Git's test suite. > > The only problem is that I also have to do other things in my day-job, so > that project waits patiently until I have some time to come back to that > project. > > Ciao, > Dscho ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: A potential approach to making tests faster on Windows 2018-04-03 11:28 ` Ævar Arnfjörð Bjarmason @ 2018-04-03 15:55 ` Johannes Schindelin 0 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-04-03 15:55 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: Jeff King, Stefan Beller, git, Junio C Hamano, Thomas Rast, Phil Haack, Jason Frey, Philip Oakley, Duy Nguyen [-- Attachment #1: Type: text/plain, Size: 1376 bytes --] Hi Ævar, On Tue, 3 Apr 2018, Ævar Arnfjörð Bjarmason wrote: > [...] I think it would be really interesting to see the third > approach I suggested, i.e. hack the shell to make the test_cmp a builtin > and test that. Then you won't fork, but will get the advantage of your > fast C codepath. That should be relatively equivalent to running in BusyBox-w32's ash. BusyBox-w32 is a pure-Win32 version of BusyBox (i.e. it does not use any POSIX emulation layer, not Cygwin nor MSYS2). I did not notice any Earth-shaking performance improvement when running a test with BusyBox-w32's ash. It was a couple of percent, maybe even 20% faster, but nowhere near the orders of magnitude I had been expecting. > Also, even if test_cmp is much faster, Peff's results over at > https://public-inbox.org/git/20161020123111.qnbsainul2g54z4z@sigill.intra.peff.net/ > suggest that you may not notice anyway. Aside from the points raised > there about the bin wrappers it seems the easiest wins are having a > builtin version of "rm" and "cat". In BusyBox-w32, `rm` and `cat` *are* built-ins. > Are you able to compile dash on Windows with some modification of the > patch I sent upthread? In theory, yes. In practice, I lack the time (and I do not expect this to have any performance benefit over using BusyBox-w32 to run the test suite). Ciao, Dscho ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: A potential approach to making tests faster on Windows 2018-04-03 9:49 ` Johannes Schindelin 2018-04-03 11:28 ` Ævar Arnfjörð Bjarmason @ 2018-04-03 21:36 ` Eric Sunshine 1 sibling, 0 replies; 103+ messages in thread From: Eric Sunshine @ 2018-04-03 21:36 UTC (permalink / raw) To: Johannes Schindelin Cc: Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, git, Junio C Hamano, Thomas Rast, Phil Haack, Jason Frey, Philip Oakley, Duy Nguyen On Tue, Apr 3, 2018 at 5:49 AM, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > My main evidence that shell scripts on macOS are slower than on Linux was > the difference of the improvement incurred by moving more things from > git-rebase--interactive.sh into sequencer.c: Linux saw an improvement only > of about 3x, while macOS saw an improvement of 4x, IIRC. If I don't > remember the absolute numbers correctly, at least I vividly remember the > qualitative difference: It was noticeable. MacOS is _slow_, much, much slower than, say, Linux. Several years ago, when I had this machine configured for multi-boot, I ran MacOS and Linux on bare metal. Back then, using ram disk for the "trash" directories, and disabling Spotlight indexing on MacOS to avoid it eating CPU and causing I/O contention, the Git test suite would run to completion on Linux in slightly over 1 minute. On MacOS, it would take over 10 minutes; 10 times slower. These days, the Git test suite takes 15 minutes to run on the same hardware (with same conditions: ram disk and Spotlight disabled), which is painfully slow, thus I rarely do it. Unfortunately, I don't have Linux installed on bare metal anymore, so I can't make a proper comparison, but I do run Linux in a virtual machine under MacOS and, even though its running within a virtualized environment, Linux is still much faster than MacOS, taking 4:25 (slow, but not to the point of outright pain). That the test suite runs so much faster on Linux (bare metal or virtualized) than MacOS on this machine, I have attributed (or understood as being due) to poor HFS+ filesystem performance. It's even worse when Spotlight interferes. Presumably, the new, recently released, Mac filesystem has improved performance, but it's restricted to SSD's, whereas this machine has a physical drive, thus I can't test it. ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: A potential approach to making tests faster on Windows 2018-03-30 18:45 ` A potential approach to making tests faster on Windows Ævar Arnfjörð Bjarmason 2018-03-30 18:58 ` Junio C Hamano 2018-03-30 19:16 ` Jeff King @ 2018-04-03 11:43 ` Johannes Schindelin 2018-04-03 13:27 ` Jeff King 2 siblings, 1 reply; 103+ messages in thread From: Johannes Schindelin @ 2018-04-03 11:43 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: Jeff King, Stefan Beller, git, Junio C Hamano, Thomas Rast, Phil Haack, Jason Frey, Philip Oakley, Duy Nguyen [-- Attachment #1: Type: text/plain, Size: 4397 bytes --] Hi Ævar, On Fri, 30 Mar 2018, Ævar Arnfjörð Bjarmason wrote: > On Fri, Mar 30 2018, Johannes Schindelin wrote [expressing frustrations > about Windows test suite slowness]: To be precise (and I think it is important to be precise here): it is not the Windows test suite about which I talked, it is Git's test suite, as run on Windows. It might sound like a small difference, but it is not: the fault really lies with Git because it wants to be a portable software. > I've wondered for a while whether it wouldn't be a viable approach to > make something like an interpreter for our test suite to get around this > problem, i.e. much of it's very repetitive and just using a few shell > functions we've defined, what if we had C equivalents of those? There has even been an attempt to do this by Linus Torvalds himself: https://public-inbox.org/git/Pine.LNX.4.64.0602232229340.3771@g5.osdl.org/ It has not really gone anywhere... To be honest, I had a different idea (because I do not really want to maintain yet another piece of software): BusyBox. The source code is clean enough, and it should, in theory, allow us to go really fast. > Duy had a WIP patch set a while ago to add C test suite support, but I > thought what if we turn that inside-out, and instead have a shell > interpreter that knows about the likes of test_cmp, and executes them > directly? The problem, of course, is: if you add Git-test-suite-specific stuff to any Unix shell, you are going to have to maintain this fork, and all of a sudden it has become a lot harder to develop Git, and to port it. Quite frankly, I would rather go with Duy's original approach, or a variation thereof, as snuck into the wildmatch discussion here: https://public-inbox.org/git/20180110090724.GA2893@ash/ > Here's proof of concept as a patch to the dash shell: > > u dash (debian/master=) $ git diff > diff --git a/src/builtins.def.in b/src/builtins.def.in > index 4441fe4..b214a17 100644 > --- a/src/builtins.def.in > +++ b/src/builtins.def.in > @@ -92,3 +92,4 @@ ulimitcmd ulimit > #endif > testcmd test [ > killcmd -u kill > +testcmpcmd test_cmp > diff --git a/src/jobs.c b/src/jobs.c > index c2c2332..905563f 100644 > --- a/src/jobs.c > +++ b/src/jobs.c > @@ -1502,3 +1502,12 @@ getstatus(struct job *job) { > jobno(job), job->nprocs, status, retval)); > return retval; > } > + > +#include <stdio.h> > +int > +testcmpcmd(argc, argv) > + int argc; > + char **argv; > +{ > + fprintf(stderr, "Got %d arguments\n", argc); > +} > > I just added that to jobs.c because it was easiest, then test_cmp > becomes a builtin: > > u dash (debian/master=) $ src/dash -c 'type test_cmp' > test_cmp is a shell builtin > u dash (debian/master=) $ src/dash -c 'echo foo && test_cmp 1 2 3' > foo > Got 4 arguments > > I.e. it's really easy to add new built in commands to the dash shell > (and probably other shells, but dash is really small & fast). > > We could carry some patch like that to dash, and also patch it so > test-lib.sh could know that that was our own custom shell, and we'd then > skip defining functions like test_cmp, and instead use that new builtin. Or even use the output of `type test_cmp` as a tell-tale. > Similarly, it could then be linked to our own binaries, and the > test-tool would be a builtin that would appropriately dispatch, and we > could even eventually make "git" a shell builtin. > > I don't have time or interest to work on this now, but thought it was > interesting to share. This assumes that something in shellscript like: > > while echo foo; do echo bar; done > > Is no slower on Windows than *nix, since it's purely using built-ins, as > opposed to something that would shell out. It is still interpreting stuff. And it still goes through the POSIX emulation layer. I did see reports on the Git for Windows bug tracker that gave me the impression that such loops in Unix shell scripts may not, in fact, be as performant in MSYS2's Bash as you would like to believe: https://github.com/git-for-windows/git/issues/1533#issuecomment-372025449 Ciao, Dscho ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: A potential approach to making tests faster on Windows 2018-04-03 11:43 ` Johannes Schindelin @ 2018-04-03 13:27 ` Jeff King 2018-04-03 16:00 ` Johannes Schindelin 0 siblings, 1 reply; 103+ messages in thread From: Jeff King @ 2018-04-03 13:27 UTC (permalink / raw) To: Johannes Schindelin Cc: Ævar Arnfjörð Bjarmason, Stefan Beller, git, Junio C Hamano, Thomas Rast, Phil Haack, Jason Frey, Philip Oakley, Duy Nguyen On Tue, Apr 03, 2018 at 01:43:10PM +0200, Johannes Schindelin wrote: > > I don't have time or interest to work on this now, but thought it was > > interesting to share. This assumes that something in shellscript like: > > > > while echo foo; do echo bar; done > > > > Is no slower on Windows than *nix, since it's purely using built-ins, as > > opposed to something that would shell out. > > It is still interpreting stuff. And it still goes through the POSIX > emulation layer. > > I did see reports on the Git for Windows bug tracker that gave me the > impression that such loops in Unix shell scripts may not, in fact, be as > performant in MSYS2's Bash as you would like to believe: > > https://github.com/git-for-windows/git/issues/1533#issuecomment-372025449 The main problem with `read` loops in shell is that the shell makes one read() syscall per character. It has to, because doing otherwise is user-visible in cases where the descriptor may get passed to a different process. There's unfortunately no portable way to say "please just read this quickly, I promise nobody else is going to read the descriptor". And nor do I know of any shell which is smart enough to know that it's going to consume to EOF anyway (as you would for something like "cmd | while read"). If you know you have bash, you can use "-N" to get a more efficient read: $ echo foo | strace -e read bash -c 'read foo' [...] read(0, "f", 1) = 1 read(0, "o", 1) = 1 read(0, "o", 1) = 1 read(0, "\n", 1) = 1 $ echo foo | strace -e read bash -c 'read -N 10 foo' [...] read(0, "foo\n", 10) = 4 read(0, "", 6) = 0 but then you have another problem: how to split the resulting buffer into lines in shell. ;) But if we're at the point of creating custom C builtins for busybox/dash/etc, you should be able to create a primitive for "read this using buffered stdio, other processes be damned, and return one line at a time". -Peff ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: A potential approach to making tests faster on Windows 2018-04-03 13:27 ` Jeff King @ 2018-04-03 16:00 ` Johannes Schindelin 2018-04-06 21:40 ` Jeff King 0 siblings, 1 reply; 103+ messages in thread From: Johannes Schindelin @ 2018-04-03 16:00 UTC (permalink / raw) To: Jeff King Cc: Ævar Arnfjörð Bjarmason, Stefan Beller, git, Junio C Hamano, Thomas Rast, Phil Haack, Jason Frey, Philip Oakley, Duy Nguyen Hi Peff, On Tue, 3 Apr 2018, Jeff King wrote: > On Tue, Apr 03, 2018 at 01:43:10PM +0200, Johannes Schindelin wrote: > > > > I don't have time or interest to work on this now, but thought it was > > > interesting to share. This assumes that something in shellscript like: > > > > > > while echo foo; do echo bar; done > > > > > > Is no slower on Windows than *nix, since it's purely using built-ins, as > > > opposed to something that would shell out. > > > > It is still interpreting stuff. And it still goes through the POSIX > > emulation layer. > > > > I did see reports on the Git for Windows bug tracker that gave me the > > impression that such loops in Unix shell scripts may not, in fact, be as > > performant in MSYS2's Bash as you would like to believe: > > > > https://github.com/git-for-windows/git/issues/1533#issuecomment-372025449 > > The main problem with `read` loops in shell is that the shell makes one > read() syscall per character. It has to, because doing otherwise is > user-visible in cases where the descriptor may get passed to a different > process. Thank you for the explanation. Makes tons of sense now. > There's unfortunately no portable way to say "please just read this > quickly, I promise nobody else is going to read the descriptor". And nor > do I know of any shell which is smart enough to know that it's going to > consume to EOF anyway (as you would for something like "cmd | while > read"). > > If you know you have bash, you can use "-N" to get a more efficient > read: > > $ echo foo | strace -e read bash -c 'read foo' > [...] > read(0, "f", 1) = 1 > read(0, "o", 1) = 1 > read(0, "o", 1) = 1 > read(0, "\n", 1) = 1 > > $ echo foo | strace -e read bash -c 'read -N 10 foo' > [...] > read(0, "foo\n", 10) = 4 > read(0, "", 6) = 0 > > but then you have another problem: how to split the resulting buffer > into lines in shell. ;) True. > But if we're at the point of creating custom C builtins for > busybox/dash/etc, you should be able to create a primitive for "read > this using buffered stdio, other processes be damned, and return one > line at a time". Well, you know, I do not think that papering over the root cause will make anything better. And the root cause is that we use a test framework written in Unix shell. I will have to set aside some time to dig into the bottlenecks there and figure out what parts I can safely convert into "test builtins", i.e. into the test-tool Duy introduced, to avoid having shell scripts do the heavy-lifting. Ciao, Dscho ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: A potential approach to making tests faster on Windows 2018-04-03 16:00 ` Johannes Schindelin @ 2018-04-06 21:40 ` Jeff King 2018-04-06 21:57 ` Stefan Beller 0 siblings, 1 reply; 103+ messages in thread From: Jeff King @ 2018-04-06 21:40 UTC (permalink / raw) To: Johannes Schindelin Cc: Ævar Arnfjörð Bjarmason, Stefan Beller, git, Junio C Hamano, Thomas Rast, Phil Haack, Jason Frey, Philip Oakley, Duy Nguyen On Tue, Apr 03, 2018 at 06:00:05PM +0200, Johannes Schindelin wrote: > > But if we're at the point of creating custom C builtins for > > busybox/dash/etc, you should be able to create a primitive for "read > > this using buffered stdio, other processes be damned, and return one > > line at a time". > > Well, you know, I do not think that papering over the root cause will make > anything better. And the root cause is that we use a test framework > written in Unix shell. I'm not entirely convinced of this. My earlier numbers show that we spend a lot of time actually running Git. But that's not because we're written in shell, but because the stable interface to Git is running individual processes. So we can unit-test wildmatch or similar in a single C program, but I think we inherently need to run "git init" a lot of times. Now I think there's reason to doubt some of my numbers. I was counting exec's, and non-exec forks due to subshells, etc, may be important. So I claim only that I remain unconvinced that we are certain of the root cause. At any rate, I would be happy to see more study into this. If we can create a measurable speedup for an existing script, that might give us a blueprint for speeding up the whole suite. -Peff ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: A potential approach to making tests faster on Windows 2018-04-06 21:40 ` Jeff King @ 2018-04-06 21:57 ` Stefan Beller 0 siblings, 0 replies; 103+ messages in thread From: Stefan Beller @ 2018-04-06 21:57 UTC (permalink / raw) To: Jeff King Cc: Johannes Schindelin, Ævar Arnfjörð Bjarmason, git, Junio C Hamano, Thomas Rast, Phil Haack, Jason Frey, Philip Oakley, Duy Nguyen On Fri, Apr 6, 2018 at 2:40 PM, Jeff King <peff@peff.net> wrote: > On Tue, Apr 03, 2018 at 06:00:05PM +0200, Johannes Schindelin wrote: > >> > But if we're at the point of creating custom C builtins for >> > busybox/dash/etc, you should be able to create a primitive for "read >> > this using buffered stdio, other processes be damned, and return one >> > line at a time". >> >> Well, you know, I do not think that papering over the root cause will make >> anything better. And the root cause is that we use a test framework >> written in Unix shell. > > I'm not entirely convinced of this. My earlier numbers show that we > spend a lot of time actually running Git. But that's not because we're > written in shell, but because the stable interface to Git is running > individual processes. > > So we can unit-test wildmatch or similar in a single C program, but I > think we inherently need to run "git init" a lot of times. > > Now I think there's reason to doubt some of my numbers. I was counting > exec's, and non-exec forks due to subshells, etc, may be important. So I > claim only that I remain unconvinced that we are certain of the root > cause. > > At any rate, I would be happy to see more study into this. If we can > create a measurable speedup for an existing script, that might give us a > blueprint for speeding up the whole suite. The setup of each test is finicky, as we'd do different setups for each test as we'd test different things. I once wondered if we'd want to have a "ready made" directory that contains repositories in various states that we can copy for each test and only need minimal adjustments instead of writing the setup from scratch in each script. ^ permalink raw reply [flat|nested] 103+ messages in thread
* [PATCH 2/9] t1300: rename it to reflect that `repo-config` was deprecated 2018-03-29 15:18 [PATCH 0/9] Assorted fixes for `git config` (including the "empty sections" bug) Johannes Schindelin 2018-03-29 15:18 ` [PATCH 1/9] git_config_set: fix off-by-two Johannes Schindelin @ 2018-03-29 15:18 ` Johannes Schindelin 2018-03-29 19:42 ` Jeff King 2018-03-29 15:18 ` [PATCH 3/9] t1300: avoid relying on a bug Johannes Schindelin ` (10 subsequent siblings) 12 siblings, 1 reply; 103+ messages in thread From: Johannes Schindelin @ 2018-03-29 15:18 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- t/{t1300-repo-config.sh => t1300-config.sh} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename t/{t1300-repo-config.sh => t1300-config.sh} (100%) diff --git a/t/t1300-repo-config.sh b/t/t1300-config.sh similarity index 100% rename from t/t1300-repo-config.sh rename to t/t1300-config.sh -- 2.16.2.windows.1.26.g2cc3565eb4b ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 2/9] t1300: rename it to reflect that `repo-config` was deprecated 2018-03-29 15:18 ` [PATCH 2/9] t1300: rename it to reflect that `repo-config` was deprecated Johannes Schindelin @ 2018-03-29 19:42 ` Jeff King 2018-03-30 12:37 ` Johannes Schindelin 0 siblings, 1 reply; 103+ messages in thread From: Jeff King @ 2018-03-29 19:42 UTC (permalink / raw) To: Johannes Schindelin Cc: git, Junio C Hamano, Thomas Rast, Phil Haack, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley On Thu, Mar 29, 2018 at 05:18:40PM +0200, Johannes Schindelin wrote: > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> > --- > t/{t1300-repo-config.sh => t1300-config.sh} | 0 > 1 file changed, 0 insertions(+), 0 deletions(-) > rename t/{t1300-repo-config.sh => t1300-config.sh} (100%) This has only been bugging me for oh, about 10 years. Thanks. -Peff ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 2/9] t1300: rename it to reflect that `repo-config` was deprecated 2018-03-29 19:42 ` Jeff King @ 2018-03-30 12:37 ` Johannes Schindelin 0 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-03-30 12:37 UTC (permalink / raw) To: Jeff King Cc: git, Junio C Hamano, Thomas Rast, Phil Haack, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley Hi Peff, On Thu, 29 Mar 2018, Jeff King wrote: > On Thu, Mar 29, 2018 at 05:18:40PM +0200, Johannes Schindelin wrote: > > > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> > > --- > > t/{t1300-repo-config.sh => t1300-config.sh} | 0 > > 1 file changed, 0 insertions(+), 0 deletions(-) > > rename t/{t1300-repo-config.sh => t1300-config.sh} (100%) > > This has only been bugging me for oh, about 10 years. Yep. We should have done that right after moving the builtins' code to builtins/. Which reminds me that we *still* do not have a lib/ where all the source code for libgit.a lives. And then maybe standalone/ for the source code of the non-builtin tools. And... this would make for a fine micro-project next year, I guess. Or in ten. Ciao, Dscho ^ permalink raw reply [flat|nested] 103+ messages in thread
* [PATCH 3/9] t1300: avoid relying on a bug 2018-03-29 15:18 [PATCH 0/9] Assorted fixes for `git config` (including the "empty sections" bug) Johannes Schindelin 2018-03-29 15:18 ` [PATCH 1/9] git_config_set: fix off-by-two Johannes Schindelin 2018-03-29 15:18 ` [PATCH 2/9] t1300: rename it to reflect that `repo-config` was deprecated Johannes Schindelin @ 2018-03-29 15:18 ` Johannes Schindelin 2018-03-29 19:43 ` Jeff King 2018-03-29 15:18 ` [PATCH 4/9] t1300: remove unreasonable expectation from TODO Johannes Schindelin ` (9 subsequent siblings) 12 siblings, 1 reply; 103+ messages in thread From: Johannes Schindelin @ 2018-03-29 15:18 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley The test case 'unset with cont. lines' relied on a bug that is about to be fixed: it tests *explicitly* that removing the last entry from a config section leaves an *empty* section behind. Let's fix this test case not to rely on that behavior, simply by preventing the section from becoming empty. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- t/t1300-config.sh | 2 ++ 1 file changed, 2 insertions(+) diff --git a/t/t1300-config.sh b/t/t1300-config.sh index 4f8e6f5fde3..1ece7bad05f 100755 --- a/t/t1300-config.sh +++ b/t/t1300-config.sh @@ -108,6 +108,7 @@ bar = foo [beta] baz = multiple \ lines +foo = bar EOF test_expect_success 'unset with cont. lines' ' @@ -118,6 +119,7 @@ cat > expect <<\EOF [alpha] bar = foo [beta] +foo = bar EOF test_expect_success 'unset with cont. lines is correct' 'test_cmp expect .git/config' -- 2.16.2.windows.1.26.g2cc3565eb4b ^ permalink raw reply related [flat|nested] 103+ messages in thread
* Re: [PATCH 3/9] t1300: avoid relying on a bug 2018-03-29 15:18 ` [PATCH 3/9] t1300: avoid relying on a bug Johannes Schindelin @ 2018-03-29 19:43 ` Jeff King 2018-03-30 12:38 ` Johannes Schindelin 0 siblings, 1 reply; 103+ messages in thread From: Jeff King @ 2018-03-29 19:43 UTC (permalink / raw) To: Johannes Schindelin Cc: git, Junio C Hamano, Thomas Rast, Phil Haack, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley On Thu, Mar 29, 2018 at 05:18:45PM +0200, Johannes Schindelin wrote: > The test case 'unset with cont. lines' relied on a bug that is about to > be fixed: it tests *explicitly* that removing the last entry from a > config section leaves an *empty* section behind. > > Let's fix this test case not to rely on that behavior, simply by > preventing the section from becoming empty. Seems like a good solution. I don't think we care in particular about testing a multi-line value at the end of the file. -Peff ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 3/9] t1300: avoid relying on a bug 2018-03-29 19:43 ` Jeff King @ 2018-03-30 12:38 ` Johannes Schindelin 0 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-03-30 12:38 UTC (permalink / raw) To: Jeff King Cc: git, Junio C Hamano, Thomas Rast, Phil Haack, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley Hi Peff, On Thu, 29 Mar 2018, Jeff King wrote: > On Thu, Mar 29, 2018 at 05:18:45PM +0200, Johannes Schindelin wrote: > > > The test case 'unset with cont. lines' relied on a bug that is about to > > be fixed: it tests *explicitly* that removing the last entry from a > > config section leaves an *empty* section behind. > > > > Let's fix this test case not to rely on that behavior, simply by > > preventing the section from becoming empty. > > Seems like a good solution. I don't think we care in particular about > testing a multi-line value at the end of the file. ... and if we did, we should have documented that. Ciao, Dscho ^ permalink raw reply [flat|nested] 103+ messages in thread
* [PATCH 4/9] t1300: remove unreasonable expectation from TODO 2018-03-29 15:18 [PATCH 0/9] Assorted fixes for `git config` (including the "empty sections" bug) Johannes Schindelin ` (2 preceding siblings ...) 2018-03-29 15:18 ` [PATCH 3/9] t1300: avoid relying on a bug Johannes Schindelin @ 2018-03-29 15:18 ` Johannes Schindelin 2018-03-29 19:52 ` Jeff King 2018-03-29 15:18 ` [PATCH 5/9] t1300: `--unset-all` can leave an empty section behind (bug) Johannes Schindelin ` (8 subsequent siblings) 12 siblings, 1 reply; 103+ messages in thread From: Johannes Schindelin @ 2018-03-29 15:18 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley In https://public-inbox.org/git/7vvc8alzat.fsf@alter.siamese.dyndns.org/ a reasonable patch was made quite a bit less so by changing a test case demonstrating a bug to a test case that demonstrates that we ask for too much: the test case 'unsetting the last key in a section removes header' now expects a future bug fix to be able to determine whether a free-form comment above a section header refers to said section or not. Rather than shooting for the stars (and not even getting off the ground), let's start shooting for something obtainable and be reasonably confident that we *can* get it. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- t/t1300-config.sh | 21 ++++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-) diff --git a/t/t1300-config.sh b/t/t1300-config.sh index 1ece7bad05f..3ad3df0c83e 100755 --- a/t/t1300-config.sh +++ b/t/t1300-config.sh @@ -1413,7 +1413,7 @@ test_expect_success 'urlmatch with wildcard' ' ' # good section hygiene -test_expect_failure 'unsetting the last key in a section removes header' ' +test_expect_failure '--unset last key removes section (except if commented)' ' cat >.git/config <<-\EOF && # some generic comment on the configuration file itself # a comment specific to this "section" section. @@ -1427,6 +1427,25 @@ test_expect_failure 'unsetting the last key in a section removes header' ' cat >expect <<-\EOF && # some generic comment on the configuration file itself + # a comment specific to this "section" section. + [section] + # some intervening lines + # that should also be dropped + + # please be careful when you update the above variable + EOF + + git config --unset section.key && + test_cmp expect .git/config && + + cat >.git/config <<-\EOF && + [section] + key = value + [next-section] + EOF + + cat >expect <<-\EOF && + [next-section] EOF git config --unset section.key && -- 2.16.2.windows.1.26.g2cc3565eb4b ^ permalink raw reply related [flat|nested] 103+ messages in thread
* Re: [PATCH 4/9] t1300: remove unreasonable expectation from TODO 2018-03-29 15:18 ` [PATCH 4/9] t1300: remove unreasonable expectation from TODO Johannes Schindelin @ 2018-03-29 19:52 ` Jeff King 2018-03-29 20:45 ` Junio C Hamano 2018-03-30 12:42 ` Johannes Schindelin 0 siblings, 2 replies; 103+ messages in thread From: Jeff King @ 2018-03-29 19:52 UTC (permalink / raw) To: Johannes Schindelin Cc: git, Junio C Hamano, Thomas Rast, Phil Haack, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley On Thu, Mar 29, 2018 at 05:18:50PM +0200, Johannes Schindelin wrote: > In https://public-inbox.org/git/7vvc8alzat.fsf@alter.siamese.dyndns.org/ > a reasonable patch was made quite a bit less so by changing a test case > demonstrating a bug to a test case that demonstrates that we ask for too > much: the test case 'unsetting the last key in a section removes header' > now expects a future bug fix to be able to determine whether a free-form > comment above a section header refers to said section or not. > > Rather than shooting for the stars (and not even getting off the > ground), let's start shooting for something obtainable and be reasonably > confident that we *can* get it. As I said before, I'm fine with turning this test into something more realistic. An obvious question is whether we should preserve the original unrealistic parts by splitting it: the realistic parts into one expect_failure (that we'd switch to expect_success by the end of this series), and then an unrealistic one to serve as a documentation of the ideal, with a comment explaining why it's unrealistic. I doubt the "unrealistic" half would be serving much purpose though, so I'm OK to see it get eliminated here. -Peff ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 4/9] t1300: remove unreasonable expectation from TODO 2018-03-29 19:52 ` Jeff King @ 2018-03-29 20:45 ` Junio C Hamano 2018-03-30 12:42 ` Johannes Schindelin 1 sibling, 0 replies; 103+ messages in thread From: Junio C Hamano @ 2018-03-29 20:45 UTC (permalink / raw) To: Jeff King Cc: Johannes Schindelin, git, Thomas Rast, Phil Haack, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley Jeff King <peff@peff.net> writes: > An obvious question is whether we should preserve the original > unrealistic parts by splitting it: the realistic parts into one > expect_failure (that we'd switch to expect_success by the end of this > series), and then an unrealistic one to serve as a documentation of the > ideal, with a comment explaining why it's unrealistic. > > I doubt the "unrealistic" half would be serving much purpose though, so > I'm OK to see it get eliminated here. Likewise. The series looks good so far. ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 4/9] t1300: remove unreasonable expectation from TODO 2018-03-29 19:52 ` Jeff King 2018-03-29 20:45 ` Junio C Hamano @ 2018-03-30 12:42 ` Johannes Schindelin 1 sibling, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-03-30 12:42 UTC (permalink / raw) To: Jeff King Cc: git, Junio C Hamano, Thomas Rast, Phil Haack, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley Hi Peff, On Thu, 29 Mar 2018, Jeff King wrote: > On Thu, Mar 29, 2018 at 05:18:50PM +0200, Johannes Schindelin wrote: > > > In https://public-inbox.org/git/7vvc8alzat.fsf@alter.siamese.dyndns.org/ > > a reasonable patch was made quite a bit less so by changing a test case > > demonstrating a bug to a test case that demonstrates that we ask for too > > much: the test case 'unsetting the last key in a section removes header' > > now expects a future bug fix to be able to determine whether a free-form > > comment above a section header refers to said section or not. > > > > Rather than shooting for the stars (and not even getting off the > > ground), let's start shooting for something obtainable and be reasonably > > confident that we *can* get it. > > As I said before, I'm fine with turning this test into something more > realistic. Good. Of course, I worked hard to come up with a patch series, i.e. I put in some effort to placate anybody who would be offended by my accompanying rant. > An obvious question is whether we should preserve the original > unrealistic parts by splitting it: the realistic parts into one > expect_failure (that we'd switch to expect_success by the end of this > series), and then an unrealistic one to serve as a documentation of the > ideal, with a comment explaining why it's unrealistic. As stated before, I think it would be a mistake to mark up this unrealistic example with `test_expect_failure`. We do, after all, suggest occasionally to grep for that when somebody asks what they could work on. And you do not want to set somebody like that up for failure by pointing them to such a "bug". However, I did keep the example to demonstrate the expectation that sections with surrounding comments are kept. That was very much intended. And the reason I did not change the unrealistic example? So that it is easier to review in our patch-based review process, where I try to avoid hunks that might distract from the intent of the change. Ciao, Dscho ^ permalink raw reply [flat|nested] 103+ messages in thread
* [PATCH 5/9] t1300: `--unset-all` can leave an empty section behind (bug) 2018-03-29 15:18 [PATCH 0/9] Assorted fixes for `git config` (including the "empty sections" bug) Johannes Schindelin ` (3 preceding siblings ...) 2018-03-29 15:18 ` [PATCH 4/9] t1300: remove unreasonable expectation from TODO Johannes Schindelin @ 2018-03-29 15:18 ` Johannes Schindelin 2018-03-29 19:54 ` Jeff King 2018-03-29 15:18 ` [PATCH 6/9] git_config_set: simplify the way the section name is remembered Johannes Schindelin ` (7 subsequent siblings) 12 siblings, 1 reply; 103+ messages in thread From: Johannes Schindelin @ 2018-03-29 15:18 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley We already have a test demonstrating that removing the last entry from a config section fails to remove the section header of the now-empty section. The same can happen, of course, if we remove the last entries in one fell swoop. This is *also* a bug, and should be fixed at the same time. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- t/t1300-config.sh | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/t/t1300-config.sh b/t/t1300-config.sh index 3ad3df0c83e..ff79a213567 100755 --- a/t/t1300-config.sh +++ b/t/t1300-config.sh @@ -1452,6 +1452,17 @@ test_expect_failure '--unset last key removes section (except if commented)' ' test_cmp expect .git/config ' +test_expect_failure '--unset-all removes section if empty & uncommented' ' + cat >.git/config <<-\EOF && + [section] + key = value1 + key = value2 + EOF + + git config --unset-all section.key && + test_line_count = 0 .git/config +' + test_expect_failure 'adding a key into an empty section reuses header' ' cat >.git/config <<-\EOF && [section] -- 2.16.2.windows.1.26.g2cc3565eb4b ^ permalink raw reply related [flat|nested] 103+ messages in thread
* Re: [PATCH 5/9] t1300: `--unset-all` can leave an empty section behind (bug) 2018-03-29 15:18 ` [PATCH 5/9] t1300: `--unset-all` can leave an empty section behind (bug) Johannes Schindelin @ 2018-03-29 19:54 ` Jeff King 0 siblings, 0 replies; 103+ messages in thread From: Jeff King @ 2018-03-29 19:54 UTC (permalink / raw) To: Johannes Schindelin Cc: git, Junio C Hamano, Thomas Rast, Phil Haack, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley On Thu, Mar 29, 2018 at 05:18:53PM +0200, Johannes Schindelin wrote: > We already have a test demonstrating that removing the last entry from a > config section fails to remove the section header of the now-empty > section. > > The same can happen, of course, if we remove the last entries in one fell > swoop. This is *also* a bug, and should be fixed at the same time. Yep, makes sense, and the diff is obviously correct. -Peff ^ permalink raw reply [flat|nested] 103+ messages in thread
* [PATCH 6/9] git_config_set: simplify the way the section name is remembered 2018-03-29 15:18 [PATCH 0/9] Assorted fixes for `git config` (including the "empty sections" bug) Johannes Schindelin ` (4 preceding siblings ...) 2018-03-29 15:18 ` [PATCH 5/9] t1300: `--unset-all` can leave an empty section behind (bug) Johannes Schindelin @ 2018-03-29 15:18 ` Johannes Schindelin 2018-03-29 15:19 ` [PATCH 7/9] git config --unset: remove empty sections (in normal situations) Johannes Schindelin ` (6 subsequent siblings) 12 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-03-29 15:18 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley This not only reduces the number of lines, but also opens the door for reusing the section name later (which the upcoming patch to remove now-empty sections will do). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- config.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/config.c b/config.c index 5cc049aaef0..d35dffa50de 100644 --- a/config.c +++ b/config.c @@ -2486,12 +2486,14 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, struct lock_file lock = LOCK_INIT; char *filename_buf = NULL; char *contents = NULL; + char *section_name = NULL; size_t contents_sz; /* parse-key returns negative; flip the sign to feed exit(3) */ - ret = 0 - git_config_parse_key(key, &store.key, &store.baselen); + ret = 0 - git_config_parse_key(key, §ion_name, &store.baselen); if (ret) goto out_free; + store.key = section_name; store.multi_replace = multi_replace; @@ -2505,7 +2507,6 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, fd = hold_lock_file_for_update(&lock, config_filename, 0); if (fd < 0) { error_errno("could not lock config file %s", config_filename); - free(store.key); ret = CONFIG_NO_LOCK; goto out_free; } @@ -2515,8 +2516,6 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, */ in_fd = open(config_filename, O_RDONLY); if ( in_fd < 0 ) { - free(store.key); - if ( ENOENT != errno ) { error_errno("opening %s", config_filename); ret = CONFIG_INVALID_FILE; /* same as "invalid config file" */ @@ -2571,7 +2570,6 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, */ if (git_config_from_file(store_aux, config_filename, NULL)) { error("invalid config file %s", config_filename); - free(store.key); if (store.value_regex != NULL && store.value_regex != CONFIG_REGEX_NONE) { regfree(store.value_regex); @@ -2581,7 +2579,6 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, goto out_free; } - free(store.key); if (store.value_regex != NULL && store.value_regex != CONFIG_REGEX_NONE) { regfree(store.value_regex); @@ -2682,6 +2679,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, out_free: rollback_lock_file(&lock); + free(section_name); free(filename_buf); if (contents) munmap(contents, contents_sz); -- 2.16.2.windows.1.26.g2cc3565eb4b ^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH 7/9] git config --unset: remove empty sections (in normal situations) 2018-03-29 15:18 [PATCH 0/9] Assorted fixes for `git config` (including the "empty sections" bug) Johannes Schindelin ` (5 preceding siblings ...) 2018-03-29 15:18 ` [PATCH 6/9] git_config_set: simplify the way the section name is remembered Johannes Schindelin @ 2018-03-29 15:19 ` Johannes Schindelin 2018-03-29 21:32 ` Jeff King 2018-03-29 15:19 ` [PATCH 8/9] git_config_set: use do_config_from_file() directly Johannes Schindelin ` (5 subsequent siblings) 12 siblings, 1 reply; 103+ messages in thread From: Johannes Schindelin @ 2018-03-29 15:19 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley The original reasoning for not removing section headers upon removal of the last entry went like this: the user could have added comments about the section, or about the entries therein, and if there were other comments there, we would not know whether we should remove them. In particular, a concocted example was presented that looked like this (and was added to t1300): # some generic comment on the configuration file itself # a comment specific to this "section" section. [section] # some intervening lines # that should also be dropped key = value # please be careful when you update the above variable The ideal thing for `git config --unset section.key` in this case would be to leave only the first line behind, because all the other comments are now obsolete. However, this is unfeasible, short of adding a complete Natural Language Processing module to Git, which seems not only a lot of work, but a totally unreasonable feature (for little benefit to most users). Now, the real kicker about this problem is: most users do not edit their config files at all! In their use case, the config looks like this instead: [section] key = value ... and it is totally obvious what should happen if the entry is removed. Let's generalize this observation to this conservative strategy: if we are removing the last entry from a section, and there are no comments inside that section nor surrounding it, then remove the entire section. Otherwise behave as before: leave the now-empty section (including those comments, even the one about the now-deleted entry). We have to be careful, though, to handle also the case where there are *multiple* entries that are removed: any subset of them might be the last entries of their respective sections. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- config.c | 181 +++++++++++++++++++++++++++++++++++++++++++++++++++++- t/t1300-config.sh | 4 +- 2 files changed, 182 insertions(+), 3 deletions(-) diff --git a/config.c b/config.c index d35dffa50de..503aef4b318 100644 --- a/config.c +++ b/config.c @@ -2429,6 +2429,177 @@ static ssize_t find_beginning_of_line(const char *contents, size_t size, return offset; } +/* + * This function determines whether the offset is in a line that starts with a + * comment character. + * + * Note: it does *not* report when a regular line (section header, config + * setting) *ends* in a comment. + */ +static int is_in_comment_line(const char *contents, size_t offset) +{ + int comment = 0; + + while (offset > 0) + switch (contents[--offset]) { + case ';': + case '#': + comment = 1; + break; + case '\n': + break; + case ' ': + case '\t': + continue; + default: + comment = 0; + } + + return comment; +} + +/* + * If we are about to unset the last key(s) in a section, and if there are + * no comments surrounding (or included in) the section, we will want to + * extend begin/end to remove the entire section. + * + * Note: the parameter `i_ptr` points to the index into the store.offset + * array, reflecting the end offset of the respective entry to be deleted. + * This index may be incremented if a section has more than one entry (which + * all are to be removed). + */ +static void maybe_remove_section(const char *contents, size_t size, + const char *section_name, + size_t section_name_len, + size_t *begin, int *i_ptr, int *new_line) +{ + size_t begin2, end2; + int seen_section = 0, dummy, i = *i_ptr; + + /* + * First, make sure that this is the last key in the section, and that + * there are no comments that are possibly about the current section. + */ +next_entry: + for (end2 = store.offset[i]; end2 < size; end2++) { + switch (contents[end2]) { + case ' ': + case '\t': + case '\n': + continue; + case '\r': + if (++end2 < size && contents[end2] == '\n') + continue; + break; + case '[': + /* If the section name is repeated, continue */ + if (end2 + 1 + section_name_len < size && + contents[end2 + section_name_len] == ']' && + !memcmp(contents + end2 + 1, section_name, + section_name_len)) { + end2 += section_name_len; + continue; + } + goto look_before; + case ';': + case '#': + /* There is a comment, cannot remove this section */ + return; + default: + /* There are other keys in that section */ + break; + } + + /* + * Uh oh... we found something else in this section. But do + * we want to remove this, too? + */ + if (++i >= store.seen) + return; + + begin2 = find_beginning_of_line(contents, size, store.offset[i], + &dummy); + if (begin2 > end2) + return; + + /* Looks like we want to remove the next one, too... */ + goto next_entry; + } + +look_before: + /* + * Now, ensure that this is the first key, and that there are no + * comments before the entry nor before the section header. + */ + for (begin2 = *begin; begin2 > 0; ) + switch (contents[begin2 - 1]) { + case ' ': + case '\t': + begin2--; + continue; + case '\n': + if (--begin2 > 0 && contents[begin2 - 1] == '\r') + begin2--; + continue; + case ']': + if (begin2 > section_name_len + 1 && + contents[begin2 - section_name_len - 2] == '[' && + !memcmp(contents + begin2 - section_name_len - 1, + section_name, section_name_len)) { + begin2 -= section_name_len + 2; + seen_section = 1; + continue; + } + + /* + * It looks like a section header, but it could be a + * comment instead... + */ + if (is_in_comment_line(contents, begin2)) + return; + + /* + * We encountered the previous section header: This + * really was the only entry, so remove the entire + * section. + */ + if (contents[begin2] != '\n') { + begin2--; + *new_line = 1; + } + + store.offset[i] = end2; + *begin = begin2; + *i_ptr = i; + return; + default: + /* + * Any other character means it is either a comment or + * a config setting; if it is a comment, we do not want + * to remove this section. If it is a config setting, + * we only want to remove this section if this is + * already the next section. + */ + if (seen_section && + !is_in_comment_line(contents, begin2)) { + if (contents[begin2] != '\n') { + begin2--; + *new_line = 1; + } + + store.offset[i] = end2; + *begin = begin2; + *i_ptr = i; + } + return; + } + + /* This section extends to the beginning of the file. */ + store.offset[i] = end2; + *begin = begin2; + *i_ptr = i; +} + int git_config_set_in_file_gently(const char *config_filename, const char *key, const char *value) { @@ -2626,10 +2797,18 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, store.offset[i] = copy_end = contents_sz; } else if (store.state != KEY_SEEN) { copy_end = store.offset[i]; - } else + } else { copy_end = find_beginning_of_line( contents, contents_sz, store.offset[i], &new_line); + if (!value) + maybe_remove_section(contents, + contents_sz, + section_name, + store.baselen, + ©_end, &i, + &new_line); + } if (copy_end > 0 && contents[copy_end-1] != '\n') new_line = 1; diff --git a/t/t1300-config.sh b/t/t1300-config.sh index ff79a213567..ecbcc9cf3d0 100755 --- a/t/t1300-config.sh +++ b/t/t1300-config.sh @@ -1413,7 +1413,7 @@ test_expect_success 'urlmatch with wildcard' ' ' # good section hygiene -test_expect_failure '--unset last key removes section (except if commented)' ' +test_expect_success '--unset last key removes section (except if commented)' ' cat >.git/config <<-\EOF && # some generic comment on the configuration file itself # a comment specific to this "section" section. @@ -1452,7 +1452,7 @@ test_expect_failure '--unset last key removes section (except if commented)' ' test_cmp expect .git/config ' -test_expect_failure '--unset-all removes section if empty & uncommented' ' +test_expect_success '--unset-all removes section if empty & uncommented' ' cat >.git/config <<-\EOF && [section] key = value1 -- 2.16.2.windows.1.26.g2cc3565eb4b ^ permalink raw reply related [flat|nested] 103+ messages in thread
* Re: [PATCH 7/9] git config --unset: remove empty sections (in normal situations) 2018-03-29 15:19 ` [PATCH 7/9] git config --unset: remove empty sections (in normal situations) Johannes Schindelin @ 2018-03-29 21:32 ` Jeff King 2018-03-30 13:00 ` Johannes Schindelin 0 siblings, 1 reply; 103+ messages in thread From: Jeff King @ 2018-03-29 21:32 UTC (permalink / raw) To: Johannes Schindelin Cc: git, Junio C Hamano, Thomas Rast, Phil Haack, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley On Thu, Mar 29, 2018 at 05:19:00PM +0200, Johannes Schindelin wrote: > Let's generalize this observation to this conservative strategy: if we > are removing the last entry from a section, and there are no comments > inside that section nor surrounding it, then remove the entire section. > Otherwise behave as before: leave the now-empty section (including those > comments, even the one about the now-deleted entry). Yep, as I said earlier, this makes a ton of sense to me. > +/* > + * This function determines whether the offset is in a line that starts with a > + * comment character. > + * > + * Note: it does *not* report when a regular line (section header, config > + * setting) *ends* in a comment. > + */ > +static int is_in_comment_line(const char *contents, size_t offset) > +{ > + int comment = 0; > + > + while (offset > 0) > + switch (contents[--offset]) { > + case ';': > + case '#': > + comment = 1; > + break; > + case '\n': > + break; > + case ' ': > + case '\t': > + continue; > + default: > + comment = 0; > + } > + > + return comment; > +} This doesn't pay any attention to quoting, so I wondered if it would get fooled by a line like: key = "this content has a # comment in it" or even: [section "this section has a # comment in it"] but those don't count because the line doesn't _start_ with the comment character. Could we design one that does? This isn't valid: [section] key = multiline \ # with comment But I think this is: [section] key = "multiline \ # with comment" So let's see if we can fool it: -- >8 -- cat >file <<-\EOF [one] key = "multiline \ # with comment" [two] key = true EOF # should produce "multiline # with comment" ./git config --file=file one.key # this should ideally remove the section ./git config --file=file --unset two.key cat file -- 8< -- That seems to work as expected. I'm not 100% sure why, though, since I thought we'd hit the "seen_section && !is_in_comment_line" bit of the look_before loop. Running it through gdb, I'm not convinced that is_in_comment_line is working correctly, though. Shouldn't it stop when it sees the newline, and return "comment"? There's a "break" there, but it doesn't break out of the loop due to the switch statement. So we'll _always_ walk back to the beginning of file. So I suspect your test passes because it does: # this is the start of the file [section] key = true but: [anotherSection] key = true # a comment not at the start [section] key = true does the wrong thing, and removes [section]. If we fix that bug like this: diff --git a/config.c b/config.c index b04c40f76b..3b2c7e9387 100644 --- a/config.c +++ b/config.c @@ -2461,7 +2461,7 @@ static int is_in_comment_line(const char *contents, size_t offset) comment = 1; break; case '\n': - break; + return comment; case ' ': case '\t': continue; then it keeps "[section]" correctly. But now if we go back to our funny multiline example, it does the wrong thing (it keeps [two], even though that's not _really_ a comment). To be honest, I could live with that as an open bug. It's a pretty ridiculous situation, and the worst case is that we err on the side of caution and don't remove the section. And I think it would be hard to fix. We could look for the continuation backslash when we find the newline, but that gets fooled by: # a comment \ # with a pointless backslash You can't just notice the quote and say "oh, I'm in a quoted section" because that gets fooled by: # a pointless "quote To know whether that quote is valid or not, you have to find the other quote. But doing that backwards is hard (if not impossible). > +static void maybe_remove_section(const char *contents, size_t size, > + const char *section_name, > + size_t section_name_len, > + size_t *begin, int *i_ptr, int *new_line) > +{ > + size_t begin2, end2; > + int seen_section = 0, dummy, i = *i_ptr; > + > + /* > + * First, make sure that this is the last key in the section, and that > + * there are no comments that are possibly about the current section. > + */ > +next_entry: > + for (end2 = store.offset[i]; end2 < size; end2++) { > + switch (contents[end2]) { > + case ' ': > + case '\t': > + case '\n': > + continue; > + case '\r': > + if (++end2 < size && contents[end2] == '\n') > + continue; > + break; > + case '[': > + /* If the section name is repeated, continue */ > + if (end2 + 1 + section_name_len < size && > + contents[end2 + section_name_len] == ']' && > + !memcmp(contents + end2 + 1, section_name, > + section_name_len)) { > + end2 += section_name_len; > + continue; > + } > + goto look_before; > + case ';': > + case '#': > + /* There is a comment, cannot remove this section */ > + return; > + default: > + /* There are other keys in that section */ > + break; > + } OK, this all makes sense. We're scanning forward to find the next '[', without finding any keys or comments. We don't have to worry about quoting because we'd quit as soon as we see a key anyway. I like the special-case for finding our same section name, since that would help clean up cruft from existing versions of Git. It looks like there may be an off-by-one, though. Should it be checking: contents[end2 + 1 + section_name_len] == ']' to skip over the opening '['? In a simple example: [foo] bar = true [foo] we don't seem to remove the second section header. It works with the patch below: diff --git a/config.c b/config.c index b04c40f76b..48dcb52840 100644 --- a/config.c +++ b/config.c @@ -2508,10 +2508,10 @@ static void maybe_remove_section(const char *contents, size_t size, case '[': /* If the section name is repeated, continue */ if (end2 + 1 + section_name_len < size && - contents[end2 + section_name_len] == ']' && + contents[end2 + 1 + section_name_len] == ']' && !memcmp(contents + end2 + 1, section_name, section_name_len)) { - end2 += section_name_len; + end2 += section_name_len + 1; continue; } goto look_before; Unfortunately I think this whole thing breaks down with subsections. If we try this: [foo "subsection"] bar = true [foo "subsection"] then the section_name variable contains "foo.subsection", which we can't textually match. And we end up failing to remove either section (the latter one because of this loop, and the former because of the same problem in the look_before loop). > + /* > + * Uh oh... we found something else in this section. But do > + * we want to remove this, too? > + */ > + if (++i >= store.seen) > + return; > + > + begin2 = find_beginning_of_line(contents, size, store.offset[i], > + &dummy); > + if (begin2 > end2) > + return; > + > + /* Looks like we want to remove the next one, too... */ > + goto next_entry; > + } OK, makes sense. > +look_before: > + /* > + * Now, ensure that this is the first key, and that there are no > + * comments before the entry nor before the section header. > + */ > + for (begin2 = *begin; begin2 > 0; ) > + switch (contents[begin2 - 1]) { > + case ' ': > + case '\t': > + begin2--; > + continue; > + case '\n': > + if (--begin2 > 0 && contents[begin2 - 1] == '\r') > + begin2--; > + continue; > + case ']': > + if (begin2 > section_name_len + 1 && > + contents[begin2 - section_name_len - 2] == '[' && > + !memcmp(contents + begin2 - section_name_len - 1, > + section_name, section_name_len)) { > + begin2 -= section_name_len + 2; > + seen_section = 1; > + continue; > + } OK, this is the backwards mirror image of the earlier part. Which makes sense. And this handles the reverse case for the doubled section name: [foo] [foo] bar = true because we'd hit this section-name check twice, and just set "seen_section = 1" both times. So that works (modulo the subsection parsing thing). As far as quoting goes, now we're coming from the back of each line now. And I don't think we strictly require double-quotes around string values. So imagine this: [one] foo = this has [brackets] bar = this does not When deleting one.bar, we'd erroneously think that closing bracket is the prior section header. I _think_ it behaves correctly, though, because we then say "well, delete everything back to that bracket character". Which happens to be the correct thing to do anyway. But let's get more devious. What about this: [one] foo = fake section [one] bar = whatever If I unset foo.bar with your patch, I end up with the truncated: [one] foo = fake sectio Yikes. This is obviously a ridiculous example, but the failure case is pretty nasty. Again, the tricky thing here is that we're parsing backwards. We don't know what's syntactically relevant and what isn't. > + > + /* > + * It looks like a section header, but it could be a > + * comment instead... > + */ > + if (is_in_comment_line(contents, begin2)) > + return; This would get fooled if we allowed line continuation in subsection names, like: [one "subsection\ # with newline"] key = true but it looks like our parser doesn't allow that (aside from it being slightly insane, of course). Good. > + /* > + * We encountered the previous section header: This > + * really was the only entry, so remove the entire > + * section. > + */ > + if (contents[begin2] != '\n') { > + begin2--; > + *new_line = 1; > + } > + > + store.offset[i] = end2; > + *begin = begin2; > + *i_ptr = i; > + return; OK, makes sense. > + default: > + /* > + * Any other character means it is either a comment or > + * a config setting; if it is a comment, we do not want > + * to remove this section. If it is a config setting, > + * we only want to remove this section if this is > + * already the next section. > + */ > + if (seen_section && > + !is_in_comment_line(contents, begin2)) { > + if (contents[begin2] != '\n') { > + begin2--; > + *new_line = 1; > + } > + > + store.offset[i] = end2; > + *begin = begin2; > + *i_ptr = i; > + } > + return; > + } Here's where we get fooled by is_in_comment_line() that I showed at the beginning. We don't have to worry about other quoting, because any key (quoted or not) would cause us to abort, since it's in the section. > + /* This section extends to the beginning of the file. */ > + store.offset[i] = end2; > + *begin = begin2; > + *i_ptr = i; > +} Right, makes sense. Ok, phew. That was a tough read. So here's what I see: 1. Minor bug in is_in_comment_line(), patch above. 2. Minor bug in matching section names, patch above. 3. Matching subsection names doesn't work. I think this should be fixable with a helper function which can match '[one "two"]' when given "one.two". 4. Backwards parsing causes is_in_comment_line to trigger more than it should. I can live with that because the trigger is arcane, and the error behavior is pretty harmless. 5. Backwards parsing can find a bogus section. Also arcane, but the error behavior is pretty scary. (4) and (5) are the ones that I don't see a way to fix, given the current way in which we do the config-writing (i.e., running it through the regular read-parser and then trying to "patch up" the found locations). I think that's also what's contributing to the code being hard to read, since you end up doing quite a lot of manual re-parsing. I think the sane way to do this would be to parse the whole thing into a tree (that includes things like comments and whitespace), and then we could much more easily manipulate that tree, without dealing with the parsing (forwards _and_ backwards). But that's a pretty big change from the current code. It also potentially means duplicating the parsing logic, unless we teach the regular reader to do the tree-parse, and then pick out the config from that. That's likely much slower than the existing parser (since we'd allocate a bunch of tree nodes instead of just dumping strings to the callbacks). But these days we cache the parsed config anyway, so I'm not sure if a slight slowdown would actually matter that much. I guess the holy grail would be a parser which reports _all_ syntactic events (section names, keys, comments, whitespace, etc) as a stream without storing anything. And then the normal reader could just discard the non-key events, and the writer here could build the tree from those events. -Peff ^ permalink raw reply related [flat|nested] 103+ messages in thread
* Re: [PATCH 7/9] git config --unset: remove empty sections (in normal situations) 2018-03-29 21:32 ` Jeff King @ 2018-03-30 13:00 ` Johannes Schindelin 2018-03-30 13:09 ` Jeff King 0 siblings, 1 reply; 103+ messages in thread From: Johannes Schindelin @ 2018-03-30 13:00 UTC (permalink / raw) To: Jeff King Cc: git, Junio C Hamano, Thomas Rast, Phil Haack, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley Hi Peff, On Thu, 29 Mar 2018, Jeff King wrote: > On Thu, Mar 29, 2018 at 05:19:00PM +0200, Johannes Schindelin wrote: > > > Let's generalize this observation to this conservative strategy: if we > > are removing the last entry from a section, and there are no comments > > inside that section nor surrounding it, then remove the entire section. > > Otherwise behave as before: leave the now-empty section (including those > > comments, even the one about the now-deleted entry). > > Yep, as I said earlier, this makes a ton of sense to me. > > [... thorough review ...] Thank you for taking the time (and figuring out my off-by-ones, am I not the king of those?). Your in-depth analysis of the backtracking approach also makes sense, in particular the awful bug that looks very, very similar to what 1/9 fixes elsewhere. I'll take some time to go over your comments in detail, but there is one suggestion that I think I'll want to pursue first: > I guess the holy grail would be a parser which reports _all_ syntactic > events (section names, keys, comments, whitespace, etc) as a stream > without storing anything. And then the normal reader could just discard > the non-key events, and the writer here could build the tree from those > events. I already changed the do_config_from_file()/do_config_from() code path to allow for handing back section headers. And I *think* that approach should be easily extended to allow for an optional callback for these syntactic events (and we do not need more than that, as the parsed "tree" really is a list: there is nothing nested about ini files, so we really only have a linear list of blocks (event type, offset range)). I'll think about this a little bit, and hopefully come back with v2 in a while that uses that approach. Thank you so much for that suggestion, Dscho ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 7/9] git config --unset: remove empty sections (in normal situations) 2018-03-30 13:00 ` Johannes Schindelin @ 2018-03-30 13:09 ` Jeff King 0 siblings, 0 replies; 103+ messages in thread From: Jeff King @ 2018-03-30 13:09 UTC (permalink / raw) To: Johannes Schindelin Cc: git, Junio C Hamano, Thomas Rast, Phil Haack, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley On Fri, Mar 30, 2018 at 03:00:06PM +0200, Johannes Schindelin wrote: > > I guess the holy grail would be a parser which reports _all_ syntactic > > events (section names, keys, comments, whitespace, etc) as a stream > > without storing anything. And then the normal reader could just discard > > the non-key events, and the writer here could build the tree from those > > events. > > I already changed the do_config_from_file()/do_config_from() code path to > allow for handing back section headers. And I *think* that approach should > be easily extended to allow for an optional callback for these syntactic > events (and we do not need more than that, as the parsed "tree" really is > a list: there is nothing nested about ini files, so we really only have a > linear list of blocks (event type, offset range)). True. I was thinking we'd want sections with keys, whitespace, and comments under them. But even that does not really make sense. As this patch series shows, comments do not "belong" to a section, and the file really needs to be considered as a stream. So yeah, if we can parse it into a sequence of events in one forward-pass and then manipulate that sequence, I think it should be sufficient (and _way_ more readable than the current code, even before the bits you are trying to fix here). > I'll think about this a little bit, and hopefully come back with v2 in a > while that uses that approach. > > Thank you so much for that suggestion, Great. Thanks for working on this. -Peff ^ permalink raw reply [flat|nested] 103+ messages in thread
* [PATCH 8/9] git_config_set: use do_config_from_file() directly 2018-03-29 15:18 [PATCH 0/9] Assorted fixes for `git config` (including the "empty sections" bug) Johannes Schindelin ` (6 preceding siblings ...) 2018-03-29 15:19 ` [PATCH 7/9] git config --unset: remove empty sections (in normal situations) Johannes Schindelin @ 2018-03-29 15:19 ` Johannes Schindelin 2018-03-29 21:38 ` Jeff King 2018-03-29 15:19 ` [PATCH 9/9] git_config_set: reuse empty sections Johannes Schindelin ` (4 subsequent siblings) 12 siblings, 1 reply; 103+ messages in thread From: Johannes Schindelin @ 2018-03-29 15:19 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley Technically, it is the git_config_set_multivar_in_file_gently() function that we modify here (but the oneline would get too long if we were that precise). This change prepares the git_config_set machinery to allow reusing empty sections, by using the file-local function do_config_from_file() directly (whose signature can then be changed without any effect outside of config.c). An incidental benefit is that we avoid a level of indirection, and we also avoid calling flockfile()/funlockfile() when we already know that we are not operating on stdin/stdout here. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- config.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/config.c b/config.c index 503aef4b318..eb1e0d335fc 100644 --- a/config.c +++ b/config.c @@ -2706,6 +2706,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, struct stat st; size_t copy_begin, copy_end; int i, new_line = 0; + FILE *f; if (value_regex == NULL) store.value_regex = NULL; @@ -2739,7 +2740,10 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, * As a side effect, we make sure to transform only a valid * existing config file. */ - if (git_config_from_file(store_aux, config_filename, NULL)) { + f = fopen_or_warn(config_filename, "r"); + if (!f || do_config_from_file(store_aux, CONFIG_ORIGIN_FILE, + config_filename, config_filename, + f, NULL)) { error("invalid config file %s", config_filename); if (store.value_regex != NULL && store.value_regex != CONFIG_REGEX_NONE) { @@ -2747,8 +2751,11 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, free(store.value_regex); } ret = CONFIG_INVALID_FILE; + if (f) + fclose(f); goto out_free; - } + } else + fclose(f); if (store.value_regex != NULL && store.value_regex != CONFIG_REGEX_NONE) { -- 2.16.2.windows.1.26.g2cc3565eb4b ^ permalink raw reply related [flat|nested] 103+ messages in thread
* Re: [PATCH 8/9] git_config_set: use do_config_from_file() directly 2018-03-29 15:19 ` [PATCH 8/9] git_config_set: use do_config_from_file() directly Johannes Schindelin @ 2018-03-29 21:38 ` Jeff King 2018-03-30 13:02 ` Johannes Schindelin 0 siblings, 1 reply; 103+ messages in thread From: Jeff King @ 2018-03-29 21:38 UTC (permalink / raw) To: Johannes Schindelin Cc: git, Junio C Hamano, Thomas Rast, Phil Haack, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley On Thu, Mar 29, 2018 at 05:19:04PM +0200, Johannes Schindelin wrote: > Technically, it is the git_config_set_multivar_in_file_gently() > function that we modify here (but the oneline would get too long if we > were that precise). > > This change prepares the git_config_set machinery to allow reusing empty > sections, by using the file-local function do_config_from_file() > directly (whose signature can then be changed without any effect outside > of config.c). > > An incidental benefit is that we avoid a level of indirection, and we > also avoid calling flockfile()/funlockfile() when we already know that > we are not operating on stdin/stdout here. I'm not sure I understand that last paragraph. What does flockfile() have to do with stdin/stdout? The point of those calls is that we're locking the FILE handle, so that it's safe for the lower-level config code to run getc_unlocked(), which is faster. So without those, we're calling getc_unlocked() without holding the lock. I think it probably works in practice because we know that we're single-threaded, but it seems a bit sketchy. -Peff ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 8/9] git_config_set: use do_config_from_file() directly 2018-03-29 21:38 ` Jeff King @ 2018-03-30 13:02 ` Johannes Schindelin 2018-03-30 13:14 ` Jeff King 0 siblings, 1 reply; 103+ messages in thread From: Johannes Schindelin @ 2018-03-30 13:02 UTC (permalink / raw) To: Jeff King Cc: git, Junio C Hamano, Thomas Rast, Phil Haack, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley Hi Peff, On Thu, 29 Mar 2018, Jeff King wrote: > On Thu, Mar 29, 2018 at 05:19:04PM +0200, Johannes Schindelin wrote: > > > Technically, it is the git_config_set_multivar_in_file_gently() > > function that we modify here (but the oneline would get too long if we > > were that precise). > > > > This change prepares the git_config_set machinery to allow reusing empty > > sections, by using the file-local function do_config_from_file() > > directly (whose signature can then be changed without any effect outside > > of config.c). > > > > An incidental benefit is that we avoid a level of indirection, and we > > also avoid calling flockfile()/funlockfile() when we already know that > > we are not operating on stdin/stdout here. > > I'm not sure I understand that last paragraph. What does flockfile() have > to do with stdin/stdout? > > The point of those calls is that we're locking the FILE handle, so that > it's safe for the lower-level config code to run getc_unlocked(), which > is faster. > > So without those, we're calling getc_unlocked() without holding the > lock. I think it probably works in practice because we know that we're > single-threaded, but it seems a bit sketchy. Oops. I misunderstood the purpose of flockfile(), then. I thought it was only about multiple users of stdin/stdout. Will have a look whether flockfile()/funlockfile() can be moved into do_config_from_file() instead. Ciao, Dscho ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 8/9] git_config_set: use do_config_from_file() directly 2018-03-30 13:02 ` Johannes Schindelin @ 2018-03-30 13:14 ` Jeff King 2018-03-30 14:01 ` Johannes Schindelin 0 siblings, 1 reply; 103+ messages in thread From: Jeff King @ 2018-03-30 13:14 UTC (permalink / raw) To: Johannes Schindelin Cc: git, Junio C Hamano, Thomas Rast, Phil Haack, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley On Fri, Mar 30, 2018 at 03:02:00PM +0200, Johannes Schindelin wrote: > > I'm not sure I understand that last paragraph. What does flockfile() have > > to do with stdin/stdout? > > > > The point of those calls is that we're locking the FILE handle, so that > > it's safe for the lower-level config code to run getc_unlocked(), which > > is faster. > > > > So without those, we're calling getc_unlocked() without holding the > > lock. I think it probably works in practice because we know that we're > > single-threaded, but it seems a bit sketchy. > > Oops. I misunderstood the purpose of flockfile(), then. I thought it was > only about multiple users of stdin/stdout. > > Will have a look whether flockfile()/funlockfile() can be moved into > do_config_from_file() instead. In a sense stdin/stdout are much more susceptible to this because they're global variables, and any thread may touch them. For the config code, we open our own handle that we don't expose elsewhere. So probably it would be fine just to use the unlocked variants even without locking. But IMHO it's good practice to always flockfile() before using the unlocked variants. My reading of POSIX is that it's OK to use the unlocked variants without holding the lock (if you know there won't be contention), but if it's not hard to err on the side of safety, I'd prefer it. -Peff ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 8/9] git_config_set: use do_config_from_file() directly 2018-03-30 13:14 ` Jeff King @ 2018-03-30 14:01 ` Johannes Schindelin 2018-03-30 14:08 ` Jeff King 0 siblings, 1 reply; 103+ messages in thread From: Johannes Schindelin @ 2018-03-30 14:01 UTC (permalink / raw) To: Jeff King Cc: git, Junio C Hamano, Thomas Rast, Phil Haack, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley Hi Peff, On Fri, 30 Mar 2018, Jeff King wrote: > On Fri, Mar 30, 2018 at 03:02:00PM +0200, Johannes Schindelin wrote: > > > > I'm not sure I understand that last paragraph. What does flockfile() have > > > to do with stdin/stdout? > > > > > > The point of those calls is that we're locking the FILE handle, so that > > > it's safe for the lower-level config code to run getc_unlocked(), which > > > is faster. > > > > > > So without those, we're calling getc_unlocked() without holding the > > > lock. I think it probably works in practice because we know that we're > > > single-threaded, but it seems a bit sketchy. > > > > Oops. I misunderstood the purpose of flockfile(), then. I thought it was > > only about multiple users of stdin/stdout. > > > > Will have a look whether flockfile()/funlockfile() can be moved into > > do_config_from_file() instead. > > In a sense stdin/stdout are much more susceptible to this because > they're global variables, and any thread may touch them. For the config > code, we open our own handle that we don't expose elsewhere. So probably > it would be fine just to use the unlocked variants even without locking. > > But IMHO it's good practice to always flockfile() before using the > unlocked variants. My reading of POSIX is that it's OK to use the > unlocked variants without holding the lock (if you know there won't be > contention), but if it's not hard to err on the side of safety, I'd > prefer it. You know what is *really* funny? -- snip -- static int git_config_from_stdin(config_fn_t fn, void *data) { return do_config_from_file(fn, CONFIG_ORIGIN_STDIN, "", NULL, stdin, data, 0); } int git_config_from_file(config_fn_t fn, const char *filename, void *data) { int ret = -1; FILE *f; f = fopen_or_warn(filename, "r"); if (f) { flockfile(f); ret = do_config_from_file(fn, CONFIG_ORIGIN_FILE, filename, filename, f, data, 0); funlockfile(f); fclose(f); } return ret; } -- snap -- So the _stdin variant *goes out of its way not to flockfile()*... But I guess all this will become moot when I start handing down the config options. It does mean that I have to change the signatures in header files, oh well ;-) But then I can drop this here patch and we can stop musing about flockfile() ;-) Ciao, Dscho ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 8/9] git_config_set: use do_config_from_file() directly 2018-03-30 14:01 ` Johannes Schindelin @ 2018-03-30 14:08 ` Jeff King 2018-03-30 19:04 ` Johannes Schindelin 0 siblings, 1 reply; 103+ messages in thread From: Jeff King @ 2018-03-30 14:08 UTC (permalink / raw) To: Johannes Schindelin Cc: git, Junio C Hamano, Thomas Rast, Phil Haack, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley On Fri, Mar 30, 2018 at 04:01:56PM +0200, Johannes Schindelin wrote: > You know what is *really* funny? > > -- snip -- > static int git_config_from_stdin(config_fn_t fn, void *data) > { > return do_config_from_file(fn, CONFIG_ORIGIN_STDIN, "", NULL, stdin, data, 0); > } > > int git_config_from_file(config_fn_t fn, const char *filename, void *data) > { > int ret = -1; > FILE *f; > > f = fopen_or_warn(filename, "r"); > if (f) { > flockfile(f); > ret = do_config_from_file(fn, CONFIG_ORIGIN_FILE, filename, filename, f, data, 0); > funlockfile(f); > fclose(f); > } > return ret; > } > -- snap -- > > So the _stdin variant *goes out of its way not to flockfile()*... *facepalm* That's probably my fault, since git_config_from_stdin() existed already when I did the flockfile stuff. Probably the flockfile should go into do_config_from_file(), where we specify to use the unlocked variants. > But I guess all this will become moot when I start handing down the config > options. It does mean that I have to change the signatures in header > files, oh well ;-) > > But then I can drop this here patch and we can stop musing about > flockfile() ;-) Yeah, I'll wait to see how your refactor turns out. -Peff ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 8/9] git_config_set: use do_config_from_file() directly 2018-03-30 14:08 ` Jeff King @ 2018-03-30 19:04 ` Johannes Schindelin 0 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-03-30 19:04 UTC (permalink / raw) To: Jeff King Cc: git, Junio C Hamano, Thomas Rast, Phil Haack, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley Hi Peff, On Fri, 30 Mar 2018, Jeff King wrote: > On Fri, Mar 30, 2018 at 04:01:56PM +0200, Johannes Schindelin wrote: > > > You know what is *really* funny? > > > > -- snip -- > > static int git_config_from_stdin(config_fn_t fn, void *data) > > { > > return do_config_from_file(fn, CONFIG_ORIGIN_STDIN, "", NULL, stdin, data, 0); > > } > > > > int git_config_from_file(config_fn_t fn, const char *filename, void *data) > > { > > int ret = -1; > > FILE *f; > > > > f = fopen_or_warn(filename, "r"); > > if (f) { > > flockfile(f); > > ret = do_config_from_file(fn, CONFIG_ORIGIN_FILE, filename, filename, f, data, 0); > > funlockfile(f); > > fclose(f); > > } > > return ret; > > } > > -- snap -- > > > > So the _stdin variant *goes out of its way not to flockfile()*... > > *facepalm* That's probably my fault, since git_config_from_stdin() > existed already when I did the flockfile stuff. > > Probably the flockfile should go into do_config_from_file(), where we > specify to use the unlocked variants. Ah, that makes sense now! I am glad I could also help ;-) > > But I guess all this will become moot when I start handing down the config > > options. It does mean that I have to change the signatures in header > > files, oh well ;-) > > > > But then I can drop this here patch and we can stop musing about > > flockfile() ;-) > > Yeah, I'll wait to see how your refactor turns out. I don't think I'll touch too much in that part of the code. My changes should not cause merge conflicts with a patch moving the flockfile()/funlockfile() calls to do_config_from_file(). Ciao, Dscho ^ permalink raw reply [flat|nested] 103+ messages in thread
* [PATCH 9/9] git_config_set: reuse empty sections 2018-03-29 15:18 [PATCH 0/9] Assorted fixes for `git config` (including the "empty sections" bug) Johannes Schindelin ` (7 preceding siblings ...) 2018-03-29 15:19 ` [PATCH 8/9] git_config_set: use do_config_from_file() directly Johannes Schindelin @ 2018-03-29 15:19 ` Johannes Schindelin 2018-03-29 21:50 ` Jeff King 2018-03-29 17:58 ` [PATCH 0/9] Assorted fixes for `git config` (including the "empty sections" bug) Stefan Beller ` (3 subsequent siblings) 12 siblings, 1 reply; 103+ messages in thread From: Johannes Schindelin @ 2018-03-29 15:19 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley It can happen quite easily that the last setting in a config section is removed, and to avoid confusion when there are comments in the config about that section, we keep a lone section header, i.e. an empty section. The code to add new entries in the config tries to be cute by reusing the parsing code that is used to retrieve config settings, but that poses the problem that the latter use case does *not* care about empty sections, therefore even the former user case won't see them. Fix this by introducing a mode where the parser reports also empty sections (with a trailing '.' as tell-tale), and then using that when adding new config entries. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- config.c | 32 +++++++++++++++++++++++--------- t/t1300-config.sh | 2 +- 2 files changed, 24 insertions(+), 10 deletions(-) diff --git a/config.c b/config.c index eb1e0d335fc..b04c40f76bc 100644 --- a/config.c +++ b/config.c @@ -653,13 +653,15 @@ static int get_base_var(struct strbuf *name) } } -static int git_parse_source(config_fn_t fn, void *data) +static int git_parse_source(config_fn_t fn, void *data, + int include_section_headers) { int comment = 0; int baselen = 0; struct strbuf *var = &cf->var; int error_return = 0; char *error_msg = NULL; + int saw_section_header = 0; /* U+FEFF Byte Order Mark in UTF8 */ const char *bomptr = utf8_bom; @@ -685,6 +687,16 @@ static int git_parse_source(config_fn_t fn, void *data) if (cf->eof) return 0; comment = 0; + if (saw_section_header) { + if (include_section_headers) { + cf->linenr--; + error_return = fn(var->buf, NULL, data); + if (error_return < 0) + break; + cf->linenr++; + } + saw_section_header = 0; + } continue; } if (comment || isspace(c)) @@ -700,6 +712,7 @@ static int git_parse_source(config_fn_t fn, void *data) break; strbuf_addch(var, '.'); baselen = var->len; + saw_section_header = 1; continue; } if (!isalpha(c)) @@ -1398,7 +1411,8 @@ int git_default_config(const char *var, const char *value, void *dummy) * fgetc, ungetc, ftell of top need to be initialized before calling * this function. */ -static int do_config_from(struct config_source *top, config_fn_t fn, void *data) +static int do_config_from(struct config_source *top, config_fn_t fn, void *data, + int include_section_headers) { int ret; @@ -1410,7 +1424,7 @@ static int do_config_from(struct config_source *top, config_fn_t fn, void *data) strbuf_init(&top->var, 1024); cf = top; - ret = git_parse_source(fn, data); + ret = git_parse_source(fn, data, include_section_headers); /* pop config-file parsing state stack */ strbuf_release(&top->value); @@ -1423,7 +1437,7 @@ static int do_config_from(struct config_source *top, config_fn_t fn, void *data) static int do_config_from_file(config_fn_t fn, const enum config_origin_type origin_type, const char *name, const char *path, FILE *f, - void *data) + void *data, int include_section_headers) { struct config_source top; @@ -1436,12 +1450,12 @@ static int do_config_from_file(config_fn_t fn, top.do_ungetc = config_file_ungetc; top.do_ftell = config_file_ftell; - return do_config_from(&top, fn, data); + return do_config_from(&top, fn, data, include_section_headers); } static int git_config_from_stdin(config_fn_t fn, void *data) { - return do_config_from_file(fn, CONFIG_ORIGIN_STDIN, "", NULL, stdin, data); + return do_config_from_file(fn, CONFIG_ORIGIN_STDIN, "", NULL, stdin, data, 0); } int git_config_from_file(config_fn_t fn, const char *filename, void *data) @@ -1452,7 +1466,7 @@ int git_config_from_file(config_fn_t fn, const char *filename, void *data) f = fopen_or_warn(filename, "r"); if (f) { flockfile(f); - ret = do_config_from_file(fn, CONFIG_ORIGIN_FILE, filename, filename, f, data); + ret = do_config_from_file(fn, CONFIG_ORIGIN_FILE, filename, filename, f, data, 0); funlockfile(f); fclose(f); } @@ -1475,7 +1489,7 @@ int git_config_from_mem(config_fn_t fn, const enum config_origin_type origin_typ top.do_ungetc = config_buf_ungetc; top.do_ftell = config_buf_ftell; - return do_config_from(&top, fn, data); + return do_config_from(&top, fn, data, 0); } int git_config_from_blob_oid(config_fn_t fn, @@ -2743,7 +2757,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, f = fopen_or_warn(config_filename, "r"); if (!f || do_config_from_file(store_aux, CONFIG_ORIGIN_FILE, config_filename, config_filename, - f, NULL)) { + f, NULL, 1)) { error("invalid config file %s", config_filename); if (store.value_regex != NULL && store.value_regex != CONFIG_REGEX_NONE) { diff --git a/t/t1300-config.sh b/t/t1300-config.sh index ecbcc9cf3d0..867397ae930 100755 --- a/t/t1300-config.sh +++ b/t/t1300-config.sh @@ -1463,7 +1463,7 @@ test_expect_success '--unset-all removes section if empty & uncommented' ' test_line_count = 0 .git/config ' -test_expect_failure 'adding a key into an empty section reuses header' ' +test_expect_success 'adding a key into an empty section reuses header' ' cat >.git/config <<-\EOF && [section] EOF -- 2.16.2.windows.1.26.g2cc3565eb4b ^ permalink raw reply related [flat|nested] 103+ messages in thread
* Re: [PATCH 9/9] git_config_set: reuse empty sections 2018-03-29 15:19 ` [PATCH 9/9] git_config_set: reuse empty sections Johannes Schindelin @ 2018-03-29 21:50 ` Jeff King 2018-03-30 13:15 ` Johannes Schindelin 0 siblings, 1 reply; 103+ messages in thread From: Jeff King @ 2018-03-29 21:50 UTC (permalink / raw) To: Johannes Schindelin Cc: git, Junio C Hamano, Thomas Rast, Phil Haack, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley On Thu, Mar 29, 2018 at 05:19:09PM +0200, Johannes Schindelin wrote: > It can happen quite easily that the last setting in a config section is > removed, and to avoid confusion when there are comments in the config > about that section, we keep a lone section header, i.e. an empty > section. > > The code to add new entries in the config tries to be cute by reusing > the parsing code that is used to retrieve config settings, but that > poses the problem that the latter use case does *not* care about empty > sections, therefore even the former user case won't see them. > > Fix this by introducing a mode where the parser reports also empty > sections (with a trailing '.' as tell-tale), and then using that when > adding new config entries. Heh, so it seems we are partway to the "event-stream" suggestion I made earlier. I agree this is the right way to approach this problem. I wondered if we allow keys to end in ".", but it seems that we don't. > diff --git a/config.c b/config.c > index eb1e0d335fc..b04c40f76bc 100644 > --- a/config.c > +++ b/config.c > @@ -653,13 +653,15 @@ static int get_base_var(struct strbuf *name) > } > } > > -static int git_parse_source(config_fn_t fn, void *data) > +static int git_parse_source(config_fn_t fn, void *data, > + int include_section_headers) We already have a "struct config_options", but we do a terrible job of passing it around (since it only impacts the include stuff right now, and that all gets handled at a very outer level). Rather than plumb this one int through everywhere, should we add it to that struct and plumb the struct through? -Peff ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 9/9] git_config_set: reuse empty sections 2018-03-29 21:50 ` Jeff King @ 2018-03-30 13:15 ` Johannes Schindelin 0 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-03-30 13:15 UTC (permalink / raw) To: Jeff King Cc: git, Junio C Hamano, Thomas Rast, Phil Haack, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley Hi Peff, On Thu, 29 Mar 2018, Jeff King wrote: > On Thu, Mar 29, 2018 at 05:19:09PM +0200, Johannes Schindelin wrote: > > > It can happen quite easily that the last setting in a config section is > > removed, and to avoid confusion when there are comments in the config > > about that section, we keep a lone section header, i.e. an empty > > section. > > > > The code to add new entries in the config tries to be cute by reusing > > the parsing code that is used to retrieve config settings, but that > > poses the problem that the latter use case does *not* care about empty > > sections, therefore even the former user case won't see them. > > > > Fix this by introducing a mode where the parser reports also empty > > sections (with a trailing '.' as tell-tale), and then using that when > > adding new config entries. > > Heh, so it seems we are partway to the "event-stream" suggestion I made > earlier. I agree this is the right way to approach this problem. > > I wondered if we allow keys to end in ".", but it seems that we don't. > > > diff --git a/config.c b/config.c > > index eb1e0d335fc..b04c40f76bc 100644 > > --- a/config.c > > +++ b/config.c > > @@ -653,13 +653,15 @@ static int get_base_var(struct strbuf *name) > > } > > } > > > > -static int git_parse_source(config_fn_t fn, void *data) > > +static int git_parse_source(config_fn_t fn, void *data, > > + int include_section_headers) > > We already have a "struct config_options", but we do a terrible job of > passing it around (since it only impacts the include stuff right now, > and that all gets handled at a very outer level). > > Rather than plumb this one int through everywhere, should we add it to > that struct and plumb the struct through? Yesss! Again, thank you so much for this really valuable review. This is even better than what I hoped for. Ciao, Dscho ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 0/9] Assorted fixes for `git config` (including the "empty sections" bug) 2018-03-29 15:18 [PATCH 0/9] Assorted fixes for `git config` (including the "empty sections" bug) Johannes Schindelin ` (8 preceding siblings ...) 2018-03-29 15:19 ` [PATCH 9/9] git_config_set: reuse empty sections Johannes Schindelin @ 2018-03-29 17:58 ` Stefan Beller 2018-03-30 12:14 ` Johannes Schindelin 2018-03-29 19:39 ` Jeff King ` (2 subsequent siblings) 12 siblings, 1 reply; 103+ messages in thread From: Stefan Beller @ 2018-03-29 17:58 UTC (permalink / raw) To: Johannes Schindelin Cc: git, Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Jason Frey, Philip Oakley On Thu, Mar 29, 2018 at 8:18 AM, Johannes Schindelin <johannes.schindelin@gmx.de> wrote: > So what is the argument against this extra care to detect comments? Well, if > you have something like this: > > [section] > ; Here we comment about the variable called snarf > snarf = froop > > and we run `git config --unset section.snarf`, we end up with this config: > > [section] > ; Here we comment about the variable called snarf > > which obviously does not make sense. However, that is already established > behavior for quite a few years, and I do not even try to think of a way how > this could be solved. By commenting out the key/value pair instead of deleting it. It's called --unset, not --delete ;) Now onto reviewing the patches. Stefan ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 0/9] Assorted fixes for `git config` (including the "empty sections" bug) 2018-03-29 17:58 ` [PATCH 0/9] Assorted fixes for `git config` (including the "empty sections" bug) Stefan Beller @ 2018-03-30 12:14 ` Johannes Schindelin 0 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-03-30 12:14 UTC (permalink / raw) To: Stefan Beller Cc: git, Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Jason Frey, Philip Oakley Hi Stefan, On Thu, 29 Mar 2018, Stefan Beller wrote: > On Thu, Mar 29, 2018 at 8:18 AM, Johannes Schindelin > <johannes.schindelin@gmx.de> wrote: > > > So what is the argument against this extra care to detect comments? Well, if > > you have something like this: > > > > [section] > > ; Here we comment about the variable called snarf > > snarf = froop > > > > and we run `git config --unset section.snarf`, we end up with this config: > > > > [section] > > ; Here we comment about the variable called snarf > > > > which obviously does not make sense. However, that is already established > > behavior for quite a few years, and I do not even try to think of a way how > > this could be solved. > > By commenting out the key/value pair instead of deleting it. > It's called --unset, not --delete ;) That would open the door to new bug reports when a user starts with this concocted config: [section] # This is a comment about the `key` setting key = value and then does this: git config --unset section.key git config section.key value git config --unset section.key git config section.key value git config --unset section.key git config section.key value and then ends up with a config like this: [section] # This is a comment about the `key` setting ;key = value ;key = value ;key = value key = value And note that the comment might be about `value` instead, so reusing a commented-out `key` setting won't fly, either. I *did* give this problem a couple of minutes of thought before writing my assessment that is quoted above ;-) Ciao, Dscho ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 0/9] Assorted fixes for `git config` (including the "empty sections" bug) 2018-03-29 15:18 [PATCH 0/9] Assorted fixes for `git config` (including the "empty sections" bug) Johannes Schindelin ` (9 preceding siblings ...) 2018-03-29 17:58 ` [PATCH 0/9] Assorted fixes for `git config` (including the "empty sections" bug) Stefan Beller @ 2018-03-29 19:39 ` Jeff King 2018-03-30 12:35 ` Johannes Schindelin 2018-03-30 14:17 ` Ævar Arnfjörð Bjarmason 2018-04-03 16:27 ` [PATCH v2 00/15] " Johannes Schindelin 12 siblings, 1 reply; 103+ messages in thread From: Jeff King @ 2018-03-29 19:39 UTC (permalink / raw) To: Johannes Schindelin Cc: git, Junio C Hamano, Thomas Rast, Phil Haack, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley On Thu, Mar 29, 2018 at 05:18:30PM +0200, Johannes Schindelin wrote: > Little did I know that this would turn not only into a full patch to fix this > issue, but into a full-blown series of nine patches. It's amazing how often that happens. :) > The first patch is somewhat of a "while at it" bug fix that I first thought > would be a lot more critical than it actually is: It really only affects config > files that start with a section followed immediately (i.e. without a newline) > by a one-letter boolean setting (i.e. without a `= <value>` part). So while it > is a real bug fix, I doubt anybody ever got bitten by it. That makes me wonder if somebody could craft a malicious config to do something bad. But I don't think so. Config is trusted already, and it looks like this bug is both hard to trigger and doesn't result in any kind of memory funniness, just a bogus output. > Now, to the really important part: why does this patch series not conflict with > my very early statements that we cannot simply remove empty sections because we > may end up with stale comments? > > Well, the patch in question takes pains to determine *iff* there are any > comments surrounding, or included in, the section. If any are found: previous > behavior. Under the assumption that the user edited the file, we keep it as > intact as possible (see below for some argument against this). If no comments > are found, and let's face it, this is probably *the* common case, as few people > edit their config files by hand these days (neither should they because it is > too easy to end up with an unparseable one), the now-empty section *is* > removed. I'm not against people editing their config files by hand. But I think what you propose here makes a lot of sense, because it works as long as you don't intermingle hand- and auto-editing in the same section (and it even works if you do intermingle, as long as you don't use comments, which are probably even more rare). So it seems like quite a sensible compromise, and I think should make most people happy. -Peff ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 0/9] Assorted fixes for `git config` (including the "empty sections" bug) 2018-03-29 19:39 ` Jeff King @ 2018-03-30 12:35 ` Johannes Schindelin 0 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-03-30 12:35 UTC (permalink / raw) To: Jeff King Cc: git, Junio C Hamano, Thomas Rast, Phil Haack, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley Hi Peff, On Thu, 29 Mar 2018, Jeff King wrote: > On Thu, Mar 29, 2018 at 05:18:30PM +0200, Johannes Schindelin wrote: > > > The first patch is somewhat of a "while at it" bug fix that I first > > thought would be a lot more critical than it actually is: It really > > only affects config files that start with a section followed > > immediately (i.e. without a newline) by a one-letter boolean setting > > (i.e. without a `= <value>` part). So while it is a real bug fix, I > > doubt anybody ever got bitten by it. > > That makes me wonder if somebody could craft a malicious config to do > something bad. I thought about that, and could not think of anything other than social engineering vectors. Even in that case, the error message is instructive enough that the user should be able to fix the config without consulting StackOverflow. > > Now, to the really important part: why does this patch series not > > conflict with my very early statements that we cannot simply remove > > empty sections because we may end up with stale comments? > > > > Well, the patch in question takes pains to determine *iff* there are > > any comments surrounding, or included in, the section. If any are > > found: previous behavior. Under the assumption that the user edited > > the file, we keep it as intact as possible (see below for some > > argument against this). If no comments are found, and let's face it, > > this is probably *the* common case, as few people edit their config > > files by hand these days (neither should they because it is too easy > > to end up with an unparseable one), the now-empty section *is* > > removed. > > I'm not against people editing their config files by hand. But I think > what you propose here makes a lot of sense, because it works as long as > you don't intermingle hand- and auto-editing in the same section (and it > even works if you do intermingle, as long as you don't use comments, > which are probably even more rare). > > So it seems like quite a sensible compromise, and I think should make > most people happy. Thanks for confirming my line of thinking, Dscho ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 0/9] Assorted fixes for `git config` (including the "empty sections" bug) 2018-03-29 15:18 [PATCH 0/9] Assorted fixes for `git config` (including the "empty sections" bug) Johannes Schindelin ` (10 preceding siblings ...) 2018-03-29 19:39 ` Jeff King @ 2018-03-30 14:17 ` Ævar Arnfjörð Bjarmason 2018-03-30 18:46 ` Johannes Schindelin 2018-04-03 16:27 ` [PATCH v2 00/15] " Johannes Schindelin 12 siblings, 1 reply; 103+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2018-03-30 14:17 UTC (permalink / raw) To: Johannes Schindelin Cc: git, Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Stefan Beller, Jason Frey, Philip Oakley On Thu, Mar 29 2018, Johannes Schindelin wrote: > Nonetheless, I would be confortable with this patch going into v2.17.0, even at > this late stage. The final verdict is Junio's, of course. Thanks a lot for working on this. I'm keen to stress test this, but won't have time in the next few days, and in any case think that the parts that change functionality should wait until after 2.17 (but e.g. the test renaming would be fine for a cherry-pick). ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH 0/9] Assorted fixes for `git config` (including the "empty sections" bug) 2018-03-30 14:17 ` Ævar Arnfjörð Bjarmason @ 2018-03-30 18:46 ` Johannes Schindelin 0 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-03-30 18:46 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: git, Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Stefan Beller, Jason Frey, Philip Oakley [-- Attachment #1: Type: text/plain, Size: 806 bytes --] Hi Ævar, On Fri, 30 Mar 2018, Ævar Arnfjörð Bjarmason wrote: > > On Thu, Mar 29 2018, Johannes Schindelin wrote: > > > Nonetheless, I would be confortable with this patch going into > > v2.17.0, even at this late stage. The final verdict is Junio's, of > > course. > > Thanks a lot for working on this. I'm keen to stress test this, but > won't have time in the next few days, and in any case think that the > parts that change functionality should wait until after 2.17 (but e.g. > the test renaming would be fine for a cherry-pick). Obviously this was never meant to get into v2.17.0 (apart maybe from 1/9, which however is so contested over that addition of the test case under the assumption that anybody but me would dare to touch those parts of the code). Ciao, Dscho ^ permalink raw reply [flat|nested] 103+ messages in thread
* [PATCH v2 00/15] Assorted fixes for `git config` (including the "empty sections" bug) 2018-03-29 15:18 [PATCH 0/9] Assorted fixes for `git config` (including the "empty sections" bug) Johannes Schindelin ` (11 preceding siblings ...) 2018-03-30 14:17 ` Ævar Arnfjörð Bjarmason @ 2018-04-03 16:27 ` Johannes Schindelin 2018-04-03 16:28 ` [PATCH v2 01/15] git_config_set: fix off-by-two Johannes Schindelin ` (16 more replies) 12 siblings, 17 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-04-03 16:27 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley This patch series originally only tried to help fixing that annoying bug that has been reported several times over the years, where `git config --unset` would leave empty sections behind, and `git config --add` would not reuse them. The first patch is somewhat of a "while at it" bug fix that I first thought would be a lot more critical than it actually is: It really only affects config files that start with a section followed immediately (i.e. without a newline) by a one-letter boolean setting (i.e. without a `= <value>` part). So while it is a real bug fix, I doubt anybody ever got bitten by it. The next swath of patches add and fix some tests, while also fixing the bug where --replace-all would sometimes insert extra line breaks. These fixes are pretty straight-forward, and I always try to keep my added tests as concise as possible, so please tell me if you find a way to make them smaller (without giving up readability and debuggability). Then, I introduce a couple of building blocks: a "config parser event stream", i.e. an optional callback that can be used to report events such as "comment", "white-space", etc together with the corresponding extents in the config file. Finally, the interesting part, where I do two things, essentially (with preparatory steps for each thing): 1. I add the ability for `git config --unset/--unset-all` to detect that it can remove a section that has just become empty (see below for some more discussion of what I consider "become empty"), and 2. I add the ability for `git config [--add] key value` to re-use empty sections. I am very, very grateful for the time Peff spent on reviewing the previous iteration, and hope that he realizes just how much the elegance of the event-stream-based version is due to his excellent review. To reiterate why does this patch series not conflict with my very early statements that we cannot simply remove empty sections because we may end up with stale comments? Well, the patch in question takes pains to determine *iff* there are any comments surrounding, or included in, the section. If any are found: previous behavior. Under the assumption that the user edited the file, we keep it as intact as possible (see below for some argument against this). If no comments are found, and let's face it, this is probably *the* common case, as few people edit their config files by hand these days (neither should they because it is too easy to end up with an unparseable one), the now-empty section *is* removed. So what is the argument against this extra care to detect comments? Well, if you have something like this: [section] ; Here we comment about the variable called snarf snarf = froop and we run `git config --unset section.snarf`, we end up with this config: [section] ; Here we comment about the variable called snarf which obviously does not make sense. However, that is already established behavior for quite a few years, and I do not even try to think of a way how this could be solved. Changes since v1: - a new feature was introduced where the config parser can be asked to report all "events" (like, section header or comment) via a callback function. - the patches to reuse empty sections, and to remove sections that just became empty (without any surrounding comments) were rewritten to make use of that config parser event stream (incidentally fixing a couple of problems with the backtracking version which were pointed out by Peff). - to make those changes easier to review, they have been split up into several tiny logical steps: the file-local `store` was replaced with callback data, some fields were renamed for consistency, the state machine when parsing the config was replaced by easier-to-understand flags, etc. - while pouring over the code, I managed to find another obscure bug: under certain circumstances, --replace-all could produce extra new-lines. This is now fixed as part of the preparatory patches. Johannes Schindelin (15): git_config_set: fix off-by-two t1300: rename it to reflect that `repo-config` was deprecated t1300: demonstrate that --replace-all can "invent" newlines config --replace-all: avoid extra line breaks t1300: avoid relying on a bug t1300: remove unreasonable expectation from TODO t1300: `--unset-all` can leave an empty section behind (bug) config: introduce an optional event stream while parsing config: avoid using the global variable `store` config_set_store: rename some fields for consistency git_config_set: do not use a state machine git_config_set: make use of the config parser's event stream git config --unset: remove empty sections (in the common case) git_config_set: reuse empty sections TODOs config.c | 449 ++++++++++++++++++++-------- config.h | 25 ++ t/{t1300-repo-config.sh => t1300-config.sh} | 57 +++- 3 files changed, 396 insertions(+), 135 deletions(-) rename t/{t1300-repo-config.sh => t1300-config.sh} (97%) base-commit: 468165c1d8a442994a825f3684528361727cd8c0 Published-As: https://github.com/dscho/git/releases/tag/empty-config-section-v2 Fetch-It-Via: git fetch https://github.com/dscho/git empty-config-section-v2 Interdiff vs v1 (sorry for the size, it is essentially a rewrite of more than half of the previous iteration): diff --git a/config.c b/config.c index b04c40f76bc..ee7ea24123d 100644 --- a/config.c +++ b/config.c @@ -653,21 +653,65 @@ static int get_base_var(struct strbuf *name) } } +struct parse_event_data { + enum config_event_t previous_type; + size_t previous_offset; + const struct config_options *opts; +}; + +static inline int do_event(enum config_event_t type, + struct parse_event_data *data) +{ + size_t offset; + + if (!data->opts || !data->opts->event_fn) + return 0; + + if (type == CONFIG_EVENT_WHITESPACE && + data->previous_type == type) + return 0; + + offset = cf->do_ftell(cf); + /* + * At EOF, the parser always "inserts" an extra '\n', therefore + * the end offset of the event is the current file position, otherwise + * we will already have advanced to the next event. + */ + if (type != CONFIG_EVENT_EOF) + offset--; + + if (data->previous_type != CONFIG_EVENT_EOF && + data->opts->event_fn(data->previous_type, data->previous_offset, + offset, data->opts->event_fn_data) < 0) + return -1; + + data->previous_type = type; + data->previous_offset = offset; + + return 0; +} + static int git_parse_source(config_fn_t fn, void *data, - int include_section_headers) + const struct config_options *opts) { int comment = 0; int baselen = 0; struct strbuf *var = &cf->var; int error_return = 0; char *error_msg = NULL; - int saw_section_header = 0; /* U+FEFF Byte Order Mark in UTF8 */ const char *bomptr = utf8_bom; + /* For the parser event callback */ + struct parse_event_data event_data = { + CONFIG_EVENT_EOF, 0, opts + }; + for (;;) { - int c = get_next_char(); + int c; + + c = get_next_char(); if (bomptr && *bomptr) { /* We are at the file beginning; skip UTF8-encoded BOM * if present. Sane editors won't put this in on their @@ -684,39 +728,47 @@ static int git_parse_source(config_fn_t fn, void *data, } } if (c == '\n') { - if (cf->eof) + if (cf->eof) { + if (do_event(CONFIG_EVENT_EOF, &event_data) < 0) + return -1; return 0; - comment = 0; - if (saw_section_header) { - if (include_section_headers) { - cf->linenr--; - error_return = fn(var->buf, NULL, data); - if (error_return < 0) - break; - cf->linenr++; - } - saw_section_header = 0; } + if (do_event(CONFIG_EVENT_WHITESPACE, &event_data) < 0) + return -1; + comment = 0; continue; } - if (comment || isspace(c)) + if (comment) continue; + if (isspace(c)) { + if (do_event(CONFIG_EVENT_WHITESPACE, &event_data) < 0) + return -1; + continue; + } if (c == '#' || c == ';') { + if (do_event(CONFIG_EVENT_COMMENT, &event_data) < 0) + return -1; comment = 1; continue; } if (c == '[') { + if (do_event(CONFIG_EVENT_SECTION, &event_data) < 0) + return -1; + /* Reset prior to determining a new stem */ strbuf_reset(var); if (get_base_var(var) < 0 || var->len < 1) break; strbuf_addch(var, '.'); baselen = var->len; - saw_section_header = 1; continue; } if (!isalpha(c)) break; + + if (do_event(CONFIG_EVENT_ENTRY, &event_data) < 0) + return -1; + /* * Truncate the var name back to the section header * stem prior to grabbing the suffix part of the name @@ -728,6 +780,9 @@ static int git_parse_source(config_fn_t fn, void *data, break; } + if (do_event(CONFIG_EVENT_ERROR, &event_data) < 0) + return -1; + switch (cf->origin_type) { case CONFIG_ORIGIN_BLOB: error_msg = xstrfmt(_("bad config line %d in blob %s"), @@ -1412,7 +1467,7 @@ int git_default_config(const char *var, const char *value, void *dummy) * this function. */ static int do_config_from(struct config_source *top, config_fn_t fn, void *data, - int include_section_headers) + const struct config_options *opts) { int ret; @@ -1424,7 +1479,7 @@ static int do_config_from(struct config_source *top, config_fn_t fn, void *data, strbuf_init(&top->var, 1024); cf = top; - ret = git_parse_source(fn, data, include_section_headers); + ret = git_parse_source(fn, data, opts); /* pop config-file parsing state stack */ strbuf_release(&top->value); @@ -1437,7 +1492,7 @@ static int do_config_from(struct config_source *top, config_fn_t fn, void *data, static int do_config_from_file(config_fn_t fn, const enum config_origin_type origin_type, const char *name, const char *path, FILE *f, - void *data, int include_section_headers) + void *data, const struct config_options *opts) { struct config_source top; @@ -1450,15 +1505,18 @@ static int do_config_from_file(config_fn_t fn, top.do_ungetc = config_file_ungetc; top.do_ftell = config_file_ftell; - return do_config_from(&top, fn, data, include_section_headers); + return do_config_from(&top, fn, data, opts); } static int git_config_from_stdin(config_fn_t fn, void *data) { - return do_config_from_file(fn, CONFIG_ORIGIN_STDIN, "", NULL, stdin, data, 0); + return do_config_from_file(fn, CONFIG_ORIGIN_STDIN, "", NULL, stdin, + data, NULL); } -int git_config_from_file(config_fn_t fn, const char *filename, void *data) +int git_config_from_file_with_options(config_fn_t fn, const char *filename, + void *data, + const struct config_options *opts) { int ret = -1; FILE *f; @@ -1466,13 +1524,19 @@ int git_config_from_file(config_fn_t fn, const char *filename, void *data) f = fopen_or_warn(filename, "r"); if (f) { flockfile(f); - ret = do_config_from_file(fn, CONFIG_ORIGIN_FILE, filename, filename, f, data, 0); + ret = do_config_from_file(fn, CONFIG_ORIGIN_FILE, filename, + filename, f, data, opts); funlockfile(f); fclose(f); } return ret; } +int git_config_from_file(config_fn_t fn, const char *filename, void *data) +{ + return git_config_from_file_with_options(fn, filename, data, NULL); +} + int git_config_from_mem(config_fn_t fn, const enum config_origin_type origin_type, const char *name, const char *buf, size_t len, void *data) { @@ -1489,7 +1553,7 @@ int git_config_from_mem(config_fn_t fn, const enum config_origin_type origin_typ top.do_ungetc = config_buf_ungetc; top.do_ftell = config_buf_ftell; - return do_config_from(&top, fn, data, 0); + return do_config_from(&top, fn, data, NULL); } int git_config_from_blob_oid(config_fn_t fn, @@ -2233,96 +2297,98 @@ void git_die_config(const char *key, const char *err, ...) * Find all the stuff for git_config_set() below. */ -static struct { +struct config_set_store { int baselen; char *key; int do_not_match; regex_t *value_regex; int multi_replace; - size_t *offset; - unsigned int offset_alloc; - enum { START, SECTION_SEEN, SECTION_END_SEEN, KEY_SEEN } state; - unsigned int seen; -} store; + struct { + size_t begin, end; + enum config_event_t type; + int is_keys_section; + } *parsed; + unsigned int parsed_nr, parsed_alloc, *seen, seen_nr, seen_alloc; + unsigned int key_seen:1, section_seen:1, is_keys_section:1; +}; -static int matches(const char *key, const char *value) +static int matches(const char *key, const char *value, + const struct config_set_store *store) { - if (strcmp(key, store.key)) + if (strcmp(key, store->key)) return 0; /* not ours */ - if (!store.value_regex) + if (!store->value_regex) return 1; /* always matches */ - if (store.value_regex == CONFIG_REGEX_NONE) + if (store->value_regex == CONFIG_REGEX_NONE) return 0; /* never matches */ - return store.do_not_match ^ - (value && !regexec(store.value_regex, value, 0, NULL, 0)); + return store->do_not_match ^ + (value && !regexec(store->value_regex, value, 0, NULL, 0)); +} + +static int store_aux_event(enum config_event_t type, + size_t begin, size_t end, void *data) +{ + struct config_set_store *store = data; + + ALLOC_GROW(store->parsed, store->parsed_nr + 1, store->parsed_alloc); + store->parsed[store->parsed_nr].begin = begin; + store->parsed[store->parsed_nr].end = end; + store->parsed[store->parsed_nr].type = type; + + if (type == CONFIG_EVENT_SECTION) { + if (cf->var.len < 2 || cf->var.buf[cf->var.len - 1] != '.') + BUG("Invalid section name '%s'", cf->var.buf); + + /* Is this the section we were looking for? */ + store->is_keys_section = + store->parsed[store->parsed_nr].is_keys_section = + cf->var.len - 1 == store->baselen && + !strncasecmp(cf->var.buf, store->key, store->baselen); + if (store->is_keys_section) { + store->section_seen = 1; + ALLOC_GROW(store->seen, store->seen_nr + 1, + store->seen_alloc); + store->seen[store->seen_nr] = store->parsed_nr; + } + } + + store->parsed_nr++; + + return 0; } static int store_aux(const char *key, const char *value, void *cb) { - const char *ep; - size_t section_len; + struct config_set_store *store = cb; - switch (store.state) { - case KEY_SEEN: - if (matches(key, value)) { - if (store.seen == 1 && store.multi_replace == 0) { + if (store->key_seen) { + if (matches(key, value, store)) { + if (store->seen_nr == 1 && store->multi_replace == 0) { warning(_("%s has multiple values"), key); } - ALLOC_GROW(store.offset, store.seen + 1, - store.offset_alloc); + ALLOC_GROW(store->seen, store->seen_nr + 1, + store->seen_alloc); - store.offset[store.seen] = cf->do_ftell(cf); - store.seen++; + store->seen[store->seen_nr] = store->parsed_nr; + store->seen_nr++; } - break; - case SECTION_SEEN: + } else if (store->is_keys_section) { /* - * What we are looking for is in store.key (both - * section and var), and its section part is baselen - * long. We found key (again, both section and var). - * We would want to know if this key is in the same - * section as what we are looking for. We already - * know we are in the same section as what should - * hold store.key. + * Do not increment matches yet: this may not be a match, but we + * are in the desired section. */ - ep = strrchr(key, '.'); - section_len = ep - key; - - if ((section_len != store.baselen) || - memcmp(key, store.key, section_len+1)) { - store.state = SECTION_END_SEEN; - break; - } + ALLOC_GROW(store->seen, store->seen_nr + 1, store->seen_alloc); + store->seen[store->seen_nr] = store->parsed_nr; + store->section_seen = 1; - /* - * Do not increment matches: this is no match, but we - * just made sure we are in the desired section. - */ - ALLOC_GROW(store.offset, store.seen + 1, - store.offset_alloc); - store.offset[store.seen] = cf->do_ftell(cf); - /* fallthru */ - case SECTION_END_SEEN: - case START: - if (matches(key, value)) { - ALLOC_GROW(store.offset, store.seen + 1, - store.offset_alloc); - store.offset[store.seen] = cf->do_ftell(cf); - store.state = KEY_SEEN; - store.seen++; - } else { - if (strrchr(key, '.') - key == store.baselen && - !strncmp(key, store.key, store.baselen)) { - store.state = SECTION_SEEN; - ALLOC_GROW(store.offset, - store.seen + 1, - store.offset_alloc); - store.offset[store.seen] = cf->do_ftell(cf); - } + if (matches(key, value, store)) { + store->seen_nr++; + store->key_seen = 1; } } + return 0; } @@ -2334,31 +2400,33 @@ static int write_error(const char *filename) return 4; } -static struct strbuf store_create_section(const char *key) +static struct strbuf store_create_section(const char *key, + const struct config_set_store *store) { const char *dot; int i; struct strbuf sb = STRBUF_INIT; - dot = memchr(key, '.', store.baselen); + dot = memchr(key, '.', store->baselen); if (dot) { strbuf_addf(&sb, "[%.*s \"", (int)(dot - key), key); - for (i = dot - key + 1; i < store.baselen; i++) { + for (i = dot - key + 1; i < store->baselen; i++) { if (key[i] == '"' || key[i] == '\\') strbuf_addch(&sb, '\\'); strbuf_addch(&sb, key[i]); } strbuf_addstr(&sb, "\"]\n"); } else { - strbuf_addf(&sb, "[%.*s]\n", store.baselen, key); + strbuf_addf(&sb, "[%.*s]\n", store->baselen, key); } return sb; } -static ssize_t write_section(int fd, const char *key) +static ssize_t write_section(int fd, const char *key, + const struct config_set_store *store) { - struct strbuf sb = store_create_section(key); + struct strbuf sb = store_create_section(key, store); ssize_t ret; ret = write_in_full(fd, sb.buf, sb.len); @@ -2367,11 +2435,12 @@ static ssize_t write_section(int fd, const char *key) return ret; } -static ssize_t write_pair(int fd, const char *key, const char *value) +static ssize_t write_pair(int fd, const char *key, const char *value, + const struct config_set_store *store) { int i; ssize_t ret; - int length = strlen(key + store.baselen + 1); + int length = strlen(key + store->baselen + 1); const char *quote = ""; struct strbuf sb = STRBUF_INIT; @@ -2391,7 +2460,7 @@ static ssize_t write_pair(int fd, const char *key, const char *value) quote = "\""; strbuf_addf(&sb, "\t%.*s = %s", - length, key + store.baselen + 1, quote); + length, key + store->baselen + 1, quote); for (i = 0; value[i]; i++) switch (value[i]) { @@ -2417,201 +2486,85 @@ static ssize_t write_pair(int fd, const char *key, const char *value) return ret; } -static ssize_t find_beginning_of_line(const char *contents, size_t size, - size_t offset_, int *found_bracket) -{ - size_t equal_offset = size, bracket_offset = size; - ssize_t offset; - -contline: - for (offset = offset_-2; offset > 0 - && contents[offset] != '\n'; offset--) - switch (contents[offset]) { - case '=': equal_offset = offset; break; - case ']': bracket_offset = offset; break; - } - if (offset > 0 && contents[offset-1] == '\\') { - offset_ = offset; - goto contline; - } - if (bracket_offset < equal_offset) { - *found_bracket = 1; - offset = bracket_offset+1; - } else - offset++; - - return offset; -} - -/* - * This function determines whether the offset is in a line that starts with a - * comment character. - * - * Note: it does *not* report when a regular line (section header, config - * setting) *ends* in a comment. - */ -static int is_in_comment_line(const char *contents, size_t offset) -{ - int comment = 0; - - while (offset > 0) - switch (contents[--offset]) { - case ';': - case '#': - comment = 1; - break; - case '\n': - break; - case ' ': - case '\t': - continue; - default: - comment = 0; - } - - return comment; -} - /* * If we are about to unset the last key(s) in a section, and if there are * no comments surrounding (or included in) the section, we will want to * extend begin/end to remove the entire section. * - * Note: the parameter `i_ptr` points to the index into the store.offset - * array, reflecting the end offset of the respective entry to be deleted. - * This index may be incremented if a section has more than one entry (which - * all are to be removed). + * Note: the parameter `seen_ptr` points to the index into the store.seen + * array. * This index may be incremented if a section has more than one + * entry (which all are to be removed). */ -static void maybe_remove_section(const char *contents, size_t size, - const char *section_name, - size_t section_name_len, - size_t *begin, int *i_ptr, int *new_line) +static void maybe_remove_section(struct config_set_store *store, + const char *contents, + size_t *begin_offset, size_t *end_offset, + int *seen_ptr) { - size_t begin2, end2; - int seen_section = 0, dummy, i = *i_ptr; + size_t begin; + int i, seen, section_seen = 0; /* - * First, make sure that this is the last key in the section, and that - * there are no comments that are possibly about the current section. + * First, ensure that this is the first key, and that there are no + * comments before the entry nor before the section header. */ -next_entry: - for (end2 = store.offset[i]; end2 < size; end2++) { - switch (contents[end2]) { - case ' ': - case '\t': - case '\n': - continue; - case '\r': - if (++end2 < size && contents[end2] == '\n') - continue; - break; - case '[': - /* If the section name is repeated, continue */ - if (end2 + 1 + section_name_len < size && - contents[end2 + section_name_len] == ']' && - !memcmp(contents + end2 + 1, section_name, - section_name_len)) { - end2 += section_name_len; - continue; - } - goto look_before; - case ';': - case '#': - /* There is a comment, cannot remove this section */ + seen = *seen_ptr; + for (i = store->seen[seen]; i > 0; i--) { + enum config_event_t type = store->parsed[i - 1].type; + + if (type == CONFIG_EVENT_COMMENT) + /* There is a comment before this entry or section */ return; - default: - /* There are other keys in that section */ + if (type == CONFIG_EVENT_ENTRY) { + if (!section_seen) + /* This is not the section's first entry. */ + return; + /* We encountered no comment before the section. */ break; } - - /* - * Uh oh... we found something else in this section. But do - * we want to remove this, too? - */ - if (++i >= store.seen) - return; - - begin2 = find_beginning_of_line(contents, size, store.offset[i], - &dummy); - if (begin2 > end2) - return; - - /* Looks like we want to remove the next one, too... */ - goto next_entry; + if (type == CONFIG_EVENT_SECTION) { + if (!store->parsed[i - 1].is_keys_section) + break; + section_seen = 1; + } } + begin = store->parsed[i].begin; -look_before: /* - * Now, ensure that this is the first key, and that there are no - * comments before the entry nor before the section header. + * Next, make sure that we are removing he last key(s) in the section, + * and that there are no comments that are possibly about the current + * section. */ - for (begin2 = *begin; begin2 > 0; ) - switch (contents[begin2 - 1]) { - case ' ': - case '\t': - begin2--; - continue; - case '\n': - if (--begin2 > 0 && contents[begin2 - 1] == '\r') - begin2--; - continue; - case ']': - if (begin2 > section_name_len + 1 && - contents[begin2 - section_name_len - 2] == '[' && - !memcmp(contents + begin2 - section_name_len - 1, - section_name, section_name_len)) { - begin2 -= section_name_len + 2; - seen_section = 1; - continue; - } - - /* - * It looks like a section header, but it could be a - * comment instead... - */ - if (is_in_comment_line(contents, begin2)) - return; - - /* - * We encountered the previous section header: This - * really was the only entry, so remove the entire - * section. - */ - if (contents[begin2] != '\n') { - begin2--; - *new_line = 1; - } + for (i = store->seen[seen] + 1; i < store->parsed_nr; i++) { + enum config_event_t type = store->parsed[i].type; - store.offset[i] = end2; - *begin = begin2; - *i_ptr = i; + if (type == CONFIG_EVENT_COMMENT) return; - default: - /* - * Any other character means it is either a comment or - * a config setting; if it is a comment, we do not want - * to remove this section. If it is a config setting, - * we only want to remove this section if this is - * already the next section. - */ - if (seen_section && - !is_in_comment_line(contents, begin2)) { - if (contents[begin2] != '\n') { - begin2--; - *new_line = 1; - } - - store.offset[i] = end2; - *begin = begin2; - *i_ptr = i; - } + if (type == CONFIG_EVENT_SECTION) { + if (store->parsed[i].is_keys_section) + continue; + break; + } + if (type == CONFIG_EVENT_ENTRY) { + if (++seen < store->seen_nr && + i == store->seen[seen]) + /* We want to remove this entry, too */ + continue; + /* There is another entry in this section. */ return; } + } - /* This section extends to the beginning of the file. */ - store.offset[i] = end2; - *begin = begin2; - *i_ptr = i; + /* + * We are really removing the last entry/entries from this section, and + * there are no enclosed or surrounding comments. Remove the entire, + * now-empty section. + */ + *seen_ptr = seen; + *begin_offset = begin; + if (i < store->parsed_nr) + *end_offset = store->parsed[i].begin; + else + *end_offset = store->parsed[store->parsed_nr - 1].end; } int git_config_set_in_file_gently(const char *config_filename, @@ -2671,14 +2624,15 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, struct lock_file lock = LOCK_INIT; char *filename_buf = NULL; char *contents = NULL; - char *section_name = NULL; size_t contents_sz; + struct config_set_store store; + + memset(&store, 0, sizeof(store)); /* parse-key returns negative; flip the sign to feed exit(3) */ - ret = 0 - git_config_parse_key(key, §ion_name, &store.baselen); + ret = 0 - git_config_parse_key(key, &store.key, &store.baselen); if (ret) goto out_free; - store.key = section_name; store.multi_replace = multi_replace; @@ -2692,6 +2646,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, fd = hold_lock_file_for_update(&lock, config_filename, 0); if (fd < 0) { error_errno("could not lock config file %s", config_filename); + free(store.key); ret = CONFIG_NO_LOCK; goto out_free; } @@ -2701,6 +2656,8 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, */ in_fd = open(config_filename, O_RDONLY); if ( in_fd < 0 ) { + free(store.key); + if ( ENOENT != errno ) { error_errno("opening %s", config_filename); ret = CONFIG_INVALID_FILE; /* same as "invalid config file" */ @@ -2713,14 +2670,14 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, } store.key = (char *)key; - if (write_section(fd, key) < 0 || - write_pair(fd, key, value) < 0) + if (write_section(fd, key, &store) < 0 || + write_pair(fd, key, value, &store) < 0) goto write_err_out; } else { struct stat st; size_t copy_begin, copy_end; int i, new_line = 0; - FILE *f; + struct config_options opts; if (value_regex == NULL) store.value_regex = NULL; @@ -2743,34 +2700,36 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, } } - ALLOC_GROW(store.offset, 1, store.offset_alloc); - store.offset[0] = 0; - store.state = START; - store.seen = 0; + ALLOC_GROW(store.parsed, 1, store.parsed_alloc); + store.parsed[0].end = 0; + + memset(&opts, 0, sizeof(opts)); + opts.event_fn = store_aux_event; + opts.event_fn_data = &store; /* - * After this, store.offset will contain the *end* offset - * of the last match, or remain at 0 if no match was found. + * After this, store.parsed will contain offsets of all the + * parsed elements, and store.seen will contain a list of + * matches, as indices into store.parsed. + * * As a side effect, we make sure to transform only a valid * existing config file. */ - f = fopen_or_warn(config_filename, "r"); - if (!f || do_config_from_file(store_aux, CONFIG_ORIGIN_FILE, - config_filename, config_filename, - f, NULL, 1)) { + if (git_config_from_file_with_options(store_aux, + config_filename, + &store, &opts)) { error("invalid config file %s", config_filename); + free(store.key); if (store.value_regex != NULL && store.value_regex != CONFIG_REGEX_NONE) { regfree(store.value_regex); free(store.value_regex); } ret = CONFIG_INVALID_FILE; - if (f) - fclose(f); goto out_free; - } else - fclose(f); + } + free(store.key); if (store.value_regex != NULL && store.value_regex != CONFIG_REGEX_NONE) { regfree(store.value_regex); @@ -2778,8 +2737,8 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, } /* if nothing to unset, or too many matches, error out */ - if ((store.seen == 0 && value == NULL) || - (store.seen > 1 && multi_replace == 0)) { + if ((store.seen_nr == 0 && value == NULL) || + (store.seen_nr > 1 && multi_replace == 0)) { ret = CONFIG_NOTHING_SET; goto out_free; } @@ -2810,25 +2769,48 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, goto out_free; } - if (store.seen == 0) - store.seen = 1; + if (store.seen_nr == 0) { + if (!store.seen_alloc) { + /* Did not see key nor section */ + ALLOC_GROW(store.seen, 1, store.seen_alloc); + store.seen[0] = store.parsed_nr + - !!store.parsed_nr; + } + store.seen_nr = 1; + } - for (i = 0, copy_begin = 0; i < store.seen; i++) { - if (store.offset[i] == 0) { - store.offset[i] = copy_end = contents_sz; - } else if (store.state != KEY_SEEN) { - copy_end = store.offset[i]; + for (i = 0, copy_begin = 0; i < store.seen_nr; i++) { + size_t replace_end; + int j = store.seen[i]; + + new_line = 0; + if (!store.key_seen) { + copy_end = store.parsed[j].end; + /* include '\n' when copying section header */ + if (copy_end > 0 && copy_end < contents_sz && + contents[copy_end - 1] != '\n' && + contents[copy_end] == '\n') + copy_end++; + replace_end = copy_end; } else { - copy_end = find_beginning_of_line( - contents, contents_sz, - store.offset[i], &new_line); + replace_end = store.parsed[j].end; + copy_end = store.parsed[j].begin; if (!value) - maybe_remove_section(contents, - contents_sz, - section_name, - store.baselen, - ©_end, &i, - &new_line); + maybe_remove_section(&store, contents, + ©_end, + &replace_end, &i); + /* + * Swallow preceding white-space on the same + * line. + */ + while (copy_end > 0 ) { + char c = contents[copy_end - 1]; + + if (isspace(c) && c != '\n') + copy_end--; + else + break; + } } if (copy_end > 0 && contents[copy_end-1] != '\n') @@ -2843,16 +2825,16 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, write_str_in_full(fd, "\n") < 0) goto write_err_out; } - copy_begin = store.offset[i]; + copy_begin = replace_end; } /* write the pair (value == NULL means unset) */ if (value != NULL) { - if (store.state == START) { - if (write_section(fd, key) < 0) + if (!store.section_seen) { + if (write_section(fd, key, &store) < 0) goto write_err_out; } - if (write_pair(fd, key, value) < 0) + if (write_pair(fd, key, value, &store) < 0) goto write_err_out; } @@ -2879,7 +2861,6 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, out_free: rollback_lock_file(&lock); - free(section_name); free(filename_buf); if (contents) munmap(contents, contents_sz); @@ -2977,7 +2958,8 @@ static int section_name_is_ok(const char *name) /* if new_name == NULL, the section is removed instead */ static int git_config_copy_or_rename_section_in_file(const char *config_filename, - const char *old_name, const char *new_name, int copy) + const char *old_name, + const char *new_name, int copy) { int ret = 0, remove = 0; char *filename_buf = NULL; @@ -2987,6 +2969,9 @@ static int git_config_copy_or_rename_section_in_file(const char *config_filename FILE *config_file = NULL; struct stat st; struct strbuf copystr = STRBUF_INIT; + struct config_set_store store; + + memset(&store, 0, sizeof(store)); if (new_name && !section_name_is_ok(new_name)) { ret = error("invalid section name: %s", new_name); @@ -3056,7 +3041,7 @@ static int git_config_copy_or_rename_section_in_file(const char *config_filename } store.baselen = strlen(new_name); if (!copy) { - if (write_section(out_fd, new_name) < 0) { + if (write_section(out_fd, new_name, &store) < 0) { ret = write_error(get_lock_file_path(&lock)); goto out; } @@ -3077,7 +3062,7 @@ static int git_config_copy_or_rename_section_in_file(const char *config_filename output[0] = '\t'; } } else { - copystr = store_create_section(new_name); + copystr = store_create_section(new_name, &store); } } remove = 0; diff --git a/config.h b/config.h index ef70a9cac1e..5a2394daae2 100644 --- a/config.h +++ b/config.h @@ -28,15 +28,40 @@ enum config_origin_type { CONFIG_ORIGIN_CMDLINE }; +enum config_event_t { + CONFIG_EVENT_SECTION, + CONFIG_EVENT_ENTRY, + CONFIG_EVENT_WHITESPACE, + CONFIG_EVENT_COMMENT, + CONFIG_EVENT_EOF, + CONFIG_EVENT_ERROR +}; + +/* + * The parser event function (if not NULL) is called with the event type and + * the begin/end offsets of the parsed elements. + * + * Note: for CONFIG_EVENT_ENTRY (i.e. config variables), the trailing newline + * character is considered part of the element. + */ +typedef int (*config_parser_event_fn_t)(enum config_event_t type, + size_t begin_offset, size_t end_offset, + void *event_fn_data); + struct config_options { unsigned int respect_includes : 1; const char *commondir; const char *git_dir; + config_parser_event_fn_t event_fn; + void *event_fn_data; }; typedef int (*config_fn_t)(const char *, const char *, void *); extern int git_default_config(const char *, const char *, void *); extern int git_config_from_file(config_fn_t fn, const char *, void *); +extern int git_config_from_file_with_options(config_fn_t fn, const char *, + void *, + const struct config_options *); extern int git_config_from_mem(config_fn_t fn, const enum config_origin_type, const char *name, const char *buf, size_t len, void *data); extern int git_config_from_blob_oid(config_fn_t fn, const char *name, diff --git a/t/t1300-config.sh b/t/t1300-config.sh index 867397ae930..6d0e13020d1 100755 --- a/t/t1300-config.sh +++ b/t/t1300-config.sh @@ -1643,4 +1643,25 @@ test_expect_success '--local requires a repo' ' test_expect_code 128 nongit git config --local foo.bar ' +test_expect_success '--replace-all does not invent newlines' ' + q_to_tab >.git/config <<-\EOF && + [abc]key + QkeepSection + [xyz] + Qkey = 1 + [abc] + Qkey = a + EOF + q_to_tab >expect <<-\EOF && + [abc] + QkeepSection + [xyz] + Qkey = 1 + [abc] + Qkey = b + EOF + git config --replace-all abc.key b && + test_cmp .git/config expect +' + test_done -- 2.16.2.windows.1.26.g2cc3565eb4b ^ permalink raw reply [flat|nested] 103+ messages in thread
* [PATCH v2 01/15] git_config_set: fix off-by-two 2018-04-03 16:27 ` [PATCH v2 00/15] " Johannes Schindelin @ 2018-04-03 16:28 ` Johannes Schindelin 2018-04-03 16:28 ` [PATCH v2 02/15] t1300: rename it to reflect that `repo-config` was deprecated Johannes Schindelin ` (15 subsequent siblings) 16 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-04-03 16:28 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley Currently, we are slightly overzealous When removing an entry from a config file of this form: [abc]a [xyz] key = value When calling `git config --unset abc.a` on this file, it leaves this (invalid) config behind: [ [xyz] key = value The reason is that we try to search for the beginning of the line (or for the end of the preceding section header on the same line) that defines abc.a, but as an optimization, we subtract 2 from the offset pointing just after the definition before we call find_beginning_of_line(). That function, however, *also* performs that optimization and promptly fails to find the section header correctly. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- config.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/config.c b/config.c index b0c20e6cb8a..5cc049aaef0 100644 --- a/config.c +++ b/config.c @@ -2632,7 +2632,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, } else copy_end = find_beginning_of_line( contents, contents_sz, - store.offset[i]-2, &new_line); + store.offset[i], &new_line); if (copy_end > 0 && contents[copy_end-1] != '\n') new_line = 1; -- 2.16.2.windows.1.26.g2cc3565eb4b ^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v2 02/15] t1300: rename it to reflect that `repo-config` was deprecated 2018-04-03 16:27 ` [PATCH v2 00/15] " Johannes Schindelin 2018-04-03 16:28 ` [PATCH v2 01/15] git_config_set: fix off-by-two Johannes Schindelin @ 2018-04-03 16:28 ` Johannes Schindelin 2018-04-03 16:28 ` [PATCH v2 03/15] t1300: demonstrate that --replace-all can "invent" newlines Johannes Schindelin ` (14 subsequent siblings) 16 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-04-03 16:28 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- t/{t1300-repo-config.sh => t1300-config.sh} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename t/{t1300-repo-config.sh => t1300-config.sh} (100%) diff --git a/t/t1300-repo-config.sh b/t/t1300-config.sh similarity index 100% rename from t/t1300-repo-config.sh rename to t/t1300-config.sh -- 2.16.2.windows.1.26.g2cc3565eb4b ^ permalink raw reply [flat|nested] 103+ messages in thread
* [PATCH v2 03/15] t1300: demonstrate that --replace-all can "invent" newlines 2018-04-03 16:27 ` [PATCH v2 00/15] " Johannes Schindelin 2018-04-03 16:28 ` [PATCH v2 01/15] git_config_set: fix off-by-two Johannes Schindelin 2018-04-03 16:28 ` [PATCH v2 02/15] t1300: rename it to reflect that `repo-config` was deprecated Johannes Schindelin @ 2018-04-03 16:28 ` Johannes Schindelin 2018-04-03 16:28 ` [PATCH v2 04/15] config --replace-all: avoid extra line breaks Johannes Schindelin ` (13 subsequent siblings) 16 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-04-03 16:28 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- t/t1300-config.sh | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/t/t1300-config.sh b/t/t1300-config.sh index 4f8e6f5fde3..cc417687e8d 100755 --- a/t/t1300-config.sh +++ b/t/t1300-config.sh @@ -1611,4 +1611,25 @@ test_expect_success '--local requires a repo' ' test_expect_code 128 nongit git config --local foo.bar ' +test_expect_failure '--replace-all does not invent newlines' ' + q_to_tab >.git/config <<-\EOF && + [abc]key + QkeepSection + [xyz] + Qkey = 1 + [abc] + Qkey = a + EOF + q_to_tab >expect <<-\EOF && + [abc] + QkeepSection + [xyz] + Qkey = 1 + [abc] + Qkey = b + EOF + git config --replace-all abc.key b && + test_cmp .git/config expect +' + test_done -- 2.16.2.windows.1.26.g2cc3565eb4b ^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v2 04/15] config --replace-all: avoid extra line breaks 2018-04-03 16:27 ` [PATCH v2 00/15] " Johannes Schindelin ` (2 preceding siblings ...) 2018-04-03 16:28 ` [PATCH v2 03/15] t1300: demonstrate that --replace-all can "invent" newlines Johannes Schindelin @ 2018-04-03 16:28 ` Johannes Schindelin 2018-04-03 16:28 ` [PATCH v2 05/15] t1300: avoid relying on a bug Johannes Schindelin ` (12 subsequent siblings) 16 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-04-03 16:28 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley When replacing multiple config entries at once, we did not re-set the flag that indicates whether we need to insert a new-line before the new entry. As a consequence, an extra new-line was inserted under certain circumstances. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- config.c | 1 + t/t1300-config.sh | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/config.c b/config.c index 5cc049aaef0..f10f8c6f52f 100644 --- a/config.c +++ b/config.c @@ -2625,6 +2625,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, store.seen = 1; for (i = 0, copy_begin = 0; i < store.seen; i++) { + new_line = 0; if (store.offset[i] == 0) { store.offset[i] = copy_end = contents_sz; } else if (store.state != KEY_SEEN) { diff --git a/t/t1300-config.sh b/t/t1300-config.sh index cc417687e8d..aed12be492f 100755 --- a/t/t1300-config.sh +++ b/t/t1300-config.sh @@ -1611,7 +1611,7 @@ test_expect_success '--local requires a repo' ' test_expect_code 128 nongit git config --local foo.bar ' -test_expect_failure '--replace-all does not invent newlines' ' +test_expect_success '--replace-all does not invent newlines' ' q_to_tab >.git/config <<-\EOF && [abc]key QkeepSection -- 2.16.2.windows.1.26.g2cc3565eb4b ^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v2 05/15] t1300: avoid relying on a bug 2018-04-03 16:27 ` [PATCH v2 00/15] " Johannes Schindelin ` (3 preceding siblings ...) 2018-04-03 16:28 ` [PATCH v2 04/15] config --replace-all: avoid extra line breaks Johannes Schindelin @ 2018-04-03 16:28 ` Johannes Schindelin 2018-04-03 16:28 ` [PATCH v2 06/15] t1300: remove unreasonable expectation from TODO Johannes Schindelin ` (11 subsequent siblings) 16 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-04-03 16:28 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley The test case 'unset with cont. lines' relied on a bug that is about to be fixed: it tests *explicitly* that removing the last entry from a config section leaves an *empty* section behind. Let's fix this test case not to rely on that behavior, simply by preventing the section from becoming empty. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- t/t1300-config.sh | 2 ++ 1 file changed, 2 insertions(+) diff --git a/t/t1300-config.sh b/t/t1300-config.sh index aed12be492f..7c0ee208dea 100755 --- a/t/t1300-config.sh +++ b/t/t1300-config.sh @@ -108,6 +108,7 @@ bar = foo [beta] baz = multiple \ lines +foo = bar EOF test_expect_success 'unset with cont. lines' ' @@ -118,6 +119,7 @@ cat > expect <<\EOF [alpha] bar = foo [beta] +foo = bar EOF test_expect_success 'unset with cont. lines is correct' 'test_cmp expect .git/config' -- 2.16.2.windows.1.26.g2cc3565eb4b ^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v2 06/15] t1300: remove unreasonable expectation from TODO 2018-04-03 16:27 ` [PATCH v2 00/15] " Johannes Schindelin ` (4 preceding siblings ...) 2018-04-03 16:28 ` [PATCH v2 05/15] t1300: avoid relying on a bug Johannes Schindelin @ 2018-04-03 16:28 ` Johannes Schindelin 2018-04-03 16:28 ` [PATCH v2 07/15] t1300: `--unset-all` can leave an empty section behind (bug) Johannes Schindelin ` (10 subsequent siblings) 16 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-04-03 16:28 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley In https://public-inbox.org/git/7vvc8alzat.fsf@alter.siamese.dyndns.org/ a reasonable patch was made quite a bit less so by changing a test case demonstrating a bug to a test case that demonstrates that we ask for too much: the test case 'unsetting the last key in a section removes header' now expects a future bug fix to be able to determine whether a free-form comment above a section header refers to said section or not. Rather than shooting for the stars (and not even getting off the ground), let's start shooting for something obtainable and be reasonably confident that we *can* get it. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- t/t1300-config.sh | 21 ++++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-) diff --git a/t/t1300-config.sh b/t/t1300-config.sh index 7c0ee208dea..187fc5b195f 100755 --- a/t/t1300-config.sh +++ b/t/t1300-config.sh @@ -1413,7 +1413,7 @@ test_expect_success 'urlmatch with wildcard' ' ' # good section hygiene -test_expect_failure 'unsetting the last key in a section removes header' ' +test_expect_failure '--unset last key removes section (except if commented)' ' cat >.git/config <<-\EOF && # some generic comment on the configuration file itself # a comment specific to this "section" section. @@ -1427,6 +1427,25 @@ test_expect_failure 'unsetting the last key in a section removes header' ' cat >expect <<-\EOF && # some generic comment on the configuration file itself + # a comment specific to this "section" section. + [section] + # some intervening lines + # that should also be dropped + + # please be careful when you update the above variable + EOF + + git config --unset section.key && + test_cmp expect .git/config && + + cat >.git/config <<-\EOF && + [section] + key = value + [next-section] + EOF + + cat >expect <<-\EOF && + [next-section] EOF git config --unset section.key && -- 2.16.2.windows.1.26.g2cc3565eb4b ^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v2 07/15] t1300: `--unset-all` can leave an empty section behind (bug) 2018-04-03 16:27 ` [PATCH v2 00/15] " Johannes Schindelin ` (5 preceding siblings ...) 2018-04-03 16:28 ` [PATCH v2 06/15] t1300: remove unreasonable expectation from TODO Johannes Schindelin @ 2018-04-03 16:28 ` Johannes Schindelin 2018-04-03 16:28 ` [PATCH v2 08/15] config: introduce an optional event stream while parsing Johannes Schindelin ` (9 subsequent siblings) 16 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-04-03 16:28 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley We already have a test demonstrating that removing the last entry from a config section fails to remove the section header of the now-empty section. The same can happen, of course, if we remove the last entries in one fell swoop. This is *also* a bug, and should be fixed at the same time. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- t/t1300-config.sh | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/t/t1300-config.sh b/t/t1300-config.sh index 187fc5b195f..10b9bf4b088 100755 --- a/t/t1300-config.sh +++ b/t/t1300-config.sh @@ -1452,6 +1452,17 @@ test_expect_failure '--unset last key removes section (except if commented)' ' test_cmp expect .git/config ' +test_expect_failure '--unset-all removes section if empty & uncommented' ' + cat >.git/config <<-\EOF && + [section] + key = value1 + key = value2 + EOF + + git config --unset-all section.key && + test_line_count = 0 .git/config +' + test_expect_failure 'adding a key into an empty section reuses header' ' cat >.git/config <<-\EOF && [section] -- 2.16.2.windows.1.26.g2cc3565eb4b ^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v2 08/15] config: introduce an optional event stream while parsing 2018-04-03 16:27 ` [PATCH v2 00/15] " Johannes Schindelin ` (6 preceding siblings ...) 2018-04-03 16:28 ` [PATCH v2 07/15] t1300: `--unset-all` can leave an empty section behind (bug) Johannes Schindelin @ 2018-04-03 16:28 ` Johannes Schindelin 2018-04-06 21:22 ` Jeff King 2018-04-03 16:28 ` [PATCH v2 09/15] config: avoid using the global variable `store` Johannes Schindelin ` (8 subsequent siblings) 16 siblings, 1 reply; 103+ messages in thread From: Johannes Schindelin @ 2018-04-03 16:28 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley This extends our config parser so that it can optionally produce an event stream via callback function, where it reports e.g. when a comment was parsed, or a section header, etc. This parser will be used subsequently to handle the scenarios better where removing config entries would make sections empty, or where a new entry could be added to an already-existing, empty section. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- config.c | 102 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-------- config.h | 25 ++++++++++++++++ 2 files changed, 115 insertions(+), 12 deletions(-) diff --git a/config.c b/config.c index f10f8c6f52f..4cd745f6628 100644 --- a/config.c +++ b/config.c @@ -653,7 +653,46 @@ static int get_base_var(struct strbuf *name) } } -static int git_parse_source(config_fn_t fn, void *data) +struct parse_event_data { + enum config_event_t previous_type; + size_t previous_offset; + const struct config_options *opts; +}; + +static inline int do_event(enum config_event_t type, + struct parse_event_data *data) +{ + size_t offset; + + if (!data->opts || !data->opts->event_fn) + return 0; + + if (type == CONFIG_EVENT_WHITESPACE && + data->previous_type == type) + return 0; + + offset = cf->do_ftell(cf); + /* + * At EOF, the parser always "inserts" an extra '\n', therefore + * the end offset of the event is the current file position, otherwise + * we will already have advanced to the next event. + */ + if (type != CONFIG_EVENT_EOF) + offset--; + + if (data->previous_type != CONFIG_EVENT_EOF && + data->opts->event_fn(data->previous_type, data->previous_offset, + offset, data->opts->event_fn_data) < 0) + return -1; + + data->previous_type = type; + data->previous_offset = offset; + + return 0; +} + +static int git_parse_source(config_fn_t fn, void *data, + const struct config_options *opts) { int comment = 0; int baselen = 0; @@ -664,8 +703,15 @@ static int git_parse_source(config_fn_t fn, void *data) /* U+FEFF Byte Order Mark in UTF8 */ const char *bomptr = utf8_bom; + /* For the parser event callback */ + struct parse_event_data event_data = { + CONFIG_EVENT_EOF, 0, opts + }; + for (;;) { - int c = get_next_char(); + int c; + + c = get_next_char(); if (bomptr && *bomptr) { /* We are at the file beginning; skip UTF8-encoded BOM * if present. Sane editors won't put this in on their @@ -682,18 +728,33 @@ static int git_parse_source(config_fn_t fn, void *data) } } if (c == '\n') { - if (cf->eof) + if (cf->eof) { + if (do_event(CONFIG_EVENT_EOF, &event_data) < 0) + return -1; return 0; + } + if (do_event(CONFIG_EVENT_WHITESPACE, &event_data) < 0) + return -1; comment = 0; continue; } - if (comment || isspace(c)) + if (comment) continue; + if (isspace(c)) { + if (do_event(CONFIG_EVENT_WHITESPACE, &event_data) < 0) + return -1; + continue; + } if (c == '#' || c == ';') { + if (do_event(CONFIG_EVENT_COMMENT, &event_data) < 0) + return -1; comment = 1; continue; } if (c == '[') { + if (do_event(CONFIG_EVENT_SECTION, &event_data) < 0) + return -1; + /* Reset prior to determining a new stem */ strbuf_reset(var); if (get_base_var(var) < 0 || var->len < 1) @@ -704,6 +765,10 @@ static int git_parse_source(config_fn_t fn, void *data) } if (!isalpha(c)) break; + + if (do_event(CONFIG_EVENT_ENTRY, &event_data) < 0) + return -1; + /* * Truncate the var name back to the section header * stem prior to grabbing the suffix part of the name @@ -715,6 +780,9 @@ static int git_parse_source(config_fn_t fn, void *data) break; } + if (do_event(CONFIG_EVENT_ERROR, &event_data) < 0) + return -1; + switch (cf->origin_type) { case CONFIG_ORIGIN_BLOB: error_msg = xstrfmt(_("bad config line %d in blob %s"), @@ -1398,7 +1466,8 @@ int git_default_config(const char *var, const char *value, void *dummy) * fgetc, ungetc, ftell of top need to be initialized before calling * this function. */ -static int do_config_from(struct config_source *top, config_fn_t fn, void *data) +static int do_config_from(struct config_source *top, config_fn_t fn, void *data, + const struct config_options *opts) { int ret; @@ -1410,7 +1479,7 @@ static int do_config_from(struct config_source *top, config_fn_t fn, void *data) strbuf_init(&top->var, 1024); cf = top; - ret = git_parse_source(fn, data); + ret = git_parse_source(fn, data, opts); /* pop config-file parsing state stack */ strbuf_release(&top->value); @@ -1423,7 +1492,7 @@ static int do_config_from(struct config_source *top, config_fn_t fn, void *data) static int do_config_from_file(config_fn_t fn, const enum config_origin_type origin_type, const char *name, const char *path, FILE *f, - void *data) + void *data, const struct config_options *opts) { struct config_source top; @@ -1436,15 +1505,18 @@ static int do_config_from_file(config_fn_t fn, top.do_ungetc = config_file_ungetc; top.do_ftell = config_file_ftell; - return do_config_from(&top, fn, data); + return do_config_from(&top, fn, data, opts); } static int git_config_from_stdin(config_fn_t fn, void *data) { - return do_config_from_file(fn, CONFIG_ORIGIN_STDIN, "", NULL, stdin, data); + return do_config_from_file(fn, CONFIG_ORIGIN_STDIN, "", NULL, stdin, + data, NULL); } -int git_config_from_file(config_fn_t fn, const char *filename, void *data) +int git_config_from_file_with_options(config_fn_t fn, const char *filename, + void *data, + const struct config_options *opts) { int ret = -1; FILE *f; @@ -1452,13 +1524,19 @@ int git_config_from_file(config_fn_t fn, const char *filename, void *data) f = fopen_or_warn(filename, "r"); if (f) { flockfile(f); - ret = do_config_from_file(fn, CONFIG_ORIGIN_FILE, filename, filename, f, data); + ret = do_config_from_file(fn, CONFIG_ORIGIN_FILE, filename, + filename, f, data, opts); funlockfile(f); fclose(f); } return ret; } +int git_config_from_file(config_fn_t fn, const char *filename, void *data) +{ + return git_config_from_file_with_options(fn, filename, data, NULL); +} + int git_config_from_mem(config_fn_t fn, const enum config_origin_type origin_type, const char *name, const char *buf, size_t len, void *data) { @@ -1475,7 +1553,7 @@ int git_config_from_mem(config_fn_t fn, const enum config_origin_type origin_typ top.do_ungetc = config_buf_ungetc; top.do_ftell = config_buf_ftell; - return do_config_from(&top, fn, data); + return do_config_from(&top, fn, data, NULL); } int git_config_from_blob_oid(config_fn_t fn, diff --git a/config.h b/config.h index ef70a9cac1e..5a2394daae2 100644 --- a/config.h +++ b/config.h @@ -28,15 +28,40 @@ enum config_origin_type { CONFIG_ORIGIN_CMDLINE }; +enum config_event_t { + CONFIG_EVENT_SECTION, + CONFIG_EVENT_ENTRY, + CONFIG_EVENT_WHITESPACE, + CONFIG_EVENT_COMMENT, + CONFIG_EVENT_EOF, + CONFIG_EVENT_ERROR +}; + +/* + * The parser event function (if not NULL) is called with the event type and + * the begin/end offsets of the parsed elements. + * + * Note: for CONFIG_EVENT_ENTRY (i.e. config variables), the trailing newline + * character is considered part of the element. + */ +typedef int (*config_parser_event_fn_t)(enum config_event_t type, + size_t begin_offset, size_t end_offset, + void *event_fn_data); + struct config_options { unsigned int respect_includes : 1; const char *commondir; const char *git_dir; + config_parser_event_fn_t event_fn; + void *event_fn_data; }; typedef int (*config_fn_t)(const char *, const char *, void *); extern int git_default_config(const char *, const char *, void *); extern int git_config_from_file(config_fn_t fn, const char *, void *); +extern int git_config_from_file_with_options(config_fn_t fn, const char *, + void *, + const struct config_options *); extern int git_config_from_mem(config_fn_t fn, const enum config_origin_type, const char *name, const char *buf, size_t len, void *data); extern int git_config_from_blob_oid(config_fn_t fn, const char *name, -- 2.16.2.windows.1.26.g2cc3565eb4b ^ permalink raw reply related [flat|nested] 103+ messages in thread
* Re: [PATCH v2 08/15] config: introduce an optional event stream while parsing 2018-04-03 16:28 ` [PATCH v2 08/15] config: introduce an optional event stream while parsing Johannes Schindelin @ 2018-04-06 21:22 ` Jeff King 2018-04-09 7:35 ` Johannes Schindelin 0 siblings, 1 reply; 103+ messages in thread From: Jeff King @ 2018-04-06 21:22 UTC (permalink / raw) To: Johannes Schindelin Cc: git, Junio C Hamano, Thomas Rast, Phil Haack, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley On Tue, Apr 03, 2018 at 06:28:29PM +0200, Johannes Schindelin wrote: > This extends our config parser so that it can optionally produce an event > stream via callback function, where it reports e.g. when a comment was > parsed, or a section header, etc. > > This parser will be used subsequently to handle the scenarios better where > removing config entries would make sections empty, or where a new entry > could be added to an already-existing, empty section. Nice, it looks like this didn't end up being too bad to go in this direction. It seems like this is an optional "also emit the events here" function you can set. I think in the long run we could actually just always emit the events to this function. And then we could wrap that to provide an interface that matches the existing callbacks (just an event-stream callback that sees EVENT_ENTRY and calls the sub-callback). But that might end up quite a pain, since we have a zillion entry points into the config parser, making wrapping tough. So I'm perfectly happy to stop here for now. > +static inline int do_event(enum config_event_t type, > + struct parse_event_data *data) I'm not sure if "inline" here is a good idea, as it seems to get called quite a few times. If we're trying to make things fast, bloating the instruction cache may have the opposite effect. -Peff ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH v2 08/15] config: introduce an optional event stream while parsing 2018-04-06 21:22 ` Jeff King @ 2018-04-09 7:35 ` Johannes Schindelin 0 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-04-09 7:35 UTC (permalink / raw) To: Jeff King Cc: git, Junio C Hamano, Thomas Rast, Phil Haack, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley Hi Peff, On Fri, 6 Apr 2018, Jeff King wrote: > On Tue, Apr 03, 2018 at 06:28:29PM +0200, Johannes Schindelin wrote: > > > This extends our config parser so that it can optionally produce an > > event stream via callback function, where it reports e.g. when a > > comment was parsed, or a section header, etc. > > > > This parser will be used subsequently to handle the scenarios better > > where removing config entries would make sections empty, or where a > > new entry could be added to an already-existing, empty section. > > Nice, it looks like this didn't end up being too bad to go in this > direction. It seems like this is an optional "also emit the events here" > function you can set. Yes. > I think in the long run we could actually just always emit the events to > this function. And then we could wrap that to provide an interface that > matches the existing callbacks (just an event-stream callback that sees > EVENT_ENTRY and calls the sub-callback). Well, not precisely. The event stream was implemented in a minimal fashion, in particular *not* emitting enough information in the event stream for that. To keep things as little intrusive as possible, the CONFIG_EVENT_ENTRY event is only emitted *after* the config_fn is called, and at that point we do not even know the key and the value any more. I fear that it would make the code quite a bit more complicated to change it in the way you suggested. Side note: a slightly ugly aspect of my patch series is that the CONFIG_EVENT_SECTION event *also* does not provide the interesting information (in this case, the section name), but that it has to be inferred from the cf->var field (which is file-local to config.c, and which has been set to the section name followed by a single '.' at that point). Again, this keeps the diff simpler to review, and that's why I did it that way. > But that might end up quite a pain, since we have a zillion entry points > into the config parser, making wrapping tough. So I'm perfectly happy to > stop here for now. Right. > > +static inline int do_event(enum config_event_t type, > > + struct parse_event_data *data) > > I'm not sure if "inline" here is a good idea, as it seems to get called > quite a few times. If we're trying to make things fast, bloating the > instruction cache may have the opposite effect. Good point. The reason I declared this as inline function was that I test whether either data->opts or data->opts->event_fn are NULL, and whether we are continuing to look at whitespace, for early returns from that function. Which I wanted to avoid doing in a hot function (I'd rather skip calling the function if it is pointless to call it). However, the config code is hardly performance-critical, as we do not expect to parse hundreds of kilobytes, right? So that "inline" was a premature optimization. Thanks, Dscho ^ permalink raw reply [flat|nested] 103+ messages in thread
* [PATCH v2 09/15] config: avoid using the global variable `store` 2018-04-03 16:27 ` [PATCH v2 00/15] " Johannes Schindelin ` (7 preceding siblings ...) 2018-04-03 16:28 ` [PATCH v2 08/15] config: introduce an optional event stream while parsing Johannes Schindelin @ 2018-04-03 16:28 ` Johannes Schindelin 2018-04-06 21:23 ` Jeff King 2018-04-03 16:28 ` [PATCH v2 10/15] config_set_store: rename some fields for consistency Johannes Schindelin ` (7 subsequent siblings) 16 siblings, 1 reply; 103+ messages in thread From: Johannes Schindelin @ 2018-04-03 16:28 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley It is much easier to reason about, when the config code to set/unset variables or to remove/rename sections does not rely on a global (or file-local) variable. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- config.c | 119 +++++++++++++++++++++++++++++++++++---------------------------- 1 file changed, 66 insertions(+), 53 deletions(-) diff --git a/config.c b/config.c index 4cd745f6628..90ae71cb905 100644 --- a/config.c +++ b/config.c @@ -2297,7 +2297,7 @@ void git_die_config(const char *key, const char *err, ...) * Find all the stuff for git_config_set() below. */ -static struct { +struct config_set_store { int baselen; char *key; int do_not_match; @@ -2307,56 +2307,58 @@ static struct { unsigned int offset_alloc; enum { START, SECTION_SEEN, SECTION_END_SEEN, KEY_SEEN } state; unsigned int seen; -} store; +}; -static int matches(const char *key, const char *value) +static int matches(const char *key, const char *value, + const struct config_set_store *store) { - if (strcmp(key, store.key)) + if (strcmp(key, store->key)) return 0; /* not ours */ - if (!store.value_regex) + if (!store->value_regex) return 1; /* always matches */ - if (store.value_regex == CONFIG_REGEX_NONE) + if (store->value_regex == CONFIG_REGEX_NONE) return 0; /* never matches */ - return store.do_not_match ^ - (value && !regexec(store.value_regex, value, 0, NULL, 0)); + return store->do_not_match ^ + (value && !regexec(store->value_regex, value, 0, NULL, 0)); } static int store_aux(const char *key, const char *value, void *cb) { const char *ep; size_t section_len; + struct config_set_store *store = cb; - switch (store.state) { + switch (store->state) { case KEY_SEEN: - if (matches(key, value)) { - if (store.seen == 1 && store.multi_replace == 0) { + if (matches(key, value, store)) { + if (store->seen == 1 && store->multi_replace == 0) { warning(_("%s has multiple values"), key); } - ALLOC_GROW(store.offset, store.seen + 1, - store.offset_alloc); + ALLOC_GROW(store->offset, store->seen + 1, + store->offset_alloc); - store.offset[store.seen] = cf->do_ftell(cf); - store.seen++; + store->offset[store->seen] = cf->do_ftell(cf); + store->seen++; } break; case SECTION_SEEN: /* - * What we are looking for is in store.key (both + * What we are looking for is in store->key (both * section and var), and its section part is baselen * long. We found key (again, both section and var). * We would want to know if this key is in the same * section as what we are looking for. We already * know we are in the same section as what should - * hold store.key. + * hold store->key. */ ep = strrchr(key, '.'); section_len = ep - key; - if ((section_len != store.baselen) || - memcmp(key, store.key, section_len+1)) { - store.state = SECTION_END_SEEN; + if ((section_len != store->baselen) || + memcmp(key, store->key, section_len+1)) { + store->state = SECTION_END_SEEN; break; } @@ -2364,26 +2366,27 @@ static int store_aux(const char *key, const char *value, void *cb) * Do not increment matches: this is no match, but we * just made sure we are in the desired section. */ - ALLOC_GROW(store.offset, store.seen + 1, - store.offset_alloc); - store.offset[store.seen] = cf->do_ftell(cf); + ALLOC_GROW(store->offset, store->seen + 1, + store->offset_alloc); + store->offset[store->seen] = cf->do_ftell(cf); /* fallthru */ case SECTION_END_SEEN: case START: - if (matches(key, value)) { - ALLOC_GROW(store.offset, store.seen + 1, - store.offset_alloc); - store.offset[store.seen] = cf->do_ftell(cf); - store.state = KEY_SEEN; - store.seen++; + if (matches(key, value, store)) { + ALLOC_GROW(store->offset, store->seen + 1, + store->offset_alloc); + store->offset[store->seen] = cf->do_ftell(cf); + store->state = KEY_SEEN; + store->seen++; } else { - if (strrchr(key, '.') - key == store.baselen && - !strncmp(key, store.key, store.baselen)) { - store.state = SECTION_SEEN; - ALLOC_GROW(store.offset, - store.seen + 1, - store.offset_alloc); - store.offset[store.seen] = cf->do_ftell(cf); + if (strrchr(key, '.') - key == store->baselen && + !strncmp(key, store->key, store->baselen)) { + store->state = SECTION_SEEN; + ALLOC_GROW(store->offset, + store->seen + 1, + store->offset_alloc); + store->offset[store->seen] = + cf->do_ftell(cf); } } } @@ -2398,31 +2401,33 @@ static int write_error(const char *filename) return 4; } -static struct strbuf store_create_section(const char *key) +static struct strbuf store_create_section(const char *key, + const struct config_set_store *store) { const char *dot; int i; struct strbuf sb = STRBUF_INIT; - dot = memchr(key, '.', store.baselen); + dot = memchr(key, '.', store->baselen); if (dot) { strbuf_addf(&sb, "[%.*s \"", (int)(dot - key), key); - for (i = dot - key + 1; i < store.baselen; i++) { + for (i = dot - key + 1; i < store->baselen; i++) { if (key[i] == '"' || key[i] == '\\') strbuf_addch(&sb, '\\'); strbuf_addch(&sb, key[i]); } strbuf_addstr(&sb, "\"]\n"); } else { - strbuf_addf(&sb, "[%.*s]\n", store.baselen, key); + strbuf_addf(&sb, "[%.*s]\n", store->baselen, key); } return sb; } -static ssize_t write_section(int fd, const char *key) +static ssize_t write_section(int fd, const char *key, + const struct config_set_store *store) { - struct strbuf sb = store_create_section(key); + struct strbuf sb = store_create_section(key, store); ssize_t ret; ret = write_in_full(fd, sb.buf, sb.len); @@ -2431,11 +2436,12 @@ static ssize_t write_section(int fd, const char *key) return ret; } -static ssize_t write_pair(int fd, const char *key, const char *value) +static ssize_t write_pair(int fd, const char *key, const char *value, + const struct config_set_store *store) { int i; ssize_t ret; - int length = strlen(key + store.baselen + 1); + int length = strlen(key + store->baselen + 1); const char *quote = ""; struct strbuf sb = STRBUF_INIT; @@ -2455,7 +2461,7 @@ static ssize_t write_pair(int fd, const char *key, const char *value) quote = "\""; strbuf_addf(&sb, "\t%.*s = %s", - length, key + store.baselen + 1, quote); + length, key + store->baselen + 1, quote); for (i = 0; value[i]; i++) switch (value[i]) { @@ -2565,6 +2571,9 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, char *filename_buf = NULL; char *contents = NULL; size_t contents_sz; + struct config_set_store store; + + memset(&store, 0, sizeof(store)); /* parse-key returns negative; flip the sign to feed exit(3) */ ret = 0 - git_config_parse_key(key, &store.key, &store.baselen); @@ -2607,8 +2616,8 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, } store.key = (char *)key; - if (write_section(fd, key) < 0 || - write_pair(fd, key, value) < 0) + if (write_section(fd, key, &store) < 0 || + write_pair(fd, key, value, &store) < 0) goto write_err_out; } else { struct stat st; @@ -2647,7 +2656,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, * As a side effect, we make sure to transform only a valid * existing config file. */ - if (git_config_from_file(store_aux, config_filename, NULL)) { + if (git_config_from_file(store_aux, config_filename, &store)) { error("invalid config file %s", config_filename); free(store.key); if (store.value_regex != NULL && @@ -2731,10 +2740,10 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, /* write the pair (value == NULL means unset) */ if (value != NULL) { if (store.state == START) { - if (write_section(fd, key) < 0) + if (write_section(fd, key, &store) < 0) goto write_err_out; } - if (write_pair(fd, key, value) < 0) + if (write_pair(fd, key, value, &store) < 0) goto write_err_out; } @@ -2858,7 +2867,8 @@ static int section_name_is_ok(const char *name) /* if new_name == NULL, the section is removed instead */ static int git_config_copy_or_rename_section_in_file(const char *config_filename, - const char *old_name, const char *new_name, int copy) + const char *old_name, + const char *new_name, int copy) { int ret = 0, remove = 0; char *filename_buf = NULL; @@ -2868,6 +2878,9 @@ static int git_config_copy_or_rename_section_in_file(const char *config_filename FILE *config_file = NULL; struct stat st; struct strbuf copystr = STRBUF_INIT; + struct config_set_store store; + + memset(&store, 0, sizeof(store)); if (new_name && !section_name_is_ok(new_name)) { ret = error("invalid section name: %s", new_name); @@ -2937,7 +2950,7 @@ static int git_config_copy_or_rename_section_in_file(const char *config_filename } store.baselen = strlen(new_name); if (!copy) { - if (write_section(out_fd, new_name) < 0) { + if (write_section(out_fd, new_name, &store) < 0) { ret = write_error(get_lock_file_path(&lock)); goto out; } @@ -2958,7 +2971,7 @@ static int git_config_copy_or_rename_section_in_file(const char *config_filename output[0] = '\t'; } } else { - copystr = store_create_section(new_name); + copystr = store_create_section(new_name, &store); } } remove = 0; -- 2.16.2.windows.1.26.g2cc3565eb4b ^ permalink raw reply related [flat|nested] 103+ messages in thread
* Re: [PATCH v2 09/15] config: avoid using the global variable `store` 2018-04-03 16:28 ` [PATCH v2 09/15] config: avoid using the global variable `store` Johannes Schindelin @ 2018-04-06 21:23 ` Jeff King 2018-04-09 7:36 ` Johannes Schindelin 0 siblings, 1 reply; 103+ messages in thread From: Jeff King @ 2018-04-06 21:23 UTC (permalink / raw) To: Johannes Schindelin Cc: git, Junio C Hamano, Thomas Rast, Phil Haack, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley On Tue, Apr 03, 2018 at 06:28:34PM +0200, Johannes Schindelin wrote: > It is much easier to reason about, when the config code to set/unset > variables or to remove/rename sections does not rely on a global (or > file-local) variable. Agreed. > -static struct { > +struct config_set_store { This made me think of the existing "configset", which is quite a different thing. Maybe just "config_store_data" or something would clash less. -Peff ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH v2 09/15] config: avoid using the global variable `store` 2018-04-06 21:23 ` Jeff King @ 2018-04-09 7:36 ` Johannes Schindelin 0 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-04-09 7:36 UTC (permalink / raw) To: Jeff King Cc: git, Junio C Hamano, Thomas Rast, Phil Haack, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley Hi Peff, On Fri, 6 Apr 2018, Jeff King wrote: > On Tue, Apr 03, 2018 at 06:28:34PM +0200, Johannes Schindelin wrote: > > > -static struct { > > +struct config_set_store { > > This made me think of the existing "configset", which is quite a > different thing. Maybe just "config_store_data" or something would clash > less. Sure, Dscho ^ permalink raw reply [flat|nested] 103+ messages in thread
* [PATCH v2 10/15] config_set_store: rename some fields for consistency 2018-04-03 16:27 ` [PATCH v2 00/15] " Johannes Schindelin ` (8 preceding siblings ...) 2018-04-03 16:28 ` [PATCH v2 09/15] config: avoid using the global variable `store` Johannes Schindelin @ 2018-04-03 16:28 ` Johannes Schindelin 2018-04-03 16:28 ` [PATCH v2 11/15] git_config_set: do not use a state machine Johannes Schindelin ` (6 subsequent siblings) 16 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-04-03 16:28 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley The `seen` field is the actual length of the `offset` array, and the `offset_alloc` field records what was allocated (to avoid resizing wherever `seen` has to be incremented). Elsewhere, we use the convention `name` for the array, where `name` is descriptive enough to guess its purpose, `name_nr` for the actual length and `name_alloc` to record the maximum length without needing to resize. Let's make the names of the fields in question consistent with that convention. This will also help with the next steps where we will let the git_config_set() machinery use the config event stream that we just introduced. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- config.c | 63 +++++++++++++++++++++++++++++++-------------------------------- 1 file changed, 31 insertions(+), 32 deletions(-) diff --git a/config.c b/config.c index 90ae71cb905..b73b48b5650 100644 --- a/config.c +++ b/config.c @@ -2303,10 +2303,9 @@ struct config_set_store { int do_not_match; regex_t *value_regex; int multi_replace; - size_t *offset; - unsigned int offset_alloc; + size_t *seen; + unsigned int seen_nr, seen_alloc; enum { START, SECTION_SEEN, SECTION_END_SEEN, KEY_SEEN } state; - unsigned int seen; }; static int matches(const char *key, const char *value, @@ -2332,15 +2331,15 @@ static int store_aux(const char *key, const char *value, void *cb) switch (store->state) { case KEY_SEEN: if (matches(key, value, store)) { - if (store->seen == 1 && store->multi_replace == 0) { + if (store->seen_nr == 1 && store->multi_replace == 0) { warning(_("%s has multiple values"), key); } - ALLOC_GROW(store->offset, store->seen + 1, - store->offset_alloc); + ALLOC_GROW(store->seen, store->seen_nr + 1, + store->seen_alloc); - store->offset[store->seen] = cf->do_ftell(cf); - store->seen++; + store->seen[store->seen_nr] = cf->do_ftell(cf); + store->seen_nr++; } break; case SECTION_SEEN: @@ -2366,26 +2365,26 @@ static int store_aux(const char *key, const char *value, void *cb) * Do not increment matches: this is no match, but we * just made sure we are in the desired section. */ - ALLOC_GROW(store->offset, store->seen + 1, - store->offset_alloc); - store->offset[store->seen] = cf->do_ftell(cf); + ALLOC_GROW(store->seen, store->seen_nr + 1, + store->seen_alloc); + store->seen[store->seen_nr] = cf->do_ftell(cf); /* fallthru */ case SECTION_END_SEEN: case START: if (matches(key, value, store)) { - ALLOC_GROW(store->offset, store->seen + 1, - store->offset_alloc); - store->offset[store->seen] = cf->do_ftell(cf); + ALLOC_GROW(store->seen, store->seen_nr + 1, + store->seen_alloc); + store->seen[store->seen_nr] = cf->do_ftell(cf); store->state = KEY_SEEN; - store->seen++; + store->seen_nr++; } else { if (strrchr(key, '.') - key == store->baselen && !strncmp(key, store->key, store->baselen)) { store->state = SECTION_SEEN; - ALLOC_GROW(store->offset, - store->seen + 1, - store->offset_alloc); - store->offset[store->seen] = + ALLOC_GROW(store->seen, + store->seen_nr + 1, + store->seen_alloc); + store->seen[store->seen_nr] = cf->do_ftell(cf); } } @@ -2645,10 +2644,10 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, } } - ALLOC_GROW(store.offset, 1, store.offset_alloc); - store.offset[0] = 0; + ALLOC_GROW(store.seen, 1, store.seen_alloc); + store.seen[0] = 0; store.state = START; - store.seen = 0; + store.seen_nr = 0; /* * After this, store.offset will contain the *end* offset @@ -2676,8 +2675,8 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, } /* if nothing to unset, or too many matches, error out */ - if ((store.seen == 0 && value == NULL) || - (store.seen > 1 && multi_replace == 0)) { + if ((store.seen_nr == 0 && value == NULL) || + (store.seen_nr > 1 && multi_replace == 0)) { ret = CONFIG_NOTHING_SET; goto out_free; } @@ -2708,19 +2707,19 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, goto out_free; } - if (store.seen == 0) - store.seen = 1; + if (store.seen_nr == 0) + store.seen_nr = 1; - for (i = 0, copy_begin = 0; i < store.seen; i++) { + for (i = 0, copy_begin = 0; i < store.seen_nr; i++) { new_line = 0; - if (store.offset[i] == 0) { - store.offset[i] = copy_end = contents_sz; + if (store.seen[i] == 0) { + store.seen[i] = copy_end = contents_sz; } else if (store.state != KEY_SEEN) { - copy_end = store.offset[i]; + copy_end = store.seen[i]; } else copy_end = find_beginning_of_line( contents, contents_sz, - store.offset[i], &new_line); + store.seen[i], &new_line); if (copy_end > 0 && contents[copy_end-1] != '\n') new_line = 1; @@ -2734,7 +2733,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, write_str_in_full(fd, "\n") < 0) goto write_err_out; } - copy_begin = store.offset[i]; + copy_begin = store.seen[i]; } /* write the pair (value == NULL means unset) */ -- 2.16.2.windows.1.26.g2cc3565eb4b ^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v2 11/15] git_config_set: do not use a state machine 2018-04-03 16:27 ` [PATCH v2 00/15] " Johannes Schindelin ` (9 preceding siblings ...) 2018-04-03 16:28 ` [PATCH v2 10/15] config_set_store: rename some fields for consistency Johannes Schindelin @ 2018-04-03 16:28 ` Johannes Schindelin 2018-04-06 21:28 ` Jeff King 2018-04-03 16:28 ` [PATCH v2 12/15] git_config_set: make use of the config parser's event stream Johannes Schindelin ` (5 subsequent siblings) 16 siblings, 1 reply; 103+ messages in thread From: Johannes Schindelin @ 2018-04-03 16:28 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley While a neat theoretical construct, state machines are hard to read. In this instance, it does not even make a whole lot of sense because we are more interested in flags, anyway: has the section been seen? Has the key been seen? Does the current section match the key we are looking for? Besides, the state `SECTION_SEEN` was named in a misleading way: it did not indicate that we saw the section matching the key we are looking for, but it instead indicated that we are *currently* in that section. Let's just replace the state machine logic by clear and obvious flags. This will also make it easier to review the upcoming patches to use the newly-introduced `event_fn` callback of the config parser. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- config.c | 59 +++++++++++++++++++++++++++++------------------------------ 1 file changed, 29 insertions(+), 30 deletions(-) diff --git a/config.c b/config.c index b73b48b5650..84e8f7ffeb8 100644 --- a/config.c +++ b/config.c @@ -2305,7 +2305,7 @@ struct config_set_store { int multi_replace; size_t *seen; unsigned int seen_nr, seen_alloc; - enum { START, SECTION_SEEN, SECTION_END_SEEN, KEY_SEEN } state; + unsigned int key_seen:1, section_seen:1, is_keys_section:1; }; static int matches(const char *key, const char *value, @@ -2328,8 +2328,7 @@ static int store_aux(const char *key, const char *value, void *cb) size_t section_len; struct config_set_store *store = cb; - switch (store->state) { - case KEY_SEEN: + if (store->key_seen) { if (matches(key, value, store)) { if (store->seen_nr == 1 && store->multi_replace == 0) { warning(_("%s has multiple values"), key); @@ -2341,8 +2340,8 @@ static int store_aux(const char *key, const char *value, void *cb) store->seen[store->seen_nr] = cf->do_ftell(cf); store->seen_nr++; } - break; - case SECTION_SEEN: + return 0; + } else if (store->is_keys_section) { /* * What we are looking for is in store->key (both * section and var), and its section part is baselen @@ -2357,10 +2356,9 @@ static int store_aux(const char *key, const char *value, void *cb) if ((section_len != store->baselen) || memcmp(key, store->key, section_len+1)) { - store->state = SECTION_END_SEEN; - break; + store->is_keys_section = 0; + return 0; } - /* * Do not increment matches: this is no match, but we * just made sure we are in the desired section. @@ -2368,27 +2366,29 @@ static int store_aux(const char *key, const char *value, void *cb) ALLOC_GROW(store->seen, store->seen_nr + 1, store->seen_alloc); store->seen[store->seen_nr] = cf->do_ftell(cf); - /* fallthru */ - case SECTION_END_SEEN: - case START: - if (matches(key, value, store)) { - ALLOC_GROW(store->seen, store->seen_nr + 1, - store->seen_alloc); - store->seen[store->seen_nr] = cf->do_ftell(cf); - store->state = KEY_SEEN; - store->seen_nr++; - } else { - if (strrchr(key, '.') - key == store->baselen && - !strncmp(key, store->key, store->baselen)) { - store->state = SECTION_SEEN; - ALLOC_GROW(store->seen, - store->seen_nr + 1, - store->seen_alloc); - store->seen[store->seen_nr] = - cf->do_ftell(cf); - } + } + + if (matches(key, value, store)) { + ALLOC_GROW(store->seen, store->seen_nr + 1, + store->seen_alloc); + store->seen[store->seen_nr] = cf->do_ftell(cf); + store->seen_nr++; + store->key_seen = 1; + store->section_seen = 1; + store->is_keys_section = 1; + } else { + if (strrchr(key, '.') - key == store->baselen && + !strncmp(key, store->key, store->baselen)) { + store->section_seen = 1; + store->is_keys_section = 1; + ALLOC_GROW(store->seen, + store->seen_nr + 1, + store->seen_alloc); + store->seen[store->seen_nr] = + cf->do_ftell(cf); } } + return 0; } @@ -2646,7 +2646,6 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, ALLOC_GROW(store.seen, 1, store.seen_alloc); store.seen[0] = 0; - store.state = START; store.seen_nr = 0; /* @@ -2714,7 +2713,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, new_line = 0; if (store.seen[i] == 0) { store.seen[i] = copy_end = contents_sz; - } else if (store.state != KEY_SEEN) { + } else if (!store.key_seen) { copy_end = store.seen[i]; } else copy_end = find_beginning_of_line( @@ -2738,7 +2737,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, /* write the pair (value == NULL means unset) */ if (value != NULL) { - if (store.state == START) { + if (!store.section_seen) { if (write_section(fd, key, &store) < 0) goto write_err_out; } -- 2.16.2.windows.1.26.g2cc3565eb4b ^ permalink raw reply related [flat|nested] 103+ messages in thread
* Re: [PATCH v2 11/15] git_config_set: do not use a state machine 2018-04-03 16:28 ` [PATCH v2 11/15] git_config_set: do not use a state machine Johannes Schindelin @ 2018-04-06 21:28 ` Jeff King 2018-04-09 7:50 ` Johannes Schindelin 0 siblings, 1 reply; 103+ messages in thread From: Jeff King @ 2018-04-06 21:28 UTC (permalink / raw) To: Johannes Schindelin Cc: git, Junio C Hamano, Thomas Rast, Phil Haack, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley On Tue, Apr 03, 2018 at 06:28:42PM +0200, Johannes Schindelin wrote: > While a neat theoretical construct, state machines are hard to read. In > this instance, it does not even make a whole lot of sense because we are > more interested in flags, anyway: has the section been seen? Has the key > been seen? Does the current section match the key we are looking for? > > Besides, the state `SECTION_SEEN` was named in a misleading way: it did > not indicate that we saw the section matching the key we are looking > for, but it instead indicated that we are *currently* in that section. > > Let's just replace the state machine logic by clear and obvious flags. > > This will also make it easier to review the upcoming patches to use the > newly-introduced `event_fn` callback of the config parser. I think this is probably a good direction. But one thing state machines can help with is keeping the state to a manageable size. With 3 bits of flags, we now have 8 possible states, up from the previous 4. Clearly some of those are nonsensical (can you be in key_seen without section_seen? I'd think not), but it's up to the code to interpret and reset those manually. I'll defer to your judgement, though, on this making things for the future patches more readable. You spend a lot more time poking at it than I have. -Peff ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH v2 11/15] git_config_set: do not use a state machine 2018-04-06 21:28 ` Jeff King @ 2018-04-09 7:50 ` Johannes Schindelin 0 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-04-09 7:50 UTC (permalink / raw) To: Jeff King Cc: git, Junio C Hamano, Thomas Rast, Phil Haack, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley Hi Peff, On Fri, 6 Apr 2018, Jeff King wrote: > On Tue, Apr 03, 2018 at 06:28:42PM +0200, Johannes Schindelin wrote: > > > While a neat theoretical construct, state machines are hard to read. In > > this instance, it does not even make a whole lot of sense because we are > > more interested in flags, anyway: has the section been seen? Has the key > > been seen? Does the current section match the key we are looking for? > > > > Besides, the state `SECTION_SEEN` was named in a misleading way: it did > > not indicate that we saw the section matching the key we are looking > > for, but it instead indicated that we are *currently* in that section. > > > > Let's just replace the state machine logic by clear and obvious flags. > > > > This will also make it easier to review the upcoming patches to use the > > newly-introduced `event_fn` callback of the config parser. > > I think this is probably a good direction. But one thing state machines > can help with is keeping the state to a manageable size. With 3 bits of > flags, we now have 8 possible states, up from the previous 4. > > Clearly some of those are nonsensical (can you be in key_seen without > section_seen? I'd think not), but it's up to the code to interpret and > reset those manually. That is true. On the other hand, it is easy to miss incorrect state transitions in state machines (or to miss unused states). > I'll defer to your judgement, though, on this making things for the > future patches more readable. You spend a lot more time poking at it > than I have. The original reason to get rid of the state machine was: I did not need the states any more in the end. Since the section name is set via the event stream we now know in the config_fn whether we are in the correct section or not. I also liked the fact that it was much easier to reason about correct code: "Did I catch all the states that apply here?" is a hairier question than "Is this flag true?" Thanks, Dscho ^ permalink raw reply [flat|nested] 103+ messages in thread
* [PATCH v2 12/15] git_config_set: make use of the config parser's event stream 2018-04-03 16:27 ` [PATCH v2 00/15] " Johannes Schindelin ` (10 preceding siblings ...) 2018-04-03 16:28 ` [PATCH v2 11/15] git_config_set: do not use a state machine Johannes Schindelin @ 2018-04-03 16:28 ` Johannes Schindelin 2018-04-03 16:28 ` [PATCH v2 13/15] git config --unset: remove empty sections (in the common case) Johannes Schindelin ` (4 subsequent siblings) 16 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-04-03 16:28 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley In the recent commit with the title "config: introduce an optional event stream while parsing", we introduced an optional callback to keep track of the config parser's events "comment", "white-space", "section header" and "entry". One motivation for this feature was to make use of it in the code that edits the config. And this commit makes it so. Note: this patch changes the meaning of the `seen` array that records whether we saw the config entry that is to be edited: previously, it contained the end offset of the found entry. Now, we introduce a new array `parsed` that keeps a record of *all* config parser events (with begin/end offsets), and the items in the `seen` array now point into the `parsed` array. There are two reasons why we do it this way: 1. To keep the implementation simple, the config parser's event stream reports the event only after the config callback was called, so we would not receive the begin offset otherwise. 2. In the following patches, we will re-use the `parsed` array to fix two long-standing bugs related to empty sections. Note that this also makes the code more robust with respect to finding the begin offset of the part(s) of the config file to be edited, as we no longer back-track to find the beginning of the line. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- config.c | 170 ++++++++++++++++++++++++++++++--------------------------------- 1 file changed, 81 insertions(+), 89 deletions(-) diff --git a/config.c b/config.c index 84e8f7ffeb8..345b1d2f140 100644 --- a/config.c +++ b/config.c @@ -2303,8 +2303,11 @@ struct config_set_store { int do_not_match; regex_t *value_regex; int multi_replace; - size_t *seen; - unsigned int seen_nr, seen_alloc; + struct { + size_t begin, end; + enum config_event_t type; + } *parsed; + unsigned int parsed_nr, parsed_alloc, *seen, seen_nr, seen_alloc; unsigned int key_seen:1, section_seen:1, is_keys_section:1; }; @@ -2322,10 +2325,31 @@ static int matches(const char *key, const char *value, (value && !regexec(store->value_regex, value, 0, NULL, 0)); } +static int store_aux_event(enum config_event_t type, + size_t begin, size_t end, void *data) +{ + struct config_set_store *store = data; + + ALLOC_GROW(store->parsed, store->parsed_nr + 1, store->parsed_alloc); + store->parsed[store->parsed_nr].begin = begin; + store->parsed[store->parsed_nr].end = end; + store->parsed[store->parsed_nr].type = type; + store->parsed_nr++; + + if (type == CONFIG_EVENT_SECTION) { + if (cf->var.len < 2 || cf->var.buf[cf->var.len - 1] != '.') + BUG("Invalid section name '%s'", cf->var.buf); + + /* Is this the section we were looking for? */ + store->is_keys_section = cf->var.len - 1 == store->baselen && + !strncasecmp(cf->var.buf, store->key, store->baselen); + } + + return 0; +} + static int store_aux(const char *key, const char *value, void *cb) { - const char *ep; - size_t section_len; struct config_set_store *store = cb; if (store->key_seen) { @@ -2337,55 +2361,21 @@ static int store_aux(const char *key, const char *value, void *cb) ALLOC_GROW(store->seen, store->seen_nr + 1, store->seen_alloc); - store->seen[store->seen_nr] = cf->do_ftell(cf); + store->seen[store->seen_nr] = store->parsed_nr; store->seen_nr++; } - return 0; } else if (store->is_keys_section) { /* - * What we are looking for is in store->key (both - * section and var), and its section part is baselen - * long. We found key (again, both section and var). - * We would want to know if this key is in the same - * section as what we are looking for. We already - * know we are in the same section as what should - * hold store->key. + * Do not increment matches yet: this may not be a match, but we + * are in the desired section. */ - ep = strrchr(key, '.'); - section_len = ep - key; - - if ((section_len != store->baselen) || - memcmp(key, store->key, section_len+1)) { - store->is_keys_section = 0; - return 0; - } - /* - * Do not increment matches: this is no match, but we - * just made sure we are in the desired section. - */ - ALLOC_GROW(store->seen, store->seen_nr + 1, - store->seen_alloc); - store->seen[store->seen_nr] = cf->do_ftell(cf); - } - - if (matches(key, value, store)) { - ALLOC_GROW(store->seen, store->seen_nr + 1, - store->seen_alloc); - store->seen[store->seen_nr] = cf->do_ftell(cf); - store->seen_nr++; - store->key_seen = 1; + ALLOC_GROW(store->seen, store->seen_nr + 1, store->seen_alloc); + store->seen[store->seen_nr] = store->parsed_nr; store->section_seen = 1; - store->is_keys_section = 1; - } else { - if (strrchr(key, '.') - key == store->baselen && - !strncmp(key, store->key, store->baselen)) { - store->section_seen = 1; - store->is_keys_section = 1; - ALLOC_GROW(store->seen, - store->seen_nr + 1, - store->seen_alloc); - store->seen[store->seen_nr] = - cf->do_ftell(cf); + + if (matches(key, value, store)) { + store->seen_nr++; + store->key_seen = 1; } } @@ -2486,32 +2476,6 @@ static ssize_t write_pair(int fd, const char *key, const char *value, return ret; } -static ssize_t find_beginning_of_line(const char *contents, size_t size, - size_t offset_, int *found_bracket) -{ - size_t equal_offset = size, bracket_offset = size; - ssize_t offset; - -contline: - for (offset = offset_-2; offset > 0 - && contents[offset] != '\n'; offset--) - switch (contents[offset]) { - case '=': equal_offset = offset; break; - case ']': bracket_offset = offset; break; - } - if (offset > 0 && contents[offset-1] == '\\') { - offset_ = offset; - goto contline; - } - if (bracket_offset < equal_offset) { - *found_bracket = 1; - offset = bracket_offset+1; - } else - offset++; - - return offset; -} - int git_config_set_in_file_gently(const char *config_filename, const char *key, const char *value) { @@ -2622,6 +2586,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, struct stat st; size_t copy_begin, copy_end; int i, new_line = 0; + struct config_options opts; if (value_regex == NULL) store.value_regex = NULL; @@ -2644,17 +2609,24 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, } } - ALLOC_GROW(store.seen, 1, store.seen_alloc); - store.seen[0] = 0; - store.seen_nr = 0; + ALLOC_GROW(store.parsed, 1, store.parsed_alloc); + store.parsed[0].end = 0; + + memset(&opts, 0, sizeof(opts)); + opts.event_fn = store_aux_event; + opts.event_fn_data = &store; /* - * After this, store.offset will contain the *end* offset - * of the last match, or remain at 0 if no match was found. + * After this, store.parsed will contain offsets of all the + * parsed elements, and store.seen will contain a list of + * matches, as indices into store.parsed. + * * As a side effect, we make sure to transform only a valid * existing config file. */ - if (git_config_from_file(store_aux, config_filename, &store)) { + if (git_config_from_file_with_options(store_aux, + config_filename, + &store, &opts)) { error("invalid config file %s", config_filename); free(store.key); if (store.value_regex != NULL && @@ -2706,19 +2678,39 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, goto out_free; } - if (store.seen_nr == 0) + if (store.seen_nr == 0) { + if (!store.seen_alloc) { + /* Did not see key nor section */ + ALLOC_GROW(store.seen, 1, store.seen_alloc); + store.seen[0] = store.parsed_nr + - !!store.parsed_nr; + } store.seen_nr = 1; + } for (i = 0, copy_begin = 0; i < store.seen_nr; i++) { + size_t replace_end; + int j = store.seen[i]; + new_line = 0; - if (store.seen[i] == 0) { - store.seen[i] = copy_end = contents_sz; - } else if (!store.key_seen) { - copy_end = store.seen[i]; - } else - copy_end = find_beginning_of_line( - contents, contents_sz, - store.seen[i], &new_line); + if (!store.key_seen) { + replace_end = copy_end = store.parsed[j].end; + } else { + replace_end = store.parsed[j].end; + copy_end = store.parsed[j].begin; + /* + * Swallow preceding white-space on the same + * line. + */ + while (copy_end > 0 ) { + char c = contents[copy_end - 1]; + + if (isspace(c) && c != '\n') + copy_end--; + else + break; + } + } if (copy_end > 0 && contents[copy_end-1] != '\n') new_line = 1; @@ -2732,7 +2724,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, write_str_in_full(fd, "\n") < 0) goto write_err_out; } - copy_begin = store.seen[i]; + copy_begin = replace_end; } /* write the pair (value == NULL means unset) */ -- 2.16.2.windows.1.26.g2cc3565eb4b ^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v2 13/15] git config --unset: remove empty sections (in the common case) 2018-04-03 16:27 ` [PATCH v2 00/15] " Johannes Schindelin ` (11 preceding siblings ...) 2018-04-03 16:28 ` [PATCH v2 12/15] git_config_set: make use of the config parser's event stream Johannes Schindelin @ 2018-04-03 16:28 ` Johannes Schindelin 2018-04-03 16:29 ` [PATCH v2 14/15] git_config_set: reuse empty sections Johannes Schindelin ` (3 subsequent siblings) 16 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-04-03 16:28 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley The original reasoning for not removing section headers upon removal of the last entry went like this: the user could have added comments about the section, or about the entries therein, and if there were other comments there, we would not know whether we should remove them. In particular, a concocted example was presented that looked like this (and was added to t1300): # some generic comment on the configuration file itself # a comment specific to this "section" section. [section] # some intervening lines # that should also be dropped key = value # please be careful when you update the above variable The ideal thing for `git config --unset section.key` in this case would be to leave only the first line behind, because all the other comments are now obsolete. However, this is unfeasible, short of adding a complete Natural Language Processing module to Git, which seems not only a lot of work, but a totally unreasonable feature (for little benefit to most users). Now, the real kicker about this problem is: most users do not edit their config files at all! In their use case, the config looks like this instead: [section] key = value ... and it is totally obvious what should happen if the entry is removed: the entire section should vanish. Let's generalize this observation to this conservative strategy: if we are removing the last entry from a section, and there are no comments inside that section nor surrounding it, then remove the entire section. Otherwise behave as before: leave the now-empty section (including those comments, even ones about the now-deleted entry). We have to be extra careful to handle the case where more than one entry is removed: any subset of them might be the last entries of their respective sections (and if there are no comments in or around that section, the section should be removed, too). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- config.c | 93 +++++++++++++++++++++++++++++++++++++++++++++++++++++-- t/t1300-config.sh | 4 +-- 2 files changed, 93 insertions(+), 4 deletions(-) diff --git a/config.c b/config.c index 345b1d2f140..271e9605ec1 100644 --- a/config.c +++ b/config.c @@ -2306,6 +2306,7 @@ struct config_set_store { struct { size_t begin, end; enum config_event_t type; + int is_keys_section; } *parsed; unsigned int parsed_nr, parsed_alloc, *seen, seen_nr, seen_alloc; unsigned int key_seen:1, section_seen:1, is_keys_section:1; @@ -2334,17 +2335,20 @@ static int store_aux_event(enum config_event_t type, store->parsed[store->parsed_nr].begin = begin; store->parsed[store->parsed_nr].end = end; store->parsed[store->parsed_nr].type = type; - store->parsed_nr++; if (type == CONFIG_EVENT_SECTION) { if (cf->var.len < 2 || cf->var.buf[cf->var.len - 1] != '.') BUG("Invalid section name '%s'", cf->var.buf); /* Is this the section we were looking for? */ - store->is_keys_section = cf->var.len - 1 == store->baselen && + store->is_keys_section = + store->parsed[store->parsed_nr].is_keys_section = + cf->var.len - 1 == store->baselen && !strncasecmp(cf->var.buf, store->key, store->baselen); } + store->parsed_nr++; + return 0; } @@ -2476,6 +2480,87 @@ static ssize_t write_pair(int fd, const char *key, const char *value, return ret; } +/* + * If we are about to unset the last key(s) in a section, and if there are + * no comments surrounding (or included in) the section, we will want to + * extend begin/end to remove the entire section. + * + * Note: the parameter `seen_ptr` points to the index into the store.seen + * array. * This index may be incremented if a section has more than one + * entry (which all are to be removed). + */ +static void maybe_remove_section(struct config_set_store *store, + const char *contents, + size_t *begin_offset, size_t *end_offset, + int *seen_ptr) +{ + size_t begin; + int i, seen, section_seen = 0; + + /* + * First, ensure that this is the first key, and that there are no + * comments before the entry nor before the section header. + */ + seen = *seen_ptr; + for (i = store->seen[seen]; i > 0; i--) { + enum config_event_t type = store->parsed[i - 1].type; + + if (type == CONFIG_EVENT_COMMENT) + /* There is a comment before this entry or section */ + return; + if (type == CONFIG_EVENT_ENTRY) { + if (!section_seen) + /* This is not the section's first entry. */ + return; + /* We encountered no comment before the section. */ + break; + } + if (type == CONFIG_EVENT_SECTION) { + if (!store->parsed[i - 1].is_keys_section) + break; + section_seen = 1; + } + } + begin = store->parsed[i].begin; + + /* + * Next, make sure that we are removing he last key(s) in the section, + * and that there are no comments that are possibly about the current + * section. + */ + for (i = store->seen[seen] + 1; i < store->parsed_nr; i++) { + enum config_event_t type = store->parsed[i].type; + + if (type == CONFIG_EVENT_COMMENT) + return; + if (type == CONFIG_EVENT_SECTION) { + if (store->parsed[i].is_keys_section) + continue; + break; + } + if (type == CONFIG_EVENT_ENTRY) { + if (++seen < store->seen_nr && + i == store->seen[seen]) + /* We want to remove this entry, too */ + continue; + /* There is another entry in this section. */ + return; + } + } + + /* + * We are really removing the last entry/entries from this section, and + * there are no enclosed or surrounding comments. Remove the entire, + * now-empty section. + */ + *seen_ptr = seen; + *begin_offset = begin; + if (i < store->parsed_nr) + *end_offset = store->parsed[i].begin; + else + *end_offset = store->parsed[store->parsed_nr - 1].end; +} + int git_config_set_in_file_gently(const char *config_filename, const char *key, const char *value) { @@ -2698,6 +2783,10 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, } else { replace_end = store.parsed[j].end; copy_end = store.parsed[j].begin; + if (!value) + maybe_remove_section(&store, contents, + ©_end, + &replace_end, &i); /* * Swallow preceding white-space on the same * line. diff --git a/t/t1300-config.sh b/t/t1300-config.sh index 10b9bf4b088..6d34513eedd 100755 --- a/t/t1300-config.sh +++ b/t/t1300-config.sh @@ -1413,7 +1413,7 @@ test_expect_success 'urlmatch with wildcard' ' ' # good section hygiene -test_expect_failure '--unset last key removes section (except if commented)' ' +test_expect_success '--unset last key removes section (except if commented)' ' cat >.git/config <<-\EOF && # some generic comment on the configuration file itself # a comment specific to this "section" section. @@ -1452,7 +1452,7 @@ test_expect_failure '--unset last key removes section (except if commented)' ' test_cmp expect .git/config ' -test_expect_failure '--unset-all removes section if empty & uncommented' ' +test_expect_success '--unset-all removes section if empty & uncommented' ' cat >.git/config <<-\EOF && [section] key = value1 -- 2.16.2.windows.1.26.g2cc3565eb4b ^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v2 14/15] git_config_set: reuse empty sections 2018-04-03 16:27 ` [PATCH v2 00/15] " Johannes Schindelin ` (12 preceding siblings ...) 2018-04-03 16:28 ` [PATCH v2 13/15] git config --unset: remove empty sections (in the common case) Johannes Schindelin @ 2018-04-03 16:29 ` Johannes Schindelin 2018-04-03 16:30 ` [PATCH v2 00/15] Assorted fixes for `git config` (including the "empty sections" bug) Johannes Schindelin ` (2 subsequent siblings) 16 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-04-03 16:29 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley It can happen quite easily that the last setting in a config section is removed, and to avoid confusion when there are comments in the config about that section, we keep a lone section header, i.e. an empty section. Now that we use the `event_fn` callback, it is easy to add support for re-using empty sections, so let's do that. Note: t5512-ls-remote requires that this change is applied *after* the patch "git config --unset: remove empty sections (in the common case)": without that patch, there would be empty `transfer` and `uploadpack` sections ready for reuse, but in the *wrong* order (and sconsequently, t5512's "overrides work between mixed transfer/upload-pack hideRefs" would fail). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- config.c | 14 +++++++++++++- t/t1300-config.sh | 2 +- 2 files changed, 14 insertions(+), 2 deletions(-) diff --git a/config.c b/config.c index 271e9605ec1..ee7ea24123d 100644 --- a/config.c +++ b/config.c @@ -2345,6 +2345,12 @@ static int store_aux_event(enum config_event_t type, store->parsed[store->parsed_nr].is_keys_section = cf->var.len - 1 == store->baselen && !strncasecmp(cf->var.buf, store->key, store->baselen); + if (store->is_keys_section) { + store->section_seen = 1; + ALLOC_GROW(store->seen, store->seen_nr + 1, + store->seen_alloc); + store->seen[store->seen_nr] = store->parsed_nr; + } } store->parsed_nr++; @@ -2779,7 +2785,13 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, new_line = 0; if (!store.key_seen) { - replace_end = copy_end = store.parsed[j].end; + copy_end = store.parsed[j].end; + /* include '\n' when copying section header */ + if (copy_end > 0 && copy_end < contents_sz && + contents[copy_end - 1] != '\n' && + contents[copy_end] == '\n') + copy_end++; + replace_end = copy_end; } else { replace_end = store.parsed[j].end; copy_end = store.parsed[j].begin; diff --git a/t/t1300-config.sh b/t/t1300-config.sh index 6d34513eedd..6d0e13020d1 100755 --- a/t/t1300-config.sh +++ b/t/t1300-config.sh @@ -1463,7 +1463,7 @@ test_expect_success '--unset-all removes section if empty & uncommented' ' test_line_count = 0 .git/config ' -test_expect_failure 'adding a key into an empty section reuses header' ' +test_expect_success 'adding a key into an empty section reuses header' ' cat >.git/config <<-\EOF && [section] EOF -- 2.16.2.windows.1.26.g2cc3565eb4b ^ permalink raw reply related [flat|nested] 103+ messages in thread
* Re: [PATCH v2 00/15] Assorted fixes for `git config` (including the "empty sections" bug) 2018-04-03 16:27 ` [PATCH v2 00/15] " Johannes Schindelin ` (13 preceding siblings ...) 2018-04-03 16:29 ` [PATCH v2 14/15] git_config_set: reuse empty sections Johannes Schindelin @ 2018-04-03 16:30 ` Johannes Schindelin 2018-04-06 21:33 ` Jeff King 2018-04-09 8:31 ` [PATCH v3 " Johannes Schindelin 16 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-04-03 16:30 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley Hi team, On Tue, 3 Apr 2018, Johannes Schindelin wrote: > Johannes Schindelin (15): > git_config_set: fix off-by-two > t1300: rename it to reflect that `repo-config` was deprecated > t1300: demonstrate that --replace-all can "invent" newlines > config --replace-all: avoid extra line breaks > t1300: avoid relying on a bug > t1300: remove unreasonable expectation from TODO > t1300: `--unset-all` can leave an empty section behind (bug) > config: introduce an optional event stream while parsing > config: avoid using the global variable `store` > config_set_store: rename some fields for consistency > git_config_set: do not use a state machine > git_config_set: make use of the config parser's event stream > git config --unset: remove empty sections (in the common case) > git_config_set: reuse empty sections > TODOs Please note that the `TODOs` commit is a left-over of my internal book-keeping, and its diff is actually empty. Hence `format-patch` does not even generate a mail for it, so there is no [PATCH v2 15/15]. Thanks, Dscho ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH v2 00/15] Assorted fixes for `git config` (including the "empty sections" bug) 2018-04-03 16:27 ` [PATCH v2 00/15] " Johannes Schindelin ` (14 preceding siblings ...) 2018-04-03 16:30 ` [PATCH v2 00/15] Assorted fixes for `git config` (including the "empty sections" bug) Johannes Schindelin @ 2018-04-06 21:33 ` Jeff King 2018-04-09 8:19 ` Johannes Schindelin 2018-04-09 8:31 ` [PATCH v3 " Johannes Schindelin 16 siblings, 1 reply; 103+ messages in thread From: Jeff King @ 2018-04-06 21:33 UTC (permalink / raw) To: Johannes Schindelin Cc: git, Junio C Hamano, Thomas Rast, Phil Haack, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley On Tue, Apr 03, 2018 at 06:27:55PM +0200, Johannes Schindelin wrote: > I am very, very grateful for the time Peff spent on reviewing the previous > iteration, and hope that he realizes just how much the elegance of the > event-stream-based version is due to his excellent review. Unfortunately I ran out of time this week to give this version an equally careful review, and I'm about to go on vacation for a few weeks. I did give a cursory look over it, and the new maybe_remove_section() is much more pleasant. So aside from a few minor nits I pointed out, this generally looks good. One thing I'd like to have seen is a few more tests covering exotic cases that I turned up in my earlier review. Some of the weird multiline cases I care less about, but we should probably cover at least: 1. Comment behavior when removing a section that isn't at the beginning of the file. 2. Removing the final key from a section with a subsection. Those should both be natural fallouts of the new method, but it would be good to have test coverage. Thanks for reworking this, and if it's still not merged when I get back, I promise to review it more carefully then. :) -Peff ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH v2 00/15] Assorted fixes for `git config` (including the "empty sections" bug) 2018-04-06 21:33 ` Jeff King @ 2018-04-09 8:19 ` Johannes Schindelin 0 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-04-09 8:19 UTC (permalink / raw) To: Jeff King Cc: git, Junio C Hamano, Thomas Rast, Phil Haack, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley Hi Peff, On Fri, 6 Apr 2018, Jeff King wrote: > On Tue, Apr 03, 2018 at 06:27:55PM +0200, Johannes Schindelin wrote: > > > I am very, very grateful for the time Peff spent on reviewing the > > previous iteration, and hope that he realizes just how much the > > elegance of the event-stream-based version is due to his excellent > > review. > > Unfortunately I ran out of time this week to give this version an > equally careful review, and I'm about to go on vacation for a few weeks. No worries, and thank you for your review. I know I am adding more stuff to review these days than I review other stuff, but I promise that I will try to get more reviews in once I am done with this patch series (and with the --rebase-merges one). > I did give a cursory look over it, and the new maybe_remove_section() is > much more pleasant. So aside from a few minor nits I pointed out, this > generally looks good. Thanks! > One thing I'd like to have seen is a few more tests covering exotic > cases that I turned up in my earlier review. Some of the weird multiline > cases I care less about, but we should probably cover at least: > > 1. Comment behavior when removing a section that isn't at the > beginning of the file. > > 2. Removing the final key from a section with a subsection. > > Those should both be natural fallouts of the new method, but it would be > good to have test coverage. I added this, in a new commit I call "t1300: add a few more hairy examples of sections becoming empty". > Thanks for reworking this, and if it's still not merged when I get back, > I promise to review it more carefully then. :) :-) Have a good vacation! Dscho ^ permalink raw reply [flat|nested] 103+ messages in thread
* [PATCH v3 00/15] Assorted fixes for `git config` (including the "empty sections" bug) 2018-04-03 16:27 ` [PATCH v2 00/15] " Johannes Schindelin ` (15 preceding siblings ...) 2018-04-06 21:33 ` Jeff King @ 2018-04-09 8:31 ` Johannes Schindelin 2018-04-09 8:31 ` [PATCH v3 01/15] git_config_set: fix off-by-two Johannes Schindelin ` (14 more replies) 16 siblings, 15 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-04-09 8:31 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley This patch series originally only tried to help fixing that annoying bug that has been reported several times over the years, where `git config --unset` would leave empty sections behind, and `git config --add` would not reuse them. The first patch is somewhat of a "while at it" bug fix that I first thought would be a lot more critical than it actually is: It really only affects config files that start with a section followed immediately (i.e. without a newline) by a one-letter boolean setting (i.e. without a `= <value>` part). So while it is a real bug fix, I doubt anybody ever got bitten by it. The next swath of patches add and fix some tests, while also fixing the bug where --replace-all would sometimes insert extra line breaks. Then, I introduce a couple of building blocks: a "config parser event stream", i.e. an optional callback that can be used to report events such as "comment", "white-space", etc together with the corresponding extents in the config file. Finally, the interesting part, where I do two things, essentially (with preparatory steps for each thing): 1. I add the ability for `git config --unset/--unset-all` to detect that it can remove a section that has just become empty (see below for some more discussion of what I consider "become empty"), and 2. I add the ability for `git config [--add] key value` to re-use empty sections. To reiterate why does this patch series not conflict with my very early statements that we cannot simply remove empty sections because we may end up with stale comments? Well, the patch in question takes pains to determine *iff* there are any comments surrounding, or included in, the section. If any are found: previous behavior. Under the assumption that the user edited the file, we keep it as intact as possible (see below for some argument against this). If no comments are found, and let's face it, this is probably *the* common case, as few people edit their config files by hand these days (neither should they because it is too easy to end up with an unparseable one), the now-empty section *is* removed. So what is the argument against this extra care to detect comments? Well, if you have something like this: [section] ; Here we comment about the variable called snarf snarf = froop and we run `git config --unset section.snarf`, we end up with this config: [section] ; Here we comment about the variable called snarf which obviously does not make sense. However, that is already established behavior for quite a few years, and I do not even try to think of a way how this could be solved. Changes since v2: - removed the `inline` attribute from the `do_event()` function. - renamed `struct config_set_store` to `struct config_store_data`, to make its roled more obvious. - a whole slew of concocted test cases were added to the test to verify that a section that becomes empty is removed, based on Peff's analysis at https://public-inbox.org/git/20180329213229.GG2939@sigill.intra.peff.net/ Johannes Schindelin (15): git_config_set: fix off-by-two t1300: rename it to reflect that `repo-config` was deprecated t1300: demonstrate that --replace-all can "invent" newlines config --replace-all: avoid extra line breaks t1300: avoid relying on a bug t1300: remove unreasonable expectation from TODO t1300: add a few more hairy examples of sections becoming empty t1300: `--unset-all` can leave an empty section behind (bug) config: introduce an optional event stream while parsing config: avoid using the global variable `store` config_set_store: rename some fields for consistency git_config_set: do not use a state machine git_config_set: make use of the config parser's event stream git config --unset: remove empty sections (in the common case) git_config_set: reuse empty sections config.c | 448 ++++++++++++++------ config.h | 25 ++ t/{t1300-repo-config.sh => t1300-config.sh} | 102 ++++- 3 files changed, 439 insertions(+), 136 deletions(-) rename t/{t1300-repo-config.sh => t1300-config.sh} (95%) base-commit: 468165c1d8a442994a825f3684528361727cd8c0 Published-As: https://github.com/dscho/git/releases/tag/empty-config-section-v3 Fetch-It-Via: git fetch https://github.com/dscho/git empty-config-section-v3 Interdiff vs v2: diff --git a/config.c b/config.c index ee7ea24123d..6155d0651bd 100644 --- a/config.c +++ b/config.c @@ -659,8 +659,7 @@ struct parse_event_data { const struct config_options *opts; }; -static inline int do_event(enum config_event_t type, - struct parse_event_data *data) +static int do_event(enum config_event_t type, struct parse_event_data *data) { size_t offset; @@ -2297,7 +2296,7 @@ void git_die_config(const char *key, const char *err, ...) * Find all the stuff for git_config_set() below. */ -struct config_set_store { +struct config_store_data { int baselen; char *key; int do_not_match; @@ -2313,7 +2312,7 @@ struct config_set_store { }; static int matches(const char *key, const char *value, - const struct config_set_store *store) + const struct config_store_data *store) { if (strcmp(key, store->key)) return 0; /* not ours */ @@ -2329,7 +2328,7 @@ static int matches(const char *key, const char *value, static int store_aux_event(enum config_event_t type, size_t begin, size_t end, void *data) { - struct config_set_store *store = data; + struct config_store_data *store = data; ALLOC_GROW(store->parsed, store->parsed_nr + 1, store->parsed_alloc); store->parsed[store->parsed_nr].begin = begin; @@ -2360,7 +2359,7 @@ static int store_aux_event(enum config_event_t type, static int store_aux(const char *key, const char *value, void *cb) { - struct config_set_store *store = cb; + struct config_store_data *store = cb; if (store->key_seen) { if (matches(key, value, store)) { @@ -2401,7 +2400,7 @@ static int write_error(const char *filename) } static struct strbuf store_create_section(const char *key, - const struct config_set_store *store) + const struct config_store_data *store) { const char *dot; int i; @@ -2424,7 +2423,7 @@ static struct strbuf store_create_section(const char *key, } static ssize_t write_section(int fd, const char *key, - const struct config_set_store *store) + const struct config_store_data *store) { struct strbuf sb = store_create_section(key, store); ssize_t ret; @@ -2436,7 +2435,7 @@ static ssize_t write_section(int fd, const char *key, } static ssize_t write_pair(int fd, const char *key, const char *value, - const struct config_set_store *store) + const struct config_store_data *store) { int i; ssize_t ret; @@ -2495,7 +2494,7 @@ static ssize_t write_pair(int fd, const char *key, const char *value, * array. * This index may be incremented if a section has more than one * entry (which all are to be removed). */ -static void maybe_remove_section(struct config_set_store *store, +static void maybe_remove_section(struct config_store_data *store, const char *contents, size_t *begin_offset, size_t *end_offset, int *seen_ptr) @@ -2625,7 +2624,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, char *filename_buf = NULL; char *contents = NULL; size_t contents_sz; - struct config_set_store store; + struct config_store_data store; memset(&store, 0, sizeof(store)); @@ -2969,7 +2968,7 @@ static int git_config_copy_or_rename_section_in_file(const char *config_filename FILE *config_file = NULL; struct stat st; struct strbuf copystr = STRBUF_INIT; - struct config_set_store store; + struct config_store_data store; memset(&store, 0, sizeof(store)); diff --git a/t/t1300-config.sh b/t/t1300-config.sh index 6d0e13020d1..eef0bbe4f9f 100755 --- a/t/t1300-config.sh +++ b/t/t1300-config.sh @@ -1449,7 +1449,50 @@ test_expect_success '--unset last key removes section (except if commented)' ' EOF git config --unset section.key && - test_cmp expect .git/config + test_cmp expect .git/config && + + q_to_tab >.git/config <<-\EOF && + [one] + Qkey = "multiline \ + QQ# with comment" + [two] + key = true + EOF + git config --unset two.key && + ! grep two .git/config && + + q_to_tab >.git/config <<-\EOF && + [one] + Qkey = "multiline \ + QQ# with comment" + [one] + key = true + EOF + git config --unset-all one.key && + test_line_count = 0 .git/config && + + q_to_tab >.git/config <<-\EOF && + [one] + Qkey = true + Q# a comment not at the start + [two] + Qkey = true + EOF + git config --unset two.key && + grep two .git/config && + + q_to_tab >.git/config <<-\EOF && + [one] + Qkey = not [two "subsection"] + [two "subsection"] + [two "subsection"] + Qkey = true + [TWO "subsection"] + [one] + EOF + git config --unset two.subsection.key && + test "not [two subsection]" = "$(git config one.key)" && + test_line_count = 3 .git/config ' test_expect_success '--unset-all removes section if empty & uncommented' ' -- 2.17.0.windows.1.4.g7e4058d72e3 ^ permalink raw reply [flat|nested] 103+ messages in thread
* [PATCH v3 01/15] git_config_set: fix off-by-two 2018-04-09 8:31 ` [PATCH v3 " Johannes Schindelin @ 2018-04-09 8:31 ` Johannes Schindelin 2018-04-09 8:31 ` [PATCH v3 02/15] t1300: rename it to reflect that `repo-config` was deprecated Johannes Schindelin ` (13 subsequent siblings) 14 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-04-09 8:31 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley Currently, we are slightly overzealous When removing an entry from a config file of this form: [abc]a [xyz] key = value When calling `git config --unset abc.a` on this file, it leaves this (invalid) config behind: [ [xyz] key = value The reason is that we try to search for the beginning of the line (or for the end of the preceding section header on the same line) that defines abc.a, but as an optimization, we subtract 2 from the offset pointing just after the definition before we call find_beginning_of_line(). That function, however, *also* performs that optimization and promptly fails to find the section header correctly. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- config.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/config.c b/config.c index b0c20e6cb8a..5cc049aaef0 100644 --- a/config.c +++ b/config.c @@ -2632,7 +2632,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, } else copy_end = find_beginning_of_line( contents, contents_sz, - store.offset[i]-2, &new_line); + store.offset[i], &new_line); if (copy_end > 0 && contents[copy_end-1] != '\n') new_line = 1; -- 2.17.0.windows.1.4.g7e4058d72e3 ^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v3 02/15] t1300: rename it to reflect that `repo-config` was deprecated 2018-04-09 8:31 ` [PATCH v3 " Johannes Schindelin 2018-04-09 8:31 ` [PATCH v3 01/15] git_config_set: fix off-by-two Johannes Schindelin @ 2018-04-09 8:31 ` Johannes Schindelin 2018-04-09 8:31 ` [PATCH v3 03/15] t1300: demonstrate that --replace-all can "invent" newlines Johannes Schindelin ` (12 subsequent siblings) 14 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-04-09 8:31 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- t/{t1300-repo-config.sh => t1300-config.sh} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename t/{t1300-repo-config.sh => t1300-config.sh} (100%) diff --git a/t/t1300-repo-config.sh b/t/t1300-config.sh similarity index 100% rename from t/t1300-repo-config.sh rename to t/t1300-config.sh -- 2.17.0.windows.1.4.g7e4058d72e3 ^ permalink raw reply [flat|nested] 103+ messages in thread
* [PATCH v3 03/15] t1300: demonstrate that --replace-all can "invent" newlines 2018-04-09 8:31 ` [PATCH v3 " Johannes Schindelin 2018-04-09 8:31 ` [PATCH v3 01/15] git_config_set: fix off-by-two Johannes Schindelin 2018-04-09 8:31 ` [PATCH v3 02/15] t1300: rename it to reflect that `repo-config` was deprecated Johannes Schindelin @ 2018-04-09 8:31 ` Johannes Schindelin 2018-04-09 8:31 ` [PATCH v3 04/15] config --replace-all: avoid extra line breaks Johannes Schindelin ` (11 subsequent siblings) 14 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-04-09 8:31 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- t/t1300-config.sh | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/t/t1300-config.sh b/t/t1300-config.sh index 4f8e6f5fde3..cc417687e8d 100755 --- a/t/t1300-config.sh +++ b/t/t1300-config.sh @@ -1611,4 +1611,25 @@ test_expect_success '--local requires a repo' ' test_expect_code 128 nongit git config --local foo.bar ' +test_expect_failure '--replace-all does not invent newlines' ' + q_to_tab >.git/config <<-\EOF && + [abc]key + QkeepSection + [xyz] + Qkey = 1 + [abc] + Qkey = a + EOF + q_to_tab >expect <<-\EOF && + [abc] + QkeepSection + [xyz] + Qkey = 1 + [abc] + Qkey = b + EOF + git config --replace-all abc.key b && + test_cmp .git/config expect +' + test_done -- 2.17.0.windows.1.4.g7e4058d72e3 ^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v3 04/15] config --replace-all: avoid extra line breaks 2018-04-09 8:31 ` [PATCH v3 " Johannes Schindelin ` (2 preceding siblings ...) 2018-04-09 8:31 ` [PATCH v3 03/15] t1300: demonstrate that --replace-all can "invent" newlines Johannes Schindelin @ 2018-04-09 8:31 ` Johannes Schindelin 2018-04-09 8:31 ` [PATCH v3 05/15] t1300: avoid relying on a bug Johannes Schindelin ` (10 subsequent siblings) 14 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-04-09 8:31 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley When replacing multiple config entries at once, we did not re-set the flag that indicates whether we need to insert a new-line before the new entry. As a consequence, an extra new-line was inserted under certain circumstances. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- config.c | 1 + t/t1300-config.sh | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/config.c b/config.c index 5cc049aaef0..f10f8c6f52f 100644 --- a/config.c +++ b/config.c @@ -2625,6 +2625,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, store.seen = 1; for (i = 0, copy_begin = 0; i < store.seen; i++) { + new_line = 0; if (store.offset[i] == 0) { store.offset[i] = copy_end = contents_sz; } else if (store.state != KEY_SEEN) { diff --git a/t/t1300-config.sh b/t/t1300-config.sh index cc417687e8d..aed12be492f 100755 --- a/t/t1300-config.sh +++ b/t/t1300-config.sh @@ -1611,7 +1611,7 @@ test_expect_success '--local requires a repo' ' test_expect_code 128 nongit git config --local foo.bar ' -test_expect_failure '--replace-all does not invent newlines' ' +test_expect_success '--replace-all does not invent newlines' ' q_to_tab >.git/config <<-\EOF && [abc]key QkeepSection -- 2.17.0.windows.1.4.g7e4058d72e3 ^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v3 05/15] t1300: avoid relying on a bug 2018-04-09 8:31 ` [PATCH v3 " Johannes Schindelin ` (3 preceding siblings ...) 2018-04-09 8:31 ` [PATCH v3 04/15] config --replace-all: avoid extra line breaks Johannes Schindelin @ 2018-04-09 8:31 ` Johannes Schindelin 2018-04-09 8:31 ` [PATCH v3 06/15] t1300: remove unreasonable expectation from TODO Johannes Schindelin ` (9 subsequent siblings) 14 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-04-09 8:31 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley The test case 'unset with cont. lines' relied on a bug that is about to be fixed: it tests *explicitly* that removing the last entry from a config section leaves an *empty* section behind. Let's fix this test case not to rely on that behavior, simply by preventing the section from becoming empty. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- t/t1300-config.sh | 2 ++ 1 file changed, 2 insertions(+) diff --git a/t/t1300-config.sh b/t/t1300-config.sh index aed12be492f..7c0ee208dea 100755 --- a/t/t1300-config.sh +++ b/t/t1300-config.sh @@ -108,6 +108,7 @@ bar = foo [beta] baz = multiple \ lines +foo = bar EOF test_expect_success 'unset with cont. lines' ' @@ -118,6 +119,7 @@ cat > expect <<\EOF [alpha] bar = foo [beta] +foo = bar EOF test_expect_success 'unset with cont. lines is correct' 'test_cmp expect .git/config' -- 2.17.0.windows.1.4.g7e4058d72e3 ^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v3 06/15] t1300: remove unreasonable expectation from TODO 2018-04-09 8:31 ` [PATCH v3 " Johannes Schindelin ` (4 preceding siblings ...) 2018-04-09 8:31 ` [PATCH v3 05/15] t1300: avoid relying on a bug Johannes Schindelin @ 2018-04-09 8:31 ` Johannes Schindelin 2018-04-09 8:31 ` [PATCH v3 07/15] t1300: add a few more hairy examples of sections becoming empty Johannes Schindelin ` (8 subsequent siblings) 14 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-04-09 8:31 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley In https://public-inbox.org/git/7vvc8alzat.fsf@alter.siamese.dyndns.org/ a reasonable patch was made quite a bit less so by changing a test case demonstrating a bug to a test case that demonstrates that we ask for too much: the test case 'unsetting the last key in a section removes header' now expects a future bug fix to be able to determine whether a free-form comment above a section header refers to said section or not. Rather than shooting for the stars (and not even getting off the ground), let's start shooting for something obtainable and be reasonably confident that we *can* get it. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- t/t1300-config.sh | 21 ++++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-) diff --git a/t/t1300-config.sh b/t/t1300-config.sh index 7c0ee208dea..187fc5b195f 100755 --- a/t/t1300-config.sh +++ b/t/t1300-config.sh @@ -1413,7 +1413,7 @@ test_expect_success 'urlmatch with wildcard' ' ' # good section hygiene -test_expect_failure 'unsetting the last key in a section removes header' ' +test_expect_failure '--unset last key removes section (except if commented)' ' cat >.git/config <<-\EOF && # some generic comment on the configuration file itself # a comment specific to this "section" section. @@ -1427,6 +1427,25 @@ test_expect_failure 'unsetting the last key in a section removes header' ' cat >expect <<-\EOF && # some generic comment on the configuration file itself + # a comment specific to this "section" section. + [section] + # some intervening lines + # that should also be dropped + + # please be careful when you update the above variable + EOF + + git config --unset section.key && + test_cmp expect .git/config && + + cat >.git/config <<-\EOF && + [section] + key = value + [next-section] + EOF + + cat >expect <<-\EOF && + [next-section] EOF git config --unset section.key && -- 2.17.0.windows.1.4.g7e4058d72e3 ^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v3 07/15] t1300: add a few more hairy examples of sections becoming empty 2018-04-09 8:31 ` [PATCH v3 " Johannes Schindelin ` (5 preceding siblings ...) 2018-04-09 8:31 ` [PATCH v3 06/15] t1300: remove unreasonable expectation from TODO Johannes Schindelin @ 2018-04-09 8:31 ` Johannes Schindelin 2018-04-09 8:32 ` [PATCH v3 08/15] t1300: `--unset-all` can leave an empty section behind (bug) Johannes Schindelin ` (7 subsequent siblings) 14 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-04-09 8:31 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley During the review of the first iteration of the patch series to remove sections that become empty upon --unset or --unset-all, Jeff King identified a couple of problematic cases with the backtracking approach that was still used then to "look backwards for the section header": https://public-inbox.org/git/20180329213229.GG2939@sigill.intra.peff.net/ This patch adds a couple of concocted examples designed to fool a backtracking parser. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- t/t1300-config.sh | 45 ++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 44 insertions(+), 1 deletion(-) diff --git a/t/t1300-config.sh b/t/t1300-config.sh index 187fc5b195f..bc30cfb3468 100755 --- a/t/t1300-config.sh +++ b/t/t1300-config.sh @@ -1449,7 +1449,50 @@ test_expect_failure '--unset last key removes section (except if commented)' ' EOF git config --unset section.key && - test_cmp expect .git/config + test_cmp expect .git/config && + + q_to_tab >.git/config <<-\EOF && + [one] + Qkey = "multiline \ + QQ# with comment" + [two] + key = true + EOF + git config --unset two.key && + ! grep two .git/config && + + q_to_tab >.git/config <<-\EOF && + [one] + Qkey = "multiline \ + QQ# with comment" + [one] + key = true + EOF + git config --unset-all one.key && + test_line_count = 0 .git/config && + + q_to_tab >.git/config <<-\EOF && + [one] + Qkey = true + Q# a comment not at the start + [two] + Qkey = true + EOF + git config --unset two.key && + grep two .git/config && + + q_to_tab >.git/config <<-\EOF && + [one] + Qkey = not [two "subsection"] + [two "subsection"] + [two "subsection"] + Qkey = true + [TWO "subsection"] + [one] + EOF + git config --unset two.subsection.key && + test "not [two subsection]" = "$(git config one.key)" && + test_line_count = 3 .git/config ' test_expect_failure 'adding a key into an empty section reuses header' ' -- 2.17.0.windows.1.4.g7e4058d72e3 ^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v3 08/15] t1300: `--unset-all` can leave an empty section behind (bug) 2018-04-09 8:31 ` [PATCH v3 " Johannes Schindelin ` (6 preceding siblings ...) 2018-04-09 8:31 ` [PATCH v3 07/15] t1300: add a few more hairy examples of sections becoming empty Johannes Schindelin @ 2018-04-09 8:32 ` Johannes Schindelin 2018-04-09 8:32 ` [PATCH v3 09/15] config: introduce an optional event stream while parsing Johannes Schindelin ` (6 subsequent siblings) 14 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-04-09 8:32 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley We already have a test demonstrating that removing the last entry from a config section fails to remove the section header of the now-empty section. The same can happen, of course, if we remove the last entries in one fell swoop. This is *also* a bug, and should be fixed at the same time. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- t/t1300-config.sh | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/t/t1300-config.sh b/t/t1300-config.sh index bc30cfb3468..9d23a8ca972 100755 --- a/t/t1300-config.sh +++ b/t/t1300-config.sh @@ -1495,6 +1495,17 @@ test_expect_failure '--unset last key removes section (except if commented)' ' test_line_count = 3 .git/config ' +test_expect_failure '--unset-all removes section if empty & uncommented' ' + cat >.git/config <<-\EOF && + [section] + key = value1 + key = value2 + EOF + + git config --unset-all section.key && + test_line_count = 0 .git/config +' + test_expect_failure 'adding a key into an empty section reuses header' ' cat >.git/config <<-\EOF && [section] -- 2.17.0.windows.1.4.g7e4058d72e3 ^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v3 09/15] config: introduce an optional event stream while parsing 2018-04-09 8:31 ` [PATCH v3 " Johannes Schindelin ` (7 preceding siblings ...) 2018-04-09 8:32 ` [PATCH v3 08/15] t1300: `--unset-all` can leave an empty section behind (bug) Johannes Schindelin @ 2018-04-09 8:32 ` Johannes Schindelin 2018-04-09 8:32 ` [PATCH v3 10/15] config: avoid using the global variable `store` Johannes Schindelin ` (5 subsequent siblings) 14 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-04-09 8:32 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley This extends our config parser so that it can optionally produce an event stream via callback function, where it reports e.g. when a comment was parsed, or a section header, etc. This parser will be used subsequently to handle the scenarios better where removing config entries would make sections empty, or where a new entry could be added to an already-existing, empty section. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- config.c | 101 ++++++++++++++++++++++++++++++++++++++++++++++++------- config.h | 25 ++++++++++++++ 2 files changed, 114 insertions(+), 12 deletions(-) diff --git a/config.c b/config.c index f10f8c6f52f..03d8e7709fe 100644 --- a/config.c +++ b/config.c @@ -653,7 +653,45 @@ static int get_base_var(struct strbuf *name) } } -static int git_parse_source(config_fn_t fn, void *data) +struct parse_event_data { + enum config_event_t previous_type; + size_t previous_offset; + const struct config_options *opts; +}; + +static int do_event(enum config_event_t type, struct parse_event_data *data) +{ + size_t offset; + + if (!data->opts || !data->opts->event_fn) + return 0; + + if (type == CONFIG_EVENT_WHITESPACE && + data->previous_type == type) + return 0; + + offset = cf->do_ftell(cf); + /* + * At EOF, the parser always "inserts" an extra '\n', therefore + * the end offset of the event is the current file position, otherwise + * we will already have advanced to the next event. + */ + if (type != CONFIG_EVENT_EOF) + offset--; + + if (data->previous_type != CONFIG_EVENT_EOF && + data->opts->event_fn(data->previous_type, data->previous_offset, + offset, data->opts->event_fn_data) < 0) + return -1; + + data->previous_type = type; + data->previous_offset = offset; + + return 0; +} + +static int git_parse_source(config_fn_t fn, void *data, + const struct config_options *opts) { int comment = 0; int baselen = 0; @@ -664,8 +702,15 @@ static int git_parse_source(config_fn_t fn, void *data) /* U+FEFF Byte Order Mark in UTF8 */ const char *bomptr = utf8_bom; + /* For the parser event callback */ + struct parse_event_data event_data = { + CONFIG_EVENT_EOF, 0, opts + }; + for (;;) { - int c = get_next_char(); + int c; + + c = get_next_char(); if (bomptr && *bomptr) { /* We are at the file beginning; skip UTF8-encoded BOM * if present. Sane editors won't put this in on their @@ -682,18 +727,33 @@ static int git_parse_source(config_fn_t fn, void *data) } } if (c == '\n') { - if (cf->eof) + if (cf->eof) { + if (do_event(CONFIG_EVENT_EOF, &event_data) < 0) + return -1; return 0; + } + if (do_event(CONFIG_EVENT_WHITESPACE, &event_data) < 0) + return -1; comment = 0; continue; } - if (comment || isspace(c)) + if (comment) continue; + if (isspace(c)) { + if (do_event(CONFIG_EVENT_WHITESPACE, &event_data) < 0) + return -1; + continue; + } if (c == '#' || c == ';') { + if (do_event(CONFIG_EVENT_COMMENT, &event_data) < 0) + return -1; comment = 1; continue; } if (c == '[') { + if (do_event(CONFIG_EVENT_SECTION, &event_data) < 0) + return -1; + /* Reset prior to determining a new stem */ strbuf_reset(var); if (get_base_var(var) < 0 || var->len < 1) @@ -704,6 +764,10 @@ static int git_parse_source(config_fn_t fn, void *data) } if (!isalpha(c)) break; + + if (do_event(CONFIG_EVENT_ENTRY, &event_data) < 0) + return -1; + /* * Truncate the var name back to the section header * stem prior to grabbing the suffix part of the name @@ -715,6 +779,9 @@ static int git_parse_source(config_fn_t fn, void *data) break; } + if (do_event(CONFIG_EVENT_ERROR, &event_data) < 0) + return -1; + switch (cf->origin_type) { case CONFIG_ORIGIN_BLOB: error_msg = xstrfmt(_("bad config line %d in blob %s"), @@ -1398,7 +1465,8 @@ int git_default_config(const char *var, const char *value, void *dummy) * fgetc, ungetc, ftell of top need to be initialized before calling * this function. */ -static int do_config_from(struct config_source *top, config_fn_t fn, void *data) +static int do_config_from(struct config_source *top, config_fn_t fn, void *data, + const struct config_options *opts) { int ret; @@ -1410,7 +1478,7 @@ static int do_config_from(struct config_source *top, config_fn_t fn, void *data) strbuf_init(&top->var, 1024); cf = top; - ret = git_parse_source(fn, data); + ret = git_parse_source(fn, data, opts); /* pop config-file parsing state stack */ strbuf_release(&top->value); @@ -1423,7 +1491,7 @@ static int do_config_from(struct config_source *top, config_fn_t fn, void *data) static int do_config_from_file(config_fn_t fn, const enum config_origin_type origin_type, const char *name, const char *path, FILE *f, - void *data) + void *data, const struct config_options *opts) { struct config_source top; @@ -1436,15 +1504,18 @@ static int do_config_from_file(config_fn_t fn, top.do_ungetc = config_file_ungetc; top.do_ftell = config_file_ftell; - return do_config_from(&top, fn, data); + return do_config_from(&top, fn, data, opts); } static int git_config_from_stdin(config_fn_t fn, void *data) { - return do_config_from_file(fn, CONFIG_ORIGIN_STDIN, "", NULL, stdin, data); + return do_config_from_file(fn, CONFIG_ORIGIN_STDIN, "", NULL, stdin, + data, NULL); } -int git_config_from_file(config_fn_t fn, const char *filename, void *data) +int git_config_from_file_with_options(config_fn_t fn, const char *filename, + void *data, + const struct config_options *opts) { int ret = -1; FILE *f; @@ -1452,13 +1523,19 @@ int git_config_from_file(config_fn_t fn, const char *filename, void *data) f = fopen_or_warn(filename, "r"); if (f) { flockfile(f); - ret = do_config_from_file(fn, CONFIG_ORIGIN_FILE, filename, filename, f, data); + ret = do_config_from_file(fn, CONFIG_ORIGIN_FILE, filename, + filename, f, data, opts); funlockfile(f); fclose(f); } return ret; } +int git_config_from_file(config_fn_t fn, const char *filename, void *data) +{ + return git_config_from_file_with_options(fn, filename, data, NULL); +} + int git_config_from_mem(config_fn_t fn, const enum config_origin_type origin_type, const char *name, const char *buf, size_t len, void *data) { @@ -1475,7 +1552,7 @@ int git_config_from_mem(config_fn_t fn, const enum config_origin_type origin_typ top.do_ungetc = config_buf_ungetc; top.do_ftell = config_buf_ftell; - return do_config_from(&top, fn, data); + return do_config_from(&top, fn, data, NULL); } int git_config_from_blob_oid(config_fn_t fn, diff --git a/config.h b/config.h index ef70a9cac1e..5a2394daae2 100644 --- a/config.h +++ b/config.h @@ -28,15 +28,40 @@ enum config_origin_type { CONFIG_ORIGIN_CMDLINE }; +enum config_event_t { + CONFIG_EVENT_SECTION, + CONFIG_EVENT_ENTRY, + CONFIG_EVENT_WHITESPACE, + CONFIG_EVENT_COMMENT, + CONFIG_EVENT_EOF, + CONFIG_EVENT_ERROR +}; + +/* + * The parser event function (if not NULL) is called with the event type and + * the begin/end offsets of the parsed elements. + * + * Note: for CONFIG_EVENT_ENTRY (i.e. config variables), the trailing newline + * character is considered part of the element. + */ +typedef int (*config_parser_event_fn_t)(enum config_event_t type, + size_t begin_offset, size_t end_offset, + void *event_fn_data); + struct config_options { unsigned int respect_includes : 1; const char *commondir; const char *git_dir; + config_parser_event_fn_t event_fn; + void *event_fn_data; }; typedef int (*config_fn_t)(const char *, const char *, void *); extern int git_default_config(const char *, const char *, void *); extern int git_config_from_file(config_fn_t fn, const char *, void *); +extern int git_config_from_file_with_options(config_fn_t fn, const char *, + void *, + const struct config_options *); extern int git_config_from_mem(config_fn_t fn, const enum config_origin_type, const char *name, const char *buf, size_t len, void *data); extern int git_config_from_blob_oid(config_fn_t fn, const char *name, -- 2.17.0.windows.1.4.g7e4058d72e3 ^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v3 10/15] config: avoid using the global variable `store` 2018-04-09 8:31 ` [PATCH v3 " Johannes Schindelin ` (8 preceding siblings ...) 2018-04-09 8:32 ` [PATCH v3 09/15] config: introduce an optional event stream while parsing Johannes Schindelin @ 2018-04-09 8:32 ` Johannes Schindelin 2018-04-09 8:32 ` [PATCH v3 11/15] config_set_store: rename some fields for consistency Johannes Schindelin ` (4 subsequent siblings) 14 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-04-09 8:32 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley It is much easier to reason about, when the config code to set/unset variables or to remove/rename sections does not rely on a global (or file-local) variable. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- config.c | 119 ++++++++++++++++++++++++++++++------------------------- 1 file changed, 66 insertions(+), 53 deletions(-) diff --git a/config.c b/config.c index 03d8e7709fe..0c0a965267d 100644 --- a/config.c +++ b/config.c @@ -2296,7 +2296,7 @@ void git_die_config(const char *key, const char *err, ...) * Find all the stuff for git_config_set() below. */ -static struct { +struct config_store_data { int baselen; char *key; int do_not_match; @@ -2306,56 +2306,58 @@ static struct { unsigned int offset_alloc; enum { START, SECTION_SEEN, SECTION_END_SEEN, KEY_SEEN } state; unsigned int seen; -} store; +}; -static int matches(const char *key, const char *value) +static int matches(const char *key, const char *value, + const struct config_store_data *store) { - if (strcmp(key, store.key)) + if (strcmp(key, store->key)) return 0; /* not ours */ - if (!store.value_regex) + if (!store->value_regex) return 1; /* always matches */ - if (store.value_regex == CONFIG_REGEX_NONE) + if (store->value_regex == CONFIG_REGEX_NONE) return 0; /* never matches */ - return store.do_not_match ^ - (value && !regexec(store.value_regex, value, 0, NULL, 0)); + return store->do_not_match ^ + (value && !regexec(store->value_regex, value, 0, NULL, 0)); } static int store_aux(const char *key, const char *value, void *cb) { const char *ep; size_t section_len; + struct config_store_data *store = cb; - switch (store.state) { + switch (store->state) { case KEY_SEEN: - if (matches(key, value)) { - if (store.seen == 1 && store.multi_replace == 0) { + if (matches(key, value, store)) { + if (store->seen == 1 && store->multi_replace == 0) { warning(_("%s has multiple values"), key); } - ALLOC_GROW(store.offset, store.seen + 1, - store.offset_alloc); + ALLOC_GROW(store->offset, store->seen + 1, + store->offset_alloc); - store.offset[store.seen] = cf->do_ftell(cf); - store.seen++; + store->offset[store->seen] = cf->do_ftell(cf); + store->seen++; } break; case SECTION_SEEN: /* - * What we are looking for is in store.key (both + * What we are looking for is in store->key (both * section and var), and its section part is baselen * long. We found key (again, both section and var). * We would want to know if this key is in the same * section as what we are looking for. We already * know we are in the same section as what should - * hold store.key. + * hold store->key. */ ep = strrchr(key, '.'); section_len = ep - key; - if ((section_len != store.baselen) || - memcmp(key, store.key, section_len+1)) { - store.state = SECTION_END_SEEN; + if ((section_len != store->baselen) || + memcmp(key, store->key, section_len+1)) { + store->state = SECTION_END_SEEN; break; } @@ -2363,26 +2365,27 @@ static int store_aux(const char *key, const char *value, void *cb) * Do not increment matches: this is no match, but we * just made sure we are in the desired section. */ - ALLOC_GROW(store.offset, store.seen + 1, - store.offset_alloc); - store.offset[store.seen] = cf->do_ftell(cf); + ALLOC_GROW(store->offset, store->seen + 1, + store->offset_alloc); + store->offset[store->seen] = cf->do_ftell(cf); /* fallthru */ case SECTION_END_SEEN: case START: - if (matches(key, value)) { - ALLOC_GROW(store.offset, store.seen + 1, - store.offset_alloc); - store.offset[store.seen] = cf->do_ftell(cf); - store.state = KEY_SEEN; - store.seen++; + if (matches(key, value, store)) { + ALLOC_GROW(store->offset, store->seen + 1, + store->offset_alloc); + store->offset[store->seen] = cf->do_ftell(cf); + store->state = KEY_SEEN; + store->seen++; } else { - if (strrchr(key, '.') - key == store.baselen && - !strncmp(key, store.key, store.baselen)) { - store.state = SECTION_SEEN; - ALLOC_GROW(store.offset, - store.seen + 1, - store.offset_alloc); - store.offset[store.seen] = cf->do_ftell(cf); + if (strrchr(key, '.') - key == store->baselen && + !strncmp(key, store->key, store->baselen)) { + store->state = SECTION_SEEN; + ALLOC_GROW(store->offset, + store->seen + 1, + store->offset_alloc); + store->offset[store->seen] = + cf->do_ftell(cf); } } } @@ -2397,31 +2400,33 @@ static int write_error(const char *filename) return 4; } -static struct strbuf store_create_section(const char *key) +static struct strbuf store_create_section(const char *key, + const struct config_store_data *store) { const char *dot; int i; struct strbuf sb = STRBUF_INIT; - dot = memchr(key, '.', store.baselen); + dot = memchr(key, '.', store->baselen); if (dot) { strbuf_addf(&sb, "[%.*s \"", (int)(dot - key), key); - for (i = dot - key + 1; i < store.baselen; i++) { + for (i = dot - key + 1; i < store->baselen; i++) { if (key[i] == '"' || key[i] == '\\') strbuf_addch(&sb, '\\'); strbuf_addch(&sb, key[i]); } strbuf_addstr(&sb, "\"]\n"); } else { - strbuf_addf(&sb, "[%.*s]\n", store.baselen, key); + strbuf_addf(&sb, "[%.*s]\n", store->baselen, key); } return sb; } -static ssize_t write_section(int fd, const char *key) +static ssize_t write_section(int fd, const char *key, + const struct config_store_data *store) { - struct strbuf sb = store_create_section(key); + struct strbuf sb = store_create_section(key, store); ssize_t ret; ret = write_in_full(fd, sb.buf, sb.len); @@ -2430,11 +2435,12 @@ static ssize_t write_section(int fd, const char *key) return ret; } -static ssize_t write_pair(int fd, const char *key, const char *value) +static ssize_t write_pair(int fd, const char *key, const char *value, + const struct config_store_data *store) { int i; ssize_t ret; - int length = strlen(key + store.baselen + 1); + int length = strlen(key + store->baselen + 1); const char *quote = ""; struct strbuf sb = STRBUF_INIT; @@ -2454,7 +2460,7 @@ static ssize_t write_pair(int fd, const char *key, const char *value) quote = "\""; strbuf_addf(&sb, "\t%.*s = %s", - length, key + store.baselen + 1, quote); + length, key + store->baselen + 1, quote); for (i = 0; value[i]; i++) switch (value[i]) { @@ -2564,6 +2570,9 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, char *filename_buf = NULL; char *contents = NULL; size_t contents_sz; + struct config_store_data store; + + memset(&store, 0, sizeof(store)); /* parse-key returns negative; flip the sign to feed exit(3) */ ret = 0 - git_config_parse_key(key, &store.key, &store.baselen); @@ -2606,8 +2615,8 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, } store.key = (char *)key; - if (write_section(fd, key) < 0 || - write_pair(fd, key, value) < 0) + if (write_section(fd, key, &store) < 0 || + write_pair(fd, key, value, &store) < 0) goto write_err_out; } else { struct stat st; @@ -2646,7 +2655,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, * As a side effect, we make sure to transform only a valid * existing config file. */ - if (git_config_from_file(store_aux, config_filename, NULL)) { + if (git_config_from_file(store_aux, config_filename, &store)) { error("invalid config file %s", config_filename); free(store.key); if (store.value_regex != NULL && @@ -2730,10 +2739,10 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, /* write the pair (value == NULL means unset) */ if (value != NULL) { if (store.state == START) { - if (write_section(fd, key) < 0) + if (write_section(fd, key, &store) < 0) goto write_err_out; } - if (write_pair(fd, key, value) < 0) + if (write_pair(fd, key, value, &store) < 0) goto write_err_out; } @@ -2857,7 +2866,8 @@ static int section_name_is_ok(const char *name) /* if new_name == NULL, the section is removed instead */ static int git_config_copy_or_rename_section_in_file(const char *config_filename, - const char *old_name, const char *new_name, int copy) + const char *old_name, + const char *new_name, int copy) { int ret = 0, remove = 0; char *filename_buf = NULL; @@ -2867,6 +2877,9 @@ static int git_config_copy_or_rename_section_in_file(const char *config_filename FILE *config_file = NULL; struct stat st; struct strbuf copystr = STRBUF_INIT; + struct config_store_data store; + + memset(&store, 0, sizeof(store)); if (new_name && !section_name_is_ok(new_name)) { ret = error("invalid section name: %s", new_name); @@ -2936,7 +2949,7 @@ static int git_config_copy_or_rename_section_in_file(const char *config_filename } store.baselen = strlen(new_name); if (!copy) { - if (write_section(out_fd, new_name) < 0) { + if (write_section(out_fd, new_name, &store) < 0) { ret = write_error(get_lock_file_path(&lock)); goto out; } @@ -2957,7 +2970,7 @@ static int git_config_copy_or_rename_section_in_file(const char *config_filename output[0] = '\t'; } } else { - copystr = store_create_section(new_name); + copystr = store_create_section(new_name, &store); } } remove = 0; -- 2.17.0.windows.1.4.g7e4058d72e3 ^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v3 11/15] config_set_store: rename some fields for consistency 2018-04-09 8:31 ` [PATCH v3 " Johannes Schindelin ` (9 preceding siblings ...) 2018-04-09 8:32 ` [PATCH v3 10/15] config: avoid using the global variable `store` Johannes Schindelin @ 2018-04-09 8:32 ` Johannes Schindelin 2018-04-09 8:32 ` [PATCH v3 12/15] git_config_set: do not use a state machine Johannes Schindelin ` (3 subsequent siblings) 14 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-04-09 8:32 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley The `seen` field is the actual length of the `offset` array, and the `offset_alloc` field records what was allocated (to avoid resizing wherever `seen` has to be incremented). Elsewhere, we use the convention `name` for the array, where `name` is descriptive enough to guess its purpose, `name_nr` for the actual length and `name_alloc` to record the maximum length without needing to resize. Let's make the names of the fields in question consistent with that convention. This will also help with the next steps where we will let the git_config_set() machinery use the config event stream that we just introduced. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- config.c | 63 ++++++++++++++++++++++++++++---------------------------- 1 file changed, 31 insertions(+), 32 deletions(-) diff --git a/config.c b/config.c index 0c0a965267d..2341620c11a 100644 --- a/config.c +++ b/config.c @@ -2302,10 +2302,9 @@ struct config_store_data { int do_not_match; regex_t *value_regex; int multi_replace; - size_t *offset; - unsigned int offset_alloc; + size_t *seen; + unsigned int seen_nr, seen_alloc; enum { START, SECTION_SEEN, SECTION_END_SEEN, KEY_SEEN } state; - unsigned int seen; }; static int matches(const char *key, const char *value, @@ -2331,15 +2330,15 @@ static int store_aux(const char *key, const char *value, void *cb) switch (store->state) { case KEY_SEEN: if (matches(key, value, store)) { - if (store->seen == 1 && store->multi_replace == 0) { + if (store->seen_nr == 1 && store->multi_replace == 0) { warning(_("%s has multiple values"), key); } - ALLOC_GROW(store->offset, store->seen + 1, - store->offset_alloc); + ALLOC_GROW(store->seen, store->seen_nr + 1, + store->seen_alloc); - store->offset[store->seen] = cf->do_ftell(cf); - store->seen++; + store->seen[store->seen_nr] = cf->do_ftell(cf); + store->seen_nr++; } break; case SECTION_SEEN: @@ -2365,26 +2364,26 @@ static int store_aux(const char *key, const char *value, void *cb) * Do not increment matches: this is no match, but we * just made sure we are in the desired section. */ - ALLOC_GROW(store->offset, store->seen + 1, - store->offset_alloc); - store->offset[store->seen] = cf->do_ftell(cf); + ALLOC_GROW(store->seen, store->seen_nr + 1, + store->seen_alloc); + store->seen[store->seen_nr] = cf->do_ftell(cf); /* fallthru */ case SECTION_END_SEEN: case START: if (matches(key, value, store)) { - ALLOC_GROW(store->offset, store->seen + 1, - store->offset_alloc); - store->offset[store->seen] = cf->do_ftell(cf); + ALLOC_GROW(store->seen, store->seen_nr + 1, + store->seen_alloc); + store->seen[store->seen_nr] = cf->do_ftell(cf); store->state = KEY_SEEN; - store->seen++; + store->seen_nr++; } else { if (strrchr(key, '.') - key == store->baselen && !strncmp(key, store->key, store->baselen)) { store->state = SECTION_SEEN; - ALLOC_GROW(store->offset, - store->seen + 1, - store->offset_alloc); - store->offset[store->seen] = + ALLOC_GROW(store->seen, + store->seen_nr + 1, + store->seen_alloc); + store->seen[store->seen_nr] = cf->do_ftell(cf); } } @@ -2644,10 +2643,10 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, } } - ALLOC_GROW(store.offset, 1, store.offset_alloc); - store.offset[0] = 0; + ALLOC_GROW(store.seen, 1, store.seen_alloc); + store.seen[0] = 0; store.state = START; - store.seen = 0; + store.seen_nr = 0; /* * After this, store.offset will contain the *end* offset @@ -2675,8 +2674,8 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, } /* if nothing to unset, or too many matches, error out */ - if ((store.seen == 0 && value == NULL) || - (store.seen > 1 && multi_replace == 0)) { + if ((store.seen_nr == 0 && value == NULL) || + (store.seen_nr > 1 && multi_replace == 0)) { ret = CONFIG_NOTHING_SET; goto out_free; } @@ -2707,19 +2706,19 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, goto out_free; } - if (store.seen == 0) - store.seen = 1; + if (store.seen_nr == 0) + store.seen_nr = 1; - for (i = 0, copy_begin = 0; i < store.seen; i++) { + for (i = 0, copy_begin = 0; i < store.seen_nr; i++) { new_line = 0; - if (store.offset[i] == 0) { - store.offset[i] = copy_end = contents_sz; + if (store.seen[i] == 0) { + store.seen[i] = copy_end = contents_sz; } else if (store.state != KEY_SEEN) { - copy_end = store.offset[i]; + copy_end = store.seen[i]; } else copy_end = find_beginning_of_line( contents, contents_sz, - store.offset[i], &new_line); + store.seen[i], &new_line); if (copy_end > 0 && contents[copy_end-1] != '\n') new_line = 1; @@ -2733,7 +2732,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, write_str_in_full(fd, "\n") < 0) goto write_err_out; } - copy_begin = store.offset[i]; + copy_begin = store.seen[i]; } /* write the pair (value == NULL means unset) */ -- 2.17.0.windows.1.4.g7e4058d72e3 ^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v3 12/15] git_config_set: do not use a state machine 2018-04-09 8:31 ` [PATCH v3 " Johannes Schindelin ` (10 preceding siblings ...) 2018-04-09 8:32 ` [PATCH v3 11/15] config_set_store: rename some fields for consistency Johannes Schindelin @ 2018-04-09 8:32 ` Johannes Schindelin 2018-04-09 8:32 ` [PATCH v3 13/15] git_config_set: make use of the config parser's event stream Johannes Schindelin ` (2 subsequent siblings) 14 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-04-09 8:32 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley While a neat theoretical construct, state machines are hard to read. In this instance, it does not even make a whole lot of sense because we are more interested in flags, anyway: has the section been seen? Has the key been seen? Does the current section match the key we are looking for? Besides, the state `SECTION_SEEN` was named in a misleading way: it did not indicate that we saw the section matching the key we are looking for, but it instead indicated that we are *currently* in that section. Let's just replace the state machine logic by clear and obvious flags. This will also make it easier to review the upcoming patches to use the newly-introduced `event_fn` callback of the config parser. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- config.c | 59 ++++++++++++++++++++++++++++---------------------------- 1 file changed, 29 insertions(+), 30 deletions(-) diff --git a/config.c b/config.c index 2341620c11a..3f1cbfa181e 100644 --- a/config.c +++ b/config.c @@ -2304,7 +2304,7 @@ struct config_store_data { int multi_replace; size_t *seen; unsigned int seen_nr, seen_alloc; - enum { START, SECTION_SEEN, SECTION_END_SEEN, KEY_SEEN } state; + unsigned int key_seen:1, section_seen:1, is_keys_section:1; }; static int matches(const char *key, const char *value, @@ -2327,8 +2327,7 @@ static int store_aux(const char *key, const char *value, void *cb) size_t section_len; struct config_store_data *store = cb; - switch (store->state) { - case KEY_SEEN: + if (store->key_seen) { if (matches(key, value, store)) { if (store->seen_nr == 1 && store->multi_replace == 0) { warning(_("%s has multiple values"), key); @@ -2340,8 +2339,8 @@ static int store_aux(const char *key, const char *value, void *cb) store->seen[store->seen_nr] = cf->do_ftell(cf); store->seen_nr++; } - break; - case SECTION_SEEN: + return 0; + } else if (store->is_keys_section) { /* * What we are looking for is in store->key (both * section and var), and its section part is baselen @@ -2356,10 +2355,9 @@ static int store_aux(const char *key, const char *value, void *cb) if ((section_len != store->baselen) || memcmp(key, store->key, section_len+1)) { - store->state = SECTION_END_SEEN; - break; + store->is_keys_section = 0; + return 0; } - /* * Do not increment matches: this is no match, but we * just made sure we are in the desired section. @@ -2367,27 +2365,29 @@ static int store_aux(const char *key, const char *value, void *cb) ALLOC_GROW(store->seen, store->seen_nr + 1, store->seen_alloc); store->seen[store->seen_nr] = cf->do_ftell(cf); - /* fallthru */ - case SECTION_END_SEEN: - case START: - if (matches(key, value, store)) { - ALLOC_GROW(store->seen, store->seen_nr + 1, - store->seen_alloc); - store->seen[store->seen_nr] = cf->do_ftell(cf); - store->state = KEY_SEEN; - store->seen_nr++; - } else { - if (strrchr(key, '.') - key == store->baselen && - !strncmp(key, store->key, store->baselen)) { - store->state = SECTION_SEEN; - ALLOC_GROW(store->seen, - store->seen_nr + 1, - store->seen_alloc); - store->seen[store->seen_nr] = - cf->do_ftell(cf); - } + } + + if (matches(key, value, store)) { + ALLOC_GROW(store->seen, store->seen_nr + 1, + store->seen_alloc); + store->seen[store->seen_nr] = cf->do_ftell(cf); + store->seen_nr++; + store->key_seen = 1; + store->section_seen = 1; + store->is_keys_section = 1; + } else { + if (strrchr(key, '.') - key == store->baselen && + !strncmp(key, store->key, store->baselen)) { + store->section_seen = 1; + store->is_keys_section = 1; + ALLOC_GROW(store->seen, + store->seen_nr + 1, + store->seen_alloc); + store->seen[store->seen_nr] = + cf->do_ftell(cf); } } + return 0; } @@ -2645,7 +2645,6 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, ALLOC_GROW(store.seen, 1, store.seen_alloc); store.seen[0] = 0; - store.state = START; store.seen_nr = 0; /* @@ -2713,7 +2712,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, new_line = 0; if (store.seen[i] == 0) { store.seen[i] = copy_end = contents_sz; - } else if (store.state != KEY_SEEN) { + } else if (!store.key_seen) { copy_end = store.seen[i]; } else copy_end = find_beginning_of_line( @@ -2737,7 +2736,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, /* write the pair (value == NULL means unset) */ if (value != NULL) { - if (store.state == START) { + if (!store.section_seen) { if (write_section(fd, key, &store) < 0) goto write_err_out; } -- 2.17.0.windows.1.4.g7e4058d72e3 ^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v3 13/15] git_config_set: make use of the config parser's event stream 2018-04-09 8:31 ` [PATCH v3 " Johannes Schindelin ` (11 preceding siblings ...) 2018-04-09 8:32 ` [PATCH v3 12/15] git_config_set: do not use a state machine Johannes Schindelin @ 2018-04-09 8:32 ` Johannes Schindelin 2018-05-08 13:42 ` Jeff King 2018-04-09 8:32 ` [PATCH v3 14/15] git config --unset: remove empty sections (in the common case) Johannes Schindelin 2018-04-09 8:32 ` [PATCH v3 15/15] git_config_set: reuse empty sections Johannes Schindelin 14 siblings, 1 reply; 103+ messages in thread From: Johannes Schindelin @ 2018-04-09 8:32 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley In the recent commit with the title "config: introduce an optional event stream while parsing", we introduced an optional callback to keep track of the config parser's events "comment", "white-space", "section header" and "entry". One motivation for this feature was to make use of it in the code that edits the config. And this commit makes it so. Note: this patch changes the meaning of the `seen` array that records whether we saw the config entry that is to be edited: previously, it contained the end offset of the found entry. Now, we introduce a new array `parsed` that keeps a record of *all* config parser events (with begin/end offsets), and the items in the `seen` array now point into the `parsed` array. There are two reasons why we do it this way: 1. To keep the implementation simple, the config parser's event stream reports the event only after the config callback was called, so we would not receive the begin offset otherwise. 2. In the following patches, we will re-use the `parsed` array to fix two long-standing bugs related to empty sections. Note that this also makes the code more robust with respect to finding the begin offset of the part(s) of the config file to be edited, as we no longer back-track to find the beginning of the line. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- config.c | 170 ++++++++++++++++++++++++++----------------------------- 1 file changed, 81 insertions(+), 89 deletions(-) diff --git a/config.c b/config.c index 3f1cbfa181e..72d71fc9a4e 100644 --- a/config.c +++ b/config.c @@ -2302,8 +2302,11 @@ struct config_store_data { int do_not_match; regex_t *value_regex; int multi_replace; - size_t *seen; - unsigned int seen_nr, seen_alloc; + struct { + size_t begin, end; + enum config_event_t type; + } *parsed; + unsigned int parsed_nr, parsed_alloc, *seen, seen_nr, seen_alloc; unsigned int key_seen:1, section_seen:1, is_keys_section:1; }; @@ -2321,10 +2324,31 @@ static int matches(const char *key, const char *value, (value && !regexec(store->value_regex, value, 0, NULL, 0)); } +static int store_aux_event(enum config_event_t type, + size_t begin, size_t end, void *data) +{ + struct config_store_data *store = data; + + ALLOC_GROW(store->parsed, store->parsed_nr + 1, store->parsed_alloc); + store->parsed[store->parsed_nr].begin = begin; + store->parsed[store->parsed_nr].end = end; + store->parsed[store->parsed_nr].type = type; + store->parsed_nr++; + + if (type == CONFIG_EVENT_SECTION) { + if (cf->var.len < 2 || cf->var.buf[cf->var.len - 1] != '.') + BUG("Invalid section name '%s'", cf->var.buf); + + /* Is this the section we were looking for? */ + store->is_keys_section = cf->var.len - 1 == store->baselen && + !strncasecmp(cf->var.buf, store->key, store->baselen); + } + + return 0; +} + static int store_aux(const char *key, const char *value, void *cb) { - const char *ep; - size_t section_len; struct config_store_data *store = cb; if (store->key_seen) { @@ -2336,55 +2360,21 @@ static int store_aux(const char *key, const char *value, void *cb) ALLOC_GROW(store->seen, store->seen_nr + 1, store->seen_alloc); - store->seen[store->seen_nr] = cf->do_ftell(cf); + store->seen[store->seen_nr] = store->parsed_nr; store->seen_nr++; } - return 0; } else if (store->is_keys_section) { /* - * What we are looking for is in store->key (both - * section and var), and its section part is baselen - * long. We found key (again, both section and var). - * We would want to know if this key is in the same - * section as what we are looking for. We already - * know we are in the same section as what should - * hold store->key. + * Do not increment matches yet: this may not be a match, but we + * are in the desired section. */ - ep = strrchr(key, '.'); - section_len = ep - key; - - if ((section_len != store->baselen) || - memcmp(key, store->key, section_len+1)) { - store->is_keys_section = 0; - return 0; - } - /* - * Do not increment matches: this is no match, but we - * just made sure we are in the desired section. - */ - ALLOC_GROW(store->seen, store->seen_nr + 1, - store->seen_alloc); - store->seen[store->seen_nr] = cf->do_ftell(cf); - } - - if (matches(key, value, store)) { - ALLOC_GROW(store->seen, store->seen_nr + 1, - store->seen_alloc); - store->seen[store->seen_nr] = cf->do_ftell(cf); - store->seen_nr++; - store->key_seen = 1; + ALLOC_GROW(store->seen, store->seen_nr + 1, store->seen_alloc); + store->seen[store->seen_nr] = store->parsed_nr; store->section_seen = 1; - store->is_keys_section = 1; - } else { - if (strrchr(key, '.') - key == store->baselen && - !strncmp(key, store->key, store->baselen)) { - store->section_seen = 1; - store->is_keys_section = 1; - ALLOC_GROW(store->seen, - store->seen_nr + 1, - store->seen_alloc); - store->seen[store->seen_nr] = - cf->do_ftell(cf); + + if (matches(key, value, store)) { + store->seen_nr++; + store->key_seen = 1; } } @@ -2485,32 +2475,6 @@ static ssize_t write_pair(int fd, const char *key, const char *value, return ret; } -static ssize_t find_beginning_of_line(const char *contents, size_t size, - size_t offset_, int *found_bracket) -{ - size_t equal_offset = size, bracket_offset = size; - ssize_t offset; - -contline: - for (offset = offset_-2; offset > 0 - && contents[offset] != '\n'; offset--) - switch (contents[offset]) { - case '=': equal_offset = offset; break; - case ']': bracket_offset = offset; break; - } - if (offset > 0 && contents[offset-1] == '\\') { - offset_ = offset; - goto contline; - } - if (bracket_offset < equal_offset) { - *found_bracket = 1; - offset = bracket_offset+1; - } else - offset++; - - return offset; -} - int git_config_set_in_file_gently(const char *config_filename, const char *key, const char *value) { @@ -2621,6 +2585,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, struct stat st; size_t copy_begin, copy_end; int i, new_line = 0; + struct config_options opts; if (value_regex == NULL) store.value_regex = NULL; @@ -2643,17 +2608,24 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, } } - ALLOC_GROW(store.seen, 1, store.seen_alloc); - store.seen[0] = 0; - store.seen_nr = 0; + ALLOC_GROW(store.parsed, 1, store.parsed_alloc); + store.parsed[0].end = 0; + + memset(&opts, 0, sizeof(opts)); + opts.event_fn = store_aux_event; + opts.event_fn_data = &store; /* - * After this, store.offset will contain the *end* offset - * of the last match, or remain at 0 if no match was found. + * After this, store.parsed will contain offsets of all the + * parsed elements, and store.seen will contain a list of + * matches, as indices into store.parsed. + * * As a side effect, we make sure to transform only a valid * existing config file. */ - if (git_config_from_file(store_aux, config_filename, &store)) { + if (git_config_from_file_with_options(store_aux, + config_filename, + &store, &opts)) { error("invalid config file %s", config_filename); free(store.key); if (store.value_regex != NULL && @@ -2705,19 +2677,39 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, goto out_free; } - if (store.seen_nr == 0) + if (store.seen_nr == 0) { + if (!store.seen_alloc) { + /* Did not see key nor section */ + ALLOC_GROW(store.seen, 1, store.seen_alloc); + store.seen[0] = store.parsed_nr + - !!store.parsed_nr; + } store.seen_nr = 1; + } for (i = 0, copy_begin = 0; i < store.seen_nr; i++) { + size_t replace_end; + int j = store.seen[i]; + new_line = 0; - if (store.seen[i] == 0) { - store.seen[i] = copy_end = contents_sz; - } else if (!store.key_seen) { - copy_end = store.seen[i]; - } else - copy_end = find_beginning_of_line( - contents, contents_sz, - store.seen[i], &new_line); + if (!store.key_seen) { + replace_end = copy_end = store.parsed[j].end; + } else { + replace_end = store.parsed[j].end; + copy_end = store.parsed[j].begin; + /* + * Swallow preceding white-space on the same + * line. + */ + while (copy_end > 0 ) { + char c = contents[copy_end - 1]; + + if (isspace(c) && c != '\n') + copy_end--; + else + break; + } + } if (copy_end > 0 && contents[copy_end-1] != '\n') new_line = 1; @@ -2731,7 +2723,7 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, write_str_in_full(fd, "\n") < 0) goto write_err_out; } - copy_begin = store.seen[i]; + copy_begin = replace_end; } /* write the pair (value == NULL means unset) */ -- 2.17.0.windows.1.4.g7e4058d72e3 ^ permalink raw reply related [flat|nested] 103+ messages in thread
* Re: [PATCH v3 13/15] git_config_set: make use of the config parser's event stream 2018-04-09 8:32 ` [PATCH v3 13/15] git_config_set: make use of the config parser's event stream Johannes Schindelin @ 2018-05-08 13:42 ` Jeff King 2018-05-08 14:00 ` Jeff King 0 siblings, 1 reply; 103+ messages in thread From: Jeff King @ 2018-05-08 13:42 UTC (permalink / raw) To: Johannes Schindelin Cc: git, Junio C Hamano, Thomas Rast, Phil Haack, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley On Mon, Apr 09, 2018 at 10:32:20AM +0200, Johannes Schindelin wrote: > +static int store_aux_event(enum config_event_t type, > + size_t begin, size_t end, void *data) > +{ > + struct config_store_data *store = data; > + > + ALLOC_GROW(store->parsed, store->parsed_nr + 1, store->parsed_alloc); > + store->parsed[store->parsed_nr].begin = begin; > + store->parsed[store->parsed_nr].end = end; > + store->parsed[store->parsed_nr].type = type; > + store->parsed_nr++; > + > + if (type == CONFIG_EVENT_SECTION) { > + if (cf->var.len < 2 || cf->var.buf[cf->var.len - 1] != '.') > + BUG("Invalid section name '%s'", cf->var.buf); I triggered this BUG today while playing around. Here's a minimal reproduction: echo '[broken' >config git config --file=config a.b c I'm not sure if it should simply be a die() and not a BUG(), since it depends on the input. Or if it is a BUG and we expected an earlier part of the code (like the event generator) to catch this broken case before we get to this function. -Peff ^ permalink raw reply [flat|nested] 103+ messages in thread
* Re: [PATCH v3 13/15] git_config_set: make use of the config parser's event stream 2018-05-08 13:42 ` Jeff King @ 2018-05-08 14:00 ` Jeff King 0 siblings, 0 replies; 103+ messages in thread From: Jeff King @ 2018-05-08 14:00 UTC (permalink / raw) To: Johannes Schindelin Cc: git, Junio C Hamano, Thomas Rast, Phil Haack, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley On Tue, May 08, 2018 at 09:42:48AM -0400, Jeff King wrote: > On Mon, Apr 09, 2018 at 10:32:20AM +0200, Johannes Schindelin wrote: > > > +static int store_aux_event(enum config_event_t type, > > + size_t begin, size_t end, void *data) > > +{ > > + struct config_store_data *store = data; > > + > > + ALLOC_GROW(store->parsed, store->parsed_nr + 1, store->parsed_alloc); > > + store->parsed[store->parsed_nr].begin = begin; > > + store->parsed[store->parsed_nr].end = end; > > + store->parsed[store->parsed_nr].type = type; > > + store->parsed_nr++; > > + > > + if (type == CONFIG_EVENT_SECTION) { > > + if (cf->var.len < 2 || cf->var.buf[cf->var.len - 1] != '.') > > + BUG("Invalid section name '%s'", cf->var.buf); > > I triggered this BUG today while playing around. Here's a minimal > reproduction: > > echo '[broken' >config > git config --file=config a.b c > > I'm not sure if it should simply be a die() and not a BUG(), since > it depends on the input. Or if it is a BUG and we expected an earlier > part of the code (like the event generator) to catch this broken case > before we get to this function. By the way, one side effect of BUG() here is that we call abort(), which means that our atexit handlers don't run. And a crufty "config.lock" file is left that prevents running the command again. In our discussion elsewhere of having BUG() just call exit(), I'm not sure if we'd want it to skip those cleanups or not (it's helpful to not run them if you're trying to debug, but otherwise is annoying). -Peff ^ permalink raw reply [flat|nested] 103+ messages in thread
* [PATCH v3 14/15] git config --unset: remove empty sections (in the common case) 2018-04-09 8:31 ` [PATCH v3 " Johannes Schindelin ` (12 preceding siblings ...) 2018-04-09 8:32 ` [PATCH v3 13/15] git_config_set: make use of the config parser's event stream Johannes Schindelin @ 2018-04-09 8:32 ` Johannes Schindelin 2018-04-09 8:32 ` [PATCH v3 15/15] git_config_set: reuse empty sections Johannes Schindelin 14 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-04-09 8:32 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley The original reasoning for not removing section headers upon removal of the last entry went like this: the user could have added comments about the section, or about the entries therein, and if there were other comments there, we would not know whether we should remove them. In particular, a concocted example was presented that looked like this (and was added to t1300): # some generic comment on the configuration file itself # a comment specific to this "section" section. [section] # some intervening lines # that should also be dropped key = value # please be careful when you update the above variable The ideal thing for `git config --unset section.key` in this case would be to leave only the first line behind, because all the other comments are now obsolete. However, this is unfeasible, short of adding a complete Natural Language Processing module to Git, which seems not only a lot of work, but a totally unreasonable feature (for little benefit to most users). Now, the real kicker about this problem is: most users do not edit their config files at all! In their use case, the config looks like this instead: [section] key = value ... and it is totally obvious what should happen if the entry is removed: the entire section should vanish. Let's generalize this observation to this conservative strategy: if we are removing the last entry from a section, and there are no comments inside that section nor surrounding it, then remove the entire section. Otherwise behave as before: leave the now-empty section (including those comments, even ones about the now-deleted entry). We have to be extra careful to handle the case where more than one entry is removed: any subset of them might be the last entries of their respective sections (and if there are no comments in or around that section, the section should be removed, too). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- config.c | 93 ++++++++++++++++++++++++++++++++++++++++++++++- t/t1300-config.sh | 4 +- 2 files changed, 93 insertions(+), 4 deletions(-) diff --git a/config.c b/config.c index 72d71fc9a4e..2c7a10acdaa 100644 --- a/config.c +++ b/config.c @@ -2305,6 +2305,7 @@ struct config_store_data { struct { size_t begin, end; enum config_event_t type; + int is_keys_section; } *parsed; unsigned int parsed_nr, parsed_alloc, *seen, seen_nr, seen_alloc; unsigned int key_seen:1, section_seen:1, is_keys_section:1; @@ -2333,17 +2334,20 @@ static int store_aux_event(enum config_event_t type, store->parsed[store->parsed_nr].begin = begin; store->parsed[store->parsed_nr].end = end; store->parsed[store->parsed_nr].type = type; - store->parsed_nr++; if (type == CONFIG_EVENT_SECTION) { if (cf->var.len < 2 || cf->var.buf[cf->var.len - 1] != '.') BUG("Invalid section name '%s'", cf->var.buf); /* Is this the section we were looking for? */ - store->is_keys_section = cf->var.len - 1 == store->baselen && + store->is_keys_section = + store->parsed[store->parsed_nr].is_keys_section = + cf->var.len - 1 == store->baselen && !strncasecmp(cf->var.buf, store->key, store->baselen); } + store->parsed_nr++; + return 0; } @@ -2475,6 +2479,87 @@ static ssize_t write_pair(int fd, const char *key, const char *value, return ret; } +/* + * If we are about to unset the last key(s) in a section, and if there are + * no comments surrounding (or included in) the section, we will want to + * extend begin/end to remove the entire section. + * + * Note: the parameter `seen_ptr` points to the index into the store.seen + * array. * This index may be incremented if a section has more than one + * entry (which all are to be removed). + */ +static void maybe_remove_section(struct config_store_data *store, + const char *contents, + size_t *begin_offset, size_t *end_offset, + int *seen_ptr) +{ + size_t begin; + int i, seen, section_seen = 0; + + /* + * First, ensure that this is the first key, and that there are no + * comments before the entry nor before the section header. + */ + seen = *seen_ptr; + for (i = store->seen[seen]; i > 0; i--) { + enum config_event_t type = store->parsed[i - 1].type; + + if (type == CONFIG_EVENT_COMMENT) + /* There is a comment before this entry or section */ + return; + if (type == CONFIG_EVENT_ENTRY) { + if (!section_seen) + /* This is not the section's first entry. */ + return; + /* We encountered no comment before the section. */ + break; + } + if (type == CONFIG_EVENT_SECTION) { + if (!store->parsed[i - 1].is_keys_section) + break; + section_seen = 1; + } + } + begin = store->parsed[i].begin; + + /* + * Next, make sure that we are removing he last key(s) in the section, + * and that there are no comments that are possibly about the current + * section. + */ + for (i = store->seen[seen] + 1; i < store->parsed_nr; i++) { + enum config_event_t type = store->parsed[i].type; + + if (type == CONFIG_EVENT_COMMENT) + return; + if (type == CONFIG_EVENT_SECTION) { + if (store->parsed[i].is_keys_section) + continue; + break; + } + if (type == CONFIG_EVENT_ENTRY) { + if (++seen < store->seen_nr && + i == store->seen[seen]) + /* We want to remove this entry, too */ + continue; + /* There is another entry in this section. */ + return; + } + } + + /* + * We are really removing the last entry/entries from this section, and + * there are no enclosed or surrounding comments. Remove the entire, + * now-empty section. + */ + *seen_ptr = seen; + *begin_offset = begin; + if (i < store->parsed_nr) + *end_offset = store->parsed[i].begin; + else + *end_offset = store->parsed[store->parsed_nr - 1].end; +} + int git_config_set_in_file_gently(const char *config_filename, const char *key, const char *value) { @@ -2697,6 +2782,10 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, } else { replace_end = store.parsed[j].end; copy_end = store.parsed[j].begin; + if (!value) + maybe_remove_section(&store, contents, + ©_end, + &replace_end, &i); /* * Swallow preceding white-space on the same * line. diff --git a/t/t1300-config.sh b/t/t1300-config.sh index 9d23a8ca972..d973fd53398 100755 --- a/t/t1300-config.sh +++ b/t/t1300-config.sh @@ -1413,7 +1413,7 @@ test_expect_success 'urlmatch with wildcard' ' ' # good section hygiene -test_expect_failure '--unset last key removes section (except if commented)' ' +test_expect_success '--unset last key removes section (except if commented)' ' cat >.git/config <<-\EOF && # some generic comment on the configuration file itself # a comment specific to this "section" section. @@ -1495,7 +1495,7 @@ test_expect_failure '--unset last key removes section (except if commented)' ' test_line_count = 3 .git/config ' -test_expect_failure '--unset-all removes section if empty & uncommented' ' +test_expect_success '--unset-all removes section if empty & uncommented' ' cat >.git/config <<-\EOF && [section] key = value1 -- 2.17.0.windows.1.4.g7e4058d72e3 ^ permalink raw reply related [flat|nested] 103+ messages in thread
* [PATCH v3 15/15] git_config_set: reuse empty sections 2018-04-09 8:31 ` [PATCH v3 " Johannes Schindelin ` (13 preceding siblings ...) 2018-04-09 8:32 ` [PATCH v3 14/15] git config --unset: remove empty sections (in the common case) Johannes Schindelin @ 2018-04-09 8:32 ` Johannes Schindelin 14 siblings, 0 replies; 103+ messages in thread From: Johannes Schindelin @ 2018-04-09 8:32 UTC (permalink / raw) To: git Cc: Junio C Hamano, Thomas Rast, Phil Haack, Jeff King, Ævar Arnfjörð Bjarmason, Stefan Beller, Jason Frey, Philip Oakley It can happen quite easily that the last setting in a config section is removed, and to avoid confusion when there are comments in the config about that section, we keep a lone section header, i.e. an empty section. Now that we use the `event_fn` callback, it is easy to add support for re-using empty sections, so let's do that. Note: t5512-ls-remote requires that this change is applied *after* the patch "git config --unset: remove empty sections (in the common case)": without that patch, there would be empty `transfer` and `uploadpack` sections ready for reuse, but in the *wrong* order (and sconsequently, t5512's "overrides work between mixed transfer/upload-pack hideRefs" would fail). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- config.c | 14 +++++++++++++- t/t1300-config.sh | 2 +- 2 files changed, 14 insertions(+), 2 deletions(-) diff --git a/config.c b/config.c index 2c7a10acdaa..6155d0651bd 100644 --- a/config.c +++ b/config.c @@ -2344,6 +2344,12 @@ static int store_aux_event(enum config_event_t type, store->parsed[store->parsed_nr].is_keys_section = cf->var.len - 1 == store->baselen && !strncasecmp(cf->var.buf, store->key, store->baselen); + if (store->is_keys_section) { + store->section_seen = 1; + ALLOC_GROW(store->seen, store->seen_nr + 1, + store->seen_alloc); + store->seen[store->seen_nr] = store->parsed_nr; + } } store->parsed_nr++; @@ -2778,7 +2784,13 @@ int git_config_set_multivar_in_file_gently(const char *config_filename, new_line = 0; if (!store.key_seen) { - replace_end = copy_end = store.parsed[j].end; + copy_end = store.parsed[j].end; + /* include '\n' when copying section header */ + if (copy_end > 0 && copy_end < contents_sz && + contents[copy_end - 1] != '\n' && + contents[copy_end] == '\n') + copy_end++; + replace_end = copy_end; } else { replace_end = store.parsed[j].end; copy_end = store.parsed[j].begin; diff --git a/t/t1300-config.sh b/t/t1300-config.sh index d973fd53398..eef0bbe4f9f 100755 --- a/t/t1300-config.sh +++ b/t/t1300-config.sh @@ -1506,7 +1506,7 @@ test_expect_success '--unset-all removes section if empty & uncommented' ' test_line_count = 0 .git/config ' -test_expect_failure 'adding a key into an empty section reuses header' ' +test_expect_success 'adding a key into an empty section reuses header' ' cat >.git/config <<-\EOF && [section] EOF -- 2.17.0.windows.1.4.g7e4058d72e3 ^ permalink raw reply related [flat|nested] 103+ messages in thread
end of thread, other threads:[~2018-05-08 14:00 UTC | newest] Thread overview: 103+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-03-29 15:18 [PATCH 0/9] Assorted fixes for `git config` (including the "empty sections" bug) Johannes Schindelin 2018-03-29 15:18 ` [PATCH 1/9] git_config_set: fix off-by-two Johannes Schindelin 2018-03-29 18:15 ` Stefan Beller 2018-03-29 19:41 ` Jeff King 2018-03-30 12:32 ` Johannes Schindelin 2018-03-30 14:15 ` Ævar Arnfjörð Bjarmason 2018-03-30 16:24 ` Junio C Hamano 2018-03-30 18:44 ` Johannes Schindelin 2018-03-30 19:00 ` Junio C Hamano 2018-04-03 9:31 ` Johannes Schindelin 2018-04-03 15:29 ` Duy Nguyen 2018-04-03 15:47 ` Johannes Schindelin 2018-04-08 23:12 ` Junio C Hamano 2018-03-30 16:36 ` Duy Nguyen 2018-03-30 18:53 ` Johannes Schindelin 2018-03-30 19:16 ` Duy Nguyen 2018-03-30 18:45 ` A potential approach to making tests faster on Windows Ævar Arnfjörð Bjarmason 2018-03-30 18:58 ` Junio C Hamano 2018-03-30 19:16 ` Jeff King 2018-04-03 9:49 ` Johannes Schindelin 2018-04-03 11:28 ` Ævar Arnfjörð Bjarmason 2018-04-03 15:55 ` Johannes Schindelin 2018-04-03 21:36 ` Eric Sunshine 2018-04-03 11:43 ` Johannes Schindelin 2018-04-03 13:27 ` Jeff King 2018-04-03 16:00 ` Johannes Schindelin 2018-04-06 21:40 ` Jeff King 2018-04-06 21:57 ` Stefan Beller 2018-03-29 15:18 ` [PATCH 2/9] t1300: rename it to reflect that `repo-config` was deprecated Johannes Schindelin 2018-03-29 19:42 ` Jeff King 2018-03-30 12:37 ` Johannes Schindelin 2018-03-29 15:18 ` [PATCH 3/9] t1300: avoid relying on a bug Johannes Schindelin 2018-03-29 19:43 ` Jeff King 2018-03-30 12:38 ` Johannes Schindelin 2018-03-29 15:18 ` [PATCH 4/9] t1300: remove unreasonable expectation from TODO Johannes Schindelin 2018-03-29 19:52 ` Jeff King 2018-03-29 20:45 ` Junio C Hamano 2018-03-30 12:42 ` Johannes Schindelin 2018-03-29 15:18 ` [PATCH 5/9] t1300: `--unset-all` can leave an empty section behind (bug) Johannes Schindelin 2018-03-29 19:54 ` Jeff King 2018-03-29 15:18 ` [PATCH 6/9] git_config_set: simplify the way the section name is remembered Johannes Schindelin 2018-03-29 15:19 ` [PATCH 7/9] git config --unset: remove empty sections (in normal situations) Johannes Schindelin 2018-03-29 21:32 ` Jeff King 2018-03-30 13:00 ` Johannes Schindelin 2018-03-30 13:09 ` Jeff King 2018-03-29 15:19 ` [PATCH 8/9] git_config_set: use do_config_from_file() directly Johannes Schindelin 2018-03-29 21:38 ` Jeff King 2018-03-30 13:02 ` Johannes Schindelin 2018-03-30 13:14 ` Jeff King 2018-03-30 14:01 ` Johannes Schindelin 2018-03-30 14:08 ` Jeff King 2018-03-30 19:04 ` Johannes Schindelin 2018-03-29 15:19 ` [PATCH 9/9] git_config_set: reuse empty sections Johannes Schindelin 2018-03-29 21:50 ` Jeff King 2018-03-30 13:15 ` Johannes Schindelin 2018-03-29 17:58 ` [PATCH 0/9] Assorted fixes for `git config` (including the "empty sections" bug) Stefan Beller 2018-03-30 12:14 ` Johannes Schindelin 2018-03-29 19:39 ` Jeff King 2018-03-30 12:35 ` Johannes Schindelin 2018-03-30 14:17 ` Ævar Arnfjörð Bjarmason 2018-03-30 18:46 ` Johannes Schindelin 2018-04-03 16:27 ` [PATCH v2 00/15] " Johannes Schindelin 2018-04-03 16:28 ` [PATCH v2 01/15] git_config_set: fix off-by-two Johannes Schindelin 2018-04-03 16:28 ` [PATCH v2 02/15] t1300: rename it to reflect that `repo-config` was deprecated Johannes Schindelin 2018-04-03 16:28 ` [PATCH v2 03/15] t1300: demonstrate that --replace-all can "invent" newlines Johannes Schindelin 2018-04-03 16:28 ` [PATCH v2 04/15] config --replace-all: avoid extra line breaks Johannes Schindelin 2018-04-03 16:28 ` [PATCH v2 05/15] t1300: avoid relying on a bug Johannes Schindelin 2018-04-03 16:28 ` [PATCH v2 06/15] t1300: remove unreasonable expectation from TODO Johannes Schindelin 2018-04-03 16:28 ` [PATCH v2 07/15] t1300: `--unset-all` can leave an empty section behind (bug) Johannes Schindelin 2018-04-03 16:28 ` [PATCH v2 08/15] config: introduce an optional event stream while parsing Johannes Schindelin 2018-04-06 21:22 ` Jeff King 2018-04-09 7:35 ` Johannes Schindelin 2018-04-03 16:28 ` [PATCH v2 09/15] config: avoid using the global variable `store` Johannes Schindelin 2018-04-06 21:23 ` Jeff King 2018-04-09 7:36 ` Johannes Schindelin 2018-04-03 16:28 ` [PATCH v2 10/15] config_set_store: rename some fields for consistency Johannes Schindelin 2018-04-03 16:28 ` [PATCH v2 11/15] git_config_set: do not use a state machine Johannes Schindelin 2018-04-06 21:28 ` Jeff King 2018-04-09 7:50 ` Johannes Schindelin 2018-04-03 16:28 ` [PATCH v2 12/15] git_config_set: make use of the config parser's event stream Johannes Schindelin 2018-04-03 16:28 ` [PATCH v2 13/15] git config --unset: remove empty sections (in the common case) Johannes Schindelin 2018-04-03 16:29 ` [PATCH v2 14/15] git_config_set: reuse empty sections Johannes Schindelin 2018-04-03 16:30 ` [PATCH v2 00/15] Assorted fixes for `git config` (including the "empty sections" bug) Johannes Schindelin 2018-04-06 21:33 ` Jeff King 2018-04-09 8:19 ` Johannes Schindelin 2018-04-09 8:31 ` [PATCH v3 " Johannes Schindelin 2018-04-09 8:31 ` [PATCH v3 01/15] git_config_set: fix off-by-two Johannes Schindelin 2018-04-09 8:31 ` [PATCH v3 02/15] t1300: rename it to reflect that `repo-config` was deprecated Johannes Schindelin 2018-04-09 8:31 ` [PATCH v3 03/15] t1300: demonstrate that --replace-all can "invent" newlines Johannes Schindelin 2018-04-09 8:31 ` [PATCH v3 04/15] config --replace-all: avoid extra line breaks Johannes Schindelin 2018-04-09 8:31 ` [PATCH v3 05/15] t1300: avoid relying on a bug Johannes Schindelin 2018-04-09 8:31 ` [PATCH v3 06/15] t1300: remove unreasonable expectation from TODO Johannes Schindelin 2018-04-09 8:31 ` [PATCH v3 07/15] t1300: add a few more hairy examples of sections becoming empty Johannes Schindelin 2018-04-09 8:32 ` [PATCH v3 08/15] t1300: `--unset-all` can leave an empty section behind (bug) Johannes Schindelin 2018-04-09 8:32 ` [PATCH v3 09/15] config: introduce an optional event stream while parsing Johannes Schindelin 2018-04-09 8:32 ` [PATCH v3 10/15] config: avoid using the global variable `store` Johannes Schindelin 2018-04-09 8:32 ` [PATCH v3 11/15] config_set_store: rename some fields for consistency Johannes Schindelin 2018-04-09 8:32 ` [PATCH v3 12/15] git_config_set: do not use a state machine Johannes Schindelin 2018-04-09 8:32 ` [PATCH v3 13/15] git_config_set: make use of the config parser's event stream Johannes Schindelin 2018-05-08 13:42 ` Jeff King 2018-05-08 14:00 ` Jeff King 2018-04-09 8:32 ` [PATCH v3 14/15] git config --unset: remove empty sections (in the common case) Johannes Schindelin 2018-04-09 8:32 ` [PATCH v3 15/15] git_config_set: reuse empty sections Johannes Schindelin
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.