git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
To: Shourya Shukla <periperidip@gmail.com>
Cc: git@vger.kernel.org, gitster@pobox.com, liu.denton@gmail.com,
	christian.couder@gmail.com, kaartic.sivaraam@gmail.com
Subject: Re: [PATCH v3 1/3] dir: change the scope of function 'directory_exists_in_index()'
Date: Sat, 19 Dec 2020 01:08:11 +0100 (CET)	[thread overview]
Message-ID: <nycvar.QRO.7.76.6.2012190104140.56@tvgsbejvaqbjf.bet> (raw)
In-Reply-To: <20201214231939.644175-2-periperidip@gmail.com>

Hi Shourya,

On Tue, 15 Dec 2020, Shourya Shukla wrote:

> Change the scope of the function 'directory_exists_in_index()' as well
> as declare it in 'dir.h'.
>
> Since the return type of the function is the enumerator 'exist_status',
> change its scope as well and declare it in 'dir.h'. While at it, rename
> the members of the aforementioned enum so as to avoid any naming clashes
> or confusions later on.

This makes it sound as if only existing code was adjusted, in a minimal
way, but no new code was introduced. But that's not true:

>
> Helped-by: Christian Couder <christian.couder@gmail.com>
> Helped-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
> Signed-off-by: Shourya Shukla <periperidip@gmail.com>
> ---
>  builtin/submodule--helper.c | 408 ++++++++++++++++++++++++++++++++++++
>  dir.c                       |  30 ++-
>  dir.h                       |   9 +
>  3 files changed, 429 insertions(+), 18 deletions(-)

Tons of new code there. And unfortunately...

>
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index c30896c897..4dfad35d77 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -2744,6 +2744,414 @@ static int module_set_branch(int argc, const char **argv, const char *prefix)
>  	return !!ret;
>  }
>
> +struct add_data {
> +	const char *prefix;
> +	const char *branch;
> +	const char *reference_path;
> +	const char *sm_path;
> +	const char *sm_name;
> +	const char *repo;
> +	const char *realrepo;
> +	int depth;
> +	unsigned int force: 1;
> +	unsigned int quiet: 1;
> +	unsigned int progress: 1;
> +	unsigned int dissociate: 1;
> +};
> +#define ADD_DATA_INIT { 0 }
> +
> +/*
> + * Guess the directory name from the repository URL by performing the
> + * operations below in the following order:
> + *
> + * - If the URL ends with '/', remove that.
> + *
> + * - If the result of the above ends with zero or more ':', followed
> + *  by zero or more '/', followed by ".git", drop the matching part.
> + *
> + * - If the result of the above has '/' or ':' in it, remove everything
> + *  before it and '/' or ':' itself.
> + */
> +static char *guess_dir_name(const char *repo)
> +{
> +	const char *start, *end;
> +
> +	start = repo;
> +	end = repo + strlen(repo);
> +
> +	/* remove the trailing '/' */
> +	if (repo < end - 1 && end[-1] == '/')
> +		end--;
> +
> +	/* remove the trailing ':', '/' and '.git' */
> +	if (repo < end - 4 && !memcmp(".git", end - 4, 4)) {
> +		end -= 4;
> +		while (repo < end - 1 && end[-1] == '/')
> +			end--;
> +		while (repo < end - 1 && end[-1] == ':')
> +			end--;
> +	}
> +
> +	/* find the last ':' or '/' */
> +	for (start = end - 1; repo <= start; start--) {
> +		if (*start == '/' || *start == ':')
> +			break;
> +	}
> +	/* exclude '/' or ':' itself */
> +	start++;
> +
> +	return xmemdupz(start, end - start);
> +}
> +
> +static int can_create_submodule(unsigned int force, const char *path)
> +{
> +	int cache_pos, dir_in_cache = 0;
> +
> +	if (read_cache() < 0)
> +		die(_("index file corrupt"));
> +
> +	cache_pos = cache_name_pos(path, strlen(path));
> +	if(cache_pos < 0 &&
> +	   directory_exists_in_index(&the_index, path, strlen(path)) == is_cache_directory)
> +		dir_in_cache = 1;
> +
> +	if (!force) {
> +		if (cache_pos >= 0 || dir_in_cache)
> +			die(_("'%s' already exists in the index"), path);
> +	} else {
> +		struct cache_entry *ce = NULL;
> +		if (cache_pos >= 0)
> +			ce = the_index.cache[cache_pos];
> +		if (dir_in_cache || (ce && !S_ISGITLINK(ce->ce_mode)))
> +			die(_("'%s' already exists in the index and is not a "
> +			      "submodule"), path);
> +	}
> +	return 0;
> +}
> +
> +static const char *parse_token(const char *cp, int *len)
> +{
> +	const char *p = cp, *start, *end;
> +	char *str;
> +
> +	start = p;
> +	while (*p != ' ')
> +		p++;
> +	end = p;
> +	str = xstrndup(start, end - start);
> +
> +	while(*p == ' ')
> +		p++;
> +
> +	return str;
> +}

This function is not careful enough to avoid buffer overruns. It even
triggers a segmentation fault in our test suite:
https://github.com/gitgitgadget/git/runs/1574891976?check_suite_focus=true#step:6:3152

I need this to make it pass (only tested locally so far, but I trust you
to take the baton from here):

-- snipsnap --
From c28c0cd3ac21d546394335957fbaa350ab287c3f Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Sat, 19 Dec 2020 01:02:04 +0100
Subject: [PATCH] fixup??? dir: change the scope of function
 'directory_exists_in_index()'

This fixes the segmentation fault reported in the linux-musl job of our
CI builds. Valgrind has this to say about it:

==32354==
==32354== Process terminating with default action of signal 11 (SIGSEGV)
==32354==  Access not within mapped region at address 0x5C73000
==32354==    at 0x202F5A: parse_token (submodule--helper.c:2837)
==32354==    by 0x20319B: report_fetch_remotes (submodule--helper.c:2871)
==32354==    by 0x2033FD: add_submodule (submodule--helper.c:2898)
==32354==    by 0x204612: module_add (submodule--helper.c:3146)
==32354==    by 0x20478A: cmd_submodule__helper (submodule--helper.c:3202)
==32354==    by 0x12655E: run_builtin (git.c:458)
==32354==    by 0x1269B4: handle_builtin (git.c:712)
==32354==    by 0x126C79: run_argv (git.c:779)
==32354==    by 0x12715C: cmd_main (git.c:913)
==32354==    by 0x2149A2: main (common-main.c:52)

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/submodule--helper.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 4f1d892b9a9..29a6f80b937 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -2834,12 +2834,12 @@ static const char *parse_token(const char *cp, int *len)
 	char *str;

 	start = p;
-	while (*p != ' ')
+	while (*p && *p != ' ')
 		p++;
 	end = p;
 	str = xstrndup(start, end - start);

-	while(*p == ' ')
+	while(*p && *p == ' ')
 		p++;

 	return str;
--
2.29.2.windows.1.1.g3464b98ce68


  reply	other threads:[~2020-12-19  0:09 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-14 23:19 [PATCH v3 0/3] submodule: port subcommand add from shell to C Shourya Shukla
2020-12-14 23:19 ` [PATCH v3 1/3] dir: change the scope of function 'directory_exists_in_index()' Shourya Shukla
2020-12-19  0:08   ` Johannes Schindelin [this message]
2020-12-19  0:47     ` Junio C Hamano
2020-12-14 23:19 ` [PATCH v3 2/3] submodule: port submodule subcommand 'add' from shell to C Shourya Shukla
2020-12-14 23:19 ` [PATCH v3 3/3] t7400: add test to check 'submodule add' for tracked paths Shourya Shukla
2020-12-15 21:44 ` [PATCH v3 0/3] submodule: port subcommand add from shell to C Junio C Hamano
2020-12-17 14:16   ` Shourya Shukla
2020-12-17 22:20     ` Junio C Hamano
2020-12-22 23:42 ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=nycvar.QRO.7.76.6.2012190104140.56@tvgsbejvaqbjf.bet \
    --to=johannes.schindelin@gmx.de \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=kaartic.sivaraam@gmail.com \
    --cc=liu.denton@gmail.com \
    --cc=periperidip@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).