All of lore.kernel.org
 help / color / mirror / Atom feed
* smudge/clean filter needs filename
@ 2010-12-18 22:38 Pete Wyckoff
  2010-12-19 21:29 ` [PATCH] convert filter: supply path to external driver Pete Wyckoff
  0 siblings, 1 reply; 17+ messages in thread
From: Pete Wyckoff @ 2010-12-18 22:38 UTC (permalink / raw)
  To: git

I'm using git-p4 to import and work with upstream p4
repositories.  Some of the files are ktext, meaning they expect
expansion of $Id$ and similar identifiers.

Using the filter driver for this file, I can do the "clean" part
easily, but to calculate the "smudge" correctly, I need to know
the filename inside the filter driver.

E.g., inside file foo/Makefile, the clean line:

    # $File$

should be smudged into:

    # $File: //depot/project/foo/Makefile $

I know the //depot/project location from context in the commit
log message that git-p4 produces.  But I don't know the pathname
in the git repo that my smudge script is working on.

Would it make sense to pass that on the command line?  E.g.

    [filter "p4"]
    	clean  = git-p4smudge --clean %s
    	smudge = git-p4smudge --smudge %s

Or maybe put the path in an environment variable?

		-- Pete

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH] convert filter: supply path to external driver
  2010-12-18 22:38 smudge/clean filter needs filename Pete Wyckoff
@ 2010-12-19 21:29 ` Pete Wyckoff
  2010-12-19 21:59   ` Junio C Hamano
  2010-12-20  8:04   ` [PATCH] " Johannes Sixt
  0 siblings, 2 replies; 17+ messages in thread
From: Pete Wyckoff @ 2010-12-19 21:29 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

Filtering to support keyword expansion may need the name of
the file being filtered.  In particular, to support p4 keywords
like

    $File: //depot/product/dir/script.sh $

the smudge filter needs to know the name of the file it is
smudging.

Add a "%s" conversion specifier to the gitattribute for filter.
It will be expanded with the path name to the file when invoking
the external filter command.

Signed-off-by: Pete Wyckoff <pw@padd.com>
---

pw@padd.com wrote on Sat, 18 Dec 2010 17:38 -0500:
> I'm using git-p4 to import and work with upstream p4
> repositories.  Some of the files are ktext, meaning they expect
> expansion of $Id$ and similar identifiers.
> 
> Using the filter driver for this file, I can do the "clean" part
> easily, but to calculate the "smudge" correctly, I need to know
> the filename inside the filter driver.

This works fine for me.  It is backward compatible, and leaves
open the possibility of adding other % modifiers if we find
a need later.

I have a filter that handles all the p4 $Keyword$ expansions,
now, using this patch.  It can go into contrib if others would
find it useful.  This seriously reduces many hassles associated
with my git-p4 workflow, where a plethora of ancillary systems
rely on keyword expansion in the source.

Cc Junio for comments as the original author of convert filter
code.

		-- Pete


 Documentation/gitattributes.txt |   12 ++++++++++++
 convert.c                       |   15 ++++++++++++++-
 2 files changed, 26 insertions(+), 1 deletions(-)

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 564586b..9ac2138 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -317,6 +317,18 @@ command is "cat").
 	smudge = cat
 ------------------------
 
+If your filter needs the path of the file it is working on,
+you can use the "%s" conversion specification.  It will be
+replaced with the relative path to the file.  This is important
+for keyword substitution that depends on the name of the
+file.  Like this:
+
+------------------------
+[filter "p4"]
+	clean = git-p4-filter --clean %s
+	smudge = git-p4-filter --smudge %s
+------------------------
+
 
 Interaction between checkin/checkout attributes
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
diff --git a/convert.c b/convert.c
index e41a31e..d4c6ed1 100644
--- a/convert.c
+++ b/convert.c
@@ -317,6 +317,7 @@ struct filter_params {
 	const char *src;
 	unsigned long size;
 	const char *cmd;
+	const char *path;
 };
 
 static int filter_buffer(int in, int out, void *data)
@@ -329,7 +330,16 @@ static int filter_buffer(int in, int out, void *data)
 	int write_err, status;
 	const char *argv[] = { NULL, NULL };
 
-	argv[0] = params->cmd;
+	/* replace optional %s with path */
+	struct strbuf cmd = STRBUF_INIT;
+	struct strbuf_expand_dict_entry dict[] = {
+	    "s", params->path,
+	    NULL, NULL,
+	};
+
+	strbuf_expand(&cmd, params->cmd, strbuf_expand_dict_cb, &dict);
+
+	argv[0] = cmd.buf;
 
 	memset(&child_process, 0, sizeof(child_process));
 	child_process.argv = argv;
@@ -349,6 +359,8 @@ static int filter_buffer(int in, int out, void *data)
 	status = finish_command(&child_process);
 	if (status)
 		error("external filter %s failed %d", params->cmd, status);
+
+	strbuf_release(&cmd);
 	return (write_err || status);
 }
 
@@ -376,6 +388,7 @@ static int apply_filter(const char *path, const char *src, size_t len,
 	params.src = src;
 	params.size = len;
 	params.cmd = cmd;
+	params.path = path;
 
 	fflush(NULL);
 	if (start_async(&async))
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH] convert filter: supply path to external driver
  2010-12-19 21:29 ` [PATCH] convert filter: supply path to external driver Pete Wyckoff
@ 2010-12-19 21:59   ` Junio C Hamano
  2010-12-20  2:24     ` Jeff King
  2010-12-20 16:09     ` [PATCH v2] " Pete Wyckoff
  2010-12-20  8:04   ` [PATCH] " Johannes Sixt
  1 sibling, 2 replies; 17+ messages in thread
From: Junio C Hamano @ 2010-12-19 21:59 UTC (permalink / raw)
  To: Pete Wyckoff; +Cc: git, Jeff King

Pete Wyckoff <pw@padd.com> writes:

> Filtering to support keyword expansion may need the name of
> the file being filtered.  In particular, to support p4 keywords
> like
>
>     $File: //depot/product/dir/script.sh $
>
> the smudge filter needs to know the name of the file it is
> smudging.
>
> Add a "%s" conversion specifier to the gitattribute for filter.
> It will be expanded with the path name to the file when invoking
> the external filter command.
>
> Signed-off-by: Pete Wyckoff <pw@padd.com>
> ---
>
> pw@padd.com wrote on Sat, 18 Dec 2010 17:38 -0500:
>> I'm using git-p4 to import and work with upstream p4
>> repositories.  Some of the files are ktext, meaning they expect
>> expansion of $Id$ and similar identifiers.
>> 
>> Using the filter driver for this file, I can do the "clean" part
>> easily, but to calculate the "smudge" correctly, I need to know
>> the filename inside the filter driver.
>
> This works fine for me.  It is backward compatible, and leaves
> open the possibility of adding other % modifiers if we find
> a need later.

This is not backward compatible for people who wanted to use '%' literal
on their filter command line for whatever reason, so please do not
advertise as such.  A fair argument you could make is "Even though this is
not strictly backward compatible, it is very unlikely that people passed a
literal % to their filter command line, and the benefit of being able to
give the pathname information would outweigh the downside of not being
compatible", and people can agree or disagree.

I am personally moderately negative about $any expansion$ (I don't use it
myself, and I don't think sane people use it either).  As far as I can
tell, this should has no impact on the correctness and very little impact
on the performance for people who do not use $any expansion$, so I am Ok
with the patch.

Modulo one worry.  Don't we have, or don't we at least plant to allow us
to have, a facility to cache expensive blob conversion result, similar to
the textconv caching?  How would this change interact with two blobs that
live in different paths?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] convert filter: supply path to external driver
  2010-12-19 21:59   ` Junio C Hamano
@ 2010-12-20  2:24     ` Jeff King
  2010-12-20  5:52       ` david
  2010-12-20 16:09     ` [PATCH v2] " Pete Wyckoff
  1 sibling, 1 reply; 17+ messages in thread
From: Jeff King @ 2010-12-20  2:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Pete Wyckoff, git

On Sun, Dec 19, 2010 at 01:59:50PM -0800, Junio C Hamano wrote:

> Modulo one worry.  Don't we have, or don't we at least plant to allow us
> to have, a facility to cache expensive blob conversion result, similar to
> the textconv caching?  How would this change interact with two blobs that
> live in different paths?

Yeah, it has been talked about, but I don't think anyone is working on
it (I don't personally use clean/smudge at all, so it is not something I
have thought _that_ much about).

This does definitely complicate matters, as the filtering is no longer a
pure mapping of sha1->sha1. However, I think in practice we could do
just fine by using a multi-level lookup. I.e., mapping a sha1 to be
filtered into a tree. Each tree entry would represent the remaining
cache parameters. In this case, the only other parameter we have is the
path given to the filter (but it could easily be extended to include
other parameters, if they existed, in this or other caching cases).

We only get a high-performance lookup for the first part of the
multi-level (i.e., the sha1), but that's OK if we assume the number of
second-level items is going to be small. Which I think is the case here
(a sha1 will tend to be found only under one or a few names).

An alternative would be to combine all parts of the filter under a
single lookup key. E.g., calculate and store under sha1(sha1(blob) +
filename)). But that means the notes keys are not actual object sha1s,
which throws off pruning.

Anyway, that's just my quick thinking on the subject. I don't see any
reason to restrict a feature just because we might want to cache it in
the future. At the very worst, we could always cache filters which do
not use %s, and make only %s users pay the penalty.

-Peff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] convert filter: supply path to external driver
  2010-12-20  2:24     ` Jeff King
@ 2010-12-20  5:52       ` david
  0 siblings, 0 replies; 17+ messages in thread
From: david @ 2010-12-20  5:52 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Pete Wyckoff, git

On Sun, 19 Dec 2010, Jeff King wrote:

> On Sun, Dec 19, 2010 at 01:59:50PM -0800, Junio C Hamano wrote:
>
>> Modulo one worry.  Don't we have, or don't we at least plant to allow us
>> to have, a facility to cache expensive blob conversion result, similar to
>> the textconv caching?  How would this change interact with two blobs that
>> live in different paths?
>
> Yeah, it has been talked about, but I don't think anyone is working on
> it (I don't personally use clean/smudge at all, so it is not something I
> have thought _that_ much about).
>
> This does definitely complicate matters, as the filtering is no longer a
> pure mapping of sha1->sha1. However, I think in practice we could do
> just fine by using a multi-level lookup. I.e., mapping a sha1 to be
> filtered into a tree. Each tree entry would represent the remaining
> cache parameters. In this case, the only other parameter we have is the
> path given to the filter (but it could easily be extended to include
> other parameters, if they existed, in this or other caching cases).
>
> We only get a high-performance lookup for the first part of the
> multi-level (i.e., the sha1), but that's OK if we assume the number of
> second-level items is going to be small. Which I think is the case here
> (a sha1 will tend to be found only under one or a few names).
>
> An alternative would be to combine all parts of the filter under a
> single lookup key. E.g., calculate and store under sha1(sha1(blob) +
> filename)). But that means the notes keys are not actual object sha1s,
> which throws off pruning.
>
> Anyway, that's just my quick thinking on the subject. I don't see any
> reason to restrict a feature just because we might want to cache it in
> the future. At the very worst, we could always cache filters which do
> not use %s, and make only %s users pay the penalty.

or you could cache the stats of the filtered file, including any changes 
that are made.

David Lang

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] convert filter: supply path to external driver
  2010-12-19 21:29 ` [PATCH] convert filter: supply path to external driver Pete Wyckoff
  2010-12-19 21:59   ` Junio C Hamano
@ 2010-12-20  8:04   ` Johannes Sixt
  2010-12-20  8:52     ` Junio C Hamano
  2010-12-20 14:41     ` Pete Wyckoff
  1 sibling, 2 replies; 17+ messages in thread
From: Johannes Sixt @ 2010-12-20  8:04 UTC (permalink / raw)
  To: Pete Wyckoff; +Cc: git, Junio C Hamano

Am 12/19/2010 22:29, schrieb Pete Wyckoff:
> Filtering to support keyword expansion may need the name of
> the file being filtered.  In particular, to support p4 keywords
> like
> 
>     $File: //depot/product/dir/script.sh $
> 
> the smudge filter needs to know the name of the file it is
> smudging.
> 
> Add a "%s" conversion specifier to the gitattribute for filter.
> It will be expanded with the path name to the file when invoking
> the external filter command.

What happens if there are any shell special characters in the path name
(or spaces, for that matter). Does this shell-escape the substituted path
name anywhere in the call chain?

-- Hannes

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] convert filter: supply path to external driver
  2010-12-20  8:04   ` [PATCH] " Johannes Sixt
@ 2010-12-20  8:52     ` Junio C Hamano
  2010-12-20 14:41     ` Pete Wyckoff
  1 sibling, 0 replies; 17+ messages in thread
From: Junio C Hamano @ 2010-12-20  8:52 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Pete Wyckoff, git

Johannes Sixt <j.sixt@viscovery.net> writes:

> What happens if there are any shell special characters in the path name
> (or spaces, for that matter). Does this shell-escape the substituted path
> name anywhere in the call chain?

Good eyes.  Thanks.

You would need something like %'s (that is "'" modifier applied to 's'
placeholder in strbuf_expand() to cause the expansion sq'ed), and then the
caller must write something like

	clean = git-p4-filter --clean '%s'

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] convert filter: supply path to external driver
  2010-12-20  8:04   ` [PATCH] " Johannes Sixt
  2010-12-20  8:52     ` Junio C Hamano
@ 2010-12-20 14:41     ` Pete Wyckoff
  1 sibling, 0 replies; 17+ messages in thread
From: Pete Wyckoff @ 2010-12-20 14:41 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: git, Junio C Hamano

j.sixt@viscovery.net wrote on Mon, 20 Dec 2010 09:04 +0100:
> Am 12/19/2010 22:29, schrieb Pete Wyckoff:
> > Filtering to support keyword expansion may need the name of
> > the file being filtered.  In particular, to support p4 keywords
> > like
> > 
> >     $File: //depot/product/dir/script.sh $
> > 
> > the smudge filter needs to know the name of the file it is
> > smudging.
> > 
> > Add a "%s" conversion specifier to the gitattribute for filter.
> > It will be expanded with the path name to the file when invoking
> > the external filter command.
> 
> What happens if there are any shell special characters in the path name
> (or spaces, for that matter). Does this shell-escape the substituted path
> name anywhere in the call chain?

Good catch---it doesn't.  I'll see if running everything through
sq_quote_buf will help.

Incidentally there appears to be no way to quote spaces in
filenames listed in .gitattributes, although fnmatch wildcards
can be used to work around that.

		-- Pete

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2] convert filter: supply path to external driver
  2010-12-19 21:59   ` Junio C Hamano
  2010-12-20  2:24     ` Jeff King
@ 2010-12-20 16:09     ` Pete Wyckoff
  2010-12-20 17:59       ` Junio C Hamano
  1 sibling, 1 reply; 17+ messages in thread
From: Pete Wyckoff @ 2010-12-20 16:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King, Johannes Sixt

Filtering to support keyword expansion may need the name of
the file being filtered.  In particular, to support p4 keywords
like

    $File: //depot/product/dir/script.sh $

the smudge filter needs to know the name of the file it is
smudging.

Add a "%s" conversion specifier to the gitattribute for filter.
It will be expanded with the path name to the file when invoking
the external filter command.  The path name is quoted and
special characters are escaped to prevent the shell from splitting
incorrectly.

Signed-off-by: Pete Wyckoff <pw@padd.com>
---
gitster@pobox.com wrote on Sun, 19 Dec 2010 13:59 -0800:
> This is not backward compatible for people who wanted to use '%' literal
> on their filter command line for whatever reason, so please do not
> advertise as such.  A fair argument you could make is "Even though this is
> not strictly backward compatible, it is very unlikely that people passed a
> literal % to their filter command line, and the benefit of being able to
> give the pathname information would outweigh the downside of not being
> compatible", and people can agree or disagree.

I overlooked that, but agree it is unlikely anyone was using % in
filter definitions.  Putting the path in an environment variable
is the other option I considered.

> I am personally moderately negative about $any expansion$ (I don't use it
> myself, and I don't think sane people use it either).  As far as I can
> tell, this should has no impact on the correctness and very little impact
> on the performance for people who do not use $any expansion$, so I am Ok
> with the patch.

Keyword expansion is indeed nasty.  My only motivation is to
support working with an upstream that relies on it.

This version of the patch handles quoting of shell
meta-characters, as pointed out by Hannes.  I decided to invoke
sq_quote_buf directly on the path before expanding %s, rather
than writing a dict entry to understand %'s.  There is no
requirement for users to use single-quotes around %s in their
config files, this way, either.

Also added a test case to make sure %s and quoting works as
advertised.

		-- Pete

 Documentation/gitattributes.txt |   12 ++++++++++
 convert.c                       |   22 +++++++++++++++++-
 t/t0021-conversion.sh           |   47 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 80 insertions(+), 1 deletions(-)

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 564586b..9ac2138 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -317,6 +317,18 @@ command is "cat").
 	smudge = cat
 ------------------------
 
+If your filter needs the path of the file it is working on,
+you can use the "%s" conversion specification.  It will be
+replaced with the relative path to the file.  This is important
+for keyword substitution that depends on the name of the
+file.  Like this:
+
+------------------------
+[filter "p4"]
+	clean = git-p4-filter --clean %s
+	smudge = git-p4-filter --smudge %s
+------------------------
+
 
 Interaction between checkin/checkout attributes
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
diff --git a/convert.c b/convert.c
index e41a31e..1ef83a0 100644
--- a/convert.c
+++ b/convert.c
@@ -317,6 +317,7 @@ struct filter_params {
 	const char *src;
 	unsigned long size;
 	const char *cmd;
+	const char *path;
 };
 
 static int filter_buffer(int in, int out, void *data)
@@ -329,7 +330,23 @@ static int filter_buffer(int in, int out, void *data)
 	int write_err, status;
 	const char *argv[] = { NULL, NULL };
 
-	argv[0] = params->cmd;
+	/* replace optional %s with path */
+	struct strbuf cmd = STRBUF_INIT;
+	struct strbuf path = STRBUF_INIT;
+	struct strbuf_expand_dict_entry dict[] = {
+	    "s", NULL,
+	    NULL, NULL,
+	};
+
+	/* quote the path to preserve spaces, etc. */
+	sq_quote_buf(&path, params->path);
+	dict[0].value = path.buf;
+
+	/* expand all %s with the quoted path */
+	strbuf_expand(&cmd, params->cmd, strbuf_expand_dict_cb, &dict);
+	strbuf_release(&path);
+
+	argv[0] = cmd.buf;
 
 	memset(&child_process, 0, sizeof(child_process));
 	child_process.argv = argv;
@@ -349,6 +366,8 @@ static int filter_buffer(int in, int out, void *data)
 	status = finish_command(&child_process);
 	if (status)
 		error("external filter %s failed %d", params->cmd, status);
+
+	strbuf_release(&cmd);
 	return (write_err || status);
 }
 
@@ -376,6 +395,7 @@ static int apply_filter(const char *path, const char *src, size_t len,
 	params.src = src;
 	params.size = len;
 	params.cmd = cmd;
+	params.path = path;
 
 	fflush(NULL);
 	if (start_async(&async))
diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index 828e35b..c5c394d 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -93,4 +93,51 @@ test_expect_success expanded_in_repo '
 	cmp expanded-keywords expected-output
 '
 
+cat <<EOF >argc.sh
+#!$SHELL_PATH
+echo argc: \$# "\$@"
+echo argc running >&2
+EOF
+chmod +x argc.sh
+
+#
+# The use of %s in a filter definition is expanded to the path to
+# the filename being smudged or cleaned.  It must be shell escaped.
+#
+test_expect_success 'shell-escaped filenames' '
+    norm=name-no-magic &&
+    spec=$(echo name:sgl\"dbl\ spc!bang | tr : \\047) &&
+    echo some test text > test
+    cat test > $norm &&
+    cat test > "$spec" &&
+    git add $norm &&
+    git add "$spec" &&
+    git commit -m "add files" &&
+
+    echo "name* filter=argc" > .gitattributes &&
+
+    # delete the files and check them out again, using the smudge filter
+    git config filter.argc.smudge "./argc.sh %s" &&
+    rm $norm "$spec" &&
+    git checkout -- $norm "$spec" &&
+
+    # make sure argc.sh counted the right number of args
+    echo "argc: 1 $norm" > res &&
+    cmp res $norm &&
+    echo "argc: 1 $spec" > res &&
+    cmp res "$spec" &&
+
+    # %s with other args
+    git config filter.argc.smudge "./argc.sh %s --myword" &&
+    rm $norm "$spec" &&
+    git checkout -- $norm "$spec" &&
+
+    # make sure argc.sh counted the right number of args
+    echo "argc: 2 $norm --myword" > res &&
+    cmp res $norm &&
+    echo "argc: 2 $spec --myword" > res &&
+    cmp res "$spec" &&
+    :
+'
+
 test_done
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] convert filter: supply path to external driver
  2010-12-20 16:09     ` [PATCH v2] " Pete Wyckoff
@ 2010-12-20 17:59       ` Junio C Hamano
  2010-12-21 13:44         ` [PATCH v3] " Pete Wyckoff
  0 siblings, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2010-12-20 17:59 UTC (permalink / raw)
  To: Pete Wyckoff; +Cc: git, Jeff King, Johannes Sixt

Pete Wyckoff <pw@padd.com> writes:

> This version of the patch handles quoting of shell
> meta-characters, as pointed out by Hannes.  I decided to invoke
> sq_quote_buf directly on the path before expanding %s, rather
> than writing a dict entry to understand %'s.

Ahh, that's much more preferable than my horrible suggestion ;-)  If the
user wants to have concatenation with other parameters, that can be left
to the shell that is invoked, e.g. "--clean foo/%s" for "hello world.c"
would expand to "--clean foo/'hello world.c'" and does what we want.

Another nitpick is if 's' is the right letter to use for the pathname
information.  I think you took 's' after "string", but if this is to be
extensible, you should anticipate that later there will be other kinds of
information you may want to throw at the filters, and expect that some of
them also will be of type "string", and you will have painted the person
who wants to add that new information into a tight corner as you already
took the valuable 's'.

Which would suggest that you shouldn't be naming the placeholder after the
type but after what the placeholder means, no?  Perhaps %f (for filename)
would be a better choice?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v3] convert filter: supply path to external driver
  2010-12-20 17:59       ` Junio C Hamano
@ 2010-12-21 13:44         ` Pete Wyckoff
  2010-12-21 18:19           ` Jonathan Nieder
  0 siblings, 1 reply; 17+ messages in thread
From: Pete Wyckoff @ 2010-12-21 13:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King, Johannes Sixt

Filtering to support keyword expansion may need the name of
the file being filtered.  In particular, to support p4 keywords
like

    $File: //depot/product/dir/script.sh $

the smudge filter needs to know the name of the file it is
smudging.

Add a "%f" conversion specifier to the gitattribute for filter.
It will be expanded with the path name to the file when invoking
the external filter command.  The path name is quoted and
special characters are escaped to prevent the shell from splitting
incorrectly.

Signed-off-by: Pete Wyckoff <pw@padd.com>
---
gitster@pobox.com wrote on Mon, 20 Dec 2010 09:59 -0800:
> Another nitpick is if 's' is the right letter to use for the pathname
> information.  I think you took 's' after "string", but if this is to be
> extensible, you should anticipate that later there will be other kinds of
> information you may want to throw at the filters, and expect that some of
> them also will be of type "string", and you will have painted the person
> who wants to add that new information into a tight corner as you already
> took the valuable 's'.
> 
> Which would suggest that you shouldn't be naming the placeholder after the
> type but after what the placeholder means, no?  Perhaps %f (for filename)
> would be a better choice?

Agree.  The "s" doesn't convey much meaning.  Nothing in
pretty-formats.txt really applies either, but "f" makes sense.

		-- Pete

 Documentation/gitattributes.txt |   12 ++++++++++
 convert.c                       |   22 +++++++++++++++++-
 t/t0021-conversion.sh           |   47 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 80 insertions(+), 1 deletions(-)

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 564586b..1afcf01 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -317,6 +317,18 @@ command is "cat").
 	smudge = cat
 ------------------------
 
+If your filter needs the path of the file it is working on,
+you can use the "%f" conversion specification.  It will be
+replaced with the relative path to the file.  This is important
+for keyword substitution that depends on the name of the
+file.  Like this:
+
+------------------------
+[filter "p4"]
+	clean = git-p4-filter --clean %f
+	smudge = git-p4-filter --smudge %f
+------------------------
+
 
 Interaction between checkin/checkout attributes
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
diff --git a/convert.c b/convert.c
index e41a31e..8f020bc 100644
--- a/convert.c
+++ b/convert.c
@@ -317,6 +317,7 @@ struct filter_params {
 	const char *src;
 	unsigned long size;
 	const char *cmd;
+	const char *path;
 };
 
 static int filter_buffer(int in, int out, void *data)
@@ -329,7 +330,23 @@ static int filter_buffer(int in, int out, void *data)
 	int write_err, status;
 	const char *argv[] = { NULL, NULL };
 
-	argv[0] = params->cmd;
+	/* apply % substitution to cmd */
+	struct strbuf cmd = STRBUF_INIT;
+	struct strbuf path = STRBUF_INIT;
+	struct strbuf_expand_dict_entry dict[] = {
+	    "f", NULL,
+	    NULL, NULL,
+	};
+
+	/* quote the path to preserve spaces, etc. */
+	sq_quote_buf(&path, params->path);
+	dict[0].value = path.buf;
+
+	/* expand all %f with the quoted path */
+	strbuf_expand(&cmd, params->cmd, strbuf_expand_dict_cb, &dict);
+	strbuf_release(&path);
+
+	argv[0] = cmd.buf;
 
 	memset(&child_process, 0, sizeof(child_process));
 	child_process.argv = argv;
@@ -349,6 +366,8 @@ static int filter_buffer(int in, int out, void *data)
 	status = finish_command(&child_process);
 	if (status)
 		error("external filter %s failed %d", params->cmd, status);
+
+	strbuf_release(&cmd);
 	return (write_err || status);
 }
 
@@ -376,6 +395,7 @@ static int apply_filter(const char *path, const char *src, size_t len,
 	params.src = src;
 	params.size = len;
 	params.cmd = cmd;
+	params.path = path;
 
 	fflush(NULL);
 	if (start_async(&async))
diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index 828e35b..534a735 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -93,4 +93,51 @@ test_expect_success expanded_in_repo '
 	cmp expanded-keywords expected-output
 '
 
+cat <<EOF >argc.sh
+#!$SHELL_PATH
+echo argc: \$# "\$@"
+echo argc running >&2
+EOF
+chmod +x argc.sh
+
+#
+# The use of %f in a filter definition is expanded to the path to
+# the filename being smudged or cleaned.  It must be shell escaped.
+#
+test_expect_success 'shell-escaped filenames' '
+    norm=name-no-magic &&
+    spec=$(echo name:sgl\"dbl\ spc!bang | tr : \\047) &&
+    echo some test text > test
+    cat test > $norm &&
+    cat test > "$spec" &&
+    git add $norm &&
+    git add "$spec" &&
+    git commit -m "add files" &&
+
+    echo "name* filter=argc" > .gitattributes &&
+
+    # delete the files and check them out again, using the smudge filter
+    git config filter.argc.smudge "./argc.sh %f" &&
+    rm $norm "$spec" &&
+    git checkout -- $norm "$spec" &&
+
+    # make sure argc.sh counted the right number of args
+    echo "argc: 1 $norm" > res &&
+    cmp res $norm &&
+    echo "argc: 1 $spec" > res &&
+    cmp res "$spec" &&
+
+    # %f with other args
+    git config filter.argc.smudge "./argc.sh %f --myword" &&
+    rm $norm "$spec" &&
+    git checkout -- $norm "$spec" &&
+
+    # make sure argc.sh counted the right number of args
+    echo "argc: 2 $norm --myword" > res &&
+    cmp res $norm &&
+    echo "argc: 2 $spec --myword" > res &&
+    cmp res "$spec" &&
+    :
+'
+
 test_done
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v3] convert filter: supply path to external driver
  2010-12-21 13:44         ` [PATCH v3] " Pete Wyckoff
@ 2010-12-21 18:19           ` Jonathan Nieder
  2010-12-21 20:33             ` [PATCH v4] " Pete Wyckoff
  0 siblings, 1 reply; 17+ messages in thread
From: Jonathan Nieder @ 2010-12-21 18:19 UTC (permalink / raw)
  To: Pete Wyckoff; +Cc: Junio C Hamano, git, Jeff King, Johannes Sixt

Hi,

Pete Wyckoff wrote:

> --- a/t/t0021-conversion.sh
> +++ b/t/t0021-conversion.sh

Nitpicks (some silly, some not):

> @@ -93,4 +93,51 @@ test_expect_success expanded_in_repo '
>  	cmp expanded-keywords expected-output
>  '
>  
> +cat <<EOF >argc.sh
> +#!$SHELL_PATH
> +echo argc: \$# "\$@"
> +echo argc running >&2
> +EOF
> +chmod +x argc.sh

You can embed this in a test_expect_success stanza (like the next
one or the earlier "setup") like so:

	cat <<-EOF >argc.sh &&
	#!$SHELL_PATH
	...
	EOF
	chmod +x argc.sh &&

This way if the "chmod" fails on some platform the test would
catch that.

> +
> +#
> +# The use of %f in a filter definition is expanded to the path to
> +# the filename being smudged or cleaned.  It must be shell escaped.
> +#

I'd even prefer to see this comment inside the test_expect_success
assertion so it can be printed when running the test with "-v".
But I suppose consistency with the other test in the script suggests
otherwise.

[...]
> +    echo some test text > test
> +    cat test > $norm &&
> +    cat test > "$spec" &&

Missing && after "> test".  Probably best to remove the space
after > (just for consistency[1]).  Also, please use tabs to indent.

[...]
> +    # make sure argc.sh counted the right number of args
> +    echo "argc: 1 $norm" > res &&
> +    cmp res $norm &&

test_cmp?  (for nicer output)  See t/README.

[...]
> +    # %f with other args
> +    git config filter.argc.smudge "./argc.sh %f --myword" &&
> +    rm $norm "$spec" &&
> +    git checkout -- $norm "$spec" &&
> +
> +    # make sure argc.sh counted the right number of args
> +    echo "argc: 2 $norm --myword" > res &&
> +    cmp res $norm &&
> +    echo "argc: 2 $spec --myword" > res &&
> +    cmp res "$spec" &&

Probably would be clearer if this were a separate test assertion.

> +    :
> +'
> +
>  test_done

Thanks for the tests.  I haven't looked at the substance, alas,
but hope that helps nonetheless.

Jonathan

[1] Trumped up justification for the "no space after >" style: if I
always include a space after, I would be tempted to use

	noisy_command > /dev/null 2> &1

But that does not work because >& is recognized as a single token.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v4] convert filter: supply path to external driver
  2010-12-21 18:19           ` Jonathan Nieder
@ 2010-12-21 20:33             ` Pete Wyckoff
  2010-12-21 21:24               ` Junio C Hamano
  0 siblings, 1 reply; 17+ messages in thread
From: Pete Wyckoff @ 2010-12-21 20:33 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Junio C Hamano, git, Jeff King, Johannes Sixt

Filtering to support keyword expansion may need the name of
the file being filtered.  In particular, to support p4 keywords
like

    $File: //depot/product/dir/script.sh $

the smudge filter needs to know the name of the file it is
smudging.

Add a "%f" conversion specifier to the gitattribute for filter.
It will be expanded with the path name to the file when invoking
the external filter command.  The path name is quoted and
special characters are escaped to prevent the shell from splitting
incorrectly.

Signed-off-by: Pete Wyckoff <pw@padd.com>
---
jrnieder@gmail.com wrote on Tue, 21 Dec 2010 12:19 -0600:
> [detailed review]

Thanks for the nitpicks.  I put the argc.sh and chmod +x into a
setup test.  Tried to put some more in there, and to break up the
test in two, but did not want to duplicate the complex
calculation of "norm" and "spec" variables.  So ended up with the
small setup, and just one big test still.

I couldn't quite bring myself to delete the nice spaces in
redirections like "> test".  Rest of the usage in t/ seems
to be about 1/3 for space, 2/3 against.

Got the test_cmp, tabs and missing &&, too, thanks for finding
those.

		-- Pete

 Documentation/gitattributes.txt |   12 +++++++++
 convert.c                       |   22 +++++++++++++++++-
 t/t0021-conversion.sh           |   48 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 81 insertions(+), 1 deletions(-)

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 564586b..1afcf01 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -317,6 +317,18 @@ command is "cat").
 	smudge = cat
 ------------------------
 
+If your filter needs the path of the file it is working on,
+you can use the "%f" conversion specification.  It will be
+replaced with the relative path to the file.  This is important
+for keyword substitution that depends on the name of the
+file.  Like this:
+
+------------------------
+[filter "p4"]
+	clean = git-p4-filter --clean %f
+	smudge = git-p4-filter --smudge %f
+------------------------
+
 
 Interaction between checkin/checkout attributes
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
diff --git a/convert.c b/convert.c
index e41a31e..8f020bc 100644
--- a/convert.c
+++ b/convert.c
@@ -317,6 +317,7 @@ struct filter_params {
 	const char *src;
 	unsigned long size;
 	const char *cmd;
+	const char *path;
 };
 
 static int filter_buffer(int in, int out, void *data)
@@ -329,7 +330,23 @@ static int filter_buffer(int in, int out, void *data)
 	int write_err, status;
 	const char *argv[] = { NULL, NULL };
 
-	argv[0] = params->cmd;
+	/* apply % substitution to cmd */
+	struct strbuf cmd = STRBUF_INIT;
+	struct strbuf path = STRBUF_INIT;
+	struct strbuf_expand_dict_entry dict[] = {
+	    "f", NULL,
+	    NULL, NULL,
+	};
+
+	/* quote the path to preserve spaces, etc. */
+	sq_quote_buf(&path, params->path);
+	dict[0].value = path.buf;
+
+	/* expand all %f with the quoted path */
+	strbuf_expand(&cmd, params->cmd, strbuf_expand_dict_cb, &dict);
+	strbuf_release(&path);
+
+	argv[0] = cmd.buf;
 
 	memset(&child_process, 0, sizeof(child_process));
 	child_process.argv = argv;
@@ -349,6 +366,8 @@ static int filter_buffer(int in, int out, void *data)
 	status = finish_command(&child_process);
 	if (status)
 		error("external filter %s failed %d", params->cmd, status);
+
+	strbuf_release(&cmd);
 	return (write_err || status);
 }
 
@@ -376,6 +395,7 @@ static int apply_filter(const char *path, const char *src, size_t len,
 	params.src = src;
 	params.size = len;
 	params.cmd = cmd;
+	params.path = path;
 
 	fflush(NULL);
 	if (start_async(&async))
diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index 828e35b..69c22a6 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -93,4 +93,52 @@ test_expect_success expanded_in_repo '
 	cmp expanded-keywords expected-output
 '
 
+test_expect_success 'filter shell-escaped filenames setup' '
+	cat >argc.sh <<-EOF &&
+	#!$SHELL_PATH
+	echo argc: \$# "\$@"
+	EOF
+	chmod +x argc.sh
+'
+
+# The use of %f in a filter definition is expanded to the path to
+# the filename being smudged or cleaned.  It must be shell escaped.
+# First, set up some interesting file names and pet them in
+# .gitattributes.
+test_expect_success 'filter shell-escaped filenames test' '
+	norm=name-no-magic &&
+	spec=$(echo name:sgl\"dbl\ spc!bang | tr : \\047) &&
+	echo some test text > test &&
+	cat test > $norm &&
+	cat test > "$spec" &&
+	git add $norm &&
+	git add "$spec" &&
+	git commit -q -m "add files" &&
+	echo "name* filter=argc" > .gitattributes &&
+
+	# delete the files and check them out again, using a smudge filter
+	# that will count the args and echo the command-line back to us
+	git config filter.argc.smudge "./argc.sh %f" &&
+	rm $norm "$spec" &&
+	git checkout -- $norm "$spec" &&
+
+	# make sure argc.sh counted the right number of args
+	echo "argc: 1 $norm" > res &&
+	test_cmp res $norm &&
+	echo "argc: 1 $spec" > res &&
+	test_cmp res "$spec" &&
+
+	# do the same thing, but with more args in the filter expression
+	git config filter.argc.smudge "./argc.sh %f --myword" &&
+	rm $norm "$spec" &&
+	git checkout -- $norm "$spec" &&
+
+	# make sure argc.sh counted the right number of args
+	echo "argc: 2 $norm --myword" > res &&
+	test_cmp res $norm &&
+	echo "argc: 2 $spec --myword" > res &&
+	test_cmp res "$spec" &&
+	:
+'
+
 test_done
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v4] convert filter: supply path to external driver
  2010-12-21 20:33             ` [PATCH v4] " Pete Wyckoff
@ 2010-12-21 21:24               ` Junio C Hamano
  2010-12-22 14:40                 ` [PATCH v5] " Pete Wyckoff
  0 siblings, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2010-12-21 21:24 UTC (permalink / raw)
  To: Pete Wyckoff; +Cc: Jonathan Nieder, git, Jeff King, Johannes Sixt

Pete Wyckoff <pw@padd.com> writes:

> Filtering to support keyword expansion may need the name of
> the file being filtered.  In particular, to support p4 keywords
> like
>
>     $File: //depot/product/dir/script.sh $
>
> the smudge filter needs to know the name of the file it is
> smudging.
>
> Add a "%f" conversion specifier to the gitattribute for filter.
> It will be expanded with the path name to the file when invoking
> the external filter command.  The path name is quoted and
> special characters are escaped to prevent the shell from splitting
> incorrectly.

You do not specify the filter in attributes file, and this is not just
about splitting incorrectly (think $var substitutions).  I rephrased the
last paragraph when I tentatively queued v3 (and then I had to discard it
after seeing this round) like this:

    Allow "%f" in the custom filter command line specified in the
    configuration.  This will be substituted by the filename inside a
    single-quote pair to be passed to the shell.

> I couldn't quite bring myself to delete the nice spaces in
> redirections like "> test".  Rest of the usage in t/ seems
> to be about 1/3 for space, 2/3 against.

While existing mistakes by others do not make a good excuse to throw
unnecessary code-churn patches at me, it is not an excuse to introduce
more mistakes of the same kind in new code.

>  Documentation/gitattributes.txt |   12 +++++++++
>  convert.c                       |   22 +++++++++++++++++-
>  t/t0021-conversion.sh           |   48 +++++++++++++++++++++++++++++++++++++++
>  3 files changed, 81 insertions(+), 1 deletions(-)
>
> diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
> index 564586b..1afcf01 100644
> --- a/Documentation/gitattributes.txt
> +++ b/Documentation/gitattributes.txt
> @@ -317,6 +317,18 @@ command is "cat").
>  	smudge = cat
>  ------------------------
>  
> +If your filter needs the path of the file it is working on,
> +you can use the "%f" conversion specification.  It will be
> +replaced with the relative path to the file.  This is important
> +for keyword substitution that depends on the name of the
> +file.  Like this:

Maybe "important" to you, but not really.  Just let the reader decide the
importance.  I rephrased this when I tentatively queued v3 (and then I had
to discard it after seeing this round) like this:

    Sequence "%f" on the filter command line is replaced with the name of
    the file the filter is working on (a filter can use this in keyword
    substitution, for example):

> diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
> index 828e35b..69c22a6 100755
> --- a/t/t0021-conversion.sh
> +++ b/t/t0021-conversion.sh
> @@ -93,4 +93,52 @@ test_expect_success expanded_in_repo '
>  	cmp expanded-keywords expected-output
>  '
>  
> +test_expect_success 'filter shell-escaped filenames setup' '
> +	cat >argc.sh <<-EOF &&
> +	#!$SHELL_PATH
> +	echo argc: \$# "\$@"
> +	EOF
> +	chmod +x argc.sh
> +'
>
> +# The use of %f in a filter definition is expanded to the path to
> +# the filename being smudged or cleaned.  It must be shell escaped.
> +# First, set up some interesting file names and pet them in
> +# .gitattributes.
> +test_expect_success 'filter shell-escaped filenames test' '
> +	norm=name-no-magic &&
> +	spec=$(echo name:sgl\"dbl\ spc!bang | tr : \\047) &&

I think this is going overboard.  The correctness of sq_quote_buf() is not
what we are really testing with this test; we are primarily interested in
testing that '%f' is substituted here.  You can and should drop at least
dq from the funny pathname, so that this can be run on Windows as well.
Perhaps (note two SPs between "name" and "with"):

	special="name  with '\''sq'\'' and \$x" &&

Do not over-abbreviate variable names unnecessarily; it will only confuse
the readers.  "spec" typically stands for "specification", but you don't
mean that here.

Also if you define the filter as "sh ./argc.sh %f", you should be able to
stop worrying about the "chmod +x" failing, no?

> +	echo some test text > test &&
> +	cat test > $norm &&
> +	cat test > "$spec" &&

Quote both for consistency.

	cat test >"$norm"
        cat test >"$spec"

> +	git add $norm &&
> +	git add "$spec" &&

Why add them separately?

> +	echo "name* filter=argc" > .gitattributes &&
> +
> +	# delete the files and check them out again, using a smudge filter
> +	# that will count the args and echo the command-line back to us
> +	git config filter.argc.smudge "./argc.sh %f" &&
> +	rm $norm "$spec" &&
> +	git checkout -- $norm "$spec" &&
> +
> +	# make sure argc.sh counted the right number of args
> +	echo "argc: 1 $norm" > res &&

Perhaps "res" stands for "result", but both are misleading as it is
unclear if that is an expected result or an actual result.

	echo "argc: 1 $norm" >expect &&

Thanks.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v5] convert filter: supply path to external driver
  2010-12-21 21:24               ` Junio C Hamano
@ 2010-12-22 14:40                 ` Pete Wyckoff
  2010-12-22 18:10                   ` Junio C Hamano
  0 siblings, 1 reply; 17+ messages in thread
From: Pete Wyckoff @ 2010-12-22 14:40 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Nieder, git, Jeff King, Johannes Sixt

Filtering to support keyword expansion may need the name of
the file being filtered.  In particular, to support p4 keywords
like

    $File: //depot/product/dir/script.sh $

the smudge filter needs to know the name of the file it is
smudging.

Allow "%f" in the custom filter command line specified in the
configuration.  This will be substituted by the filename
inside a single-quote pair to be passed to the shell.

Signed-off-by: Pete Wyckoff <pw@padd.com>
---

gitster@pobox.com wrote on Tue, 21 Dec 2010 13:24 -0800:
> [detailed review]

Changes from v4:
- Updated commit message, docs per Junio mods
- Removed space after shell redirection ">"
- Simplified test case per Junio recommendations

Hopefully this is the last round of review and it is
safe to stage now.  Thanks,

		-- Pete

 Documentation/gitattributes.txt |   10 +++++++++
 convert.c                       |   22 +++++++++++++++++++-
 t/t0021-conversion.sh           |   42 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 73 insertions(+), 1 deletions(-)

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 564586b..8c2fdd1 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -317,6 +317,16 @@ command is "cat").
 	smudge = cat
 ------------------------
 
+Sequence "%f" on the filter command line is replaced with the name of
+the file the filter is working on.  A filter might use this in keyword
+substitution.  For example:
+
+------------------------
+[filter "p4"]
+	clean = git-p4-filter --clean %f
+	smudge = git-p4-filter --smudge %f
+------------------------
+
 
 Interaction between checkin/checkout attributes
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
diff --git a/convert.c b/convert.c
index e41a31e..8f020bc 100644
--- a/convert.c
+++ b/convert.c
@@ -317,6 +317,7 @@ struct filter_params {
 	const char *src;
 	unsigned long size;
 	const char *cmd;
+	const char *path;
 };
 
 static int filter_buffer(int in, int out, void *data)
@@ -329,7 +330,23 @@ static int filter_buffer(int in, int out, void *data)
 	int write_err, status;
 	const char *argv[] = { NULL, NULL };
 
-	argv[0] = params->cmd;
+	/* apply % substitution to cmd */
+	struct strbuf cmd = STRBUF_INIT;
+	struct strbuf path = STRBUF_INIT;
+	struct strbuf_expand_dict_entry dict[] = {
+	    "f", NULL,
+	    NULL, NULL,
+	};
+
+	/* quote the path to preserve spaces, etc. */
+	sq_quote_buf(&path, params->path);
+	dict[0].value = path.buf;
+
+	/* expand all %f with the quoted path */
+	strbuf_expand(&cmd, params->cmd, strbuf_expand_dict_cb, &dict);
+	strbuf_release(&path);
+
+	argv[0] = cmd.buf;
 
 	memset(&child_process, 0, sizeof(child_process));
 	child_process.argv = argv;
@@ -349,6 +366,8 @@ static int filter_buffer(int in, int out, void *data)
 	status = finish_command(&child_process);
 	if (status)
 		error("external filter %s failed %d", params->cmd, status);
+
+	strbuf_release(&cmd);
 	return (write_err || status);
 }
 
@@ -376,6 +395,7 @@ static int apply_filter(const char *path, const char *src, size_t len,
 	params.src = src;
 	params.size = len;
 	params.cmd = cmd;
+	params.path = path;
 
 	fflush(NULL);
 	if (start_async(&async))
diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index 828e35b..aacfd00 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -93,4 +93,46 @@ test_expect_success expanded_in_repo '
 	cmp expanded-keywords expected-output
 '
 
+# The use of %f in a filter definition is expanded to the path to
+# the filename being smudged or cleaned.  It must be shell escaped.
+# First, set up some interesting file names and pet them in
+# .gitattributes.
+test_expect_success 'filter shell-escaped filenames' '
+	cat >argc.sh <<-EOF &&
+	#!$SHELL_PATH
+	echo argc: \$# "\$@"
+	EOF
+	normal=name-no-magic &&
+	special="name  with '\''sq'\'' and \$x" &&
+	echo some test text >"$normal" &&
+	echo some test text >"$special" &&
+	git add "$normal" "$special" &&
+	git commit -q -m "add files" &&
+	echo "name* filter=argc" >.gitattributes &&
+
+	# delete the files and check them out again, using a smudge filter
+	# that will count the args and echo the command-line back to us
+	git config filter.argc.smudge "sh ./argc.sh %f" &&
+	rm "$normal" "$special" &&
+	git checkout -- "$normal" "$special" &&
+
+	# make sure argc.sh counted the right number of args
+	echo "argc: 1 $normal" >expect &&
+	test_cmp expect "$normal" &&
+	echo "argc: 1 $special" >expect &&
+	test_cmp expect "$special" &&
+
+	# do the same thing, but with more args in the filter expression
+	git config filter.argc.smudge "sh ./argc.sh %f --my-extra-arg" &&
+	rm "$normal" "$special" &&
+	git checkout -- "$normal" "$special" &&
+
+	# make sure argc.sh counted the right number of args
+	echo "argc: 2 $normal --my-extra-arg" >expect &&
+	test_cmp expect "$normal" &&
+	echo "argc: 2 $special --my-extra-arg" >expect &&
+	test_cmp expect "$special" &&
+	:
+'
+
 test_done
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v5] convert filter: supply path to external driver
  2010-12-22 14:40                 ` [PATCH v5] " Pete Wyckoff
@ 2010-12-22 18:10                   ` Junio C Hamano
  2010-12-22 23:22                     ` Junio C Hamano
  0 siblings, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2010-12-22 18:10 UTC (permalink / raw)
  To: Pete Wyckoff; +Cc: Jonathan Nieder, git, Jeff King, Johannes Sixt

Just FYI.

As I build stuff with -Werror to be on the safe side, I had to fix this up
a bit before queueing the patch.

 convert.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/convert.c b/convert.c
index 0b24790..d5aebed 100644
--- a/convert.c
+++ b/convert.c
@@ -1,6 +1,7 @@
 #include "cache.h"
 #include "attr.h"
 #include "run-command.h"
+#include "quote.h"
 
 /*
  * convert.c - convert a file when checking it out and checking it in.
@@ -335,8 +336,8 @@ static int filter_buffer(int in, int out, void *data)
 	struct strbuf cmd = STRBUF_INIT;
 	struct strbuf path = STRBUF_INIT;
 	struct strbuf_expand_dict_entry dict[] = {
-	    "f", NULL,
-	    NULL, NULL,
+		{ "f", NULL, },
+		{ NULL, NULL, },
 	};
 
 	/* quote the path to preserve spaces, etc. */

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v5] convert filter: supply path to external driver
  2010-12-22 18:10                   ` Junio C Hamano
@ 2010-12-22 23:22                     ` Junio C Hamano
  0 siblings, 0 replies; 17+ messages in thread
From: Junio C Hamano @ 2010-12-22 23:22 UTC (permalink / raw)
  To: Pete Wyckoff; +Cc: Jonathan Nieder, git, Jeff King, Johannes Sixt

Here is another, just FYI.

-- >8 --
Subject: [PATCH] t0021: avoid getting filter killed with SIGPIPE

The fake filter did not read from the standard input at all,
which caused the calling side to die with SIGPIPE, depending
on the timing.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t0021-conversion.sh |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index aacfd00..9078b84 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -100,6 +100,7 @@ test_expect_success expanded_in_repo '
 test_expect_success 'filter shell-escaped filenames' '
 	cat >argc.sh <<-EOF &&
 	#!$SHELL_PATH
+	cat >/dev/null
 	echo argc: \$# "\$@"
 	EOF
 	normal=name-no-magic &&
-- 
1.7.3.4.811.g38bda

^ permalink raw reply related	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2010-12-22 23:23 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-12-18 22:38 smudge/clean filter needs filename Pete Wyckoff
2010-12-19 21:29 ` [PATCH] convert filter: supply path to external driver Pete Wyckoff
2010-12-19 21:59   ` Junio C Hamano
2010-12-20  2:24     ` Jeff King
2010-12-20  5:52       ` david
2010-12-20 16:09     ` [PATCH v2] " Pete Wyckoff
2010-12-20 17:59       ` Junio C Hamano
2010-12-21 13:44         ` [PATCH v3] " Pete Wyckoff
2010-12-21 18:19           ` Jonathan Nieder
2010-12-21 20:33             ` [PATCH v4] " Pete Wyckoff
2010-12-21 21:24               ` Junio C Hamano
2010-12-22 14:40                 ` [PATCH v5] " Pete Wyckoff
2010-12-22 18:10                   ` Junio C Hamano
2010-12-22 23:22                     ` Junio C Hamano
2010-12-20  8:04   ` [PATCH] " Johannes Sixt
2010-12-20  8:52     ` Junio C Hamano
2010-12-20 14:41     ` Pete Wyckoff

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.