All of lore.kernel.org
 help / color / mirror / Atom feed
* Infinite loop in cascade_filter_fn()
@ 2011-11-23 17:40 Henrik Grubbström
  2011-11-25 14:31 ` Carlos Martín Nieto
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Henrik Grubbström @ 2011-11-23 17:40 UTC (permalink / raw)
  To: Git Mailing list; +Cc: Junio C Hamano

[-- Attachment #1: Type: TEXT/PLAIN, Size: 689 bytes --]

Hi.

My git repository walker just got bitten by what seems to be a reasonably 
new bug in convert.c:cascade_filter_fn() (git 1.7.8.rc3 (gentoo)).

How to reproduce:

   git clone git@github.com:pikelang/Pike.git

   git checkout -f 0e2080f838c6f0bc7d670ac7549676a353451dca^

   git checkout -f 0e2080f838c6f0bc7d670ac7549676a353451dca

The first two commands complete as expected, while the last hangs forever.
Performing the same with git 1.7.6.4 works as expected.

The problematic file seems to be /src/modules/_Crypto/rijndael_ecb_vt.txt 
which has the attributes: text ident eol=crlf

Thanks,

--
Henrik Grubbström					grubba@grubba.org
Roxen Internet Software AB				grubba@roxen.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Infinite loop in cascade_filter_fn()
  2011-11-23 17:40 Infinite loop in cascade_filter_fn() Henrik Grubbström
@ 2011-11-25 14:31 ` Carlos Martín Nieto
  2011-11-25 15:38 ` Carlos Martín Nieto
  2011-11-25 15:43 ` Henrik Grubbström
  2 siblings, 0 replies; 16+ messages in thread
From: Carlos Martín Nieto @ 2011-11-25 14:31 UTC (permalink / raw)
  To: Henrik Grubbström; +Cc: Git Mailing list, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 1326 bytes --]

On Wed, Nov 23, 2011 at 06:40:47PM +0100, Henrik Grubbström wrote:
> Hi.
> 
> My git repository walker just got bitten by what seems to be a
> reasonably new bug in convert.c:cascade_filter_fn() (git 1.7.8.rc3
> (gentoo)).

It looks like it's a bug between cascade_filter_fn and the actual
filter function lf_to_crlf_filter_fn that gets triggered when the
output buffer is too small. In this particular case, *isize_p=378 and
*osize_p=1 which causes cascade_filter_fn to feed the filter data
which it can't process because it doesn't have anywhere to put it.

I think that the function assumes that the output buffer is always
large enough, but there are many indirections, so it might be an
off-by-one.

> 
> How to reproduce:
> 
>   git clone git@github.com:pikelang/Pike.git
> 
>   git checkout -f 0e2080f838c6f0bc7d670ac7549676a353451dca^
> 
>   git checkout -f 0e2080f838c6f0bc7d670ac7549676a353451dca
> 
> The first two commands complete as expected, while the last hangs forever.
> Performing the same with git 1.7.6.4 works as expected.
> 
> The problematic file seems to be
> /src/modules/_Crypto/rijndael_ecb_vt.txt which has the attributes:
> text ident eol=crlf
> 
> Thanks,
> 
> --
> Henrik Grubbström					grubba@grubba.org
> Roxen Internet Software AB				grubba@roxen.com


[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Infinite loop in cascade_filter_fn()
  2011-11-23 17:40 Infinite loop in cascade_filter_fn() Henrik Grubbström
  2011-11-25 14:31 ` Carlos Martín Nieto
@ 2011-11-25 15:38 ` Carlos Martín Nieto
  2011-11-25 16:14   ` Henrik Grubbström
  2011-11-25 15:43 ` Henrik Grubbström
  2 siblings, 1 reply; 16+ messages in thread
From: Carlos Martín Nieto @ 2011-11-25 15:38 UTC (permalink / raw)
  To: Henrik Grubbström; +Cc: Git Mailing list, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 2907 bytes --]

On Wed, Nov 23, 2011 at 06:40:47PM +0100, Henrik Grubbström wrote:
> Hi.
> 
> My git repository walker just got bitten by what seems to be a
> reasonably new bug in convert.c:cascade_filter_fn() (git 1.7.8.rc3
> (gentoo)).
> 
> How to reproduce:
> 
>   git clone git@github.com:pikelang/Pike.git
> 
>   git checkout -f 0e2080f838c6f0bc7d670ac7549676a353451dca^
> 
>   git checkout -f 0e2080f838c6f0bc7d670ac7549676a353451dca
> 
> The first two commands complete as expected, while the last hangs forever.
> Performing the same with git 1.7.6.4 works as expected.
> 
> The problematic file seems to be
> /src/modules/_Crypto/rijndael_ecb_vt.txt which has the attributes:
> text ident eol=crlf

It looks like you won the lottery. The problem was that the output
buffer only has one byte available when we see a LF. We check whether
there is enough space (two bytes) to store CRLF in the output buffer,
see that there isn't and return. cascade_filter_fn sees that the
buffer hasn't been written fully and calls lf_to_crlf_filter_fn with
the same output buffer, which we still can't fill, because it's too
short.

This patch fixes this, but I think it would still break if the LF is
at the end of the file. Changing the `if (!input)` to put the LF in
the output buffer may or may not be the right soulution. I feel like
this should be handled by cascade_filter_fn rather than the actual
filter somehow, but Junio's comment (4ae66704 'stream filter: add "no
more input" to the filters') suggests otherwise.

I'm working on a cleaner patch that takes care of a bit of state, but
this is the general idea.

   cmn
--- 8< ---
Subject: [PATCH] convert: don't loop indefintely if at LF-to-CRLF streaming

If we find a LF when the output buffer is only has one byte remaining,
cascade_filter_fn won't notice that we need more input and won't drain
the output buffer.

In such a case, store whether we've outputted the CR so we can retake
it from there.

Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
---
 convert.c |   11 ++++++++---
 1 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/convert.c b/convert.c
index 86e9c29..4218f40 100644
--- a/convert.c
+++ b/convert.c
@@ -881,6 +881,7 @@ static int lf_to_crlf_filter_fn(struct stream_filter *filter,
 				char *output, size_t *osize_p)
 {
 	size_t count;
+	static int put_cr = 0;
 
 	if (!input)
 		return 0; /* we do not keep any states */
@@ -890,10 +891,14 @@ static int lf_to_crlf_filter_fn(struct stream_filter *filter,
 		for (i = o = 0; o < *osize_p && i < count; i++) {
 			char ch = input[i];
 			if (ch == '\n') {
-				if (o + 1 < *osize_p)
+				if (put_cr) {
+					put_cr = 0;
+				} else {
 					output[o++] = '\r';
-				else
-					break;
+					put_cr = 1;
+					i--;
+					continue;
+				}
 			}
 			output[o++] = ch;
 		}
-- 
1.7.8.rc3.31.g017d1


[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: Infinite loop in cascade_filter_fn()
  2011-11-23 17:40 Infinite loop in cascade_filter_fn() Henrik Grubbström
  2011-11-25 14:31 ` Carlos Martín Nieto
  2011-11-25 15:38 ` Carlos Martín Nieto
@ 2011-11-25 15:43 ` Henrik Grubbström
  2011-11-25 15:53   ` Carlos Martín Nieto
  2 siblings, 1 reply; 16+ messages in thread
From: Henrik Grubbström @ 2011-11-25 15:43 UTC (permalink / raw)
  To: Git Mailing list; +Cc: Junio C Hamano

[-- Attachment #1: Type: TEXT/PLAIN, Size: 938 bytes --]

On Wed, 23 Nov 2011, Henrik Grubbström wrote:

> Hi.
>
> My git repository walker just got bitten by what seems to be a reasonably new 
> bug in convert.c:cascade_filter_fn() (git 1.7.8.rc3 (gentoo)).

After some tracing, the problem is triggered by the variable "remaining"
being set to 1 in the beginning of the cascade_filter_fn() loop, which 
causes filter "two" to be called with an output buffer size of 1.
Filter "two" in this case is lf_to_crlf_filter_fn(), and the next input 
character is a "\n". lf_to_crlf_filter_fn() wants to convert this to 
"\r\n", but that doesn't fit into the buffer, so it breaks out and returns 
zero. Upon seing the zero cascade_filter_fn() thinks all is well, even 
though nothing has happened, and loops.

The bug is probably that lf_to_crlf_filter_fn() should return non-zero in 
this case (ie o and/or i being zero).

> Thanks,

--
Henrik Grubbström					grubba@roxen.com
Roxen Internet Software AB

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Infinite loop in cascade_filter_fn()
  2011-11-25 15:43 ` Henrik Grubbström
@ 2011-11-25 15:53   ` Carlos Martín Nieto
  2011-11-25 15:59     ` Henrik Grubbström
  0 siblings, 1 reply; 16+ messages in thread
From: Carlos Martín Nieto @ 2011-11-25 15:53 UTC (permalink / raw)
  To: Henrik Grubbström; +Cc: Git Mailing list, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 1277 bytes --]

On Fri, Nov 25, 2011 at 04:43:41PM +0100, Henrik Grubbström wrote:
> On Wed, 23 Nov 2011, Henrik Grubbström wrote:
> 
> >Hi.
> >
> >My git repository walker just got bitten by what seems to be a
> >reasonably new bug in convert.c:cascade_filter_fn() (git 1.7.8.rc3
> >(gentoo)).
> 
> After some tracing, the problem is triggered by the variable "remaining"
> being set to 1 in the beginning of the cascade_filter_fn() loop,
> which causes filter "two" to be called with an output buffer size of
> 1.
> Filter "two" in this case is lf_to_crlf_filter_fn(), and the next
> input character is a "\n". lf_to_crlf_filter_fn() wants to convert
> this to "\r\n", but that doesn't fit into the buffer, so it breaks
> out and returns zero. Upon seing the zero cascade_filter_fn() thinks
> all is well, even though nothing has happened, and loops.
> 
> The bug is probably that lf_to_crlf_filter_fn() should return
> non-zero in this case (ie o and/or i being zero).

non-zero? That would cause the filter to abort, which definitely not
what we want. Have you seen my other e-mails regarding this? I'm
trying to figure out which is the best way to go about this. The
solution is to keep track of the fact that we're missing a LF in the
output buffer.

   cmn

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Infinite loop in cascade_filter_fn()
  2011-11-25 15:53   ` Carlos Martín Nieto
@ 2011-11-25 15:59     ` Henrik Grubbström
  0 siblings, 0 replies; 16+ messages in thread
From: Henrik Grubbström @ 2011-11-25 15:59 UTC (permalink / raw)
  To: Carlos Martín Nieto; +Cc: Git Mailing list, Junio C Hamano

[-- Attachment #1: Type: TEXT/PLAIN, Size: 692 bytes --]

On Fri, 25 Nov 2011, Carlos Martín Nieto wrote:

> On Fri, Nov 25, 2011 at 04:43:41PM +0100, Henrik Grubbström wrote:
>>
>> The bug is probably that lf_to_crlf_filter_fn() should return
>> non-zero in this case (ie o and/or i being zero).
>
> non-zero? That would cause the filter to abort, which definitely not
> what we want. Have you seen my other e-mails regarding this? I'm
> trying to figure out which is the best way to go about this. The
> solution is to keep track of the fact that we're missing a LF in the
> output buffer.

True, I misread the code.

Keeping track of the filter state is the way to go.

>   cmn

--
Henrik Grubbström					grubba@roxen.com
Roxen Internet Software AB

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Infinite loop in cascade_filter_fn()
  2011-11-25 15:38 ` Carlos Martín Nieto
@ 2011-11-25 16:14   ` Henrik Grubbström
  2011-11-25 17:02     ` Carlos Martín Nieto
  0 siblings, 1 reply; 16+ messages in thread
From: Henrik Grubbström @ 2011-11-25 16:14 UTC (permalink / raw)
  To: Carlos Martín Nieto; +Cc: Git Mailing list, Junio C Hamano

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1007 bytes --]

On Fri, 25 Nov 2011, Carlos Martín Nieto wrote:

> This patch fixes this, but I think it would still break if the LF is
> at the end of the file. Changing the `if (!input)` to put the LF in
> the output buffer may or may not be the right soulution. I feel like
> this should be handled by cascade_filter_fn rather than the actual
> filter somehow, but Junio's comment (4ae66704 'stream filter: add "no
> more input" to the filters') suggests otherwise.
>
> I'm working on a cleaner patch that takes care of a bit of state, but
> this is the general idea.

Looks good to me (and seems to work in my case).
Typo in the commit subject though.

>   cmn
> --- 8< ---
> Subject: [PATCH] convert: don't loop indefintely if at LF-to-CRLF streaming
                                        ^^^^^^^^^^^
This should be either "infinitely", or "indefinitely", but since we know 
that the loop won't terminate "infinitely" is to be preferred.

Thanks,

--
Henrik Grubbström					grubba@roxen.com
Roxen Internet Software AB

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Infinite loop in cascade_filter_fn()
  2011-11-25 16:14   ` Henrik Grubbström
@ 2011-11-25 17:02     ` Carlos Martín Nieto
  2011-11-26 22:48       ` Junio C Hamano
  0 siblings, 1 reply; 16+ messages in thread
From: Carlos Martín Nieto @ 2011-11-25 17:02 UTC (permalink / raw)
  To: Henrik Grubbström; +Cc: Git Mailing list, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 2949 bytes --]

On Fri, Nov 25, 2011 at 05:14:17PM +0100, Henrik Grubbström wrote:
> On Fri, 25 Nov 2011, Carlos Martín Nieto wrote:
> 
> >This patch fixes this, but I think it would still break if the LF is
> >at the end of the file. Changing the `if (!input)` to put the LF in
> >the output buffer may or may not be the right soulution. I feel like
> >this should be handled by cascade_filter_fn rather than the actual
> >filter somehow, but Junio's comment (4ae66704 'stream filter: add "no
> >more input" to the filters') suggests otherwise.
> >
> >I'm working on a cleaner patch that takes care of a bit of state, but
> >this is the general idea.
> 
> Looks good to me (and seems to work in my case).

That patch would give wrong output if the same happened at the end of
a file. The attached patch should also cover this case.

> Typo in the commit subject though.
> 
> >  cmn
> >--- 8< ---
> >Subject: [PATCH] convert: don't loop indefintely if at LF-to-CRLF streaming
>                                        ^^^^^^^^^^^
> This should be either "infinitely", or "indefinitely", but since we
> know that the loop won't terminate "infinitely" is to be preferred.

Thanks for noticing. I went with a different title in the end. Junio,
could you consider this one for inclusion in the next RC?

--- 8< ---
Subject: [PATCH] convert: track state in LF-to-CRLF filter

There may not be enough space to store CRLF in the output. If we don't
fill the buffer, then the filter will keep getting called with the same
short buffer and will loop forever.

Instead, always store the CR and record there's a missing LF if
necessary it so we store it in the output buffer the next time the
function gets called.

Reported-by: Henrik Grubbström <grubba@roxen.com>
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
---
 convert.c |   23 ++++++++++++++++-------
 1 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/convert.c b/convert.c
index 86e9c29..c050b86 100644
--- a/convert.c
+++ b/convert.c
@@ -880,20 +880,29 @@ static int lf_to_crlf_filter_fn(struct stream_filter *filter,
 				const char *input, size_t *isize_p,
 				char *output, size_t *osize_p)
 {
-	size_t count;
+	size_t count, o = 0;
+	static int want_lf = 0;
+
+	/* Output a pending LF if we need to */
+	if (want_lf) {
+		output[o++] = '\n';
+		want_lf = 0;
+	}
 
 	if (!input)
-		return 0; /* we do not keep any states */
+		return 0; /* We've already dealt with the state */
+
 	count = *isize_p;
 	if (count) {
-		size_t i, o;
-		for (i = o = 0; o < *osize_p && i < count; i++) {
+		size_t i;
+		for (i = 0; o < *osize_p && i < count; i++) {
 			char ch = input[i];
 			if (ch == '\n') {
-				if (o + 1 < *osize_p)
-					output[o++] = '\r';
-				else
+				output[o++] = '\r';
+				if (o >= *osize_p) {
+					want_lf = 1;
 					break;
+				}
 			}
 			output[o++] = ch;
 		}
-- 
1.7.8.rc3.31.g017d1



[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: Infinite loop in cascade_filter_fn()
  2011-11-25 17:02     ` Carlos Martín Nieto
@ 2011-11-26 22:48       ` Junio C Hamano
  2011-11-28 10:48         ` Carlos Martín Nieto
  0 siblings, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2011-11-26 22:48 UTC (permalink / raw)
  To: Carlos Martín Nieto
  Cc: Henrik Grubbström, Git Mailing list, Junio C Hamano

Carlos Martín Nieto <cmn@elego.de> writes:

> diff --git a/convert.c b/convert.c
> index 86e9c29..c050b86 100644
> --- a/convert.c
> +++ b/convert.c
> @@ -880,20 +880,29 @@ static int lf_to_crlf_filter_fn(struct stream_filter *filter,
>  				const char *input, size_t *isize_p,
>  				char *output, size_t *osize_p)
>  {
> -	size_t count;
> +	size_t count, o = 0;
> +	static int want_lf = 0;

I do not think we want function scope static state anywhere in the cascade
filter chain, as it will forbid us from running more than one output chain
at the same time in the future. I think the correct way to structure it
would be to create lf_to_crlf_filter as a proper subclass of stream_filter
(see how cascade_filter_fn() casts its filter argument down to an instance
of the cascade_filter class and uses it to keep track of its state) and
keep this variable as its own filter state [*1*].

[Footnote]

*1* We currently use a singleton instance of lf_to_crlf_filter object
because the implementation assumed there is no need for per-instance
state.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Infinite loop in cascade_filter_fn()
  2011-11-26 22:48       ` Junio C Hamano
@ 2011-11-28 10:48         ` Carlos Martín Nieto
  2011-11-28 19:18           ` Junio C Hamano
  2011-12-16 22:01           ` Junio C Hamano
  0 siblings, 2 replies; 16+ messages in thread
From: Carlos Martín Nieto @ 2011-11-28 10:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Henrik Grubbström, Git Mailing list

[-- Attachment #1: Type: text/plain, Size: 4081 bytes --]

On Sat, Nov 26, 2011 at 02:48:12PM -0800, Junio C Hamano wrote:
> Carlos Martín Nieto <cmn@elego.de> writes:
> 
> > diff --git a/convert.c b/convert.c
> > index 86e9c29..c050b86 100644
> > --- a/convert.c
> > +++ b/convert.c
> > @@ -880,20 +880,29 @@ static int lf_to_crlf_filter_fn(struct stream_filter *filter,
> >  				const char *input, size_t *isize_p,
> >  				char *output, size_t *osize_p)
> >  {
> > -	size_t count;
> > +	size_t count, o = 0;
> > +	static int want_lf = 0;
> 
> I do not think we want function scope static state anywhere in the cascade
> filter chain, as it will forbid us from running more than one output chain
> at the same time in the future. I think the correct way to structure it
> would be to create lf_to_crlf_filter as a proper subclass of stream_filter
> (see how cascade_filter_fn() casts its filter argument down to an instance
> of the cascade_filter class and uses it to keep track of its state) and
> keep this variable as its own filter state [*1*].

Good point, here's a patch that does that.

   cmn

--- 8< ---
Subject: [PATCHv2] convert: track state in LF-to-CRLF filter

There may not be enough space to store CRLF in the output. If we don't
fill the buffer, then the filter will keep getting called with the same
short buffer and will loop forever.

Instead, always store the CR and record whether there's a missing LF
if so we store it in the output buffer the next time the function gets
called.

Reported-by: Henrik Grubbström <grubba@roxen.com>
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
---
 convert.c |   50 +++++++++++++++++++++++++++++++++++++-------------
 1 files changed, 37 insertions(+), 13 deletions(-)

diff --git a/convert.c b/convert.c
index 86e9c29..1c91409 100644
--- a/convert.c
+++ b/convert.c
@@ -876,24 +876,39 @@ int is_null_stream_filter(struct stream_filter *filter)
 /*
  * LF-to-CRLF filter
  */
+
+struct lf_to_crlf_filter {
+	struct stream_filter filter;
+	int want_lf;
+};
+
 static int lf_to_crlf_filter_fn(struct stream_filter *filter,
 				const char *input, size_t *isize_p,
 				char *output, size_t *osize_p)
 {
-	size_t count;
+	size_t count, o = 0;
+	struct lf_to_crlf_filter *lfcrlf = (struct lf_to_crlf_filter *) filter;
+
+	/* Output a pending LF if we need to */
+	if (lfcrlf->want_lf) {
+		output[o++] = '\n';
+		lfcrlf->want_lf = 0;
+	}
 
 	if (!input)
-		return 0; /* we do not keep any states */
+		return 0; /* We've already dealt with the state */
+
 	count = *isize_p;
 	if (count) {
-		size_t i, o;
-		for (i = o = 0; o < *osize_p && i < count; i++) {
+		size_t i;
+		for (i = 0; o < *osize_p && i < count; i++) {
 			char ch = input[i];
 			if (ch == '\n') {
-				if (o + 1 < *osize_p)
-					output[o++] = '\r';
-				else
-					break;
+				output[o++] = '\r';
+				if (o >= *osize_p) {
+					lfcrlf->want_lf = 1;
+					continue; /* We need to increase i */
+				}
 			}
 			output[o++] = ch;
 		}
@@ -904,15 +919,24 @@ static int lf_to_crlf_filter_fn(struct stream_filter *filter,
 	return 0;
 }
 
+static void lf_to_crlf_free_fn(struct stream_filter *filter)
+{
+	free(filter);
+}
+
 static struct stream_filter_vtbl lf_to_crlf_vtbl = {
 	lf_to_crlf_filter_fn,
-	null_free_fn,
+	lf_to_crlf_free_fn,
 };
 
-static struct stream_filter lf_to_crlf_filter_singleton = {
-	&lf_to_crlf_vtbl,
-};
+static struct stream_filter *lf_to_crlf_filter(void)
+{
+	struct lf_to_crlf_filter *lfcrlf = xmalloc(sizeof(*lfcrlf));
 
+	lfcrlf->filter.vtbl = &lf_to_crlf_vtbl;
+	lfcrlf->want_lf = 0;
+	return (struct stream_filter *)lfcrlf;
+}
 
 /*
  * Cascade filter
@@ -1194,7 +1218,7 @@ struct stream_filter *get_stream_filter(const char *path, const unsigned char *s
 
 	else if (output_eol(crlf_action) == EOL_CRLF &&
 		 !(crlf_action == CRLF_AUTO || crlf_action == CRLF_GUESS))
-		filter = cascade_filter(filter, &lf_to_crlf_filter_singleton);
+		filter = cascade_filter(filter, lf_to_crlf_filter());
 
 	return filter;
 }
-- 
1.7.8.rc3.31.g017d1



[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: Infinite loop in cascade_filter_fn()
  2011-11-28 10:48         ` Carlos Martín Nieto
@ 2011-11-28 19:18           ` Junio C Hamano
  2011-12-16 22:01           ` Junio C Hamano
  1 sibling, 0 replies; 16+ messages in thread
From: Junio C Hamano @ 2011-11-28 19:18 UTC (permalink / raw)
  To: Carlos Martín Nieto; +Cc: Henrik Grubbström, Git Mailing list

Carlos Martín Nieto <cmn@elego.de> writes:

>> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="M9NhX3UHpAaciwkO"
>> Content-Disposition: inline

Please do not do this. It makes it unnecessarily cumbersome to handle
patches without adding much value to the patch.

> --- 8< ---
> Subject: [PATCHv2] convert: track state in LF-to-CRLF filter
>
> There may not be enough space to store CRLF in the output. If we don't
> fill the buffer, then the filter will keep getting called with the same
> short buffer and will loop forever.
>
> Instead, always store the CR and record whether there's a missing LF
> if so we store it in the output buffer the next time the function gets
> called.
>
> Reported-by: Henrik Grubbström <grubba@roxen.com>
> Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
> ---
>  convert.c |   50 +++++++++++++++++++++++++++++++++++++-------------
>  1 files changed, 37 insertions(+), 13 deletions(-)
>
> diff --git a/convert.c b/convert.c
> index 86e9c29..1c91409 100644
> --- a/convert.c
> +++ b/convert.c
> @@ -876,24 +876,39 @@ int is_null_stream_filter(struct stream_filter *filter)
>  /*
>   * LF-to-CRLF filter
>   */
> +
> +struct lf_to_crlf_filter {
> +	struct stream_filter filter;
> +	int want_lf;
> +};
> +
>  static int lf_to_crlf_filter_fn(struct stream_filter *filter,
>  				const char *input, size_t *isize_p,
>  				char *output, size_t *osize_p)
>  {
> -	size_t count;
> +	size_t count, o = 0;
> +	struct lf_to_crlf_filter *lfcrlf = (struct lf_to_crlf_filter *) filter;
> ...
> -};
> +static struct stream_filter *lf_to_crlf_filter(void)
> +{
> +	struct lf_to_crlf_filter *lfcrlf = xmalloc(sizeof(*lfcrlf));
>  
> +	lfcrlf->filter.vtbl = &lf_to_crlf_vtbl;
> +	lfcrlf->want_lf = 0;
> +	return (struct stream_filter *)lfcrlf;
> +}

Patch looks sane; you may want to rename the variable to lf_crlf at least,
though. The name does not consist of three tokens ("lf", "cr" and "lf")
but of two ("lf" and "crlf"), and your naming loses it.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Infinite loop in cascade_filter_fn()
  2011-11-28 10:48         ` Carlos Martín Nieto
  2011-11-28 19:18           ` Junio C Hamano
@ 2011-12-16 22:01           ` Junio C Hamano
  2011-12-16 22:43             ` [PATCH] lf_to_crlf_filter(): tell the caller we added "\n" when draining Junio C Hamano
  2011-12-19 16:42             ` Infinite loop in cascade_filter_fn() Carlos Martín Nieto
  1 sibling, 2 replies; 16+ messages in thread
From: Junio C Hamano @ 2011-12-16 22:01 UTC (permalink / raw)
  To: Carlos Martín Nieto; +Cc: Henrik Grubbström, Git Mailing list

Carlos Martín Nieto <cmn@elego.de> writes:

> Subject: [PATCHv2] convert: track state in LF-to-CRLF filter
>
> There may not be enough space to store CRLF in the output. If we don't
> fill the buffer, then the filter will keep getting called with the same
> short buffer and will loop forever.
>
> Instead, always store the CR and record whether there's a missing LF
> if so we store it in the output buffer the next time the function gets
> called.
>
> Reported-by: Henrik Grubbström <grubba@roxen.com>
> Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
> ---
>  convert.c |   50 +++++++++++++++++++++++++++++++++++++-------------
>  1 files changed, 37 insertions(+), 13 deletions(-)
>
> diff --git a/convert.c b/convert.c
> index 86e9c29..1c91409 100644
> --- a/convert.c
> +++ b/convert.c
> @@ -876,24 +876,39 @@ int is_null_stream_filter(struct stream_filter *filter)
>  /*
>   * LF-to-CRLF filter
>   */
> +
> +struct lf_to_crlf_filter {
> +	struct stream_filter filter;
> +	int want_lf;
> +};
> +
>  static int lf_to_crlf_filter_fn(struct stream_filter *filter,
>  				const char *input, size_t *isize_p,
>  				char *output, size_t *osize_p)
>  {
> -	size_t count;
> +	size_t count, o = 0;
> +	struct lf_to_crlf_filter *lfcrlf = (struct lf_to_crlf_filter *) filter;
> +
> +	/* Output a pending LF if we need to */
> +	if (lfcrlf->want_lf) {
> +		output[o++] = '\n';
> +		lfcrlf->want_lf = 0;
> +	}
>  
>  	if (!input)
> -		return 0; /* we do not keep any states */
> +		return 0; /* We've already dealt with the state */
> +

Shouldn't we be decrementing *osize_p by 'o' to signal that we used that
many bytes in the output buffer here before returning to the caller?

>  	count = *isize_p;
>  	if (count) {
> -		size_t i, o;
> -		for (i = o = 0; o < *osize_p && i < count; i++) {
> +		size_t i;
> +		for (i = 0; o < *osize_p && i < count; i++) {
>  			char ch = input[i];
>  			if (ch == '\n') {
> -				if (o + 1 < *osize_p)
> -					output[o++] = '\r';
> -				else
> -					break;
> +				output[o++] = '\r';
> +				if (o >= *osize_p) {
> +					lfcrlf->want_lf = 1;
> +					continue; /* We need to increase i */
> +				}
>  			}
>  			output[o++] = ch;
>  		}

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH] lf_to_crlf_filter(): tell the caller we added "\n" when draining
  2011-12-16 22:01           ` Junio C Hamano
@ 2011-12-16 22:43             ` Junio C Hamano
  2011-12-19 10:19               ` Henrik Grubbström
  2011-12-19 16:42             ` Infinite loop in cascade_filter_fn() Carlos Martín Nieto
  1 sibling, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2011-12-16 22:43 UTC (permalink / raw)
  To: Git Mailing list; +Cc: Carlos Martín Nieto, Henrik Grubbström

This can only happen when the input size is multiple of the
buffer size of the cascade filter (16k) and ends with an LF,
but in such a case, the code forgot to tell the caller that
it added the "\n" it could not add during the last round.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

 convert.c |   12 +++++++-----
 1 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/convert.c b/convert.c
index c2c2c11..c028275 100644
--- a/convert.c
+++ b/convert.c
@@ -879,7 +879,7 @@ int is_null_stream_filter(struct stream_filter *filter)
 
 struct lf_to_crlf_filter {
 	struct stream_filter filter;
-	int want_lf;
+	unsigned want_lf:1;
 };
 
 static int lf_to_crlf_filter_fn(struct stream_filter *filter,
@@ -895,8 +895,11 @@ static int lf_to_crlf_filter_fn(struct stream_filter *filter,
 		lf_to_crlf->want_lf = 0;
 	}
 
-	if (!input)
-		return 0; /* We've already dealt with the state */
+	/* We are told to drain */
+	if (!input) {
+		*osize_p -= o;
+		return 0;
+	}
 
 	count = *isize_p;
 	if (count) {
@@ -931,10 +934,9 @@ static struct stream_filter_vtbl lf_to_crlf_vtbl = {
 
 static struct stream_filter *lf_to_crlf_filter(void)
 {
-	struct lf_to_crlf_filter *lf_to_crlf = xmalloc(sizeof(*lf_to_crlf));
+	struct lf_to_crlf_filter *lf_to_crlf = xcalloc(1, sizeof(*lf_to_crlf));
 
 	lf_to_crlf->filter.vtbl = &lf_to_crlf_vtbl;
-	lf_to_crlf->want_lf = 0;
 	return (struct stream_filter *)lf_to_crlf;
 }
 
-- 
1.7.8.351.g9111b

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] lf_to_crlf_filter(): tell the caller we added "\n" when draining
  2011-12-16 22:43             ` [PATCH] lf_to_crlf_filter(): tell the caller we added "\n" when draining Junio C Hamano
@ 2011-12-19 10:19               ` Henrik Grubbström
  2011-12-19 20:23                 ` Junio C Hamano
  0 siblings, 1 reply; 16+ messages in thread
From: Henrik Grubbström @ 2011-12-19 10:19 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing list, Carlos Martín Nieto

[-- Attachment #1: Type: TEXT/PLAIN, Size: 532 bytes --]

On Fri, 16 Dec 2011, Junio C Hamano wrote:

> This can only happen when the input size is multiple of the
> buffer size of the cascade filter (16k) and ends with an LF,
> but in such a case, the code forgot to tell the caller that
> it added the "\n" it could not add during the last round.

We probably ought to have a corresponding test in the testsuite.
A blob consisting of a singe 'A' followed by 8192 linefeeds should
be sufficient to trigger the problems.

--
Henrik Grubbström					grubba@roxen.com
Roxen Internet Software AB

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Infinite loop in cascade_filter_fn()
  2011-12-16 22:01           ` Junio C Hamano
  2011-12-16 22:43             ` [PATCH] lf_to_crlf_filter(): tell the caller we added "\n" when draining Junio C Hamano
@ 2011-12-19 16:42             ` Carlos Martín Nieto
  1 sibling, 0 replies; 16+ messages in thread
From: Carlos Martín Nieto @ 2011-12-19 16:42 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Henrik Grubbström, Git Mailing list

[-- Attachment #1: Type: text/plain, Size: 2503 bytes --]

On Fri, Dec 16, 2011 at 02:01:50PM -0800, Junio C Hamano wrote:
> Carlos Martín Nieto <cmn@elego.de> writes:
> 
> > Subject: [PATCHv2] convert: track state in LF-to-CRLF filter
> >
> > There may not be enough space to store CRLF in the output. If we don't
> > fill the buffer, then the filter will keep getting called with the same
> > short buffer and will loop forever.
> >
> > Instead, always store the CR and record whether there's a missing LF
> > if so we store it in the output buffer the next time the function gets
> > called.
> >
> > Reported-by: Henrik Grubbström <grubba@roxen.com>
> > Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
> > ---
> >  convert.c |   50 +++++++++++++++++++++++++++++++++++++-------------
> >  1 files changed, 37 insertions(+), 13 deletions(-)
> >
> > diff --git a/convert.c b/convert.c
> > index 86e9c29..1c91409 100644
> > --- a/convert.c
> > +++ b/convert.c
> > @@ -876,24 +876,39 @@ int is_null_stream_filter(struct stream_filter *filter)
> >  /*
> >   * LF-to-CRLF filter
> >   */
> > +
> > +struct lf_to_crlf_filter {
> > +	struct stream_filter filter;
> > +	int want_lf;
> > +};
> > +
> >  static int lf_to_crlf_filter_fn(struct stream_filter *filter,
> >  				const char *input, size_t *isize_p,
> >  				char *output, size_t *osize_p)
> >  {
> > -	size_t count;
> > +	size_t count, o = 0;
> > +	struct lf_to_crlf_filter *lfcrlf = (struct lf_to_crlf_filter *) filter;
> > +
> > +	/* Output a pending LF if we need to */
> > +	if (lfcrlf->want_lf) {
> > +		output[o++] = '\n';
> > +		lfcrlf->want_lf = 0;
> > +	}
> >  
> >  	if (!input)
> > -		return 0; /* we do not keep any states */
> > +		return 0; /* We've already dealt with the state */
> > +
> 
> Shouldn't we be decrementing *osize_p by 'o' to signal that we used that
> many bytes in the output buffer here before returning to the caller?

Yes we should, thanks for spotting it.

> 
> >  	count = *isize_p;
> >  	if (count) {
> > -		size_t i, o;
> > -		for (i = o = 0; o < *osize_p && i < count; i++) {
> > +		size_t i;
> > +		for (i = 0; o < *osize_p && i < count; i++) {
> >  			char ch = input[i];
> >  			if (ch == '\n') {
> > -				if (o + 1 < *osize_p)
> > -					output[o++] = '\r';
> > -				else
> > -					break;
> > +				output[o++] = '\r';
> > +				if (o >= *osize_p) {
> > +					lfcrlf->want_lf = 1;
> > +					continue; /* We need to increase i */
> > +				}
> >  			}
> >  			output[o++] = ch;
> >  		}
> 

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] lf_to_crlf_filter(): tell the caller we added "\n" when draining
  2011-12-19 10:19               ` Henrik Grubbström
@ 2011-12-19 20:23                 ` Junio C Hamano
  0 siblings, 0 replies; 16+ messages in thread
From: Junio C Hamano @ 2011-12-19 20:23 UTC (permalink / raw)
  To: Henrik Grubbström; +Cc: Git Mailing list, Carlos Martín Nieto

Henrik Grubbström <grubba@roxen.com> writes:

> We probably ought to have a corresponding test in the testsuite.
> A blob consisting of a singe 'A' followed by 8192 linefeeds should
> be sufficient to trigger the problems.

Sounds good. Please make it so.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2011-12-19 20:23 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-11-23 17:40 Infinite loop in cascade_filter_fn() Henrik Grubbström
2011-11-25 14:31 ` Carlos Martín Nieto
2011-11-25 15:38 ` Carlos Martín Nieto
2011-11-25 16:14   ` Henrik Grubbström
2011-11-25 17:02     ` Carlos Martín Nieto
2011-11-26 22:48       ` Junio C Hamano
2011-11-28 10:48         ` Carlos Martín Nieto
2011-11-28 19:18           ` Junio C Hamano
2011-12-16 22:01           ` Junio C Hamano
2011-12-16 22:43             ` [PATCH] lf_to_crlf_filter(): tell the caller we added "\n" when draining Junio C Hamano
2011-12-19 10:19               ` Henrik Grubbström
2011-12-19 20:23                 ` Junio C Hamano
2011-12-19 16:42             ` Infinite loop in cascade_filter_fn() Carlos Martín Nieto
2011-11-25 15:43 ` Henrik Grubbström
2011-11-25 15:53   ` Carlos Martín Nieto
2011-11-25 15:59     ` Henrik Grubbström

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.