dash.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Line continuation and variables
@ 2014-08-26 12:15 Oleg Bulatov
  2014-08-26 12:34 ` Eric Blake
  0 siblings, 1 reply; 11+ messages in thread
From: Oleg Bulatov @ 2014-08-26 12:15 UTC (permalink / raw)
  To: dash

Hi!

While playing with sh generators I found that dash and bash have different
interpretations for <slash><newline> sequence.

$ dash -c 'EDIT=xxx; echo $EDIT\
> OR'
xxxOR
$ bash -c 'EDIT=xxx; echo $EDIT\
OR'
/usr/bin/vim

$ dash -c 'echo "$\
(pwd)"'
$(pwd)

Is it undefined behaviour in POSIX?

-- 
WBR, Oleg Bulatov

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Line continuation and variables
  2014-08-26 12:15 Line continuation and variables Oleg Bulatov
@ 2014-08-26 12:34 ` Eric Blake
  2014-09-29 14:55   ` Herbert Xu
  0 siblings, 1 reply; 11+ messages in thread
From: Eric Blake @ 2014-08-26 12:34 UTC (permalink / raw)
  To: Oleg Bulatov, dash

[-- Attachment #1: Type: text/plain, Size: 1707 bytes --]

On 08/26/2014 06:15 AM, Oleg Bulatov wrote:
> Hi!
> 
> While playing with sh generators I found that dash and bash have different
> interpretations for <slash><newline> sequence.
> 
> $ dash -c 'EDIT=xxx; echo $EDIT\
>> OR'
> xxxOR

Buggy.

> $ bash -c 'EDIT=xxx; echo $EDIT\
> OR'
> /usr/bin/vim

Correct behavior.

> 
> $ dash -c 'echo "$\
> (pwd)"'
> $(pwd)
> 
> Is it undefined behaviour in POSIX?

No, it's well-defined, and dash is buggy.  POSIX says:

http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_03

"the shell shall break its input into tokens by applying the first
applicable rule below to the next character in its input"

Rule 4 covers backslash handling, while rule 5 covers locating the end
of a word to be subject to $ expansion.  Therefore, rule 4 should happen
first.  Rule 4 defers to the section on quoting, with the caveat that
<newline> joining is the only substitution that happens immediately as
part of the parsing:

http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_02

"If a <newline> follows the <backslash>, the shell shall interpret this
as line continuation. The <backslash> and <newline> shall be removed
before splitting the input into tokens. Since the escaped <newline> is
removed entirely from the input and is not replaced by any white space,
it cannot serve as a token separator."

So the fact that dash is treating the elided backslash-newline as a
token separator, and parsing your input as if ${EDIT}OR instead of
${EDITOR} is a bug in dash.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 539 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Line continuation and variables
  2014-08-26 12:34 ` Eric Blake
@ 2014-09-29 14:55   ` Herbert Xu
  2014-09-29 14:57     ` Herbert Xu
  2014-10-29 21:52     ` Jilles Tjoelker
  0 siblings, 2 replies; 11+ messages in thread
From: Herbert Xu @ 2014-09-29 14:55 UTC (permalink / raw)
  To: Eric Blake; +Cc: Oleg Bulatov, dash

On Tue, Aug 26, 2014 at 12:34:42PM +0000, Eric Blake wrote:
> On 08/26/2014 06:15 AM, Oleg Bulatov wrote:
> > Hi!
> > 
> > While playing with sh generators I found that dash and bash have different
> > interpretations for <slash><newline> sequence.
> > 
> > $ dash -c 'EDIT=xxx; echo $EDIT\
> >> OR'
> > xxxOR
> 
> Buggy.
> 
> > $ bash -c 'EDIT=xxx; echo $EDIT\
> > OR'
> > /usr/bin/vim
> 
> Correct behavior.
> 
> > 
> > $ dash -c 'echo "$\
> > (pwd)"'
> > $(pwd)
> > 
> > Is it undefined behaviour in POSIX?
> 
> No, it's well-defined, and dash is buggy.  POSIX says:
> 
> http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_03
> 
> "the shell shall break its input into tokens by applying the first
> applicable rule below to the next character in its input"
> 
> Rule 4 covers backslash handling, while rule 5 covers locating the end
> of a word to be subject to $ expansion.  Therefore, rule 4 should happen
> first.  Rule 4 defers to the section on quoting, with the caveat that
> <newline> joining is the only substitution that happens immediately as
> part of the parsing:
> 
> http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_02
> 
> "If a <newline> follows the <backslash>, the shell shall interpret this
> as line continuation. The <backslash> and <newline> shall be removed
> before splitting the input into tokens. Since the escaped <newline> is
> removed entirely from the input and is not replaced by any white space,
> it cannot serve as a token separator."
> 
> So the fact that dash is treating the elided backslash-newline as a
> token separator, and parsing your input as if ${EDIT}OR instead of
> ${EDITOR} is a bug in dash.

I agree.  The following patch should fix this:

commit ef91d3d6a4c39421fd3a391e02cd82f9f3aee4a8
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date:   Mon Sep 29 22:52:41 2014 +0800

    [PARSER] Handle backslash newlines properly after dollar sign
    
    On Tue, Aug 26, 2014 at 12:34:42PM +0000, Eric Blake wrote:
    > On 08/26/2014 06:15 AM, Oleg Bulatov wrote:
    > > Hi!
    > >
    > > While playing with sh generators I found that dash and bash have different
    > > interpretations for <slash><newline> sequence.
    > >
    > > $ dash -c 'EDIT=xxx; echo $EDIT\
    > >> OR'
    > > xxxOR
    >
    > Buggy.
    >
    > > $ bash -c 'EDIT=xxx; echo $EDIT\
    > > OR'
    > > /usr/bin/vim
    >
    > Correct behavior.
    >
    > >
    > > $ dash -c 'echo "$\
    > > (pwd)"'
    > > $(pwd)
    > >
    > > Is it undefined behaviour in POSIX?
    >
    > No, it's well-defined, and dash is buggy.  POSIX says:
    >
    > http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_03
    >
    > "the shell shall break its input into tokens by applying the first
    > applicable rule below to the next character in its input"
    >
    > Rule 4 covers backslash handling, while rule 5 covers locating the end
    > of a word to be subject to $ expansion.  Therefore, rule 4 should happen
    > first.  Rule 4 defers to the section on quoting, with the caveat that
    > <newline> joining is the only substitution that happens immediately as
    > part of the parsing:
    >
    > http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_02
    >
    > "If a <newline> follows the <backslash>, the shell shall interpret this
    > as line continuation. The <backslash> and <newline> shall be removed
    > before splitting the input into tokens. Since the escaped <newline> is
    > removed entirely from the input and is not replaced by any white space,
    > it cannot serve as a token separator."
    >
    > So the fact that dash is treating the elided backslash-newline as a
    > token separator, and parsing your input as if ${EDIT}OR instead of
    > ${EDITOR} is a bug in dash.
    
    I agree.  This patch should resolve this problem and similar ones
    affecting blackslash newlines after we encounter a dollar sign.
    
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

diff --git a/ChangeLog b/ChangeLog
index 0fbc514..398bd15 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,6 +1,7 @@
 2014-09-29  Herbert Xu <herbert@gondor.apana.org.au>
 
 	* Kill pgetc_macro.
+	* Handle backslash newlines properly after dollar sign.
 
 2014-09-28  Herbert Xu <herbert@gondor.apana.org.au>
 
diff --git a/src/parser.c b/src/parser.c
index c4eaae2..2b07437 100644
--- a/src/parser.c
+++ b/src/parser.c
@@ -827,6 +827,24 @@ breakloop:
 #undef RETURN
 }
 
+static int pgetc_eatbnl(void)
+{
+	int c;
+
+	while ((c = pgetc()) == '\\') {
+		if (pgetc() != '\n') {
+			pungetc();
+			break;
+		}
+
+		plinno++;
+		if (doprompt)
+			setprompt(2);
+	}
+
+	return c;
+}
+
 
 
 /*
@@ -1179,7 +1197,7 @@ parsesub: {
 	char *p;
 	static const char types[] = "}-+?=";
 
-	c = pgetc();
+	c = pgetc_eatbnl();
 	if (
 		(checkkwd & CHKEOFMARK) ||
 		c <= PEOA  ||
@@ -1188,7 +1206,7 @@ parsesub: {
 		USTPUTC('$', out);
 		pungetc();
 	} else if (c == '(') {	/* $(command) or $((arith)) */
-		if (pgetc() == '(') {
+		if (pgetc_eatbnl() == '(') {
 			PARSEARITH();
 		} else {
 			pungetc();
@@ -1200,25 +1218,25 @@ parsesub: {
 		STADJUST(1, out);
 		subtype = VSNORMAL;
 		if (likely(c == '{')) {
-			c = pgetc();
+			c = pgetc_eatbnl();
 			subtype = 0;
 		}
 varname:
 		if (is_name(c)) {
 			do {
 				STPUTC(c, out);
-				c = pgetc();
+				c = pgetc_eatbnl();
 			} while (is_in_name(c));
 		} else if (is_digit(c)) {
 			do {
 				STPUTC(c, out);
-				c = pgetc();
+				c = pgetc_eatbnl();
 			} while (is_digit(c));
 		}
 		else if (is_special(c)) {
 			int cc = c;
 
-			c = pgetc();
+			c = pgetc_eatbnl();
 
 			if (!subtype && cc == '#') {
 				subtype = VSLENGTH;
@@ -1227,7 +1245,7 @@ varname:
 					goto varname;
 
 				cc = c;
-				c = pgetc();
+				c = pgetc_eatbnl();
 				if (cc == '}' || c != '}') {
 					pungetc();
 					subtype = 0;
@@ -1245,7 +1263,7 @@ varname:
 			switch (c) {
 			case ':':
 				subtype = VSNUL;
-				c = pgetc();
+				c = pgetc_eatbnl();
 				/*FALLTHROUGH*/
 			default:
 				p = strchr(types, c);
@@ -1259,7 +1277,7 @@ varname:
 					int cc = c;
 					subtype = c == '#' ? VSTRIMLEFT :
 							     VSTRIMRIGHT;
-					c = pgetc();
+					c = pgetc_eatbnl();
 					if (c == cc)
 						subtype++;
 					else

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Line continuation and variables
  2014-09-29 14:55   ` Herbert Xu
@ 2014-09-29 14:57     ` Herbert Xu
  2014-10-29 21:52     ` Jilles Tjoelker
  1 sibling, 0 replies; 11+ messages in thread
From: Herbert Xu @ 2014-09-29 14:57 UTC (permalink / raw)
  To: Eric Blake; +Cc: Oleg Bulatov, dash

On Mon, Sep 29, 2014 at 10:55:07PM +0800, Herbert Xu wrote:
>
> I agree.  The following patch should fix this:
> 
> commit ef91d3d6a4c39421fd3a391e02cd82f9f3aee4a8
> Author: Herbert Xu <herbert@gondor.apana.org.au>
> Date:   Mon Sep 29 22:52:41 2014 +0800
> 
>     [PARSER] Handle backslash newlines properly after dollar sign

Here is a small clean-up on top of it:

commit 6df87cf1d4b7c0c490ab1803b863de10579df92e
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date:   Mon Sep 29 22:53:53 2014 +0800

    [PARSER] Add nlprompt/nlnoprompt helpers
    
    This patch adds the nlprompt/nlnoprompt helpers to isolate code
    dealing with newlines and prompting.
    
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

diff --git a/ChangeLog b/ChangeLog
index 398bd15..f161a13 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -2,6 +2,7 @@
 
 	* Kill pgetc_macro.
 	* Handle backslash newlines properly after dollar sign.
+	* Add nlprompt/nlnoprompt helpers.
 
 2014-09-28  Herbert Xu <herbert@gondor.apana.org.au>
 
diff --git a/src/parser.c b/src/parser.c
index 2b07437..f6c43be 100644
--- a/src/parser.c
+++ b/src/parser.c
@@ -743,6 +743,19 @@ out:
 	return (t);
 }
 
+static void nlprompt(void)
+{
+	plinno++;
+	if (doprompt)
+		setprompt(2);
+}
+
+static void nlnoprompt(void)
+{
+	plinno++;
+	needprompt = doprompt;
+}
+
 
 /*
  * Read the next input token.
@@ -786,16 +799,13 @@ xxreadtoken(void)
 			continue;
 		case '\\':
 			if (pgetc() == '\n') {
-				plinno++;
-				if (doprompt)
-					setprompt(2);
+				nlprompt();
 				continue;
 			}
 			pungetc();
 			goto breakloop;
 		case '\n':
-			plinno++;
-			needprompt = doprompt;
+			nlnoprompt();
 			RETURN(TNL);
 		case PEOF:
 			RETURN(TEOF);
@@ -837,9 +847,7 @@ static int pgetc_eatbnl(void)
 			break;
 		}
 
-		plinno++;
-		if (doprompt)
-			setprompt(2);
+		nlprompt();
 	}
 
 	return c;
@@ -913,9 +921,7 @@ readtoken1(int firstc, char const *syntax, char *eofmark, int striptabs)
 				if (syntax == BASESYNTAX)
 					goto endword;	/* exit outer loop */
 				USTPUTC(c, out);
-				plinno++;
-				if (doprompt)
-					setprompt(2);
+				nlprompt();
 				c = pgetc();
 				goto loop;		/* continue outer loop */
 			case CWORD:
@@ -934,9 +940,7 @@ readtoken1(int firstc, char const *syntax, char *eofmark, int striptabs)
 					USTPUTC('\\', out);
 					pungetc();
 				} else if (c == '\n') {
-					plinno++;
-					if (doprompt)
-						setprompt(2);
+					nlprompt();
 				} else {
 					if (
 						dblquote &&
@@ -1092,8 +1096,7 @@ checkend: {
 
 		if (c == '\n' || c == PEOF) {
 			c = PEOF;
-			plinno++;
-			needprompt = doprompt;
+			nlnoprompt();
 		} else {
 			int len;
 
@@ -1342,9 +1345,7 @@ parsebackq: {
 
 			case '\\':
                                 if ((pc = pgetc()) == '\n') {
-					plinno++;
-					if (doprompt)
-						setprompt(2);
+					nlprompt();
 					/*
 					 * If eating a newline, avoid putting
 					 * the newline into the new character
@@ -1366,8 +1367,7 @@ parsebackq: {
 				synerror("EOF in backquote substitution");
 
 			case '\n':
-				plinno++;
-				needprompt = doprompt;
+				nlnoprompt();
 				break;
 
 			default:

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Line continuation and variables
  2014-09-29 14:55   ` Herbert Xu
  2014-09-29 14:57     ` Herbert Xu
@ 2014-10-29 21:52     ` Jilles Tjoelker
  2014-10-30  2:10       ` Herbert Xu
  2015-01-05 12:00       ` [0/4] input: Allow two consecutive calls to pungetc Herbert Xu
  1 sibling, 2 replies; 11+ messages in thread
From: Jilles Tjoelker @ 2014-10-29 21:52 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Eric Blake, Oleg Bulatov, dash

On Mon, Sep 29, 2014 at 10:55:07PM +0800, Herbert Xu wrote:
> On Tue, Aug 26, 2014 at 12:34:42PM +0000, Eric Blake wrote:
> [snip]
> > So the fact that dash is treating the elided backslash-newline as a
> > token separator, and parsing your input as if ${EDIT}OR instead of
> > ${EDITOR} is a bug in dash.

> I agree.  The following patch should fix this:

> commit ef91d3d6a4c39421fd3a391e02cd82f9f3aee4a8
> Author: Herbert Xu <herbert@gondor.apana.org.au>
> Date:   Mon Sep 29 22:52:41 2014 +0800

>     [PARSER] Handle backslash newlines properly after dollar sign
> [snip]

> diff --git a/ChangeLog b/ChangeLog
> index 0fbc514..398bd15 100644
> --- a/ChangeLog
> +++ b/ChangeLog
> @@ -1,6 +1,7 @@
>  2014-09-29  Herbert Xu <herbert@gondor.apana.org.au>
>  
>  	* Kill pgetc_macro.
> +	* Handle backslash newlines properly after dollar sign.
>  
>  2014-09-28  Herbert Xu <herbert@gondor.apana.org.au>
>  
> diff --git a/src/parser.c b/src/parser.c
> index c4eaae2..2b07437 100644
> --- a/src/parser.c
> +++ b/src/parser.c
> @@ -827,6 +827,24 @@ breakloop:
>  #undef RETURN
>  }
>  
> +static int pgetc_eatbnl(void)
> +{
> +	int c;
> +
> +	while ((c = pgetc()) == '\\') {
> +		if (pgetc() != '\n') {
> +			pungetc();
> +			break;
> +		}
> +
> +		plinno++;
> +		if (doprompt)
> +			setprompt(2);
> +	}
> +
> +	return c;
> +}
> +
>  
>  
>  /*

This implementation of pgetc_eatbnl() does not allow pushing back a
backslash, since that would call pungetc() twice without an intervening
pgetc(). However, some places do attempt to push back a backslash. As a
result, a script file containing many repeated  ${w#\#}  will not be
parsed correctly. There is a similar bug with repeated  $\#  but this is
not specified by POSIX.

-- 
Jilles Tjoelker

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Line continuation and variables
  2014-10-29 21:52     ` Jilles Tjoelker
@ 2014-10-30  2:10       ` Herbert Xu
  2015-01-05 12:00       ` [0/4] input: Allow two consecutive calls to pungetc Herbert Xu
  1 sibling, 0 replies; 11+ messages in thread
From: Herbert Xu @ 2014-10-30  2:10 UTC (permalink / raw)
  To: Jilles Tjoelker; +Cc: Eric Blake, Oleg Bulatov, dash

On Wed, Oct 29, 2014 at 10:52:30PM +0100, Jilles Tjoelker wrote:
>
> This implementation of pgetc_eatbnl() does not allow pushing back a
> backslash, since that would call pungetc() twice without an intervening
> pgetc(). However, some places do attempt to push back a backslash. As a
> result, a script file containing many repeated  ${w#\#}  will not be
> parsed correctly. There is a similar bug with repeated  $\#  but this is
> not specified by POSIX.

Good catch! I guess I'll do something similar to tokpushback
to handle this.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [0/4] input: Allow two consecutive calls to pungetc
  2014-10-29 21:52     ` Jilles Tjoelker
  2014-10-30  2:10       ` Herbert Xu
@ 2015-01-05 12:00       ` Herbert Xu
  2015-01-05 12:01         ` [PATCH 1/4] input: Make preadbuffer static Herbert Xu
                           ` (3 more replies)
  1 sibling, 4 replies; 11+ messages in thread
From: Herbert Xu @ 2015-01-05 12:00 UTC (permalink / raw)
  To: Jilles Tjoelker; +Cc: Eric Blake, Oleg Bulatov, dash, Juergen Daubert

On Wed, Oct 29, 2014 at 10:52:30PM +0100, Jilles Tjoelker wrote:
>
> This implementation of pgetc_eatbnl() does not allow pushing back a
> backslash, since that would call pungetc() twice without an intervening
> pgetc(). However, some places do attempt to push back a backslash. As a
> result, a script file containing many repeated  ${w#\#}  will not be
> parsed correctly. There is a similar bug with repeated  $\#  but this is
> not specified by POSIX.

I finally got around to fixing this.  I've decided to do things
a little differently by making it possible to do two pungetc's in
a row.

When I get some spare time I would like to make the parser reentrant
so we can do PS4 properly and fix some other corner cases.

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/4] input: Make preadbuffer static
  2015-01-05 12:00       ` [0/4] input: Allow two consecutive calls to pungetc Herbert Xu
@ 2015-01-05 12:01         ` Herbert Xu
  2015-01-05 12:01         ` [PATCH 2/4] input: Remove HETIO Herbert Xu
                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Herbert Xu @ 2015-01-05 12:01 UTC (permalink / raw)
  To: Jilles Tjoelker, Eric Blake, Oleg Bulatov, dash, Juergen Daubert

The function preadbuffer should be static as it's only used in
input.c.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
---

 src/input.c |    4 ++--
 src/input.h |    1 -
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/src/input.c b/src/input.c
index f11ac84..aa5dcfc 100644
--- a/src/input.c
+++ b/src/input.c
@@ -109,6 +109,7 @@ EditLine *el;			/* cookie for editline package */
 STATIC void pushfile(void);
 static int preadfd(void);
 static void setinputfd(int fd, int push);
+static int preadbuffer(void);
 
 #ifdef mkinit
 INCLUDE <stdio.h>
@@ -222,8 +223,7 @@ retry:
  * 4) Process input up to the next newline, deleting nul characters.
  */
 
-int
-preadbuffer(void)
+static int preadbuffer(void)
 {
 	char *q;
 	int more;
diff --git a/src/input.h b/src/input.h
index 775291b..90ff6c3 100644
--- a/src/input.h
+++ b/src/input.h
@@ -52,7 +52,6 @@ extern char *parsenextc;	/* next character in input buffer */
 
 int pgetc(void);
 int pgetc2(void);
-int preadbuffer(void);
 void pungetc(void);
 void pushstring(char *, void *);
 void popstring(void);

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 2/4] input: Remove HETIO
  2015-01-05 12:00       ` [0/4] input: Allow two consecutive calls to pungetc Herbert Xu
  2015-01-05 12:01         ` [PATCH 1/4] input: Make preadbuffer static Herbert Xu
@ 2015-01-05 12:01         ` Herbert Xu
  2015-01-05 12:01         ` [PATCH 3/4] input: Move all input state into parsefile Herbert Xu
  2015-01-05 12:01         ` [PATCH 4/4] input: Allow two consecutive calls to pungetc Herbert Xu
  3 siblings, 0 replies; 11+ messages in thread
From: Herbert Xu @ 2015-01-05 12:01 UTC (permalink / raw)
  To: Jilles Tjoelker, Eric Blake, Oleg Bulatov, dash, Juergen Daubert

It hasn't been possible to build HETIO for over ten years.  So
let's just kill it.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
---

 src/Makefile.am |    2 
 src/hetio.c     |  397 --------------------------------------------------------
 src/hetio.h     |   22 ---
 src/input.c     |    9 -
 src/main.c      |    8 -
 src/trap.c      |    7 
 6 files changed, 1 insertion(+), 444 deletions(-)

diff --git a/src/Makefile.am b/src/Makefile.am
index 2a37381..120ffa2 100644
--- a/src/Makefile.am
+++ b/src/Makefile.am
@@ -26,7 +26,7 @@ dash_CFILES = \
 dash_SOURCES = \
 	$(dash_CFILES) \
 	alias.h arith_yacc.h bltin/bltin.h cd.h error.h eval.h exec.h \
-	expand.h hetio.h \
+	expand.h \
 	init.h input.h jobs.h machdep.h mail.h main.h memalloc.h miscbltin.h \
 	myhistedit.h mystring.h options.h output.h parser.h redir.h shell.h \
 	show.h system.h trap.h var.h
diff --git a/src/hetio.c b/src/hetio.c
deleted file mode 100644
index f7d175f..0000000
--- a/src/hetio.c
+++ /dev/null
@@ -1,397 +0,0 @@
-/*
- * Termios command line History and Editting for NetBSD sh (ash)
- * Copyright (c) 1999
- *	Main code:	Adam Rogoyski <rogoyski@cs.utexas.edu>
- *	Etc:		Dave Cinege <dcinege@psychosis.com>
- *
- * You may use this code as you wish, so long as the original author(s)
- * are attributed in any redistributions of the source code.
- * This code is 'as is' with no warranty.
- * This code may safely be consumed by a BSD or GPL license.
- *
- * v 0.5  19990328	Initial release
- *
- * Future plans: Simple file and path name completion. (like BASH)
- *
- */
-
-/*
-Usage and Known bugs:
-	Terminal key codes are not extensive, and more will probably
-	need to be added. This version was created on Debian GNU/Linux 2.x.
-	Delete, Backspace, Home, End, and the arrow keys were tested
-	to work in an Xterm and console. Ctrl-A also works as Home.
-	Ctrl-E also works as End. Ctrl-D and Ctrl-U perform their respective
-	functions. The binary size increase is <3K.
-
-	Editting will not display correctly for lines greater then the
-	terminal width. (more then one line.) However, history will.
-*/
-
-#include <stdio.h>
-#include <unistd.h>
-#include <stdlib.h>
-#include <string.h>
-#include <termios.h>
-#include <ctype.h>
-#include <sys/ioctl.h>
-
-#include "input.h"
-#include "output.h"
-
-#include "hetio.h"
-
-
-#define  MAX_HISTORY   15			/* Maximum length of the linked list for the command line history */
-
-#define ESC	27
-#define DEL	127
-
-static struct history *his_front = NULL;	/* First element in command line list */
-static struct history *his_end = NULL;		/* Last element in command line list */
-static struct termios old_term, new_term;	/* Current termio and the previous termio before starting ash */
-
-static int history_counter = 0;			/* Number of commands in history list */
-static int reset_term = 0;			/* Set to true if the terminal needs to be reset upon exit */
-static int hetio_inter = 0;
-
-struct history
-{
-   char *s;
-   struct history *p;
-   struct history *n;
-};
-
-
-void input_delete    (int);
-void input_home      (int *);
-void input_end       (int *, int);
-void input_backspace (int *, int *);
-
-
-
-void hetio_init(void)
-{
-	hetio_inter = 1;
-}
-
-
-void hetio_reset_term(void)
-{
-	if (reset_term)
-		tcsetattr(1, TCSANOW, &old_term);
-}
-
-
-void setIO(struct termios *new, struct termios *old)	/* Set terminal IO to canonical mode, and save old term settings. */
-{
-	tcgetattr(0, old);
-	memcpy(new, old, sizeof(*new));
-	new->c_cc[VMIN] = 1;
-	new->c_cc[VTIME] = 0;
-	new->c_lflag &= ~ICANON; /* unbuffered input */
-	new->c_lflag &= ~ECHO;
-	tcsetattr(0, TCSANOW, new);
-}
-
-void input_home(int *cursor)				/* Command line input routines */
-{
- 	while (*cursor > 0) {
-		out1c('\b');
-		--*cursor;
-	}
-	flushout(out1);
-}
-
-
-void input_delete(int cursor)
-{
-	int j = 0;
-
-	memmove(parsenextc + cursor, parsenextc + cursor + 1,
-		BUFSIZ - cursor - 1);
-	for (j = cursor; j < (BUFSIZ - 1); j++) {
-		if (!*(parsenextc + j))
-			break;
-		else
-			out1c(*(parsenextc + j));
-	}
-
-	out1str(" \b");
-
-	while (j-- > cursor)
-		out1c('\b');
-	flushout(out1);
-}
-
-
-void input_end(int *cursor, int len)
-{
-	while (*cursor < len) {
-		out1str("\033[C");
-		++*cursor;
-	}
-	flushout(out1);
-}
-
-
-void
-input_backspace(int *cursor, int *len)
-{
-	int j = 0;
-
-	if (*cursor > 0) {
-		out1str("\b \b");
-		--*cursor;
-		memmove(parsenextc + *cursor, parsenextc + *cursor + 1,
-			BUFSIZ - *cursor + 1);
-
-		for (j = *cursor; j < (BUFSIZ - 1); j++) {
-			if (!*(parsenextc + j))
-				break;
-			else
-				out1c(*(parsenextc + j));
-		}
-
-		out1str(" \b");
-
-		while (j-- > *cursor)
-			out1c('\b');
-
-		--*len;
-		flushout(out1);
-	}
-}
-
-int hetio_read_input(int fd)
-{
-	int nr = 0;
-
-	/* Are we an interactive shell? */
-	if (!hetio_inter || fd) {
-		return -255;
-	} else {
-		int len = 0;
-		int j = 0;
-		int cursor = 0;
-		int break_out = 0;
-		int ret = 0;
-		char c = 0;
-		struct history *hp = his_end;
-
-		if (!reset_term) {
-			setIO(&new_term, &old_term);
-			reset_term = 1;
-		} else {
-			tcsetattr(0, TCSANOW, &new_term);
-		}
-
-		memset(parsenextc, 0, BUFSIZ);
-
-		while (1) {
-			if ((ret = read(fd, &c, 1)) < 1)
-				return ret;
-
-			switch (c) {
-   				case 1:		/* Control-A Beginning of line */
-   					input_home(&cursor);
-					break;
-				case 5:		/* Control-E EOL */
-					input_end(&cursor, len);
-					break;
-				case 4:		/* Control-D */
-					if (!len)
-						exitshell(0);
-					break;
-				case 21: 	/* Control-U */
-					/* Return to begining of line. */
-					for (; cursor > 0; cursor--)
-						out1c('\b');
-					/* Erase old command. */
-					for (j = 0; j < len; j++) {
-						/*
-						 * Clear buffer while we're at
-						 * it.
-						 */
-						parsenextc[j] = 0;
-						out1c(' ');
-					}
-					/* return to begining of line */
-					for (; len > 0; len--)
-						out1c('\b');
-					flushout(out1);
-					break;
-				case '\b':	/* Backspace */
-				case DEL:
-					input_backspace(&cursor, &len);
-					break;
-				case '\n':	/* Enter */
-					*(parsenextc + len++ + 1) = c;
-					out1c(c);
-					flushout(out1);
-					break_out = 1;
-					break;
-				case ESC:	/* escape sequence follows */
-					if ((ret = read(fd, &c, 1)) < 1)
-						return ret;
-
-					if (c == '[' ) {    /* 91 */
-						if ((ret = read(fd, &c, 1)) < 1)
-							return ret;
-
-						switch (c) {
-							case 'A':
-								if (hp && hp->p) {		/* Up */
-									hp = hp->p;
-									goto hop;
-								}
-								break;
-							case 'B':
-								if (hp && hp->n && hp->n->s) {	/* Down */
-									hp = hp->n;
-									goto hop;
-								}
-								break;
-
-hop:						/* hop */
-								len = strlen(parsenextc);
-
-								for (; cursor > 0; cursor--)		/* return to begining of line */
-									out1c('\b');
-
-		   						for (j = 0; j < len; j++)		/* erase old command */
-									out1c(' ');
-
-								for (; j > 0; j--)		/* return to begining of line */
-									out1c('\b');
-
-								strcpy (parsenextc, hp->s);		/* write new command */
-								len = strlen (hp->s);
-								out1str(parsenextc);
-								flushout(out1);
-								cursor = len;
-								break;
-							case 'C':		/* Right */
-								if (cursor < len) {
-									out1str("\033[C");
-									cursor++;
-									flushout(out1);
-						 		}
-								break;
-							case 'D':		/* Left */
-								if (cursor > 0) {
-									out1str("\033[D");
-									cursor--;
-									flushout(out1);
-								}
-								break;
-							case '3':		/* Delete */
-								if (cursor != len) {
-									input_delete(cursor);
-									len--;
-								}
-								break;
-							case '1':		/* Home (Ctrl-A) */
-								input_home(&cursor);
-								break;
-							case '4':		/* End (Ctrl-E) */
-								input_end(&cursor, len);
-								break;
-						}
-						if (c == '1' || c == '3' || c == '4')
-							if ((ret = read(fd, &c, 1)) < 1)
-								return ret;  /* read 126 (~) */
-					}
-
-					if (c == 'O') {	/* 79 */
-						if ((ret = read(fd, &c, 1)) < 1)
-							return ret;
-						switch (c) {
-							case 'H':		/* Home (xterm) */
-      								input_home(&cursor);
-								break;
-							case 'F':		/* End (xterm_ */
-								input_end(&cursor, len);
-								break;
-						}
-					}
-
-					c = 0;
-					break;
-
-				default:				/* If it's regular input, do the normal thing */
-					if (!isprint(c))		/* Skip non-printable characters */
-						break;
-
-	       				if (len >= (BUFSIZ - 2))	/* Need to leave space for enter */
-		  				break;
-
-					len++;
-
-					if (cursor == (len - 1)) {	/* Append if at the end of the line */
-						*(parsenextc + cursor) = c;
-					} else {			/* Insert otherwise */
-						memmove(parsenextc + cursor + 1, parsenextc + cursor,
-							len - cursor - 1);
-
-						*(parsenextc + cursor) = c;
-
-						for (j = cursor; j < len; j++)
-							out1c(*(parsenextc + j));
-						for (; j > cursor; j--)
-							out1str("\033[D");
-					}
-
-					cursor++;
-					out1c(c);
-					flushout(out1);
-					break;
-			}
-
-			if (break_out)		/* Enter is the command terminator, no more input. */
-				break;
-		}
-
-		nr = len + 1;
-		tcsetattr(0, TCSANOW, &old_term);
-
-		if (*(parsenextc)) {		/* Handle command history log */
-			struct history *h = his_end;
-
-			if (!h) {       /* No previous history */
-				h = his_front = malloc(sizeof (struct history));
-				h->n = malloc(sizeof (struct history));
-				h->p = NULL;
-				h->s = strdup(parsenextc);
-
-				h->n->p = h;
-				h->n->n = NULL;
-				h->n->s = NULL;
-				his_end = h->n;
-				history_counter++;
-			} else {	/* Add a new history command */
-
-				h->n = malloc(sizeof (struct history));
-
-				h->n->p = h;
-				h->n->n = NULL;
-				h->n->s = NULL;
-				h->s = strdup(parsenextc);
-				his_end = h->n;
-
-				if (history_counter >= MAX_HISTORY) {	/* After max history, remove the last known command */
-					struct history *p = his_front->n;
-
-					p->p = NULL;
-					free(his_front->s);
-					free(his_front);
-					his_front = p;
-				} else {
-					history_counter++;
-				}
-			}
-		}
-	}
-
-	return nr;
-}
diff --git a/src/hetio.h b/src/hetio.h
deleted file mode 100644
index c3e915c..0000000
--- a/src/hetio.h
+++ /dev/null
@@ -1,22 +0,0 @@
-/*
- * Termios command line History and Editting for NetBSD sh (ash)
- * Copyright (c) 1999
- *	Main code:	Adam Rogoyski <rogoyski@cs.utexas.edu> 
- *	Etc:		Dave Cinege <dcinege@psychosis.com>
- *
- * You may use this code as you wish, so long as the original author(s)
- * are attributed in any redistributions of the source code.
- * This code is 'as is' with no warranty.
- * This code may safely be consumed by a BSD or GPL license.
- *
- * v 0.5  19990328	Initial release 
- *
- * Future plans: Simple file and path name completion. (like BASH)
- *
- */
-
-void hetio_init(void);
-int hetio_read_input(int fd);
-void hetio_reset_term(void);
-
-extern int hetio_inter;
diff --git a/src/input.c b/src/input.c
index aa5dcfc..232bb9c 100644
--- a/src/input.c
+++ b/src/input.c
@@ -58,10 +58,6 @@
 #include "myhistedit.h"
 #endif
 
-#ifdef HETIO
-#include "hetio.h"
-#endif
-
 #define EOF_NLEFT -99		/* value of parsenleft when EOF pushed back */
 #define IBUFSIZ (BUFSIZ + 1)
 
@@ -188,11 +184,6 @@ retry:
 
 	} else
 #endif
-
-#ifdef HETIO
-		nr = hetio_read_input(parsefile->fd);
-		if (nr == -255)
-#endif
 		nr = read(parsefile->fd, buf, IBUFSIZ - 1);
 
 
diff --git a/src/main.c b/src/main.c
index 985e8c4..bedb663 100644
--- a/src/main.c
+++ b/src/main.c
@@ -60,10 +60,6 @@
 #include "exec.h"
 #include "cd.h"
 
-#ifdef HETIO
-#include "hetio.h"
-#endif
-
 #define PROFILE 0
 
 int rootpid;
@@ -206,10 +202,6 @@ cmdloop(int top)
 	int numeof = 0;
 
 	TRACE(("cmdloop(%d) called\n", top));
-#ifdef HETIO
-	if(iflag && top)
-		hetio_init();
-#endif
 	for (;;) {
 		int skip;
 
diff --git a/src/trap.c b/src/trap.c
index b924661..82d4263 100644
--- a/src/trap.c
+++ b/src/trap.c
@@ -51,10 +51,6 @@
 #include "trap.h"
 #include "mystring.h"
 
-#ifdef HETIO
-#include "hetio.h"
-#endif

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 3/4] input: Move all input state into parsefile
  2015-01-05 12:00       ` [0/4] input: Allow two consecutive calls to pungetc Herbert Xu
  2015-01-05 12:01         ` [PATCH 1/4] input: Make preadbuffer static Herbert Xu
  2015-01-05 12:01         ` [PATCH 2/4] input: Remove HETIO Herbert Xu
@ 2015-01-05 12:01         ` Herbert Xu
  2015-01-05 12:01         ` [PATCH 4/4] input: Allow two consecutive calls to pungetc Herbert Xu
  3 siblings, 0 replies; 11+ messages in thread
From: Herbert Xu @ 2015-01-05 12:01 UTC (permalink / raw)
  To: Jilles Tjoelker, Eric Blake, Oleg Bulatov, dash, Juergen Daubert

Currently we maintain a copy of the input state outside of parsefile.
This is redundant and makes reentrancy difficult.  This patch kills
the duplicate global states and now everyone simply uses parsefile.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
---

 src/input.c |  107 ++++++++++++++++++++----------------------------------------
 src/input.h |   33 ++++++++++++++++--
 2 files changed, 67 insertions(+), 73 deletions(-)

diff --git a/src/input.c b/src/input.c
index 232bb9c..6223a73 100644
--- a/src/input.c
+++ b/src/input.c
@@ -61,38 +61,7 @@
 #define EOF_NLEFT -99		/* value of parsenleft when EOF pushed back */
 #define IBUFSIZ (BUFSIZ + 1)
 
-MKINIT
-struct strpush {
-	struct strpush *prev;	/* preceding string on stack */
-	char *prevstring;
-	int prevnleft;
-	struct alias *ap;	/* if push was associated with an alias */
-	char *string;		/* remember the string since it may change */
-};
 
-/*
- * The parsefile structure pointed to by the global variable parsefile
- * contains information about the current file being read.
- */
-
-MKINIT
-struct parsefile {
-	struct parsefile *prev;	/* preceding file on stack */
-	int linno;		/* current line */
-	int fd;			/* file descriptor (or -1 if string) */
-	int nleft;		/* number of chars left in this line */
-	int lleft;		/* number of chars left in this buffer */
-	char *nextc;		/* next char in buffer */
-	char *buf;		/* input buffer */
-	struct strpush *strpush; /* for pushing strings at this level */
-	struct strpush basestrpush; /* so pushing one is fast */
-};
-
-
-int plinno = 1;			/* input line number */
-int parsenleft;			/* copy of parsefile->nleft */
-MKINIT int parselleft;		/* copy of parsefile->lleft */
-char *parsenextc;		/* copy of parsefile->nextc */
 MKINIT struct parsefile basepf;	/* top level input file */
 MKINIT char basebuf[IBUFSIZ];	/* buffer for top level input file */
 struct parsefile *parsefile = &basepf;	/* current input file */
@@ -114,10 +83,12 @@ INCLUDE "error.h"
 
 INIT {
 	basepf.nextc = basepf.buf = basebuf;
+	basepf.linno = 1;
 }
 
 RESET {
-	parselleft = parsenleft = 0;	/* clear input buffer */
+	/* clear input buffer */
+	basepf.lleft = basepf.nleft = 0;
 	popallfiles();
 }
 #endif
@@ -131,8 +102,8 @@ RESET {
 int
 pgetc(void)
 {
-	if (--parsenleft >= 0)
-		return (signed char)*parsenextc++;
+	if (--parsefile->nleft >= 0)
+		return (signed char)*parsefile->nextc++;
 	else
 		return preadbuffer();
 }
@@ -158,7 +129,7 @@ preadfd(void)
 {
 	int nr;
 	char *buf =  parsefile->buf;
-	parsenextc = buf;
+	parsefile->nextc = buf;
 
 retry:
 #ifndef SMALL
@@ -225,29 +196,32 @@ static int preadbuffer(void)
 
 	while (unlikely(parsefile->strpush)) {
 		if (
-			parsenleft == -1 && parsefile->strpush->ap &&
-			parsenextc[-1] != ' ' && parsenextc[-1] != '\t'
+			parsefile->nleft == -1 &&
+			parsefile->strpush->ap &&
+			parsefile->nextc[-1] != ' ' &&
+			parsefile->nextc[-1] != '\t'
 		) {
 			return PEOA;
 		}
 		popstring();
-		if (--parsenleft >= 0)
-			return (signed char)*parsenextc++;
+		if (--parsefile->nleft >= 0)
+			return (signed char)*parsefile->nextc++;
 	}
-	if (unlikely(parsenleft == EOF_NLEFT || parsefile->buf == NULL))
+	if (unlikely(parsefile->nleft == EOF_NLEFT ||
+		     parsefile->buf == NULL))
 		return PEOF;
 	flushall();
 
-	more = parselleft;
+	more = parsefile->lleft;
 	if (more <= 0) {
 again:
 		if ((more = preadfd()) <= 0) {
-			parselleft = parsenleft = EOF_NLEFT;
+			parsefile->lleft = parsefile->nleft = EOF_NLEFT;
 			return PEOF;
 		}
 	}
 
-	q = parsenextc;
+	q = parsefile->nextc;
 
 	/* delete nul characters */
 #ifndef SMALL
@@ -265,7 +239,7 @@ again:
 			q++;
 
 			if (c == '\n') {
-				parsenleft = q - parsenextc - 1;
+				parsefile->nleft = q - parsefile->nextc - 1;
 				break;
 			}
 
@@ -282,13 +256,13 @@ again:
 		}
 
 		if (more <= 0) {
-			parsenleft = q - parsenextc - 1;
-			if (parsenleft < 0)
+			parsefile->nleft = q - parsefile->nextc - 1;
+			if (parsefile->nleft < 0)
 				goto again;
 			break;
 		}
 	}
-	parselleft = more;
+	parsefile->lleft = more;
 
 	savec = *q;
 	*q = '\0';
@@ -298,13 +272,13 @@ again:
 		HistEvent he;
 		INTOFF;
 		history(hist, &he, whichprompt == 1? H_ENTER : H_APPEND,
-		    parsenextc);
+			parsefile->nextc);
 		INTON;
 	}
 #endif
 
 	if (vflag) {
-		out2str(parsenextc);
+		out2str(parsefile->nextc);
 #ifdef FLUSHERR
 		flushout(out2);
 #endif
@@ -312,7 +286,7 @@ again:
 
 	*q = savec;
 
-	return (signed char)*parsenextc++;
+	return (signed char)*parsefile->nextc++;
 }
 
 /*
@@ -323,8 +297,8 @@ again:
 void
 pungetc(void)
 {
-	parsenleft++;
-	parsenextc--;
+	parsefile->nleft++;
+	parsefile->nextc--;
 }
 
 /*
@@ -346,15 +320,15 @@ pushstring(char *s, void *ap)
 		parsefile->strpush = sp;
 	} else
 		sp = parsefile->strpush = &(parsefile->basestrpush);
-	sp->prevstring = parsenextc;
-	sp->prevnleft = parsenleft;
+	sp->prevstring = parsefile->nextc;
+	sp->prevnleft = parsefile->nleft;
 	sp->ap = (struct alias *)ap;
 	if (ap) {
 		((struct alias *)ap)->flag |= ALIASINUSE;
 		sp->string = s;
 	}
-	parsenextc = s;
-	parsenleft = len;
+	parsefile->nextc = s;
+	parsefile->nleft = len;
 	INTON;
 }
 
@@ -365,7 +339,8 @@ popstring(void)
 
 	INTOFF;
 	if (sp->ap) {
-		if (parsenextc[-1] == ' ' || parsenextc[-1] == '\t') {
+		if (parsefile->nextc[-1] == ' ' ||
+		    parsefile->nextc[-1] == '\t') {
 			checkkwd |= CHKALIAS;
 		}
 		if (sp->string != sp->ap->val) {
@@ -376,8 +351,8 @@ popstring(void)
 			unalias(sp->ap->name);
 		}
 	}
-	parsenextc = sp->prevstring;
-	parsenleft = sp->prevnleft;
+	parsefile->nextc = sp->prevstring;
+	parsefile->nleft = sp->prevnleft;
 /*dprintf("*** calling popstring: restoring to '%s'\n", parsenextc);*/
 	parsefile->strpush = sp->prev;
 	if (sp != &(parsefile->basestrpush))
@@ -426,7 +401,7 @@ setinputfd(int fd, int push)
 	parsefile->fd = fd;
 	if (parsefile->buf == NULL)
 		parsefile->buf = ckmalloc(IBUFSIZ);
-	parselleft = parsenleft = 0;
+	parsefile->lleft = parsefile->nleft = 0;
 	plinno = 1;
 }
 
@@ -440,8 +415,8 @@ setinputstring(char *string)
 {
 	INTOFF;
 	pushfile();
-	parsenextc = string;
-	parsenleft = strlen(string);
+	parsefile->nextc = string;
+	parsefile->nleft = strlen(string);
 	parsefile->buf = NULL;
 	plinno = 1;
 	INTON;
@@ -459,10 +434,6 @@ pushfile(void)
 {
 	struct parsefile *pf;
 
-	parsefile->nleft = parsenleft;
-	parsefile->lleft = parselleft;
-	parsefile->nextc = parsenextc;
-	parsefile->linno = plinno;
 	pf = (struct parsefile *)ckmalloc(sizeof (struct parsefile));
 	pf->prev = parsefile;
 	pf->fd = -1;
@@ -486,10 +457,6 @@ popfile(void)
 		popstring();
 	parsefile = pf->prev;
 	ckfree(pf);
-	parsenleft = parsefile->nleft;
-	parselleft = parsefile->lleft;
-	parsenextc = parsefile->nextc;
-	plinno = parsefile->linno;
 	INTON;
 }
 
diff --git a/src/input.h b/src/input.h
index 90ff6c3..ad8b463 100644
--- a/src/input.h
+++ b/src/input.h
@@ -41,14 +41,41 @@ enum {
 	INPUT_NOFILE_OK = 2,
 };
 
+struct alias;
+
+struct strpush {
+	struct strpush *prev;	/* preceding string on stack */
+	char *prevstring;
+	int prevnleft;
+	struct alias *ap;	/* if push was associated with an alias */
+	char *string;		/* remember the string since it may change */
+};
+
+/*
+ * The parsefile structure pointed to by the global variable parsefile
+ * contains information about the current file being read.
+ */
+
+struct parsefile {
+	struct parsefile *prev;	/* preceding file on stack */
+	int linno;		/* current line */
+	int fd;			/* file descriptor (or -1 if string) */
+	int nleft;		/* number of chars left in this line */
+	int lleft;		/* number of chars left in this buffer */
+	char *nextc;		/* next char in buffer */
+	char *buf;		/* input buffer */
+	struct strpush *strpush; /* for pushing strings at this level */
+	struct strpush basestrpush; /* so pushing one is fast */
+};
+
+extern struct parsefile *parsefile;
+
 /*
  * The input line number.  Input.c just defines this variable, and saves
  * and restores it when files are pushed and popped.  The user of this
  * package must set its value.
  */
-extern int plinno;
-extern int parsenleft;		/* number of characters left in input buffer */
-extern char *parsenextc;	/* next character in input buffer */
+#define plinno (parsefile->linno)
 
 int pgetc(void);
 int pgetc2(void);

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 4/4] input: Allow two consecutive calls to pungetc
  2015-01-05 12:00       ` [0/4] input: Allow two consecutive calls to pungetc Herbert Xu
                           ` (2 preceding siblings ...)
  2015-01-05 12:01         ` [PATCH 3/4] input: Move all input state into parsefile Herbert Xu
@ 2015-01-05 12:01         ` Herbert Xu
  3 siblings, 0 replies; 11+ messages in thread
From: Herbert Xu @ 2015-01-05 12:01 UTC (permalink / raw)
  To: Jilles Tjoelker, Eric Blake, Oleg Bulatov, dash, Juergen Daubert

The commit ef91d3d6a4c39421fd3a391e02cd82f9f3aee4a8 ([PARSER]
Handle backslash newlines properly after dollar sign) created
cases where we make two consecutive calls to pungetc.  As we
don't explicitly support that there are corner cases where you
end up with garbage input leading to undefined behaviour.

This patch adds explicit support for two consecutive calls to
pungetc.

Reported-by: Jilles Tjoelker <jilles@stack.nl>
Reported-by: Juergen Daubert <jue@jue.li>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
---

 src/input.c |   30 ++++++++++++++++++++++--------
 src/input.h |   12 ++++++++++++
 2 files changed, 34 insertions(+), 8 deletions(-)

diff --git a/src/input.c b/src/input.c
index 6223a73..06c08d4 100644
--- a/src/input.c
+++ b/src/input.c
@@ -102,10 +102,20 @@ RESET {
 int
 pgetc(void)
 {
+	int c;
+
+	if (parsefile->unget)
+		return parsefile->lastc[--parsefile->unget];
+
 	if (--parsefile->nleft >= 0)
-		return (signed char)*parsefile->nextc++;
+		c = (signed char)*parsefile->nextc++;
 	else
-		return preadbuffer();
+		c = preadbuffer();
+
+	parsefile->lastc[1] = parsefile->lastc[0];
+	parsefile->lastc[0] = c;
+
+	return c;
 }
 
 
@@ -194,7 +204,7 @@ static int preadbuffer(void)
 #endif
 	char savec;
 
-	while (unlikely(parsefile->strpush)) {
+	if (unlikely(parsefile->strpush)) {
 		if (
 			parsefile->nleft == -1 &&
 			parsefile->strpush->ap &&
@@ -204,8 +214,7 @@ static int preadbuffer(void)
 			return PEOA;
 		}
 		popstring();
-		if (--parsefile->nleft >= 0)
-			return (signed char)*parsefile->nextc++;
+		return pgetc();
 	}
 	if (unlikely(parsefile->nleft == EOF_NLEFT ||
 		     parsefile->buf == NULL))
@@ -290,15 +299,14 @@ again:
 }
 
 /*
- * Undo the last call to pgetc.  Only one character may be pushed back.
+ * Undo a call to pgetc.  Only two characters may be pushed back.
  * PEOF may be pushed back.
  */
 
 void
 pungetc(void)
 {
-	parsefile->nleft++;
-	parsefile->nextc--;
+	parsefile->unget++;
 }
 
 /*
@@ -322,6 +330,8 @@ pushstring(char *s, void *ap)
 		sp = parsefile->strpush = &(parsefile->basestrpush);
 	sp->prevstring = parsefile->nextc;
 	sp->prevnleft = parsefile->nleft;
+	sp->unget = parsefile->unget;
+	memcpy(sp->lastc, parsefile->lastc, sizeof(sp->lastc));
 	sp->ap = (struct alias *)ap;
 	if (ap) {
 		((struct alias *)ap)->flag |= ALIASINUSE;
@@ -329,6 +339,7 @@ pushstring(char *s, void *ap)
 	}
 	parsefile->nextc = s;
 	parsefile->nleft = len;
+	parsefile->unget = 0;
 	INTON;
 }
 
@@ -353,6 +364,8 @@ popstring(void)
 	}
 	parsefile->nextc = sp->prevstring;
 	parsefile->nleft = sp->prevnleft;
+	parsefile->unget = sp->unget;
+	memcpy(parsefile->lastc, sp->lastc, sizeof(sp->lastc));
 /*dprintf("*** calling popstring: restoring to '%s'\n", parsenextc);*/
 	parsefile->strpush = sp->prev;
 	if (sp != &(parsefile->basestrpush))
@@ -439,6 +452,7 @@ pushfile(void)
 	pf->fd = -1;
 	pf->strpush = NULL;
 	pf->basestrpush.prev = NULL;
+	pf->unget = 0;
 	parsefile = pf;
 }
 
diff --git a/src/input.h b/src/input.h
index ad8b463..ec97c1d 100644
--- a/src/input.h
+++ b/src/input.h
@@ -49,6 +49,12 @@ struct strpush {
 	int prevnleft;
 	struct alias *ap;	/* if push was associated with an alias */
 	char *string;		/* remember the string since it may change */
+
+	/* Remember last two characters for pungetc. */
+	int lastc[2];
+
+	/* Number of outstanding calls to pungetc. */
+	int unget;
 };
 
 /*
@@ -66,6 +72,12 @@ struct parsefile {
 	char *buf;		/* input buffer */
 	struct strpush *strpush; /* for pushing strings at this level */
 	struct strpush basestrpush; /* so pushing one is fast */
+
+	/* Remember last two characters for pungetc. */
+	int lastc[2];
+
+	/* Number of outstanding calls to pungetc. */
+	int unget;
 };
 
 extern struct parsefile *parsefile;

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2015-01-05 12:02 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-26 12:15 Line continuation and variables Oleg Bulatov
2014-08-26 12:34 ` Eric Blake
2014-09-29 14:55   ` Herbert Xu
2014-09-29 14:57     ` Herbert Xu
2014-10-29 21:52     ` Jilles Tjoelker
2014-10-30  2:10       ` Herbert Xu
2015-01-05 12:00       ` [0/4] input: Allow two consecutive calls to pungetc Herbert Xu
2015-01-05 12:01         ` [PATCH 1/4] input: Make preadbuffer static Herbert Xu
2015-01-05 12:01         ` [PATCH 2/4] input: Remove HETIO Herbert Xu
2015-01-05 12:01         ` [PATCH 3/4] input: Move all input state into parsefile Herbert Xu
2015-01-05 12:01         ` [PATCH 4/4] input: Allow two consecutive calls to pungetc Herbert Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).