dash.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Harald van Dijk <harald@gigawatt.nl>
To: Gioele Barabucci <gioele@svario.it>
Cc: dash@vger.kernel.org
Subject: Re: dash: read does not ignore trailing spaces
Date: Thu, 3 Dec 2015 22:02:14 +0100	[thread overview]
Message-ID: <5660ADD6.4020308@gigawatt.nl> (raw)
In-Reply-To: <n3nrqu$mj2$1@ger.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 1829 bytes --]

On 02/12/2015 23:37, Gioele Barabucci wrote:
> Hello,
>
> I am forwarding a bug [1] reported by a Debian user: `read` does not
> ignore trailing spaces. The current version of dash is affected by
> this bug.
>
> A simple test from the original reporter:
>
>      $ dash -c 'echo "  a b  " | { read v ; echo "<$v>" ; }'
>      <a b  >
>
>      $ bash -c 'echo "  a b  " | { read v ; echo "<$v>" ; }'
>      <a b>
>
> Other shells like posh and mksh behave like bash.

This is indeed a bug based on the current specification. In the past, 
the specification was unclear and the dash interpretation was a 
legitimate one, but currently it explicitly spells out that trailing IFS 
whitespace gets ignored.

However, unless I'm misreading the spec, mksh and bash don't seem to 
implement it properly either: only trailing IFS whitespace is supposed 
to be dropped. IFS non-whitespace is supposed to remain, even at the end 
of the input. mksh and bash remove it, posh and zsh leave it in:

   $ for shell in bash mksh posh zsh; do printf %s: "$shell"; $shell -c 
'IFS=,; echo a, | { read v; echo "<$v>"; }'; done
   bash:<a>
   mksh:<a>
   posh:<a,>
   zsh:<a,>

As far as I can tell, the posh/zsh behaviour is the correct behaviour, 
but I'm not convinced yet my interpretation is correct.

Attached is a not fully tested proof of concept to implement the 
posh/zsh behaviour in dash by extending ifsbreakup() to allow specifying 
a maximum number of arguments instead of fixing it up in 
readcmd_handle_line(). It returns <a b> in your test, and <a,> in mine. 
Feedback welcome.

Cheers,
Harald van Dijk

> This error is reproducible with dash 0.5.7 and with the current master
> git master branch, commit 2e5842258bd5b252ffdaa630db09c9a19a9717ca.
>
> [1] https://bugs.debian.org/794965
>
> --
> Gioele Barabucci <gioele@svario.it>

[-- Attachment #2: dash-read-ifs.patch --]
[-- Type: text/plain, Size: 3999 bytes --]

diff --git a/src/expand.c b/src/expand.c
index b2d710d..6afd562 100644
--- a/src/expand.c
+++ b/src/expand.c
@@ -203,7 +203,7 @@ expandarg(union node *arg, struct arglist *arglist, int flag)
 	 * TODO - EXP_REDIR
 	 */
 	if (flag & EXP_FULL) {
-		ifsbreakup(p, &exparg);
+		ifsbreakup(p, 0, &exparg);
 		*exparg.lastp = NULL;
 		exparg.lastp = &exparg.list;
 		expandmeta(exparg.list, flag);
@@ -1016,9 +1016,11 @@ recordregion(int start, int end, int nulonly)
  * Break the argument string into pieces based upon IFS and add the
  * strings to the argument list.  The regions of the string to be
  * searched for IFS characters have been stored by recordregion.
+ * If maxargs is non-zero, at most maxargs arguments will be created, by
+ * joining together the last arguments.
  */
 void
-ifsbreakup(char *string, struct arglist *arglist)
+ifsbreakup(char *string, int maxargs, struct arglist *arglist)
 {
 	struct ifsregion *ifsp;
 	struct strlist *sp;
@@ -1054,12 +1056,36 @@ ifsbreakup(char *string, struct arglist *arglist)
 						start = p;
 						continue;
 					}
+					/* If only reading one more argument, combine any field terminators,
+					 * except for trailing IFS whitespace. */
+					if (maxargs == 1) {
+						q = p;
+						p++;
+						if (ifsspc) {
+							/* Ignore IFS whitespace at end */
+							for (;;) {
+								if (p >= string + ifsp->endoff) {
+									*q = '\0';
+									goto add;
+								}
+								if (*p == (char)CTLESC)
+									p++;
+								ifsspc = strchr(ifs, *p) && strchr(defifs, *p);
+								p++;
+								if (!ifsspc) {
+									break;
+								}
+							}
+						}
+						continue;
+					}
 					*q = '\0';
 					sp = (struct strlist *)stalloc(sizeof *sp);
 					sp->text = start;
 					*arglist->lastp = sp;
 					arglist->lastp = &sp->next;
 					p++;
+					if (maxargs) maxargs--;
 					if (!nulonly) {
 						for (;;) {
 							if (p >= string + ifsp->endoff) {
diff --git a/src/expand.h b/src/expand.h
index 6a90f67..26dc5b4 100644
--- a/src/expand.h
+++ b/src/expand.h
@@ -69,7 +69,7 @@ char *_rmescapes(char *, int);
 int casematch(union node *, char *);
 void recordregion(int, int, int);
 void removerecordregions(int); 
-void ifsbreakup(char *, struct arglist *);
+void ifsbreakup(char *, int, struct arglist *);
 void ifsfree(void);
 
 /* From arith.y */
diff --git a/src/miscbltin.c b/src/miscbltin.c
index b596fd2..39b9c47 100644
--- a/src/miscbltin.c
+++ b/src/miscbltin.c
@@ -67,28 +67,21 @@
  *  less fields than variables -> remaining variables unset.
  *
  *  @param line complete line of input
+ *  @param ac argument count
  *  @param ap argument (variable) list
  *  @param len length of line including trailing '\0'
  */
 static void
-readcmd_handle_line(char *s, char **ap)
+readcmd_handle_line(char *s, int ac, char **ap)
 {
 	struct arglist arglist;
 	struct strlist *sl;
-	char *backup;
-	char *line;
 
-	/* ifsbreakup will fiddle with stack region... */
-	line = stackblock();
 	s = grabstackstr(s);
 
-	/* need a copy, so that delimiters aren't lost
-	 * in case there are more fields than variables */
-	backup = sstrdup(line);
-
 	arglist.lastp = &arglist.list;
 	
-	ifsbreakup(s, &arglist);
+	ifsbreakup(s, ac, &arglist);
 	*arglist.lastp = NULL;
 	ifsfree();
 
@@ -104,21 +97,6 @@ readcmd_handle_line(char *s, char **ap)
 			return;
 		}
 
-		/* remaining fields present, but no variables left. */
-		if (!ap[1] && sl->next) {
-			size_t offset;
-			char *remainder;
-
-			/* FIXME little bit hacky, assuming that ifsbreakup 
-			 * will not modify the length of the string */
-			offset = sl->text - s;
-			remainder = backup + offset;
-			rmescapes(remainder);
-			setvar(*ap, remainder, 0);
-
-			return;
-		}
-		
 		/* set variable to field */
 		rmescapes(sl->text);
 		setvar(*ap, sl->text, 0);
@@ -211,7 +189,7 @@ start:
 out:
 	recordregion(startloc, p - (char *)stackblock(), 0);
 	STACKSTRNUL(p);
-	readcmd_handle_line(p + 1, ap);
+	readcmd_handle_line(p + 1, argc - (ap - argv), ap);
 	return status;
 }
 

  parent reply	other threads:[~2015-12-03 21:02 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-02 22:37 dash: read does not ignore trailing spaces Gioele Barabucci
2015-12-02 23:25 ` Jilles Tjoelker
2015-12-03 21:02 ` Harald van Dijk [this message]
2015-12-03 21:17   ` Stephane Chazelas
2015-12-03 21:43     ` Martijn Dekker
2015-12-03 23:04       ` Stephane Chazelas
2015-12-03 23:17         ` Stephane Chazelas
2015-12-04  0:00           ` Stephane Chazelas
2015-12-03 22:26     ` Harald van Dijk
2015-12-04 19:51       ` Harald van Dijk
2016-01-29 12:57         ` Martijn Dekker
2016-06-06  8:48         ` [PATCH v2] builtin: Fix handling of trailing IFS white spaces Herbert Xu
2016-06-06 20:43           ` Harald van Dijk
2016-06-07  9:25             ` [PATCH v3] " Herbert Xu
2016-06-12 10:35               ` Harald van Dijk
2016-06-12 11:06                 ` Herbert Xu
2016-06-12 11:12                   ` Harald van Dijk
2016-06-12 12:17                     ` [PATCH v4] " Herbert Xu
2016-06-19 22:01                       ` Harald van Dijk
2016-06-20  1:28                         ` Herbert Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5660ADD6.4020308@gigawatt.nl \
    --to=harald@gigawatt.nl \
    --cc=dash@vger.kernel.org \
    --cc=gioele@svario.it \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).