All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Gibson <david-xT8FGy+AXnRB3Ne2BGzF6laj5H9X9Tb+@public.gmane.org>
To: Anton Staaf <robotboy-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
Cc: devicetree-discuss-uLR06cmDAlY/bJ5BZ2RsiQ@public.gmane.org
Subject: Re: [PATCH v2 2/3] dtc: Support character literals in cell lists
Date: Thu, 8 Sep 2011 13:47:42 +1000	[thread overview]
Message-ID: <20110908034742.GN30278@yookeroo.fritz.box> (raw)
In-Reply-To: <1315437340-1661-3-git-send-email-robotboy-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>

On Wed, Sep 07, 2011 at 04:15:39PM -0700, Anton Staaf wrote:
> With this patch the following property assignment:
> 
>     property = <0x12345678 'a' '\r' 100>;
> 
> is equivalent to:
> 
>     property = <0x12345678 0x00000061 0x0000000D 0x00000064>
> 
> Signed-off-by: Anton Staaf <robotboy-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> Cc: Jon Loeliger <jdl-CYoMK+44s/E@public.gmane.org>
> Cc: David Gibson <david-xT8FGy+AXnRB3Ne2BGzF6laj5H9X9Tb+@public.gmane.org>
> ---
>  Documentation/dts-format.txt |    2 +-
>  Documentation/manual.txt     |    7 ++++---
>  dtc-lexer.l                  |   14 ++++++++++++++
>  dtc-parser.y                 |    4 ++++
>  4 files changed, 23 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/dts-format.txt b/Documentation/dts-format.txt
> index a655b87..eae8b76 100644
> --- a/Documentation/dts-format.txt
> +++ b/Documentation/dts-format.txt
> @@ -33,7 +33,7 @@ Property values may be defined as an array of 32-bit integer cells, as
>  NUL-terminated strings, as bytestrings or a combination of these.
>  
>  * Arrays of cells are represented by angle brackets surrounding a
> -  space separated list of C-style integers
> +  space separated list of C-style integers or character literals.
>  
>  	e.g. interrupts = <17 0xc>;
>  
> diff --git a/Documentation/manual.txt b/Documentation/manual.txt
> index f8a8a7b..940fd3d 100644
> --- a/Documentation/manual.txt
> +++ b/Documentation/manual.txt
> @@ -213,10 +213,11 @@ For example:
>  
>  By default, all numeric values are hexadecimal.  Alternate bases
>  may be specified using a prefix "d#" for decimal, "b#" for binary,
> -and "o#" for octal.
> +and "o#" for octal.  Character literals are supported using the C
> +language character literal syntax of 'a'.

Hrm, this paragraph is clearly referring to the old-style dts-v0
syntax (hex is not the default, these days) - which may well mean the
whole document is hopelessly out of date.  And I think the char
literals should only be processed in dts-v1 mode.

> -Strings support common escape sequences from C: "\n", "\t", "\r",
> -"\(octal value)", "\x(hex value)".
> +Strings and character literals support common escape sequences from C:
> +"\n", "\t", "\r", "\(octal value)", "\x(hex value)".
>  
>  
>  4.3) Labels and References
> diff --git a/dtc-lexer.l b/dtc-lexer.l
> index e866ea5..d4f9eaa 100644
> --- a/dtc-lexer.l
> +++ b/dtc-lexer.l
> @@ -29,6 +29,8 @@ PROPNODECHAR	[a-zA-Z0-9,._+*#?@-]
>  PATHCHAR	({PROPNODECHAR}|[/])
>  LABEL		[a-zA-Z_][a-zA-Z0-9_]*
>  STRING		\"([^\\"]|\\.)*\"
> +CHAR_LITERAL	'[^\\']'
> +CHAR_ESCAPED	'\\([^']+|')'

I'd prefer a single regex here, of '[^']+', and then check that it is
indeed a single character when you process it.  An aside as to why...

So, when first working with lexer/parser generators, it is very
tempting to make the token regexes as tight as possible, so that as
soon as lex has recognized the token, you know it is 100%
syntactically valid.  The problem with this is that when the user
enters something that looks kinda like the token in question, but
isn't quite right, then flex will re-interpret those characters as
some other tokens.  That usually results in a deeply incomprehensible
syntax error from the parser.

I think the better approach, therefore, is to make the token regexes
as wide as they reasonably can be without ambiguity, then do extra
validation once the token is recognized by flex.  That way you can
produce a much more helpful "Bad format for token X" type error
message.  Currently dtc is a bit mixed in the extent it does this, but
that's the direction I'd prefer to move.

>  WS		[[:space:]]
>  COMMENT		"/*"([^*]|\*+[^*/])*\*+"/"
>  LINECOMMENT	"//".*\n
> @@ -109,6 +111,18 @@ static int pop_input_file(void);
>  			return DT_LITERAL;
>  		}
>  
> +<V1>{CHAR_LITERAL}	{
> +			yylval.byte = yytext[1];
> +			DPRINT("Character literal: %s\n", yytext);
> +			return DT_BYTE;
> +		}
> +
> +<V1>{CHAR_ESCAPED}	{
> +			yylval.byte = get_escape_char_exact(yytext+1, yyleng-2);
> +			DPRINT("Character literal escaped: %s\n", yytext);

So, I'd prefer the error handling to be done here - that is if there's
a bad escape sequence or more than one character, report it here.  For
now, a die() will do, although longer term I'd prefer a not
immediately fatal message (allowing the rest of the file to be checked
for errors) with an error triggered after the parse is complete.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

  parent reply	other threads:[~2011-09-08  3:47 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-09-07 23:15 [PATCH v2 0/3] Support character literals Anton Staaf
     [not found] ` <1315437340-1661-1-git-send-email-robotboy-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
2011-09-07 23:15   ` [PATCH v2 1/3] dtc: Refactor character literal parsing code Anton Staaf
     [not found]     ` <1315437340-1661-2-git-send-email-robotboy-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
2011-09-08  3:22       ` David Gibson
     [not found]         ` <20110908032250.GM30278-787xzQ0H9iQXU02nzanrWNbf9cGiqdzd@public.gmane.org>
2011-09-08 16:06           ` Anton Staaf
2011-09-07 23:15   ` [PATCH v2 2/3] dtc: Support character literals in cell lists Anton Staaf
     [not found]     ` <1315437340-1661-3-git-send-email-robotboy-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
2011-09-08  3:47       ` David Gibson [this message]
     [not found]         ` <20110908034742.GN30278-787xzQ0H9iQXU02nzanrWNbf9cGiqdzd@public.gmane.org>
2011-09-08 21:20           ` Anton Staaf
     [not found]             ` <CAF6FioU7ur-bRpvscFV6XG7vWpu-DLezdeyPGAwz-J+moPXyXQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-09-09  0:52               ` David Gibson
2011-09-07 23:15   ` [PATCH v2 3/3] dtc: Support character literals in bytestrings Anton Staaf
     [not found]     ` <1315437340-1661-4-git-send-email-robotboy-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
2011-09-08  3:51       ` David Gibson
     [not found]         ` <20110908035149.GO30278-787xzQ0H9iQXU02nzanrWNbf9cGiqdzd@public.gmane.org>
2011-09-08  7:07           ` Grant Likely
     [not found]             ` <20110908070716.GC15955-e0URQFbLeQY2iJbIjFUEsiwD8/FfD2ys@public.gmane.org>
2011-09-08 13:01               ` David Gibson
     [not found]                 ` <20110908130109.GW30278-787xzQ0H9iQXU02nzanrWNbf9cGiqdzd@public.gmane.org>
2011-09-09  6:30                   ` Anton Staaf
     [not found]                     ` <CAF6FioWOuK3szFb8FD+_fde0HpZ0TJJXx51ZL=BAcZFSojRGCQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-09-09  7:25                       ` David Gibson
     [not found]                         ` <20110909072532.GC9025-787xzQ0H9iQXU02nzanrWNbf9cGiqdzd@public.gmane.org>
2011-09-09 18:37                           ` Anton Staaf
     [not found]                             ` <CAF6FioVxYZW7grfwE1Q_FP5T95Omh9sqzAVyzsZr8EbfM6WV_Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-09-12  0:48                               ` David Gibson
     [not found]                                 ` <20110912004807.GH9025-787xzQ0H9iQXU02nzanrWNbf9cGiqdzd@public.gmane.org>
2011-09-12  5:09                                   ` Anton Staaf
     [not found]                                     ` <CAF6FioW9iY1JAEq5_vn0ZbHC81LBF7xN6JNzBVTbs+k19p2pRg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-09-12 17:53                                       ` Anton Staaf
2011-09-09 14:22                   ` Jon Loeliger
2011-09-08 16:02               ` Anton Staaf
2011-09-08  3:18   ` [PATCH v2 0/3] Support character literals David Gibson
     [not found]     ` <20110908031842.GL30278-787xzQ0H9iQXU02nzanrWNbf9cGiqdzd@public.gmane.org>
2011-09-08  5:03       ` David Gibson
     [not found]         ` <20110908050354.GP30278-787xzQ0H9iQXU02nzanrWNbf9cGiqdzd@public.gmane.org>
2011-09-08 15:51           ` Anton Staaf
     [not found]             ` <CAF6FioVSMD2rT89tBeZEirNWj3DKO2=TXy9cE+z_FvYSn3M_dA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-09-09  0:54               ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110908034742.GN30278@yookeroo.fritz.box \
    --to=david-xt8fgy+axnrb3ne2bgzf6laj5h9x9tb+@public.gmane.org \
    --cc=devicetree-discuss-uLR06cmDAlY/bJ5BZ2RsiQ@public.gmane.org \
    --cc=robotboy-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.