All of lore.kernel.org
 help / color / mirror / Atom feed
From: Cyril Hrubis <chrubis@suse.cz>
To: ltp@lists.linux.it
Subject: [LTP] [PATCH v4 1/1] docparse: Handle special characters in JSON
Date: Thu, 6 May 2021 16:44:10 +0200	[thread overview]
Message-ID: <YJQAukLCEqSX1X/9@yuki> (raw)
In-Reply-To: <20210506132745.16973-1-pvorel@suse.cz>

Hi!
> * escape backslash (/) and double quote (")
                      ^
		      \
>   escaping backslash effectively escapes other C escaped strings (\t,
>   \n, ...), which we sometimes want (in the comment) but sometimes not
>   (in .option we want to have them interpreted)
> * replace tab with 8x space
> * skip and TWARN invalid chars (< 0x20, i.e. anything before space)
             ^
	     warn on? We are not actually using TWARN o here right?
>   defined by RFC 8259 (https://tools.ietf.org/html/rfc8259#page-9)
> 
> NOTE: atm fix is required only for ", but tab was problematic in the past.
> 
> TODO: This is just a "hot fix" solution before release. Proper solution
> would be to check if chars needed to be escaped (", \, /) aren't already
> escaped.
> 
> Also for correct decision whether \n, \t should be escaped or interpreted
> we should decide in the parser which has the context. C string should be
> probably interpreted (thus nothing needed to be done as it escapes in
> a compatible way with JSON), but comments probably should display \n, \t
> thus add extra \.
>
> Fixes: c39b29f0a ("bpf: Check truncation on 32bit div/mod by zero")
> 
> Suggested-by: Cyril Hrubis <chrubis@suse.cz>
> Co-developed-by: Cyril Hrubis <chrubis@suse.cz>
> Signed-off-by: Petr Vorel <pvorel@suse.cz>
> ---
>  docparse/data_storage.h | 36 +++++++++++++++++++++++++++++++++++-
>  1 file changed, 35 insertions(+), 1 deletion(-)
> 
> diff --git a/docparse/data_storage.h b/docparse/data_storage.h
> index ef420c08f..9f36dd6f0 100644
> --- a/docparse/data_storage.h
> +++ b/docparse/data_storage.h
> @@ -256,6 +256,40 @@ static inline void data_fprintf(FILE *f, unsigned int padd, const char *fmt, ...
>  	va_end(va);
>  }
>  
> +
> +static inline void data_fprintf_esc(FILE *f, unsigned int padd, const char *str)
> +{
> +	while (padd-- > 0)
> +		fputc(' ', f);
> +
> +	fputc('"', f);

	int was_backslash = 0;

> +	while (*str) {
> +		switch (*str) {
> +		case '\\':
> +		break;
> +		case '"':
> +			fputs("\\\"", f);
			was_backslash = 0;
> +			break;
> +		case '\t':
> +			fputs("        ", f);
> +			break;
> +		default:
> +			/* RFC 8259 specify  chars before 0x20 as invalid */
> +			if (*str >= 0x20)
> +				putc(*str, f);
> +			else
> +				fprintf(stderr, "%s:%d %s(): invalid character for JSON: %x\n",
> +						__FILE__, __LINE__, __func__, *str);
> +			break;
> +		}

		if (was_backslash)
			fputs("\\\\", f);

		was_backslash = (*str == '\\');
> +		str++;
> +	}
> +
> +	fputc('"', f);
> +}

This should avoid "unescaping" an escaped double quote. We deffer
printing the backslash until we know the character after it and we make
sure that we do not excape backslash before ".

Consider what would happen if someone did put a "\"text\"" into options
strings, the original code would escape the backslashes and we would end
up with "\\"text"\\" which would break parser again.

This way we can at least avoid parsing errors until we fix the problem
one level down in the parser where we have the context required for a
proper fix.

>  static inline void data_to_json_(struct data_node *self, FILE *f, unsigned int padd, int do_padd)
>  {
>  	unsigned int i;
> @@ -263,7 +297,7 @@ static inline void data_to_json_(struct data_node *self, FILE *f, unsigned int p
>  	switch (self->type) {
>  	case DATA_STRING:
>  		padd = do_padd ? padd : 0;
> -		data_fprintf(f, padd, "\"%s\"", self->string.val);
> +		data_fprintf_esc(f, padd, self->string.val);
>  	break;
>  	case DATA_HASH:
>  		for (i = 0; i < self->hash.elems_used; i++) {
> -- 
> 2.31.1
> 

-- 
Cyril Hrubis
chrubis@suse.cz

  reply	other threads:[~2021-05-06 14:44 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-06 13:27 [LTP] [PATCH v4 1/1] docparse: Handle special characters in JSON Petr Vorel
2021-05-06 14:44 ` Cyril Hrubis [this message]
2021-05-06 18:21   ` Petr Vorel
2021-05-06 19:35   ` Petr Vorel
2021-05-07 10:10     ` Cyril Hrubis
2021-05-07 10:52       ` Petr Vorel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YJQAukLCEqSX1X/9@yuki \
    --to=chrubis@suse.cz \
    --cc=ltp@lists.linux.it \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.