All of lore.kernel.org
 help / color / mirror / Atom feed
* L'\0' handling
@ 2010-04-08 14:59 Yura Pakhuchiy
  2010-04-08 15:22 ` Michael Stefaniuc
  0 siblings, 1 reply; 12+ messages in thread
From: Yura Pakhuchiy @ 2010-04-08 14:59 UTC (permalink / raw)
  To: linux-sparse

Hi,

It looks like sparse do not understand constructions like L'\0'.

.-(~)------------------------------------------------------------(yura@yura-tl)-
`--> cat b.c
int main(void)
{
	L'\0';
	return 0;
}
.-(~)------------------------------------------------------------(yura@yura-tl)-
`--> gcc b.c
.-(~)------------------------------------------------------------(yura@yura-tl)-
`--> cgcc b.c
b.c:3:10: error: Expected ; at end of statement
b.c:3:10: error: got `\0'
b.c:3:9: error: undefined identifier `L'

This causes problems with /usr/include/wchar.h
and  /usr/include/bits/wchar.h includes from ubuntu.

/usr/include/bits/wchar.h:38:8: error: garbage at end: `\0' - 1 > 0

/usr/include/wchar.h:393:51: error: Expected ) in expression
/usr/include/wchar.h:393:51: error: got `\0'

Relevant lines from these headers:

extern int __wctob_alias (wint_t __c) __asm ("wctob");
__extern_inline int
__NTH (wctob (wint_t __wc))
{ return (__builtin_constant_p (__wc) && __wc >= L'\0' && __wc <= L'\x7f'
	  ? (int) __wc : __wctob_alias (__wc)); }

and

#ifdef __WCHAR_UNSIGNED__
#define __WCHAR_MIN       L'\0'

/* Failing that, rely on the preprocessor's knowledge of the
   signedness of wchar_t.  */
#elif L'\0' - 1 > 0
#define __WCHAR_MIN       L'\0'
#else
#define __WCHAR_MIN       (-__WCHAR_MAX - 1)
#endif


-- 
Best regards,
        Yura


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: L'\0' handling
  2010-04-08 14:59 L'\0' handling Yura Pakhuchiy
@ 2010-04-08 15:22 ` Michael Stefaniuc
  2010-04-08 15:39   ` Yura Pakhuchiy
  0 siblings, 1 reply; 12+ messages in thread
From: Michael Stefaniuc @ 2010-04-08 15:22 UTC (permalink / raw)
  To: Yura Pakhuchiy; +Cc: linux-sparse

Hello,

Yura Pakhuchiy wrote:
> It looks like sparse do not understand constructions like L'\0'.
yepp, it doesn't. I have run into that problem 2 years ago. But for Wine
that is actually a "feature" as wide char/string literals are forbidden
aka cannot be used. Thus my interest in adding support for that in
sparse died.

bye
	michael

> 
> .-(~)------------------------------------------------------------(yura@yura-tl)-
> `--> cat b.c
> int main(void)
> {
> 	L'\0';
> 	return 0;
> }
> .-(~)------------------------------------------------------------(yura@yura-tl)-
> `--> gcc b.c
> .-(~)------------------------------------------------------------(yura@yura-tl)-
> `--> cgcc b.c
> b.c:3:10: error: Expected ; at end of statement
> b.c:3:10: error: got `\0'
> b.c:3:9: error: undefined identifier `L'
> 
> This causes problems with /usr/include/wchar.h
> and  /usr/include/bits/wchar.h includes from ubuntu.
> 
> /usr/include/bits/wchar.h:38:8: error: garbage at end: `\0' - 1 > 0
> 
> /usr/include/wchar.h:393:51: error: Expected ) in expression
> /usr/include/wchar.h:393:51: error: got `\0'
> 
> Relevant lines from these headers:
> 
> extern int __wctob_alias (wint_t __c) __asm ("wctob");
> __extern_inline int
> __NTH (wctob (wint_t __wc))
> { return (__builtin_constant_p (__wc) && __wc >= L'\0' && __wc <= L'\x7f'
> 	  ? (int) __wc : __wctob_alias (__wc)); }
> 
> and
> 
> #ifdef __WCHAR_UNSIGNED__
> #define __WCHAR_MIN       L'\0'
> 
> /* Failing that, rely on the preprocessor's knowledge of the
>    signedness of wchar_t.  */
> #elif L'\0' - 1 > 0
> #define __WCHAR_MIN       L'\0'
> #else
> #define __WCHAR_MIN       (-__WCHAR_MAX - 1)
> #endif

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: L'\0' handling
  2010-04-08 15:22 ` Michael Stefaniuc
@ 2010-04-08 15:39   ` Yura Pakhuchiy
  2010-04-08 15:54     ` Michael Stefaniuc
  0 siblings, 1 reply; 12+ messages in thread
From: Yura Pakhuchiy @ 2010-04-08 15:39 UTC (permalink / raw)
  To: Michael Stefaniuc; +Cc: linux-sparse

У Чцв, 08/04/2010 у 17:22 +0200, Michael Stefaniuc піша:
> Yura Pakhuchiy wrote:
> > It looks like sparse do not understand constructions like L'\0'.
> yepp, it doesn't. I have run into that problem 2 years ago. But for Wine
> that is actually a "feature" as wide char/string literals are forbidden
> aka cannot be used. Thus my interest in adding support for that in
> sparse died.

Wine is not the only project which uses sparse. Sparse is broken for any
userspace program which includes wchar.h in Ubuntu with out supporting
this.

-- 
Best regards,
        Yura

--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: L'\0' handling
  2010-04-08 15:39   ` Yura Pakhuchiy
@ 2010-04-08 15:54     ` Michael Stefaniuc
  2010-04-08 20:19       ` Christopher Li
  0 siblings, 1 reply; 12+ messages in thread
From: Michael Stefaniuc @ 2010-04-08 15:54 UTC (permalink / raw)
  To: Yura Pakhuchiy; +Cc: linux-sparse

Yura Pakhuchiy wrote:
> У Чцв, 08/04/2010 у 17:22 +0200, Michael Stefaniuc піша:
>> Yura Pakhuchiy wrote:
>>> It looks like sparse do not understand constructions like L'\0'.
>> yepp, it doesn't. I have run into that problem 2 years ago. But for Wine
>> that is actually a "feature" as wide char/string literals are forbidden
>> aka cannot be used. Thus my interest in adding support for that in
>> sparse died.
> 
> Wine is not the only project which uses sparse. Sparse is broken for any
I'm a Wine developer and not a sparse developer. I was just saying that
I run into this problem in sparse, thought about fixing it but dumped
the idea and submitted instead patches to Wine to remove the wide
char/string literals.

> userspace program which includes wchar.h in Ubuntu with out supporting
> this.
I didn't assert that the feature shouldn't be implemented, just that it
is a known missing feature that the normal sparse consumer didn't need
until now.

bye
	michael
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: L'\0' handling
  2010-04-08 15:54     ` Michael Stefaniuc
@ 2010-04-08 20:19       ` Christopher Li
       [not found]         ` <1270758815.2167.13.camel@yura-tl>
  0 siblings, 1 reply; 12+ messages in thread
From: Christopher Li @ 2010-04-08 20:19 UTC (permalink / raw)
  To: Michael Stefaniuc; +Cc: Yura Pakhuchiy, linux-sparse

[-- Attachment #1: Type: text/plain, Size: 545 bytes --]

On Thu, Apr 8, 2010 at 8:54 AM, Michael Stefaniuc <mstefani@redhat.com> wrote:
> I didn't assert that the feature shouldn't be implemented, just that it
> is a known missing feature that the normal sparse consumer didn't need
> until now.

Yura, do you want to give this patch a try?

It is nasty that L'\0' start from an identifier letter. I try my best
not to slow down the
hot path. The test is done inside get_one_identifier() after the ident
hash is built.
It look a little bit out of palace but faster than testing 'L' before hand.

Chris

[-- Attachment #2: 0005-Allow-parsing-L-0.patch --]
[-- Type: application/octet-stream, Size: 3585 bytes --]

From 6c8d9169ebe8198d8b994825ce9d4eb1b7339129 Mon Sep 17 00:00:00 2001
From: Christopher Li <sparse@chrisli.org>
Date: Thu, 8 Apr 2010 14:05:29 -0700
Subject: [PATCH 5/5] Allow parsing L'\0'

Signed-off-by: Christopher Li <sparse@chrisli.org>
---
 expression.c  |    3 ++-
 ident-list.h  |    3 +++
 pre-process.c |    1 +
 token.h       |    1 +
 tokenize.c    |   12 ++++++++----
 5 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/expression.c b/expression.c
index c9277da..67e05e7 100644
--- a/expression.c
+++ b/expression.c
@@ -397,9 +397,10 @@ struct token *primary_expression(struct token *token, struct expression **tree)
 
 	switch (token_type(token)) {
 	case TOKEN_CHAR:
+	case TOKEN_LONG_CHAR:
 		expr = alloc_expression(token->pos, EXPR_VALUE);   
 		expr->flags = Int_const_expr;
-		expr->ctype = &int_ctype; 
+		expr->ctype = token_type(token) == TOKEN_CHAR ? &int_ctype : &long_ctype;
 		expr->value = (unsigned char) token->character;
 		token = token->next;
 		break;
diff --git a/ident-list.h b/ident-list.h
index 0ee81bc..b94aece 100644
--- a/ident-list.h
+++ b/ident-list.h
@@ -25,6 +25,9 @@ IDENT(__attribute); IDENT(__attribute__);
 IDENT(volatile); IDENT(__volatile); IDENT(__volatile__);
 IDENT(double);
 
+/* Special case for L'\t' */
+IDENT(L);
+
 /* Extended gcc identifiers */
 IDENT(asm); IDENT_RESERVED(__asm); IDENT_RESERVED(__asm__);
 IDENT(alignof); IDENT_RESERVED(__alignof); IDENT_RESERVED(__alignof__); 
diff --git a/pre-process.c b/pre-process.c
index 34b21ff..058f24b 100644
--- a/pre-process.c
+++ b/pre-process.c
@@ -864,6 +864,7 @@ static int token_different(struct token *t1, struct token *t2)
 		different = t1->argnum != t2->argnum;
 		break;
 	case TOKEN_CHAR:
+	case TOKEN_LONG_CHAR:
 		different = t1->character != t2->character;
 		break;
 	case TOKEN_STRING: {
diff --git a/token.h b/token.h
index ebc94b4..c527e78 100644
--- a/token.h
+++ b/token.h
@@ -67,6 +67,7 @@ enum token_type {
 	TOKEN_ZERO_IDENT,
 	TOKEN_NUMBER,
 	TOKEN_CHAR,
+	TOKEN_LONG_CHAR,
 	TOKEN_STRING,
 	TOKEN_SPECIAL,
 	TOKEN_STREAMBEGIN,
diff --git a/tokenize.c b/tokenize.c
index 93dd007..cf05826 100644
--- a/tokenize.c
+++ b/tokenize.c
@@ -145,7 +145,8 @@ const char *show_token(const struct token *token)
 	case TOKEN_SPECIAL:
 		return show_special(token->special);
 
-	case TOKEN_CHAR: {
+	case TOKEN_CHAR: 
+	case TOKEN_LONG_CHAR: {
 		char *ptr = buffer;
 		int c = token->character;
 		*ptr++ = '\'';
@@ -527,7 +528,7 @@ static int escapechar(int first, int type, stream_t *stream, int *valp)
 	return next;
 }
 
-static int get_char_token(int next, stream_t *stream)
+static int get_char_token(int next, stream_t *stream, enum token_type type)
 {
 	int value;
 	struct token *token;
@@ -540,7 +541,7 @@ static int get_char_token(int next, stream_t *stream)
 	}
 
 	token = stream->token;
-	token_type(token) = TOKEN_CHAR;
+	token_type(token) = type;
 	token->character = value & 0xff;
 
 	add_token(stream);
@@ -702,7 +703,7 @@ static int get_one_special(int c, stream_t *stream)
 	case '"':
 		return get_string_token(next, stream);
 	case '\'':
-		return get_char_token(next, stream);
+		return get_char_token(next, stream, TOKEN_CHAR);
 	case '/':
 		if (next == '/')
 			return drop_stream_eoln(stream);
@@ -880,6 +881,9 @@ static int get_one_identifier(int c, stream_t *stream)
 
 	ident = create_hashed_ident(buf, len, hash);
 
+	if (ident == &L_ident && next == '\'')
+		return get_char_token(nextchar(stream), stream, TOKEN_LONG_CHAR);
+
 	/* Pass it on.. */
 	token = stream->token;
 	token_type(token) = TOKEN_IDENT;
-- 
1.6.6.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: L'\0' handling
       [not found]         ` <1270758815.2167.13.camel@yura-tl>
@ 2010-04-08 20:46           ` Christopher Li
  2010-04-08 20:58             ` Michael Stefaniuc
  0 siblings, 1 reply; 12+ messages in thread
From: Christopher Li @ 2010-04-08 20:46 UTC (permalink / raw)
  To: Yura Pakhuchiy; +Cc: Linux-Sparse

On Thu, Apr 8, 2010 at 1:33 PM, Yura Pakhuchiy <pakhuchiy@gmail.com> wrote:
> Hi Chris,
>
> У Чцв, 08/04/2010 у 13:19 -0700, Christopher Li піша:
>> Yura, do you want to give this patch a try?
>
> Works great for me! Fixed all problems with wchar.h related includes.
> Thanks!

Great. Change pushed.

Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: L'\0' handling
  2010-04-08 20:46           ` Christopher Li
@ 2010-04-08 20:58             ` Michael Stefaniuc
  2010-04-08 23:18               ` Christopher Li
  0 siblings, 1 reply; 12+ messages in thread
From: Michael Stefaniuc @ 2010-04-08 20:58 UTC (permalink / raw)
  To: Christopher Li; +Cc: Yura Pakhuchiy, Linux-Sparse

Hello Chris,

On 04/08/2010 10:46 PM, Christopher Li wrote:
> On Thu, Apr 8, 2010 at 1:33 PM, Yura Pakhuchiy<pakhuchiy@gmail.com>  wrote:
>> Hi Chris,
>>
>> У Чцв, 08/04/2010 у 13:19 -0700, Christopher Li піша:
>>> Yura, do you want to give this patch a try?
>>
>> Works great for me! Fixed all problems with wchar.h related includes.
>> Thanks!
>
> Great. Change pushed.
I have looked at the patch but I don't see it handle wchar_t string 
literals like L"Hello World\n".

bye
	michael
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: L'\0' handling
  2010-04-08 20:58             ` Michael Stefaniuc
@ 2010-04-08 23:18               ` Christopher Li
  2010-04-09  8:57                 ` Michael Stefaniuc
  0 siblings, 1 reply; 12+ messages in thread
From: Christopher Li @ 2010-04-08 23:18 UTC (permalink / raw)
  To: Michael Stefaniuc; +Cc: Yura Pakhuchiy, Linux-Sparse

On Thu, Apr 8, 2010 at 1:58 PM, Michael Stefaniuc <mstefani@redhat.com> wrote:
>
> I have looked at the patch but I don't see it handle wchar_t string literals
> like L"Hello World\n".

That is on purpose. L"Hello worlds" is very questionable.

Chris

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: L'\0' handling
  2010-04-08 23:18               ` Christopher Li
@ 2010-04-09  8:57                 ` Michael Stefaniuc
  2010-04-09 20:07                   ` Christopher Li
  0 siblings, 1 reply; 12+ messages in thread
From: Michael Stefaniuc @ 2010-04-09  8:57 UTC (permalink / raw)
  To: Christopher Li; +Cc: Yura Pakhuchiy, Linux-Sparse

Christopher Li wrote:
> On Thu, Apr 8, 2010 at 1:58 PM, Michael Stefaniuc <mstefani@redhat.com> wrote:
>> I have looked at the patch but I don't see it handle wchar_t string literals
>> like L"Hello World\n".
> 
> That is on purpose. L"Hello worlds" is very questionable.
Huh? Care to explain this one? That is a valid wide char string literal
in C and sparse doesn't support those. I don't see much point in
supporting only wide char literals and not the wide char string literals.

bye
	michael


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: L'\0' handling
  2010-04-09  8:57                 ` Michael Stefaniuc
@ 2010-04-09 20:07                   ` Christopher Li
  2010-04-09 20:28                     ` Michael Stefaniuc
  0 siblings, 1 reply; 12+ messages in thread
From: Christopher Li @ 2010-04-09 20:07 UTC (permalink / raw)
  To: Michael Stefaniuc; +Cc: Yura Pakhuchiy, Linux-Sparse

On Fri, Apr 9, 2010 at 1:57 AM, Michael Stefaniuc <mstefani@redhat.com> wrote:
> Christopher Li wrote:
>> On Thu, Apr 8, 2010 at 1:58 PM, Michael Stefaniuc <mstefani@redhat.com> wrote:
>>> I have looked at the patch but I don't see it handle wchar_t string literals
>>> like L"Hello World\n".
>>
>> That is on purpose. L"Hello worlds" is very questionable.
> Huh? Care to explain this one? That is a valid wide char string literal
> in C and sparse doesn't support those. I don't see much point in
> supporting only wide char literals and not the wide char string literals.

Ah, silly me. I did not realized the nature of this change is to support wide
char literals. Just look up what wide char string literals is, now I
have a better
idea. You are right. We should support both. My previous patch is wrong
to set the type of wide char string as "long" type.

So L"hello word\n" pointer are incompatible with char * pointer right?
And the wchar_t is implementation specific. I am wondering should I just
pick 16 bit or 32 bit in sparse. Maybe just make it compatible with what
gcc does.

Obviously, it need more patches to support wide char string literals.

Chris

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: L'\0' handling
  2010-04-09 20:07                   ` Christopher Li
@ 2010-04-09 20:28                     ` Michael Stefaniuc
  2010-06-18  0:30                       ` Christopher Li
  0 siblings, 1 reply; 12+ messages in thread
From: Michael Stefaniuc @ 2010-04-09 20:28 UTC (permalink / raw)
  To: Christopher Li; +Cc: Linux-Sparse

On 04/09/2010 10:07 PM, Christopher Li wrote:
> On Fri, Apr 9, 2010 at 1:57 AM, Michael Stefaniuc<mstefani@redhat.com>  wrote:
>> Christopher Li wrote:
>>> On Thu, Apr 8, 2010 at 1:58 PM, Michael Stefaniuc<mstefani@redhat.com>  wrote:
>>>> I have looked at the patch but I don't see it handle wchar_t string literals
>>>> like L"Hello World\n".
>>>
>>> That is on purpose. L"Hello worlds" is very questionable.
>> Huh? Care to explain this one? That is a valid wide char string literal
>> in C and sparse doesn't support those. I don't see much point in
>> supporting only wide char literals and not the wide char string literals.
>
> Ah, silly me. I did not realized the nature of this change is to support wide
> char literals. Just look up what wide char string literals is, now I
> have a better
> idea. You are right. We should support both. My previous patch is wrong
> to set the type of wide char string as "long" type.
>
> So L"hello word\n" pointer are incompatible with char * pointer right?
Yes, they are incompatible.

> And the wchar_t is implementation specific. I am wondering should I just
> pick 16 bit or 32 bit in sparse. Maybe just make it compatible with what
> gcc does.
I know only about Windows that has 16bit wide chars; the rest seems to 
be all 32bit.

gcc supports -fshort-wchar to have the wide chars be 16bit; primary 
consumer of that is mingw; for probably everything else it is useless.

> Obviously, it need more patches to support wide char string literals.
Yeah, that's what I remember from having looked back then at sparse. 
Luckily the correct fix for Wine was to remove the wide char literals :)

bye
	michael

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: L'\0' handling
  2010-04-09 20:28                     ` Michael Stefaniuc
@ 2010-06-18  0:30                       ` Christopher Li
  0 siblings, 0 replies; 12+ messages in thread
From: Christopher Li @ 2010-06-18  0:30 UTC (permalink / raw)
  To: Michael Stefaniuc; +Cc: Linux-Sparse

[-- Attachment #1: Type: text/plain, Size: 704 bytes --]

On Fri, Apr 9, 2010 at 1:28 PM, Michael Stefaniuc <mstefani@redhat.com> wrote:
>> Ah, silly me. I did not realized the nature of this change is to support
>> wide
>> char literals. Just look up what wide char string literals is, now I
>> have a better
>> idea. You are right. We should support both. My previous patch is wrong
>> to set the type of wide char string as "long" type.
>>
>> So L"hello word\n" pointer are incompatible with char * pointer right?
>
> Yes, they are incompatible.

A blast from the past. I found this patch while I am cleaning up my
tree. Totally forget about it already.

At least it should parse the L"hello world" now.
Make the base type correct is more work though.

Chris

[-- Attachment #2: 0001-Parsing-wide-char-string.patch --]
[-- Type: application/octet-stream, Size: 5578 bytes --]

From 49adf11b99cfce04ddcae7be0a272cc2df31436d Mon Sep 17 00:00:00 2001
From: Christopher Li <sparse@chrisli.org>
Date: Thu, 17 Jun 2010 17:08:09 -0700
Subject: [PATCH 1/4] Parsing wide char string

A follow up change to parse the wide char string.
It currently only parse and store it like normal strings.
Need more change to reflect the base type and size etc.

Signed-off-by: Christopher Li <sparse@chrisli.org>
---
 expression.c  |   13 ++++++++-----
 expression.h  |    5 ++++-
 pre-process.c |    5 +++--
 token.h       |    3 ++-
 tokenize.c    |   17 +++++++++++------
 5 files changed, 28 insertions(+), 15 deletions(-)

diff --git a/expression.c b/expression.c
index 67e05e7..7e06e60 100644
--- a/expression.c
+++ b/expression.c
@@ -224,17 +224,18 @@ static struct token *string_expression(struct token *token, struct expression *e
 {
 	struct string *string = token->string;
 	struct token *next = token->next;
+	int stringtype = token_type(token);
 
 	convert_function(token);
 
-	if (token_type(next) == TOKEN_STRING) {
+	if (token_type(next) == stringtype) {
 		int totlen = string->length-1;
 		char *data;
 
 		do {
 			totlen += next->string->length-1;
 			next = next->next;
-		} while (token_type(next) == TOKEN_STRING);
+		} while (token_type(next) == stringtype);
 
 		if (totlen > MAX_STRING) {
 			warning(token->pos, "trying to concatenate %d-character string (%d bytes max)", totlen, MAX_STRING);
@@ -256,7 +257,7 @@ static struct token *string_expression(struct token *token, struct expression *e
 			next = next->next;
 			memcpy(data, s->data, len);
 			data += len;
-		} while (token_type(next) == TOKEN_STRING);
+		} while (token_type(next) == stringtype);
 		*data = '\0';
 	}
 	expr->string = string;
@@ -397,7 +398,7 @@ struct token *primary_expression(struct token *token, struct expression **tree)
 
 	switch (token_type(token)) {
 	case TOKEN_CHAR:
-	case TOKEN_LONG_CHAR:
+	case TOKEN_WIDE_CHAR:
 		expr = alloc_expression(token->pos, EXPR_VALUE);   
 		expr->flags = Int_const_expr;
 		expr->ctype = token_type(token) == TOKEN_CHAR ? &int_ctype : &long_ctype;
@@ -464,9 +465,11 @@ struct token *primary_expression(struct token *token, struct expression **tree)
 		break;
 	}
 
-	case TOKEN_STRING: {
+	case TOKEN_STRING:
+	case TOKEN_WIDE_STRING: {
 	handle_string:
 		expr = alloc_expression(token->pos, EXPR_STRING);
+		expr->wide = token_type(token) == TOKEN_WIDE_STRING;
 		token = string_expression(token, expr);
 		break;
 	}
diff --git a/expression.h b/expression.h
index 631224f..9778de8 100644
--- a/expression.h
+++ b/expression.h
@@ -76,7 +76,10 @@ struct expression {
 		long double fvalue;
 
 		// EXPR_STRING
-		struct string *string;
+		struct {
+			int wide;
+			struct string *string;
+		};
 
 		// EXPR_UNOP, EXPR_PREOP and EXPR_POSTOP
 		struct /* unop */ {
diff --git a/pre-process.c b/pre-process.c
index 058f24b..656acaa 100644
--- a/pre-process.c
+++ b/pre-process.c
@@ -864,10 +864,11 @@ static int token_different(struct token *t1, struct token *t2)
 		different = t1->argnum != t2->argnum;
 		break;
 	case TOKEN_CHAR:
-	case TOKEN_LONG_CHAR:
+	case TOKEN_WIDE_CHAR:
 		different = t1->character != t2->character;
 		break;
-	case TOKEN_STRING: {
+	case TOKEN_STRING:
+	case TOKEN_WIDE_STRING: {
 		struct string *s1, *s2;
 
 		s1 = t1->string;
diff --git a/token.h b/token.h
index c527e78..a7ec77e 100644
--- a/token.h
+++ b/token.h
@@ -67,8 +67,9 @@ enum token_type {
 	TOKEN_ZERO_IDENT,
 	TOKEN_NUMBER,
 	TOKEN_CHAR,
-	TOKEN_LONG_CHAR,
+	TOKEN_WIDE_CHAR,
 	TOKEN_STRING,
+	TOKEN_WIDE_STRING,
 	TOKEN_SPECIAL,
 	TOKEN_STREAMBEGIN,
 	TOKEN_STREAMEND,
diff --git a/tokenize.c b/tokenize.c
index cf05826..4c97517 100644
--- a/tokenize.c
+++ b/tokenize.c
@@ -137,6 +137,7 @@ const char *show_token(const struct token *token)
 		return show_ident(token->ident);
 
 	case TOKEN_STRING:
+	case TOKEN_WIDE_STRING:
 		return show_string(token->string);
 
 	case TOKEN_NUMBER:
@@ -146,7 +147,7 @@ const char *show_token(const struct token *token)
 		return show_special(token->special);
 
 	case TOKEN_CHAR: 
-	case TOKEN_LONG_CHAR: {
+	case TOKEN_WIDE_CHAR: {
 		char *ptr = buffer;
 		int c = token->character;
 		*ptr++ = '\'';
@@ -548,7 +549,7 @@ static int get_char_token(int next, stream_t *stream, enum token_type type)
 	return nextchar(stream);
 }
 
-static int get_string_token(int next, stream_t *stream)
+static int get_string_token(int next, stream_t *stream, enum token_type type)
 {
 	static char buffer[MAX_STRING];
 	struct string *string;
@@ -581,7 +582,7 @@ static int get_string_token(int next, stream_t *stream)
 
 	/* Pass it on.. */
 	token = stream->token;
-	token_type(token) = TOKEN_STRING;
+	token_type(token) = type;
 	token->string = string;
 	add_token(stream);
 	
@@ -701,7 +702,7 @@ static int get_one_special(int c, stream_t *stream)
 			return get_one_number(c, next, stream);
 		break;
 	case '"':
-		return get_string_token(next, stream);
+		return get_string_token(next, stream, TOKEN_STRING);
 	case '\'':
 		return get_char_token(next, stream, TOKEN_CHAR);
 	case '/':
@@ -881,8 +882,12 @@ static int get_one_identifier(int c, stream_t *stream)
 
 	ident = create_hashed_ident(buf, len, hash);
 
-	if (ident == &L_ident && next == '\'')
-		return get_char_token(nextchar(stream), stream, TOKEN_LONG_CHAR);
+	if (ident == &L_ident) {
+		if (next == '\'')
+			return get_char_token(nextchar(stream), stream, TOKEN_WIDE_CHAR);
+		if (next == '\"')
+			return get_string_token(nextchar(stream), stream, TOKEN_WIDE_STRING);
+	}
 
 	/* Pass it on.. */
 	token = stream->token;
-- 
1.6.6.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2010-06-18  0:30 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-04-08 14:59 L'\0' handling Yura Pakhuchiy
2010-04-08 15:22 ` Michael Stefaniuc
2010-04-08 15:39   ` Yura Pakhuchiy
2010-04-08 15:54     ` Michael Stefaniuc
2010-04-08 20:19       ` Christopher Li
     [not found]         ` <1270758815.2167.13.camel@yura-tl>
2010-04-08 20:46           ` Christopher Li
2010-04-08 20:58             ` Michael Stefaniuc
2010-04-08 23:18               ` Christopher Li
2010-04-09  8:57                 ` Michael Stefaniuc
2010-04-09 20:07                   ` Christopher Li
2010-04-09 20:28                     ` Michael Stefaniuc
2010-06-18  0:30                       ` Christopher Li

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.