* Bad interaction between macro expansion and literal concatenation @ 2015-01-30 22:16 Rasmus Villemoes 2015-01-31 1:23 ` [PATCH] Avoid reuse of string buffer when concatening adjacent string litterals Luc Van Oostenryck 2015-01-31 5:16 ` Bad interaction between macro expansion and literal concatenation Christopher Li 0 siblings, 2 replies; 28+ messages in thread From: Rasmus Villemoes @ 2015-01-30 22:16 UTC (permalink / raw) To: linux-sparse Hi, I'm trying to write a type checker for the kernel's %p format extensions, but I've run into a problem with sparse which I can't debug myself. It seems that something goes wrong when sparse is concatenating literal strings, one of which was hidden behind a macro. As a random example, drivers/net/ethernet/realtek/atp.c:471 has this format string: KERN_DEBUG "%s: Reset: current Rx mode %d.\n" KERN_DEBUG ultimately expands to "\001" "7". When I print the corresponding ->string->data, I get ^A6%s7%s: Reset: current Rx mode %d. (^A is just less' way of showing the \001 byte - that part is fine). Note the extra "6%s". Needless to say, such random extra content makes format string verification very hard... If I replace KERN_DEBUG by its expansion "\001" "7", it works fine. If I only replace it by its immediate replacement KERN_SOH "7", I see the same bad behaviour. Any ideas? Rasmus ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH] Avoid reuse of string buffer when concatening adjacent string litterals 2015-01-30 22:16 Bad interaction between macro expansion and literal concatenation Rasmus Villemoes @ 2015-01-31 1:23 ` Luc Van Oostenryck 2015-02-03 22:38 ` Rasmus Villemoes 2015-01-31 5:16 ` Bad interaction between macro expansion and literal concatenation Christopher Li 1 sibling, 1 reply; 28+ messages in thread From: Luc Van Oostenryck @ 2015-01-31 1:23 UTC (permalink / raw) To: Rasmus Villemoes; +Cc: linux-sparse, Christopher Li In get_string_constant(), the code tried to reuse the storage for the string but only if the expansion of the string was not bigger than its unexpanded form. But this fail when the string constant is a sequence of adjacent string litterals (each being possibly shared, used elsewhere, isolated or in another order). The minimal exemple would be something like this: #define P "\001" const char a[] = P "a"; const char b[] = P "b"; The expansion for 'a' will produce a string which is smaller than the unexpanded "\001" (2 instead of 4). By trying to reuse the storage, all further occurrence of "\001" (probably only from the same 'origin', here the macro P) will then be replaced by "\001a". The fix is thus to not try to reuse the storage for the string if it consit of several adjacent litterals. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> --- char.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/char.c b/char.c index 08ca2230..ce1a0700 100644 --- a/char.c +++ b/char.c @@ -93,6 +93,7 @@ struct token *get_string_constant(struct token *token, struct expression *expr) static char buffer[MAX_STRING]; int len = 0; int bits; + int parts = 0; while (!done) { switch (token_type(next)) { @@ -117,13 +118,14 @@ struct token *get_string_constant(struct token *token, struct expression *expr) len++; } token = token->next; + parts++; } if (len > MAX_STRING) { warning(token->pos, "trying to concatenate %d-character string (%d bytes max)", len, MAX_STRING); len = MAX_STRING; } - if (len >= string->length) /* can't cannibalize */ + if (len >= string->length || parts > 1) /* safe to reuse the string buffer */ string = __alloc_string(len+1); string->length = len+1; memcpy(string->data, buffer, len); -- 2.2.2 ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH] Avoid reuse of string buffer when concatening adjacent string litterals 2015-01-31 1:23 ` [PATCH] Avoid reuse of string buffer when concatening adjacent string litterals Luc Van Oostenryck @ 2015-02-03 22:38 ` Rasmus Villemoes 2015-02-04 0:32 ` Luc Van Oostenryck 2015-02-04 2:01 ` [PATCH v2] Avoid reusing string buffer when doing string expansion Luc Van Oostenryck 0 siblings, 2 replies; 28+ messages in thread From: Rasmus Villemoes @ 2015-02-03 22:38 UTC (permalink / raw) To: Luc Van Oostenryck; +Cc: linux-sparse, Christopher Li On Sat, Jan 31 2015, Luc Van Oostenryck <luc.vanoostenryck@gmail.com> wrote: > In get_string_constant(), the code tried to reuse the storage for the string > but only if the expansion of the string was not bigger than its unexpanded form. > But this fail when the string constant is a sequence of adjacent string litterals > (each being possibly shared, used elsewhere, isolated or in another order). > The minimal exemple would be something like this: > > #define P "\001" > const char a[] = P "a"; > const char b[] = P "b"; > > The expansion for 'a' will produce a string which is smaller than > the unexpanded "\001" (2 instead of 4). > By trying to reuse the storage, all further occurrence of "\001" > (probably only from the same 'origin', here the macro P) will then be replaced by "\001a". > > The fix is thus to not try to reuse the storage for the string if it consit of > several adjacent litterals. > Thanks, but there's still something wrong. Using your show-data feature on this: === #define BACKSLASH "\\" #define LETTER_t "t" static const char s1[] = BACKSLASH; /* static const char s2[] = BACKSLASH; */ static const char s3[] = BACKSLASH LETTER_t; static const char s4[] = "a" BACKSLASH LETTER_t "b"; === I get symbol s1: char static const [toplevel] s1[0] bit_size = 16 val = "\\" symbol s3: char static const [toplevel] s3[0] bit_size = 24 val = "\0t" symbol s4: char static const [toplevel] s4[0] bit_size = 40 val = "a\0tb" Now if I do the same with s2 not commented out, I get symbol s1: char static const [toplevel] s1[0] bit_size = 16 val = "\0" symbol s2: char static const [toplevel] s2[0] bit_size = 16 val = "\0" symbol s3: char static const [toplevel] s3[0] bit_size = 24 val = "\0t" symbol s4: char static const [toplevel] s4[0] bit_size = 40 val = "a\0tb" So the expansion of BACKSLASH changes depending on how often it is expanded... The LETTER_t thing above is because I thought I had somehow provoked a double expansion, making BACKSLASH LETTER_t (or some variant) expand to a single-character string containing just a tab. But I can't seem to reproduce that particular behaviour, so maybe I'm imagining stuff. Anyway, the above is certainly real. Thanks, Rasmus ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] Avoid reuse of string buffer when concatening adjacent string litterals 2015-02-03 22:38 ` Rasmus Villemoes @ 2015-02-04 0:32 ` Luc Van Oostenryck 2015-02-04 3:26 ` Christopher Li 2015-02-04 8:39 ` Rasmus Villemoes 2015-02-04 2:01 ` [PATCH v2] Avoid reusing string buffer when doing string expansion Luc Van Oostenryck 1 sibling, 2 replies; 28+ messages in thread From: Luc Van Oostenryck @ 2015-02-04 0:32 UTC (permalink / raw) To: Rasmus Villemoes; +Cc: linux-sparse, Christopher Li On Tue, Feb 03, 2015 at 11:38:02PM +0100, Rasmus Villemoes wrote: > On Sat, Jan 31 2015, Luc Van Oostenryck <luc.vanoostenryck@gmail.com> wrote: > > > In get_string_constant(), the code tried to reuse the storage for the string > > but only if the expansion of the string was not bigger than its unexpanded form. > > But this fail when the string constant is a sequence of adjacent string litterals > > (each being possibly shared, used elsewhere, isolated or in another order). > > The minimal exemple would be something like this: > > > > #define P "\001" > > const char a[] = P "a"; > > const char b[] = P "b"; > > > > The expansion for 'a' will produce a string which is smaller than > > the unexpanded "\001" (2 instead of 4). > > By trying to reuse the storage, all further occurrence of "\001" > > (probably only from the same 'origin', here the macro P) will then be replaced by "\001a". > > > > The fix is thus to not try to reuse the storage for the string if it consit of > > several adjacent litterals. > > > > Thanks, but there's still something wrong. Using your show-data feature > on this: > > === > #define BACKSLASH "\\" > #define LETTER_t "t" > > static const char s1[] = BACKSLASH; > /* static const char s2[] = BACKSLASH; */ > static const char s3[] = BACKSLASH LETTER_t; > static const char s4[] = "a" BACKSLASH LETTER_t "b"; > === > > I get > > symbol s1: > char static const [toplevel] s1[0] > bit_size = 16 > val = "\\" > symbol s3: > char static const [toplevel] s3[0] > bit_size = 24 > val = "\0t" > symbol s4: > char static const [toplevel] s4[0] > bit_size = 40 > val = "a\0tb" > > Now if I do the same with s2 not commented out, I get > > > symbol s1: > char static const [toplevel] s1[0] > bit_size = 16 > val = "\0" > symbol s2: > char static const [toplevel] s2[0] > bit_size = 16 > val = "\0" > symbol s3: > char static const [toplevel] s3[0] > bit_size = 24 > val = "\0t" > symbol s4: > char static const [toplevel] s4[0] > bit_size = 40 > val = "a\0tb" > > So the expansion of BACKSLASH changes depending on how often it is > expanded... > > The LETTER_t thing above is because I thought I had somehow provoked a > double expansion, making BACKSLASH LETTER_t (or some variant) expand to > a single-character string containing just a tab. But I can't seem to > reproduce that particular behaviour, so maybe I'm imagining > stuff. Anyway, the above is certainly real. > > Thanks, > Rasmus > -- Yes, I see. Now thinking about it, it's obvious that the string buffer can't be reused at all if there is any kind of expansion done on it, the adjacent strings concatenation make just the thing worse but are not the cause of it. I'll post an updated patch later. Regards, Luc ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] Avoid reuse of string buffer when concatening adjacent string litterals 2015-02-04 0:32 ` Luc Van Oostenryck @ 2015-02-04 3:26 ` Christopher Li 2015-02-04 8:39 ` Rasmus Villemoes 1 sibling, 0 replies; 28+ messages in thread From: Christopher Li @ 2015-02-04 3:26 UTC (permalink / raw) To: Luc Van Oostenryck; +Cc: Rasmus Villemoes, Linux-Sparse On Tue, Feb 3, 2015 at 4:32 PM, Luc Van Oostenryck <luc.vanoostenryck@gmail.com> wrote: > Now thinking about it, it's obvious that the string buffer can't be reused at all > if there is any kind of expansion done on it, the adjacent strings concatenation > make just the thing worse but are not the cause of it. > Right. That is what I think after reading your patch too. String concatenation is a not a good indicator on macro expand. There should be a fix base on the macro expand. Even though I haven't construct an test case like Rasmus did. Chris ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] Avoid reuse of string buffer when concatening adjacent string litterals 2015-02-04 0:32 ` Luc Van Oostenryck 2015-02-04 3:26 ` Christopher Li @ 2015-02-04 8:39 ` Rasmus Villemoes 2015-02-04 8:58 ` Rasmus Villemoes 1 sibling, 1 reply; 28+ messages in thread From: Rasmus Villemoes @ 2015-02-04 8:39 UTC (permalink / raw) To: Luc Van Oostenryck; +Cc: linux-sparse, Christopher Li On Wed, Feb 04 2015, Luc Van Oostenryck <luc.vanoostenryck@gmail.com> wrote: > On Tue, Feb 03, 2015 at 11:38:02PM +0100, Rasmus Villemoes wrote: >> >> Thanks, but there's still something wrong. Using your show-data feature >> on this: >> >> === >> #define BACKSLASH "\\" >> #define LETTER_t "t" >> >> static const char s1[] = BACKSLASH; >> /* static const char s2[] = BACKSLASH; */ >> static const char s3[] = BACKSLASH LETTER_t; >> static const char s4[] = "a" BACKSLASH LETTER_t "b"; >> === >> >> I get >> >> symbol s1: >> char static const [toplevel] s1[0] >> bit_size = 16 >> val = "\\" >> symbol s3: >> char static const [toplevel] s3[0] >> bit_size = 24 >> val = "\0t" >> symbol s4: >> char static const [toplevel] s4[0] >> bit_size = 40 >> val = "a\0tb" >> >> Now if I do the same with s2 not commented out, I get >> >> >> symbol s1: >> char static const [toplevel] s1[0] >> bit_size = 16 >> val = "\0" >> symbol s2: >> char static const [toplevel] s2[0] >> bit_size = 16 >> val = "\0" >> symbol s3: >> char static const [toplevel] s3[0] >> bit_size = 24 >> val = "\0t" >> symbol s4: >> char static const [toplevel] s4[0] >> bit_size = 40 >> val = "a\0tb" >> >> So the expansion of BACKSLASH changes depending on how often it is >> expanded... >> >> The LETTER_t thing above is because I thought I had somehow provoked a >> double expansion, making BACKSLASH LETTER_t (or some variant) expand to >> a single-character string containing just a tab. But I can't seem to >> reproduce that particular behaviour, so maybe I'm imagining >> stuff. Anyway, the above is certainly real. >> >> Thanks, >> Rasmus >> -- > Yes, I see. > > Now thinking about it, it's obvious that the string buffer can't be reused at all > if there is any kind of expansion done on it, the adjacent strings concatenation > make just the thing worse but are not the cause of it. > That was also my conclusion from looking at the code, but I was unable to do anything about it. And I wasn't hallucinating, I was just overcomplicating things: #define NOT_TAB "\\t" static const char s1[] = NOT_TAB; static const char s2[] = NOT_TAB; indeed fails. Thanks, Rasmus ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] Avoid reuse of string buffer when concatening adjacent string litterals 2015-02-04 8:39 ` Rasmus Villemoes @ 2015-02-04 8:58 ` Rasmus Villemoes 2015-02-04 16:20 ` Christopher Li 0 siblings, 1 reply; 28+ messages in thread From: Rasmus Villemoes @ 2015-02-04 8:58 UTC (permalink / raw) To: Luc Van Oostenryck; +Cc: linux-sparse, Christopher Li On Wed, Feb 04 2015, Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote: > And I wasn't hallucinating, I was just overcomplicating things: > > #define NOT_TAB "\\t" > > static const char s1[] = NOT_TAB; > static const char s2[] = NOT_TAB; > > indeed fails. While we're collecting examples, let me also mention that __FILE__ doesn't work for files with backslash in their name. Sane people of course don't put backslashes in file names, but they are a rather normal occurence in path names on a certain operating system. Rasmus ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] Avoid reuse of string buffer when concatening adjacent string litterals 2015-02-04 8:58 ` Rasmus Villemoes @ 2015-02-04 16:20 ` Christopher Li 2015-02-06 21:52 ` Rasmus Villemoes 0 siblings, 1 reply; 28+ messages in thread From: Christopher Li @ 2015-02-04 16:20 UTC (permalink / raw) To: Rasmus Villemoes; +Cc: Luc Van Oostenryck, Linux-Sparse On Wed, Feb 4, 2015 at 12:58 AM, Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote: > On Wed, Feb 04 2015, Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote: > >> And I wasn't hallucinating, I was just overcomplicating things: >> >> #define NOT_TAB "\\t" >> >> static const char s1[] = NOT_TAB; >> static const char s2[] = NOT_TAB; >> >> indeed fails. > > While we're collecting examples, let me also mention that __FILE__ > doesn't work for files with backslash in their name. Sane people of > course don't put backslashes in file names, but they are a rather normal > occurence in path names on a certain operating system. Can you submit a patch for adding the test case you found? I will include those into the the test suit. Thanks Chris ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] Avoid reuse of string buffer when concatening adjacent string litterals 2015-02-04 16:20 ` Christopher Li @ 2015-02-06 21:52 ` Rasmus Villemoes 2015-02-07 1:30 ` Christopher Li 0 siblings, 1 reply; 28+ messages in thread From: Rasmus Villemoes @ 2015-02-06 21:52 UTC (permalink / raw) To: Christopher Li; +Cc: Luc Van Oostenryck, Linux-Sparse On Wed, Feb 04 2015, Christopher Li <sparse@chrisli.org> wrote: > On Wed, Feb 4, 2015 at 12:58 AM, Rasmus Villemoes > <linux@rasmusvillemoes.dk> wrote: >> On Wed, Feb 04 2015, Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote: >> >>> And I wasn't hallucinating, I was just overcomplicating things: >>> >>> #define NOT_TAB "\\t" >>> >>> static const char s1[] = NOT_TAB; >>> static const char s2[] = NOT_TAB; >>> >>> indeed fails. >> >> While we're collecting examples, let me also mention that __FILE__ >> doesn't work for files with backslash in their name. Sane people of >> course don't put backslashes in file names, but they are a rather normal >> occurence in path names on a certain operating system. > > Can you submit a patch for adding the test case you found? > I will include those into the the test suit. I'd like to, but I'm not sure how to write the test in terms of sparse's test frame work. How do I check that the string is as expected? Rasmus ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] Avoid reuse of string buffer when concatening adjacent string litterals 2015-02-06 21:52 ` Rasmus Villemoes @ 2015-02-07 1:30 ` Christopher Li 2015-02-09 21:48 ` Damien Lespiau 0 siblings, 1 reply; 28+ messages in thread From: Christopher Li @ 2015-02-07 1:30 UTC (permalink / raw) To: Rasmus Villemoes; +Cc: Luc Van Oostenryck, Linux-Sparse On Fri, Feb 6, 2015 at 1:52 PM, Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote: >> Can you submit a patch for adding the test case you found? >> I will include those into the the test suit. > > I'd like to, but I'm not sure how to write the test in terms of sparse's > test frame work. How do I check that the string is as expected? You can use ./test-parsing to evaluate the test file. $ cat /tmp/v.c #define BACKSLASH "\\" static const char a[] = BACKSLASH "a"; static const char b[] = BACKSLASH "b"; # on current master branch it shows: $ ./test-parsing /tmp/v.c .align 1 char static const [toplevel] a[0] = movi.64 v2,&"\7b" ld.24 v3,[v2] , .align 1 char static const [toplevel] b[0] = movi.64 v5,&"\7b" ld.24 v6,[v5] # on the review-immutable-string branch it shows: $ ./test-parsing /tmp/v.c .align 1 char static const [toplevel] a[0] = movi.64 v2,&"\\a" ld.24 v3,[v2] , .align 1 char static const [toplevel] b[0] = movi.64 v5,&"\\b" ld.24 v6,[v5] Notice that "\\a" and "\\b" was not there before the bug was fixed. The test suit is just some C file in the "validations" directory. Each file has a test name and optional how to invoke the C file and what to expect from the running result. You should able to find some example in the test suit. Or, you can just submit a patch to include those C file in the "validations" directory. Let some one help you complete the test suit. You run the test suit as: make check Chris ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] Avoid reuse of string buffer when concatening adjacent string litterals 2015-02-07 1:30 ` Christopher Li @ 2015-02-09 21:48 ` Damien Lespiau 0 siblings, 0 replies; 28+ messages in thread From: Damien Lespiau @ 2015-02-09 21:48 UTC (permalink / raw) To: Christopher Li; +Cc: Rasmus Villemoes, Luc Van Oostenryck, Linux-Sparse On 7 February 2015 at 01:30, Christopher Li <sparse@chrisli.org> wrote: > On Fri, Feb 6, 2015 at 1:52 PM, Rasmus Villemoes > <linux@rasmusvillemoes.dk> wrote: >>> Can you submit a patch for adding the test case you found? >>> I will include those into the the test suit. >> >> I'd like to, but I'm not sure how to write the test in terms of sparse's >> test frame work. How do I check that the string is as expected? > > You can use ./test-parsing to evaluate the test file. There's even a bit of documentation! http://git.kernel.org/cgit/devel/sparse/sparse.git/tree/Documentation/test-suite -- Damien ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v2] Avoid reusing string buffer when doing string expansion 2015-02-03 22:38 ` Rasmus Villemoes 2015-02-04 0:32 ` Luc Van Oostenryck @ 2015-02-04 2:01 ` Luc Van Oostenryck 2015-02-04 5:30 ` Christopher Li 1 sibling, 1 reply; 28+ messages in thread From: Luc Van Oostenryck @ 2015-02-04 2:01 UTC (permalink / raw) To: Rasmus Villemoes; +Cc: linux-sparse, Christopher Li, Luc Van Oostenryck In get_string_constant(), the code tried to reuse the storage for the string but only if the expansion of the string was not bigger than its unexpanded form. But this string can be shared with other expressions and reusing the buffer will result in later corruption A minimal exemple would be something like: const char a[] = BACKSLASH; const char b[] = BACKSLASH; The expansion for 'a' will correctly produce the two-char string consisting of a backslash char followed by a null char. But then the expansion of 'b' will expand this once more, producing the expansion of "\0": the two-char string: { '\0', '\0' }. The fix is to not reuse the storage for the string if any king of expansion have been done. Reported-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> --- char.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/char.c b/char.c index 08ca2230..2e21bb77 100644 --- a/char.c +++ b/char.c @@ -123,11 +123,15 @@ struct token *get_string_constant(struct token *token, struct expression *expr) len = MAX_STRING; } - if (len >= string->length) /* can't cannibalize */ + /* The input string can be shared with other expression and so + * its storage can't be reused if any kind of expansion have been done on it. + */ + if ((len != string->length) || memcmp(buffer, string->data, len)) { string = __alloc_string(len+1); - string->length = len+1; - memcpy(string->data, buffer, len); - string->data[len] = '\0'; + string->length = len+1; + memcpy(string->data, buffer, len); + string->data[len] = '\0'; + } expr->string = string; expr->wide = is_wide; return token; -- 2.2.0 ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH v2] Avoid reusing string buffer when doing string expansion 2015-02-04 2:01 ` [PATCH v2] Avoid reusing string buffer when doing string expansion Luc Van Oostenryck @ 2015-02-04 5:30 ` Christopher Li 2015-02-04 6:22 ` Luc Van Oostenryck 0 siblings, 1 reply; 28+ messages in thread From: Christopher Li @ 2015-02-04 5:30 UTC (permalink / raw) To: Luc Van Oostenryck; +Cc: Rasmus Villemoes, Linux-Sparse On Tue, Feb 3, 2015 at 6:01 PM, Luc Van Oostenryck <luc.vanoostenryck@gmail.com> wrote: > > In get_string_constant(), the code tried to reuse the storage for the string > but only if the expansion of the string was not bigger than its unexpanded form. > But this string can be shared with other expressions and reusing the buffer will > result in later corruption > > A minimal exemple would be something like: > const char a[] = BACKSLASH; > const char b[] = BACKSLASH; > > The expansion for 'a' will correctly produce the two-char string consisting > of a backslash char followed by a null char. > But then the expansion of 'b' will expand this once more, > producing the expansion of "\0": the two-char string: { '\0', '\0' }. Are you sure about this behavior? You mean you see "b" has the string size as 2. I haven't understand how this can happen. > The fix is to not reuse the storage for the string if any king of expansion > have been done. That is a bit over kill. We only need to avoid reuse storage if the destination part of the string is come from a preprocessor macro. It is pretty common string contain escape sequence. We don't want to allocate extra memory copy if it is not part of a macro expansion. > > Reported-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> > Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> > --- > char.c | 12 ++++++++---- > 1 file changed, 8 insertions(+), 4 deletions(-) > > diff --git a/char.c b/char.c > index 08ca2230..2e21bb77 100644 > --- a/char.c > +++ b/char.c > @@ -123,11 +123,15 @@ struct token *get_string_constant(struct token *token, struct expression *expr) > len = MAX_STRING; > } > > - if (len >= string->length) /* can't cannibalize */ > + /* The input string can be shared with other expression and so > + * its storage can't be reused if any kind of expansion have been done on it. > + */ > + if ((len != string->length) || memcmp(buffer, string->data, len)) { I don' think this check take into account the preprocessor macro has been used or not. In other words, any general "hello world\n" which contain the escape character will produce a different buffer, there for, a new copy of the string. Which is not necessary. That is a pretty common case. I am working on patch to address it in the preprocessor macro. The idea is that just mark the string as immutable if it is part of the macro expansion. I will see how it goes. Chris ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v2] Avoid reusing string buffer when doing string expansion 2015-02-04 5:30 ` Christopher Li @ 2015-02-04 6:22 ` Luc Van Oostenryck 2015-02-04 8:01 ` Christopher Li 0 siblings, 1 reply; 28+ messages in thread From: Luc Van Oostenryck @ 2015-02-04 6:22 UTC (permalink / raw) To: Christopher Li; +Cc: Rasmus Villemoes, Linux-Sparse On Tue, Feb 03, 2015 at 09:30:15PM -0800, Christopher Li wrote: > On Tue, Feb 3, 2015 at 6:01 PM, Luc Van Oostenryck > <luc.vanoostenryck@gmail.com> wrote: > > > > In get_string_constant(), the code tried to reuse the storage for the string > > but only if the expansion of the string was not bigger than its unexpanded form. > > But this string can be shared with other expressions and reusing the buffer will > > result in later corruption > > > > A minimal exemple would be something like: > > const char a[] = BACKSLASH; > > const char b[] = BACKSLASH; > > > > The expansion for 'a' will correctly produce the two-char string consisting > > of a backslash char followed by a null char. > > But then the expansion of 'b' will expand this once more, > > producing the expansion of "\0": the two-char string: { '\0', '\0' }. > > Are you sure about this behavior? You mean you see "b" has the string > size as 2. I haven't understand how this can happen. Using the show_data() / sparse -vdata on: === #define BACKSLASH "\\" const char a[] = BACKSLASH; === gives, correctly: === symbol a: char const [addressable] [toplevel] b[0] bit_size = 16 val = "\\" === But if the macro is used several times: === #define BACKSLASH "\\" const char a[] = BACKSLASH; const char b[] = BACKSLASH; const char c[] = "<" BACKSLASH ">"; === the, we get: === symbol a: char const [addressable] [toplevel] a[0] bit_size = 16 val = "\0" symbol b: char const [addressable] [toplevel] b[0] bit_size = 16 val = "\0" symbol c: char const [addressable] [toplevel] c[0] bit_size = 32 val = "<\0>" === And even worse: === #define BACKSLASH "(\\)" const char m[] = BACKSLASH; const char n[] = BACKSLASH; const char k[] = "<" BACKSLASH ">"; === gives: === symbol m: char const [addressable] [toplevel] m[0] bit_size = 24 val = "()" symbol n: char const [addressable] [toplevel] n[0] bit_size = 24 val = "()" symbol k: char const [addressable] [toplevel] k[0] bit_size = 40 val = "<()>" === > > The fix is to not reuse the storage for the string if any king of expansion > > have been done. > > That is a bit over kill. We only need to avoid reuse storage if the > destination part of the string is come from a preprocessor macro. > It is pretty common string contain escape sequence. We don't > want to allocate extra memory copy if it is not part of a macro > expansion. Well yes ... Is it only with macros that the string structure is so shared? And have we a way to test if the string is coming from a macro? > > > > Reported-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> > > Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> > > --- > > char.c | 12 ++++++++---- > > 1 file changed, 8 insertions(+), 4 deletions(-) > > > > diff --git a/char.c b/char.c > > index 08ca2230..2e21bb77 100644 > > --- a/char.c > > +++ b/char.c > > @@ -123,11 +123,15 @@ struct token *get_string_constant(struct token *token, struct expression *expr) > > len = MAX_STRING; > > } > > > > - if (len >= string->length) /* can't cannibalize */ > > + /* The input string can be shared with other expression and so > > + * its storage can't be reused if any kind of expansion have been done on it. > > + */ > > + if ((len != string->length) || memcmp(buffer, string->data, len)) { > > I don' think this check take into account the preprocessor macro has > been used or not. In other words, any general "hello world\n" which > contain the escape character will produce a different buffer, there for, > a new copy of the string. Which is not necessary. That is a pretty > common case. No, indeed, it does not. It just allocate a new buffer every time there is any modification/expansion so that the original one is not touched (in case it is used elsewhere). > > I am working on patch to address it in the preprocessor macro. > The idea is that just mark the string as immutable if it is part of the > macro expansion. I will see how it goes. > > Chris > -- A simpler and safer way would be to directly do the string expansion just after a string token is recognized, or even better in the lexer itself. So the string buffer, macro or not, will always directly contain the right values. But maybe there was good reasons to not do it this way. Luc ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v2] Avoid reusing string buffer when doing string expansion 2015-02-04 6:22 ` Luc Van Oostenryck @ 2015-02-04 8:01 ` Christopher Li 2015-02-04 16:38 ` Christopher Li 2015-02-04 23:38 ` Luc Van Oostenryck 0 siblings, 2 replies; 28+ messages in thread From: Christopher Li @ 2015-02-04 8:01 UTC (permalink / raw) To: Luc Van Oostenryck; +Cc: Rasmus Villemoes, Linux-Sparse On Tue, Feb 3, 2015 at 10:22 PM, Luc Van Oostenryck <luc.vanoostenryck@gmail.com> wrote: >> Are you sure about this behavior? You mean you see "b" has the string >> size as 2. I haven't understand how this can happen. > > > But if the macro is used several times: > === > #define BACKSLASH "\\" > const char a[] = BACKSLASH; > const char b[] = BACKSLASH; > const char c[] = "<" BACKSLASH ">"; > === > > the, we get: > === > symbol a: > char const [addressable] [toplevel] a[0] > bit_size = 16 > val = "\0" > symbol b: > char const [addressable] [toplevel] b[0] > bit_size = 16 The value buffer is corrupted. But the bit_size is still 16, which is correct. I just think that in your example it shouldn't corrupt the size. Your test case seems confirm that. > Is it only with macros that the string structure is so shared? That is right. I haven't see it can happen any other way. The tokenizer always construct new token and string structure from the C source file. It is the preprocessor using macro expand which copy and duplicate the token list. The token has a pointer point to the string which is shared across different invocation of macro. > And have we a way to test if the string is coming from a macro? Not right now. But we can add it. > > A simpler and safer way would be to directly do the string expansion just after > a string token is recognized, or even better in the lexer itself. > So the string buffer, macro or not, will always directly contain the right values. > But maybe there was good reasons to not do it this way. I have an counter example that will not work. Let say #define b(a, d) a##d wchar_t s[] = b(L, "\xabcdabc"); When the lexer process the escape char, you did not know the string is wide char or not. That can be changed after the macro expansion. Chris ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v2] Avoid reusing string buffer when doing string expansion 2015-02-04 8:01 ` Christopher Li @ 2015-02-04 16:38 ` Christopher Li 2015-02-04 23:38 ` Luc Van Oostenryck 2015-02-04 23:38 ` Luc Van Oostenryck 1 sibling, 1 reply; 28+ messages in thread From: Christopher Li @ 2015-02-04 16:38 UTC (permalink / raw) To: Luc Van Oostenryck; +Cc: Rasmus Villemoes, Linux-Sparse On Wed, Feb 4, 2015 at 12:01 AM, Christopher Li <sparse@chrisli.org> wrote: > > When the lexer process the escape char, you did not know the string > is wide char or not. That can be changed after the macro expansion. With that in mind, we can't actually perform the escape char substitution before the pre-processor stage. Here is an untested patch adding the immutable string to macro body. I need to double check if the macro arguments needs it as well. Can you guys try it on the test case you have? Thanks Chris diff --git a/char.c b/char.c index 08ca223..7f86b8a 100644 --- a/char.c +++ b/char.c @@ -123,7 +123,7 @@ struct token *get_string_constant(struct token *token, struct expression *expr) len = MAX_STRING; } - if (len >= string->length) /* can't cannibalize */ + if (string->immutable || len >= string->length) /* can't cannibalize */ string = __alloc_string(len+1); string->length = len+1; memcpy(string->data, buffer, len); diff --git a/pre-process.c b/pre-process.c index 1aa3d2c..57d84b9 100644 --- a/pre-process.c +++ b/pre-process.c @@ -1259,8 +1259,16 @@ static struct token *parse_expansion(struct token *expansion, struct token *argl } else { try_arg(token, TOKEN_MACRO_ARGUMENT, arglist); } - if (token_type(token) == TOKEN_ERROR) + switch (token_type(token)) { + case TOKEN_ERROR: goto Earg; + + case TOKEN_STRING: + case TOKEN_WIDE_STRING: + token->string->immutable = 1; + default: + ; + } } token = alloc_token(&expansion->pos); token_type(token) = TOKEN_UNTAINT; diff --git a/token.h b/token.h index 8dbd80f..af66b2b 100644 --- a/token.h +++ b/token.h @@ -164,7 +164,8 @@ enum special_token { }; struct string { - unsigned int length; + unsigned int length:31; + unsigned int immutable:1; char data[]; }; ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH v2] Avoid reusing string buffer when doing string expansion 2015-02-04 16:38 ` Christopher Li @ 2015-02-04 23:38 ` Luc Van Oostenryck 2015-02-06 13:58 ` Christopher Li 0 siblings, 1 reply; 28+ messages in thread From: Luc Van Oostenryck @ 2015-02-04 23:38 UTC (permalink / raw) To: Christopher Li; +Cc: Rasmus Villemoes, Linux-Sparse On Wed, Feb 04, 2015 at 08:38:19AM -0800, Christopher Li wrote: > On Wed, Feb 4, 2015 at 12:01 AM, Christopher Li <sparse@chrisli.org> wrote: > > > > When the lexer process the escape char, you did not know the string > > is wide char or not. That can be changed after the macro expansion. > > With that in mind, we can't actually perform the escape char substitution > before the pre-processor stage. > > Here is an untested patch adding the immutable string to macro body. > I need to double check if the macro arguments needs it as well. > > Can you guys try it on the test case you have? > > Thanks > > Chris Working perfectly here. Greetings, Luc ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v2] Avoid reusing string buffer when doing string expansion 2015-02-04 23:38 ` Luc Van Oostenryck @ 2015-02-06 13:58 ` Christopher Li 2015-02-06 20:32 ` Rasmus Villemoes 0 siblings, 1 reply; 28+ messages in thread From: Christopher Li @ 2015-02-06 13:58 UTC (permalink / raw) To: Luc Van Oostenryck; +Cc: Rasmus Villemoes, Linux-Sparse On Wed, Feb 4, 2015 at 3:38 PM, Luc Van Oostenryck <luc.vanoostenryck@gmail.com> wrote: > > Working perfectly here. > Any update of the string test case? I commit a review version of the immutable string fix in review-immutable-string branch. Compare to the last patch, it take into account that string can come from macro arguments rather than macro body. It also avoid the string copy if there is no change, that is pretty common as well. Care to give this version a try? https://git.kernel.org/cgit/devel/sparse/chrisl/sparse.git/commit/?h=review-immutable-string&id=de1fa7e60d3d179a1b67c47a0429b2d0ac4e4842 Chris ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v2] Avoid reusing string buffer when doing string expansion 2015-02-06 13:58 ` Christopher Li @ 2015-02-06 20:32 ` Rasmus Villemoes 0 siblings, 0 replies; 28+ messages in thread From: Rasmus Villemoes @ 2015-02-06 20:32 UTC (permalink / raw) To: Christopher Li; +Cc: Luc Van Oostenryck, Linux-Sparse On Fri, Feb 06 2015, Christopher Li <sparse@chrisli.org> wrote: > On Wed, Feb 4, 2015 at 3:38 PM, Luc Van Oostenryck > <luc.vanoostenryck@gmail.com> wrote: >> >> Working perfectly here. >> > > Any update of the string test case? > > I commit a review version of the immutable string fix in review-immutable-string > branch. Compare to the last patch, it take into account that string > can come from > macro arguments rather than macro body. > > It also avoid the string copy if there is no change, that is pretty > common as well. > > Care to give this version a try? > > https://git.kernel.org/cgit/devel/sparse/chrisl/sparse.git/commit/?h=review-immutable-string&id=de1fa7e60d3d179a1b67c47a0429b2d0ac4e4842 Sorry for taking so long to reply. This seems to work for the cases I can think of, but I haven't put it to the real test yet (whether the format strings are going to be messed up). Rasmus ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v2] Avoid reusing string buffer when doing string expansion 2015-02-04 8:01 ` Christopher Li 2015-02-04 16:38 ` Christopher Li @ 2015-02-04 23:38 ` Luc Van Oostenryck 1 sibling, 0 replies; 28+ messages in thread From: Luc Van Oostenryck @ 2015-02-04 23:38 UTC (permalink / raw) To: Christopher Li; +Cc: Rasmus Villemoes, Linux-Sparse On Wed, Feb 04, 2015 at 12:01:39AM -0800, Christopher Li wrote: > On Tue, Feb 3, 2015 at 10:22 PM, Luc Van Oostenryck > <luc.vanoostenryck@gmail.com> wrote: > >> Are you sure about this behavior? You mean you see "b" has the string > >> size as 2. I haven't understand how this can happen. > > > > > > But if the macro is used several times: > > === > > #define BACKSLASH "\\" > > const char a[] = BACKSLASH; > > const char b[] = BACKSLASH; > > const char c[] = "<" BACKSLASH ">"; > > === > > > > the, we get: > > === > > symbol a: > > char const [addressable] [toplevel] a[0] > > bit_size = 16 > > val = "\0" > > symbol b: > > char const [addressable] [toplevel] b[0] > > bit_size = 16 > > The value buffer is corrupted. But the bit_size is still 16, which > is correct. I just think that in your example it shouldn't corrupt > the size. Your test case seems confirm that. > > > Is it only with macros that the string structure is so shared? > > That is right. I haven't see it can happen any other way. > The tokenizer always construct new token and string structure > from the C source file. > > It is the preprocessor using macro expand which copy and duplicate > the token list. The token has a pointer point to the string which > is shared across different invocation of macro. Fine. I was affraid that there was other possibilities, like, for exemple, if the identical string litterals are put in an hash table, like it is done for identifiers. > > And have we a way to test if the string is coming from a macro? > > Not right now. But we can add it. > > > > > A simpler and safer way would be to directly do the string expansion just after > > a string token is recognized, or even better in the lexer itself. > > So the string buffer, macro or not, will always directly contain the right values. > > But maybe there was good reasons to not do it this way. > > I have an counter example that will not work. Let say > > #define b(a, d) a##d > wchar_t s[] = b(L, "\xabcdabc"); > > When the lexer process the escape char, you did not know the string > is wide char or not. That can be changed after the macro expansion. > > Chris Yes, I see. BTW, I've checked and there is a lot of problems with wide strings. I'll send some test case later. Regards, Luc ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Bad interaction between macro expansion and literal concatenation 2015-01-30 22:16 Bad interaction between macro expansion and literal concatenation Rasmus Villemoes 2015-01-31 1:23 ` [PATCH] Avoid reuse of string buffer when concatening adjacent string litterals Luc Van Oostenryck @ 2015-01-31 5:16 ` Christopher Li 2015-02-01 2:19 ` [PATCH 0/4] Teach sparse to display data/initial values Luc Van Oostenryck 1 sibling, 1 reply; 28+ messages in thread From: Christopher Li @ 2015-01-31 5:16 UTC (permalink / raw) To: Rasmus Villemoes; +Cc: Linux-Sparse On Fri, Jan 30, 2015 at 2:16 PM, Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote: > > KERN_DEBUG "%s: Reset: current Rx mode %d.\n" > > KERN_DEBUG ultimately expands to "\001" "7". When I print the > corresponding ->string->data, I get > > ^A6%s7%s: Reset: current Rx mode %d. Can you construct a small C program to demonstrate the problem without including any real kernel headers? e.g. some thing like: #define KERNEL_DEBUG "\001" "7" void foo(void) { printk(KERN_DEBUG "%s: .....", ...); } Obviously, my test case is not a complete one. It is not even valid C. It just show you some idea how a test case can be constructed. Once you have a small test case, it is much easier to discuss and debug what is going on there. BTW, how do you print the "string->data"? It helps to show the functions that does the print. Chris ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH 0/4] Teach sparse to display data/initial values 2015-01-31 5:16 ` Bad interaction between macro expansion and literal concatenation Christopher Li @ 2015-02-01 2:19 ` Luc Van Oostenryck 2015-02-01 2:19 ` [PATCH 1/4] Add support for '-vdata', the equivalent of '-ventry' but for data Luc Van Oostenryck ` (3 more replies) 0 siblings, 4 replies; 28+ messages in thread From: Luc Van Oostenryck @ 2015-02-01 2:19 UTC (permalink / raw) To: sparse; +Cc: linux-sparse, linux The patches in this serie implement the display of toplevel variable's initializer. It should allow to easily investigate the bug reported here: http://article.gmane.org/gmane.comp.parsers.sparse/4080 but can be usefull for others things. For convenience, this patch series can also be found at git://github.com/looxix/sparse.git show-data Luc Van Oostenryck (4): Add support for '-vdata', the equivalent of '-ventry' but for data Add support for show_data() Teach sparse to display data/initial values Small test/exemple for using '-vdata' lib.c | 2 ++ lib.h | 1 + linearize.c | 86 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ linearize.h | 1 + sparse.c | 3 +++ validation/show_data.c | 75 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 6 files changed, 168 insertions(+) ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH 1/4] Add support for '-vdata', the equivalent of '-ventry' but for data 2015-02-01 2:19 ` [PATCH 0/4] Teach sparse to display data/initial values Luc Van Oostenryck @ 2015-02-01 2:19 ` Luc Van Oostenryck 2015-02-01 2:19 ` [PATCH 2/4] Add support for show_data() Luc Van Oostenryck ` (2 subsequent siblings) 3 siblings, 0 replies; 28+ messages in thread From: Luc Van Oostenryck @ 2015-02-01 2:19 UTC (permalink / raw) To: sparse; +Cc: linux-sparse, linux, Luc Van Oostenryck Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> --- lib.c | 2 ++ lib.h | 1 + 2 files changed, 3 insertions(+) diff --git a/lib.c b/lib.c index 8dc5bcf9..73714109 100644 --- a/lib.c +++ b/lib.c @@ -243,6 +243,7 @@ int Wuninitialized = 1; int Wvla = 1; int dbg_entry = 0; +int dbg_data = 0; int dbg_dead = 0; int preprocess_only; @@ -517,6 +518,7 @@ static char **handle_switch_W(char *arg, char **next) static struct warning debugs[] = { { "entry", &dbg_entry}, + { "data", &dbg_data}, { "dead", &dbg_dead}, }; diff --git a/lib.h b/lib.h index 15b69fa2..2e1170b4 100644 --- a/lib.h +++ b/lib.h @@ -129,6 +129,7 @@ extern int Wuninitialized; extern int Wvla; extern int dbg_entry; +extern int dbg_data; extern int dbg_dead; extern int arch_m64; -- 2.2.0 ^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH 2/4] Add support for show_data() 2015-02-01 2:19 ` [PATCH 0/4] Teach sparse to display data/initial values Luc Van Oostenryck 2015-02-01 2:19 ` [PATCH 1/4] Add support for '-vdata', the equivalent of '-ventry' but for data Luc Van Oostenryck @ 2015-02-01 2:19 ` Luc Van Oostenryck 2015-02-02 5:30 ` Christopher Li 2015-02-01 2:19 ` [PATCH 3/4] Teach sparse to display data/initial values Luc Van Oostenryck 2015-02-01 2:19 ` [PATCH 4/4] Small test/exemple for using '-vdata' Luc Van Oostenryck 3 siblings, 1 reply; 28+ messages in thread From: Luc Van Oostenryck @ 2015-02-01 2:19 UTC (permalink / raw) To: sparse; +Cc: linux-sparse, linux, Luc Van Oostenryck The purpose of show_data() is to visualize data, more or less like show_entry() allow to visualize the code. The support is not complete but at least the basic types are handled. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> --- linearize.c | 86 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ linearize.h | 1 + 2 files changed, 87 insertions(+) diff --git a/linearize.c b/linearize.c index c6ada1e8..28beb271 100644 --- a/linearize.c +++ b/linearize.c @@ -2220,3 +2220,89 @@ struct entrypoint *linearize_symbol(struct symbol *sym) return linearize_fn(sym, base_type); return NULL; } + +static void do_indent(int level) +{ + while (level--) + printf("\t"); +} + +static void show_data_expr(struct expression *expr, int indent) +{ + struct expression *entry; + struct symbol *sym; + + switch (expr->type) { + case EXPR_VALUE: + printf("%lld", expr->value); + break; + case EXPR_SYMBOL: + sym = expr->symbol; + if (sym->initializer && sym->initializer->type == EXPR_STRING) + return show_data_expr(sym->initializer, indent); + printf("%s", show_ident(expr->symbol_name)); + break; + case EXPR_IMPLIED_CAST: + show_data_expr(expr->cast_expression, indent); + break; + case EXPR_STRING: + printf("%s", show_string(expr->string)); + break; + case EXPR_INITIALIZER: + printf("{"); + FOR_EACH_PTR(expr->expr_list, entry) { + switch (entry->type) { + case EXPR_POS: + case EXPR_INDEX: + case EXPR_IDENTIFIER: + continue; + } + do_indent(indent+1); + show_data_expr(entry, indent+1); + printf(",\n"); + } END_FOR_EACH_PTR(entry); + do_indent(indent); + printf("}"); + break; + case EXPR_PREOP: + if (expr->op == '*') { + show_data_expr(expr->unop, indent); + break; + } + // Fall through + default: + printf("!!unhandled expression type (%d)", expr->type); + break; + } +} + +void show_data(struct symbol *sym) +{ + struct expression *expr = sym->initializer; + + printf("symbol %s:\n", show_ident(sym->ident)); + printf("\t"); show_type(sym); printf("\n"); + printf("\tbit_size = %i\n", sym->bit_size); + printf("\tval = "); + switch (sym->ctype.base_type->type) { + case SYM_BASETYPE: + case SYM_ENUM: + show_data_expr(expr, 0); + break; + case SYM_PTR: + printf("&"); + show_data_expr(expr, 0); + break; + case SYM_ARRAY: + show_data_expr(expr, 0); + break; + case SYM_STRUCT: + case SYM_UNION: + show_data_expr(expr, 1); + break; + default: + printf("\ttype = %d\n", sym->ctype.base_type->type); + break; + } + printf("\n"); +} diff --git a/linearize.h b/linearize.h index 61fbd831..7b2d0681 100644 --- a/linearize.h +++ b/linearize.h @@ -344,6 +344,7 @@ void show_entry(struct entrypoint *ep); const char *show_pseudo(pseudo_t pseudo); void show_bb(struct basic_block *bb); const char *show_instruction(struct instruction *insn); +void show_data(struct symbol *sym); #endif /* LINEARIZE_H */ -- 2.2.0 ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH 2/4] Add support for show_data() 2015-02-01 2:19 ` [PATCH 2/4] Add support for show_data() Luc Van Oostenryck @ 2015-02-02 5:30 ` Christopher Li 2015-02-04 0:50 ` Luc Van Oostenryck 0 siblings, 1 reply; 28+ messages in thread From: Christopher Li @ 2015-02-02 5:30 UTC (permalink / raw) To: Luc Van Oostenryck; +Cc: Linux-Sparse, Rasmus Villemoes On Sat, Jan 31, 2015 at 6:19 PM, Luc Van Oostenryck <luc.vanoostenryck@gmail.com> wrote: > The purpose of show_data() is to visualize data, more or less I think show_data() is a bad name. Data is very generic term, it has not specific meaning what exactly is that data. What you really want to show is the initializer expression. Make the function name reflect that. > diff --git a/linearize.c b/linearize.c Why is this function belongs to linearize.c? It has nothing to do with linearization instruction. It has every thing to do with the AST structure. I suggest that move this code to show-parse.c. There are many functions showing internal of the AST there. Chris ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 2/4] Add support for show_data() 2015-02-02 5:30 ` Christopher Li @ 2015-02-04 0:50 ` Luc Van Oostenryck 0 siblings, 0 replies; 28+ messages in thread From: Luc Van Oostenryck @ 2015-02-04 0:50 UTC (permalink / raw) To: Christopher Li; +Cc: Linux-Sparse, Rasmus Villemoes On Sun, Feb 01, 2015 at 09:30:44PM -0800, Christopher Li wrote: > On Sat, Jan 31, 2015 at 6:19 PM, Luc Van Oostenryck > <luc.vanoostenryck@gmail.com> wrote: > > The purpose of show_data() is to visualize data, more or less > > I think show_data() is a bad name. Data is very generic term, > it has not specific meaning what exactly is that data. > > What you really want to show is the initializer expression. > Make the function name reflect that. Yes, the name is too generic. The purpose was to use 'data' in opposition to 'code'. > > diff --git a/linearize.c b/linearize.c > > Why is this function belongs to linearize.c? > It has nothing to do with linearization instruction. It has > every thing to do with the AST structure. > > I suggest that move this code to show-parse.c. > There are many functions showing internal of the AST > there. Well, I put it in linearize because it's where show_entry() is defined and since I wanted this new show function be its complement, it seemed to me much better to put it there than in show-parse.c Even more so because I'm not interested in showing anything close to the AST, on the contrary, I want to show the values like you would find them in an assembler code file generated by a compiler after constant folding, string expansion, type coercion, ... Anyway, my intention was not to have it included as is (mainly because it's not complete) but only as a quick tool to investigate the bug that Rasmus have reported. I'll try to complete it and resubmit it then, taking your remarks in account. Regards, Luc ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH 3/4] Teach sparse to display data/initial values 2015-02-01 2:19 ` [PATCH 0/4] Teach sparse to display data/initial values Luc Van Oostenryck 2015-02-01 2:19 ` [PATCH 1/4] Add support for '-vdata', the equivalent of '-ventry' but for data Luc Van Oostenryck 2015-02-01 2:19 ` [PATCH 2/4] Add support for show_data() Luc Van Oostenryck @ 2015-02-01 2:19 ` Luc Van Oostenryck 2015-02-01 2:19 ` [PATCH 4/4] Small test/exemple for using '-vdata' Luc Van Oostenryck 3 siblings, 0 replies; 28+ messages in thread From: Luc Van Oostenryck @ 2015-02-01 2:19 UTC (permalink / raw) To: sparse; +Cc: linux-sparse, linux, Luc Van Oostenryck Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> --- sparse.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/sparse.c b/sparse.c index 6b3324cf..e87f8c4f 100644 --- a/sparse.c +++ b/sparse.c @@ -285,6 +285,9 @@ static void check_symbols(struct symbol_list *list) show_entry(ep); check_context(ep); + } else { + if (dbg_data) + show_data(sym); } } END_FOR_EACH_PTR(sym); -- 2.2.0 ^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH 4/4] Small test/exemple for using '-vdata' 2015-02-01 2:19 ` [PATCH 0/4] Teach sparse to display data/initial values Luc Van Oostenryck ` (2 preceding siblings ...) 2015-02-01 2:19 ` [PATCH 3/4] Teach sparse to display data/initial values Luc Van Oostenryck @ 2015-02-01 2:19 ` Luc Van Oostenryck 3 siblings, 0 replies; 28+ messages in thread From: Luc Van Oostenryck @ 2015-02-01 2:19 UTC (permalink / raw) To: sparse; +Cc: linux-sparse, linux, Luc Van Oostenryck Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> --- validation/show_data.c | 75 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 75 insertions(+) create mode 100644 validation/show_data.c diff --git a/validation/show_data.c b/validation/show_data.c new file mode 100644 index 00000000..b620b610 --- /dev/null +++ b/validation/show_data.c @@ -0,0 +1,75 @@ +#define P "\001" + +static const char a[] = P "a"; +static const char b[] = P "b"; + +static const char *c = "abc"; + +static int i = 42; + +static int *ipi = &i; +static void *vpi = &i; + +struct s { + int i; +}; +static struct s s = { + .i = 42, +}; + +union u { + int i; + void *p; +}; +static union u u = { + .i = 42, +}; + +enum e { + first, + next, + last +}; +static enum e e = next; +/* + * check-name: show_data.c + * check-command: sparse -m32 -vdata $file + * check-output-start: +symbol a: + char static const [toplevel] a[0] + bit_size = 24 + val = "\1a" +symbol b: + char static const [toplevel] b[0] + bit_size = 32 + val = "\1b" +symbol c: + char const *static [toplevel] c + bit_size = 32 + val = &"abc" +symbol i: + int static [signed] [addressable] [toplevel] i + bit_size = 32 + val = 42 +symbol ipi: + int *static [toplevel] ipi + bit_size = 32 + val = &i +symbol vpi: + void *static [toplevel] vpi + bit_size = 32 + val = &i +symbol s: + struct s static [toplevel] s + bit_size = 32 + val = { } +symbol u: + union u static [toplevel] u + bit_size = 32 + val = { } +symbol e: + int enum e static [signed] [toplevel] e + bit_size = 32 + val = 1 + * check-output-end + */ -- 2.2.0 ^ permalink raw reply related [flat|nested] 28+ messages in thread
end of thread, other threads:[~2015-02-09 21:48 UTC | newest] Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-01-30 22:16 Bad interaction between macro expansion and literal concatenation Rasmus Villemoes 2015-01-31 1:23 ` [PATCH] Avoid reuse of string buffer when concatening adjacent string litterals Luc Van Oostenryck 2015-02-03 22:38 ` Rasmus Villemoes 2015-02-04 0:32 ` Luc Van Oostenryck 2015-02-04 3:26 ` Christopher Li 2015-02-04 8:39 ` Rasmus Villemoes 2015-02-04 8:58 ` Rasmus Villemoes 2015-02-04 16:20 ` Christopher Li 2015-02-06 21:52 ` Rasmus Villemoes 2015-02-07 1:30 ` Christopher Li 2015-02-09 21:48 ` Damien Lespiau 2015-02-04 2:01 ` [PATCH v2] Avoid reusing string buffer when doing string expansion Luc Van Oostenryck 2015-02-04 5:30 ` Christopher Li 2015-02-04 6:22 ` Luc Van Oostenryck 2015-02-04 8:01 ` Christopher Li 2015-02-04 16:38 ` Christopher Li 2015-02-04 23:38 ` Luc Van Oostenryck 2015-02-06 13:58 ` Christopher Li 2015-02-06 20:32 ` Rasmus Villemoes 2015-02-04 23:38 ` Luc Van Oostenryck 2015-01-31 5:16 ` Bad interaction between macro expansion and literal concatenation Christopher Li 2015-02-01 2:19 ` [PATCH 0/4] Teach sparse to display data/initial values Luc Van Oostenryck 2015-02-01 2:19 ` [PATCH 1/4] Add support for '-vdata', the equivalent of '-ventry' but for data Luc Van Oostenryck 2015-02-01 2:19 ` [PATCH 2/4] Add support for show_data() Luc Van Oostenryck 2015-02-02 5:30 ` Christopher Li 2015-02-04 0:50 ` Luc Van Oostenryck 2015-02-01 2:19 ` [PATCH 3/4] Teach sparse to display data/initial values Luc Van Oostenryck 2015-02-01 2:19 ` [PATCH 4/4] Small test/exemple for using '-vdata' Luc Van Oostenryck
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.