All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Change the default for Mixed declarations.
@ 2023-02-14 16:07 Juan Quintela
  2023-03-23 19:00 ` Daniel P. Berrangé
  0 siblings, 1 reply; 10+ messages in thread
From: Juan Quintela @ 2023-02-14 16:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: Juan Quintela

Hi

I want to enter a discussion about changing the default of the style
guide.

There are several reasons for that:
- they exist since C99 (i.e. all supported compilers support them)
- they eliminate the posibility of an unitialized variable.
- (at least for me), declaring the index inside the for make clear
  that index is not used outside the for.
- Current documentation already declares that they are allowed in some
  cases.
- Lots of places already use them.

We can change the text to whatever you want, just wondering if it is
valib to change the standard.

Doing a trivial grep through my local qemu messages (around 100k) it
shows that some people are complaining that they are not allowed, and
other saying that they are used all over the place.

Discuss.
---
 docs/devel/style.rst | 23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/docs/devel/style.rst b/docs/devel/style.rst
index 68aa776930..dc248aa9e4 100644
--- a/docs/devel/style.rst
+++ b/docs/devel/style.rst
@@ -202,15 +202,20 @@ Furthermore, it is the QEMU coding style.
 Declarations
 ============
 
-Mixed declarations (interleaving statements and declarations within
-blocks) are generally not allowed; declarations should be at the beginning
-of blocks.
-
-Every now and then, an exception is made for declarations inside a
-#ifdef or #ifndef block: if the code looks nicer, such declarations can
-be placed at the top of the block even if there are statements above.
-On the other hand, however, it's often best to move that #ifdef/#ifndef
-block to a separate function altogether.
+Declaring variables at first use has two advantages:
+- we can see the right type of the variable just to the use
+- we completely remove the posibility of using a variable that is
+  unitialized.
+
+It is especially the case when we are in a for statement.
+
+for (int i = X; i++; ..) {
+    ...
+}
+
+Makes clear visually that this variable is not useed outside of the for.
+
+Mixing declarations an code has been allowed since the C99 standard.
 
 Conditional statements
 ======================
-- 
2.39.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] Change the default for Mixed declarations.
  2023-02-14 16:07 [PATCH] Change the default for Mixed declarations Juan Quintela
@ 2023-03-23 19:00 ` Daniel P. Berrangé
  2023-03-24  8:43   ` Philippe Mathieu-Daudé
                     ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Daniel P. Berrangé @ 2023-03-23 19:00 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel

On Tue, Feb 14, 2023 at 05:07:38PM +0100, Juan Quintela wrote:
> Hi
> 
> I want to enter a discussion about changing the default of the style
> guide.
> 
> There are several reasons for that:
> - they exist since C99 (i.e. all supported compilers support them)
> - they eliminate the posibility of an unitialized variable.

Actually they don't do that reliably. In fact, when combined
with usage of 'goto', they introduce uninitialized variables,
despite the declaration having an initialization present, and
thus actively mislead reviewers into thinking their code is
safe.

Consider this example:

#include <stdlib.h>
#include <stdio.h>

void foo(int something) {
  if (something == 8) {
    goto cleanup;
  }
  
  int nitems = 3;
  int *items = malloc(sizeof(int) *nitems);
       
  printf("Hello world %p\n", items);

 cleanup:
  printf("nitems=%d items=%p\n", nitems, items);
  if (nitems) {
    free(items);
  }
}

int main(int argc, char **argv) {
  foo(atoi(argv[1]));
  return 0;
}

Superficially everything looks awesome right - the variables are
all initialized at time of declaration after all.

$ gcc -Wall -o mixed mixed.c

$ ./mixed 3
Hello world 0x18ef2a0
nitems=3 items=0x18ef2a0

$ ./mixed 8
nitems=32677 items=0x7fa5a9209000
free(): invalid pointer
Aborted (core dumped)


What happens is that when you 'goto $LABEL' across a variable
declaration, the variable is in scope at your target label, but
its declared initializers never get run :-(

Luckily you can protect against that with gcc:

$ gcc -Wjump-misses-init -Wall -o mixed mixed.c
mixed.c: In function ‘foo’:
mixed.c:7:12: warning: jump skips variable initialization [-Wjump-misses-init]
    7 |            goto cleanup;
      |            ^~~~
mixed.c:15:5: note: label ‘cleanup’ defined here
   15 |     cleanup:
      |     ^~~~~~~
mixed.c:11:13: note: ‘items’ declared here
   11 |        int *items = malloc(sizeof(int) *nitems);
      |             ^~~~~
mixed.c:7:12: warning: jump skips variable initialization [-Wjump-misses-init]
    7 |            goto cleanup;
      |            ^~~~
mixed.c:15:5: note: label ‘cleanup’ defined here
   15 |     cleanup:
      |     ^~~~~~~
mixed.c:10:12: note: ‘nitems’ declared here
   10 |        int nitems = 3;
      |            ^~~~~~


however that will warn about *all* cases where we jump over a
declared variable, even if the variable we're jumping over is
not used at the target label location. IOW, it has significant
false positive rates. There are quite a few triggers for this
in the QEMU code already if we turn on this warning.

It also doesn't alter that the code initialization is misleading
to read.

> - (at least for me), declaring the index inside the for make clear
>   that index is not used outside the for.

I'll admit that declaring loop indexes in the for() is a nice
bit, but I'm not a fan in general of mixing the declarations
in the middle of code for projects that use the 'goto cleanup'
pattern.

> - Current documentation already declares that they are allowed in some
>   cases.
> - Lots of places already use them.
> 
> We can change the text to whatever you want, just wondering if it is
> valib to change the standard.
> 
> Doing a trivial grep through my local qemu messages (around 100k) it
> shows that some people are complaining that they are not allowed, and
> other saying that they are used all over the place.

IMHO the status quo is bad because it is actively dangerous when
combined with goto and we aren't using any compiler warnings to
help us.

Either we allow it, but use -Wjump-misses-init to prevent mixing
delayed declarations with gotos, and just avoid this when it triggers
a false positive.

Or we forbid it, rewrite current cases that use it, and then add
-Wdeclaration-after-statement to enforce it.


IMHO if we are concerned about uninitialized variables then I think
a better approach is to add -ftrivial-auto-var-init=zero, which will
make the compiler initialize all variables to 0 if they lack an
explicit initializer. 

> Discuss.
> ---
>  docs/devel/style.rst | 23 ++++++++++++++---------
>  1 file changed, 14 insertions(+), 9 deletions(-)
> 
> diff --git a/docs/devel/style.rst b/docs/devel/style.rst
> index 68aa776930..dc248aa9e4 100644
> --- a/docs/devel/style.rst
> +++ b/docs/devel/style.rst
> @@ -202,15 +202,20 @@ Furthermore, it is the QEMU coding style.
>  Declarations
>  ============
>  
> -Mixed declarations (interleaving statements and declarations within
> -blocks) are generally not allowed; declarations should be at the beginning
> -of blocks.
> -
> -Every now and then, an exception is made for declarations inside a
> -#ifdef or #ifndef block: if the code looks nicer, such declarations can
> -be placed at the top of the block even if there are statements above.
> -On the other hand, however, it's often best to move that #ifdef/#ifndef
> -block to a separate function altogether.
> +Declaring variables at first use has two advantages:
> +- we can see the right type of the variable just to the use
> +- we completely remove the posibility of using a variable that is
> +  unitialized.
> +
> +It is especially the case when we are in a for statement.
> +
> +for (int i = X; i++; ..) {
> +    ...
> +}
> +
> +Makes clear visually that this variable is not useed outside of the for.
> +
> +Mixing declarations an code has been allowed since the C99 standard.
>  
>  Conditional statements
>  ======================
> -- 
> 2.39.1
> 
> 

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Change the default for Mixed declarations.
  2023-03-23 19:00 ` Daniel P. Berrangé
@ 2023-03-24  8:43   ` Philippe Mathieu-Daudé
  2023-03-24 14:04   ` Alex Bennée
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 10+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-03-24  8:43 UTC (permalink / raw)
  To: Daniel P. Berrangé, Juan Quintela; +Cc: qemu-devel

On 23/3/23 20:00, Daniel P. Berrangé wrote:
> On Tue, Feb 14, 2023 at 05:07:38PM +0100, Juan Quintela wrote:
>> Hi
>>
>> I want to enter a discussion about changing the default of the style
>> guide.
>>
>> There are several reasons for that:
>> - they exist since C99 (i.e. all supported compilers support them)
>> - they eliminate the posibility of an unitialized variable.
> 
> Actually they don't do that reliably. In fact, when combined
> with usage of 'goto', they introduce uninitialized variables,
> despite the declaration having an initialization present, and
> thus actively mislead reviewers into thinking their code is
> safe.


> IMHO the status quo is bad because it is actively dangerous when
> combined with goto and we aren't using any compiler warnings to
> help us.

Thanks, TIL this, interesting.

> Either we allow it, but use -Wjump-misses-init to prevent mixing
> delayed declarations with gotos, and just avoid this when it triggers
> a false positive.
> 
> Or we forbid it, rewrite current cases that use it, and then add
> -Wdeclaration-after-statement to enforce it.

I guess various macros (Q/LIST/FOO_FOREACH_BAR i.e.) already abuse that.

> IMHO if we are concerned about uninitialized variables then I think
> a better approach is to add -ftrivial-auto-var-init=zero, which will
> make the compiler initialize all variables to 0 if they lack an
> explicit initializer.
But we need to be aware of:

      With the option '-ftrivial-auto-var-init', all the automatic
      variables that do not have explicit initializers will be
      initialized by the compiler.  These additional compiler
      initializations might incur run-time overhead, sometimes
      dramatically.

Also:

     ‘pattern’ Initialize automatic variables with values which will
     likely transform logic bugs into crashes down the line, are easily
     recognized in a crash dump and without being values that programmers
     can rely on for useful program semantics. The current value is
     byte-repeatable pattern with byte "0xFE". The values used for
     pattern initialization might be changed in the future.

If we use -ftrivial-auto-var-init, could the
-ftrivial-auto-var-init=pattern form could be more beneficial to us?


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Change the default for Mixed declarations.
  2023-03-23 19:00 ` Daniel P. Berrangé
  2023-03-24  8:43   ` Philippe Mathieu-Daudé
@ 2023-03-24 14:04   ` Alex Bennée
  2023-03-24 17:39   ` Juan Quintela
  2023-03-27 10:45   ` Markus Armbruster
  3 siblings, 0 replies; 10+ messages in thread
From: Alex Bennée @ 2023-03-24 14:04 UTC (permalink / raw)
  To: Daniel P. Berrangé; +Cc: Juan Quintela, qemu-devel


Daniel P. Berrangé <berrange@redhat.com> writes:

> On Tue, Feb 14, 2023 at 05:07:38PM +0100, Juan Quintela wrote:
>> Hi
>> 
>> I want to enter a discussion about changing the default of the style
>> guide.
>> 
>> There are several reasons for that:
>> - they exist since C99 (i.e. all supported compilers support them)
>> - they eliminate the posibility of an unitialized variable.
>
> Actually they don't do that reliably. In fact, when combined
> with usage of 'goto', they introduce uninitialized variables,
> despite the declaration having an initialization present, and
> thus actively mislead reviewers into thinking their code is
> safe.
>
<snip>
>
>> - (at least for me), declaring the index inside the for make clear
>>   that index is not used outside the for.
>
> I'll admit that declaring loop indexes in the for() is a nice
> bit, but I'm not a fan in general of mixing the declarations
> in the middle of code for projects that use the 'goto cleanup'
> pattern.

I think we could just finesse the rules to allow declaring within the
for() as allowable as start of block. I think more freedom to declare on
first use is only warranted when the compiler will stop you foot gunning
yourself (as it does in Rust).

>> - Current documentation already declares that they are allowed in some
>>   cases.
>> - Lots of places already use them.
>> 
>> We can change the text to whatever you want, just wondering if it is
>> valib to change the standard.
>> 
>> Doing a trivial grep through my local qemu messages (around 100k) it
>> shows that some people are complaining that they are not allowed, and
>> other saying that they are used all over the place.
>
> IMHO the status quo is bad because it is actively dangerous when
> combined with goto and we aren't using any compiler warnings to
> help us.
>
> Either we allow it, but use -Wjump-misses-init to prevent mixing
> delayed declarations with gotos, and just avoid this when it triggers
> a false positive.

Has anyone looked to see how much this triggers on the code base as is?

> Or we forbid it, rewrite current cases that use it, and then add
> -Wdeclaration-after-statement to enforce it.
>
>
> IMHO if we are concerned about uninitialized variables then I think
> a better approach is to add -ftrivial-auto-var-init=zero, which will
> make the compiler initialize all variables to 0 if they lack an
> explicit initializer.

Would that make us less likely to find the occasional bug that does fire
when missing inititizers could be random?

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Change the default for Mixed declarations.
  2023-03-23 19:00 ` Daniel P. Berrangé
  2023-03-24  8:43   ` Philippe Mathieu-Daudé
  2023-03-24 14:04   ` Alex Bennée
@ 2023-03-24 17:39   ` Juan Quintela
  2023-03-24 17:56     ` Alex Bennée
  2023-03-27  9:10     ` Daniel P. Berrangé
  2023-03-27 10:45   ` Markus Armbruster
  3 siblings, 2 replies; 10+ messages in thread
From: Juan Quintela @ 2023-03-24 17:39 UTC (permalink / raw)
  To: Daniel P. Berrangé; +Cc: qemu-devel

Daniel P. Berrangé <berrange@redhat.com> wrote:
> On Tue, Feb 14, 2023 at 05:07:38PM +0100, Juan Quintela wrote:
>> Hi
>> 
>> I want to enter a discussion about changing the default of the style
>> guide.
>> 
>> There are several reasons for that:
>> - they exist since C99 (i.e. all supported compilers support them)
>> - they eliminate the posibility of an unitialized variable.
>
> Actually they don't do that reliably. In fact, when combined
> with usage of 'goto', they introduce uninitialized variables,
> despite the declaration having an initialization present, and
> thus actively mislead reviewers into thinking their code is
> safe.

Wait a minute.
If you use goto, you are already in special rules.

And don't get confused, I fully agree when using goto for two reasons:
- performance
  if you show that the code is x% faster when using goto, it is
  justified.  It is even better if you send a bug report to gcc/clang,
  but I will not opose that use.
- code clearity
  Some code (basically error paths) are clearer with goto that without
  them.

But that don't mind that mixed declarations are bad.  It means that goto
is complicated.

> Consider this example:
>
> #include <stdlib.h>
> #include <stdio.h>
>
> void foo(int something) {
>   if (something == 8) {
>     goto cleanup;
>   }
>   
>   int nitems = 3;
>   int *items = malloc(sizeof(int) *nitems);
>        
>   printf("Hello world %p\n", items);
>
>  cleanup:
>   printf("nitems=%d items=%p\n", nitems, items);
>   if (nitems) {
>     free(items);
>   }
> }
>
> int main(int argc, char **argv) {
>   foo(atoi(argv[1]));
>   return 0;
> }
>
> Superficially everything looks awesome right - the variables are
> all initialized at time of declaration after all.

If you uses goto, you need to check for that.
Majority of functions in qemu (and elsewhere) don't use goto, so don't
make the rules for the exception.

> $ gcc -Wall -o mixed mixed.c
>
> $ ./mixed 3
> Hello world 0x18ef2a0
> nitems=3 items=0x18ef2a0
>
> $ ./mixed 8
> nitems=32677 items=0x7fa5a9209000
> free(): invalid pointer
> Aborted (core dumped)
>
>
> What happens is that when you 'goto $LABEL' across a variable
> declaration, the variable is in scope at your target label, but
> its declared initializers never get run :-(
>
> Luckily you can protect against that with gcc:
>
> $ gcc -Wjump-misses-init -Wall -o mixed mixed.c
> mixed.c: In function ‘foo’:
> mixed.c:7:12: warning: jump skips variable initialization [-Wjump-misses-init]
>     7 |            goto cleanup;
>       |            ^~~~
> mixed.c:15:5: note: label ‘cleanup’ defined here
>    15 |     cleanup:
>       |     ^~~~~~~
> mixed.c:11:13: note: ‘items’ declared here
>    11 |        int *items = malloc(sizeof(int) *nitems);
>       |             ^~~~~
> mixed.c:7:12: warning: jump skips variable initialization [-Wjump-misses-init]
>     7 |            goto cleanup;
>       |            ^~~~
> mixed.c:15:5: note: label ‘cleanup’ defined here
>    15 |     cleanup:
>       |     ^~~~~~~
> mixed.c:10:12: note: ‘nitems’ declared here
>    10 |        int nitems = 3;
>       |            ^~~~~~
>

Nice option.  I didn't know about it.

> however that will warn about *all* cases where we jump over a
> declared variable, even if the variable we're jumping over is
> not used at the target label location. IOW, it has significant
> false positive rates. There are quite a few triggers for this
> in the QEMU code already if we turn on this warning.
> It also doesn't alter that the code initialization is misleading
> to read.

I will take a look next week and see how many errors we have have like
that and then we can decide if it is fixable to add that option.

>> - (at least for me), declaring the index inside the for make clear
>>   that index is not used outside the for.
>
> I'll admit that declaring loop indexes in the for() is a nice
> bit, but I'm not a fan in general of mixing the declarations
> in the middle of code for projects that use the 'goto cleanup'
> pattern.

don't do that if we use goto.  I will resend the patch and put the
warning about goto.

>> - Current documentation already declares that they are allowed in some
>>   cases.
>> - Lots of places already use them.
>> 
>> We can change the text to whatever you want, just wondering if it is
>> valib to change the standard.
>> 
>> Doing a trivial grep through my local qemu messages (around 100k) it
>> shows that some people are complaining that they are not allowed, and
>> other saying that they are used all over the place.
>
> IMHO the status quo is bad because it is actively dangerous when
> combined with goto and we aren't using any compiler warnings to
> help us.
>
> Either we allow it, but use -Wjump-misses-init to prevent mixing
> delayed declarations with gotos, and just avoid this when it triggers
> a false positive.
>
> Or we forbid it, rewrite current cases that use it, and then add
> -Wdeclaration-after-statement to enforce it.
>
>
> IMHO if we are concerned about uninitialized variables then I think
> a better approach is to add -ftrivial-auto-var-init=zero, which will
> make the compiler initialize all variables to 0 if they lack an
> explicit initializer. 

I think this is a bad idea.
If we want to "catch" unitialized variables, using something like:

-ftrivial-auto-var-init=pattern sounds much saner.

Obviously gcc is missing

-ftrivial-auto-var-init=42

But well, no project is perfect.

Later, Juan.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Change the default for Mixed declarations.
  2023-03-24 17:39   ` Juan Quintela
@ 2023-03-24 17:56     ` Alex Bennée
  2023-03-27  9:12       ` Daniel P. Berrangé
  2023-03-27 10:49       ` Markus Armbruster
  2023-03-27  9:10     ` Daniel P. Berrangé
  1 sibling, 2 replies; 10+ messages in thread
From: Alex Bennée @ 2023-03-24 17:56 UTC (permalink / raw)
  To: quintela; +Cc: Daniel P. Berrangé, qemu-devel


Juan Quintela <quintela@redhat.com> writes:

> Daniel P. Berrangé <berrange@redhat.com> wrote:
>> On Tue, Feb 14, 2023 at 05:07:38PM +0100, Juan Quintela wrote:
>>> Hi
>>> 
>>> I want to enter a discussion about changing the default of the style
>>> guide.
>>> 
>>> There are several reasons for that:
>>> - they exist since C99 (i.e. all supported compilers support them)
>>> - they eliminate the posibility of an unitialized variable.
>>
>> Actually they don't do that reliably. In fact, when combined
>> with usage of 'goto', they introduce uninitialized variables,
>> despite the declaration having an initialization present, and
>> thus actively mislead reviewers into thinking their code is
>> safe.
>
> Wait a minute.
> If you use goto, you are already in special rules.
>
> And don't get confused, I fully agree when using goto for two reasons:
> - performance
>   if you show that the code is x% faster when using goto, it is
>   justified.  It is even better if you send a bug report to gcc/clang,
>   but I will not opose that use.

I await a clear example in the context of QEMU - there is almost always
a better way to structure things.

> - code clearity
>   Some code (basically error paths) are clearer with goto that without
>   them.

Now we have g_auto* and lock guards we should encourage their use. goto
error_path is a relic of a simpler time ;-)

<snip>
>> IMHO if we are concerned about uninitialized variables then I think
>> a better approach is to add -ftrivial-auto-var-init=zero, which will
>> make the compiler initialize all variables to 0 if they lack an
>> explicit initializer. 
>
> I think this is a bad idea.
> If we want to "catch" unitialized variables, using something like:
>
> -ftrivial-auto-var-init=pattern sounds much saner.
>
> Obviously gcc is missing
>
> -ftrivial-auto-var-init=42

I think we could at least eat the runtime cost of
-ftrvial-auto-var-init=0xDEADBEEF for our --enable-debug builds.

>
> But well, no project is perfect.
>
> Later, Juan.


-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Change the default for Mixed declarations.
  2023-03-24 17:39   ` Juan Quintela
  2023-03-24 17:56     ` Alex Bennée
@ 2023-03-27  9:10     ` Daniel P. Berrangé
  1 sibling, 0 replies; 10+ messages in thread
From: Daniel P. Berrangé @ 2023-03-27  9:10 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel

On Fri, Mar 24, 2023 at 06:39:34PM +0100, Juan Quintela wrote:
> Daniel P. Berrangé <berrange@redhat.com> wrote:
> > On Tue, Feb 14, 2023 at 05:07:38PM +0100, Juan Quintela wrote:
> >> Hi
> >> 
> >> I want to enter a discussion about changing the default of the style
> >> guide.
> >> 
> >> There are several reasons for that:
> >> - they exist since C99 (i.e. all supported compilers support them)
> >> - they eliminate the posibility of an unitialized variable.
> >
> > Actually they don't do that reliably. In fact, when combined
> > with usage of 'goto', they introduce uninitialized variables,
> > despite the declaration having an initialization present, and
> > thus actively mislead reviewers into thinking their code is
> > safe.
> 
> Wait a minute.
> If you use goto, you are already in special rules.
> 
> And don't get confused, I fully agree when using goto for two reasons:
> - performance
>   if you show that the code is x% faster when using goto, it is
>   justified.  It is even better if you send a bug report to gcc/clang,
>   but I will not opose that use.
> - code clearity
>   Some code (basically error paths) are clearer with goto that without
>   them.
> 
> But that don't mind that mixed declarations are bad.  It means that goto
> is complicated.

Yes, goto is complicated and we shouldn't make that worse by using a
code pattern that encourages mistakes IMHO.

> >> - Current documentation already declares that they are allowed in some
> >>   cases.
> >> - Lots of places already use them.
> >> 
> >> We can change the text to whatever you want, just wondering if it is
> >> valib to change the standard.
> >> 
> >> Doing a trivial grep through my local qemu messages (around 100k) it
> >> shows that some people are complaining that they are not allowed, and
> >> other saying that they are used all over the place.
> >
> > IMHO the status quo is bad because it is actively dangerous when
> > combined with goto and we aren't using any compiler warnings to
> > help us.
> >
> > Either we allow it, but use -Wjump-misses-init to prevent mixing
> > delayed declarations with gotos, and just avoid this when it triggers
> > a false positive.
> >
> > Or we forbid it, rewrite current cases that use it, and then add
> > -Wdeclaration-after-statement to enforce it.
> >
> >
> > IMHO if we are concerned about uninitialized variables then I think
> > a better approach is to add -ftrivial-auto-var-init=zero, which will
> > make the compiler initialize all variables to 0 if they lack an
> > explicit initializer. 
> 
> I think this is a bad idea.
> If we want to "catch" unitialized variables, using something like:
> 
> -ftrivial-auto-var-init=pattern sounds much saner.

It depends on what you are aiming to achieve.

In almost all cases where we forgot to initialize something, all-zeros
is the value that we would have wanted to be present. IOW, init=zero
will (almost) always make the code do what we wanted it to do, and thus
is the safe option to make QEMU robust.

Using a non-zero value will be potentially dangerous in a number of
possible ways. It will lead to loops iterating when they should not,
potentially even infinite loops. It will lead to reads/writes off
the end of arrays. It will lead to attempts to free() invalid pointers.
Essentially all the ways in which an uninitialized value can make the
code go wrong wil still potentially happen if we initialized to a
non-zero value. The only benefit is that it will go horribly wrong
in the same way each time.

IOW...

* If you want the application to be robust and generally "do the
  right thing", even in the face of missing initializers, then
  using -ftrivial-auto-var-init=zero is the right answer.

* If you want the application to go horribly wrong, but in a
  repeatable manner, then -ftrivial-auto-var-init=pattern is the
  right answer

Personally I prefer QEMU to be robust and thus believe we should
initialize to zero in real world deployments.

Potentially there's a case for CI jobs to use a non-zero pattern
though to try to expose edge cases.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Change the default for Mixed declarations.
  2023-03-24 17:56     ` Alex Bennée
@ 2023-03-27  9:12       ` Daniel P. Berrangé
  2023-03-27 10:49       ` Markus Armbruster
  1 sibling, 0 replies; 10+ messages in thread
From: Daniel P. Berrangé @ 2023-03-27  9:12 UTC (permalink / raw)
  To: Alex Bennée; +Cc: quintela, qemu-devel

On Fri, Mar 24, 2023 at 05:56:46PM +0000, Alex Bennée wrote:
> 
> Juan Quintela <quintela@redhat.com> writes:
> 
> > Daniel P. Berrangé <berrange@redhat.com> wrote:
> >> On Tue, Feb 14, 2023 at 05:07:38PM +0100, Juan Quintela wrote:
> >>> Hi
> >>> 
> >>> I want to enter a discussion about changing the default of the style
> >>> guide.
> >>> 
> >>> There are several reasons for that:
> >>> - they exist since C99 (i.e. all supported compilers support them)
> >>> - they eliminate the posibility of an unitialized variable.
> >>
> >> Actually they don't do that reliably. In fact, when combined
> >> with usage of 'goto', they introduce uninitialized variables,
> >> despite the declaration having an initialization present, and
> >> thus actively mislead reviewers into thinking their code is
> >> safe.
> >
> > Wait a minute.
> > If you use goto, you are already in special rules.
> >
> > And don't get confused, I fully agree when using goto for two reasons:
> > - performance
> >   if you show that the code is x% faster when using goto, it is
> >   justified.  It is even better if you send a bug report to gcc/clang,
> >   but I will not opose that use.
> 
> I await a clear example in the context of QEMU - there is almost always
> a better way to structure things.
> 
> > - code clearity
> >   Some code (basically error paths) are clearer with goto that without
> >   them.
> 
> Now we have g_auto* and lock guards we should encourage their use. goto
> error_path is a relic of a simpler time ;-)
> 
> <snip>
> >> IMHO if we are concerned about uninitialized variables then I think
> >> a better approach is to add -ftrivial-auto-var-init=zero, which will
> >> make the compiler initialize all variables to 0 if they lack an
> >> explicit initializer. 
> >
> > I think this is a bad idea.
> > If we want to "catch" unitialized variables, using something like:
> >
> > -ftrivial-auto-var-init=pattern sounds much saner.
> >
> > Obviously gcc is missing
> >
> > -ftrivial-auto-var-init=42
> 
> I think we could at least eat the runtime cost of
> -ftrvial-auto-var-init=0xDEADBEEF for our --enable-debug builds.

If there is ever a case where an uninitialized var gets used as a
loop counter, that's 3,735,928,559 iterations. A small value pattern
would avoid such CPU burn.


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Change the default for Mixed declarations.
  2023-03-23 19:00 ` Daniel P. Berrangé
                     ` (2 preceding siblings ...)
  2023-03-24 17:39   ` Juan Quintela
@ 2023-03-27 10:45   ` Markus Armbruster
  3 siblings, 0 replies; 10+ messages in thread
From: Markus Armbruster @ 2023-03-27 10:45 UTC (permalink / raw)
  To: Daniel P. Berrangé; +Cc: Juan Quintela, qemu-devel

Daniel P. Berrangé <berrange@redhat.com> writes:

> On Tue, Feb 14, 2023 at 05:07:38PM +0100, Juan Quintela wrote:
>> Hi
>> 
>> I want to enter a discussion about changing the default of the style
>> guide.
>> 
>> There are several reasons for that:
>> - they exist since C99 (i.e. all supported compilers support them)
>> - they eliminate the posibility of an unitialized variable.
>
> Actually they don't do that reliably. In fact, when combined
> with usage of 'goto', they introduce uninitialized variables,
> despite the declaration having an initialization present, and
> thus actively mislead reviewers into thinking their code is
> safe.
>
> Consider this example:

[...]

> What happens is that when you 'goto $LABEL' across a variable
> declaration, the variable is in scope at your target label, but
> its declared initializers never get run :-(
>
> Luckily you can protect against that with gcc:
>
> $ gcc -Wjump-misses-init -Wall -o mixed mixed.c
> mixed.c: In function ‘foo’:
> mixed.c:7:12: warning: jump skips variable initialization [-Wjump-misses-init]
>     7 |            goto cleanup;
>       |            ^~~~
> mixed.c:15:5: note: label ‘cleanup’ defined here
>    15 |     cleanup:
>       |     ^~~~~~~
> mixed.c:11:13: note: ‘items’ declared here
>    11 |        int *items = malloc(sizeof(int) *nitems);
>       |             ^~~~~
> mixed.c:7:12: warning: jump skips variable initialization [-Wjump-misses-init]
>     7 |            goto cleanup;
>       |            ^~~~
> mixed.c:15:5: note: label ‘cleanup’ defined here
>    15 |     cleanup:
>       |     ^~~~~~~
> mixed.c:10:12: note: ‘nitems’ declared here
>    10 |        int nitems = 3;
>       |            ^~~~~~
>
>
> however that will warn about *all* cases where we jump over a
> declared variable, even if the variable we're jumping over is
> not used at the target label location. IOW, it has significant
> false positive rates. There are quite a few triggers for this
> in the QEMU code already if we turn on this warning.
>
> It also doesn't alter that the code initialization is misleading
> to read.

Yup.  Strong dislike.

>> - (at least for me), declaring the index inside the for make clear
>>   that index is not used outside the for.
>
> I'll admit that declaring loop indexes in the for() is a nice
> bit, but I'm not a fan in general of mixing the declarations
> in the middle of code for projects that use the 'goto cleanup'
> pattern.

A declaration in a for statement's first operand is effectively at the
beginning of a block.  Therefore, use of this feature is already
sanctioned by the QEMU Coding Style.  The proposed patch at most
clarifies this.

>> - Current documentation already declares that they are allowed in some
>>   cases.
>> - Lots of places already use them.
>> 
>> We can change the text to whatever you want, just wondering if it is
>> valib to change the standard.
>> 
>> Doing a trivial grep through my local qemu messages (around 100k) it
>> shows that some people are complaining that they are not allowed, and
>> other saying that they are used all over the place.
>
> IMHO the status quo is bad because it is actively dangerous when
> combined with goto and we aren't using any compiler warnings to
> help us.
>
> Either we allow it, but use -Wjump-misses-init to prevent mixing
> delayed declarations with gotos, and just avoid this when it triggers
> a false positive.
>
> Or we forbid it, rewrite current cases that use it, and then add
> -Wdeclaration-after-statement to enforce it.

I'm in favour of -Wdeclaration-after-statement.

> IMHO if we are concerned about uninitialized variables then I think
> a better approach is to add -ftrivial-auto-var-init=zero, which will
> make the compiler initialize all variables to 0 if they lack an
> explicit initializer. 

How often do we get bitten by uninitialized variables despite
-Wmaybe-uninitialized?  Honest question!

>> Discuss.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Change the default for Mixed declarations.
  2023-03-24 17:56     ` Alex Bennée
  2023-03-27  9:12       ` Daniel P. Berrangé
@ 2023-03-27 10:49       ` Markus Armbruster
  1 sibling, 0 replies; 10+ messages in thread
From: Markus Armbruster @ 2023-03-27 10:49 UTC (permalink / raw)
  To: Alex Bennée; +Cc: quintela, Daniel P. Berrangé, qemu-devel

Alex Bennée <alex.bennee@linaro.org> writes:

> Juan Quintela <quintela@redhat.com> writes:
>
>> Daniel P. Berrangé <berrange@redhat.com> wrote:
>>> On Tue, Feb 14, 2023 at 05:07:38PM +0100, Juan Quintela wrote:
>>>> Hi
>>>> 
>>>> I want to enter a discussion about changing the default of the style
>>>> guide.
>>>> 
>>>> There are several reasons for that:
>>>> - they exist since C99 (i.e. all supported compilers support them)
>>>> - they eliminate the posibility of an unitialized variable.
>>>
>>> Actually they don't do that reliably. In fact, when combined
>>> with usage of 'goto', they introduce uninitialized variables,
>>> despite the declaration having an initialization present, and
>>> thus actively mislead reviewers into thinking their code is
>>> safe.
>>
>> Wait a minute.
>> If you use goto, you are already in special rules.
>>
>> And don't get confused, I fully agree when using goto for two reasons:
>> - performance
>>   if you show that the code is x% faster when using goto, it is
>>   justified.  It is even better if you send a bug report to gcc/clang,
>>   but I will not opose that use.
>
> I await a clear example in the context of QEMU - there is almost always
> a better way to structure things.
>
>> - code clearity
>>   Some code (basically error paths) are clearer with goto that without
>>   them.
>
> Now we have g_auto* and lock guards we should encourage their use. goto
> error_path is a relic of a simpler time ;-)

Only 8004 places to "modernize" (not counting generated code and
documentation) before presence of goto ceases to be a concern.

[...]



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-03-27 10:49 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-14 16:07 [PATCH] Change the default for Mixed declarations Juan Quintela
2023-03-23 19:00 ` Daniel P. Berrangé
2023-03-24  8:43   ` Philippe Mathieu-Daudé
2023-03-24 14:04   ` Alex Bennée
2023-03-24 17:39   ` Juan Quintela
2023-03-24 17:56     ` Alex Bennée
2023-03-27  9:12       ` Daniel P. Berrangé
2023-03-27 10:49       ` Markus Armbruster
2023-03-27  9:10     ` Daniel P. Berrangé
2023-03-27 10:45   ` Markus Armbruster

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.