All of lore.kernel.org
 help / color / mirror / Atom feed
* Buggy variable-length array code...or compiler?
@ 2010-02-25 23:17 Steven J. Magnani
  2010-02-25 23:23 ` Samuel Thibault
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Steven J. Magnani @ 2010-02-25 23:17 UTC (permalink / raw)
  To: linux-kernel; +Cc: microblaze-uclinux, dan.j.williams

When I run a memcpy dmatest with a Microblaze 2.6.33 noMMU kernel, the
system crashes after about 400 iterations. After much head scratching, I
believe I've narrowed the problem to this fragment of code in
drivers/dma/dmatest.c:

static int dmatest_func(void *data)
{
	struct dmatest_thread	*thread = data;
...
	unsigned int		total_tests = 0;
	int			src_cnt;
	int			dst_cnt;

...
	if (thread->type == DMA_MEMCPY)
		src_cnt = dst_cnt = 1;
...

	while (!kthread_should_stop()
	       && !(iterations && total_tests >= iterations)) {
		struct dma_device *dev = chan->device;
		struct dma_async_tx_descriptor *tx = NULL;
		dma_addr_t dma_srcs[src_cnt];
		dma_addr_t dma_dsts[dst_cnt];

		...
		total_tests++;

		/* CODE ADDED BY ME FOR DEBUG */
		printk("dmatest: Iteration %d, dma_srcs = %p\n",
		       total_tests, dma_srcs);

		...
	}

With this code I get output like this:

dmatest: Iteration 1, dma_srcs = 2c963ee8
dmatest: Iteration 2, dma_srcs = 2c963ed8
dmatest: Iteration 3, dma_srcs = 2c963ec8
dmatest: Iteration 4, dma_srcs = 2c963eb8
...
dmatest: Iteration 420, dma_srcs = 2c9624b8

...and then the stack detonates and the kernel crashes with some strange
error or other.

Are there any language lawyers in the house who'd care to weigh in on
which of these possibilities is the right one?

1. There is a coding error in dmatest
2. There is a bug specific to Microblaze gcc compiler(s) [mine is 4.1.2]
3. There is a bug generic to specific versions of gcc compilers
4. There is a bug generic to all gcc compilers

Obviously, the options get more disturbing the higher you go. I don't
know if VLAs are used elsewhere in the kernel; a 'smatch' search might
be helpful.

Regards,
------------------------------------------------------------------------
 Steven J. Magnani               "I claim this network for MARS!
 www.digidescorp.com              Earthling, return my space modulator!"

 #include <standard.disclaimer>




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Buggy variable-length array code...or compiler?
  2010-02-25 23:17 Buggy variable-length array code...or compiler? Steven J. Magnani
@ 2010-02-25 23:23 ` Samuel Thibault
  2010-02-25 23:46 ` David Rientjes
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 10+ messages in thread
From: Samuel Thibault @ 2010-02-25 23:23 UTC (permalink / raw)
  To: Steven J. Magnani; +Cc: linux-kernel, microblaze-uclinux, dan.j.williams

IIRC C99 allows VLAs to not be freed immediately, but at function
termination, which makes things way simpler to implement for compilers.
I've noticed it on Solaris and cell's spugcc.

Steven J. Magnani, le Thu 25 Feb 2010 17:17:29 -0600, a écrit :
> 1. There is a coding error in dmatest

It needs to get out of the function from times to times to free the
VLAs.

> 2. There is a bug specific to Microblaze gcc compiler(s) [mine is 4.1.2]
> 3. There is a bug generic to specific versions of gcc compilers
> 4. There is a bug generic to all gcc compilers

I've usually seen gcc free VLAs as soon as possible, so I believe it's
only a few targets.

Samuel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Buggy variable-length array code...or compiler?
  2010-02-25 23:17 Buggy variable-length array code...or compiler? Steven J. Magnani
  2010-02-25 23:23 ` Samuel Thibault
@ 2010-02-25 23:46 ` David Rientjes
  2010-02-26 10:27   ` Dan Carpenter
  2010-02-26  0:46 ` J.A. Magallón
  2010-02-26 10:11 ` Dan Carpenter
  3 siblings, 1 reply; 10+ messages in thread
From: David Rientjes @ 2010-02-25 23:46 UTC (permalink / raw)
  To: Steven J. Magnani
  Cc: linux-kernel, microblaze-uclinux, dan.j.williams, Dave Hansen

On Thu, 25 Feb 2010, Steven J. Magnani wrote:

> When I run a memcpy dmatest with a Microblaze 2.6.33 noMMU kernel, the
> system crashes after about 400 iterations. After much head scratching, I
> believe I've narrowed the problem to this fragment of code in
> drivers/dma/dmatest.c:
> 
> static int dmatest_func(void *data)
> {
> 	struct dmatest_thread	*thread = data;
> ...
> 	unsigned int		total_tests = 0;
> 	int			src_cnt;
> 	int			dst_cnt;
> 
> ...
> 	if (thread->type == DMA_MEMCPY)
> 		src_cnt = dst_cnt = 1;
> ...
> 
> 	while (!kthread_should_stop()
> 	       && !(iterations && total_tests >= iterations)) {
> 		struct dma_device *dev = chan->device;
> 		struct dma_async_tx_descriptor *tx = NULL;
> 		dma_addr_t dma_srcs[src_cnt];
> 		dma_addr_t dma_dsts[dst_cnt];
> 
> 		...
> 		total_tests++;
> 
> 		/* CODE ADDED BY ME FOR DEBUG */
> 		printk("dmatest: Iteration %d, dma_srcs = %p\n",
> 		       total_tests, dma_srcs);
> 
> 		...
> 	}
> 
> With this code I get output like this:
> 
> dmatest: Iteration 1, dma_srcs = 2c963ee8
> dmatest: Iteration 2, dma_srcs = 2c963ed8
> dmatest: Iteration 3, dma_srcs = 2c963ec8
> dmatest: Iteration 4, dma_srcs = 2c963eb8
> ...
> dmatest: Iteration 420, dma_srcs = 2c9624b8
> 
> ...and then the stack detonates and the kernel crashes with some strange
> error or other.
> 

This could probably become the first kernel user of the flexible array 
library (see Documentation/flexible-arrays.txt).  Dan?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Buggy variable-length array code...or compiler?
  2010-02-25 23:17 Buggy variable-length array code...or compiler? Steven J. Magnani
  2010-02-25 23:23 ` Samuel Thibault
  2010-02-25 23:46 ` David Rientjes
@ 2010-02-26  0:46 ` J.A. Magallón
  2010-02-26 17:43   ` Samuel Thibault
  2010-02-26 18:52   ` Steven J. Magnani
  2010-02-26 10:11 ` Dan Carpenter
  3 siblings, 2 replies; 10+ messages in thread
From: J.A. Magallón @ 2010-02-26  0:46 UTC (permalink / raw)
  To: LKML

On Thu, 25 Feb 2010 17:17:29 -0600, "Steven J. Magnani" <steve@digidescorp.com> wrote:

> When I run a memcpy dmatest with a Microblaze 2.6.33 noMMU kernel, the
> system crashes after about 400 iterations. After much head scratching, I
> believe I've narrowed the problem to this fragment of code in
> drivers/dma/dmatest.c:
> 
> static int dmatest_func(void *data)
> {
> 	struct dmatest_thread	*thread = data;
> ...
> 	unsigned int		total_tests = 0;
> 	int			src_cnt;
> 	int			dst_cnt;
> 
> ...
> 	if (thread->type == DMA_MEMCPY)
> 		src_cnt = dst_cnt = 1;
> ...
> 
> 	while (!kthread_should_stop()
> 	       && !(iterations && total_tests >= iterations)) {
> 		struct dma_device *dev = chan->device;
> 		struct dma_async_tx_descriptor *tx = NULL;
> 		dma_addr_t dma_srcs[src_cnt];
> 		dma_addr_t dma_dsts[dst_cnt];
> 
> 		...
> 		total_tests++;
> 
> 		/* CODE ADDED BY ME FOR DEBUG */
> 		printk("dmatest: Iteration %d, dma_srcs = %p\n",
> 		       total_tests, dma_srcs);
> 
> 		...
> 	}
> 
> With this code I get output like this:
> 
> dmatest: Iteration 1, dma_srcs = 2c963ee8
> dmatest: Iteration 2, dma_srcs = 2c963ed8
> dmatest: Iteration 3, dma_srcs = 2c963ec8
> dmatest: Iteration 4, dma_srcs = 2c963eb8
> ...
> dmatest: Iteration 420, dma_srcs = 2c9624b8
> 
> ...and then the stack detonates and the kernel crashes with some strange
> error or other.
> 
> Are there any language lawyers in the house who'd care to weigh in on
> which of these possibilities is the right one?
> 
> 1. There is a coding error in dmatest
> 2. There is a bug specific to Microblaze gcc compiler(s) [mine is 4.1.2]
> 3. There is a bug generic to specific versions of gcc compilers
> 4. There is a bug generic to all gcc compilers
> 
> Obviously, the options get more disturbing the higher you go. I don't
> know if VLAs are used elsewhere in the kernel; a 'smatch' search might
> be helpful.

Can you try this in userspace ? I compiled in CentOS gcc 4.1.2 (just the
same), and the addresses are always the same:

#include <stdio.h>

int main()
{
    int c = 2;
    while (1)
    {
        int a[c];
        int b[c];
        a[0] = b[0];
        printf("%p %p\n",a,b);
    }
}

Could you post the full contents of the while loop ? Perhaps there's a
buglet in other piece of code that leaves something on the stack.
Which is the size of dma_addr_t ? Does it match the difference of 16 bytes
on each iteration ? cnt's are always 1, isn't it ?
Can you switch the size to a fixed '1' to see if this hangs again ?

TIA

-- 
J.A. Magallon <jamagallon()ono!com>     \               Software is like sex:
                                         \         It's better when it's free

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Buggy variable-length array code...or compiler?
  2010-02-25 23:17 Buggy variable-length array code...or compiler? Steven J. Magnani
                   ` (2 preceding siblings ...)
  2010-02-26  0:46 ` J.A. Magallón
@ 2010-02-26 10:11 ` Dan Carpenter
  3 siblings, 0 replies; 10+ messages in thread
From: Dan Carpenter @ 2010-02-26 10:11 UTC (permalink / raw)
  To: Steven J. Magnani; +Cc: linux-kernel, microblaze-uclinux, dan.j.williams

On Thu, Feb 25, 2010 at 05:17:29PM -0600, Steven J. Magnani wrote:
> When I run a memcpy dmatest with a Microblaze 2.6.33 noMMU kernel, the
> system crashes after about 400 iterations. After much head scratching, I
> believe I've narrowed the problem to this fragment of code in
> drivers/dma/dmatest.c:
> 
> static int dmatest_func(void *data)
> {
> 	struct dmatest_thread	*thread = data;
> ...
> 	unsigned int		total_tests = 0;
> 	int			src_cnt;
> 	int			dst_cnt;
> 
> ...
> 	if (thread->type == DMA_MEMCPY)
> 		src_cnt = dst_cnt = 1;
> ...
> 
> 	while (!kthread_should_stop()
> 	       && !(iterations && total_tests >= iterations)) {
> 		struct dma_device *dev = chan->device;
> 		struct dma_async_tx_descriptor *tx = NULL;
> 		dma_addr_t dma_srcs[src_cnt];
> 		dma_addr_t dma_dsts[dst_cnt];
> 
> 		...
> 		total_tests++;
> 
> 		/* CODE ADDED BY ME FOR DEBUG */
> 		printk("dmatest: Iteration %d, dma_srcs = %p\n",
> 		       total_tests, dma_srcs);
> 
> 		...
> 	}
> 
> With this code I get output like this:
> 
> dmatest: Iteration 1, dma_srcs = 2c963ee8
> dmatest: Iteration 2, dma_srcs = 2c963ed8
> dmatest: Iteration 3, dma_srcs = 2c963ec8
> dmatest: Iteration 4, dma_srcs = 2c963eb8
> ...
> dmatest: Iteration 420, dma_srcs = 2c9624b8
> 
> ...and then the stack detonates and the kernel crashes with some strange
> error or other.
> 
> Are there any language lawyers in the house who'd care to weigh in on
> which of these possibilities is the right one?
> 
> 1. There is a coding error in dmatest
> 2. There is a bug specific to Microblaze gcc compiler(s) [mine is 4.1.2]
> 3. There is a bug generic to specific versions of gcc compilers
> 4. There is a bug generic to all gcc compilers
> 
> Obviously, the options get more disturbing the higher you go. I don't
> know if VLAs are used elsewhere in the kernel; a 'smatch' search might
> be helpful.
> 

Sparse complains about variable length arrays.

drivers/dma/dmatest.c:293:37: error: bad constant expression

In my allmodconfig with staging:
~/progs/kernel/devel$ grep "bad constant expression" warns.txt  | sort -u | wc -l
133

regards,
dan carpenter


> Regards,
> ------------------------------------------------------------------------
>  Steven J. Magnani               "I claim this network for MARS!
>  www.digidescorp.com              Earthling, return my space modulator!"
> 
>  #include <standard.disclaimer>
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Buggy variable-length array code...or compiler?
  2010-02-25 23:46 ` David Rientjes
@ 2010-02-26 10:27   ` Dan Carpenter
  2010-02-26 19:15     ` Steven J. Magnani
  0 siblings, 1 reply; 10+ messages in thread
From: Dan Carpenter @ 2010-02-26 10:27 UTC (permalink / raw)
  To: David Rientjes
  Cc: Steven J. Magnani, linux-kernel, microblaze-uclinux,
	dan.j.williams, Dave Hansen

On Thu, Feb 25, 2010 at 03:46:48PM -0800, David Rientjes wrote:
> On Thu, 25 Feb 2010, Steven J. Magnani wrote:
> 
> > When I run a memcpy dmatest with a Microblaze 2.6.33 noMMU kernel, the
> > system crashes after about 400 iterations. After much head scratching, I
> > believe I've narrowed the problem to this fragment of code in
> > drivers/dma/dmatest.c:
> > 
> > static int dmatest_func(void *data)
> > {
> > 	struct dmatest_thread	*thread = data;
> > ...
> > 	unsigned int		total_tests = 0;
> > 	int			src_cnt;
> > 	int			dst_cnt;
> > 
> > ...
> > 	if (thread->type == DMA_MEMCPY)
> > 		src_cnt = dst_cnt = 1;
> > ...
> > 
> > 	while (!kthread_should_stop()
> > 	       && !(iterations && total_tests >= iterations)) {
> > 		struct dma_device *dev = chan->device;
> > 		struct dma_async_tx_descriptor *tx = NULL;
> > 		dma_addr_t dma_srcs[src_cnt];
> > 		dma_addr_t dma_dsts[dst_cnt];
> > 
> > 		...
> > 		total_tests++;
> > 
> > 		/* CODE ADDED BY ME FOR DEBUG */
> > 		printk("dmatest: Iteration %d, dma_srcs = %p\n",
> > 		       total_tests, dma_srcs);
> > 
> > 		...
> > 	}
> > 
> > With this code I get output like this:
> > 
> > dmatest: Iteration 1, dma_srcs = 2c963ee8
> > dmatest: Iteration 2, dma_srcs = 2c963ed8
> > dmatest: Iteration 3, dma_srcs = 2c963ec8
> > dmatest: Iteration 4, dma_srcs = 2c963eb8
> > ...
> > dmatest: Iteration 420, dma_srcs = 2c9624b8
> > 
> > ...and then the stack detonates and the kernel crashes with some strange
> > error or other.
> > 
> 
> This could probably become the first kernel user of the flexible array 
> library (see Documentation/flexible-arrays.txt).  Dan?
> --

I think the max that src_cnt can be is 3 and the most dst_cnt can be is 2.  
We could just put that there.

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Buggy variable-length array code...or compiler?
  2010-02-26  0:46 ` J.A. Magallón
@ 2010-02-26 17:43   ` Samuel Thibault
  2010-02-26 18:52   ` Steven J. Magnani
  1 sibling, 0 replies; 10+ messages in thread
From: Samuel Thibault @ 2010-02-26 17:43 UTC (permalink / raw)
  To: J.A. Magallón; +Cc: LKML

J.A. Magallón, le Fri 26 Feb 2010 01:46:23 +0100, a écrit :
> int main()
> {
>     int c = 2;
>     while (1)
>     {
>         int a[c];
>         int b[c];

c is not variable here.

Samuel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Buggy variable-length array code...or compiler?
  2010-02-26  0:46 ` J.A. Magallón
  2010-02-26 17:43   ` Samuel Thibault
@ 2010-02-26 18:52   ` Steven J. Magnani
  1 sibling, 0 replies; 10+ messages in thread
From: Steven J. Magnani @ 2010-02-26 18:52 UTC (permalink / raw)
  To: J.A. Magallón; +Cc: LKML

On Fri, 2010-02-26 at 01:46 +0100, J.A. Magallón wrote:

> Can you try this in userspace ? I compiled in CentOS gcc 4.1.2 (just the
> same), and the addresses are always the same:
> 
> #include <stdio.h>
> 
> int main()
> {
>     int c = 2;
>     while (1)
>     {
>         int a[c];
>         int b[c];
>         a[0] = b[0];
>         printf("%p %p\n",a,b);
>     }
> }

On the Microblaze:

0x2d711f1c 0x2d711f10
0x2d711f04 0x2d711ef8
0x2d711eec 0x2d711ee0
0x2d711ed4 0x2d711ec8

This happens no matter what optimization setting I compile with.

> 
> Could you post the full contents of the while loop ? 

It's the stock drivers/dma/dmatest.c in 2.6.33.

> Which is the size of dma_addr_t ? 
Microblaze is a 32-bit arch; dma_addr_t is 32 bits.

> Does it match the difference of 16 bytes on each iteration ? 
No. It appears that the data going on the stack each iteration are:

  dma_srcs[0]: iteration n
  total_tests: [n-2]
  X
  dst_off:     [iteration n-1]
  dma_srcs[0]: iteration n-1
  total_tests [n-3]
  X

> cnt's are always 1, isn't it ?
For the memcpy test, yes/

> Can you switch the size to a fixed '1' to see if this hangs again ?
It does not.

Regards,
------------------------------------------------------------------------
 Steven J. Magnani               "I claim this network for MARS!
 www.digidescorp.com              Earthling, return my space modulator!"

 #include <standard.disclaimer>




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Buggy variable-length array code...or compiler?
  2010-02-26 10:27   ` Dan Carpenter
@ 2010-02-26 19:15     ` Steven J. Magnani
  2010-02-26 19:43       ` Dan Williams
  0 siblings, 1 reply; 10+ messages in thread
From: Steven J. Magnani @ 2010-02-26 19:15 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: David Rientjes, linux-kernel, microblaze-uclinux, dan.j.williams,
	Dave Hansen

On Thu, 25 Feb 2010, Steven J. Magnani wrote:
> > 
> > > When I run a memcpy dmatest with a Microblaze 2.6.33 noMMU kernel, the
> > > system crashes after about 400 iterations. After much head scratching, I
> > > believe I've narrowed the problem to this fragment of code in
> > > drivers/dma/dmatest.c:
> > > 
> > > static int dmatest_func(void *data)
> > > {
> > > ...
> > >     int                     src_cnt;
> > >     int                     dst_cnt;
> > > ...
> > >     if (thread->type == DMA_MEMCPY)
> > >             src_cnt = dst_cnt = 1;
> > > ...
> > >     while (!kthread_should_stop()
> > >            && !(iterations && total_tests >= iterations)) {
> > > ...
> > >             dma_addr_t dma_srcs[src_cnt];
> > >             dma_addr_t dma_dsts[dst_cnt];
> > > ...

On Thu, Feb 25, 2010 at 03:46:48PM -0800, David Rientjes wrote:
> > This could probably become the first kernel user of the flexible array 
> > library (see Documentation/flexible-arrays.txt).  Dan?
> > --

On Fri, 2010-02-26 at 13:27 +0300, Dan Carpenter wrote:
> I think the max that src_cnt can be is 3 and the most dst_cnt can be is 2.  
> We could just put that there.

src_cnt is dependent on module parameters.

The bug goes away if dma_srcs and dma_dsts are declared outside the
loop, but that requires knocking the loop in another tabstop. At that
point dmatest_func() would be begging for refactoring.

Regards,
------------------------------------------------------------------------
 Steven J. Magnani               "I claim this network for MARS!
 www.digidescorp.com              Earthling, return my space modulator!"

 #include <standard.disclaimer>




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Buggy variable-length array code...or compiler?
  2010-02-26 19:15     ` Steven J. Magnani
@ 2010-02-26 19:43       ` Dan Williams
  0 siblings, 0 replies; 10+ messages in thread
From: Dan Williams @ 2010-02-26 19:43 UTC (permalink / raw)
  To: steve
  Cc: Dan Carpenter, David Rientjes, linux-kernel, microblaze-uclinux,
	Dave Hansen

On Fri, Feb 26, 2010 at 12:15 PM, Steven J. Magnani
<steve@digidescorp.com> wrote:
> On Thu, 25 Feb 2010, Steven J. Magnani wrote:
>> >
>> > > When I run a memcpy dmatest with a Microblaze 2.6.33 noMMU kernel, the
>> > > system crashes after about 400 iterations. After much head scratching, I
>> > > believe I've narrowed the problem to this fragment of code in
>> > > drivers/dma/dmatest.c:
>> > >
>> > > static int dmatest_func(void *data)
>> > > {
>> > > ...
>> > >     int                     src_cnt;
>> > >     int                     dst_cnt;
>> > > ...
>> > >     if (thread->type == DMA_MEMCPY)
>> > >             src_cnt = dst_cnt = 1;
>> > > ...
>> > >     while (!kthread_should_stop()
>> > >            && !(iterations && total_tests >= iterations)) {
>> > > ...
>> > >             dma_addr_t dma_srcs[src_cnt];
>> > >             dma_addr_t dma_dsts[dst_cnt];
>> > > ...
>
> On Thu, Feb 25, 2010 at 03:46:48PM -0800, David Rientjes wrote:
>> > This could probably become the first kernel user of the flexible array
>> > library (see Documentation/flexible-arrays.txt).  Dan?
>> > --
>
> On Fri, 2010-02-26 at 13:27 +0300, Dan Carpenter wrote:
>> I think the max that src_cnt can be is 3 and the most dst_cnt can be is 2.
>> We could just put that there.
>
> src_cnt is dependent on module parameters.
>
> The bug goes away if dma_srcs and dma_dsts are declared outside the
> loop, but that requires knocking the loop in another tabstop. At that
> point dmatest_func() would be begging for refactoring.
>

...and even if you refactored it there is no guarantee that gcc would
not re-inline the functions and put you back in the same situation.
Especially given the finding that J.A.'s simple test, where the size
of the array should have been constant, still failed.

--
Dan

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2010-02-26 19:43 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-02-25 23:17 Buggy variable-length array code...or compiler? Steven J. Magnani
2010-02-25 23:23 ` Samuel Thibault
2010-02-25 23:46 ` David Rientjes
2010-02-26 10:27   ` Dan Carpenter
2010-02-26 19:15     ` Steven J. Magnani
2010-02-26 19:43       ` Dan Williams
2010-02-26  0:46 ` J.A. Magallón
2010-02-26 17:43   ` Samuel Thibault
2010-02-26 18:52   ` Steven J. Magnani
2010-02-26 10:11 ` Dan Carpenter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.