All of lore.kernel.org
 help / color / mirror / Atom feed
* qemu:beagle no longer booting with omap2plus_defconfig in -next
@ 2016-04-23 17:53 ` Guenter Roeck
  0 siblings, 0 replies; 9+ messages in thread
From: Guenter Roeck @ 2016-04-23 17:53 UTC (permalink / raw)
  To: linux-next
  Cc: Boris Brezillon, Stephen Rothwell, linux-kernel, Roger Quadros,
	Brian Norris, Boris Brezillon, Tony Lindgren, linux-mtd,
	linux-omap

Hi,

since next-20160421, I get the following error and hang when trying to boot
an omap2plus_defconfig image with qemu, machine 'beagle' and omap3-beagle.dtb.
multi_v7_defconfig still works, as does machine 'beaglexm' with omap3-beagle-xm.dtb
and omap2plus_defconfig. This is with Linaro's version of qemu.

nand: timeout while waiting for chip to become ready

The message repeats until the test times out.

Bisect points to "Merge remote-tracking branch 'nand/nand/next'" as the offending
commit. However, the nand/nand/next branch itself is fine, as is the merge just
prior to the nand/nand/next merge ("Merge remote-tracking branch 'l2-mtd/master'").

After some digging, I found that reverting commit "mtd: nand: omap2: Implement
NAND ready using gpiolib" fixes the problem. What I don't know, though, is why
the problem is only seen with omap2plus_defconfig, but not with multi_v7_defconfig,
and why it is only seen with beagle/omap3-beagle.dtb but not with
beaglexm/omap3-beagle-xm.dtb.

The 'rb-gpios' property is only defined in omap3-beagle.dts, but not in
omap3-beagle-xm.dts, which may be part of the explanation. That still doesn't
explain, though, why multi_v7_defconfig still works, but not omap2plus_defconfig.

Any ideas, anyone ?

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 9+ messages in thread

* qemu:beagle no longer booting with omap2plus_defconfig in -next
@ 2016-04-23 17:53 ` Guenter Roeck
  0 siblings, 0 replies; 9+ messages in thread
From: Guenter Roeck @ 2016-04-23 17:53 UTC (permalink / raw)
  To: linux-next
  Cc: Boris Brezillon, Stephen Rothwell, linux-kernel, Roger Quadros,
	Brian Norris

Hi,

since next-20160421, I get the following error and hang when trying to boot
an omap2plus_defconfig image with qemu, machine 'beagle' and omap3-beagle.dtb.
multi_v7_defconfig still works, as does machine 'beaglexm' with omap3-beagle-xm.dtb
and omap2plus_defconfig. This is with Linaro's version of qemu.

nand: timeout while waiting for chip to become ready

The message repeats until the test times out.

Bisect points to "Merge remote-tracking branch 'nand/nand/next'" as the offending
commit. However, the nand/nand/next branch itself is fine, as is the merge just
prior to the nand/nand/next merge ("Merge remote-tracking branch 'l2-mtd/master'").

After some digging, I found that reverting commit "mtd: nand: omap2: Implement
NAND ready using gpiolib" fixes the problem. What I don't know, though, is why
the problem is only seen with omap2plus_defconfig, but not with multi_v7_defconfig,
and why it is only seen with beagle/omap3-beagle.dtb but not with
beaglexm/omap3-beagle-xm.dtb.

The 'rb-gpios' property is only defined in omap3-beagle.dts, but not in
omap3-beagle-xm.dts, which may be part of the explanation. That still doesn't
explain, though, why multi_v7_defconfig still works, but not omap2plus_defconfig.

Any ideas, anyone ?

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: qemu:beagle no longer booting with omap2plus_defconfig in -next
  2016-04-23 17:53 ` Guenter Roeck
  (?)
@ 2016-04-23 19:46 ` Boris Brezillon
  2016-04-24 16:42   ` Guenter Roeck
  2016-04-24 19:28   ` Boris Brezillon
  -1 siblings, 2 replies; 9+ messages in thread
From: Boris Brezillon @ 2016-04-23 19:46 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: linux-next, Stephen Rothwell, linux-kernel, Roger Quadros,
	Brian Norris, Tony Lindgren, linux-mtd, linux-omap

Hi Guenter,

On Sat, 23 Apr 2016 10:53:06 -0700
Guenter Roeck <linux@roeck-us.net> wrote:

> Hi,
> 
> since next-20160421, I get the following error and hang when trying to boot
> an omap2plus_defconfig image with qemu, machine 'beagle' and omap3-beagle.dtb.
> multi_v7_defconfig still works, as does machine 'beaglexm' with omap3-beagle-xm.dtb
> and omap2plus_defconfig. This is with Linaro's version of qemu.
> 
> nand: timeout while waiting for chip to become ready
> 
> The message repeats until the test times out.
> 
> Bisect points to "Merge remote-tracking branch 'nand/nand/next'" as the offending
> commit. However, the nand/nand/next branch itself is fine, as is the merge just
> prior to the nand/nand/next merge ("Merge remote-tracking branch 'l2-mtd/master'").
> 
> After some digging, I found that reverting commit "mtd: nand: omap2: Implement
> NAND ready using gpiolib" fixes the problem. What I don't know, though, is why
> the problem is only seen with omap2plus_defconfig, but not with multi_v7_defconfig,
> and why it is only seen with beagle/omap3-beagle.dtb but not with
> beaglexm/omap3-beagle-xm.dtb.
> 
> The 'rb-gpios' property is only defined in omap3-beagle.dts, but not in
> omap3-beagle-xm.dts, which may be part of the explanation. That still doesn't
> explain, though, why multi_v7_defconfig still works, but not omap2plus_defconfig.
> 
> Any ideas, anyone ?

I think you got it right for the DT changes: if rb-gpios is not
defined, it's working because the implementation fallback to "status
polling" mode, which is not relying on the new GPIO controller
implementation.
I don't know why it's working when using multi_v7_defconfig and not
with omap2_plus though (maybe a different probe order making
devm_gpiod_get_optional() return NULL instead of EPROBE_DEFER?).

And the other question I have for Roger is, do you see a reason why the
rb-gpio mode would not work?

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: qemu:beagle no longer booting with omap2plus_defconfig in -next
  2016-04-23 19:46 ` Boris Brezillon
@ 2016-04-24 16:42   ` Guenter Roeck
  2016-04-24 17:14     ` Boris Brezillon
  2016-04-24 17:34     ` Boris Brezillon
  2016-04-24 19:28   ` Boris Brezillon
  1 sibling, 2 replies; 9+ messages in thread
From: Guenter Roeck @ 2016-04-24 16:42 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: linux-next, Stephen Rothwell, linux-kernel, Roger Quadros,
	Brian Norris, Tony Lindgren, linux-mtd, linux-omap

On 04/23/2016 12:46 PM, Boris Brezillon wrote:
> Hi Guenter,
>
> On Sat, 23 Apr 2016 10:53:06 -0700
> Guenter Roeck <linux@roeck-us.net> wrote:
>
>> Hi,
>>
>> since next-20160421, I get the following error and hang when trying to boot
>> an omap2plus_defconfig image with qemu, machine 'beagle' and omap3-beagle.dtb.
>> multi_v7_defconfig still works, as does machine 'beaglexm' with omap3-beagle-xm.dtb
>> and omap2plus_defconfig. This is with Linaro's version of qemu.
>>
>> nand: timeout while waiting for chip to become ready
>>
>> The message repeats until the test times out.
>>
>> Bisect points to "Merge remote-tracking branch 'nand/nand/next'" as the offending
>> commit. However, the nand/nand/next branch itself is fine, as is the merge just
>> prior to the nand/nand/next merge ("Merge remote-tracking branch 'l2-mtd/master'").
>>
>> After some digging, I found that reverting commit "mtd: nand: omap2: Implement
>> NAND ready using gpiolib" fixes the problem. What I don't know, though, is why
>> the problem is only seen with omap2plus_defconfig, but not with multi_v7_defconfig,
>> and why it is only seen with beagle/omap3-beagle.dtb but not with
>> beaglexm/omap3-beagle-xm.dtb.
>>
>> The 'rb-gpios' property is only defined in omap3-beagle.dts, but not in
>> omap3-beagle-xm.dts, which may be part of the explanation. That still doesn't
>> explain, though, why multi_v7_defconfig still works, but not omap2plus_defconfig.
>>
>> Any ideas, anyone ?
>
> I think you got it right for the DT changes: if rb-gpios is not
> defined, it's working because the implementation fallback to "status
> polling" mode, which is not relying on the new GPIO controller
> implementation.
> I don't know why it's working when using multi_v7_defconfig and not
> with omap2_plus though (maybe a different probe order making
> devm_gpiod_get_optional() return NULL instead of EPROBE_DEFER?).
>
> And the other question I have for Roger is, do you see a reason why the
> rb-gpio mode would not work?
>

Hi Boris,

Turns out MTD_NAND_OMAP2 is not enabled on multi_v7_defconfig, thus the issue
does not arise there. After reverting 'mtd: nand: omap2: Implement NAND ready
using gpiolib', the driver uses omap_wait(), which as far as I can see is never
called in my tests. Since dev_ready is NULL in that case, it is never called
either (the chip is just assumed to be always ready), and thus the problem
does not arise.

So the big difference is that the dev_info callback was not used prior to
commit 'mtd: nand: omap2: Implement NAND ready using gpiolib', and that
it is logically different to the wait function which was previously used.

In qemu, it looks like gpmc bit 0 is considered to be the NAND chip select,
which is distinctly different to a chip ready pin. Guess I would have to try
finding a chip datasheet to figure out what this pin is supposed to do, and
what is wrong. Since it is somewhat unlikely that I'll find the time to do that,
I just disabled MTD_NAND_OMAP2 in my qemu tests instead. Not an ideal solution,
of course, but the alternative would be to drop the beagle qemu tests entirely.

Guenter

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: qemu:beagle no longer booting with omap2plus_defconfig in -next
  2016-04-24 16:42   ` Guenter Roeck
@ 2016-04-24 17:14     ` Boris Brezillon
  2016-04-24 18:10       ` Guenter Roeck
  2016-04-24 17:34     ` Boris Brezillon
  1 sibling, 1 reply; 9+ messages in thread
From: Boris Brezillon @ 2016-04-24 17:14 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: linux-next, Stephen Rothwell, linux-kernel, Roger Quadros,
	Brian Norris, Tony Lindgren, linux-mtd, linux-omap

On Sun, 24 Apr 2016 09:42:40 -0700
Guenter Roeck <linux@roeck-us.net> wrote:

> On 04/23/2016 12:46 PM, Boris Brezillon wrote:
> > Hi Guenter,
> >
> > On Sat, 23 Apr 2016 10:53:06 -0700
> > Guenter Roeck <linux@roeck-us.net> wrote:
> >  
> >> Hi,
> >>
> >> since next-20160421, I get the following error and hang when trying to boot
> >> an omap2plus_defconfig image with qemu, machine 'beagle' and omap3-beagle.dtb.
> >> multi_v7_defconfig still works, as does machine 'beaglexm' with omap3-beagle-xm.dtb
> >> and omap2plus_defconfig. This is with Linaro's version of qemu.
> >>
> >> nand: timeout while waiting for chip to become ready
> >>
> >> The message repeats until the test times out.
> >>
> >> Bisect points to "Merge remote-tracking branch 'nand/nand/next'" as the offending
> >> commit. However, the nand/nand/next branch itself is fine, as is the merge just
> >> prior to the nand/nand/next merge ("Merge remote-tracking branch 'l2-mtd/master'").
> >>
> >> After some digging, I found that reverting commit "mtd: nand: omap2: Implement
> >> NAND ready using gpiolib" fixes the problem. What I don't know, though, is why
> >> the problem is only seen with omap2plus_defconfig, but not with multi_v7_defconfig,
> >> and why it is only seen with beagle/omap3-beagle.dtb but not with
> >> beaglexm/omap3-beagle-xm.dtb.
> >>
> >> The 'rb-gpios' property is only defined in omap3-beagle.dts, but not in
> >> omap3-beagle-xm.dts, which may be part of the explanation. That still doesn't
> >> explain, though, why multi_v7_defconfig still works, but not omap2plus_defconfig.
> >>
> >> Any ideas, anyone ?  
> >
> > I think you got it right for the DT changes: if rb-gpios is not
> > defined, it's working because the implementation fallback to "status
> > polling" mode, which is not relying on the new GPIO controller
> > implementation.
> > I don't know why it's working when using multi_v7_defconfig and not
> > with omap2_plus though (maybe a different probe order making
> > devm_gpiod_get_optional() return NULL instead of EPROBE_DEFER?).
> >
> > And the other question I have for Roger is, do you see a reason why the
> > rb-gpio mode would not work?
> >  
> 
> Hi Boris,
> 
> Turns out MTD_NAND_OMAP2 is not enabled on multi_v7_defconfig, thus the issue
> does not arise there.

Okay, this explains why you don't see this problem with multi_v7.

> After reverting 'mtd: nand: omap2: Implement NAND ready
> using gpiolib', the driver uses omap_wait(), which as far as I can see is never
> called in my tests. Since dev_ready is NULL in that case, it is never called
> either (the chip is just assumed to be always ready), and thus the problem
> does not arise.

That's not entirely true: the NAND chip is not assumed to be always
ready, the core just uses a different method to get the R/B status (by
reading the STATUS register using the NAND_CMD_STATUS command). But
you're right in that when you revert this commit you end up not using
the new GPIO controller exposed by the GPMC driver.

> 
> So the big difference is that the dev_info callback was not used prior to
> commit 'mtd: nand: omap2: Implement NAND ready using gpiolib', and that
> it is logically different to the wait function which was previously used.

Yes.

> 
> In qemu, it looks like gpmc bit 0 is considered to be the NAND chip select,
> which is distinctly different to a chip ready pin.

Well, if you look at the GPIO controller implementation, you'll see
that gpichip->get() is adding 8 to the GPIO index, so the
implementation is actually testing bit 8 and not bit 0. Maybe this is
not emulated properly in qemu though...

> Guess I would have to try
> finding a chip datasheet to figure out what this pin is supposed to do, and
> what is wrong. Since it is somewhat unlikely that I'll find the time to do that,
> I just disabled MTD_NAND_OMAP2 in my qemu tests instead. Not an ideal solution,
> of course, but the alternative would be to drop the beagle qemu tests entirely.

Long time I haven't looked at qemu code, but IIRC there were no proper
support for the NAND layer (maybe this has changed since then though).
And the R/B pin status emulation is probably much more complicated to
implement than just returning a valid STATUS byte in a generic NAND chip
emulation layer (you have to emulate the GPMC block and all its
external interfaces like the R/B IOs as well as the R/B pin
emulation at the NAND chip emulation level)...


-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: qemu:beagle no longer booting with omap2plus_defconfig in -next
  2016-04-24 16:42   ` Guenter Roeck
  2016-04-24 17:14     ` Boris Brezillon
@ 2016-04-24 17:34     ` Boris Brezillon
  2016-04-24 18:11       ` Guenter Roeck
  1 sibling, 1 reply; 9+ messages in thread
From: Boris Brezillon @ 2016-04-24 17:34 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: linux-next, Stephen Rothwell, linux-kernel, Roger Quadros,
	Brian Norris, Tony Lindgren, linux-mtd, linux-omap

On Sun, 24 Apr 2016 09:42:40 -0700
Guenter Roeck <linux@roeck-us.net> wrote:

> In qemu, it looks like gpmc bit 0 is considered to be the NAND chip select,
> which is distinctly different to a chip ready pin. Guess I would have to try
> finding a chip datasheet to figure out what this pin is supposed to do, and
> what is wrong. Since it is somewhat unlikely that I'll find the time to do that,
> I just disabled MTD_NAND_OMAP2 in my qemu tests instead. Not an ideal solution,
> of course, but the alternative would be to drop the beagle qemu tests entirely.

Here is a patch [1] which should fix your problem. It's obviously not
enough to handle the different use cases we have in in the wild, but
should fix your problem on the beagle board.

Regards,

Boris

[1]http://code.bulix.org/i5c4yc-97598

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: qemu:beagle no longer booting with omap2plus_defconfig in -next
  2016-04-24 17:14     ` Boris Brezillon
@ 2016-04-24 18:10       ` Guenter Roeck
  0 siblings, 0 replies; 9+ messages in thread
From: Guenter Roeck @ 2016-04-24 18:10 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: linux-next, Stephen Rothwell, linux-kernel, Roger Quadros,
	Brian Norris, Tony Lindgren, linux-mtd, linux-omap

Hi Boris,

On 04/24/2016 10:14 AM, Boris Brezillon wrote:
[ ... ]

>>
>> In qemu, it looks like gpmc bit 0 is considered to be the NAND chip select,
>> which is distinctly different to a chip ready pin.
>
> Well, if you look at the GPIO controller implementation, you'll see
> that gpichip->get() is adding 8 to the GPIO index, so the
> implementation is actually testing bit 8 and not bit 0. Maybe this is
> not emulated properly in qemu though...
>
That helps. The QEMU emulation always returns 0x0001 when reading gpmc register
0x54, which suggests that WAIT0STATUS reports as 0.

>> Guess I would have to try
>> finding a chip datasheet to figure out what this pin is supposed to do, and
>> what is wrong. Since it is somewhat unlikely that I'll find the time to do that,
>> I just disabled MTD_NAND_OMAP2 in my qemu tests instead. Not an ideal solution,
>> of course, but the alternative would be to drop the beagle qemu tests entirely.
>
> Long time I haven't looked at qemu code, but IIRC there were no proper
> support for the NAND layer (maybe this has changed since then though).
> And the R/B pin status emulation is probably much more complicated to
> implement than just returning a valid STATUS byte in a generic NAND chip
> emulation layer (you have to emulate the GPMC block and all its
> external interfaces like the R/B IOs as well as the R/B pin
> emulation at the NAND chip emulation level)...
>

Well enough for it to at least find the NAND chip.

So the qemu "fix" was to return 0x0101 instead of 0x0001 when reading gpmc
register 0x54.

Now I get "INFO: suspicious RCU usage" on reboot, but that is a separate issue.

Thanks a lot for the hints!

Guenter

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: qemu:beagle no longer booting with omap2plus_defconfig in -next
  2016-04-24 17:34     ` Boris Brezillon
@ 2016-04-24 18:11       ` Guenter Roeck
  0 siblings, 0 replies; 9+ messages in thread
From: Guenter Roeck @ 2016-04-24 18:11 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: linux-next, Stephen Rothwell, linux-kernel, Roger Quadros,
	Brian Norris, Tony Lindgren, linux-mtd, linux-omap

On 04/24/2016 10:34 AM, Boris Brezillon wrote:
> On Sun, 24 Apr 2016 09:42:40 -0700
> Guenter Roeck <linux@roeck-us.net> wrote:
>
>> In qemu, it looks like gpmc bit 0 is considered to be the NAND chip select,
>> which is distinctly different to a chip ready pin. Guess I would have to try
>> finding a chip datasheet to figure out what this pin is supposed to do, and
>> what is wrong. Since it is somewhat unlikely that I'll find the time to do that,
>> I just disabled MTD_NAND_OMAP2 in my qemu tests instead. Not an ideal solution,
>> of course, but the alternative would be to drop the beagle qemu tests entirely.
>
> Here is a patch [1] which should fix your problem. It's obviously not
> enough to handle the different use cases we have in in the wild, but
> should fix your problem on the beagle board.
>
> Regards,
>
> Boris
>
> [1]http://code.bulix.org/i5c4yc-97598
>
Yep, I figured out that one a minute ago :-)

Thanks again!

Guenter

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: qemu:beagle no longer booting with omap2plus_defconfig in -next
  2016-04-23 19:46 ` Boris Brezillon
  2016-04-24 16:42   ` Guenter Roeck
@ 2016-04-24 19:28   ` Boris Brezillon
  1 sibling, 0 replies; 9+ messages in thread
From: Boris Brezillon @ 2016-04-24 19:28 UTC (permalink / raw)
  To: Guenter Roeck, Roger Quadros
  Cc: linux-next, Stephen Rothwell, linux-kernel, Brian Norris,
	Tony Lindgren, linux-mtd, linux-omap

On Sat, 23 Apr 2016 21:46:17 +0200
Boris Brezillon <boris.brezillon@free-electrons.com> wrote:
> 
> And the other question I have for Roger is, do you see a reason why the
> rb-gpio mode would not work?
> 

Forget that one, this bug appeared to be caused by partial emulation of
the GPMC block in qemu.

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-04-24 19:29 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-23 17:53 qemu:beagle no longer booting with omap2plus_defconfig in -next Guenter Roeck
2016-04-23 17:53 ` Guenter Roeck
2016-04-23 19:46 ` Boris Brezillon
2016-04-24 16:42   ` Guenter Roeck
2016-04-24 17:14     ` Boris Brezillon
2016-04-24 18:10       ` Guenter Roeck
2016-04-24 17:34     ` Boris Brezillon
2016-04-24 18:11       ` Guenter Roeck
2016-04-24 19:28   ` Boris Brezillon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.