All of lore.kernel.org
 help / color / mirror / Atom feed
* Issue while oops and panic message logging to MTD partition
@ 2018-05-15 10:49 Jagdish Gediya
  2018-05-15 12:07 ` Boris Brezillon
  0 siblings, 1 reply; 4+ messages in thread
From: Jagdish Gediya @ 2018-05-15 10:49 UTC (permalink / raw)
  To: linux-mtd; +Cc: Prabhakar Kushwaha

Hi,

Setup details:
Board - Freescale ls1046ardb(ARM64)
MTD device - nand(IFC)

CONFIG_MTD_OOPS is enabled to collect oops and panic logs. 
Added bootargs to collect logs : mtdoops.mtddev=3 mtdoops.record_size=16384

Issue:
Kernel hangs during oops log collection in function :fsl_ifc_run_command".
Below is the code location where it hangs exactly,

/*
 * execute IFC NAND command and wait for it to complete
 */
static void fsl_ifc_run_command(struct mtd_info *mtd)
{		.
		.
		.
		.
		.

        /* wait for command complete flag or timeout */
        wait_event_timeout(ctrl->nand_wait, ctrl->nand_stat,
                           msecs_to_jiffies(IFC_TIMEOUT_MSECS));

		.
		.
		.
		.
}

"wait_event_timeout" is the exact culrit where kernel hangs. As panic(...) disables the local interrupt by calling local_irq_disable(),
It looks like behavior is expected because timer interrupts are disabled and because of that "wait_event_timeout" hangs forever.

The odd behaviour is sometimes "wait_event_timeout" does not hang. The reason could be being a multicore processor, some other core would receive the
timer interrupt and as a result "wait_event_timeout" gets unblocked.

How the other driver accomplish the timer replated work if any during the panic path or in general when local interrupts are disabled?

Thanks,
Jagdish

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Issue while oops and panic message logging to MTD partition
  2018-05-15 10:49 Issue while oops and panic message logging to MTD partition Jagdish Gediya
@ 2018-05-15 12:07 ` Boris Brezillon
  2018-05-24  7:22   ` Jagdish Gediya
  0 siblings, 1 reply; 4+ messages in thread
From: Boris Brezillon @ 2018-05-15 12:07 UTC (permalink / raw)
  To: Jagdish Gediya; +Cc: linux-mtd, Prabhakar Kushwaha

Hi,

On Tue, 15 May 2018 10:49:15 +0000
Jagdish Gediya <jagdish.gediya@nxp.com> wrote:

> Hi,
> 
> Setup details:
> Board - Freescale ls1046ardb(ARM64)
> MTD device - nand(IFC)
> 
> CONFIG_MTD_OOPS is enabled to collect oops and panic logs. 
> Added bootargs to collect logs : mtdoops.mtddev=3 mtdoops.record_size=16384
> 
> Issue:
> Kernel hangs during oops log collection in function :fsl_ifc_run_command".
> Below is the code location where it hangs exactly,
> 
> /*
>  * execute IFC NAND command and wait for it to complete
>  */
> static void fsl_ifc_run_command(struct mtd_info *mtd)
> {		.
> 		.
> 		.
> 		.
> 		.
> 
>         /* wait for command complete flag or timeout */
>         wait_event_timeout(ctrl->nand_wait, ctrl->nand_stat,
>                            msecs_to_jiffies(IFC_TIMEOUT_MSECS));
> 
> 		.
> 		.
> 		.
> 		.
> }
> 
> "wait_event_timeout" is the exact culrit where kernel hangs. As panic(...) disables the local interrupt by calling local_irq_disable(),
> It looks like behavior is expected because timer interrupts are disabled and because of that "wait_event_timeout" hangs forever.
> 
> The odd behaviour is sometimes "wait_event_timeout" does not hang. The reason could be being a multicore processor, some other core would receive the
> timer interrupt and as a result "wait_event_timeout" gets unblocked.
> 
> How the other driver accomplish the timer replated work if any during the panic path or in general when local interrupts are disabled?

MTD_OOPS is just a mess, and I'm sure most driver simply don't support
it properly. If you still want to use the feature, you'll probably have
to fallback to status polling instead of using wait_event_timeout().
See what the core does here [1].

Still, I'd recommend not using MTD_OOPS if possible, because I fear
that's not the only problem you'll face. One problem I see is that the
locking is completely bypassed when ->panic_write() is called, and your
->cmdfunc() might be called while another operation is still in
progress (PROGRAM, ERASE, READ...) in order to get the NAND status.
Looking at the ifc code, it seems the driver is not ready to cope with
that.

Regards,

Boris

[1]https://elixir.bootlin.com/linux/v4.17-rc5/source/drivers/mtd/nand/raw/nand_base.c#L648

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: Issue while oops and panic message logging to MTD partition
  2018-05-15 12:07 ` Boris Brezillon
@ 2018-05-24  7:22   ` Jagdish Gediya
  2018-05-24  9:59     ` Boris Brezillon
  0 siblings, 1 reply; 4+ messages in thread
From: Jagdish Gediya @ 2018-05-24  7:22 UTC (permalink / raw)
  To: Boris Brezillon; +Cc: linux-mtd, Prabhakar Kushwaha

Hi Boris,

I tried to test this feature on NOR flash as well however it is not working. There is nothing written on the mtd partition provided through mtdoops.mtddev. Is this the expected behavior?

Has anyone tried to test mtdoops feature on nor flash?

Thanks,
Jagdish

> -----Original Message-----
> From: Boris Brezillon [mailto:boris.brezillon@bootlin.com]
> Sent: Tuesday, May 15, 2018 5:38 PM
> To: Jagdish Gediya <jagdish.gediya@nxp.com>
> Cc: linux-mtd@lists.infradead.org; Prabhakar Kushwaha
> <prabhakar.kushwaha@nxp.com>
> Subject: Re: Issue while oops and panic message logging to MTD partition
> 
> Hi,
> 
> On Tue, 15 May 2018 10:49:15 +0000
> Jagdish Gediya <jagdish.gediya@nxp.com> wrote:
> 
> > Hi,
> >
> > Setup details:
> > Board - Freescale ls1046ardb(ARM64)
> > MTD device - nand(IFC)
> >
> > CONFIG_MTD_OOPS is enabled to collect oops and panic logs.
> > Added bootargs to collect logs : mtdoops.mtddev=3
> > mtdoops.record_size=16384
> >
> > Issue:
> > Kernel hangs during oops log collection in function :fsl_ifc_run_command".
> > Below is the code location where it hangs exactly,
> >
> > /*
> >  * execute IFC NAND command and wait for it to complete  */ static
> > void fsl_ifc_run_command(struct mtd_info *mtd)
> > {		.
> > 		.
> > 		.
> > 		.
> > 		.
> >
> >         /* wait for command complete flag or timeout */
> >         wait_event_timeout(ctrl->nand_wait, ctrl->nand_stat,
> >                            msecs_to_jiffies(IFC_TIMEOUT_MSECS));
> >
> > 		.
> > 		.
> > 		.
> > 		.
> > }
> >
> > "wait_event_timeout" is the exact culrit where kernel hangs. As
> > panic(...) disables the local interrupt by calling local_irq_disable(), It looks like
> behavior is expected because timer interrupts are disabled and because of that
> "wait_event_timeout" hangs forever.
> >
> > The odd behaviour is sometimes "wait_event_timeout" does not hang. The
> > reason could be being a multicore processor, some other core would receive
> the timer interrupt and as a result "wait_event_timeout" gets unblocked.
> >
> > How the other driver accomplish the timer replated work if any during the
> panic path or in general when local interrupts are disabled?
> 
> MTD_OOPS is just a mess, and I'm sure most driver simply don't support it
> properly. If you still want to use the feature, you'll probably have to fallback to
> status polling instead of using wait_event_timeout().
> See what the core does here [1].
> 
> Still, I'd recommend not using MTD_OOPS if possible, because I fear that's not
> the only problem you'll face. One problem I see is that the locking is
> completely bypassed when ->panic_write() is called, and your
> ->cmdfunc() might be called while another operation is still in
> progress (PROGRAM, ERASE, READ...) in order to get the NAND status.
> Looking at the ifc code, it seems the driver is not ready to cope with that.
> 
> Regards,
> 
> Boris
> 
> [1]https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Feli
> xir.bootlin.com%2Flinux%2Fv4.17-
> rc5%2Fsource%2Fdrivers%2Fmtd%2Fnand%2Fraw%2Fnand_base.c%23L648&d
> ata=02%7C01%7Cjagdish.gediya%40nxp.com%7Ca4d8df1c44a44dfa4abd08d5b
> a5c7bac%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C63661982875
> 7277666&sdata=9zGWTGKp8ILZJLu8QdRz5YdmHnxI2cxYN7iYQYtAX9k%3D&res
> erved=0

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Issue while oops and panic message logging to MTD partition
  2018-05-24  7:22   ` Jagdish Gediya
@ 2018-05-24  9:59     ` Boris Brezillon
  0 siblings, 0 replies; 4+ messages in thread
From: Boris Brezillon @ 2018-05-24  9:59 UTC (permalink / raw)
  To: Jagdish Gediya; +Cc: Prabhakar Kushwaha, linux-mtd

On Thu, 24 May 2018 07:22:35 +0000
Jagdish Gediya <jagdish.gediya@nxp.com> wrote:

> Hi Boris,
> 
> I tried to test this feature on NOR flash as well however it is not
> working. There is nothing written on the mtd partition provided
> through mtdoops.mtddev. Is this the expected behavior?

It's probably not supported by the MTD driver. You can check which
drivers are supporting panic writes with 'git grep "\->_panic_write"'.

> 
> Has anyone tried to test mtdoops feature on nor flash?

I don't know if anyone has tried that, I certainly did not.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-05-24  9:59 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-15 10:49 Issue while oops and panic message logging to MTD partition Jagdish Gediya
2018-05-15 12:07 ` Boris Brezillon
2018-05-24  7:22   ` Jagdish Gediya
2018-05-24  9:59     ` Boris Brezillon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.