All of lore.kernel.org
 help / color / mirror / Atom feed
* OOB Test fails
@ 2016-10-26 16:07 Danesh Daroui
  2016-10-26 16:16 ` Steve deRosier
  0 siblings, 1 reply; 9+ messages in thread
From: Danesh Daroui @ 2016-10-26 16:07 UTC (permalink / raw)
  To: linux-mtd

Hello all,

I have executed MTD tests against an unmounted MTD device which uses UBIFS. All MTD tests except OOB test is passed where the OOB test fails from time to time and from device to device. I have read here:

http://www.linux-mtd.infradead.org/faq/ubi.html#L_why_no_oob

that UBIFS does not use OOB area. Therefore wanted to ask if failing this test does not need that there is any problem in our system (both HW and SW/configuration) and we can safely ignore the results for this test.

Thanks in advance,

Danesh Daroui

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: OOB Test fails
  2016-10-26 16:07 OOB Test fails Danesh Daroui
@ 2016-10-26 16:16 ` Steve deRosier
  2016-10-26 16:28   ` Danesh Daroui
  2016-10-27  7:34   ` Boris Brezillon
  0 siblings, 2 replies; 9+ messages in thread
From: Steve deRosier @ 2016-10-26 16:16 UTC (permalink / raw)
  To: Danesh Daroui; +Cc: linux-mtd

Hi Danesh,


On Wed, Oct 26, 2016 at 9:07 AM, Danesh Daroui <Danesh.Daroui@ascom.com> wrote:
> Hello all,
>
> I have executed MTD tests against an unmounted MTD device which uses UBIFS. All MTD tests except OOB test is passed where the OOB test fails from time to time and from device to device. I have read here:
>
> http://www.linux-mtd.infradead.org/faq/ubi.html#L_why_no_oob
>
> that UBIFS does not use OOB area. Therefore wanted to ask if failing this test does not need that there is any problem in our system (both HW and SW/configuration) and we can safely ignore the results for this test.
>

UBIFS doesn't use the OOB area, but your MTD driver most likely does.
We don't want UBI or UBIFS messing with the OOB area because your
flash, NAND controller and the MTD driver will use that. Bad block
markers are set in the OOB area by the manufacturer in most NAND
flashes and your controller and MTD driver will store and use the bits
in the OOB to do error correction.

So, in short - the OOB area does need to function correctly.  As to if
and why the OOB test is or isn't working is a different issue and I
have no input on that.  All that depends on your mix of flash,
controller and driver.

- Steve

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: OOB Test fails
  2016-10-26 16:16 ` Steve deRosier
@ 2016-10-26 16:28   ` Danesh Daroui
  2016-10-27  7:38     ` Boris Brezillon
  2016-10-27  7:34   ` Boris Brezillon
  1 sibling, 1 reply; 9+ messages in thread
From: Danesh Daroui @ 2016-10-26 16:28 UTC (permalink / raw)
  To: Steve deRosier; +Cc: linux-mtd

Hi Steve,

Thank you for your prompt answer. When I run OOB test (mtd_oobtest), for instance, one of devices always return verification failed error on a certain address. This is all we know and all the test reports. We use a quite old kernel i.e. 2.6.39 and this is one of the things that we suspect as a source of the problem that the kernel is outdated. Also, we consider the hardware failure since on some devices no error is shown on OOB test while on others more errors are shown and the address is changed randomly sometimes.

Our main problem is that sometimes UBIFS forces the device into read-only mode due to "bad CRC" error at startup when the device is booted. I am now running tests which are in "mtd_utils" for testing file system. I have started running two tests which are "simple/test_1" and "simple/test_2" which simply write until the drive is full and the read the data back and verify the correctness. During the test, I see lots of:

UBI: scrubbed PEB 585 (LEB 3:770), data moved to PEB 1772
UBI: scrubbed PEB 1045 (LEB 3:1261), data moved to PEB 828
UBI: scrubbed PEB 1493 (LEB 3:664), data moved to PEB 814
UBI: scrubbed PEB 751 (LEB 3:1260), data moved to PEB 1772

In my mind, this is related to problematic hardware that the data is corrupted on many cells that UBIFS tries to move the data when a corruption is detected. My question is, whether this guess can be valid or this is mostly due to old kernel that we are using and upgrading to a new kernel would most likely solve the problems?

Thanks again,

Danesh Daroui


-----Original Message-----
From: Steve deRosier [mailto:derosier@gmail.com] 
Sent: den 26 oktober 2016 18:17
To: Danesh Daroui <Danesh.Daroui@ascom.com>
Cc: linux-mtd@lists.infradead.org
Subject: Re: OOB Test fails

Hi Danesh,


On Wed, Oct 26, 2016 at 9:07 AM, Danesh Daroui <Danesh.Daroui@ascom.com> wrote:
> Hello all,
>
> I have executed MTD tests against an unmounted MTD device which uses UBIFS. All MTD tests except OOB test is passed where the OOB test fails from time to time and from device to device. I have read here:
>
> http://www.linux-mtd.infradead.org/faq/ubi.html#L_why_no_oob
>
> that UBIFS does not use OOB area. Therefore wanted to ask if failing this test does not need that there is any problem in our system (both HW and SW/configuration) and we can safely ignore the results for this test.
>

UBIFS doesn't use the OOB area, but your MTD driver most likely does.
We don't want UBI or UBIFS messing with the OOB area because your flash, NAND controller and the MTD driver will use that. Bad block markers are set in the OOB area by the manufacturer in most NAND flashes and your controller and MTD driver will store and use the bits in the OOB to do error correction.

So, in short - the OOB area does need to function correctly.  As to if and why the OOB test is or isn't working is a different issue and I have no input on that.  All that depends on your mix of flash, controller and driver.

- Steve

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: OOB Test fails
  2016-10-26 16:16 ` Steve deRosier
  2016-10-26 16:28   ` Danesh Daroui
@ 2016-10-27  7:34   ` Boris Brezillon
  2016-10-27 15:45     ` Boris Brezillon
  1 sibling, 1 reply; 9+ messages in thread
From: Boris Brezillon @ 2016-10-27  7:34 UTC (permalink / raw)
  To: Steve deRosier; +Cc: Danesh Daroui, linux-mtd

Hi Steve,

On Wed, 26 Oct 2016 09:16:42 -0700
Steve deRosier <derosier@gmail.com> wrote:

> Hi Danesh,
> 
> 
> On Wed, Oct 26, 2016 at 9:07 AM, Danesh Daroui <Danesh.Daroui@ascom.com> wrote:
> > Hello all,
> >
> > I have executed MTD tests against an unmounted MTD device which uses UBIFS. All MTD tests except OOB test is passed where the OOB test fails from time to time and from device to device. I have read here:
> >
> > http://www.linux-mtd.infradead.org/faq/ubi.html#L_why_no_oob
> >
> > that UBIFS does not use OOB area. Therefore wanted to ask if failing this test does not need that there is any problem in our system (both HW and SW/configuration) and we can safely ignore the results for this test.
> >  
> 
> UBIFS doesn't use the OOB area, but your MTD driver most likely does.
> We don't want UBI or UBIFS messing with the OOB area because your
> flash, NAND controller and the MTD driver will use that. Bad block
> markers are set in the OOB area by the manufacturer in most NAND
> flashes and your controller and MTD driver will store and use the bits
> in the OOB to do error correction.
> 
> So, in short - the OOB area does need to function correctly.  As to if
> and why the OOB test is or isn't working is a different issue and I
> have no input on that.  All that depends on your mix of flash,
> controller and driver.

UBI/UBIFS are not using the OOB area, but I don't think it's a good
idea to encourage people to partially implement the NAND interface.
What if another layer needs to use the OOB area (JFFS2 does)?

Having your NAND controller driver pass all the MTD tests is a good
practice, let's keep it this way, even if some features are not
strictly required for the UBI/UBIFS use case.

Thanks,

Boris

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: OOB Test fails
  2016-10-26 16:28   ` Danesh Daroui
@ 2016-10-27  7:38     ` Boris Brezillon
  2016-10-27 10:51       ` Danesh Daroui
  0 siblings, 1 reply; 9+ messages in thread
From: Boris Brezillon @ 2016-10-27  7:38 UTC (permalink / raw)
  To: Danesh Daroui; +Cc: Steve deRosier, linux-mtd

Hi Danesh,

On Wed, 26 Oct 2016 16:28:43 +0000
Danesh Daroui <Danesh.Daroui@ascom.com> wrote:

> Hi Steve,
> 
> Thank you for your prompt answer. When I run OOB test (mtd_oobtest), for instance, one of devices always return verification failed error on a certain address. This is all we know and all the test reports. We use a quite old kernel i.e. 2.6.39 and this is one of the things that we suspect as a source of the problem that the kernel is outdated. Also, we consider the hardware failure since on some devices no error is shown on OOB test while on others more errors are shown and the address is changed randomly sometimes.

Yes, please, try with a newer kernel: I won't help debugging such an
old thing.

> 
> Our main problem is that sometimes UBIFS forces the device into read-only mode due to "bad CRC" error at startup when the device is booted. I am now running tests which are in "mtd_utils" for testing file system. I have started running two tests which are "simple/test_1" and "simple/test_2" which simply write until the drive is full and the read the data back and verify the correctness. During the test, I see lots of:
> 
> UBI: scrubbed PEB 585 (LEB 3:770), data moved to PEB 1772
> UBI: scrubbed PEB 1045 (LEB 3:1261), data moved to PEB 828
> UBI: scrubbed PEB 1493 (LEB 3:664), data moved to PEB 814
> UBI: scrubbed PEB 751 (LEB 3:1260), data moved to PEB 1772
> 
> In my mind, this is related to problematic hardware that the data is corrupted on many cells that UBIFS tries to move the data when a corruption is detected. My question is, whether this guess can be valid or this is mostly due to old kernel that we are using and upgrading to a new kernel would most likely solve the problems?

Well, I can't tell. It can be caused by a buggy NAND controller driver,
a bug in the UBI layer or maybe your NAND is simply worn.

Try with a newer kernel, and let's see what the MTD tests and MTD utils
tests say.

BTW, which NAND and NAND controller are your testing on?

Regards,

Boris

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: OOB Test fails
  2016-10-27  7:38     ` Boris Brezillon
@ 2016-10-27 10:51       ` Danesh Daroui
  0 siblings, 0 replies; 9+ messages in thread
From: Danesh Daroui @ 2016-10-27 10:51 UTC (permalink / raw)
  To: Boris Brezillon; +Cc: Steve deRosier, linux-mtd

Hi Boris,

Thanks for your help. We would really like to upgrade the Kernel and that is a wise approach of course, but we would like to be sure that this is due to the outdated Kernel or whether this is a hardware problem since Kernel upgrading is a time consuming and cumbersome task, but definitely necessary as you mentioned. Right now I am trying to run UBIFS tests which are included in "mtd_utils". I hope these tests will give me some hints if there is any problem is UBI/UBIFS layers. I had written my own stress test before which would test the memory on POSIX level (same as UBI/UBIFS layers more or less), and I experienced some crashes but could not identify what is the reason. For instance I could not find out if the crash happens due to a bug in driver or file system, etc. 

The flash memory we are using is a Micron NAND 1GiB 3,3V 8-bit and the driver delivered with Kernel 3.6.39. Have you heard about similar problem before? Or do you want me to give you more info about the hardware and the system we have under test?

Thanks again for your help,

Danesh Daroui


-----Original Message-----
From: Boris Brezillon [mailto:boris.brezillon@free-electrons.com] 
Sent: den 27 oktober 2016 09:38
To: Danesh Daroui <Danesh.Daroui@ascom.com>
Cc: Steve deRosier <derosier@gmail.com>; linux-mtd@lists.infradead.org
Subject: Re: OOB Test fails

Hi Danesh,

On Wed, 26 Oct 2016 16:28:43 +0000
Danesh Daroui <Danesh.Daroui@ascom.com> wrote:

> Hi Steve,
> 
> Thank you for your prompt answer. When I run OOB test (mtd_oobtest), for instance, one of devices always return verification failed error on a certain address. This is all we know and all the test reports. We use a quite old kernel i.e. 2.6.39 and this is one of the things that we suspect as a source of the problem that the kernel is outdated. Also, we consider the hardware failure since on some devices no error is shown on OOB test while on others more errors are shown and the address is changed randomly sometimes.

Yes, please, try with a newer kernel: I won't help debugging such an old thing.

> 
> Our main problem is that sometimes UBIFS forces the device into read-only mode due to "bad CRC" error at startup when the device is booted. I am now running tests which are in "mtd_utils" for testing file system. I have started running two tests which are "simple/test_1" and "simple/test_2" which simply write until the drive is full and the read the data back and verify the correctness. During the test, I see lots of:
> 
> UBI: scrubbed PEB 585 (LEB 3:770), data moved to PEB 1772
> UBI: scrubbed PEB 1045 (LEB 3:1261), data moved to PEB 828
> UBI: scrubbed PEB 1493 (LEB 3:664), data moved to PEB 814
> UBI: scrubbed PEB 751 (LEB 3:1260), data moved to PEB 1772
> 
> In my mind, this is related to problematic hardware that the data is corrupted on many cells that UBIFS tries to move the data when a corruption is detected. My question is, whether this guess can be valid or this is mostly due to old kernel that we are using and upgrading to a new kernel would most likely solve the problems?

Well, I can't tell. It can be caused by a buggy NAND controller driver, a bug in the UBI layer or maybe your NAND is simply worn.

Try with a newer kernel, and let's see what the MTD tests and MTD utils tests say.

BTW, which NAND and NAND controller are your testing on?

Regards,

Boris

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: OOB Test fails
  2016-10-27  7:34   ` Boris Brezillon
@ 2016-10-27 15:45     ` Boris Brezillon
  2016-10-28 10:23       ` Danesh Daroui
  0 siblings, 1 reply; 9+ messages in thread
From: Boris Brezillon @ 2016-10-27 15:45 UTC (permalink / raw)
  To: Steve deRosier; +Cc: Danesh Daroui, linux-mtd

On Thu, 27 Oct 2016 09:34:07 +0200
Boris Brezillon <boris.brezillon@free-electrons.com> wrote:

> Hi Steve,
> 
> On Wed, 26 Oct 2016 09:16:42 -0700
> Steve deRosier <derosier@gmail.com> wrote:
> 
> > Hi Danesh,
> > 
> > 
> > On Wed, Oct 26, 2016 at 9:07 AM, Danesh Daroui <Danesh.Daroui@ascom.com> wrote:  
> > > Hello all,
> > >
> > > I have executed MTD tests against an unmounted MTD device which uses UBIFS. All MTD tests except OOB test is passed where the OOB test fails from time to time and from device to device. I have read here:
> > >
> > > http://www.linux-mtd.infradead.org/faq/ubi.html#L_why_no_oob
> > >
> > > that UBIFS does not use OOB area. Therefore wanted to ask if failing this test does not need that there is any problem in our system (both HW and SW/configuration) and we can safely ignore the results for this test.
> > >    
> > 
> > UBIFS doesn't use the OOB area, but your MTD driver most likely does.
> > We don't want UBI or UBIFS messing with the OOB area because your
> > flash, NAND controller and the MTD driver will use that. Bad block
> > markers are set in the OOB area by the manufacturer in most NAND
> > flashes and your controller and MTD driver will store and use the bits
> > in the OOB to do error correction.
> > 
> > So, in short - the OOB area does need to function correctly.  As to if
> > and why the OOB test is or isn't working is a different issue and I
> > have no input on that.  All that depends on your mix of flash,
> > controller and driver.  

My bad, I misread your sentence. I thought you we saying "the OOB area
does *not* need to function correctly".

My apologies for this misunderstanding, it seems we're on the same page.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: OOB Test fails
  2016-10-27 15:45     ` Boris Brezillon
@ 2016-10-28 10:23       ` Danesh Daroui
  2016-10-28 14:40         ` Ricard Wanderlof
  0 siblings, 1 reply; 9+ messages in thread
From: Danesh Daroui @ 2016-10-28 10:23 UTC (permalink / raw)
  To: Boris Brezillon, Steve deRosier; +Cc: linux-mtd

Hi Boris,

I have one question and would appreciate if you can assist me with that. As I wrote we are using a Micron NAND 1GiB 3,3V 8-bit flash memory with UBIFS where we experience some problems. I am wondering whether there are specific drivers provided by vendor in this case or there are generic drivers with respect to the hardware architecture given to the Kernel that can be used as we probably did in this case? I have downloaded a driver package from Micron which is apparently written for our NAND flash memory but the driver uses lots of read/write functions which are platform dependent and not implemented in the driver which makes sense though.

Would you please advise and let us know if we are on the right track e.g. using the driver with the kernel or we need to find more specific driver so the problem that we face now can be related to incorrect driver?

Regards,

Danesh Daroui


-----Original Message-----
From: Boris Brezillon [mailto:boris.brezillon@free-electrons.com] 
Sent: den 27 oktober 2016 17:45
To: Steve deRosier <derosier@gmail.com>
Cc: Danesh Daroui <Danesh.Daroui@ascom.com>; linux-mtd@lists.infradead.org
Subject: Re: OOB Test fails

On Thu, 27 Oct 2016 09:34:07 +0200
Boris Brezillon <boris.brezillon@free-electrons.com> wrote:

> Hi Steve,
> 
> On Wed, 26 Oct 2016 09:16:42 -0700
> Steve deRosier <derosier@gmail.com> wrote:
> 
> > Hi Danesh,
> > 
> > 
> > On Wed, Oct 26, 2016 at 9:07 AM, Danesh Daroui <Danesh.Daroui@ascom.com> wrote:  
> > > Hello all,
> > >
> > > I have executed MTD tests against an unmounted MTD device which uses UBIFS. All MTD tests except OOB test is passed where the OOB test fails from time to time and from device to device. I have read here:
> > >
> > > http://www.linux-mtd.infradead.org/faq/ubi.html#L_why_no_oob
> > >
> > > that UBIFS does not use OOB area. Therefore wanted to ask if failing this test does not need that there is any problem in our system (both HW and SW/configuration) and we can safely ignore the results for this test.
> > >    
> > 
> > UBIFS doesn't use the OOB area, but your MTD driver most likely does.
> > We don't want UBI or UBIFS messing with the OOB area because your 
> > flash, NAND controller and the MTD driver will use that. Bad block 
> > markers are set in the OOB area by the manufacturer in most NAND 
> > flashes and your controller and MTD driver will store and use the 
> > bits in the OOB to do error correction.
> > 
> > So, in short - the OOB area does need to function correctly.  As to 
> > if and why the OOB test is or isn't working is a different issue and 
> > I have no input on that.  All that depends on your mix of flash, 
> > controller and driver.

My bad, I misread your sentence. I thought you we saying "the OOB area does *not* need to function correctly".

My apologies for this misunderstanding, it seems we're on the same page.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: OOB Test fails
  2016-10-28 10:23       ` Danesh Daroui
@ 2016-10-28 14:40         ` Ricard Wanderlof
  0 siblings, 0 replies; 9+ messages in thread
From: Ricard Wanderlof @ 2016-10-28 14:40 UTC (permalink / raw)
  To: Danesh Daroui; +Cc: Boris Brezillon, Steve deRosier, linux-mtd


On Fri, 28 Oct 2016, Danesh Daroui wrote:

> I have one question and would appreciate if you can assist me with that. 
> As I wrote we are using a Micron NAND 1GiB 3,3V 8-bit flash memory with 
> UBIFS where we experience some problems.

For what its worth, we've been using precisely this flash in many of our 
products for quite a while without any special drivers and with no OOB 
problems.

/Ricard
-- 
Ricard Wolf Wanderlöf                           ricardw(at)axis.com
Axis Communications AB, Lund, Sweden            www.axis.com
Phone +46 46 272 2016                           Fax +46 46 13 61 30

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-10-28 14:41 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-26 16:07 OOB Test fails Danesh Daroui
2016-10-26 16:16 ` Steve deRosier
2016-10-26 16:28   ` Danesh Daroui
2016-10-27  7:38     ` Boris Brezillon
2016-10-27 10:51       ` Danesh Daroui
2016-10-27  7:34   ` Boris Brezillon
2016-10-27 15:45     ` Boris Brezillon
2016-10-28 10:23       ` Danesh Daroui
2016-10-28 14:40         ` Ricard Wanderlof

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.