All of lore.kernel.org
 help / color / mirror / Atom feed
* Correct behaviour of ubiattach / ubiformat in the face of ECC errors?
@ 2014-06-19  3:15 Iwo Mergler
  2014-07-01 14:05 ` Artem Bityutskiy
  0 siblings, 1 reply; 3+ messages in thread
From: Iwo Mergler @ 2014-06-19  3:15 UTC (permalink / raw)
  To: linux-mtd


Hi all,


I'm debugging a NAND driver and I'm unsure about what the correct behaviour
of ubiformat / ubiattach ought to be in the face of uncorrectable ECC errors.

This is a 2.6.35 kernel, with UBI/UBIFS updated to the top of
git://git.infradead.org/~dedekind/ubifs-v2.6.35.git and mtd-utils v1.5.1.

My question is, what is ubiattach / ubiformat supposed to do if a non-correctable 
ECC error occurs in a block during the scan?

My gut feeling says that ubiattach should fail if the PEB was in use, but ubiformat
should mark the PEB as bad and continue.

This is not the behaviour I'm observing. The broken PEB in the examples below
contains a EC header, but no VID header - it's not in use.

To complicate matters, I'm also unsure about the correct return value from the
NAND driver in this situation. Depending on which documentation I read, it should
now return -1, -EIO or -EBADMSG.

So I tried all of them.

Ubiformat. fails for -EPERM & -EIO, enters infinite read loop for -EBADMSG:

[   14.800000] SWECC(1) - MBE detected ([5ff00]-0000/0)
[   14.810000] [T49]bovine_nand_read_page_hwecc:671: Returned -1 [-EPERM], page=0x5ff00
libmtd: error!: cannot read 64 bytes from mtd6 (eraseblock 13796, offset 0)
        error 1 (Operation not permitted)
ubiformat: error!: failed to scan mtd6 (/dev/mtd6)

[   14.840000] [T49]bovine_nand_read_page_hwecc:671: Returned -5 [-EIO], page=0x5ff00
libmtd: error!: cannot read 64 bytes from mtd6 (eraseblock 13796, offset 0)
        error 5 (Input/output error)
ubiformat: error!: failed to scan mtd6 (/dev/mtd6)

[   27.040000] HWECC(1) - MBE detected ([5ff00]-0000)
[   27.040000] SWECC(1) - MBE detected ([5ff00]-0000/0)
[   27.050000] [T4a]bovine_nand_read_page_hwecc:671: Returned -74 [-EBADMSG], page=0x5ff00
[   27.060000] HWECC(1) - MBE detected ([5ff00]-0000)
[   27.060000] SWECC(1) - MBE detected ([5ff00]-0000/0)
[   27.070000] [T4a]bovine_nand_read_page_hwecc:671: Returned -74 [-EBADMSG], page=0x5ff00
[   27.080000] HWECC(1) - MBE detected ([5ff00]-0000)
[   27.080000] SWECC(1) - MBE detected ([5ff00]-0000/0)
[   27.090000] [T4a]bovine_nand_read_page_hwecc:671: Returned -74 [-EBADMSG], page=0x5ff00
[   27.090000] HWECC(1) - MBE detected ([5ff00]-0000)
[   27.100000] SWECC(1) - MBE detected ([5ff00]-0000/0)
[   27.100000] [T4a]bovine_nand_read_page_hwecc:671: Returned -74 [-EBADMSG], page=0x5ff00
[   27.110000] HWECC(1) - MBE detected ([5ff00]-0000)
[   27.120000] SWECC(1) - MBE detected ([5ff00]-0000/0)
... Goes on forever.

Ubiattach. This fails in all cases, although the error is significantly different when -EBADMSG
is returned:

[   19.910000] HWECC(1) - MBE detected ([5ff00]-0000)
[   19.920000] SWECC(1) - MBE detected ([5ff00]-0000/0)
[   19.920000] [T84]bovine_nand_read_page_hwecc:671: Returned -1 [-EPERM], page=0x5ff00
[   19.930000] UBI warning: ubi_io_read: error -1 while reading 64 bytes from PEB 13796:0, read only 0 bytes, retry
[   19.940000] HWECC(1) - MBE detected ([5ff00]-0000)
[   19.950000] SWECC(1) - MBE detected ([5ff00]-0000/0)
[   19.950000] [T84]bovine_nand_read_page_hwecc:671: Returned -1 [-EPERM], page=0x5ff00
[   19.960000] UBI warning: ubi_io_read: error -1 while reading 64 bytes from PEB 13796:0, read only 0 bytes, retry
[   19.970000] HWECC(1) - MBE detected ([5ff00]-0000)
[   19.970000] SWECC(1) - MBE detected ([5ff00]-0000/0)
[   19.980000] [T84]bovine_nand_read_page_hwecc:671: Returned -1 [-EPERM], page=0x5ff00
[   19.990000] UBI warning: ubi_io_read: error -1 while reading 64 bytes from PEB 13796:0, read only 0 bytes, retry
[   20.000000] HWECC(1) - MBE detected ([5ff00]-0000)
[   20.000000] SWECC(1) - MBE detected ([5ff00]-0000/0)
[   20.010000] [T84]bovine_nand_read_page_hwecc:671: Returned -1 [-EPERM], page=0x5ff00
[   20.020000] UBI error: ubi_io_read: error -1 while reading 64 bytes from PEB 13796:0, read 0 bytes
[   20.020000] [<c002a420>] (unwind_backtrace+0x0/0xec) from [<c01e23a0>] (ubi_io_read+0x1dc/0x2a4)
[   20.030000] [<c01e23a0>] (ubi_io_read+0x1dc/0x2a4) from [<c01e2814>] (ubi_io_read_ec_hdr+0x68/0x214)
[   20.040000] [<c01e2814>] (ubi_io_read_ec_hdr+0x68/0x214) from [<c01e6d30>] (scan_peb+0x54/0x5e8)
[   20.050000] [<c01e6d30>] (scan_peb+0x54/0x5e8) from [<c01e73a8>] (scan_all+0xe4/0x2f8)
[   20.060000] [<c01e73a8>] (scan_all+0xe4/0x2f8) from [<c01e75c8>] (ubi_attach+0xc/0xbc)
[   20.070000] [<c01e75c8>] (ubi_attach+0xc/0xbc) from [<c01dcabc>] (ubi_attach_mtd_dev+0x1e4/0x570)
[   20.080000] [<c01dcabc>] (ubi_attach_mtd_dev+0x1e4/0x570) from [<c01dcfb8>] (ctrl_cdev_ioctl+0xd0/0x160)
[   20.090000] [<c01dcfb8>] (ctrl_cdev_ioctl+0xd0/0x160) from [<c0095898>] (vfs_ioctl+0x2c/0x70)
[   20.090000] [<c0095898>] (vfs_ioctl+0x2c/0x70) from [<c0095fa4>] (do_vfs_ioctl+0x300/0x32c)
[   20.100000] [<c0095fa4>] (do_vfs_ioctl+0x300/0x32c) from [<c0096008>] (sys_ioctl+0x38/0x5c)
[   20.110000] [<c0096008>] (sys_ioctl+0x38/0x5c) from [<c0024e80>] (ret_fast_syscall+0x0/0x2c)
[   20.120000] UBI error: ubi_attach_mtd_dev: failed to attach mtd6, error -1
ubiattach: error!: cannot attach mtd6
           error 1 (Operation not permitted)
ubimount: [ubiattach /dev/ubi_ctrl -m 6 -d 11 -O 4096]->255

[   19.940000] HWECC(1) - MBE detected ([5ff00]-0000)
[   19.940000] SWECC(1) - MBE detected ([5ff00]-0000/0)
[   19.950000] [T01]bovine_nand_read_page_hwecc:671: Returned -5 [-EIO], page=0x5ff00
[   19.960000] UBI warning: ubi_io_read: error -5 while reading 64 bytes from PEB 13796:0, read only 0 bytes, retry
[   19.970000] HWECC(1) - MBE detected ([5ff00]-0000)
[   19.970000] SWECC(1) - MBE detected ([5ff00]-0000/0)
[   19.980000] [T01]bovine_nand_read_page_hwecc:671: Returned -5 [-EIO], page=0x5ff00
[   19.980000] UBI warning: ubi_io_read: error -5 while reading 64 bytes from PEB 13796:0, read only 0 bytes, retry
[   19.990000] HWECC(1) - MBE detected ([5ff00]-0000)
[   20.000000] SWECC(1) - MBE detected ([5ff00]-0000/0)
[   20.000000] [T01]bovine_nand_read_page_hwecc:671: Returned -5 [-EIO], page=0x5ff00
[   20.010000] UBI warning: ubi_io_read: error -5 while reading 64 bytes from PEB 13796:0, read only 0 bytes, retry
[   20.020000] HWECC(1) - MBE detected ([5ff00]-0000)
[   20.030000] SWECC(1) - MBE detected ([5ff00]-0000/0)
[   20.030000] [T01]bovine_nand_read_page_hwecc:671: Returned -5 [-EIO], page=0x5ff00
[   20.040000] UBI error: ubi_io_read: error -5 while reading 64 bytes from PEB 13796:0, read 0 bytes
[   20.050000] [<c002a420>] (unwind_backtrace+0x0/0xec) from [<c01e2374>] (ubi_io_read+0x1dc/0x2a4)
[   20.060000] [<c01e2374>] (ubi_io_read+0x1dc/0x2a4) from [<c01e27e8>] (ubi_io_read_ec_hdr+0x68/0x214)
[   20.070000] [<c01e27e8>] (ubi_io_read_ec_hdr+0x68/0x214) from [<c01e6d04>] (scan_peb+0x54/0x5e8)
[   20.070000] [<c01e6d04>] (scan_peb+0x54/0x5e8) from [<c01e737c>] (scan_all+0xe4/0x2f8)
[   20.080000] [<c01e737c>] (scan_all+0xe4/0x2f8) from [<c01e759c>] (ubi_attach+0xc/0xbc)
[   20.090000] [<c01e759c>] (ubi_attach+0xc/0xbc) from [<c01dca90>] (ubi_attach_mtd_dev+0x1e4/0x570)
[   20.100000] [<c01dca90>] (ubi_attach_mtd_dev+0x1e4/0x570) from [<c01dcf8c>] (ctrl_cdev_ioctl+0xd0/0x160)
[   20.110000] [<c01dcf8c>] (ctrl_cdev_ioctl+0xd0/0x160) from [<c0095898>] (vfs_ioctl+0x2c/0x70)
[   20.120000] [<c0095898>] (vfs_ioctl+0x2c/0x70) from [<c0095fa4>] (do_vfs_ioctl+0x300/0x32c)
[   20.130000] [<c0095fa4>] (do_vfs_ioctl+0x300/0x32c) from [<c0096008>] (sys_ioctl+0x38/0x5c)
[   20.130000] [<c0096008>] (sys_ioctl+0x38/0x5c) from [<c0024e80>] (ret_fast_syscall+0x0/0x2c)
[   20.150000] UBI error: ubi_attach_mtd_dev: failed to attach mtd6, error -5
ubiattach: error!: cannot attach mtd6
           error 5 (Input/output error)
ubimount: [ubiattach /dev/ubi_ctrl -m 6 -d 11 -O 4096]->255

[   20.080000] HWECC(1) - MBE detected ([5ff00]-0000)
[   20.080000] SWECC(1) - MBE detected ([5ff00]-0000/0)
[   20.090000] [T88]bovine_nand_read_page_hwecc:671: Returned -74 [-EBADMSG], page=0x5ff00
[   20.090000] UBI warning: ubi_io_read: error -74 (ECC error) while reading 64 bytes from PEB 13796:0, read only 0 bytes, retry
[   20.110000] HWECC(1) - MBE detected ([5ff00]-0000)
[   20.110000] SWECC(1) - MBE detected ([5ff00]-0000/0)
[   20.120000] [T88]bovine_nand_read_page_hwecc:671: Returned -74 [-EBADMSG], page=0x5ff00
[   20.120000] UBI warning: ubi_io_read: error -74 (ECC error) while reading 64 bytes from PEB 13796:0, read only 0 bytes, retry
[   20.140000] HWECC(1) - MBE detected ([5ff00]-0000)
[   20.140000] SWECC(1) - MBE detected ([5ff00]-0000/0)
[   20.150000] [T88]bovine_nand_read_page_hwecc:671: Returned -74 [-EBADMSG], page=0x5ff00
[   20.150000] UBI warning: ubi_io_read: error -74 (ECC error) while reading 64 bytes from PEB 13796:0, read only 0 bytes, retry
[   20.170000] HWECC(1) - MBE detected ([5ff00]-0000)
[   20.170000] SWECC(1) - MBE detected ([5ff00]-0000/0)
[   20.180000] [T88]bovine_nand_read_page_hwecc:671: Returned -74 [-EBADMSG], page=0x5ff00
[   20.180000] UBI error: ubi_io_read: error -74 (ECC error) while reading 64 bytes from PEB 13796:0, read 0 bytes
[   20.190000] [<c002a420>] (unwind_backtrace+0x0/0xec) from [<c01e23a0>] (ubi_io_read+0x1dc/0x2a4)
[   20.200000] [<c01e23a0>] (ubi_io_read+0x1dc/0x2a4) from [<c01e2814>] (ubi_io_read_ec_hdr+0x68/0x214)
[   20.210000] [<c01e2814>] (ubi_io_read_ec_hdr+0x68/0x214) from [<c01e6d30>] (scan_peb+0x54/0x5e8)
[   20.220000] [<c01e6d30>] (scan_peb+0x54/0x5e8) from [<c01e73a8>] (scan_all+0xe4/0x2f8)
[   20.230000] [<c01e73a8>] (scan_all+0xe4/0x2f8) from [<c01e75c8>] (ubi_attach+0xc/0xbc)
[   20.240000] [<c01e75c8>] (ubi_attach+0xc/0xbc) from [<c01dcabc>] (ubi_attach_mtd_dev+0x1e4/0x570)
[   20.240000] [<c01dcabc>] (ubi_attach_mtd_dev+0x1e4/0x570) from [<c01dcfb8>] (ctrl_cdev_ioctl+0xd0/0x160)
[   20.250000] [<c01dcfb8>] (ctrl_cdev_ioctl+0xd0/0x160) from [<c0095898>] (vfs_ioctl+0x2c/0x70)
[   20.260000] [<c0095898>] (vfs_ioctl+0x2c/0x70) from [<c0095fa4>] (do_vfs_ioctl+0x300/0x32c)
[   20.270000] [<c0095fa4>] (do_vfs_ioctl+0x300/0x32c) from [<c0096008>] (sys_ioctl+0x38/0x5c)
[   20.280000] [<c0096008>] (sys_ioctl+0x38/0x5c) from [<c0024e80>] (ret_fast_syscall+0x0/0x2c)
[   20.290000] UBI assert failed in ubi_io_read at 203 (pid 197)
[   20.290000] [<c002a420>] (unwind_backtrace+0x0/0xec) from [<c01e23e4>] (ubi_io_read+0x220/0x2a4)
[   20.300000] [<c01e23e4>] (ubi_io_read+0x220/0x2a4) from [<c01e2814>] (ubi_io_read_ec_hdr+0x68/0x214)
[   20.310000] [<c01e2814>] (ubi_io_read_ec_hdr+0x68/0x214) from [<c01e6d30>] (scan_peb+0x54/0x5e8)
[   20.320000] [<c01e6d30>] (scan_peb+0x54/0x5e8) from [<c01e73a8>] (scan_all+0xe4/0x2f8)
[   20.330000] [<c01e73a8>] (scan_all+0xe4/0x2f8) from [<c01e75c8>] (ubi_attach+0xc/0xbc)
[   20.340000] [<c01e75c8>] (ubi_attach+0xc/0xbc) from [<c01dcabc>] (ubi_attach_mtd_dev+0x1e4/0x570)
[   20.340000] [<c01dcabc>] (ubi_attach_mtd_dev+0x1e4/0x570) from [<c01dcfb8>] (ctrl_cdev_ioctl+0xd0/0x160)
[   20.350000] [<c01dcfb8>] (ctrl_cdev_ioctl+0xd0/0x160) from [<c0095898>] (vfs_ioctl+0x2c/0x70)
[   20.360000] [<c0095898>] (vfs_ioctl+0x2c/0x70) from [<c0095fa4>] (do_vfs_ioctl+0x300/0x32c)
[   20.370000] [<c0095fa4>] (do_vfs_ioctl+0x300/0x32c) from [<c0096008>] (sys_ioctl+0x38/0x5c)
[   20.380000] [<c0096008>] (sys_ioctl+0x38/0x5c) from [<c0024e80>] (ret_fast_syscall+0x0/0x2c)
[   20.390000] UBI error: ubi_attach_mtd_dev: failed to attach mtd6, error -5
ubiattach: error!: cannot attach mtd6
           error 5 (Input/output error)
ubimount: [ubiattach /dev/ubi_ctrl -m 6 -d 11 -O 4096]->255


Best reagrds,

Iwo

______________________________________________________________________
This communication contains information which may be confidential or privileged. The information is intended solely for the use of the individual or entity named above.  If you are not the intended recipient, be aware that any disclosure, copying, distribution or use of the contents of this information is prohibited.  If you have received this communication in error, please notify me by telephone immediately.
______________________________________________________________________

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Correct behaviour of ubiattach / ubiformat in the face of ECC errors?
  2014-06-19  3:15 Correct behaviour of ubiattach / ubiformat in the face of ECC errors? Iwo Mergler
@ 2014-07-01 14:05 ` Artem Bityutskiy
  2014-07-02 23:58   ` Iwo Mergler
  0 siblings, 1 reply; 3+ messages in thread
From: Artem Bityutskiy @ 2014-07-01 14:05 UTC (permalink / raw)
  To: Iwo Mergler; +Cc: linux-mtd

On Thu, 2014-06-19 at 13:15 +1000, Iwo Mergler wrote:
> Hi all,
> 
> 
> I'm debugging a NAND driver and I'm unsure about what the correct behaviour
> of ubiformat / ubiattach ought to be in the face of uncorrectable ECC errors.
> 
> This is a 2.6.35 kernel, with UBI/UBIFS updated to the top of
> git://git.infradead.org/~dedekind/ubifs-v2.6.35.git and mtd-utils v1.5.1.
> 
> My question is, what is ubiattach / ubiformat supposed to do if a non-correctable 
> ECC error occurs in a block during the scan?

Generally, 'ubiformat' is about to destroy all the data, so should
ignore errors and just go on, generally speaking.

'ubiattach' asks the kernel to attach the media. And generally, it
should do something about ECC errors.

> My gut feeling says that ubiattach should fail if the PEB was in use, but ubiformat
> should mark the PEB as bad and continue.

> This is not the behaviour I'm observing. The broken PEB in the examples below
> contains a EC header, but no VID header - it's not in use.
> 
> To complicate matters, I'm also unsure about the correct return value from the
> NAND driver in this situation. Depending on which documentation I read, it should
> now return -1, -EIO or -EBADMSG.

-EBADMSG. I think we have this documented in several places.

> [   20.080000] HWECC(1) - MBE detected ([5ff00]-0000)
> [   20.080000] SWECC(1) - MBE detected ([5ff00]-0000/0)
> [   20.090000] [T88]bovine_nand_read_page_hwecc:671: Returned -74 [-EBADMSG], page=0x5ff00
> [   20.090000] UBI warning: ubi_io_read: error -74 (ECC error) while reading 64 bytes from PEB 13796:0, read only 0 bytes, retry
> [   20.110000] HWECC(1) - MBE detected ([5ff00]-0000)
> [   20.110000] SWECC(1) - MBE detected ([5ff00]-0000/0)
> [   20.120000] [T88]bovine_nand_read_page_hwecc:671: Returned -74 [-EBADMSG], page=0x5ff00
> [   20.120000] UBI warning: ubi_io_read: error -74 (ECC error) while reading 64 bytes from PEB 13796:0, read only 0 bytes, retry
> [   20.140000] HWECC(1) - MBE detected ([5ff00]-0000)
> [   20.140000] SWECC(1) - MBE detected ([5ff00]-0000/0)
> [   20.150000] [T88]bovine_nand_read_page_hwecc:671: Returned -74 [-EBADMSG], page=0x5ff00
> [   20.150000] UBI warning: ubi_io_read: error -74 (ECC error) while reading 64 bytes from PEB 13796:0, read only 0 bytes, retry

Notice "read only 0 bytes". I believe the MTD API assumes that even if
there is an ECC error, you should still return the data. You indicate
that they are corrupted using -EBADMSG, but you return them. Your driver
returns no data by saying that it read 0 bytes. Please, correct this.

-- 
Best Regards,
Artem Bityutskiy

^ permalink raw reply	[flat|nested] 3+ messages in thread

* RE: Correct behaviour of ubiattach / ubiformat in the face of ECC errors?
  2014-07-01 14:05 ` Artem Bityutskiy
@ 2014-07-02 23:58   ` Iwo Mergler
  0 siblings, 0 replies; 3+ messages in thread
From: Iwo Mergler @ 2014-07-02 23:58 UTC (permalink / raw)
  To: dedekind1; +Cc: linux-mtd

On Wed, 2 Jul 2014 00:05:53 +1000
Artem Bityutskiy <dedekind1@gmail.com> wrote:
> On Thu, 2014-06-19 at 13:15 +1000, Iwo Mergler wrote:

> > My question is, what is ubiattach / ubiformat supposed to do if a
> > non-correctable ECC error occurs in a block during the scan?
> 
> Generally, 'ubiformat' is about to destroy all the data, so should
> ignore errors and just go on, generally speaking.
> 
> 'ubiattach' asks the kernel to attach the media. And generally, it
> should do something about ECC errors.

Hi Artem,

thanks for answering. I have found out what I was doing wrong
and meant to reply to myself documenting that any day now.. ;-)

You are right, and UBI/UBIFS do indeed behave correctly, if the right
error codes arrive from MTD.

For the 2.6.35 kernel, the chain of return values goes like this:

Low-level ECC functions: return the number of corrected bit errors
(>=0) or -EBADMSG when uncorrectable.

Page read functions: return 0 for correctable errors, -EBADMSG
for uncorrectable. Also increments ecc_stats.corrected with the
number of corrected bit errors.

MTD core read: Checks ecc_stats.corrected difference (after-before)
and returns -EUCLEAN if any bit errors were corrected. Error codes
(-EBADMSG) get passed through.

In newer kernels (~v3.9?) , the page read can also return
-EUCLEAN directly to signal a threshold of corrected bit errors
worth worrying about.

In all cases, the read must report having read all requested bytes,
even if the buffer may contain some garbage. Various things (e.g.
jffs2, UBI) have checksums for some data and can independently
verify and maybe recover correct information.

In my system, I did return -EUCLEAN from the page read function,
which was treated as a fatal error by MTD (i.e. it wasn't 0 or -EBADMSG).
This led to MTD-char returning 0 bytes read and the C-library retrying
endlessly.


Best regards,

Iwo

______________________________________________________________________
This communication contains information which may be confidential or privileged. The information is intended solely for the use of the individual or entity named above.  If you are not the intended recipient, be aware that any disclosure, copying, distribution or use of the contents of this information is prohibited.  If you have received this communication in error, please notify me by telephone immediately.
______________________________________________________________________

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2014-07-02 23:59 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-19  3:15 Correct behaviour of ubiattach / ubiformat in the face of ECC errors? Iwo Mergler
2014-07-01 14:05 ` Artem Bityutskiy
2014-07-02 23:58   ` Iwo Mergler

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.