* UBI: Question about error checking in ubi_eba_write_leb()
@ 2014-09-30 11:00 steve
2014-10-03 15:14 ` Artem Bityutskiy
0 siblings, 1 reply; 3+ messages in thread
From: steve @ 2014-09-30 11:00 UTC (permalink / raw)
To: linux-mtd
This is my first post to this mailing list, I think this is the right place for
my question.
I think the error checking in ubi_eba_write_leb() may be wrong and may cause
incorrect behaviour when extra checking is enabled in debugfs, here's the
scenario I have a problem with:
ubi_eba_write_leb() calls ubi_io_write_data() -> ubi_io_write() ->
ubi_dbg_check_all_ff()
ubi_dbg_check_all_ff() will perform an mtd_read() which can fail with EBADMSG
(if there is a failure to correct using ECC as in my case). This error code
will bubble back up to ubi_eba_write_leb(), but recover_peb() will only be
called if the error code was EIO. The question is should recover_peb() be
called in this case as well?
Thanks for any help with this,
Steve
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: UBI: Question about error checking in ubi_eba_write_leb()
2014-09-30 11:00 UBI: Question about error checking in ubi_eba_write_leb() steve
@ 2014-10-03 15:14 ` Artem Bityutskiy
2014-10-07 8:46 ` steve
0 siblings, 1 reply; 3+ messages in thread
From: Artem Bityutskiy @ 2014-10-03 15:14 UTC (permalink / raw)
To: steve; +Cc: linux-mtd
On Tue, 2014-09-30 at 12:00 +0100, steve wrote:
> This is my first post to this mailing list, I think this is the right place for
> my question.
>
> I think the error checking in ubi_eba_write_leb() may be wrong and may cause
> incorrect behaviour when extra checking is enabled in debugfs, here's the
> scenario I have a problem with:
>
> ubi_eba_write_leb() calls ubi_io_write_data() -> ubi_io_write() ->
> ubi_dbg_check_all_ff()
>
> ubi_dbg_check_all_ff() will perform an mtd_read() which can fail with EBADMSG
> (if there is a failure to correct using ECC as in my case).
Yep, this is possible.
> This error code
> will bubble back up to ubi_eba_write_leb()
And the question is - is this the right thing to do? Original intention
was to catch I/O problems and report about them promptly by dumping the
information to the log and returning an error.
The error we return is -EINVAL exactly in because we did not want the
upper layers try to 'deal' with the problem. All we wanted to do is to
catch the problems.
> , but recover_peb() will only be
> called if the error code was EIO.
Yeah, and we never meant the recover_peb() to actually try doing
anything about that.
> The question is should recover_peb() be
> called in this case as well?
Good question - this was not meant to be.
So see, you are now talking about the debugging code. It was written
basically to make developers' live simpler. This is just a tool to catch
bugs during the R&D.
So if during the R&D phase we see that the driver returns an "ECC
error" (-EBADMSG) when we read the data we just wrote - this is a
problem, and we _do not_ want to hide it by starting the recovery
mechanism - we want the write request to actually fail and the developer
to actually notice this.
The non-debug case is a completely different thing. Here we actually
would be interested to handle the issue transparently - move the data to
a different PEB, and mark the failing PEB as bad.
But the assumption is that the write() path never returns -EBADMSG. If a
driver internally does write-read-verify, we assume it should return
-EIO if there is an ECC error. But not -EBADMSG.
This is why we do not handle this error code in eba.c.
Now, what I see is that you are looking at some old code. In the new
code the function is named 'ubi_self_check_all_ff()', and if I read the
code correctly, it will return -EINVAL if mtd_read() returns -EBADMSG:
err = mtd_read(ubi->mtd, addr, len, &read, buf);
if (err && !mtd_is_bitflip(err)) {
ubi_err("error %d while reading %d bytes from PEB %d:%d, read %zd bytes",
err, len, pnum, offset, read);
goto error;
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: UBI: Question about error checking in ubi_eba_write_leb()
2014-10-03 15:14 ` Artem Bityutskiy
@ 2014-10-07 8:46 ` steve
0 siblings, 0 replies; 3+ messages in thread
From: steve @ 2014-10-07 8:46 UTC (permalink / raw)
To: Artem Bityutskiy; +Cc: linux-mtd
Thank you very much for your reply Artem, that makes sense to me now.
On Fri, 3 Oct 2014, Artem Bityutskiy wrote:
> On Tue, 2014-09-30 at 12:00 +0100, steve wrote:
>> This is my first post to this mailing list, I think this is the right place for
>> my question.
>>
>> I think the error checking in ubi_eba_write_leb() may be wrong and may cause
>> incorrect behaviour when extra checking is enabled in debugfs, here's the
>> scenario I have a problem with:
>>
>> ubi_eba_write_leb() calls ubi_io_write_data() -> ubi_io_write() ->
>> ubi_dbg_check_all_ff()
>>
>> ubi_dbg_check_all_ff() will perform an mtd_read() which can fail with EBADMSG
>> (if there is a failure to correct using ECC as in my case).
>
> Yep, this is possible.
>
>> This error code
>> will bubble back up to ubi_eba_write_leb()
>
> And the question is - is this the right thing to do? Original intention
> was to catch I/O problems and report about them promptly by dumping the
> information to the log and returning an error.
>
> The error we return is -EINVAL exactly in because we did not want the
> upper layers try to 'deal' with the problem. All we wanted to do is to
> catch the problems.
>
>> , but recover_peb() will only be
>> called if the error code was EIO.
>
> Yeah, and we never meant the recover_peb() to actually try doing
> anything about that.
>
>> The question is should recover_peb() be
>> called in this case as well?
>
> Good question - this was not meant to be.
>
> So see, you are now talking about the debugging code. It was written
> basically to make developers' live simpler. This is just a tool to catch
> bugs during the R&D.
>
> So if during the R&D phase we see that the driver returns an "ECC
> error" (-EBADMSG) when we read the data we just wrote - this is a
> problem, and we _do not_ want to hide it by starting the recovery
> mechanism - we want the write request to actually fail and the developer
> to actually notice this.
>
> The non-debug case is a completely different thing. Here we actually
> would be interested to handle the issue transparently - move the data to
> a different PEB, and mark the failing PEB as bad.
>
> But the assumption is that the write() path never returns -EBADMSG. If a
> driver internally does write-read-verify, we assume it should return
> -EIO if there is an ECC error. But not -EBADMSG.
>
> This is why we do not handle this error code in eba.c.
>
> Now, what I see is that you are looking at some old code. In the new
> code the function is named 'ubi_self_check_all_ff()', and if I read the
> code correctly, it will return -EINVAL if mtd_read() returns -EBADMSG:
>
>
> err = mtd_read(ubi->mtd, addr, len, &read, buf);
> if (err && !mtd_is_bitflip(err)) {
> ubi_err("error %d while reading %d bytes from PEB %d:%d, read %zd bytes",
> err, len, pnum, offset, read);
> goto error;
>
>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2014-10-07 8:51 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-30 11:00 UBI: Question about error checking in ubi_eba_write_leb() steve
2014-10-03 15:14 ` Artem Bityutskiy
2014-10-07 8:46 ` steve
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.