linux-mtd.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* What is wrong?
@ 2014-02-26 14:07 Leon Pollak
  2014-02-27  1:11 ` Brian Norris
  0 siblings, 1 reply; 10+ messages in thread
From: Leon Pollak @ 2014-02-26 14:07 UTC (permalink / raw)
  To: linux-mtd

Hello, all.
I will be very thankful for any hint...:)

I use JFFS2 version 2.2.(NAND)(SUMMARY)(C) 2001-2006 Red Hat, Inc. 
(comes with MontaVistaPro 5.0). Kernel 2.6.18.
I prepare JFFS2 partition with mkfs.jffs2 and sumtool and burn it with 
flash_eraseall and 'nandwrite -p'.
Everything works fine.

The NAND is write protected by HW and the partition is mounted as RO. 
At some moment I need to update a small file. 
So I do: 
- HW write protect off, 
- remount RW, 
- update file, 
- sync, 
- remount RO, 
- write protect on.

Looking at linux console I see a lot of messages like:
Erase at 0x00400000 failed immediately: errno -5
Erase at 0x003e0000 failed immediately: errno -5
......
Erase at 0x00340000 failed immediately: errno -5
jffs2_flush_wbuf(): Write failed with -5
Write of 2016 bytes at 0x002578a0 failed. returned -5, retlen 0
Not marking the space at 0x002578a0 as dirty because the flash driver 
returned retlen zero.


This is repeated for a long time, but everything seems work OK.
The sequential starts and even file updates are also OK, without error 
messages.

What do I do wrong? Thanks a lot.

--  
Leon

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: What is wrong?
  2014-02-26 14:07 What is wrong? Leon Pollak
@ 2014-02-27  1:11 ` Brian Norris
  2014-02-27  1:13   ` Brian Norris
       [not found]   ` <1553411.O1igofskp7@leonp.plris.com>
  0 siblings, 2 replies; 10+ messages in thread
From: Brian Norris @ 2014-02-27  1:11 UTC (permalink / raw)
  To: Leon Pollak; +Cc: linux-mtd

Hi Leon,

On Wed, Feb 26, 2014 at 04:07:21PM +0200, Leon Pollak wrote:
> I use JFFS2 version 2.2.(NAND)(SUMMARY)(C) 2001-2006 Red Hat, Inc. 
> (comes with MontaVistaPro 5.0). Kernel 2.6.18.
> I prepare JFFS2 partition with mkfs.jffs2 and sumtool and burn it with 
> flash_eraseall and 'nandwrite -p'.
> Everything works fine.
> 
> The NAND is write protected by HW and the partition is mounted as RO. 
> At some moment I need to update a small file. 
> So I do: 
> - HW write protect off, 
> - remount RW, 
> - update file, 
> - sync, 
> - remount RO, 
> - write protect on.
> 
> Looking at linux console I see a lot of messages like:
> Erase at 0x00400000 failed immediately: errno -5
> Erase at 0x003e0000 failed immediately: errno -5
> ......
> Erase at 0x00340000 failed immediately: errno -5
> jffs2_flush_wbuf(): Write failed with -5
> Write of 2016 bytes at 0x002578a0 failed. returned -5, retlen 0
> Not marking the space at 0x002578a0 as dirty because the flash driver 
> returned retlen zero.
> 
> 
> This is repeated for a long time, but everything seems work OK.
> The sequential starts and even file updates are also OK, without error 
> messages.
> 
> What do I do wrong? Thanks a lot.

It's possible you're seeing symptoms of this:

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=807612db2f9940b9fa6deaef054eb16d51bd3e00

It seems like maybe JFFS2 is still doing some GC and/or write flushing
after the remount.

Could try this?

 - HW write protect off, 
 - remount RW, 
 - update file, 
 - sync, 
 - remount RO, 
 - sync, <---------- add this, to see if you're experiencing any
                     writeback after remount
 - write protect on.

Brian

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: What is wrong?
  2014-02-27  1:11 ` Brian Norris
@ 2014-02-27  1:13   ` Brian Norris
       [not found]   ` <1553411.O1igofskp7@leonp.plris.com>
  1 sibling, 0 replies; 10+ messages in thread
From: Brian Norris @ 2014-02-27  1:13 UTC (permalink / raw)
  To: Leon Pollak; +Cc: Richard Weinberger, Andrew Ruder, linux-mtd

+ Richard, Andrew

(Meant to CC earlier)

On Wed, Feb 26, 2014 at 05:11:48PM -0800, Brian Norris wrote:
> On Wed, Feb 26, 2014 at 04:07:21PM +0200, Leon Pollak wrote:
> > I use JFFS2 version 2.2.(NAND)(SUMMARY)(C) 2001-2006 Red Hat, Inc. 
> > (comes with MontaVistaPro 5.0). Kernel 2.6.18.
> > I prepare JFFS2 partition with mkfs.jffs2 and sumtool and burn it with 
> > flash_eraseall and 'nandwrite -p'.
> > Everything works fine.
> > 
> > The NAND is write protected by HW and the partition is mounted as RO. 
> > At some moment I need to update a small file. 
> > So I do: 
> > - HW write protect off, 
> > - remount RW, 
> > - update file, 
> > - sync, 
> > - remount RO, 
> > - write protect on.
> > 
> > Looking at linux console I see a lot of messages like:
> > Erase at 0x00400000 failed immediately: errno -5
> > Erase at 0x003e0000 failed immediately: errno -5
> > ......
> > Erase at 0x00340000 failed immediately: errno -5
> > jffs2_flush_wbuf(): Write failed with -5
> > Write of 2016 bytes at 0x002578a0 failed. returned -5, retlen 0
> > Not marking the space at 0x002578a0 as dirty because the flash driver 
> > returned retlen zero.
> > 
> > 
> > This is repeated for a long time, but everything seems work OK.
> > The sequential starts and even file updates are also OK, without error 
> > messages.
> > 
> > What do I do wrong? Thanks a lot.
> 
> It's possible you're seeing symptoms of this:
> 
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=807612db2f9940b9fa6deaef054eb16d51bd3e00
> 
> It seems like maybe JFFS2 is still doing some GC and/or write flushing
> after the remount.
> 
> Could try this?
> 
>  - HW write protect off, 
>  - remount RW, 
>  - update file, 
>  - sync, 
>  - remount RO, 
>  - sync, <---------- add this, to see if you're experiencing any
>                      writeback after remount
>  - write protect on.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: What is wrong?
       [not found]   ` <1553411.O1igofskp7@leonp.plris.com>
@ 2014-02-27 21:22     ` Brian Norris
  2014-03-03 21:13       ` Andrew Ruder
  0 siblings, 1 reply; 10+ messages in thread
From: Brian Norris @ 2014-02-27 21:22 UTC (permalink / raw)
  To: Leon Pollak
  Cc: Artem Bityutskiy, Richard Weinberger, Linux Kernel, linux-mtd,
	Al Viro, Andrew Ruder

+ others

Hi Leon,

Can you please keep the CC list intact? And please try to reply below
the quotes and trim context, rather than top-posting. Thanks!

On Thu, Feb 27, 2014 at 02:00:25PM +0200, Leon Pollak wrote:
> I am VERY(!) thankful to you for the answer.
> First, I am calm now that there is no any error on my side and the 
> system remains clean despite these messages.
> Second, yes, the workaround worked.

That's nice to hear, but that is (as you note) a workaround. You should
not need an extra sync after remounting read-only. Do you think you can
try the linked patch?

commit 807612db2f9940b9fa6deaef054eb16d51bd3e00
Author: Andrew Ruder <andrew.ruder@elecsyscorp.com>
Date:   Thu Jan 30 09:26:54 2014 -0600

    fs/super.c: sync ro remount after blocking writers

Perhaps Richard or Andrew can comment on whether this patch should help
you. But I think JFFS2 on NAND uses write-buffered support which can be
affected by this bug.

> May thanks to you for your help!!!

You're welcome.

I have a few other questions: are you using NOR or NAND (it looks like
maybe NAND)?

Leaving most context intact for others, below.

> On Wednesday 26 February 2014 17:11:48 you wrote:
> > On Wed, Feb 26, 2014 at 04:07:21PM +0200, Leon Pollak wrote:
> > > The NAND is write protected by HW and the partition is mounted as
> > > RO.
> > > At some moment I need to update a small file.
> > > So I do:
> > > - HW write protect off,
> > > - remount RW,
> > > - update file,
> > > - sync,
> > > - remount RO,
> > > - write protect on.
> > > 
> > > Looking at linux console I see a lot of messages like:
> > > Erase at 0x00400000 failed immediately: errno -5
> > > Erase at 0x003e0000 failed immediately: errno -5
> > > ......
> > > Erase at 0x00340000 failed immediately: errno -5
> > > jffs2_flush_wbuf(): Write failed with -5
> > > Write of 2016 bytes at 0x002578a0 failed. returned -5, retlen 0
> > > Not marking the space at 0x002578a0 as dirty because the flash
> > > driver
> > > returned retlen zero.
> > > 
> > > 
> > > This is repeated for a long time, but everything seems work OK.
> > > The sequential starts and even file updates are also OK, without
> > > error messages.
> > > 
> > > What do I do wrong? Thanks a lot.
> > 
> > It's possible you're seeing symptoms of this:
> > 
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit
> > /?id=807612db2f9940b9fa6deaef054eb16d51bd3e00
> > 
> > It seems like maybe JFFS2 is still doing some GC and/or write flushing
> > after the remount.
> > 
> > Could try this?
> > 
> >  - HW write protect off,
> >  - remount RW,
> >  - update file,
> >  - sync,
> >  - remount RO,
> >  - sync, <---------- add this, to see if you're experiencing any
> >                      writeback after remount
> >  - write protect on.

Regards,
Brian

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: What is wrong?
  2014-02-27 21:22     ` Brian Norris
@ 2014-03-03 21:13       ` Andrew Ruder
  2014-03-04  8:33         ` Brian Norris
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Ruder @ 2014-03-03 21:13 UTC (permalink / raw)
  To: Brian Norris
  Cc: Leon Pollak, Artem Bityutskiy, Richard Weinberger, Linux Kernel,
	linux-mtd, Al Viro

On Thu, Feb 27, 2014 at 01:22:08PM -0800, Brian Norris wrote:
> Perhaps Richard or Andrew can comment on whether this patch should help
> you. But I think JFFS2 on NAND uses write-buffered support which can be
> affected by this bug.

Definitely sounds like the same issue and I'm kind of glad to see it
crop up in another filesystem.  Also glad you Cc'd me with the URL
because I had the painful task of recreating this issue on another
filesystem on my TODO list as I didn't think it had ever been committed.

Cheers,
Andy

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: What is wrong?
  2014-03-03 21:13       ` Andrew Ruder
@ 2014-03-04  8:33         ` Brian Norris
  2014-03-04  8:54           ` Leon Pollak
  2014-03-27 11:50           ` What is wrong? Leon Pollak
  0 siblings, 2 replies; 10+ messages in thread
From: Brian Norris @ 2014-03-04  8:33 UTC (permalink / raw)
  To: Leon Pollak, Richard Weinberger, linux-mtd, Linux Kernel,
	Al Viro, Artem Bityutskiy

On Mon, Mar 03, 2014 at 03:13:36PM -0600, Andrew Ruder wrote:
> On Thu, Feb 27, 2014 at 01:22:08PM -0800, Brian Norris wrote:
> > Perhaps Richard or Andrew can comment on whether this patch should help
> > you. But I think JFFS2 on NAND uses write-buffered support which can be
> > affected by this bug.
> 
> Definitely sounds like the same issue and I'm kind of glad to see it
> crop up in another filesystem.

We haven't confirmed that the *patch* actually affects Leon's problem;
just that if he runs an additional 'sync' it solves his problem. Leon,
did you get to try the patch?

Anyway, should commit 807612db2f9940b9fa6deaef054eb16d51bd3e00 be marked
for -stable? 

Brian

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: What is wrong?
  2014-03-04  8:33         ` Brian Norris
@ 2014-03-04  8:54           ` Leon Pollak
  2014-03-04 16:40             ` Andrew Ruder
  2014-03-30  7:16             ` What is wrong? - Cont Leon Pollak
  2014-03-27 11:50           ` What is wrong? Leon Pollak
  1 sibling, 2 replies; 10+ messages in thread
From: Leon Pollak @ 2014-03-04  8:54 UTC (permalink / raw)
  To: Brian Norris
  Cc: Richard Weinberger, Artem Bityutskiy, linux-mtd, Linux Kernel, Al Viro

Hello, all.

I am really sorry for the silence - I was on the business trip and 
returned today.

I will recheck everything and try. Meanwhile, the news are not good: our 
guys say that it appears that the additional sync DOES NOT SOLVE the 
issue.
I ask for excuse, but as I did not know the exact processing, I was 
mistaken and, probably, used already gc-ted unit for tests.

Sorry, again.

BR

On Tuesday 04 March 2014 00:33:25 Brian Norris wrote:
> On Mon, Mar 03, 2014 at 03:13:36PM -0600, Andrew Ruder wrote:
> > On Thu, Feb 27, 2014 at 01:22:08PM -0800, Brian Norris wrote:
> > > Perhaps Richard or Andrew can comment on whether this patch should
> > > help you. But I think JFFS2 on NAND uses write-buffered support
> > > which can be affected by this bug.
> > 
> > Definitely sounds like the same issue and I'm kind of glad to see it
> > crop up in another filesystem.
> 
> We haven't confirmed that the *patch* actually affects Leon's problem;
> just that if he runs an additional 'sync' it solves his problem.
> Leon, did you get to try the patch?
> 
> Anyway, should commit 807612db2f9940b9fa6deaef054eb16d51bd3e00 be
> marked for -stable?
> 
> Brian

-- 
Leon

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: What is wrong?
  2014-03-04  8:54           ` Leon Pollak
@ 2014-03-04 16:40             ` Andrew Ruder
  2014-03-30  7:16             ` What is wrong? - Cont Leon Pollak
  1 sibling, 0 replies; 10+ messages in thread
From: Andrew Ruder @ 2014-03-04 16:40 UTC (permalink / raw)
  To: Leon Pollak
  Cc: Artem Bityutskiy, Richard Weinberger, Linux Kernel, linux-mtd,
	Al Viro, Brian Norris

On Tue, Mar 04, 2014 at 10:54:04AM +0200, Leon Pollak wrote:
> I will recheck everything and try. Meanwhile, the news are not good: our 
> guys say that it appears that the additional sync DOES NOT SOLVE the 
> issue.

Gonna be honest, I have a tough time explaining this. :(  Unfortunately
I don't have a board here with a hardware write protect which would make
things easier to verify.

- Andy

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: What is wrong?
  2014-03-04  8:33         ` Brian Norris
  2014-03-04  8:54           ` Leon Pollak
@ 2014-03-27 11:50           ` Leon Pollak
  1 sibling, 0 replies; 10+ messages in thread
From: Leon Pollak @ 2014-03-27 11:50 UTC (permalink / raw)
  To: linux-mtd

Brian, hello.

I am a bit more free now and wish to solve the issue.
I am a bit confused now - what should I try to test? Can you help me.

Many hanks ahead.

On Tuesday 04 March 2014 00:33:25 Brian Norris wrote:
> On Mon, Mar 03, 2014 at 03:13:36PM -0600, Andrew Ruder wrote:
> > On Thu, Feb 27, 2014 at 01:22:08PM -0800, Brian Norris wrote:
> > > Perhaps Richard or Andrew can comment on whether this patch should
> > > help you. But I think JFFS2 on NAND uses write-buffered support
> > > which can be affected by this bug.
> > 
> > Definitely sounds like the same issue and I'm kind of glad to see it
> > crop up in another filesystem.
> 
> We haven't confirmed that the *patch* actually affects Leon's problem;
> just that if he runs an additional 'sync' it solves his problem.
> Leon, did you get to try the patch?
> 
> Anyway, should commit 807612db2f9940b9fa6deaef054eb16d51bd3e00 be
> marked for -stable?
> 
> Brian
> 
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/

-- 
Leon

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: What is wrong? - Cont
  2014-03-04  8:54           ` Leon Pollak
  2014-03-04 16:40             ` Andrew Ruder
@ 2014-03-30  7:16             ` Leon Pollak
  1 sibling, 0 replies; 10+ messages in thread
From: Leon Pollak @ 2014-03-30  7:16 UTC (permalink / raw)
  To: linux-mtd, Al Viro; +Cc: Richard Weinberger, Brian Norris, Artem Bityutskiy

Meanwhile, some more inputs:

First, to remind, the scenario:
After the first burning (from withing U-Boot), the NAND is write 
protected by HW and the partition is mounted as RO.
At some moment I need to update a small file. 
So I do: 
- HW write protect off, 
- remount RW, 
- update file, <---- The main point  <----
- sync, 
- remount RO, 
- write protect on.
At this moment, the console is full of error messages - crash or a lot 
of "Erase at 0x..... failed immediately: errno -5".
-------------------------------

Now two more test cases:

1. After burning JFFS2 from U-Boot and STILL WITHOUT working from flash, 
I boot into NFS, mount the partition as RW, keep it for 3-5 seconds and 
reboot into flashes.
This works FINE, updating passes silently, no crazy messages and/or 
crashes.

2. I added to the partition a flag file "first_time.dat" and in rc3.d 
added script:
if [ -e first_time.dat ] ; then
    'HW unlock flash'
    mount -o,remount,rw /home
    sleep 20
    mount -o,remount,ro /home
    'HW lock flash';
    rm first_time.dat
fi
Against my hopes, this DOES NOT helped. Crashes and messages appeared as 
"usual"....


Any help will be highly appreciated.
-- 
Leon

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-03-30  7:17 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-26 14:07 What is wrong? Leon Pollak
2014-02-27  1:11 ` Brian Norris
2014-02-27  1:13   ` Brian Norris
     [not found]   ` <1553411.O1igofskp7@leonp.plris.com>
2014-02-27 21:22     ` Brian Norris
2014-03-03 21:13       ` Andrew Ruder
2014-03-04  8:33         ` Brian Norris
2014-03-04  8:54           ` Leon Pollak
2014-03-04 16:40             ` Andrew Ruder
2014-03-30  7:16             ` What is wrong? - Cont Leon Pollak
2014-03-27 11:50           ` What is wrong? Leon Pollak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).