All of lore.kernel.org
 help / color / mirror / Atom feed
* non-fresh: what?
@ 2008-02-03  2:54 Dexter Filmore
  2008-02-04 22:05 ` when is a disk "non-fresh"? Dexter Filmore
  0 siblings, 1 reply; 7+ messages in thread
From: Dexter Filmore @ 2008-02-03  2:54 UTC (permalink / raw)
  To: linux-raid

[   40.671910] md: md0 stopped.
[   40.676923] md: bind<sdd1>
[   40.677136] md: bind<sda1>
[   40.677370] md: bind<sdb1>
[   40.677572] md: bind<sdc1>
[   40.677618] md: kicking non-fresh sdd1 from array!

When is a disk "non-fresh" and what might lead to this? 
Happened about 15 times now since I built the array.

Dex


-- 
-----BEGIN GEEK CODE BLOCK-----
Version: 3.12
GCS d--(+)@ s-:+ a- C++++ UL++ P+>++ L+++>++++ E-- W++ N o? K-
w--(---) !O M+ V- PS+ PE Y++ PGP t++(---)@ 5 X+(++) R+(++) tv--(+)@ 
b++(+++) DI+++ D- G++ e* h>++ r* y?
------END GEEK CODE BLOCK------

http://www.vorratsdatenspeicherung.de

^ permalink raw reply	[flat|nested] 7+ messages in thread

* when is a disk "non-fresh"?
  2008-02-03  2:54 non-fresh: what? Dexter Filmore
@ 2008-02-04 22:05 ` Dexter Filmore
  2008-02-05  2:02   ` Neil Brown
  0 siblings, 1 reply; 7+ messages in thread
From: Dexter Filmore @ 2008-02-04 22:05 UTC (permalink / raw)
  To: linux-raid

Seems the other topic wasn't quite clear...
Occasionally a disk is kicked for being "non-fresh" - what does this mean and 
what causes it?

Dex



-- 
-----BEGIN GEEK CODE BLOCK-----
Version: 3.12
GCS d--(+)@ s-:+ a- C++++ UL++ P+>++ L+++>++++ E-- W++ N o? K-
w--(---) !O M+ V- PS+ PE Y++ PGP t++(---)@ 5 X+(++) R+(++) tv--(+)@ 
b++(+++) DI+++ D- G++ e* h>++ r* y?
------END GEEK CODE BLOCK------

http://www.vorratsdatenspeicherung.de

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: when is a disk "non-fresh"?
  2008-02-04 22:05 ` when is a disk "non-fresh"? Dexter Filmore
@ 2008-02-05  2:02   ` Neil Brown
  2008-02-07 22:16     ` Dexter Filmore
  0 siblings, 1 reply; 7+ messages in thread
From: Neil Brown @ 2008-02-05  2:02 UTC (permalink / raw)
  To: Dexter Filmore; +Cc: linux-raid

On Monday February 4, Dexter.Filmore@gmx.de wrote:
> Seems the other topic wasn't quite clear...

not necessarily.  sometimes it helps to repeat your question.  there
is a lot of noise on the internet and somethings important things get
missed... :-)

> Occasionally a disk is kicked for being "non-fresh" - what does this mean and 
> what causes it?

The 'event' count is too small.  
Every event that happens on an array causes the event count to be
incremented.
If the event counts on different devices differ by more than 1, then
the smaller number is 'non-fresh'.

You need to look to the kernel logs of when the array was previously
shut down to figure out why it is now non-fresh.

NeilBrown


> 
> Dex
> 
> 
> 
> -- 
> -----BEGIN GEEK CODE BLOCK-----
> Version: 3.12
> GCS d--(+)@ s-:+ a- C++++ UL++ P+>++ L+++>++++ E-- W++ N o? K-
> w--(---) !O M+ V- PS+ PE Y++ PGP t++(---)@ 5 X+(++) R+(++) tv--(+)@ 
> b++(+++) DI+++ D- G++ e* h>++ r* y?
> ------END GEEK CODE BLOCK------
> 
> http://www.vorratsdatenspeicherung.de
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: when is a disk "non-fresh"?
  2008-02-05  2:02   ` Neil Brown
@ 2008-02-07 22:16     ` Dexter Filmore
  2008-02-07 23:22       ` Neil Brown
  0 siblings, 1 reply; 7+ messages in thread
From: Dexter Filmore @ 2008-02-07 22:16 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid

On Tuesday 05 February 2008 03:02:00 Neil Brown wrote:
> On Monday February 4, Dexter.Filmore@gmx.de wrote:
> > Seems the other topic wasn't quite clear...
>
> not necessarily.  sometimes it helps to repeat your question.  there
> is a lot of noise on the internet and somethings important things get
> missed... :-)
>
> > Occasionally a disk is kicked for being "non-fresh" - what does this mean
> > and what causes it?
>
> The 'event' count is too small.
> Every event that happens on an array causes the event count to be
> incremented.

An 'event' here is any atomic action? Like "write byte there" or "calc XOR"?


> If the event counts on different devices differ by more than 1, then
> the smaller number is 'non-fresh'.
>
> You need to look to the kernel logs of when the array was previously
> shut down to figure out why it is now non-fresh.

The kernel logs show absolutely nothing. Log's fine, next time I boot up, one 
disk is kicked, I got no clue why, badblocks is fine, smartctl is fine, selft 
test fine, dmesg and /var/log/messages show nothing apart from that news that 
the disk was kicked and mdadm -E doesn't say anything suspicious either.

Question: what events occured on the 3 other disks that didn't occur on the 
last? It only happens after reboots, not while the machine is up so the 
closest assumption is that the array is not properly shut down somehow during 
system shutdown - only I wouldn't know why.
Box is Slackware 11.0, 11 doesn't come with raid script of its own so I hacked 
them into the boot scripts myself and carefully watched that everything 
accessing the array is down before mdadm --stop --scan is issued.
No NFS, no Samba, no other funny daemons, disks are synced and so on.

I could write some failsafe inot it by checking if the event count is the same 
on all disks before --stop, but even if it wasn't, I really wouldn't know 
what to do about it.

(btw mdadm -E gives me:     Events : 0.1149316 - what's with the 0. ?)

Dex



-- 
-----BEGIN GEEK CODE BLOCK-----
Version: 3.12
GCS d--(+)@ s-:+ a- C++++ UL++ P+>++ L+++>++++ E-- W++ N o? K-
w--(---) !O M+ V- PS+ PE Y++ PGP t++(---)@ 5 X+(++) R+(++) tv--(+)@ 
b++(+++) DI+++ D- G++ e* h>++ r* y?
------END GEEK CODE BLOCK------

http://www.vorratsdatenspeicherung.de

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: when is a disk "non-fresh"?
  2008-02-07 22:16     ` Dexter Filmore
@ 2008-02-07 23:22       ` Neil Brown
  2008-02-08  9:32         ` Dexter Filmore
  0 siblings, 1 reply; 7+ messages in thread
From: Neil Brown @ 2008-02-07 23:22 UTC (permalink / raw)
  To: Dexter Filmore; +Cc: linux-raid

On Thursday February 7, Dexter.Filmore@gmx.de wrote:
> On Tuesday 05 February 2008 03:02:00 Neil Brown wrote:
> > On Monday February 4, Dexter.Filmore@gmx.de wrote:
> > > Seems the other topic wasn't quite clear...
> >
> > not necessarily.  sometimes it helps to repeat your question.  there
> > is a lot of noise on the internet and somethings important things get
> > missed... :-)
> >
> > > Occasionally a disk is kicked for being "non-fresh" - what does this mean
> > > and what causes it?
> >
> > The 'event' count is too small.
> > Every event that happens on an array causes the event count to be
> > incremented.
> 
> An 'event' here is any atomic action? Like "write byte there" or "calc XOR"?

An 'event' is
   - switch from clean to dirty
   - switch from dirty to clean
   - a device fails
   - a spare finishes recovery
things like that.

> 
> 
> > If the event counts on different devices differ by more than 1, then
> > the smaller number is 'non-fresh'.
> >
> > You need to look to the kernel logs of when the array was previously
> > shut down to figure out why it is now non-fresh.
> 
> The kernel logs show absolutely nothing. Log's fine, next time I boot up, one 
> disk is kicked, I got no clue why, badblocks is fine, smartctl is fine, selft 
> test fine, dmesg and /var/log/messages show nothing apart from that news that 
> the disk was kicked and mdadm -E doesn't say anything suspicious either.

Can you get "mdadm -E" on all devices *before* attempting to assemble
the array?

> 
> Question: what events occured on the 3 other disks that didn't occur on the 
> last? It only happens after reboots, not while the machine is up so the 
> closest assumption is that the array is not properly shut down somehow during 
> system shutdown - only I wouldn't know why.

Yes, most likely is that the array didn't shut down properly.

> Box is Slackware 11.0, 11 doesn't come with raid script of its own so I hacked 
> them into the boot scripts myself and carefully watched that everything 
> accessing the array is down before mdadm --stop --scan is issued.
> No NFS, no Samba, no other funny daemons, disks are synced and so on.
> 
> I could write some failsafe inot it by checking if the event count is the same 
> on all disks before --stop, but even if it wasn't, I really wouldn't know 
> what to do about it.
> 
> (btw mdadm -E gives me:     Events : 0.1149316 - what's with the 0. ?)
> 

The events count is a 64bit number and for historical reasons it is
printed as 2 32bit numbers.  I agree this is ugly.

NeilBrown

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: when is a disk "non-fresh"?
  2008-02-07 23:22       ` Neil Brown
@ 2008-02-08  9:32         ` Dexter Filmore
  2008-02-10 10:36           ` David Greaves
  0 siblings, 1 reply; 7+ messages in thread
From: Dexter Filmore @ 2008-02-08  9:32 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid

On Friday 08 February 2008 00:22:36 Neil Brown wrote:
> On Thursday February 7, Dexter.Filmore@gmx.de wrote:
> > On Tuesday 05 February 2008 03:02:00 Neil Brown wrote:
> > > On Monday February 4, Dexter.Filmore@gmx.de wrote:
> > > > Seems the other topic wasn't quite clear...
> > >
> > > not necessarily.  sometimes it helps to repeat your question.  there
> > > is a lot of noise on the internet and somethings important things get
> > > missed... :-)
> > >
> > > > Occasionally a disk is kicked for being "non-fresh" - what does this
> > > > mean and what causes it?
> > >
> > > The 'event' count is too small.
> > > Every event that happens on an array causes the event count to be
> > > incremented.
> >
> > An 'event' here is any atomic action? Like "write byte there" or "calc
> > XOR"?
>
> An 'event' is
>    - switch from clean to dirty
>    - switch from dirty to clean
>    - a device fails
>    - a spare finishes recovery
> things like that.

Is there a glossary that explains "dirty" and such in detail?

>
> > > If the event counts on different devices differ by more than 1, then
> > > the smaller number is 'non-fresh'.
> > >
> > > You need to look to the kernel logs of when the array was previously
> > > shut down to figure out why it is now non-fresh.
> >
> > The kernel logs show absolutely nothing. Log's fine, next time I boot up,
> > one disk is kicked, I got no clue why, badblocks is fine, smartctl is
> > fine, selft test fine, dmesg and /var/log/messages show nothing apart
> > from that news that the disk was kicked and mdadm -E doesn't say anything
> > suspicious either.
>
> Can you get "mdadm -E" on all devices *before* attempting to assemble
> the array?
>

Yes, can do. But now the array is in sync again, guess you want an -E scan 
when it's degraded?


> > Question: what events occured on the 3 other disks that didn't occur on
> > the last? It only happens after reboots, not while the machine is up so
> > the closest assumption is that the array is not properly shut down
> > somehow during system shutdown - only I wouldn't know why.
>
> Yes, most likely is that the array didn't shut down properly.

I noticed that *after* stoppping the array I get some message on the console 
about SCSI caches, but it disappeares too quickly to read and doesn't turn up 
in logs. Will try and video shoot it tho I issue "sync" anyway before 
stopping the array.

>
> > Box is Slackware 11.0, 11 doesn't come with raid script of its own so I
> > hacked them into the boot scripts myself and carefully watched that
> > everything accessing the array is down before mdadm --stop --scan is
> > issued. No NFS, no Samba, no other funny daemons, disks are synced and so
> > on.
> >
> > I could write some failsafe inot it by checking if the event count is the
> > same on all disks before --stop, but even if it wasn't, I really wouldn't
> > know what to do about it.
> >
> > (btw mdadm -E gives me:     Events : 0.1149316 - what's with the 0. ?)
>
> The events count is a 64bit number and for historical reasons it is
> printed as 2 32bit numbers.  I agree this is ugly.
>
> NeilBrown
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
-----BEGIN GEEK CODE BLOCK-----
Version: 3.12
GCS d--(+)@ s-:+ a- C++++ UL++ P+>++ L+++>++++ E-- W++ N o? K-
w--(---) !O M+ V- PS+ PE Y++ PGP t++(---)@ 5 X+(++) R+(++) tv--(+)@ 
b++(+++) DI+++ D- G++ e* h>++ r* y?
------END GEEK CODE BLOCK------

http://www.vorratsdatenspeicherung.de

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: when is a disk "non-fresh"?
  2008-02-08  9:32         ` Dexter Filmore
@ 2008-02-10 10:36           ` David Greaves
  0 siblings, 0 replies; 7+ messages in thread
From: David Greaves @ 2008-02-10 10:36 UTC (permalink / raw)
  To: Dexter Filmore; +Cc: Neil Brown, linux-raid

Dexter Filmore wrote:
> On Friday 08 February 2008 00:22:36 Neil Brown wrote:
>> On Thursday February 7, Dexter.Filmore@gmx.de wrote:
>>> On Tuesday 05 February 2008 03:02:00 Neil Brown wrote:
>>>> On Monday February 4, Dexter.Filmore@gmx.de wrote:
>>>>> Seems the other topic wasn't quite clear...
>>>> not necessarily.  sometimes it helps to repeat your question.  there
>>>> is a lot of noise on the internet and somethings important things get
>>>> missed... :-)
>>>>
>>>>> Occasionally a disk is kicked for being "non-fresh" - what does this
>>>>> mean and what causes it?
>>>> The 'event' count is too small.
>>>> Every event that happens on an array causes the event count to be
>>>> incremented.
>>> An 'event' here is any atomic action? Like "write byte there" or "calc
>>> XOR"?
>> An 'event' is
>>    - switch from clean to dirty
>>    - switch from dirty to clean
>>    - a device fails
>>    - a spare finishes recovery
>> things like that.
> 
> Is there a glossary that explains "dirty" and such in detail?

Not yet.

http://linux-raid.osdl.org/index.php?title=Glossary

David

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2008-02-10 10:36 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-02-03  2:54 non-fresh: what? Dexter Filmore
2008-02-04 22:05 ` when is a disk "non-fresh"? Dexter Filmore
2008-02-05  2:02   ` Neil Brown
2008-02-07 22:16     ` Dexter Filmore
2008-02-07 23:22       ` Neil Brown
2008-02-08  9:32         ` Dexter Filmore
2008-02-10 10:36           ` David Greaves

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.