* constant array_state active after specific jobs
@ 2017-03-23 8:46 pdi
2017-03-24 5:25 ` NeilBrown
0 siblings, 1 reply; 6+ messages in thread
From: pdi @ 2017-03-23 8:46 UTC (permalink / raw)
To: linux-raid
Greetings all,
The problem in a nutshell is that an array is clean after boot, until
some specific jobs switch it to active where it remains until reboot.
A similar problem was discussed, and solved, in
https://www.spinics.net/lists/raid/msg46450.html. However, AFAICT,
it is not the same issue.
I would be grateful for any insights as to why this happens and/or how
to prevent it.
The relevant info follows, please let me know if anything further might
help.
Many thanks in advance.
- uname -a
Linux hostname 4.4.38 #1 SMP Sun Dec 11 16:03:41 CST 2016 x86_64
Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz GenuineIntel GNU/Linux
- mdadm -V
mdadm - v3.3.4 - 3rd August 2015
- Desktop drives without sct/erc,
with timeout mismatch correction as per
https://raid.wiki.kernel.org/index.php/Timeout_Mismatch
- /dev/md9 is a raid10 array, 4 devices, far=2,
with various dirs used as samba and nfs shares
- The array is in *constant* array_state active
- mdadm -D /dev/md9 | grep 'State :'
State : active
- cat /sys/block/md9/md/array_state
active
- watch -d 'grep md9 /proc/diskstats'
remain unchanged
- uptime
load average: 0.00, 0.00, 0.00
- cat /sys/block/md9/md/safe_mode_delay
0.201
- echo 0.1 > /sys/block/md9/md/safe_mode_delay
array_state remains active
- echo clean > /sys/block/md9/md/array_state
echo: write error: Device or resource busy
- reboot (with or without prior check)
array_state clean
- After reboot, array remains clean until some specific
jobs put it in constant active state. Such jobs so far
identified:
- echo check > /sys/block/md9/md/sync_action
- run an rsnapshot job
- start a qemu/kvm vm
- Other jobs, like text/doc editing, multimedia playback,
etc retain array_state clean
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: constant array_state active after specific jobs
2017-03-23 8:46 constant array_state active after specific jobs pdi
@ 2017-03-24 5:25 ` NeilBrown
2017-03-24 7:04 ` pdi
2017-03-27 18:08 ` Shaohua Li
0 siblings, 2 replies; 6+ messages in thread
From: NeilBrown @ 2017-03-24 5:25 UTC (permalink / raw)
To: pdi, linux-raid; +Cc: Shaohua Li
[-- Attachment #1: Type: text/plain, Size: 2196 bytes --]
On Thu, Mar 23 2017, pdi wrote:
> Greetings all,
>
> The problem in a nutshell is that an array is clean after boot, until
> some specific jobs switch it to active where it remains until reboot.
>
> A similar problem was discussed, and solved, in
> https://www.spinics.net/lists/raid/msg46450.html. However, AFAICT,
> it is not the same issue.
>
> I would be grateful for any insights as to why this happens and/or how
> to prevent it.
>
> The relevant info follows, please let me know if anything further might
> help.
>
> Many thanks in advance.
>
> - uname -a
> Linux hostname 4.4.38 #1 SMP Sun Dec 11 16:03:41 CST 2016 x86_64
> Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz GenuineIntel GNU/Linux
> - mdadm -V
> mdadm - v3.3.4 - 3rd August 2015
> - Desktop drives without sct/erc,
> with timeout mismatch correction as per
> https://raid.wiki.kernel.org/index.php/Timeout_Mismatch
> - /dev/md9 is a raid10 array, 4 devices, far=2,
> with various dirs used as samba and nfs shares
> - The array is in *constant* array_state active
> - mdadm -D /dev/md9 | grep 'State :'
> State : active
> - cat /sys/block/md9/md/array_state
> active
> - watch -d 'grep md9 /proc/diskstats'
> remain unchanged
> - uptime
> load average: 0.00, 0.00, 0.00
> - cat /sys/block/md9/md/safe_mode_delay
> 0.201
> - echo 0.1 > /sys/block/md9/md/safe_mode_delay
> array_state remains active
> - echo clean > /sys/block/md9/md/array_state
> echo: write error: Device or resource busy
> - reboot (with or without prior check)
> array_state clean
> - After reboot, array remains clean until some specific
> jobs put it in constant active state. Such jobs so far
> identified:
> - echo check > /sys/block/md9/md/sync_action
> - run an rsnapshot job
> - start a qemu/kvm vm
> - Other jobs, like text/doc editing, multimedia playback,
> etc retain array_state clean
This bug was introduced by
Commit: 20d0189b1012 ("block: Introduce new bio_split()")
in 3.14, and fixed by
Commit: 9b622e2bbcf0 ("raid10: increment write counter after bio is split")
in 4.8.
Maybe the latter patch should be sent to -stable ??
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: constant array_state active after specific jobs
2017-03-24 5:25 ` NeilBrown
@ 2017-03-24 7:04 ` pdi
2017-03-26 22:42 ` NeilBrown
2017-03-27 18:08 ` Shaohua Li
1 sibling, 1 reply; 6+ messages in thread
From: pdi @ 2017-03-24 7:04 UTC (permalink / raw)
To: NeilBrown; +Cc: linux-raid, Shaohua Li
On Fri, 24 Mar 2017 16:25:35 +1100
NeilBrown <neilb@suse.com> wrote:
> On Thu, Mar 23 2017, pdi wrote:
>
> > Greetings all,
> >
> > The problem in a nutshell is that an array is clean after boot,
> > until some specific jobs switch it to active where it remains until
> > reboot.
> >
> > A similar problem was discussed, and solved, in
> > https://www.spinics.net/lists/raid/msg46450.html. However, AFAICT,
> > it is not the same issue.
> >
> > I would be grateful for any insights as to why this happens and/or
> > how to prevent it.
> >
> > The relevant info follows, please let me know if anything further
> > might help.
> >
> > Many thanks in advance.
> >
> > - uname -a
> > Linux hostname 4.4.38 #1 SMP Sun Dec 11 16:03:41 CST 2016 x86_64
> > Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz GenuineIntel GNU/Linux
> > - mdadm -V
> > mdadm - v3.3.4 - 3rd August 2015
> > - Desktop drives without sct/erc,
> > with timeout mismatch correction as per
> > https://raid.wiki.kernel.org/index.php/Timeout_Mismatch
> > - /dev/md9 is a raid10 array, 4 devices, far=2,
> > with various dirs used as samba and nfs shares
> > - The array is in *constant* array_state active
> > - mdadm -D /dev/md9 | grep 'State :'
> > State : active
> > - cat /sys/block/md9/md/array_state
> > active
> > - watch -d 'grep md9 /proc/diskstats'
> > remain unchanged
> > - uptime
> > load average: 0.00, 0.00, 0.00
> > - cat /sys/block/md9/md/safe_mode_delay
> > 0.201
> > - echo 0.1 > /sys/block/md9/md/safe_mode_delay
> > array_state remains active
> > - echo clean > /sys/block/md9/md/array_state
> > echo: write error: Device or resource busy
> > - reboot (with or without prior check)
> > array_state clean
> > - After reboot, array remains clean until some specific
> > jobs put it in constant active state. Such jobs so far
> > identified:
> > - echo check > /sys/block/md9/md/sync_action
> > - run an rsnapshot job
> > - start a qemu/kvm vm
> > - Other jobs, like text/doc editing, multimedia playback,
> > etc retain array_state clean
>
> This bug was introduced by
> Commit: 20d0189b1012 ("block: Introduce new bio_split()")
> in 3.14, and fixed by
> Commit: 9b622e2bbcf0 ("raid10: increment write counter after bio is
> split") in 4.8.
>
> Maybe the latter patch should be sent to -stable ??
>
> NeilBrown
NeilBrown, thank you for your swift and concise answer.
I gather you are referring to kernel version numbers. The described
behaviour was first noticed many months ago with kernel 2.6.37.6, and
persisted after a system upgrade and kernel 4.4.38. However, after the
upgrade two things were corrected, the timeout mismatch, and a
Current_Pending_Sector in one of the drives; which may, or may not,
explain the occurrence with the older kernel.
Is this constant active state in the data array something to worry about
and try kernel >= 4.8, or shall I let be?
pdi
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: constant array_state active after specific jobs
2017-03-24 7:04 ` pdi
@ 2017-03-26 22:42 ` NeilBrown
2017-03-28 13:44 ` pdi
0 siblings, 1 reply; 6+ messages in thread
From: NeilBrown @ 2017-03-26 22:42 UTC (permalink / raw)
To: pdi; +Cc: linux-raid, Shaohua Li
[-- Attachment #1: Type: text/plain, Size: 3594 bytes --]
On Fri, Mar 24 2017, pdi wrote:
> On Fri, 24 Mar 2017 16:25:35 +1100
> NeilBrown <neilb@suse.com> wrote:
>
>> On Thu, Mar 23 2017, pdi wrote:
>>
>> > Greetings all,
>> >
>> > The problem in a nutshell is that an array is clean after boot,
>> > until some specific jobs switch it to active where it remains until
>> > reboot.
>> >
>> > A similar problem was discussed, and solved, in
>> > https://www.spinics.net/lists/raid/msg46450.html. However, AFAICT,
>> > it is not the same issue.
>> >
>> > I would be grateful for any insights as to why this happens and/or
>> > how to prevent it.
>> >
>> > The relevant info follows, please let me know if anything further
>> > might help.
>> >
>> > Many thanks in advance.
>> >
>> > - uname -a
>> > Linux hostname 4.4.38 #1 SMP Sun Dec 11 16:03:41 CST 2016 x86_64
>> > Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz GenuineIntel GNU/Linux
>> > - mdadm -V
>> > mdadm - v3.3.4 - 3rd August 2015
>> > - Desktop drives without sct/erc,
>> > with timeout mismatch correction as per
>> > https://raid.wiki.kernel.org/index.php/Timeout_Mismatch
>> > - /dev/md9 is a raid10 array, 4 devices, far=2,
>> > with various dirs used as samba and nfs shares
>> > - The array is in *constant* array_state active
>> > - mdadm -D /dev/md9 | grep 'State :'
>> > State : active
>> > - cat /sys/block/md9/md/array_state
>> > active
>> > - watch -d 'grep md9 /proc/diskstats'
>> > remain unchanged
>> > - uptime
>> > load average: 0.00, 0.00, 0.00
>> > - cat /sys/block/md9/md/safe_mode_delay
>> > 0.201
>> > - echo 0.1 > /sys/block/md9/md/safe_mode_delay
>> > array_state remains active
>> > - echo clean > /sys/block/md9/md/array_state
>> > echo: write error: Device or resource busy
>> > - reboot (with or without prior check)
>> > array_state clean
>> > - After reboot, array remains clean until some specific
>> > jobs put it in constant active state. Such jobs so far
>> > identified:
>> > - echo check > /sys/block/md9/md/sync_action
>> > - run an rsnapshot job
>> > - start a qemu/kvm vm
>> > - Other jobs, like text/doc editing, multimedia playback,
>> > etc retain array_state clean
>>
>> This bug was introduced by
>> Commit: 20d0189b1012 ("block: Introduce new bio_split()")
>> in 3.14, and fixed by
>> Commit: 9b622e2bbcf0 ("raid10: increment write counter after bio is
>> split") in 4.8.
>>
>> Maybe the latter patch should be sent to -stable ??
>>
>> NeilBrown
>
> NeilBrown, thank you for your swift and concise answer.
>
> I gather you are referring to kernel version numbers. The described
> behaviour was first noticed many months ago with kernel 2.6.37.6, and
> persisted after a system upgrade and kernel 4.4.38. However, after the
> upgrade two things were corrected, the timeout mismatch, and a
> Current_Pending_Sector in one of the drives; which may, or may not,
> explain the occurrence with the older kernel.
>
> Is this constant active state in the data array something to worry about
> and try kernel >= 4.8, or shall I let be?
The only important consequence of the constant active state is that if
your machine crashes at a moment when the array would otherwise have
been idle, then a resync will be needed after reboot. Without the
constant active state, that resync would not have been needed.
If you have a write-intent bitmap, this is not particularly relevant.
I cannot say how important it is to you to avoid a resync after a crash,
so I don't know if you should just let it be or not.
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: constant array_state active after specific jobs
2017-03-24 5:25 ` NeilBrown
2017-03-24 7:04 ` pdi
@ 2017-03-27 18:08 ` Shaohua Li
1 sibling, 0 replies; 6+ messages in thread
From: Shaohua Li @ 2017-03-27 18:08 UTC (permalink / raw)
To: NeilBrown; +Cc: pdi, linux-raid
On Fri, Mar 24, 2017 at 04:25:35PM +1100, Neil Brown wrote:
> On Thu, Mar 23 2017, pdi wrote:
>
> > Greetings all,
> >
> > The problem in a nutshell is that an array is clean after boot, until
> > some specific jobs switch it to active where it remains until reboot.
> >
> > A similar problem was discussed, and solved, in
> > https://www.spinics.net/lists/raid/msg46450.html. However, AFAICT,
> > it is not the same issue.
> >
> > I would be grateful for any insights as to why this happens and/or how
> > to prevent it.
> >
> > The relevant info follows, please let me know if anything further might
> > help.
> >
> > Many thanks in advance.
> >
> > - uname -a
> > Linux hostname 4.4.38 #1 SMP Sun Dec 11 16:03:41 CST 2016 x86_64
> > Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz GenuineIntel GNU/Linux
> > - mdadm -V
> > mdadm - v3.3.4 - 3rd August 2015
> > - Desktop drives without sct/erc,
> > with timeout mismatch correction as per
> > https://raid.wiki.kernel.org/index.php/Timeout_Mismatch
> > - /dev/md9 is a raid10 array, 4 devices, far=2,
> > with various dirs used as samba and nfs shares
> > - The array is in *constant* array_state active
> > - mdadm -D /dev/md9 | grep 'State :'
> > State : active
> > - cat /sys/block/md9/md/array_state
> > active
> > - watch -d 'grep md9 /proc/diskstats'
> > remain unchanged
> > - uptime
> > load average: 0.00, 0.00, 0.00
> > - cat /sys/block/md9/md/safe_mode_delay
> > 0.201
> > - echo 0.1 > /sys/block/md9/md/safe_mode_delay
> > array_state remains active
> > - echo clean > /sys/block/md9/md/array_state
> > echo: write error: Device or resource busy
> > - reboot (with or without prior check)
> > array_state clean
> > - After reboot, array remains clean until some specific
> > jobs put it in constant active state. Such jobs so far
> > identified:
> > - echo check > /sys/block/md9/md/sync_action
> > - run an rsnapshot job
> > - start a qemu/kvm vm
> > - Other jobs, like text/doc editing, multimedia playback,
> > etc retain array_state clean
>
> This bug was introduced by
> Commit: 20d0189b1012 ("block: Introduce new bio_split()")
> in 3.14, and fixed by
> Commit: 9b622e2bbcf0 ("raid10: increment write counter after bio is split")
> in 4.8.
>
> Maybe the latter patch should be sent to -stable ??
Sure, looks suitable, will do it now.
Thanks,
Shaohua
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: constant array_state active after specific jobs
2017-03-26 22:42 ` NeilBrown
@ 2017-03-28 13:44 ` pdi
0 siblings, 0 replies; 6+ messages in thread
From: pdi @ 2017-03-28 13:44 UTC (permalink / raw)
To: NeilBrown; +Cc: linux-raid, Shaohua Li
On Mon, 27 Mar 2017 09:42:29 +1100
NeilBrown <neilb@suse.com> wrote:
> On Fri, Mar 24 2017, pdi wrote:
>
> > On Fri, 24 Mar 2017 16:25:35 +1100
> > NeilBrown <neilb@suse.com> wrote:
> >
> >> On Thu, Mar 23 2017, pdi wrote:
> >>
> >> > Greetings all,
> >> >
> >> > The problem in a nutshell is that an array is clean after boot,
> >> > until some specific jobs switch it to active where it remains
> >> > until reboot.
> >> >
> >> > A similar problem was discussed, and solved, in
> >> > https://www.spinics.net/lists/raid/msg46450.html. However,
> >> > AFAICT, it is not the same issue.
> >> >
> >> > I would be grateful for any insights as to why this happens
> >> > and/or how to prevent it.
> >> >
> >> > The relevant info follows, please let me know if anything further
> >> > might help.
> >> >
> >> > Many thanks in advance.
> >> >
> >> > - uname -a
> >> > Linux hostname 4.4.38 #1 SMP Sun Dec 11 16:03:41 CST 2016
> >> > x86_64 Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz GenuineIntel
> >> > GNU/Linux
> >> > - mdadm -V
> >> > mdadm - v3.3.4 - 3rd August 2015
> >> > - Desktop drives without sct/erc,
> >> > with timeout mismatch correction as per
> >> > https://raid.wiki.kernel.org/index.php/Timeout_Mismatch
> >> > - /dev/md9 is a raid10 array, 4 devices, far=2,
> >> > with various dirs used as samba and nfs shares
> >> > - The array is in *constant* array_state active
> >> > - mdadm -D /dev/md9 | grep 'State :'
> >> > State : active
> >> > - cat /sys/block/md9/md/array_state
> >> > active
> >> > - watch -d 'grep md9 /proc/diskstats'
> >> > remain unchanged
> >> > - uptime
> >> > load average: 0.00, 0.00, 0.00
> >> > - cat /sys/block/md9/md/safe_mode_delay
> >> > 0.201
> >> > - echo 0.1 > /sys/block/md9/md/safe_mode_delay
> >> > array_state remains active
> >> > - echo clean > /sys/block/md9/md/array_state
> >> > echo: write error: Device or resource busy
> >> > - reboot (with or without prior check)
> >> > array_state clean
> >> > - After reboot, array remains clean until some specific
> >> > jobs put it in constant active state. Such jobs so far
> >> > identified:
> >> > - echo check > /sys/block/md9/md/sync_action
> >> > - run an rsnapshot job
> >> > - start a qemu/kvm vm
> >> > - Other jobs, like text/doc editing, multimedia playback,
> >> > etc retain array_state clean
> >>
> >> This bug was introduced by
> >> Commit: 20d0189b1012 ("block: Introduce new bio_split()")
> >> in 3.14, and fixed by
> >> Commit: 9b622e2bbcf0 ("raid10: increment write counter after bio is
> >> split") in 4.8.
> >>
> >> Maybe the latter patch should be sent to -stable ??
> >>
> >> NeilBrown
> >
> > NeilBrown, thank you for your swift and concise answer.
> >
> > I gather you are referring to kernel version numbers. The described
> > behaviour was first noticed many months ago with kernel 2.6.37.6,
> > and persisted after a system upgrade and kernel 4.4.38. However,
> > after the upgrade two things were corrected, the timeout mismatch,
> > and a Current_Pending_Sector in one of the drives; which may, or
> > may not, explain the occurrence with the older kernel.
> >
> > Is this constant active state in the data array something to worry
> > about and try kernel >= 4.8, or shall I let be?
>
> The only important consequence of the constant active state is that if
> your machine crashes at a moment when the array would otherwise have
> been idle, then a resync will be needed after reboot. Without the
> constant active state, that resync would not have been needed.
>
> If you have a write-intent bitmap, this is not particularly relevant.
>
> I cannot say how important it is to you to avoid a resync after a
> crash, so I don't know if you should just let it be or not.
>
> NeilBrown
NeilBrown,
Thank you for your clear explanation.
Best regards,
pdi
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-03-28 13:44 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-23 8:46 constant array_state active after specific jobs pdi
2017-03-24 5:25 ` NeilBrown
2017-03-24 7:04 ` pdi
2017-03-26 22:42 ` NeilBrown
2017-03-28 13:44 ` pdi
2017-03-27 18:08 ` Shaohua Li
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.