All of lore.kernel.org
 help / color / mirror / Atom feed
* IMSM RAID10: Rebuild/Resync difference on Linux vs Windows
@ 2017-02-22 13:41 Matthias Dahl
  2017-02-23 15:27 ` Artur Paszkiewicz
  0 siblings, 1 reply; 5+ messages in thread
From: Matthias Dahl @ 2017-02-22 13:41 UTC (permalink / raw)
  To: linux-raid

Hello @everyone,

I had an unclean shutdown today and the RAID was (as expected) out of
sync -- so with the next boot, Linux started a resync.

A while into the resync, I had to fire up Windows due to work, and the
Intel Rapid Storage Manager took over the rebuild as expected. But there
was a crucial difference: When I left Linux, I was at somewhat ~35% with
another 2 hours or so to go for the resync, whereas IRSM was at 97% from
the get-go and took just another 10 minutes or so to (apparently) finish
the job.

On Linux, the resync was taking place at 150 MiB/s to 200 MiB/s. So even
if Windows was syncing faster (as-in: transfer-speed wise), there is no 
way to account for a jump from ~35% to 97%.

That is quite a huge difference and got me worried, since I ran into my 
fair share of bugs with imsm on Linux, unfortunately.

Is this to be expected and normal behavior? Does IRSM on Win use some 
kind of optimization/shortcut during rebuild that is not implemented on 
Linux? Or should this really not be happening and I should indeed be 
worried that the resync was done improperly now?

This happened with kernel 4.9.5 and mdadm/mdmon 4.0 with a RAID10.

If there you need any more information, please don't hesitate to ask and
I will gladly provide it and help figure this out.

Thanks in advance for any help.

So long,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu
  Hire me for contract work: Open Source, Proprietary, Web/Mobile/Desktop

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: IMSM RAID10: Rebuild/Resync difference on Linux vs Windows
  2017-02-22 13:41 IMSM RAID10: Rebuild/Resync difference on Linux vs Windows Matthias Dahl
@ 2017-02-23 15:27 ` Artur Paszkiewicz
  2017-02-27  9:30   ` Matthias Dahl
  2017-04-20  7:44   ` Matthias Dahl
  0 siblings, 2 replies; 5+ messages in thread
From: Artur Paszkiewicz @ 2017-02-23 15:27 UTC (permalink / raw)
  To: Matthias Dahl, linux-raid

On 02/22/2017 02:41 PM, Matthias Dahl wrote:
> Hello @everyone,
> 
> I had an unclean shutdown today and the RAID was (as expected) out of
> sync -- so with the next boot, Linux started a resync.
> 
> A while into the resync, I had to fire up Windows due to work, and the
> Intel Rapid Storage Manager took over the rebuild as expected. But there
> was a crucial difference: When I left Linux, I was at somewhat ~35% with
> another 2 hours or so to go for the resync, whereas IRSM was at 97% from
> the get-go and took just another 10 minutes or so to (apparently) finish
> the job.
> 
> On Linux, the resync was taking place at 150 MiB/s to 200 MiB/s. So even
> if Windows was syncing faster (as-in: transfer-speed wise), there is no way to account for a jump from ~35% to 97%.
> 
> That is quite a huge difference and got me worried, since I ran into my fair share of bugs with imsm on Linux, unfortunately.
> 
> Is this to be expected and normal behavior? Does IRSM on Win use some kind of optimization/shortcut during rebuild that is not implemented on Linux? Or should this really not be happening and I should indeed be worried that the resync was done improperly now?
> 
> This happened with kernel 4.9.5 and mdadm/mdmon 4.0 with a RAID10.
> 
> If there you need any more information, please don't hesitate to ask and
> I will gladly provide it and help figure this out.
> 
> Thanks in advance for any help.

Windows IRSM should not make any "shortcuts" in this case. If you
suspect that resync was not completed properly in Windows, you can run
it manually by writing "repair" to /sys/block/<dev>/md/sync_action. When
it finishes you can check md/mismatch_cnt if there were any
unsynchronized blocks.

Can you share the output of mdadm -E <raid disks> and mdadm --detail-platform?
What version of RST software are you using on Windows?

Thanks,
Artur

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: IMSM RAID10: Rebuild/Resync difference on Linux vs Windows
  2017-02-23 15:27 ` Artur Paszkiewicz
@ 2017-02-27  9:30   ` Matthias Dahl
  2017-04-20  7:44   ` Matthias Dahl
  1 sibling, 0 replies; 5+ messages in thread
From: Matthias Dahl @ 2017-02-27  9:30 UTC (permalink / raw)
  To: Artur Paszkiewicz, linux-raid

[-- Attachment #1: Type: text/plain, Size: 1190 bytes --]

Hello Artur...

On 23/02/17 16:27, Artur Paszkiewicz wrote:

> Windows IRSM should not make any "shortcuts" in this case.

I expected and dreaded that answer equally. ;-)

> [...] When it finishes you can check md/mismatch_cnt if there were any
> unsynchronized blocks.

I did a resync completely on Linux and the mismatch_cnt stayed at zero,
which is nice to know but does unfortunately not change the fact that
there was a huge jump going from Linux to Windows with regards to sync
completion the last time around. :(

Knowing almost nothing about the on-disk (meta-data) structure here, I'd
guess that having an in-sync raid could also be completely due to luck
in this case.

> Can you share the output of mdadm -E <raid disks> and
 > mdadm --detail-platform?

It is attached. If there is anything else you need, please let me know.
The data was pulled, btw, before I did the "test resync" above -- if
that is relevant to you.

> What version of RST software are you using on Windows?

14.8.0.1042 on Win10 Pro (x86_64)

Thanks,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu
  Hire me for contract work: Open Source, Proprietary, Web/Mobile/Desktop

[-- Attachment #2: raid-details.txt --]
[-- Type: text/plain, Size: 5901 bytes --]

       Platform : Intel(R) Rapid Storage Technology
        Version : 14.6.0.2285
    RAID Levels : raid0 raid1 raid10 raid5
    Chunk Sizes : 4k 8k 16k 32k 64k 128k
    2TB volumes : supported
      2TB disks : supported
      Max Disks : 7
    Max Volumes : 2 per array, 4 per controller
 I/O Controller : /sys/devices/pci0000:00/0000:00:17.0 (SATA)
          Port5 : /dev/sdd (Z1X69GC2)
          Port3 : /dev/sdb (Z1X6VEEQ)
          Port4 : /dev/sdc (Z1X69HB2)
          Port1 : - non-disk device (PIONEER BD-RW   BDR-S09) -
          Port2 : /dev/sda (Z1X6V5JZ)
          Port0 : - no device attached -

/dev/sda:
          Magic : Intel Raid ISM Cfg Sig.
        Version : 1.3.00
    Orig Family : 8bf33aa4
         Family : 8bf33aa4
     Generation : 00312a2c
     Attributes : All supported
           UUID : 9c45a5a7:83c364db:f1e8b47c:e9aee53f
       Checksum : 65c084b8 correct
    MPB Sectors : 2
          Disks : 4
   RAID Devices : 1

  Disk00 Serial : Z1X6V5JZ
          State : active
             Id : 00000002
    Usable Size : 3907024136 (1863.01 GiB 2000.40 GB)

[MainR10]:
           UUID : 92adf4b0:129c3b94:9a2b5610:d7c8c5d5
     RAID Level : 10
        Members : 4
          Slots : [UUUU]
    Failed disk : none
      This Slot : 0
     Array Size : 7814047744 (3726.03 GiB 4000.79 GB)
   Per Dev Size : 3907024136 (1863.01 GiB 2000.40 GB)
  Sector Offset : 0
    Num Stripes : 15261812
     Chunk Size : 64 KiB
       Reserved : 0
  Migrate State : idle
      Map State : normal
    Dirty State : dirty

  Disk01 Serial : Z1X6VEEQ
          State : active
             Id : 00000003
    Usable Size : 3907024136 (1863.01 GiB 2000.40 GB)

  Disk02 Serial : Z1X69HB2
          State : active
             Id : 00000004
    Usable Size : 3907024136 (1863.01 GiB 2000.40 GB)

  Disk03 Serial : Z1X69GC2
          State : active
             Id : 00000005
    Usable Size : 3907024136 (1863.01 GiB 2000.40 GB)

/dev/sdb:
          Magic : Intel Raid ISM Cfg Sig.
        Version : 1.3.00
    Orig Family : 8bf33aa4
         Family : 8bf33aa4
     Generation : 00312a2c
     Attributes : All supported
           UUID : 9c45a5a7:83c364db:f1e8b47c:e9aee53f
       Checksum : 65c084b8 correct
    MPB Sectors : 2
          Disks : 4
   RAID Devices : 1

  Disk01 Serial : Z1X6VEEQ
          State : active
             Id : 00000003
    Usable Size : 3907024136 (1863.01 GiB 2000.40 GB)

[MainR10]:
           UUID : 92adf4b0:129c3b94:9a2b5610:d7c8c5d5
     RAID Level : 10
        Members : 4
          Slots : [UUUU]
    Failed disk : none
      This Slot : 1
     Array Size : 7814047744 (3726.03 GiB 4000.79 GB)
   Per Dev Size : 3907024136 (1863.01 GiB 2000.40 GB)
  Sector Offset : 0
    Num Stripes : 15261812
     Chunk Size : 64 KiB
       Reserved : 0
  Migrate State : idle
      Map State : normal
    Dirty State : dirty

  Disk00 Serial : Z1X6V5JZ
          State : active
             Id : 00000002
    Usable Size : 3907024136 (1863.01 GiB 2000.40 GB)

  Disk02 Serial : Z1X69HB2
          State : active
             Id : 00000004
    Usable Size : 3907024136 (1863.01 GiB 2000.40 GB)

  Disk03 Serial : Z1X69GC2
          State : active
             Id : 00000005
    Usable Size : 3907024136 (1863.01 GiB 2000.40 GB)

/dev/sdc:
          Magic : Intel Raid ISM Cfg Sig.
        Version : 1.3.00
    Orig Family : 8bf33aa4
         Family : 8bf33aa4
     Generation : 00312a2e
     Attributes : All supported
           UUID : 9c45a5a7:83c364db:f1e8b47c:e9aee53f
       Checksum : 65c084ba correct
    MPB Sectors : 2
          Disks : 4
   RAID Devices : 1

  Disk02 Serial : Z1X69HB2
          State : active
             Id : 00000004
    Usable Size : 3907024136 (1863.01 GiB 2000.40 GB)

[MainR10]:
           UUID : 92adf4b0:129c3b94:9a2b5610:d7c8c5d5
     RAID Level : 10
        Members : 4
          Slots : [UUUU]
    Failed disk : none
      This Slot : 2
     Array Size : 7814047744 (3726.03 GiB 4000.79 GB)
   Per Dev Size : 3907024136 (1863.01 GiB 2000.40 GB)
  Sector Offset : 0
    Num Stripes : 15261812
     Chunk Size : 64 KiB
       Reserved : 0
  Migrate State : idle
      Map State : normal
    Dirty State : dirty

  Disk00 Serial : Z1X6V5JZ
          State : active
             Id : 00000002
    Usable Size : 3907024136 (1863.01 GiB 2000.40 GB)

  Disk01 Serial : Z1X6VEEQ
          State : active
             Id : 00000003
    Usable Size : 3907024136 (1863.01 GiB 2000.40 GB)

  Disk03 Serial : Z1X69GC2
          State : active
             Id : 00000005
    Usable Size : 3907024136 (1863.01 GiB 2000.40 GB)

/dev/sdd:
          Magic : Intel Raid ISM Cfg Sig.
        Version : 1.3.00
    Orig Family : 8bf33aa4
         Family : 8bf33aa4
     Generation : 00312a2f
     Attributes : All supported
           UUID : 9c45a5a7:83c364db:f1e8b47c:e9aee53f
       Checksum : 65bf84bb correct
    MPB Sectors : 2
          Disks : 4
   RAID Devices : 1

  Disk03 Serial : Z1X69GC2
          State : active
             Id : 00000005
    Usable Size : 3907024136 (1863.01 GiB 2000.40 GB)

[MainR10]:
           UUID : 92adf4b0:129c3b94:9a2b5610:d7c8c5d5
     RAID Level : 10
        Members : 4
          Slots : [UUUU]
    Failed disk : none
      This Slot : 3
     Array Size : 7814047744 (3726.03 GiB 4000.79 GB)
   Per Dev Size : 3907024136 (1863.01 GiB 2000.40 GB)
  Sector Offset : 0
    Num Stripes : 15261812
     Chunk Size : 64 KiB
       Reserved : 0
  Migrate State : idle
      Map State : normal
    Dirty State : clean

  Disk00 Serial : Z1X6V5JZ
          State : active
             Id : 00000002
    Usable Size : 3907024136 (1863.01 GiB 2000.40 GB)

  Disk01 Serial : Z1X6VEEQ
          State : active
             Id : 00000003
    Usable Size : 3907024136 (1863.01 GiB 2000.40 GB)

  Disk02 Serial : Z1X69HB2
          State : active
             Id : 00000004
    Usable Size : 3907024136 (1863.01 GiB 2000.40 GB)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: IMSM RAID10: Rebuild/Resync difference on Linux vs Windows
  2017-02-23 15:27 ` Artur Paszkiewicz
  2017-02-27  9:30   ` Matthias Dahl
@ 2017-04-20  7:44   ` Matthias Dahl
  2017-04-20 10:06     ` Artur Paszkiewicz
  1 sibling, 1 reply; 5+ messages in thread
From: Matthias Dahl @ 2017-04-20  7:44 UTC (permalink / raw)
  To: Artur Paszkiewicz, linux-raid

Hello Artur...

So, the same thing happened again yesterday: The RAID10 was out of sync
due to an unclean shutdown and while Linux was at 70.8% of the resync, I
restarted to Win 10 with RST 15.2.0.1020 and it recognized the RAID as
clean without any visible/noticeable verification at all.

I did a manual verification run in Win this time, just for the sake of
it, and it came up fine.

Nevertheless, this really has me worried. Did you get / see my last mail
with the information you requested?

By the way, kernel version was 4.10.8 with mdadm/mdmon 4.0.

Thanks,
Matthias

-- 
Dipl.-Inf. (FH) Matthias Dahl | Software Engineer | binary-island.eu
 Hire me for contract work: Open Source, Proprietary, Web/Mobile/Desktop

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: IMSM RAID10: Rebuild/Resync difference on Linux vs Windows
  2017-04-20  7:44   ` Matthias Dahl
@ 2017-04-20 10:06     ` Artur Paszkiewicz
  0 siblings, 0 replies; 5+ messages in thread
From: Artur Paszkiewicz @ 2017-04-20 10:06 UTC (permalink / raw)
  To: Matthias Dahl, linux-raid

On 04/20/2017 09:44 AM, Matthias Dahl wrote:
> Hello Artur...
> 
> So, the same thing happened again yesterday: The RAID10 was out of sync
> due to an unclean shutdown and while Linux was at 70.8% of the resync, I
> restarted to Win 10 with RST 15.2.0.1020 and it recognized the RAID as
> clean without any visible/noticeable verification at all.
> 
> I did a manual verification run in Win this time, just for the sake of
> it, and it came up fine.
> 
> Nevertheless, this really has me worried. Did you get / see my last mail
> with the information you requested?
> 
> By the way, kernel version was 4.10.8 with mdadm/mdmon 4.0.

Hi Matthias,

This really is strange, I'll try to ask someone from the Windows RST
team about this. Unfortunately I don't have any knowledge about the
Windows driver. I saw your earlier email and everything looked OK there.

Does this happen only when you reboot to Windows? Have you ever rebooted
during resync again to Linux and did resync resume as expected? Also,
have you noticed what state did the BIOS show for the array when you
rebooted? I'm asking because this might not be caused by Windows but
maybe the BIOS/UEFI driver or Linux not updating the metadata properly
at shutdown.

Thanks,
Artur

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-04-20 10:06 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-22 13:41 IMSM RAID10: Rebuild/Resync difference on Linux vs Windows Matthias Dahl
2017-02-23 15:27 ` Artur Paszkiewicz
2017-02-27  9:30   ` Matthias Dahl
2017-04-20  7:44   ` Matthias Dahl
2017-04-20 10:06     ` Artur Paszkiewicz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.