All of lore.kernel.org
 help / color / mirror / Atom feed
* shown disk sizes
@ 2013-07-17 14:37 Christoph Anton Mitterer
  2013-07-17 23:43 ` NeilBrown
  0 siblings, 1 reply; 10+ messages in thread
From: Christoph Anton Mitterer @ 2013-07-17 14:37 UTC (permalink / raw)
  To: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1636 bytes --]

Hi Neil, et al...

(btw: do we have an issue tracker somewhere?)

I was experimenting a bit... created two GPT partitions exactly 10 GiB
(aka 20971520 sectors a 512B).

Created a raid 1 on them:
mdadm --create /dev/md/data --verbose --metadata=1.2 --raid-devices=2
--spare-devices=0 --size=max --chunk=32 --level=raid1 --bitmap=internal
--name=data /dev/sda1 /dev/sdb1

The size is a multiple of the 32KiB so no rounding effects should kick
in.


Now:
--examine gives for both devices:
Avail Dev Size : 20969472 (10.00 GiB 10.74 GB)
     Array Size : 20969328 (10.00 GiB 10.74 GB)
  Used Dev Size : 20969328 (10.00 GiB 10.74 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors

=> Avail is the available payload size on each component device,... so
given that we have the first 2048S for the superblock/bitmap/etc... that
fits exactly.

=> Why is the array size / used dev size smaller?



--detail gives:
     Array Size : 10484664 (10.00 GiB 10.74 GB)
  Used Dev Size : 10484664 (10.00 GiB 10.74 GB)

=> That's half of the Array Size from above? Is that a bug?


--query gives even another value:
/dev/md/data: 9.100GiB raid1 2 devices, 0 spares. Use mdadm --detail for
more detail.
=> But the device really has 20969328S it seems... so the 9.1 GiB seems
a bit bogus as well?


Last but not least... when the tools print values like "10.00 GiB 10.74
GB"... wouldn't it be better if they printed "~10.00 GiB ~10.74 GB" or
something like this to show that the values are rounded and not
_exactly_ 10 GiB... could be helpful to avoid misalignment issues.


Cheers,
Chris.

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5165 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: shown disk sizes
  2013-07-17 14:37 shown disk sizes Christoph Anton Mitterer
@ 2013-07-17 23:43 ` NeilBrown
  2013-07-18  0:39   ` Christoph Anton Mitterer
                     ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: NeilBrown @ 2013-07-17 23:43 UTC (permalink / raw)
  To: Christoph Anton Mitterer; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 3075 bytes --]

On Wed, 17 Jul 2013 16:37:47 +0200 Christoph Anton Mitterer
<calestyo@scientia.net> wrote:

> Hi Neil, et al...
> 
> (btw: do we have an issue tracker somewhere?)

Yes - this mailing list.
The protocol is that as long as you care about the issue and haven't had
satisfactory response, you post a "Can anyone help with this" every week or
so.
That way we don't have a problem with lots of stale entries that no-one cares
about.

> 
> I was experimenting a bit... created two GPT partitions exactly 10 GiB
> (aka 20971520 sectors a 512B).
> 
> Created a raid 1 on them:
> mdadm --create /dev/md/data --verbose --metadata=1.2 --raid-devices=2
> --spare-devices=0 --size=max --chunk=32 --level=raid1 --bitmap=internal
> --name=data /dev/sda1 /dev/sdb1
> 
> The size is a multiple of the 32KiB so no rounding effects should kick
> in.

chunksize is not very meaningful for RAID1.  If you add '-v' mdadm should
tell you:
  mdadm: chunk size ignored for this level

Maybe it should round the size down to a multiple of the given chunk size,
but as you said "--size=max", maybe not..  Not sure.

> 
> 
> Now:
> --examine gives for both devices:
> Avail Dev Size : 20969472 (10.00 GiB 10.74 GB)
>      Array Size : 20969328 (10.00 GiB 10.74 GB)
>   Used Dev Size : 20969328 (10.00 GiB 10.74 GB)
>     Data Offset : 2048 sectors
>    Super Offset : 8 sectors
> 
> => Avail is the available payload size on each component device,... so
> given that we have the first 2048S for the superblock/bitmap/etc... that
> fits exactly.
> 
> => Why is the array size / used dev size smaller?

Good question.  Not easy to answer ... it is rather convoluted.  Different
bits of code try to reserve space for things differently and they don't end
up agreeing.  I might try to simplify that.

> 
> 
> 
> --detail gives:
>      Array Size : 10484664 (10.00 GiB 10.74 GB)
>   Used Dev Size : 10484664 (10.00 GiB 10.74 GB)
> 
> => That's half of the Array Size from above? Is that a bug?

The number is in K rather than sectors.  Sorry :-(
The numbers if brackets, which have units, match.

> 
> 
> --query gives even another value:
> /dev/md/data: 9.100GiB raid1 2 devices, 0 spares. Use mdadm --detail for
> more detail.
> => But the device really has 20969328S it seems... so the 9.1 GiB seems
> a bit bogus as well?

I think this is fixed in 3.3-rc1 by commit 570abc6f3881b5152cb1244

http://git.neil.brown.name/git?p=mdadm.git;a=commitdiff;h=570abc6f3881b5152cb1244

> 
> 
> Last but not least... when the tools print values like "10.00 GiB 10.74
> GB"... wouldn't it be better if they printed "~10.00 GiB ~10.74 GB" or
> something like this to show that the values are rounded and not
> _exactly_ 10 GiB... could be helpful to avoid misalignment issues.

Would that really help?  Given that 10.00GiB is not exactly the same as
10.74GB, isn't it obvious that they must be approximations?

I'm not exactly against adding '~' but it doesn't seem necessary.
Does anyone else have thoughts?

Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: shown disk sizes
  2013-07-17 23:43 ` NeilBrown
@ 2013-07-18  0:39   ` Christoph Anton Mitterer
  2013-07-18  7:35   ` Mikael Abrahamsson
  2013-07-23 16:26   ` Christoph Anton Mitterer
  2 siblings, 0 replies; 10+ messages in thread
From: Christoph Anton Mitterer @ 2013-07-18  0:39 UTC (permalink / raw)
  To: linux-raid

[-- Attachment #1: Type: text/plain, Size: 2244 bytes --]

On Thu, 2013-07-18 at 09:43 +1000, NeilBrown wrote:
> Yes - this mailing list.
> The protocol is that as long as you care about the issue and haven't had
> satisfactory response, you post a "Can anyone help with this" every week or
> so.
> That way we don't have a problem with lots of stale entries that no-one cares
> about.
okay... guess I need to track my ideas for additions to the
documentation (like the ones below) somewhere else than... =)


> chunksize is not very meaningful for RAID1.  If you add '-v' mdadm should
> tell you:
>   mdadm: chunk size ignored for this level
Yeah... sure... I just made some tests and re-used the history from a
previous raid6 and didn't remove the useless stuff ;)


> Maybe it should round the size down to a multiple of the given chunk size,
> but as you said "--size=max", maybe not..  Not sure.
Interesting question... I'll try that later... maybe something we can
add to the manpage as well.


> > => Why is the array size / used dev size smaller?
> 
> Good question.  Not easy to answer ... it is rather convoluted.  Different
> bits of code try to reserve space for things differently and they don't end
> up agreeing.  I might try to simplify that.
Okay... I have no idea what you're talking about ;-)
It seems to it's always 144 sectors that are "missing"...

What do you mean by simplify?


> > --detail gives:
> >      Array Size : 10484664 (10.00 GiB 10.74 GB)
> >   Used Dev Size : 10484664 (10.00 GiB 10.74 GB)
> > 
> > => That's half of the Array Size from above? Is that a bug?
> The number is in K rather than sectors.  Sorry :-(
You know that these are the reasons why kernel developers may end up in
hell?! ;-P

okay... I guess again something for the documentation... or would you
see a problem to change the output to at least include the unit?
Like 
Array Size (KiB) or so?


> Would that really help?  Given that 10.00GiB is not exactly the same as
> 10.74GB, isn't it obvious that they must be approximations?
> 
> I'm not exactly against adding '~' but it doesn't seem necessary.
> Does anyone else have thoughts?
I don't think it's strictly necessary as well... but I guess it would be
cleaner...


Cheers,
Chris.

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5113 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: shown disk sizes
  2013-07-17 23:43 ` NeilBrown
  2013-07-18  0:39   ` Christoph Anton Mitterer
@ 2013-07-18  7:35   ` Mikael Abrahamsson
  2013-07-23 16:26   ` Christoph Anton Mitterer
  2 siblings, 0 replies; 10+ messages in thread
From: Mikael Abrahamsson @ 2013-07-18  7:35 UTC (permalink / raw)
  To: NeilBrown; +Cc: Christoph Anton Mitterer, linux-raid

On Thu, 18 Jul 2013, NeilBrown wrote:

>> Last but not least... when the tools print values like "10.00 GiB 10.74
>> GB"... wouldn't it be better if they printed "~10.00 GiB ~10.74 GB" or
>> something like this to show that the values are rounded and not
>> _exactly_ 10 GiB... could be helpful to avoid misalignment issues.
>
> Would that really help?  Given that 10.00GiB is not exactly the same as
> 10.74GB, isn't it obvious that they must be approximations?
>
> I'm not exactly against adding '~' but it doesn't seem necessary.
> Does anyone else have thoughts?

When I studied rounding I was taught that 10.74 implied values could be 
between 10.744999999... and 10.735000000... Same goes with 10.00 can be 
between 10.004999999... and 99.995000000...

As long as the value numbers are correctly rounded, I don't feel there is 
a need to put the "~" in there. 10.74GB doesn't imply 10.7400000000000.

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: shown disk sizes
  2013-07-17 23:43 ` NeilBrown
  2013-07-18  0:39   ` Christoph Anton Mitterer
  2013-07-18  7:35   ` Mikael Abrahamsson
@ 2013-07-23 16:26   ` Christoph Anton Mitterer
  2013-07-23 17:12     ` Mikael Abrahamsson
  2 siblings, 1 reply; 10+ messages in thread
From: Christoph Anton Mitterer @ 2013-07-23 16:26 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 6693 bytes --]

On Thu, 2013-07-18 at 09:43 +1000, NeilBrown wrote: 
> > --examine gives for both devices:
> > Avail Dev Size : 20969472 (10.00 GiB 10.74 GB)
> >      Array Size : 20969328 (10.00 GiB 10.74 GB)
> >   Used Dev Size : 20969328 (10.00 GiB 10.74 GB)
> >     Data Offset : 2048 sectors
> >    Super Offset : 8 sectors
> > 
> > => Avail is the available payload size on each component device,... so
> > given that we have the first 2048S for the superblock/bitmap/etc... that
> > fits exactly.
> > 
> > => Why is the array size / used dev size smaller?
> 
> Good question.  Not easy to answer ... it is rather convoluted.  Different
> bits of code try to reserve space for things differently and they don't end
> up agreeing.  I might try to simplify that.


I played a bit more around here,... and got even more confused:

Made two files for losetup:
-rw-r--r--  1 root root 524288000 Jul 23 17:41 image1
-rw-r--r--  1 root root 524288000 Jul 23 17:41 image2

And a RAID1 out of it:
mdadm --create /dev/md/raid3 --verbose --metadata=1.2 --size=max
--level=raid1  --name=raid3  --raid-devices=2 /dev/loop0 /dev/loop1

Examine says:
# mdadm --examine /dev/loop0
/dev/loop0:
Avail Dev Size : 1023488 (499.83 MiB 524.03 MB)
     Array Size : 511680 (499.77 MiB 523.96 MB)
  Used Dev Size : 1023360 (499.77 MiB 523.96 MB)
    Data Offset : 512 sectors
   Super Offset : 8 sectors


Fines, so 524288000/512 - 512S = exactly the 1023488 Avail Dev size
Array Size is in 1K and Used Dev Size in S, so these are identical.

Questions
1) How does mdadm choose the data alignment? It seems to also use
completely "odd" numbers like 262144 sectors

2) Again, some sectors (here 128 S) are missing... :(


Now I did some fun:
cat /dev/md/raid > image
-rw-r--r--  1 root root 523960320 Jul 23 17:43 image
=> which is just the ArraySize / Used Dev Size... hurray...
I edited that file with an hexeditor with the following changes: 
# hd i
00000000  43 41 4c 45 53 54 59 4f  5f 42 45 47 49 4e 00 00  |CALESTYO_BEGIN..|
00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
1f3afff0  00 00 00 00 43 41 4c 45  53 54 59 4f 5f 45 4e 44  |....CALESTYO_END|
1f3b0000

and wrote it back (cat image > /dev/md/raid).

Of course I couldn't resist to stop the raid and directly read the
losetup image files:
# hd image1 
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00001000  fc 4e 2b a9 01 00 00 00  00 00 00 00 00 00 00 00  |.N+.............|
00001010  99 05 68 13 83 97 70 53  bd 98 b6 9e 04 9c 1a 79  |..h...pS.......y|
00001020  6c 63 67 2d 6c 72 7a 2d  70 75 70 70 65 74 3a 72  |lcg-lrz-puppet:r|
00001030  61 69 64 33 00 00 00 00  00 00 00 00 00 00 00 00  |aid3............|
00001040  ed a2 ee 51 00 00 00 00  01 00 00 00 00 00 00 00  |...Q............|
00001050  80 9d 0f 00 00 00 00 00  00 00 00 00 02 00 00 00  |................|
00001060  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00001080  00 02 00 00 00 00 00 00  00 9e 0f 00 00 00 00 00  |................|
00001090  08 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000010a0  00 00 00 00 00 00 00 00  d4 58 14 b3 8c 07 e4 88  |.........X......|
000010b0  fa f7 61 90 83 2b 4c 0d  00 00 00 00 00 00 00 00  |..a..+L.........|
000010c0  2c a4 ee 51 00 00 00 00  13 00 00 00 00 00 00 00  |,..Q............|
000010d0  ff ff ff ff ff ff ff ff  c3 52 2b 16 80 00 00 00  |.........R+.....|
000010e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00001100  00 00 01 00 fe ff fe ff  fe ff fe ff fe ff fe ff  |................|
00001110  fe ff fe ff fe ff fe ff  fe ff fe ff fe ff fe ff  |................|
*
00001200  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00002000  62 69 74 6d 04 00 00 00  35 c5 62 da cf d2 21 ea  |bitm....5.b...!.|
00002010  1a e5 49 74 92 7b 49 2e  00 00 00 00 00 00 00 00  |..It.{I.........|
00002020  00 00 00 00 00 00 00 00  80 9d 0f 00 00 00 00 00  |................|
00002030  00 00 00 00 00 00 00 04  05 00 00 00 00 00 00 00  |................|
00002040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00002100  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
*
00002200  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00040000  43 41 4c 45 53 54 59 4f  5f 42 45 47 49 4e 00 00  |CALESTYO_BEGIN..|
00040010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
1f3efff0  00 00 00 00 43 41 4c 45  53 54 59 4f 5f 45 4e 44  |....CALESTYO_END|
1f3f0000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
1f400000

(they are the same (especially the addresses) except some places in the
header)

Okay here we go: 0x00040000 = 262144 B = 512 S... hurray... the data
starts at the data offset (who'd have expected this? :P)

The total size: 0x1f400000 = 524288000 B = 1024000 S, which is just the
size of my losetup image files

Total size 0x1f400000 - 0x0x00040000 = 524025856 B = 1023488 S, which is
again the Avail Size... (i.e. NOT the array size)

And the 1f400000-1f3f0000 are just the "missing" 128S.
Jihaw...

3) Stupid question... are these 128S kept free for the 0.9/1.0
superblock-in-the-end disease? ;)
If so,... I'd have expected that this region is at least 64K and at most
128K large,... but I've also had this:
# mdadm --examine /dev/loop3
/dev/loop3:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 968d9b80:d8714965:c3cb34cd:4d2952e6
           Name : lcg-lrz-puppet:raid2  (local to host lcg-lrz-puppet)
  Creation Time : Tue Jul 23 17:15:23 2013
     Raid Level : raid1
   Raid Devices : 2

Avail Dev Size : 1048313856 (499.88 GiB 536.74 GB)
     Array Size : 524156736 (499.87 GiB 536.74 GB)
  Used Dev Size : 1048313472 (499.87 GiB 536.74 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : active
    Device UUID : bbcfa311:0e61c19a:5eafe78e:f53b1c15

Internal Bitmap : 8 sectors from superblock
    Update Time : Tue Jul 23 17:15:23 2013
       Checksum : 23cbc3d7 - correct
         Events : 0

=> and there we have 384 S...



I was working on some spreadsheat which should give one, starting from a
few variables like first sector of the MD's component device partition,
and so on... and especially the desired (usabel) array size... the
necessary size for the the component device.
Obviously, as long as I can't calculate how the size of the superblock
comes together... I have no real chance


Cheers,
Chris.

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5165 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: shown disk sizes
  2013-07-23 16:26   ` Christoph Anton Mitterer
@ 2013-07-23 17:12     ` Mikael Abrahamsson
  2013-07-23 17:22       ` Christoph Anton Mitterer
  0 siblings, 1 reply; 10+ messages in thread
From: Mikael Abrahamsson @ 2013-07-23 17:12 UTC (permalink / raw)
  To: Christoph Anton Mitterer; +Cc: NeilBrown, linux-raid

On Tue, 23 Jul 2013, Christoph Anton Mitterer wrote:

> Questions
> 1) How does mdadm choose the data alignment? It seems to also use
> completely "odd" numbers like 262144 sectors

2^18=262144

Nothing odd about it.

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: shown disk sizes
  2013-07-23 17:12     ` Mikael Abrahamsson
@ 2013-07-23 17:22       ` Christoph Anton Mitterer
  2013-07-23 20:57         ` NeilBrown
  0 siblings, 1 reply; 10+ messages in thread
From: Christoph Anton Mitterer @ 2013-07-23 17:22 UTC (permalink / raw)
  To: linux-raid

[-- Attachment #1: Type: text/plain, Size: 241 bytes --]

On Tue, 2013-07-23 at 19:12 +0200, Mikael Abrahamsson wrote: 
> 2^18=262144
> Nothing odd about it.
Sure... it's not e or pi ;-)

"odd" in the sense of "why such a big offset?" respectively "what does
it try to align to?"


Chris.

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5165 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: shown disk sizes
  2013-07-23 17:22       ` Christoph Anton Mitterer
@ 2013-07-23 20:57         ` NeilBrown
  2013-07-23 22:10           ` Christoph Anton Mitterer
  0 siblings, 1 reply; 10+ messages in thread
From: NeilBrown @ 2013-07-23 20:57 UTC (permalink / raw)
  To: Christoph Anton Mitterer; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1016 bytes --]

On Tue, 23 Jul 2013 19:22:06 +0200 Christoph Anton Mitterer
<calestyo@scientia.net> wrote:

> On Tue, 2013-07-23 at 19:12 +0200, Mikael Abrahamsson wrote: 
> > 2^18=262144
> > Nothing odd about it.
> Sure... it's not e or pi ;-)
> 
> "odd" in the sense of "why such a big offset?" respectively "what does
> it try to align to?"
>

It isn't (all) alignment.  It is mostly spare space.

New feature in 3.3 is that when you reshape (e.g.) a RAID5 to a RAID6 it can
do so without using a "backup file" - which is are a pain to work with and
slow things down a lot.
What it does instead is move the "data_offset" towards the start of the
device.  That way it is never writing onto live data, and so no backup is
needed.

For this to work, we need a buffer at the start of the device.  128M is
plenty big enough and just a tiny fraction of a 1TB drive.  (If you only have
a small drive it will still pick a tiny fraction).

So we are reserving a bit of space for future flexibility.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: shown disk sizes
  2013-07-23 20:57         ` NeilBrown
@ 2013-07-23 22:10           ` Christoph Anton Mitterer
  2013-07-23 22:27             ` NeilBrown
  0 siblings, 1 reply; 10+ messages in thread
From: Christoph Anton Mitterer @ 2013-07-23 22:10 UTC (permalink / raw)
  To: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1546 bytes --]

On Wed, 2013-07-24 at 06:57 +1000, NeilBrown wrote:
> New feature in 3.3 is that when you reshape (e.g.) a RAID5 to a RAID6 it can
> do so without using a "backup file" - which is are a pain to work with and
> slow things down a lot.
> What it does instead is move the "data_offset" towards the start of the
> device.  That way it is never writing onto live data, and so no backup is
> needed.
So you are stealing my precious space, which I was paying soo much money
for?! ;-)

Seriously,... sounds like a good idea... :)
But why does it also increase the "gap" at the end? Guess the whole
"gap" at the end thingy is something we will never ever really get rid
of, will we? I mean it's no problem because of some missing sectors...
it just makes like more complicated when doing manual geometry
calculations...


While talking about data_offset... is it planned to implement a way to
add a way to specify the data_offset manually  and/or specify a
data_offset_offset (i.e. a additional offset to what mdadm would have
autodetected... similar as LVM allows?

I mean the practical relevance is probably not that much... but manual
specification would probably needed whenever you have some more complex
block layer (e.g. LVM), another MD or a hardware RAID below MD... and
you want to align to the stripes... or unusual large chunk sizes...

I've seen that you have some branch where you can specify the offset for
each device... but... guess that was only for some guys to do recovery
works?!



Thanks,
Chris.

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5113 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: shown disk sizes
  2013-07-23 22:10           ` Christoph Anton Mitterer
@ 2013-07-23 22:27             ` NeilBrown
  0 siblings, 0 replies; 10+ messages in thread
From: NeilBrown @ 2013-07-23 22:27 UTC (permalink / raw)
  To: Christoph Anton Mitterer; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1938 bytes --]

On Wed, 24 Jul 2013 00:10:31 +0200 Christoph Anton Mitterer
<calestyo@scientia.net> wrote:

> On Wed, 2013-07-24 at 06:57 +1000, NeilBrown wrote:
> > New feature in 3.3 is that when you reshape (e.g.) a RAID5 to a RAID6 it can
> > do so without using a "backup file" - which is are a pain to work with and
> > slow things down a lot.
> > What it does instead is move the "data_offset" towards the start of the
> > device.  That way it is never writing onto live data, and so no backup is
> > needed.
> So you are stealing my precious space, which I was paying soo much money
> for?! ;-)
> 
> Seriously,... sounds like a good idea... :)
> But why does it also increase the "gap" at the end? Guess the whole
> "gap" at the end thingy is something we will never ever really get rid
> of, will we? I mean it's no problem because of some missing sectors...
> it just makes like more complicated when doing manual geometry
> calculations...

The "gap at the end" is due to clumsy coding that I indicated earlier that I
would try to find time to tidy up.

> 
> 
> While talking about data_offset... is it planned to implement a way to
> add a way to specify the data_offset manually  and/or specify a
> data_offset_offset (i.e. a additional offset to what mdadm would have
> autodetected... similar as LVM allows?

You mean like a "--data-offset" option to --create and --grow?  Yes mdadm
3.3 has that.

> 
> I mean the practical relevance is probably not that much... but manual
> specification would probably needed whenever you have some more complex
> block layer (e.g. LVM), another MD or a hardware RAID below MD... and
> you want to align to the stripes... or unusual large chunk sizes...
> 
> I've seen that you have some branch where you can specify the offset for
> each device... but... guess that was only for some guys to do recovery
> works?!

It is now part of mainline.

NeilBrown


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2013-07-23 22:27 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-17 14:37 shown disk sizes Christoph Anton Mitterer
2013-07-17 23:43 ` NeilBrown
2013-07-18  0:39   ` Christoph Anton Mitterer
2013-07-18  7:35   ` Mikael Abrahamsson
2013-07-23 16:26   ` Christoph Anton Mitterer
2013-07-23 17:12     ` Mikael Abrahamsson
2013-07-23 17:22       ` Christoph Anton Mitterer
2013-07-23 20:57         ` NeilBrown
2013-07-23 22:10           ` Christoph Anton Mitterer
2013-07-23 22:27             ` NeilBrown

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.