All of lore.kernel.org
 help / color / mirror / Atom feed
* Slow mounting raid1
@ 2017-07-31 18:30 Leonidas Spyropoulos
  2017-08-01  1:12 ` Duncan
  0 siblings, 1 reply; 6+ messages in thread
From: Leonidas Spyropoulos @ 2017-07-31 18:30 UTC (permalink / raw)
  To: linux-btrfs

Hello,

I got a raid1 setup of btrfs on a HDD array of 2 disks. The fstab has
the following mount settings:
  # cat /etc/fstab | grep raid1
  UUID=c9db91e6-0ba8-4ae6-b471-8fd4ff7ee72d /media/raid1 btrfs rw,relatime,compress=lzo,space_cache 0 0

When I try to mount the array it's consistent about 5 seconds+
  # time umount /media/raid1
  
  real    0m0.358s
  user    0m0.010s
  sys     0m0.010s
  # time mount /media/raid1
  
  real    0m5.605s
  user    0m0.504s
  sys     0m0.071s

I have this setup for sometime now and from the time I made it the mount
time went up (I notice that on boot). When I first build that was
almost instant. In terms of maintenance I regularly run a scrub and
rebalance every now and then.

Running kernel 4.11.12 (with -ck patchs)

Is there something I can do to speed it up (apart buying 2 SSDs :D ). I
feel like I'm missing something as the usage of the raid is not really
frequent - just backup mainly.

Thanks for your time.

-- 
Leonidas Spyropoulos

A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Slow mounting raid1
  2017-07-31 18:30 Slow mounting raid1 Leonidas Spyropoulos
@ 2017-08-01  1:12 ` Duncan
  2017-08-01  6:43   ` Leonidas Spyropoulos
  0 siblings, 1 reply; 6+ messages in thread
From: Duncan @ 2017-08-01  1:12 UTC (permalink / raw)
  To: linux-btrfs

Leonidas Spyropoulos posted on Mon, 31 Jul 2017 19:30:47 +0100 as
excerpted:

> Hello,
> 
> I got a raid1 setup of btrfs on a HDD array of 2 disks. The fstab has
> the following mount settings:
>   # cat /etc/fstab | grep raid1
>   UUID=c9db91e6-0ba8-4ae6-b471-8fd4ff7ee72d /media/raid1 btrfs
>   rw,relatime,compress=lzo,space_cache 0 0

If you're doing any snapshotting, you almost certainly want noatime, not 
the default relatime.  Even without snapshotting and regardless of the 
filesystem, tho on btrfs it's a bigger factor due to COW, noatime is a 
recommended performance optimization.

The biggest caveat with that is if you're running something that actually 
depends on atime.  Few if any modern applications depend on atime, with 
mutt in some configurations being an older application that still does.  
But AFAIK it only does in some configurations...

Tho I haven't the foggiest whether it'd affect your mount times...

(FWIW I have a number of pair-device btrfs raid1s, but I'm all-ssd these 
days and mounts seem to be fast enough here.)

> When I try to mount the array it's consistent about 5 seconds+
>   # time umount /media/raid1
>   
>   real    0m0.358s user    0m0.010s sys     0m0.010s # time mount
>   /media/raid1
>   
>   real    0m5.605s user    0m0.504s sys     0m0.071s
> 
> I have this setup for sometime now and from the time I made it the mount
> time went up (I notice that on boot). When I first build that was almost
> instant. In terms of maintenance I regularly run a scrub and rebalance
> every now and then.
> 
> Running kernel 4.11.12 (with -ck patchs)
> 
> Is there something I can do to speed it up (apart buying 2 SSDs :D ). I
> feel like I'm missing something as the usage of the raid is not really
> frequent - just backup mainly.

Is there anything suspect in dmesg during the mount?  What does smartctl 
say about the health of the devices?  (smartctl -AH at least, the selftest 
data is unlikely to be useful unless you actually run the selftests.)

To my awareness there's a few things that can affect mount speed, tho as 
I said I'm on ssd so I really don't know if 5 seconds for spinning rust 
is unusual or not, you'll need the experience of others on that.

1) Attempting to mount filesystems with many devices is of course 
slower.  But two devices shouldn't be a problem.

2) Sometimes a device might take awhile to "spin up" and initialize 
itself.  Since you're still on spinning rust, are the devices perhaps 
spinning down to save power, and the delay you see is them spinning back 
up?

SSDs may have a similar, tho generally shorter and for different reasons, 
delay.  SSDs often have a capacitor that charges up so they can finish a 
write and avoid corrupting themselves in the event of an unexpected power 
loss in the middle of a write.  A lower end device might allow the device 
to appear ready while the capacitor is still charging to avoid long power-
on response times, while higher end devices both tend to have higher 
capacity capacitors, and don't signify ready until they are sufficiently 
charged to avoid issues in "blink" situations where the power supply 
comes back on but isn't immediately steady and might go out again right 
away.

If a device takes too long and times out you'll see resets and the like 
in dmesg, but that normally starts at ~30 seconds, not the 5 seconds you 
mention.  Still, doesn't hurt to check.

3) If the space cache is damaged the mount may take longer, but btrfs 
will complain so you'll see it in dmesg.


[OT but quoting the signature]

> A: Because it messes up the order in which people normally read text.
> Q: Why is it such a bad thing?
> A: Top-posting.
> Q: What is the most annoying thing on usenet and in e-mail?

My sentiments exactly! (Well, except that HTML's even more annoying!) =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Slow mounting raid1
  2017-08-01  1:12 ` Duncan
@ 2017-08-01  6:43   ` Leonidas Spyropoulos
  2017-08-01 12:32     ` E V
  0 siblings, 1 reply; 6+ messages in thread
From: Leonidas Spyropoulos @ 2017-08-01  6:43 UTC (permalink / raw)
  To: Duncan; +Cc: linux-btrfs

Hi Duncan,

Thanks for your answer
On 01/08/17, Duncan wrote:
> 
> If you're doing any snapshotting, you almost certainly want noatime, not 
> the default relatime.  Even without snapshotting and regardless of the 
> filesystem, tho on btrfs it's a bigger factor due to COW, noatime is a 
> recommended performance optimization.
> 
> The biggest caveat with that is if you're running something that actually 
> depends on atime.  Few if any modern applications depend on atime, with 
> mutt in some configurations being an older application that still does.  
> But AFAIK it only does in some configurations...
The array has no snapshots and my mutt resides on a diff SSD btrfs so I can
safely try this option.

> 
> Is there anything suspect in dmesg during the mount?  What does smartctl 
> say about the health of the devices?  (smartctl -AH at least, the selftest 
> data is unlikely to be useful unless you actually run the selftests.)
>
dmesg while mount says:
  [19823.896790] BTRFS info (device sde): use lzo compression
  [19823.896798] BTRFS info (device sde): disk space caching is enabled
  [19823.896800] BTRFS info (device sde): has skinny extents

Smartctl tests are scheduled to run all disks once every day (for short test) and every week for long tests.
smartctl output:
  # smartctl -AH /dev/sdd
  smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.11.12-1-ck] (local build)
  Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
  
  === START OF READ SMART DATA SECTION ===
  SMART overall-health self-assessment test result: PASSED
  
  SMART Attributes Data Structure revision number: 16
  Vendor Specific SMART Attributes with Thresholds:
  ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
    1 Raw_Read_Error_Rate     0x000b   100   100   016    Pre-fail  Always       -       0
    2 Throughput_Performance  0x0005   143   143   054    Pre-fail  Offline      -       67
    3 Spin_Up_Time            0x0007   124   124   024    Pre-fail  Always       -       185 (Average 185)
    4 Start_Stop_Count        0x0012   100   100   000    Old_age   Always       -       651
    5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       0
    7 Seek_Error_Rate         0x000b   100   100   067    Pre-fail  Always       -       0
    8 Seek_Time_Performance   0x0005   110   110   020    Pre-fail  Offline      -       36
    9 Power_On_Hours          0x0012   100   100   000    Old_age   Always       -       4594
   10 Spin_Retry_Count        0x0013   100   100   060    Pre-fail  Always       -       0
   12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       353
  192 Power-Off_Retract_Count 0x0032   094   094   000    Old_age   Always       -       7671
  193 Load_Cycle_Count        0x0012   094   094   000    Old_age   Always       -       7671
  194 Temperature_Celsius     0x0002   162   162   000    Old_age   Always       -       37 (Min/Max 17/62)
  196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
  197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -       0
  198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -       0
  199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age   Always       -       0

  # smartctl -AH /dev/sde
  smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.11.12-1-ck] (local build)
  Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
  
  === START OF READ SMART DATA SECTION ===
  SMART overall-health self-assessment test result: PASSED
  
  SMART Attributes Data Structure revision number: 16
  Vendor Specific SMART Attributes with Thresholds:
  ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
    1 Raw_Read_Error_Rate     0x000b   100   100   016    Pre-fail  Always       -       0
    2 Throughput_Performance  0x0005   142   142   054    Pre-fail  Offline      -       69
    3 Spin_Up_Time            0x0007   123   123   024    Pre-fail  Always       -       186 (Average 187)
    4 Start_Stop_Count        0x0012   100   100   000    Old_age   Always       -       709
    5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       0
    7 Seek_Error_Rate         0x000b   100   100   067    Pre-fail  Always       -       0
    8 Seek_Time_Performance   0x0005   113   113   020    Pre-fail  Offline      -       35
    9 Power_On_Hours          0x0012   100   100   000    Old_age   Always       -       4678
   10 Spin_Retry_Count        0x0013   100   100   060    Pre-fail  Always       -       0
   12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       353
  192 Power-Off_Retract_Count 0x0032   093   093   000    Old_age   Always       -       8407
  193 Load_Cycle_Count        0x0012   093   093   000    Old_age   Always       -       8407
  194 Temperature_Celsius     0x0002   166   166   000    Old_age   Always       -       36 (Min/Max 17/64)
  196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
  197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -       0
  198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -       0
  199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age   Always       -       0

> 1) Attempting to mount filesystems with many devices is of course 
> slower.  But two devices shouldn't be a problem.
> 
> 2) Sometimes a device might take awhile to "spin up" and initialize 
> itself.  Since you're still on spinning rust, are the devices perhaps 
> spinning down to save power, and the delay you see is them spinning back 
> up?
> 
Good idea but even when I do it like below it's still 6 seconds:
  # time umount /media/raid1 && time mount /media/raid1
  
  real    0m0.501s
  user    0m0.046s
  sys     0m0.011s

  real    0m5.540s
  user    0m0.943s
  sys     0m0.062s

> SSDs may have a similar, tho generally shorter and for different reasons, 
> delay.  SSDs often have a capacitor that charges up so they can finish a 
> write and avoid corrupting themselves in the event of an unexpected power 
> loss in the middle of a write.  A lower end device might allow the device 
> to appear ready while the capacitor is still charging to avoid long power-
> on response times, while higher end devices both tend to have higher 
> capacity capacitors, and don't signify ready until they are sufficiently 
> charged to avoid issues in "blink" situations where the power supply 
> comes back on but isn't immediately steady and might go out again right 
> away.
> 
> If a device takes too long and times out you'll see resets and the like 
> in dmesg, but that normally starts at ~30 seconds, not the 5 seconds you 
> mention.  Still, doesn't hurt to check.
Nothing on dmesg related as you can see.

> 
> 3) If the space cache is damaged the mount may take longer, but btrfs 
> will complain so you'll see it in dmesg.
> 
That was my idea and that's why I asked in ML.

I'll give 'noatime' a go and see if it's changes anything.

-- 
Leonidas Spyropoulos

A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Slow mounting raid1
  2017-08-01  6:43   ` Leonidas Spyropoulos
@ 2017-08-01 12:32     ` E V
  2017-08-01 20:21       ` Leonidas Spyropoulos
  0 siblings, 1 reply; 6+ messages in thread
From: E V @ 2017-08-01 12:32 UTC (permalink / raw)
  To: Leonidas Spyropoulos, linux-btrfs

On Tue, Aug 1, 2017 at 2:43 AM, Leonidas Spyropoulos
<artafinde@gmail.com> wrote:
> Hi Duncan,
>
> Thanks for your answer

In general I think btrfs takes time proportional to the size of your
metadata to mount. Bigger and/or fragmented metadata leads to longer
mount times. My big backup fs with >300GB of metadata takes over
20minutes to mount, and that's with the space tree which is
significantly faster then space cache v1.

>>
>> If a device takes too long and times out you'll see resets and the like
>> in dmesg, but that normally starts at ~30 seconds, not the 5 seconds you
>> mention.  Still, doesn't hurt to check.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Slow mounting raid1
  2017-08-01 12:32     ` E V
@ 2017-08-01 20:21       ` Leonidas Spyropoulos
  2017-08-01 20:40         ` Timofey Titovets
  0 siblings, 1 reply; 6+ messages in thread
From: Leonidas Spyropoulos @ 2017-08-01 20:21 UTC (permalink / raw)
  To: linux-btrfs

On 01/08/17, E V wrote:
> In general I think btrfs takes time proportional to the size of your
> metadata to mount. Bigger and/or fragmented metadata leads to longer
> mount times. My big backup fs with >300GB of metadata takes over
> 20minutes to mount, and that's with the space tree which is
> significantly faster then space cache v1.
> 
Hmm my raid1 doesn't seem near to full or has a significant Metadata so
I don't I'm on this case:
  # btrfs fi show /media/raid1/
  Label: 'raid1'  uuid: c9db91e6-0ba8-4ae6-b471-8fd4ff7ee72d
         Total devices 2 FS bytes used 516.18GiB
         devid    1 size 931.51GiB used 518.03GiB path /dev/sdd
         devid    2 size 931.51GiB used 518.03GiB path /dev/sde

  # btrfs fi df /media/raid1/
  Data, RAID1: total=513.00GiB, used=512.21GiB
  System, RAID1: total=32.00MiB, used=112.00KiB
  Metadata, RAID1: total=5.00GiB, used=3.97GiB
  GlobalReserve, single: total=512.00MiB, used=0.00B

I tried the space_cache=v2 just to see if it would do any
difference but nothing changed
  # cat /etc/fstab | grep raid1
  UUID=c9db91e6-0ba8-4ae6-b471-8fd4ff7ee72d   /media/raid1 btrfs   rw,noatime,compress=lzo,space_cache=v2         0 0
  # time umount /media/raid1 && time mount /media/raid1/

  real    0m0.807s
  user    0m0.237s
  sys     0m0.441s

  real    0m5.494s
  user    0m0.618s
  sys     0m0.116s

I did a couple of rebalances on metadata and data and it improved a bit:
  # btrfs balance start -musage=100 /media/raid1/
  # btrfs balance start -dusage=10 /media/raid1/
  [.. incremental dusage 10 -> 95]
  # btrfs balance start -dusage=95 /media/raid1

Down to 3.7 sec
  # time umount /media/raid1 && time mount /media/raid1/

  real    0m0.807s
  user    0m0.237s
  sys     0m0.441s

  real    0m3.790s
  user    0m0.430s
  sys     0m0.031s

I think maybe the next step is to disable compression if I want to mount
it faster. Is this normal for BTRFS that performance would degrade after
some time?

Regards,

-- 
Leonidas Spyropoulos

A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Slow mounting raid1
  2017-08-01 20:21       ` Leonidas Spyropoulos
@ 2017-08-01 20:40         ` Timofey Titovets
  0 siblings, 0 replies; 6+ messages in thread
From: Timofey Titovets @ 2017-08-01 20:40 UTC (permalink / raw)
  To: Leonidas Spyropoulos, linux-btrfs

2017-08-01 23:21 GMT+03:00 Leonidas Spyropoulos <artafinde@gmail.com>:
> On 01/08/17, E V wrote:
>> In general I think btrfs takes time proportional to the size of your
>> metadata to mount. Bigger and/or fragmented metadata leads to longer
>> mount times. My big backup fs with >300GB of metadata takes over
>> 20minutes to mount, and that's with the space tree which is
>> significantly faster then space cache v1.
>>
> Hmm my raid1 doesn't seem near to full or has a significant Metadata so
> I don't I'm on this case:
>   # btrfs fi show /media/raid1/
>   Label: 'raid1'  uuid: c9db91e6-0ba8-4ae6-b471-8fd4ff7ee72d
>          Total devices 2 FS bytes used 516.18GiB
>          devid    1 size 931.51GiB used 518.03GiB path /dev/sdd
>          devid    2 size 931.51GiB used 518.03GiB path /dev/sde
>
>   # btrfs fi df /media/raid1/
>   Data, RAID1: total=513.00GiB, used=512.21GiB
>   System, RAID1: total=32.00MiB, used=112.00KiB
>   Metadata, RAID1: total=5.00GiB, used=3.97GiB
>   GlobalReserve, single: total=512.00MiB, used=0.00B
>
> I tried the space_cache=v2 just to see if it would do any
> difference but nothing changed
>   # cat /etc/fstab | grep raid1
>   UUID=c9db91e6-0ba8-4ae6-b471-8fd4ff7ee72d   /media/raid1 btrfs   rw,noatime,compress=lzo,space_cache=v2         0 0
>   # time umount /media/raid1 && time mount /media/raid1/
>
>   real    0m0.807s
>   user    0m0.237s
>   sys     0m0.441s
>
>   real    0m5.494s
>   user    0m0.618s
>   sys     0m0.116s
>
> I did a couple of rebalances on metadata and data and it improved a bit:
>   # btrfs balance start -musage=100 /media/raid1/
>   # btrfs balance start -dusage=10 /media/raid1/
>   [.. incremental dusage 10 -> 95]
>   # btrfs balance start -dusage=95 /media/raid1
>
> Down to 3.7 sec
>   # time umount /media/raid1 && time mount /media/raid1/
>
>   real    0m0.807s
>   user    0m0.237s
>   sys     0m0.441s
>
>   real    0m3.790s
>   user    0m0.430s
>   sys     0m0.031s
>
> I think maybe the next step is to disable compression if I want to mount
> it faster. Is this normal for BTRFS that performance would degrade after
> some time?
>
> Regards,
>
> --
> Leonidas Spyropoulos
>
> A: Because it messes up the order in which people normally read text.
> Q: Why is it such a bad thing?
> A: Top-posting.
> Q: What is the most annoying thing on usenet and in e-mail?
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

AFAIK, for space_cache=v2, you need do something like:
btrfs check --clear-space-cache v1 /dev/sdd
mount -o space_cache=v2 /dev/sdd <mount_point>
First mount will be very slow, because that require rebuild of space_cache

Thanks.
-- 
Have a nice day,
Timofey.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-08-01 20:41 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-31 18:30 Slow mounting raid1 Leonidas Spyropoulos
2017-08-01  1:12 ` Duncan
2017-08-01  6:43   ` Leonidas Spyropoulos
2017-08-01 12:32     ` E V
2017-08-01 20:21       ` Leonidas Spyropoulos
2017-08-01 20:40         ` Timofey Titovets

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.