All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: Anand Jain <anand.jain@oracle.com>, linux-btrfs@vger.kernel.org
Cc: clm@fb.com, dsterba@suse.cz
Subject: Re: [PATCH v2 00/15] Introduce device state 'failed', Hot spare and Auto replace
Date: Tue, 29 Mar 2016 13:30:13 -0400	[thread overview]
Message-ID: <56FABBA5.4090402@gmail.com> (raw)
In-Reply-To: <1459261349-32206-1-git-send-email-anand.jain@oracle.com>

On 2016-03-29 10:22, Anand Jain wrote:
> Thanks for various comments, tests and feedback.
>
> Background: Hot spare and Auto replace:
>   Hot spare is predominately used to mitigate or narrow the time
>   window of a storage in degraded mode during which any further disk
>   failure might lead to a catastrophic data loss. Data center
>   storage generally will have couple of disks reserved as spares
>   on the storage. Mainly this is an enterprise storage feature
>   rather than a FS feature, I believe people acquainted with
>   enterprise storage use cases will appreciate the need of it and
>   so most/all of the enterprise storage has hot spare feature.
>
> Btrfs device states:
>   This patch-set adds 'failed' state and makes provision to use
>   'offline' state as two new device states. So to summarize
>   various device states and their meanings..
>
>   /* missing: device wasn't found at the time of mount */
>   int missing;
>
>   /*
>    * failed: device confirmed to have experienced critical
>    * io failure
>    */
>   int failed;
>
>   /*
>    * offline: When there is no confirmation that a disk has
>    * failed. But an interim communication breakdown
>    * and not necessarily a candidate for the device replace.
>    * Device might be online after user intervention or after
>    * block transport layer error recovery.
>    */
>   int offline;
>
>
> Device state transition Tuning and visualization:
>   Sysfs interfaces are planned to provide the required tuning for
>   device state transition sensitivities and visualization of device
>   states. However sysfs framework which could provide such an interface
>   is being reviewed/tested and not yet ready as of now. So for the
>   testing and debug of these features here I have used an update
>   version of the procfs patch which is in the ML.
>
>        [PATCH] btrfs: debug: procfs-devlist: introduce procfs interface for
> the device list for debugging
>
>   I find the above patch very useful and stable as compared to sysfs
> to visualize the device state.
>
> This patch set does not depend on any of the sysfs patches as such.
>
> Cross compatibility:
>   Adds a new incompatibility feature flags
>   (BTRFS_FEATURE_INCOMPAT_SPARE_DEV) to manage the spare device
>   when older kernels are used. So it is tested to be work fine
>   with older kernel/prog versions.
>
>
> Auto replace:
>   Replace happens automatically, that is when there is any write
>   failed or flush failed, the device will be marked as failed, which
>   will stop any further IO attempt to that device. And in the next
>   commit cycle the auto replace will pick the spare device to
>   replace the failed device. And so the btrfs volume is back to a
>   healthy state.
>
> Per FSID spare vs Global spare:
>   As of now only global hot spare is supported, that is hot spare(s)
>   are for all the btrfs FS in the system. However future there will
>   be a fs_info->no_auto_replace tunable which can be tuned by the user
>   to limit the use of global spare.
>
>
> Example use case:
>   Here below is an example use case of the hot spare setup.
>
>   Add a spare device:
>          btrfs spare add /dev/sde -f
>
>   If there is a spare device which is already added before the,
>   just run
>
>          btrfs dev scan [/dev/sde]
>
>   Which will register the spare device to the kernel.
>
>          btrfs fi show
>           Label: none uuid: 52f170c1-725c-457d-8cfd-d57090460091
>            Total devices 2 FS bytes used 112.00KiB
>            devid 1 size 2.00GiB used 417.50MiB path /dev/sdc
>            devid 2 size 2.00GiB used 417.50MiB path /dev/sdd
>
>          Global spare
>            device size 3.00GiB path /dev/sde
>
>
> Patches:
>
> Kernel:
>   First, it needs, Qu's per chunk missing device patchset, which is
>   part of the set.
>
>   Next patches 6/12 brings in support to manage the transition of
>   devices from online (no state) to offline OR failed state dynamically.
>   On top of static device state like the current "missing" state.
>
>   Next patches 7-11/12 adds support for Spare device. For kernel without
>   spare feature the spare device is kept away. And when the kernel
>   supports the spare device, it will inhibit from mounting it. Further
>   these patch set provides helper function to pick a spare device and
>   release a spare device back to the spare device pool.
>
>   Patch 11/12 provides function for auto replace, this is mainly
>   from the existing replace code.
>   Last 12/15, uses all these facilities, picks a failed device and
>   triggers a auto replace in a kthread (casualty_kthread())
>
>
> Progs:
>   Needs below 4 patches which will add sub cli 'spare' to manage
>   the spare device. As of now deleting a spare device has to be
>   managed using wipefs. However in the long run we would a proper
>   btrfs command to do that job.
>
>
> V1->V2:
> Kernel:
>   (Based on tests and commets provided in the ML)
>   a. Now transition_kthread() wakes up the casualty_kthread to check
>      for device states. Instead of doing that in the transition_kthread()
>      itself. Cleaner and less pressure on transition_kthread().
>   b. Dropped
>       [PATCH 05/15] btrfs: optimize btrfs_check_degradable() for calls outside of barrier
>      as it was wrong patch and the optimization was incomplete.
>   c. Merged patches
>      btrfs: check for failed device and hot replace
>        to
>      btrfs: check device for critical errors and mark failed
>      in an effort to make the changes as in a above.
>
> Progs:
>   a. Added to call btrfs_register_one_device() when doing btrfs
>      spare add
>
>
> Anand Jain (7):
>    btrfs: introduce device dynamic state transition to offline or failed
>    btrfs: introduce BTRFS_FEATURE_INCOMPAT_SPARE_DEV
>    btrfs: add check not to mount a spare device
>    btrfs: support btrfs dev scan for spare device
>    btrfs: provide framework to get and put a spare device
>    btrfs: introduce helper functions to perform hot replace
>    btrfs: check device for critical errors and mark failed
>
> Qu Wenruo (5):
>    btrfs: Introduce a new function to check if all chunks a OK for
>      degraded mount
>    btrfs: Do per-chunk check for mount time check
>    btrfs: Do per-chunk degraded check for remount
>    btrfs: Allow barrier_all_devices to do per-chunk device check
>    btrfs: Cleanup num_tolerated_disk_barrier_failures
>
>   fs/btrfs/ctree.h       |   8 +-
>   fs/btrfs/dev-replace.c |  24 +++++
>   fs/btrfs/dev-replace.h |   1 +
>   fs/btrfs/disk-io.c     | 256 +++++++++++++++++++++++++++++++++--------------
>   fs/btrfs/disk-io.h     |   4 +-
>   fs/btrfs/super.c       |  20 +++-
>   fs/btrfs/volumes.c     | 263 +++++++++++++++++++++++++++++++++++++++++++++----
>   fs/btrfs/volumes.h     |  27 +++++
>   8 files changed, 504 insertions(+), 99 deletions(-)
>
> Anand Jain (4):
>    btrfs-progs: Introduce BTRFS_FEATURE_INCOMPAT_SPARE_DEV SB flags
>    btrfs-progs: Introduce btrfs spare subcommand
>    btrfs-progs: add fi show for spare
>    btrfs-progs: add global spare device list to filesystem show
>
>   Android.mk        |   2 +-
>   Makefile.in       |   3 +-
>   btrfs.c           |   1 +
>   cmds-filesystem.c |   9 ++
>   cmds-spare.c      | 292 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>   commands.h        |   2 +
>   ctree.h           |   4 +-
>   utils.h           |   1 +
>   volumes.c         |   4 +
>   volumes.h         |   2 +
>   10 files changed, 317 insertions(+), 3 deletions(-)
>   create mode 100644 cmds-spare.c
>
I can't provide the same degree of testing this time that I did for the 
previous version (the system I had set up with my normal testing harness 
is offline for the foreseeable future).  That said, I've built and 
booted a kernel with these patches in a VM on my laptop and tested the 
new functionality, and everything appears to work like it's supposed to 
without breaking any existing code, so for the patch-set as a whole:

Tested-by: Austin S. Hemmelgarn <ahferroin7@gmail.com>

      parent reply	other threads:[~2016-03-29 17:31 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-29 14:22 [PATCH v2 00/15] Introduce device state 'failed', Hot spare and Auto replace Anand Jain
2016-03-29 14:22 ` [PATCH 01/12] btrfs: Introduce a new function to check if all chunks a OK for degraded mount Anand Jain
2016-03-29 14:22 ` [PATCH 02/12] btrfs: Do per-chunk check for mount time check Anand Jain
2016-03-29 14:22 ` [PATCH 03/12] btrfs: Do per-chunk degraded check for remount Anand Jain
2016-03-29 14:22 ` [PATCH 04/12] btrfs: Allow barrier_all_devices to do per-chunk device check Anand Jain
2016-03-29 14:22 ` [PATCH 05/12] btrfs: Cleanup num_tolerated_disk_barrier_failures Anand Jain
2016-03-29 14:22 ` [PATCH 06/12] btrfs: introduce device dynamic state transition to offline or failed Anand Jain
2016-03-29 14:22 ` [PATCH 07/12] btrfs: introduce BTRFS_FEATURE_INCOMPAT_SPARE_DEV Anand Jain
2016-03-29 14:22 ` [PATCH 08/12] btrfs: add check not to mount a spare device Anand Jain
2016-03-29 14:22 ` [PATCH 09/12] btrfs: support btrfs dev scan for " Anand Jain
2016-03-29 14:22 ` [PATCH 10/12] btrfs: provide framework to get and put a " Anand Jain
2016-03-29 14:22 ` [PATCH 11/12] btrfs: introduce helper functions to perform hot replace Anand Jain
2016-03-29 14:45   ` kbuild test robot
2016-03-30 10:13     ` Anand Jain
2016-03-31  2:14       ` [kbuild-all] " Fengguang Wu
2016-03-29 14:22 ` [PATCH 12/12] btrfs: check device for critical errors and mark failed Anand Jain
2016-03-29 22:41   ` Yauhen Kharuzhy
2016-04-01 23:53     ` Anand Jain
2016-03-30  0:49   ` Yauhen Kharuzhy
2016-04-01 23:59     ` Anand Jain
2016-03-29 14:27 ` [PATCH 1/4] btrfs-progs: Introduce BTRFS_FEATURE_INCOMPAT_SPARE_DEV SB flags Anand Jain
2016-03-29 14:27   ` [PATCH v2 2/4] btrfs-progs: Introduce btrfs spare subcommand Anand Jain
2016-03-29 14:27   ` [PATCH 3/4] btrfs-progs: add fi show for spare Anand Jain
2016-03-29 14:27   ` [PATCH 4/4] btrfs-progs: add global spare device list to filesystem show Anand Jain
2016-03-29 17:30 ` Austin S. Hemmelgarn [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56FABBA5.4090402@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=anand.jain@oracle.com \
    --cc=clm@fb.com \
    --cc=dsterba@suse.cz \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.