On 2015-11-16 17:07, Anand Jain wrote: > > > On 11/16/2015 09:41 PM, Austin S Hemmelgarn wrote: >> On 2015-11-09 05:56, Anand Jain wrote: >>> These set of patches provides btrfs hot spare and auto replace support >>> for you review and comments. >>> >>> First, here below are the simple example steps to configure the same: >>> >>> Add a spare device: >>> btrfs spare add /dev/sde -f >>> >>> OR if there is a spare device which is already added before the, just >>> run >>> >>> btrfs dev scan [/dev/sde] >>> >>> this will register the spare device to the kernel. >>> >>> btrfs fi show >>> Label: none uuid: 52f170c1-725c-457d-8cfd-d57090460091 >>> Total devices 2 FS bytes used 112.00KiB >>> devid 1 size 2.00GiB used 417.50MiB path /dev/sdc >>> devid 2 size 2.00GiB used 417.50MiB path /dev/sdd >>> >>> Global spare >>> device size 3.00GiB path /dev/sde >>> >>> Thats it. >>> >>> Auto replace: >>> Replace happens automatically, that is when there is any write >>> failed or flush failed, the device will be marked as failed, which >>> will stop any further IO attempt to that device. And in the next >>> commit >>> thread cycle the auto replace will pick the spare device (/dev/sde is >>> above example) to replace the failed device. And so the btrfs >>> volume is >>> back to a healthy state. >>> >>> >>> Its btrfs Global spare: >>> as of now only global hot spare is supported, that is hot spare(s) >>> are for all the btrfs FS in the system. >>> >>> No spare when device failed: >>> It would scan for spare device at the rate of transaction commit >>> and will trigger the auto replace when ever spare device is added. >>> >>> Priority: >>> In some future work there can be some chronological order to pick >>> a spare and the failed device. >>> >>> >>> Patches: >>> >>> Kernel: >>> First, it needs, Qu's per chunk missing device patchset, >>> which is part of the set here and also there is a light optimization >>> (patch 5/15) which was required as part of this enhancement. >>> >>> Next patches 7,8/15 brings in support, to manage the transition of >>> devices from online (no state) to offline OR failed state dynamically. >>> On top of static device state like the current "missing" state. >>> >>> Patch 9/15 fixes a bug where in we should have blocked the incompatible >>> feature at the device scan/add level instead/also at in the mount level. >>> This is because we don't have to bring a device into the device list, >>> if it is incompatible. >>> >>> Next patches 10,11,12,13/15 adds support for Spare device. For the >>> details on how to add a spare device kindly see further below. >>> For kernel with out spare feature supported the spare device >>> is kept away. And when the kernel supports the spare device, it will >>> inhibit from mounting it. Further these patch set provides helper >>> function to pick a spare device and release a spare device back to >>> the spare device pool. >>> >>> Patch 14/15 provides function for auto replace, this is mainly >>> from the existing replace code, and in the long run I see opportunity >>> to merge these code with the replace code that is triggered from >>> the user spare. >>> >>> Last 15/15, uses all these facilities, picks a failed device and >>> triggers a auto replace in a kthread (casualty_kthread()) >>> >>> >>> Progs: >>> Would need 4 patches as listed below. >>> >>> >>> Known Bug: >>> >>> As now I see below stale kmem cache during module unload. Which >>> I am digging. >>> ------ >>> BUG btrfs_path (Not tainted): Objects remaining in btrfs_path on >>> kmem_cache_close() >>> ------ >>> >>> Anand Jain (10): >>> btrfs: optimize btrfs_check_degradable() for calls outside of barrier >>> btrfs: introduce device dynamic state transition to offline or failed >>> btrfs: check device for critical errors and mark failed >>> btrfs: block incompatible optional features at scan >>> btrfs: introduce BTRFS_FEATURE_INCOMPAT_SPARE_DEV >>> btrfs: add check not to mount a spare device >>> btrfs: support btrfs dev scan for spare device >>> btrfs: provide framework to get and put a spare device >>> btrfs: introduce helper functions to perform hot replace >>> btrfs: check for failed device and hot replace >>> >>> Qu Wenruo (5): >>> btrfs: Introduce a new function to check if all chunks a OK for >>> degraded mount >>> btrfs: Do per-chunk check for mount time check >>> btrfs: Do per-chunk degraded check for remount >>> btrfs: Allow barrier_all_devices to do per-chunk device check >>> btrfs: Cleanup num_tolerated_disk_barrier_failures >>> >>> fs/btrfs/ctree.h | 7 +- >>> fs/btrfs/dev-replace.c | 116 ++++++++++++++++++++ >>> fs/btrfs/dev-replace.h | 1 + >>> fs/btrfs/disk-io.c | 211 +++++++++++++++++++++++------------- >>> fs/btrfs/disk-io.h | 2 - >>> fs/btrfs/super.c | 20 +++- >>> fs/btrfs/transaction.c | 3 +- >>> fs/btrfs/volumes.c | 283 >>> ++++++++++++++++++++++++++++++++++++++++++++++--- >>> fs/btrfs/volumes.h | 27 +++++ >>> 9 files changed, 571 insertions(+), 99 deletions(-) >>> >> I've thrown everything I can think of at this over the weekend, and >> nothing broke (at least, nothing broke that had anything to do with >> these patches, I ended up triggering a couple of known bugs that I had >> completely forgotten about), so you can add: >> Tested-by: Austin S. Hemmelgarn >> > > Thanks Austin. > Yeah I should fix the known bug as listed above. > Actually, while I did see that, I also ran into a couple of other bugs that are unrelated to these patches (including the balance related bug I was recently discussing in another thread on the ML, which (like everyone else it's hit) I've sadly been unable to reproduce). None of the ones I hit other than the one you mentioned in the cover letter were anything new with these patches, and they didn't happen any more frequently with the patches.