On 2019/1/16 上午9:34, Josef Bacik wrote: > On Wed, Jan 16, 2019 at 09:29:29AM +0800, Qu Wenruo wrote: >> >> >> On 2019/1/16 上午9:15, Josef Bacik wrote: >>> On Wed, Jan 16, 2019 at 09:07:26AM +0800, Qu Wenruo wrote: >>>> >>>> >>>> On 2019/1/16 上午8:31, Qu Wenruo wrote: >>>>> >>>>> >>>>> On 2019/1/16 上午1:26, Josef Bacik wrote: >>>>>> On Tue, Jan 15, 2019 at 04:15:57PM +0800, Qu Wenruo wrote: >>>>>>> This patchset can be fetched from github: >>>>>>> https://github.com/adam900710/linux/tree/qgroup_delayed_subtree >>>>>>> >>>>>>> Which is based on v5.0-rc1. >>>>>>> >>>>>> >>>>>> I've been testing these patches hoping to get rid of the qgroup deadlock that >>>>>> these patches are supposed to fix, but instead they make the box completely >>>>>> unusable with 100% cpu usage for minutes at a time at every transaction commit. >>>>> >>>>> I'm afraid it's related to the v5.0-rc1 base, not the patchset itself. >>>>> >>>>> Just try to balance metadata with 16 snapshots, you'll see btrfs bumping >>>>> its generation like crazy, no matter if quota is enabled or not. >>>>> >>>>> And since btrfs is committing transaction like crazy, no wonder it will >>>>> do tons of qgroup accounting. >>>>> >>>>> My bisect leads to commit 64403612b73a94bc7b02cf8ca126e3b8ced6e921 >>>>> btrfs: rework btrfs_check_space_for_delayed_refs. >>>> >>>> Furthermore, you could try this RFC test case to see the problem. >>>> https://patchwork.kernel.org/patch/10761715/ >>>> >>>> It would only take around 100s for v4.20 but over 500 for v5.0-rc1 with >>>> tons of unnecessary transaction committed for nothing, no quota enabled. >>>> >>>> So I'm afraid that commit is blocking my qgroup patchset. >>>> >>> >>> I've fixed the balance problem, it took 2 seconds to figure out, I'm just >>> waiting on xfstests to finish running. >>> >>> And your patch making things worse has nothing to do with that problem. Our >>> test doesn't run balance, so the issue you reported has nothing to do with the >>> fact that your patch makes our boxes unusable with qgroups on. >>> >>> The problem is with your deadlock avoidence patch we're now making a lot more >>> dirty extents to run in the critical section of the transaction commit. Also >>> because we're no longer pre-fetching the old roots, we're doing the old roots >>> and new roots lookup inside the critical section, so now each dirty extent takes >>> 2x as long. With my basic test it was taking 5 minutes to do the qgroup >>> accounting, which keeps the box from booting essentially. >>> >>> Without your patch it's still awful because btrfs-cleaner just sits there at >>> 100% while deleting snapshots, but at least it's not making the whole system >>> stop running while it does all that work in the transaction commit. >>> >>> And if you had done the proper root cause analysis you would have noticed that >>> we're taking tree locks in the find_parent_nodes() case when we're searching the >>> commit root, something we shouldn't be doing. So all that really needs to be >>> done is to check if (!path->skip_locking) btrfs_tree_read_lock(eb); in those >>> cases and the deadlock goes away. Because no matter what we shouldn't be taking >>> locks when we're not given a trans in the backref lookup code. >> >> That indeed looks much better than my current solution. >> >> Although I'm not 100% sure for cases like tree blocks shared between >> commit and current root (tree block not modified yet). > > Doesn't matter, the block can't be modified so we don't need the locking, If > the current root needs to change it then it cow's it and messes with the new eb, > not the one attached to the commit root. > >> >> I'll definitely invest more time to try to fix this bug. >> > > Don't worry about it, I'm already running the patch through my tests, I'll send > them in the morning when the tests are finished. Thanks, I can't be more happier dropping that deadlock fix patch. Thanks, Qu > > Josef >