From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EA71AC6FD1D for ; Thu, 23 Mar 2023 08:21:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230342AbjCWIVF (ORCPT ); Thu, 23 Mar 2023 04:21:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47418 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230172AbjCWIVC (ORCPT ); Thu, 23 Mar 2023 04:21:02 -0400 Received: from mout.gmx.net (mout.gmx.net [212.227.17.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DE8A32FCC2 for ; Thu, 23 Mar 2023 01:21:00 -0700 (PDT) Received: from [0.0.0.0] ([149.28.201.231]) by mail.gmx.net (mrgmx105 [212.227.17.174]) with ESMTPSA (Nemesis) id 1N49hB-1qfMzA2fon-0103Da; Thu, 23 Mar 2023 09:20:50 +0100 Message-ID: <6b1d2d63-ef00-3c6c-8bea-481e46b6fcef@gmx.com> Date: Thu, 23 Mar 2023 16:20:45 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.8.0 Subject: Re: [PATCH 03/10] btrfs: offload all write I/O completions to a workqueue Content-Language: en-US To: Christoph Hellwig Cc: Chris Mason , Josef Bacik , David Sterba , Johannes Thumshirn , linux-btrfs@vger.kernel.org References: <20230314165910.373347-1-hch@lst.de> <20230314165910.373347-4-hch@lst.de> <2aa047a7-984e-8f6f-163e-8fe6d12a41d8@gmx.com> <20230320123059.GB9008@lst.de> <20230321125550.GB10470@lst.de> <5eebb0fc-0be3-c313-27cd-4e11a7b04405@gmx.com> <20230322083258.GA23315@lst.de> <20230323081237.GA21669@lst.de> From: Qu Wenruo In-Reply-To: <20230323081237.GA21669@lst.de> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Provags-ID: V03:K1:Psb9E//9BNKq4Csl90fGy6Nk5qeOtpXOytbydHZYuAk5fmU2oM3 65lsKbzb13rruHKg7cmR9sSKsY6SpGQ6nuQdskUONLCADTwMxFnuRuiRCRpL456Z55M5QOM Y7I1ai2qXLb9FTxTh0KrDaJuYJVeixzm02SUt1JDuMuAj5E7Fjk968uhAQKxAuNFkgV8hDq nVVaJQ5LxZ+bV3fV0Rfgw== UI-OutboundReport: notjunk:1;M01:P0:e9hUGDF2LB8=;6tT+VXln6byHvqoYPdqyERsjkGQ fFeLp9Wqm2zCN8rrAHaVXuac9ra7wL/CeFsFz6EtIGa8lbVz3Onlvy7VXij6EitmNws3U1MRZ uLKQXVP1hvu8HsyIHnAJNo/hJzjtIWVbb1yjPbbfmBpycLuAwVVOlq+AVJidAqAgP3VluTKBI RVmNBPAad7Xl4dLh1TB/6g1HsFUzO5ppGqZ9fen3FHt3w2e+dmTMfNpCwz9AW/lNzype/BvDR S7o7fSSwPaGTda1MCkqrtVCawDY7cTCtkK3P5+UFH80Ud6K3ycuZPfm+Svxh9ZR0l2F3njq+B hWbty5v2/NG9GFurG0+bd6E0nvL5iz5NKsqeQkzW2Y8MtMQh8pKVy7v5cJ6q5KYeY3r0JA4Z4 z4MGB7TUExaeJVXyCuaJ0ybXOc9inCKw2fVWNRak0V2oVCN/I9SnWN6m7vMdD8MYiKxMPBDyP VY9DA2zc46wt2C4e+KRsCRWBs7VMFfdwhoKHGWIs2wf5oraG355tDJjRoEfLbu+b4gduw3N13 ZQ7EIgpOyqWPuIpu6wI5rmUsAjBXB9ZJ9Xx3J22FXOvQDLepdyQz2Xxndlc+GMgIeon7Yv/uo +XX0sGvuKbWtqgmGhqAYvU0CpN+6kGGaqZbJpZH6lmjbq80rJC/TYpHUhrkazAOaAjp4TC8TH Vvjw+bq0vX/gGNzp9D9ndttKFnoviZv2N5H/RHSE9ocgfemUVddwO0rxiVEOQPb1zRDX1SXBk vDSaJGK39MF6sdUz0bqp/rs/OuHkId/el9Bd0SYW7LrUAyLNVLXt7wQgMWHowv3dyGQ8aONkA hfNOvI2fTbLFvfAaXYnRABnMSyWJ4kL13McDI2V6BOwMeZKyILUV1gcOP8Xbb3LIQXheTtEGb KPg35EZIDXuRqdIAv3X5eppA/bO8hMd/V0VnjPgBB4vLmiHAGi4rc4v2I8wZMaKDXBT8cw4ae nhClAMON4qe81cOQnTNw8cvuSso= Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org On 2023/3/23 16:12, Christoph Hellwig wrote: > On Thu, Mar 23, 2023 at 04:07:28PM +0800, Qu Wenruo wrote: >>>> And if a work load can only be deadlock free using the default max_active, >>>> but not any value smaller, then I'd say the work load itself is buggy. >>> >>> Anything that has an interaction between two instances of a work_struct >>> can deadlock. Only a single execution context is guaranteed (and even >>> that only with WQ_MEM_RECLAIM), and we've seen plenty of deadlocks due >>> to e.g. only using a global workqueue in file systems or block devices >>> that can stack. >> >> Shouldn't we avoid such cross workqueue workload at all cost? > > Yes, btrfs uses per-sb workqueues. As do most other places now, > but there as a bit of a learning curve years ago. > >>> So this is the first time I see an actual explanation, thanks for that >>> first. If this is the reason we should apply the max_active to all >>> workqueus that do csum an compression work, but not to other random >>> workqueues. >> >> If we're limiting the max_active for certain workqueues, then I'd say why >> not to all workqueues? >> >> If we have some usage relying on the amount of workers, at least we should >> be able to expose it and fix it. > > Again, btrfs is the odd one out allowing the user to set arbitrary limits. > This code predates using the kernel workqueues, and I'm a little > doubtful it still is useful. But for that I need to figure out why > it was even be kept when converting btrfs to use workqueues. > >> (IIRC we should have a better way with less cross-workqueue dependency) > > I've been very actively working on reducing the amount of different > workqueues. This series is an important part of that. > Yeah, your work is very appreciated, that's without any doubt. I'm just wondering if we all know that we should avoid cross-workqueue dependency to avoid deadlock, then why no one is trying to expose any such deadlocks by simply limiting max_active to 1 for all workqueues? Especially with lockdep, once we got a reproduce, it should not be that hard to fix. That's the only other valid usecase other than limiting CPU usage for csum/compression, I know regular users should not touch this, but us developer should still be able to create a worst case for our CI systems to suffer. Thanks, Qu