From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 078F8C77B7A for ; Tue, 16 May 2023 23:15:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229578AbjEPXPl (ORCPT ); Tue, 16 May 2023 19:15:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51790 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229518AbjEPXPl (ORCPT ); Tue, 16 May 2023 19:15:41 -0400 Received: from mail-pf1-x432.google.com (mail-pf1-x432.google.com [IPv6:2607:f8b0:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0F77E4C2D for ; Tue, 16 May 2023 16:15:39 -0700 (PDT) Received: by mail-pf1-x432.google.com with SMTP id d2e1a72fcca58-64ab2a37812so9104179b3a.1 for ; Tue, 16 May 2023 16:15:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20221208.gappssmtp.com; s=20221208; t=1684278939; x=1686870939; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=ComBD5cVSZXxb8CF1oU6ofe5UBMPl0+rt3cWiD9Tacg=; b=nQDi6NYrzWKrK0bA4GjAg3dfl3oHSpMlCxu4FLE1nnFNWHCjuRLMFhBS9Nqwu8kjjn 1Cl5BHAOoCd5zi7pqEenAcLKhi8pm72XFABs6sCcMumuOF52ZsWEZ8gjFp/qcKgkHe/t OuWN2+FkrNqC1e+TUc0aTRu4ojycowYVSnRuMA0XNS3ZbgOnKaE2SjNZJx3J/D/8NBgU arE7TsWhDx5uCTWpuNMzDwYmSGQXfjQ/5y2YrYDo9TzLOk5QoIjGoezXygUQiRKNlinI sRBGrWIzZmfKupV+Ufgs9EskGsOUt5Ti6pfL3Jl4Wsc3n8rUbs9HvypBXUOFyzfux9Li ooDw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684278939; x=1686870939; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=ComBD5cVSZXxb8CF1oU6ofe5UBMPl0+rt3cWiD9Tacg=; b=NEwUKpB1sUY0QR3HyW+gn3oMlYDx+YzLCYJJJhAsWD7O8dUxKbLBmv5WmfszjG/OEo bFaXkLmEJ56VPv0xqPbN1Sml/0oGc9JNw+fHrw8MTPlhWT0p6POAi/p3fUNTFYw+SlkB +m2JNVmL7rwipws+JXy+TVyiAxAosHH42ClaFMU8mgTkqnUuQF7CN6hzb5qoKE0N5ASn EEf/LV7yAygL7oAVeNJmUwFLAciNNnqMzsgkEs1CP83o8rfaSjHmF9oC8C0WpHNPCyfZ jNcOQuWUUq5mwpknsExq9R3190rmWVul3F5tihfh2UKclnIRzqGwkYsDpmDu3LcLAmQO PuuQ== X-Gm-Message-State: AC+VfDw/L3XMbNLTiorPk7GpjPQhgdLsTM74e/yMsk8SCDsOpXZmXwFD akizOegIcTcuV6Zlum6XfywdsA== X-Google-Smtp-Source: ACHHUZ4gosaB6ODsKvk2XDn1AJsJpl9jEWJVUpQm58NAxF/3RNS7MtlM6w4/fMxX3zUVwJKoPCNW6A== X-Received: by 2002:a05:6a00:238c:b0:64b:e8:24ff with SMTP id f12-20020a056a00238c00b0064b00e824ffmr367750pfc.17.1684278939473; Tue, 16 May 2023 16:15:39 -0700 (PDT) Received: from dread.disaster.area (pa49-179-0-188.pa.nsw.optusnet.com.au. [49.179.0.188]) by smtp.gmail.com with ESMTPSA id g26-20020aa7819a000000b0063799398eaesm13840532pfi.51.2023.05.16.16.15.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 16 May 2023 16:15:38 -0700 (PDT) Received: from dave by dread.disaster.area with local (Exim 4.96) (envelope-from ) id 1pz3tG-000KiC-28; Wed, 17 May 2023 09:15:34 +1000 Date: Wed, 17 May 2023 09:15:34 +1000 From: Dave Chinner To: Kent Overstreet Cc: Christian Brauner , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-bcachefs@vger.kernel.org, Dave Chinner , Alexander Viro Subject: Re: [PATCH 22/32] vfs: inode cache conversion to hash-bl Message-ID: References: <20230509165657.1735798-1-kent.overstreet@linux.dev> <20230509165657.1735798-23-kent.overstreet@linux.dev> <20230510044557.GF2651828@dread.disaster.area> <20230516-brand-hocken-a7b5b07e406c@brauner> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-bcachefs@vger.kernel.org On Tue, May 16, 2023 at 12:17:04PM -0400, Kent Overstreet wrote: > On Tue, May 16, 2023 at 05:45:19PM +0200, Christian Brauner wrote: > > On Wed, May 10, 2023 at 02:45:57PM +1000, Dave Chinner wrote: > > There's a bit of a backlog before I get around to looking at this but > > it'd be great if we'd have a few reviewers for this change. > > It is well tested - it's been in the bcachefs tree for ages with zero > issues. I'm pulling it out of the bcachefs-prerequisites series though > since Dave's still got it in his tree, he's got a newer version with > better commit messages. > > It's a significant performance boost on metadata heavy workloads for any > non-XFS filesystem, we should definitely get it in. I've got an up to date vfs-scale tree here (6.4-rc1) but I have not been able to test it effectively right now because my local performance test server is broken. I'll do what I can on the old small machine that I have to validate it when I get time, but that might be a few weeks away.... git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs.git vfs-scale As it is, the inode hash-bl changes have zero impact on XFS because it has it's own highly scalable lockless, sharded inode cache. So unless I'm explicitly testing ext4 or btrfs scalability (rare) it's not getting a lot of scalability exercise. It is being used by the root filesytsems on all those test VMs, but that's about it... That said, my vfs-scale tree also has Waiman Long's old dlist code (per cpu linked list) which converts the sb inode list and removes the global lock there. This does make a huge impact for XFS - the current code limits inode cache cycling to about 600,000 inodes/sec on >=16p machines. With dlists, however: | 5.17.0 on a XFS filesystem with 50 million inodes in it on a 32p | machine with a 1.6MIOPS/6.5GB/s block device. | | Fully concurrent full filesystem bulkstat: | | wall time sys time IOPS BW rate | unpatched: 1m56.035s 56m12.234s 8k 200MB/s 0.4M/s | patched: 0m15.710s 3m45.164s 70k 1.9GB/s 3.4M/s | | Unpatched flat kernel profile: | | 81.97% [kernel] [k] __pv_queued_spin_lock_slowpath | 1.84% [kernel] [k] do_raw_spin_lock | 1.33% [kernel] [k] __raw_callee_save___pv_queued_spin_unlock | 0.50% [kernel] [k] memset_erms | 0.42% [kernel] [k] do_raw_spin_unlock | 0.42% [kernel] [k] xfs_perag_get | 0.40% [kernel] [k] xfs_buf_find | 0.39% [kernel] [k] __raw_spin_lock_init | | Patched flat kernel profile: | | 10.90% [kernel] [k] do_raw_spin_lock | 7.21% [kernel] [k] __raw_callee_save___pv_queued_spin_unlock | 3.16% [kernel] [k] xfs_buf_find | 3.06% [kernel] [k] rcu_segcblist_enqueue | 2.73% [kernel] [k] memset_erms | 2.31% [kernel] [k] __pv_queued_spin_lock_slowpath | 2.15% [kernel] [k] __raw_spin_lock_init | 2.15% [kernel] [k] do_raw_spin_unlock | 2.12% [kernel] [k] xfs_perag_get | 1.93% [kernel] [k] xfs_btree_lookup Cheers, Dave. -- Dave Chinner david@fromorbit.com