From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D9ED3C4725D for ; Mon, 22 Jan 2024 11:53:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 61B6D6B0082; Mon, 22 Jan 2024 06:53:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5CBB06B0083; Mon, 22 Jan 2024 06:53:44 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 493706B0085; Mon, 22 Jan 2024 06:53:44 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 397E16B0082 for ; Mon, 22 Jan 2024 06:53:44 -0500 (EST) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 00CA91409A6 for ; Mon, 22 Jan 2024 11:53:43 +0000 (UTC) X-FDA: 81706787526.17.4C7D418 Received: from mail-ot1-f48.google.com (mail-ot1-f48.google.com [209.85.210.48]) by imf06.hostedemail.com (Postfix) with ESMTP id 1D84718000A for ; Mon, 22 Jan 2024 11:53:40 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601 header.b=UFVAht5i; spf=pass (imf06.hostedemail.com: domain of david@fromorbit.com designates 209.85.210.48 as permitted sender) smtp.mailfrom=david@fromorbit.com; dmarc=pass (policy=quarantine) header.from=fromorbit.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705924421; a=rsa-sha256; cv=none; b=Yyn26V018TErannpzFd9pR0RGWXH+NDn9btstj13Va5pzqzl6DVevBmKgKPh+LSNh6wqIZ wv4nZmULreEpAVEL6/AO14N291avhbXLaZPekjdscK6GMwDuFM2PT7ztYuWgYlkQFW9rcR nRzXnjC93lou6bMEOLyl++WPuBUs6/Q= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601 header.b=UFVAht5i; spf=pass (imf06.hostedemail.com: domain of david@fromorbit.com designates 209.85.210.48 as permitted sender) smtp.mailfrom=david@fromorbit.com; dmarc=pass (policy=quarantine) header.from=fromorbit.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705924421; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=4lvDGi3XtCFwulPhjEFTJBsNK4AKTWo6BAFDZ+Ol/6E=; b=p2hVigi7EUg0lTjXOOrGRb2o2E/pWOx/NP249n6jgTQhJdAhAXm9aSgA0hULNzmf+DQ27A aW3VNNnXe3oo0Nas4zyBDsL5TQ6t1es+QsE74u3luwKuIopAXBZwqbZLf3nwXuIzHANnHJ MfxQqzNuZ0z9VJtu37zVWzkiTOPHF18= Received: by mail-ot1-f48.google.com with SMTP id 46e09a7af769-6dd8cba736aso2215369a34.3 for ; Mon, 22 Jan 2024 03:53:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20230601.gappssmtp.com; s=20230601; t=1705924420; x=1706529220; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=4lvDGi3XtCFwulPhjEFTJBsNK4AKTWo6BAFDZ+Ol/6E=; b=UFVAht5irRe/Rgmodr0KBm2H4Yd89aSoVwJtbjqpMpfVxCom6Wrt7b9dw0TcBpT3fc 4pHmu4L0yt/F53UEGdef/OJY5nyDY7+RqecdfUh51dDhSkHXzX5FMHx2Xi5Q6uEqbue4 eVISXmrepoS8TkUFgQR31IwrSixcziaK1asMFykgwWvt5/coDZudQbXmGzXD6000PtdZ ocjH6egMyZRHU7EuFg7aoToPBU8uzzbR/7OEdS6khSU8QWt4uufs9RdzxSQ9G+IOVV0+ tNHV3l1tIY2wl/OtWKSpsN7nVU72wZTcWaS4W9J6PKFgCXcHPxCKGx/1XouHiWqZoYfS p5qA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705924420; x=1706529220; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=4lvDGi3XtCFwulPhjEFTJBsNK4AKTWo6BAFDZ+Ol/6E=; b=i8bW5hNPR5C4Bn0aTuMzKpcVu92IKLlnr7+Wd8V+xdGlC6d9CB5jh0fxvZ0Up/j9D2 uZ9A3+j46wzRQxygFClT8AvyF3tiFq2osNfsCAzaA6bpQG80w3E2koa+G5LSnTVmpZfR EY7829a56HpWLH01K4ONqrb/pRz1mHnM1EqKO2HNFVKv2Z0AJuJ1qNErkxPrOLKw7+gp 7ac5TVpnWQffDRLs38cTTC0vCWKnDKqsqEUjSFEqOs+Dr/GL+TmAyZTisydGDuvSTMMl W9k2syeHCDcCXUeJx0dy+eGhv2xzrBicPsNjELyhDkWRussr9x5qr1EvllOt3ytBqDnh AHZQ== X-Gm-Message-State: AOJu0Ywkmv+Na4/ZYc2C33JOA2/qcC+kNTSDpY8x4gknLePbzgXszJHe cskrkOGMesAwrJSkZkXiKV7oYEk5r4riLbTVQ386xAs9JHewI6BXqGhDKUgN+3Q= X-Google-Smtp-Source: AGHT+IGScjgt6UBLLx+2xQYb5zDJnwz9kSx2S8tAvCRRK2WxYQ3E4GLNCu1B9AyKuLtV1W5tVHAhJQ== X-Received: by 2002:a05:6830:3a83:b0:6dd:e0d5:80f6 with SMTP id dj3-20020a0568303a8300b006dde0d580f6mr4543051otb.12.1705924420181; Mon, 22 Jan 2024 03:53:40 -0800 (PST) Received: from dread.disaster.area (pa49-180-249-6.pa.nsw.optusnet.com.au. [49.180.249.6]) by smtp.gmail.com with ESMTPSA id s16-20020a639250000000b005c200b11b77sm8314062pgn.86.2024.01.22.03.53.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 22 Jan 2024 03:53:39 -0800 (PST) Received: from dave by dread.disaster.area with local (Exim 4.96) (envelope-from ) id 1rRsrw-00Dkyv-2w; Mon, 22 Jan 2024 22:53:36 +1100 Date: Mon, 22 Jan 2024 22:53:36 +1100 From: Dave Chinner To: Andi Kleen Cc: linux-xfs@vger.kernel.org, linux-mm@kvack.org Subject: Re: Message-ID: References: <20240118222216.4131379-1-david@fromorbit.com> <87zfwxk75o.fsf@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87zfwxk75o.fsf@linux.intel.com> X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 1D84718000A X-Stat-Signature: kt7tbmdzrmskusjraxi5o5i5r64k91ps X-Rspam-User: X-HE-Tag: 1705924420-676528 X-HE-Meta: U2FsdGVkX18RQuBSI64lqHISXgGB2OgzmuFfQHTFOEw1N0AKjxEi88rqZmWynxeEI7niKirx/5Iv9cNd/qiKMVRk2pGdD33oLeo867Tu9Kff01jYUCYCWVQHjpQWOeLdVkDsCzgnOW0s1jgDpSHzsSOpNzykOSuiUor9QL1HZtzaGnvMEDs9vKmHtD1HvXlTeKNjdc/cfQ3Tg8hme57uGSt5Ml3Ap7onGjmuam5HzQH99cusxI9x98/tDLGelkVpGOemqha47gPZkIjqP8TqdjX+GKRqnmtC6lLxxsKJyVCbzmMPSfmoAIRsmombKfCNjgKyEvqPY2dLGEihUlNYGYpFdxe82P7H5RiCB4HJww5AdKTwUiko5c6PGimWrJQHRIjHnVlrp7RBN/gwkZLlcKjfx3SAD+bb32z3WYBFyJQMtV/3u3AovqiI8ZaM6cad6qVmRdNX3mmpVmErFw3RxSI9OTQ4H1wXN0kxtAnimXhtpU2vmG85IQc2CbK3i/w+8xAOP71X4iqyWeUcjHBR1c9idzLYZUprl4NY3ATDzoa+UUBWF94+GNf8b+uzxtp4ahp89tn0XvIwY9c3eGldKCDelElgFoZoQPpwm6sT5mDtoj7x03Qr6nX5enqcWe0S5itiNWQA5sVeiUPsbFAxgq3oL6KnCCwFLaSoQ4w1NYoskC3htqg9gMuv6+9yWAWLpXZc+0TTgo2RfahsiVlkVrr2Bi0jsLnS8mK2WlhDiwt4TlspDlVI5bRcPDmZf/5DCXKfbp4U0bHJ6E18Ln+LnjkrkvmD2DTr2mNRFosDv8A1sJGJ5ZWFaDAh6XPxww3jIAAvAQYkSQM0AT/oWz8lWrleKTAQG+ycF+otH5pizFKYAmo36DysCOiY0UdBRHLzBbWgR0/VsddHVX8H5j6FCVNyxiqdYogXbEYeuuHqRIVAASU0q1DWjU+CXwl09e/xood2q0UUzborbSAyn0t BrXs9j0i /trNgRexQkXKTTb3BxbObAj0VoepV8vbfwkUwDO3EJz0UY28w2QkaZXH9ljXvgnTZsXCcZCm4PVctfjie+NKWf+R5JEeB0hb2dzmuuTmUFQUB5i60laJoOzm71t6/GSqiOFNm3pRGNJLjDyxgrmz961W+JbdhCYbYytSSV083epOAvHzbQWulna0P9kp2oUUnlzUskAYYRjyBZFyuHPNxYxO+SPpEDtI/COQ+sWPoWh/YZDGVEE4qAScjG9rwolRzkwcn4mrNmOB2ZWcA9U53rHSLLfsMo+CJgqswuQ8gGbhTJ/o0QSBWsHkn5fiEiGhCXgH/Cz22goIBJNdvd2Yoh0eOVuzCyarip9tlmMQzZgQs6V/wXjpU2PfU3IlsFFVFOkQGr93SpeVJ6FgKkRsbxhpQvYPuCQcWwSztJSV4gJcnTyV0jFJEJKivT5nKNZXh7cni X-Bogosity: Ham, tests=bogofilter, spamicity=0.000006, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Jan 22, 2024 at 02:13:23AM -0800, Andi Kleen wrote: > Dave Chinner writes: > > > Thoughts, comments, etc? > > The interesting part is if it will cause additional tail latencies > allocating under fragmentation with direct reclaim, compaction > etc. being triggered before it falls back to the base page path. It's not like I don't know these problems exist with memory allocation. Go have a look at xlog_kvmalloc() which is an open coded kvmalloc() that allows the high order kmalloc allocations to fail-fast without triggering all the expensive and unnecessary direct reclaim overhead (e.g. compaction!) because we can fall back to vmalloc without huge concerns. When high order allocations start to fail, then we fall back to vmalloc and then we hit the long standing vmalloc scalability problems before anything else in XFS or the IO path becomes a bottleneck. IOWs, we already know that fail-fast high-order allocation is a more efficient and effective fast path than using vmalloc/vmap_ram() all the time. As this is an RFC, I haven't implemented stuff like this yet - I haven't seen anything in the profiles indicating that high order folio allocation is failing and causing lots of reclaim overhead, so I simply haven't added fail-fast behaviour yet... > In fact it is highly likely it will, the question is just how bad it is. > > Unfortunately benchmarking for that isn't that easy, it needs artificial > memory fragmentation and then some high stress workload, and then > instrumenting the transactions for individual latencies. I stress test and measure XFS metadata performance under sustained memory pressure all the time. This change has not caused any obvious regressions in the short time I've been testing it. I still need to do perf testing on large directory block sizes. That is where high-order allocations will get stressed - that's where xlog_kvmalloc() starts dominating the profiles as it trips over vmalloc scalability issues... > I would in any case add a tunable for it in case people run into this. No tunables. It either works or it doesn't. If we can't make it work reliably by default, we throw it in the dumpster, light it on fire and walk away. > Tail latencies are a common concern on many IO workloads. Yes, for user data operations it's a common concern. For metadata, not so much - there's so many far worse long tail latencies in metadata operations (like waiting for journal space) that memory allocation latencies in the metadata IO path are largely noise.... -Dave. -- Dave Chinner david@fromorbit.com