From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A5855C4167B for ; Tue, 29 Nov 2022 16:35:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 10F566B0072; Tue, 29 Nov 2022 11:35:48 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0C05A6B0073; Tue, 29 Nov 2022 11:35:48 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EC9586B0074; Tue, 29 Nov 2022 11:35:47 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id DD8C26B0072 for ; Tue, 29 Nov 2022 11:35:47 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 947208019E for ; Tue, 29 Nov 2022 16:35:47 +0000 (UTC) X-FDA: 80187031134.01.5C14CCB Received: from mail-qk1-f179.google.com (mail-qk1-f179.google.com [209.85.222.179]) by imf22.hostedemail.com (Postfix) with ESMTP id D7B05C0010 for ; Tue, 29 Nov 2022 16:35:46 +0000 (UTC) Received: by mail-qk1-f179.google.com with SMTP id z1so10156526qkl.9 for ; Tue, 29 Nov 2022 08:35:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20210112.gappssmtp.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=hfCGcNBAsmiVj/au60lcN4VRfzbvJerWkTGdG7L5yRs=; b=Zo/CRmDgn5LPkoLj+xYoDbN9umME6D+3NzrVEvuNjnRtT5wiEuhhvmmnDayocL0CNT qExhnGKBrCrlNbSCP2aS3ItDbLVM4j89CbryAsS9c+UGEVosOAuA6wW9BVBqh83kYi29 0dq1S1gAPfaky9hZyiudm3QQCjhRUnaG9RzBOsfmrX1XBGxVmcLov4Bs7ijES/XvLbvx Bvqe4+4MifNduD7FvmTWj5dkNdMeIm87odq5fgp0b1dq0igsRKjXpA94ER/tvQ5I+nNO IcisR6vkda5tpCubomBGqrS74zAOa//Asq2eWNIWuttZS2VwX3HZRjUZBxJ+vdwlU0wG 6Riw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=hfCGcNBAsmiVj/au60lcN4VRfzbvJerWkTGdG7L5yRs=; b=W6A/bsaPc7lrwwJuyP8/b9Ggdd92hMpeaW5uaf9dvU2n+FT8PRh8o0CAEcsDqrFPVU wYuvbe8czc9m3GDbfhiTiQ5xvkM5iSO9l5TgiukV55fxBPbarYsmnY/dDLDiebOrCtAd sXl4GiJFAV0sJd6gBDr9K4cUCTwyUZUX/2TyGstmUqjjpGAsu9Mo5ohI/5r2O/fEymfV VC2ZCRB4kqYAGVyFHldaqPboEQzaTc2fvNt1cAG85GqA60WhlJYW1zyGX6mS4akD8INY cEDJ+Cntib5tUzCStiQ2O2GU1OdvdO0Ivf0umfB3H+9u5tQXR152mzwc5l3soeAomZs9 DHSg== X-Gm-Message-State: ANoB5plQ3YN/6c/sbj6eqYdos5oSXVfskdDxAjNr+7ZDguj2eEmgI45y FvV7nSE06UoDzJmJ1owe1NtDMlxHWNQyzg== X-Google-Smtp-Source: AA0mqf4HR0fb/VbpIXAmPfGgZyfSTolmQvNOQzQLIV2f6TfOclwwxWxiNLsNBHMjhD5ZS1LuF5aUyA== X-Received: by 2002:a37:ad0c:0:b0:6ee:91b3:2484 with SMTP id f12-20020a37ad0c000000b006ee91b32484mr49798852qkm.648.1669739745882; Tue, 29 Nov 2022 08:35:45 -0800 (PST) Received: from localhost (2603-7000-0c01-2716-9175-2920-760a-79fa.res6.spectrum.com. [2603:7000:c01:2716:9175:2920:760a:79fa]) by smtp.gmail.com with ESMTPSA id g10-20020a05620a40ca00b006faf76e7c9asm11039471qko.115.2022.11.29.08.35.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Nov 2022 08:35:45 -0800 (PST) Date: Tue, 29 Nov 2022 11:35:44 -0500 From: Johannes Weiner To: Vitaly Wool Cc: ananda , linux-mm@kvack.org Subject: Re: [PATCH v6] mm: add zblock - new allocator for use via zpool API Message-ID: References: <20221104085856.18745-1-a.badmaev@clicknet.pro> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=cmpxchg-org.20210112.gappssmtp.com header.s=20210112 header.b="Zo/CRmDg"; spf=pass (imf22.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.222.179 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (policy=none) header.from=cmpxchg.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1669739747; a=rsa-sha256; cv=none; b=Pcp1ToQX+AbeTSbAUwHRL8ldajrVuTdc4S71omJ3QVI8q7hvJmL604oYAXxnzK3Qdsvyhj 3UDTnUbQpCEHZ1AY0kQBOuvgc5/vvg+KtUQkw/maqJeMvVWivPogPYI6K8APfxs4rmuty+ oNYs51EoP46FZD9J7vb9qpredSyI/aQ= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1669739747; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=hfCGcNBAsmiVj/au60lcN4VRfzbvJerWkTGdG7L5yRs=; b=aqpzRfwmC42rsiCWYhWtUebLa4iaaQE6/eWBmuc9fyVdt7QsalDBcOE6sJYBAvJGoQW985 prbXHxdeNSqYSJzb0f56waPpOz8C8S9PgyFgBzPXmDxY5oR8CUBQZ30Hf1A6uLSJPylH+r 7LXQV6cx3ignVzqd7A6HEJC2k0Io+MI= X-Rspam-User: X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: D7B05C0010 Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=cmpxchg-org.20210112.gappssmtp.com header.s=20210112 header.b="Zo/CRmDg"; spf=pass (imf22.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.222.179 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (policy=none) header.from=cmpxchg.org X-Stat-Signature: qzs98tmd1w59b34gz5ftqath6sd4zw5r X-HE-Tag: 1669739746-703123 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Nov 29, 2022 at 08:48:27AM +0100, Vitaly Wool wrote: > On Mon, Nov 28, 2022 at 9:01 PM Johannes Weiner wrote: > > > > On Fri, Nov 04, 2022 at 11:58:56AM +0300, ananda wrote: > > > From: Ananda > > > > > > Zblock stores integer number of compressed objects per zblock block. > > > > What does that mean? > > It's explained later in the patch but anyway, an example: let's create > an object with 4 adjacent pages, a total of 16384 bytes. We can divide > it into 43 subblocks of size 381, plus we'll have one byte that's not > used. Subblocks will then be treated as an array. Thanks, that makes sense. The 'integer' threw me off, I think. Maybe 'stores a fixed number' would be a bit clearer? > > > These blocks consist of several physical pages (1/2/4/8) and are arranged > > > in linked lists. > > > The range from 0 to PAGE_SIZE is divided into the number of intervals > > > corresponding to the number of lists and each list only operates objects > > > of size from its interval. Thus the block lists are isolated from each > > > other, which makes it possible to simultaneously perform actions with > > > several objects from different lists. > > > > This was benchmarked not long ago in the context of zsmalloc, and it > > didn't seem to matter too much in real world applications: > > > > https://lore.kernel.org/linux-mm/20221107213114.916231-1-nphamcs@gmail.com/ > > We basically reproduced this test and also ran it with zblock, and > zblock performs better by 3.5% on a 8G ZRAM disk with btrfs and this > difference is getting bigger with disk sizes getting bigger. > I'm pretty sure that the difference will get even bigger over time > because zsmalloc will run compaction more and more. Very interesting results. Do you know if the difference is owed fully to compaction, or is that just the factor you expect to have the biggest scaling impact? The numbers speak for themselves, I mostly ask out of curiosity. > > Do you have situations where this matters? > > > > > Blocks make it possible to densely arrange objects of various sizes > > > resulting in low internal fragmentation. Also this allocator tries to fill > > > incomplete blocks instead of adding new ones thus in many cases providing > > > a compression ratio substantially higher than z3fold and zbud. > > > > How does it compare to zsmalloc? > > That depends on the type of data being compressed, but typically > zsmalloc is better by 5-10%. Thanks. I think this would be great to include in the Kconfig help text, to help users understand the tradeoff and choose accordingly. > > > Zblock does not require MMU and also is superior to zsmalloc with > > > regard to the worst execution times, thus allowing for better response time > > > and real-time characteristics of the whole system. > > > > zsmalloc has depends on MMU, but which parts actually require it? It > > has its own handle indirection and can migrate objects around and > > replace backing pages without any virtual memory tricks. There is the > > kmap stuff of course, because it supports highmem backing pages, but > > that isn't relevant on NOMMU either. > > > > Also can you please elaborate on the worst execution time? > > I don't have the numbers at hand but zsmalloc (and z3fold, for that > matter) do have high spikes when compaction kicks in, not to speak > about longer disabled preemption. Gotcha, makes sense. > > My first impression is that this looks awfully close to zsmalloc, with > > a couple fewer features and somewhat more static design choices. It's > > in that sense reminiscent of the slob allocator, which we're in the > > process of removing, because 3 slab allocators is a pain to > > maintain. This would be the 4th zswap allocator, and it's not clear > > that it's drastically outperforming or doing something that isn't > > possible in one of the existing ones. > > I don't think this comparison is on point, at least because zblock's > code is at least 4x smaller than zsmalloc's, and the execution > overhead is lower too. For lower performance devices, zblock is a real > enabler, and there's a class of high performance devices where it can > be the best fit too. That's fair enough. > I get your point about 4 zswap allocators though, and have no problem > obsoleting z3fold as soon as we get zblock in. Ok no objection from me then. Not to sound greedy or anything, but do you see a chance it could supplant zbud as well in the longer term? We noticed in tests across various Meta workloads that 2:1 packing is pretty much always too low. The compression algorithms are just better than that for the majority of data. The allocation strategy is fast and simple, yes, but it wastes too much space. zblock looks like a more reasonable balance between simplicity, performance, and acceptable space efficiency. If it performs in the same ballpark as zbud, it would be great to ditch that too and make life easier for both developers and the users having to pick one.