From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40654C64EC4 for ; Thu, 9 Mar 2023 19:59:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D68A16B0074; Thu, 9 Mar 2023 14:58:59 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D18966B0075; Thu, 9 Mar 2023 14:58:59 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BE1206B0078; Thu, 9 Mar 2023 14:58:59 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id AFE4D6B0074 for ; Thu, 9 Mar 2023 14:58:59 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 8C4BB1C6A80 for ; Thu, 9 Mar 2023 19:58:59 +0000 (UTC) X-FDA: 80550423198.01.6B9AD53 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf19.hostedemail.com (Postfix) with ESMTP id B103D1A0009 for ; Thu, 9 Mar 2023 19:58:57 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=WCfKe72c; spf=pass (imf19.hostedemail.com: domain of chrisl@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=chrisl@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1678391937; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=qnFMwgrZtaDVvDW+RrD/NxNgL/AzJz0anPYmNZDfMgg=; b=jFT4pOZmpjJo4rNDjxuKIBxL/tfKauc3/mG15Jcja2hZGBStu/3qWo6Q0Wj24NnU1IEiR5 ibN7rqi+C7OsSuFm1WCgUZQzzxnHANWpFWLjWG1oPYsPZ1rDLejEjTERsS8w2h6PjT69rQ V2du7q1PGk4uUuJ5o/LhTWNlDctBXb8= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=WCfKe72c; spf=pass (imf19.hostedemail.com: domain of chrisl@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=chrisl@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1678391937; a=rsa-sha256; cv=none; b=avGLZyixeLyPp2wgkSb0U6Dlo51lq2MNy7oUU4B5cMkNBvOlU3F6FxNGPfOEtWCgCeZX8y r23aDS7fWNJknQOTWbnv2QZJRXmdZjRihcYY7PivspLQv1pamvM/PXeg/Odx2wzUa6LJ+S K/gcdysltsJdKGi0yc/sB3NBsB8qFt0= Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id BB6E461CCA; Thu, 9 Mar 2023 19:58:56 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9B92FC433EF; Thu, 9 Mar 2023 19:58:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1678391936; bh=lKOHq2HEHH38OQAd14LsQqKLdfBO0IMbZCrxNfaH/Hc=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=WCfKe72cDsDhB+lCGV+lFo7wBMW5+9lVrP7EVRngAh4T6z4cEZbdo4q8OirAo0KSr N5nHqMHBNR2UQFFdxAuBe1Gp0uqnSrQwvbBkpTbwP3JYiJ9d5r4ok2TgZ7JRvxLHRO w6+UHDL8+2XlSySkLE1RsdOpMIvC99PEBdjbi2PCU1Q6YQrs85P5dYMHC8m+G//ICC jidFGQV7PI2iLaeYqA47si5DG6PUpIxieVTHvH5fPEyI2IMluLqY7LoTUve6/9aGww P2kD01sRLjr0sOWqSyC3p54EXbCvrcp941+g8+xfQn7AWTwQuEfQs0AeqtGhfo+zjP LusSwm7jmP05A== Date: Thu, 9 Mar 2023 11:58:53 -0800 From: Chris Li To: "Huang, Ying" Cc: Yosry Ahmed , lsf-pc@lists.linux-foundation.org, Johannes Weiner , Linux-MM , Michal Hocko , Shakeel Butt , David Rientjes , Hugh Dickins , Seth Jennings , Dan Streetman , Vitaly Wool , Yang Shi , Peter Xu , Minchan Kim , Andrew Morton Subject: Re: [LSF/MM/BPF TOPIC] Swap Abstraction / Native Zswap Message-ID: References: <87356e850j.fsf@yhuang6-desk2.ccr.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87356e850j.fsf@yhuang6-desk2.ccr.corp.intel.com> X-Stat-Signature: 9d4yechx3hwzrn4odnw7x4agwfzd47tp X-Rspam-User: X-Rspamd-Queue-Id: B103D1A0009 X-Rspamd-Server: rspam06 X-HE-Tag: 1678391937-438269 X-HE-Meta: U2FsdGVkX19K85JYoVgKIydrjlJ4b47UcHGQ/lGyQhCY2Ti6OP8kUFVVgVt5Mtwlct5yf8n3356AWm7fNGqlB+hRTDeNGN3HDTVBj8HdeN9NWL/NUHCaIFLNaIxg9oFmLWHSXmxFbyNPlp664jEuSuNnFJ+eBIceLGNZH8armCMW5Km3Vf3s11QwhOolRlqAz6o1ZFFtgWsIB/xHQUZwtInH3knkHBRlH4S1syHNeszFLthcjHhl/VhRd3kRuFVgVXRKbEr4OwlI2GhvkI0w10Ei1AfQvkwAJhhelF0UZvnYiBlO7weXTZMDDUSf9yOCudrBk+D1BrbNmkkcjYc+QlXWC4qhQQBoRjF+ksO3KsWb1Lti7+Q+sfV3P8sg2UueIWuiNtTkq5E3VGYWCEIm9vVIe9hF6RqpA8ZVO8SOd/UmQyxj/UKvnreMtfrINM0LDjYM3lDGCzix0GK/PG1Gmi3qO3GA2xeYJ7KdmQ/bX1SNmos1xT8whLt/arg7pJBlYNKU4HCKPSxMRgiOzxdv8AtLWk/6kScHHpRb3s6xWbhV9wCEdpiyHcATp+9OgJm0L58GU9laJ/iFWhKY+lIwuiSQv0D9QCG5IsVm/osprnUr4Hi4/rOrgFcL7Nz6cgFN1q7pLZ/zKTTQYApVDIAKFax+cX+VLBaQf9VnXuaVzzerzRj7YAJOmlBO/f0sTvbnU4srlOyif+ZSFxG75iVzJB4mmpE5Yo4O/Ip0APj4oWCJtnVANvw9uk4O1izBmGFDmC4WDA9AzVUcprdynkc1s4HPij5jqWckv4e1jSoEn8Eq6BXsW4nSA0tQqXUEVCAmE8USFh9tZVZzL9y6K4GvU/w9dAxpRnCRptboKsu4YW3tvBM6ITVvsuxOq+X340fQEmsGc57PuuM05nwahu+ptCG20TAIL4hFuMZQ6B1iT5uS2nE3X7KXUJzvBakgP3SqnwXDREM8/iSsI+zEWg1 EppQMbiC nZaXeJm1F4+rkwPlAsk1BUGhG+XW8kO6s9uUx5UDe6VYBEbZZ0QaE+nhjZTA2rFZW1ItEi1xDbcouAK3BXuBSRl/0CRKFhvmnQ1riTENGcDNKssf3TlTvXQNvhFub+d0vys+N8rmx2pc2PcIEPTGMY0YoYd2CxjSgQd6ay25eZ7DKhovipoFtijdXr2WBfcpw//GjsbLgWRKWZb6/brmI09LYpAGSUQOZW9kTm5XXnMXsId2vEX9iQdS3EWlAbH6Darj+Wszq/GBE21vjcXYqcKO5YQd5RqEIx9+rjYs+Xj3pWKqDi8u9x1zNneJ0vQ9yDGiA51JLvF7hb88= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Mar 09, 2023 at 08:48:28PM +0800, Huang, Ying wrote: > Yosry Ahmed writes: > > > > > struct swap_desc { > > union { /* Use one bit to distinguish them */ > > swp_entry_t swap_entry; > > struct zswap_entry *zswap_entry; > > }; > > struct folio *swapcache; > > atomic_t swap_count; > > u32 id; > > } > > > > Having the id in the swap_desc is convenient as we can directly map > > the swap_desc to a swp_entry_t to place in the page tables, but I > > don't think it's necessary. Without it, the struct size is 20 bytes, > > so I think the extra 4 bytes are okay to use anyway if the slab > > allocator only allocates multiples of 8 bytes. > > > > The idea here is to unify the swapcache and swap_count implementation > > between different swap backends (swapfiles, zswap, etc), which would > > create a better abstraction and reduce reinventing the wheel. > > > > We can reduce to only 8 bytes and only store the swap/zswap entry, but > > we still need the swap cache anyway so might as well just store the > > pointer in the struct and have a unified lookup-free swapcache, so > > really 16 bytes is the minimum. > > > > If we stop at 16 bytes, then we need to handle swap count separately > > in swapfiles and zswap. This is not the end of the world, but are the > > 8 bytes worth this? > > If my understanding were correct, for current implementation, we need > one swap cache pointer per swapped out page too. Even after calling > __delete_from_swap_cache(), we store the "shadow" entry there. Although That is correct. We have the "shadow" entry. > it's possible to implement shadow entry reclaiming like that for file > cache shadow entry (workingset_shadow_shrinker), we haven't done that > yet. And, it appears that we can live with that. So, in current > implementation, for each swapped out page, we use 9 bytes. If so, the > memory usage ratio is 24 / 9 = 2.667, still not trivial, but not as > horrible as 24 / 1 = 24. The swap_desc proposal did not explicit save the shadow entry in swap_desc. So the math should be (24 + 8) vs ( 1 + 8). There is about 20 byte extra per page frame. > > Instead, if we store a key else in swp_entry_t and use this to lookup > > the swp_entry_t or zswap_entry pointer then that's essentially what > > the swap_desc does. It just goes the extra mile of unifying the > > swapcache as well and storing it directly in the swap_desc instead of > > storing it in another lookup structure. > > If we choose to make sizeof(struct swap_desc) == 8, that is, store only > swap_entry in swap_desc. The added indirection appears to be another > level of page table with 1 entry. Then, we may use the similar method > as supporting system with 2 level and 3 level page tables, like the code > in include/asm-generic/pgtable-nopmd.h. But I haven't thought about > this deeply. I would like to explore other possibility as well. More idea and discussion is welcome. Chris