From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9DD1FC433E0 for ; Thu, 14 Jan 2021 19:53:22 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 257CC23434 for ; Thu, 14 Jan 2021 19:53:22 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 257CC23434 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=konsulko.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 8DDD98D011C; Thu, 14 Jan 2021 14:53:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 865BC8D00F0; Thu, 14 Jan 2021 14:53:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 72E408D011C; Thu, 14 Jan 2021 14:53:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0233.hostedemail.com [216.40.44.233]) by kanga.kvack.org (Postfix) with ESMTP id 578CC8D00F0 for ; Thu, 14 Jan 2021 14:53:21 -0500 (EST) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 1CEB8180AD81A for ; Thu, 14 Jan 2021 19:53:21 +0000 (UTC) X-FDA: 77705429802.02.play08_0606a4227529 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin02.hostedemail.com (Postfix) with ESMTP id F123F10097AA0 for ; Thu, 14 Jan 2021 19:53:20 +0000 (UTC) X-HE-Tag: play08_0606a4227529 X-Filterd-Recvd-Size: 7786 Received: from mail-lf1-f43.google.com (mail-lf1-f43.google.com [209.85.167.43]) by imf32.hostedemail.com (Postfix) with ESMTP for ; Thu, 14 Jan 2021 19:53:20 +0000 (UTC) Received: by mail-lf1-f43.google.com with SMTP id o19so9802889lfo.1 for ; Thu, 14 Jan 2021 11:53:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=konsulko.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=L+iian0bNKWWdyeoHyFsfgaU2B45qWEsKAbXCdXac7o=; b=SGPe+BfsFLcglQhOkai1pf9Q2uCLZFc9DWmD1sgmisk8LaAEz9JycFh0ECl4XJttwb yeshUlJ6WU0bcVuXV0c7yk3JayD99WKcIlr0ehfKJKYzXQayep80pHcHrrStQwKKekg5 y6yCMglTbKxmcAqQFMtzbuO3y2sauuKJOhTe4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=L+iian0bNKWWdyeoHyFsfgaU2B45qWEsKAbXCdXac7o=; b=e4DVNRpV7yf6KMOMQmJMzvdBbRu2NrTVpJC7Pwlv6/AvQd9+LlBoHNt406EdfeLiYP i84Lqd6FIw884z/+4sKZCj3cFoEkLDaIUPlWGhN7kmQIZ3sktGsoGe+c2pn5DRtwgsoK i9k9fRRNt0yzv1dtlp21mTDRfEe5NiQbcQDRHw8RKAeb/JWoO4VOmtkWwxO5JbAN2mve vMs1yXKENXZSGxtIE5quBZX0vF3qyYt7TG9OBmKUq3nRNracmf7+QV5Mg7lDAjbg9vTc I0yJO87GKhc1JzgFSg7cy/ZJeJhImHoh4UjFKAiyyI5MT0Wc9CzTKsSQSllTTyRznxck jPsQ== X-Gm-Message-State: AOAM530f7h31EDIwaLLYB/sAdDjiPs3M3UJcnpRHbj0K1wUE9kCvdVf9 axrBCSOukZ7NPmCLigXiaXcbHGcfn0gGydTJTNfK3w== X-Google-Smtp-Source: ABdhPJwpVmm3j9l7D2sla5bOaUgim30vH+EhtS3KgpqD3zij2uaJJrIKiV2b6TTrPBHvQ+Njwvf+KcOFEzi08sUfEq4= X-Received: by 2002:ac2:5edb:: with SMTP id d27mr3708226lfq.411.1610653998988; Thu, 14 Jan 2021 11:53:18 -0800 (PST) MIME-Version: 1.0 References: <1608894171-54174-1-git-send-email-tiantao6@hisilicon.com> <1608894171-54174-2-git-send-email-tiantao6@hisilicon.com> In-Reply-To: From: Vitaly Wool Date: Thu, 14 Jan 2021 20:53:07 +0100 Message-ID: Subject: Re: [RFC mm/zswap 1/2] mm/zswap: add the flag can_sleep_mapped To: Shakeel Butt Cc: Minchan Kim , Tian Tao , Seth Jennings , Dan Streetman , Andrew Morton , Barry Song , Linux-MM , Sergey Senozhatsky Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Jan 14, 2021 at 8:21 PM Shakeel Butt wrote: > > On Thu, Jan 14, 2021 at 11:05 AM Vitaly Wool wrote: > > > > On Thu, Jan 14, 2021 at 7:56 PM Minchan Kim wrote: > > > > > > On Thu, Jan 14, 2021 at 07:40:50PM +0100, Vitaly Wool wrote: > > > > On Thu, Jan 14, 2021 at 7:29 PM Minchan Kim wrote: > > > > > > > > > > On Fri, Dec 25, 2020 at 07:02:50PM +0800, Tian Tao wrote: > > > > > > add a flag to zpool, named is "can_sleep_mapped", and have it set true > > > > > > for zbud/z3fold, set false for zsmalloc. Then zswap could go the current > > > > > > path if the flag is true; and if it's false, copy data from src to a > > > > > > temporary buffer, then unmap the handle, take the mutex, process the > > > > > > buffer instead of src to avoid sleeping function called from atomic > > > > > > context. > > > > > > > > > > > > Signed-off-by: Tian Tao > > > > > > --- > > > > > > include/linux/zpool.h | 3 +++ > > > > > > mm/zpool.c | 13 +++++++++++++ > > > > > > mm/zswap.c | 50 +++++++++++++++++++++++++++++++++++++++++++++----- > > > > > > 3 files changed, 61 insertions(+), 5 deletions(-) > > > > > > > > > > > > diff --git a/include/linux/zpool.h b/include/linux/zpool.h > > > > > > index 51bf430..e899701 100644 > > > > > > --- a/include/linux/zpool.h > > > > > > +++ b/include/linux/zpool.h > > > > > > @@ -73,6 +73,7 @@ u64 zpool_get_total_size(struct zpool *pool); > > > > > > * @malloc: allocate mem from a pool. > > > > > > * @free: free mem from a pool. > > > > > > * @shrink: shrink the pool. > > > > > > + * @sleep_mapped: whether zpool driver can sleep during map. > > > > > > > > > > I don't think it's a good idea. It just breaks zpool abstraction > > > > > in that it exposes internal implementation to user to avoid issue > > > > > zswap recently introduced. It also conflicts zpool_map_handle's > > > > > semantic. > > > > > > > > > > Rather than introducing another break in zpool due to the new > > > > > zswap feature recenlty introduced, zswap could introduce > > > > > CONFIG_ZSWAP_HW_COMPRESSOR. Once it's configured, zsmalloc could > > > > > be disabled. And with disabling CONFIG_ZSWAP_HW_COMPRESSOR, zswap > > > > > doesn't need to make any bounce buffer copy so that no existing > > > > > zsmalloc user will see performance regression. > > > > > > > > I believe it won't help that much -- the new compressor API presumes > > > > that the caller may sleep during compression and that will be an > > > > accident waiting to happen as long as we use it and keep the handle > > > > mapped in zsmalloc case. > > > > > > > > Or maybe I interpreted you wrong and you are suggesting re-introducing > > > > calls to the old API under this #ifdef, is that the case? > > > > > > Yub. zswap could abstract that part under #ifdef to keep old behavior. > > > > We can reconsider this option when zsmalloc implements reclaim > > callback. So far it's obviously too much a mess for a reason so weak. > > > > Sorry I don't understand the link between zsmalloc implementing shrink > callback and this patch. There is none. There is a link between taking all the burden to revive zsmalloc for zswap at the cost of extra zswap complexity and zsmalloc not being fully zswap compatible. The ultimate zswap goal is to cache hottest swapped-out pages in a compressed format. zsmalloc doesn't implement reclaim callback, and therefore zswap can *not* fulfill its goal since old pages are there to stay, and it can't accept new hotter ones. So, let me make it clear: zswap/zsmalloc combo is a legacy corner case. > This patch is adding an overhead for all > zswap+zsmalloc users irrespective of availability of hardware. If we > want to add support for new hardware, please add without impacting the > current users. No, it's not like that. zswap+zsmalloc combination is currently already broken and this patch implements a way to work that around. The users are already impacted and that is of course a problem. The workaround is not perfect but looks good enough for me. The suggested messy #ifdef-based alternative will turn zswap into something really tricky to understand and maintain and therefore is not going to work. The best possible thing here would be for zsmalloc to stop taking spinlock in map() callback and releasing in unmap() (it _is_ legit -- I don't argue that -- but it does look weird and goes against the recent efforts to make Linux kernel more realtime and generally more responsive; moreover, it may be just a way to conceal some race conditions which could be easily fixed otherwise). Best regards, Vitaly