From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 996F8C5519F for ; Wed, 18 Nov 2020 02:44:27 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id ECD85221FE for ; Wed, 18 Nov 2020 02:44:26 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="TTgZMsAB" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org ECD85221FE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 31EC16B0071; Tue, 17 Nov 2020 21:44:26 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2F5796B0073; Tue, 17 Nov 2020 21:44:26 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 233436B0075; Tue, 17 Nov 2020 21:44:26 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0008.hostedemail.com [216.40.44.8]) by kanga.kvack.org (Postfix) with ESMTP id EAC536B0071 for ; Tue, 17 Nov 2020 21:44:25 -0500 (EST) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 97FF58249980 for ; Wed, 18 Nov 2020 02:44:25 +0000 (UTC) X-FDA: 77495995290.13.lift77_1800c9527336 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin13.hostedemail.com (Postfix) with ESMTP id 76C2218140B60 for ; Wed, 18 Nov 2020 02:44:25 +0000 (UTC) X-HE-Tag: lift77_1800c9527336 X-Filterd-Recvd-Size: 5644 Received: from mail-vs1-f66.google.com (mail-vs1-f66.google.com [209.85.217.66]) by imf16.hostedemail.com (Postfix) with ESMTP for ; Wed, 18 Nov 2020 02:44:24 +0000 (UTC) Received: by mail-vs1-f66.google.com with SMTP id z123so198974vsb.0 for ; Tue, 17 Nov 2020 18:44:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=i1baSlx7+AuP0I6rzkrgeQ8LBxCjB7SE8APgJzoq8Tk=; b=TTgZMsABcbK/0tR56BhY5Kc3aVCHk2MgelMbWKx9Xwhkr5Frty7saHuiqmPAxqBqh4 tClJ4m2IgA5rVTAuQyHfe049KsBCpdpoUZ+z7seEZFoCiR/ysG8XYEfemf/IP92tv4RG jU2JG55ayurOrn831ogClu8q9WHNpZ6CZCteSQsxNvF4NSDu7mdR4XPgO1YPRfNKYZts gliQu/cX/r8+X1HHYae48a84LeUPpoj7e5BBoxeqtQ8SGuRDU8W/gp12c8fO1i3LcKVu PFPunEFubjhtDS2uuJeE3m45WK5kIyDYVjgpY7n/iOCCh7MLe8STHz8OEyujXi3Adsht fU1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=i1baSlx7+AuP0I6rzkrgeQ8LBxCjB7SE8APgJzoq8Tk=; b=CMs9EpASmP+W/o2P6Jx4L/pSUx/JFqICPAMn5ej+FvP65ej29g0OjRg7hBnyJnxafZ nd4RBhuHtaS+yB7vAUamCTsw15c6gdb6coRm7z3jOsGRd7xA4ei/eLAdGfW7pUnvFi9q iTkpG2WpZQ/C8PcEVR5PyCbeN42GFaV23RXOqWEQjOtq7g/dVc+MWd7L4uhsciROhjY7 f6+6GPQF+IvjfO2ZOIRcaxOz5tDT4bqGAGw9t5p7Vy0aGlT29XVeynTqW/GlBPLouzoo BZYQEjHmK56k19LOBTbyhHadqn5w+0G2v4IDeSuLjkKmDWbpl8dsA3CxhFvee/2jqDir Be1w== X-Gm-Message-State: AOAM531GzN8NdcCxxFtj73BXblMAW9w3RscY665ejoyrORtHT1z/KCiC 2HpEf3yhmBizs6VnlVSkXRyTvRBNG17Dp76y31A= X-Google-Smtp-Source: ABdhPJxxHA53i+lcdLKff/HqCsW/grer8JkoU730foAB0z0yMxezTLp9BJ0x6IGh+jGWE/i4nmTeGxMsZAFsAcIFFbY= X-Received: by 2002:a05:6102:3129:: with SMTP id f9mr2009802vsh.26.1605667464406; Tue, 17 Nov 2020 18:44:24 -0800 (PST) MIME-Version: 1.0 References: <20201116220033.1837-1-urezki@gmail.com> <20201116220033.1837-2-urezki@gmail.com> <20201117130434.GA10769@pc636> In-Reply-To: <20201117130434.GA10769@pc636> From: huang ying Date: Wed, 18 Nov 2020 10:44:13 +0800 Message-ID: Subject: Re: [PATCH 2/2] mm/vmalloc: rework the drain logic To: Uladzislau Rezki Cc: Andrew Morton , linux-mm@kvack.org, LKML , Hillf Danton , Michal Hocko , Matthew Wilcox , Oleksiy Avramchenko , Steven Rostedt , Huang Ying , Christoph Hellwig Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Nov 17, 2020 at 9:04 PM Uladzislau Rezki wrote: > > On Tue, Nov 17, 2020 at 10:37:34AM +0800, huang ying wrote: > > On Tue, Nov 17, 2020 at 6:00 AM Uladzislau Rezki (Sony) > > wrote: > > > > > > A current "lazy drain" model suffers from at least two issues. > > > > > > First one is related to the unsorted list of vmap areas, thus > > > in order to identify the [min:max] range of areas to be drained, > > > it requires a full list scan. What is a time consuming if the > > > list is too long. > > > > > > Second one and as a next step is about merging all fragments > > > with a free space. What is also a time consuming because it > > > has to iterate over entire list which holds outstanding lazy > > > areas. > > > > > > See below the "preemptirqsoff" tracer that illustrates a high > > > latency. It is ~24 676us. Our workloads like audio and video > > > are effected by such long latency: > > > > This seems like a real problem. But I found there's long latency > > avoidance mechanism in the loop in __purge_vmap_area_lazy() as > > follows, > > > > if (atomic_long_read(&vmap_lazy_nr) < resched_threshold) > > cond_resched_lock(&free_vmap_area_lock); > > > I have added that "resched threshold" because of on my tests i could > simply hit out of memory, due to the fact that a drain work is not up > to speed to process such long outstanding list of vmap areas. OK. Now I think I understand the problem. For free area purging, there are multiple "producers" but one "consumer", and it lacks enough mechanism to slow down the "producers" if "consumer" can not catch up. And your patch tries to resolve the problem via accelerating the "consumer". That isn't perfect, but I think we may have quite some opportunities to merge the free areas, so it should just work. And I found the long latency avoidance logic in __purge_vmap_area_lazy() appears problematic, if (atomic_long_read(&vmap_lazy_nr) < resched_threshold) cond_resched_lock(&free_vmap_area_lock); Shouldn't it be something as follows? if (i >= BATCH && atomic_long_read(&vmap_lazy_nr) < resched_threshold) { cond_resched_lock(&free_vmap_area_lock); i = 0; } else i++; This will accelerate the purging via batching and slow down vmalloc() via holding free_vmap_area_lock. If it makes sense, can we try this? And, can we reduce lazy_max_pages() to control the length of the purging list? It could be > 8K if the vmalloc/vfree size is small. Best Regards, Huang, Ying