From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.4 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E9903C352A4 for ; Wed, 12 Feb 2020 23:31:03 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id AEE0020848 for ; Wed, 12 Feb 2020 23:31:03 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=ziepe.ca header.i=@ziepe.ca header.b="A9dJ4yim" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AEE0020848 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ziepe.ca Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 4B4F16B04C8; Wed, 12 Feb 2020 18:31:03 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 43EF66B04C9; Wed, 12 Feb 2020 18:31:03 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 308896B04CA; Wed, 12 Feb 2020 18:31:03 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0080.hostedemail.com [216.40.44.80]) by kanga.kvack.org (Postfix) with ESMTP id 15BBE6B04C8 for ; Wed, 12 Feb 2020 18:31:03 -0500 (EST) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id D09062DFD for ; Wed, 12 Feb 2020 23:31:02 +0000 (UTC) X-FDA: 76483072764.21.run44_26322a13f6322 X-HE-Tag: run44_26322a13f6322 X-Filterd-Recvd-Size: 5046 Received: from mail-qt1-f194.google.com (mail-qt1-f194.google.com [209.85.160.194]) by imf28.hostedemail.com (Postfix) with ESMTP for ; Wed, 12 Feb 2020 23:31:02 +0000 (UTC) Received: by mail-qt1-f194.google.com with SMTP id e21so3001585qtp.13 for ; Wed, 12 Feb 2020 15:31:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=c190503eC6bUOcd0mnGdeBcEiz+QDARedCEtmx7/DdI=; b=A9dJ4yimMdChZoYLPEspdDRpBnWRaN5sw/bAKkSAGzUiNBQkSwcBilvzolMCGfwuhG zP/Q4SwJIJXpRRq3/mKqoVEBb+HJZ2zbSWNOlH3ms0CQZAyLjHr5z/PRowtvYUtpdo4a wZw51GgjHjDfEPxugC5Efzy5tSo1AqT69aFaIEgxiD1/RW9YkO/ckQIqRxbotE1dz1N2 zgP/WAfUMnz1Om40GCjqyZzqxCpAAguZeScpu7k9fhSEe3vsTSlEJAYb03Ux2td6tdil HGfX0ygqw0czJ0tOAzLo0+kTq80ww5KKm4+wJJN33pLewB2oW0L0yzHprvvsGd4Ig+10 IOvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=c190503eC6bUOcd0mnGdeBcEiz+QDARedCEtmx7/DdI=; b=gvV5mKxSZVxIsPvgyq6N8Q5XT8TAoFpr7pUsq6uBMtxxbWH6aLvCadVwQf89Apo6sB +hgKfBS/xWD8St5FikWmIlip16eMG5p4jkKq52M5arOFrSUw0lOQPxXeWr+fa5TlorkR w8W/HKSMp1tado4PLHqOmWnQhD5lGx3/3TEpo7DV4bViT4z0be4AO1oXlJZo85V1gQG5 q14t517Pa3tS/oNjx2KrGAdCiS1Gw1H2YoHDN1jvWBTcs/5JgoIKlnk2CQnctQ+EmMeq Fe6dnjbmTnvp3sxe5GFcIIkycEbyl28deG3kVS32xGnIt3S56ZxvZ/DPCRAcGduCNEjO jCbQ== X-Gm-Message-State: APjAAAULfX/XVYgPUnreZJE9jvzW2tOjMMjTmvgpC5FTT70q8ncL7Prm jod7V7YFDBmhXKCAF67MnZP7oQ== X-Google-Smtp-Source: APXvYqxb7t8cAbVYw7fOjg05rwdEaU9rvHjZUG5B3/cSjRofwl4e8hetUICGqk2LtrlJR/jm3GFhpQ== X-Received: by 2002:ac8:7501:: with SMTP id u1mr9426605qtq.149.1581550261673; Wed, 12 Feb 2020 15:31:01 -0800 (PST) Received: from ziepe.ca (hlfxns017vw-142-68-57-212.dhcp-dynamic.fibreop.ns.bellaliant.net. [142.68.57.212]) by smtp.gmail.com with ESMTPSA id i13sm254577qki.70.2020.02.12.15.31.01 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 12 Feb 2020 15:31:01 -0800 (PST) Received: from jgg by mlx.ziepe.ca with local (Exim 4.90_1) (envelope-from ) id 1j21TA-0001Uu-H8; Wed, 12 Feb 2020 19:31:00 -0400 Date: Wed, 12 Feb 2020 19:31:00 -0400 From: Jason Gunthorpe To: Daniel Jordan Cc: lsf-pc@lists.linuxfoundation.org, linux-mm@kvack.org, Dan Williams , Dave Hansen , Tim Chen , Mike Kravetz , Herbert Xu , Steffen Klassert , Tejun Heo , Peter Zijlstra , Alex Williamson Subject: Re: [LSF/MM/BPF TOPIC] kernel multithreading with padata Message-ID: <20200212233100.GF31668@ziepe.ca> References: <20200212224731.kmss6o6agekkg3mw@ca-dmjordan1.us.oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200212224731.kmss6o6agekkg3mw@ca-dmjordan1.us.oracle.com> User-Agent: Mutt/1.9.4 (2018-02-28) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Feb 12, 2020 at 05:47:31PM -0500, Daniel Jordan wrote: > padata has been undergoing some surgery over the last year[0] and now seems > ready for another enhancement: splitting up and multithreading CPU-intensive > kernel work. > > Quoting from an earlier series[1], the problem I'm trying to solve is > > A single CPU can spend an excessive amount of time in the kernel operating > on large amounts of data. Often these situations arise during initialization- > and destruction-related tasks, where the data involved scales with system > size. These long-running jobs can slow startup and shutdown of applications > and the system itself while extra CPUs sit idle. > > Here are the current consumers: > > - struct page init (boot, hotplug, pmem) > - VFIO page pinning (kvm guest init) > - fallocating a hugetlb file (database shared memory init) > > On a large-memory server, DRAM page init is ~23% of kernel boot (3.5s/15.2s), > and it takes over a minute to start a VFIO-enabled kvm guest or fallocate a > hugetlb file that occupy a significant fraction of memory. This work results > in 7-20x speedups and is currently increasing the uptime of our production > kernels. > > Future areas include munmap/exit, umount, and __ib_umem_release. Some of these > need coarse locks broken up for multithreading (zone->lock, lru_lock). I'm aware of this ib_umem_release request, it would be interesting to see, the main workload here is put_page and dma_unmap Jason