From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F0B6CC433DF for ; Wed, 24 Jun 2020 17:59:04 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C041A206EB for ; Wed, 24 Jun 2020 17:59:04 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C041A206EB Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 485016B0005; Wed, 24 Jun 2020 13:59:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 40D2E6B0007; Wed, 24 Jun 2020 13:59:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2AFEC6B000C; Wed, 24 Jun 2020 13:59:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0115.hostedemail.com [216.40.44.115]) by kanga.kvack.org (Postfix) with ESMTP id 0FFDB6B0005 for ; Wed, 24 Jun 2020 13:59:04 -0400 (EDT) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 7EABC2DFC for ; Wed, 24 Jun 2020 17:59:03 +0000 (UTC) X-FDA: 76964866566.02.shame07_1712c2c26e46 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin02.hostedemail.com (Postfix) with ESMTP id 55E8650000B2E9DC for ; Wed, 24 Jun 2020 17:59:02 +0000 (UTC) X-HE-Tag: shame07_1712c2c26e46 X-Filterd-Recvd-Size: 5883 Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by imf32.hostedemail.com (Postfix) with ESMTP for ; Wed, 24 Jun 2020 17:59:01 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from localhost (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP (TLS) id 21606053-1500050 for multiple; Wed, 24 Jun 2020 18:58:53 +0100 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable In-Reply-To: <20200624165057.GJ6578@ziepe.ca> References: <20200624080248.3701-1-chris@chris-wilson.co.uk> <20200624121053.GD6578@ziepe.ca> <159300126338.4527.3968787379471939056@build.alporthouse.com> <20200624123910.GA3178169@ziepe.ca> <159300796224.4527.2014771396582759689@build.alporthouse.com> <20200624141604.GH6578@ziepe.ca> <159300850942.4527.8335506003268197914@build.alporthouse.com> <20200624142544.GI6578@ziepe.ca> <159300945202.4527.4366416413140642633@build.alporthouse.com> <20200624165057.GJ6578@ziepe.ca> From: Chris Wilson Subject: Re: [PATCH 1/2] mm/mmu_notifier: Mark up direct reclaim paths with MAYFAIL To: Jason Gunthorpe Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, intel-gfx@lists.freedesktop.org, Andrew Morton Message-ID: <159302152915.4527.9099070806700792078@build.alporthouse.com> User-Agent: alot/0.8.1 Date: Wed, 24 Jun 2020 18:58:49 +0100 X-Rspamd-Queue-Id: 55E8650000B2E9DC X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam01 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Quoting Jason Gunthorpe (2020-06-24 17:50:57) > On Wed, Jun 24, 2020 at 03:37:32PM +0100, Chris Wilson wrote: > > Quoting Jason Gunthorpe (2020-06-24 15:25:44) > > > On Wed, Jun 24, 2020 at 03:21:49PM +0100, Chris Wilson wrote: > > > > Quoting Jason Gunthorpe (2020-06-24 15:16:04) > > > > > On Wed, Jun 24, 2020 at 03:12:42PM +0100, Chris Wilson wrote: > > > > > > Quoting Jason Gunthorpe (2020-06-24 13:39:10) > > > > > > > On Wed, Jun 24, 2020 at 01:21:03PM +0100, Chris Wilson wrote: > > > > > > > > Quoting Jason Gunthorpe (2020-06-24 13:10:53) > > > > > > > > > On Wed, Jun 24, 2020 at 09:02:47AM +0100, Chris Wilson wr= ote: > > > > > > > > > > When direct reclaim enters the shrinker and tries to re= claim pages, it > > > > > > > > > > has to opportunitically unmap them [try_to_unmap_one]. = For direct > > > > > > > > > > reclaim, the calling context is unknown and may include= attempts to > > > > > > > > > > unmap one page of a dma object while attempting to allo= cate more pages > > > > > > > > > > for that object. Pass the information along that we are= inside an > > > > > > > > > > opportunistic unmap that can allow that page to remain = referenced and > > > > > > > > > > mapped, and let the callback opt in to avoiding a recur= sive wait. > > > > > > > > >=20 > > > > > > > > > i915 should already not be holding locks shared with the = notifiers > > > > > > > > > across allocations that can trigger reclaim. This is alre= ady required > > > > > > > > > to use notifiers correctly anyhow - why do we need someth= ing in the > > > > > > > > > notifiers? > > > > > > > >=20 > > > > > > > > for (n =3D 0; n < num_pages; n++) > > > > > > > > pin_user_page() > > > > > > > >=20 > > > > > > > > may call try_to_unmap_page from the lru shrinker for [0, n-= 1]. > > > > > > >=20 > > > > > > > Yes, of course you can't hold any locks that intersect with n= otifiers > > > > > > > across pin_user_page()/get_user_page() > > > > > >=20 > > > > > > What lock though? It's just the page refcount, shrinker asks us= to drop > > > > > > it [via mmu], we reply we would like to keep using that page as= freeing > > > > > > it for the current allocation is "robbing Peter to pay Paul". > > > > >=20 > > > > > Maybe I'm unclear what this series is actually trying to fix?=20 > > > > >=20 > > > > > You said "avoiding a recursive wait" which sounds like some locki= ng > > > > > deadlock to me. > > > >=20 > > > > It's the shrinker being called while we are allocating for/on behal= f of > > > > the object. As we are actively using the object, we don't want to f= ree > > > > it -- the partial object allocation being the clearest, if the obje= ct > > > > consists of 2 pages, trying to free page 0 in order to allocate pag= e 1 > > > > has to fail (and the shrinker should find another candidate to recl= aim, > > > > or fail the allocation). > > >=20 > > > mmu notifiers are not for influencing policy of the mm. > >=20 > > It's policy is "this may fail" regardless of the mmu notifier at this > > point. That is not changed. >=20 > MMU notifiers are for tracking updates, they are not allowed to fail. > The one slightly weird case of non-blocking is the only exception. >=20 > > Your suggestion is that we move the pages to the unevictable mapping so > > that the shrinker LRU is never invoked on pages we have grabbed with > > pin_user_page. Does that work with the rest of the mmu notifiers? >=20 > That is beyond what I'm familiar with - but generally - if you want to > influence decisions the MM is making then it needs to be at the > front of the process and not inside notifiers.=20 >=20 > So what you describe seems broadly appropriate to me. Sadly, it's a mlock_vma_page problem all over again. =20 > I'm still a little unclear on what you are trying to fix - pinned > pages are definitely not freed, do you have some case where pages > which are pinned are being cleaned out from the MM despite being > pinned? Sounds a bit strange, maybe that is worth adressing directly? It suffices to say that pin_user_pages does not prevent try_to_unmap_one from trying to revoke the page. But we could perhaps slip a page_maybe_dma_pinned() in around there and see what happens. -Chris