From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 340A1ECAAD3 for ; Thu, 15 Sep 2022 08:16:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6C4006B0071; Thu, 15 Sep 2022 04:16:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 673556B0075; Thu, 15 Sep 2022 04:16:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 53B388D0003; Thu, 15 Sep 2022 04:16:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 44A626B0071 for ; Thu, 15 Sep 2022 04:16:28 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 1CA6DC0C34 for ; Thu, 15 Sep 2022 08:16:28 +0000 (UTC) X-FDA: 79913612856.26.38AB27D Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by imf12.hostedemail.com (Postfix) with ESMTP id 625DF400C2 for ; Thu, 15 Sep 2022 08:16:27 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id DFC6833889; Thu, 15 Sep 2022 08:16:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1663229785; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=NTSN1xqCruJC4ZaRq7lW0AZpNYYRS+bGOmEnOKZF9Ng=; b=TL4DUwJH0HUGXwxd/0idnCPEo+CBe3TRIIAOzAv3or6KpnMBKH8UrgVOEscEUXI89QSLzn CLAOW8OOV+ygGlmhBOpIDZSDslOawYQInGjO8KtALpwdS4X7Vy/YxjGDRoOS3f7wEqeHJW 9FsixXYInXb+FrMfcZMlwBxWBiV3GhY= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1663229785; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=NTSN1xqCruJC4ZaRq7lW0AZpNYYRS+bGOmEnOKZF9Ng=; b=HRlirllTGDI6NxtxuhafI4mnGPWJH/XlnVOAZ7P1+Mbk2Sx/n62L5kDAJwc2bT/o7UV9Kj 5arHv2C81Yvn6CCQ== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id C8A9C139C8; Thu, 15 Sep 2022 08:16:25 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id V478MFnfImOjdgAAMHmgww (envelope-from ); Thu, 15 Sep 2022 08:16:25 +0000 Received: by quack3.suse.cz (Postfix, from userid 1000) id 10690A0682; Thu, 15 Sep 2022 10:16:25 +0200 (CEST) Date: Thu, 15 Sep 2022 10:16:25 +0200 From: Jan Kara To: Al Viro Cc: Jan Kara , Christoph Hellwig , John Hubbard , Andrew Morton , Jens Axboe , Miklos Szeredi , "Darrick J . Wong" , Trond Myklebust , Anna Schumaker , David Hildenbrand , Logan Gunthorpe , linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, LKML Subject: Re: [PATCH v2 4/7] iov_iter: new iov_iter_pin_pages*() routines Message-ID: <20220915081625.6a72nza6yq4l5etp@quack3> References: <20220831041843.973026-1-jhubbard@nvidia.com> <20220831041843.973026-5-jhubbard@nvidia.com> <103fe662-3dc8-35cb-1a68-dda8af95c518@nvidia.com> <20220906102106.q23ovgyjyrsnbhkp@quack3> <20220914145233.cyeljaku4egeu4x2@quack3> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1663229787; a=rsa-sha256; cv=none; b=Z9LzkCI2h3gbut5GHZcS2NSpIUOX3fzs19M3Eion+5PCYM2F1utt/RqGtkvvYZKAe1/iGD MOfoIGZl2tBCIljPW24uBG0iOlxtfCltOBDic2n+eccqr8dF73QA+N21M6TJwnRBdK00x4 QiPJu8vLCkP157OHGuh2y5xakylLqXA= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=TL4DUwJH; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=HRlirllT; dmarc=none; spf=pass (imf12.hostedemail.com: domain of jack@suse.cz designates 195.135.220.28 as permitted sender) smtp.mailfrom=jack@suse.cz ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1663229787; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=NTSN1xqCruJC4ZaRq7lW0AZpNYYRS+bGOmEnOKZF9Ng=; b=YDKRx5fKD1+975C3yeLl9eSRTl8wZuUDVQb3G8Usd5tvSKIjSW6xYcn65Da5t4hUGwxYXu jyHQ2NfMP9rJTvt2l9o/X1TD3M2+/qZszc0idTHgHSGUkyfn4AsW2u0PrrEHf1LC8Iw3a1 2BFTzVT2V+ypQ2LxiJvcay1SGbW9NpM= X-Rspam-User: Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=TL4DUwJH; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=HRlirllT; dmarc=none; spf=pass (imf12.hostedemail.com: domain of jack@suse.cz designates 195.135.220.28 as permitted sender) smtp.mailfrom=jack@suse.cz X-Stat-Signature: a8z6piawt8d6sf8xzemnxbuitw1e5y8g X-Rspamd-Queue-Id: 625DF400C2 X-Rspamd-Server: rspam09 X-HE-Tag: 1663229787-337711 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed 14-09-22 17:42:40, Al Viro wrote: > On Wed, Sep 14, 2022 at 04:52:33PM +0200, Jan Kara wrote: > > > ================================================================================= > > > CASE 5: Pinning in order to write to the data within the page > > > ------------------------------------------------------------- > > > Even though neither DMA nor Direct IO is involved, just a simple case of "pin, > > > write to a page's data, unpin" can cause a problem. Case 5 may be considered a > > > superset of Case 1, plus Case 2, plus anything that invokes that pattern. In > > > other words, if the code is neither Case 1 nor Case 2, it may still require > > > FOLL_PIN, for patterns like this: > > > > > > Correct (uses FOLL_PIN calls): > > > pin_user_pages() > > > write to the data within the pages > > > unpin_user_pages() > > > > > > INCORRECT (uses FOLL_GET calls): > > > get_user_pages() > > > write to the data within the pages > > > put_page() > > > ================================================================================= > > > > Yes, that was my point. > > The thing is, at which point do we pin those pages? pin_user_pages() works by > userland address; by the time we get to any of those we have struct page > references and no idea whether they are still mapped anywhere. Yes, pin_user_pages() currently works by page address but there's nothing fundamental about that. Technically, pin is currently just another type of page reference so we can as well just pin the page when given struct page. In fact John Hubbart has added such helper in this series. > How would that work? What protects the area where you want to avoid running > into pinned pages from previously acceptable page getting pinned? If "they > must have been successfully unmapped" is a part of what you are planning, we > really do have a problem... But this is a very good question. So far the idea was that we lock the page, unmap (or writeprotect) the page, and then check pincount == 0 and that is a reliable method for making sure page data is stable (until we unlock the page & release other locks blocking page faults and writes). But once suddently ordinary page references can be used to create pins this does not work anymore. Hrm. Just brainstorming ideas now: So we'd either need to obtain the pins early when we still have the virtual address (but I guess that is often not practical but should work e.g. for normal direct IO path) or we need some way to "simulate" the page fault when pinning the page, just don't map it into page tables in the end. This simulated page fault could be perhaps avoided if rmap walk shows that the page is already mapped somewhere with suitable permissions. Honza -- Jan Kara SUSE Labs, CR