From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.7 required=3.0 tests=FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2F21CCA9EB0 for ; Sun, 3 Nov 2019 11:21:28 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C31F62080F for ; Sun, 3 Nov 2019 11:21:27 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C31F62080F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=sina.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 587AE6B0003; Sun, 3 Nov 2019 06:21:27 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 538BE6B0006; Sun, 3 Nov 2019 06:21:27 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 450196B0007; Sun, 3 Nov 2019 06:21:27 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0105.hostedemail.com [216.40.44.105]) by kanga.kvack.org (Postfix) with ESMTP id 2C52E6B0003 for ; Sun, 3 Nov 2019 06:21:27 -0500 (EST) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with SMTP id C8F8F181AC9CB for ; Sun, 3 Nov 2019 11:21:26 +0000 (UTC) X-FDA: 76114725372.12.boat71_c6bed145251a X-HE-Tag: boat71_c6bed145251a X-Filterd-Recvd-Size: 4637 Received: from mail3-166.sinamail.sina.com.cn (mail3-166.sinamail.sina.com.cn [202.108.3.166]) by imf12.hostedemail.com (Postfix) with SMTP for ; Sun, 3 Nov 2019 11:21:24 +0000 (UTC) Received: from unknown (HELO localhost.localdomain)([221.219.0.223]) by sina.com with ESMTP id 5DBEB83000010B90; Sun, 3 Nov 2019 19:21:22 +0800 (CST) X-Sender: hdanton@sina.com X-Auth-ID: hdanton@sina.com X-SMAIL-MID: 49426554921579 From: Hillf Danton To: linux-mm Cc: Andrew Morton , linux-kernel , Vlastimil Babka , Jan Kara , Mel Gorman , Jerome Glisse , Dan Williams , Ira Weiny , John Hubbard , Christoph Hellwig , Jonathan Corbet , Hillf Danton Subject: [RFC] mm: gup: add helper page_try_gup_pin(page) Date: Sun, 3 Nov 2019 19:21:13 +0800 Message-Id: <20191103112113.8256-1-hdanton@sina.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: A helper is added for mitigating the gup issue described at https://lwn.net/Articles/784574/. It is unsafe to write out a dirty page that is already gup pinned for DMA. In the current writeback context, dirty pages are written out with no detecting whether they have been gup pinned; Nor mark to keep gupers off. In the gup context, file pages can be pinned with other gupers and writeback ignored. The factor, that no room, supposedly even one bit, in the current page struct can be used for tracking gupers, makes the issue harder to tackle. The approach here is, because it makes no sense to allow a file page to have multiple gupers at the same time, looking to make gupers mutually exclusive, and then guper's singulairty helps to tell if a guper is existing by staring at the change in page count. The result of that sigularity is not yet 100% correct but something of "best effort" as the effect of random get_page() is perhaps also folded in it. It is assumed the best effort is feasible/acceptable in practice without the the cost of growing the page struct size by one bit, were it true that something similar has been applied to the page migrate and reclaim contexts for a while. With the helper in place, we skip writing out a dirty page if a guper is detected; On gupping, we give up pinning a file page due to writeback or losing the race to become a guper. The end result is, no gup-pinned page will be put under writeback. It is based on next-20191031. Signed-off-by: Hillf Danton --- --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1055,6 +1055,29 @@ static inline void put_page(struct page __put_page(page); } =20 +/* + * @page must be pagecache page + */ +static inline bool page_try_gup_pin(struct page *page) +{ + int count; + + page =3D compound_head(page); + count =3D page_ref_count(page); + smp_mb__after_atomic(); + + if (!count || count > page_mapcount(page) + 1 + + page_has_private(page)) + return false; + + if (page_ref_inc_return(page) =3D=3D count + 1) { + smp_mb__after_atomic(); + return true; + } + put_page(page); + return false; +} + /** * put_user_page() - release a gup-pinned page * @page: pointer to page to be released --- a/mm/gup.c +++ b/mm/gup.c @@ -253,7 +253,11 @@ retry: } =20 if (flags & FOLL_GET) { - if (unlikely(!try_get_page(page))) { + if (page_is_file_cache(page)) { + if (PageWriteback(page) || !page_try_gup_pin(page)) + goto pin_fail; + } else if (unlikely(!try_get_page(page))) { +pin_fail: page =3D ERR_PTR(-ENOMEM); goto out; } --- a/mm/page-writeback.c +++ b/mm/page-writeback.c @@ -2202,6 +2202,9 @@ int write_cache_pages(struct address_spa =20 done_index =3D page->index; =20 + if (!page_try_gup_pin(page)) + continue; + lock_page(page); =20 /* @@ -2215,6 +2218,7 @@ int write_cache_pages(struct address_spa if (unlikely(page->mapping !=3D mapping)) { continue_unlock: unlock_page(page); + put_page(page); continue; } =20 @@ -2236,6 +2240,11 @@ continue_unlock: =20 trace_wbc_writepage(wbc, inode_to_bdi(mapping->host)); error =3D (*writepage)(page, wbc, data); + /* + * protection of gup pin is no longer needed after + * putting page under writeback + */ + put_page(page); if (unlikely(error)) { /* * Handle errors according to the type of --