From: Joao Martins <joao.m.martins@oracle.com> To: Jason Gunthorpe <jgg@ziepe.ca> Cc: linux-mm@kvack.org, Dan Williams <dan.j.williams@intel.com>, Ira Weiny <ira.weiny@intel.com>, linux-nvdimm@lists.01.org, Matthew Wilcox <willy@infradead.org>, Jane Chu <jane.chu@oracle.com>, Muchun Song <songmuchun@bytedance.com>, Mike Kravetz <mike.kravetz@oracle.com>, Andrew Morton <akpm@linux-foundation.org>, Daniel Jordan <daniel.m.jordan@oracle.com>, John Hubbard <jhubbard@nvidia.com> Subject: Re: [PATCH RFC 7/9] mm/gup: Decrement head page once for group of subpages Date: Thu, 17 Dec 2020 19:05:37 +0000 [thread overview] Message-ID: <cf5585f0-1352-e3ab-9dbf-0185ad0a1b31@oracle.com> (raw) In-Reply-To: <20201208193446.GP5487@ziepe.ca> On 12/8/20 7:34 PM, Jason Gunthorpe wrote: >> @@ -274,6 +291,7 @@ void unpin_user_pages_dirty_lock(struct page **pages, unsigned long npages, >> bool make_dirty) >> { >> unsigned long index; >> + int refs = 1; >> >> /* >> * TODO: this can be optimized for huge pages: if a series of pages is >> @@ -286,8 +304,9 @@ void unpin_user_pages_dirty_lock(struct page **pages, unsigned long npages, >> return; >> } >> >> - for (index = 0; index < npages; index++) { >> + for (index = 0; index < npages; index += refs) { >> struct page *page = compound_head(pages[index]); >> + > > I think this is really hard to read, it should end up as some: > > for_each_compond_head(page_list, page_list_len, &head, &ntails) { > if (!PageDirty(head)) > set_page_dirty_lock(head, ntails); > unpin_user_page(head, ntails); > } > > And maybe you open code that iteration, but that basic idea to find a > compound_head and ntails should be computational work performed. > > No reason not to fix set_page_dirty_lock() too while you are here. > The wack of atomics you mentioned earlier you referred to, I suppose it ends being account_page_dirtied(). See partial diff at the end. I was looking at the latter part and renaming all the fs that supply set_page_dirty()... But now my concern is whether it's really safe to assume that filesystems that supply it ... have indeed the ability to dirty @ntails pages. Functionally, fixing set_page_dirty_lock() means we don't call set_page_dirty(head) @ntails times as it happens today, we would only call once with ntails as argument. Perhaps the safest thing to do is still to iterate over @ntails and call .set_page_dirty(page) and instead introduce a set_page_range_dirty() which individual filesystems can separately supply and give precedence of ->set_page_range_dirty() as opposed to ->set_page_dirty() ? Joao --------------------->8------------------------------ diff --git a/mm/gup.c b/mm/gup.c index 41ab3d48e1bb..5f8a0f16ab62 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -295,7 +295,7 @@ void unpin_user_pages_dirty_lock(struct page **pages, unsigned long npages, * next writeback cycle. This is harmless. */ if (!PageDirty(head)) - set_page_dirty_lock(head); + set_page_range_dirty_lock(head, ntails); put_compound_head(head, ntails, FOLL_PIN); } } diff --git a/mm/page-writeback.c b/mm/page-writeback.c index 088729ea80b2..4642d037f657 100644 --- a/mm/page-writeback.c +++ b/mm/page-writeback.c @@ -2417,7 +2417,8 @@ int __set_page_dirty_no_writeback(struct page *page, unsigned int ntails) * * NOTE: This relies on being atomic wrt interrupts. */ -void account_page_dirtied(struct page *page, struct address_space *mapping) +void account_page_dirtied(struct page *page, struct address_space *mapping, + unsigned int ntails) { struct inode *inode = mapping->host; @@ -2425,17 +2426,18 @@ void account_page_dirtied(struct page *page, struct address_space *mapping) if (mapping_can_writeback(mapping)) { struct bdi_writeback *wb; + int nr = ntails + 1; inode_attach_wb(inode, page); wb = inode_to_wb(inode); - __inc_lruvec_page_state(page, NR_FILE_DIRTY); - __inc_zone_page_state(page, NR_ZONE_WRITE_PENDING); - __inc_node_page_state(page, NR_DIRTIED); - inc_wb_stat(wb, WB_RECLAIMABLE); - inc_wb_stat(wb, WB_DIRTIED); - task_io_account_write(PAGE_SIZE); - current->nr_dirtied++; + mod_lruvec_page_state(page, NR_FILE_DIRTY, nr); + mod_zone_page_state(page_zone(page), NR_ZONE_WRITE_PENDING, nr); + mod_node_page_state(page_pgdat(page), NR_DIRTIED, nr); + __add_wb_stat(wb, WB_RECLAIMABLE, nr); + __add_wb_stat(wb, WB_DIRTIED, nr); + task_io_account_write(nr * PAGE_SIZE); + current->nr_dirtied += nr; this_cpu_inc(bdp_ratelimits); mem_cgroup_track_foreign_dirty(page, wb); @@ -2485,7 +2487,7 @@ int __set_page_dirty_nobuffers(struct page *page, unsigned int ntails) xa_lock_irqsave(&mapping->i_pages, flags); BUG_ON(page_mapping(page) != mapping); WARN_ON_ONCE(!PagePrivate(page) && !PageUptodate(page)); - account_page_dirtied(page, mapping); + account_page_dirtied(page, mapping, ntails); __xa_set_mark(&mapping->i_pages, page_index(page), PAGECACHE_TAG_DIRTY); xa_unlock_irqrestore(&mapping->i_pages, flags); @@ -2624,6 +2626,27 @@ int set_page_dirty_lock(struct page *page) } EXPORT_SYMBOL(set_page_dirty_lock); +/* + * set_page_range_dirty() is racy if the caller has no reference against + * page->mapping->host, and if the page is unlocked. This is because another + * CPU could truncate the page off the mapping and then free the mapping. + * + * Usually, the page _is_ locked, or the caller is a user-space process which + * holds a reference on the inode by having an open file. + * + * In other cases, the page should be locked before running set_page_range_dirty(). + */ +int set_page_range_dirty_lock(struct page *page, unsigned int ntails) +{ + int ret; + + lock_page(page); + ret = set_page_range_dirty(page, ntails); + unlock_page(page); + return ret; +} +EXPORT_SYMBOL(set_page_range_dirty_lock); +
next prev parent reply other threads:[~2020-12-17 19:08 UTC|newest] Thread overview: 80+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-12-08 17:28 [PATCH RFC 0/9] mm, sparse-vmemmap: Introduce compound pagemaps Joao Martins 2020-12-08 17:28 ` [PATCH RFC 1/9] memremap: add ZONE_DEVICE support for compound pages Joao Martins 2020-12-09 5:59 ` John Hubbard 2020-12-09 6:33 ` Matthew Wilcox 2020-12-09 13:12 ` Joao Martins 2021-02-20 1:43 ` Dan Williams 2021-02-22 11:24 ` Joao Martins 2021-02-22 20:37 ` Dan Williams 2021-02-23 15:46 ` Joao Martins 2021-02-23 16:50 ` Dan Williams 2021-02-23 17:18 ` Joao Martins 2021-02-23 18:18 ` Dan Williams 2021-03-10 18:12 ` Joao Martins 2021-03-12 5:54 ` Dan Williams 2021-02-20 1:24 ` Dan Williams 2021-02-22 11:09 ` Joao Martins 2020-12-08 17:28 ` [PATCH RFC 2/9] sparse-vmemmap: Consolidate arguments in vmemmap section populate Joao Martins 2020-12-09 6:16 ` John Hubbard 2020-12-09 13:51 ` Joao Martins 2021-02-20 1:49 ` Dan Williams 2021-02-22 11:26 ` Joao Martins 2020-12-08 17:28 ` [PATCH RFC 3/9] sparse-vmemmap: Reuse vmemmap areas for a given mhp_params::align Joao Martins 2020-12-08 17:38 ` Joao Martins 2020-12-08 17:28 ` [PATCH RFC 3/9] sparse-vmemmap: Reuse vmemmap areas for a given page size Joao Martins 2021-02-20 3:34 ` Dan Williams 2021-02-22 11:42 ` Joao Martins 2021-02-22 22:40 ` Dan Williams 2021-02-23 15:46 ` Joao Martins 2020-12-08 17:28 ` [PATCH RFC 4/9] mm/page_alloc: Reuse tail struct pages for compound pagemaps Joao Martins 2021-02-20 6:17 ` Dan Williams 2021-02-22 12:01 ` Joao Martins 2020-12-08 17:28 ` [PATCH RFC 5/9] device-dax: Compound pagemap support Joao Martins 2020-12-08 17:28 ` [PATCH RFC 6/9] mm/gup: Grab head page refcount once for group of subpages Joao Martins 2020-12-08 19:49 ` Jason Gunthorpe 2020-12-09 11:05 ` Joao Martins 2020-12-09 15:15 ` Jason Gunthorpe 2020-12-09 16:02 ` Joao Martins 2020-12-09 16:24 ` Jason Gunthorpe 2020-12-09 17:27 ` Joao Martins 2020-12-09 18:14 ` Matthew Wilcox 2020-12-09 19:08 ` Jason Gunthorpe 2020-12-10 15:43 ` Joao Martins 2020-12-09 4:40 ` John Hubbard 2020-12-09 13:44 ` Joao Martins 2020-12-08 17:28 ` [PATCH RFC 7/9] mm/gup: Decrement head page " Joao Martins 2020-12-08 19:34 ` Jason Gunthorpe 2020-12-09 5:06 ` John Hubbard 2020-12-09 13:43 ` Jason Gunthorpe 2020-12-09 12:17 ` Joao Martins 2020-12-17 19:05 ` Joao Martins [this message] 2020-12-17 20:05 ` Jason Gunthorpe 2020-12-17 22:34 ` Joao Martins 2020-12-18 14:25 ` Jason Gunthorpe 2020-12-19 2:06 ` John Hubbard 2020-12-19 13:10 ` Joao Martins 2020-12-08 17:29 ` [PATCH RFC 8/9] RDMA/umem: batch page unpin in __ib_mem_release() Joao Martins 2020-12-08 19:29 ` Jason Gunthorpe 2020-12-09 10:59 ` Joao Martins 2020-12-19 13:15 ` Joao Martins 2020-12-09 5:18 ` John Hubbard 2020-12-08 17:29 ` [PATCH RFC 9/9] mm: Add follow_devmap_page() for devdax vmas Joao Martins 2020-12-08 19:57 ` Jason Gunthorpe 2020-12-09 8:05 ` Christoph Hellwig 2020-12-09 11:19 ` Joao Martins 2020-12-09 5:23 ` John Hubbard 2020-12-09 9:38 ` [PATCH RFC 0/9] mm, sparse-vmemmap: Introduce compound pagemaps David Hildenbrand 2020-12-09 9:52 ` [External] " Muchun Song 2021-02-20 1:18 ` Dan Williams 2021-02-22 11:06 ` Joao Martins 2021-02-22 14:32 ` Joao Martins 2021-02-23 16:28 ` Joao Martins 2021-02-23 16:44 ` Dan Williams 2021-02-23 17:15 ` Joao Martins 2021-02-23 18:15 ` Dan Williams 2021-02-23 18:54 ` Jason Gunthorpe 2021-02-23 22:48 ` Dan Williams 2021-02-23 23:07 ` Jason Gunthorpe 2021-02-24 0:14 ` Dan Williams 2021-02-24 1:00 ` Jason Gunthorpe 2021-02-24 1:32 ` Dan Williams
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=cf5585f0-1352-e3ab-9dbf-0185ad0a1b31@oracle.com \ --to=joao.m.martins@oracle.com \ --cc=akpm@linux-foundation.org \ --cc=dan.j.williams@intel.com \ --cc=daniel.m.jordan@oracle.com \ --cc=ira.weiny@intel.com \ --cc=jane.chu@oracle.com \ --cc=jgg@ziepe.ca \ --cc=jhubbard@nvidia.com \ --cc=linux-mm@kvack.org \ --cc=linux-nvdimm@lists.01.org \ --cc=mike.kravetz@oracle.com \ --cc=songmuchun@bytedance.com \ --cc=willy@infradead.org \ --subject='Re: [PATCH RFC 7/9] mm/gup: Decrement head page once for group of subpages' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).