From: John Hubbard <jhubbard@nvidia.com> To: Balbir Singh <bsingharora@gmail.com> Cc: Matthew Wilcox <willy@infradead.org>, Michal Hocko <mhocko@kernel.org>, Christopher Lameter <cl@linux.com>, Jason Gunthorpe <jgg@ziepe.ca>, Dan Williams <dan.j.williams@intel.com>, Jan Kara <jack@suse.cz>, linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>, LKML <linux-kernel@vger.kernel.org>, linux-rdma <linux-rdma@vger.kernel.org>, linux-fsdevel@vger.kernel.org Subject: Re: [PATCH 4/6] mm: introduce page->dma_pinned_flags, _count Date: Fri, 12 Oct 2018 17:15:51 -0700 [thread overview] Message-ID: <b115b2ce-8fe8-db03-da9c-452511c8ed27@nvidia.com> (raw) In-Reply-To: <20181012105612.GK8537@350D> On 10/12/18 3:56 AM, Balbir Singh wrote: > On Thu, Oct 11, 2018 at 11:00:12PM -0700, john.hubbard@gmail.com wrote: >> From: John Hubbard <jhubbard@nvidia.com> [...] >> + * Because page->dma_pinned_flags is unioned with page->lru, any page that >> + * uses these flags must NOT be on an LRU. That's partly enforced by >> + * ClearPageDmaPinned, which gives the page back to LRU. >> + * >> + * PageDmaPinned also corresponds to PageTail (the 0th bit in the first union >> + * of struct page), and this flag is checked without knowing whether it is a >> + * tail page or a PageDmaPinned page. Therefore, start the flags at bit 1 (0x2), >> + * rather than bit 0. >> + */ >> +#define PAGE_DMA_PINNED 0x2 >> +#define PAGE_DMA_PINNED_FLAGS (PAGE_DMA_PINNED) >> + > > This is really subtle, additional changes to compound_head will need to coordinate > with these flags? Also doesn't this bit need to be unique across all structs in > the union? I guess that is guaranteed by the fact that page == compound_head(page) > as per your assertion, but I've forgotten why that is true. Could you please > add some commentary on that > Yes, agreed. I've rewritten and augmented that comment block, plus removed the PAGE_DMA_PINNED_FLAGS (there are no more bits available, so it's just misleading to even have it). So now it looks like this: /* * Because page->dma_pinned_flags is unioned with page->lru, any page that * uses these flags must NOT be on an LRU. That's partly enforced by * ClearPageDmaPinned, which gives the page back to LRU. * * PageDmaPinned is checked without knowing whether it is a tail page or a * PageDmaPinned page. For that reason, PageDmaPinned avoids PageTail (the 0th * bit in the first union of struct page), and instead uses bit 1 (0x2), * rather than bit 0. * * PageDmaPinned can only be used if no other systems are using the same bit * across the first struct page union. In this regard, it is similar to * PageTail, and in fact, because of PageTail's constraint that bit 0 be left * alone, bit 1 is also left alone so far: other union elements (ignoring tail * pages) put pointers there, and pointer alignment leaves the lower two bits * available. * * So, constraints include: * * -- Only use PageDmaPinned on non-tail pages. * -- Remove the page from any LRU list first. */ #define PAGE_DMA_PINNED 0x2 /* * Because these flags are read outside of a lock, ensure visibility between * different threads, by using READ|WRITE_ONCE. */ static __always_inline int PageDmaPinned(struct page *page) { VM_BUG_ON(page != compound_head(page)); return (READ_ONCE(page->dma_pinned_flags) & PAGE_DMA_PINNED) != 0; } [...] >> +static __always_inline void SetPageDmaPinned(struct page *page) >> +{ >> + VM_BUG_ON(page != compound_head(page)); > > VM_BUG_ON(!list_empty(&page->lru)) There is only one place where we set this flag, and that is when (in patch 6/6) transitioning from a page that might (or might not) have been on an LRU. In that case, the calling code has already corrupted page->lru, by writing to page->dma_pinned_count, which is unions with page->lru: atomic_set(&page->dma_pinned_count, 1); SetPageDmaPinned(page); ...so it would be inappropriate to call a list function, such as list_empty(), on that field. Let's just leave it as-is. > >> + WRITE_ONCE(page->dma_pinned_flags, PAGE_DMA_PINNED); >> +} >> + >> +static __always_inline void ClearPageDmaPinned(struct page *page) >> +{ >> + VM_BUG_ON(page != compound_head(page)); >> + VM_BUG_ON_PAGE(!PageDmaPinnedFlags(page), page); >> + >> + /* This does a WRITE_ONCE to the lru.next, which is also the >> + * page->dma_pinned_flags field. So in addition to restoring page->lru, >> + * this provides visibility to other threads. >> + */ >> + INIT_LIST_HEAD(&page->lru); > > This assumes certain things about list_head, why not use the correct > initialization bits. > Yes, OK, changed to: static __always_inline void ClearPageDmaPinned(struct page *page) { VM_BUG_ON(page != compound_head(page)); VM_BUG_ON_PAGE(!PageDmaPinned(page), page); /* Provide visibility to other threads: */ WRITE_ONCE(page->dma_pinned_flags, 0); /* * Safety precaution: restore the list head, before possibly returning * the page to other subsystems. */ INIT_LIST_HEAD(&page->lru); } -- thanks, John Hubbard NVIDIA
WARNING: multiple messages have this Message-ID (diff)
From: John Hubbard <jhubbard@nvidia.com> To: Balbir Singh <bsingharora@gmail.com> Cc: Matthew Wilcox <willy@infradead.org>, Michal Hocko <mhocko@kernel.org>, Christopher Lameter <cl@linux.com>, Jason Gunthorpe <jgg@ziepe.ca>, Dan Williams <dan.j.williams@intel.com>, Jan Kara <jack@suse.cz>, <linux-mm@kvack.org>, Andrew Morton <akpm@linux-foundation.org>, LKML <linux-kernel@vger.kernel.org>, linux-rdma <linux-rdma@vger.kernel.org>, <linux-fsdevel@vger.kernel.org> Subject: Re: [PATCH 4/6] mm: introduce page->dma_pinned_flags, _count Date: Fri, 12 Oct 2018 17:15:51 -0700 [thread overview] Message-ID: <b115b2ce-8fe8-db03-da9c-452511c8ed27@nvidia.com> (raw) In-Reply-To: <20181012105612.GK8537@350D> On 10/12/18 3:56 AM, Balbir Singh wrote: > On Thu, Oct 11, 2018 at 11:00:12PM -0700, john.hubbard@gmail.com wrote: >> From: John Hubbard <jhubbard@nvidia.com> [...] >> + * Because page->dma_pinned_flags is unioned with page->lru, any page that >> + * uses these flags must NOT be on an LRU. That's partly enforced by >> + * ClearPageDmaPinned, which gives the page back to LRU. >> + * >> + * PageDmaPinned also corresponds to PageTail (the 0th bit in the first union >> + * of struct page), and this flag is checked without knowing whether it is a >> + * tail page or a PageDmaPinned page. Therefore, start the flags at bit 1 (0x2), >> + * rather than bit 0. >> + */ >> +#define PAGE_DMA_PINNED 0x2 >> +#define PAGE_DMA_PINNED_FLAGS (PAGE_DMA_PINNED) >> + > > This is really subtle, additional changes to compound_head will need to coordinate > with these flags? Also doesn't this bit need to be unique across all structs in > the union? I guess that is guaranteed by the fact that page == compound_head(page) > as per your assertion, but I've forgotten why that is true. Could you please > add some commentary on that > Yes, agreed. I've rewritten and augmented that comment block, plus removed the PAGE_DMA_PINNED_FLAGS (there are no more bits available, so it's just misleading to even have it). So now it looks like this: /* * Because page->dma_pinned_flags is unioned with page->lru, any page that * uses these flags must NOT be on an LRU. That's partly enforced by * ClearPageDmaPinned, which gives the page back to LRU. * * PageDmaPinned is checked without knowing whether it is a tail page or a * PageDmaPinned page. For that reason, PageDmaPinned avoids PageTail (the 0th * bit in the first union of struct page), and instead uses bit 1 (0x2), * rather than bit 0. * * PageDmaPinned can only be used if no other systems are using the same bit * across the first struct page union. In this regard, it is similar to * PageTail, and in fact, because of PageTail's constraint that bit 0 be left * alone, bit 1 is also left alone so far: other union elements (ignoring tail * pages) put pointers there, and pointer alignment leaves the lower two bits * available. * * So, constraints include: * * -- Only use PageDmaPinned on non-tail pages. * -- Remove the page from any LRU list first. */ #define PAGE_DMA_PINNED 0x2 /* * Because these flags are read outside of a lock, ensure visibility between * different threads, by using READ|WRITE_ONCE. */ static __always_inline int PageDmaPinned(struct page *page) { VM_BUG_ON(page != compound_head(page)); return (READ_ONCE(page->dma_pinned_flags) & PAGE_DMA_PINNED) != 0; } [...] >> +static __always_inline void SetPageDmaPinned(struct page *page) >> +{ >> + VM_BUG_ON(page != compound_head(page)); > > VM_BUG_ON(!list_empty(&page->lru)) There is only one place where we set this flag, and that is when (in patch 6/6) transitioning from a page that might (or might not) have been on an LRU. In that case, the calling code has already corrupted page->lru, by writing to page->dma_pinned_count, which is unions with page->lru: atomic_set(&page->dma_pinned_count, 1); SetPageDmaPinned(page); ...so it would be inappropriate to call a list function, such as list_empty(), on that field. Let's just leave it as-is. > >> + WRITE_ONCE(page->dma_pinned_flags, PAGE_DMA_PINNED); >> +} >> + >> +static __always_inline void ClearPageDmaPinned(struct page *page) >> +{ >> + VM_BUG_ON(page != compound_head(page)); >> + VM_BUG_ON_PAGE(!PageDmaPinnedFlags(page), page); >> + >> + /* This does a WRITE_ONCE to the lru.next, which is also the >> + * page->dma_pinned_flags field. So in addition to restoring page->lru, >> + * this provides visibility to other threads. >> + */ >> + INIT_LIST_HEAD(&page->lru); > > This assumes certain things about list_head, why not use the correct > initialization bits. > Yes, OK, changed to: static __always_inline void ClearPageDmaPinned(struct page *page) { VM_BUG_ON(page != compound_head(page)); VM_BUG_ON_PAGE(!PageDmaPinned(page), page); /* Provide visibility to other threads: */ WRITE_ONCE(page->dma_pinned_flags, 0); /* * Safety precaution: restore the list head, before possibly returning * the page to other subsystems. */ INIT_LIST_HEAD(&page->lru); } -- thanks, John Hubbard NVIDIA
next prev parent reply other threads:[~2018-10-13 0:15 UTC|newest] Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-10-12 6:00 [PATCH 0/6] RFC: gup+dma: tracking dma-pinned pages john.hubbard 2018-10-12 6:00 ` [PATCH 1/6] mm: get_user_pages: consolidate error handling john.hubbard 2018-10-12 6:30 ` Balbir Singh 2018-10-12 22:45 ` John Hubbard 2018-10-12 22:45 ` John Hubbard 2018-10-12 6:00 ` [PATCH 2/6] mm: introduce put_user_page*(), placeholder versions john.hubbard 2018-10-12 7:35 ` Balbir Singh 2018-10-12 22:31 ` John Hubbard 2018-10-12 22:31 ` John Hubbard 2018-10-12 6:00 ` [PATCH 3/6] infiniband/mm: convert put_page() to put_user_page*() john.hubbard 2018-10-12 6:00 ` [PATCH 4/6] mm: introduce page->dma_pinned_flags, _count john.hubbard 2018-10-12 10:56 ` Balbir Singh 2018-10-13 0:15 ` John Hubbard [this message] 2018-10-13 0:15 ` John Hubbard 2018-10-24 11:00 ` Balbir Singh 2018-11-02 23:27 ` John Hubbard 2018-11-02 23:27 ` John Hubbard 2018-10-13 3:55 ` Dave Chinner 2018-10-13 7:34 ` John Hubbard 2018-10-13 7:34 ` John Hubbard 2018-10-13 16:47 ` Christoph Hellwig 2018-10-13 21:19 ` John Hubbard 2018-10-13 21:19 ` John Hubbard 2018-11-05 7:10 ` John Hubbard 2018-11-05 7:10 ` John Hubbard 2018-11-05 9:54 ` Jan Kara 2018-11-06 0:26 ` John Hubbard 2018-11-06 0:26 ` John Hubbard 2018-11-06 2:47 ` Dave Chinner 2018-11-06 11:00 ` Jan Kara 2018-11-06 20:41 ` Dave Chinner 2018-11-07 6:36 ` John Hubbard 2018-11-07 6:36 ` John Hubbard 2018-10-13 23:01 ` Dave Chinner 2018-10-16 8:51 ` Jan Kara 2018-10-17 1:48 ` John Hubbard 2018-10-17 1:48 ` John Hubbard 2018-10-17 11:09 ` Michal Hocko 2018-10-18 0:03 ` John Hubbard 2018-10-18 0:03 ` John Hubbard 2018-10-19 8:11 ` Michal Hocko 2018-10-12 6:00 ` [PATCH 5/6] mm: introduce zone_gup_lock, for dma-pinned pages john.hubbard 2018-10-12 6:00 ` [PATCH 6/6] mm: track gup pages with page->dma_pinned_* fields john.hubbard 2018-10-12 11:07 ` Balbir Singh 2018-10-13 0:33 ` John Hubbard 2018-10-13 0:33 ` John Hubbard
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=b115b2ce-8fe8-db03-da9c-452511c8ed27@nvidia.com \ --to=jhubbard@nvidia.com \ --cc=akpm@linux-foundation.org \ --cc=bsingharora@gmail.com \ --cc=cl@linux.com \ --cc=dan.j.williams@intel.com \ --cc=jack@suse.cz \ --cc=jgg@ziepe.ca \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=linux-rdma@vger.kernel.org \ --cc=mhocko@kernel.org \ --cc=willy@infradead.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.