From: "Mika Penttilä" <mpenttil@redhat.com> To: David Howells <dhowells@redhat.com>, Yunsheng Lin <linyunsheng@huawei.com>, Matthew Wilcox <willy@infradead.org> Cc: netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>, Eric Dumazet <edumazet@google.com>, Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>, Willem de Bruijn <willemdebruijn.kernel@gmail.com>, David Ahern <dsahern@kernel.org>, Jens Axboe <axboe@kernel.dk>, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Jeroen de Borst <jeroendb@google.com>, Catherine Sullivan <csully@google.com>, Shailend Chand <shailend@google.com>, Felix Fietkau <nbd@nbd.name>, John Crispin <john@phrozen.org>, Sean Wang <sean.wang@mediatek.com>, Mark Lee <Mark-MC.Lee@mediatek.com>, Lorenzo Bianconi <lorenzo@kernel.org>, Matthias Brugger <matthias.bgg@gmail.com>, AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>, Keith Busch <kbusch@kernel.org>, Jens Axboe <axboe@fb.com>, Christoph Hellwig <hch@lst.de>, Sagi Grimberg <sagi@grimberg.me>, Chaitanya Kulkarni <kch@nvidia.com>, Andrew Morton <akpm@linux-foundation.org>, linux-arm-kernel@lists.infradead.org, linux-mediatek@lists.infradead.org, linux-nvme@lists.infradead.org Subject: Re: [PATCH net-next 04/12] mm: Make the page_frag_cache allocator use multipage folios Date: Fri, 26 May 2023 17:06:55 +0300 [thread overview] Message-ID: <5dd62fee-56bf-0b54-2e91-c31068a2b040@redhat.com> (raw) In-Reply-To: <739166.1685105220@warthog.procyon.org.uk> Hi, On 26.5.2023 15.47, David Howells wrote: > Yunsheng Lin <linyunsheng@huawei.com> wrote: > >>> Change the page_frag_cache allocator to use multipage folios rather than >>> groups of pages. This reduces page_frag_free to just a folio_put() or >>> put_page(). >> >> put_page() is not used in this patch, perhaps remove it to avoid >> the confusion? > > Will do if I need to respin the patches. > >> Also, Is there any significant difference between __free_pages() >> and folio_put()? IOW, what does the 'reduces' part means here? > > I meant that the folio code handles page compounding for us and we don't need > to work out how big the page is for ourselves. > > If you look at __free_pages(), you can see a PageHead() call. folio_put() > doesn't need that. > >> I followed some disscusion about folio before, but have not really >> understood about real difference between 'multipage folios' and >> 'groups of pages' yet. Is folio mostly used to avoid the confusion >> about whether a page is 'headpage of compound page', 'base page' or >> 'tailpage of compound page'? Or is there any abvious benefit about >> folio that I missed? > > There is a benefit: a folio pointer always points to the head page and so we > never need to do "is this compound? where's the head?" logic to find it. When > going from a page pointer, we still have to find the head. > But page_frag_free() uses folio_put(virt_to_folio(addr)) and virt_to_folio() depends on the compound infrastructure to get the head page and folio. > Ultimately, the aim is to reduce struct page to a typed pointer to massively > reduce the amount of space consumed by mem_map[]. A page struct will then > point at a folio or a slab struct or one of a number of different types. But > to get to that point, we have to stop a whole lot of things from using page > structs, but rather use some other type, such as folio. > > Eventually, there won't be a need for head pages and tail pages per se - just > memory objects of different sizes. > >>> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h >>> index 306a3d1a0fa6..d7c52a5979cc 100644 >>> --- a/include/linux/mm_types.h >>> +++ b/include/linux/mm_types.h >>> @@ -420,18 +420,13 @@ static inline void *folio_get_private(struct folio *folio) >>> } >>> >>> struct page_frag_cache { >>> - void * va; >>> -#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) >>> - __u16 offset; >>> - __u16 size; >>> -#else >>> - __u32 offset; >>> -#endif >>> + struct folio *folio; >>> + unsigned int offset; >>> /* we maintain a pagecount bias, so that we dont dirty cache line >>> * containing page->_refcount every time we allocate a fragment. >>> */ >>> - unsigned int pagecnt_bias; >>> - bool pfmemalloc; >>> + unsigned int pagecnt_bias; >>> + bool pfmemalloc; >>> }; >> >> It seems 'va' and 'size' field is used to avoid touching 'stuct page' to >> avoid possible cache bouncing when there is more frag can be allocated >> from the page while other frags is freed at the same time before this patch? > > Hmmm... fair point, though va is calculated from the page pointer on most > arches without the need to dereference struct page (only arc, m68k and sparc > define WANT_PAGE_VIRTUAL). > > David > --Mika
WARNING: multiple messages have this Message-ID (diff)
From: "Mika Penttilä" <mpenttil@redhat.com> To: David Howells <dhowells@redhat.com>, Yunsheng Lin <linyunsheng@huawei.com>, Matthew Wilcox <willy@infradead.org> Cc: netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>, Eric Dumazet <edumazet@google.com>, Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>, Willem de Bruijn <willemdebruijn.kernel@gmail.com>, David Ahern <dsahern@kernel.org>, Jens Axboe <axboe@kernel.dk>, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Jeroen de Borst <jeroendb@google.com>, Catherine Sullivan <csully@google.com>, Shailend Chand <shailend@google.com>, Felix Fietkau <nbd@nbd.name>, John Crispin <john@phrozen.org>, Sean Wang <sean.wang@mediatek.com>, Mark Lee <Mark-MC.Lee@mediatek.com>, Lorenzo Bianconi <lorenzo@kernel.org>, Matthias Brugger <matthias.bgg@gmail.com>, AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>, Keith Busch <kbusch@kernel.org>, Jens Axboe <axboe@fb.com>, Christoph Hellwig <hch@lst.de>, Sagi Grimberg <sagi@grimberg.me>, Chaitanya Kulkarni <kch@nvidia.com>, Andrew Morton <akpm@linux-foundation.org>, linux-arm-kernel@lists.infradead.org, linux-mediatek@lists.infradead.org, linux-nvme@lists.infradead.org Subject: Re: [PATCH net-next 04/12] mm: Make the page_frag_cache allocator use multipage folios Date: Fri, 26 May 2023 17:06:55 +0300 [thread overview] Message-ID: <5dd62fee-56bf-0b54-2e91-c31068a2b040@redhat.com> (raw) In-Reply-To: <739166.1685105220@warthog.procyon.org.uk> Hi, On 26.5.2023 15.47, David Howells wrote: > Yunsheng Lin <linyunsheng@huawei.com> wrote: > >>> Change the page_frag_cache allocator to use multipage folios rather than >>> groups of pages. This reduces page_frag_free to just a folio_put() or >>> put_page(). >> >> put_page() is not used in this patch, perhaps remove it to avoid >> the confusion? > > Will do if I need to respin the patches. > >> Also, Is there any significant difference between __free_pages() >> and folio_put()? IOW, what does the 'reduces' part means here? > > I meant that the folio code handles page compounding for us and we don't need > to work out how big the page is for ourselves. > > If you look at __free_pages(), you can see a PageHead() call. folio_put() > doesn't need that. > >> I followed some disscusion about folio before, but have not really >> understood about real difference between 'multipage folios' and >> 'groups of pages' yet. Is folio mostly used to avoid the confusion >> about whether a page is 'headpage of compound page', 'base page' or >> 'tailpage of compound page'? Or is there any abvious benefit about >> folio that I missed? > > There is a benefit: a folio pointer always points to the head page and so we > never need to do "is this compound? where's the head?" logic to find it. When > going from a page pointer, we still have to find the head. > But page_frag_free() uses folio_put(virt_to_folio(addr)) and virt_to_folio() depends on the compound infrastructure to get the head page and folio. > Ultimately, the aim is to reduce struct page to a typed pointer to massively > reduce the amount of space consumed by mem_map[]. A page struct will then > point at a folio or a slab struct or one of a number of different types. But > to get to that point, we have to stop a whole lot of things from using page > structs, but rather use some other type, such as folio. > > Eventually, there won't be a need for head pages and tail pages per se - just > memory objects of different sizes. > >>> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h >>> index 306a3d1a0fa6..d7c52a5979cc 100644 >>> --- a/include/linux/mm_types.h >>> +++ b/include/linux/mm_types.h >>> @@ -420,18 +420,13 @@ static inline void *folio_get_private(struct folio *folio) >>> } >>> >>> struct page_frag_cache { >>> - void * va; >>> -#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) >>> - __u16 offset; >>> - __u16 size; >>> -#else >>> - __u32 offset; >>> -#endif >>> + struct folio *folio; >>> + unsigned int offset; >>> /* we maintain a pagecount bias, so that we dont dirty cache line >>> * containing page->_refcount every time we allocate a fragment. >>> */ >>> - unsigned int pagecnt_bias; >>> - bool pfmemalloc; >>> + unsigned int pagecnt_bias; >>> + bool pfmemalloc; >>> }; >> >> It seems 'va' and 'size' field is used to avoid touching 'stuct page' to >> avoid possible cache bouncing when there is more frag can be allocated >> from the page while other frags is freed at the same time before this patch? > > Hmmm... fair point, though va is calculated from the page pointer on most > arches without the need to dereference struct page (only arc, m68k and sparc > define WANT_PAGE_VIRTUAL). > > David > --Mika _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2023-05-26 14:07 UTC|newest] Thread overview: 69+ messages / expand[flat|nested] mbox.gz Atom feed top 2023-05-24 15:32 [PATCH net-next 00/12] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES), part 3 David Howells 2023-05-24 15:33 ` [PATCH net-next 01/12] mm: Move the page fragment allocator from page_alloc.c into its own file David Howells 2023-05-24 15:33 ` [PATCH net-next 02/12] mm: Provide a page_frag_cache allocator cleanup function David Howells 2023-05-24 15:33 ` David Howells 2023-05-24 15:33 ` [PATCH net-next 03/12] mm: Make the page_frag_cache allocator alignment param a pow-of-2 David Howells 2023-05-24 15:33 ` David Howells 2023-05-27 15:54 ` Alexander H Duyck 2023-05-27 15:54 ` Alexander H Duyck 2023-11-30 9:00 ` Yunsheng Lin 2023-11-30 9:00 ` Yunsheng Lin 2023-06-16 15:28 ` David Howells 2023-06-16 15:28 ` David Howells 2023-06-16 16:06 ` Alexander Duyck 2023-06-16 16:06 ` Alexander Duyck 2023-05-24 15:33 ` [PATCH net-next 04/12] mm: Make the page_frag_cache allocator use multipage folios David Howells 2023-05-24 15:33 ` David Howells 2023-05-26 11:56 ` Yunsheng Lin 2023-05-26 11:56 ` Yunsheng Lin 2023-05-27 15:47 ` Alexander H Duyck 2023-05-27 15:47 ` Alexander H Duyck 2023-06-06 8:25 ` David Howells 2023-06-06 8:25 ` David Howells 2023-06-06 14:59 ` Alexander Duyck 2023-06-06 14:59 ` Alexander Duyck 2023-05-26 12:47 ` David Howells 2023-05-26 12:47 ` David Howells 2023-05-26 14:06 ` Mika Penttilä [this message] 2023-05-26 14:06 ` Mika Penttilä 2023-05-27 0:50 ` Jakub Kicinski 2023-05-27 0:50 ` Jakub Kicinski 2023-05-24 15:33 ` [PATCH net-next 05/12] mm: Make the page_frag_cache allocator handle __GFP_ZERO itself David Howells 2023-05-24 15:33 ` David Howells 2023-05-27 0:57 ` Jakub Kicinski 2023-05-27 0:57 ` Jakub Kicinski 2023-05-27 15:54 ` Alexander Duyck 2023-05-27 15:54 ` Alexander Duyck 2023-05-24 15:33 ` [PATCH net-next 06/12] mm: Make the page_frag_cache allocator use per-cpu David Howells 2023-05-24 15:33 ` David Howells 2023-05-27 1:02 ` Jakub Kicinski 2023-05-27 1:02 ` Jakub Kicinski 2023-05-24 15:33 ` [PATCH net-next 07/12] net: Clean up users of netdev_alloc_cache and napi_frag_cache David Howells 2023-05-24 15:33 ` [PATCH net-next 08/12] net: Copy slab data for sendmsg(MSG_SPLICE_PAGES) David Howells 2023-05-24 15:33 ` [PATCH net-next 09/12] tls/sw: Support MSG_SPLICE_PAGES David Howells 2023-05-27 1:08 ` Jakub Kicinski 2023-05-30 22:26 ` Bug in short splice to socket? David Howells 2023-05-31 0:32 ` Jakub Kicinski 2023-06-01 11:01 ` David Laight 2023-06-01 13:09 ` Linus Torvalds 2023-06-01 13:19 ` Linus Torvalds 2023-06-01 14:34 ` David Howells 2023-06-01 15:12 ` Linus Torvalds 2023-06-05 11:03 ` David Laight 2023-06-05 15:52 ` David Howells 2023-06-01 17:14 ` David Howells 2023-06-02 4:20 ` Jakub Kicinski 2023-06-02 8:23 ` David Howells 2023-06-02 11:28 ` Linus Torvalds 2023-06-02 11:44 ` David Howells 2023-06-02 12:11 ` Linus Torvalds 2023-06-02 16:39 ` Jakub Kicinski 2023-06-02 16:53 ` Linus Torvalds 2023-06-02 17:05 ` Linus Torvalds 2023-06-02 17:38 ` Jakub Kicinski 2023-06-02 20:38 ` David Howells 2023-06-02 20:50 ` David Howells 2023-05-24 15:33 ` [PATCH net-next 10/12] tls/sw: Convert tls_sw_sendpage() to use MSG_SPLICE_PAGES David Howells 2023-05-27 1:13 ` Jakub Kicinski 2023-05-24 15:33 ` [PATCH net-next 11/12] tls/device: Support MSG_SPLICE_PAGES David Howells 2023-05-24 15:33 ` [PATCH net-next 12/12] tls/device: Convert tls_device_sendpage() to use MSG_SPLICE_PAGES David Howells
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=5dd62fee-56bf-0b54-2e91-c31068a2b040@redhat.com \ --to=mpenttil@redhat.com \ --cc=Mark-MC.Lee@mediatek.com \ --cc=akpm@linux-foundation.org \ --cc=angelogioacchino.delregno@collabora.com \ --cc=axboe@fb.com \ --cc=axboe@kernel.dk \ --cc=csully@google.com \ --cc=davem@davemloft.net \ --cc=dhowells@redhat.com \ --cc=dsahern@kernel.org \ --cc=edumazet@google.com \ --cc=hch@lst.de \ --cc=jeroendb@google.com \ --cc=john@phrozen.org \ --cc=kbusch@kernel.org \ --cc=kch@nvidia.com \ --cc=kuba@kernel.org \ --cc=linux-arm-kernel@lists.infradead.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mediatek@lists.infradead.org \ --cc=linux-mm@kvack.org \ --cc=linux-nvme@lists.infradead.org \ --cc=linyunsheng@huawei.com \ --cc=lorenzo@kernel.org \ --cc=matthias.bgg@gmail.com \ --cc=nbd@nbd.name \ --cc=netdev@vger.kernel.org \ --cc=pabeni@redhat.com \ --cc=sagi@grimberg.me \ --cc=sean.wang@mediatek.com \ --cc=shailend@google.com \ --cc=willemdebruijn.kernel@gmail.com \ --cc=willy@infradead.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.