linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 1/3] net: introduce helper sendpage_ok() in include/linux/net.h
@ 2020-08-16  7:15 Coly Li
  2020-08-16  7:15 ` [PATCH v5 2/3] nvme-tcp: check page by sendpage_ok() before calling kernel_sendpage() Coly Li
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Coly Li @ 2020-08-16  7:15 UTC (permalink / raw)
  To: linux-block, linux-nvme
  Cc: netdev, stable, linux-kernel, Coly Li, Chaitanya Kulkarni,
	Christoph Hellwig, Hannes Reinecke, Jan Kara, Jens Axboe,
	Mikhail Skorzhinskii, Philipp Reisner, Sagi Grimberg,
	Vlastimil Babka

The original problem was from nvme-over-tcp code, who mistakenly uses
kernel_sendpage() to send pages allocated by __get_free_pages() without
__GFP_COMP flag. Such pages don't have refcount (page_count is 0) on
tail pages, sending them by kernel_sendpage() may trigger a kernel panic
from a corrupted kernel heap, because these pages are incorrectly freed
in network stack as page_count 0 pages.

This patch introduces a helper sendpage_ok(), it returns true if the
checking page,
- is not slab page: PageSlab(page) is false.
- has page refcount: page_count(page) is not zero

All drivers who want to send page to remote end by kernel_sendpage()
may use this helper to check whether the page is OK. If the helper does
not return true, the driver should try other non sendpage method (e.g.
sock_no_sendpage()) to handle the page.

Signed-off-by: Coly Li <colyli@suse.de>
Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Jan Kara <jack@suse.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Mikhail Skorzhinskii <mskorzhinskiy@solarflare.com>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Vlastimil Babka <vbabka@suse.com>
Cc: stable@vger.kernel.org
---
Changelog:
v5, include linux/mm.h in include/linux/net.h
v4, change sendpage_ok() as an inline helper, and post it as
    separate patch.
v3, introduce a more common sendpage_ok()
v2, fix typo in patch subject
v1, the initial version.
 include/linux/net.h | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/include/linux/net.h b/include/linux/net.h
index d48ff1180879..a807fad31958 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -21,6 +21,7 @@
 #include <linux/rcupdate.h>
 #include <linux/once.h>
 #include <linux/fs.h>
+#include <linux/mm.h>
 #include <linux/sockptr.h>
 
 #include <uapi/linux/net.h>
@@ -286,6 +287,21 @@ do {									\
 #define net_get_random_once_wait(buf, nbytes)			\
 	get_random_once_wait((buf), (nbytes))
 
+/*
+ * E.g. XFS meta- & log-data is in slab pages, or bcache meta
+ * data pages, or other high order pages allocated by
+ * __get_free_pages() without __GFP_COMP, which have a page_count
+ * of 0 and/or have PageSlab() set. We cannot use send_page for
+ * those, as that does get_page(); put_page(); and would cause
+ * either a VM_BUG directly, or __page_cache_release a page that
+ * would actually still be referenced by someone, leading to some
+ * obscure delayed Oops somewhere else.
+ */
+static inline bool sendpage_ok(struct page *page)
+{
+	return  (!PageSlab(page) && page_count(page) >= 1);
+}
+
 int kernel_sendmsg(struct socket *sock, struct msghdr *msg, struct kvec *vec,
 		   size_t num, size_t len);
 int kernel_sendmsg_locked(struct sock *sk, struct msghdr *msg,
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v5 2/3] nvme-tcp: check page by sendpage_ok() before calling kernel_sendpage()
  2020-08-16  7:15 [PATCH v5 1/3] net: introduce helper sendpage_ok() in include/linux/net.h Coly Li
@ 2020-08-16  7:15 ` Coly Li
  2020-08-16  7:15 ` [PATCH v5 3/3] drbd: code cleanup by using sendpage_ok() to check page for kernel_sendpage() Coly Li
  2020-08-16 17:55 ` [PATCH v5 1/3] net: introduce helper sendpage_ok() in include/linux/net.h Cong Wang
  2 siblings, 0 replies; 10+ messages in thread
From: Coly Li @ 2020-08-16  7:15 UTC (permalink / raw)
  To: linux-block, linux-nvme
  Cc: netdev, stable, linux-kernel, Coly Li, Chaitanya Kulkarni,
	Christoph Hellwig, Hannes Reinecke, Jan Kara, Jens Axboe,
	Mikhail Skorzhinskii, Philipp Reisner, Sagi Grimberg,
	Vlastimil Babka

Currently nvme_tcp_try_send_data() doesn't use kernel_sendpage() to
send slab pages. But for pages allocated by __get_free_pages() without
__GFP_COMP, which also have refcount as 0, they are still sent by
kernel_sendpage() to remote end, this is problematic.

The new introduced helper sendpage_ok() checks both PageSlab tag and
page_count counter, and returns true if the checking page is OK to be
sent by kernel_sendpage().

This patch fixes the page checking issue of nvme_tcp_try_send_data()
with sendpage_ok(). If sendpage_ok() returns true, send this page by
kernel_sendpage(), otherwise use sock_no_sendpage to handle this page.

Signed-off-by: Coly Li <colyli@suse.de>
Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Jan Kara <jack@suse.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Mikhail Skorzhinskii <mskorzhinskiy@solarflare.com>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Vlastimil Babka <vbabka@suse.com>
Cc: stable@vger.kernel.org
---
Changelog:
v4, change sendpage_ok() as an inline helper, and post it as
    separate patch.
v3, introduce a more common sendpage_ok()
v2, fix typo in patch subject
v1, the initial version.
 drivers/nvme/host/tcp.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 62fbaecdc960..902fe742762b 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -912,12 +912,11 @@ static int nvme_tcp_try_send_data(struct nvme_tcp_request *req)
 		else
 			flags |= MSG_MORE | MSG_SENDPAGE_NOTLAST;
 
-		/* can't zcopy slab pages */
-		if (unlikely(PageSlab(page))) {
-			ret = sock_no_sendpage(queue->sock, page, offset, len,
+		if (sendpage_ok(page)) {
+			ret = kernel_sendpage(queue->sock, page, offset, len,
 					flags);
 		} else {
-			ret = kernel_sendpage(queue->sock, page, offset, len,
+			ret = sock_no_sendpage(queue->sock, page, offset, len,
 					flags);
 		}
 		if (ret <= 0)
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v5 3/3] drbd: code cleanup by using sendpage_ok() to check page for kernel_sendpage()
  2020-08-16  7:15 [PATCH v5 1/3] net: introduce helper sendpage_ok() in include/linux/net.h Coly Li
  2020-08-16  7:15 ` [PATCH v5 2/3] nvme-tcp: check page by sendpage_ok() before calling kernel_sendpage() Coly Li
@ 2020-08-16  7:15 ` Coly Li
  2020-08-16 17:55 ` [PATCH v5 1/3] net: introduce helper sendpage_ok() in include/linux/net.h Cong Wang
  2 siblings, 0 replies; 10+ messages in thread
From: Coly Li @ 2020-08-16  7:15 UTC (permalink / raw)
  To: linux-block, linux-nvme
  Cc: netdev, stable, linux-kernel, Coly Li, Philipp Reisner, Sagi Grimberg

In _drbd_send_page() a page is checked by following code before sending
it by kernel_sendpage(),
        (page_count(page) < 1) || PageSlab(page)
If the check is true, this page won't be send by kernel_sendpage() and
handled by sock_no_sendpage().

This kind of check is exactly what macro sendpage_ok() does, which is
introduced into include/linux/net.h to solve a similar send page issue
in nvme-tcp code.

This patch uses macro sendpage_ok() to replace the open coded checks to
page type and refcount in _drbd_send_page(), as a code cleanup.

Signed-off-by: Coly Li <colyli@suse.de>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
---
Changelog:
v3, introduce a more common sendpage_ok()
v2, fix typo in patch subject
v1, the initial version.
 drivers/block/drbd/drbd_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index cb687ccdbd96..55dc0c91781e 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -1553,7 +1553,7 @@ static int _drbd_send_page(struct drbd_peer_device *peer_device, struct page *pa
 	 * put_page(); and would cause either a VM_BUG directly, or
 	 * __page_cache_release a page that would actually still be referenced
 	 * by someone, leading to some obscure delayed Oops somewhere else. */
-	if (drbd_disable_sendpage || (page_count(page) < 1) || PageSlab(page))
+	if (drbd_disable_sendpage || !sendpage_ok(page))
 		return _drbd_no_send_page(peer_device, page, offset, size, msg_flags);
 
 	msg_flags |= MSG_NOSIGNAL;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v5 1/3] net: introduce helper sendpage_ok() in include/linux/net.h
  2020-08-16  7:15 [PATCH v5 1/3] net: introduce helper sendpage_ok() in include/linux/net.h Coly Li
  2020-08-16  7:15 ` [PATCH v5 2/3] nvme-tcp: check page by sendpage_ok() before calling kernel_sendpage() Coly Li
  2020-08-16  7:15 ` [PATCH v5 3/3] drbd: code cleanup by using sendpage_ok() to check page for kernel_sendpage() Coly Li
@ 2020-08-16 17:55 ` Cong Wang
  2020-08-17  5:45   ` Christoph Hellwig
  2 siblings, 1 reply; 10+ messages in thread
From: Cong Wang @ 2020-08-16 17:55 UTC (permalink / raw)
  To: Coly Li
  Cc: linux-block, linux-nvme, Linux Kernel Network Developers, stable,
	LKML, Chaitanya Kulkarni, Christoph Hellwig, Hannes Reinecke,
	Jan Kara, Jens Axboe, Mikhail Skorzhinskii, Philipp Reisner,
	Sagi Grimberg, Vlastimil Babka

On Sun, Aug 16, 2020 at 1:36 AM Coly Li <colyli@suse.de> wrote:
>
> The original problem was from nvme-over-tcp code, who mistakenly uses
> kernel_sendpage() to send pages allocated by __get_free_pages() without
> __GFP_COMP flag. Such pages don't have refcount (page_count is 0) on
> tail pages, sending them by kernel_sendpage() may trigger a kernel panic
> from a corrupted kernel heap, because these pages are incorrectly freed
> in network stack as page_count 0 pages.
>
> This patch introduces a helper sendpage_ok(), it returns true if the
> checking page,
> - is not slab page: PageSlab(page) is false.
> - has page refcount: page_count(page) is not zero
>
> All drivers who want to send page to remote end by kernel_sendpage()
> may use this helper to check whether the page is OK. If the helper does
> not return true, the driver should try other non sendpage method (e.g.
> sock_no_sendpage()) to handle the page.

Can we leave this helper to mm subsystem?

I know it is for sendpage, but its implementation is all about some
mm details and its two callers do not belong to net subsystem either.

Think this in another way: who would fix it if it is buggy? I bet mm people
should. ;)

Thanks.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v5 1/3] net: introduce helper sendpage_ok() in include/linux/net.h
  2020-08-16 17:55 ` [PATCH v5 1/3] net: introduce helper sendpage_ok() in include/linux/net.h Cong Wang
@ 2020-08-17  5:45   ` Christoph Hellwig
  2020-08-17 19:12     ` Cong Wang
  0 siblings, 1 reply; 10+ messages in thread
From: Christoph Hellwig @ 2020-08-17  5:45 UTC (permalink / raw)
  To: Cong Wang
  Cc: Coly Li, linux-block, linux-nvme,
	Linux Kernel Network Developers, stable, LKML,
	Chaitanya Kulkarni, Christoph Hellwig, Hannes Reinecke, Jan Kara,
	Jens Axboe, Mikhail Skorzhinskii, Philipp Reisner, Sagi Grimberg,
	Vlastimil Babka

On Sun, Aug 16, 2020 at 10:55:09AM -0700, Cong Wang wrote:
> On Sun, Aug 16, 2020 at 1:36 AM Coly Li <colyli@suse.de> wrote:
> >
> > The original problem was from nvme-over-tcp code, who mistakenly uses
> > kernel_sendpage() to send pages allocated by __get_free_pages() without
> > __GFP_COMP flag. Such pages don't have refcount (page_count is 0) on
> > tail pages, sending them by kernel_sendpage() may trigger a kernel panic
> > from a corrupted kernel heap, because these pages are incorrectly freed
> > in network stack as page_count 0 pages.
> >
> > This patch introduces a helper sendpage_ok(), it returns true if the
> > checking page,
> > - is not slab page: PageSlab(page) is false.
> > - has page refcount: page_count(page) is not zero
> >
> > All drivers who want to send page to remote end by kernel_sendpage()
> > may use this helper to check whether the page is OK. If the helper does
> > not return true, the driver should try other non sendpage method (e.g.
> > sock_no_sendpage()) to handle the page.
> 
> Can we leave this helper to mm subsystem?
> 
> I know it is for sendpage, but its implementation is all about some
> mm details and its two callers do not belong to net subsystem either.
> 
> Think this in another way: who would fix it if it is buggy? I bet mm people
> should. ;)

No.  This is all about a really unusual imitation in sendpage, which
is pretty much unexpected.  In fact the best thing would be to make
sock_sendpage do the right thing and call sock_no_sendpage based
on this condition, so that driver writers don't have to worry at all.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v5 1/3] net: introduce helper sendpage_ok() in include/linux/net.h
  2020-08-17  5:45   ` Christoph Hellwig
@ 2020-08-17 19:12     ` Cong Wang
  2020-08-18  7:00       ` Christoph Hellwig
  0 siblings, 1 reply; 10+ messages in thread
From: Cong Wang @ 2020-08-17 19:12 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Coly Li, linux-block, linux-nvme,
	Linux Kernel Network Developers, stable, LKML,
	Chaitanya Kulkarni, Hannes Reinecke, Jan Kara, Jens Axboe,
	Mikhail Skorzhinskii, Philipp Reisner, Sagi Grimberg,
	Vlastimil Babka

On Sun, Aug 16, 2020 at 10:45 PM Christoph Hellwig <hch@lst.de> wrote:
>
> On Sun, Aug 16, 2020 at 10:55:09AM -0700, Cong Wang wrote:
> > On Sun, Aug 16, 2020 at 1:36 AM Coly Li <colyli@suse.de> wrote:
> > >
> > > The original problem was from nvme-over-tcp code, who mistakenly uses
> > > kernel_sendpage() to send pages allocated by __get_free_pages() without
> > > __GFP_COMP flag. Such pages don't have refcount (page_count is 0) on
> > > tail pages, sending them by kernel_sendpage() may trigger a kernel panic
> > > from a corrupted kernel heap, because these pages are incorrectly freed
> > > in network stack as page_count 0 pages.
> > >
> > > This patch introduces a helper sendpage_ok(), it returns true if the
> > > checking page,
> > > - is not slab page: PageSlab(page) is false.
> > > - has page refcount: page_count(page) is not zero
> > >
> > > All drivers who want to send page to remote end by kernel_sendpage()
> > > may use this helper to check whether the page is OK. If the helper does
> > > not return true, the driver should try other non sendpage method (e.g.
> > > sock_no_sendpage()) to handle the page.
> >
> > Can we leave this helper to mm subsystem?
> >
> > I know it is for sendpage, but its implementation is all about some
> > mm details and its two callers do not belong to net subsystem either.
> >
> > Think this in another way: who would fix it if it is buggy? I bet mm people
> > should. ;)
>
> No.  This is all about a really unusual imitation in sendpage, which

So netdev people will have to understand and support PageSlab() or
page_count()?

If it is unusual even for mm people, how could netdev people suppose
to understand this unusual mm bug? At least not any better.

> is pretty much unexpected.  In fact the best thing would be to make
> sock_sendpage do the right thing and call sock_no_sendpage based
> on this condition, so that driver writers don't have to worry at all.

Agreed, but kernel_sendpage() still relies on mm to provide a helper
to make the decision and ensure this helper is always up-to-date.

In short, it is all about ownership.

Thanks.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v5 1/3] net: introduce helper sendpage_ok() in include/linux/net.h
  2020-08-17 19:12     ` Cong Wang
@ 2020-08-18  7:00       ` Christoph Hellwig
  0 siblings, 0 replies; 10+ messages in thread
From: Christoph Hellwig @ 2020-08-18  7:00 UTC (permalink / raw)
  To: Cong Wang
  Cc: Christoph Hellwig, Coly Li, linux-block, linux-nvme,
	Linux Kernel Network Developers, stable, LKML,
	Chaitanya Kulkarni, Hannes Reinecke, Jan Kara, Jens Axboe,
	Mikhail Skorzhinskii, Philipp Reisner, Sagi Grimberg,
	Vlastimil Babka

On Mon, Aug 17, 2020 at 12:12:12PM -0700, Cong Wang wrote:
> 
> So netdev people will have to understand and support PageSlab() or
> page_count()?

Yes.  As they came up with that contrived rule what is acceptable
for sendpage.  No one else really knows and other subsystems like the
block layer are perfectly fine with it.

> 
> If it is unusual even for mm people, how could netdev people suppose
> to understand this unusual mm bug? At least not any better.

It is not a mm bug, it is a networking quirk.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v5 1/3] net: introduce helper sendpage_ok() in include/linux/net.h
  2020-08-18  8:08 ` Eric Dumazet
@ 2020-08-18  8:21   ` Coly Li
  0 siblings, 0 replies; 10+ messages in thread
From: Coly Li @ 2020-08-18  8:21 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: linux-block, linux-nvme, netdev, stable, linux-kernel,
	Chaitanya Kulkarni, Christoph Hellwig, Hannes Reinecke, Jan Kara,
	Jens Axboe, Mikhail Skorzhinskii, Philipp Reisner, Sagi Grimberg,
	Vlastimil Babka

On 2020/8/18 16:08, Eric Dumazet wrote:
> 
> 
> On 8/16/20 12:08 AM, Coly Li wrote:
>> The original problem was from nvme-over-tcp code, who mistakenly uses
>> kernel_sendpage() to send pages allocated by __get_free_pages() without
>> __GFP_COMP flag. Such pages don't have refcount (page_count is 0) on
>> tail pages, sending them by kernel_sendpage() may trigger a kernel panic
>> from a corrupted kernel heap, because these pages are incorrectly freed
>> in network stack as page_count 0 pages.
>>
>> This patch introduces a helper sendpage_ok(), it returns true if the
>> checking page,
>> - is not slab page: PageSlab(page) is false.
>> - has page refcount: page_count(page) is not zero
>>
>> All drivers who want to send page to remote end by kernel_sendpage()
>> may use this helper to check whether the page is OK. If the helper does
>> not return true, the driver should try other non sendpage method (e.g.
>> sock_no_sendpage()) to handle the page.
>>
>>
>>
>> diff --git a/include/linux/net.h b/include/linux/net.h
>> index d48ff1180879..a807fad31958 100644
>> --- a/include/linux/net.h
>> +++ b/include/linux/net.h
>> @@ -21,6 +21,7 @@
>>  #include <linux/rcupdate.h>
>>  #include <linux/once.h>
>>  #include <linux/fs.h>
>> +#include <linux/mm.h>
>>  #include <linux/sockptr.h>
>>  
>>  #include <uapi/linux/net.h>
>> @@ -286,6 +287,21 @@ do {									\
>>  #define net_get_random_once_wait(buf, nbytes)			\
>>  	get_random_once_wait((buf), (nbytes))
>>  
>> +/*
>> + * E.g. XFS meta- & log-data is in slab pages, or bcache meta
>> + * data pages, or other high order pages allocated by
>> + * __get_free_pages() without __GFP_COMP, which have a page_count
>> + * of 0 and/or have PageSlab() set. We cannot use send_page for
>> + * those, as that does get_page(); put_page(); and would cause
>> + * either a VM_BUG directly, or __page_cache_release a page that
>> + * would actually still be referenced by someone, leading to some
>> + * obscure delayed Oops somewhere else.
>> + */
>> +static inline bool sendpage_ok(struct page *page)
>> +{
>> +	return  (!PageSlab(page) && page_count(page) >= 1);
>> +}
>>

Hi Eric,

> 
> return (A);
> 
> Can simply be written :
> 
> return A;
> 
> In this case :
> 
> return !PageSlab(page) && page_count(page) >= 1;

Sure, I update it in v6 series.


> 
> BTW, do you have plans to refine code added with commit a10674bf2406afc2554f9c7d31b2dc65d6a27fd9
> ("tcp: detecting the misuse of .sendpage for Slab objects")
> 

Thanks for the hint, I will remove this piece in v6 series.

Coly Li

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v5 1/3] net: introduce helper sendpage_ok() in include/linux/net.h
  2020-08-16  7:08 Coly Li
@ 2020-08-18  8:08 ` Eric Dumazet
  2020-08-18  8:21   ` Coly Li
  0 siblings, 1 reply; 10+ messages in thread
From: Eric Dumazet @ 2020-08-18  8:08 UTC (permalink / raw)
  To: Coly Li, linux-block, linux-nvme
  Cc: netdev, stable, linux-kernel, Chaitanya Kulkarni,
	Christoph Hellwig, Hannes Reinecke, Jan Kara, Jens Axboe,
	Mikhail Skorzhinskii, Philipp Reisner, Sagi Grimberg,
	Vlastimil Babka



On 8/16/20 12:08 AM, Coly Li wrote:
> The original problem was from nvme-over-tcp code, who mistakenly uses
> kernel_sendpage() to send pages allocated by __get_free_pages() without
> __GFP_COMP flag. Such pages don't have refcount (page_count is 0) on
> tail pages, sending them by kernel_sendpage() may trigger a kernel panic
> from a corrupted kernel heap, because these pages are incorrectly freed
> in network stack as page_count 0 pages.
> 
> This patch introduces a helper sendpage_ok(), it returns true if the
> checking page,
> - is not slab page: PageSlab(page) is false.
> - has page refcount: page_count(page) is not zero
> 
> All drivers who want to send page to remote end by kernel_sendpage()
> may use this helper to check whether the page is OK. If the helper does
> not return true, the driver should try other non sendpage method (e.g.
> sock_no_sendpage()) to handle the page.
> 
>
> 
> diff --git a/include/linux/net.h b/include/linux/net.h
> index d48ff1180879..a807fad31958 100644
> --- a/include/linux/net.h
> +++ b/include/linux/net.h
> @@ -21,6 +21,7 @@
>  #include <linux/rcupdate.h>
>  #include <linux/once.h>
>  #include <linux/fs.h>
> +#include <linux/mm.h>
>  #include <linux/sockptr.h>
>  
>  #include <uapi/linux/net.h>
> @@ -286,6 +287,21 @@ do {									\
>  #define net_get_random_once_wait(buf, nbytes)			\
>  	get_random_once_wait((buf), (nbytes))
>  
> +/*
> + * E.g. XFS meta- & log-data is in slab pages, or bcache meta
> + * data pages, or other high order pages allocated by
> + * __get_free_pages() without __GFP_COMP, which have a page_count
> + * of 0 and/or have PageSlab() set. We cannot use send_page for
> + * those, as that does get_page(); put_page(); and would cause
> + * either a VM_BUG directly, or __page_cache_release a page that
> + * would actually still be referenced by someone, leading to some
> + * obscure delayed Oops somewhere else.
> + */
> +static inline bool sendpage_ok(struct page *page)
> +{
> +	return  (!PageSlab(page) && page_count(page) >= 1);
> +}
>

return (A);

Can simply be written :

return A;

In this case :

return !PageSlab(page) && page_count(page) >= 1;

BTW, do you have plans to refine code added with commit a10674bf2406afc2554f9c7d31b2dc65d6a27fd9
("tcp: detecting the misuse of .sendpage for Slab objects")

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v5 1/3] net: introduce helper sendpage_ok() in include/linux/net.h
@ 2020-08-16  7:08 Coly Li
  2020-08-18  8:08 ` Eric Dumazet
  0 siblings, 1 reply; 10+ messages in thread
From: Coly Li @ 2020-08-16  7:08 UTC (permalink / raw)
  To: linux-block, linux-nvme
  Cc: netdev, stable, linux-kernel, Coly Li, Chaitanya Kulkarni,
	Christoph Hellwig, Hannes Reinecke, Jan Kara, Jens Axboe,
	Mikhail Skorzhinskii, Philipp Reisner, Sagi Grimberg,
	Vlastimil Babka

The original problem was from nvme-over-tcp code, who mistakenly uses
kernel_sendpage() to send pages allocated by __get_free_pages() without
__GFP_COMP flag. Such pages don't have refcount (page_count is 0) on
tail pages, sending them by kernel_sendpage() may trigger a kernel panic
from a corrupted kernel heap, because these pages are incorrectly freed
in network stack as page_count 0 pages.

This patch introduces a helper sendpage_ok(), it returns true if the
checking page,
- is not slab page: PageSlab(page) is false.
- has page refcount: page_count(page) is not zero

All drivers who want to send page to remote end by kernel_sendpage()
may use this helper to check whether the page is OK. If the helper does
not return true, the driver should try other non sendpage method (e.g.
sock_no_sendpage()) to handle the page.

Signed-off-by: Coly Li <colyli@suse.de>
Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Jan Kara <jack@suse.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Mikhail Skorzhinskii <mskorzhinskiy@solarflare.com>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Vlastimil Babka <vbabka@suse.com>
Cc: stable@vger.kernel.org
---
v5, include linux/mm.h in include/linux/net.h
v4, change sendpage_ok() as an inline helper, and post it as
    separate patch.
v3, introduce a more common sendpage_ok()
v2, fix typo in patch subject
v1, the initial version.
 include/linux/net.h | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/include/linux/net.h b/include/linux/net.h
index d48ff1180879..a807fad31958 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -21,6 +21,7 @@
 #include <linux/rcupdate.h>
 #include <linux/once.h>
 #include <linux/fs.h>
+#include <linux/mm.h>
 #include <linux/sockptr.h>
 
 #include <uapi/linux/net.h>
@@ -286,6 +287,21 @@ do {									\
 #define net_get_random_once_wait(buf, nbytes)			\
 	get_random_once_wait((buf), (nbytes))
 
+/*
+ * E.g. XFS meta- & log-data is in slab pages, or bcache meta
+ * data pages, or other high order pages allocated by
+ * __get_free_pages() without __GFP_COMP, which have a page_count
+ * of 0 and/or have PageSlab() set. We cannot use send_page for
+ * those, as that does get_page(); put_page(); and would cause
+ * either a VM_BUG directly, or __page_cache_release a page that
+ * would actually still be referenced by someone, leading to some
+ * obscure delayed Oops somewhere else.
+ */
+static inline bool sendpage_ok(struct page *page)
+{
+	return  (!PageSlab(page) && page_count(page) >= 1);
+}
+
 int kernel_sendmsg(struct socket *sock, struct msghdr *msg, struct kvec *vec,
 		   size_t num, size_t len);
 int kernel_sendmsg_locked(struct sock *sk, struct msghdr *msg,
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-08-18  8:21 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-16  7:15 [PATCH v5 1/3] net: introduce helper sendpage_ok() in include/linux/net.h Coly Li
2020-08-16  7:15 ` [PATCH v5 2/3] nvme-tcp: check page by sendpage_ok() before calling kernel_sendpage() Coly Li
2020-08-16  7:15 ` [PATCH v5 3/3] drbd: code cleanup by using sendpage_ok() to check page for kernel_sendpage() Coly Li
2020-08-16 17:55 ` [PATCH v5 1/3] net: introduce helper sendpage_ok() in include/linux/net.h Cong Wang
2020-08-17  5:45   ` Christoph Hellwig
2020-08-17 19:12     ` Cong Wang
2020-08-18  7:00       ` Christoph Hellwig
  -- strict thread matches above, loose matches on Subject: below --
2020-08-16  7:08 Coly Li
2020-08-18  8:08 ` Eric Dumazet
2020-08-18  8:21   ` Coly Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).