ceph-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v8 0/7] Introduce sendpage_ok() to detect misused sendpage in network related drivers
@ 2020-09-25 15:01 Coly Li
  2020-09-25 15:01 ` [PATCH v8 1/7] net: introduce helper sendpage_ok() in include/linux/net.h Coly Li
                   ` (6 more replies)
  0 siblings, 7 replies; 13+ messages in thread
From: Coly Li @ 2020-09-25 15:01 UTC (permalink / raw)
  To: linux-block, linux-nvme, netdev, open-iscsi, linux-scsi, ceph-devel
  Cc: linux-kernel, Coly Li, Chaitanya Kulkarni, Chris Leech,
	Christoph Hellwig, Cong Wang, David S . Miller, Eric Dumazet,
	Hannes Reinecke, Ilya Dryomov, Jan Kara, Jeff Layton, Jens Axboe,
	Lee Duncan, Mike Christie, Mikhail Skorzhinskii, Philipp Reisner,
	Sagi Grimberg, Vasily Averin, Vlastimil Babka

This series was original by a bug fix in nvme-over-tcp driver which only
checked whether a page was allocated from slab allcoator, but forgot to
check its page_count: The page handled by sendpage should be neither a
Slab page nor 0 page_count page.

As Sagi Grimberg suggested, the original fix is refind to a more common
inline routine:
    static inline bool sendpage_ok(struct page *page)
    {
        return  (!PageSlab(page) && page_count(page) >= 1);
    }
If sendpage_ok() returns true, the checking page can be handled by the
concrete zero-copy sendpage method in network layer.

The v8 series has 7 patches,
- The 1st patch in this series introduces sendpage_ok() in header file
  include/linux/net.h.
- The 2nd patch adds WARN_ONCE() for improper zero-copy send in
  kernel_sendpage().
- The 3rd patch fixes the page checking issue in nvme-over-tcp driver.
- The 4th patch adds page_count check by using sendpage_ok() in
  do_tcp_sendpages() as Eric Dumazet suggested.
- The 5th and 6th patches just replace existing open coded checks with
  the inline sendpage_ok() routine.

Coly Li

Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Cc: Chris Leech <cleech@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Cong Wang <amwang@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: Jan Kara <jack@suse.com>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Lee Duncan <lduncan@suse.com>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Mikhail Skorzhinskii <mskorzhinskiy@solarflare.com>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Vasily Averin <vvs@virtuozzo.com>
Cc: Vlastimil Babka <vbabka@suse.com>
---
Changelog:
v8: add WARN_ONCE() in kernel_sendpage() as Christoph suggested.
v7: remove outer brackets from the return line of sendpage_ok() as
    Eric Dumazet suggested.
v6: fix page check in do_tcp_sendpages(), as Eric Dumazet suggested.
    replace other open coded checks with sendpage_ok() in libceph,
    iscsi drivers.
v5, include linux/mm.h in include/linux/net.h
v4, change sendpage_ok() as an inline helper, and post it as
    separate patch, as Christoph Hellwig suggested.
v3, introduce a more common sendpage_ok() as Sagi Grimberg suggested.
v2, fix typo in patch subject
v1, the initial version.

Coly Li (7):
  net: introduce helper sendpage_ok() in include/linux/net.h
  net: add WARN_ONCE in kernel_sendpage() for improper zero-copy send
  nvme-tcp: check page by sendpage_ok() before calling kernel_sendpage()
  tcp: use sendpage_ok() to detect misused .sendpage
  drbd: code cleanup by using sendpage_ok() to check page for
    kernel_sendpage()
  scsi: libiscsi: use sendpage_ok() in iscsi_tcp_segment_map()
  libceph: use sendpage_ok() in ceph_tcp_sendpage()

 drivers/block/drbd/drbd_main.c |  2 +-
 drivers/nvme/host/tcp.c        |  7 +++----
 drivers/scsi/libiscsi_tcp.c    |  2 +-
 include/linux/net.h            | 16 ++++++++++++++++
 net/ceph/messenger.c           |  2 +-
 net/ipv4/tcp.c                 |  3 ++-
 net/socket.c                   |  6 ++++--
 7 files changed, 28 insertions(+), 10 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v8 1/7] net: introduce helper sendpage_ok() in include/linux/net.h
  2020-09-25 15:01 [PATCH v8 0/7] Introduce sendpage_ok() to detect misused sendpage in network related drivers Coly Li
@ 2020-09-25 15:01 ` Coly Li
  2020-09-25 15:18   ` Greg KH
  2020-09-25 15:01 ` [PATCH v8 2/7] net: add WARN_ONCE in kernel_sendpage() for improper zero-copy send Coly Li
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 13+ messages in thread
From: Coly Li @ 2020-09-25 15:01 UTC (permalink / raw)
  To: linux-block, linux-nvme, netdev, open-iscsi, linux-scsi, ceph-devel
  Cc: linux-kernel, Coly Li, Chaitanya Kulkarni, Christoph Hellwig,
	Hannes Reinecke, Jan Kara, Jens Axboe, Mikhail Skorzhinskii,
	Philipp Reisner, Sagi Grimberg, Vlastimil Babka, stable

The original problem was from nvme-over-tcp code, who mistakenly uses
kernel_sendpage() to send pages allocated by __get_free_pages() without
__GFP_COMP flag. Such pages don't have refcount (page_count is 0) on
tail pages, sending them by kernel_sendpage() may trigger a kernel panic
from a corrupted kernel heap, because these pages are incorrectly freed
in network stack as page_count 0 pages.

This patch introduces a helper sendpage_ok(), it returns true if the
checking page,
- is not slab page: PageSlab(page) is false.
- has page refcount: page_count(page) is not zero

All drivers who want to send page to remote end by kernel_sendpage()
may use this helper to check whether the page is OK. If the helper does
not return true, the driver should try other non sendpage method (e.g.
sock_no_sendpage()) to handle the page.

Signed-off-by: Coly Li <colyli@suse.de>
Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Jan Kara <jack@suse.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Mikhail Skorzhinskii <mskorzhinskiy@solarflare.com>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Vlastimil Babka <vbabka@suse.com>
Cc: stable@vger.kernel.org
---
 include/linux/net.h | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/include/linux/net.h b/include/linux/net.h
index d48ff1180879..05db8690f67e 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -21,6 +21,7 @@
 #include <linux/rcupdate.h>
 #include <linux/once.h>
 #include <linux/fs.h>
+#include <linux/mm.h>
 #include <linux/sockptr.h>
 
 #include <uapi/linux/net.h>
@@ -286,6 +287,21 @@ do {									\
 #define net_get_random_once_wait(buf, nbytes)			\
 	get_random_once_wait((buf), (nbytes))
 
+/*
+ * E.g. XFS meta- & log-data is in slab pages, or bcache meta
+ * data pages, or other high order pages allocated by
+ * __get_free_pages() without __GFP_COMP, which have a page_count
+ * of 0 and/or have PageSlab() set. We cannot use send_page for
+ * those, as that does get_page(); put_page(); and would cause
+ * either a VM_BUG directly, or __page_cache_release a page that
+ * would actually still be referenced by someone, leading to some
+ * obscure delayed Oops somewhere else.
+ */
+static inline bool sendpage_ok(struct page *page)
+{
+	return  !PageSlab(page) && page_count(page) >= 1;
+}
+
 int kernel_sendmsg(struct socket *sock, struct msghdr *msg, struct kvec *vec,
 		   size_t num, size_t len);
 int kernel_sendmsg_locked(struct sock *sk, struct msghdr *msg,
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v8 2/7] net: add WARN_ONCE in kernel_sendpage() for improper zero-copy send
  2020-09-25 15:01 [PATCH v8 0/7] Introduce sendpage_ok() to detect misused sendpage in network related drivers Coly Li
  2020-09-25 15:01 ` [PATCH v8 1/7] net: introduce helper sendpage_ok() in include/linux/net.h Coly Li
@ 2020-09-25 15:01 ` Coly Li
  2020-09-25 15:01 ` [PATCH v8 3/7] nvme-tcp: check page by sendpage_ok() before calling kernel_sendpage() Coly Li
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Coly Li @ 2020-09-25 15:01 UTC (permalink / raw)
  To: linux-block, linux-nvme, netdev, open-iscsi, linux-scsi, ceph-devel
  Cc: linux-kernel, Coly Li, Cong Wang, Christoph Hellwig,
	David S . Miller, Sridhar Samudrala

If a page sent into kernel_sendpage() is a slab page or it doesn't have
ref_count, this page is improper to send by the zero copy sendpage()
method. Otherwise such page might be unexpected released in network code
path and causes impredictable panic due to kernel memory management data
structure corruption.

This path adds a WARN_ON() on the sending page before sends it into the
concrete zero-copy sendpage() method, if the page is improper for the
zero-copy sendpage() method, a warning message can be observed before
the consequential unpredictable kernel panic.

This patch does not change existing kernel_sendpage() behavior for the
improper page zero-copy send, it just provides hint warning message for
following potential panic due the kernel memory heap corruption.

Signed-off-by: Coly Li <colyli@suse.de>
Cc: Cong Wang <amwang@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Sridhar Samudrala <sri@us.ibm.com>
---
 net/socket.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/socket.c b/net/socket.c
index 0c0144604f81..771456a1d947 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -3638,9 +3638,11 @@ EXPORT_SYMBOL(kernel_getpeername);
 int kernel_sendpage(struct socket *sock, struct page *page, int offset,
 		    size_t size, int flags)
 {
-	if (sock->ops->sendpage)
+	if (sock->ops->sendpage) {
+		/* Warn in case the improper page to zero-copy send */
+		WARN_ONCE(!sendpage_ok(page));
 		return sock->ops->sendpage(sock, page, offset, size, flags);
-
+	}
 	return sock_no_sendpage(sock, page, offset, size, flags);
 }
 EXPORT_SYMBOL(kernel_sendpage);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v8 3/7] nvme-tcp: check page by sendpage_ok() before calling kernel_sendpage()
  2020-09-25 15:01 [PATCH v8 0/7] Introduce sendpage_ok() to detect misused sendpage in network related drivers Coly Li
  2020-09-25 15:01 ` [PATCH v8 1/7] net: introduce helper sendpage_ok() in include/linux/net.h Coly Li
  2020-09-25 15:01 ` [PATCH v8 2/7] net: add WARN_ONCE in kernel_sendpage() for improper zero-copy send Coly Li
@ 2020-09-25 15:01 ` Coly Li
  2020-09-25 15:01 ` [PATCH v8 4/7] tcp: use sendpage_ok() to detect misused .sendpage Coly Li
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Coly Li @ 2020-09-25 15:01 UTC (permalink / raw)
  To: linux-block, linux-nvme, netdev, open-iscsi, linux-scsi, ceph-devel
  Cc: linux-kernel, Coly Li, Chaitanya Kulkarni, Christoph Hellwig,
	Hannes Reinecke, Jan Kara, Jens Axboe, Mikhail Skorzhinskii,
	Philipp Reisner, Sagi Grimberg, Vlastimil Babka, stable

Currently nvme_tcp_try_send_data() doesn't use kernel_sendpage() to
send slab pages. But for pages allocated by __get_free_pages() without
__GFP_COMP, which also have refcount as 0, they are still sent by
kernel_sendpage() to remote end, this is problematic.

The new introduced helper sendpage_ok() checks both PageSlab tag and
page_count counter, and returns true if the checking page is OK to be
sent by kernel_sendpage().

This patch fixes the page checking issue of nvme_tcp_try_send_data()
with sendpage_ok(). If sendpage_ok() returns true, send this page by
kernel_sendpage(), otherwise use sock_no_sendpage to handle this page.

Signed-off-by: Coly Li <colyli@suse.de>
Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Jan Kara <jack@suse.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Mikhail Skorzhinskii <mskorzhinskiy@solarflare.com>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Vlastimil Babka <vbabka@suse.com>
Cc: stable@vger.kernel.org
---
 drivers/nvme/host/tcp.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 8f4f29f18b8c..d6a3e1487354 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -913,12 +913,11 @@ static int nvme_tcp_try_send_data(struct nvme_tcp_request *req)
 		else
 			flags |= MSG_MORE | MSG_SENDPAGE_NOTLAST;
 
-		/* can't zcopy slab pages */
-		if (unlikely(PageSlab(page))) {
-			ret = sock_no_sendpage(queue->sock, page, offset, len,
+		if (sendpage_ok(page)) {
+			ret = kernel_sendpage(queue->sock, page, offset, len,
 					flags);
 		} else {
-			ret = kernel_sendpage(queue->sock, page, offset, len,
+			ret = sock_no_sendpage(queue->sock, page, offset, len,
 					flags);
 		}
 		if (ret <= 0)
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v8 4/7] tcp: use sendpage_ok() to detect misused .sendpage
  2020-09-25 15:01 [PATCH v8 0/7] Introduce sendpage_ok() to detect misused sendpage in network related drivers Coly Li
                   ` (2 preceding siblings ...)
  2020-09-25 15:01 ` [PATCH v8 3/7] nvme-tcp: check page by sendpage_ok() before calling kernel_sendpage() Coly Li
@ 2020-09-25 15:01 ` Coly Li
  2020-09-25 15:01 ` [PATCH v8 5/7] drbd: code cleanup by using sendpage_ok() to check page for kernel_sendpage() Coly Li
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Coly Li @ 2020-09-25 15:01 UTC (permalink / raw)
  To: linux-block, linux-nvme, netdev, open-iscsi, linux-scsi, ceph-devel
  Cc: linux-kernel, Coly Li, Eric Dumazet, Vasily Averin,
	David S . Miller, stable

commit a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab
objects") adds the checks for Slab pages, but the pages don't have
page_count are still missing from the check.

Network layer's sendpage method is not designed to send page_count 0
pages neither, therefore both PageSlab() and page_count() should be
both checked for the sending page. This is exactly what sendpage_ok()
does.

This patch uses sendpage_ok() in do_tcp_sendpages() to detect misused
.sendpage, to make the code more robust.

Fixes: a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab objects")
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Coly Li <colyli@suse.de>
Cc: Vasily Averin <vvs@virtuozzo.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: stable@vger.kernel.org
---
 net/ipv4/tcp.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 31f3b858db81..2135ee7c806d 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -970,7 +970,8 @@ ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset,
 	long timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT);
 
 	if (IS_ENABLED(CONFIG_DEBUG_VM) &&
-	    WARN_ONCE(PageSlab(page), "page must not be a Slab one"))
+	    WARN_ONCE(!sendpage_ok(page),
+		      "page must not be a Slab one and have page_count > 0"))
 		return -EINVAL;
 
 	/* Wait for a connection to finish. One exception is TCP Fast Open
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v8 5/7] drbd: code cleanup by using sendpage_ok() to check page for kernel_sendpage()
  2020-09-25 15:01 [PATCH v8 0/7] Introduce sendpage_ok() to detect misused sendpage in network related drivers Coly Li
                   ` (3 preceding siblings ...)
  2020-09-25 15:01 ` [PATCH v8 4/7] tcp: use sendpage_ok() to detect misused .sendpage Coly Li
@ 2020-09-25 15:01 ` Coly Li
  2020-09-25 15:01 ` [PATCH v8 6/7] scsi: libiscsi: use sendpage_ok() in iscsi_tcp_segment_map() Coly Li
  2020-09-25 15:01 ` [PATCH v8 7/7] libceph: use sendpage_ok() in ceph_tcp_sendpage() Coly Li
  6 siblings, 0 replies; 13+ messages in thread
From: Coly Li @ 2020-09-25 15:01 UTC (permalink / raw)
  To: linux-block, linux-nvme, netdev, open-iscsi, linux-scsi, ceph-devel
  Cc: linux-kernel, Coly Li, Philipp Reisner, Sagi Grimberg

In _drbd_send_page() a page is checked by following code before sending
it by kernel_sendpage(),
        (page_count(page) < 1) || PageSlab(page)
If the check is true, this page won't be send by kernel_sendpage() and
handled by sock_no_sendpage().

This kind of check is exactly what macro sendpage_ok() does, which is
introduced into include/linux/net.h to solve a similar send page issue
in nvme-tcp code.

This patch uses macro sendpage_ok() to replace the open coded checks to
page type and refcount in _drbd_send_page(), as a code cleanup.

Signed-off-by: Coly Li <colyli@suse.de>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/block/drbd/drbd_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index 04b6bde9419d..573dbf6f0c31 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -1553,7 +1553,7 @@ static int _drbd_send_page(struct drbd_peer_device *peer_device, struct page *pa
 	 * put_page(); and would cause either a VM_BUG directly, or
 	 * __page_cache_release a page that would actually still be referenced
 	 * by someone, leading to some obscure delayed Oops somewhere else. */
-	if (drbd_disable_sendpage || (page_count(page) < 1) || PageSlab(page))
+	if (drbd_disable_sendpage || !sendpage_ok(page))
 		return _drbd_no_send_page(peer_device, page, offset, size, msg_flags);
 
 	msg_flags |= MSG_NOSIGNAL;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v8 6/7] scsi: libiscsi: use sendpage_ok() in iscsi_tcp_segment_map()
  2020-09-25 15:01 [PATCH v8 0/7] Introduce sendpage_ok() to detect misused sendpage in network related drivers Coly Li
                   ` (4 preceding siblings ...)
  2020-09-25 15:01 ` [PATCH v8 5/7] drbd: code cleanup by using sendpage_ok() to check page for kernel_sendpage() Coly Li
@ 2020-09-25 15:01 ` Coly Li
  2020-09-25 20:54   ` Martin K. Petersen
  2020-09-25 15:01 ` [PATCH v8 7/7] libceph: use sendpage_ok() in ceph_tcp_sendpage() Coly Li
  6 siblings, 1 reply; 13+ messages in thread
From: Coly Li @ 2020-09-25 15:01 UTC (permalink / raw)
  To: linux-block, linux-nvme, netdev, open-iscsi, linux-scsi, ceph-devel
  Cc: linux-kernel, Coly Li, Vasily Averin, Cong Wang, Mike Christie,
	Lee Duncan, Chris Leech, Christoph Hellwig, Hannes Reinecke

In iscsci driver, iscsi_tcp_segment_map() uses the following code to
check whether the page should or not be handled by sendpage:
    if (!recv && page_count(sg_page(sg)) >= 1 && !PageSlab(sg_page(sg)))

The "page_count(sg_page(sg)) >= 1 && !PageSlab(sg_page(sg)" part is to
make sure the page can be sent to network layer's zero copy path. This
part is exactly what sendpage_ok() does.

This patch uses  use sendpage_ok() in iscsi_tcp_segment_map() to replace
the original open coded checks.

Signed-off-by: Coly Li <colyli@suse.de>
Cc: Vasily Averin <vvs@virtuozzo.com>
Cc: Cong Wang <amwang@redhat.com>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Lee Duncan <lduncan@suse.com>
Cc: Chris Leech <cleech@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
---
 drivers/scsi/libiscsi_tcp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/libiscsi_tcp.c b/drivers/scsi/libiscsi_tcp.c
index 37e5d4e48c2f..83f14b2c8804 100644
--- a/drivers/scsi/libiscsi_tcp.c
+++ b/drivers/scsi/libiscsi_tcp.c
@@ -128,7 +128,7 @@ static void iscsi_tcp_segment_map(struct iscsi_segment *segment, int recv)
 	 * coalescing neighboring slab objects into a single frag which
 	 * triggers one of hardened usercopy checks.
 	 */
-	if (!recv && page_count(sg_page(sg)) >= 1 && !PageSlab(sg_page(sg)))
+	if (!recv && sendpage_ok(sg_page(sg)))
 		return;
 
 	if (recv) {
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v8 7/7] libceph: use sendpage_ok() in ceph_tcp_sendpage()
  2020-09-25 15:01 [PATCH v8 0/7] Introduce sendpage_ok() to detect misused sendpage in network related drivers Coly Li
                   ` (5 preceding siblings ...)
  2020-09-25 15:01 ` [PATCH v8 6/7] scsi: libiscsi: use sendpage_ok() in iscsi_tcp_segment_map() Coly Li
@ 2020-09-25 15:01 ` Coly Li
  2020-09-25 15:13   ` Jeff Layton
  6 siblings, 1 reply; 13+ messages in thread
From: Coly Li @ 2020-09-25 15:01 UTC (permalink / raw)
  To: linux-block, linux-nvme, netdev, open-iscsi, linux-scsi, ceph-devel
  Cc: linux-kernel, Coly Li, Ilya Dryomov, Jeff Layton

In libceph, ceph_tcp_sendpage() does the following checks before handle
the page by network layer's zero copy sendpage method,
	if (page_count(page) >= 1 && !PageSlab(page))

This check is exactly what sendpage_ok() does. This patch replace the
open coded checks by sendpage_ok() as a code cleanup.

Signed-off-by: Coly Li <colyli@suse.de>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: Jeff Layton <jlayton@kernel.org>
---
 net/ceph/messenger.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index bdfd66ba3843..d4d7a0e52491 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -575,7 +575,7 @@ static int ceph_tcp_sendpage(struct socket *sock, struct page *page,
 	 * coalescing neighboring slab objects into a single frag which
 	 * triggers one of hardened usercopy checks.
 	 */
-	if (page_count(page) >= 1 && !PageSlab(page))
+	if (sendpage_ok(page))
 		sendpage = sock->ops->sendpage;
 	else
 		sendpage = sock_no_sendpage;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v8 7/7] libceph: use sendpage_ok() in ceph_tcp_sendpage()
  2020-09-25 15:01 ` [PATCH v8 7/7] libceph: use sendpage_ok() in ceph_tcp_sendpage() Coly Li
@ 2020-09-25 15:13   ` Jeff Layton
  0 siblings, 0 replies; 13+ messages in thread
From: Jeff Layton @ 2020-09-25 15:13 UTC (permalink / raw)
  To: Coly Li, linux-block, linux-nvme, netdev, open-iscsi, linux-scsi,
	ceph-devel
  Cc: linux-kernel, Ilya Dryomov

On Fri, 2020-09-25 at 23:01 +0800, Coly Li wrote:
> In libceph, ceph_tcp_sendpage() does the following checks before handle
> the page by network layer's zero copy sendpage method,
> 	if (page_count(page) >= 1 && !PageSlab(page))
> 
> This check is exactly what sendpage_ok() does. This patch replace the
> open coded checks by sendpage_ok() as a code cleanup.
> 
> Signed-off-by: Coly Li <colyli@suse.de>
> Cc: Ilya Dryomov <idryomov@gmail.com>
> Cc: Jeff Layton <jlayton@kernel.org>
> ---
>  net/ceph/messenger.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index bdfd66ba3843..d4d7a0e52491 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -575,7 +575,7 @@ static int ceph_tcp_sendpage(struct socket *sock, struct page *page,
>  	 * coalescing neighboring slab objects into a single frag which
>  	 * triggers one of hardened usercopy checks.
>  	 */
> -	if (page_count(page) >= 1 && !PageSlab(page))
> +	if (sendpage_ok(page))
>  		sendpage = sock->ops->sendpage;
>  	else
>  		sendpage = sock_no_sendpage;

Looks like a reasonable change to make. Assuming that there is no
objection to the new helper:

Acked-by: Jeff Layton <jlayton@kernel.org>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v8 1/7] net: introduce helper sendpage_ok() in include/linux/net.h
  2020-09-25 15:01 ` [PATCH v8 1/7] net: introduce helper sendpage_ok() in include/linux/net.h Coly Li
@ 2020-09-25 15:18   ` Greg KH
  2020-09-26 13:28     ` Coly Li
  0 siblings, 1 reply; 13+ messages in thread
From: Greg KH @ 2020-09-25 15:18 UTC (permalink / raw)
  To: Coly Li
  Cc: linux-block, linux-nvme, netdev, open-iscsi, linux-scsi,
	ceph-devel, linux-kernel, Chaitanya Kulkarni, Christoph Hellwig,
	Hannes Reinecke, Jan Kara, Jens Axboe, Mikhail Skorzhinskii,
	Philipp Reisner, Sagi Grimberg, Vlastimil Babka, stable

On Fri, Sep 25, 2020 at 11:01:13PM +0800, Coly Li wrote:
> The original problem was from nvme-over-tcp code, who mistakenly uses
> kernel_sendpage() to send pages allocated by __get_free_pages() without
> __GFP_COMP flag. Such pages don't have refcount (page_count is 0) on
> tail pages, sending them by kernel_sendpage() may trigger a kernel panic
> from a corrupted kernel heap, because these pages are incorrectly freed
> in network stack as page_count 0 pages.
> 
> This patch introduces a helper sendpage_ok(), it returns true if the
> checking page,
> - is not slab page: PageSlab(page) is false.
> - has page refcount: page_count(page) is not zero
> 
> All drivers who want to send page to remote end by kernel_sendpage()
> may use this helper to check whether the page is OK. If the helper does
> not return true, the driver should try other non sendpage method (e.g.
> sock_no_sendpage()) to handle the page.
> 
> Signed-off-by: Coly Li <colyli@suse.de>
> Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Hannes Reinecke <hare@suse.de>
> Cc: Jan Kara <jack@suse.com>
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: Mikhail Skorzhinskii <mskorzhinskiy@solarflare.com>
> Cc: Philipp Reisner <philipp.reisner@linbit.com>
> Cc: Sagi Grimberg <sagi@grimberg.me>
> Cc: Vlastimil Babka <vbabka@suse.com>
> Cc: stable@vger.kernel.org
> ---
>  include/linux/net.h | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/include/linux/net.h b/include/linux/net.h
> index d48ff1180879..05db8690f67e 100644
> --- a/include/linux/net.h
> +++ b/include/linux/net.h
> @@ -21,6 +21,7 @@
>  #include <linux/rcupdate.h>
>  #include <linux/once.h>
>  #include <linux/fs.h>
> +#include <linux/mm.h>
>  #include <linux/sockptr.h>
>  
>  #include <uapi/linux/net.h>
> @@ -286,6 +287,21 @@ do {									\
>  #define net_get_random_once_wait(buf, nbytes)			\
>  	get_random_once_wait((buf), (nbytes))
>  
> +/*
> + * E.g. XFS meta- & log-data is in slab pages, or bcache meta
> + * data pages, or other high order pages allocated by
> + * __get_free_pages() without __GFP_COMP, which have a page_count
> + * of 0 and/or have PageSlab() set. We cannot use send_page for
> + * those, as that does get_page(); put_page(); and would cause
> + * either a VM_BUG directly, or __page_cache_release a page that
> + * would actually still be referenced by someone, leading to some
> + * obscure delayed Oops somewhere else.
> + */
> +static inline bool sendpage_ok(struct page *page)
> +{
> +	return  !PageSlab(page) && page_count(page) >= 1;

Do you have one extra ' ' after "return" there?

And this feels like a mm thing, why put it in net.h and not mm.h?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v8 6/7] scsi: libiscsi: use sendpage_ok() in iscsi_tcp_segment_map()
  2020-09-25 15:01 ` [PATCH v8 6/7] scsi: libiscsi: use sendpage_ok() in iscsi_tcp_segment_map() Coly Li
@ 2020-09-25 20:54   ` Martin K. Petersen
  0 siblings, 0 replies; 13+ messages in thread
From: Martin K. Petersen @ 2020-09-25 20:54 UTC (permalink / raw)
  To: Coly Li
  Cc: linux-block, linux-nvme, netdev, open-iscsi, linux-scsi,
	ceph-devel, linux-kernel, Vasily Averin, Cong Wang,
	Mike Christie, Lee Duncan, Chris Leech, Christoph Hellwig,
	Hannes Reinecke


Coly,

> In iscsci driver, iscsi_tcp_segment_map() uses the following code to
> check whether the page should or not be handled by sendpage:
>     if (!recv && page_count(sg_page(sg)) >= 1 && !PageSlab(sg_page(sg)))
>
> The "page_count(sg_page(sg)) >= 1 && !PageSlab(sg_page(sg)" part is to
> make sure the page can be sent to network layer's zero copy path. This
> part is exactly what sendpage_ok() does.
>
> This patch uses  use sendpage_ok() in iscsi_tcp_segment_map() to replace
> the original open coded checks.

Looks fine to me.

Acked-by: Martin K. Petersen <martin.petersen@oracle.com>

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v8 1/7] net: introduce helper sendpage_ok() in include/linux/net.h
  2020-09-25 15:18   ` Greg KH
@ 2020-09-26 13:28     ` Coly Li
  2020-09-27 12:21       ` Greg KH
  0 siblings, 1 reply; 13+ messages in thread
From: Coly Li @ 2020-09-26 13:28 UTC (permalink / raw)
  To: Greg KH
  Cc: linux-block, linux-nvme, netdev, open-iscsi, linux-scsi,
	ceph-devel, linux-kernel, Chaitanya Kulkarni, Christoph Hellwig,
	Hannes Reinecke, Jan Kara, Jens Axboe, Mikhail Skorzhinskii,
	Philipp Reisner, Sagi Grimberg, Vlastimil Babka, stable

On 2020/9/25 23:18, Greg KH wrote:
> On Fri, Sep 25, 2020 at 11:01:13PM +0800, Coly Li wrote:
>> The original problem was from nvme-over-tcp code, who mistakenly uses
>> kernel_sendpage() to send pages allocated by __get_free_pages() without
>> __GFP_COMP flag. Such pages don't have refcount (page_count is 0) on
>> tail pages, sending them by kernel_sendpage() may trigger a kernel panic
>> from a corrupted kernel heap, because these pages are incorrectly freed
>> in network stack as page_count 0 pages.
>>
>> This patch introduces a helper sendpage_ok(), it returns true if the
>> checking page,
>> - is not slab page: PageSlab(page) is false.
>> - has page refcount: page_count(page) is not zero
>>
>> All drivers who want to send page to remote end by kernel_sendpage()
>> may use this helper to check whether the page is OK. If the helper does
>> not return true, the driver should try other non sendpage method (e.g.
>> sock_no_sendpage()) to handle the page.
>>
>> Signed-off-by: Coly Li <colyli@suse.de>
>> Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
>> Cc: Christoph Hellwig <hch@lst.de>
>> Cc: Hannes Reinecke <hare@suse.de>
>> Cc: Jan Kara <jack@suse.com>
>> Cc: Jens Axboe <axboe@kernel.dk>
>> Cc: Mikhail Skorzhinskii <mskorzhinskiy@solarflare.com>
>> Cc: Philipp Reisner <philipp.reisner@linbit.com>
>> Cc: Sagi Grimberg <sagi@grimberg.me>
>> Cc: Vlastimil Babka <vbabka@suse.com>
>> Cc: stable@vger.kernel.org
>> ---
>>  include/linux/net.h | 16 ++++++++++++++++
>>  1 file changed, 16 insertions(+)
>>
>> diff --git a/include/linux/net.h b/include/linux/net.h
>> index d48ff1180879..05db8690f67e 100644
>> --- a/include/linux/net.h
>> +++ b/include/linux/net.h
>> @@ -21,6 +21,7 @@
>>  #include <linux/rcupdate.h>
>>  #include <linux/once.h>
>>  #include <linux/fs.h>
>> +#include <linux/mm.h>
>>  #include <linux/sockptr.h>
>>  
>>  #include <uapi/linux/net.h>
>> @@ -286,6 +287,21 @@ do {									\
>>  #define net_get_random_once_wait(buf, nbytes)			\
>>  	get_random_once_wait((buf), (nbytes))
>>  
>> +/*
>> + * E.g. XFS meta- & log-data is in slab pages, or bcache meta
>> + * data pages, or other high order pages allocated by
>> + * __get_free_pages() without __GFP_COMP, which have a page_count
>> + * of 0 and/or have PageSlab() set. We cannot use send_page for
>> + * those, as that does get_page(); put_page(); and would cause
>> + * either a VM_BUG directly, or __page_cache_release a page that
>> + * would actually still be referenced by someone, leading to some
>> + * obscure delayed Oops somewhere else.
>> + */
>> +static inline bool sendpage_ok(struct page *page)
>> +{
>> +	return  !PageSlab(page) && page_count(page) >= 1;
> 
> Do you have one extra ' ' after "return" there?

It should be fixed in next version.

> 
> And this feels like a mm thing, why put it in net.h and not mm.h?

This check is specific for kernel_sendpage(), so I want to place it
closer to where kernel_sendpage() is declared.

And indeed there was similar discussion about why this helper is not in
mm code in v5 series. Christoph supported to place sendpage_ok() in
net.h, an uncompleted piece of his opinion was "It is not a mm bug, it
is a networking quirk."

Thanks.

Coly Li

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v8 1/7] net: introduce helper sendpage_ok() in include/linux/net.h
  2020-09-26 13:28     ` Coly Li
@ 2020-09-27 12:21       ` Greg KH
  0 siblings, 0 replies; 13+ messages in thread
From: Greg KH @ 2020-09-27 12:21 UTC (permalink / raw)
  To: Coly Li
  Cc: linux-block, linux-nvme, netdev, open-iscsi, linux-scsi,
	ceph-devel, linux-kernel, Chaitanya Kulkarni, Christoph Hellwig,
	Hannes Reinecke, Jan Kara, Jens Axboe, Mikhail Skorzhinskii,
	Philipp Reisner, Sagi Grimberg, Vlastimil Babka, stable

On Sat, Sep 26, 2020 at 09:28:03PM +0800, Coly Li wrote:
> On 2020/9/25 23:18, Greg KH wrote:
> > On Fri, Sep 25, 2020 at 11:01:13PM +0800, Coly Li wrote:
> >> The original problem was from nvme-over-tcp code, who mistakenly uses
> >> kernel_sendpage() to send pages allocated by __get_free_pages() without
> >> __GFP_COMP flag. Such pages don't have refcount (page_count is 0) on
> >> tail pages, sending them by kernel_sendpage() may trigger a kernel panic
> >> from a corrupted kernel heap, because these pages are incorrectly freed
> >> in network stack as page_count 0 pages.
> >>
> >> This patch introduces a helper sendpage_ok(), it returns true if the
> >> checking page,
> >> - is not slab page: PageSlab(page) is false.
> >> - has page refcount: page_count(page) is not zero
> >>
> >> All drivers who want to send page to remote end by kernel_sendpage()
> >> may use this helper to check whether the page is OK. If the helper does
> >> not return true, the driver should try other non sendpage method (e.g.
> >> sock_no_sendpage()) to handle the page.
> >>
> >> Signed-off-by: Coly Li <colyli@suse.de>
> >> Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
> >> Cc: Christoph Hellwig <hch@lst.de>
> >> Cc: Hannes Reinecke <hare@suse.de>
> >> Cc: Jan Kara <jack@suse.com>
> >> Cc: Jens Axboe <axboe@kernel.dk>
> >> Cc: Mikhail Skorzhinskii <mskorzhinskiy@solarflare.com>
> >> Cc: Philipp Reisner <philipp.reisner@linbit.com>
> >> Cc: Sagi Grimberg <sagi@grimberg.me>
> >> Cc: Vlastimil Babka <vbabka@suse.com>
> >> Cc: stable@vger.kernel.org
> >> ---
> >>  include/linux/net.h | 16 ++++++++++++++++
> >>  1 file changed, 16 insertions(+)
> >>
> >> diff --git a/include/linux/net.h b/include/linux/net.h
> >> index d48ff1180879..05db8690f67e 100644
> >> --- a/include/linux/net.h
> >> +++ b/include/linux/net.h
> >> @@ -21,6 +21,7 @@
> >>  #include <linux/rcupdate.h>
> >>  #include <linux/once.h>
> >>  #include <linux/fs.h>
> >> +#include <linux/mm.h>
> >>  #include <linux/sockptr.h>
> >>  
> >>  #include <uapi/linux/net.h>
> >> @@ -286,6 +287,21 @@ do {									\
> >>  #define net_get_random_once_wait(buf, nbytes)			\
> >>  	get_random_once_wait((buf), (nbytes))
> >>  
> >> +/*
> >> + * E.g. XFS meta- & log-data is in slab pages, or bcache meta
> >> + * data pages, or other high order pages allocated by
> >> + * __get_free_pages() without __GFP_COMP, which have a page_count
> >> + * of 0 and/or have PageSlab() set. We cannot use send_page for
> >> + * those, as that does get_page(); put_page(); and would cause
> >> + * either a VM_BUG directly, or __page_cache_release a page that
> >> + * would actually still be referenced by someone, leading to some
> >> + * obscure delayed Oops somewhere else.
> >> + */
> >> +static inline bool sendpage_ok(struct page *page)
> >> +{
> >> +	return  !PageSlab(page) && page_count(page) >= 1;
> > 
> > Do you have one extra ' ' after "return" there?
> 
> It should be fixed in next version.
> 
> > 
> > And this feels like a mm thing, why put it in net.h and not mm.h?
> 
> This check is specific for kernel_sendpage(), so I want to place it
> closer to where kernel_sendpage() is declared.
> 
> And indeed there was similar discussion about why this helper is not in
> mm code in v5 series. Christoph supported to place sendpage_ok() in
> net.h, an uncompleted piece of his opinion was "It is not a mm bug, it
> is a networking quirk."

Ah, nevermind then, sorry for the noise :)

greg k-h

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-09-27 12:21 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-25 15:01 [PATCH v8 0/7] Introduce sendpage_ok() to detect misused sendpage in network related drivers Coly Li
2020-09-25 15:01 ` [PATCH v8 1/7] net: introduce helper sendpage_ok() in include/linux/net.h Coly Li
2020-09-25 15:18   ` Greg KH
2020-09-26 13:28     ` Coly Li
2020-09-27 12:21       ` Greg KH
2020-09-25 15:01 ` [PATCH v8 2/7] net: add WARN_ONCE in kernel_sendpage() for improper zero-copy send Coly Li
2020-09-25 15:01 ` [PATCH v8 3/7] nvme-tcp: check page by sendpage_ok() before calling kernel_sendpage() Coly Li
2020-09-25 15:01 ` [PATCH v8 4/7] tcp: use sendpage_ok() to detect misused .sendpage Coly Li
2020-09-25 15:01 ` [PATCH v8 5/7] drbd: code cleanup by using sendpage_ok() to check page for kernel_sendpage() Coly Li
2020-09-25 15:01 ` [PATCH v8 6/7] scsi: libiscsi: use sendpage_ok() in iscsi_tcp_segment_map() Coly Li
2020-09-25 20:54   ` Martin K. Petersen
2020-09-25 15:01 ` [PATCH v8 7/7] libceph: use sendpage_ok() in ceph_tcp_sendpage() Coly Li
2020-09-25 15:13   ` Jeff Layton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).