linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/8] xen: harden frontends against malicious backends
@ 2021-05-13 10:02 Juergen Gross
  2021-05-13 10:02 ` [PATCH 2/8] xen/blkfront: read response from backend only once Juergen Gross
                   ` (3 more replies)
  0 siblings, 4 replies; 18+ messages in thread
From: Juergen Gross @ 2021-05-13 10:02 UTC (permalink / raw)
  To: xen-devel, linux-kernel, linux-block, netdev, linuxppc-dev
  Cc: Juergen Gross, Boris Ostrovsky, Stefano Stabellini,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Jens Axboe, David S. Miller, Jakub Kicinski, Greg Kroah-Hartman,
	Jiri Slaby

Xen backends of para-virtualized devices can live in dom0 kernel, dom0
user land, or in a driver domain. This means that a backend might
reside in a less trusted environment than the Xen core components, so
a backend should not be able to do harm to a Xen guest (it can still
mess up I/O data, but it shouldn't be able to e.g. crash a guest by
other means or cause a privilege escalation in the guest).

Unfortunately many frontends in the Linux kernel are fully trusting
their respective backends. This series is starting to fix the most
important frontends: console, disk and network.

It was discussed to handle this as a security problem, but the topic
was discussed in public before, so it isn't a real secret.

Juergen Gross (8):
  xen: sync include/xen/interface/io/ring.h with Xen's newest version
  xen/blkfront: read response from backend only once
  xen/blkfront: don't take local copy of a request from the ring page
  xen/blkfront: don't trust the backend response data blindly
  xen/netfront: read response from backend only once
  xen/netfront: don't read data from request on the ring page
  xen/netfront: don't trust the backend response data blindly
  xen/hvc: replace BUG_ON() with negative return value

 drivers/block/xen-blkfront.c    | 118 +++++++++-----
 drivers/net/xen-netfront.c      | 184 ++++++++++++++-------
 drivers/tty/hvc/hvc_xen.c       |  15 +-
 include/xen/interface/io/ring.h | 278 ++++++++++++++++++--------------
 4 files changed, 369 insertions(+), 226 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 2/8] xen/blkfront: read response from backend only once
  2021-05-13 10:02 [PATCH 0/8] xen: harden frontends against malicious backends Juergen Gross
@ 2021-05-13 10:02 ` Juergen Gross
  2021-05-17 13:50   ` Jan Beulich
  2021-05-13 10:02 ` [PATCH 3/8] xen/blkfront: don't take local copy of a request from the ring page Juergen Gross
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 18+ messages in thread
From: Juergen Gross @ 2021-05-13 10:02 UTC (permalink / raw)
  To: xen-devel, linux-block, linux-kernel
  Cc: Juergen Gross, Konrad Rzeszutek Wilk, Roger Pau Monné,
	Boris Ostrovsky, Stefano Stabellini, Jens Axboe

In order to avoid problems in case the backend is modifying a response
on the ring page while the frontend has already seen it, just read the
response into a local buffer in one go and then operate on that buffer
only.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 drivers/block/xen-blkfront.c | 35 ++++++++++++++++++-----------------
 1 file changed, 18 insertions(+), 17 deletions(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 10df39a8b18d..a8b56c153330 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -1557,7 +1557,7 @@ static bool blkif_completion(unsigned long *id,
 static irqreturn_t blkif_interrupt(int irq, void *dev_id)
 {
 	struct request *req;
-	struct blkif_response *bret;
+	struct blkif_response bret;
 	RING_IDX i, rp;
 	unsigned long flags;
 	struct blkfront_ring_info *rinfo = (struct blkfront_ring_info *)dev_id;
@@ -1574,8 +1574,9 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id)
 	for (i = rinfo->ring.rsp_cons; i != rp; i++) {
 		unsigned long id;
 
-		bret = RING_GET_RESPONSE(&rinfo->ring, i);
-		id   = bret->id;
+		RING_COPY_RESPONSE(&rinfo->ring, i, &bret);
+		id = bret.id;
+
 		/*
 		 * The backend has messed up and given us an id that we would
 		 * never have given to it (we stamp it up to BLK_RING_SIZE -
@@ -1583,39 +1584,39 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id)
 		 */
 		if (id >= BLK_RING_SIZE(info)) {
 			WARN(1, "%s: response to %s has incorrect id (%ld)\n",
-			     info->gd->disk_name, op_name(bret->operation), id);
+			     info->gd->disk_name, op_name(bret.operation), id);
 			/* We can't safely get the 'struct request' as
 			 * the id is busted. */
 			continue;
 		}
 		req  = rinfo->shadow[id].request;
 
-		if (bret->operation != BLKIF_OP_DISCARD) {
+		if (bret.operation != BLKIF_OP_DISCARD) {
 			/*
 			 * We may need to wait for an extra response if the
 			 * I/O request is split in 2
 			 */
-			if (!blkif_completion(&id, rinfo, bret))
+			if (!blkif_completion(&id, rinfo, &bret))
 				continue;
 		}
 
 		if (add_id_to_freelist(rinfo, id)) {
 			WARN(1, "%s: response to %s (id %ld) couldn't be recycled!\n",
-			     info->gd->disk_name, op_name(bret->operation), id);
+			     info->gd->disk_name, op_name(bret.operation), id);
 			continue;
 		}
 
-		if (bret->status == BLKIF_RSP_OKAY)
+		if (bret.status == BLKIF_RSP_OKAY)
 			blkif_req(req)->error = BLK_STS_OK;
 		else
 			blkif_req(req)->error = BLK_STS_IOERR;
 
-		switch (bret->operation) {
+		switch (bret.operation) {
 		case BLKIF_OP_DISCARD:
-			if (unlikely(bret->status == BLKIF_RSP_EOPNOTSUPP)) {
+			if (unlikely(bret.status == BLKIF_RSP_EOPNOTSUPP)) {
 				struct request_queue *rq = info->rq;
 				printk(KERN_WARNING "blkfront: %s: %s op failed\n",
-					   info->gd->disk_name, op_name(bret->operation));
+					   info->gd->disk_name, op_name(bret.operation));
 				blkif_req(req)->error = BLK_STS_NOTSUPP;
 				info->feature_discard = 0;
 				info->feature_secdiscard = 0;
@@ -1625,15 +1626,15 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id)
 			break;
 		case BLKIF_OP_FLUSH_DISKCACHE:
 		case BLKIF_OP_WRITE_BARRIER:
-			if (unlikely(bret->status == BLKIF_RSP_EOPNOTSUPP)) {
+			if (unlikely(bret.status == BLKIF_RSP_EOPNOTSUPP)) {
 				printk(KERN_WARNING "blkfront: %s: %s op failed\n",
-				       info->gd->disk_name, op_name(bret->operation));
+				       info->gd->disk_name, op_name(bret.operation));
 				blkif_req(req)->error = BLK_STS_NOTSUPP;
 			}
-			if (unlikely(bret->status == BLKIF_RSP_ERROR &&
+			if (unlikely(bret.status == BLKIF_RSP_ERROR &&
 				     rinfo->shadow[id].req.u.rw.nr_segments == 0)) {
 				printk(KERN_WARNING "blkfront: %s: empty %s op failed\n",
-				       info->gd->disk_name, op_name(bret->operation));
+				       info->gd->disk_name, op_name(bret.operation));
 				blkif_req(req)->error = BLK_STS_NOTSUPP;
 			}
 			if (unlikely(blkif_req(req)->error)) {
@@ -1646,9 +1647,9 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id)
 			fallthrough;
 		case BLKIF_OP_READ:
 		case BLKIF_OP_WRITE:
-			if (unlikely(bret->status != BLKIF_RSP_OKAY))
+			if (unlikely(bret.status != BLKIF_RSP_OKAY))
 				dev_dbg(&info->xbdev->dev, "Bad return from blkdev data "
-					"request: %x\n", bret->status);
+					"request: %x\n", bret.status);
 
 			break;
 		default:
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/8] xen/blkfront: don't take local copy of a request from the ring page
  2021-05-13 10:02 [PATCH 0/8] xen: harden frontends against malicious backends Juergen Gross
  2021-05-13 10:02 ` [PATCH 2/8] xen/blkfront: read response from backend only once Juergen Gross
@ 2021-05-13 10:02 ` Juergen Gross
  2021-05-17 14:01   ` Jan Beulich
  2021-05-13 10:02 ` [PATCH 4/8] xen/blkfront: don't trust the backend response data blindly Juergen Gross
  2021-05-21 10:43 ` [PATCH 0/8] xen: harden frontends against malicious backends Marek Marczykowski-Górecki
  3 siblings, 1 reply; 18+ messages in thread
From: Juergen Gross @ 2021-05-13 10:02 UTC (permalink / raw)
  To: xen-devel, linux-block, linux-kernel
  Cc: Juergen Gross, Boris Ostrovsky, Stefano Stabellini,
	Konrad Rzeszutek Wilk, Roger Pau Monné,
	Jens Axboe

In order to avoid a malicious backend being able to influence the local
copy of a request build the request locally first and then copy it to
the ring page instead of doing it the other way round as today.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 drivers/block/xen-blkfront.c | 25 +++++++++++++++----------
 1 file changed, 15 insertions(+), 10 deletions(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index a8b56c153330..c6a05de4f15f 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -546,7 +546,7 @@ static unsigned long blkif_ring_get_request(struct blkfront_ring_info *rinfo,
 	rinfo->shadow[id].status = REQ_WAITING;
 	rinfo->shadow[id].associated_id = NO_ASSOCIATED_ID;
 
-	(*ring_req)->u.rw.id = id;
+	rinfo->shadow[id].req.u.rw.id = id;
 
 	return id;
 }
@@ -554,11 +554,12 @@ static unsigned long blkif_ring_get_request(struct blkfront_ring_info *rinfo,
 static int blkif_queue_discard_req(struct request *req, struct blkfront_ring_info *rinfo)
 {
 	struct blkfront_info *info = rinfo->dev_info;
-	struct blkif_request *ring_req;
+	struct blkif_request *ring_req, *final_ring_req;
 	unsigned long id;
 
 	/* Fill out a communications ring structure. */
-	id = blkif_ring_get_request(rinfo, req, &ring_req);
+	id = blkif_ring_get_request(rinfo, req, &final_ring_req);
+	ring_req = &rinfo->shadow[id].req;
 
 	ring_req->operation = BLKIF_OP_DISCARD;
 	ring_req->u.discard.nr_sectors = blk_rq_sectors(req);
@@ -569,8 +570,8 @@ static int blkif_queue_discard_req(struct request *req, struct blkfront_ring_inf
 	else
 		ring_req->u.discard.flag = 0;
 
-	/* Keep a private copy so we can reissue requests when recovering. */
-	rinfo->shadow[id].req = *ring_req;
+	/* Copy the request to the ring page. */
+	*final_ring_req = *ring_req;
 
 	return 0;
 }
@@ -703,6 +704,7 @@ static int blkif_queue_rw_req(struct request *req, struct blkfront_ring_info *ri
 {
 	struct blkfront_info *info = rinfo->dev_info;
 	struct blkif_request *ring_req, *extra_ring_req = NULL;
+	struct blkif_request *final_ring_req, *final_extra_ring_req;
 	unsigned long id, extra_id = NO_ASSOCIATED_ID;
 	bool require_extra_req = false;
 	int i;
@@ -747,7 +749,8 @@ static int blkif_queue_rw_req(struct request *req, struct blkfront_ring_info *ri
 	}
 
 	/* Fill out a communications ring structure. */
-	id = blkif_ring_get_request(rinfo, req, &ring_req);
+	id = blkif_ring_get_request(rinfo, req, &final_ring_req);
+	ring_req = &rinfo->shadow[id].req;
 
 	num_sg = blk_rq_map_sg(req->q, req, rinfo->shadow[id].sg);
 	num_grant = 0;
@@ -798,7 +801,9 @@ static int blkif_queue_rw_req(struct request *req, struct blkfront_ring_info *ri
 		ring_req->u.rw.nr_segments = num_grant;
 		if (unlikely(require_extra_req)) {
 			extra_id = blkif_ring_get_request(rinfo, req,
-							  &extra_ring_req);
+							  &final_extra_ring_req);
+			extra_ring_req = &rinfo->shadow[extra_id].req;
+
 			/*
 			 * Only the first request contains the scatter-gather
 			 * list.
@@ -840,10 +845,10 @@ static int blkif_queue_rw_req(struct request *req, struct blkfront_ring_info *ri
 	if (setup.segments)
 		kunmap_atomic(setup.segments);
 
-	/* Keep a private copy so we can reissue requests when recovering. */
-	rinfo->shadow[id].req = *ring_req;
+	/* Copy request(s) to the ring page. */
+	*final_ring_req = *ring_req;
 	if (unlikely(require_extra_req))
-		rinfo->shadow[extra_id].req = *extra_ring_req;
+		*final_extra_ring_req = *extra_ring_req;
 
 	if (new_persistent_gnts)
 		gnttab_free_grant_references(setup.gref_head);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 4/8] xen/blkfront: don't trust the backend response data blindly
  2021-05-13 10:02 [PATCH 0/8] xen: harden frontends against malicious backends Juergen Gross
  2021-05-13 10:02 ` [PATCH 2/8] xen/blkfront: read response from backend only once Juergen Gross
  2021-05-13 10:02 ` [PATCH 3/8] xen/blkfront: don't take local copy of a request from the ring page Juergen Gross
@ 2021-05-13 10:02 ` Juergen Gross
  2021-05-17 14:11   ` Jan Beulich
  2021-05-21 10:43 ` [PATCH 0/8] xen: harden frontends against malicious backends Marek Marczykowski-Górecki
  3 siblings, 1 reply; 18+ messages in thread
From: Juergen Gross @ 2021-05-13 10:02 UTC (permalink / raw)
  To: xen-devel, linux-block, linux-kernel
  Cc: Juergen Gross, Konrad Rzeszutek Wilk, Roger Pau Monné,
	Boris Ostrovsky, Stefano Stabellini, Jens Axboe

Today blkfront will trust the backend to send only sane response data.
In order to avoid privilege escalations or crashes in case of malicious
backends verify the data to be within expected limits. Especially make
sure that the response always references an outstanding request.

Introduce a new state of the ring BLKIF_STATE_ERROR which will be
switched to in case an inconsistency is being detected.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 drivers/block/xen-blkfront.c | 62 +++++++++++++++++++++++++++---------
 1 file changed, 47 insertions(+), 15 deletions(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index c6a05de4f15f..aa0f159829b4 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -80,6 +80,7 @@ enum blkif_state {
 	BLKIF_STATE_DISCONNECTED,
 	BLKIF_STATE_CONNECTED,
 	BLKIF_STATE_SUSPENDED,
+	BLKIF_STATE_ERROR,
 };
 
 struct grant {
@@ -89,6 +90,7 @@ struct grant {
 };
 
 enum blk_req_status {
+	REQ_PROCESSING,
 	REQ_WAITING,
 	REQ_DONE,
 	REQ_ERROR,
@@ -543,7 +545,7 @@ static unsigned long blkif_ring_get_request(struct blkfront_ring_info *rinfo,
 
 	id = get_id_from_freelist(rinfo);
 	rinfo->shadow[id].request = req;
-	rinfo->shadow[id].status = REQ_WAITING;
+	rinfo->shadow[id].status = REQ_PROCESSING;
 	rinfo->shadow[id].associated_id = NO_ASSOCIATED_ID;
 
 	rinfo->shadow[id].req.u.rw.id = id;
@@ -572,6 +574,7 @@ static int blkif_queue_discard_req(struct request *req, struct blkfront_ring_inf
 
 	/* Copy the request to the ring page. */
 	*final_ring_req = *ring_req;
+	rinfo->shadow[id].status = REQ_WAITING;
 
 	return 0;
 }
@@ -847,8 +850,11 @@ static int blkif_queue_rw_req(struct request *req, struct blkfront_ring_info *ri
 
 	/* Copy request(s) to the ring page. */
 	*final_ring_req = *ring_req;
-	if (unlikely(require_extra_req))
+	rinfo->shadow[id].status = REQ_WAITING;
+	if (unlikely(require_extra_req)) {
 		*final_extra_ring_req = *extra_ring_req;
+		rinfo->shadow[extra_id].status = REQ_WAITING;
+	}
 
 	if (new_persistent_gnts)
 		gnttab_free_grant_references(setup.gref_head);
@@ -1420,8 +1426,8 @@ static enum blk_req_status blkif_rsp_to_req_status(int rsp)
 static int blkif_get_final_status(enum blk_req_status s1,
 				  enum blk_req_status s2)
 {
-	BUG_ON(s1 == REQ_WAITING);
-	BUG_ON(s2 == REQ_WAITING);
+	BUG_ON(s1 < REQ_DONE);
+	BUG_ON(s2 < REQ_DONE);
 
 	if (s1 == REQ_ERROR || s2 == REQ_ERROR)
 		return BLKIF_RSP_ERROR;
@@ -1454,7 +1460,7 @@ static bool blkif_completion(unsigned long *id,
 		s->status = blkif_rsp_to_req_status(bret->status);
 
 		/* Wait the second response if not yet here. */
-		if (s2->status == REQ_WAITING)
+		if (s2->status < REQ_DONE)
 			return false;
 
 		bret->status = blkif_get_final_status(s->status,
@@ -1574,10 +1580,16 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id)
 	spin_lock_irqsave(&rinfo->ring_lock, flags);
  again:
 	rp = rinfo->ring.sring->rsp_prod;
+	if (RING_RESPONSE_PROD_OVERFLOW(&rinfo->ring, rp)) {
+		pr_alert("%s: illegal number of responses %u\n",
+			 info->gd->disk_name, rp - rinfo->ring.rsp_cons);
+		goto err;
+	}
 	rmb(); /* Ensure we see queued responses up to 'rp'. */
 
 	for (i = rinfo->ring.rsp_cons; i != rp; i++) {
 		unsigned long id;
+		unsigned int op;
 
 		RING_COPY_RESPONSE(&rinfo->ring, i, &bret);
 		id = bret.id;
@@ -1588,14 +1600,28 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id)
 		 * look in get_id_from_freelist.
 		 */
 		if (id >= BLK_RING_SIZE(info)) {
-			WARN(1, "%s: response to %s has incorrect id (%ld)\n",
-			     info->gd->disk_name, op_name(bret.operation), id);
-			/* We can't safely get the 'struct request' as
-			 * the id is busted. */
-			continue;
+			pr_alert("%s: response has incorrect id (%ld)\n",
+				 info->gd->disk_name, id);
+			goto err;
 		}
+		if (rinfo->shadow[id].status != REQ_WAITING) {
+			pr_alert("%s: response references no pending request\n",
+				 info->gd->disk_name);
+			goto err;
+		}
+
+		rinfo->shadow[id].status = REQ_PROCESSING;
 		req  = rinfo->shadow[id].request;
 
+		op = rinfo->shadow[id].req.operation;
+		if (op == BLKIF_OP_INDIRECT)
+			op = rinfo->shadow[id].req.u.indirect.indirect_op;
+		if (bret.operation != op) {
+			pr_alert("%s: response has wrong operation (%u instead of %u)\n",
+				 info->gd->disk_name, bret.operation, op);
+			goto err;
+		}
+
 		if (bret.operation != BLKIF_OP_DISCARD) {
 			/*
 			 * We may need to wait for an extra response if the
@@ -1620,7 +1646,8 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id)
 		case BLKIF_OP_DISCARD:
 			if (unlikely(bret.status == BLKIF_RSP_EOPNOTSUPP)) {
 				struct request_queue *rq = info->rq;
-				printk(KERN_WARNING "blkfront: %s: %s op failed\n",
+
+				pr_warn_ratelimited("blkfront: %s: %s op failed\n",
 					   info->gd->disk_name, op_name(bret.operation));
 				blkif_req(req)->error = BLK_STS_NOTSUPP;
 				info->feature_discard = 0;
@@ -1632,13 +1659,13 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id)
 		case BLKIF_OP_FLUSH_DISKCACHE:
 		case BLKIF_OP_WRITE_BARRIER:
 			if (unlikely(bret.status == BLKIF_RSP_EOPNOTSUPP)) {
-				printk(KERN_WARNING "blkfront: %s: %s op failed\n",
+				pr_warn_ratelimited("blkfront: %s: %s op failed\n",
 				       info->gd->disk_name, op_name(bret.operation));
 				blkif_req(req)->error = BLK_STS_NOTSUPP;
 			}
 			if (unlikely(bret.status == BLKIF_RSP_ERROR &&
 				     rinfo->shadow[id].req.u.rw.nr_segments == 0)) {
-				printk(KERN_WARNING "blkfront: %s: empty %s op failed\n",
+				pr_warn_ratelimited("blkfront: %s: empty %s op failed\n",
 				       info->gd->disk_name, op_name(bret.operation));
 				blkif_req(req)->error = BLK_STS_NOTSUPP;
 			}
@@ -1653,8 +1680,8 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id)
 		case BLKIF_OP_READ:
 		case BLKIF_OP_WRITE:
 			if (unlikely(bret.status != BLKIF_RSP_OKAY))
-				dev_dbg(&info->xbdev->dev, "Bad return from blkdev data "
-					"request: %x\n", bret.status);
+				dev_dbg_ratelimited(&info->xbdev->dev,
+					"Bad return from blkdev data request: %x\n", bret.status);
 
 			break;
 		default:
@@ -1680,6 +1707,11 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id)
 	spin_unlock_irqrestore(&rinfo->ring_lock, flags);
 
 	return IRQ_HANDLED;
+
+ err:
+	info->connected = BLKIF_STATE_ERROR;
+	pr_alert("%s disabled for further use\n", info->gd->disk_name);
+	return IRQ_HANDLED;
 }
 
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/8] xen/blkfront: read response from backend only once
  2021-05-13 10:02 ` [PATCH 2/8] xen/blkfront: read response from backend only once Juergen Gross
@ 2021-05-17 13:50   ` Jan Beulich
  0 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2021-05-17 13:50 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Konrad Rzeszutek Wilk, Roger Pau Monné,
	Boris Ostrovsky, Stefano Stabellini, Jens Axboe, linux-block,
	xen-devel, linux-kernel

On 13.05.2021 12:02, Juergen Gross wrote:
> In order to avoid problems in case the backend is modifying a response
> on the ring page while the frontend has already seen it, just read the
> response into a local buffer in one go and then operate on that buffer
> only.
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/8] xen/blkfront: don't take local copy of a request from the ring page
  2021-05-13 10:02 ` [PATCH 3/8] xen/blkfront: don't take local copy of a request from the ring page Juergen Gross
@ 2021-05-17 14:01   ` Jan Beulich
  2021-05-17 14:11     ` Juergen Gross
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2021-05-17 14:01 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Boris Ostrovsky, Stefano Stabellini, Konrad Rzeszutek Wilk,
	Roger Pau Monné,
	Jens Axboe, xen-devel, linux-kernel, linux-block

On 13.05.2021 12:02, Juergen Gross wrote:
> In order to avoid a malicious backend being able to influence the local
> copy of a request build the request locally first and then copy it to
> the ring page instead of doing it the other way round as today.
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>
with one remark/question:

> @@ -703,6 +704,7 @@ static int blkif_queue_rw_req(struct request *req, struct blkfront_ring_info *ri
>  {
>  	struct blkfront_info *info = rinfo->dev_info;
>  	struct blkif_request *ring_req, *extra_ring_req = NULL;
> +	struct blkif_request *final_ring_req, *final_extra_ring_req;

Without setting final_extra_ring_req to NULL just like is done for
extra_ring_req, ...

> @@ -840,10 +845,10 @@ static int blkif_queue_rw_req(struct request *req, struct blkfront_ring_info *ri
>  	if (setup.segments)
>  		kunmap_atomic(setup.segments);
>  
> -	/* Keep a private copy so we can reissue requests when recovering. */
> -	rinfo->shadow[id].req = *ring_req;
> +	/* Copy request(s) to the ring page. */
> +	*final_ring_req = *ring_req;
>  	if (unlikely(require_extra_req))
> -		rinfo->shadow[extra_id].req = *extra_ring_req;
> +		*final_extra_ring_req = *extra_ring_req;

... are you sure all supported compilers will recognize the
conditional use and not warn about use of a possibly uninitialized
variable?

Jan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/8] xen/blkfront: don't trust the backend response data blindly
  2021-05-13 10:02 ` [PATCH 4/8] xen/blkfront: don't trust the backend response data blindly Juergen Gross
@ 2021-05-17 14:11   ` Jan Beulich
  2021-05-17 14:23     ` Juergen Gross
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2021-05-17 14:11 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Konrad Rzeszutek Wilk, Roger Pau Monné,
	Boris Ostrovsky, Stefano Stabellini, Jens Axboe, xen-devel,
	linux-block, linux-kernel

On 13.05.2021 12:02, Juergen Gross wrote:
> @@ -1574,10 +1580,16 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id)
>  	spin_lock_irqsave(&rinfo->ring_lock, flags);
>   again:
>  	rp = rinfo->ring.sring->rsp_prod;
> +	if (RING_RESPONSE_PROD_OVERFLOW(&rinfo->ring, rp)) {
> +		pr_alert("%s: illegal number of responses %u\n",
> +			 info->gd->disk_name, rp - rinfo->ring.rsp_cons);
> +		goto err;
> +	}
>  	rmb(); /* Ensure we see queued responses up to 'rp'. */

I think you want to insert after the barrier.

> @@ -1680,6 +1707,11 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id)
>  	spin_unlock_irqrestore(&rinfo->ring_lock, flags);
>  
>  	return IRQ_HANDLED;
> +
> + err:
> +	info->connected = BLKIF_STATE_ERROR;
> +	pr_alert("%s disabled for further use\n", info->gd->disk_name);
> +	return IRQ_HANDLED;
>  }

Am I understanding that a suspend (and then resume) can be used to
recover from error state? If so - is this intentional? If so in turn,
would it make sense to spell this out in the description?

Jan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/8] xen/blkfront: don't take local copy of a request from the ring page
  2021-05-17 14:01   ` Jan Beulich
@ 2021-05-17 14:11     ` Juergen Gross
  0 siblings, 0 replies; 18+ messages in thread
From: Juergen Gross @ 2021-05-17 14:11 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Boris Ostrovsky, Stefano Stabellini, Konrad Rzeszutek Wilk,
	Roger Pau Monné,
	Jens Axboe, xen-devel, linux-kernel, linux-block


[-- Attachment #1.1.1: Type: text/plain, Size: 1581 bytes --]

On 17.05.21 16:01, Jan Beulich wrote:
> On 13.05.2021 12:02, Juergen Gross wrote:
>> In order to avoid a malicious backend being able to influence the local
>> copy of a request build the request locally first and then copy it to
>> the ring page instead of doing it the other way round as today.
>>
>> Signed-off-by: Juergen Gross <jgross@suse.com>
> 
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> with one remark/question:
> 
>> @@ -703,6 +704,7 @@ static int blkif_queue_rw_req(struct request *req, 
struct blkfront_ring_info *ri
>>   {
>>   	struct blkfront_info *info = rinfo->dev_info;
>>   	struct blkif_request *ring_req, *extra_ring_req = NULL;
>> +	struct blkif_request *final_ring_req, *final_extra_ring_req;
> 
> Without setting final_extra_ring_req to NULL just like is done for
> extra_ring_req, ...
> 
>> @@ -840,10 +845,10 @@ static int blkif_queue_rw_req(struct request *req, struct blkfront_ring_info *ri
>>   	if (setup.segments)
>>   		kunmap_atomic(setup.segments);
>>   
>> -	/* Keep a private copy so we can reissue requests when recovering. */
>> -	rinfo->shadow[id].req = *ring_req;
>> +	/* Copy request(s) to the ring page. */
>> +	*final_ring_req = *ring_req;
>>   	if (unlikely(require_extra_req))
>> -		rinfo->shadow[extra_id].req = *extra_ring_req;
>> +		*final_extra_ring_req = *extra_ring_req;
> 
> ... are you sure all supported compilers will recognize the
> conditional use and not warn about use of a possibly uninitialized
> variable?

Hmm, probably better safe than sorry. Will change it.


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/8] xen/blkfront: don't trust the backend response data blindly
  2021-05-17 14:11   ` Jan Beulich
@ 2021-05-17 14:23     ` Juergen Gross
  2021-05-17 15:12       ` Jan Beulich
  0 siblings, 1 reply; 18+ messages in thread
From: Juergen Gross @ 2021-05-17 14:23 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Konrad Rzeszutek Wilk, Roger Pau Monné,
	Boris Ostrovsky, Stefano Stabellini, Jens Axboe, xen-devel,
	linux-block, linux-kernel


[-- Attachment #1.1.1: Type: text/plain, Size: 1516 bytes --]

On 17.05.21 16:11, Jan Beulich wrote:
> On 13.05.2021 12:02, Juergen Gross wrote:
>> @@ -1574,10 +1580,16 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id)
>>   	spin_lock_irqsave(&rinfo->ring_lock, flags);
>>    again:
>>   	rp = rinfo->ring.sring->rsp_prod;
>> +	if (RING_RESPONSE_PROD_OVERFLOW(&rinfo->ring, rp)) {
>> +		pr_alert("%s: illegal number of responses %u\n",
>> +			 info->gd->disk_name, rp - rinfo->ring.rsp_cons);
>> +		goto err;
>> +	}
>>   	rmb(); /* Ensure we see queued responses up to 'rp'. */
> 
> I think you want to insert after the barrier.

Why? The relevant variable which is checked is "rp". The result of the
check is in no way depending on the responses themselves. And any change
of rsp_cons is protected by ring_lock, so there is no possibility of
reading an old value here.

> 
>> @@ -1680,6 +1707,11 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id)
>>   	spin_unlock_irqrestore(&rinfo->ring_lock, flags);
>>   
>>   	return IRQ_HANDLED;
>> +
>> + err:
>> +	info->connected = BLKIF_STATE_ERROR;
>> +	pr_alert("%s disabled for further use\n", info->gd->disk_name);
>> +	return IRQ_HANDLED;
>>   }
> 
> Am I understanding that a suspend (and then resume) can be used to
> recover from error state? If so - is this intentional? If so in turn,
> would it make sense to spell this out in the description?

I'd call it a nice side effect rather than intention. I can add a remark
to the commit message if you want.


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/8] xen/blkfront: don't trust the backend response data blindly
  2021-05-17 14:23     ` Juergen Gross
@ 2021-05-17 15:12       ` Jan Beulich
  2021-05-17 15:22         ` Juergen Gross
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2021-05-17 15:12 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Konrad Rzeszutek Wilk, Roger Pau Monné,
	Boris Ostrovsky, Stefano Stabellini, Jens Axboe, xen-devel,
	linux-block, linux-kernel

On 17.05.2021 16:23, Juergen Gross wrote:
> On 17.05.21 16:11, Jan Beulich wrote:
>> On 13.05.2021 12:02, Juergen Gross wrote:
>>> @@ -1574,10 +1580,16 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id)
>>>   	spin_lock_irqsave(&rinfo->ring_lock, flags);
>>>    again:
>>>   	rp = rinfo->ring.sring->rsp_prod;
>>> +	if (RING_RESPONSE_PROD_OVERFLOW(&rinfo->ring, rp)) {
>>> +		pr_alert("%s: illegal number of responses %u\n",
>>> +			 info->gd->disk_name, rp - rinfo->ring.rsp_cons);
>>> +		goto err;
>>> +	}
>>>   	rmb(); /* Ensure we see queued responses up to 'rp'. */
>>
>> I think you want to insert after the barrier.
> 
> Why? The relevant variable which is checked is "rp". The result of the
> check is in no way depending on the responses themselves. And any change
> of rsp_cons is protected by ring_lock, so there is no possibility of
> reading an old value here.

But this is a standard double read situation: You might check a value
and then (via a separate read) use a different one past the barrier.

Jan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/8] xen/blkfront: don't trust the backend response data blindly
  2021-05-17 15:12       ` Jan Beulich
@ 2021-05-17 15:22         ` Juergen Gross
  2021-05-17 15:33           ` Jan Beulich
  0 siblings, 1 reply; 18+ messages in thread
From: Juergen Gross @ 2021-05-17 15:22 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Konrad Rzeszutek Wilk, Roger Pau Monné,
	Boris Ostrovsky, Stefano Stabellini, Jens Axboe, xen-devel,
	linux-block, linux-kernel


[-- Attachment #1.1.1: Type: text/plain, Size: 1730 bytes --]

On 17.05.21 17:12, Jan Beulich wrote:
> On 17.05.2021 16:23, Juergen Gross wrote:
>> On 17.05.21 16:11, Jan Beulich wrote:
>>> On 13.05.2021 12:02, Juergen Gross wrote:
>>>> @@ -1574,10 +1580,16 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id)
>>>>    	spin_lock_irqsave(&rinfo->ring_lock, flags);
>>>>     again:
>>>>    	rp = rinfo->ring.sring->rsp_prod;
>>>> +	if (RING_RESPONSE_PROD_OVERFLOW(&rinfo->ring, rp)) {
>>>> +		pr_alert("%s: illegal number of responses %u\n",
>>>> +			 info->gd->disk_name, rp - rinfo->ring.rsp_cons);
>>>> +		goto err;
>>>> +	}
>>>>    	rmb(); /* Ensure we see queued responses up to 'rp'. */
>>>
>>> I think you want to insert after the barrier.
>>
>> Why? The relevant variable which is checked is "rp". The result of the
>> check is in no way depending on the responses themselves. And any change
>> of rsp_cons is protected by ring_lock, so there is no possibility of
>> reading an old value here.
> 
> But this is a standard double read situation: You might check a value
> and then (via a separate read) use a different one past the barrier.

Yes and no.

rsp_cons should never be written by the other side, and additionally
it would be read multiple times anyway.

So if the other side is writing it, the write could always happen after
the test and before the loop is started. This is no real issue here as
the frontend would very soon stumble over an illegal response (either
no request pending, or some other inconsistency). The test is meant to
have a more detailed error message in case it hits.

In the end it doesn't really matter, so I can change it. I just wanted
to point out that IMO both variants are equally valid.


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/8] xen/blkfront: don't trust the backend response data blindly
  2021-05-17 15:22         ` Juergen Gross
@ 2021-05-17 15:33           ` Jan Beulich
  2021-07-08  5:47             ` Juergen Gross
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2021-05-17 15:33 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Konrad Rzeszutek Wilk, Roger Pau Monné,
	Boris Ostrovsky, Stefano Stabellini, Jens Axboe, xen-devel,
	linux-block, linux-kernel

On 17.05.2021 17:22, Juergen Gross wrote:
> On 17.05.21 17:12, Jan Beulich wrote:
>> On 17.05.2021 16:23, Juergen Gross wrote:
>>> On 17.05.21 16:11, Jan Beulich wrote:
>>>> On 13.05.2021 12:02, Juergen Gross wrote:
>>>>> @@ -1574,10 +1580,16 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id)
>>>>>    	spin_lock_irqsave(&rinfo->ring_lock, flags);
>>>>>     again:
>>>>>    	rp = rinfo->ring.sring->rsp_prod;
>>>>> +	if (RING_RESPONSE_PROD_OVERFLOW(&rinfo->ring, rp)) {
>>>>> +		pr_alert("%s: illegal number of responses %u\n",
>>>>> +			 info->gd->disk_name, rp - rinfo->ring.rsp_cons);
>>>>> +		goto err;
>>>>> +	}
>>>>>    	rmb(); /* Ensure we see queued responses up to 'rp'. */
>>>>
>>>> I think you want to insert after the barrier.
>>>
>>> Why? The relevant variable which is checked is "rp". The result of the
>>> check is in no way depending on the responses themselves. And any change
>>> of rsp_cons is protected by ring_lock, so there is no possibility of
>>> reading an old value here.
>>
>> But this is a standard double read situation: You might check a value
>> and then (via a separate read) use a different one past the barrier.
> 
> Yes and no.
> 
> rsp_cons should never be written by the other side, and additionally
> it would be read multiple times anyway.

But I'm talking about rsp_prod, as that's what rp gets loaded from.

Jan

> So if the other side is writing it, the write could always happen after
> the test and before the loop is started. This is no real issue here as
> the frontend would very soon stumble over an illegal response (either
> no request pending, or some other inconsistency). The test is meant to
> have a more detailed error message in case it hits.
> 
> In the end it doesn't really matter, so I can change it. I just wanted
> to point out that IMO both variants are equally valid.
> 
> 
> Juergen
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/8] xen: harden frontends against malicious backends
  2021-05-13 10:02 [PATCH 0/8] xen: harden frontends against malicious backends Juergen Gross
                   ` (2 preceding siblings ...)
  2021-05-13 10:02 ` [PATCH 4/8] xen/blkfront: don't trust the backend response data blindly Juergen Gross
@ 2021-05-21 10:43 ` Marek Marczykowski-Górecki
  3 siblings, 0 replies; 18+ messages in thread
From: Marek Marczykowski-Górecki @ 2021-05-21 10:43 UTC (permalink / raw)
  To: Juergen Gross
  Cc: xen-devel, linux-kernel, linux-block, netdev, linuxppc-dev,
	Boris Ostrovsky, Stefano Stabellini, Konrad Rzeszutek Wilk,
	Roger Pau Monné,
	Jens Axboe, David S. Miller, Jakub Kicinski, Greg Kroah-Hartman,
	Jiri Slaby

[-- Attachment #1: Type: text/plain, Size: 2052 bytes --]

On Thu, May 13, 2021 at 12:02:54PM +0200, Juergen Gross wrote:
> Xen backends of para-virtualized devices can live in dom0 kernel, dom0
> user land, or in a driver domain. This means that a backend might
> reside in a less trusted environment than the Xen core components, so
> a backend should not be able to do harm to a Xen guest (it can still
> mess up I/O data, but it shouldn't be able to e.g. crash a guest by
> other means or cause a privilege escalation in the guest).
> 
> Unfortunately many frontends in the Linux kernel are fully trusting
> their respective backends. This series is starting to fix the most
> important frontends: console, disk and network.
> 
> It was discussed to handle this as a security problem, but the topic
> was discussed in public before, so it isn't a real secret.

Is it based on patches we ship in Qubes[1] and also I've sent here some
years ago[2]? I see a lot of similarities. If not, you may want to
compare them.

[1] https://github.com/QubesOS/qubes-linux-kernel/
[2] https://lists.xenproject.org/archives/html/xen-devel/2018-04/msg02336.html


> Juergen Gross (8):
>   xen: sync include/xen/interface/io/ring.h with Xen's newest version
>   xen/blkfront: read response from backend only once
>   xen/blkfront: don't take local copy of a request from the ring page
>   xen/blkfront: don't trust the backend response data blindly
>   xen/netfront: read response from backend only once
>   xen/netfront: don't read data from request on the ring page
>   xen/netfront: don't trust the backend response data blindly
>   xen/hvc: replace BUG_ON() with negative return value
> 
>  drivers/block/xen-blkfront.c    | 118 +++++++++-----
>  drivers/net/xen-netfront.c      | 184 ++++++++++++++-------
>  drivers/tty/hvc/hvc_xen.c       |  15 +-
>  include/xen/interface/io/ring.h | 278 ++++++++++++++++++--------------
>  4 files changed, 369 insertions(+), 226 deletions(-)
> 
> -- 
> 2.26.2
> 
> 

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/8] xen/blkfront: don't trust the backend response data blindly
  2021-05-17 15:33           ` Jan Beulich
@ 2021-07-08  5:47             ` Juergen Gross
  2021-07-08  6:37               ` Jan Beulich
  0 siblings, 1 reply; 18+ messages in thread
From: Juergen Gross @ 2021-07-08  5:47 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Konrad Rzeszutek Wilk, Roger Pau Monné,
	Boris Ostrovsky, Stefano Stabellini, Jens Axboe, xen-devel,
	linux-block, linux-kernel


[-- Attachment #1.1.1: Type: text/plain, Size: 1597 bytes --]

On 17.05.21 17:33, Jan Beulich wrote:
> On 17.05.2021 17:22, Juergen Gross wrote:
>> On 17.05.21 17:12, Jan Beulich wrote:
>>> On 17.05.2021 16:23, Juergen Gross wrote:
>>>> On 17.05.21 16:11, Jan Beulich wrote:
>>>>> On 13.05.2021 12:02, Juergen Gross wrote:
>>>>>> @@ -1574,10 +1580,16 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id)
>>>>>>     	spin_lock_irqsave(&rinfo->ring_lock, flags);
>>>>>>      again:
>>>>>>     	rp = rinfo->ring.sring->rsp_prod;
>>>>>> +	if (RING_RESPONSE_PROD_OVERFLOW(&rinfo->ring, rp)) {
>>>>>> +		pr_alert("%s: illegal number of responses %u\n",
>>>>>> +			 info->gd->disk_name, rp - rinfo->ring.rsp_cons);
>>>>>> +		goto err;
>>>>>> +	}
>>>>>>     	rmb(); /* Ensure we see queued responses up to 'rp'. */
>>>>>
>>>>> I think you want to insert after the barrier.
>>>>
>>>> Why? The relevant variable which is checked is "rp". The result of the
>>>> check is in no way depending on the responses themselves. And any change
>>>> of rsp_cons is protected by ring_lock, so there is no possibility of
>>>> reading an old value here.
>>>
>>> But this is a standard double read situation: You might check a value
>>> and then (via a separate read) use a different one past the barrier.
>>
>> Yes and no.
>>
>> rsp_cons should never be written by the other side, and additionally
>> it would be read multiple times anyway.
> 
> But I'm talking about rsp_prod, as that's what rp gets loaded from.

Oh, now I get your problem.

But shouldn't that better be solved by using READ_ONCE() for reading rp
instead?


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/8] xen/blkfront: don't trust the backend response data blindly
  2021-07-08  5:47             ` Juergen Gross
@ 2021-07-08  6:37               ` Jan Beulich
  2021-07-08  6:40                 ` Juergen Gross
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2021-07-08  6:37 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Konrad Rzeszutek Wilk, Roger Pau Monné,
	Boris Ostrovsky, Stefano Stabellini, Jens Axboe, xen-devel,
	linux-block, linux-kernel

On 08.07.2021 07:47, Juergen Gross wrote:
> On 17.05.21 17:33, Jan Beulich wrote:
>> On 17.05.2021 17:22, Juergen Gross wrote:
>>> On 17.05.21 17:12, Jan Beulich wrote:
>>>> On 17.05.2021 16:23, Juergen Gross wrote:
>>>>> On 17.05.21 16:11, Jan Beulich wrote:
>>>>>> On 13.05.2021 12:02, Juergen Gross wrote:
>>>>>>> @@ -1574,10 +1580,16 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id)
>>>>>>>     	spin_lock_irqsave(&rinfo->ring_lock, flags);
>>>>>>>      again:
>>>>>>>     	rp = rinfo->ring.sring->rsp_prod;
>>>>>>> +	if (RING_RESPONSE_PROD_OVERFLOW(&rinfo->ring, rp)) {
>>>>>>> +		pr_alert("%s: illegal number of responses %u\n",
>>>>>>> +			 info->gd->disk_name, rp - rinfo->ring.rsp_cons);
>>>>>>> +		goto err;
>>>>>>> +	}
>>>>>>>     	rmb(); /* Ensure we see queued responses up to 'rp'. */
>>>>>>
>>>>>> I think you want to insert after the barrier.
>>>>>
>>>>> Why? The relevant variable which is checked is "rp". The result of the
>>>>> check is in no way depending on the responses themselves. And any change
>>>>> of rsp_cons is protected by ring_lock, so there is no possibility of
>>>>> reading an old value here.
>>>>
>>>> But this is a standard double read situation: You might check a value
>>>> and then (via a separate read) use a different one past the barrier.
>>>
>>> Yes and no.
>>>
>>> rsp_cons should never be written by the other side, and additionally
>>> it would be read multiple times anyway.
>>
>> But I'm talking about rsp_prod, as that's what rp gets loaded from.
> 
> Oh, now I get your problem.
> 
> But shouldn't that better be solved by using READ_ONCE() for reading rp
> instead?

Not sure - the rmb() is needed anyway aiui, and hence you could as well
move your code addition.

Jan


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/8] xen/blkfront: don't trust the backend response data blindly
  2021-07-08  6:37               ` Jan Beulich
@ 2021-07-08  6:40                 ` Juergen Gross
  2021-07-08  6:52                   ` Jan Beulich
  0 siblings, 1 reply; 18+ messages in thread
From: Juergen Gross @ 2021-07-08  6:40 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Konrad Rzeszutek Wilk, Roger Pau Monné,
	Boris Ostrovsky, Stefano Stabellini, Jens Axboe, xen-devel,
	linux-block, linux-kernel


[-- Attachment #1.1.1: Type: text/plain, Size: 2051 bytes --]

On 08.07.21 08:37, Jan Beulich wrote:
> On 08.07.2021 07:47, Juergen Gross wrote:
>> On 17.05.21 17:33, Jan Beulich wrote:
>>> On 17.05.2021 17:22, Juergen Gross wrote:
>>>> On 17.05.21 17:12, Jan Beulich wrote:
>>>>> On 17.05.2021 16:23, Juergen Gross wrote:
>>>>>> On 17.05.21 16:11, Jan Beulich wrote:
>>>>>>> On 13.05.2021 12:02, Juergen Gross wrote:
>>>>>>>> @@ -1574,10 +1580,16 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id)
>>>>>>>>      	spin_lock_irqsave(&rinfo->ring_lock, flags);
>>>>>>>>       again:
>>>>>>>>      	rp = rinfo->ring.sring->rsp_prod;
>>>>>>>> +	if (RING_RESPONSE_PROD_OVERFLOW(&rinfo->ring, rp)) {
>>>>>>>> +		pr_alert("%s: illegal number of responses %u\n",
>>>>>>>> +			 info->gd->disk_name, rp - rinfo->ring.rsp_cons);
>>>>>>>> +		goto err;
>>>>>>>> +	}
>>>>>>>>      	rmb(); /* Ensure we see queued responses up to 'rp'. */
>>>>>>>
>>>>>>> I think you want to insert after the barrier.
>>>>>>
>>>>>> Why? The relevant variable which is checked is "rp". The result of the
>>>>>> check is in no way depending on the responses themselves. And any change
>>>>>> of rsp_cons is protected by ring_lock, so there is no possibility of
>>>>>> reading an old value here.
>>>>>
>>>>> But this is a standard double read situation: You might check a value
>>>>> and then (via a separate read) use a different one past the barrier.
>>>>
>>>> Yes and no.
>>>>
>>>> rsp_cons should never be written by the other side, and additionally
>>>> it would be read multiple times anyway.
>>>
>>> But I'm talking about rsp_prod, as that's what rp gets loaded from.
>>
>> Oh, now I get your problem.
>>
>> But shouldn't that better be solved by using READ_ONCE() for reading rp
>> instead?
> 
> Not sure - the rmb() is needed anyway aiui, and hence you could as well
> move your code addition.

Sure.

My question was rather: does the rmb() really eliminate the possibility
of a double read introduced by the compiler? If yes, moving the code is
the correct solution.


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/8] xen/blkfront: don't trust the backend response data blindly
  2021-07-08  6:40                 ` Juergen Gross
@ 2021-07-08  6:52                   ` Jan Beulich
  2021-07-08  6:56                     ` Juergen Gross
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2021-07-08  6:52 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Konrad Rzeszutek Wilk, Roger Pau Monné,
	Boris Ostrovsky, Stefano Stabellini, Jens Axboe, xen-devel,
	linux-block, linux-kernel

On 08.07.2021 08:40, Juergen Gross wrote:
> On 08.07.21 08:37, Jan Beulich wrote:
>> On 08.07.2021 07:47, Juergen Gross wrote:
>>> On 17.05.21 17:33, Jan Beulich wrote:
>>>> On 17.05.2021 17:22, Juergen Gross wrote:
>>>>> On 17.05.21 17:12, Jan Beulich wrote:
>>>>>> On 17.05.2021 16:23, Juergen Gross wrote:
>>>>>>> On 17.05.21 16:11, Jan Beulich wrote:
>>>>>>>> On 13.05.2021 12:02, Juergen Gross wrote:
>>>>>>>>> @@ -1574,10 +1580,16 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id)
>>>>>>>>>      	spin_lock_irqsave(&rinfo->ring_lock, flags);
>>>>>>>>>       again:
>>>>>>>>>      	rp = rinfo->ring.sring->rsp_prod;
>>>>>>>>> +	if (RING_RESPONSE_PROD_OVERFLOW(&rinfo->ring, rp)) {
>>>>>>>>> +		pr_alert("%s: illegal number of responses %u\n",
>>>>>>>>> +			 info->gd->disk_name, rp - rinfo->ring.rsp_cons);
>>>>>>>>> +		goto err;
>>>>>>>>> +	}
>>>>>>>>>      	rmb(); /* Ensure we see queued responses up to 'rp'. */
>>>>>>>>
>>>>>>>> I think you want to insert after the barrier.
>>>>>>>
>>>>>>> Why? The relevant variable which is checked is "rp". The result of the
>>>>>>> check is in no way depending on the responses themselves. And any change
>>>>>>> of rsp_cons is protected by ring_lock, so there is no possibility of
>>>>>>> reading an old value here.
>>>>>>
>>>>>> But this is a standard double read situation: You might check a value
>>>>>> and then (via a separate read) use a different one past the barrier.
>>>>>
>>>>> Yes and no.
>>>>>
>>>>> rsp_cons should never be written by the other side, and additionally
>>>>> it would be read multiple times anyway.
>>>>
>>>> But I'm talking about rsp_prod, as that's what rp gets loaded from.
>>>
>>> Oh, now I get your problem.
>>>
>>> But shouldn't that better be solved by using READ_ONCE() for reading rp
>>> instead?
>>
>> Not sure - the rmb() is needed anyway aiui, and hence you could as well
>> move your code addition.
> 
> Sure.
> 
> My question was rather: does the rmb() really eliminate the possibility
> of a double read introduced by the compiler? If yes, moving the code is
> the correct solution.

It doesn't eliminate the possibility of a double read, but (leaving
aside split accesses) that's not what you care about here. What you
need is a single stable value to operate on. No matter how many
(non-split) reads the compiler may issue to fill "rp", the final
read's value will be used in the subsequent calculation. Or at
least that's been my understanding; thinking about it the compiler
might issue multiple reads into distinct registers ahead of the
barrier, and use different registers for different subsequent
operations. While this would look like intentionally inefficient
code generation to me, you may indeed want to play safe and use
ACCESS_ONCE() _and_ the barrier. I guess there are more places then
which would want similar treatment, and it's not a problem that
this change introduces ...

Jan


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/8] xen/blkfront: don't trust the backend response data blindly
  2021-07-08  6:52                   ` Jan Beulich
@ 2021-07-08  6:56                     ` Juergen Gross
  0 siblings, 0 replies; 18+ messages in thread
From: Juergen Gross @ 2021-07-08  6:56 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Konrad Rzeszutek Wilk, Roger Pau Monné,
	Boris Ostrovsky, Stefano Stabellini, Jens Axboe, xen-devel,
	linux-block, linux-kernel


[-- Attachment #1.1.1: Type: text/plain, Size: 3186 bytes --]

On 08.07.21 08:52, Jan Beulich wrote:
> On 08.07.2021 08:40, Juergen Gross wrote:
>> On 08.07.21 08:37, Jan Beulich wrote:
>>> On 08.07.2021 07:47, Juergen Gross wrote:
>>>> On 17.05.21 17:33, Jan Beulich wrote:
>>>>> On 17.05.2021 17:22, Juergen Gross wrote:
>>>>>> On 17.05.21 17:12, Jan Beulich wrote:
>>>>>>> On 17.05.2021 16:23, Juergen Gross wrote:
>>>>>>>> On 17.05.21 16:11, Jan Beulich wrote:
>>>>>>>>> On 13.05.2021 12:02, Juergen Gross wrote:
>>>>>>>>>> @@ -1574,10 +1580,16 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id)
>>>>>>>>>>       	spin_lock_irqsave(&rinfo->ring_lock, flags);
>>>>>>>>>>        again:
>>>>>>>>>>       	rp = rinfo->ring.sring->rsp_prod;
>>>>>>>>>> +	if (RING_RESPONSE_PROD_OVERFLOW(&rinfo->ring, rp)) {
>>>>>>>>>> +		pr_alert("%s: illegal number of responses %u\n",
>>>>>>>>>> +			 info->gd->disk_name, rp - rinfo->ring.rsp_cons);
>>>>>>>>>> +		goto err;
>>>>>>>>>> +	}
>>>>>>>>>>       	rmb(); /* Ensure we see queued responses up to 'rp'. */
>>>>>>>>>
>>>>>>>>> I think you want to insert after the barrier.
>>>>>>>>
>>>>>>>> Why? The relevant variable which is checked is "rp". The result of the
>>>>>>>> check is in no way depending on the responses themselves. And any change
>>>>>>>> of rsp_cons is protected by ring_lock, so there is no possibility of
>>>>>>>> reading an old value here.
>>>>>>>
>>>>>>> But this is a standard double read situation: You might check a value
>>>>>>> and then (via a separate read) use a different one past the barrier.
>>>>>>
>>>>>> Yes and no.
>>>>>>
>>>>>> rsp_cons should never be written by the other side, and additionally
>>>>>> it would be read multiple times anyway.
>>>>>
>>>>> But I'm talking about rsp_prod, as that's what rp gets loaded from.
>>>>
>>>> Oh, now I get your problem.
>>>>
>>>> But shouldn't that better be solved by using READ_ONCE() for reading rp
>>>> instead?
>>>
>>> Not sure - the rmb() is needed anyway aiui, and hence you could as well
>>> move your code addition.
>>
>> Sure.
>>
>> My question was rather: does the rmb() really eliminate the possibility
>> of a double read introduced by the compiler? If yes, moving the code is
>> the correct solution.
> 
> It doesn't eliminate the possibility of a double read, but (leaving
> aside split accesses) that's not what you care about here. What you
> need is a single stable value to operate on. No matter how many
> (non-split) reads the compiler may issue to fill "rp", the final
> read's value will be used in the subsequent calculation. Or at
> least that's been my understanding; thinking about it the compiler
> might issue multiple reads into distinct registers ahead of the
> barrier, and use different registers for different subsequent
> operations. While this would look like intentionally inefficient
> code generation to me, you may indeed want to play safe and use
> ACCESS_ONCE() _and_ the barrier. I guess there are more places then
> which would want similar treatment, and it's not a problem that
> this change introduces ...

Nevertheless I think I can change it right away. It will also help
against load tearing.


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2021-07-08  6:56 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-13 10:02 [PATCH 0/8] xen: harden frontends against malicious backends Juergen Gross
2021-05-13 10:02 ` [PATCH 2/8] xen/blkfront: read response from backend only once Juergen Gross
2021-05-17 13:50   ` Jan Beulich
2021-05-13 10:02 ` [PATCH 3/8] xen/blkfront: don't take local copy of a request from the ring page Juergen Gross
2021-05-17 14:01   ` Jan Beulich
2021-05-17 14:11     ` Juergen Gross
2021-05-13 10:02 ` [PATCH 4/8] xen/blkfront: don't trust the backend response data blindly Juergen Gross
2021-05-17 14:11   ` Jan Beulich
2021-05-17 14:23     ` Juergen Gross
2021-05-17 15:12       ` Jan Beulich
2021-05-17 15:22         ` Juergen Gross
2021-05-17 15:33           ` Jan Beulich
2021-07-08  5:47             ` Juergen Gross
2021-07-08  6:37               ` Jan Beulich
2021-07-08  6:40                 ` Juergen Gross
2021-07-08  6:52                   ` Jan Beulich
2021-07-08  6:56                     ` Juergen Gross
2021-05-21 10:43 ` [PATCH 0/8] xen: harden frontends against malicious backends Marek Marczykowski-Górecki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).