linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/4] PCI: hv: Miscellaneous changes
@ 2022-03-28 14:42 Andrea Parri (Microsoft)
  2022-03-28 14:42 ` [RFC PATCH 1/4] Drivers: hv: vmbus: Remove special code for unsolicited messages Andrea Parri (Microsoft)
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Andrea Parri (Microsoft) @ 2022-03-28 14:42 UTC (permalink / raw)
  To: KY Srinivasan, Haiyang Zhang, Stephen Hemminger, Wei Liu,
	Dexuan Cui, Michael Kelley, Wei Hu, Lorenzo Pieralisi,
	Rob Herring, Krzysztof Wilczynski, Bjorn Helgaas
  Cc: linux-hyperv, linux-pci, linux-kernel, Andrea Parri (Microsoft)

A re-work of [1] to use vmbus_requestor.

Thanks,
  Andrea

[1] https://lkml.kernel.org/r/20220318174848.290621-1-parri.andrea@gmail.com

Andrea Parri (Microsoft) (4):
  Drivers: hv: vmbus: Remove special code for unsolicited messages
  PCI: hv: Use vmbus_requestor to generate transaction IDs for VMbus
    hardening
  Drivers: hv: vmbus: Introduce vmbus_sendpacket_getid()
  PCI: hv: Fix synchronization between channel callback and
    hv_compose_msi_msg()

 drivers/hv/channel.c                |  51 ++++++++-----
 drivers/hv/hyperv_vmbus.h           |   2 +-
 drivers/hv/ring_buffer.c            |   4 +-
 drivers/pci/controller/pci-hyperv.c | 109 +++++++++++++++++++++++++---
 include/linux/hyperv.h              |   7 ++
 5 files changed, 141 insertions(+), 32 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC PATCH 1/4] Drivers: hv: vmbus: Remove special code for unsolicited messages
  2022-03-28 14:42 [RFC PATCH 0/4] PCI: hv: Miscellaneous changes Andrea Parri (Microsoft)
@ 2022-03-28 14:42 ` Andrea Parri (Microsoft)
  2022-03-31 18:00   ` Michael Kelley (LINUX)
  2022-04-07  2:54   ` Andrea Parri
  2022-03-28 14:42 ` [RFC PATCH 2/4] PCI: hv: Use vmbus_requestor to generate transaction IDs for VMbus hardening Andrea Parri (Microsoft)
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 13+ messages in thread
From: Andrea Parri (Microsoft) @ 2022-03-28 14:42 UTC (permalink / raw)
  To: KY Srinivasan, Haiyang Zhang, Stephen Hemminger, Wei Liu,
	Dexuan Cui, Michael Kelley, Wei Hu, Lorenzo Pieralisi,
	Rob Herring, Krzysztof Wilczynski, Bjorn Helgaas
  Cc: linux-hyperv, linux-pci, linux-kernel, Andrea Parri (Microsoft)

vmbus_requestor has included code for handling unsolicited messages
since its introduction with commit e8b7db38449ac ("Drivers: hv: vmbus:
Add vmbus_requestor data structure for VMBus hardening"); such code was
motivated by the early use of vmbus_requestor from storvsc.  Since
storvsc moved to a tag-based mechanism to generate/retrieve request IDs
with commit bf5fd8cae3c8f ("scsi: storvsc: Use blk_mq_unique_tag() to
generate requestIDs"), the special handling of unsolicited messages in
vmbus_requestor is not useful and can be removed.

Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
---
 drivers/hv/channel.c | 13 +------------
 1 file changed, 1 insertion(+), 12 deletions(-)

diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c
index dc5c35210c16a..a253eee3aeb1a 100644
--- a/drivers/hv/channel.c
+++ b/drivers/hv/channel.c
@@ -1243,11 +1243,7 @@ u64 vmbus_next_request_id(struct vmbus_channel *channel, u64 rqst_addr)
 
 	spin_unlock_irqrestore(&rqstor->req_lock, flags);
 
-	/*
-	 * Cannot return an ID of 0, which is reserved for an unsolicited
-	 * message from Hyper-V.
-	 */
-	return current_id + 1;
+	return current_id;
 }
 EXPORT_SYMBOL_GPL(vmbus_next_request_id);
 
@@ -1268,15 +1264,8 @@ u64 vmbus_request_addr(struct vmbus_channel *channel, u64 trans_id)
 	if (!channel->rqstor_size)
 		return VMBUS_NO_RQSTOR;
 
-	/* Hyper-V can send an unsolicited message with ID of 0 */
-	if (!trans_id)
-		return trans_id;
-
 	spin_lock_irqsave(&rqstor->req_lock, flags);
 
-	/* Data corresponding to trans_id is stored at trans_id - 1 */
-	trans_id--;
-
 	/* Invalid trans_id */
 	if (trans_id >= rqstor->size || !test_bit(trans_id, rqstor->req_bitmap)) {
 		spin_unlock_irqrestore(&rqstor->req_lock, flags);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH 2/4] PCI: hv: Use vmbus_requestor to generate transaction IDs for VMbus hardening
  2022-03-28 14:42 [RFC PATCH 0/4] PCI: hv: Miscellaneous changes Andrea Parri (Microsoft)
  2022-03-28 14:42 ` [RFC PATCH 1/4] Drivers: hv: vmbus: Remove special code for unsolicited messages Andrea Parri (Microsoft)
@ 2022-03-28 14:42 ` Andrea Parri (Microsoft)
  2022-03-31 18:12   ` Michael Kelley (LINUX)
  2022-03-28 14:42 ` [RFC PATCH 3/4] Drivers: hv: vmbus: Introduce vmbus_sendpacket_getid() Andrea Parri (Microsoft)
  2022-03-28 14:42 ` [RFC PATCH 4/4] PCI: hv: Fix synchronization between channel callback and hv_compose_msi_msg() Andrea Parri (Microsoft)
  3 siblings, 1 reply; 13+ messages in thread
From: Andrea Parri (Microsoft) @ 2022-03-28 14:42 UTC (permalink / raw)
  To: KY Srinivasan, Haiyang Zhang, Stephen Hemminger, Wei Liu,
	Dexuan Cui, Michael Kelley, Wei Hu, Lorenzo Pieralisi,
	Rob Herring, Krzysztof Wilczynski, Bjorn Helgaas
  Cc: linux-hyperv, linux-pci, linux-kernel, Andrea Parri (Microsoft)

Currently, pointers to guest memory are passed to Hyper-V as transaction
IDs in hv_pci.  In the face of errors or malicious behavior in Hyper-V,
hv_pci should not expose or trust the transaction IDs returned by
Hyper-V to be valid guest memory addresses.  Instead, use small integers
generated by vmbus_requestor as request (transaction) IDs.

Suggested-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
---
 drivers/pci/controller/pci-hyperv.c | 30 +++++++++++++++++++++--------
 1 file changed, 22 insertions(+), 8 deletions(-)

diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
index ae0bc2fee4ca8..9f963a46b8298 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -91,6 +91,9 @@ static enum pci_protocol_version_t pci_protocol_versions[] = {
 /* space for 32bit serial number as string */
 #define SLOT_NAME_SIZE 11
 
+/* Size of requestor for VMbus */
+#define HV_PCI_RQSTOR_SIZE 64
+
 /*
  * Message Types
  */
@@ -1407,7 +1410,7 @@ static void hv_int_desc_free(struct hv_pci_dev *hpdev,
 	int_pkt->wslot.slot = hpdev->desc.win_slot.slot;
 	int_pkt->int_desc = *int_desc;
 	vmbus_sendpacket(hpdev->hbus->hdev->channel, int_pkt, sizeof(*int_pkt),
-			 (unsigned long)&ctxt.pkt, VM_PKT_DATA_INBAND, 0);
+			 0, VM_PKT_DATA_INBAND, 0);
 	kfree(int_desc);
 }
 
@@ -2649,7 +2652,7 @@ static void hv_eject_device_work(struct work_struct *work)
 	ejct_pkt->message_type.type = PCI_EJECTION_COMPLETE;
 	ejct_pkt->wslot.slot = hpdev->desc.win_slot.slot;
 	vmbus_sendpacket(hbus->hdev->channel, ejct_pkt,
-			 sizeof(*ejct_pkt), (unsigned long)&ctxt.pkt,
+			 sizeof(*ejct_pkt), 0,
 			 VM_PKT_DATA_INBAND, 0);
 
 	/* For the get_pcichild() in hv_pci_eject_device() */
@@ -2696,8 +2699,9 @@ static void hv_pci_onchannelcallback(void *context)
 	const int packet_size = 0x100;
 	int ret;
 	struct hv_pcibus_device *hbus = context;
+	struct vmbus_channel *chan = hbus->hdev->channel;
 	u32 bytes_recvd;
-	u64 req_id;
+	u64 req_id, req_addr;
 	struct vmpacket_descriptor *desc;
 	unsigned char *buffer;
 	int bufferlen = packet_size;
@@ -2743,11 +2747,13 @@ static void hv_pci_onchannelcallback(void *context)
 		switch (desc->type) {
 		case VM_PKT_COMP:
 
-			/*
-			 * The host is trusted, and thus it's safe to interpret
-			 * this transaction ID as a pointer.
-			 */
-			comp_packet = (struct pci_packet *)req_id;
+			req_addr = chan->request_addr_callback(chan, req_id);
+			if (req_addr == VMBUS_RQST_ERROR) {
+				dev_warn_ratelimited(&hbus->hdev->device,
+						     "Invalid request ID\n");
+				break;
+			}
+			comp_packet = (struct pci_packet *)req_addr;
 			response = (struct pci_response *)buffer;
 			comp_packet->completion_func(comp_packet->compl_ctxt,
 						     response,
@@ -3419,6 +3425,10 @@ static int hv_pci_probe(struct hv_device *hdev,
 		goto free_dom;
 	}
 
+	hdev->channel->next_request_id_callback = vmbus_next_request_id;
+	hdev->channel->request_addr_callback = vmbus_request_addr;
+	hdev->channel->rqstor_size = HV_PCI_RQSTOR_SIZE;
+
 	ret = vmbus_open(hdev->channel, pci_ring_size, pci_ring_size, NULL, 0,
 			 hv_pci_onchannelcallback, hbus);
 	if (ret)
@@ -3749,6 +3759,10 @@ static int hv_pci_resume(struct hv_device *hdev)
 
 	hbus->state = hv_pcibus_init;
 
+	hdev->channel->next_request_id_callback = vmbus_next_request_id;
+	hdev->channel->request_addr_callback = vmbus_request_addr;
+	hdev->channel->rqstor_size = HV_PCI_RQSTOR_SIZE;
+
 	ret = vmbus_open(hdev->channel, pci_ring_size, pci_ring_size, NULL, 0,
 			 hv_pci_onchannelcallback, hbus);
 	if (ret)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH 3/4] Drivers: hv: vmbus: Introduce vmbus_sendpacket_getid()
  2022-03-28 14:42 [RFC PATCH 0/4] PCI: hv: Miscellaneous changes Andrea Parri (Microsoft)
  2022-03-28 14:42 ` [RFC PATCH 1/4] Drivers: hv: vmbus: Remove special code for unsolicited messages Andrea Parri (Microsoft)
  2022-03-28 14:42 ` [RFC PATCH 2/4] PCI: hv: Use vmbus_requestor to generate transaction IDs for VMbus hardening Andrea Parri (Microsoft)
@ 2022-03-28 14:42 ` Andrea Parri (Microsoft)
  2022-03-31 19:47   ` Michael Kelley (LINUX)
  2022-03-28 14:42 ` [RFC PATCH 4/4] PCI: hv: Fix synchronization between channel callback and hv_compose_msi_msg() Andrea Parri (Microsoft)
  3 siblings, 1 reply; 13+ messages in thread
From: Andrea Parri (Microsoft) @ 2022-03-28 14:42 UTC (permalink / raw)
  To: KY Srinivasan, Haiyang Zhang, Stephen Hemminger, Wei Liu,
	Dexuan Cui, Michael Kelley, Wei Hu, Lorenzo Pieralisi,
	Rob Herring, Krzysztof Wilczynski, Bjorn Helgaas
  Cc: linux-hyperv, linux-pci, linux-kernel, Andrea Parri (Microsoft)

The function can be used to send a VMbus packet and retrieve the
corresponding transaction ID.  It will be used by hv_pci.

No functional change.

Suggested-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
---
 drivers/hv/channel.c      | 38 ++++++++++++++++++++++++++++++++------
 drivers/hv/hyperv_vmbus.h |  2 +-
 drivers/hv/ring_buffer.c  |  4 +++-
 include/linux/hyperv.h    |  7 +++++++
 4 files changed, 43 insertions(+), 8 deletions(-)

diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c
index a253eee3aeb1a..3eaa41c7ce15f 100644
--- a/drivers/hv/channel.c
+++ b/drivers/hv/channel.c
@@ -1022,11 +1022,13 @@ void vmbus_close(struct vmbus_channel *channel)
 EXPORT_SYMBOL_GPL(vmbus_close);
 
 /**
- * vmbus_sendpacket() - Send the specified buffer on the given channel
+ * vmbus_sendpacket_getid() - Send the specified buffer on the given channel
  * @channel: Pointer to vmbus_channel structure
  * @buffer: Pointer to the buffer you want to send the data from.
  * @bufferlen: Maximum size of what the buffer holds.
  * @requestid: Identifier of the request
+ * @trans_id: Identifier of the transaction associated to this request, if
+ *            the send is successful; undefined, otherwise.
  * @type: Type of packet that is being sent e.g. negotiate, time
  *	  packet etc.
  * @flags: 0 or VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED
@@ -1036,8 +1038,8 @@ EXPORT_SYMBOL_GPL(vmbus_close);
  *
  * Mainly used by Hyper-V drivers.
  */
-int vmbus_sendpacket(struct vmbus_channel *channel, void *buffer,
-			   u32 bufferlen, u64 requestid,
+int vmbus_sendpacket_getid(struct vmbus_channel *channel, void *buffer,
+			   u32 bufferlen, u64 requestid, u64 *trans_id,
 			   enum vmbus_packet_type type, u32 flags)
 {
 	struct vmpacket_descriptor desc;
@@ -1063,7 +1065,31 @@ int vmbus_sendpacket(struct vmbus_channel *channel, void *buffer,
 	bufferlist[2].iov_base = &aligned_data;
 	bufferlist[2].iov_len = (packetlen_aligned - packetlen);
 
-	return hv_ringbuffer_write(channel, bufferlist, num_vecs, requestid);
+	return hv_ringbuffer_write(channel, bufferlist, num_vecs, requestid, trans_id);
+}
+EXPORT_SYMBOL(vmbus_sendpacket_getid);
+
+/**
+ * vmbus_sendpacket() - Send the specified buffer on the given channel
+ * @channel: Pointer to vmbus_channel structure
+ * @buffer: Pointer to the buffer you want to send the data from.
+ * @bufferlen: Maximum size of what the buffer holds.
+ * @requestid: Identifier of the request
+ * @type: Type of packet that is being sent e.g. negotiate, time
+ *	  packet etc.
+ * @flags: 0 or VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED
+ *
+ * Sends data in @buffer directly to Hyper-V via the vmbus.
+ * This will send the data unparsed to Hyper-V.
+ *
+ * Mainly used by Hyper-V drivers.
+ */
+int vmbus_sendpacket(struct vmbus_channel *channel, void *buffer,
+		     u32 bufferlen, u64 requestid,
+		     enum vmbus_packet_type type, u32 flags)
+{
+	return vmbus_sendpacket_getid(channel, buffer, bufferlen,
+				      requestid, NULL, type, flags);
 }
 EXPORT_SYMBOL(vmbus_sendpacket);
 
@@ -1122,7 +1148,7 @@ int vmbus_sendpacket_pagebuffer(struct vmbus_channel *channel,
 	bufferlist[2].iov_base = &aligned_data;
 	bufferlist[2].iov_len = (packetlen_aligned - packetlen);
 
-	return hv_ringbuffer_write(channel, bufferlist, 3, requestid);
+	return hv_ringbuffer_write(channel, bufferlist, 3, requestid, NULL);
 }
 EXPORT_SYMBOL_GPL(vmbus_sendpacket_pagebuffer);
 
@@ -1160,7 +1186,7 @@ int vmbus_sendpacket_mpb_desc(struct vmbus_channel *channel,
 	bufferlist[2].iov_base = &aligned_data;
 	bufferlist[2].iov_len = (packetlen_aligned - packetlen);
 
-	return hv_ringbuffer_write(channel, bufferlist, 3, requestid);
+	return hv_ringbuffer_write(channel, bufferlist, 3, requestid, NULL);
 }
 EXPORT_SYMBOL_GPL(vmbus_sendpacket_mpb_desc);
 
diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
index 3a1f007b678a0..64c0b9cbe183b 100644
--- a/drivers/hv/hyperv_vmbus.h
+++ b/drivers/hv/hyperv_vmbus.h
@@ -181,7 +181,7 @@ void hv_ringbuffer_cleanup(struct hv_ring_buffer_info *ring_info);
 
 int hv_ringbuffer_write(struct vmbus_channel *channel,
 			const struct kvec *kv_list, u32 kv_count,
-			u64 requestid);
+			u64 requestid, u64 *trans_id);
 
 int hv_ringbuffer_read(struct vmbus_channel *channel,
 		       void *buffer, u32 buflen, u32 *buffer_actual_len,
diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c
index 71efacb909659..c8561c80c460c 100644
--- a/drivers/hv/ring_buffer.c
+++ b/drivers/hv/ring_buffer.c
@@ -283,7 +283,7 @@ void hv_ringbuffer_cleanup(struct hv_ring_buffer_info *ring_info)
 /* Write to the ring buffer. */
 int hv_ringbuffer_write(struct vmbus_channel *channel,
 			const struct kvec *kv_list, u32 kv_count,
-			u64 requestid)
+			u64 requestid, u64 *trans_id)
 {
 	int i;
 	u32 bytes_avail_towrite;
@@ -354,6 +354,8 @@ int hv_ringbuffer_write(struct vmbus_channel *channel,
 	}
 	desc = hv_get_ring_buffer(outring_info) + old_write;
 	desc->trans_id = (rqst_id == VMBUS_NO_RQSTOR) ? requestid : rqst_id;
+	if (trans_id)
+		*trans_id = desc->trans_id;
 
 	/* Set previous packet start */
 	prev_indices = hv_get_ring_bufferindices(outring_info);
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index fe2e0179ed51e..a7cb596d893b1 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -1161,6 +1161,13 @@ extern int vmbus_open(struct vmbus_channel *channel,
 
 extern void vmbus_close(struct vmbus_channel *channel);
 
+extern int vmbus_sendpacket_getid(struct vmbus_channel *channel,
+				  void *buffer,
+				  u32 bufferLen,
+				  u64 requestid,
+				  u64 *trans_id,
+				  enum vmbus_packet_type type,
+				  u32 flags);
 extern int vmbus_sendpacket(struct vmbus_channel *channel,
 				  void *buffer,
 				  u32 bufferLen,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH 4/4] PCI: hv: Fix synchronization between channel callback and hv_compose_msi_msg()
  2022-03-28 14:42 [RFC PATCH 0/4] PCI: hv: Miscellaneous changes Andrea Parri (Microsoft)
                   ` (2 preceding siblings ...)
  2022-03-28 14:42 ` [RFC PATCH 3/4] Drivers: hv: vmbus: Introduce vmbus_sendpacket_getid() Andrea Parri (Microsoft)
@ 2022-03-28 14:42 ` Andrea Parri (Microsoft)
  2022-03-31 20:04   ` Michael Kelley (LINUX)
  3 siblings, 1 reply; 13+ messages in thread
From: Andrea Parri (Microsoft) @ 2022-03-28 14:42 UTC (permalink / raw)
  To: KY Srinivasan, Haiyang Zhang, Stephen Hemminger, Wei Liu,
	Dexuan Cui, Michael Kelley, Wei Hu, Lorenzo Pieralisi,
	Rob Herring, Krzysztof Wilczynski, Bjorn Helgaas
  Cc: linux-hyperv, linux-pci, linux-kernel, Andrea Parri (Microsoft)

Dexuan wrote:

  "[...]  when we disable AccelNet, the host PCI VSP driver sends a
   PCI_EJECT message first, and the channel callback may set
   hpdev->state to hv_pcichild_ejecting on a different CPU.  This can
   cause hv_compose_msi_msg() to exit from the loop and 'return', and
   the on-stack variable 'ctxt' is invalid.  Now, if the response
   message from the host arrives, the channel callback will try to
   access the invalid 'ctxt' variable, and this may cause a crash."

Schematically:

  Hyper-V sends PCI_EJECT msg
    hv_pci_onchannelcallback()
      state = hv_pcichild_ejecting
                                       hv_compose_msi_msg()
                                         alloc and init comp_pkt
                                         state == hv_pcichild_ejecting
  Hyper-V sends VM_PKT_COMP msg
    hv_pci_onchannelcallback()
      retrieve address of comp_pkt
                                         'free' comp_pkt and return
      comp_pkt->completion_func()

Dexuan also showed how the crash can be triggered after introducing
suitable delays in the driver code, thus validating the 'assumption'
that the host can still normally respond to the guest's compose_msi
request after the host has started to eject the PCI device.

Fix the synchronization by leveraging the requestor lock as follows:

  - Before 'return'-ing in hv_compose_msi_msg(), remove the ID (while
    holding the requestor lock) associated to the completion packet.

  - Retrieve the address *and call ->completion_func() within a same
    (requestor) critical section in hv_pci_onchannelcallback().

Fixes: de0aa7b2f97d3 ("PCI: hv: Fix 2 hang issues in hv_compose_msi_msg()")
Reported-by: Wei Hu <weh@microsoft.com>
Reported-by: Dexuan Cui <decui@microsoft.com>
Suggested-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
---
 drivers/pci/controller/pci-hyperv.c | 83 ++++++++++++++++++++++++++---
 1 file changed, 77 insertions(+), 6 deletions(-)

diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
index 9f963a46b8298..8876b318173f0 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -1662,6 +1662,55 @@ static u32 hv_compose_msi_req_v3(
 	return sizeof(*int_pkt);
 }
 
+/* As in vmbus_request_addr() but without the requestor lock */
+static u64 __hv_pci_request_addr(struct vmbus_channel *channel, u64 trans_id)
+{
+	struct vmbus_requestor *rqstor = &channel->requestor;
+	u64 req_addr;
+
+	if (trans_id >= rqstor->size ||
+	    !test_bit(trans_id, rqstor->req_bitmap))
+		return VMBUS_RQST_ERROR;
+
+	req_addr = rqstor->req_arr[trans_id];
+	rqstor->req_arr[trans_id] = rqstor->next_request_id;
+	rqstor->next_request_id = trans_id;
+
+	bitmap_clear(rqstor->req_bitmap, trans_id, 1);
+
+	return req_addr;
+}
+
+/*
+ * Clear/remove @trans_id from @channel's requestor, provided the memory
+ * address stored at @trans_id equals @rqst_addr.
+ */
+static void hv_pci_request_addr_match(struct vmbus_channel *channel,
+				      u64 trans_id, u64 rqst_addr)
+{
+	struct vmbus_requestor *rqstor = &channel->requestor;
+	unsigned long flags;
+	u64 req_addr;
+
+	spin_lock_irqsave(&rqstor->req_lock, flags);
+
+	if (trans_id >= rqstor->size ||
+	    !test_bit(trans_id, rqstor->req_bitmap)) {
+		spin_unlock_irqrestore(&rqstor->req_lock, flags);
+		return;
+	}
+
+	req_addr = rqstor->req_arr[trans_id];
+	if (req_addr == rqst_addr) {
+		rqstor->req_arr[trans_id] = rqstor->next_request_id;
+		rqstor->next_request_id = trans_id;
+
+		bitmap_clear(rqstor->req_bitmap, trans_id, 1);
+	}
+
+	spin_unlock_irqrestore(&rqstor->req_lock, flags);
+}
+
 /**
  * hv_compose_msi_msg() - Supplies a valid MSI address/data
  * @data:	Everything about this MSI
@@ -1691,7 +1740,7 @@ static void hv_compose_msi_msg(struct irq_data *data, struct msi_msg *msg)
 			struct pci_create_interrupt3 v3;
 		} int_pkts;
 	} __packed ctxt;
-
+	u64 trans_id;
 	u32 size;
 	int ret;
 
@@ -1753,10 +1802,10 @@ static void hv_compose_msi_msg(struct irq_data *data, struct msi_msg *msg)
 		goto free_int_desc;
 	}
 
-	ret = vmbus_sendpacket(hpdev->hbus->hdev->channel, &ctxt.int_pkts,
-			       size, (unsigned long)&ctxt.pci_pkt,
-			       VM_PKT_DATA_INBAND,
-			       VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED);
+	ret = vmbus_sendpacket_getid(hpdev->hbus->hdev->channel, &ctxt.int_pkts,
+				     size, (unsigned long)&ctxt.pci_pkt,
+				     &trans_id, VM_PKT_DATA_INBAND,
+				     VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED);
 	if (ret) {
 		dev_err(&hbus->hdev->device,
 			"Sending request for interrupt failed: 0x%x",
@@ -1835,6 +1884,16 @@ static void hv_compose_msi_msg(struct irq_data *data, struct msi_msg *msg)
 
 enable_tasklet:
 	tasklet_enable(&channel->callback_event);
+	/*
+	 * The completion packet on the stack becomes invalid after 'return';
+	 * remove the ID from the VMbus requestor if the identifier is still
+	 * mapped to/associated with the packet.  (The identifier could have
+	 * been 're-used', i.e., already removed and (re-)mapped.)
+	 *
+	 * Cf. hv_pci_onchannelcallback().
+	 */
+	hv_pci_request_addr_match(channel, trans_id,
+				  (unsigned long)&ctxt.pci_pkt);
 free_int_desc:
 	kfree(int_desc);
 drop_reference:
@@ -2700,6 +2759,8 @@ static void hv_pci_onchannelcallback(void *context)
 	int ret;
 	struct hv_pcibus_device *hbus = context;
 	struct vmbus_channel *chan = hbus->hdev->channel;
+	struct vmbus_requestor *rqstor = &chan->requestor;
+	unsigned long flags;
 	u32 bytes_recvd;
 	u64 req_id, req_addr;
 	struct vmpacket_descriptor *desc;
@@ -2747,17 +2808,27 @@ static void hv_pci_onchannelcallback(void *context)
 		switch (desc->type) {
 		case VM_PKT_COMP:
 
-			req_addr = chan->request_addr_callback(chan, req_id);
+			spin_lock_irqsave(&rqstor->req_lock, flags);
+			req_addr = __hv_pci_request_addr(chan, req_id);
 			if (req_addr == VMBUS_RQST_ERROR) {
+				spin_unlock_irqrestore(&rqstor->req_lock, flags);
 				dev_warn_ratelimited(&hbus->hdev->device,
 						     "Invalid request ID\n");
 				break;
 			}
 			comp_packet = (struct pci_packet *)req_addr;
 			response = (struct pci_response *)buffer;
+			/*
+			 * Call ->completion_func() within the critical section to make
+			 * sure that the packet pointer is still valid during the call:
+			 * here 'valid' means that there's a task still waiting for the
+			 * completion, and that the packet data is still on the waiting
+			 * task's stack.  Cf. hv_compose_msi_msg().
+			 */
 			comp_packet->completion_func(comp_packet->compl_ctxt,
 						     response,
 						     bytes_recvd);
+			spin_unlock_irqrestore(&rqstor->req_lock, flags);
 			break;
 
 		case VM_PKT_DATA_INBAND:
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* RE: [RFC PATCH 1/4] Drivers: hv: vmbus: Remove special code for unsolicited messages
  2022-03-28 14:42 ` [RFC PATCH 1/4] Drivers: hv: vmbus: Remove special code for unsolicited messages Andrea Parri (Microsoft)
@ 2022-03-31 18:00   ` Michael Kelley (LINUX)
  2022-04-07  2:54   ` Andrea Parri
  1 sibling, 0 replies; 13+ messages in thread
From: Michael Kelley (LINUX) @ 2022-03-31 18:00 UTC (permalink / raw)
  To: Andrea Parri (Microsoft),
	KY Srinivasan, Haiyang Zhang, Stephen Hemminger, Wei Liu,
	Dexuan Cui, Wei Hu, Lorenzo Pieralisi, Rob Herring,
	Krzysztof Wilczynski, Bjorn Helgaas
  Cc: linux-hyperv, linux-pci, linux-kernel

From: Andrea Parri (Microsoft) <parri.andrea@gmail.com> Sent: Monday, March 28, 2022 7:43 AM
> 
> vmbus_requestor has included code for handling unsolicited messages
> since its introduction with commit e8b7db38449ac ("Drivers: hv: vmbus:
> Add vmbus_requestor data structure for VMBus hardening"); such code was
> motivated by the early use of vmbus_requestor from storvsc.  Since
> storvsc moved to a tag-based mechanism to generate/retrieve request IDs
> with commit bf5fd8cae3c8f ("scsi: storvsc: Use blk_mq_unique_tag() to
> generate requestIDs"), the special handling of unsolicited messages in
> vmbus_requestor is not useful and can be removed.
> 
> Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
> ---
>  drivers/hv/channel.c | 13 +------------
>  1 file changed, 1 insertion(+), 12 deletions(-)
> 
> diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c
> index dc5c35210c16a..a253eee3aeb1a 100644
> --- a/drivers/hv/channel.c
> +++ b/drivers/hv/channel.c
> @@ -1243,11 +1243,7 @@ u64 vmbus_next_request_id(struct vmbus_channel
> *channel, u64 rqst_addr)
> 
>  	spin_unlock_irqrestore(&rqstor->req_lock, flags);
> 
> -	/*
> -	 * Cannot return an ID of 0, which is reserved for an unsolicited
> -	 * message from Hyper-V.
> -	 */
> -	return current_id + 1;
> +	return current_id;
>  }
>  EXPORT_SYMBOL_GPL(vmbus_next_request_id);
> 
> @@ -1268,15 +1264,8 @@ u64 vmbus_request_addr(struct vmbus_channel *channel,
> u64 trans_id)
>  	if (!channel->rqstor_size)
>  		return VMBUS_NO_RQSTOR;
> 
> -	/* Hyper-V can send an unsolicited message with ID of 0 */
> -	if (!trans_id)
> -		return trans_id;
> -
>  	spin_lock_irqsave(&rqstor->req_lock, flags);
> 
> -	/* Data corresponding to trans_id is stored at trans_id - 1 */
> -	trans_id--;
> -
>  	/* Invalid trans_id */
>  	if (trans_id >= rqstor->size || !test_bit(trans_id, rqstor->req_bitmap)) {
>  		spin_unlock_irqrestore(&rqstor->req_lock, flags);
> --
> 2.25.1

Reviewed-by: Michael Kelley <mikelley@microsoft.com>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [RFC PATCH 2/4] PCI: hv: Use vmbus_requestor to generate transaction IDs for VMbus hardening
  2022-03-28 14:42 ` [RFC PATCH 2/4] PCI: hv: Use vmbus_requestor to generate transaction IDs for VMbus hardening Andrea Parri (Microsoft)
@ 2022-03-31 18:12   ` Michael Kelley (LINUX)
  2022-04-01 16:00     ` Andrea Parri
  0 siblings, 1 reply; 13+ messages in thread
From: Michael Kelley (LINUX) @ 2022-03-31 18:12 UTC (permalink / raw)
  To: Andrea Parri (Microsoft),
	KY Srinivasan, Haiyang Zhang, Stephen Hemminger, Wei Liu,
	Dexuan Cui, Wei Hu, Lorenzo Pieralisi, Rob Herring,
	Krzysztof Wilczynski, Bjorn Helgaas
  Cc: linux-hyperv, linux-pci, linux-kernel

From: Andrea Parri (Microsoft) <parri.andrea@gmail.com> Sent: Monday, March 28, 2022 7:43 AM
> 
> Currently, pointers to guest memory are passed to Hyper-V as transaction
> IDs in hv_pci.  In the face of errors or malicious behavior in Hyper-V,
> hv_pci should not expose or trust the transaction IDs returned by
> Hyper-V to be valid guest memory addresses.  Instead, use small integers
> generated by vmbus_requestor as request (transaction) IDs.
> 
> Suggested-by: Michael Kelley <mikelley@microsoft.com>
> Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
> ---
>  drivers/pci/controller/pci-hyperv.c | 30 +++++++++++++++++++++--------
>  1 file changed, 22 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
> index ae0bc2fee4ca8..9f963a46b8298 100644
> --- a/drivers/pci/controller/pci-hyperv.c
> +++ b/drivers/pci/controller/pci-hyperv.c
> @@ -91,6 +91,9 @@ static enum pci_protocol_version_t pci_protocol_versions[] = {
>  /* space for 32bit serial number as string */
>  #define SLOT_NAME_SIZE 11
> 
> +/* Size of requestor for VMbus */
> +#define HV_PCI_RQSTOR_SIZE 64

Might include a comment about how this value was derived.  I *think*
it is an arbitrary value based on the assumption that having more than
one request outstanding is rare, and so 64 should be extremely generous
in ensuring that we don't ever run out.

> +
>  /*
>   * Message Types
>   */
> @@ -1407,7 +1410,7 @@ static void hv_int_desc_free(struct hv_pci_dev *hpdev,
>  	int_pkt->wslot.slot = hpdev->desc.win_slot.slot;
>  	int_pkt->int_desc = *int_desc;
>  	vmbus_sendpacket(hpdev->hbus->hdev->channel, int_pkt, sizeof(*int_pkt),
> -			 (unsigned long)&ctxt.pkt, VM_PKT_DATA_INBAND, 0);
> +			 0, VM_PKT_DATA_INBAND, 0);
>  	kfree(int_desc);
>  }
> 
> @@ -2649,7 +2652,7 @@ static void hv_eject_device_work(struct work_struct *work)
>  	ejct_pkt->message_type.type = PCI_EJECTION_COMPLETE;
>  	ejct_pkt->wslot.slot = hpdev->desc.win_slot.slot;
>  	vmbus_sendpacket(hbus->hdev->channel, ejct_pkt,
> -			 sizeof(*ejct_pkt), (unsigned long)&ctxt.pkt,
> +			 sizeof(*ejct_pkt), 0,
>  			 VM_PKT_DATA_INBAND, 0);
> 
>  	/* For the get_pcichild() in hv_pci_eject_device() */
> @@ -2696,8 +2699,9 @@ static void hv_pci_onchannelcallback(void *context)
>  	const int packet_size = 0x100;
>  	int ret;
>  	struct hv_pcibus_device *hbus = context;
> +	struct vmbus_channel *chan = hbus->hdev->channel;

Having gotten the channel as a local variable, could also use the local as
the first argument to vmbus_recvpacket_raw().

>  	u32 bytes_recvd;
> -	u64 req_id;
> +	u64 req_id, req_addr;
>  	struct vmpacket_descriptor *desc;
>  	unsigned char *buffer;
>  	int bufferlen = packet_size;
> @@ -2743,11 +2747,13 @@ static void hv_pci_onchannelcallback(void *context)
>  		switch (desc->type) {
>  		case VM_PKT_COMP:
> 
> -			/*
> -			 * The host is trusted, and thus it's safe to interpret
> -			 * this transaction ID as a pointer.
> -			 */
> -			comp_packet = (struct pci_packet *)req_id;
> +			req_addr = chan->request_addr_callback(chan, req_id);
> +			if (req_addr == VMBUS_RQST_ERROR) {
> +				dev_warn_ratelimited(&hbus->hdev->device,
> +						     "Invalid request ID\n");

Could you include the req_id value in the error message that is output?  I
was recently debugging a problem in the storvsc driver where having that
value would have been handy.

> +				break;
> +			}
> +			comp_packet = (struct pci_packet *)req_addr;
>  			response = (struct pci_response *)buffer;
>  			comp_packet->completion_func(comp_packet->compl_ctxt,
>  						     response,
> @@ -3419,6 +3425,10 @@ static int hv_pci_probe(struct hv_device *hdev,
>  		goto free_dom;
>  	}
> 
> +	hdev->channel->next_request_id_callback = vmbus_next_request_id;
> +	hdev->channel->request_addr_callback = vmbus_request_addr;
> +	hdev->channel->rqstor_size = HV_PCI_RQSTOR_SIZE;
> +
>  	ret = vmbus_open(hdev->channel, pci_ring_size, pci_ring_size, NULL, 0,
>  			 hv_pci_onchannelcallback, hbus);
>  	if (ret)
> @@ -3749,6 +3759,10 @@ static int hv_pci_resume(struct hv_device *hdev)
> 
>  	hbus->state = hv_pcibus_init;
> 
> +	hdev->channel->next_request_id_callback = vmbus_next_request_id;
> +	hdev->channel->request_addr_callback = vmbus_request_addr;
> +	hdev->channel->rqstor_size = HV_PCI_RQSTOR_SIZE;
> +
>  	ret = vmbus_open(hdev->channel, pci_ring_size, pci_ring_size, NULL, 0,
>  			 hv_pci_onchannelcallback, hbus);
>  	if (ret)
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [RFC PATCH 3/4] Drivers: hv: vmbus: Introduce vmbus_sendpacket_getid()
  2022-03-28 14:42 ` [RFC PATCH 3/4] Drivers: hv: vmbus: Introduce vmbus_sendpacket_getid() Andrea Parri (Microsoft)
@ 2022-03-31 19:47   ` Michael Kelley (LINUX)
  2022-04-01 16:09     ` Andrea Parri
  0 siblings, 1 reply; 13+ messages in thread
From: Michael Kelley (LINUX) @ 2022-03-31 19:47 UTC (permalink / raw)
  To: Andrea Parri (Microsoft),
	KY Srinivasan, Haiyang Zhang, Stephen Hemminger, Wei Liu,
	Dexuan Cui, Wei Hu, Lorenzo Pieralisi, Rob Herring,
	Krzysztof Wilczynski, Bjorn Helgaas
  Cc: linux-hyperv, linux-pci, linux-kernel

From: Andrea Parri (Microsoft) <parri.andrea@gmail.com> Sent: Monday, March 28, 2022 7:43 AM
> 
> The function can be used to send a VMbus packet and retrieve the
> corresponding transaction ID.  It will be used by hv_pci.
> 
> No functional change.
> 
> Suggested-by: Michael Kelley <mikelley@microsoft.com>
> Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
> ---
>  drivers/hv/channel.c      | 38 ++++++++++++++++++++++++++++++++------
>  drivers/hv/hyperv_vmbus.h |  2 +-
>  drivers/hv/ring_buffer.c  |  4 +++-
>  include/linux/hyperv.h    |  7 +++++++
>  4 files changed, 43 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c
> index a253eee3aeb1a..3eaa41c7ce15f 100644
> --- a/drivers/hv/channel.c
> +++ b/drivers/hv/channel.c
> @@ -1022,11 +1022,13 @@ void vmbus_close(struct vmbus_channel *channel)
>  EXPORT_SYMBOL_GPL(vmbus_close);
> 
>  /**
> - * vmbus_sendpacket() - Send the specified buffer on the given channel
> + * vmbus_sendpacket_getid() - Send the specified buffer on the given channel
>   * @channel: Pointer to vmbus_channel structure
>   * @buffer: Pointer to the buffer you want to send the data from.
>   * @bufferlen: Maximum size of what the buffer holds.
>   * @requestid: Identifier of the request
> + * @trans_id: Identifier of the transaction associated to this request, if
> + *            the send is successful; undefined, otherwise.
>   * @type: Type of packet that is being sent e.g. negotiate, time
>   *	  packet etc.
>   * @flags: 0 or VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED
> @@ -1036,8 +1038,8 @@ EXPORT_SYMBOL_GPL(vmbus_close);
>   *
>   * Mainly used by Hyper-V drivers.
>   */
> -int vmbus_sendpacket(struct vmbus_channel *channel, void *buffer,
> -			   u32 bufferlen, u64 requestid,
> +int vmbus_sendpacket_getid(struct vmbus_channel *channel, void *buffer,
> +			   u32 bufferlen, u64 requestid, u64 *trans_id,
>  			   enum vmbus_packet_type type, u32 flags)
>  {
>  	struct vmpacket_descriptor desc;
> @@ -1063,7 +1065,31 @@ int vmbus_sendpacket(struct vmbus_channel *channel,
> void *buffer,
>  	bufferlist[2].iov_base = &aligned_data;
>  	bufferlist[2].iov_len = (packetlen_aligned - packetlen);
> 
> -	return hv_ringbuffer_write(channel, bufferlist, num_vecs, requestid);
> +	return hv_ringbuffer_write(channel, bufferlist, num_vecs, requestid, trans_id);
> +}
> +EXPORT_SYMBOL(vmbus_sendpacket_getid);
> +
> +/**
> + * vmbus_sendpacket() - Send the specified buffer on the given channel
> + * @channel: Pointer to vmbus_channel structure
> + * @buffer: Pointer to the buffer you want to send the data from.
> + * @bufferlen: Maximum size of what the buffer holds.
> + * @requestid: Identifier of the request
> + * @type: Type of packet that is being sent e.g. negotiate, time
> + *	  packet etc.
> + * @flags: 0 or VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED
> + *
> + * Sends data in @buffer directly to Hyper-V via the vmbus.
> + * This will send the data unparsed to Hyper-V.
> + *
> + * Mainly used by Hyper-V drivers.
> + */
> +int vmbus_sendpacket(struct vmbus_channel *channel, void *buffer,
> +		     u32 bufferlen, u64 requestid,
> +		     enum vmbus_packet_type type, u32 flags)
> +{
> +	return vmbus_sendpacket_getid(channel, buffer, bufferlen,
> +				      requestid, NULL, type, flags);
>  }
>  EXPORT_SYMBOL(vmbus_sendpacket);
> 
> @@ -1122,7 +1148,7 @@ int vmbus_sendpacket_pagebuffer(struct vmbus_channel
> *channel,
>  	bufferlist[2].iov_base = &aligned_data;
>  	bufferlist[2].iov_len = (packetlen_aligned - packetlen);
> 
> -	return hv_ringbuffer_write(channel, bufferlist, 3, requestid);
> +	return hv_ringbuffer_write(channel, bufferlist, 3, requestid, NULL);
>  }
>  EXPORT_SYMBOL_GPL(vmbus_sendpacket_pagebuffer);
> 
> @@ -1160,7 +1186,7 @@ int vmbus_sendpacket_mpb_desc(struct vmbus_channel
> *channel,
>  	bufferlist[2].iov_base = &aligned_data;
>  	bufferlist[2].iov_len = (packetlen_aligned - packetlen);
> 
> -	return hv_ringbuffer_write(channel, bufferlist, 3, requestid);
> +	return hv_ringbuffer_write(channel, bufferlist, 3, requestid, NULL);
>  }
>  EXPORT_SYMBOL_GPL(vmbus_sendpacket_mpb_desc);
> 
> diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
> index 3a1f007b678a0..64c0b9cbe183b 100644
> --- a/drivers/hv/hyperv_vmbus.h
> +++ b/drivers/hv/hyperv_vmbus.h
> @@ -181,7 +181,7 @@ void hv_ringbuffer_cleanup(struct hv_ring_buffer_info
> *ring_info);
> 
>  int hv_ringbuffer_write(struct vmbus_channel *channel,
>  			const struct kvec *kv_list, u32 kv_count,
> -			u64 requestid);
> +			u64 requestid, u64 *trans_id);
> 
>  int hv_ringbuffer_read(struct vmbus_channel *channel,
>  		       void *buffer, u32 buflen, u32 *buffer_actual_len,
> diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c
> index 71efacb909659..c8561c80c460c 100644
> --- a/drivers/hv/ring_buffer.c
> +++ b/drivers/hv/ring_buffer.c
> @@ -283,7 +283,7 @@ void hv_ringbuffer_cleanup(struct hv_ring_buffer_info
> *ring_info)
>  /* Write to the ring buffer. */
>  int hv_ringbuffer_write(struct vmbus_channel *channel,
>  			const struct kvec *kv_list, u32 kv_count,
> -			u64 requestid)
> +			u64 requestid, u64 *trans_id)
>  {
>  	int i;
>  	u32 bytes_avail_towrite;
> @@ -354,6 +354,8 @@ int hv_ringbuffer_write(struct vmbus_channel *channel,
>  	}
>  	desc = hv_get_ring_buffer(outring_info) + old_write;
>  	desc->trans_id = (rqst_id == VMBUS_NO_RQSTOR) ? requestid : rqst_id;
> +	if (trans_id)
> +		*trans_id = desc->trans_id;

This line should *not* read the trans_id out of the ring buffer, since that
memory is shared with the Hyper-V host and subject to being maliciously
changed by the host.  Need to set *trans_id only from local variables, and
somehow ensure the compiler doesn't generate code that reads the value
from the ring buffer.  Maybe mark the desc->trans_id field as volatile, or cast
it as such?  Or does WRITE_ONCE() work when setting it?

Michael

> 
>  	/* Set previous packet start */
>  	prev_indices = hv_get_ring_bufferindices(outring_info);
> diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
> index fe2e0179ed51e..a7cb596d893b1 100644
> --- a/include/linux/hyperv.h
> +++ b/include/linux/hyperv.h
> @@ -1161,6 +1161,13 @@ extern int vmbus_open(struct vmbus_channel *channel,
> 
>  extern void vmbus_close(struct vmbus_channel *channel);
> 
> +extern int vmbus_sendpacket_getid(struct vmbus_channel *channel,
> +				  void *buffer,
> +				  u32 bufferLen,
> +				  u64 requestid,
> +				  u64 *trans_id,
> +				  enum vmbus_packet_type type,
> +				  u32 flags);
>  extern int vmbus_sendpacket(struct vmbus_channel *channel,
>  				  void *buffer,
>  				  u32 bufferLen,
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [RFC PATCH 4/4] PCI: hv: Fix synchronization between channel callback and hv_compose_msi_msg()
  2022-03-28 14:42 ` [RFC PATCH 4/4] PCI: hv: Fix synchronization between channel callback and hv_compose_msi_msg() Andrea Parri (Microsoft)
@ 2022-03-31 20:04   ` Michael Kelley (LINUX)
  2022-04-01 16:30     ` Andrea Parri
  0 siblings, 1 reply; 13+ messages in thread
From: Michael Kelley (LINUX) @ 2022-03-31 20:04 UTC (permalink / raw)
  To: Andrea Parri (Microsoft),
	KY Srinivasan, Haiyang Zhang, Stephen Hemminger, Wei Liu,
	Dexuan Cui, Wei Hu, Lorenzo Pieralisi, Rob Herring,
	Krzysztof Wilczynski, Bjorn Helgaas
  Cc: linux-hyperv, linux-pci, linux-kernel

From: Andrea Parri (Microsoft) <parri.andrea@gmail.com> Sent: Monday, March 28, 2022 7:43 AM
> 
> Dexuan wrote:
> 
>   "[...]  when we disable AccelNet, the host PCI VSP driver sends a
>    PCI_EJECT message first, and the channel callback may set
>    hpdev->state to hv_pcichild_ejecting on a different CPU.  This can
>    cause hv_compose_msi_msg() to exit from the loop and 'return', and
>    the on-stack variable 'ctxt' is invalid.  Now, if the response
>    message from the host arrives, the channel callback will try to
>    access the invalid 'ctxt' variable, and this may cause a crash."
> 
> Schematically:
> 
>   Hyper-V sends PCI_EJECT msg
>     hv_pci_onchannelcallback()
>       state = hv_pcichild_ejecting
>                                        hv_compose_msi_msg()
>                                          alloc and init comp_pkt
>                                          state == hv_pcichild_ejecting
>   Hyper-V sends VM_PKT_COMP msg
>     hv_pci_onchannelcallback()
>       retrieve address of comp_pkt
>                                          'free' comp_pkt and return
>       comp_pkt->completion_func()
> 
> Dexuan also showed how the crash can be triggered after introducing
> suitable delays in the driver code, thus validating the 'assumption'
> that the host can still normally respond to the guest's compose_msi
> request after the host has started to eject the PCI device.
> 
> Fix the synchronization by leveraging the requestor lock as follows:
> 
>   - Before 'return'-ing in hv_compose_msi_msg(), remove the ID (while
>     holding the requestor lock) associated to the completion packet.
> 
>   - Retrieve the address *and call ->completion_func() within a same
>     (requestor) critical section in hv_pci_onchannelcallback().
> 
> Fixes: de0aa7b2f97d3 ("PCI: hv: Fix 2 hang issues in hv_compose_msi_msg()")
> Reported-by: Wei Hu <weh@microsoft.com>
> Reported-by: Dexuan Cui <decui@microsoft.com>
> Suggested-by: Michael Kelley <mikelley@microsoft.com>
> Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
> ---
>  drivers/pci/controller/pci-hyperv.c | 83 ++++++++++++++++++++++++++---
>  1 file changed, 77 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
> index 9f963a46b8298..8876b318173f0 100644
> --- a/drivers/pci/controller/pci-hyperv.c
> +++ b/drivers/pci/controller/pci-hyperv.c
> @@ -1662,6 +1662,55 @@ static u32 hv_compose_msi_req_v3(
>  	return sizeof(*int_pkt);
>  }
> 
> +/* As in vmbus_request_addr() but without the requestor lock */
> +static u64 __hv_pci_request_addr(struct vmbus_channel *channel, u64 trans_id)
> +{
> +	struct vmbus_requestor *rqstor = &channel->requestor;
> +	u64 req_addr;
> +
> +	if (trans_id >= rqstor->size ||
> +	    !test_bit(trans_id, rqstor->req_bitmap))
> +		return VMBUS_RQST_ERROR;
> +
> +	req_addr = rqstor->req_arr[trans_id];
> +	rqstor->req_arr[trans_id] = rqstor->next_request_id;
> +	rqstor->next_request_id = trans_id;
> +
> +	bitmap_clear(rqstor->req_bitmap, trans_id, 1);
> +
> +	return req_addr;
> +}
> +
> +/*
> + * Clear/remove @trans_id from @channel's requestor, provided the memory
> + * address stored at @trans_id equals @rqst_addr.
> + */
> +static void hv_pci_request_addr_match(struct vmbus_channel *channel,
> +				      u64 trans_id, u64 rqst_addr)
> +{
> +	struct vmbus_requestor *rqstor = &channel->requestor;
> +	unsigned long flags;
> +	u64 req_addr;
> +
> +	spin_lock_irqsave(&rqstor->req_lock, flags);
> +
> +	if (trans_id >= rqstor->size ||
> +	    !test_bit(trans_id, rqstor->req_bitmap)) {
> +		spin_unlock_irqrestore(&rqstor->req_lock, flags);
> +		return;
> +	}
> +
> +	req_addr = rqstor->req_arr[trans_id];
> +	if (req_addr == rqst_addr) {
> +		rqstor->req_arr[trans_id] = rqstor->next_request_id;
> +		rqstor->next_request_id = trans_id;
> +
> +		bitmap_clear(rqstor->req_bitmap, trans_id, 1);
> +	}
> +
> +	spin_unlock_irqrestore(&rqstor->req_lock, flags);
> +}
> +

Even though these two new functions are used only in the Hyper-V
vPCI driver, it seems like they should go in drivers/hv/channel.c
along with vmbus_next_request_id() and vmbus_request_addr().
And maybe vmbus_request_addr(), which gets the spin lock,
could be implemented to call the new version above that
assumes the spin lock is already held.  Also, the new function
that requires matching on the rqst_addr might also be folded
into common code via an optional rqst_addr argument.

>  /**
>   * hv_compose_msi_msg() - Supplies a valid MSI address/data
>   * @data:	Everything about this MSI
> @@ -1691,7 +1740,7 @@ static void hv_compose_msi_msg(struct irq_data *data,
> struct msi_msg *msg)
>  			struct pci_create_interrupt3 v3;
>  		} int_pkts;
>  	} __packed ctxt;
> -
> +	u64 trans_id;
>  	u32 size;
>  	int ret;
> 
> @@ -1753,10 +1802,10 @@ static void hv_compose_msi_msg(struct irq_data *data,
> struct msi_msg *msg)
>  		goto free_int_desc;
>  	}
> 
> -	ret = vmbus_sendpacket(hpdev->hbus->hdev->channel, &ctxt.int_pkts,
> -			       size, (unsigned long)&ctxt.pci_pkt,
> -			       VM_PKT_DATA_INBAND,
> -			       VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED);
> +	ret = vmbus_sendpacket_getid(hpdev->hbus->hdev->channel, &ctxt.int_pkts,
> +				     size, (unsigned long)&ctxt.pci_pkt,
> +				     &trans_id, VM_PKT_DATA_INBAND,
> +
> VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED);
>  	if (ret) {
>  		dev_err(&hbus->hdev->device,
>  			"Sending request for interrupt failed: 0x%x",
> @@ -1835,6 +1884,16 @@ static void hv_compose_msi_msg(struct irq_data *data,
> struct msi_msg *msg)
> 
>  enable_tasklet:
>  	tasklet_enable(&channel->callback_event);
> +	/*
> +	 * The completion packet on the stack becomes invalid after 'return';
> +	 * remove the ID from the VMbus requestor if the identifier is still
> +	 * mapped to/associated with the packet.  (The identifier could have
> +	 * been 're-used', i.e., already removed and (re-)mapped.)
> +	 *
> +	 * Cf. hv_pci_onchannelcallback().
> +	 */
> +	hv_pci_request_addr_match(channel, trans_id,
> +				  (unsigned long)&ctxt.pci_pkt);
>  free_int_desc:
>  	kfree(int_desc);
>  drop_reference:
> @@ -2700,6 +2759,8 @@ static void hv_pci_onchannelcallback(void *context)
>  	int ret;
>  	struct hv_pcibus_device *hbus = context;
>  	struct vmbus_channel *chan = hbus->hdev->channel;
> +	struct vmbus_requestor *rqstor = &chan->requestor;
> +	unsigned long flags;
>  	u32 bytes_recvd;
>  	u64 req_id, req_addr;
>  	struct vmpacket_descriptor *desc;
> @@ -2747,17 +2808,27 @@ static void hv_pci_onchannelcallback(void *context)
>  		switch (desc->type) {
>  		case VM_PKT_COMP:
> 
> -			req_addr = chan->request_addr_callback(chan, req_id);
> +			spin_lock_irqsave(&rqstor->req_lock, flags);

Obtaining the lock (and releasing it below) might be better abstracted into
a lock_requestor() and unlock_requestor() pair that are implemented in
drivers/hv/channel.c along with the other related functions.

> +			req_addr = __hv_pci_request_addr(chan, req_id);
>  			if (req_addr == VMBUS_RQST_ERROR) {
> +				spin_unlock_irqrestore(&rqstor->req_lock, flags);
>  				dev_warn_ratelimited(&hbus->hdev->device,
>  						     "Invalid request ID\n");
>  				break;
>  			}
>  			comp_packet = (struct pci_packet *)req_addr;
>  			response = (struct pci_response *)buffer;
> +			/*
> +			 * Call ->completion_func() within the critical section to make
> +			 * sure that the packet pointer is still valid during the call:
> +			 * here 'valid' means that there's a task still waiting for the
> +			 * completion, and that the packet data is still on the waiting
> +			 * task's stack.  Cf. hv_compose_msi_msg().
> +			 */
>  			comp_packet->completion_func(comp_packet->compl_ctxt,
>  						     response,
>  						     bytes_recvd);
> +			spin_unlock_irqrestore(&rqstor->req_lock, flags);
>  			break;
> 
>  		case VM_PKT_DATA_INBAND:
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 2/4] PCI: hv: Use vmbus_requestor to generate transaction IDs for VMbus hardening
  2022-03-31 18:12   ` Michael Kelley (LINUX)
@ 2022-04-01 16:00     ` Andrea Parri
  0 siblings, 0 replies; 13+ messages in thread
From: Andrea Parri @ 2022-04-01 16:00 UTC (permalink / raw)
  To: Michael Kelley (LINUX)
  Cc: KY Srinivasan, Haiyang Zhang, Stephen Hemminger, Wei Liu,
	Dexuan Cui, Wei Hu, Lorenzo Pieralisi, Rob Herring,
	Krzysztof Wilczynski, Bjorn Helgaas, linux-hyperv, linux-pci,
	linux-kernel

> > @@ -91,6 +91,9 @@ static enum pci_protocol_version_t pci_protocol_versions[] = {
> >  /* space for 32bit serial number as string */
> >  #define SLOT_NAME_SIZE 11
> > 
> > +/* Size of requestor for VMbus */
> > +#define HV_PCI_RQSTOR_SIZE 64
> 
> Might include a comment about how this value was derived.  I *think*
> it is an arbitrary value based on the assumption that having more than
> one request outstanding is rare, and so 64 should be extremely generous
> in ensuring that we don't ever run out.

Right, I've added a comment to that effect.


> > @@ -2696,8 +2699,9 @@ static void hv_pci_onchannelcallback(void *context)
> >  	const int packet_size = 0x100;
> >  	int ret;
> >  	struct hv_pcibus_device *hbus = context;
> > +	struct vmbus_channel *chan = hbus->hdev->channel;
> 
> Having gotten the channel as a local variable, could also use the local as
> the first argument to vmbus_recvpacket_raw().

Applied.


> > @@ -2743,11 +2747,13 @@ static void hv_pci_onchannelcallback(void *context)
> >  		switch (desc->type) {
> >  		case VM_PKT_COMP:
> > 
> > -			/*
> > -			 * The host is trusted, and thus it's safe to interpret
> > -			 * this transaction ID as a pointer.
> > -			 */
> > -			comp_packet = (struct pci_packet *)req_id;
> > +			req_addr = chan->request_addr_callback(chan, req_id);
> > +			if (req_addr == VMBUS_RQST_ERROR) {
> > +				dev_warn_ratelimited(&hbus->hdev->device,
> > +						     "Invalid request ID\n");
> 
> Could you include the req_id value in the error message that is output?  I
> was recently debugging a problem in the storvsc driver where having that
> value would have been handy.

Sure.

Thanks,
  Andrea

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 3/4] Drivers: hv: vmbus: Introduce vmbus_sendpacket_getid()
  2022-03-31 19:47   ` Michael Kelley (LINUX)
@ 2022-04-01 16:09     ` Andrea Parri
  0 siblings, 0 replies; 13+ messages in thread
From: Andrea Parri @ 2022-04-01 16:09 UTC (permalink / raw)
  To: Michael Kelley (LINUX)
  Cc: KY Srinivasan, Haiyang Zhang, Stephen Hemminger, Wei Liu,
	Dexuan Cui, Wei Hu, Lorenzo Pieralisi, Rob Herring,
	Krzysztof Wilczynski, Bjorn Helgaas, linux-hyperv, linux-pci,
	linux-kernel

> > @@ -354,6 +354,8 @@ int hv_ringbuffer_write(struct vmbus_channel *channel,
> >  	}
> >  	desc = hv_get_ring_buffer(outring_info) + old_write;
> >  	desc->trans_id = (rqst_id == VMBUS_NO_RQSTOR) ? requestid : rqst_id;
> > +	if (trans_id)
> > +		*trans_id = desc->trans_id;
> 
> This line should *not* read the trans_id out of the ring buffer, since that
> memory is shared with the Hyper-V host and subject to being maliciously
> changed by the host.  Need to set *trans_id only from local variables, and
> somehow ensure the compiler doesn't generate code that reads the value
> from the ring buffer.  Maybe mark the desc->trans_id field as volatile, or cast
> it as such?  Or does WRITE_ONCE() work when setting it?

I'd stick to WRITE_ONCE() (with a comment).

Good catch!

Thanks,
  Andrea

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 4/4] PCI: hv: Fix synchronization between channel callback and hv_compose_msi_msg()
  2022-03-31 20:04   ` Michael Kelley (LINUX)
@ 2022-04-01 16:30     ` Andrea Parri
  0 siblings, 0 replies; 13+ messages in thread
From: Andrea Parri @ 2022-04-01 16:30 UTC (permalink / raw)
  To: Michael Kelley (LINUX)
  Cc: KY Srinivasan, Haiyang Zhang, Stephen Hemminger, Wei Liu,
	Dexuan Cui, Wei Hu, Lorenzo Pieralisi, Rob Herring,
	Krzysztof Wilczynski, Bjorn Helgaas, linux-hyperv, linux-pci,
	linux-kernel

> > @@ -1662,6 +1662,55 @@ static u32 hv_compose_msi_req_v3(
> >  	return sizeof(*int_pkt);
> >  }
> > 
> > +/* As in vmbus_request_addr() but without the requestor lock */
> > +static u64 __hv_pci_request_addr(struct vmbus_channel *channel, u64 trans_id)
> > +{
> > +	struct vmbus_requestor *rqstor = &channel->requestor;
> > +	u64 req_addr;
> > +
> > +	if (trans_id >= rqstor->size ||
> > +	    !test_bit(trans_id, rqstor->req_bitmap))
> > +		return VMBUS_RQST_ERROR;
> > +
> > +	req_addr = rqstor->req_arr[trans_id];
> > +	rqstor->req_arr[trans_id] = rqstor->next_request_id;
> > +	rqstor->next_request_id = trans_id;
> > +
> > +	bitmap_clear(rqstor->req_bitmap, trans_id, 1);
> > +
> > +	return req_addr;
> > +}
> > +
> > +/*
> > + * Clear/remove @trans_id from @channel's requestor, provided the memory
> > + * address stored at @trans_id equals @rqst_addr.
> > + */
> > +static void hv_pci_request_addr_match(struct vmbus_channel *channel,
> > +				      u64 trans_id, u64 rqst_addr)
> > +{
> > +	struct vmbus_requestor *rqstor = &channel->requestor;
> > +	unsigned long flags;
> > +	u64 req_addr;
> > +
> > +	spin_lock_irqsave(&rqstor->req_lock, flags);
> > +
> > +	if (trans_id >= rqstor->size ||
> > +	    !test_bit(trans_id, rqstor->req_bitmap)) {
> > +		spin_unlock_irqrestore(&rqstor->req_lock, flags);
> > +		return;
> > +	}
> > +
> > +	req_addr = rqstor->req_arr[trans_id];
> > +	if (req_addr == rqst_addr) {
> > +		rqstor->req_arr[trans_id] = rqstor->next_request_id;
> > +		rqstor->next_request_id = trans_id;
> > +
> > +		bitmap_clear(rqstor->req_bitmap, trans_id, 1);
> > +	}
> > +
> > +	spin_unlock_irqrestore(&rqstor->req_lock, flags);
> > +}
> > +
> 
> Even though these two new functions are used only in the Hyper-V
> vPCI driver, it seems like they should go in drivers/hv/channel.c
> along with vmbus_next_request_id() and vmbus_request_addr().
> And maybe vmbus_request_addr(), which gets the spin lock,
> could be implemented to call the new version above that
> assumes the spin lock is already held.  Also, the new function
> that requires matching on the rqst_addr might also be folded
> into common code via an optional rqst_addr argument.

Noted.


> > @@ -2747,17 +2808,27 @@ static void hv_pci_onchannelcallback(void *context)
> >  		switch (desc->type) {
> >  		case VM_PKT_COMP:
> > 
> > -			req_addr = chan->request_addr_callback(chan, req_id);
> > +			spin_lock_irqsave(&rqstor->req_lock, flags);
> 
> Obtaining the lock (and releasing it below) might be better abstracted into
> a lock_requestor() and unlock_requestor() pair that are implemented in
> drivers/hv/channel.c along with the other related functions.

Seems like these helpers should go 'inline' in <linux/hyper.h>, let me
do it...

Thanks,
  Andrea

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/4] Drivers: hv: vmbus: Remove special code for unsolicited messages
  2022-03-28 14:42 ` [RFC PATCH 1/4] Drivers: hv: vmbus: Remove special code for unsolicited messages Andrea Parri (Microsoft)
  2022-03-31 18:00   ` Michael Kelley (LINUX)
@ 2022-04-07  2:54   ` Andrea Parri
  1 sibling, 0 replies; 13+ messages in thread
From: Andrea Parri @ 2022-04-07  2:54 UTC (permalink / raw)
  To: KY Srinivasan, Haiyang Zhang, Stephen Hemminger, Wei Liu,
	Dexuan Cui, Michael Kelley, Wei Hu, Lorenzo Pieralisi,
	Rob Herring, Krzysztof Wilczynski, Bjorn Helgaas
  Cc: linux-hyperv, linux-pci, linux-kernel

On Mon, Mar 28, 2022 at 04:42:41PM +0200, Andrea Parri (Microsoft) wrote:
> vmbus_requestor has included code for handling unsolicited messages
> since its introduction with commit e8b7db38449ac ("Drivers: hv: vmbus:
> Add vmbus_requestor data structure for VMBus hardening"); such code was
> motivated by the early use of vmbus_requestor from storvsc.  Since
> storvsc moved to a tag-based mechanism to generate/retrieve request IDs
> with commit bf5fd8cae3c8f ("scsi: storvsc: Use blk_mq_unique_tag() to
> generate requestIDs"), the special handling of unsolicited messages in
> vmbus_requestor is not useful and can be removed.

As it turns out, this is not quite right.  In particular...


> @@ -1243,11 +1243,7 @@ u64 vmbus_next_request_id(struct vmbus_channel *channel, u64 rqst_addr)
>  
>  	spin_unlock_irqrestore(&rqstor->req_lock, flags);
>  
> -	/*
> -	 * Cannot return an ID of 0, which is reserved for an unsolicited
> -	 * message from Hyper-V.
> -	 */
> -	return current_id + 1;
> +	return current_id;

Hyper-V treats requests with ID of 0 as "non-transactional" and it does
not respond to such requests.


> @@ -1268,15 +1264,8 @@ u64 vmbus_request_addr(struct vmbus_channel *channel, u64 trans_id)
>  	if (!channel->rqstor_size)
>  		return VMBUS_NO_RQSTOR;
>  
> -	/* Hyper-V can send an unsolicited message with ID of 0 */
> -	if (!trans_id)
> -		return trans_id;

This remains problematic: I will elaborate and propose some solution in
the next iteration (to be sent shortly).

Thanks,
  Andrea

> -
>  	spin_lock_irqsave(&rqstor->req_lock, flags);
>  
> -	/* Data corresponding to trans_id is stored at trans_id - 1 */
> -	trans_id--;
> -
>  	/* Invalid trans_id */
>  	if (trans_id >= rqstor->size || !test_bit(trans_id, rqstor->req_bitmap)) {
>  		spin_unlock_irqrestore(&rqstor->req_lock, flags);

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-04-07  2:54 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-28 14:42 [RFC PATCH 0/4] PCI: hv: Miscellaneous changes Andrea Parri (Microsoft)
2022-03-28 14:42 ` [RFC PATCH 1/4] Drivers: hv: vmbus: Remove special code for unsolicited messages Andrea Parri (Microsoft)
2022-03-31 18:00   ` Michael Kelley (LINUX)
2022-04-07  2:54   ` Andrea Parri
2022-03-28 14:42 ` [RFC PATCH 2/4] PCI: hv: Use vmbus_requestor to generate transaction IDs for VMbus hardening Andrea Parri (Microsoft)
2022-03-31 18:12   ` Michael Kelley (LINUX)
2022-04-01 16:00     ` Andrea Parri
2022-03-28 14:42 ` [RFC PATCH 3/4] Drivers: hv: vmbus: Introduce vmbus_sendpacket_getid() Andrea Parri (Microsoft)
2022-03-31 19:47   ` Michael Kelley (LINUX)
2022-04-01 16:09     ` Andrea Parri
2022-03-28 14:42 ` [RFC PATCH 4/4] PCI: hv: Fix synchronization between channel callback and hv_compose_msi_msg() Andrea Parri (Microsoft)
2022-03-31 20:04   ` Michael Kelley (LINUX)
2022-04-01 16:30     ` Andrea Parri

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).