linux-hyperv.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RE: [PATCH] PCI: pci-hyperv: Retry PCI bus D0 entry when the first attempt failed with invalid device state 0xC0000184.
@ 2020-04-27  1:30 Dexuan Cui
  0 siblings, 0 replies; 9+ messages in thread
From: Dexuan Cui @ 2020-04-27  1:30 UTC (permalink / raw)
  To: Wei Hu, KY Srinivasan, Haiyang Zhang, Stephen Hemminger, wei.liu,
	lorenzo.pieralisi, robh, bhelgaas, linux-hyperv, linux-pci,
	linux-kernel, Michael Kelley

> From: Wei Hu <weh@microsoft.com>
> Sent: Sunday, April 26, 2020 6:25 AM
> Subject: [PATCH] PCI: pci-hyperv: Retry PCI bus D0 entry when the first attempt
> failed with invalid device state 0xC0000184.

The title looks too long. :-)
Ideally it should be shorter than 75 chars. I suggest the part 
"with invalid device state 0xC0000184. " should be removed.

> +#define STATUS_INVALID_DEVICE_STATE		0xC0000184
> +
> +static int hv_pci_bus_exit(struct hv_device *hdev, bool hibernating);

Should we change the name of the parameter 'hibernating'?

>  /**
>   * hv_pci_enter_d0() - Bring the "bus" into the D0 power state
>   * @hdev:	VMBus's tracking struct for this root PCI bus
> @@ -2748,8 +2752,10 @@ static int hv_pci_enter_d0(struct hv_device *hdev)
>  	struct pci_bus_d0_entry *d0_entry;
>  	struct hv_pci_compl comp_pkt;
>  	struct pci_packet *pkt;
> +	bool retry = true;
>  	int ret;
> 
> +enter_d0_retry:
>  	/*
>  	 * Tell the host that the bus is ready to use, and moved into the
>  	 * powered-on state.  This includes telling the host which region
> @@ -2780,6 +2786,30 @@ static int hv_pci_enter_d0(struct hv_device *hdev)
>  		dev_err(&hdev->device,
>  			"PCI Pass-through VSP failed D0 Entry with status %x\n",
>  			comp_pkt.completion_status);
> +
> +		/*
> +		 * In certain case (Kdump) the pci device of interest was
> +		 * not cleanly shut down and resource is still held on host
> +		 * side, the host could return STATUS_INVALID_DEVICE_STATE.
> +		 * We need to explicitly request host to release the resource
> +		 * and try to enter D0 again.
> +		 */
> +		if (comp_pkt.completion_status == STATUS_INVALID_DEVICE_STATE
> &&
> +		    retry) {

Maybe it's better to just retry for any error in comp_pkt.completion_status?
Just in case the host returns a slightly different error code in future.

Thanks,
-- Dexuan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH] PCI: pci-hyperv: Retry PCI bus D0 entry when the first attempt failed with invalid device state 0xC0000184.
  2020-04-28  6:41       ` Wei Hu
@ 2020-04-29 23:47         ` Michael Kelley
  0 siblings, 0 replies; 9+ messages in thread
From: Michael Kelley @ 2020-04-29 23:47 UTC (permalink / raw)
  To: Wei Hu, KY Srinivasan, Haiyang Zhang, Stephen Hemminger, wei.liu,
	linux-hyperv, Dexuan Cui

From: Wei Hu <weh@microsoft.com>  Sent: Monday, April 27, 2020 11:42 PM
> > From: Michael Kelley <mikelley@microsoft.com>
> > > > The above is good.  But there's another error case that isn't
> > > > handled correctly.  If create_root_hv_pci_bus() fails,
> > > > hv_send_resources_released() should be called.
> > > >
> > > > Fixing these two error cases should probably go in a separate patch:
> > > > One patch for the retry problem in kdump, and a separate patch for
> > > > these error cases.
> > >
> > > [Wei Hu]
> > > hv_send_resources_released() is called in the added hv_pci_bus_exit().
> > > If hv_send_resources_allocated() fails, is it correct to call
> > hv_send_resources_released()?
> > > Allocation can fail in the middle. So I am not sure if calling
> > > hv_send_resources_released() won't cause any side effect.
> > >
> >
> > Ah yes, you are right.  But that brings up a separate problem.
> > If hv_pci_allocate_bridge_windows() or hv_send_resources_allocated() fails,
> > then the error path will call hv_pci_bus_exit(), which will call
> > hv_send_resources_released(), even if hv_send_resources_allocated() was
> > never called or didn't fully succeed.  As you noted,
> > hv_send_resources_allocated() does multiple steps, some of which might have
> > succeeded, and some of which didn't.  The mismatch might cause problems.
> > That means fixing this error handling is going to be a bit more complex.  Each
> > operation needs
> > to be individually undone, and only if it previously succeeded.   Could you
> > follow up
> > with the Hyper-V people to see if there's a problem with doing the
> > RESOURCES_RELEASED message on a slot where RESOURCES_ASSIGNED was
> > not done or wasn't successful?
> > If doing a spurious RESOURCES_RELEASED is harmless, that will make the error
> > cleanup easier.
> >
> 
> I think we can clean this up by doing like following, without the need of consulting
> Hyper-V team. The kdump retry also works with this by setting the wslot_res_allocated
> to 255 before calling hv_pci_bus_exit() to retry. Let me know what you think.

I like the overall approach.  I've reviewed your code below, and I think it works.
For the kdump situation, the assumption is that if we get the failure in enter_d0(),
then the PCI device must have been successfully set up in the main kernel.  The
occupied slots found by the kdump kernel must the same as the occupied slots
that were found (and setup) by the main kernel.  Therefore it is OK to iterate
through all 256 slots in hv_send_resources_released() on the retry path.  Make
sure to add a comment with that reasoning. :-)

Michael

> 
> diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
> index 0a42c228b231..06f31f5777f9 100644
> --- a/drivers/pci/controller/pci-hyperv.c
> +++ b/drivers/pci/controller/pci-hyperv.c
> @@ -480,6 +480,9 @@ struct hv_pcibus_device {
> 
>         struct workqueue_struct *wq;
> 
> +       /* Highest slot of child device allocated resources allocated*/
> +       int wslot_res_allocated;
> +
>         /* hypercall arg, must not cross page boundary */
>         struct hv_retarget_device_interrupt retarget_msi_interrupt_params;
> 
> @@ -2873,7 +2876,7 @@ static int hv_send_resources_allocated(struct hv_device *hdev)
>         struct hv_pci_dev *hpdev;
>         struct pci_packet *pkt;
>         size_t size_res;
> -       u32 wslot;
> +       int wslot;
>         int ret;
> 
>         size_res = (hbus->protocol_version < PCI_PROTOCOL_VERSION_1_2)
> @@ -2926,6 +2929,8 @@ static int hv_send_resources_allocated(struct hv_device *hdev)
>                                 comp_pkt.completion_status);
>                         break;
>                 }
> +
> +               hbus->wslot_res_allocated = wslot;
>         }
> 
>         kfree(pkt);
> @@ -2944,10 +2949,10 @@ static int hv_send_resources_released(struct hv_device *hdev)
>         struct hv_pcibus_device *hbus = hv_get_drvdata(hdev);
>         struct pci_child_message pkt;
>         struct hv_pci_dev *hpdev;
> -       u32 wslot;
> +       int wslot;
>         int ret;
> 
> -       for (wslot = 0; wslot < 256; wslot++) {
> +       for (wslot = hbus->wslot_res_allocated; wslot >= 0; wslot--) {
>                 hpdev = get_pcichild_wslot(hbus, wslot);
>                 if (!hpdev)
>                         continue;
> @@ -2962,8 +2967,12 @@ static int hv_send_resources_released(struct hv_device *hdev)
>                                        VM_PKT_DATA_INBAND, 0);
>                 if (ret)
>                         return ret;
> +
> +               hbus->wslot_res_allocated = wslot - 1;
>         }
> 
> +       hbus->wslot_res_allocated = -1;
> +
>         return 0;
>  }
> 
> @@ -3063,6 +3072,7 @@ static int hv_pci_probe(struct hv_device *hdev,
>         if (!hbus)
>                 return -ENOMEM;
>         hbus->state = hv_pcibus_init;
> +       hbus->wslot_res_allocated = -1;
> 
>         /*
>          * The PCI bus "domain" is what is called "segment" in ACPI and other
> @@ -3162,7 +3172,7 @@ static int hv_pci_probe(struct hv_device *hdev,
> 
>         ret = hv_pci_allocate_bridge_windows(hbus);
>         if (ret)
> -               goto free_irq_domain;
> +               goto exit_d0;
> 
>         ret = hv_send_resources_allocated(hdev);
>         if (ret)
> @@ -3180,6 +3190,8 @@ static int hv_pci_probe(struct hv_device *hdev,
> 
>  free_windows:
>         hv_pci_free_bridge_windows(hbus);
> +exit_d0:
> +       (void) hv_pci_bus_exit(hdev, true);
>  free_irq_domain:
>         irq_domain_remove(hbus->irq_domain);
>  free_fwnode:

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH] PCI: pci-hyperv: Retry PCI bus D0 entry when the first attempt failed with invalid device state 0xC0000184.
  2020-04-27 15:36     ` Michael Kelley
@ 2020-04-28  6:41       ` Wei Hu
  2020-04-29 23:47         ` Michael Kelley
  0 siblings, 1 reply; 9+ messages in thread
From: Wei Hu @ 2020-04-28  6:41 UTC (permalink / raw)
  To: Michael Kelley, KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	wei.liu, linux-hyperv, Dexuan Cui

> -----Original Message-----
> From: Michael Kelley <mikelley@microsoft.com>
> > > The above is good.  But there's another error case that isn't
> > > handled correctly.  If create_root_hv_pci_bus() fails,
> > > hv_send_resources_released() should be called.
> > >
> > > Fixing these two error cases should probably go in a separate patch:
> > > One patch for the retry problem in kdump, and a separate patch for
> > > these error cases.
> >
> > [Wei Hu]
> > hv_send_resources_released() is called in the added hv_pci_bus_exit().
> > If hv_send_resources_allocated() fails, is it correct to call
> hv_send_resources_released()?
> > Allocation can fail in the middle. So I am not sure if calling
> > hv_send_resources_released() won't cause any side effect.
> >
> 
> Ah yes, you are right.  But that brings up a separate problem.
> If hv_pci_allocate_bridge_windows() or hv_send_resources_allocated() fails,
> then the error path will call hv_pci_bus_exit(), which will call
> hv_send_resources_released(), even if hv_send_resources_allocated() was
> never called or didn't fully succeed.  As you noted,
> hv_send_resources_allocated() does multiple steps, some of which might have
> succeeded, and some of which didn't.  The mismatch might cause problems.
> That means fixing this error handling is going to be a bit more complex.  Each
> operation needs
> to be individually undone, and only if it previously succeeded.   Could you
> follow up
> with the Hyper-V people to see if there's a problem with doing the
> RESOURCES_RELEASED message on a slot where RESOURCES_ASSIGNED was
> not done or wasn't successful?
> If doing a spurious RESOURCES_RELEASED is harmless, that will make the error
> cleanup easier.
> 

I think we can clean this up by doing like following, without the need of consulting
Hyper-V team. The kdump retry also works with this by setting the wslot_res_allocated
to 255 before calling hv_pci_bus_exit() to retry. Let me know what you think.

diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
index 0a42c228b231..06f31f5777f9 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -480,6 +480,9 @@ struct hv_pcibus_device {

        struct workqueue_struct *wq;

+       /* Highest slot of child device allocated resources allocated*/
+       int wslot_res_allocated;
+
        /* hypercall arg, must not cross page boundary */
        struct hv_retarget_device_interrupt retarget_msi_interrupt_params;

@@ -2873,7 +2876,7 @@ static int hv_send_resources_allocated(struct hv_device *hdev)
        struct hv_pci_dev *hpdev;
        struct pci_packet *pkt;
        size_t size_res;
-       u32 wslot;
+       int wslot;
        int ret;

        size_res = (hbus->protocol_version < PCI_PROTOCOL_VERSION_1_2)
@@ -2926,6 +2929,8 @@ static int hv_send_resources_allocated(struct hv_device *hdev)
                                comp_pkt.completion_status);
                        break;
                }
+
+               hbus->wslot_res_allocated = wslot;
        }

        kfree(pkt);
@@ -2944,10 +2949,10 @@ static int hv_send_resources_released(struct hv_device *hdev)
        struct hv_pcibus_device *hbus = hv_get_drvdata(hdev);
        struct pci_child_message pkt;
        struct hv_pci_dev *hpdev;
-       u32 wslot;
+       int wslot;
        int ret;

-       for (wslot = 0; wslot < 256; wslot++) {
+       for (wslot = hbus->wslot_res_allocated; wslot >= 0; wslot--) {
                hpdev = get_pcichild_wslot(hbus, wslot);
                if (!hpdev)
                        continue;
@@ -2962,8 +2967,12 @@ static int hv_send_resources_released(struct hv_device *hdev)
                                       VM_PKT_DATA_INBAND, 0);
                if (ret)
                        return ret;
+
+               hbus->wslot_res_allocated = wslot - 1;
        }

+       hbus->wslot_res_allocated = -1;
+
        return 0;
 }

@@ -3063,6 +3072,7 @@ static int hv_pci_probe(struct hv_device *hdev,
        if (!hbus)
                return -ENOMEM;
        hbus->state = hv_pcibus_init;
+       hbus->wslot_res_allocated = -1;

        /*
         * The PCI bus "domain" is what is called "segment" in ACPI and other
@@ -3162,7 +3172,7 @@ static int hv_pci_probe(struct hv_device *hdev,

        ret = hv_pci_allocate_bridge_windows(hbus);
        if (ret)
-               goto free_irq_domain;
+               goto exit_d0;

        ret = hv_send_resources_allocated(hdev);
        if (ret)
@@ -3180,6 +3190,8 @@ static int hv_pci_probe(struct hv_device *hdev,

 free_windows:
        hv_pci_free_bridge_windows(hbus);
+exit_d0:
+       (void) hv_pci_bus_exit(hdev, true);
 free_irq_domain:
        irq_domain_remove(hbus->irq_domain);
 free_fwnode:

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] PCI: pci-hyperv: Retry PCI bus D0 entry when the first attempt failed with invalid device state 0xC0000184.
  2020-04-26 13:24 Wei Hu
  2020-04-27  0:15 ` Michael Kelley
  2020-04-27 10:21 ` Wei Liu
@ 2020-04-27 22:51 ` Bjorn Helgaas
  2 siblings, 0 replies; 9+ messages in thread
From: Bjorn Helgaas @ 2020-04-27 22:51 UTC (permalink / raw)
  To: Wei Hu
  Cc: kys, haiyangz, sthemmin, wei.liu, lorenzo.pieralisi, robh,
	linux-hyperv, linux-pci, linux-kernel, decui, mikelley

Please pay attention to the changelog history and make yours match:

  $ git log --oneline drivers/pci/controller/pci-hyperv.c
  1cf106d93245 ("PCI: hv: Introduce hv_msi_entry")
  61bfd920abbf ("PCI: hv: Move retarget related structures into tlfs header")
  b00f80fcfaa0 ("PCI: hv: Move hypercall related definitions into tlfs header")
  067fb6c97e7e ("PCI: hv: Replace zero-length array with flexible-array member")
  999dd956d838 ("PCI: hv: Add support for protocol 1.3 and support PCI_BUS_RELATIONS2")
  f9ad0f361cf3 ("PCI: hv: Decouple the func definition in hv_dr_state from VSP message")
  42c3d41832ef ("PCI: hv: Add missing kfree(hbus) in hv_pci_probe()'s error handling path")
  e658a4fea8ef ("PCI: hv: Remove unnecessary type casting from kzalloc")

No period at end of subject.

On Sun, Apr 26, 2020 at 09:24:30PM +0800, Wei Hu wrote:
> In the case of kdump, the PCI device was not cleanly shut down
> before the kdump kernel starts. This causes the initial
> attempt of entering D0 state in the kdump kernel to fail with
> invalid device state 0xC0000184 returned from Hyper-V host.
> When this happens, explicitly call PCI bus exit and retry to
> enter the D0 state.
> 
> Also fix the PCI probe failure path to release the PCI device
> resource properly.

This sounds like two separate fixes that should be in separate
patches?

> Signed-off-by: Wei Hu <weh@microsoft.com>
> ---
>  drivers/pci/controller/pci-hyperv.c | 34 ++++++++++++++++++++++++++++-
>  1 file changed, 33 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
> index e15022ff63e3..eb4781fa058d 100644
> --- a/drivers/pci/controller/pci-hyperv.c
> +++ b/drivers/pci/controller/pci-hyperv.c
> @@ -2736,6 +2736,10 @@ static void hv_free_config_window(struct hv_pcibus_device *hbus)
>  	vmbus_free_mmio(hbus->mem_config->start, PCI_CONFIG_MMIO_LENGTH);
>  }
>  
> +#define STATUS_INVALID_DEVICE_STATE		0xC0000184
> +
> +static int hv_pci_bus_exit(struct hv_device *hdev, bool hibernating);
> +
>  /**
>   * hv_pci_enter_d0() - Bring the "bus" into the D0 power state
>   * @hdev:	VMBus's tracking struct for this root PCI bus
> @@ -2748,8 +2752,10 @@ static int hv_pci_enter_d0(struct hv_device *hdev)
>  	struct pci_bus_d0_entry *d0_entry;
>  	struct hv_pci_compl comp_pkt;
>  	struct pci_packet *pkt;
> +	bool retry = true;
>  	int ret;
>  
> +enter_d0_retry:
>  	/*
>  	 * Tell the host that the bus is ready to use, and moved into the
>  	 * powered-on state.  This includes telling the host which region
> @@ -2780,6 +2786,30 @@ static int hv_pci_enter_d0(struct hv_device *hdev)
>  		dev_err(&hdev->device,
>  			"PCI Pass-through VSP failed D0 Entry with status %x\n",
>  			comp_pkt.completion_status);
> +
> +		/*
> +		 * In certain case (Kdump) the pci device of interest was
> +		 * not cleanly shut down and resource is still held on host
> +		 * side, the host could return STATUS_INVALID_DEVICE_STATE.
> +		 * We need to explicitly request host to release the resource
> +		 * and try to enter D0 again.
> +		 */
> +		if (comp_pkt.completion_status == STATUS_INVALID_DEVICE_STATE &&
> +		    retry) {
> +			ret = hv_pci_bus_exit(hdev, true);
> +
> +			retry = false;
> +
> +			if (ret == 0) {
> +				kfree(pkt);
> +				goto enter_d0_retry;
> +			} else {
> +				dev_err(&hdev->device,
> +					"PCI bus D0 exit failed with ret %d\n",
> +					ret);
> +			}
> +		}
> +
>  		ret = -EPROTO;
>  		goto exit;
>  	}
> @@ -3136,7 +3166,7 @@ static int hv_pci_probe(struct hv_device *hdev,
>  
>  	ret = hv_pci_allocate_bridge_windows(hbus);
>  	if (ret)
> -		goto free_irq_domain;
> +		goto exit_d0;
>  
>  	ret = hv_send_resources_allocated(hdev);
>  	if (ret)
> @@ -3154,6 +3184,8 @@ static int hv_pci_probe(struct hv_device *hdev,
>  
>  free_windows:
>  	hv_pci_free_bridge_windows(hbus);
> +exit_d0:
> +	(void) hv_pci_bus_exit(hdev, true);
>  free_irq_domain:
>  	irq_domain_remove(hbus->irq_domain);
>  free_fwnode:
> -- 
> 2.20.1
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH] PCI: pci-hyperv: Retry PCI bus D0 entry when the first attempt failed with invalid device state 0xC0000184.
  2020-04-27  7:05   ` Wei Hu
@ 2020-04-27 15:36     ` Michael Kelley
  2020-04-28  6:41       ` Wei Hu
  0 siblings, 1 reply; 9+ messages in thread
From: Michael Kelley @ 2020-04-27 15:36 UTC (permalink / raw)
  To: Wei Hu, KY Srinivasan, Haiyang Zhang, Stephen Hemminger, wei.liu,
	linux-hyperv, Dexuan Cui

From: Wei Hu <weh@microsoft.com> Sent: Monday, April 27, 2020 12:06 AM
> > -----Original Message-----
> > From: Michael Kelley <mikelley@microsoft.com>
> > > @@ -3136,7 +3166,7 @@ static int hv_pci_probe(struct hv_device *hdev,
> > >
> > >  	ret = hv_pci_allocate_bridge_windows(hbus);
> > >  	if (ret)
> > > -		goto free_irq_domain;
> > > +		goto exit_d0;
> > >
> > >  	ret = hv_send_resources_allocated(hdev);
> > >  	if (ret)
> >
> > The above is good.  But there's another error case that isn't handled
> > correctly.  If create_root_hv_pci_bus() fails, hv_send_resources_released()
> > should be called.
> >
> > Fixing these two error cases should probably go in a separate patch:  One
> > patch for the retry problem in kdump, and a separate patch for these error
> > cases.
> 
> [Wei Hu]
> hv_send_resources_released() is called in the added hv_pci_bus_exit().
> If hv_send_resources_allocated() fails, is it correct to call hv_send_resources_released()?
> Allocation can fail in the middle. So I am not sure if calling hv_send_resources_released()
> won't cause any side effect.
> 

Ah yes, you are right.  But that brings up a separate problem.
If hv_pci_allocate_bridge_windows() or hv_send_resources_allocated() fails, then the
error path will call hv_pci_bus_exit(), which will call hv_send_resources_released(),
even if hv_send_resources_allocated() was never called or didn't fully succeed.  As you
noted, hv_send_resources_allocated() does multiple steps, some of which might have
succeeded, and some of which didn't.  The mismatch might cause problems.  That means
fixing this error handling is going to be a bit more complex.  Each operation needs
to be individually undone, and only if it previously succeeded.   Could you follow up
with the Hyper-V people to see if there's a problem with doing the RESOURCES_RELEASED
message on a slot where RESOURCES_ASSIGNED was not done or wasn't successful?
If doing a spurious RESOURCES_RELEASED is harmless, that will make the error cleanup
easier.

Michael

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] PCI: pci-hyperv: Retry PCI bus D0 entry when the first attempt failed with invalid device state 0xC0000184.
  2020-04-26 13:24 Wei Hu
  2020-04-27  0:15 ` Michael Kelley
@ 2020-04-27 10:21 ` Wei Liu
  2020-04-27 22:51 ` Bjorn Helgaas
  2 siblings, 0 replies; 9+ messages in thread
From: Wei Liu @ 2020-04-27 10:21 UTC (permalink / raw)
  To: Wei Hu
  Cc: kys, haiyangz, sthemmin, wei.liu, lorenzo.pieralisi, robh,
	bhelgaas, linux-hyperv, linux-pci, linux-kernel, decui, mikelley

On Sun, Apr 26, 2020 at 09:24:30PM +0800, Wei Hu wrote:
> In the case of kdump, the PCI device was not cleanly shut down
> before the kdump kernel starts. This causes the initial
> attempt of entering D0 state in the kdump kernel to fail with
> invalid device state 0xC0000184 returned from Hyper-V host.
> When this happens, explicitly call PCI bus exit and retry to
> enter the D0 state.
> 
> Also fix the PCI probe failure path to release the PCI device
> resource properly.
> 
> Signed-off-by: Wei Hu <weh@microsoft.com>
> ---
>  drivers/pci/controller/pci-hyperv.c | 34 ++++++++++++++++++++++++++++-
>  1 file changed, 33 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
> index e15022ff63e3..eb4781fa058d 100644
> --- a/drivers/pci/controller/pci-hyperv.c
> +++ b/drivers/pci/controller/pci-hyperv.c
> @@ -2736,6 +2736,10 @@ static void hv_free_config_window(struct hv_pcibus_device *hbus)
>  	vmbus_free_mmio(hbus->mem_config->start, PCI_CONFIG_MMIO_LENGTH);
>  }
>  
> +#define STATUS_INVALID_DEVICE_STATE		0xC0000184
> +

Can you please move this along side STATUS_REVISION_MISMATCH?

Wei.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH] PCI: pci-hyperv: Retry PCI bus D0 entry when the first attempt failed with invalid device state 0xC0000184.
  2020-04-27  0:15 ` Michael Kelley
@ 2020-04-27  7:05   ` Wei Hu
  2020-04-27 15:36     ` Michael Kelley
  0 siblings, 1 reply; 9+ messages in thread
From: Wei Hu @ 2020-04-27  7:05 UTC (permalink / raw)
  To: Michael Kelley, KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	wei.liu, linux-hyperv, Dexuan Cui

> -----Original Message-----
> From: Michael Kelley <mikelley@microsoft.com>
> > @@ -3136,7 +3166,7 @@ static int hv_pci_probe(struct hv_device *hdev,
> >
> >  	ret = hv_pci_allocate_bridge_windows(hbus);
> >  	if (ret)
> > -		goto free_irq_domain;
> > +		goto exit_d0;
> >
> >  	ret = hv_send_resources_allocated(hdev);
> >  	if (ret)
> 
> The above is good.  But there's another error case that isn't handled
> correctly.  If create_root_hv_pci_bus() fails, hv_send_resources_released()
> should be called.
> 
> Fixing these two error cases should probably go in a separate patch:  One
> patch for the retry problem in kdump, and a separate patch for these error
> cases.

[Wei Hu] 
hv_send_resources_released() is called in the added hv_pci_bus_exit(). 
If hv_send_resources_allocated() fails, is it correct to call hv_send_resources_released()?
Allocation can fail in the middle. So I am not sure if calling hv_send_resources_released()
won't cause any side effect.

Agree it should to into a separate patch.

Wei

> 
> Michael
> 
> 
> > @@ -3154,6 +3184,8 @@ static int hv_pci_probe(struct hv_device *hdev,
> >
> >  free_windows:
> >  	hv_pci_free_bridge_windows(hbus);
> > +exit_d0:
> > +	(void) hv_pci_bus_exit(hdev, true);
> >  free_irq_domain:
> >  	irq_domain_remove(hbus->irq_domain);
> >  free_fwnode:
> > --
> > 2.20.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH] PCI: pci-hyperv: Retry PCI bus D0 entry when the first attempt failed with invalid device state 0xC0000184.
  2020-04-26 13:24 Wei Hu
@ 2020-04-27  0:15 ` Michael Kelley
  2020-04-27  7:05   ` Wei Hu
  2020-04-27 10:21 ` Wei Liu
  2020-04-27 22:51 ` Bjorn Helgaas
  2 siblings, 1 reply; 9+ messages in thread
From: Michael Kelley @ 2020-04-27  0:15 UTC (permalink / raw)
  To: Wei Hu, KY Srinivasan, Haiyang Zhang, Stephen Hemminger, wei.liu,
	lorenzo.pieralisi, robh, bhelgaas, linux-hyperv, linux-pci,
	linux-kernel, Dexuan Cui

From: Wei Hu <weh@microsoft.com> Sent: Sunday, April 26, 2020 6:25 AM
> 
> In the case of kdump, the PCI device was not cleanly shut down
> before the kdump kernel starts. This causes the initial
> attempt of entering D0 state in the kdump kernel to fail with
> invalid device state 0xC0000184 returned from Hyper-V host.
> When this happens, explicitly call PCI bus exit and retry to
> enter the D0 state.
> 
> Also fix the PCI probe failure path to release the PCI device
> resource properly.
> 
> Signed-off-by: Wei Hu <weh@microsoft.com>
> ---
>  drivers/pci/controller/pci-hyperv.c | 34 ++++++++++++++++++++++++++++-
>  1 file changed, 33 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
> index e15022ff63e3..eb4781fa058d 100644
> --- a/drivers/pci/controller/pci-hyperv.c
> +++ b/drivers/pci/controller/pci-hyperv.c
> @@ -2736,6 +2736,10 @@ static void hv_free_config_window(struct hv_pcibus_device
> *hbus)
>  	vmbus_free_mmio(hbus->mem_config->start, PCI_CONFIG_MMIO_LENGTH);
>  }
> 
> +#define STATUS_INVALID_DEVICE_STATE		0xC0000184
> +
> +static int hv_pci_bus_exit(struct hv_device *hdev, bool hibernating);
> +
>  /**
>   * hv_pci_enter_d0() - Bring the "bus" into the D0 power state
>   * @hdev:	VMBus's tracking struct for this root PCI bus
> @@ -2748,8 +2752,10 @@ static int hv_pci_enter_d0(struct hv_device *hdev)
>  	struct pci_bus_d0_entry *d0_entry;
>  	struct hv_pci_compl comp_pkt;
>  	struct pci_packet *pkt;
> +	bool retry = true;
>  	int ret;
> 
> +enter_d0_retry:
>  	/*
>  	 * Tell the host that the bus is ready to use, and moved into the
>  	 * powered-on state.  This includes telling the host which region
> @@ -2780,6 +2786,30 @@ static int hv_pci_enter_d0(struct hv_device *hdev)
>  		dev_err(&hdev->device,
>  			"PCI Pass-through VSP failed D0 Entry with status %x\n",
>  			comp_pkt.completion_status);

The above error message will be output even if a retry is attempted.
And if the retry succeeds, there's no further message, which could leave an
incorrect impression for someone looking at the boot logs.  If the error message
is output, there should be a follow-up message indicating the retry succeeded. 
Or don't output the above message at all -- output only a message that says
"doing a retry".  This could be accomplished by doing a separate test for
STATUS_INVALID_DEVICE_STATE that is not nested under checking the
completion_status for 0.  Here's a structure that also has the benefit of
reducing the indentation levels:

	if ((comp_pkt.completion_status == STATUS_INVALID_DEVICE_STATE) && retry) {
		retry = false;
		dev_err(&hdev->device, "Retrying D0 Entry\n");
		ret = hv_pci_bus_exit(hdev, true);
		if (ret == 0) {
			kfree(pkt);
			goto enter_do_retry;
		}
		dev_err(&hdev->device, "Retrying D0 Entry failed with %d\n", ret);
	} 

	if (comp_pkt.completion_status < 0) {
		dev_err(&hdev->device,
			 "PCI Pass-through VSP failed D0 Entry with status %x\n",
			 comp_pkt.completion_status);
		ret = -EPROTO;
		goto exit;
	}

> +
> +		/*
> +		 * In certain case (Kdump) the pci device of interest was
> +		 * not cleanly shut down and resource is still held on host
> +		 * side, the host could return STATUS_INVALID_DEVICE_STATE.
> +		 * We need to explicitly request host to release the resource
> +		 * and try to enter D0 again.
> +		 */
> +		if (comp_pkt.completion_status == STATUS_INVALID_DEVICE_STATE &&
> +		    retry) {
> +			ret = hv_pci_bus_exit(hdev, true);
> +
> +			retry = false;
> +
> +			if (ret == 0) {
> +				kfree(pkt);
> +				goto enter_d0_retry;
> +			} else {
> +				dev_err(&hdev->device,
> +					"PCI bus D0 exit failed with ret %d\n",
> +					ret);
> +			}
> +		}
> +
>  		ret = -EPROTO;
>  		goto exit;
>  	}
> @@ -3136,7 +3166,7 @@ static int hv_pci_probe(struct hv_device *hdev,
> 
>  	ret = hv_pci_allocate_bridge_windows(hbus);
>  	if (ret)
> -		goto free_irq_domain;
> +		goto exit_d0;
> 
>  	ret = hv_send_resources_allocated(hdev);
>  	if (ret)

The above is good.  But there's another error case that isn't handled
correctly.  If create_root_hv_pci_bus() fails, hv_send_resources_released()
should be called.

Fixing these two error cases should probably go in a separate patch:  One
patch for the retry problem in kdump, and a separate patch for these error
cases.

Michael


> @@ -3154,6 +3184,8 @@ static int hv_pci_probe(struct hv_device *hdev,
> 
>  free_windows:
>  	hv_pci_free_bridge_windows(hbus);
> +exit_d0:
> +	(void) hv_pci_bus_exit(hdev, true);
>  free_irq_domain:
>  	irq_domain_remove(hbus->irq_domain);
>  free_fwnode:
> --
> 2.20.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] PCI: pci-hyperv: Retry PCI bus D0 entry when the first attempt failed with invalid device state 0xC0000184.
@ 2020-04-26 13:24 Wei Hu
  2020-04-27  0:15 ` Michael Kelley
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Wei Hu @ 2020-04-26 13:24 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, wei.liu, lorenzo.pieralisi, robh,
	bhelgaas, linux-hyperv, linux-pci, linux-kernel, decui, mikelley
  Cc: Wei Hu

In the case of kdump, the PCI device was not cleanly shut down
before the kdump kernel starts. This causes the initial
attempt of entering D0 state in the kdump kernel to fail with
invalid device state 0xC0000184 returned from Hyper-V host.
When this happens, explicitly call PCI bus exit and retry to
enter the D0 state.

Also fix the PCI probe failure path to release the PCI device
resource properly.

Signed-off-by: Wei Hu <weh@microsoft.com>
---
 drivers/pci/controller/pci-hyperv.c | 34 ++++++++++++++++++++++++++++-
 1 file changed, 33 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
index e15022ff63e3..eb4781fa058d 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -2736,6 +2736,10 @@ static void hv_free_config_window(struct hv_pcibus_device *hbus)
 	vmbus_free_mmio(hbus->mem_config->start, PCI_CONFIG_MMIO_LENGTH);
 }
 
+#define STATUS_INVALID_DEVICE_STATE		0xC0000184
+
+static int hv_pci_bus_exit(struct hv_device *hdev, bool hibernating);
+
 /**
  * hv_pci_enter_d0() - Bring the "bus" into the D0 power state
  * @hdev:	VMBus's tracking struct for this root PCI bus
@@ -2748,8 +2752,10 @@ static int hv_pci_enter_d0(struct hv_device *hdev)
 	struct pci_bus_d0_entry *d0_entry;
 	struct hv_pci_compl comp_pkt;
 	struct pci_packet *pkt;
+	bool retry = true;
 	int ret;
 
+enter_d0_retry:
 	/*
 	 * Tell the host that the bus is ready to use, and moved into the
 	 * powered-on state.  This includes telling the host which region
@@ -2780,6 +2786,30 @@ static int hv_pci_enter_d0(struct hv_device *hdev)
 		dev_err(&hdev->device,
 			"PCI Pass-through VSP failed D0 Entry with status %x\n",
 			comp_pkt.completion_status);
+
+		/*
+		 * In certain case (Kdump) the pci device of interest was
+		 * not cleanly shut down and resource is still held on host
+		 * side, the host could return STATUS_INVALID_DEVICE_STATE.
+		 * We need to explicitly request host to release the resource
+		 * and try to enter D0 again.
+		 */
+		if (comp_pkt.completion_status == STATUS_INVALID_DEVICE_STATE &&
+		    retry) {
+			ret = hv_pci_bus_exit(hdev, true);
+
+			retry = false;
+
+			if (ret == 0) {
+				kfree(pkt);
+				goto enter_d0_retry;
+			} else {
+				dev_err(&hdev->device,
+					"PCI bus D0 exit failed with ret %d\n",
+					ret);
+			}
+		}
+
 		ret = -EPROTO;
 		goto exit;
 	}
@@ -3136,7 +3166,7 @@ static int hv_pci_probe(struct hv_device *hdev,
 
 	ret = hv_pci_allocate_bridge_windows(hbus);
 	if (ret)
-		goto free_irq_domain;
+		goto exit_d0;
 
 	ret = hv_send_resources_allocated(hdev);
 	if (ret)
@@ -3154,6 +3184,8 @@ static int hv_pci_probe(struct hv_device *hdev,
 
 free_windows:
 	hv_pci_free_bridge_windows(hbus);
+exit_d0:
+	(void) hv_pci_bus_exit(hdev, true);
 free_irq_domain:
 	irq_domain_remove(hbus->irq_domain);
 free_fwnode:
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-04-29 23:47 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-27  1:30 [PATCH] PCI: pci-hyperv: Retry PCI bus D0 entry when the first attempt failed with invalid device state 0xC0000184 Dexuan Cui
  -- strict thread matches above, loose matches on Subject: below --
2020-04-26 13:24 Wei Hu
2020-04-27  0:15 ` Michael Kelley
2020-04-27  7:05   ` Wei Hu
2020-04-27 15:36     ` Michael Kelley
2020-04-28  6:41       ` Wei Hu
2020-04-29 23:47         ` Michael Kelley
2020-04-27 10:21 ` Wei Liu
2020-04-27 22:51 ` Bjorn Helgaas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).