intel-wired-lan.lists.osuosl.org archive mirror
 help / color / mirror / Atom feed
* [Intel-wired-lan] [PATCH v2] ice: Fix enabling SR-IOV with Xen
@ 2024-04-29 12:49 Ross Lagerwall
  2024-04-29 13:04 ` Paul Menzel
  2024-05-06 21:27 ` Tony Nguyen
  0 siblings, 2 replies; 5+ messages in thread
From: Ross Lagerwall @ 2024-04-29 12:49 UTC (permalink / raw)
  To: netdev; +Cc: Tony Nguyen, Javi Merino, intel-wired-lan, Ross Lagerwall

When the PCI functions are created, Xen is informed about them and
caches the number of MSI-X entries each function has.  However, the
number of MSI-X entries is not set until after the hardware has been
configured and the VFs have been started. This prevents
PCI-passthrough from working because Xen rejects mapping MSI-X
interrupts to domains because it thinks the MSI-X interrupts don't
exist.

Fix this by moving the call to pci_enable_sriov() later so that the
number of MSI-X entries is set correctly in hardware by the time Xen
reads it.

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: Javi Merino <javi.merino@kernel.org>
---

In v2:
* Fix cleanup on if pci_enable_sriov() fails.

 drivers/net/ethernet/intel/ice/ice_sriov.c | 23 +++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_sriov.c b/drivers/net/ethernet/intel/ice/ice_sriov.c
index a958fcf3e6be..bc97493046a8 100644
--- a/drivers/net/ethernet/intel/ice/ice_sriov.c
+++ b/drivers/net/ethernet/intel/ice/ice_sriov.c
@@ -864,6 +864,8 @@ static int ice_ena_vfs(struct ice_pf *pf, u16 num_vfs)
 	int total_vectors = pf->hw.func_caps.common_cap.num_msix_vectors;
 	struct device *dev = ice_pf_to_dev(pf);
 	struct ice_hw *hw = &pf->hw;
+	struct ice_vf *vf;
+	unsigned int bkt;
 	int ret;
 
 	pf->sriov_irq_bm = bitmap_zalloc(total_vectors, GFP_KERNEL);
@@ -877,24 +879,20 @@ static int ice_ena_vfs(struct ice_pf *pf, u16 num_vfs)
 	set_bit(ICE_OICR_INTR_DIS, pf->state);
 	ice_flush(hw);
 
-	ret = pci_enable_sriov(pf->pdev, num_vfs);
-	if (ret)
-		goto err_unroll_intr;
-
 	mutex_lock(&pf->vfs.table_lock);
 
 	ret = ice_set_per_vf_res(pf, num_vfs);
 	if (ret) {
 		dev_err(dev, "Not enough resources for %d VFs, err %d. Try with fewer number of VFs\n",
 			num_vfs, ret);
-		goto err_unroll_sriov;
+		goto err_unroll_intr;
 	}
 
 	ret = ice_create_vf_entries(pf, num_vfs);
 	if (ret) {
 		dev_err(dev, "Failed to allocate VF entries for %d VFs\n",
 			num_vfs);
-		goto err_unroll_sriov;
+		goto err_unroll_intr;
 	}
 
 	ice_eswitch_reserve_cp_queues(pf, num_vfs);
@@ -905,6 +903,10 @@ static int ice_ena_vfs(struct ice_pf *pf, u16 num_vfs)
 		goto err_unroll_vf_entries;
 	}
 
+	ret = pci_enable_sriov(pf->pdev, num_vfs);
+	if (ret)
+		goto err_unroll_start_vfs;
+
 	clear_bit(ICE_VF_DIS, pf->state);
 
 	/* rearm global interrupts */
@@ -915,12 +917,15 @@ static int ice_ena_vfs(struct ice_pf *pf, u16 num_vfs)
 
 	return 0;
 
+err_unroll_start_vfs:
+	ice_for_each_vf(pf, bkt, vf) {
+		ice_dis_vf_mappings(vf);
+		ice_vf_vsi_release(vf);
+	}
 err_unroll_vf_entries:
 	ice_free_vf_entries(pf);
-err_unroll_sriov:
-	mutex_unlock(&pf->vfs.table_lock);
-	pci_disable_sriov(pf->pdev);
 err_unroll_intr:
+	mutex_unlock(&pf->vfs.table_lock);
 	/* rearm interrupts here */
 	ice_irq_dynamic_ena(hw, NULL, NULL);
 	clear_bit(ICE_OICR_INTR_DIS, pf->state);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [Intel-wired-lan] [PATCH v2] ice: Fix enabling SR-IOV with Xen
  2024-04-29 12:49 [Intel-wired-lan] [PATCH v2] ice: Fix enabling SR-IOV with Xen Ross Lagerwall
@ 2024-04-29 13:04 ` Paul Menzel
  2024-04-30  9:03   ` Ross Lagerwall
  2024-05-06 21:27 ` Tony Nguyen
  1 sibling, 1 reply; 5+ messages in thread
From: Paul Menzel @ 2024-04-29 13:04 UTC (permalink / raw)
  To: Ross Lagerwall; +Cc: netdev, Javi Merino, Tony Nguyen, intel-wired-lan

Dear Ross,


Thank you for your patch.

Am 29.04.24 um 14:49 schrieb Ross Lagerwall:
> When the PCI functions are created, Xen is informed about them and
> caches the number of MSI-X entries each function has.  However, the
> number of MSI-X entries is not set until after the hardware has been
> configured and the VFs have been started. This prevents
> PCI-passthrough from working because Xen rejects mapping MSI-X
> interrupts to domains because it thinks the MSI-X interrupts don't
> exist.

Thank you for this great problem description. Is there any log message 
shown, you could paste, so people can find this commit when searching 
for the log message?

Do you have a minimal test case, so the maintainers can reproduce this 
to test the fix?

> Fix this by moving the call to pci_enable_sriov() later so that the
> number of MSI-X entries is set correctly in hardware by the time Xen
> reads it.

It’d be great if you could be more specific on “later”, and why this is 
the correct place.

> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
> Signed-off-by: Javi Merino <javi.merino@kernel.org>
> ---
> 
> In v2:
> * Fix cleanup on if pci_enable_sriov() fails.
> 
>   drivers/net/ethernet/intel/ice/ice_sriov.c | 23 +++++++++++++---------
>   1 file changed, 14 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_sriov.c b/drivers/net/ethernet/intel/ice/ice_sriov.c
> index a958fcf3e6be..bc97493046a8 100644
> --- a/drivers/net/ethernet/intel/ice/ice_sriov.c
> +++ b/drivers/net/ethernet/intel/ice/ice_sriov.c
> @@ -864,6 +864,8 @@ static int ice_ena_vfs(struct ice_pf *pf, u16 num_vfs)
>   	int total_vectors = pf->hw.func_caps.common_cap.num_msix_vectors;
>   	struct device *dev = ice_pf_to_dev(pf);
>   	struct ice_hw *hw = &pf->hw;
> +	struct ice_vf *vf;
> +	unsigned int bkt;
>   	int ret;
>   
>   	pf->sriov_irq_bm = bitmap_zalloc(total_vectors, GFP_KERNEL);
> @@ -877,24 +879,20 @@ static int ice_ena_vfs(struct ice_pf *pf, u16 num_vfs)
>   	set_bit(ICE_OICR_INTR_DIS, pf->state);
>   	ice_flush(hw);
>   
> -	ret = pci_enable_sriov(pf->pdev, num_vfs);
> -	if (ret)
> -		goto err_unroll_intr;
> -
>   	mutex_lock(&pf->vfs.table_lock);
>   
>   	ret = ice_set_per_vf_res(pf, num_vfs);
>   	if (ret) {
>   		dev_err(dev, "Not enough resources for %d VFs, err %d. Try with fewer number of VFs\n",
>   			num_vfs, ret);
> -		goto err_unroll_sriov;
> +		goto err_unroll_intr;
>   	}
>   
>   	ret = ice_create_vf_entries(pf, num_vfs);
>   	if (ret) {
>   		dev_err(dev, "Failed to allocate VF entries for %d VFs\n",
>   			num_vfs);
> -		goto err_unroll_sriov;
> +		goto err_unroll_intr;
>   	}
>   
>   	ice_eswitch_reserve_cp_queues(pf, num_vfs);
> @@ -905,6 +903,10 @@ static int ice_ena_vfs(struct ice_pf *pf, u16 num_vfs)
>   		goto err_unroll_vf_entries;
>   	}
>   
> +	ret = pci_enable_sriov(pf->pdev, num_vfs);
> +	if (ret)
> +		goto err_unroll_start_vfs;
> +
>   	clear_bit(ICE_VF_DIS, pf->state);
>   
>   	/* rearm global interrupts */
> @@ -915,12 +917,15 @@ static int ice_ena_vfs(struct ice_pf *pf, u16 num_vfs)
>   
>   	return 0;
>   
> +err_unroll_start_vfs:
> +	ice_for_each_vf(pf, bkt, vf) {
> +		ice_dis_vf_mappings(vf);
> +		ice_vf_vsi_release(vf);
> +	}

Why wasn’t this needed with `pci_enable_sriov()` done earlier?

>   err_unroll_vf_entries:
>   	ice_free_vf_entries(pf);
> -err_unroll_sriov:
> -	mutex_unlock(&pf->vfs.table_lock);
> -	pci_disable_sriov(pf->pdev);
>   err_unroll_intr:
> +	mutex_unlock(&pf->vfs.table_lock);
>   	/* rearm interrupts here */
>   	ice_irq_dynamic_ena(hw, NULL, NULL);
>   	clear_bit(ICE_OICR_INTR_DIS, pf->state);


Kind regards,

Paul

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Intel-wired-lan] [PATCH v2] ice: Fix enabling SR-IOV with Xen
  2024-04-29 13:04 ` Paul Menzel
@ 2024-04-30  9:03   ` Ross Lagerwall
  2024-05-08  9:23     ` Sergey Temerkhanov
  0 siblings, 1 reply; 5+ messages in thread
From: Ross Lagerwall @ 2024-04-30  9:03 UTC (permalink / raw)
  To: Paul Menzel; +Cc: netdev, Javi Merino, Tony Nguyen, intel-wired-lan

On Mon, Apr 29, 2024 at 2:04 PM Paul Menzel <pmenzel@molgen.mpg.de> wrote:
>
> Dear Ross,
>
>
> Thank you for your patch.
>
> Am 29.04.24 um 14:49 schrieb Ross Lagerwall:
> > When the PCI functions are created, Xen is informed about them and
> > caches the number of MSI-X entries each function has.  However, the
> > number of MSI-X entries is not set until after the hardware has been
> > configured and the VFs have been started. This prevents
> > PCI-passthrough from working because Xen rejects mapping MSI-X
> > interrupts to domains because it thinks the MSI-X interrupts don't
> > exist.
>
> Thank you for this great problem description. Is there any log message
> shown, you could paste, so people can find this commit when searching
> for the log message?

When this issue occurs, QEMU repeatedly reports:

msi_msix_setup: Error: Mapping of MSI-X (err: 22, vec: 0, entry 0x1)

>
> Do you have a minimal test case, so the maintainers can reproduce this
> to test the fix?

Testing this requires setting up Xen which I wouldn't expect anyone to
do unless they already have an environment set up.

In any case, a "minimal" test would be something like:

1. Set up Xen with dom0 and another VM running Linux.
2. Pass through a VF to the VM. See that QEMU reports the above message
   and the VF is not usable within the VM.
3. Rebuild the dom0 kernel with the attached patch.
4. Pass through a VF to the VM. See that the VF is usable within the
   VM.

>
> > Fix this by moving the call to pci_enable_sriov() later so that the
> > number of MSI-X entries is set correctly in hardware by the time Xen
> > reads it.
>
> It’d be great if you could be more specific on “later”, and why this is
> the correct place.

"later" in this case means after ice_start_vfs() since it is at that
point that the hardware sets the number of MSI-X entries.
I expect that a maintainer or someone with more knowledge of the
hardware could explain why the hardware only sets the number of MSI-X
entries at this point.

>
> > Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
> > Signed-off-by: Javi Merino <javi.merino@kernel.org>
> > ---
> >
> > In v2:
> > * Fix cleanup on if pci_enable_sriov() fails.
> >
> >   drivers/net/ethernet/intel/ice/ice_sriov.c | 23 +++++++++++++---------
> >   1 file changed, 14 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/intel/ice/ice_sriov.c b/drivers/net/ethernet/intel/ice/ice_sriov.c
> > index a958fcf3e6be..bc97493046a8 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_sriov.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_sriov.c
> > @@ -864,6 +864,8 @@ static int ice_ena_vfs(struct ice_pf *pf, u16 num_vfs)
> >       int total_vectors = pf->hw.func_caps.common_cap.num_msix_vectors;
> >       struct device *dev = ice_pf_to_dev(pf);
> >       struct ice_hw *hw = &pf->hw;
> > +     struct ice_vf *vf;
> > +     unsigned int bkt;
> >       int ret;
> >
> >       pf->sriov_irq_bm = bitmap_zalloc(total_vectors, GFP_KERNEL);
> > @@ -877,24 +879,20 @@ static int ice_ena_vfs(struct ice_pf *pf, u16 num_vfs)
> >       set_bit(ICE_OICR_INTR_DIS, pf->state);
> >       ice_flush(hw);
> >
> > -     ret = pci_enable_sriov(pf->pdev, num_vfs);
> > -     if (ret)
> > -             goto err_unroll_intr;
> > -
> >       mutex_lock(&pf->vfs.table_lock);
> >
> >       ret = ice_set_per_vf_res(pf, num_vfs);
> >       if (ret) {
> >               dev_err(dev, "Not enough resources for %d VFs, err %d. Try with fewer number of VFs\n",
> >                       num_vfs, ret);
> > -             goto err_unroll_sriov;
> > +             goto err_unroll_intr;
> >       }
> >
> >       ret = ice_create_vf_entries(pf, num_vfs);
> >       if (ret) {
> >               dev_err(dev, "Failed to allocate VF entries for %d VFs\n",
> >                       num_vfs);
> > -             goto err_unroll_sriov;
> > +             goto err_unroll_intr;
> >       }
> >
> >       ice_eswitch_reserve_cp_queues(pf, num_vfs);
> > @@ -905,6 +903,10 @@ static int ice_ena_vfs(struct ice_pf *pf, u16 num_vfs)
> >               goto err_unroll_vf_entries;
> >       }
> >
> > +     ret = pci_enable_sriov(pf->pdev, num_vfs);
> > +     if (ret)
> > +             goto err_unroll_start_vfs;
> > +
> >       clear_bit(ICE_VF_DIS, pf->state);
> >
> >       /* rearm global interrupts */
> > @@ -915,12 +917,15 @@ static int ice_ena_vfs(struct ice_pf *pf, u16 num_vfs)
> >
> >       return 0;
> >
> > +err_unroll_start_vfs:
> > +     ice_for_each_vf(pf, bkt, vf) {
> > +             ice_dis_vf_mappings(vf);
> > +             ice_vf_vsi_release(vf);
> > +     }
>
> Why wasn’t this needed with `pci_enable_sriov()` done earlier?

Previously ice_start_vifs() was the last function call that may fail
in this function. That is no longer the case so when
pci_enable_sriov() fails, it needs to undo what was done in
ice_start_vifs().

Thanks,
Ross

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Intel-wired-lan] [PATCH v2] ice: Fix enabling SR-IOV with Xen
  2024-04-29 12:49 [Intel-wired-lan] [PATCH v2] ice: Fix enabling SR-IOV with Xen Ross Lagerwall
  2024-04-29 13:04 ` Paul Menzel
@ 2024-05-06 21:27 ` Tony Nguyen
  1 sibling, 0 replies; 5+ messages in thread
From: Tony Nguyen @ 2024-05-06 21:27 UTC (permalink / raw)
  To: Ross Lagerwall, netdev; +Cc: Javi Merino, intel-wired-lan



On 4/29/2024 5:49 AM, Ross Lagerwall wrote:
> When the PCI functions are created, Xen is informed about them and
> caches the number of MSI-X entries each function has.  However, the
> number of MSI-X entries is not set until after the hardware has been
> configured and the VFs have been started. This prevents
> PCI-passthrough from working because Xen rejects mapping MSI-X
> interrupts to domains because it thinks the MSI-X interrupts don't
> exist.
> 
> Fix this by moving the call to pci_enable_sriov() later so that the
> number of MSI-X entries is set correctly in hardware by the time Xen
> reads it.
> 

Sorry, I missed this on initial review, but bug fixes should have a 
Fixes: tag

I assume you are targeting this for net, if so, can you mark it as 
'PATCH iwl-net'.

> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
> Signed-off-by: Javi Merino <javi.merino@kernel.org>

Also, sender should be the last sign-off.

Thanks,
Tony

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Intel-wired-lan] [PATCH v2] ice: Fix enabling SR-IOV with Xen
  2024-04-30  9:03   ` Ross Lagerwall
@ 2024-05-08  9:23     ` Sergey Temerkhanov
  0 siblings, 0 replies; 5+ messages in thread
From: Sergey Temerkhanov @ 2024-05-08  9:23 UTC (permalink / raw)
  To: ross.lagerwall
  Cc: intel-wired-lan, pmenzel, javi.merino, anthony.l.nguyen, netdev

This patch makes sense since VFs need to be assigned resources (especially MSI-X interrupt count)
before making these VFs visible, so that the kernel PCI enumeration code reads correct MSI-X
interrupt count for the VFs.

Regards,
Sergey

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-05-08 15:19 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-29 12:49 [Intel-wired-lan] [PATCH v2] ice: Fix enabling SR-IOV with Xen Ross Lagerwall
2024-04-29 13:04 ` Paul Menzel
2024-04-30  9:03   ` Ross Lagerwall
2024-05-08  9:23     ` Sergey Temerkhanov
2024-05-06 21:27 ` Tony Nguyen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).