All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [RFC 01/10] drm/i915/gvt: add module parameter enable_pvmmio
  2018-09-27 16:37 ` [RFC 01/10] drm/i915/gvt: add module parameter enable_pvmmio Xiaolin Zhang
@ 2018-09-27  7:16   ` Chris Wilson
  2018-09-27 11:03   ` Joonas Lahtinen
  1 sibling, 0 replies; 28+ messages in thread
From: Chris Wilson @ 2018-09-27  7:16 UTC (permalink / raw)
  To: Xiaolin Zhang, intel-gfx, intel-gvt-dev
  Cc: joonas.lahtinen, zhiyuan.lv, fei.jiang, zhenyu.z.wang, hang.yuan

Quoting Xiaolin Zhang (2018-09-27 17:37:46)
> This int type module parameter is used to control the different
> level pvmmio feature for MMIO emulation in GVT.
> 
> This parameter is default zero, no pvmmio feature enabled.
> 
> Its permission type is 0400 which means user could only change its
> value through the cmdline, this is to prevent the dynamic modification
> during runtime which would break the pvmmio internal logic.
> 
> Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h    |  3 +++
>  drivers/gpu/drm/i915/i915_params.c |  4 ++++
>  drivers/gpu/drm/i915/i915_params.h |  3 ++-
>  drivers/gpu/drm/i915/i915_pvinfo.h | 16 +++++++++++++++-
>  drivers/gpu/drm/i915/i915_vgpu.c   | 12 +++++++++++-
>  5 files changed, 35 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 8624b4b..174d618 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -3871,4 +3871,7 @@ static inline int intel_hws_csb_write_index(struct drm_i915_private *i915)
>                 return I915_HWS_CSB_WRITE_INDEX;
>  }
>  
> +#define PVMMIO_LEVEL_ENABLE(dev_priv, level)   \
> +       (intel_vgpu_active(dev_priv) && i915_modparams.enable_pvmmio & level)
> +
>  #endif
> diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
> index 295e981..5ee236ec 100644
> --- a/drivers/gpu/drm/i915/i915_params.c
> +++ b/drivers/gpu/drm/i915/i915_params.c
> @@ -174,6 +174,10 @@ struct i915_params i915_modparams __read_mostly = {
>  i915_param_named(enable_gvt, bool, 0400,
>         "Enable support for Intel GVT-g graphics virtualization host support(default:false)");
>  
> +i915_param_named(enable_pvmmio, int, 0400,
> +       "Enable pv mmio feature, default TRUE. This parameter "
> +       "could only set from host, guest value is set through vgt_if");

We were placing gvt specific module parameters under gvt/
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 02/10] drm/i915/gvt: get ready of memory for pvmmio
  2018-09-27 16:37 ` [RFC 02/10] drm/i915/gvt: get ready of memory for pvmmio Xiaolin Zhang
@ 2018-09-27  7:17   ` Chris Wilson
  2018-09-28  7:31     ` Zhang, Xiaolin
  2018-10-09  2:31   ` Zhenyu Wang
  1 sibling, 1 reply; 28+ messages in thread
From: Chris Wilson @ 2018-09-27  7:17 UTC (permalink / raw)
  To: Xiaolin Zhang, intel-gfx, intel-gvt-dev
  Cc: joonas.lahtinen, zhiyuan.lv, fei.jiang, zhenyu.z.wang, hang.yuan

Quoting Xiaolin Zhang (2018-09-27 17:37:47)
> To enable pvmmio feature, we need to prepare one 4K shared page
> which will be accessed by both guest and backend i915 driver.
> 
> guest i915 allocate one page memory and then the guest physical address is
> passed to backend i915 driver through PVINFO register so that backend i915
> driver can access this shared page without hypeviser trap cost for shared
> data exchagne via hyperviser read_gpa functionality.
> 
> Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.c    |  5 +++++
>  drivers/gpu/drm/i915/i915_drv.h    |  3 +++
>  drivers/gpu/drm/i915/i915_pvinfo.h | 25 ++++++++++++++++++++++++-
>  drivers/gpu/drm/i915/i915_vgpu.c   | 17 +++++++++++++++++
>  4 files changed, 49 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index ade9bca..815a4dd 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -885,6 +885,7 @@ static int i915_driver_init_early(struct drm_i915_private *dev_priv)
>                 return -ENODEV;
>  
>         spin_lock_init(&dev_priv->irq_lock);
> +       spin_lock_init(&dev_priv->shared_page_lock);

No. Do we not have a more appropriate struct for this to find a home in.
No one will ever uess that 'shared_page_lock' refers to vgpu.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 03/10] drm/i915/gvt: context submission pvmmio optimization
  2018-09-27 16:37 ` [RFC 03/10] drm/i915/gvt: context submission pvmmio optimization Xiaolin Zhang
@ 2018-09-27  7:19   ` Chris Wilson
  2018-09-28  5:31     ` Zhang, Xiaolin
  2018-09-27 11:13   ` Joonas Lahtinen
  1 sibling, 1 reply; 28+ messages in thread
From: Chris Wilson @ 2018-09-27  7:19 UTC (permalink / raw)
  To: Xiaolin Zhang, intel-gfx, intel-gvt-dev
  Cc: joonas.lahtinen, zhiyuan.lv, fei.jiang, zhenyu.z.wang, hang.yuan

Quoting Xiaolin Zhang (2018-09-27 17:37:48)
> It is performance optimization to reduce mmio trap numbers from 4 to
> 1 durning ELSP porting writing (context submission).
> 
> When context subission, to cache elsp_data[4] values in
> the shared page, the last elsp_data[0] port writing will be trapped
> to gvt for real context submission.
> 
> Use PVMMIO_ELSP_SUBMIT to control this level of pvmmio optimization.

> Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_params.h |  2 +-
>  drivers/gpu/drm/i915/intel_lrc.c   | 37 ++++++++++++++++++++++++++++++++++++-
>  2 files changed, 37 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_params.h b/drivers/gpu/drm/i915/i915_params.h
> index 0f6a38b..6c81c87 100644
> --- a/drivers/gpu/drm/i915/i915_params.h
> +++ b/drivers/gpu/drm/i915/i915_params.h
> @@ -69,7 +69,7 @@
>         param(bool, enable_dp_mst, true) \
>         param(bool, enable_dpcd_backlight, false) \
>         param(bool, enable_gvt, false) \
> -       param(int, enable_pvmmio, 0)
> +       param(int, enable_pvmmio, PVMMIO_ELSP_SUBMIT)
>  
>  #define MEMBER(T, member, ...) T member;
>  struct i915_params {
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 4b28225..cdc713c 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -451,7 +451,12 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
>  {
>         struct intel_engine_execlists *execlists = &engine->execlists;
>         struct execlist_port *port = execlists->port;
> +       u32 __iomem *elsp =
> +               engine->i915->regs + i915_mmio_reg_offset(RING_ELSP(engine));
> +       u32 *elsp_data;
>         unsigned int n;
> +       u32 descs[4];
> +       int i = 0;
>  
>         /*
>          * We can skip acquiring intel_runtime_pm_get() here as it was taken
> @@ -494,8 +499,24 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
>                         GEM_BUG_ON(!n);
>                         desc = 0;
>                 }
> +               if (PVMMIO_LEVEL_ENABLE(engine->i915, PVMMIO_ELSP_SUBMIT)) {
> +                       GEM_BUG_ON(i >= 4);
> +                       descs[i] = upper_32_bits(desc);
> +                       descs[i + 1] = lower_32_bits(desc);
> +                       i += 2;
> +               } else {
> +                       write_desc(execlists, desc, n);
> +               }
> +       }
>  
> -               write_desc(execlists, desc, n);
> +       if (PVMMIO_LEVEL_ENABLE(engine->i915, PVMMIO_ELSP_SUBMIT)) {
> +               spin_lock(&engine->i915->shared_page_lock);
> +               elsp_data = engine->i915->shared_page->elsp_data;
> +               *elsp_data = descs[0];
> +               *(elsp_data + 1) = descs[1];
> +               *(elsp_data + 2) = descs[2];
> +               writel(descs[3], elsp);
> +               spin_unlock(&engine->i915->shared_page_lock);
>         }
>  
>         /* we need to manually load the submit queue */
> @@ -538,11 +559,25 @@ static void inject_preempt_context(struct intel_engine_cs *engine)
>         struct intel_engine_execlists *execlists = &engine->execlists;
>         struct intel_context *ce =
>                 to_intel_context(engine->i915->preempt_context, engine);
> +       u32 __iomem *elsp =
> +               engine->i915->regs + i915_mmio_reg_offset(RING_ELSP(engine));
> +       u32 *elsp_data;
>         unsigned int n;
>  
>         GEM_BUG_ON(execlists->preempt_complete_status !=
>                    upper_32_bits(ce->lrc_desc));
>  
> +       if (PVMMIO_LEVEL_ENABLE(engine->i915, PVMMIO_ELSP_SUBMIT)) {
> +               spin_lock(&engine->i915->shared_page_lock);
> +               elsp_data = engine->i915->shared_page->elsp_data;
> +               *elsp_data = 0;
> +               *(elsp_data + 1) = 0;
> +               *(elsp_data + 2) = upper_32_bits(ce->lrc_desc);
> +               writel(lower_32_bits(ce->lrc_desc), elsp);
> +               spin_unlock(&engine->i915->shared_page_lock);
> +               return;
> +       }
> +

Really?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* ✗ Fi.CI.CHECKPATCH: warning for i915 pvmmio to improve GVTg performance
  2018-09-27 16:37 [RFC 00/10] i915 pvmmio to improve GVTg performance Xiaolin Zhang
@ 2018-09-27  7:20 ` Patchwork
  2018-09-27  7:24 ` ✗ Fi.CI.SPARSE: " Patchwork
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 28+ messages in thread
From: Patchwork @ 2018-09-27  7:20 UTC (permalink / raw)
  To: Xiaolin Zhang; +Cc: intel-gfx

== Series Details ==

Series: i915 pvmmio to improve GVTg performance
URL   : https://patchwork.freedesktop.org/series/50257/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
b7754818ea52 drm/i915/gvt: add module parameter enable_pvmmio
-:25: CHECK:MACRO_ARG_PRECEDENCE: Macro argument 'level' may be better as '(level)' to avoid precedence issues
#25: FILE: drivers/gpu/drm/i915/i915_drv.h:3876:
+#define PVMMIO_LEVEL_ENABLE(dev_priv, level)	\
+	(intel_vgpu_active(dev_priv) && i915_modparams.enable_pvmmio & level)

-:38: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#38: FILE: drivers/gpu/drm/i915/i915_params.c:178:
+i915_param_named(enable_pvmmio, int, 0400,
+	"Enable pv mmio feature, default TRUE. This parameter "

-:113: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#113: FILE: drivers/gpu/drm/i915/i915_vgpu.c:85:
+	__raw_i915_write32(dev_priv, vgtif_reg(enable_pvmmio),
+			i915_modparams.enable_pvmmio);

-:115: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#115: FILE: drivers/gpu/drm/i915/i915_vgpu.c:87:
+	i915_modparams.enable_pvmmio = __raw_i915_read32(dev_priv,
+			vgtif_reg(enable_pvmmio));

-:120: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#120: FILE: drivers/gpu/drm/i915/i915_vgpu.c:91:
+	DRM_INFO("Virtual GPU for Intel GVT-g detected with pvmmio 0x%x\n",
+		i915_modparams.enable_pvmmio);

total: 0 errors, 0 warnings, 5 checks, 80 lines checked
164efba08cbe drm/i915/gvt: get ready of memory for pvmmio
-:63: CHECK:UNCOMMENTED_DEFINITION: spinlock_t definition without comment
#63: FILE: drivers/gpu/drm/i915/i915_drv.h:1630:
+	spinlock_t shared_page_lock;

-:137: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#137: FILE: drivers/gpu/drm/i915/i915_vgpu.c:103:
+		__raw_i915_write32(dev_priv, vgtif_reg(shared_page_gpa.lo),
+				lower_32_bits(shared_page_gpa));

-:139: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#139: FILE: drivers/gpu/drm/i915/i915_vgpu.c:105:
+		__raw_i915_write32(dev_priv, vgtif_reg(shared_page_gpa.hi),
+				upper_32_bits(shared_page_gpa));

total: 0 errors, 0 warnings, 3 checks, 105 lines checked
ec9f5f379874 drm/i915/gvt: context submission pvmmio optimization
f520b54318b0 drm/i915/gvt: master irq pvmmio optimization
4300a40028a1 drm/i915/gvt: ppgtt update pvmmio optimization
82d8fd860bb7 drm/i915/gvt: GVTg handle enable_pvmmio PVINFO register
-:24: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#24: FILE: drivers/gpu/drm/i915/gvt/handlers.c:1249:
+			DRM_INFO("vgpu id=%d pvmmio=0x%x\n",
+				vgpu->id, VGPU_PVMMIO(vgpu));

total: 0 errors, 0 warnings, 1 checks, 46 lines checked
460761b5ad2c drm/i915/gvt: GVTg read_shared_page implementation
-:32: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#32: FILE: drivers/gpu/drm/i915/gvt/gvt.h:695:
+void intel_gvt_read_shared_page(struct intel_vgpu *vgpu,
+		unsigned int offset, void *buf, unsigned long len);

-:47: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#47: FILE: drivers/gpu/drm/i915/gvt/handlers.c:1257:
+		vgpu->shared_page_gpa = vgpu_vreg64_t(vgpu,
+				vgtif_reg(shared_page_gpa));

-:65: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#65: FILE: drivers/gpu/drm/i915/gvt/vgpu.c:598:
+void intel_gvt_read_shared_page(struct intel_vgpu *vgpu,
+		unsigned int offset, void *buf, unsigned long len)

total: 0 errors, 0 warnings, 3 checks, 44 lines checked
a7fbe585aa4e drm/i915/gvt: GVTg support context submission pvmmio optimization
8399e7a93462 drm/i915/gvt: GVTg support master irq pvmmio optimization
97b377e08a05 drm/i915/gvt: GVTg support ppgtt pvmmio optimization
-:74: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#74: FILE: drivers/gpu/drm/i915/gvt/gtt.c:1810:
+		gvt_vgpu_err("fail to create ppgtt for pdp 0x%llx\n",
+				px_dma(&mm->ppgtt->pml4));

-:104: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#104: FILE: drivers/gpu/drm/i915/gvt/gtt.c:2819:
+int intel_vgpu_g2v_pv_ppgtt_alloc_4lvl(struct intel_vgpu *vgpu,
+		u64 pdps[])

-:129: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#129: FILE: drivers/gpu/drm/i915/gvt/gtt.c:2844:
+int intel_vgpu_g2v_pv_ppgtt_clear_4lvl(struct intel_vgpu *vgpu,
+		u64 pdps[])

-:157: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'end' - possible side-effects?
#157: FILE: drivers/gpu/drm/i915/gvt/gtt.c:2872:
+#define pml4_addr_end(addr, end)					\
+({	unsigned long __boundary = \
+			((addr) + GEN8_PML4E_SIZE) & GEN8_PML4E_SIZE_MASK; \
+	(__boundary < (end)) ? __boundary : (end);		\
+})

-:163: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'end' - possible side-effects?
#163: FILE: drivers/gpu/drm/i915/gvt/gtt.c:2878:
+#define pdp_addr_end(addr, end)						\
+({	unsigned long __boundary = \
+			((addr) + GEN8_PDPE_SIZE) & GEN8_PDPE_SIZE_MASK; \
+	(__boundary < (end)) ? __boundary : (end);		\
+})

-:169: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'end' - possible side-effects?
#169: FILE: drivers/gpu/drm/i915/gvt/gtt.c:2884:
+#define pd_addr_end(addr, end)						\
+({	unsigned long __boundary = \
+			((addr) + GEN8_PDE_SIZE) & GEN8_PDE_SIZE_MASK;	\
+	(__boundary < (end)) ? __boundary : (end);		\
+})

-:182: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#182: FILE: drivers/gpu/drm/i915/gvt/gtt.c:2897:
+static int walk_pt_range(struct intel_vgpu *vgpu, u64 pt,
+				u64 start, u64 end, struct ppgtt_walk *walk)

-:195: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#195: FILE: drivers/gpu/drm/i915/gvt/gtt.c:2910:
+	ret = intel_gvt_hypervisor_read_gpa(vgpu,
+		(pt & PAGE_MASK) + (start_index << info->gtt_entry_size_shift),

-:216: CHECK:LINE_SPACING: Please don't use multiple blank lines
#216: FILE: drivers/gpu/drm/i915/gvt/gtt.c:2931:
+
+

-:218: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#218: FILE: drivers/gpu/drm/i915/gvt/gtt.c:2933:
+static int walk_pd_range(struct intel_vgpu *vgpu, u64 pd,
+				u64 start, u64 end, struct ppgtt_walk *walk)

-:230: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#230: FILE: drivers/gpu/drm/i915/gvt/gtt.c:2945:
+		ret = intel_gvt_hypervisor_read_gpa(vgpu,
+			(pd & PAGE_MASK) + (index <<

-:243: CHECK:LINE_SPACING: Please don't use multiple blank lines
#243: FILE: drivers/gpu/drm/i915/gvt/gtt.c:2958:
+
+

-:245: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#245: FILE: drivers/gpu/drm/i915/gvt/gtt.c:2960:
+static int walk_pdp_range(struct intel_vgpu *vgpu, u64 pdp,
+				  u64 start, u64 end, struct ppgtt_walk *walk)

-:257: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#257: FILE: drivers/gpu/drm/i915/gvt/gtt.c:2972:
+		ret = intel_gvt_hypervisor_read_gpa(vgpu,
+			(pdp & PAGE_MASK) + (index <<

-:269: CHECK:LINE_SPACING: Please don't use multiple blank lines
#269: FILE: drivers/gpu/drm/i915/gvt/gtt.c:2984:
+
+

-:271: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#271: FILE: drivers/gpu/drm/i915/gvt/gtt.c:2986:
+static int walk_pml4_range(struct intel_vgpu *vgpu, u64 pml4,
+				u64 start, u64 end, struct ppgtt_walk *walk)

-:282: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#282: FILE: drivers/gpu/drm/i915/gvt/gtt.c:2997:
+		ret = intel_gvt_hypervisor_read_gpa(vgpu,
+			(pml4 & PAGE_MASK) + (index <<

-:295: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#295: FILE: drivers/gpu/drm/i915/gvt/gtt.c:3010:
+int intel_vgpu_g2v_pv_ppgtt_insert_4lvl(struct intel_vgpu *vgpu,
+		u64 pdps[])

-:329: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#329: FILE: drivers/gpu/drm/i915/gvt/gtt.c:3044:
+	walk.mfns = kmalloc_array(num_pages,
+			sizeof(unsigned long), GFP_KERNEL);

-:389: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#389: FILE: drivers/gpu/drm/i915/gvt/gtt.h:277:
+int intel_vgpu_g2v_pv_ppgtt_alloc_4lvl(struct intel_vgpu *vgpu,
+		u64 pdps[]);

-:392: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#392: FILE: drivers/gpu/drm/i915/gvt/gtt.h:280:
+int intel_vgpu_g2v_pv_ppgtt_clear_4lvl(struct intel_vgpu *vgpu,
+		u64 pdps[]);

-:395: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#395: FILE: drivers/gpu/drm/i915/gvt/gtt.h:283:
+int intel_vgpu_g2v_pv_ppgtt_insert_4lvl(struct intel_vgpu *vgpu,
+		u64 pdps[]);

total: 0 errors, 0 warnings, 22 checks, 395 lines checked

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* ✗ Fi.CI.SPARSE: warning for i915 pvmmio to improve GVTg performance
  2018-09-27 16:37 [RFC 00/10] i915 pvmmio to improve GVTg performance Xiaolin Zhang
  2018-09-27  7:20 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
@ 2018-09-27  7:24 ` Patchwork
  2018-09-27  7:43 ` ✓ Fi.CI.BAT: success " Patchwork
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 28+ messages in thread
From: Patchwork @ 2018-09-27  7:24 UTC (permalink / raw)
  To: Xiaolin Zhang; +Cc: intel-gfx

== Series Details ==

Series: i915 pvmmio to improve GVTg performance
URL   : https://patchwork.freedesktop.org/series/50257/
State : warning

== Summary ==

$ dim sparse origin/drm-tip
Commit: drm/i915/gvt: add module parameter enable_pvmmio
Okay!

Commit: drm/i915/gvt: get ready of memory for pvmmio
-drivers/gpu/drm/i915/selftests/../i915_drv.h:3720:16: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/selftests/../i915_drv.h:3723:16: warning: expression using sizeof(void)

Commit: drm/i915/gvt: context submission pvmmio optimization
Okay!

Commit: drm/i915/gvt: master irq pvmmio optimization
Okay!

Commit: drm/i915/gvt: ppgtt update pvmmio optimization
-O:drivers/gpu/drm/i915/i915_gem_gtt.c:1491:9: warning: expression using sizeof(void)
-O:drivers/gpu/drm/i915/i915_gem_gtt.c:1491:9: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/i915_gem_gtt.c:1517:9: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/i915_gem_gtt.c:1517:9: warning: expression using sizeof(void)

Commit: drm/i915/gvt: GVTg handle enable_pvmmio PVINFO register
Okay!

Commit: drm/i915/gvt: GVTg read_shared_page implementation
Okay!

Commit: drm/i915/gvt: GVTg support context submission pvmmio optimization
Okay!

Commit: drm/i915/gvt: GVTg support master irq pvmmio optimization
Okay!

Commit: drm/i915/gvt: GVTg support ppgtt pvmmio optimization
+./include/linux/slab.h:631:13: error: undefined identifier '__builtin_mul_overflow'
+./include/linux/slab.h:631:13: warning: call with no type!

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* ✓ Fi.CI.BAT: success for i915 pvmmio to improve GVTg performance
  2018-09-27 16:37 [RFC 00/10] i915 pvmmio to improve GVTg performance Xiaolin Zhang
  2018-09-27  7:20 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
  2018-09-27  7:24 ` ✗ Fi.CI.SPARSE: " Patchwork
@ 2018-09-27  7:43 ` Patchwork
  2018-09-27 10:25 ` ✓ Fi.CI.IGT: " Patchwork
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 28+ messages in thread
From: Patchwork @ 2018-09-27  7:43 UTC (permalink / raw)
  To: Xiaolin Zhang; +Cc: intel-gfx

== Series Details ==

Series: i915 pvmmio to improve GVTg performance
URL   : https://patchwork.freedesktop.org/series/50257/
State : success

== Summary ==

= CI Bug Log - changes from CI_DRM_4890 -> Patchwork_10291 =

== Summary - SUCCESS ==

  No regressions found.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/50257/revisions/1/mbox/

== Known issues ==

  Here are the changes found in Patchwork_10291 that come from known issues:

  === IGT changes ===

    ==== Issues hit ====

    igt@debugfs_test@read_all_entries:
      fi-icl-u:           NOTRUN -> DMESG-WARN (fdo#107724, fdo#108070)

    igt@drv_module_reload@basic-no-display:
      fi-icl-u:           NOTRUN -> DMESG-WARN (fdo#108071) +1

    igt@gem_exec_suspend@basic-s3:
      fi-blb-e6850:       PASS -> INCOMPLETE (fdo#107718)

    igt@kms_flip@basic-flip-vs-wf_vblank:
      fi-icl-u:           NOTRUN -> DMESG-WARN (fdo#108070) +46

    igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b:
      fi-bdw-samus:       NOTRUN -> INCOMPLETE (fdo#107773)

    
    ==== Possible fixes ====

    igt@gem_exec_suspend@basic-s4-devices:
      fi-bdw-samus:       INCOMPLETE (fdo#107773) -> PASS

    igt@kms_psr@primary_page_flip:
      fi-kbl-r:           FAIL (fdo#107336) -> PASS

    igt@pm_rpm@module-reload:
      fi-glk-j4005:       FAIL (fdo#104767) -> PASS

    
  fdo#104767 https://bugs.freedesktop.org/show_bug.cgi?id=104767
  fdo#107336 https://bugs.freedesktop.org/show_bug.cgi?id=107336
  fdo#107718 https://bugs.freedesktop.org/show_bug.cgi?id=107718
  fdo#107724 https://bugs.freedesktop.org/show_bug.cgi?id=107724
  fdo#107773 https://bugs.freedesktop.org/show_bug.cgi?id=107773
  fdo#108070 https://bugs.freedesktop.org/show_bug.cgi?id=108070
  fdo#108071 https://bugs.freedesktop.org/show_bug.cgi?id=108071


== Participating hosts (41 -> 41) ==

  Additional (3): fi-icl-u fi-skl-caroline fi-gdg-551 
  Missing    (3): fi-bsw-cyan fi-ilk-m540 fi-icl-u2 


== Build changes ==

    * Linux: CI_DRM_4890 -> Patchwork_10291

  CI_DRM_4890: 3cf2dd7288b4d68f0702d9f5f0a2f0e0e0185c8d @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4652: 83352d08b52acd6ee677f9f62dd032b0b8d31835 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_10291: 97b377e08a059ecd2ec5f3dc2bfb249a0f2aabc5 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

97b377e08a05 drm/i915/gvt: GVTg support ppgtt pvmmio optimization
8399e7a93462 drm/i915/gvt: GVTg support master irq pvmmio optimization
a7fbe585aa4e drm/i915/gvt: GVTg support context submission pvmmio optimization
460761b5ad2c drm/i915/gvt: GVTg read_shared_page implementation
82d8fd860bb7 drm/i915/gvt: GVTg handle enable_pvmmio PVINFO register
4300a40028a1 drm/i915/gvt: ppgtt update pvmmio optimization
f520b54318b0 drm/i915/gvt: master irq pvmmio optimization
ec9f5f379874 drm/i915/gvt: context submission pvmmio optimization
164efba08cbe drm/i915/gvt: get ready of memory for pvmmio
b7754818ea52 drm/i915/gvt: add module parameter enable_pvmmio

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_10291/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* ✓ Fi.CI.IGT: success for i915 pvmmio to improve GVTg performance
  2018-09-27 16:37 [RFC 00/10] i915 pvmmio to improve GVTg performance Xiaolin Zhang
                   ` (2 preceding siblings ...)
  2018-09-27  7:43 ` ✓ Fi.CI.BAT: success " Patchwork
@ 2018-09-27 10:25 ` Patchwork
  2018-09-27 11:07 ` [RFC 00/10] " Joonas Lahtinen
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 28+ messages in thread
From: Patchwork @ 2018-09-27 10:25 UTC (permalink / raw)
  To: Xiaolin Zhang; +Cc: intel-gfx

== Series Details ==

Series: i915 pvmmio to improve GVTg performance
URL   : https://patchwork.freedesktop.org/series/50257/
State : success

== Summary ==

= CI Bug Log - changes from CI_DRM_4890_full -> Patchwork_10291_full =

== Summary - SUCCESS ==

  No regressions found.

  

== Known issues ==

  Here are the changes found in Patchwork_10291_full that come from known issues:

  === IGT changes ===

    ==== Issues hit ====

    igt@gem_userptr_blits@readonly-unsync:
      shard-apl:          PASS -> INCOMPLETE (fdo#103927)

    igt@kms_busy@basic-flip-f:
      shard-snb:          SKIP -> INCOMPLETE (fdo#105411)

    igt@kms_busy@extended-modeset-hang-newfb-render-a:
      shard-kbl:          NOTRUN -> DMESG-WARN (fdo#107956)

    igt@kms_frontbuffer_tracking@fbc-2p-primscrn-spr-indfb-fullscreen:
      shard-glk:          PASS -> FAIL (fdo#103167)

    igt@kms_setmode@basic:
      shard-kbl:          PASS -> FAIL (fdo#99912)

    
    ==== Possible fixes ====

    igt@kms_setmode@basic:
      shard-apl:          FAIL (fdo#99912) -> PASS

    
  fdo#103167 https://bugs.freedesktop.org/show_bug.cgi?id=103167
  fdo#103927 https://bugs.freedesktop.org/show_bug.cgi?id=103927
  fdo#105411 https://bugs.freedesktop.org/show_bug.cgi?id=105411
  fdo#107956 https://bugs.freedesktop.org/show_bug.cgi?id=107956
  fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912


== Participating hosts (6 -> 5) ==

  Missing    (1): shard-skl 


== Build changes ==

    * Linux: CI_DRM_4890 -> Patchwork_10291

  CI_DRM_4890: 3cf2dd7288b4d68f0702d9f5f0a2f0e0e0185c8d @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4652: 83352d08b52acd6ee677f9f62dd032b0b8d31835 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_10291: 97b377e08a059ecd2ec5f3dc2bfb249a0f2aabc5 @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_10291/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 01/10] drm/i915/gvt: add module parameter enable_pvmmio
  2018-09-27 16:37 ` [RFC 01/10] drm/i915/gvt: add module parameter enable_pvmmio Xiaolin Zhang
  2018-09-27  7:16   ` Chris Wilson
@ 2018-09-27 11:03   ` Joonas Lahtinen
  2018-09-28  6:09     ` Zhang, Xiaolin
  1 sibling, 1 reply; 28+ messages in thread
From: Joonas Lahtinen @ 2018-09-27 11:03 UTC (permalink / raw)
  To: Xiaolin Zhang, intel-gfx, intel-gvt-dev
  Cc: joonas.lahtinen, zhiyuan.lv, fei.jiang, zhenyu.z.wang, hang.yuan

Quoting Xiaolin Zhang (2018-09-27 19:37:46)
> This int type module parameter is used to control the different
> level pvmmio feature for MMIO emulation in GVT.
> 
> This parameter is default zero, no pvmmio feature enabled.
> 
> Its permission type is 0400 which means user could only change its
> value through the cmdline, this is to prevent the dynamic modification
> during runtime which would break the pvmmio internal logic.
> 
> Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com>

This shouldn't really be a module parameter. We should detect the
capability from the vGPU device and use it always when possible.

Regards, Joonas
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 00/10] i915 pvmmio to improve GVTg performance
  2018-09-27 16:37 [RFC 00/10] i915 pvmmio to improve GVTg performance Xiaolin Zhang
                   ` (3 preceding siblings ...)
  2018-09-27 10:25 ` ✓ Fi.CI.IGT: " Patchwork
@ 2018-09-27 11:07 ` Joonas Lahtinen
  2018-09-28  6:11   ` Zhang, Xiaolin
  2018-09-27 16:37 ` [RFC 01/10] drm/i915/gvt: add module parameter enable_pvmmio Xiaolin Zhang
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 28+ messages in thread
From: Joonas Lahtinen @ 2018-09-27 11:07 UTC (permalink / raw)
  To: Xiaolin Zhang, intel-gfx, intel-gvt-dev
  Cc: joonas.lahtinen, zhiyuan.lv, fei.jiang, zhenyu.z.wang, hang.yuan

Quoting Xiaolin Zhang (2018-09-27 19:37:45)
> To improve GVTg performance, it could reduce the mmio access trap
> numbers within guest driver in some certain scenarios since mmio
> access trap will introuduce vm exit/vm enter cost.
> 
> the solution in this patch set is to setup a shared memory region
> which accessed both by guest and GVTg without trap cost. the shared
> memory region is allocated by guest driver and guest driver will
> pass the region's memory guest physical address to GVTg through
> PVINFO register and later GVTg can access this region directly without
> trap cost to achieve data exchange purpose between guest and GVTg.
> 
> in this patch set, 3 kind of pvmmio optimization implemented which is
> controlled by enable_pvmmio PVINO register with different level flag.
> 1. workload submission (context submission): reduce 4 traps to 1 trap.
> 2. master irq: reduce 2 traps to 1 trap. 
> 3. ppgtt update: eliminate the cost of ppgtt write protection. 
> 
> based on the experiment, the performance was gained 4 percent (average)
> improvment with regard to both media and 3D workload benchmarks. 
> 
> based on the pvmmio framework, it could achive more sceneario optimization
> such as globle GTT update, display plane and water mark update with guest.

Overall comments:

The patches should be properly prefixed and split down. We should have
"drm/i915:" patches that touch i915 portions, and those should not touch
any gvt parts. Then there should be "drm/i915/gvt:" parts which don't
touch anything from i915, and would be reviewed in the GVT list.

We'd then proceed to merge the i915 changes and the GVT changes would be
merged in the GVT tree.

Regards, Joonas
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 03/10] drm/i915/gvt: context submission pvmmio optimization
  2018-09-27 16:37 ` [RFC 03/10] drm/i915/gvt: context submission pvmmio optimization Xiaolin Zhang
  2018-09-27  7:19   ` Chris Wilson
@ 2018-09-27 11:13   ` Joonas Lahtinen
  1 sibling, 0 replies; 28+ messages in thread
From: Joonas Lahtinen @ 2018-09-27 11:13 UTC (permalink / raw)
  To: Xiaolin Zhang, intel-gfx, intel-gvt-dev
  Cc: joonas.lahtinen, zhiyuan.lv, fei.jiang, zhenyu.z.wang, hang.yuan

Quoting Xiaolin Zhang (2018-09-27 19:37:48)
> It is performance optimization to reduce mmio trap numbers from 4 to
> 1 durning ELSP porting writing (context submission).
> 
> When context subission, to cache elsp_data[4] values in
> the shared page, the last elsp_data[0] port writing will be trapped
> to gvt for real context submission.
> 
> Use PVMMIO_ELSP_SUBMIT to control this level of pvmmio optimization.
> 
> Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com>

These changes are somewhat invasive to the ELSP submit path.

We should instead look into providing "pvgpu" (just a suggestion for
name) flavour of virtual functions which do the para-virtualized talking.
We don't want to sprinkle pv code around the driver with ifs.

Regards, Joonas
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC 00/10] i915 pvmmio to improve GVTg performance
@ 2018-09-27 16:37 Xiaolin Zhang
  2018-09-27  7:20 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
                   ` (14 more replies)
  0 siblings, 15 replies; 28+ messages in thread
From: Xiaolin Zhang @ 2018-09-27 16:37 UTC (permalink / raw)
  To: intel-gvt-dev, intel-gfx
  Cc: zhenyu.z.wang, hang.yuan, joonas.lahtinen, fei.jiang, zhiyuan.lv

To improve GVTg performance, it could reduce the mmio access trap
numbers within guest driver in some certain scenarios since mmio
access trap will introuduce vm exit/vm enter cost.

the solution in this patch set is to setup a shared memory region
which accessed both by guest and GVTg without trap cost. the shared
memory region is allocated by guest driver and guest driver will
pass the region's memory guest physical address to GVTg through
PVINFO register and later GVTg can access this region directly without
trap cost to achieve data exchange purpose between guest and GVTg.

in this patch set, 3 kind of pvmmio optimization implemented which is
controlled by enable_pvmmio PVINO register with different level flag.
1. workload submission (context submission): reduce 4 traps to 1 trap.
2. master irq: reduce 2 traps to 1 trap. 
3. ppgtt update: eliminate the cost of ppgtt write protection. 

based on the experiment, the performance was gained 4 percent (average)
improvment with regard to both media and 3D workload benchmarks. 

based on the pvmmio framework, it could achive more sceneario optimization
such as globle GTT update, display plane and water mark update with guest.

Xiaolin Zhang (10):
  drm/i915/gvt: add module parameter enable_pvmmio
  drm/i915/gvt: get ready of memory for pvmmio
  drm/i915/gvt: context submission pvmmio optimization
  drm/i915/gvt: master irq pvmmio optimization
  drm/i915/gvt: ppgtt update pvmmio optimization
  drm/i915/gvt: GVTg handle enable_pvmmio PVINFO register
  drm/i915/gvt: GVTg read_shared_page implementation
  drm/i915/gvt: GVTg support context submission pvmmio optimization
  drm/i915/gvt: GVTg support master irq pvmmio optimization
  drm/i915/gvt: GVTg support ppgtt pvmmio optimization

 drivers/gpu/drm/i915/gvt/gtt.c       | 318 +++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/gvt/gtt.h       |   9 +
 drivers/gpu/drm/i915/gvt/gvt.h       |   4 +-
 drivers/gpu/drm/i915/gvt/handlers.c  |  44 ++++-
 drivers/gpu/drm/i915/gvt/interrupt.c |  17 +-
 drivers/gpu/drm/i915/gvt/vgpu.c      |  20 +++
 drivers/gpu/drm/i915/i915_drv.c      |   5 +
 drivers/gpu/drm/i915/i915_drv.h      |   6 +
 drivers/gpu/drm/i915/i915_gem_gtt.c  |  36 ++++
 drivers/gpu/drm/i915/i915_irq.c      |  29 +++-
 drivers/gpu/drm/i915/i915_params.c   |   4 +
 drivers/gpu/drm/i915/i915_params.h   |   4 +-
 drivers/gpu/drm/i915/i915_pvinfo.h   |  43 ++++-
 drivers/gpu/drm/i915/i915_vgpu.c     |  29 +++-
 drivers/gpu/drm/i915/intel_lrc.c     |  37 +++-
 15 files changed, 588 insertions(+), 17 deletions(-)

-- 
1.8.3.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC 01/10] drm/i915/gvt: add module parameter enable_pvmmio
  2018-09-27 16:37 [RFC 00/10] i915 pvmmio to improve GVTg performance Xiaolin Zhang
                   ` (4 preceding siblings ...)
  2018-09-27 11:07 ` [RFC 00/10] " Joonas Lahtinen
@ 2018-09-27 16:37 ` Xiaolin Zhang
  2018-09-27  7:16   ` Chris Wilson
  2018-09-27 11:03   ` Joonas Lahtinen
  2018-09-27 16:37 ` [RFC 02/10] drm/i915/gvt: get ready of memory for pvmmio Xiaolin Zhang
                   ` (8 subsequent siblings)
  14 siblings, 2 replies; 28+ messages in thread
From: Xiaolin Zhang @ 2018-09-27 16:37 UTC (permalink / raw)
  To: intel-gvt-dev, intel-gfx
  Cc: zhenyu.z.wang, hang.yuan, joonas.lahtinen, fei.jiang, zhiyuan.lv

This int type module parameter is used to control the different
level pvmmio feature for MMIO emulation in GVT.

This parameter is default zero, no pvmmio feature enabled.

Its permission type is 0400 which means user could only change its
value through the cmdline, this is to prevent the dynamic modification
during runtime which would break the pvmmio internal logic.

Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h    |  3 +++
 drivers/gpu/drm/i915/i915_params.c |  4 ++++
 drivers/gpu/drm/i915/i915_params.h |  3 ++-
 drivers/gpu/drm/i915/i915_pvinfo.h | 16 +++++++++++++++-
 drivers/gpu/drm/i915/i915_vgpu.c   | 12 +++++++++++-
 5 files changed, 35 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 8624b4b..174d618 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3871,4 +3871,7 @@ static inline int intel_hws_csb_write_index(struct drm_i915_private *i915)
 		return I915_HWS_CSB_WRITE_INDEX;
 }
 
+#define PVMMIO_LEVEL_ENABLE(dev_priv, level)	\
+	(intel_vgpu_active(dev_priv) && i915_modparams.enable_pvmmio & level)
+
 #endif
diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index 295e981..5ee236ec 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -174,6 +174,10 @@ struct i915_params i915_modparams __read_mostly = {
 i915_param_named(enable_gvt, bool, 0400,
 	"Enable support for Intel GVT-g graphics virtualization host support(default:false)");
 
+i915_param_named(enable_pvmmio, int, 0400,
+	"Enable pv mmio feature, default TRUE. This parameter "
+	"could only set from host, guest value is set through vgt_if");
+
 static __always_inline void _print_param(struct drm_printer *p,
 					 const char *name,
 					 const char *type,
diff --git a/drivers/gpu/drm/i915/i915_params.h b/drivers/gpu/drm/i915/i915_params.h
index 6c4d4a2..0f6a38b 100644
--- a/drivers/gpu/drm/i915/i915_params.h
+++ b/drivers/gpu/drm/i915/i915_params.h
@@ -68,7 +68,8 @@
 	param(bool, nuclear_pageflip, false) \
 	param(bool, enable_dp_mst, true) \
 	param(bool, enable_dpcd_backlight, false) \
-	param(bool, enable_gvt, false)
+	param(bool, enable_gvt, false) \
+	param(int, enable_pvmmio, 0)
 
 #define MEMBER(T, member, ...) T member;
 struct i915_params {
diff --git a/drivers/gpu/drm/i915/i915_pvinfo.h b/drivers/gpu/drm/i915/i915_pvinfo.h
index eeaa3d5..697e998 100644
--- a/drivers/gpu/drm/i915/i915_pvinfo.h
+++ b/drivers/gpu/drm/i915/i915_pvinfo.h
@@ -49,6 +49,8 @@ enum vgt_g2v_type {
 	VGT_G2V_MAX,
 };
 
+#define VGPU_PVMMIO(vgpu) vgpu_vreg_t(vgpu, vgtif_reg(enable_pvmmio))
+
 /*
  * VGT capabilities type
  */
@@ -56,6 +58,17 @@ enum vgt_g2v_type {
 #define VGT_CAPS_HWSP_EMULATION		BIT(3)
 #define VGT_CAPS_HUGE_GTT		BIT(4)
 
+/*
+ * define different levels of PVMMIO optimization
+ */
+enum pvmmio_levels {
+	PVMMIO_ELSP_SUBMIT = 0x1,
+	PVMMIO_PLANE_UPDATE = 0x2,
+	PVMMIO_PLANE_WM_UPDATE = 0x4,
+	PVMMIO_MASTER_IRQ = 0x8,
+	PVMMIO_PPGTT_UPDATE = 0x10,
+};
+
 struct vgt_if {
 	u64 magic;		/* VGT_MAGIC */
 	u16 version_major;
@@ -106,8 +119,9 @@ struct vgt_if {
 
 	u32 execlist_context_descriptor_lo;
 	u32 execlist_context_descriptor_hi;
+	u32 enable_pvmmio;
 
-	u32  rsv7[0x200 - 24];    /* pad to one page */
+	u32  rsv7[0x200 - 25];    /* pad to one page */
 } __packed;
 
 #define vgtif_reg(x) \
diff --git a/drivers/gpu/drm/i915/i915_vgpu.c b/drivers/gpu/drm/i915/i915_vgpu.c
index 869cf4a..d22c5ca 100644
--- a/drivers/gpu/drm/i915/i915_vgpu.c
+++ b/drivers/gpu/drm/i915/i915_vgpu.c
@@ -77,8 +77,18 @@ void i915_check_vgpu(struct drm_i915_private *dev_priv)
 
 	dev_priv->vgpu.caps = __raw_i915_read32(dev_priv, vgtif_reg(vgt_caps));
 
+	/* If guest wants to enable pvmmio, it needs to enable it explicitly
+	 * through vgt_if interface, and then read back the enable state from
+	 * gvt layer.
+	 */
+	__raw_i915_write32(dev_priv, vgtif_reg(enable_pvmmio),
+			i915_modparams.enable_pvmmio);
+	i915_modparams.enable_pvmmio = __raw_i915_read32(dev_priv,
+			vgtif_reg(enable_pvmmio));
+
 	dev_priv->vgpu.active = true;
-	DRM_INFO("Virtual GPU for Intel GVT-g detected.\n");
+	DRM_INFO("Virtual GPU for Intel GVT-g detected with pvmmio 0x%x\n",
+		i915_modparams.enable_pvmmio);
 }
 
 bool intel_vgpu_has_full_48bit_ppgtt(struct drm_i915_private *dev_priv)
-- 
1.8.3.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC 02/10] drm/i915/gvt: get ready of memory for pvmmio
  2018-09-27 16:37 [RFC 00/10] i915 pvmmio to improve GVTg performance Xiaolin Zhang
                   ` (5 preceding siblings ...)
  2018-09-27 16:37 ` [RFC 01/10] drm/i915/gvt: add module parameter enable_pvmmio Xiaolin Zhang
@ 2018-09-27 16:37 ` Xiaolin Zhang
  2018-09-27  7:17   ` Chris Wilson
  2018-10-09  2:31   ` Zhenyu Wang
  2018-09-27 16:37 ` [RFC 03/10] drm/i915/gvt: context submission pvmmio optimization Xiaolin Zhang
                   ` (7 subsequent siblings)
  14 siblings, 2 replies; 28+ messages in thread
From: Xiaolin Zhang @ 2018-09-27 16:37 UTC (permalink / raw)
  To: intel-gvt-dev, intel-gfx
  Cc: zhenyu.z.wang, hang.yuan, joonas.lahtinen, fei.jiang, zhiyuan.lv

To enable pvmmio feature, we need to prepare one 4K shared page
which will be accessed by both guest and backend i915 driver.

guest i915 allocate one page memory and then the guest physical address is
passed to backend i915 driver through PVINFO register so that backend i915
driver can access this shared page without hypeviser trap cost for shared
data exchagne via hyperviser read_gpa functionality.

Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c    |  5 +++++
 drivers/gpu/drm/i915/i915_drv.h    |  3 +++
 drivers/gpu/drm/i915/i915_pvinfo.h | 25 ++++++++++++++++++++++++-
 drivers/gpu/drm/i915/i915_vgpu.c   | 17 +++++++++++++++++
 4 files changed, 49 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index ade9bca..815a4dd 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -885,6 +885,7 @@ static int i915_driver_init_early(struct drm_i915_private *dev_priv)
 		return -ENODEV;
 
 	spin_lock_init(&dev_priv->irq_lock);
+	spin_lock_init(&dev_priv->shared_page_lock);
 	spin_lock_init(&dev_priv->gpu_error.lock);
 	mutex_init(&dev_priv->backlight_lock);
 	spin_lock_init(&dev_priv->uncore.lock);
@@ -987,6 +988,8 @@ static void i915_mmio_cleanup(struct drm_i915_private *dev_priv)
 
 	intel_teardown_mchbar(dev_priv);
 	pci_iounmap(pdev, dev_priv->regs);
+	if (intel_vgpu_active(dev_priv) && dev_priv->shared_page)
+		free_pages((unsigned long)dev_priv->shared_page, 0);
 }
 
 /**
@@ -1029,6 +1032,8 @@ static int i915_driver_init_mmio(struct drm_i915_private *dev_priv)
 	return 0;
 
 err_uncore:
+	if (intel_vgpu_active(dev_priv) && dev_priv->shared_page)
+		free_pages((unsigned long)dev_priv->shared_page, 0);
 	intel_uncore_fini(dev_priv);
 err_bridge:
 	pci_dev_put(dev_priv->bridge_dev);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 174d618..76d7e9c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -56,6 +56,7 @@
 
 #include "i915_params.h"
 #include "i915_reg.h"
+#include "i915_pvinfo.h"
 #include "i915_utils.h"
 
 #include "intel_bios.h"
@@ -1623,6 +1624,8 @@ struct drm_i915_private {
 	resource_size_t stolen_usable_size;	/* Total size minus reserved ranges */
 
 	void __iomem *regs;
+	struct gvt_shared_page *shared_page;
+	spinlock_t shared_page_lock;
 
 	struct intel_uncore uncore;
 
diff --git a/drivers/gpu/drm/i915/i915_pvinfo.h b/drivers/gpu/drm/i915/i915_pvinfo.h
index 697e998..ab839a7 100644
--- a/drivers/gpu/drm/i915/i915_pvinfo.h
+++ b/drivers/gpu/drm/i915/i915_pvinfo.h
@@ -49,6 +49,25 @@ enum vgt_g2v_type {
 	VGT_G2V_MAX,
 };
 
+struct pv_ppgtt_update {
+	u64 pdp;
+	u64 start;
+	u64 length;
+	u32 cache_level;
+};
+
+/*
+ * shared page(4KB) between gvt and VM, could be allocated by guest driver
+ * or a fixed location in PCI bar 0 region
+ */
+struct gvt_shared_page {
+	u32 elsp_data[4];
+	u32 reg_addr;
+	u32 disable_irq;
+	struct pv_ppgtt_update pv_ppgtt;
+	u32 rsvd2[0x400 - 13];
+};
+
 #define VGPU_PVMMIO(vgpu) vgpu_vreg_t(vgpu, vgtif_reg(enable_pvmmio))
 
 /*
@@ -120,8 +139,12 @@ struct vgt_if {
 	u32 execlist_context_descriptor_lo;
 	u32 execlist_context_descriptor_hi;
 	u32 enable_pvmmio;
+	struct {
+		u32 lo;
+		u32 hi;
+	} shared_page_gpa;
 
-	u32  rsv7[0x200 - 25];    /* pad to one page */
+	u32  rsv7[0x200 - 27];    /* pad to one page */
 } __packed;
 
 #define vgtif_reg(x) \
diff --git a/drivers/gpu/drm/i915/i915_vgpu.c b/drivers/gpu/drm/i915/i915_vgpu.c
index d22c5ca..10ae94b 100644
--- a/drivers/gpu/drm/i915/i915_vgpu.c
+++ b/drivers/gpu/drm/i915/i915_vgpu.c
@@ -62,6 +62,7 @@ void i915_check_vgpu(struct drm_i915_private *dev_priv)
 {
 	u64 magic;
 	u16 version_major;
+	u64 shared_page_gpa;
 
 	BUILD_BUG_ON(sizeof(struct vgt_if) != VGT_PVINFO_SIZE);
 
@@ -89,6 +90,22 @@ void i915_check_vgpu(struct drm_i915_private *dev_priv)
 	dev_priv->vgpu.active = true;
 	DRM_INFO("Virtual GPU for Intel GVT-g detected with pvmmio 0x%x\n",
 		i915_modparams.enable_pvmmio);
+
+	if (intel_vgpu_active(dev_priv) && i915_modparams.enable_pvmmio) {
+		dev_priv->shared_page =  (struct gvt_shared_page *)
+				__get_free_pages(GFP_KERNEL | __GFP_ZERO, 0);
+		if (!dev_priv->shared_page) {
+			DRM_ERROR("out of memory for shared page memory\n");
+			return;
+		}
+		shared_page_gpa = __pa(dev_priv->shared_page);
+		__raw_i915_write32(dev_priv, vgtif_reg(shared_page_gpa.lo),
+				lower_32_bits(shared_page_gpa));
+		__raw_i915_write32(dev_priv, vgtif_reg(shared_page_gpa.hi),
+				upper_32_bits(shared_page_gpa));
+		DRM_INFO("VGPU shared page enabled\n");
+	}
+
 }
 
 bool intel_vgpu_has_full_48bit_ppgtt(struct drm_i915_private *dev_priv)
-- 
1.8.3.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC 03/10] drm/i915/gvt: context submission pvmmio optimization
  2018-09-27 16:37 [RFC 00/10] i915 pvmmio to improve GVTg performance Xiaolin Zhang
                   ` (6 preceding siblings ...)
  2018-09-27 16:37 ` [RFC 02/10] drm/i915/gvt: get ready of memory for pvmmio Xiaolin Zhang
@ 2018-09-27 16:37 ` Xiaolin Zhang
  2018-09-27  7:19   ` Chris Wilson
  2018-09-27 11:13   ` Joonas Lahtinen
  2018-09-27 16:37 ` [RFC 04/10] drm/i915/gvt: master irq " Xiaolin Zhang
                   ` (6 subsequent siblings)
  14 siblings, 2 replies; 28+ messages in thread
From: Xiaolin Zhang @ 2018-09-27 16:37 UTC (permalink / raw)
  To: intel-gvt-dev, intel-gfx
  Cc: zhenyu.z.wang, hang.yuan, joonas.lahtinen, fei.jiang, zhiyuan.lv

It is performance optimization to reduce mmio trap numbers from 4 to
1 durning ELSP porting writing (context submission).

When context subission, to cache elsp_data[4] values in
the shared page, the last elsp_data[0] port writing will be trapped
to gvt for real context submission.

Use PVMMIO_ELSP_SUBMIT to control this level of pvmmio optimization.

Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com>
---
 drivers/gpu/drm/i915/i915_params.h |  2 +-
 drivers/gpu/drm/i915/intel_lrc.c   | 37 ++++++++++++++++++++++++++++++++++++-
 2 files changed, 37 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_params.h b/drivers/gpu/drm/i915/i915_params.h
index 0f6a38b..6c81c87 100644
--- a/drivers/gpu/drm/i915/i915_params.h
+++ b/drivers/gpu/drm/i915/i915_params.h
@@ -69,7 +69,7 @@
 	param(bool, enable_dp_mst, true) \
 	param(bool, enable_dpcd_backlight, false) \
 	param(bool, enable_gvt, false) \
-	param(int, enable_pvmmio, 0)
+	param(int, enable_pvmmio, PVMMIO_ELSP_SUBMIT)
 
 #define MEMBER(T, member, ...) T member;
 struct i915_params {
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 4b28225..cdc713c 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -451,7 +451,12 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
 {
 	struct intel_engine_execlists *execlists = &engine->execlists;
 	struct execlist_port *port = execlists->port;
+	u32 __iomem *elsp =
+		engine->i915->regs + i915_mmio_reg_offset(RING_ELSP(engine));
+	u32 *elsp_data;
 	unsigned int n;
+	u32 descs[4];
+	int i = 0;
 
 	/*
 	 * We can skip acquiring intel_runtime_pm_get() here as it was taken
@@ -494,8 +499,24 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
 			GEM_BUG_ON(!n);
 			desc = 0;
 		}
+		if (PVMMIO_LEVEL_ENABLE(engine->i915, PVMMIO_ELSP_SUBMIT)) {
+			GEM_BUG_ON(i >= 4);
+			descs[i] = upper_32_bits(desc);
+			descs[i + 1] = lower_32_bits(desc);
+			i += 2;
+		} else {
+			write_desc(execlists, desc, n);
+		}
+	}
 
-		write_desc(execlists, desc, n);
+	if (PVMMIO_LEVEL_ENABLE(engine->i915, PVMMIO_ELSP_SUBMIT)) {
+		spin_lock(&engine->i915->shared_page_lock);
+		elsp_data = engine->i915->shared_page->elsp_data;
+		*elsp_data = descs[0];
+		*(elsp_data + 1) = descs[1];
+		*(elsp_data + 2) = descs[2];
+		writel(descs[3], elsp);
+		spin_unlock(&engine->i915->shared_page_lock);
 	}
 
 	/* we need to manually load the submit queue */
@@ -538,11 +559,25 @@ static void inject_preempt_context(struct intel_engine_cs *engine)
 	struct intel_engine_execlists *execlists = &engine->execlists;
 	struct intel_context *ce =
 		to_intel_context(engine->i915->preempt_context, engine);
+	u32 __iomem *elsp =
+		engine->i915->regs + i915_mmio_reg_offset(RING_ELSP(engine));
+	u32 *elsp_data;
 	unsigned int n;
 
 	GEM_BUG_ON(execlists->preempt_complete_status !=
 		   upper_32_bits(ce->lrc_desc));
 
+	if (PVMMIO_LEVEL_ENABLE(engine->i915, PVMMIO_ELSP_SUBMIT)) {
+		spin_lock(&engine->i915->shared_page_lock);
+		elsp_data = engine->i915->shared_page->elsp_data;
+		*elsp_data = 0;
+		*(elsp_data + 1) = 0;
+		*(elsp_data + 2) = upper_32_bits(ce->lrc_desc);
+		writel(lower_32_bits(ce->lrc_desc), elsp);
+		spin_unlock(&engine->i915->shared_page_lock);
+		return;
+	}
+
 	/*
 	 * Switch to our empty preempt context so
 	 * the state of the GPU is known (idle).
-- 
1.8.3.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC 04/10] drm/i915/gvt: master irq pvmmio optimization
  2018-09-27 16:37 [RFC 00/10] i915 pvmmio to improve GVTg performance Xiaolin Zhang
                   ` (7 preceding siblings ...)
  2018-09-27 16:37 ` [RFC 03/10] drm/i915/gvt: context submission pvmmio optimization Xiaolin Zhang
@ 2018-09-27 16:37 ` Xiaolin Zhang
  2018-09-27 16:37 ` [RFC 05/10] drm/i915/gvt: ppgtt update " Xiaolin Zhang
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 28+ messages in thread
From: Xiaolin Zhang @ 2018-09-27 16:37 UTC (permalink / raw)
  To: intel-gvt-dev, intel-gfx
  Cc: zhenyu.z.wang, hang.yuan, joonas.lahtinen, fei.jiang, zhiyuan.lv

Master irq register is accessed twice every irq handling, then trapped
to SOS very frequently. Optimize it by moving master irq register
to share page, writing don't need be trapped.

When need enable irq to let SOS inject irq timely, use another pvmmio
register to achieve this purpose. So avoid one trap when we disable
master irq.

Use PVMMIO_MASTER_IRQ to control this level of pvmmio optimization.

Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com>
---
 drivers/gpu/drm/i915/i915_irq.c    | 29 +++++++++++++++++++++++------
 drivers/gpu/drm/i915/i915_params.h |  2 +-
 drivers/gpu/drm/i915/i915_pvinfo.h |  3 ++-
 3 files changed, 26 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 2e24227..f911ed1 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2901,7 +2901,10 @@ static irqreturn_t gen8_irq_handler(int irq, void *arg)
 	if (!master_ctl)
 		return IRQ_NONE;
 
-	I915_WRITE_FW(GEN8_MASTER_IRQ, 0);
+	if (PVMMIO_LEVEL_ENABLE(dev_priv, PVMMIO_MASTER_IRQ))
+		dev_priv->shared_page->disable_irq = 1;
+	else
+		I915_WRITE_FW(GEN8_MASTER_IRQ, 0);
 
 	/* Find, clear, then process each source of interrupt */
 	gen8_gt_irq_ack(dev_priv, master_ctl, gt_iir);
@@ -2913,7 +2916,12 @@ static irqreturn_t gen8_irq_handler(int irq, void *arg)
 		enable_rpm_wakeref_asserts(dev_priv);
 	}
 
-	I915_WRITE_FW(GEN8_MASTER_IRQ, GEN8_MASTER_IRQ_CONTROL);
+	if (PVMMIO_LEVEL_ENABLE(dev_priv, PVMMIO_MASTER_IRQ)) {
+		dev_priv->shared_page->disable_irq = 0;
+		__raw_i915_write32(dev_priv, vgtif_reg(check_pending_irq), 1);
+	} else {
+		I915_WRITE_FW(GEN8_MASTER_IRQ, GEN8_MASTER_IRQ_CONTROL);
+	}
 
 	gen8_gt_irq_handler(dev_priv, master_ctl, gt_iir);
 
@@ -3598,8 +3606,12 @@ static void gen8_irq_reset(struct drm_device *dev)
 	struct drm_i915_private *dev_priv = to_i915(dev);
 	int pipe;
 
-	I915_WRITE(GEN8_MASTER_IRQ, 0);
-	POSTING_READ(GEN8_MASTER_IRQ);
+	if (PVMMIO_LEVEL_ENABLE(dev_priv, PVMMIO_MASTER_IRQ)) {
+		dev_priv->shared_page->disable_irq = 1;
+	} else {
+		I915_WRITE(GEN8_MASTER_IRQ, 0);
+		POSTING_READ(GEN8_MASTER_IRQ);
+	}
 
 	gen8_gt_irq_reset(dev_priv);
 
@@ -4244,8 +4256,13 @@ static int gen8_irq_postinstall(struct drm_device *dev)
 	if (HAS_PCH_SPLIT(dev_priv))
 		ibx_irq_postinstall(dev);
 
-	I915_WRITE(GEN8_MASTER_IRQ, GEN8_MASTER_IRQ_CONTROL);
-	POSTING_READ(GEN8_MASTER_IRQ);
+	if (PVMMIO_LEVEL_ENABLE(dev_priv, PVMMIO_MASTER_IRQ)) {
+		dev_priv->shared_page->disable_irq = 0;
+		__raw_i915_write32(dev_priv, vgtif_reg(check_pending_irq), 1);
+	} else {
+		I915_WRITE(GEN8_MASTER_IRQ, GEN8_MASTER_IRQ_CONTROL);
+		POSTING_READ(GEN8_MASTER_IRQ);
+	}
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/i915_params.h b/drivers/gpu/drm/i915/i915_params.h
index 6c81c87..bfc30a0 100644
--- a/drivers/gpu/drm/i915/i915_params.h
+++ b/drivers/gpu/drm/i915/i915_params.h
@@ -69,7 +69,7 @@
 	param(bool, enable_dp_mst, true) \
 	param(bool, enable_dpcd_backlight, false) \
 	param(bool, enable_gvt, false) \
-	param(int, enable_pvmmio, PVMMIO_ELSP_SUBMIT)
+	param(int, enable_pvmmio, PVMMIO_ELSP_SUBMIT | PVMMIO_MASTER_IRQ)
 
 #define MEMBER(T, member, ...) T member;
 struct i915_params {
diff --git a/drivers/gpu/drm/i915/i915_pvinfo.h b/drivers/gpu/drm/i915/i915_pvinfo.h
index ab839a7..60183c7 100644
--- a/drivers/gpu/drm/i915/i915_pvinfo.h
+++ b/drivers/gpu/drm/i915/i915_pvinfo.h
@@ -143,8 +143,9 @@ struct vgt_if {
 		u32 lo;
 		u32 hi;
 	} shared_page_gpa;
+	u32 check_pending_irq;
 
-	u32  rsv7[0x200 - 27];    /* pad to one page */
+	u32  rsv7[0x200 - 28];    /* pad to one page */
 } __packed;
 
 #define vgtif_reg(x) \
-- 
1.8.3.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC 05/10] drm/i915/gvt: ppgtt update pvmmio optimization
  2018-09-27 16:37 [RFC 00/10] i915 pvmmio to improve GVTg performance Xiaolin Zhang
                   ` (8 preceding siblings ...)
  2018-09-27 16:37 ` [RFC 04/10] drm/i915/gvt: master irq " Xiaolin Zhang
@ 2018-09-27 16:37 ` Xiaolin Zhang
  2018-09-27 16:37 ` [RFC 06/10] drm/i915/gvt: GVTg handle enable_pvmmio PVINFO register Xiaolin Zhang
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 28+ messages in thread
From: Xiaolin Zhang @ 2018-09-27 16:37 UTC (permalink / raw)
  To: intel-gvt-dev, intel-gfx
  Cc: zhenyu.z.wang, hang.yuan, joonas.lahtinen, fei.jiang, zhiyuan.lv

This patch extends g2v notification to notify host GVT-g of
ppgtt update from guest, including alloc_4lvl, clear_4lv4 and
insert_4lvl. It uses shared page to pass the additional params.
This patch also add one new pvmmio level to control ppgtt update.

Use PVMMIO_PPGTT_UPDATE to control this level of pvmmio optimization.

Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 36 ++++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_params.h  |  3 ++-
 drivers/gpu/drm/i915/i915_pvinfo.h  |  3 +++
 3 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index f6c7ab4..fe0cbd2 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -990,6 +990,8 @@ static void gen8_ppgtt_clear_4lvl(struct i915_address_space *vm,
 	struct i915_pml4 *pml4 = &ppgtt->pml4;
 	struct i915_page_directory_pointer *pdp;
 	unsigned int pml4e;
+	u64 orig_start = start;
+	u64 orig_length = length;
 
 	GEM_BUG_ON(!use_4lvl(vm));
 
@@ -1003,6 +1005,16 @@ static void gen8_ppgtt_clear_4lvl(struct i915_address_space *vm,
 
 		free_pdp(vm, pdp);
 	}
+
+	if (PVMMIO_LEVEL_ENABLE(vm->i915, PVMMIO_PPGTT_UPDATE)) {
+		struct drm_i915_private *dev_priv = vm->i915;
+		struct pv_ppgtt_update *pv_ppgtt =
+					&dev_priv->shared_page->pv_ppgtt;
+		pv_ppgtt->pdp = px_dma(pml4);
+		pv_ppgtt->start = orig_start;
+		pv_ppgtt->length = orig_length;
+		I915_WRITE(vgtif_reg(g2v_notify), VGT_G2V_PPGTT_L4_CLEAR);
+	}
 }
 
 static inline struct sgt_dma {
@@ -1244,6 +1256,18 @@ static void gen8_ppgtt_insert_4lvl(struct i915_address_space *vm,
 
 		vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
 	}
+
+	if (PVMMIO_LEVEL_ENABLE(vm->i915, PVMMIO_PPGTT_UPDATE)) {
+		struct drm_i915_private *dev_priv = vm->i915;
+		struct pv_ppgtt_update *pv_ppgtt =
+					     &dev_priv->shared_page->pv_ppgtt;
+		pv_ppgtt->pdp = px_dma(&ppgtt->pml4);
+		pv_ppgtt->start = vma->node.start;
+		pv_ppgtt->length = vma->node.size;
+		pv_ppgtt->cache_level = cache_level;
+		I915_WRITE(vgtif_reg(g2v_notify), VGT_G2V_PPGTT_L4_INSERT);
+	}
+
 }
 
 static void gen8_free_page_tables(struct i915_address_space *vm,
@@ -1487,6 +1511,8 @@ static int gen8_ppgtt_alloc_4lvl(struct i915_address_space *vm,
 	u64 from = start;
 	u32 pml4e;
 	int ret;
+	u64 orig_start = start;
+	u64 orig_length = length;
 
 	gen8_for_each_pml4e(pdp, pml4, start, length, pml4e) {
 		if (pml4->pdps[pml4e] == vm->scratch_pdp) {
@@ -1503,6 +1529,16 @@ static int gen8_ppgtt_alloc_4lvl(struct i915_address_space *vm,
 			goto unwind_pdp;
 	}
 
+	if (PVMMIO_LEVEL_ENABLE(vm->i915, PVMMIO_PPGTT_UPDATE)) {
+		struct drm_i915_private *dev_priv = vm->i915;
+		struct pv_ppgtt_update *pv_ppgtt =
+					&dev_priv->shared_page->pv_ppgtt;
+		pv_ppgtt->pdp = px_dma(pml4);
+		pv_ppgtt->start = orig_start;
+		pv_ppgtt->length = orig_length;
+		I915_WRITE(vgtif_reg(g2v_notify), VGT_G2V_PPGTT_L4_ALLOC);
+	}
+
 	return 0;
 
 unwind_pdp:
diff --git a/drivers/gpu/drm/i915/i915_params.h b/drivers/gpu/drm/i915/i915_params.h
index bfc30a0..f9fe6af 100644
--- a/drivers/gpu/drm/i915/i915_params.h
+++ b/drivers/gpu/drm/i915/i915_params.h
@@ -69,7 +69,8 @@
 	param(bool, enable_dp_mst, true) \
 	param(bool, enable_dpcd_backlight, false) \
 	param(bool, enable_gvt, false) \
-	param(int, enable_pvmmio, PVMMIO_ELSP_SUBMIT | PVMMIO_MASTER_IRQ)
+	param(int, enable_pvmmio, PVMMIO_ELSP_SUBMIT | PVMMIO_MASTER_IRQ \
+			| PVMMIO_PPGTT_UPDATE)
 
 #define MEMBER(T, member, ...) T member;
 struct i915_params {
diff --git a/drivers/gpu/drm/i915/i915_pvinfo.h b/drivers/gpu/drm/i915/i915_pvinfo.h
index 60183c7..39721f8 100644
--- a/drivers/gpu/drm/i915/i915_pvinfo.h
+++ b/drivers/gpu/drm/i915/i915_pvinfo.h
@@ -46,6 +46,9 @@ enum vgt_g2v_type {
 	VGT_G2V_PPGTT_L4_PAGE_TABLE_DESTROY,
 	VGT_G2V_EXECLIST_CONTEXT_CREATE,
 	VGT_G2V_EXECLIST_CONTEXT_DESTROY,
+	VGT_G2V_PPGTT_L4_ALLOC,
+	VGT_G2V_PPGTT_L4_CLEAR,
+	VGT_G2V_PPGTT_L4_INSERT,
 	VGT_G2V_MAX,
 };
 
-- 
1.8.3.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC 06/10] drm/i915/gvt: GVTg handle enable_pvmmio PVINFO register
  2018-09-27 16:37 [RFC 00/10] i915 pvmmio to improve GVTg performance Xiaolin Zhang
                   ` (9 preceding siblings ...)
  2018-09-27 16:37 ` [RFC 05/10] drm/i915/gvt: ppgtt update " Xiaolin Zhang
@ 2018-09-27 16:37 ` Xiaolin Zhang
  2018-09-27 16:37 ` [RFC 07/10] drm/i915/gvt: GVTg read_shared_page implementation Xiaolin Zhang
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 28+ messages in thread
From: Xiaolin Zhang @ 2018-09-27 16:37 UTC (permalink / raw)
  To: intel-gvt-dev, intel-gfx
  Cc: zhenyu.z.wang, hang.yuan, joonas.lahtinen, fei.jiang, zhiyuan.lv

implement enable_pvmmio PVINFO register handler in GVTg to
control different level pvmmio optimization within guest.

Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com>
---
 drivers/gpu/drm/i915/gvt/handlers.c | 10 ++++++++++
 drivers/gpu/drm/i915/gvt/vgpu.c     |  6 ++++++
 2 files changed, 16 insertions(+)

diff --git a/drivers/gpu/drm/i915/gvt/handlers.c b/drivers/gpu/drm/i915/gvt/handlers.c
index d262587..36171e6 100644
--- a/drivers/gpu/drm/i915/gvt/handlers.c
+++ b/drivers/gpu/drm/i915/gvt/handlers.c
@@ -1241,6 +1241,16 @@ static int pvinfo_mmio_write(struct intel_vgpu *vgpu, unsigned int offset,
 	case _vgtif_reg(g2v_notify):
 		ret = handle_g2v_notification(vgpu, data);
 		break;
+	case _vgtif_reg(enable_pvmmio):
+		if (i915_modparams.enable_pvmmio) {
+			vgpu_vreg(vgpu, offset) = data &
+				i915_modparams.enable_pvmmio;
+			DRM_INFO("vgpu id=%d pvmmio=0x%x\n",
+				vgpu->id, VGPU_PVMMIO(vgpu));
+		} else {
+			vgpu_vreg(vgpu, offset) = 0;
+		}
+		break;
 	/* add xhot and yhot to handled list to avoid error log */
 	case _vgtif_reg(cursor_x_hot):
 	case _vgtif_reg(cursor_y_hot):
diff --git a/drivers/gpu/drm/i915/gvt/vgpu.c b/drivers/gpu/drm/i915/gvt/vgpu.c
index a4e8e3c..1aed197 100644
--- a/drivers/gpu/drm/i915/gvt/vgpu.c
+++ b/drivers/gpu/drm/i915/gvt/vgpu.c
@@ -62,6 +62,8 @@ void populate_pvinfo_page(struct intel_vgpu *vgpu)
 	vgpu_vreg_t(vgpu, vgtif_reg(cursor_x_hot)) = UINT_MAX;
 	vgpu_vreg_t(vgpu, vgtif_reg(cursor_y_hot)) = UINT_MAX;
 
+	vgpu_vreg_t(vgpu, vgtif_reg(enable_pvmmio)) = 0;
+
 	gvt_dbg_core("Populate PVINFO PAGE for vGPU %d\n", vgpu->id);
 	gvt_dbg_core("aperture base [GMADR] 0x%llx size 0x%llx\n",
 		vgpu_aperture_gmadr_base(vgpu), vgpu_aperture_sz(vgpu));
@@ -490,6 +492,8 @@ struct intel_vgpu *intel_gvt_create_vgpu(struct intel_gvt *gvt,
 	return vgpu;
 }
 
+#define _vgtif_reg(x) \
+	(VGT_PVINFO_PAGE + offsetof(struct vgt_if, x))
 /**
  * intel_gvt_reset_vgpu_locked - reset a virtual GPU by DMLR or GT reset
  * @vgpu: virtual GPU
@@ -524,6 +528,7 @@ void intel_gvt_reset_vgpu_locked(struct intel_vgpu *vgpu, bool dmlr,
 	struct intel_gvt *gvt = vgpu->gvt;
 	struct intel_gvt_workload_scheduler *scheduler = &gvt->scheduler;
 	unsigned int resetting_eng = dmlr ? ALL_ENGINES : engine_mask;
+	int enable_pvmmio = vgpu_vreg(vgpu, _vgtif_reg(enable_pvmmio));
 
 	gvt_dbg_core("------------------------------------------\n");
 	gvt_dbg_core("resseting vgpu%d, dmlr %d, engine_mask %08x\n",
@@ -555,6 +560,7 @@ void intel_gvt_reset_vgpu_locked(struct intel_vgpu *vgpu, bool dmlr,
 
 		intel_vgpu_reset_mmio(vgpu, dmlr);
 		populate_pvinfo_page(vgpu);
+		vgpu_vreg(vgpu, _vgtif_reg(enable_pvmmio)) = enable_pvmmio;
 		intel_vgpu_reset_display(vgpu);
 
 		if (dmlr) {
-- 
1.8.3.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC 07/10] drm/i915/gvt: GVTg read_shared_page implementation
  2018-09-27 16:37 [RFC 00/10] i915 pvmmio to improve GVTg performance Xiaolin Zhang
                   ` (10 preceding siblings ...)
  2018-09-27 16:37 ` [RFC 06/10] drm/i915/gvt: GVTg handle enable_pvmmio PVINFO register Xiaolin Zhang
@ 2018-09-27 16:37 ` Xiaolin Zhang
  2018-09-27 16:37 ` [RFC 08/10] drm/i915/gvt: GVTg support context submission pvmmio optimization Xiaolin Zhang
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 28+ messages in thread
From: Xiaolin Zhang @ 2018-09-27 16:37 UTC (permalink / raw)
  To: intel-gvt-dev, intel-gfx
  Cc: zhenyu.z.wang, hang.yuan, joonas.lahtinen, fei.jiang, zhiyuan.lv

GVTg implemented the read_shared_page functionality based on
hypervisor_read_gpa().

the shared_page_gpa was passed from guest driver through PVINFO
shared_page_gpa register.

Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com>
---
 drivers/gpu/drm/i915/gvt/gvt.h      |  4 +++-
 drivers/gpu/drm/i915/gvt/handlers.c |  5 +++++
 drivers/gpu/drm/i915/gvt/vgpu.c     | 14 ++++++++++++++
 3 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gvt/gvt.h b/drivers/gpu/drm/i915/gvt/gvt.h
index 31f6cdb..7562f75 100644
--- a/drivers/gpu/drm/i915/gvt/gvt.h
+++ b/drivers/gpu/drm/i915/gvt/gvt.h
@@ -232,6 +232,7 @@ struct intel_vgpu {
 	struct completion vblank_done;
 
 	u32 scan_nonprivbb;
+	u64 shared_page_gpa;
 };
 
 /* validating GM healthy status*/
@@ -690,7 +691,8 @@ static inline void intel_gvt_mmio_set_in_ctx(
 void intel_gvt_debugfs_remove_vgpu(struct intel_vgpu *vgpu);
 int intel_gvt_debugfs_init(struct intel_gvt *gvt);
 void intel_gvt_debugfs_clean(struct intel_gvt *gvt);
-
+void intel_gvt_read_shared_page(struct intel_vgpu *vgpu,
+		unsigned int offset, void *buf, unsigned long len);
 
 #include "trace.h"
 #include "mpt.h"
diff --git a/drivers/gpu/drm/i915/gvt/handlers.c b/drivers/gpu/drm/i915/gvt/handlers.c
index 36171e6..a229770 100644
--- a/drivers/gpu/drm/i915/gvt/handlers.c
+++ b/drivers/gpu/drm/i915/gvt/handlers.c
@@ -1251,6 +1251,11 @@ static int pvinfo_mmio_write(struct intel_vgpu *vgpu, unsigned int offset,
 			vgpu_vreg(vgpu, offset) = 0;
 		}
 		break;
+	case _vgtif_reg(shared_page_gpa.lo):
+	case _vgtif_reg(shared_page_gpa.hi):
+		vgpu->shared_page_gpa = vgpu_vreg64_t(vgpu,
+				vgtif_reg(shared_page_gpa));
+		break;
 	/* add xhot and yhot to handled list to avoid error log */
 	case _vgtif_reg(cursor_x_hot):
 	case _vgtif_reg(cursor_y_hot):
diff --git a/drivers/gpu/drm/i915/gvt/vgpu.c b/drivers/gpu/drm/i915/gvt/vgpu.c
index 1aed197..7101e45 100644
--- a/drivers/gpu/drm/i915/gvt/vgpu.c
+++ b/drivers/gpu/drm/i915/gvt/vgpu.c
@@ -589,3 +589,17 @@ void intel_gvt_reset_vgpu(struct intel_vgpu *vgpu)
 	intel_gvt_reset_vgpu_locked(vgpu, true, 0);
 	mutex_unlock(&vgpu->vgpu_lock);
 }
+
+/**
+ * intel_gvt_read_shared_page - read content from shared page
+ */
+void intel_gvt_read_shared_page(struct intel_vgpu *vgpu,
+		unsigned int offset, void *buf, unsigned long len)
+{
+	int ret = 0;
+	unsigned long gpa = vgpu->shared_page_gpa + offset;
+
+	ret = intel_gvt_hypervisor_read_gpa(vgpu, gpa, buf, len);
+	if (ret)
+		gvt_vgpu_err("read shared page (offset %x) failed", offset);
+}
-- 
1.8.3.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC 08/10] drm/i915/gvt: GVTg support context submission pvmmio optimization
  2018-09-27 16:37 [RFC 00/10] i915 pvmmio to improve GVTg performance Xiaolin Zhang
                   ` (11 preceding siblings ...)
  2018-09-27 16:37 ` [RFC 07/10] drm/i915/gvt: GVTg read_shared_page implementation Xiaolin Zhang
@ 2018-09-27 16:37 ` Xiaolin Zhang
  2018-09-27 16:37 ` [RFC 09/10] drm/i915/gvt: GVTg support master irq " Xiaolin Zhang
  2018-09-27 16:37 ` [RFC 10/10] drm/i915/gvt: GVTg support ppgtt " Xiaolin Zhang
  14 siblings, 0 replies; 28+ messages in thread
From: Xiaolin Zhang @ 2018-09-27 16:37 UTC (permalink / raw)
  To: intel-gvt-dev, intel-gfx
  Cc: zhenyu.z.wang, hang.yuan, joonas.lahtinen, fei.jiang, zhiyuan.lv

implemented context submission pvmmio optimizaiton with GVTg.

GVTg to read context submission data (elsp_data) from the shared_page
directly without trap cost to improve guest GPU peformrnace.

Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com>
---
 drivers/gpu/drm/i915/gvt/handlers.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/i915/gvt/handlers.c b/drivers/gpu/drm/i915/gvt/handlers.c
index a229770..9ddb78e 100644
--- a/drivers/gpu/drm/i915/gvt/handlers.c
+++ b/drivers/gpu/drm/i915/gvt/handlers.c
@@ -1647,13 +1647,25 @@ static int elsp_mmio_write(struct intel_vgpu *vgpu, unsigned int offset,
 	int ring_id = intel_gvt_render_mmio_to_ring_id(vgpu->gvt, offset);
 	struct intel_vgpu_execlist *execlist;
 	u32 data = *(u32 *)p_data;
+	u32 elsp_data[4];
 	int ret = 0;
+	u32 elsp_data_off;
 
 	if (WARN_ON(ring_id < 0 || ring_id >= I915_NUM_ENGINES))
 		return -EINVAL;
 
 	execlist = &vgpu->submission.execlist[ring_id];
 
+	if (VGPU_PVMMIO(vgpu) & PVMMIO_ELSP_SUBMIT) {
+		elsp_data_off = offsetof(struct gvt_shared_page, elsp_data);
+		intel_gvt_read_shared_page(vgpu, elsp_data_off, &elsp_data, 16);
+		execlist->elsp_dwords.data[3] = elsp_data[0];
+		execlist->elsp_dwords.data[2] = elsp_data[1];
+		execlist->elsp_dwords.data[1] = elsp_data[2];
+		execlist->elsp_dwords.data[0] = data;
+		return intel_vgpu_submit_execlist(vgpu, ring_id);
+	}
+
 	execlist->elsp_dwords.data[3 - execlist->elsp_dwords.index] = data;
 	if (execlist->elsp_dwords.index == 3) {
 		ret = intel_vgpu_submit_execlist(vgpu, ring_id);
-- 
1.8.3.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC 09/10] drm/i915/gvt: GVTg support master irq pvmmio optimization
  2018-09-27 16:37 [RFC 00/10] i915 pvmmio to improve GVTg performance Xiaolin Zhang
                   ` (12 preceding siblings ...)
  2018-09-27 16:37 ` [RFC 08/10] drm/i915/gvt: GVTg support context submission pvmmio optimization Xiaolin Zhang
@ 2018-09-27 16:37 ` Xiaolin Zhang
  2018-09-27 16:37 ` [RFC 10/10] drm/i915/gvt: GVTg support ppgtt " Xiaolin Zhang
  14 siblings, 0 replies; 28+ messages in thread
From: Xiaolin Zhang @ 2018-09-27 16:37 UTC (permalink / raw)
  To: intel-gvt-dev, intel-gfx
  Cc: zhenyu.z.wang, hang.yuan, joonas.lahtinen, fei.jiang, zhiyuan.lv

GVTg to check master irq status in the shared_page instead
of register.

Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com>
---
 drivers/gpu/drm/i915/gvt/handlers.c  |  4 ++++
 drivers/gpu/drm/i915/gvt/interrupt.c | 17 +++++++++++++----
 2 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gvt/handlers.c b/drivers/gpu/drm/i915/gvt/handlers.c
index 9ddb78e..a915b72 100644
--- a/drivers/gpu/drm/i915/gvt/handlers.c
+++ b/drivers/gpu/drm/i915/gvt/handlers.c
@@ -1230,6 +1230,7 @@ static int pvinfo_mmio_write(struct intel_vgpu *vgpu, unsigned int offset,
 {
 	u32 data;
 	int ret;
+	struct intel_gvt_irq_ops *ops = vgpu->gvt->irq.ops;
 
 	write_vreg(vgpu, offset, p_data, bytes);
 	data = vgpu_vreg(vgpu, offset);
@@ -1256,6 +1257,9 @@ static int pvinfo_mmio_write(struct intel_vgpu *vgpu, unsigned int offset,
 		vgpu->shared_page_gpa = vgpu_vreg64_t(vgpu,
 				vgtif_reg(shared_page_gpa));
 		break;
+	case _vgtif_reg(check_pending_irq):
+		ops->check_pending_irq(vgpu);
+		break;
 	/* add xhot and yhot to handled list to avoid error log */
 	case _vgtif_reg(cursor_x_hot):
 	case _vgtif_reg(cursor_y_hot):
diff --git a/drivers/gpu/drm/i915/gvt/interrupt.c b/drivers/gpu/drm/i915/gvt/interrupt.c
index 5daa23a..c1884f8 100644
--- a/drivers/gpu/drm/i915/gvt/interrupt.c
+++ b/drivers/gpu/drm/i915/gvt/interrupt.c
@@ -465,10 +465,19 @@ static void gen8_check_pending_irq(struct intel_vgpu *vgpu)
 {
 	struct intel_gvt_irq *irq = &vgpu->gvt->irq;
 	int i;
-
-	if (!(vgpu_vreg(vgpu, i915_mmio_reg_offset(GEN8_MASTER_IRQ)) &
-				GEN8_MASTER_IRQ_CONTROL))
-		return;
+	u32 offset;
+	u32 disable_irq;
+
+	if (VGPU_PVMMIO(vgpu) & PVMMIO_MASTER_IRQ) {
+		offset = offsetof(struct gvt_shared_page, disable_irq);
+		intel_gvt_read_shared_page(vgpu, offset, &disable_irq, 4);
+		if (disable_irq)
+			return;
+	} else {
+		if (!(vgpu_vreg(vgpu, i915_mmio_reg_offset(GEN8_MASTER_IRQ)) &
+		       GEN8_MASTER_IRQ_CONTROL))
+			return;
+	}
 
 	for_each_set_bit(i, irq->irq_info_bitmap, INTEL_GVT_IRQ_INFO_MAX) {
 		struct intel_gvt_irq_info *info = irq->info[i];
-- 
1.8.3.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC 10/10] drm/i915/gvt: GVTg support ppgtt pvmmio optimization
  2018-09-27 16:37 [RFC 00/10] i915 pvmmio to improve GVTg performance Xiaolin Zhang
                   ` (13 preceding siblings ...)
  2018-09-27 16:37 ` [RFC 09/10] drm/i915/gvt: GVTg support master irq " Xiaolin Zhang
@ 2018-09-27 16:37 ` Xiaolin Zhang
  14 siblings, 0 replies; 28+ messages in thread
From: Xiaolin Zhang @ 2018-09-27 16:37 UTC (permalink / raw)
  To: intel-gvt-dev, intel-gfx
  Cc: zhenyu.z.wang, hang.yuan, joonas.lahtinen, fei.jiang, zhiyuan.lv

This patch handles ppgtt update from g2v notification.

It read out ppgtt pte entries from guest pte tables page and
convert them to host pfns.

It creates local ppgtt tables and insert the content pages
into the local ppgtt tables directly, which does not track
the usage of guest page table and removes the cost of write
protection from the original shadow page mechansim.

Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com>
---
 drivers/gpu/drm/i915/gvt/gtt.c      | 318 ++++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/gvt/gtt.h      |   9 +
 drivers/gpu/drm/i915/gvt/handlers.c |  13 +-
 3 files changed, 338 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gvt/gtt.c b/drivers/gpu/drm/i915/gvt/gtt.c
index 2402395..5a0c71b 100644
--- a/drivers/gpu/drm/i915/gvt/gtt.c
+++ b/drivers/gpu/drm/i915/gvt/gtt.c
@@ -1744,6 +1744,26 @@ static int ppgtt_handle_guest_write_page_table_bytes(
 	return 0;
 }
 
+static void invalidate_mm_pv(struct intel_vgpu_mm *mm)
+{
+	struct intel_vgpu *vgpu = mm->vgpu;
+	struct intel_gvt *gvt = vgpu->gvt;
+	struct intel_gvt_gtt *gtt = &gvt->gtt;
+	struct intel_gvt_gtt_pte_ops *ops = gtt->pte_ops;
+	struct intel_gvt_gtt_entry se;
+
+	i915_ppgtt_close(&mm->ppgtt->vm);
+	i915_ppgtt_put(mm->ppgtt);
+
+	ppgtt_get_shadow_root_entry(mm, &se, 0);
+	if (!ops->test_present(&se))
+		return;
+	se.val64 = 0;
+	ppgtt_set_shadow_root_entry(mm, &se, 0);
+
+	mm->ppgtt_mm.shadowed  = false;
+}
+
 static void invalidate_ppgtt_mm(struct intel_vgpu_mm *mm)
 {
 	struct intel_vgpu *vgpu = mm->vgpu;
@@ -1756,6 +1776,11 @@ static void invalidate_ppgtt_mm(struct intel_vgpu_mm *mm)
 	if (!mm->ppgtt_mm.shadowed)
 		return;
 
+	if (VGPU_PVMMIO(mm->vgpu) & PVMMIO_PPGTT_UPDATE) {
+		invalidate_mm_pv(mm);
+		return;
+	}
+
 	for (index = 0; index < ARRAY_SIZE(mm->ppgtt_mm.shadow_pdps); index++) {
 		ppgtt_get_shadow_root_entry(mm, &se, index);
 
@@ -1773,6 +1798,26 @@ static void invalidate_ppgtt_mm(struct intel_vgpu_mm *mm)
 	mm->ppgtt_mm.shadowed = false;
 }
 
+static int shadow_mm_pv(struct intel_vgpu_mm *mm)
+{
+	struct intel_vgpu *vgpu = mm->vgpu;
+	struct intel_gvt *gvt = vgpu->gvt;
+	struct intel_gvt_gtt_entry se;
+
+	mm->ppgtt = i915_ppgtt_create(gvt->dev_priv, NULL);
+	if (IS_ERR(mm->ppgtt)) {
+		gvt_vgpu_err("fail to create ppgtt for pdp 0x%llx\n",
+				px_dma(&mm->ppgtt->pml4));
+		return PTR_ERR(mm->ppgtt);
+	}
+
+	se.type = GTT_TYPE_PPGTT_ROOT_L4_ENTRY;
+	se.val64 = px_dma(&mm->ppgtt->pml4);
+	ppgtt_set_shadow_root_entry(mm, &se, 0);
+	mm->ppgtt_mm.shadowed  = true;
+
+	return 0;
+}
 
 static int shadow_ppgtt_mm(struct intel_vgpu_mm *mm)
 {
@@ -1787,6 +1832,9 @@ static int shadow_ppgtt_mm(struct intel_vgpu_mm *mm)
 	if (mm->ppgtt_mm.shadowed)
 		return 0;
 
+	if (VGPU_PVMMIO(mm->vgpu) & PVMMIO_PPGTT_UPDATE)
+		return shadow_mm_pv(mm);
+
 	mm->ppgtt_mm.shadowed = true;
 
 	for (index = 0; index < ARRAY_SIZE(mm->ppgtt_mm.guest_pdps); index++) {
@@ -2766,3 +2814,273 @@ void intel_vgpu_reset_gtt(struct intel_vgpu *vgpu)
 	intel_vgpu_destroy_all_ppgtt_mm(vgpu);
 	intel_vgpu_reset_ggtt(vgpu, true);
 }
+
+int intel_vgpu_g2v_pv_ppgtt_alloc_4lvl(struct intel_vgpu *vgpu,
+		u64 pdps[])
+{
+	struct intel_vgpu_mm *mm;
+	int ret = 0;
+	u32 offset;
+	struct pv_ppgtt_update pv_ppgtt;
+
+	offset = offsetof(struct gvt_shared_page, pv_ppgtt);
+	intel_gvt_read_shared_page(vgpu, offset, &pv_ppgtt, sizeof(pv_ppgtt));
+
+	mm = intel_vgpu_find_ppgtt_mm(vgpu, &pv_ppgtt.pdp);
+	if (!mm) {
+		gvt_vgpu_err("failed to find pdp 0x%llx\n", pv_ppgtt.pdp);
+		ret = -EINVAL;
+	} else {
+		ret = mm->ppgtt->vm.allocate_va_range(&mm->ppgtt->vm,
+			pv_ppgtt.start, pv_ppgtt.length);
+		if (ret)
+			gvt_vgpu_err("failed to alloc %llx\n", pv_ppgtt.pdp);
+	}
+
+	return ret;
+}
+
+int intel_vgpu_g2v_pv_ppgtt_clear_4lvl(struct intel_vgpu *vgpu,
+		u64 pdps[])
+{
+	struct intel_vgpu_mm *mm;
+	int ret = 0;
+	u32 offset;
+	struct pv_ppgtt_update pv_ppgtt;
+
+	offset = offsetof(struct gvt_shared_page, pv_ppgtt);
+	intel_gvt_read_shared_page(vgpu, offset, &pv_ppgtt, sizeof(pv_ppgtt));
+	mm = intel_vgpu_find_ppgtt_mm(vgpu, &pv_ppgtt.pdp);
+	if (!mm) {
+		gvt_vgpu_err("failed to find pdp 0x%llx\n", pv_ppgtt.pdp);
+		ret = -EINVAL;
+	} else {
+		mm->ppgtt->vm.clear_range(&mm->ppgtt->vm,
+			pv_ppgtt.start, pv_ppgtt.length);
+	}
+
+	return ret;
+}
+
+#define GEN8_PML4E_SIZE		(1UL << GEN8_PML4E_SHIFT)
+#define GEN8_PML4E_SIZE_MASK	(~(GEN8_PML4E_SIZE - 1))
+#define GEN8_PDPE_SIZE		(1UL << GEN8_PDPE_SHIFT)
+#define GEN8_PDPE_SIZE_MASK	(~(GEN8_PDPE_SIZE - 1))
+#define GEN8_PDE_SIZE		(1UL << GEN8_PDE_SHIFT)
+#define GEN8_PDE_SIZE_MASK	(~(GEN8_PDE_SIZE - 1))
+
+#define pml4_addr_end(addr, end)					\
+({	unsigned long __boundary = \
+			((addr) + GEN8_PML4E_SIZE) & GEN8_PML4E_SIZE_MASK; \
+	(__boundary < (end)) ? __boundary : (end);		\
+})
+
+#define pdp_addr_end(addr, end)						\
+({	unsigned long __boundary = \
+			((addr) + GEN8_PDPE_SIZE) & GEN8_PDPE_SIZE_MASK; \
+	(__boundary < (end)) ? __boundary : (end);		\
+})
+
+#define pd_addr_end(addr, end)						\
+({	unsigned long __boundary = \
+			((addr) + GEN8_PDE_SIZE) & GEN8_PDE_SIZE_MASK;	\
+	(__boundary < (end)) ? __boundary : (end);		\
+})
+
+struct ppgtt_walk {
+	unsigned long *mfns;
+	int mfn_index;
+	unsigned long *pt;
+};
+
+static int walk_pt_range(struct intel_vgpu *vgpu, u64 pt,
+				u64 start, u64 end, struct ppgtt_walk *walk)
+{
+	const struct intel_gvt_device_info *info = &vgpu->gvt->device_info;
+	struct intel_gvt_gtt_gma_ops *gma_ops = vgpu->gvt->gtt.gma_ops;
+	unsigned long start_index, end_index;
+	int ret;
+	int i;
+	unsigned long mfn, gfn;
+
+	start_index = gma_ops->gma_to_pte_index(start);
+	end_index = ((end - start) >> PAGE_SHIFT) + start_index;
+
+	ret = intel_gvt_hypervisor_read_gpa(vgpu,
+		(pt & PAGE_MASK) + (start_index << info->gtt_entry_size_shift),
+		walk->pt + start_index,
+		(end_index - start_index) << info->gtt_entry_size_shift);
+	if (ret) {
+		gvt_vgpu_err("fail to read gpa %llx\n", pt);
+		return ret;
+	}
+
+	for (i = start_index; i < end_index; i++) {
+		gfn = walk->pt[i] >> PAGE_SHIFT;
+		mfn = intel_gvt_hypervisor_gfn_to_mfn(vgpu, gfn);
+		if (mfn == INTEL_GVT_INVALID_ADDR) {
+			gvt_vgpu_err("fail to translate gfn: 0x%lx\n", gfn);
+			return -ENXIO;
+		}
+		walk->mfns[walk->mfn_index++] = mfn << PAGE_SHIFT;
+	}
+
+	return 0;
+}
+
+
+static int walk_pd_range(struct intel_vgpu *vgpu, u64 pd,
+				u64 start, u64 end, struct ppgtt_walk *walk)
+{
+	const struct intel_gvt_device_info *info = &vgpu->gvt->device_info;
+	struct intel_gvt_gtt_gma_ops *gma_ops = vgpu->gvt->gtt.gma_ops;
+	unsigned long index;
+	u64 pt, next;
+	int ret  = 0;
+
+	do {
+		index = gma_ops->gma_to_pde_index(start);
+
+		ret = intel_gvt_hypervisor_read_gpa(vgpu,
+			(pd & PAGE_MASK) + (index <<
+			info->gtt_entry_size_shift), &pt, 8);
+		if (ret)
+			return ret;
+		next = pd_addr_end(start, end);
+		walk_pt_range(vgpu, pt, start, next, walk);
+
+		start = next;
+	} while (start != end);
+
+	return ret;
+}
+
+
+static int walk_pdp_range(struct intel_vgpu *vgpu, u64 pdp,
+				  u64 start, u64 end, struct ppgtt_walk *walk)
+{
+	const struct intel_gvt_device_info *info = &vgpu->gvt->device_info;
+	struct intel_gvt_gtt_gma_ops *gma_ops = vgpu->gvt->gtt.gma_ops;
+	unsigned long index;
+	u64 pd, next;
+	int ret  = 0;
+
+	do {
+		index = gma_ops->gma_to_l4_pdp_index(start);
+
+		ret = intel_gvt_hypervisor_read_gpa(vgpu,
+			(pdp & PAGE_MASK) + (index <<
+			info->gtt_entry_size_shift), &pd, 8);
+		if (ret)
+			return ret;
+		next = pdp_addr_end(start, end);
+		walk_pd_range(vgpu, pd, start, next, walk);
+		start = next;
+	} while (start != end);
+
+	return ret;
+}
+
+
+static int walk_pml4_range(struct intel_vgpu *vgpu, u64 pml4,
+				u64 start, u64 end, struct ppgtt_walk *walk)
+{
+	const struct intel_gvt_device_info *info = &vgpu->gvt->device_info;
+	struct intel_gvt_gtt_gma_ops *gma_ops = vgpu->gvt->gtt.gma_ops;
+	unsigned long index;
+	u64 pdp, next;
+	int ret  = 0;
+
+	do {
+		index = gma_ops->gma_to_pml4_index(start);
+		ret = intel_gvt_hypervisor_read_gpa(vgpu,
+			(pml4 & PAGE_MASK) + (index <<
+			info->gtt_entry_size_shift), &pdp, 8);
+		if (ret)
+			return ret;
+		next = pml4_addr_end(start, end);
+		walk_pdp_range(vgpu, pdp, start, next, walk);
+		start = next;
+	} while (start != end);
+
+	return ret;
+}
+
+int intel_vgpu_g2v_pv_ppgtt_insert_4lvl(struct intel_vgpu *vgpu,
+		u64 pdps[])
+{
+	struct intel_vgpu_mm *mm;
+	u64 pml4, start, length;
+	u32 cache_level;
+	int ret = 0;
+	struct sg_table st;
+	struct scatterlist *sg = NULL;
+	int num_pages;
+	struct i915_vma vma;
+	struct ppgtt_walk walk;
+	int i;
+	u32 offset;
+	struct pv_ppgtt_update pv_ppgtt;
+
+	offset = offsetof(struct gvt_shared_page, pv_ppgtt);
+	intel_gvt_read_shared_page(vgpu, offset, &pv_ppgtt, sizeof(pv_ppgtt));
+	pml4 = pv_ppgtt.pdp;
+	start = pv_ppgtt.start;
+	length = pv_ppgtt.length;
+	cache_level = pv_ppgtt.cache_level;
+	num_pages = length >> PAGE_SHIFT;
+
+	mm = intel_vgpu_find_ppgtt_mm(vgpu, &pml4);
+	if (!mm) {
+		gvt_vgpu_err("fail to find mm for pml4 0x%llx\n", pml4);
+		return -EINVAL;
+	}
+
+	walk.mfn_index = 0;
+	walk.mfns = NULL;
+	walk.pt = NULL;
+
+	walk.mfns = kmalloc_array(num_pages,
+			sizeof(unsigned long), GFP_KERNEL);
+	if (!walk.mfns) {
+		ret = -ENOMEM;
+		goto fail;
+	}
+
+	walk.pt = (unsigned long *)__get_free_pages(GFP_KERNEL, 0);
+	if (!walk.pt) {
+		ret = -ENOMEM;
+		goto fail;
+	}
+
+	if (sg_alloc_table(&st, num_pages, GFP_KERNEL)) {
+		ret = -ENOMEM;
+		goto fail;
+	}
+
+	ret = walk_pml4_range(vgpu, pml4, start, start + length, &walk);
+	if (ret)
+		goto fail_free_sg;
+
+	WARN_ON(num_pages != walk.mfn_index);
+
+	for_each_sg(st.sgl, sg, num_pages, i) {
+		sg->offset = 0;
+		sg->length = PAGE_SIZE;
+		sg_dma_address(sg) = walk.mfns[i];
+		sg_dma_len(sg) = PAGE_SIZE;
+	}
+
+	memset(&vma, 0, sizeof(vma));
+	vma.node.start = start;
+	vma.pages = &st;
+	mm->ppgtt->vm.insert_entries(&mm->ppgtt->vm, &vma, cache_level, 0);
+
+fail_free_sg:
+	sg_free_table(&st);
+fail:
+	kfree(walk.mfns);
+	free_page((unsigned long)walk.pt);
+
+	return ret;
+}
diff --git a/drivers/gpu/drm/i915/gvt/gtt.h b/drivers/gpu/drm/i915/gvt/gtt.h
index 7a9b361..b25510f 100644
--- a/drivers/gpu/drm/i915/gvt/gtt.h
+++ b/drivers/gpu/drm/i915/gvt/gtt.h
@@ -135,6 +135,7 @@ enum intel_gvt_mm_type {
 
 struct intel_vgpu_mm {
 	enum intel_gvt_mm_type type;
+	struct i915_hw_ppgtt *ppgtt;
 	struct intel_vgpu *vgpu;
 
 	struct kref ref;
@@ -272,4 +273,12 @@ int intel_vgpu_emulate_ggtt_mmio_read(struct intel_vgpu *vgpu,
 int intel_vgpu_emulate_ggtt_mmio_write(struct intel_vgpu *vgpu,
 	unsigned int off, void *p_data, unsigned int bytes);
 
+int intel_vgpu_g2v_pv_ppgtt_alloc_4lvl(struct intel_vgpu *vgpu,
+		u64 pdps[]);
+
+int intel_vgpu_g2v_pv_ppgtt_clear_4lvl(struct intel_vgpu *vgpu,
+		u64 pdps[]);
+
+int intel_vgpu_g2v_pv_ppgtt_insert_4lvl(struct intel_vgpu *vgpu,
+		u64 pdps[]);
 #endif /* _GVT_GTT_H_ */
diff --git a/drivers/gpu/drm/i915/gvt/handlers.c b/drivers/gpu/drm/i915/gvt/handlers.c
index a915b72..1528ce7 100644
--- a/drivers/gpu/drm/i915/gvt/handlers.c
+++ b/drivers/gpu/drm/i915/gvt/handlers.c
@@ -1185,7 +1185,7 @@ static int handle_g2v_notification(struct intel_vgpu *vgpu, int notification)
 	intel_gvt_gtt_type_t root_entry_type = GTT_TYPE_PPGTT_ROOT_L4_ENTRY;
 	struct intel_vgpu_mm *mm;
 	u64 *pdps;
-
+	int ret = 0;
 	pdps = (u64 *)&vgpu_vreg64_t(vgpu, vgtif_reg(pdp[0]));
 
 	switch (notification) {
@@ -1198,6 +1198,15 @@ static int handle_g2v_notification(struct intel_vgpu *vgpu, int notification)
 	case VGT_G2V_PPGTT_L3_PAGE_TABLE_DESTROY:
 	case VGT_G2V_PPGTT_L4_PAGE_TABLE_DESTROY:
 		return intel_vgpu_put_ppgtt_mm(vgpu, pdps);
+	case VGT_G2V_PPGTT_L4_ALLOC:
+		ret = intel_vgpu_g2v_pv_ppgtt_alloc_4lvl(vgpu, pdps);
+			break;
+	case VGT_G2V_PPGTT_L4_INSERT:
+		ret = intel_vgpu_g2v_pv_ppgtt_insert_4lvl(vgpu, pdps);
+		break;
+	case VGT_G2V_PPGTT_L4_CLEAR:
+		ret = intel_vgpu_g2v_pv_ppgtt_clear_4lvl(vgpu, pdps);
+		break;
 	case VGT_G2V_EXECLIST_CONTEXT_CREATE:
 	case VGT_G2V_EXECLIST_CONTEXT_DESTROY:
 	case 1:	/* Remove this in guest driver. */
@@ -1205,7 +1214,7 @@ static int handle_g2v_notification(struct intel_vgpu *vgpu, int notification)
 	default:
 		gvt_vgpu_err("Invalid PV notification %d\n", notification);
 	}
-	return 0;
+	return ret;
 }
 
 static int send_display_ready_uevent(struct intel_vgpu *vgpu, int ready)
-- 
1.8.3.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [RFC 03/10] drm/i915/gvt: context submission pvmmio optimization
  2018-09-27  7:19   ` Chris Wilson
@ 2018-09-28  5:31     ` Zhang, Xiaolin
  0 siblings, 0 replies; 28+ messages in thread
From: Zhang, Xiaolin @ 2018-09-28  5:31 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx, intel-gvt-dev
  Cc: Lahtinen, Joonas, Lv, Zhiyuan, Jiang, Fei, Wang, Zhenyu Z, Yuan, Hang

On 09/27/2018 03:21 PM, Chris Wilson wrote:
> Quoting Xiaolin Zhang (2018-09-27 17:37:48)
>> It is performance optimization to reduce mmio trap numbers from 4 to
>> 1 durning ELSP porting writing (context submission).
>>
>> When context subission, to cache elsp_data[4] values in
>> the shared page, the last elsp_data[0] port writing will be trapped
>> to gvt for real context submission.
>>
>> Use PVMMIO_ELSP_SUBMIT to control this level of pvmmio optimization.
>> Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com>
>> ---
>>  drivers/gpu/drm/i915/i915_params.h |  2 +-
>>  drivers/gpu/drm/i915/intel_lrc.c   | 37 ++++++++++++++++++++++++++++++++++++-
>>  2 files changed, 37 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_params.h b/drivers/gpu/drm/i915/i915_params.h
>> index 0f6a38b..6c81c87 100644
>> --- a/drivers/gpu/drm/i915/i915_params.h
>> +++ b/drivers/gpu/drm/i915/i915_params.h
>> @@ -69,7 +69,7 @@
>>         param(bool, enable_dp_mst, true) \
>>         param(bool, enable_dpcd_backlight, false) \
>>         param(bool, enable_gvt, false) \
>> -       param(int, enable_pvmmio, 0)
>> +       param(int, enable_pvmmio, PVMMIO_ELSP_SUBMIT)
>>  
>>  #define MEMBER(T, member, ...) T member;
>>  struct i915_params {
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>> index 4b28225..cdc713c 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -451,7 +451,12 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
>>  {
>>         struct intel_engine_execlists *execlists = &engine->execlists;
>>         struct execlist_port *port = execlists->port;
>> +       u32 __iomem *elsp =
>> +               engine->i915->regs + i915_mmio_reg_offset(RING_ELSP(engine));
>> +       u32 *elsp_data;
>>         unsigned int n;
>> +       u32 descs[4];
>> +       int i = 0;
>>  
>>         /*
>>          * We can skip acquiring intel_runtime_pm_get() here as it was taken
>> @@ -494,8 +499,24 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
>>                         GEM_BUG_ON(!n);
>>                         desc = 0;
>>                 }
>> +               if (PVMMIO_LEVEL_ENABLE(engine->i915, PVMMIO_ELSP_SUBMIT)) {
>> +                       GEM_BUG_ON(i >= 4);
>> +                       descs[i] = upper_32_bits(desc);
>> +                       descs[i + 1] = lower_32_bits(desc);
>> +                       i += 2;
>> +               } else {
>> +                       write_desc(execlists, desc, n);
>> +               }
>> +       }
>>  
>> -               write_desc(execlists, desc, n);
>> +       if (PVMMIO_LEVEL_ENABLE(engine->i915, PVMMIO_ELSP_SUBMIT)) {
>> +               spin_lock(&engine->i915->shared_page_lock);
>> +               elsp_data = engine->i915->shared_page->elsp_data;
>> +               *elsp_data = descs[0];
>> +               *(elsp_data + 1) = descs[1];
>> +               *(elsp_data + 2) = descs[2];
>> +               writel(descs[3], elsp);
>> +               spin_unlock(&engine->i915->shared_page_lock);
>>         }
>>  
>>         /* we need to manually load the submit queue */
>> @@ -538,11 +559,25 @@ static void inject_preempt_context(struct intel_engine_cs *engine)
>>         struct intel_engine_execlists *execlists = &engine->execlists;
>>         struct intel_context *ce =
>>                 to_intel_context(engine->i915->preempt_context, engine);
>> +       u32 __iomem *elsp =
>> +               engine->i915->regs + i915_mmio_reg_offset(RING_ELSP(engine));
>> +       u32 *elsp_data;
>>         unsigned int n;
>>  
>>         GEM_BUG_ON(execlists->preempt_complete_status !=
>>                    upper_32_bits(ce->lrc_desc));
>>  
>> +       if (PVMMIO_LEVEL_ENABLE(engine->i915, PVMMIO_ELSP_SUBMIT)) {
>> +               spin_lock(&engine->i915->shared_page_lock);
>> +               elsp_data = engine->i915->shared_page->elsp_data;
>> +               *elsp_data = 0;
>> +               *(elsp_data + 1) = 0;
>> +               *(elsp_data + 2) = upper_32_bits(ce->lrc_desc);
>> +               writel(lower_32_bits(ce->lrc_desc), elsp);
>> +               spin_unlock(&engine->i915->shared_page_lock);
>> +               return;
>> +       }
>> +
> Really?
> -Chris
in normal case,  write_desc will update port[0] & port[1] descriptor and
in total 4 mmio will be accessed and trapped.
in PVMMIO_ELSP_SUBMIT case,  cache descs[0] ~ descs[4] to elsp_data[4]
in shared_page, just use one mmio access
"writel(descs[3], elsp)" to trap to GVTg for workload submission and in
GVGT the desc[0]~desc[2] will be read from shared_page.
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 01/10] drm/i915/gvt: add module parameter enable_pvmmio
  2018-09-27 11:03   ` Joonas Lahtinen
@ 2018-09-28  6:09     ` Zhang, Xiaolin
  2018-10-09  2:26       ` Zhenyu Wang
  0 siblings, 1 reply; 28+ messages in thread
From: Zhang, Xiaolin @ 2018-09-28  6:09 UTC (permalink / raw)
  To: Joonas Lahtinen, intel-gfx, intel-gvt-dev
  Cc: Lahtinen, Joonas, Lv, Zhiyuan, Jiang, Fei, Wang, Zhenyu Z, Yuan, Hang

On 09/27/2018 07:03 PM, Joonas Lahtinen wrote:
> Quoting Xiaolin Zhang (2018-09-27 19:37:46)
>> This int type module parameter is used to control the different
>> level pvmmio feature for MMIO emulation in GVT.
>>
>> This parameter is default zero, no pvmmio feature enabled.
>>
>> Its permission type is 0400 which means user could only change its
>> value through the cmdline, this is to prevent the dynamic modification
>> during runtime which would break the pvmmio internal logic.
>>
>> Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com>
> This shouldn't really be a module parameter. We should detect the
> capability from the vGPU device and use it always when possible.
>
> Regards, Joonas
>
for pv optimization, we should touch both guest driver and GVTg.  this
parameter is used for

guest pv capability because GVTg with pv capability will support both pv
and non pv capability guest.

BRs, Xiaolin

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 00/10] i915 pvmmio to improve GVTg performance
  2018-09-27 11:07 ` [RFC 00/10] " Joonas Lahtinen
@ 2018-09-28  6:11   ` Zhang, Xiaolin
  0 siblings, 0 replies; 28+ messages in thread
From: Zhang, Xiaolin @ 2018-09-28  6:11 UTC (permalink / raw)
  To: Joonas Lahtinen, intel-gfx, intel-gvt-dev
  Cc: Lahtinen, Joonas, Lv, Zhiyuan, Jiang, Fei, Wang, Zhenyu Z, Yuan, Hang

On 09/27/2018 07:07 PM, Joonas Lahtinen wrote:
> Quoting Xiaolin Zhang (2018-09-27 19:37:45)
>> To improve GVTg performance, it could reduce the mmio access trap
>> numbers within guest driver in some certain scenarios since mmio
>> access trap will introuduce vm exit/vm enter cost.
>>
>> the solution in this patch set is to setup a shared memory region
>> which accessed both by guest and GVTg without trap cost. the shared
>> memory region is allocated by guest driver and guest driver will
>> pass the region's memory guest physical address to GVTg through
>> PVINFO register and later GVTg can access this region directly without
>> trap cost to achieve data exchange purpose between guest and GVTg.
>>
>> in this patch set, 3 kind of pvmmio optimization implemented which is
>> controlled by enable_pvmmio PVINO register with different level flag.
>> 1. workload submission (context submission): reduce 4 traps to 1 trap.
>> 2. master irq: reduce 2 traps to 1 trap. 
>> 3. ppgtt update: eliminate the cost of ppgtt write protection. 
>>
>> based on the experiment, the performance was gained 4 percent (average)
>> improvment with regard to both media and 3D workload benchmarks. 
>>
>> based on the pvmmio framework, it could achive more sceneario optimization
>> such as globle GTT update, display plane and water mark update with guest.
> Overall comments:
>
> The patches should be properly prefixed and split down. We should have
> "drm/i915:" patches that touch i915 portions, and those should not touch
> any gvt parts. Then there should be "drm/i915/gvt:" parts which don't
> touch anything from i915, and would be reviewed in the GVT list.
>
> We'd then proceed to merge the i915 changes and the GVT changes would be
> merged in the GVT tree.
>
> Regards, Joonas
>
thanks your comment, it makes sense and will be addressed in next version.

BRs, Xiaolin



_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 02/10] drm/i915/gvt: get ready of memory for pvmmio
  2018-09-27  7:17   ` Chris Wilson
@ 2018-09-28  7:31     ` Zhang, Xiaolin
  0 siblings, 0 replies; 28+ messages in thread
From: Zhang, Xiaolin @ 2018-09-28  7:31 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx, intel-gvt-dev
  Cc: Lahtinen, Joonas, Yuan, Hang, Lv, Zhiyuan, Wang, Zhenyu Z, Jiang, Fei

On 09/27/2018 03:18 PM, Chris Wilson wrote:
> Quoting Xiaolin Zhang (2018-09-27 17:37:47)
>> To enable pvmmio feature, we need to prepare one 4K shared page
>> which will be accessed by both guest and backend i915 driver.
>>
>> guest i915 allocate one page memory and then the guest physical address is
>> passed to backend i915 driver through PVINFO register so that backend i915
>> driver can access this shared page without hypeviser trap cost for shared
>> data exchagne via hyperviser read_gpa functionality.
>>
>> Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com>
>> ---
>>  drivers/gpu/drm/i915/i915_drv.c    |  5 +++++
>>  drivers/gpu/drm/i915/i915_drv.h    |  3 +++
>>  drivers/gpu/drm/i915/i915_pvinfo.h | 25 ++++++++++++++++++++++++-
>>  drivers/gpu/drm/i915/i915_vgpu.c   | 17 +++++++++++++++++
>>  4 files changed, 49 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
>> index ade9bca..815a4dd 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.c
>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>> @@ -885,6 +885,7 @@ static int i915_driver_init_early(struct drm_i915_private *dev_priv)
>>                 return -ENODEV;
>>  
>>         spin_lock_init(&dev_priv->irq_lock);
>> +       spin_lock_init(&dev_priv->shared_page_lock);
> No. Do we not have a more appropriate struct for this to find a home in.
> No one will ever uess that 'shared_page_lock' refers to vgpu.
> -Chris
thanks your comment. I just find another place "struct i915_virtual_gpu"
and guess it is better than this, do you think it is right place for them?

BRs, Xiaolin
> _______________________________________________
> intel-gvt-dev mailing list
> intel-gvt-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 01/10] drm/i915/gvt: add module parameter enable_pvmmio
  2018-09-28  6:09     ` Zhang, Xiaolin
@ 2018-10-09  2:26       ` Zhenyu Wang
  2018-10-10  6:48         ` Zhang, Xiaolin
  0 siblings, 1 reply; 28+ messages in thread
From: Zhenyu Wang @ 2018-10-09  2:26 UTC (permalink / raw)
  To: Zhang, Xiaolin
  Cc: intel-gfx, Yuan, Hang, Lahtinen, Joonas, Jiang, Fei,
	intel-gvt-dev, Lv, Zhiyuan


[-- Attachment #1.1: Type: text/plain, Size: 1379 bytes --]

On 2018.09.28 14:09:45 +0800, Zhang, Xiaolin wrote:
> On 09/27/2018 07:03 PM, Joonas Lahtinen wrote:
> > Quoting Xiaolin Zhang (2018-09-27 19:37:46)
> >> This int type module parameter is used to control the different
> >> level pvmmio feature for MMIO emulation in GVT.
> >>
> >> This parameter is default zero, no pvmmio feature enabled.
> >>
> >> Its permission type is 0400 which means user could only change its
> >> value through the cmdline, this is to prevent the dynamic modification
> >> during runtime which would break the pvmmio internal logic.
> >>
> >> Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com>
> > This shouldn't really be a module parameter. We should detect the
> > capability from the vGPU device and use it always when possible.
> >
> > Regards, Joonas
> >
> for pv optimization, we should touch both guest driver and GVTg.  this
> parameter is used for
> 
> guest pv capability because GVTg with pv capability will support both pv
> and non pv capability guest.
> 

That's the purpose of 'vgt_caps' in PVINFO to do capability check between
host/guest. You need a new cap bit definition for PVMMIO and maybe another
field for different PVMMIO level capability check. New parameter is not useful here.

Thanks

-- 
Open Source Technology Center, Intel ltd.

$gpg --keyserver wwwkeys.pgp.net --recv-keys 4D781827

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 02/10] drm/i915/gvt: get ready of memory for pvmmio
  2018-09-27 16:37 ` [RFC 02/10] drm/i915/gvt: get ready of memory for pvmmio Xiaolin Zhang
  2018-09-27  7:17   ` Chris Wilson
@ 2018-10-09  2:31   ` Zhenyu Wang
  1 sibling, 0 replies; 28+ messages in thread
From: Zhenyu Wang @ 2018-10-09  2:31 UTC (permalink / raw)
  To: Xiaolin Zhang
  Cc: zhiyuan.lv, intel-gfx, hang.yuan, joonas.lahtinen, fei.jiang,
	intel-gvt-dev


[-- Attachment #1.1: Type: text/plain, Size: 5437 bytes --]

On 2018.09.27 12:37:47 -0400, Xiaolin Zhang wrote:
> To enable pvmmio feature, we need to prepare one 4K shared page
> which will be accessed by both guest and backend i915 driver.
> 
> guest i915 allocate one page memory and then the guest physical address is
> passed to backend i915 driver through PVINFO register so that backend i915
> driver can access this shared page without hypeviser trap cost for shared
> data exchagne via hyperviser read_gpa functionality.
> 
> Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.c    |  5 +++++
>  drivers/gpu/drm/i915/i915_drv.h    |  3 +++
>  drivers/gpu/drm/i915/i915_pvinfo.h | 25 ++++++++++++++++++++++++-
>  drivers/gpu/drm/i915/i915_vgpu.c   | 17 +++++++++++++++++
>  4 files changed, 49 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index ade9bca..815a4dd 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -885,6 +885,7 @@ static int i915_driver_init_early(struct drm_i915_private *dev_priv)
>  		return -ENODEV;
>  
>  	spin_lock_init(&dev_priv->irq_lock);
> +	spin_lock_init(&dev_priv->shared_page_lock);
>  	spin_lock_init(&dev_priv->gpu_error.lock);
>  	mutex_init(&dev_priv->backlight_lock);
>  	spin_lock_init(&dev_priv->uncore.lock);
> @@ -987,6 +988,8 @@ static void i915_mmio_cleanup(struct drm_i915_private *dev_priv)
>  
>  	intel_teardown_mchbar(dev_priv);
>  	pci_iounmap(pdev, dev_priv->regs);
> +	if (intel_vgpu_active(dev_priv) && dev_priv->shared_page)
> +		free_pages((unsigned long)dev_priv->shared_page, 0);
>  }
>  
>  /**
> @@ -1029,6 +1032,8 @@ static int i915_driver_init_mmio(struct drm_i915_private *dev_priv)
>  	return 0;
>  
>  err_uncore:
> +	if (intel_vgpu_active(dev_priv) && dev_priv->shared_page)
> +		free_pages((unsigned long)dev_priv->shared_page, 0);
>  	intel_uncore_fini(dev_priv);
>  err_bridge:
>  	pci_dev_put(dev_priv->bridge_dev);
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 174d618..76d7e9c 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -56,6 +56,7 @@
>  
>  #include "i915_params.h"
>  #include "i915_reg.h"
> +#include "i915_pvinfo.h"
>  #include "i915_utils.h"
>  
>  #include "intel_bios.h"
> @@ -1623,6 +1624,8 @@ struct drm_i915_private {
>  	resource_size_t stolen_usable_size;	/* Total size minus reserved ranges */
>  
>  	void __iomem *regs;
> +	struct gvt_shared_page *shared_page;
> +	spinlock_t shared_page_lock;
>  
>  	struct intel_uncore uncore;
>  
> diff --git a/drivers/gpu/drm/i915/i915_pvinfo.h b/drivers/gpu/drm/i915/i915_pvinfo.h
> index 697e998..ab839a7 100644
> --- a/drivers/gpu/drm/i915/i915_pvinfo.h
> +++ b/drivers/gpu/drm/i915/i915_pvinfo.h
> @@ -49,6 +49,25 @@ enum vgt_g2v_type {
>  	VGT_G2V_MAX,
>  };
>  
> +struct pv_ppgtt_update {
> +	u64 pdp;
> +	u64 start;
> +	u64 length;
> +	u32 cache_level;
> +};
> +
> +/*
> + * shared page(4KB) between gvt and VM, could be allocated by guest driver
> + * or a fixed location in PCI bar 0 region
> + */
> +struct gvt_shared_page {
> +	u32 elsp_data[4];
> +	u32 reg_addr;
> +	u32 disable_irq;
> +	struct pv_ppgtt_update pv_ppgtt;
> +	u32 rsvd2[0x400 - 13];
> +};

Could we define offset for shared page fields instead of a struct?
Which is wasting space I think.

> +
>  #define VGPU_PVMMIO(vgpu) vgpu_vreg_t(vgpu, vgtif_reg(enable_pvmmio))
>  
>  /*
> @@ -120,8 +139,12 @@ struct vgt_if {
>  	u32 execlist_context_descriptor_lo;
>  	u32 execlist_context_descriptor_hi;
>  	u32 enable_pvmmio;
> +	struct {
> +		u32 lo;
> +		u32 hi;
> +	} shared_page_gpa;
>  
> -	u32  rsv7[0x200 - 25];    /* pad to one page */
> +	u32  rsv7[0x200 - 27];    /* pad to one page */
>  } __packed;
>  
>  #define vgtif_reg(x) \
> diff --git a/drivers/gpu/drm/i915/i915_vgpu.c b/drivers/gpu/drm/i915/i915_vgpu.c
> index d22c5ca..10ae94b 100644
> --- a/drivers/gpu/drm/i915/i915_vgpu.c
> +++ b/drivers/gpu/drm/i915/i915_vgpu.c
> @@ -62,6 +62,7 @@ void i915_check_vgpu(struct drm_i915_private *dev_priv)
>  {
>  	u64 magic;
>  	u16 version_major;
> +	u64 shared_page_gpa;
>  
>  	BUILD_BUG_ON(sizeof(struct vgt_if) != VGT_PVINFO_SIZE);
>  
> @@ -89,6 +90,22 @@ void i915_check_vgpu(struct drm_i915_private *dev_priv)
>  	dev_priv->vgpu.active = true;
>  	DRM_INFO("Virtual GPU for Intel GVT-g detected with pvmmio 0x%x\n",
>  		i915_modparams.enable_pvmmio);
> +
> +	if (intel_vgpu_active(dev_priv) && i915_modparams.enable_pvmmio) {
> +		dev_priv->shared_page =  (struct gvt_shared_page *)
> +				__get_free_pages(GFP_KERNEL | __GFP_ZERO, 0);
> +		if (!dev_priv->shared_page) {
> +			DRM_ERROR("out of memory for shared page memory\n");
> +			return;
> +		}
> +		shared_page_gpa = __pa(dev_priv->shared_page);
> +		__raw_i915_write32(dev_priv, vgtif_reg(shared_page_gpa.lo),
> +				lower_32_bits(shared_page_gpa));
> +		__raw_i915_write32(dev_priv, vgtif_reg(shared_page_gpa.hi),
> +				upper_32_bits(shared_page_gpa));
> +		DRM_INFO("VGPU shared page enabled\n");
> +	}
> +
>  }
>  
>  bool intel_vgpu_has_full_48bit_ppgtt(struct drm_i915_private *dev_priv)
> -- 
> 1.8.3.1
> 

-- 
Open Source Technology Center, Intel ltd.

$gpg --keyserver wwwkeys.pgp.net --recv-keys 4D781827

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 01/10] drm/i915/gvt: add module parameter enable_pvmmio
  2018-10-09  2:26       ` Zhenyu Wang
@ 2018-10-10  6:48         ` Zhang, Xiaolin
  0 siblings, 0 replies; 28+ messages in thread
From: Zhang, Xiaolin @ 2018-10-10  6:48 UTC (permalink / raw)
  To: Zhenyu Wang
  Cc: intel-gfx, Yuan, Hang, Lahtinen, Joonas, Jiang, Fei,
	intel-gvt-dev, Lv, Zhiyuan

On 10/09/2018 10:34 AM, Zhenyu Wang wrote:
> On 2018.09.28 14:09:45 +0800, Zhang, Xiaolin wrote:
>> On 09/27/2018 07:03 PM, Joonas Lahtinen wrote:
>>> Quoting Xiaolin Zhang (2018-09-27 19:37:46)
>>>> This int type module parameter is used to control the different
>>>> level pvmmio feature for MMIO emulation in GVT.
>>>>
>>>> This parameter is default zero, no pvmmio feature enabled.
>>>>
>>>> Its permission type is 0400 which means user could only change its
>>>> value through the cmdline, this is to prevent the dynamic modification
>>>> during runtime which would break the pvmmio internal logic.
>>>>
>>>> Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com>
>>> This shouldn't really be a module parameter. We should detect the
>>> capability from the vGPU device and use it always when possible.
>>>
>>> Regards, Joonas
>>>
>> for pv optimization, we should touch both guest driver and GVTg.  this
>> parameter is used for
>>
>> guest pv capability because GVTg with pv capability will support both pv
>> and non pv capability guest.
>>
> That's the purpose of 'vgt_caps' in PVINFO to do capability check between
> host/guest. You need a new cap bit definition for PVMMIO and maybe another
> field for different PVMMIO level capability check. New parameter is not useful here.
>
> Thanks
>
Sounds good to me. Will to use it in the next version.  thanks very much.

BRs, Xiaolin

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2018-10-10  6:48 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-27 16:37 [RFC 00/10] i915 pvmmio to improve GVTg performance Xiaolin Zhang
2018-09-27  7:20 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
2018-09-27  7:24 ` ✗ Fi.CI.SPARSE: " Patchwork
2018-09-27  7:43 ` ✓ Fi.CI.BAT: success " Patchwork
2018-09-27 10:25 ` ✓ Fi.CI.IGT: " Patchwork
2018-09-27 11:07 ` [RFC 00/10] " Joonas Lahtinen
2018-09-28  6:11   ` Zhang, Xiaolin
2018-09-27 16:37 ` [RFC 01/10] drm/i915/gvt: add module parameter enable_pvmmio Xiaolin Zhang
2018-09-27  7:16   ` Chris Wilson
2018-09-27 11:03   ` Joonas Lahtinen
2018-09-28  6:09     ` Zhang, Xiaolin
2018-10-09  2:26       ` Zhenyu Wang
2018-10-10  6:48         ` Zhang, Xiaolin
2018-09-27 16:37 ` [RFC 02/10] drm/i915/gvt: get ready of memory for pvmmio Xiaolin Zhang
2018-09-27  7:17   ` Chris Wilson
2018-09-28  7:31     ` Zhang, Xiaolin
2018-10-09  2:31   ` Zhenyu Wang
2018-09-27 16:37 ` [RFC 03/10] drm/i915/gvt: context submission pvmmio optimization Xiaolin Zhang
2018-09-27  7:19   ` Chris Wilson
2018-09-28  5:31     ` Zhang, Xiaolin
2018-09-27 11:13   ` Joonas Lahtinen
2018-09-27 16:37 ` [RFC 04/10] drm/i915/gvt: master irq " Xiaolin Zhang
2018-09-27 16:37 ` [RFC 05/10] drm/i915/gvt: ppgtt update " Xiaolin Zhang
2018-09-27 16:37 ` [RFC 06/10] drm/i915/gvt: GVTg handle enable_pvmmio PVINFO register Xiaolin Zhang
2018-09-27 16:37 ` [RFC 07/10] drm/i915/gvt: GVTg read_shared_page implementation Xiaolin Zhang
2018-09-27 16:37 ` [RFC 08/10] drm/i915/gvt: GVTg support context submission pvmmio optimization Xiaolin Zhang
2018-09-27 16:37 ` [RFC 09/10] drm/i915/gvt: GVTg support master irq " Xiaolin Zhang
2018-09-27 16:37 ` [RFC 10/10] drm/i915/gvt: GVTg support ppgtt " Xiaolin Zhang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.