All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 1/4] drm/i915/guc: Fix lockdep due to log relay channel handling under struct_mutex
  2018-01-22  8:26 [PATCH 1/4] drm/i915/guc: Fix lockdep due to log relay channel handling under struct_mutex Sagar Arun Kamble
@ 2018-01-22  8:24 ` Lofstedt, Marta
  2018-01-22  8:34   ` Sagar Arun Kamble
  2018-01-22  8:26 ` [PATCH 2/4] drm/i915/guc: Grab RPM wakelock while disabling GuC interrupts Sagar Arun Kamble
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 16+ messages in thread
From: Lofstedt, Marta @ 2018-01-22  8:24 UTC (permalink / raw)
  To: Kamble, Sagar A, intel-gfx

I recommend that this is tested with a HACK to enable the GUC logs again, so that we can see if it really fixes the issue.

> -----Original Message-----
> From: Kamble, Sagar A
> Sent: Monday, January 22, 2018 10:26 AM
> To: intel-gfx@lists.freedesktop.org
> Cc: Kamble, Sagar A <sagar.a.kamble@intel.com>; Wajdeczko, Michal
> <Michal.Wajdeczko@intel.com>; Ceraolo Spurio, Daniele
> <daniele.ceraolospurio@intel.com>; Ursulin, Tvrtko
> <tvrtko.ursulin@intel.com>; Chris Wilson <chris@chris-wilson.co.uk>; Joonas
> Lahtinen <joonas.lahtinen@linux.intel.com>; Lofstedt, Marta
> <marta.lofstedt@intel.com>; Winiarski, Michal <michal.winiarski@intel.com>
> Subject: [PATCH 1/4] drm/i915/guc: Fix lockdep due to log relay channel
> handling under struct_mutex
> 
> This patch fixes lockdep issue due to circular locking dependency of
> struct_mutex, i_mutex_key, mmap_sem, relay_channels_mutex.
> For GuC log relay channel we create debugfs file that requires i_mutex_key
> lock and we are doing that under struct_mutex. So we introduced newer
> dependency as:
>     &dev->struct_mutex --> &sb->s_type->i_mutex_key#3 --> &mm-
> >mmap_sem However, there is dependency from mmap_sem to
> struct_mutex. Hence we separate the relay create/destroy operation from
> under struct_mutex.
> 
> ======================================================
> WARNING: possible circular locking dependency detected 4.15.0-rc6-CI-
> Patchwork_7614+ #1 Not tainted
> ------------------------------------------------------
> debugfs_test/1388 is trying to acquire lock:
>  (&dev->struct_mutex){+.+.}, at: [<00000000d5e1d915>]
> i915_mutex_lock_interruptible+0x47/0x130 [i915]
> 
> but task is already holding lock:
>  (&mm->mmap_sem){++++}, at: [<0000000029a9c131>]
> __do_page_fault+0x106/0x560
> 
> which lock already depends on the new lock.
> 
> the existing dependency chain (in reverse order) is:
> 
> -> #3 (&mm->mmap_sem){++++}:
>        _copy_to_user+0x1e/0x70
>        filldir+0x8c/0xf0
>        dcache_readdir+0xeb/0x160
>        iterate_dir+0xdc/0x140
>        SyS_getdents+0xa0/0x130
>        entry_SYSCALL_64_fastpath+0x1c/0x89
> 
> -> #2 (&sb->s_type->i_mutex_key#3){++++}:
>        start_creating+0x59/0x110
>        __debugfs_create_file+0x2e/0xe0
>        relay_create_buf_file+0x62/0x80
>        relay_late_setup_files+0x84/0x250
>        guc_log_late_setup+0x4f/0x110 [i915]
>        i915_guc_log_register+0x32/0x40 [i915]
>        i915_driver_load+0x7b6/0x1720 [i915]
>        i915_pci_probe+0x2e/0x90 [i915]
>        pci_device_probe+0x9c/0x120
>        driver_probe_device+0x2a3/0x480
>        __driver_attach+0xd9/0xe0
>        bus_for_each_dev+0x57/0x90
>        bus_add_driver+0x168/0x260
>        driver_register+0x52/0xc0
>        do_one_initcall+0x39/0x150
>        do_init_module+0x56/0x1ef
>        load_module+0x231c/0x2d70
>        SyS_finit_module+0xa5/0xe0
>        entry_SYSCALL_64_fastpath+0x1c/0x89
> 
> -> #1 (relay_channels_mutex){+.+.}:
>        relay_open+0x12c/0x2b0
>        intel_guc_log_runtime_create+0xab/0x230 [i915]
>        intel_guc_init+0x81/0x120 [i915]
>        intel_uc_init+0x29/0xa0 [i915]
>        i915_gem_init+0x182/0x530 [i915]
>        i915_driver_load+0xaa9/0x1720 [i915]
>        i915_pci_probe+0x2e/0x90 [i915]
>        pci_device_probe+0x9c/0x120
>        driver_probe_device+0x2a3/0x480
>        __driver_attach+0xd9/0xe0
>        bus_for_each_dev+0x57/0x90
>        bus_add_driver+0x168/0x260
>        driver_register+0x52/0xc0
>        do_one_initcall+0x39/0x150
>        do_init_module+0x56/0x1ef
>        load_module+0x231c/0x2d70
>        SyS_finit_module+0xa5/0xe0
>        entry_SYSCALL_64_fastpath+0x1c/0x89
> 
> -> #0 (&dev->struct_mutex){+.+.}:
>        __mutex_lock+0x81/0x9b0
>        i915_mutex_lock_interruptible+0x47/0x130 [i915]
>        i915_gem_fault+0x201/0x790 [i915]
>        __do_fault+0x15/0x70
>        __handle_mm_fault+0x677/0xdc0
>        handle_mm_fault+0x14f/0x2f0
>        __do_page_fault+0x2d1/0x560
>        page_fault+0x4c/0x60
> 
> other info that might help us debug this:
> 
> Chain exists of:
>   &dev->struct_mutex --> &sb->s_type->i_mutex_key#3 --> &mm-
> >mmap_sem
> 
>  Possible unsafe locking scenario:
> 
>        CPU0                    CPU1
>        ----                    ----
>   lock(&mm->mmap_sem);
>                                lock(&sb->s_type->i_mutex_key#3);
>                                lock(&mm->mmap_sem);
>   lock(&dev->struct_mutex);
> 
>  *** DEADLOCK ***
> 
> 1 lock held by debugfs_test/1388:
>  #0:  (&mm->mmap_sem){++++}, at: [<0000000029a9c131>]
> __do_page_fault+0x106/0x560
> 
> stack backtrace:
> CPU: 2 PID: 1388 Comm: debugfs_test Not tainted 4.15.0-rc6-CI-
> Patchwork_7614+ #1 Hardware name: To Be Filled By O.E.M. To Be Filled By
> O.E.M./J4205-ITX, BIOS P1.10 09/29/2016 Call Trace:
>  dump_stack+0x5f/0x86
>  print_circular_bug.isra.18+0x1d0/0x2c0
>  __lock_acquire+0x14ae/0x1b60
>  ? lock_acquire+0xaf/0x200
>  lock_acquire+0xaf/0x200
>  ? i915_mutex_lock_interruptible+0x47/0x130 [i915]
>  __mutex_lock+0x81/0x9b0
>  ? i915_mutex_lock_interruptible+0x47/0x130 [i915]  ?
> i915_mutex_lock_interruptible+0x47/0x130 [i915]  ?
> i915_mutex_lock_interruptible+0x47/0x130 [i915]
>  i915_mutex_lock_interruptible+0x47/0x130 [i915]  ?
> __pm_runtime_resume+0x4f/0x80
>  i915_gem_fault+0x201/0x790 [i915]
>  __do_fault+0x15/0x70
>  ? _raw_spin_unlock+0x29/0x40
>  __handle_mm_fault+0x677/0xdc0
>  handle_mm_fault+0x14f/0x2f0
>  __do_page_fault+0x2d1/0x560
>  ? page_fault+0x36/0x60
>  page_fault+0x4c/0x60
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104693
> Testcase: igt/debugfs_test/read_all_entries # with enable_guc=1 and
> guc_log_level=1
> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Marta Lofstedt <marta.lofstedt@intel.com>
> Cc: Michal Winiarski <michal.winiarski@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c  |  12 +---
>  drivers/gpu/drm/i915/i915_drv.c      |   2 +-
>  drivers/gpu/drm/i915/i915_gem.c      |   4 +-
>  drivers/gpu/drm/i915/intel_guc.c     |   6 +-
>  drivers/gpu/drm/i915/intel_guc_log.c | 111
> ++++++++++++++++++++++++++---------
>  drivers/gpu/drm/i915/intel_guc_log.h |   9 +++
>  drivers/gpu/drm/i915/intel_uc.c      |  26 ++++++--
>  drivers/gpu/drm/i915/intel_uc.h      |   4 +-
>  8 files changed, 125 insertions(+), 49 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
> b/drivers/gpu/drm/i915/i915_debugfs.c
> index 80dc679..b45be0d 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -2467,7 +2467,6 @@ static int i915_guc_log_control_get(void *data, u64
> *val)  static int i915_guc_log_control_set(void *data, u64 val)  {
>  	struct drm_i915_private *dev_priv = data;
> -	int ret;
> 
>  	if (!HAS_GUC(dev_priv))
>  		return -ENODEV;
> @@ -2475,16 +2474,7 @@ static int i915_guc_log_control_set(void *data,
> u64 val)
>  	if (!dev_priv->guc.log.vma)
>  		return -EINVAL;
> 
> -	ret = mutex_lock_interruptible(&dev_priv-
> >drm.struct_mutex);
> -	if (ret)
> -		return ret;
> -
> -	intel_runtime_pm_get(dev_priv);
> -	ret = i915_guc_log_control(dev_priv, val);
> -	intel_runtime_pm_put(dev_priv);
> -
> -	mutex_unlock(&dev_priv->drm.struct_mutex);
> -	return ret;
> +	return i915_guc_log_control(dev_priv, val);
>  }
> 
>  DEFINE_SIMPLE_ATTRIBUTE(i915_guc_log_control_fops,
> diff --git a/drivers/gpu/drm/i915/i915_drv.c
> b/drivers/gpu/drm/i915/i915_drv.c index 1fbe378..29048b9 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -626,7 +626,7 @@ static void i915_gem_fini(struct drm_i915_private
> *dev_priv)
>  	i915_gem_contexts_fini(dev_priv);
>  	mutex_unlock(&dev_priv->drm.struct_mutex);
> 
> -	intel_uc_fini_wq(dev_priv);
> +	intel_uc_fini_misc(dev_priv);
>  	i915_gem_cleanup_userptr(dev_priv);
> 
>  	i915_gem_drain_freed_objects(dev_priv);
> diff --git a/drivers/gpu/drm/i915/i915_gem.c
> b/drivers/gpu/drm/i915/i915_gem.c index 7f0684c..34d0d51 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -5193,7 +5193,7 @@ int i915_gem_init(struct drm_i915_private
> *dev_priv)
>  	if (ret)
>  		return ret;
> 
> -	ret = intel_uc_init_wq(dev_priv);
> +	ret = intel_uc_init_misc(dev_priv);
>  	if (ret)
>  		return ret;
> 
> @@ -5289,7 +5289,7 @@ int i915_gem_init(struct drm_i915_private
> *dev_priv)
>  	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
>  	mutex_unlock(&dev_priv->drm.struct_mutex);
> 
> -	intel_uc_fini_wq(dev_priv);
> +	intel_uc_fini_misc(dev_priv);
> 
>  	if (ret != -EIO)
>  		i915_gem_cleanup_userptr(dev_priv);
> diff --git a/drivers/gpu/drm/i915/intel_guc.c
> b/drivers/gpu/drm/i915/intel_guc.c
> index ea30e7c..912c9c0 100644
> --- a/drivers/gpu/drm/i915/intel_guc.c
> +++ b/drivers/gpu/drm/i915/intel_guc.c
> @@ -87,8 +87,10 @@ int intel_guc_init_wq(struct intel_guc *guc)
>  	 */
>  	guc->log.runtime.flush_wq =
> alloc_ordered_workqueue("i915-guc_log",
> 
> 	WQ_HIGHPRI | WQ_FREEZABLE);
> -	if (!guc->log.runtime.flush_wq)
> +	if (!guc->log.runtime.flush_wq) {
> +		DRM_ERROR("Couldn't allocate workqueue for
> GuC log\n");
>  		return -ENOMEM;
> +	}
> 
>  	/*
>  	 * Even though both sending GuC action, and adding a new
> workitem to @@ -109,6 +111,8 @@ int intel_guc_init_wq(struct intel_guc
> *guc)
> 
> 		  WQ_HIGHPRI);
>  		if (!guc->preempt_wq) {
>  			destroy_workqueue(guc-
> >log.runtime.flush_wq);
> +			DRM_ERROR("Couldn't allocate
> workqueue for GuC "
> +				  "preemption\n");
>  			return -ENOMEM;
>  		}
>  	}
> diff --git a/drivers/gpu/drm/i915/intel_guc_log.c
> b/drivers/gpu/drm/i915/intel_guc_log.c
> index 2ffc966..9e5600355 100644
> --- a/drivers/gpu/drm/i915/intel_guc_log.c
> +++ b/drivers/gpu/drm/i915/intel_guc_log.c
> @@ -359,16 +359,21 @@ static bool guc_log_has_runtime(struct intel_guc
> *guc)
>  	return guc->log.runtime.buf_addr != NULL;  }
> 
> +static bool guc_log_has_relay(struct intel_guc *guc) {
> +	return guc->log.runtime.relay_chan != NULL; }
> +
>  static int guc_log_runtime_create(struct intel_guc *guc)  {
>  	struct drm_i915_private *dev_priv = guc_to_i915(guc);
>  	void *vaddr;
> -	struct rchan *guc_log_relay_chan;
> -	size_t n_subbufs, subbuf_size;
>  	int ret;
> 
>  	lockdep_assert_held(&dev_priv->drm.struct_mutex);
> 
> +	GEM_BUG_ON(!guc_log_has_relay(guc));
> +
>  	GEM_BUG_ON(guc_log_has_runtime(guc));
> 
>  	ret = i915_gem_object_set_to_wc_domain(guc->log.vma-
> >obj, true); @@ -387,8 +392,38 @@ static int guc_log_runtime_create(struct
> intel_guc *guc)
> 
>  	guc->log.runtime.buf_addr = vaddr;
> 
> +	INIT_WORK(&guc->log.runtime.flush_work,
> capture_logs_work);
> +
> +	return 0;
> +}
> +
> +static void guc_log_runtime_destroy(struct intel_guc *guc) {
> +	/*
> +	 * It's possible that the runtime stuff was never allocated
> because
> +	 * GuC log was disabled at the boot time.
> +	 **/
> +	if (!guc_log_has_runtime(guc))
> +		return;
> +
> +	i915_gem_object_unpin_map(guc->log.vma->obj);
> +	guc->log.runtime.buf_addr = NULL;
> +}
> +
> +int intel_guc_log_relay_create(struct intel_guc *guc) {
> +	struct drm_i915_private *dev_priv = guc_to_i915(guc);
> +	struct rchan *guc_log_relay_chan;
> +	size_t n_subbufs, subbuf_size;
> +	int ret;
> +
> +	if (!i915_modparams.guc_log_level)
> +		return 0;
> +
> +	GEM_BUG_ON(guc_log_has_relay(guc));
> +
>  	 /* Keep the size of sub buffers same as shared log buffer */
> -	subbuf_size = guc->log.vma->obj->base.size;
> +	subbuf_size = GUC_LOG_SIZE;
> 
>  	/* Store up to 8 snapshots, which is large enough to buffer
> sufficient
>  	 * boot time logs and provides enough leeway to User, in
> terms of @@ -407,33 +442,31 @@ static int guc_log_runtime_create(struct
> intel_guc *guc)
>  		DRM_ERROR("Couldn't create relay chan for
> GuC logging\n");
> 
>  		ret = -ENOMEM;
> -		goto err_vaddr;
> +		goto err;
>  	}
> 
>  	GEM_BUG_ON(guc_log_relay_chan->subbuf_size <
> subbuf_size);
>  	guc->log.runtime.relay_chan = guc_log_relay_chan;
> 
> -	INIT_WORK(&guc->log.runtime.flush_work,
> capture_logs_work);
>  	return 0;
> 
> -err_vaddr:
> -	i915_gem_object_unpin_map(guc->log.vma->obj);
> -	guc->log.runtime.buf_addr = NULL;
> +err:
> +	/* logging will be off */
> +	i915_modparams.guc_log_level = 0;
>  	return ret;
>  }
> 
> -static void guc_log_runtime_destroy(struct intel_guc *guc)
> +void intel_guc_log_relay_destroy(struct intel_guc *guc)
>  {
>  	/*
> -	 * It's possible that the runtime stuff was never allocated
> because
> +	 * It's possible that the relay was never allocated because
>  	 * GuC log was disabled at the boot time.
>  	 */
> -	if (!guc_log_has_runtime(guc))
> +	if (!guc_log_has_relay(guc))
>  		return;
> 
>  	relay_close(guc->log.runtime.relay_chan);
> -	i915_gem_object_unpin_map(guc->log.vma->obj);
> -	guc->log.runtime.buf_addr = NULL;
> +	guc->log.runtime.relay_chan = NULL;
>  }
> 
>  static int guc_log_late_setup(struct intel_guc *guc) @@ -441,17 +474,26 @@
> static int guc_log_late_setup(struct intel_guc *guc)
>  	struct drm_i915_private *dev_priv = guc_to_i915(guc);
>  	int ret;
> 
> -	lockdep_assert_held(&dev_priv->drm.struct_mutex);
> -
>  	if (!guc_log_has_runtime(guc)) {
>  		/*
>  		 * If log was disabled at boot time, then setup
> needed to handle
>  		 * log buffer flush interrupts would not have
> been done yet, so
>  		 * do that now.
>  		 */
> -		ret = guc_log_runtime_create(guc);
> +		GEM_BUG_ON(guc_log_has_relay(guc));
> +
> +		ret = intel_guc_log_relay_create(guc);
>  		if (ret)
>  			goto err;
> +
> +		mutex_lock(&dev_priv->drm.struct_mutex);
> +		intel_runtime_pm_get(dev_priv);
> +		ret = guc_log_runtime_create(guc);
> +		intel_runtime_pm_put(dev_priv);
> +		mutex_unlock(&dev_priv->drm.struct_mutex);
> +
> +		if (ret)
> +			goto err_relay;
>  	}
> 
>  	ret = guc_log_relay_file_create(guc);
> @@ -461,7 +503,11 @@ static int guc_log_late_setup(struct intel_guc *guc)
>  	return 0;
> 
>  err_runtime:
> +	mutex_lock(&dev_priv->drm.struct_mutex);
>  	guc_log_runtime_destroy(guc);
> +	mutex_unlock(&dev_priv->drm.struct_mutex);
> +err_relay:
> +	intel_guc_log_relay_destroy(guc);
>  err:
>  	/* logging will remain off */
>  	i915_modparams.guc_log_level = 0;
> @@ -509,17 +555,10 @@ int intel_guc_log_create(struct intel_guc *guc)
>  	struct i915_vma *vma;
>  	unsigned long offset;
>  	u32 flags;
> -	u32 size;
>  	int ret;
> 
>  	GEM_BUG_ON(guc->log.vma);
> 
> -	/* The first page is to save log buffer state. Allocate one
> -	 * extra page for others in case for overlap */
> -	size = (1 + GUC_LOG_DPC_PAGES + 1 +
> -		GUC_LOG_ISR_PAGES + 1 +
> -		GUC_LOG_CRASH_PAGES + 1) << PAGE_SHIFT;
> -
>  	/* We require SSE 4.1 for fast reads from the GuC log buffer
> and
>  	 * it should be present on the chipsets supporting GuC based
>  	 * submisssions.
> @@ -529,7 +568,7 @@ int intel_guc_log_create(struct intel_guc *guc)
>  		goto err;
>  	}
> 
> -	vma = intel_guc_allocate_vma(guc, size);
> +	vma = intel_guc_allocate_vma(guc, GUC_LOG_SIZE);
>  	if (IS_ERR(vma)) {
>  		ret = PTR_ERR(vma);
>  		goto err;
> @@ -584,7 +623,15 @@ int i915_guc_log_control(struct drm_i915_private
> *dev_priv, u64 control_val)
>  		return 0;
> 
>  	verbosity = enable_logging ? control_val - 1 : 0;
> +
> +	ret = mutex_lock_interruptible(&dev_priv-
> >drm.struct_mutex);
> +	if (ret)
> +		return ret;
> +	intel_runtime_pm_get(dev_priv);
>  	ret = guc_log_control(guc, enable_logging, verbosity);
> +	intel_runtime_pm_put(dev_priv);
> +	mutex_unlock(&dev_priv->drm.struct_mutex);
> +
>  	if (ret < 0) {
>  		DRM_DEBUG_DRIVER("guc_logging_control
> action failed %d\n", ret);
>  		return ret;
> @@ -605,7 +652,11 @@ int i915_guc_log_control(struct drm_i915_private
> *dev_priv, u64 control_val)
>  		}
> 
>  		/* GuC logging is currently the only user of
> Guc2Host interrupts */
> +		mutex_lock(&dev_priv->drm.struct_mutex);
> +		intel_runtime_pm_get(dev_priv);
>  		gen9_enable_guc_interrupts(dev_priv);
> +		intel_runtime_pm_put(dev_priv);
> +		mutex_unlock(&dev_priv->drm.struct_mutex);
>  	} else {
>  		/*
>  		 * Once logging is disabled, GuC won't generate
> logs & send an @@ -613,7 +664,11 @@ int i915_guc_log_control(struct
> drm_i915_private *dev_priv, u64 control_val)
>  		 * which is yet to be captured. So request GuC to
> update the log
>  		 * buffer state and then collect the left over
> logs.
>  		 */
> +		mutex_lock(&dev_priv->drm.struct_mutex);
> +		intel_runtime_pm_get(dev_priv);
>  		guc_flush_logs(guc);
> +		intel_runtime_pm_put(dev_priv);
> +		mutex_unlock(&dev_priv->drm.struct_mutex);
> 
>  		/* As logging is disabled, update log level to
> reflect that */
>  		i915_modparams.guc_log_level = 0;
> @@ -627,19 +682,21 @@ void i915_guc_log_register(struct drm_i915_private
> *dev_priv)
>  	if (!USES_GUC_SUBMISSION(dev_priv) ||
> !i915_modparams.guc_log_level)
>  		return;
> 
> -	mutex_lock(&dev_priv->drm.struct_mutex);
>  	guc_log_late_setup(&dev_priv->guc);
> -	mutex_unlock(&dev_priv->drm.struct_mutex);
>  }
> 
>  void i915_guc_log_unregister(struct drm_i915_private *dev_priv)  {
> +	struct intel_guc *guc = &dev_priv->guc;
> +
>  	if (!USES_GUC_SUBMISSION(dev_priv))
>  		return;
> 
>  	mutex_lock(&dev_priv->drm.struct_mutex);
>  	/* GuC logging is currently the only user of Guc2Host
> interrupts */
>  	gen9_disable_guc_interrupts(dev_priv);
> -	guc_log_runtime_destroy(&dev_priv->guc);
> +	guc_log_runtime_destroy(guc);
>  	mutex_unlock(&dev_priv->drm.struct_mutex);
> +
> +	intel_guc_log_relay_destroy(guc);
>  }
> diff --git a/drivers/gpu/drm/i915/intel_guc_log.h
> b/drivers/gpu/drm/i915/intel_guc_log.h
> index f512cf7..84915a8 100644
> --- a/drivers/gpu/drm/i915/intel_guc_log.h
> +++ b/drivers/gpu/drm/i915/intel_guc_log.h
> @@ -32,6 +32,13 @@
>  struct drm_i915_private;
>  struct intel_guc;
> 
> +/*
> + * The first page is to save log buffer state. Allocate one
> + * extra page for others in case for overlap  */
> +#define GUC_LOG_SIZE	((1 + GUC_LOG_DPC_PAGES + 1 +
> GUC_LOG_ISR_PAGES + \
> +			  1 + GUC_LOG_CRASH_PAGES +
> 1) << PAGE_SHIFT)
> +
>  struct intel_guc_log {
>  	u32 flags;
>  	struct i915_vma *vma;
> @@ -52,6 +59,8 @@ struct intel_guc_log {
> 
>  int intel_guc_log_create(struct intel_guc *guc);  void
> intel_guc_log_destroy(struct intel_guc *guc);
> +int intel_guc_log_relay_create(struct intel_guc *guc); void
> +intel_guc_log_relay_destroy(struct intel_guc *guc);
>  int i915_guc_log_control(struct drm_i915_private *dev_priv, u64
> control_val);  void i915_guc_log_register(struct drm_i915_private
> *dev_priv);  void i915_guc_log_unregister(struct drm_i915_private
> *dev_priv); diff --git a/drivers/gpu/drm/i915/intel_uc.c
> b/drivers/gpu/drm/i915/intel_uc.c index f78a17b..e3f3509 100644
> --- a/drivers/gpu/drm/i915/intel_uc.c
> +++ b/drivers/gpu/drm/i915/intel_uc.c
> @@ -236,28 +236,44 @@ static void guc_disable_communication(struct
> intel_guc *guc)
>  	guc->send = intel_guc_send_nop;
>  }
> 
> -int intel_uc_init_wq(struct drm_i915_private *dev_priv)
> +int intel_uc_init_misc(struct drm_i915_private *dev_priv)
>  {
> +	struct intel_guc *guc = &dev_priv->guc;
>  	int ret;
> 
>  	if (!USES_GUC(dev_priv))
>  		return 0;
> 
> -	ret = intel_guc_init_wq(&dev_priv->guc);
> +	ret = intel_guc_init_wq(guc);
>  	if (ret) {
>  		DRM_ERROR("Couldn't allocate workqueues for
> GuC\n");
> -		return ret;
> +		goto err;
> +	}
> +
> +	ret = intel_guc_log_relay_create(guc);
> +	if (ret) {
> +		DRM_ERROR("Couldn't allocate relay for GuC
> log\n");
> +		goto err_relay;
>  	}
> 
>  	return 0;
> +
> +err_relay:
> +	intel_guc_fini_wq(guc);
> +err:
> +	return ret;
>  }
> 
> -void intel_uc_fini_wq(struct drm_i915_private *dev_priv)
> +void intel_uc_fini_misc(struct drm_i915_private *dev_priv)
>  {
> +	struct intel_guc *guc = &dev_priv->guc;
> +
>  	if (!USES_GUC(dev_priv))
>  		return;
> 
> -	intel_guc_fini_wq(&dev_priv->guc);
> +	intel_guc_fini_wq(guc);
> +
> +	intel_guc_log_relay_destroy(guc);
>  }
> 
>  int intel_uc_init(struct drm_i915_private *dev_priv) diff --git
> a/drivers/gpu/drm/i915/intel_uc.h b/drivers/gpu/drm/i915/intel_uc.h index
> 8a72497..f2984e0 100644
> --- a/drivers/gpu/drm/i915/intel_uc.h
> +++ b/drivers/gpu/drm/i915/intel_uc.h
> @@ -33,8 +33,8 @@
>  void intel_uc_init_mmio(struct drm_i915_private *dev_priv);  void
> intel_uc_init_fw(struct drm_i915_private *dev_priv);  void
> intel_uc_fini_fw(struct drm_i915_private *dev_priv); -int
> intel_uc_init_wq(struct drm_i915_private *dev_priv); -void
> intel_uc_fini_wq(struct drm_i915_private *dev_priv);
> +int intel_uc_init_misc(struct drm_i915_private *dev_priv); void
> +intel_uc_fini_misc(struct drm_i915_private *dev_priv);
>  int intel_uc_init_hw(struct drm_i915_private *dev_priv);  void
> intel_uc_fini_hw(struct drm_i915_private *dev_priv);  int intel_uc_init(struct
> drm_i915_private *dev_priv);
> --
> 1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/4] drm/i915/guc: Fix lockdep due to log relay channel handling under struct_mutex
@ 2018-01-22  8:26 Sagar Arun Kamble
  2018-01-22  8:24 ` Lofstedt, Marta
                   ` (6 more replies)
  0 siblings, 7 replies; 16+ messages in thread
From: Sagar Arun Kamble @ 2018-01-22  8:26 UTC (permalink / raw)
  To: intel-gfx

This patch fixes lockdep issue due to circular locking dependency of
struct_mutex, i_mutex_key, mmap_sem, relay_channels_mutex.
For GuC log relay channel we create debugfs file that requires i_mutex_key
lock and we are doing that under struct_mutex. So we introduced newer
dependency as:
    &dev->struct_mutex --> &sb->s_type->i_mutex_key#3 --> &mm->mmap_sem
However, there is dependency from mmap_sem to struct_mutex. Hence we
separate the relay create/destroy operation from under struct_mutex.

======================================================
WARNING: possible circular locking dependency detected
4.15.0-rc6-CI-Patchwork_7614+ #1 Not tainted
------------------------------------------------------
debugfs_test/1388 is trying to acquire lock:
 (&dev->struct_mutex){+.+.}, at: [<00000000d5e1d915>] i915_mutex_lock_interruptible+0x47/0x130 [i915]

but task is already holding lock:
 (&mm->mmap_sem){++++}, at: [<0000000029a9c131>] __do_page_fault+0x106/0x560

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #3 (&mm->mmap_sem){++++}:
       _copy_to_user+0x1e/0x70
       filldir+0x8c/0xf0
       dcache_readdir+0xeb/0x160
       iterate_dir+0xdc/0x140
       SyS_getdents+0xa0/0x130
       entry_SYSCALL_64_fastpath+0x1c/0x89

-> #2 (&sb->s_type->i_mutex_key#3){++++}:
       start_creating+0x59/0x110
       __debugfs_create_file+0x2e/0xe0
       relay_create_buf_file+0x62/0x80
       relay_late_setup_files+0x84/0x250
       guc_log_late_setup+0x4f/0x110 [i915]
       i915_guc_log_register+0x32/0x40 [i915]
       i915_driver_load+0x7b6/0x1720 [i915]
       i915_pci_probe+0x2e/0x90 [i915]
       pci_device_probe+0x9c/0x120
       driver_probe_device+0x2a3/0x480
       __driver_attach+0xd9/0xe0
       bus_for_each_dev+0x57/0x90
       bus_add_driver+0x168/0x260
       driver_register+0x52/0xc0
       do_one_initcall+0x39/0x150
       do_init_module+0x56/0x1ef
       load_module+0x231c/0x2d70
       SyS_finit_module+0xa5/0xe0
       entry_SYSCALL_64_fastpath+0x1c/0x89

-> #1 (relay_channels_mutex){+.+.}:
       relay_open+0x12c/0x2b0
       intel_guc_log_runtime_create+0xab/0x230 [i915]
       intel_guc_init+0x81/0x120 [i915]
       intel_uc_init+0x29/0xa0 [i915]
       i915_gem_init+0x182/0x530 [i915]
       i915_driver_load+0xaa9/0x1720 [i915]
       i915_pci_probe+0x2e/0x90 [i915]
       pci_device_probe+0x9c/0x120
       driver_probe_device+0x2a3/0x480
       __driver_attach+0xd9/0xe0
       bus_for_each_dev+0x57/0x90
       bus_add_driver+0x168/0x260
       driver_register+0x52/0xc0
       do_one_initcall+0x39/0x150
       do_init_module+0x56/0x1ef
       load_module+0x231c/0x2d70
       SyS_finit_module+0xa5/0xe0
       entry_SYSCALL_64_fastpath+0x1c/0x89

-> #0 (&dev->struct_mutex){+.+.}:
       __mutex_lock+0x81/0x9b0
       i915_mutex_lock_interruptible+0x47/0x130 [i915]
       i915_gem_fault+0x201/0x790 [i915]
       __do_fault+0x15/0x70
       __handle_mm_fault+0x677/0xdc0
       handle_mm_fault+0x14f/0x2f0
       __do_page_fault+0x2d1/0x560
       page_fault+0x4c/0x60

other info that might help us debug this:

Chain exists of:
  &dev->struct_mutex --> &sb->s_type->i_mutex_key#3 --> &mm->mmap_sem

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&mm->mmap_sem);
                               lock(&sb->s_type->i_mutex_key#3);
                               lock(&mm->mmap_sem);
  lock(&dev->struct_mutex);

 *** DEADLOCK ***

1 lock held by debugfs_test/1388:
 #0:  (&mm->mmap_sem){++++}, at: [<0000000029a9c131>] __do_page_fault+0x106/0x560

stack backtrace:
CPU: 2 PID: 1388 Comm: debugfs_test Not tainted 4.15.0-rc6-CI-Patchwork_7614+ #1
Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./J4205-ITX, BIOS P1.10 09/29/2016
Call Trace:
 dump_stack+0x5f/0x86
 print_circular_bug.isra.18+0x1d0/0x2c0
 __lock_acquire+0x14ae/0x1b60
 ? lock_acquire+0xaf/0x200
 lock_acquire+0xaf/0x200
 ? i915_mutex_lock_interruptible+0x47/0x130 [i915]
 __mutex_lock+0x81/0x9b0
 ? i915_mutex_lock_interruptible+0x47/0x130 [i915]
 ? i915_mutex_lock_interruptible+0x47/0x130 [i915]
 ? i915_mutex_lock_interruptible+0x47/0x130 [i915]
 i915_mutex_lock_interruptible+0x47/0x130 [i915]
 ? __pm_runtime_resume+0x4f/0x80
 i915_gem_fault+0x201/0x790 [i915]
 __do_fault+0x15/0x70
 ? _raw_spin_unlock+0x29/0x40
 __handle_mm_fault+0x677/0xdc0
 handle_mm_fault+0x14f/0x2f0
 __do_page_fault+0x2d1/0x560
 ? page_fault+0x36/0x60
 page_fault+0x4c/0x60

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104693
Testcase: igt/debugfs_test/read_all_entries # with enable_guc=1 and guc_log_level=1
Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Marta Lofstedt <marta.lofstedt@intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c  |  12 +---
 drivers/gpu/drm/i915/i915_drv.c      |   2 +-
 drivers/gpu/drm/i915/i915_gem.c      |   4 +-
 drivers/gpu/drm/i915/intel_guc.c     |   6 +-
 drivers/gpu/drm/i915/intel_guc_log.c | 111 ++++++++++++++++++++++++++---------
 drivers/gpu/drm/i915/intel_guc_log.h |   9 +++
 drivers/gpu/drm/i915/intel_uc.c      |  26 ++++++--
 drivers/gpu/drm/i915/intel_uc.h      |   4 +-
 8 files changed, 125 insertions(+), 49 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 80dc679..b45be0d 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2467,7 +2467,6 @@ static int i915_guc_log_control_get(void *data, u64 *val)
 static int i915_guc_log_control_set(void *data, u64 val)
 {
 	struct drm_i915_private *dev_priv = data;
-	int ret;
 
 	if (!HAS_GUC(dev_priv))
 		return -ENODEV;
@@ -2475,16 +2474,7 @@ static int i915_guc_log_control_set(void *data, u64 val)
 	if (!dev_priv->guc.log.vma)
 		return -EINVAL;
 
-	ret = mutex_lock_interruptible(&dev_priv->drm.struct_mutex);
-	if (ret)
-		return ret;
-
-	intel_runtime_pm_get(dev_priv);
-	ret = i915_guc_log_control(dev_priv, val);
-	intel_runtime_pm_put(dev_priv);
-
-	mutex_unlock(&dev_priv->drm.struct_mutex);
-	return ret;
+	return i915_guc_log_control(dev_priv, val);
 }
 
 DEFINE_SIMPLE_ATTRIBUTE(i915_guc_log_control_fops,
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 1fbe378..29048b9 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -626,7 +626,7 @@ static void i915_gem_fini(struct drm_i915_private *dev_priv)
 	i915_gem_contexts_fini(dev_priv);
 	mutex_unlock(&dev_priv->drm.struct_mutex);
 
-	intel_uc_fini_wq(dev_priv);
+	intel_uc_fini_misc(dev_priv);
 	i915_gem_cleanup_userptr(dev_priv);
 
 	i915_gem_drain_freed_objects(dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 7f0684c..34d0d51 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5193,7 +5193,7 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 	if (ret)
 		return ret;
 
-	ret = intel_uc_init_wq(dev_priv);
+	ret = intel_uc_init_misc(dev_priv);
 	if (ret)
 		return ret;
 
@@ -5289,7 +5289,7 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
 	mutex_unlock(&dev_priv->drm.struct_mutex);
 
-	intel_uc_fini_wq(dev_priv);
+	intel_uc_fini_misc(dev_priv);
 
 	if (ret != -EIO)
 		i915_gem_cleanup_userptr(dev_priv);
diff --git a/drivers/gpu/drm/i915/intel_guc.c b/drivers/gpu/drm/i915/intel_guc.c
index ea30e7c..912c9c0 100644
--- a/drivers/gpu/drm/i915/intel_guc.c
+++ b/drivers/gpu/drm/i915/intel_guc.c
@@ -87,8 +87,10 @@ int intel_guc_init_wq(struct intel_guc *guc)
 	 */
 	guc->log.runtime.flush_wq = alloc_ordered_workqueue("i915-guc_log",
 						WQ_HIGHPRI | WQ_FREEZABLE);
-	if (!guc->log.runtime.flush_wq)
+	if (!guc->log.runtime.flush_wq) {
+		DRM_ERROR("Couldn't allocate workqueue for GuC log\n");
 		return -ENOMEM;
+	}
 
 	/*
 	 * Even though both sending GuC action, and adding a new workitem to
@@ -109,6 +111,8 @@ int intel_guc_init_wq(struct intel_guc *guc)
 							  WQ_HIGHPRI);
 		if (!guc->preempt_wq) {
 			destroy_workqueue(guc->log.runtime.flush_wq);
+			DRM_ERROR("Couldn't allocate workqueue for GuC "
+				  "preemption\n");
 			return -ENOMEM;
 		}
 	}
diff --git a/drivers/gpu/drm/i915/intel_guc_log.c b/drivers/gpu/drm/i915/intel_guc_log.c
index 2ffc966..9e5600355 100644
--- a/drivers/gpu/drm/i915/intel_guc_log.c
+++ b/drivers/gpu/drm/i915/intel_guc_log.c
@@ -359,16 +359,21 @@ static bool guc_log_has_runtime(struct intel_guc *guc)
 	return guc->log.runtime.buf_addr != NULL;
 }
 
+static bool guc_log_has_relay(struct intel_guc *guc)
+{
+	return guc->log.runtime.relay_chan != NULL;
+}
+
 static int guc_log_runtime_create(struct intel_guc *guc)
 {
 	struct drm_i915_private *dev_priv = guc_to_i915(guc);
 	void *vaddr;
-	struct rchan *guc_log_relay_chan;
-	size_t n_subbufs, subbuf_size;
 	int ret;
 
 	lockdep_assert_held(&dev_priv->drm.struct_mutex);
 
+	GEM_BUG_ON(!guc_log_has_relay(guc));
+
 	GEM_BUG_ON(guc_log_has_runtime(guc));
 
 	ret = i915_gem_object_set_to_wc_domain(guc->log.vma->obj, true);
@@ -387,8 +392,38 @@ static int guc_log_runtime_create(struct intel_guc *guc)
 
 	guc->log.runtime.buf_addr = vaddr;
 
+	INIT_WORK(&guc->log.runtime.flush_work, capture_logs_work);
+
+	return 0;
+}
+
+static void guc_log_runtime_destroy(struct intel_guc *guc)
+{
+	/*
+	 * It's possible that the runtime stuff was never allocated because
+	 * GuC log was disabled at the boot time.
+	 **/
+	if (!guc_log_has_runtime(guc))
+		return;
+
+	i915_gem_object_unpin_map(guc->log.vma->obj);
+	guc->log.runtime.buf_addr = NULL;
+}
+
+int intel_guc_log_relay_create(struct intel_guc *guc)
+{
+	struct drm_i915_private *dev_priv = guc_to_i915(guc);
+	struct rchan *guc_log_relay_chan;
+	size_t n_subbufs, subbuf_size;
+	int ret;
+
+	if (!i915_modparams.guc_log_level)
+		return 0;
+
+	GEM_BUG_ON(guc_log_has_relay(guc));
+
 	 /* Keep the size of sub buffers same as shared log buffer */
-	subbuf_size = guc->log.vma->obj->base.size;
+	subbuf_size = GUC_LOG_SIZE;
 
 	/* Store up to 8 snapshots, which is large enough to buffer sufficient
 	 * boot time logs and provides enough leeway to User, in terms of
@@ -407,33 +442,31 @@ static int guc_log_runtime_create(struct intel_guc *guc)
 		DRM_ERROR("Couldn't create relay chan for GuC logging\n");
 
 		ret = -ENOMEM;
-		goto err_vaddr;
+		goto err;
 	}
 
 	GEM_BUG_ON(guc_log_relay_chan->subbuf_size < subbuf_size);
 	guc->log.runtime.relay_chan = guc_log_relay_chan;
 
-	INIT_WORK(&guc->log.runtime.flush_work, capture_logs_work);
 	return 0;
 
-err_vaddr:
-	i915_gem_object_unpin_map(guc->log.vma->obj);
-	guc->log.runtime.buf_addr = NULL;
+err:
+	/* logging will be off */
+	i915_modparams.guc_log_level = 0;
 	return ret;
 }
 
-static void guc_log_runtime_destroy(struct intel_guc *guc)
+void intel_guc_log_relay_destroy(struct intel_guc *guc)
 {
 	/*
-	 * It's possible that the runtime stuff was never allocated because
+	 * It's possible that the relay was never allocated because
 	 * GuC log was disabled at the boot time.
 	 */
-	if (!guc_log_has_runtime(guc))
+	if (!guc_log_has_relay(guc))
 		return;
 
 	relay_close(guc->log.runtime.relay_chan);
-	i915_gem_object_unpin_map(guc->log.vma->obj);
-	guc->log.runtime.buf_addr = NULL;
+	guc->log.runtime.relay_chan = NULL;
 }
 
 static int guc_log_late_setup(struct intel_guc *guc)
@@ -441,17 +474,26 @@ static int guc_log_late_setup(struct intel_guc *guc)
 	struct drm_i915_private *dev_priv = guc_to_i915(guc);
 	int ret;
 
-	lockdep_assert_held(&dev_priv->drm.struct_mutex);
-
 	if (!guc_log_has_runtime(guc)) {
 		/*
 		 * If log was disabled at boot time, then setup needed to handle
 		 * log buffer flush interrupts would not have been done yet, so
 		 * do that now.
 		 */
-		ret = guc_log_runtime_create(guc);
+		GEM_BUG_ON(guc_log_has_relay(guc));
+
+		ret = intel_guc_log_relay_create(guc);
 		if (ret)
 			goto err;
+
+		mutex_lock(&dev_priv->drm.struct_mutex);
+		intel_runtime_pm_get(dev_priv);
+		ret = guc_log_runtime_create(guc);
+		intel_runtime_pm_put(dev_priv);
+		mutex_unlock(&dev_priv->drm.struct_mutex);
+
+		if (ret)
+			goto err_relay;
 	}
 
 	ret = guc_log_relay_file_create(guc);
@@ -461,7 +503,11 @@ static int guc_log_late_setup(struct intel_guc *guc)
 	return 0;
 
 err_runtime:
+	mutex_lock(&dev_priv->drm.struct_mutex);
 	guc_log_runtime_destroy(guc);
+	mutex_unlock(&dev_priv->drm.struct_mutex);
+err_relay:
+	intel_guc_log_relay_destroy(guc);
 err:
 	/* logging will remain off */
 	i915_modparams.guc_log_level = 0;
@@ -509,17 +555,10 @@ int intel_guc_log_create(struct intel_guc *guc)
 	struct i915_vma *vma;
 	unsigned long offset;
 	u32 flags;
-	u32 size;
 	int ret;
 
 	GEM_BUG_ON(guc->log.vma);
 
-	/* The first page is to save log buffer state. Allocate one
-	 * extra page for others in case for overlap */
-	size = (1 + GUC_LOG_DPC_PAGES + 1 +
-		GUC_LOG_ISR_PAGES + 1 +
-		GUC_LOG_CRASH_PAGES + 1) << PAGE_SHIFT;
-
 	/* We require SSE 4.1 for fast reads from the GuC log buffer and
 	 * it should be present on the chipsets supporting GuC based
 	 * submisssions.
@@ -529,7 +568,7 @@ int intel_guc_log_create(struct intel_guc *guc)
 		goto err;
 	}
 
-	vma = intel_guc_allocate_vma(guc, size);
+	vma = intel_guc_allocate_vma(guc, GUC_LOG_SIZE);
 	if (IS_ERR(vma)) {
 		ret = PTR_ERR(vma);
 		goto err;
@@ -584,7 +623,15 @@ int i915_guc_log_control(struct drm_i915_private *dev_priv, u64 control_val)
 		return 0;
 
 	verbosity = enable_logging ? control_val - 1 : 0;
+
+	ret = mutex_lock_interruptible(&dev_priv->drm.struct_mutex);
+	if (ret)
+		return ret;
+	intel_runtime_pm_get(dev_priv);
 	ret = guc_log_control(guc, enable_logging, verbosity);
+	intel_runtime_pm_put(dev_priv);
+	mutex_unlock(&dev_priv->drm.struct_mutex);
+
 	if (ret < 0) {
 		DRM_DEBUG_DRIVER("guc_logging_control action failed %d\n", ret);
 		return ret;
@@ -605,7 +652,11 @@ int i915_guc_log_control(struct drm_i915_private *dev_priv, u64 control_val)
 		}
 
 		/* GuC logging is currently the only user of Guc2Host interrupts */
+		mutex_lock(&dev_priv->drm.struct_mutex);
+		intel_runtime_pm_get(dev_priv);
 		gen9_enable_guc_interrupts(dev_priv);
+		intel_runtime_pm_put(dev_priv);
+		mutex_unlock(&dev_priv->drm.struct_mutex);
 	} else {
 		/*
 		 * Once logging is disabled, GuC won't generate logs & send an
@@ -613,7 +664,11 @@ int i915_guc_log_control(struct drm_i915_private *dev_priv, u64 control_val)
 		 * which is yet to be captured. So request GuC to update the log
 		 * buffer state and then collect the left over logs.
 		 */
+		mutex_lock(&dev_priv->drm.struct_mutex);
+		intel_runtime_pm_get(dev_priv);
 		guc_flush_logs(guc);
+		intel_runtime_pm_put(dev_priv);
+		mutex_unlock(&dev_priv->drm.struct_mutex);
 
 		/* As logging is disabled, update log level to reflect that */
 		i915_modparams.guc_log_level = 0;
@@ -627,19 +682,21 @@ void i915_guc_log_register(struct drm_i915_private *dev_priv)
 	if (!USES_GUC_SUBMISSION(dev_priv) || !i915_modparams.guc_log_level)
 		return;
 
-	mutex_lock(&dev_priv->drm.struct_mutex);
 	guc_log_late_setup(&dev_priv->guc);
-	mutex_unlock(&dev_priv->drm.struct_mutex);
 }
 
 void i915_guc_log_unregister(struct drm_i915_private *dev_priv)
 {
+	struct intel_guc *guc = &dev_priv->guc;
+
 	if (!USES_GUC_SUBMISSION(dev_priv))
 		return;
 
 	mutex_lock(&dev_priv->drm.struct_mutex);
 	/* GuC logging is currently the only user of Guc2Host interrupts */
 	gen9_disable_guc_interrupts(dev_priv);
-	guc_log_runtime_destroy(&dev_priv->guc);
+	guc_log_runtime_destroy(guc);
 	mutex_unlock(&dev_priv->drm.struct_mutex);
+
+	intel_guc_log_relay_destroy(guc);
 }
diff --git a/drivers/gpu/drm/i915/intel_guc_log.h b/drivers/gpu/drm/i915/intel_guc_log.h
index f512cf7..84915a8 100644
--- a/drivers/gpu/drm/i915/intel_guc_log.h
+++ b/drivers/gpu/drm/i915/intel_guc_log.h
@@ -32,6 +32,13 @@
 struct drm_i915_private;
 struct intel_guc;
 
+/*
+ * The first page is to save log buffer state. Allocate one
+ * extra page for others in case for overlap
+ */
+#define GUC_LOG_SIZE	((1 + GUC_LOG_DPC_PAGES + 1 + GUC_LOG_ISR_PAGES + \
+			  1 + GUC_LOG_CRASH_PAGES + 1) << PAGE_SHIFT)
+
 struct intel_guc_log {
 	u32 flags;
 	struct i915_vma *vma;
@@ -52,6 +59,8 @@ struct intel_guc_log {
 
 int intel_guc_log_create(struct intel_guc *guc);
 void intel_guc_log_destroy(struct intel_guc *guc);
+int intel_guc_log_relay_create(struct intel_guc *guc);
+void intel_guc_log_relay_destroy(struct intel_guc *guc);
 int i915_guc_log_control(struct drm_i915_private *dev_priv, u64 control_val);
 void i915_guc_log_register(struct drm_i915_private *dev_priv);
 void i915_guc_log_unregister(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/intel_uc.c b/drivers/gpu/drm/i915/intel_uc.c
index f78a17b..e3f3509 100644
--- a/drivers/gpu/drm/i915/intel_uc.c
+++ b/drivers/gpu/drm/i915/intel_uc.c
@@ -236,28 +236,44 @@ static void guc_disable_communication(struct intel_guc *guc)
 	guc->send = intel_guc_send_nop;
 }
 
-int intel_uc_init_wq(struct drm_i915_private *dev_priv)
+int intel_uc_init_misc(struct drm_i915_private *dev_priv)
 {
+	struct intel_guc *guc = &dev_priv->guc;
 	int ret;
 
 	if (!USES_GUC(dev_priv))
 		return 0;
 
-	ret = intel_guc_init_wq(&dev_priv->guc);
+	ret = intel_guc_init_wq(guc);
 	if (ret) {
 		DRM_ERROR("Couldn't allocate workqueues for GuC\n");
-		return ret;
+		goto err;
+	}
+
+	ret = intel_guc_log_relay_create(guc);
+	if (ret) {
+		DRM_ERROR("Couldn't allocate relay for GuC log\n");
+		goto err_relay;
 	}
 
 	return 0;
+
+err_relay:
+	intel_guc_fini_wq(guc);
+err:
+	return ret;
 }
 
-void intel_uc_fini_wq(struct drm_i915_private *dev_priv)
+void intel_uc_fini_misc(struct drm_i915_private *dev_priv)
 {
+	struct intel_guc *guc = &dev_priv->guc;
+
 	if (!USES_GUC(dev_priv))
 		return;
 
-	intel_guc_fini_wq(&dev_priv->guc);
+	intel_guc_fini_wq(guc);
+
+	intel_guc_log_relay_destroy(guc);
 }
 
 int intel_uc_init(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/intel_uc.h b/drivers/gpu/drm/i915/intel_uc.h
index 8a72497..f2984e0 100644
--- a/drivers/gpu/drm/i915/intel_uc.h
+++ b/drivers/gpu/drm/i915/intel_uc.h
@@ -33,8 +33,8 @@
 void intel_uc_init_mmio(struct drm_i915_private *dev_priv);
 void intel_uc_init_fw(struct drm_i915_private *dev_priv);
 void intel_uc_fini_fw(struct drm_i915_private *dev_priv);
-int intel_uc_init_wq(struct drm_i915_private *dev_priv);
-void intel_uc_fini_wq(struct drm_i915_private *dev_priv);
+int intel_uc_init_misc(struct drm_i915_private *dev_priv);
+void intel_uc_fini_misc(struct drm_i915_private *dev_priv);
 int intel_uc_init_hw(struct drm_i915_private *dev_priv);
 void intel_uc_fini_hw(struct drm_i915_private *dev_priv);
 int intel_uc_init(struct drm_i915_private *dev_priv);
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/4] drm/i915/guc: Grab RPM wakelock while disabling GuC interrupts
  2018-01-22  8:26 [PATCH 1/4] drm/i915/guc: Fix lockdep due to log relay channel handling under struct_mutex Sagar Arun Kamble
  2018-01-22  8:24 ` Lofstedt, Marta
@ 2018-01-22  8:26 ` Sagar Arun Kamble
  2018-01-22 10:11   ` Chris Wilson
  2018-01-22  8:26 ` [PATCH 3/4] drm/i915/guc: Enable interrupts before resuming GuC during runtime resume Sagar Arun Kamble
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 16+ messages in thread
From: Sagar Arun Kamble @ 2018-01-22  8:26 UTC (permalink / raw)
  To: intel-gfx

Disabling GuC interrupts involves access to GuC IRQ control registers
hence ensure device is RPM awake.

v2: Add comment about need to synchronize flush work and log runtime
    destroy

v3: Moved patch earlier in the series and removed comment about future
work. (Tvrtko)

v4-v5: Rebase.

Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_guc_log.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_guc_log.c b/drivers/gpu/drm/i915/intel_guc_log.c
index 9e5600355..4039912 100644
--- a/drivers/gpu/drm/i915/intel_guc_log.c
+++ b/drivers/gpu/drm/i915/intel_guc_log.c
@@ -694,7 +694,10 @@ void i915_guc_log_unregister(struct drm_i915_private *dev_priv)
 
 	mutex_lock(&dev_priv->drm.struct_mutex);
 	/* GuC logging is currently the only user of Guc2Host interrupts */
+	intel_runtime_pm_get(dev_priv);
 	gen9_disable_guc_interrupts(dev_priv);
+	intel_runtime_pm_put(dev_priv);
+
 	guc_log_runtime_destroy(guc);
 	mutex_unlock(&dev_priv->drm.struct_mutex);
 
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/4] drm/i915/guc: Enable interrupts before resuming GuC during runtime resume
  2018-01-22  8:26 [PATCH 1/4] drm/i915/guc: Fix lockdep due to log relay channel handling under struct_mutex Sagar Arun Kamble
  2018-01-22  8:24 ` Lofstedt, Marta
  2018-01-22  8:26 ` [PATCH 2/4] drm/i915/guc: Grab RPM wakelock while disabling GuC interrupts Sagar Arun Kamble
@ 2018-01-22  8:26 ` Sagar Arun Kamble
  2018-01-22 10:09   ` Chris Wilson
  2018-01-22  8:26 ` [PATCH 4/4] [HAX] Revert "drm/i915/guc: Keep GuC log disabled by default" Sagar Arun Kamble
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 16+ messages in thread
From: Sagar Arun Kamble @ 2018-01-22  8:26 UTC (permalink / raw)
  To: intel-gfx

GuC log streaming needs interrupts enabled prior to GuC resume but
runtime pm interrupt setup was happening post GuC resume. Fix it.
While at it, fix the unwinding of steps in the runtime suspend path.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104695
Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Marta Lofstedt <marta.lofstedt@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 29048b9..1ec12ad 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -2600,6 +2600,11 @@ static int intel_runtime_suspend(struct device *kdev)
 
 		intel_runtime_pm_enable_interrupts(dev_priv);
 
+		intel_guc_resume(dev_priv);
+
+		i915_gem_init_swizzling(dev_priv);
+		i915_gem_restore_fences(dev_priv);
+
 		enable_rpm_wakeref_asserts(dev_priv);
 
 		return ret;
@@ -2665,8 +2670,6 @@ static int intel_runtime_resume(struct device *kdev)
 	if (intel_uncore_unclaimed_mmio(dev_priv))
 		DRM_DEBUG_DRIVER("Unclaimed access during suspend, bios?\n");
 
-	intel_guc_resume(dev_priv);
-
 	if (IS_GEN9_LP(dev_priv)) {
 		bxt_disable_dc9(dev_priv);
 		bxt_display_core_init(dev_priv, true);
@@ -2681,6 +2684,10 @@ static int intel_runtime_resume(struct device *kdev)
 
 	intel_uncore_runtime_resume(dev_priv);
 
+	intel_runtime_pm_enable_interrupts(dev_priv);
+
+	intel_guc_resume(dev_priv);
+
 	/*
 	 * No point of rolling back things in case of an error, as the best
 	 * we can do is to hope that things will still work (and disable RPM).
@@ -2688,8 +2695,6 @@ static int intel_runtime_resume(struct device *kdev)
 	i915_gem_init_swizzling(dev_priv);
 	i915_gem_restore_fences(dev_priv);
 
-	intel_runtime_pm_enable_interrupts(dev_priv);
-
 	/*
 	 * On VLV/CHV display interrupts are part of the display
 	 * power well, so hpd is reinitialized from there. For
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 4/4] [HAX] Revert "drm/i915/guc: Keep GuC log disabled by default"
  2018-01-22  8:26 [PATCH 1/4] drm/i915/guc: Fix lockdep due to log relay channel handling under struct_mutex Sagar Arun Kamble
                   ` (2 preceding siblings ...)
  2018-01-22  8:26 ` [PATCH 3/4] drm/i915/guc: Enable interrupts before resuming GuC during runtime resume Sagar Arun Kamble
@ 2018-01-22  8:26 ` Sagar Arun Kamble
  2018-01-22  9:15 ` ✓ Fi.CI.BAT: success for series starting with [1/4] drm/i915/guc: Fix lockdep due to log relay channel handling under struct_mutex Patchwork
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 16+ messages in thread
From: Sagar Arun Kamble @ 2018-01-22  8:26 UTC (permalink / raw)
  To: intel-gfx

Comment DRM_ERROR_RATELIMIT outputs as log capturer won't be
running.

This reverts commit bd724318b682587ad2f989ab8e0f7b3d4486ced5.
---
 drivers/gpu/drm/i915/i915_params.h   | 2 +-
 drivers/gpu/drm/i915/intel_guc_log.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_params.h b/drivers/gpu/drm/i915/i915_params.h
index 430f5f9..c963603 100644
--- a/drivers/gpu/drm/i915/i915_params.h
+++ b/drivers/gpu/drm/i915/i915_params.h
@@ -48,7 +48,7 @@
 	param(int, enable_ips, 1) \
 	param(int, invert_brightness, 0) \
 	param(int, enable_guc, 0) \
-	param(int, guc_log_level, 0) \
+	param(int, guc_log_level, -1) \
 	param(char *, guc_firmware_path, NULL) \
 	param(char *, huc_firmware_path, NULL) \
 	param(int, mmio_debug, 0) \
diff --git a/drivers/gpu/drm/i915/intel_guc_log.c b/drivers/gpu/drm/i915/intel_guc_log.c
index 4039912..d600fa4 100644
--- a/drivers/gpu/drm/i915/intel_guc_log.c
+++ b/drivers/gpu/drm/i915/intel_guc_log.c
@@ -228,7 +228,7 @@ static bool guc_check_log_buf_overflow(struct intel_guc *guc,
 			/* buffer_full_cnt is a 4 bit counter */
 			guc->log.total_overflow_count[type] += 16;
 		}
-		DRM_ERROR_RATELIMITED("GuC log buffer overflow\n");
+		//DRM_ERROR_RATELIMITED("GuC log buffer overflow\n");
 	}
 
 	return overflow;
@@ -341,7 +341,7 @@ static void guc_read_update_log_buffer(struct intel_guc *guc)
 		/* Used rate limited to avoid deluge of messages, logs might be
 		 * getting consumed by User at a slow rate.
 		 */
-		DRM_ERROR_RATELIMITED("no sub-buffer to capture logs\n");
+		//DRM_ERROR_RATELIMITED("no sub-buffer to capture logs\n");
 		guc->log.capture_miss_count++;
 	}
 }
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4] drm/i915/guc: Fix lockdep due to log relay channel handling under struct_mutex
  2018-01-22  8:24 ` Lofstedt, Marta
@ 2018-01-22  8:34   ` Sagar Arun Kamble
  0 siblings, 0 replies; 16+ messages in thread
From: Sagar Arun Kamble @ 2018-01-22  8:34 UTC (permalink / raw)
  To: Lofstedt, Marta, intel-gfx



On 1/22/2018 1:54 PM, Lofstedt, Marta wrote:
> I recommend that this is tested with a HACK to enable the GUC logs again, so that we can see if it really fixes the issue.
Yes. Patch 4 in the series enables the GuC log for testing.
>> -----Original Message-----
>> From: Kamble, Sagar A
>> Sent: Monday, January 22, 2018 10:26 AM
>> To: intel-gfx@lists.freedesktop.org
>> Cc: Kamble, Sagar A <sagar.a.kamble@intel.com>; Wajdeczko, Michal
>> <Michal.Wajdeczko@intel.com>; Ceraolo Spurio, Daniele
>> <daniele.ceraolospurio@intel.com>; Ursulin, Tvrtko
>> <tvrtko.ursulin@intel.com>; Chris Wilson <chris@chris-wilson.co.uk>; Joonas
>> Lahtinen <joonas.lahtinen@linux.intel.com>; Lofstedt, Marta
>> <marta.lofstedt@intel.com>; Winiarski, Michal <michal.winiarski@intel.com>
>> Subject: [PATCH 1/4] drm/i915/guc: Fix lockdep due to log relay channel
>> handling under struct_mutex
>>
>> This patch fixes lockdep issue due to circular locking dependency of
>> struct_mutex, i_mutex_key, mmap_sem, relay_channels_mutex.
>> For GuC log relay channel we create debugfs file that requires i_mutex_key
>> lock and we are doing that under struct_mutex. So we introduced newer
>> dependency as:
>>      &dev->struct_mutex --> &sb->s_type->i_mutex_key#3 --> &mm-
>>> mmap_sem However, there is dependency from mmap_sem to
>> struct_mutex. Hence we separate the relay create/destroy operation from
>> under struct_mutex.
>>
>> ======================================================
>> WARNING: possible circular locking dependency detected 4.15.0-rc6-CI-
>> Patchwork_7614+ #1 Not tainted
>> ------------------------------------------------------
>> debugfs_test/1388 is trying to acquire lock:
>>   (&dev->struct_mutex){+.+.}, at: [<00000000d5e1d915>]
>> i915_mutex_lock_interruptible+0x47/0x130 [i915]
>>
>> but task is already holding lock:
>>   (&mm->mmap_sem){++++}, at: [<0000000029a9c131>]
>> __do_page_fault+0x106/0x560
>>
>> which lock already depends on the new lock.
>>
>> the existing dependency chain (in reverse order) is:
>>
>> -> #3 (&mm->mmap_sem){++++}:
>>         _copy_to_user+0x1e/0x70
>>         filldir+0x8c/0xf0
>>         dcache_readdir+0xeb/0x160
>>         iterate_dir+0xdc/0x140
>>         SyS_getdents+0xa0/0x130
>>         entry_SYSCALL_64_fastpath+0x1c/0x89
>>
>> -> #2 (&sb->s_type->i_mutex_key#3){++++}:
>>         start_creating+0x59/0x110
>>         __debugfs_create_file+0x2e/0xe0
>>         relay_create_buf_file+0x62/0x80
>>         relay_late_setup_files+0x84/0x250
>>         guc_log_late_setup+0x4f/0x110 [i915]
>>         i915_guc_log_register+0x32/0x40 [i915]
>>         i915_driver_load+0x7b6/0x1720 [i915]
>>         i915_pci_probe+0x2e/0x90 [i915]
>>         pci_device_probe+0x9c/0x120
>>         driver_probe_device+0x2a3/0x480
>>         __driver_attach+0xd9/0xe0
>>         bus_for_each_dev+0x57/0x90
>>         bus_add_driver+0x168/0x260
>>         driver_register+0x52/0xc0
>>         do_one_initcall+0x39/0x150
>>         do_init_module+0x56/0x1ef
>>         load_module+0x231c/0x2d70
>>         SyS_finit_module+0xa5/0xe0
>>         entry_SYSCALL_64_fastpath+0x1c/0x89
>>
>> -> #1 (relay_channels_mutex){+.+.}:
>>         relay_open+0x12c/0x2b0
>>         intel_guc_log_runtime_create+0xab/0x230 [i915]
>>         intel_guc_init+0x81/0x120 [i915]
>>         intel_uc_init+0x29/0xa0 [i915]
>>         i915_gem_init+0x182/0x530 [i915]
>>         i915_driver_load+0xaa9/0x1720 [i915]
>>         i915_pci_probe+0x2e/0x90 [i915]
>>         pci_device_probe+0x9c/0x120
>>         driver_probe_device+0x2a3/0x480
>>         __driver_attach+0xd9/0xe0
>>         bus_for_each_dev+0x57/0x90
>>         bus_add_driver+0x168/0x260
>>         driver_register+0x52/0xc0
>>         do_one_initcall+0x39/0x150
>>         do_init_module+0x56/0x1ef
>>         load_module+0x231c/0x2d70
>>         SyS_finit_module+0xa5/0xe0
>>         entry_SYSCALL_64_fastpath+0x1c/0x89
>>
>> -> #0 (&dev->struct_mutex){+.+.}:
>>         __mutex_lock+0x81/0x9b0
>>         i915_mutex_lock_interruptible+0x47/0x130 [i915]
>>         i915_gem_fault+0x201/0x790 [i915]
>>         __do_fault+0x15/0x70
>>         __handle_mm_fault+0x677/0xdc0
>>         handle_mm_fault+0x14f/0x2f0
>>         __do_page_fault+0x2d1/0x560
>>         page_fault+0x4c/0x60
>>
>> other info that might help us debug this:
>>
>> Chain exists of:
>>    &dev->struct_mutex --> &sb->s_type->i_mutex_key#3 --> &mm-
>>> mmap_sem
>>   Possible unsafe locking scenario:
>>
>>         CPU0                    CPU1
>>         ----                    ----
>>    lock(&mm->mmap_sem);
>>                                 lock(&sb->s_type->i_mutex_key#3);
>>                                 lock(&mm->mmap_sem);
>>    lock(&dev->struct_mutex);
>>
>>   *** DEADLOCK ***
>>
>> 1 lock held by debugfs_test/1388:
>>   #0:  (&mm->mmap_sem){++++}, at: [<0000000029a9c131>]
>> __do_page_fault+0x106/0x560
>>
>> stack backtrace:
>> CPU: 2 PID: 1388 Comm: debugfs_test Not tainted 4.15.0-rc6-CI-
>> Patchwork_7614+ #1 Hardware name: To Be Filled By O.E.M. To Be Filled By
>> O.E.M./J4205-ITX, BIOS P1.10 09/29/2016 Call Trace:
>>   dump_stack+0x5f/0x86
>>   print_circular_bug.isra.18+0x1d0/0x2c0
>>   __lock_acquire+0x14ae/0x1b60
>>   ? lock_acquire+0xaf/0x200
>>   lock_acquire+0xaf/0x200
>>   ? i915_mutex_lock_interruptible+0x47/0x130 [i915]
>>   __mutex_lock+0x81/0x9b0
>>   ? i915_mutex_lock_interruptible+0x47/0x130 [i915]  ?
>> i915_mutex_lock_interruptible+0x47/0x130 [i915]  ?
>> i915_mutex_lock_interruptible+0x47/0x130 [i915]
>>   i915_mutex_lock_interruptible+0x47/0x130 [i915]  ?
>> __pm_runtime_resume+0x4f/0x80
>>   i915_gem_fault+0x201/0x790 [i915]
>>   __do_fault+0x15/0x70
>>   ? _raw_spin_unlock+0x29/0x40
>>   __handle_mm_fault+0x677/0xdc0
>>   handle_mm_fault+0x14f/0x2f0
>>   __do_page_fault+0x2d1/0x560
>>   ? page_fault+0x36/0x60
>>   page_fault+0x4c/0x60
>>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104693
>> Testcase: igt/debugfs_test/read_all_entries # with enable_guc=1 and
>> guc_log_level=1
>> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
>> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
>> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>> Cc: Marta Lofstedt <marta.lofstedt@intel.com>
>> Cc: Michal Winiarski <michal.winiarski@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_debugfs.c  |  12 +---
>>   drivers/gpu/drm/i915/i915_drv.c      |   2 +-
>>   drivers/gpu/drm/i915/i915_gem.c      |   4 +-
>>   drivers/gpu/drm/i915/intel_guc.c     |   6 +-
>>   drivers/gpu/drm/i915/intel_guc_log.c | 111
>> ++++++++++++++++++++++++++---------
>>   drivers/gpu/drm/i915/intel_guc_log.h |   9 +++
>>   drivers/gpu/drm/i915/intel_uc.c      |  26 ++++++--
>>   drivers/gpu/drm/i915/intel_uc.h      |   4 +-
>>   8 files changed, 125 insertions(+), 49 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
>> b/drivers/gpu/drm/i915/i915_debugfs.c
>> index 80dc679..b45be0d 100644
>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>> @@ -2467,7 +2467,6 @@ static int i915_guc_log_control_get(void *data, u64
>> *val)  static int i915_guc_log_control_set(void *data, u64 val)  {
>>   	struct drm_i915_private *dev_priv = data;
>> -	int ret;
>>
>>   	if (!HAS_GUC(dev_priv))
>>   		return -ENODEV;
>> @@ -2475,16 +2474,7 @@ static int i915_guc_log_control_set(void *data,
>> u64 val)
>>   	if (!dev_priv->guc.log.vma)
>>   		return -EINVAL;
>>
>> -	ret = mutex_lock_interruptible(&dev_priv-
>>> drm.struct_mutex);
>> -	if (ret)
>> -		return ret;
>> -
>> -	intel_runtime_pm_get(dev_priv);
>> -	ret = i915_guc_log_control(dev_priv, val);
>> -	intel_runtime_pm_put(dev_priv);
>> -
>> -	mutex_unlock(&dev_priv->drm.struct_mutex);
>> -	return ret;
>> +	return i915_guc_log_control(dev_priv, val);
>>   }
>>
>>   DEFINE_SIMPLE_ATTRIBUTE(i915_guc_log_control_fops,
>> diff --git a/drivers/gpu/drm/i915/i915_drv.c
>> b/drivers/gpu/drm/i915/i915_drv.c index 1fbe378..29048b9 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.c
>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>> @@ -626,7 +626,7 @@ static void i915_gem_fini(struct drm_i915_private
>> *dev_priv)
>>   	i915_gem_contexts_fini(dev_priv);
>>   	mutex_unlock(&dev_priv->drm.struct_mutex);
>>
>> -	intel_uc_fini_wq(dev_priv);
>> +	intel_uc_fini_misc(dev_priv);
>>   	i915_gem_cleanup_userptr(dev_priv);
>>
>>   	i915_gem_drain_freed_objects(dev_priv);
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c
>> b/drivers/gpu/drm/i915/i915_gem.c index 7f0684c..34d0d51 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -5193,7 +5193,7 @@ int i915_gem_init(struct drm_i915_private
>> *dev_priv)
>>   	if (ret)
>>   		return ret;
>>
>> -	ret = intel_uc_init_wq(dev_priv);
>> +	ret = intel_uc_init_misc(dev_priv);
>>   	if (ret)
>>   		return ret;
>>
>> @@ -5289,7 +5289,7 @@ int i915_gem_init(struct drm_i915_private
>> *dev_priv)
>>   	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
>>   	mutex_unlock(&dev_priv->drm.struct_mutex);
>>
>> -	intel_uc_fini_wq(dev_priv);
>> +	intel_uc_fini_misc(dev_priv);
>>
>>   	if (ret != -EIO)
>>   		i915_gem_cleanup_userptr(dev_priv);
>> diff --git a/drivers/gpu/drm/i915/intel_guc.c
>> b/drivers/gpu/drm/i915/intel_guc.c
>> index ea30e7c..912c9c0 100644
>> --- a/drivers/gpu/drm/i915/intel_guc.c
>> +++ b/drivers/gpu/drm/i915/intel_guc.c
>> @@ -87,8 +87,10 @@ int intel_guc_init_wq(struct intel_guc *guc)
>>   	 */
>>   	guc->log.runtime.flush_wq =
>> alloc_ordered_workqueue("i915-guc_log",
>>
>> 	WQ_HIGHPRI | WQ_FREEZABLE);
>> -	if (!guc->log.runtime.flush_wq)
>> +	if (!guc->log.runtime.flush_wq) {
>> +		DRM_ERROR("Couldn't allocate workqueue for
>> GuC log\n");
>>   		return -ENOMEM;
>> +	}
>>
>>   	/*
>>   	 * Even though both sending GuC action, and adding a new
>> workitem to @@ -109,6 +111,8 @@ int intel_guc_init_wq(struct intel_guc
>> *guc)
>>
>> 		  WQ_HIGHPRI);
>>   		if (!guc->preempt_wq) {
>>   			destroy_workqueue(guc-
>>> log.runtime.flush_wq);
>> +			DRM_ERROR("Couldn't allocate
>> workqueue for GuC "
>> +				  "preemption\n");
>>   			return -ENOMEM;
>>   		}
>>   	}
>> diff --git a/drivers/gpu/drm/i915/intel_guc_log.c
>> b/drivers/gpu/drm/i915/intel_guc_log.c
>> index 2ffc966..9e5600355 100644
>> --- a/drivers/gpu/drm/i915/intel_guc_log.c
>> +++ b/drivers/gpu/drm/i915/intel_guc_log.c
>> @@ -359,16 +359,21 @@ static bool guc_log_has_runtime(struct intel_guc
>> *guc)
>>   	return guc->log.runtime.buf_addr != NULL;  }
>>
>> +static bool guc_log_has_relay(struct intel_guc *guc) {
>> +	return guc->log.runtime.relay_chan != NULL; }
>> +
>>   static int guc_log_runtime_create(struct intel_guc *guc)  {
>>   	struct drm_i915_private *dev_priv = guc_to_i915(guc);
>>   	void *vaddr;
>> -	struct rchan *guc_log_relay_chan;
>> -	size_t n_subbufs, subbuf_size;
>>   	int ret;
>>
>>   	lockdep_assert_held(&dev_priv->drm.struct_mutex);
>>
>> +	GEM_BUG_ON(!guc_log_has_relay(guc));
>> +
>>   	GEM_BUG_ON(guc_log_has_runtime(guc));
>>
>>   	ret = i915_gem_object_set_to_wc_domain(guc->log.vma-
>>> obj, true); @@ -387,8 +392,38 @@ static int guc_log_runtime_create(struct
>> intel_guc *guc)
>>
>>   	guc->log.runtime.buf_addr = vaddr;
>>
>> +	INIT_WORK(&guc->log.runtime.flush_work,
>> capture_logs_work);
>> +
>> +	return 0;
>> +}
>> +
>> +static void guc_log_runtime_destroy(struct intel_guc *guc) {
>> +	/*
>> +	 * It's possible that the runtime stuff was never allocated
>> because
>> +	 * GuC log was disabled at the boot time.
>> +	 **/
>> +	if (!guc_log_has_runtime(guc))
>> +		return;
>> +
>> +	i915_gem_object_unpin_map(guc->log.vma->obj);
>> +	guc->log.runtime.buf_addr = NULL;
>> +}
>> +
>> +int intel_guc_log_relay_create(struct intel_guc *guc) {
>> +	struct drm_i915_private *dev_priv = guc_to_i915(guc);
>> +	struct rchan *guc_log_relay_chan;
>> +	size_t n_subbufs, subbuf_size;
>> +	int ret;
>> +
>> +	if (!i915_modparams.guc_log_level)
>> +		return 0;
>> +
>> +	GEM_BUG_ON(guc_log_has_relay(guc));
>> +
>>   	 /* Keep the size of sub buffers same as shared log buffer */
>> -	subbuf_size = guc->log.vma->obj->base.size;
>> +	subbuf_size = GUC_LOG_SIZE;
>>
>>   	/* Store up to 8 snapshots, which is large enough to buffer
>> sufficient
>>   	 * boot time logs and provides enough leeway to User, in
>> terms of @@ -407,33 +442,31 @@ static int guc_log_runtime_create(struct
>> intel_guc *guc)
>>   		DRM_ERROR("Couldn't create relay chan for
>> GuC logging\n");
>>
>>   		ret = -ENOMEM;
>> -		goto err_vaddr;
>> +		goto err;
>>   	}
>>
>>   	GEM_BUG_ON(guc_log_relay_chan->subbuf_size <
>> subbuf_size);
>>   	guc->log.runtime.relay_chan = guc_log_relay_chan;
>>
>> -	INIT_WORK(&guc->log.runtime.flush_work,
>> capture_logs_work);
>>   	return 0;
>>
>> -err_vaddr:
>> -	i915_gem_object_unpin_map(guc->log.vma->obj);
>> -	guc->log.runtime.buf_addr = NULL;
>> +err:
>> +	/* logging will be off */
>> +	i915_modparams.guc_log_level = 0;
>>   	return ret;
>>   }
>>
>> -static void guc_log_runtime_destroy(struct intel_guc *guc)
>> +void intel_guc_log_relay_destroy(struct intel_guc *guc)
>>   {
>>   	/*
>> -	 * It's possible that the runtime stuff was never allocated
>> because
>> +	 * It's possible that the relay was never allocated because
>>   	 * GuC log was disabled at the boot time.
>>   	 */
>> -	if (!guc_log_has_runtime(guc))
>> +	if (!guc_log_has_relay(guc))
>>   		return;
>>
>>   	relay_close(guc->log.runtime.relay_chan);
>> -	i915_gem_object_unpin_map(guc->log.vma->obj);
>> -	guc->log.runtime.buf_addr = NULL;
>> +	guc->log.runtime.relay_chan = NULL;
>>   }
>>
>>   static int guc_log_late_setup(struct intel_guc *guc) @@ -441,17 +474,26 @@
>> static int guc_log_late_setup(struct intel_guc *guc)
>>   	struct drm_i915_private *dev_priv = guc_to_i915(guc);
>>   	int ret;
>>
>> -	lockdep_assert_held(&dev_priv->drm.struct_mutex);
>> -
>>   	if (!guc_log_has_runtime(guc)) {
>>   		/*
>>   		 * If log was disabled at boot time, then setup
>> needed to handle
>>   		 * log buffer flush interrupts would not have
>> been done yet, so
>>   		 * do that now.
>>   		 */
>> -		ret = guc_log_runtime_create(guc);
>> +		GEM_BUG_ON(guc_log_has_relay(guc));
>> +
>> +		ret = intel_guc_log_relay_create(guc);
>>   		if (ret)
>>   			goto err;
>> +
>> +		mutex_lock(&dev_priv->drm.struct_mutex);
>> +		intel_runtime_pm_get(dev_priv);
>> +		ret = guc_log_runtime_create(guc);
>> +		intel_runtime_pm_put(dev_priv);
>> +		mutex_unlock(&dev_priv->drm.struct_mutex);
>> +
>> +		if (ret)
>> +			goto err_relay;
>>   	}
>>
>>   	ret = guc_log_relay_file_create(guc);
>> @@ -461,7 +503,11 @@ static int guc_log_late_setup(struct intel_guc *guc)
>>   	return 0;
>>
>>   err_runtime:
>> +	mutex_lock(&dev_priv->drm.struct_mutex);
>>   	guc_log_runtime_destroy(guc);
>> +	mutex_unlock(&dev_priv->drm.struct_mutex);
>> +err_relay:
>> +	intel_guc_log_relay_destroy(guc);
>>   err:
>>   	/* logging will remain off */
>>   	i915_modparams.guc_log_level = 0;
>> @@ -509,17 +555,10 @@ int intel_guc_log_create(struct intel_guc *guc)
>>   	struct i915_vma *vma;
>>   	unsigned long offset;
>>   	u32 flags;
>> -	u32 size;
>>   	int ret;
>>
>>   	GEM_BUG_ON(guc->log.vma);
>>
>> -	/* The first page is to save log buffer state. Allocate one
>> -	 * extra page for others in case for overlap */
>> -	size = (1 + GUC_LOG_DPC_PAGES + 1 +
>> -		GUC_LOG_ISR_PAGES + 1 +
>> -		GUC_LOG_CRASH_PAGES + 1) << PAGE_SHIFT;
>> -
>>   	/* We require SSE 4.1 for fast reads from the GuC log buffer
>> and
>>   	 * it should be present on the chipsets supporting GuC based
>>   	 * submisssions.
>> @@ -529,7 +568,7 @@ int intel_guc_log_create(struct intel_guc *guc)
>>   		goto err;
>>   	}
>>
>> -	vma = intel_guc_allocate_vma(guc, size);
>> +	vma = intel_guc_allocate_vma(guc, GUC_LOG_SIZE);
>>   	if (IS_ERR(vma)) {
>>   		ret = PTR_ERR(vma);
>>   		goto err;
>> @@ -584,7 +623,15 @@ int i915_guc_log_control(struct drm_i915_private
>> *dev_priv, u64 control_val)
>>   		return 0;
>>
>>   	verbosity = enable_logging ? control_val - 1 : 0;
>> +
>> +	ret = mutex_lock_interruptible(&dev_priv-
>>> drm.struct_mutex);
>> +	if (ret)
>> +		return ret;
>> +	intel_runtime_pm_get(dev_priv);
>>   	ret = guc_log_control(guc, enable_logging, verbosity);
>> +	intel_runtime_pm_put(dev_priv);
>> +	mutex_unlock(&dev_priv->drm.struct_mutex);
>> +
>>   	if (ret < 0) {
>>   		DRM_DEBUG_DRIVER("guc_logging_control
>> action failed %d\n", ret);
>>   		return ret;
>> @@ -605,7 +652,11 @@ int i915_guc_log_control(struct drm_i915_private
>> *dev_priv, u64 control_val)
>>   		}
>>
>>   		/* GuC logging is currently the only user of
>> Guc2Host interrupts */
>> +		mutex_lock(&dev_priv->drm.struct_mutex);
>> +		intel_runtime_pm_get(dev_priv);
>>   		gen9_enable_guc_interrupts(dev_priv);
>> +		intel_runtime_pm_put(dev_priv);
>> +		mutex_unlock(&dev_priv->drm.struct_mutex);
>>   	} else {
>>   		/*
>>   		 * Once logging is disabled, GuC won't generate
>> logs & send an @@ -613,7 +664,11 @@ int i915_guc_log_control(struct
>> drm_i915_private *dev_priv, u64 control_val)
>>   		 * which is yet to be captured. So request GuC to
>> update the log
>>   		 * buffer state and then collect the left over
>> logs.
>>   		 */
>> +		mutex_lock(&dev_priv->drm.struct_mutex);
>> +		intel_runtime_pm_get(dev_priv);
>>   		guc_flush_logs(guc);
>> +		intel_runtime_pm_put(dev_priv);
>> +		mutex_unlock(&dev_priv->drm.struct_mutex);
>>
>>   		/* As logging is disabled, update log level to
>> reflect that */
>>   		i915_modparams.guc_log_level = 0;
>> @@ -627,19 +682,21 @@ void i915_guc_log_register(struct drm_i915_private
>> *dev_priv)
>>   	if (!USES_GUC_SUBMISSION(dev_priv) ||
>> !i915_modparams.guc_log_level)
>>   		return;
>>
>> -	mutex_lock(&dev_priv->drm.struct_mutex);
>>   	guc_log_late_setup(&dev_priv->guc);
>> -	mutex_unlock(&dev_priv->drm.struct_mutex);
>>   }
>>
>>   void i915_guc_log_unregister(struct drm_i915_private *dev_priv)  {
>> +	struct intel_guc *guc = &dev_priv->guc;
>> +
>>   	if (!USES_GUC_SUBMISSION(dev_priv))
>>   		return;
>>
>>   	mutex_lock(&dev_priv->drm.struct_mutex);
>>   	/* GuC logging is currently the only user of Guc2Host
>> interrupts */
>>   	gen9_disable_guc_interrupts(dev_priv);
>> -	guc_log_runtime_destroy(&dev_priv->guc);
>> +	guc_log_runtime_destroy(guc);
>>   	mutex_unlock(&dev_priv->drm.struct_mutex);
>> +
>> +	intel_guc_log_relay_destroy(guc);
>>   }
>> diff --git a/drivers/gpu/drm/i915/intel_guc_log.h
>> b/drivers/gpu/drm/i915/intel_guc_log.h
>> index f512cf7..84915a8 100644
>> --- a/drivers/gpu/drm/i915/intel_guc_log.h
>> +++ b/drivers/gpu/drm/i915/intel_guc_log.h
>> @@ -32,6 +32,13 @@
>>   struct drm_i915_private;
>>   struct intel_guc;
>>
>> +/*
>> + * The first page is to save log buffer state. Allocate one
>> + * extra page for others in case for overlap  */
>> +#define GUC_LOG_SIZE	((1 + GUC_LOG_DPC_PAGES + 1 +
>> GUC_LOG_ISR_PAGES + \
>> +			  1 + GUC_LOG_CRASH_PAGES +
>> 1) << PAGE_SHIFT)
>> +
>>   struct intel_guc_log {
>>   	u32 flags;
>>   	struct i915_vma *vma;
>> @@ -52,6 +59,8 @@ struct intel_guc_log {
>>
>>   int intel_guc_log_create(struct intel_guc *guc);  void
>> intel_guc_log_destroy(struct intel_guc *guc);
>> +int intel_guc_log_relay_create(struct intel_guc *guc); void
>> +intel_guc_log_relay_destroy(struct intel_guc *guc);
>>   int i915_guc_log_control(struct drm_i915_private *dev_priv, u64
>> control_val);  void i915_guc_log_register(struct drm_i915_private
>> *dev_priv);  void i915_guc_log_unregister(struct drm_i915_private
>> *dev_priv); diff --git a/drivers/gpu/drm/i915/intel_uc.c
>> b/drivers/gpu/drm/i915/intel_uc.c index f78a17b..e3f3509 100644
>> --- a/drivers/gpu/drm/i915/intel_uc.c
>> +++ b/drivers/gpu/drm/i915/intel_uc.c
>> @@ -236,28 +236,44 @@ static void guc_disable_communication(struct
>> intel_guc *guc)
>>   	guc->send = intel_guc_send_nop;
>>   }
>>
>> -int intel_uc_init_wq(struct drm_i915_private *dev_priv)
>> +int intel_uc_init_misc(struct drm_i915_private *dev_priv)
>>   {
>> +	struct intel_guc *guc = &dev_priv->guc;
>>   	int ret;
>>
>>   	if (!USES_GUC(dev_priv))
>>   		return 0;
>>
>> -	ret = intel_guc_init_wq(&dev_priv->guc);
>> +	ret = intel_guc_init_wq(guc);
>>   	if (ret) {
>>   		DRM_ERROR("Couldn't allocate workqueues for
>> GuC\n");
>> -		return ret;
>> +		goto err;
>> +	}
>> +
>> +	ret = intel_guc_log_relay_create(guc);
>> +	if (ret) {
>> +		DRM_ERROR("Couldn't allocate relay for GuC
>> log\n");
>> +		goto err_relay;
>>   	}
>>
>>   	return 0;
>> +
>> +err_relay:
>> +	intel_guc_fini_wq(guc);
>> +err:
>> +	return ret;
>>   }
>>
>> -void intel_uc_fini_wq(struct drm_i915_private *dev_priv)
>> +void intel_uc_fini_misc(struct drm_i915_private *dev_priv)
>>   {
>> +	struct intel_guc *guc = &dev_priv->guc;
>> +
>>   	if (!USES_GUC(dev_priv))
>>   		return;
>>
>> -	intel_guc_fini_wq(&dev_priv->guc);
>> +	intel_guc_fini_wq(guc);
>> +
>> +	intel_guc_log_relay_destroy(guc);
>>   }
>>
>>   int intel_uc_init(struct drm_i915_private *dev_priv) diff --git
>> a/drivers/gpu/drm/i915/intel_uc.h b/drivers/gpu/drm/i915/intel_uc.h index
>> 8a72497..f2984e0 100644
>> --- a/drivers/gpu/drm/i915/intel_uc.h
>> +++ b/drivers/gpu/drm/i915/intel_uc.h
>> @@ -33,8 +33,8 @@
>>   void intel_uc_init_mmio(struct drm_i915_private *dev_priv);  void
>> intel_uc_init_fw(struct drm_i915_private *dev_priv);  void
>> intel_uc_fini_fw(struct drm_i915_private *dev_priv); -int
>> intel_uc_init_wq(struct drm_i915_private *dev_priv); -void
>> intel_uc_fini_wq(struct drm_i915_private *dev_priv);
>> +int intel_uc_init_misc(struct drm_i915_private *dev_priv); void
>> +intel_uc_fini_misc(struct drm_i915_private *dev_priv);
>>   int intel_uc_init_hw(struct drm_i915_private *dev_priv);  void
>> intel_uc_fini_hw(struct drm_i915_private *dev_priv);  int intel_uc_init(struct
>> drm_i915_private *dev_priv);
>> --
>> 1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* ✓ Fi.CI.BAT: success for series starting with [1/4] drm/i915/guc: Fix lockdep due to log relay channel handling under struct_mutex
  2018-01-22  8:26 [PATCH 1/4] drm/i915/guc: Fix lockdep due to log relay channel handling under struct_mutex Sagar Arun Kamble
                   ` (3 preceding siblings ...)
  2018-01-22  8:26 ` [PATCH 4/4] [HAX] Revert "drm/i915/guc: Keep GuC log disabled by default" Sagar Arun Kamble
@ 2018-01-22  9:15 ` Patchwork
  2018-01-22 10:16 ` [PATCH 1/4] " Chris Wilson
  2018-01-22 12:40 ` ✗ Fi.CI.IGT: failure for series starting with [1/4] " Patchwork
  6 siblings, 0 replies; 16+ messages in thread
From: Patchwork @ 2018-01-22  9:15 UTC (permalink / raw)
  To: Sagar Arun Kamble; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/4] drm/i915/guc: Fix lockdep due to log relay channel handling under struct_mutex
URL   : https://patchwork.freedesktop.org/series/36875/
State : success

== Summary ==

Series 36875v1 series starting with [1/4] drm/i915/guc: Fix lockdep due to log relay channel handling under struct_mutex
https://patchwork.freedesktop.org/api/1.0/series/36875/revisions/1/mbox/

Test gem_ringfill:
        Subgroup basic-default:
                skip       -> PASS       (fi-bsw-n3050)
Test kms_pipe_crc_basic:
        Subgroup suspend-read-crc-pipe-b:
                incomplete -> PASS       (fi-snb-2520m) fdo#103713

fdo#103713 https://bugs.freedesktop.org/show_bug.cgi?id=103713

fi-bdw-5557u     total:288  pass:267  dwarn:0   dfail:0   fail:0   skip:21  time:435s
fi-bdw-gvtdvm    total:288  pass:264  dwarn:0   dfail:0   fail:0   skip:24  time:437s
fi-blb-e6850     total:288  pass:223  dwarn:1   dfail:0   fail:0   skip:64  time:385s
fi-bsw-n3050     total:288  pass:242  dwarn:0   dfail:0   fail:0   skip:46  time:523s
fi-bwr-2160      total:288  pass:183  dwarn:0   dfail:0   fail:0   skip:105 time:295s
fi-bxt-dsi       total:288  pass:258  dwarn:0   dfail:0   fail:0   skip:30  time:500s
fi-bxt-j4205     total:288  pass:259  dwarn:0   dfail:0   fail:0   skip:29  time:502s
fi-byt-j1900     total:288  pass:253  dwarn:0   dfail:0   fail:0   skip:35  time:496s
fi-byt-n2820     total:288  pass:249  dwarn:0   dfail:0   fail:0   skip:39  time:479s
fi-elk-e7500     total:224  pass:168  dwarn:10  dfail:0   fail:0   skip:45 
fi-gdg-551       total:288  pass:179  dwarn:0   dfail:0   fail:1   skip:108 time:303s
fi-glk-1         total:288  pass:260  dwarn:0   dfail:0   fail:0   skip:28  time:525s
fi-hsw-4770      total:288  pass:261  dwarn:0   dfail:0   fail:0   skip:27  time:398s
fi-hsw-4770r     total:288  pass:261  dwarn:0   dfail:0   fail:0   skip:27  time:407s
fi-ilk-650       total:288  pass:228  dwarn:0   dfail:0   fail:0   skip:60  time:426s
fi-ivb-3520m     total:288  pass:259  dwarn:0   dfail:0   fail:0   skip:29  time:473s
fi-ivb-3770      total:288  pass:255  dwarn:0   dfail:0   fail:0   skip:33  time:421s
fi-kbl-7500u     total:288  pass:263  dwarn:1   dfail:0   fail:0   skip:24  time:465s
fi-kbl-7560u     total:288  pass:269  dwarn:0   dfail:0   fail:0   skip:19  time:504s
fi-kbl-7567u     total:288  pass:268  dwarn:0   dfail:0   fail:0   skip:20  time:461s
fi-kbl-r         total:288  pass:261  dwarn:0   dfail:0   fail:0   skip:27  time:511s
fi-pnv-d510      total:288  pass:222  dwarn:1   dfail:0   fail:0   skip:65  time:631s
fi-skl-6260u     total:288  pass:268  dwarn:0   dfail:0   fail:0   skip:20  time:443s
fi-skl-6600u     total:288  pass:261  dwarn:0   dfail:0   fail:0   skip:27  time:521s
fi-skl-6700hq    total:288  pass:262  dwarn:0   dfail:0   fail:0   skip:26  time:538s
fi-skl-6700k2    total:288  pass:264  dwarn:0   dfail:0   fail:0   skip:24  time:500s
fi-skl-6770hq    total:288  pass:268  dwarn:0   dfail:0   fail:0   skip:20  time:512s
fi-skl-gvtdvm    total:288  pass:265  dwarn:0   dfail:0   fail:0   skip:23  time:446s
fi-snb-2520m     total:288  pass:248  dwarn:0   dfail:0   fail:0   skip:40  time:549s
fi-snb-2600      total:288  pass:248  dwarn:0   dfail:0   fail:0   skip:40  time:406s
Blacklisted hosts:
fi-cfl-s2        total:288  pass:256  dwarn:0   dfail:0   fail:3   skip:26  time:565s
fi-glk-dsi       total:288  pass:258  dwarn:0   dfail:0   fail:0   skip:30  time:483s
fi-skl-guc       total:288  pass:260  dwarn:0   dfail:0   fail:0   skip:28  time:423s

a1f34f9415a41d5a4d219bcbbe5cf39a0b07cc22 drm-tip: 2018y-01m-20d-19h-27m-32s UTC integration manifest
d79027e481e0 Revert "drm/i915/guc: Keep GuC log disabled by default"
cce3e79ffe73 drm/i915/guc: Enable interrupts before resuming GuC during runtime resume
1ae00207a925 drm/i915/guc: Grab RPM wakelock while disabling GuC interrupts
3ff2d9df1077 drm/i915/guc: Fix lockdep due to log relay channel handling under struct_mutex

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_7735/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] drm/i915/guc: Enable interrupts before resuming GuC during runtime resume
  2018-01-22  8:26 ` [PATCH 3/4] drm/i915/guc: Enable interrupts before resuming GuC during runtime resume Sagar Arun Kamble
@ 2018-01-22 10:09   ` Chris Wilson
  0 siblings, 0 replies; 16+ messages in thread
From: Chris Wilson @ 2018-01-22 10:09 UTC (permalink / raw)
  To: Sagar Arun Kamble, intel-gfx; +Cc: Sagar

Quoting Sagar Arun Kamble (2018-01-22 08:26:03)
> GuC log streaming needs interrupts enabled prior to GuC resume but
> runtime pm interrupt setup was happening post GuC resume. Fix it.
> While at it, fix the unwinding of steps in the runtime suspend path.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104695
> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Michał Winiarski <michal.winiarski@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Marta Lofstedt <marta.lofstedt@intel.com>

Ok, moving the interrupts around shouldn't be too much of an issue
during rpm suspend/resume as we hold the rpm wakeref for the operation.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/4] drm/i915/guc: Grab RPM wakelock while disabling GuC interrupts
  2018-01-22  8:26 ` [PATCH 2/4] drm/i915/guc: Grab RPM wakelock while disabling GuC interrupts Sagar Arun Kamble
@ 2018-01-22 10:11   ` Chris Wilson
  2018-01-22 10:30     ` Sagar Arun Kamble
  0 siblings, 1 reply; 16+ messages in thread
From: Chris Wilson @ 2018-01-22 10:11 UTC (permalink / raw)
  To: Sagar Arun Kamble, intel-gfx

Quoting Sagar Arun Kamble (2018-01-22 08:26:02)
> Disabling GuC interrupts involves access to GuC IRQ control registers
> hence ensure device is RPM awake.
> 
> v2: Add comment about need to synchronize flush work and log runtime
>     destroy
> 
> v3: Moved patch earlier in the series and removed comment about future
> work. (Tvrtko)
> 
> v4-v5: Rebase.
> 
> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/intel_guc_log.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_guc_log.c b/drivers/gpu/drm/i915/intel_guc_log.c
> index 9e5600355..4039912 100644
> --- a/drivers/gpu/drm/i915/intel_guc_log.c
> +++ b/drivers/gpu/drm/i915/intel_guc_log.c
> @@ -694,7 +694,10 @@ void i915_guc_log_unregister(struct drm_i915_private *dev_priv)
>  
>         mutex_lock(&dev_priv->drm.struct_mutex);
>         /* GuC logging is currently the only user of Guc2Host interrupts */
> +       intel_runtime_pm_get(dev_priv);
>         gen9_disable_guc_interrupts(dev_priv);

Do we have/want assert_rpm_wakelock_held() in
gen9_disable_guc_interrupts() ?

Are we really locking the interrupts on/off with struct_mutex?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4] drm/i915/guc: Fix lockdep due to log relay channel handling under struct_mutex
  2018-01-22  8:26 [PATCH 1/4] drm/i915/guc: Fix lockdep due to log relay channel handling under struct_mutex Sagar Arun Kamble
                   ` (4 preceding siblings ...)
  2018-01-22  9:15 ` ✓ Fi.CI.BAT: success for series starting with [1/4] drm/i915/guc: Fix lockdep due to log relay channel handling under struct_mutex Patchwork
@ 2018-01-22 10:16 ` Chris Wilson
  2018-01-22 10:38   ` Sagar Arun Kamble
  2018-01-22 12:40 ` ✗ Fi.CI.IGT: failure for series starting with [1/4] " Patchwork
  6 siblings, 1 reply; 16+ messages in thread
From: Chris Wilson @ 2018-01-22 10:16 UTC (permalink / raw)
  To: Sagar Arun Kamble, intel-gfx

Quoting Sagar Arun Kamble (2018-01-22 08:26:01)
> +int intel_guc_log_relay_create(struct intel_guc *guc)
> +{
> +       struct drm_i915_private *dev_priv = guc_to_i915(guc);
> +       struct rchan *guc_log_relay_chan;
> +       size_t n_subbufs, subbuf_size;
> +       int ret;
> +
> +       if (!i915_modparams.guc_log_level)
> +               return 0;
> +
> +       GEM_BUG_ON(guc_log_has_relay(guc));
> +
>          /* Keep the size of sub buffers same as shared log buffer */
> -       subbuf_size = guc->log.vma->obj->base.size;
> +       subbuf_size = GUC_LOG_SIZE;
>  
>         /* Store up to 8 snapshots, which is large enough to buffer sufficient
>          * boot time logs and provides enough leeway to User, in terms of
> @@ -407,33 +442,31 @@ static int guc_log_runtime_create(struct intel_guc *guc)
>                 DRM_ERROR("Couldn't create relay chan for GuC logging\n");
>  
>                 ret = -ENOMEM;
> -               goto err_vaddr;
> +               goto err;
>         }
>  
>         GEM_BUG_ON(guc_log_relay_chan->subbuf_size < subbuf_size);
>         guc->log.runtime.relay_chan = guc_log_relay_chan;
>  
> -       INIT_WORK(&guc->log.runtime.flush_work, capture_logs_work);
>         return 0;
>  
> -err_vaddr:
> -       i915_gem_object_unpin_map(guc->log.vma->obj);
> -       guc->log.runtime.buf_addr = NULL;
> +err:
> +       /* logging will be off */
> +       i915_modparams.guc_log_level = 0;
>         return ret;
>  }
>  
> -static void guc_log_runtime_destroy(struct intel_guc *guc)
> +void intel_guc_log_relay_destroy(struct intel_guc *guc)
>  {

Now exposed to multiple users, we need to document what the locking
requirements are here. Or add some local locking. Looks like at the
moment, _create is using struct_mutex, which as we are only handling the
relay may be overkill (but since this is one-off init, fair enough).
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/4] drm/i915/guc: Grab RPM wakelock while disabling GuC interrupts
  2018-01-22 10:11   ` Chris Wilson
@ 2018-01-22 10:30     ` Sagar Arun Kamble
  0 siblings, 0 replies; 16+ messages in thread
From: Sagar Arun Kamble @ 2018-01-22 10:30 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx



On 1/22/2018 3:41 PM, Chris Wilson wrote:
> Quoting Sagar Arun Kamble (2018-01-22 08:26:02)
>> Disabling GuC interrupts involves access to GuC IRQ control registers
>> hence ensure device is RPM awake.
>>
>> v2: Add comment about need to synchronize flush work and log runtime
>>      destroy
>>
>> v3: Moved patch earlier in the series and removed comment about future
>> work. (Tvrtko)
>>
>> v4-v5: Rebase.
>>
>> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
>> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
>> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>> ---
>>   drivers/gpu/drm/i915/intel_guc_log.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_guc_log.c b/drivers/gpu/drm/i915/intel_guc_log.c
>> index 9e5600355..4039912 100644
>> --- a/drivers/gpu/drm/i915/intel_guc_log.c
>> +++ b/drivers/gpu/drm/i915/intel_guc_log.c
>> @@ -694,7 +694,10 @@ void i915_guc_log_unregister(struct drm_i915_private *dev_priv)
>>   
>>          mutex_lock(&dev_priv->drm.struct_mutex);
>>          /* GuC logging is currently the only user of Guc2Host interrupts */
>> +       intel_runtime_pm_get(dev_priv);
>>          gen9_disable_guc_interrupts(dev_priv);
> Do we have/want assert_rpm_wakelock_held() in
> gen9_disable_guc_interrupts() ?
We don't have. Will add it.
> Are we really locking the interrupts on/off with struct_mutex?
Yes.
> -Chris
Thanks,
Sagar
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4] drm/i915/guc: Fix lockdep due to log relay channel handling under struct_mutex
  2018-01-22 10:16 ` [PATCH 1/4] " Chris Wilson
@ 2018-01-22 10:38   ` Sagar Arun Kamble
  2018-01-22 10:47     ` Chris Wilson
  0 siblings, 1 reply; 16+ messages in thread
From: Sagar Arun Kamble @ 2018-01-22 10:38 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx



On 1/22/2018 3:46 PM, Chris Wilson wrote:
> Quoting Sagar Arun Kamble (2018-01-22 08:26:01)
>> +int intel_guc_log_relay_create(struct intel_guc *guc)
>> +{
>> +       struct drm_i915_private *dev_priv = guc_to_i915(guc);
>> +       struct rchan *guc_log_relay_chan;
>> +       size_t n_subbufs, subbuf_size;
>> +       int ret;
>> +
>> +       if (!i915_modparams.guc_log_level)
>> +               return 0;
>> +
>> +       GEM_BUG_ON(guc_log_has_relay(guc));
>> +
>>           /* Keep the size of sub buffers same as shared log buffer */
>> -       subbuf_size = guc->log.vma->obj->base.size;
>> +       subbuf_size = GUC_LOG_SIZE;
>>   
>>          /* Store up to 8 snapshots, which is large enough to buffer sufficient
>>           * boot time logs and provides enough leeway to User, in terms of
>> @@ -407,33 +442,31 @@ static int guc_log_runtime_create(struct intel_guc *guc)
>>                  DRM_ERROR("Couldn't create relay chan for GuC logging\n");
>>   
>>                  ret = -ENOMEM;
>> -               goto err_vaddr;
>> +               goto err;
>>          }
>>   
>>          GEM_BUG_ON(guc_log_relay_chan->subbuf_size < subbuf_size);
>>          guc->log.runtime.relay_chan = guc_log_relay_chan;
>>   
>> -       INIT_WORK(&guc->log.runtime.flush_work, capture_logs_work);
>>          return 0;
>>   
>> -err_vaddr:
>> -       i915_gem_object_unpin_map(guc->log.vma->obj);
>> -       guc->log.runtime.buf_addr = NULL;
>> +err:
>> +       /* logging will be off */
>> +       i915_modparams.guc_log_level = 0;
>>          return ret;
>>   }
>>   
>> -static void guc_log_runtime_destroy(struct intel_guc *guc)
>> +void intel_guc_log_relay_destroy(struct intel_guc *guc)
>>   {
> Now exposed to multiple users, we need to document what the locking
> requirements are here. Or add some local locking.
Do you mean locking between relay_create and relay_destroy?
>   Looks like at the
> moment, _create is using struct_mutex,
relay_create and relay_destroy are now to be done outside of struct_mutex.
I will add this documentation to the functions.
> which as we are only handling the
> relay may be overkill (but since this is one-off init, fair enough).
> -Chris

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4] drm/i915/guc: Fix lockdep due to log relay channel handling under struct_mutex
  2018-01-22 10:38   ` Sagar Arun Kamble
@ 2018-01-22 10:47     ` Chris Wilson
  2018-01-22 15:27       ` Sagar Arun Kamble
  0 siblings, 1 reply; 16+ messages in thread
From: Chris Wilson @ 2018-01-22 10:47 UTC (permalink / raw)
  To: Sagar Arun Kamble, intel-gfx

Quoting Sagar Arun Kamble (2018-01-22 10:38:10)
> 
> 
> On 1/22/2018 3:46 PM, Chris Wilson wrote:
> > Quoting Sagar Arun Kamble (2018-01-22 08:26:01)
> >> +int intel_guc_log_relay_create(struct intel_guc *guc)
> >> +{
> >> +       struct drm_i915_private *dev_priv = guc_to_i915(guc);
> >> +       struct rchan *guc_log_relay_chan;
> >> +       size_t n_subbufs, subbuf_size;
> >> +       int ret;
> >> +
> >> +       if (!i915_modparams.guc_log_level)
> >> +               return 0;
> >> +
> >> +       GEM_BUG_ON(guc_log_has_relay(guc));
> >> +
> >>           /* Keep the size of sub buffers same as shared log buffer */
> >> -       subbuf_size = guc->log.vma->obj->base.size;
> >> +       subbuf_size = GUC_LOG_SIZE;
> >>   
> >>          /* Store up to 8 snapshots, which is large enough to buffer sufficient
> >>           * boot time logs and provides enough leeway to User, in terms of
> >> @@ -407,33 +442,31 @@ static int guc_log_runtime_create(struct intel_guc *guc)
> >>                  DRM_ERROR("Couldn't create relay chan for GuC logging\n");
> >>   
> >>                  ret = -ENOMEM;
> >> -               goto err_vaddr;
> >> +               goto err;
> >>          }
> >>   
> >>          GEM_BUG_ON(guc_log_relay_chan->subbuf_size < subbuf_size);
> >>          guc->log.runtime.relay_chan = guc_log_relay_chan;
> >>   
> >> -       INIT_WORK(&guc->log.runtime.flush_work, capture_logs_work);
> >>          return 0;
> >>   
> >> -err_vaddr:
> >> -       i915_gem_object_unpin_map(guc->log.vma->obj);
> >> -       guc->log.runtime.buf_addr = NULL;
> >> +err:
> >> +       /* logging will be off */
> >> +       i915_modparams.guc_log_level = 0;
> >>          return ret;
> >>   }
> >>   
> >> -static void guc_log_runtime_destroy(struct intel_guc *guc)
> >> +void intel_guc_log_relay_destroy(struct intel_guc *guc)
> >>   {
> > Now exposed to multiple users, we need to document what the locking
> > requirements are here. Or add some local locking.
> Do you mean locking between relay_create and relay_destroy?

We need a lock around guc->log.runtime.relay_chan as the destroy path is
not ostensibly serialised between multiple potential callers. Ordinarily
I would have said that serialisation for create/destroy/access of
relay_chan was guaranteed by init/fini ordering, but that's no longer
clear (based on a 5min read of the patch).

The most important question is "can relay_destroy be called while some
user still has access to the relay_chan?"

> >   Looks like at the
> > moment, _create is using struct_mutex,
> relay_create and relay_destroy are now to be done outside of struct_mutex.
> I will add this documentation to the functions.

(lockdep_assert_held :)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* ✗ Fi.CI.IGT: failure for series starting with [1/4] drm/i915/guc: Fix lockdep due to log relay channel handling under struct_mutex
  2018-01-22  8:26 [PATCH 1/4] drm/i915/guc: Fix lockdep due to log relay channel handling under struct_mutex Sagar Arun Kamble
                   ` (5 preceding siblings ...)
  2018-01-22 10:16 ` [PATCH 1/4] " Chris Wilson
@ 2018-01-22 12:40 ` Patchwork
  6 siblings, 0 replies; 16+ messages in thread
From: Patchwork @ 2018-01-22 12:40 UTC (permalink / raw)
  To: Sagar Arun Kamble; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/4] drm/i915/guc: Fix lockdep due to log relay channel handling under struct_mutex
URL   : https://patchwork.freedesktop.org/series/36875/
State : failure

== Summary ==

Test drv_suspend:
        Subgroup forcewake:
                skip       -> PASS       (shard-hsw)
Test kms_frontbuffer_tracking:
        Subgroup fbc-1p-primscrn-pri-indfb-draw-render:
                fail       -> PASS       (shard-apl) fdo#101623 +1
Test kms_flip:
        Subgroup 2x-flip-vs-expired-vblank-interruptible:
                pass       -> FAIL       (shard-hsw)
        Subgroup rcs-wf_vblank-vs-dpms-interruptible:
                dmesg-warn -> PASS       (shard-hsw) fdo#102614
        Subgroup flip-vs-expired-vblank-interruptible:
                fail       -> PASS       (shard-apl) fdo#102887
        Subgroup modeset-vs-vblank-race-interruptible:
                pass       -> FAIL       (shard-apl) fdo#103060
        Subgroup vblank-vs-suspend-interruptible:
                incomplete -> PASS       (shard-hsw) fdo#100368

fdo#101623 https://bugs.freedesktop.org/show_bug.cgi?id=101623
fdo#102614 https://bugs.freedesktop.org/show_bug.cgi?id=102614
fdo#102887 https://bugs.freedesktop.org/show_bug.cgi?id=102887
fdo#103060 https://bugs.freedesktop.org/show_bug.cgi?id=103060
fdo#100368 https://bugs.freedesktop.org/show_bug.cgi?id=100368

shard-apl        total:2702 pass:1669 dwarn:1   dfail:0   fail:25  skip:1006 time:14465s
shard-hsw        total:2780 pass:1721 dwarn:1   dfail:0   fail:15  skip:1042 time:15536s
shard-snb        total:2780 pass:1316 dwarn:1   dfail:0   fail:13  skip:1450 time:8109s
Blacklisted hosts:
shard-kbl        total:2776 pass:1834 dwarn:1   dfail:0   fail:24  skip:916 time:10705s

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_7735/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4] drm/i915/guc: Fix lockdep due to log relay channel handling under struct_mutex
  2018-01-22 10:47     ` Chris Wilson
@ 2018-01-22 15:27       ` Sagar Arun Kamble
  2018-01-23 10:24         ` Sagar Arun Kamble
  0 siblings, 1 reply; 16+ messages in thread
From: Sagar Arun Kamble @ 2018-01-22 15:27 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx



On 1/22/2018 4:17 PM, Chris Wilson wrote:
> Quoting Sagar Arun Kamble (2018-01-22 10:38:10)
>>
>> On 1/22/2018 3:46 PM, Chris Wilson wrote:
>>> Quoting Sagar Arun Kamble (2018-01-22 08:26:01)
>>>> +int intel_guc_log_relay_create(struct intel_guc *guc)
>>>> +{
>>>> +       struct drm_i915_private *dev_priv = guc_to_i915(guc);
>>>> +       struct rchan *guc_log_relay_chan;
>>>> +       size_t n_subbufs, subbuf_size;
>>>> +       int ret;
>>>> +
>>>> +       if (!i915_modparams.guc_log_level)
>>>> +               return 0;
>>>> +
>>>> +       GEM_BUG_ON(guc_log_has_relay(guc));
>>>> +
>>>>            /* Keep the size of sub buffers same as shared log buffer */
>>>> -       subbuf_size = guc->log.vma->obj->base.size;
>>>> +       subbuf_size = GUC_LOG_SIZE;
>>>>    
>>>>           /* Store up to 8 snapshots, which is large enough to buffer sufficient
>>>>            * boot time logs and provides enough leeway to User, in terms of
>>>> @@ -407,33 +442,31 @@ static int guc_log_runtime_create(struct intel_guc *guc)
>>>>                   DRM_ERROR("Couldn't create relay chan for GuC logging\n");
>>>>    
>>>>                   ret = -ENOMEM;
>>>> -               goto err_vaddr;
>>>> +               goto err;
>>>>           }
>>>>    
>>>>           GEM_BUG_ON(guc_log_relay_chan->subbuf_size < subbuf_size);
>>>>           guc->log.runtime.relay_chan = guc_log_relay_chan;
>>>>    
>>>> -       INIT_WORK(&guc->log.runtime.flush_work, capture_logs_work);
>>>>           return 0;
>>>>    
>>>> -err_vaddr:
>>>> -       i915_gem_object_unpin_map(guc->log.vma->obj);
>>>> -       guc->log.runtime.buf_addr = NULL;
>>>> +err:
>>>> +       /* logging will be off */
>>>> +       i915_modparams.guc_log_level = 0;
>>>>           return ret;
>>>>    }
>>>>    
>>>> -static void guc_log_runtime_destroy(struct intel_guc *guc)
>>>> +void intel_guc_log_relay_destroy(struct intel_guc *guc)
>>>>    {
>>> Now exposed to multiple users, we need to document what the locking
>>> requirements are here. Or add some local locking.
>> Do you mean locking between relay_create and relay_destroy?
> We need a lock around guc->log.runtime.relay_chan as the destroy path is
> not ostensibly serialised between multiple potential callers. Ordinarily
> I would have said that serialisation for create/destroy/access of
> relay_chan was guaranteed by init/fini ordering,
I was also initially thinking that init/fini ordering should take care 
of synchronization.
Checked further and I see that relay_open and relay_destroy are 
synchronized by relay_channels_mutex
and internally if needed through debugfs inode synchronization. So I 
feel we need not add new lock.
>   but that's no longer
> clear (based on a 5min read of the patch).
>
> The most important question is "can relay_destroy be called while some
> user still has access to the relay_chan?"
>
>>>    Looks like at the
>>> moment, _create is using struct_mutex,
>> relay_create and relay_destroy are now to be done outside of struct_mutex.
>> I will add this documentation to the functions.
> (lockdep_assert_held :)
> -Chris

-- 
Thanks,
Sagar

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4] drm/i915/guc: Fix lockdep due to log relay channel handling under struct_mutex
  2018-01-22 15:27       ` Sagar Arun Kamble
@ 2018-01-23 10:24         ` Sagar Arun Kamble
  0 siblings, 0 replies; 16+ messages in thread
From: Sagar Arun Kamble @ 2018-01-23 10:24 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx



On 1/22/2018 8:57 PM, Sagar Arun Kamble wrote:
>
>
> On 1/22/2018 4:17 PM, Chris Wilson wrote:
>> Quoting Sagar Arun Kamble (2018-01-22 10:38:10)
>>>
>>> On 1/22/2018 3:46 PM, Chris Wilson wrote:
>>>> Quoting Sagar Arun Kamble (2018-01-22 08:26:01)
>>>>> +int intel_guc_log_relay_create(struct intel_guc *guc)
>>>>> +{
>>>>> +       struct drm_i915_private *dev_priv = guc_to_i915(guc);
>>>>> +       struct rchan *guc_log_relay_chan;
>>>>> +       size_t n_subbufs, subbuf_size;
>>>>> +       int ret;
>>>>> +
>>>>> +       if (!i915_modparams.guc_log_level)
>>>>> +               return 0;
>>>>> +
>>>>> +       GEM_BUG_ON(guc_log_has_relay(guc));
>>>>> +
>>>>>            /* Keep the size of sub buffers same as shared log 
>>>>> buffer */
>>>>> -       subbuf_size = guc->log.vma->obj->base.size;
>>>>> +       subbuf_size = GUC_LOG_SIZE;
>>>>>              /* Store up to 8 snapshots, which is large enough to 
>>>>> buffer sufficient
>>>>>            * boot time logs and provides enough leeway to User, in 
>>>>> terms of
>>>>> @@ -407,33 +442,31 @@ static int guc_log_runtime_create(struct 
>>>>> intel_guc *guc)
>>>>>                   DRM_ERROR("Couldn't create relay chan for GuC 
>>>>> logging\n");
>>>>>                      ret = -ENOMEM;
>>>>> -               goto err_vaddr;
>>>>> +               goto err;
>>>>>           }
>>>>>              GEM_BUG_ON(guc_log_relay_chan->subbuf_size < 
>>>>> subbuf_size);
>>>>>           guc->log.runtime.relay_chan = guc_log_relay_chan;
>>>>>    -       INIT_WORK(&guc->log.runtime.flush_work, 
>>>>> capture_logs_work);
>>>>>           return 0;
>>>>>    -err_vaddr:
>>>>> - i915_gem_object_unpin_map(guc->log.vma->obj);
>>>>> -       guc->log.runtime.buf_addr = NULL;
>>>>> +err:
>>>>> +       /* logging will be off */
>>>>> +       i915_modparams.guc_log_level = 0;
>>>>>           return ret;
>>>>>    }
>>>>>    -static void guc_log_runtime_destroy(struct intel_guc *guc)
>>>>> +void intel_guc_log_relay_destroy(struct intel_guc *guc)
>>>>>    {
>>>> Now exposed to multiple users, we need to document what the locking
>>>> requirements are here. Or add some local locking.
>>> Do you mean locking between relay_create and relay_destroy?
>> We need a lock around guc->log.runtime.relay_chan as the destroy path is
>> not ostensibly serialised between multiple potential callers. Ordinarily
>> I would have said that serialisation for create/destroy/access of
>> relay_chan was guaranteed by init/fini ordering,
> I was also initially thinking that init/fini ordering should take care 
> of synchronization.
> Checked further and I see that relay_open and relay_destroy are 
> synchronized by relay_channels_mutex
> and internally if needed through debugfs inode synchronization. So I 
> feel we need not add new lock.
Sorry. after some more thinking, I am now convinced that lock will be 
needed as guc_log_has_relay was accessing
it without any locking. Will share new rev. Thanks for the review.
>>   but that's no longer
>> clear (based on a 5min read of the patch).
>>
>> The most important question is "can relay_destroy be called while some
>> user still has access to the relay_chan?"
>>
>>>>    Looks like at the
>>>> moment, _create is using struct_mutex,
>>> relay_create and relay_destroy are now to be done outside of 
>>> struct_mutex.
>>> I will add this documentation to the functions.
>> (lockdep_assert_held :)
>> -Chris
>

-- 
Thanks,
Sagar

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2018-01-23 10:24 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-22  8:26 [PATCH 1/4] drm/i915/guc: Fix lockdep due to log relay channel handling under struct_mutex Sagar Arun Kamble
2018-01-22  8:24 ` Lofstedt, Marta
2018-01-22  8:34   ` Sagar Arun Kamble
2018-01-22  8:26 ` [PATCH 2/4] drm/i915/guc: Grab RPM wakelock while disabling GuC interrupts Sagar Arun Kamble
2018-01-22 10:11   ` Chris Wilson
2018-01-22 10:30     ` Sagar Arun Kamble
2018-01-22  8:26 ` [PATCH 3/4] drm/i915/guc: Enable interrupts before resuming GuC during runtime resume Sagar Arun Kamble
2018-01-22 10:09   ` Chris Wilson
2018-01-22  8:26 ` [PATCH 4/4] [HAX] Revert "drm/i915/guc: Keep GuC log disabled by default" Sagar Arun Kamble
2018-01-22  9:15 ` ✓ Fi.CI.BAT: success for series starting with [1/4] drm/i915/guc: Fix lockdep due to log relay channel handling under struct_mutex Patchwork
2018-01-22 10:16 ` [PATCH 1/4] " Chris Wilson
2018-01-22 10:38   ` Sagar Arun Kamble
2018-01-22 10:47     ` Chris Wilson
2018-01-22 15:27       ` Sagar Arun Kamble
2018-01-23 10:24         ` Sagar Arun Kamble
2018-01-22 12:40 ` ✗ Fi.CI.IGT: failure for series starting with [1/4] " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.