All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andi Shyti <andi.shyti@linux.intel.com>
To: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org
Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com>,
	Chris Wilson <chris@chris-wilson.co.uk>,
	Andi Shyti <andi.shyti@linux.intel.com>,
	Rodrigo Vivi <rodrigo.vivi@intel.com>,
	Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>,
	Chris Wilson <chris.p.wilson@linux.intel.com>,
	Andi Shyti <andi@etezian.org>
Subject: [PATCH v2] drm/i915/pcode: Give the punit time to settle before fatally failing
Date: Mon,  6 Feb 2023 19:32:36 +0100	[thread overview]
Message-ID: <20230206183236.109908-1-andi.shyti@linux.intel.com> (raw)

From: Aravind Iddamsetty <aravind.iddamsetty@intel.com>

During module load the punit might still be busy with its booting
routines. During this time we try to communicate with it but we
fail because we don't receive any feedback from it and we return
immediately with a -EINVAL fatal error.

At this point the driver load is "dramatically" aborted. The
following error message notifies us about it.

   i915 0000:4d:00.0: drm_WARN_ON_ONCE(timeout_base_ms > 3)

It would be enough to wait a little in order to give the punit
the chance to come up bright and shiny, ready to interact with
the driver.

Wait up 10 seconds for the punit to settle and complete any
outstanding transactions upon module load. If it still fails try
again with a longer timeout, 180s, 3 minutes. If it still fails
then return -EPROBE_DEFER, in order to give the punit a second
chance.

Even if these timers might look long, we should consider that the
punit, depending on the platforms, might need long times to
complete its routines. Besides we want to try anything possible
to move forward before deciding to abort the driver's load.

The issue has been reported in:

   https://gitlab.freedesktop.org/drm/intel/-/issues/7814

The changes in this patch are valid only and uniquely during
boot. The common transactions with the punit during the driver's
normal operation are not affected.

Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Co-developed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
Hi,

I'm proposing again the same patch as v1 with a hopefully more
descriptive commit log in order to minimize the
misunderstandings that we had during the v1 review.

Thanks,
Andi

Changelog:
==========
v1 -> v2:
 - write a more descriptive commit log.
 - add Chris SoB which was triggering a checkpatch error.

 drivers/gpu/drm/i915/intel_pcode.c | 35 ++++++++++++++++++++++++++----
 1 file changed, 31 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pcode.c b/drivers/gpu/drm/i915/intel_pcode.c
index a234d9b4ed143..3db2ba439bb57 100644
--- a/drivers/gpu/drm/i915/intel_pcode.c
+++ b/drivers/gpu/drm/i915/intel_pcode.c
@@ -204,15 +204,42 @@ int skl_pcode_request(struct intel_uncore *uncore, u32 mbox, u32 request,
 #undef COND
 }
 
+static int pcode_init_wait(struct intel_uncore *uncore, int timeout_ms)
+{
+	if (__intel_wait_for_register_fw(uncore,
+					 GEN6_PCODE_MAILBOX,
+					 GEN6_PCODE_READY, 0,
+					 500, timeout_ms,
+					 NULL))
+		return -EPROBE_DEFER;
+
+	return skl_pcode_request(uncore,
+				 DG1_PCODE_STATUS,
+				 DG1_UNCORE_GET_INIT_STATUS,
+				 DG1_UNCORE_INIT_STATUS_COMPLETE,
+				 DG1_UNCORE_INIT_STATUS_COMPLETE, timeout_ms);
+}
+
 int intel_pcode_init(struct intel_uncore *uncore)
 {
+	int err;
+
 	if (!IS_DGFX(uncore->i915))
 		return 0;
 
-	return skl_pcode_request(uncore, DG1_PCODE_STATUS,
-				 DG1_UNCORE_GET_INIT_STATUS,
-				 DG1_UNCORE_INIT_STATUS_COMPLETE,
-				 DG1_UNCORE_INIT_STATUS_COMPLETE, 180000);
+	/*
+	 * Wait 10 seconds so that the punit to settle and complete
+	 * any outstanding transactions upon module load
+	 */
+	err = pcode_init_wait(uncore, 10000);
+
+	if (err) {
+		drm_notice(&uncore->i915->drm,
+			   "Waiting for HW initialisation...\n");
+		err = pcode_init_wait(uncore, 180000);
+	}
+
+	return err;
 }
 
 int snb_pcode_read_p(struct intel_uncore *uncore, u32 mbcmd, u32 p1, u32 p2, u32 *val)
-- 
2.39.1


WARNING: multiple messages have this Message-ID (diff)
From: Andi Shyti <andi.shyti@linux.intel.com>
To: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>,
	Rodrigo Vivi <rodrigo.vivi@intel.com>,
	Chris Wilson <chris.p.wilson@linux.intel.com>
Subject: [Intel-gfx] [PATCH v2] drm/i915/pcode: Give the punit time to settle before fatally failing
Date: Mon,  6 Feb 2023 19:32:36 +0100	[thread overview]
Message-ID: <20230206183236.109908-1-andi.shyti@linux.intel.com> (raw)

From: Aravind Iddamsetty <aravind.iddamsetty@intel.com>

During module load the punit might still be busy with its booting
routines. During this time we try to communicate with it but we
fail because we don't receive any feedback from it and we return
immediately with a -EINVAL fatal error.

At this point the driver load is "dramatically" aborted. The
following error message notifies us about it.

   i915 0000:4d:00.0: drm_WARN_ON_ONCE(timeout_base_ms > 3)

It would be enough to wait a little in order to give the punit
the chance to come up bright and shiny, ready to interact with
the driver.

Wait up 10 seconds for the punit to settle and complete any
outstanding transactions upon module load. If it still fails try
again with a longer timeout, 180s, 3 minutes. If it still fails
then return -EPROBE_DEFER, in order to give the punit a second
chance.

Even if these timers might look long, we should consider that the
punit, depending on the platforms, might need long times to
complete its routines. Besides we want to try anything possible
to move forward before deciding to abort the driver's load.

The issue has been reported in:

   https://gitlab.freedesktop.org/drm/intel/-/issues/7814

The changes in this patch are valid only and uniquely during
boot. The common transactions with the punit during the driver's
normal operation are not affected.

Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Co-developed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
Hi,

I'm proposing again the same patch as v1 with a hopefully more
descriptive commit log in order to minimize the
misunderstandings that we had during the v1 review.

Thanks,
Andi

Changelog:
==========
v1 -> v2:
 - write a more descriptive commit log.
 - add Chris SoB which was triggering a checkpatch error.

 drivers/gpu/drm/i915/intel_pcode.c | 35 ++++++++++++++++++++++++++----
 1 file changed, 31 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pcode.c b/drivers/gpu/drm/i915/intel_pcode.c
index a234d9b4ed143..3db2ba439bb57 100644
--- a/drivers/gpu/drm/i915/intel_pcode.c
+++ b/drivers/gpu/drm/i915/intel_pcode.c
@@ -204,15 +204,42 @@ int skl_pcode_request(struct intel_uncore *uncore, u32 mbox, u32 request,
 #undef COND
 }
 
+static int pcode_init_wait(struct intel_uncore *uncore, int timeout_ms)
+{
+	if (__intel_wait_for_register_fw(uncore,
+					 GEN6_PCODE_MAILBOX,
+					 GEN6_PCODE_READY, 0,
+					 500, timeout_ms,
+					 NULL))
+		return -EPROBE_DEFER;
+
+	return skl_pcode_request(uncore,
+				 DG1_PCODE_STATUS,
+				 DG1_UNCORE_GET_INIT_STATUS,
+				 DG1_UNCORE_INIT_STATUS_COMPLETE,
+				 DG1_UNCORE_INIT_STATUS_COMPLETE, timeout_ms);
+}
+
 int intel_pcode_init(struct intel_uncore *uncore)
 {
+	int err;
+
 	if (!IS_DGFX(uncore->i915))
 		return 0;
 
-	return skl_pcode_request(uncore, DG1_PCODE_STATUS,
-				 DG1_UNCORE_GET_INIT_STATUS,
-				 DG1_UNCORE_INIT_STATUS_COMPLETE,
-				 DG1_UNCORE_INIT_STATUS_COMPLETE, 180000);
+	/*
+	 * Wait 10 seconds so that the punit to settle and complete
+	 * any outstanding transactions upon module load
+	 */
+	err = pcode_init_wait(uncore, 10000);
+
+	if (err) {
+		drm_notice(&uncore->i915->drm,
+			   "Waiting for HW initialisation...\n");
+		err = pcode_init_wait(uncore, 180000);
+	}
+
+	return err;
 }
 
 int snb_pcode_read_p(struct intel_uncore *uncore, u32 mbcmd, u32 p1, u32 p2, u32 *val)
-- 
2.39.1


             reply	other threads:[~2023-02-06 18:33 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-06 18:32 Andi Shyti [this message]
2023-02-06 18:32 ` [Intel-gfx] [PATCH v2] drm/i915/pcode: Give the punit time to settle before fatally failing Andi Shyti
2023-02-06 20:39 ` [Intel-gfx] ✓ Fi.CI.BAT: success for " Patchwork
2023-02-07  5:26 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork
2023-02-07  9:08 ` [Intel-gfx] [PATCH v2] " Andrzej Hajda
2023-02-07 10:40   ` Andi Shyti
2023-02-07 10:40     ` Andi Shyti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230206183236.109908-1-andi.shyti@linux.intel.com \
    --to=andi.shyti@linux.intel.com \
    --cc=andi@etezian.org \
    --cc=aravind.iddamsetty@intel.com \
    --cc=chris.p.wilson@linux.intel.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=gwan-gyeong.mun@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=rodrigo.vivi@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.