All of lore.kernel.org
 help / color / mirror / Atom feed
* [igt-dev] [PATCH i-g-t] lib/igt_kmod: Allow some leeway in igt_kmod_unload_r
@ 2023-01-18 16:06 Jonathan Cavitt
  2023-01-18 19:14 ` [igt-dev] ✗ Fi.CI.BAT: failure for lib/igt_kmod: Allow some leeway in igt_kmod_unload_r (rev3) Patchwork
  2023-01-18 19:32 ` [igt-dev] [PATCH i-g-t] lib/igt_kmod: Allow some leeway in igt_kmod_unload_r Kamil Konieczny
  0 siblings, 2 replies; 11+ messages in thread
From: Jonathan Cavitt @ 2023-01-18 16:06 UTC (permalink / raw)
  To: igt-dev; +Cc: chris.p.wilson, jonathan.cavitt, sandeep.kumar.parupalli

kmod_module_remove_module occasionally returns EAGAIN for mei_gsc
in the setup of some gem_lmem_swapping subtests.  Just because
EAGAIN is returned doesn't mean the module is lost and unable to
be unloaded.  Try again some number of times (currently 20) before
giving up.

Retries will occur for -EBUSY and -EAGAIN, as these imply the
system is waiting for the target module can simply be waited for.
All other errors exit immediately, but as the context for each
error informs their relative severity, no warnings will be issued.

Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
CC: Stuart Summers <stuart.summers@intel.com>
CC: Sandeep Kumar Parupalli <sandeep.kumar.parupalli@intel.com>
CC: Chris Wilson <chris.p.wilson@linux.intel.com>
Reviewed-by: Chris Wilson <chris.p.wilson@linux.intel.com>
---
 lib/igt_kmod.c | 35 ++++++++++++++++++++++++++++++++---
 1 file changed, 32 insertions(+), 3 deletions(-)

diff --git a/lib/igt_kmod.c b/lib/igt_kmod.c
index 17090110c..4aa5320d0 100644
--- a/lib/igt_kmod.c
+++ b/lib/igt_kmod.c
@@ -253,8 +253,11 @@ out:
 
 static int igt_kmod_unload_r(struct kmod_module *kmod, unsigned int flags)
 {
+#define MAX_TRIES	20
+#define SLEEP_DURATION	500000
 	struct kmod_list *holders, *pos;
-	int err = 0;
+	int err, tries;
+	const char *mod_name = kmod_module_get_name(kmod);
 
 	holders = kmod_module_get_holders(kmod);
 	kmod_list_foreach(pos, holders) {
@@ -269,7 +272,6 @@ static int igt_kmod_unload_r(struct kmod_module *kmod, unsigned int flags)
 		return err;
 
 	if (igt_kmod_is_loading(kmod)) {
-		const char *mod_name = kmod_module_get_name(kmod);
 		igt_debug("%s still initializing\n", mod_name);
 		err = igt_wait(!igt_kmod_is_loading(kmod), 10000, 100);
 		if (err < 0) {
@@ -279,7 +281,34 @@ static int igt_kmod_unload_r(struct kmod_module *kmod, unsigned int flags)
 		}
 	}
 
-	return kmod_module_remove_module(kmod, flags);
+	for (tries = 0; tries < MAX_TRIES; tries++) {
+		err = kmod_module_remove_module(kmod, flags);
+
+		/* Only loop in the following cases */
+		if (err != -EBUSY && err != -EAGAIN)
+			break;
+
+		igt_debug("Module %s failed to unload with err: %d on attempt: %i\n",
+			  mod_name, err, tries + 1);
+
+		if (MAX_TRIES - tries - 1)
+			usleep(SLEEP_DURATION);
+	}
+
+	if (err == -ENOENT)
+		igt_debug("Module %s could not be found or does not exist. err: %d\n",
+			  mod_name, err);
+	else if (err == -ENOTSUP)
+		igt_debug("Module %s cannot be unloaded. err: %d\n",
+			  mod_name, err);
+	else if (err)
+		igt_debug("Module %s failed to unload with err: %d after ~%.1fms\n",
+			  mod_name, err, SLEEP_DURATION*tries/1000.);
+	else if (tries)
+		igt_debug("Module %s unload took ~%.1fms over %i attempts\n",
+			  mod_name, SLEEP_DURATION*tries/1000., tries + 1);
+
+	return err;
 }
 
 /**
-- 
2.25.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [igt-dev] ✗ Fi.CI.BAT: failure for lib/igt_kmod: Allow some leeway in igt_kmod_unload_r (rev3)
  2023-01-18 16:06 [igt-dev] [PATCH i-g-t] lib/igt_kmod: Allow some leeway in igt_kmod_unload_r Jonathan Cavitt
@ 2023-01-18 19:14 ` Patchwork
  2023-01-18 19:32 ` [igt-dev] [PATCH i-g-t] lib/igt_kmod: Allow some leeway in igt_kmod_unload_r Kamil Konieczny
  1 sibling, 0 replies; 11+ messages in thread
From: Patchwork @ 2023-01-18 19:14 UTC (permalink / raw)
  To: Jonathan Cavitt; +Cc: igt-dev

[-- Attachment #1: Type: text/plain, Size: 4302 bytes --]

== Series Details ==

Series: lib/igt_kmod: Allow some leeway in igt_kmod_unload_r (rev3)
URL   : https://patchwork.freedesktop.org/series/112566/
State : failure

== Summary ==

CI Bug Log - changes from IGT_7123 -> IGTPW_8367
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with IGTPW_8367 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in IGTPW_8367, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_8367/index.html

Participating hosts (42 -> 42)
------------------------------

  Additional (1): fi-pnv-d510 
  Missing    (1): fi-snb-2520m 

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in IGTPW_8367:

### IGT changes ###

#### Possible regressions ####

  * igt@core_hotunplug@unbind-rebind:
    - fi-apl-guc:         [PASS][1] -> [DMESG-WARN][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7123/fi-apl-guc/igt@core_hotunplug@unbind-rebind.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_8367/fi-apl-guc/igt@core_hotunplug@unbind-rebind.html

  
Known issues
------------

  Here are the changes found in IGTPW_8367 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@kms_chamelium_hpd@common-hpd-after-suspend:
    - bat-dg1-6:          NOTRUN -> [SKIP][3] ([i915#7828])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_8367/bat-dg1-6/igt@kms_chamelium_hpd@common-hpd-after-suspend.html

  * igt@kms_psr@primary_page_flip:
    - fi-pnv-d510:        NOTRUN -> [SKIP][4] ([fdo#109271]) +44 similar issues
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_8367/fi-pnv-d510/igt@kms_psr@primary_page_flip.html

  * igt@runner@aborted:
    - fi-apl-guc:         NOTRUN -> [FAIL][5] ([i915#4312])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_8367/fi-apl-guc/igt@runner@aborted.html

  
#### Possible fixes ####

  * igt@i915_selftest@live@gt_lrc:
    - {bat-adln-1}:       [INCOMPLETE][6] ([i915#4983] / [i915#7609]) -> [PASS][7]
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7123/bat-adln-1/igt@i915_selftest@live@gt_lrc.html
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_8367/bat-adln-1/igt@i915_selftest@live@gt_lrc.html

  * igt@i915_selftest@live@reset:
    - {bat-rpls-1}:       [DMESG-FAIL][8] ([i915#4983]) -> [PASS][9]
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7123/bat-rpls-1/igt@i915_selftest@live@reset.html
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_8367/bat-rpls-1/igt@i915_selftest@live@reset.html

  * igt@i915_selftest@live@workarounds:
    - bat-dg1-6:          [INCOMPLETE][10] ([i915#4983]) -> [PASS][11]
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7123/bat-dg1-6/igt@i915_selftest@live@workarounds.html
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_8367/bat-dg1-6/igt@i915_selftest@live@workarounds.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [i915#4312]: https://gitlab.freedesktop.org/drm/intel/issues/4312
  [i915#4983]: https://gitlab.freedesktop.org/drm/intel/issues/4983
  [i915#6367]: https://gitlab.freedesktop.org/drm/intel/issues/6367
  [i915#6997]: https://gitlab.freedesktop.org/drm/intel/issues/6997
  [i915#7609]: https://gitlab.freedesktop.org/drm/intel/issues/7609
  [i915#7625]: https://gitlab.freedesktop.org/drm/intel/issues/7625
  [i915#7828]: https://gitlab.freedesktop.org/drm/intel/issues/7828


Build changes
-------------

  * CI: CI-20190529 -> None
  * IGT: IGT_7123 -> IGTPW_8367

  CI-20190529: 20190529
  CI_DRM_12601: 5fee98189f91987f2b91fa666541de66e6c4a5b9 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGTPW_8367: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_8367/index.html
  IGT_7123: 2b29e8ac07fbcfadc48b9d60e4d736a6e3b289ab @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_8367/index.html

[-- Attachment #2: Type: text/html, Size: 5026 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [igt-dev] [PATCH i-g-t] lib/igt_kmod: Allow some leeway in igt_kmod_unload_r
  2023-01-18 16:06 [igt-dev] [PATCH i-g-t] lib/igt_kmod: Allow some leeway in igt_kmod_unload_r Jonathan Cavitt
  2023-01-18 19:14 ` [igt-dev] ✗ Fi.CI.BAT: failure for lib/igt_kmod: Allow some leeway in igt_kmod_unload_r (rev3) Patchwork
@ 2023-01-18 19:32 ` Kamil Konieczny
  2023-01-18 19:56   ` Cavitt, Jonathan
  1 sibling, 1 reply; 11+ messages in thread
From: Kamil Konieczny @ 2023-01-18 19:32 UTC (permalink / raw)
  To: igt-dev; +Cc: sandeep.kumar.parupalli, jonathan.cavitt, chris.p.wilson

Hi Jonathan,

On 2023-01-18 at 08:06:31 -0800, Jonathan Cavitt wrote:
> kmod_module_remove_module occasionally returns EAGAIN for mei_gsc
> in the setup of some gem_lmem_swapping subtests.  Just because
> EAGAIN is returned doesn't mean the module is lost and unable to
> be unloaded.  Try again some number of times (currently 20) before
> giving up.
> 
> Retries will occur for -EBUSY and -EAGAIN, as these imply the
> system is waiting for the target module can simply be waited for.
> All other errors exit immediately, but as the context for each
> error informs their relative severity, no warnings will be issued.
> 
> Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> CC: Stuart Summers <stuart.summers@intel.com>
> CC: Sandeep Kumar Parupalli <sandeep.kumar.parupalli@intel.com>
> CC: Chris Wilson <chris.p.wilson@linux.intel.com>
> Reviewed-by: Chris Wilson <chris.p.wilson@linux.intel.com>
> ---
>  lib/igt_kmod.c | 35 ++++++++++++++++++++++++++++++++---
>  1 file changed, 32 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/igt_kmod.c b/lib/igt_kmod.c
> index 17090110c..4aa5320d0 100644
> --- a/lib/igt_kmod.c
> +++ b/lib/igt_kmod.c
> @@ -253,8 +253,11 @@ out:
>  
>  static int igt_kmod_unload_r(struct kmod_module *kmod, unsigned int flags)
>  {
> +#define MAX_TRIES	20
> +#define SLEEP_DURATION	500000
>  	struct kmod_list *holders, *pos;
> -	int err = 0;
> +	int err, tries;
> +	const char *mod_name = kmod_module_get_name(kmod);
>  
>  	holders = kmod_module_get_holders(kmod);
>  	kmod_list_foreach(pos, holders) {
> @@ -269,7 +272,6 @@ static int igt_kmod_unload_r(struct kmod_module *kmod, unsigned int flags)
>  		return err;
>  
>  	if (igt_kmod_is_loading(kmod)) {
> -		const char *mod_name = kmod_module_get_name(kmod);
>  		igt_debug("%s still initializing\n", mod_name);
>  		err = igt_wait(!igt_kmod_is_loading(kmod), 10000, 100);
>  		if (err < 0) {
> @@ -279,7 +281,34 @@ static int igt_kmod_unload_r(struct kmod_module *kmod, unsigned int flags)
>  		}
>  	}
>  
> -	return kmod_module_remove_module(kmod, flags);
> +	for (tries = 0; tries < MAX_TRIES; tries++) {
> +		err = kmod_module_remove_module(kmod, flags);
> +
> +		/* Only loop in the following cases */
> +		if (err != -EBUSY && err != -EAGAIN)
> +			break;
> +
> +		igt_debug("Module %s failed to unload with err: %d on attempt: %i\n",
> +			  mod_name, err, tries + 1);
> +
> +		if (MAX_TRIES - tries - 1)
------------------ ^
This should be tries < MAX_TRIES - 1

> +			usleep(SLEEP_DURATION);
> +	}
> +
> +	if (err == -ENOENT)
> +		igt_debug("Module %s could not be found or does not exist. err: %d\n",
> +			  mod_name, err);
> +	else if (err == -ENOTSUP)
> +		igt_debug("Module %s cannot be unloaded. err: %d\n",
> +			  mod_name, err);
> +	else if (err)
> +		igt_debug("Module %s failed to unload with err: %d after ~%.1fms\n",
> +			  mod_name, err, SLEEP_DURATION*tries/1000.);
> +	else if (tries)
> +		igt_debug("Module %s unload took ~%.1fms over %i attempts\n",
> +			  mod_name, SLEEP_DURATION*tries/1000., tries + 1);

What about one final else with igt_debug about success at first try ?

Regards,
Kamil

> +
> +	return err;
>  }
>  
>  /**
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [igt-dev] [PATCH i-g-t] lib/igt_kmod: Allow some leeway in igt_kmod_unload_r
  2023-01-18 19:32 ` [igt-dev] [PATCH i-g-t] lib/igt_kmod: Allow some leeway in igt_kmod_unload_r Kamil Konieczny
@ 2023-01-18 19:56   ` Cavitt, Jonathan
  0 siblings, 0 replies; 11+ messages in thread
From: Cavitt, Jonathan @ 2023-01-18 19:56 UTC (permalink / raw)
  To: Kamil Konieczny; +Cc: Parupalli, Sandeep Kumar, igt-dev, chris.p.wilson

-----Original Message-----
From: Kamil Konieczny <kamil.konieczny@linux.intel.com> 
Sent: Wednesday, January 18, 2023 11:32 AM
To: igt-dev@lists.freedesktop.org
Cc: chris.p.wilson@linux.intel.com; Cavitt, Jonathan <jonathan.cavitt@intel.com>; Parupalli, Sandeep Kumar <sandeep.kumar.parupalli@intel.com>
Subject: Re: [igt-dev] [PATCH i-g-t] lib/igt_kmod: Allow some leeway in igt_kmod_unload_r
> 
> Hi Jonathan,
> 
> On 2023-01-18 at 08:06:31 -0800, Jonathan Cavitt wrote:
> > kmod_module_remove_module occasionally returns EAGAIN for mei_gsc
> > in the setup of some gem_lmem_swapping subtests.  Just because
> > EAGAIN is returned doesn't mean the module is lost and unable to
> > be unloaded.  Try again some number of times (currently 20) before
> > giving up.
> > 
> > Retries will occur for -EBUSY and -EAGAIN, as these imply the
> > system is waiting for the target module can simply be waited for.
> > All other errors exit immediately, but as the context for each
> > error informs their relative severity, no warnings will be issued.
> > 
> > Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> > CC: Stuart Summers <stuart.summers@intel.com>
> > CC: Sandeep Kumar Parupalli <sandeep.kumar.parupalli@intel.com>
> > CC: Chris Wilson <chris.p.wilson@linux.intel.com>
> > Reviewed-by: Chris Wilson <chris.p.wilson@linux.intel.com>
> > ---
> >  lib/igt_kmod.c | 35 ++++++++++++++++++++++++++++++++---
> >  1 file changed, 32 insertions(+), 3 deletions(-)
> > 
> > diff --git a/lib/igt_kmod.c b/lib/igt_kmod.c
> > index 17090110c..4aa5320d0 100644
> > --- a/lib/igt_kmod.c
> > +++ b/lib/igt_kmod.c
> > @@ -253,8 +253,11 @@ out:
> >  
> >  static int igt_kmod_unload_r(struct kmod_module *kmod, unsigned int flags)
> >  {
> > +#define MAX_TRIES	20
> > +#define SLEEP_DURATION	500000
> >  	struct kmod_list *holders, *pos;
> > -	int err = 0;
> > +	int err, tries;
> > +	const char *mod_name = kmod_module_get_name(kmod);
> >  
> >  	holders = kmod_module_get_holders(kmod);
> >  	kmod_list_foreach(pos, holders) {
> > @@ -269,7 +272,6 @@ static int igt_kmod_unload_r(struct kmod_module *kmod, unsigned int flags)
> >  		return err;
> >  
> >  	if (igt_kmod_is_loading(kmod)) {
> > -		const char *mod_name = kmod_module_get_name(kmod);
> >  		igt_debug("%s still initializing\n", mod_name);
> >  		err = igt_wait(!igt_kmod_is_loading(kmod), 10000, 100);
> >  		if (err < 0) {
> > @@ -279,7 +281,34 @@ static int igt_kmod_unload_r(struct kmod_module *kmod, unsigned int flags)
> >  		}
> >  	}
> >  
> > -	return kmod_module_remove_module(kmod, flags);
> > +	for (tries = 0; tries < MAX_TRIES; tries++) {
> > +		err = kmod_module_remove_module(kmod, flags);
> > +
> > +		/* Only loop in the following cases */
> > +		if (err != -EBUSY && err != -EAGAIN)
> > +			break;
> > +
> > +		igt_debug("Module %s failed to unload with err: %d on attempt: %i\n",
> > +			  mod_name, err, tries + 1);
> > +
> > +		if (MAX_TRIES - tries - 1)
> ------------------ ^
> This should be tries < MAX_TRIES - 1

I'm using "tries != MAX_TRIES - 1", which should also work as long as the for-loop doesn't somehow overflow.
Here, the two are logically equivalent:

tries != MAX_TRIES - 1
tries - tries != MAX_TRIES - 1 - tries
0 != MAX_TRIES - tries - 1

... I'll clean it up for clarity...

> 
> > +			usleep(SLEEP_DURATION);
> > +	}
> > +
> > +	if (err == -ENOENT)
> > +		igt_debug("Module %s could not be found or does not exist. err: %d\n",
> > +			  mod_name, err);
> > +	else if (err == -ENOTSUP)
> > +		igt_debug("Module %s cannot be unloaded. err: %d\n",
> > +			  mod_name, err);
> > +	else if (err)
> > +		igt_debug("Module %s failed to unload with err: %d after ~%.1fms\n",
> > +			  mod_name, err, SLEEP_DURATION*tries/1000.);
> > +	else if (tries)
> > +		igt_debug("Module %s unload took ~%.1fms over %i attempts\n",
> > +			  mod_name, SLEEP_DURATION*tries/1000., tries + 1);
> 
> What about one final else with igt_debug about success at first try ?

Generally speaking, should we really be printing debug information when a
function behaves as intended?  I don't believe so, but if you want me to add
that debug message there I'll do it.
-Jonathan Cavitt

> 
> Regards,
> Kamil
> 
> > +
> > +	return err;
> >  }
> >  
> >  /**
> > -- 
> > 2.25.1
> > 
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [igt-dev] [PATCH i-g-t] lib/igt_kmod: Allow some leeway in igt_kmod_unload_r
@ 2023-01-18 20:03 Jonathan Cavitt
  0 siblings, 0 replies; 11+ messages in thread
From: Jonathan Cavitt @ 2023-01-18 20:03 UTC (permalink / raw)
  To: igt-dev; +Cc: chris.p.wilson, jonathan.cavitt, sandeep.kumar.parupalli

kmod_module_remove_module occasionally returns EAGAIN for mei_gsc
in the setup of some gem_lmem_swapping subtests.  Just because
EAGAIN is returned doesn't mean the module is lost and unable to
be unloaded.  Try again some number of times (currently 20) before
giving up.

Retries will occur for -EBUSY and -EAGAIN, as these imply the
system is waiting for the target module can simply be waited for.
All other errors exit immediately, but as the context for each
error informs their relative severity, no warnings will be issued.

Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
CC: Stuart Summers <stuart.summers@intel.com>
CC: Sandeep Kumar Parupalli <sandeep.kumar.parupalli@intel.com>
CC: Chris Wilson <chris.p.wilson@linux.intel.com>
Reviewed-by: Chris Wilson <chris.p.wilson@linux.intel.com>
---
 lib/igt_kmod.c | 37 ++++++++++++++++++++++++++++++++++---
 1 file changed, 34 insertions(+), 3 deletions(-)

diff --git a/lib/igt_kmod.c b/lib/igt_kmod.c
index 17090110c..75f1a1035 100644
--- a/lib/igt_kmod.c
+++ b/lib/igt_kmod.c
@@ -253,8 +253,11 @@ out:
 
 static int igt_kmod_unload_r(struct kmod_module *kmod, unsigned int flags)
 {
+#define MAX_TRIES	20
+#define SLEEP_DURATION	500000
 	struct kmod_list *holders, *pos;
-	int err = 0;
+	int err, tries;
+	const char *mod_name = kmod_module_get_name(kmod);
 
 	holders = kmod_module_get_holders(kmod);
 	kmod_list_foreach(pos, holders) {
@@ -269,7 +272,6 @@ static int igt_kmod_unload_r(struct kmod_module *kmod, unsigned int flags)
 		return err;
 
 	if (igt_kmod_is_loading(kmod)) {
-		const char *mod_name = kmod_module_get_name(kmod);
 		igt_debug("%s still initializing\n", mod_name);
 		err = igt_wait(!igt_kmod_is_loading(kmod), 10000, 100);
 		if (err < 0) {
@@ -279,7 +281,36 @@ static int igt_kmod_unload_r(struct kmod_module *kmod, unsigned int flags)
 		}
 	}
 
-	return kmod_module_remove_module(kmod, flags);
+	for (tries = 0; tries < MAX_TRIES; tries++) {
+		err = kmod_module_remove_module(kmod, flags);
+
+		/* Only loop in the following cases */
+		if (err != -EBUSY && err != -EAGAIN)
+			break;
+
+		igt_debug("Module %s failed to unload with err: %d on attempt: %i\n",
+			  mod_name, err, tries + 1);
+
+		if (tries < MAX_TRIES - 1)
+			usleep(SLEEP_DURATION);
+	}
+
+	if (err == -ENOENT)
+		igt_debug("Module %s could not be found or does not exist. err: %d\n",
+			  mod_name, err);
+	else if (err == -ENOTSUP)
+		igt_debug("Module %s cannot be unloaded. err: %d\n",
+			  mod_name, err);
+	else if (err)
+		igt_debug("Module %s failed to unload with err: %d after ~%.1fms\n",
+			  mod_name, err, SLEEP_DURATION*tries/1000.);
+	else if (tries)
+		igt_debug("Module %s unload took ~%.1fms over %i attempts\n",
+			  mod_name, SLEEP_DURATION*tries/1000., tries + 1);
+	else
+		igt_debug("Module %s unloaded immediately\n", mod_name);
+
+	return err;
 }
 
 /**
-- 
2.25.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [igt-dev] [PATCH i-g-t] lib/igt_kmod: Allow some leeway in igt_kmod_unload_r
  2023-01-17 18:03 Jonathan Cavitt
@ 2023-01-18 13:10 ` Kamil Konieczny
  0 siblings, 0 replies; 11+ messages in thread
From: Kamil Konieczny @ 2023-01-18 13:10 UTC (permalink / raw)
  To: igt-dev; +Cc: sandeep.kumar.parupalli, jonathan.cavitt, chris.p.wilson

Hi Jonathan,

few more nits, see below.

On 2023-01-17 at 10:03:34 -0800, Jonathan Cavitt wrote:
> kmod_module_remove_module occasionally returns EAGAIN for mei_gsc
> in the setup of some gem_lmem_swapping subtests.  Just because
> EAGAIN is returned doesn't mean the module is lost and unable to
> be unloaded.  Try again some number of times (currently 10) before
> giving up.
> 
> Retries will occur for -EBUSY and -EAGAIN, as these imply the
> system is waiting for the target module can simply be waited for.
> All other errors exit immediately, but as the context for each
> error informs their relative severity, no warnings will be issued.
> 
> Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> CC: Stuart Summers <stuart.summers@intel.com>
> CC: Sandeep Kumar Parupalli <sandeep.kumar.parupalli@intel.com>
> CC: Chris Wilson <chris.p.wilson@linux.intel.com>
> Reviewed-by: Chris Wilson <chris.p.wilson@linux.intel.com>
> ---
>  lib/igt_kmod.c | 32 +++++++++++++++++++++++++++++---
>  1 file changed, 29 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/igt_kmod.c b/lib/igt_kmod.c
> index 17090110c..10c79b740 100644
> --- a/lib/igt_kmod.c
> +++ b/lib/igt_kmod.c
> @@ -253,8 +253,11 @@ out:
>  
>  static int igt_kmod_unload_r(struct kmod_module *kmod, unsigned int flags)
>  {
> +#define MAX_TRIES	20
> +#define SLEEP_DURATION	500000
>  	struct kmod_list *holders, *pos;
> -	int err = 0;
> +	int err, tries;
> +	const char *mod_name = kmod_module_get_name(kmod);
>  
>  	holders = kmod_module_get_holders(kmod);
>  	kmod_list_foreach(pos, holders) {
> @@ -269,7 +272,6 @@ static int igt_kmod_unload_r(struct kmod_module *kmod, unsigned int flags)
>  		return err;
>  
>  	if (igt_kmod_is_loading(kmod)) {
> -		const char *mod_name = kmod_module_get_name(kmod);
>  		igt_debug("%s still initializing\n", mod_name);
>  		err = igt_wait(!igt_kmod_is_loading(kmod), 10000, 100);
>  		if (err < 0) {
> @@ -279,7 +281,31 @@ static int igt_kmod_unload_r(struct kmod_module *kmod, unsigned int flags)
>  		}
>  	}
>  
> -	return kmod_module_remove_module(kmod, flags);
> +	for (tries = 0; tries < MAX_TRIES; tries++) {
> +		bool loop = false;

Remove this var, see below.

> +
> +		err = kmod_module_remove_module(kmod, flags);
> +
> +		/* Only loop in the following cases */
> +		loop = err == -EBUSY || err == -EAGAIN;
> +
> +		if (!loop)
> +			break;

Instead of using loop var just test and break,
		if (err != -EBUSY && err != -EAGAIN)
			break;

> +
> +		igt_debug("Module %s failed to unload with err: %d on attempt: %i\n",
> +			  mod_name, err, tries + 1);
> +

Avoid sleep before break, so
		if (tries < MAX_TRIES - 1)

> +		usleep(SLEEP_DURATION);
> +	}
> +
> +	if (err && err != -ENOENT)
> +		igt_info("Module %s failed to unload with err: %d after ~%.1fms\n",
> +			 mod_name, err, SLEEP_DURATION*tries/1000.);
> +	else if (tries)
> +		igt_info("Module %s unload took ~%.1fms over %i attempts\n",
> +			 mod_name, SLEEP_DURATION*tries/1000., tries + 1);

imho it should also print reasonable info about -ENOENT and -ENOTSUP

Regards,
Kamil

> +
> +	return err;
>  }
>  
>  /**
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [igt-dev] [PATCH i-g-t] lib/igt_kmod: Allow some leeway in igt_kmod_unload_r
@ 2023-01-17 18:03 Jonathan Cavitt
  2023-01-18 13:10 ` Kamil Konieczny
  0 siblings, 1 reply; 11+ messages in thread
From: Jonathan Cavitt @ 2023-01-17 18:03 UTC (permalink / raw)
  To: igt-dev; +Cc: chris.p.wilson, jonathan.cavitt, sandeep.kumar.parupalli

kmod_module_remove_module occasionally returns EAGAIN for mei_gsc
in the setup of some gem_lmem_swapping subtests.  Just because
EAGAIN is returned doesn't mean the module is lost and unable to
be unloaded.  Try again some number of times (currently 10) before
giving up.

Retries will occur for -EBUSY and -EAGAIN, as these imply the
system is waiting for the target module can simply be waited for.
All other errors exit immediately, but as the context for each
error informs their relative severity, no warnings will be issued.

Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
CC: Stuart Summers <stuart.summers@intel.com>
CC: Sandeep Kumar Parupalli <sandeep.kumar.parupalli@intel.com>
CC: Chris Wilson <chris.p.wilson@linux.intel.com>
Reviewed-by: Chris Wilson <chris.p.wilson@linux.intel.com>
---
 lib/igt_kmod.c | 32 +++++++++++++++++++++++++++++---
 1 file changed, 29 insertions(+), 3 deletions(-)

diff --git a/lib/igt_kmod.c b/lib/igt_kmod.c
index 17090110c..10c79b740 100644
--- a/lib/igt_kmod.c
+++ b/lib/igt_kmod.c
@@ -253,8 +253,11 @@ out:
 
 static int igt_kmod_unload_r(struct kmod_module *kmod, unsigned int flags)
 {
+#define MAX_TRIES	20
+#define SLEEP_DURATION	500000
 	struct kmod_list *holders, *pos;
-	int err = 0;
+	int err, tries;
+	const char *mod_name = kmod_module_get_name(kmod);
 
 	holders = kmod_module_get_holders(kmod);
 	kmod_list_foreach(pos, holders) {
@@ -269,7 +272,6 @@ static int igt_kmod_unload_r(struct kmod_module *kmod, unsigned int flags)
 		return err;
 
 	if (igt_kmod_is_loading(kmod)) {
-		const char *mod_name = kmod_module_get_name(kmod);
 		igt_debug("%s still initializing\n", mod_name);
 		err = igt_wait(!igt_kmod_is_loading(kmod), 10000, 100);
 		if (err < 0) {
@@ -279,7 +281,31 @@ static int igt_kmod_unload_r(struct kmod_module *kmod, unsigned int flags)
 		}
 	}
 
-	return kmod_module_remove_module(kmod, flags);
+	for (tries = 0; tries < MAX_TRIES; tries++) {
+		bool loop = false;
+
+		err = kmod_module_remove_module(kmod, flags);
+
+		/* Only loop in the following cases */
+		loop = err == -EBUSY || err == -EAGAIN;
+
+		if (!loop)
+			break;
+
+		igt_debug("Module %s failed to unload with err: %d on attempt: %i\n",
+			  mod_name, err, tries + 1);
+
+		usleep(SLEEP_DURATION);
+	}
+
+	if (err && err != -ENOENT)
+		igt_info("Module %s failed to unload with err: %d after ~%.1fms\n",
+			 mod_name, err, SLEEP_DURATION*tries/1000.);
+	else if (tries)
+		igt_info("Module %s unload took ~%.1fms over %i attempts\n",
+			 mod_name, SLEEP_DURATION*tries/1000., tries + 1);
+
+	return err;
 }
 
 /**
-- 
2.25.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [igt-dev] [PATCH i-g-t] lib/igt_kmod: Allow some leeway in igt_kmod_unload_r
  2023-01-17 16:06 ` Kamil Konieczny
@ 2023-01-17 17:56   ` Cavitt, Jonathan
  0 siblings, 0 replies; 11+ messages in thread
From: Cavitt, Jonathan @ 2023-01-17 17:56 UTC (permalink / raw)
  To: Kamil Konieczny; +Cc: Parupalli, Sandeep Kumar, igt-dev, Chris Wilson

-----Original Message-----
From: Kamil Konieczny <kamil.konieczny@linux.intel.com> 
Sent: Tuesday, January 17, 2023 8:07 AM
To: igt-dev@lists.freedesktop.org
Cc: Cavitt, Jonathan <jonathan.cavitt@intel.com>; Summers, Stuart <stuart.summers@intel.com>; Parupalli, Sandeep Kumar <sandeep.kumar.parupalli@intel.com>; Chris Wilson <chris.p.wilson@linux.intel.com>
Subject: Re: [igt-dev] [PATCH i-g-t] lib/igt_kmod: Allow some leeway in igt_kmod_unload_r
> 
> Hi Jonathan,
> 
> On 2023-01-09 at 11:25:36 -0800, Jonathan Cavitt wrote:
> > kmod_module_remove_module occasionally returns EAGAIN for mei_gsc
> > in the setup of some gem_lmem_swapping subtests.  Just because
> > EAGAIN is returned doesn't mean the module is lost and unable to
> > be unloaded.  Try again some number of times (currently 10) before
> > giving up.
> > 
> > Retries will occur for -EBUSY and -EAGAIN, as these imply the
> > system is waiting for the target module can simply be waited for.
> > All other errors exit immediately, but as the context for each
> > error informs their relative severity, no warnings will be issued.
> > 
> > References: VLK-30287, VLK-39629
> - ^ --------- ^
> Please remove any refs not reachable from internet.
> 
> > 
> > Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> > CC: Stuart Summers <stuart.summers@intel.com>
> > CC: Sandeep Kumar Parupalli <sandeep.kumar.parupalli@intel.com>
> > CC: Chris Wilson <chris.p.wilson@linux.intel.com>
> > Reviewed-by: Chris Wilson <chris.p.wilson@linux.intel.com>
> > ---
> >  lib/igt_kmod.c | 41 ++++++++++++++++++++++++++++++++++++++---
> >  1 file changed, 38 insertions(+), 3 deletions(-)
> > 
> > diff --git a/lib/igt_kmod.c b/lib/igt_kmod.c
> > index 17090110c..c840f9a17 100644
> > --- a/lib/igt_kmod.c
> > +++ b/lib/igt_kmod.c
> > @@ -253,8 +253,11 @@ out:
> >  
> >  static int igt_kmod_unload_r(struct kmod_module *kmod, unsigned int flags)
> >  {
> > +#define MAX_TRIES	20
> > +#define SLEEP_DURATION	500000
> >  	struct kmod_list *holders, *pos;
> > -	int err = 0;
> > +	int err, tries;
> > +	const char *mod_name = kmod_module_get_name(kmod);
> >  
> >  	holders = kmod_module_get_holders(kmod);
> >  	kmod_list_foreach(pos, holders) {
> > @@ -269,7 +272,6 @@ static int igt_kmod_unload_r(struct kmod_module *kmod, unsigned int flags)
> >  		return err;
> >  
> >  	if (igt_kmod_is_loading(kmod)) {
> > -		const char *mod_name = kmod_module_get_name(kmod);
> 
> Please describe this change, it looks unrelated.

The debug/info messages at the end of this function print the mod_name, so we need to initialize it at the
start of the test, rather than just when igt_kmod_is_loading.

> 
> >  		igt_debug("%s still initializing\n", mod_name);
> >  		err = igt_wait(!igt_kmod_is_loading(kmod), 10000, 100);
> >  		if (err < 0) {
> > @@ -279,7 +281,40 @@ static int igt_kmod_unload_r(struct kmod_module *kmod, unsigned int flags)
> >  		}
> >  	}
> >  
> > -	return kmod_module_remove_module(kmod, flags);
> > +	for (tries = 0; tries < MAX_TRIES; tries++) {
> > +		bool loop = false;
> > +
> > +		err = kmod_module_remove_module(kmod, flags);
> > +
> > +		/* Only loop in the following cases */
> > +		switch (err) {
> > +		case -EBUSY:
> > +		case -EAGAIN:
> > +			/* Waiting for module to be available */
> > +			loop = true;
> > +			break;
> > +		default:
> > +			break;
> > +                }
> > +		if (!loop)
> > +			break;
> 
> May you turn switch into if() ? Now it looks convoluted.

This was designed to be easily extendable for additional cases where we want to wait for
kmod_module_remove_module to succeed, or if certain other error cases needed
additional management.  I'll change it over to this:

		loop = err == -EBUSY || err == -EAGAIN;

I don't foresee this condition needing to be extended, but if it does, this line might get
cumbersomely large.

> 
> > +
> > +		igt_debug("Module %s failed to unload with err: %d on attempt: %i\n",
> > +			  mod_name, err, tries + 1);
> > +
> > +		usleep(SLEEP_DURATION);
> > +	}
> > +
> > +	if (err == -ENOENT)
> > +		igt_debug("Module %s could not be found.  Skipping.\n", mod_name);
> ------------------------------------------------------- ^^^^^^^^^^
> Please remove /  Skipping./, you print debugs and you will not
> decide what caller will do.
> 
> One more error would be if driver is compiled in, so it
> just can not be removed (will it be -ENOTSUP ?).

All other errors should be caught below.  We just don't inform on -ENOENT because
that's an expected condition for builtins we won't be fixing, so we issue an igt_debug
instead.  In other words, we don't inform the user of -ENOENT errors because the
desired end result for the user is still achieved (module X is not loaded).  By
comparison, it sounds like the -ENOTSUP case keeps the module loaded, so we should
probably inform the user the module did not unload.

If this is incorrect, and -ENOTSUP is in the same class of error as -ENOENT, I can add
that to the set of errors we don't directly inform on.

> 
> > +	else if (err)
> > +		igt_info("Module %s failed to unload with err: %d after ~%.1fms\n",
> --------------- ^
> Please make them all into igt_info or igt_debug.

Every error other than -ENOENT (for reasons explained above) is a genuine issue
that should be warned on, but issuing a full warning isn't desirable in most cases,
so we simply inform instead.  We cannot hide these issues behind igt_debug.

Additionally, below, the module taking more than one attempt to unload is a
genuine issue the user should be made privy to, so we shouldn't hide it
behind igt_debug either.

It was argued to me in the past that we could just remove the -ENOENT igt_debug
statement entirely.  I'll do that and see how that's received, and if it's received
positively, the reason for why igt_debug is used in the -ENOENT case and nowhere
else is a moot distinction.
-Jonathan Cavitt

> 
> > +			 mod_name, err, SLEEP_DURATION*tries/1000.);
> > +	else if (tries)
> > +		igt_info("Module %s unload took ~%.1fms over %i attempts\n",
> --------------- ^
> Same here, please be consistent.
> 
> Regards,
> Kamil
> 
> > +			 mod_name, SLEEP_DURATION*tries/1000., tries + 1);
> > +
> > +	return err;
> >  }
> >  
> >  /**
> > -- 
> > 2.25.1
> > 
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [igt-dev] [PATCH i-g-t] lib/igt_kmod: Allow some leeway in igt_kmod_unload_r
  2023-01-09 19:25 Jonathan Cavitt
  2023-01-09 19:33 ` Cavitt, Jonathan
@ 2023-01-17 16:06 ` Kamil Konieczny
  2023-01-17 17:56   ` Cavitt, Jonathan
  1 sibling, 1 reply; 11+ messages in thread
From: Kamil Konieczny @ 2023-01-17 16:06 UTC (permalink / raw)
  To: igt-dev; +Cc: Sandeep Kumar Parupalli, Jonathan Cavitt, Chris Wilson

Hi Jonathan,

On 2023-01-09 at 11:25:36 -0800, Jonathan Cavitt wrote:
> kmod_module_remove_module occasionally returns EAGAIN for mei_gsc
> in the setup of some gem_lmem_swapping subtests.  Just because
> EAGAIN is returned doesn't mean the module is lost and unable to
> be unloaded.  Try again some number of times (currently 10) before
> giving up.
> 
> Retries will occur for -EBUSY and -EAGAIN, as these imply the
> system is waiting for the target module can simply be waited for.
> All other errors exit immediately, but as the context for each
> error informs their relative severity, no warnings will be issued.
> 
> References: VLK-30287, VLK-39629
- ^ --------- ^
Please remove any refs not reachable from internet.

> 
> Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> CC: Stuart Summers <stuart.summers@intel.com>
> CC: Sandeep Kumar Parupalli <sandeep.kumar.parupalli@intel.com>
> CC: Chris Wilson <chris.p.wilson@linux.intel.com>
> Reviewed-by: Chris Wilson <chris.p.wilson@linux.intel.com>
> ---
>  lib/igt_kmod.c | 41 ++++++++++++++++++++++++++++++++++++++---
>  1 file changed, 38 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/igt_kmod.c b/lib/igt_kmod.c
> index 17090110c..c840f9a17 100644
> --- a/lib/igt_kmod.c
> +++ b/lib/igt_kmod.c
> @@ -253,8 +253,11 @@ out:
>  
>  static int igt_kmod_unload_r(struct kmod_module *kmod, unsigned int flags)
>  {
> +#define MAX_TRIES	20
> +#define SLEEP_DURATION	500000
>  	struct kmod_list *holders, *pos;
> -	int err = 0;
> +	int err, tries;
> +	const char *mod_name = kmod_module_get_name(kmod);
>  
>  	holders = kmod_module_get_holders(kmod);
>  	kmod_list_foreach(pos, holders) {
> @@ -269,7 +272,6 @@ static int igt_kmod_unload_r(struct kmod_module *kmod, unsigned int flags)
>  		return err;
>  
>  	if (igt_kmod_is_loading(kmod)) {
> -		const char *mod_name = kmod_module_get_name(kmod);

Please describe this change, it looks unrelated.

>  		igt_debug("%s still initializing\n", mod_name);
>  		err = igt_wait(!igt_kmod_is_loading(kmod), 10000, 100);
>  		if (err < 0) {
> @@ -279,7 +281,40 @@ static int igt_kmod_unload_r(struct kmod_module *kmod, unsigned int flags)
>  		}
>  	}
>  
> -	return kmod_module_remove_module(kmod, flags);
> +	for (tries = 0; tries < MAX_TRIES; tries++) {
> +		bool loop = false;
> +
> +		err = kmod_module_remove_module(kmod, flags);
> +
> +		/* Only loop in the following cases */
> +		switch (err) {
> +		case -EBUSY:
> +		case -EAGAIN:
> +			/* Waiting for module to be available */
> +			loop = true;
> +			break;
> +		default:
> +			break;
> +                }
> +		if (!loop)
> +			break;

May you turn switch into if() ? Now it looks convoluted.

> +
> +		igt_debug("Module %s failed to unload with err: %d on attempt: %i\n",
> +			  mod_name, err, tries + 1);
> +
> +		usleep(SLEEP_DURATION);
> +	}
> +
> +	if (err == -ENOENT)
> +		igt_debug("Module %s could not be found.  Skipping.\n", mod_name);
------------------------------------------------------- ^^^^^^^^^^
Please remove /  Skipping./, you print debugs and you will not
decide what caller will do.

One more error would be if driver is compiled in, so it
just can not be removed (will it be -ENOTSUP ?).

> +	else if (err)
> +		igt_info("Module %s failed to unload with err: %d after ~%.1fms\n",
--------------- ^
Please make them all into igt_info or igt_debug.

> +			 mod_name, err, SLEEP_DURATION*tries/1000.);
> +	else if (tries)
> +		igt_info("Module %s unload took ~%.1fms over %i attempts\n",
--------------- ^
Same here, please be consistent.

Regards,
Kamil

> +			 mod_name, SLEEP_DURATION*tries/1000., tries + 1);
> +
> +	return err;
>  }
>  
>  /**
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [igt-dev] [PATCH i-g-t] lib/igt_kmod: Allow some leeway in igt_kmod_unload_r
  2023-01-09 19:25 Jonathan Cavitt
@ 2023-01-09 19:33 ` Cavitt, Jonathan
  2023-01-17 16:06 ` Kamil Konieczny
  1 sibling, 0 replies; 11+ messages in thread
From: Cavitt, Jonathan @ 2023-01-09 19:33 UTC (permalink / raw)
  To: Cavitt, Jonathan; +Cc: igt-dev

-----Original Message-----
From: Cavitt, Jonathan <jonathan.cavitt@intel.com> 
Sent: Monday, January 9, 2023 11:26 AM
To: igt-dev@lists.freedesktop.org
Cc: Dutt, Sudeep <sudeep.dutt@intel.com>; Cavitt, Jonathan <jonathan.cavitt@intel.com>
Subject: [PATCH i-g-t] lib/igt_kmod: Allow some leeway in igt_kmod_unload_r
> 
> kmod_module_remove_module occasionally returns EAGAIN for mei_gsc
> in the setup of some gem_lmem_swapping subtests.  Just because
> EAGAIN is returned doesn't mean the module is lost and unable to
> be unloaded.  Try again some number of times (currently 10) before

Just noticed it's actually 20 times in the code.  I'll fix this before confirming the final
push if this change is accepted.
-Jonathan Cavitt

> giving up.
> 
> Retries will occur for -EBUSY and -EAGAIN, as these imply the
> system is waiting for the target module can simply be waited for.
> All other errors exit immediately, but as the context for each
> error informs their relative severity, no warnings will be issued.
> 
> References: VLK-30287, VLK-39629
> 
> Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> CC: Stuart Summers <stuart.summers@intel.com>
> CC: Sandeep Kumar Parupalli <sandeep.kumar.parupalli@intel.com>
> CC: Chris Wilson <chris.p.wilson@linux.intel.com>
> Reviewed-by: Chris Wilson <chris.p.wilson@linux.intel.com>
> ---
>  lib/igt_kmod.c | 41 ++++++++++++++++++++++++++++++++++++++---
>  1 file changed, 38 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/igt_kmod.c b/lib/igt_kmod.c
> index 17090110c..c840f9a17 100644
> --- a/lib/igt_kmod.c
> +++ b/lib/igt_kmod.c
> @@ -253,8 +253,11 @@ out:
>  
>  static int igt_kmod_unload_r(struct kmod_module *kmod, unsigned int flags)
>  {
> +#define MAX_TRIES	20
> +#define SLEEP_DURATION	500000
>  	struct kmod_list *holders, *pos;
> -	int err = 0;
> +	int err, tries;
> +	const char *mod_name = kmod_module_get_name(kmod);
>  
>  	holders = kmod_module_get_holders(kmod);
>  	kmod_list_foreach(pos, holders) {
> @@ -269,7 +272,6 @@ static int igt_kmod_unload_r(struct kmod_module *kmod, unsigned int flags)
>  		return err;
>  
>  	if (igt_kmod_is_loading(kmod)) {
> -		const char *mod_name = kmod_module_get_name(kmod);
>  		igt_debug("%s still initializing\n", mod_name);
>  		err = igt_wait(!igt_kmod_is_loading(kmod), 10000, 100);
>  		if (err < 0) {
> @@ -279,7 +281,40 @@ static int igt_kmod_unload_r(struct kmod_module *kmod, unsigned int flags)
>  		}
>  	}
>  
> -	return kmod_module_remove_module(kmod, flags);
> +	for (tries = 0; tries < MAX_TRIES; tries++) {
> +		bool loop = false;
> +
> +		err = kmod_module_remove_module(kmod, flags);
> +
> +		/* Only loop in the following cases */
> +		switch (err) {
> +		case -EBUSY:
> +		case -EAGAIN:
> +			/* Waiting for module to be available */
> +			loop = true;
> +			break;
> +		default:
> +			break;
> +                }
> +		if (!loop)
> +			break;
> +
> +		igt_debug("Module %s failed to unload with err: %d on attempt: %i\n",
> +			  mod_name, err, tries + 1);
> +
> +		usleep(SLEEP_DURATION);
> +	}
> +
> +	if (err == -ENOENT)
> +		igt_debug("Module %s could not be found.  Skipping.\n", mod_name);
> +	else if (err)
> +		igt_info("Module %s failed to unload with err: %d after ~%.1fms\n",
> +			 mod_name, err, SLEEP_DURATION*tries/1000.);
> +	else if (tries)
> +		igt_info("Module %s unload took ~%.1fms over %i attempts\n",
> +			 mod_name, SLEEP_DURATION*tries/1000., tries + 1);
> +
> +	return err;
>  }
>  
>  /**
> -- 
> 2.25.1
> 
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [igt-dev] [PATCH i-g-t] lib/igt_kmod: Allow some leeway in igt_kmod_unload_r
@ 2023-01-09 19:25 Jonathan Cavitt
  2023-01-09 19:33 ` Cavitt, Jonathan
  2023-01-17 16:06 ` Kamil Konieczny
  0 siblings, 2 replies; 11+ messages in thread
From: Jonathan Cavitt @ 2023-01-09 19:25 UTC (permalink / raw)
  To: igt-dev; +Cc: jonathan.cavitt

kmod_module_remove_module occasionally returns EAGAIN for mei_gsc
in the setup of some gem_lmem_swapping subtests.  Just because
EAGAIN is returned doesn't mean the module is lost and unable to
be unloaded.  Try again some number of times (currently 10) before
giving up.

Retries will occur for -EBUSY and -EAGAIN, as these imply the
system is waiting for the target module can simply be waited for.
All other errors exit immediately, but as the context for each
error informs their relative severity, no warnings will be issued.

References: VLK-30287, VLK-39629

Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
CC: Stuart Summers <stuart.summers@intel.com>
CC: Sandeep Kumar Parupalli <sandeep.kumar.parupalli@intel.com>
CC: Chris Wilson <chris.p.wilson@linux.intel.com>
Reviewed-by: Chris Wilson <chris.p.wilson@linux.intel.com>
---
 lib/igt_kmod.c | 41 ++++++++++++++++++++++++++++++++++++++---
 1 file changed, 38 insertions(+), 3 deletions(-)

diff --git a/lib/igt_kmod.c b/lib/igt_kmod.c
index 17090110c..c840f9a17 100644
--- a/lib/igt_kmod.c
+++ b/lib/igt_kmod.c
@@ -253,8 +253,11 @@ out:
 
 static int igt_kmod_unload_r(struct kmod_module *kmod, unsigned int flags)
 {
+#define MAX_TRIES	20
+#define SLEEP_DURATION	500000
 	struct kmod_list *holders, *pos;
-	int err = 0;
+	int err, tries;
+	const char *mod_name = kmod_module_get_name(kmod);
 
 	holders = kmod_module_get_holders(kmod);
 	kmod_list_foreach(pos, holders) {
@@ -269,7 +272,6 @@ static int igt_kmod_unload_r(struct kmod_module *kmod, unsigned int flags)
 		return err;
 
 	if (igt_kmod_is_loading(kmod)) {
-		const char *mod_name = kmod_module_get_name(kmod);
 		igt_debug("%s still initializing\n", mod_name);
 		err = igt_wait(!igt_kmod_is_loading(kmod), 10000, 100);
 		if (err < 0) {
@@ -279,7 +281,40 @@ static int igt_kmod_unload_r(struct kmod_module *kmod, unsigned int flags)
 		}
 	}
 
-	return kmod_module_remove_module(kmod, flags);
+	for (tries = 0; tries < MAX_TRIES; tries++) {
+		bool loop = false;
+
+		err = kmod_module_remove_module(kmod, flags);
+
+		/* Only loop in the following cases */
+		switch (err) {
+		case -EBUSY:
+		case -EAGAIN:
+			/* Waiting for module to be available */
+			loop = true;
+			break;
+		default:
+			break;
+                }
+		if (!loop)
+			break;
+
+		igt_debug("Module %s failed to unload with err: %d on attempt: %i\n",
+			  mod_name, err, tries + 1);
+
+		usleep(SLEEP_DURATION);
+	}
+
+	if (err == -ENOENT)
+		igt_debug("Module %s could not be found.  Skipping.\n", mod_name);
+	else if (err)
+		igt_info("Module %s failed to unload with err: %d after ~%.1fms\n",
+			 mod_name, err, SLEEP_DURATION*tries/1000.);
+	else if (tries)
+		igt_info("Module %s unload took ~%.1fms over %i attempts\n",
+			 mod_name, SLEEP_DURATION*tries/1000., tries + 1);
+
+	return err;
 }
 
 /**
-- 
2.25.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-01-18 20:08 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-18 16:06 [igt-dev] [PATCH i-g-t] lib/igt_kmod: Allow some leeway in igt_kmod_unload_r Jonathan Cavitt
2023-01-18 19:14 ` [igt-dev] ✗ Fi.CI.BAT: failure for lib/igt_kmod: Allow some leeway in igt_kmod_unload_r (rev3) Patchwork
2023-01-18 19:32 ` [igt-dev] [PATCH i-g-t] lib/igt_kmod: Allow some leeway in igt_kmod_unload_r Kamil Konieczny
2023-01-18 19:56   ` Cavitt, Jonathan
  -- strict thread matches above, loose matches on Subject: below --
2023-01-18 20:03 Jonathan Cavitt
2023-01-17 18:03 Jonathan Cavitt
2023-01-18 13:10 ` Kamil Konieczny
2023-01-09 19:25 Jonathan Cavitt
2023-01-09 19:33 ` Cavitt, Jonathan
2023-01-17 16:06 ` Kamil Konieczny
2023-01-17 17:56   ` Cavitt, Jonathan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.