All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Ekstrand <jason@jlekstrand.net>
To: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org
Cc: Jason Ekstrand <jason@jlekstrand.net>
Subject: [PATCH 2/2] drm/i915: Tear down properly on early i915_init exit
Date: Fri, 16 Jul 2021 17:47:58 -0500	[thread overview]
Message-ID: <20210716224758.2162003-2-jason@jlekstrand.net> (raw)
In-Reply-To: <20210716224758.2162003-1-jason@jlekstrand.net>

In i915_exit(), we check i915_pci_driver.driver.owner to detect if
i915_init exited early and don't tear anything down.  However, we didn't
have proper tear-down paths for early exits in i915_init().

Most of the time, you would never notice this as driver init failures
are extremely rare and generally the sign of a bigger bug.  However,
when the mock self-tests are run, they run as part of i915_init() and
exit early once they complete.  They run after i915_globals_init() and
before we set up anything else.  The IGT test then unloads the module,
invoking i915_exit() which, thanks to our i915_pci_driver.driver.owner
check, doesn't actually tear anything down.  Importantly, this means
i915_globals_exit() never gets called even though i915_globals_init()
was and we leak the globals.

The most annoying part is that you don't actually notice the failure as
part of the self-tests since leaking a bit of memory, while bad, doesn't
result in anything observable from userspace.  Instead, the next time we
load the driver (usually for next IGT test), i915_globals_init() gets
invoked again, we go to allocate a bunch of new memory slabs, those
implicitly create debugfs entries, and debugfs warns that we're trying
to create directories and files that already exist.  Since this all
happens as part of the next driver load, it shows up in the dmesg-warn
of whatever IGT test ran after the mock selftests.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Fixes: 32eb6bcfdda9 ("drm/i915: Make request allocation caches global")
Cc: Daniel Vetter <daniel@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_globals.c |  4 ++--
 drivers/gpu/drm/i915/i915_pci.c     | 23 +++++++++++++++++------
 2 files changed, 19 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_globals.c b/drivers/gpu/drm/i915/i915_globals.c
index 77f1911c463b8..87267e1d2ad92 100644
--- a/drivers/gpu/drm/i915/i915_globals.c
+++ b/drivers/gpu/drm/i915/i915_globals.c
@@ -138,7 +138,7 @@ void i915_globals_unpark(void)
 	atomic_inc(&active);
 }
 
-static void __exit __i915_globals_flush(void)
+static void __i915_globals_flush(void)
 {
 	atomic_inc(&active); /* skip shrinking */
 
@@ -148,7 +148,7 @@ static void __exit __i915_globals_flush(void)
 	atomic_dec(&active);
 }
 
-void __exit i915_globals_exit(void)
+void i915_globals_exit(void)
 {
 	GEM_BUG_ON(atomic_read(&active));
 
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 50ed93b03e582..783f547be0990 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -1199,13 +1199,20 @@ static int __init i915_init(void)
 	bool use_kms = true;
 	int err;
 
+	/* We use this to detect early returns from i915_init() so we don't
+	 * tear anything down in i915_exit()
+	 */
+	i915_pci_driver.driver.owner = NULL;
+
 	err = i915_globals_init();
 	if (err)
 		return err;
 
 	err = i915_mock_selftests();
-	if (err)
-		return err > 0 ? 0 : err;
+	if (err) {
+		err = err > 0 ? 0 : err;
+		goto globals_exit;
+	}
 
 	/*
 	 * Enable KMS by default, unless explicitly overriden by
@@ -1228,13 +1235,17 @@ static int __init i915_init(void)
 	i915_pmu_init();
 
 	err = pci_register_driver(&i915_pci_driver);
-	if (err) {
-		i915_pmu_exit();
-		return err;
-	}
+	if (err)
+		goto pmu_exit;
 
 	i915_perf_sysctl_register();
 	return 0;
+
+pmu_exit:
+	i915_pmu_exit();
+globals_exit:
+	i915_globals_exit();
+	return err;
 }
 
 static void __exit i915_exit(void)
-- 
2.31.1


WARNING: multiple messages have this Message-ID (diff)
From: Jason Ekstrand <jason@jlekstrand.net>
To: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org
Subject: [Intel-gfx] [PATCH 2/2] drm/i915: Tear down properly on early i915_init exit
Date: Fri, 16 Jul 2021 17:47:58 -0500	[thread overview]
Message-ID: <20210716224758.2162003-2-jason@jlekstrand.net> (raw)
In-Reply-To: <20210716224758.2162003-1-jason@jlekstrand.net>

In i915_exit(), we check i915_pci_driver.driver.owner to detect if
i915_init exited early and don't tear anything down.  However, we didn't
have proper tear-down paths for early exits in i915_init().

Most of the time, you would never notice this as driver init failures
are extremely rare and generally the sign of a bigger bug.  However,
when the mock self-tests are run, they run as part of i915_init() and
exit early once they complete.  They run after i915_globals_init() and
before we set up anything else.  The IGT test then unloads the module,
invoking i915_exit() which, thanks to our i915_pci_driver.driver.owner
check, doesn't actually tear anything down.  Importantly, this means
i915_globals_exit() never gets called even though i915_globals_init()
was and we leak the globals.

The most annoying part is that you don't actually notice the failure as
part of the self-tests since leaking a bit of memory, while bad, doesn't
result in anything observable from userspace.  Instead, the next time we
load the driver (usually for next IGT test), i915_globals_init() gets
invoked again, we go to allocate a bunch of new memory slabs, those
implicitly create debugfs entries, and debugfs warns that we're trying
to create directories and files that already exist.  Since this all
happens as part of the next driver load, it shows up in the dmesg-warn
of whatever IGT test ran after the mock selftests.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Fixes: 32eb6bcfdda9 ("drm/i915: Make request allocation caches global")
Cc: Daniel Vetter <daniel@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_globals.c |  4 ++--
 drivers/gpu/drm/i915/i915_pci.c     | 23 +++++++++++++++++------
 2 files changed, 19 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_globals.c b/drivers/gpu/drm/i915/i915_globals.c
index 77f1911c463b8..87267e1d2ad92 100644
--- a/drivers/gpu/drm/i915/i915_globals.c
+++ b/drivers/gpu/drm/i915/i915_globals.c
@@ -138,7 +138,7 @@ void i915_globals_unpark(void)
 	atomic_inc(&active);
 }
 
-static void __exit __i915_globals_flush(void)
+static void __i915_globals_flush(void)
 {
 	atomic_inc(&active); /* skip shrinking */
 
@@ -148,7 +148,7 @@ static void __exit __i915_globals_flush(void)
 	atomic_dec(&active);
 }
 
-void __exit i915_globals_exit(void)
+void i915_globals_exit(void)
 {
 	GEM_BUG_ON(atomic_read(&active));
 
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 50ed93b03e582..783f547be0990 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -1199,13 +1199,20 @@ static int __init i915_init(void)
 	bool use_kms = true;
 	int err;
 
+	/* We use this to detect early returns from i915_init() so we don't
+	 * tear anything down in i915_exit()
+	 */
+	i915_pci_driver.driver.owner = NULL;
+
 	err = i915_globals_init();
 	if (err)
 		return err;
 
 	err = i915_mock_selftests();
-	if (err)
-		return err > 0 ? 0 : err;
+	if (err) {
+		err = err > 0 ? 0 : err;
+		goto globals_exit;
+	}
 
 	/*
 	 * Enable KMS by default, unless explicitly overriden by
@@ -1228,13 +1235,17 @@ static int __init i915_init(void)
 	i915_pmu_init();
 
 	err = pci_register_driver(&i915_pci_driver);
-	if (err) {
-		i915_pmu_exit();
-		return err;
-	}
+	if (err)
+		goto pmu_exit;
 
 	i915_perf_sysctl_register();
 	return 0;
+
+pmu_exit:
+	i915_pmu_exit();
+globals_exit:
+	i915_globals_exit();
+	return err;
 }
 
 static void __exit i915_exit(void)
-- 
2.31.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2021-07-16 22:48 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-16 22:47 [PATCH 1/2] drm/i915: Call i915_globals_exit() after i915_pmu_exit() Jason Ekstrand
2021-07-16 22:47 ` [Intel-gfx] " Jason Ekstrand
2021-07-16 22:47 ` Jason Ekstrand [this message]
2021-07-16 22:47   ` [Intel-gfx] [PATCH 2/2] drm/i915: Tear down properly on early i915_init exit Jason Ekstrand
2021-07-19  8:28   ` Daniel Vetter
2021-07-19  8:28     ` [Intel-gfx] " Daniel Vetter
2021-07-17  1:21 ` [Intel-gfx] ✗ Fi.CI.DOCS: warning for series starting with [1/2] drm/i915: Call i915_globals_exit() after i915_pmu_exit() Patchwork
2021-07-17  1:47 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2021-07-17 11:53 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
2021-07-19  8:29 ` [PATCH 1/2] " Daniel Vetter
2021-07-19  8:29   ` [Intel-gfx] " Daniel Vetter
2021-07-19  8:55 ` Tvrtko Ursulin
2021-07-19  8:55   ` Tvrtko Ursulin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210716224758.2162003-2-jason@jlekstrand.net \
    --to=jason@jlekstrand.net \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.