All of lore.kernel.org
 help / color / mirror / Atom feed
* [REGRESSION] system hang on ILK/SNB/IVB
@ 2016-03-30 17:20 Gabriel Feceoru
  2016-03-30 17:57 ` [PATCH] drm/i915: Protect fbdev across slow or failed initialisation Chris Wilson
                   ` (3 more replies)
  0 siblings, 4 replies; 25+ messages in thread
From: Gabriel Feceoru @ 2016-03-30 17:20 UTC (permalink / raw)
  To: lukas, daniel.vetter; +Cc: Tomi Sarvela, intel-gfx

This commit causes a hang while running kms suspend tests 
(kms_pipe_crc_basic@suspend-read-crc-pipe-*) on ILK/SNB/IVB, affecting CI.

Probably the same problem with the one in v2, but on older HW.


commit a7442b93cf32c1e1ddb721a26cd1f92302e2a222
Author: Lukas Wunner <lukas@wunner.de>
Date:   Wed Mar 9 12:52:53 2016 +0100

     drm/i915: Fix races on fbdev

     The ->lastclose callback invokes intel_fbdev_restore_mode() and has
     been witnessed to run before intel_fbdev_initial_config_async()
     has finished.

     We might likewise receive hotplug events before we've had a chance to
     fully set up the fbdev.

     Fix by waiting for the asynchronous thread to finish.

     v2:
     An async_synchronize_full() was also added to intel_fbdev_set_suspend()
     in v1 which turned out to be entirely gratuitous. It caused a deadlock
     on suspend (discovered by CI, thanks to Damien Lespiau and Tomi Sarvela
     for CI support) and was unnecessary since a device is never suspended
     until its ->probe callback (and all asynchronous tasks it scheduled)
     have finished. See dpm_prepare(), which calls wait_for_device_probe(),
     which calls async_synchronize_full().

     Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93580
     Reported-by: Gustav Fägerlind <gustav.fagerlind@gmail.com>
     Reported-by: "Li, Weinan Z" <weinan.z.li@intel.com>
     Cc: Chris Wilson <chris@chris-wilson.co.uk>
     Cc: stable@vger.kernel.org
     Signed-off-by: Lukas Wunner <lukas@wunner.de>
     Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
     Link: 
http://patchwork.freedesktop.org/patch/msgid/20160309115147.67B2B6E0D3@gabe.freedesktop.org


Regards,
Gabriel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH] drm/i915: Protect fbdev across slow or failed initialisation
  2016-03-30 17:20 [REGRESSION] system hang on ILK/SNB/IVB Gabriel Feceoru
@ 2016-03-30 17:57 ` Chris Wilson
  2016-03-30 18:10   ` kbuild test robot
                     ` (4 more replies)
  2016-03-30 18:47 ` [REGRESSION] system hang on ILK/SNB/IVB Daniel Vetter
                   ` (2 subsequent siblings)
  3 siblings, 5 replies; 25+ messages in thread
From: Chris Wilson @ 2016-03-30 17:57 UTC (permalink / raw)
  To: intel-gfx, Gabriel Feceoru

If the initialisation fails, we may be left with a dangling pointer with
an incomplete fbdev structure. Here we want to disable internal calls
into fbdev. Similarly, the initialisation may be slow and we haven't yet
enabled the fbdev (e.g. quick suspend or last-close before the async init
completes).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93580
Reported-by: "Li, Weinan Z" <weinan.z.li@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_fbdev.c | 48 ++++++++++++++++++++++++--------------
 1 file changed, 30 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c
index 51a5e39e52f2..0bae91268a12 100644
--- a/drivers/gpu/drm/i915/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/intel_fbdev.c
@@ -114,6 +114,24 @@ static struct fb_ops intelfb_ops = {
 	.fb_debug_leave = drm_fb_helper_debug_leave,
 };
 
+static struct intel_fbdev *intel_fbdev_get_if_active(struct drm_device *dev)
+{
+	struct drm_i915_device *dev_priv = to_i915(dev);
+	struct fb_info *info;
+
+	if (dev_priv->fbdev == NULL)
+		return NULL;
+
+	info = ifbdev->helper.fbdev;
+	if (info->screen_base == NULL)
+		return NULL;
+
+	if (info->state != FBINFO_STATE_RUNNING)
+		return NULL;
+
+	return dev_priv->fbdev;
+}
+
 static int intelfb_alloc(struct drm_fb_helper *helper,
 			 struct drm_fb_helper_surface_size *sizes)
 {
@@ -764,6 +782,8 @@ void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool synchronous
 		return;
 
 	info = ifbdev->helper.fbdev;
+	if (info->screen_base == NULL)
+		return;
 
 	if (synchronous) {
 		/* Flush any pending work to turn the console on, and then
@@ -805,32 +825,24 @@ void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool synchronous
 
 void intel_fbdev_output_poll_changed(struct drm_device *dev)
 {
-	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_fbdev *ifbdev = intel_fbdev_get_if_active(dev);
+
+	if (ifbdev == NULL)
+		return;
 
-	async_synchronize_full();
-	if (dev_priv->fbdev)
-		drm_fb_helper_hotplug_event(&dev_priv->fbdev->helper);
+	drm_fb_helper_hotplug_event(&ifbdev->helper);
 }
 
 void intel_fbdev_restore_mode(struct drm_device *dev)
 {
-	int ret;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_fbdev *ifbdev = dev_priv->fbdev;
-	struct drm_fb_helper *fb_helper;
+	struct intel_fbdev *ifbdev = intel_fbdev_get_if_active(dev);
 
-	async_synchronize_full();
-	if (!ifbdev)
+	if (ifbdev == NULL)
 		return;
 
-	fb_helper = &ifbdev->helper;
-
-	ret = drm_fb_helper_restore_fbdev_mode_unlocked(fb_helper);
-	if (ret) {
-		DRM_DEBUG("failed to restore crtc mode\n");
-	} else {
-		mutex_lock(&fb_helper->dev->struct_mutex);
+	if (drm_fb_helper_restore_fbdev_mode_unlocked(&ifbdev->helper) == 0) {
+		mutex_lock(&dev->struct_mutex);
 		intel_fb_obj_invalidate(ifbdev->fb->obj, ORIGIN_GTT);
-		mutex_unlock(&fb_helper->dev->struct_mutex);
+		mutex_unlock(&dev->struct_mutex);
 	}
 }
-- 
2.8.0.rc3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH] drm/i915: Protect fbdev across slow or failed initialisation
  2016-03-30 17:57 ` [PATCH] drm/i915: Protect fbdev across slow or failed initialisation Chris Wilson
@ 2016-03-30 18:10   ` kbuild test robot
  2016-03-30 18:10   ` kbuild test robot
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 25+ messages in thread
From: kbuild test robot @ 2016-03-30 18:10 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, kbuild-all

[-- Attachment #1: Type: text/plain, Size: 5701 bytes --]

Hi Chris,

[auto build test WARNING on drm-intel/for-linux-next]
[cannot apply to v4.6-rc1 next-20160330]
[if your patch is applied to the wrong git tree, please drop us a note to help improving the system]

url:    https://github.com/0day-ci/linux/commits/Chris-Wilson/drm-i915-Protect-fbdev-across-slow-or-failed-initialisation/20160331-015912
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-randconfig-x015-03310059 (attached as .config)
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All warnings (new ones prefixed by >>):

   drivers/gpu/drm/i915/intel_fbdev.c: In function 'intel_fbdev_get_if_active':
   drivers/gpu/drm/i915/intel_fbdev.c:119:37: warning: initialization from incompatible pointer type [-Wincompatible-pointer-types]
     struct drm_i915_device *dev_priv = to_i915(dev);
                                        ^
   In file included from include/uapi/linux/stddef.h:1:0,
                    from include/linux/stddef.h:4,
                    from include/uapi/linux/posix_types.h:4,
                    from include/uapi/linux/types.h:13,
                    from include/linux/types.h:5,
                    from include/linux/async.h:15,
                    from drivers/gpu/drm/i915/intel_fbdev.c:27:
   drivers/gpu/drm/i915/intel_fbdev.c:122:14: error: dereferencing pointer to incomplete type 'struct drm_i915_device'
     if (dev_priv->fbdev == NULL)
                 ^
   include/linux/compiler.h:147:30: note: in definition of macro '__trace_if'
     if (__builtin_constant_p(!!(cond)) ? !!(cond) :   \
                                 ^
>> drivers/gpu/drm/i915/intel_fbdev.c:122:2: note: in expansion of macro 'if'
     if (dev_priv->fbdev == NULL)
     ^
   drivers/gpu/drm/i915/intel_fbdev.c:125:9: error: 'ifbdev' undeclared (first use in this function)
     info = ifbdev->helper.fbdev;
            ^
   drivers/gpu/drm/i915/intel_fbdev.c:125:9: note: each undeclared identifier is reported only once for each function it appears in
   drivers/gpu/drm/i915/intel_fbdev.c:133:1: warning: control reaches end of non-void function [-Wreturn-type]
    }
    ^

vim +/if +122 drivers/gpu/drm/i915/intel_fbdev.c

    21	 * DEALINGS IN THE SOFTWARE.
    22	 *
    23	 * Authors:
    24	 *     David Airlie
    25	 */
    26	
  > 27	#include <linux/async.h>
    28	#include <linux/module.h>
    29	#include <linux/kernel.h>
    30	#include <linux/console.h>
    31	#include <linux/errno.h>
    32	#include <linux/string.h>
    33	#include <linux/mm.h>
    34	#include <linux/tty.h>
    35	#include <linux/sysrq.h>
    36	#include <linux/delay.h>
    37	#include <linux/fb.h>
    38	#include <linux/init.h>
    39	#include <linux/vga_switcheroo.h>
    40	
    41	#include <drm/drmP.h>
    42	#include <drm/drm_crtc.h>
    43	#include <drm/drm_fb_helper.h>
    44	#include "intel_drv.h"
    45	#include <drm/i915_drm.h>
    46	#include "i915_drv.h"
    47	
    48	static int intel_fbdev_set_par(struct fb_info *info)
    49	{
    50		struct drm_fb_helper *fb_helper = info->par;
    51		struct intel_fbdev *ifbdev =
    52			container_of(fb_helper, struct intel_fbdev, helper);
    53		int ret;
    54	
    55		ret = drm_fb_helper_set_par(info);
    56	
    57		if (ret == 0) {
    58			mutex_lock(&fb_helper->dev->struct_mutex);
    59			intel_fb_obj_invalidate(ifbdev->fb->obj, ORIGIN_GTT);
    60			mutex_unlock(&fb_helper->dev->struct_mutex);
    61		}
    62	
    63		return ret;
    64	}
    65	
    66	static int intel_fbdev_blank(int blank, struct fb_info *info)
    67	{
    68		struct drm_fb_helper *fb_helper = info->par;
    69		struct intel_fbdev *ifbdev =
    70			container_of(fb_helper, struct intel_fbdev, helper);
    71		int ret;
    72	
    73		ret = drm_fb_helper_blank(blank, info);
    74	
    75		if (ret == 0) {
    76			mutex_lock(&fb_helper->dev->struct_mutex);
    77			intel_fb_obj_invalidate(ifbdev->fb->obj, ORIGIN_GTT);
    78			mutex_unlock(&fb_helper->dev->struct_mutex);
    79		}
    80	
    81		return ret;
    82	}
    83	
    84	static int intel_fbdev_pan_display(struct fb_var_screeninfo *var,
    85					   struct fb_info *info)
    86	{
    87		struct drm_fb_helper *fb_helper = info->par;
    88		struct intel_fbdev *ifbdev =
    89			container_of(fb_helper, struct intel_fbdev, helper);
    90	
    91		int ret;
    92		ret = drm_fb_helper_pan_display(var, info);
    93	
    94		if (ret == 0) {
    95			mutex_lock(&fb_helper->dev->struct_mutex);
    96			intel_fb_obj_invalidate(ifbdev->fb->obj, ORIGIN_GTT);
    97			mutex_unlock(&fb_helper->dev->struct_mutex);
    98		}
    99	
   100		return ret;
   101	}
   102	
   103	static struct fb_ops intelfb_ops = {
   104		.owner = THIS_MODULE,
   105		.fb_check_var = drm_fb_helper_check_var,
   106		.fb_set_par = intel_fbdev_set_par,
   107		.fb_fillrect = drm_fb_helper_cfb_fillrect,
   108		.fb_copyarea = drm_fb_helper_cfb_copyarea,
   109		.fb_imageblit = drm_fb_helper_cfb_imageblit,
   110		.fb_pan_display = intel_fbdev_pan_display,
   111		.fb_blank = intel_fbdev_blank,
   112		.fb_setcmap = drm_fb_helper_setcmap,
   113		.fb_debug_enter = drm_fb_helper_debug_enter,
   114		.fb_debug_leave = drm_fb_helper_debug_leave,
   115	};
   116	
   117	static struct intel_fbdev *intel_fbdev_get_if_active(struct drm_device *dev)
   118	{
   119		struct drm_i915_device *dev_priv = to_i915(dev);
   120		struct fb_info *info;
   121	
 > 122		if (dev_priv->fbdev == NULL)
   123			return NULL;
   124	
   125		info = ifbdev->helper.fbdev;

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 24904 bytes --]

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] drm/i915: Protect fbdev across slow or failed initialisation
  2016-03-30 17:57 ` [PATCH] drm/i915: Protect fbdev across slow or failed initialisation Chris Wilson
  2016-03-30 18:10   ` kbuild test robot
@ 2016-03-30 18:10   ` kbuild test robot
  2016-03-30 18:26   ` kbuild test robot
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 25+ messages in thread
From: kbuild test robot @ 2016-03-30 18:10 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, kbuild-all

[-- Attachment #1: Type: text/plain, Size: 2429 bytes --]

Hi Chris,

[auto build test ERROR on drm-intel/for-linux-next]
[cannot apply to v4.6-rc1 next-20160330]
[if your patch is applied to the wrong git tree, please drop us a note to help improving the system]

url:    https://github.com/0day-ci/linux/commits/Chris-Wilson/drm-i915-Protect-fbdev-across-slow-or-failed-initialisation/20160331-015912
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-randconfig-x012-03310059 (attached as .config)
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All error/warnings (new ones prefixed by >>):

   drivers/gpu/drm/i915/intel_fbdev.c: In function 'intel_fbdev_get_if_active':
>> drivers/gpu/drm/i915/intel_fbdev.c:119:37: warning: initialization from incompatible pointer type [-Wincompatible-pointer-types]
     struct drm_i915_device *dev_priv = to_i915(dev);
                                        ^
>> drivers/gpu/drm/i915/intel_fbdev.c:122:14: error: dereferencing pointer to incomplete type 'struct drm_i915_device'
     if (dev_priv->fbdev == NULL)
                 ^
>> drivers/gpu/drm/i915/intel_fbdev.c:125:9: error: 'ifbdev' undeclared (first use in this function)
     info = ifbdev->helper.fbdev;
            ^
   drivers/gpu/drm/i915/intel_fbdev.c:125:9: note: each undeclared identifier is reported only once for each function it appears in
>> drivers/gpu/drm/i915/intel_fbdev.c:133:1: warning: control reaches end of non-void function [-Wreturn-type]
    }
    ^

vim +122 drivers/gpu/drm/i915/intel_fbdev.c

   113		.fb_debug_enter = drm_fb_helper_debug_enter,
   114		.fb_debug_leave = drm_fb_helper_debug_leave,
   115	};
   116	
   117	static struct intel_fbdev *intel_fbdev_get_if_active(struct drm_device *dev)
   118	{
 > 119		struct drm_i915_device *dev_priv = to_i915(dev);
   120		struct fb_info *info;
   121	
 > 122		if (dev_priv->fbdev == NULL)
   123			return NULL;
   124	
 > 125		info = ifbdev->helper.fbdev;
   126		if (info->screen_base == NULL)
   127			return NULL;
   128	
   129		if (info->state != FBINFO_STATE_RUNNING)
   130			return NULL;
   131	
   132		return dev_priv->fbdev;
 > 133	}
   134	
   135	static int intelfb_alloc(struct drm_fb_helper *helper,
   136				 struct drm_fb_helper_surface_size *sizes)

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 23574 bytes --]

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] drm/i915: Protect fbdev across slow or failed initialisation
  2016-03-30 17:57 ` [PATCH] drm/i915: Protect fbdev across slow or failed initialisation Chris Wilson
  2016-03-30 18:10   ` kbuild test robot
  2016-03-30 18:10   ` kbuild test robot
@ 2016-03-30 18:26   ` kbuild test robot
  2016-03-30 18:30   ` Chris Wilson
  2016-03-30 18:56   ` [PATCH v3] " Chris Wilson
  4 siblings, 0 replies; 25+ messages in thread
From: kbuild test robot @ 2016-03-30 18:26 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, kbuild-all

[-- Attachment #1: Type: text/plain, Size: 2178 bytes --]

Hi Chris,

[auto build test ERROR on drm-intel/for-linux-next]
[cannot apply to v4.6-rc1 next-20160330]
[if your patch is applied to the wrong git tree, please drop us a note to help improving the system]

url:    https://github.com/0day-ci/linux/commits/Chris-Wilson/drm-i915-Protect-fbdev-across-slow-or-failed-initialisation/20160331-015912
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-rhel (attached as .config)
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All errors (new ones prefixed by >>):

   drivers/gpu/drm/i915/intel_fbdev.c: In function 'intel_fbdev_get_if_active':
   drivers/gpu/drm/i915/intel_fbdev.c:119:37: warning: initialization from incompatible pointer type
     struct drm_i915_device *dev_priv = to_i915(dev);
                                        ^
>> drivers/gpu/drm/i915/intel_fbdev.c:122:14: error: dereferencing pointer to incomplete type
     if (dev_priv->fbdev == NULL)
                 ^
   drivers/gpu/drm/i915/intel_fbdev.c:125:9: error: 'ifbdev' undeclared (first use in this function)
     info = ifbdev->helper.fbdev;
            ^
   drivers/gpu/drm/i915/intel_fbdev.c:125:9: note: each undeclared identifier is reported only once for each function it appears in
   drivers/gpu/drm/i915/intel_fbdev.c:132:17: error: dereferencing pointer to incomplete type
     return dev_priv->fbdev;
                    ^
   drivers/gpu/drm/i915/intel_fbdev.c:133:1: warning: control reaches end of non-void function [-Wreturn-type]
    }
    ^

vim +122 drivers/gpu/drm/i915/intel_fbdev.c

   113		.fb_debug_enter = drm_fb_helper_debug_enter,
   114		.fb_debug_leave = drm_fb_helper_debug_leave,
   115	};
   116	
   117	static struct intel_fbdev *intel_fbdev_get_if_active(struct drm_device *dev)
   118	{
 > 119		struct drm_i915_device *dev_priv = to_i915(dev);
   120		struct fb_info *info;
   121	
 > 122		if (dev_priv->fbdev == NULL)
   123			return NULL;
   124	
   125		info = ifbdev->helper.fbdev;

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 36120 bytes --]

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] drm/i915: Protect fbdev across slow or failed initialisation
  2016-03-30 17:57 ` [PATCH] drm/i915: Protect fbdev across slow or failed initialisation Chris Wilson
                     ` (2 preceding siblings ...)
  2016-03-30 18:26   ` kbuild test robot
@ 2016-03-30 18:30   ` Chris Wilson
  2016-03-30 18:56   ` [PATCH v3] " Chris Wilson
  4 siblings, 0 replies; 25+ messages in thread
From: Chris Wilson @ 2016-03-30 18:30 UTC (permalink / raw)
  To: intel-gfx, Gabriel Feceoru

On Wed, Mar 30, 2016 at 06:57:14PM +0100, Chris Wilson wrote:
> If the initialisation fails, we may be left with a dangling pointer with
> an incomplete fbdev structure. Here we want to disable internal calls
> into fbdev. Similarly, the initialisation may be slow and we haven't yet
> enabled the fbdev (e.g. quick suspend or last-close before the async init
> completes).
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93580
> Reported-by: "Li, Weinan Z" <weinan.z.li@intel.com>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/intel_fbdev.c | 48 ++++++++++++++++++++++++--------------
>  1 file changed, 30 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c
> index 51a5e39e52f2..0bae91268a12 100644
> --- a/drivers/gpu/drm/i915/intel_fbdev.c
> +++ b/drivers/gpu/drm/i915/intel_fbdev.c
> @@ -114,6 +114,24 @@ static struct fb_ops intelfb_ops = {
>  	.fb_debug_leave = drm_fb_helper_debug_leave,
>  };
>  
> +static struct intel_fbdev *intel_fbdev_get_if_active(struct drm_device *dev)
> +{
> +	struct drm_i915_device *dev_priv = to_i915(dev);

As kbuild reported, the danger of retyping a patch from email and
forgetting you compiled on a machine without fbdev.
s/drm_i915_device/drm_i915_private/
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [REGRESSION] system hang on ILK/SNB/IVB
  2016-03-30 17:20 [REGRESSION] system hang on ILK/SNB/IVB Gabriel Feceoru
  2016-03-30 17:57 ` [PATCH] drm/i915: Protect fbdev across slow or failed initialisation Chris Wilson
@ 2016-03-30 18:47 ` Daniel Vetter
  2016-03-30 21:35 ` Lukas Wunner
  2016-03-31 14:42 ` ✗ Fi.CI.BAT: failure for drm/i915: Protect fbdev across slow or failed initialisation (rev2) Patchwork
  3 siblings, 0 replies; 25+ messages in thread
From: Daniel Vetter @ 2016-03-30 18:47 UTC (permalink / raw)
  To: Gabriel Feceoru; +Cc: Tomi Sarvela, daniel.vetter, intel-gfx

On Wed, Mar 30, 2016 at 08:20:26PM +0300, Gabriel Feceoru wrote:
> This commit causes a hang while running kms suspend tests
> (kms_pipe_crc_basic@suspend-read-crc-pipe-*) on ILK/SNB/IVB, affecting CI.
> 
> Probably the same problem with the one in v2, but on older HW.

I did check the patchwork/BAT-CI results and it looked clean. Is this a
new machine? Should I just revert for now until we have a proper fix?
-Daniel

> 
> 
> commit a7442b93cf32c1e1ddb721a26cd1f92302e2a222
> Author: Lukas Wunner <lukas@wunner.de>
> Date:   Wed Mar 9 12:52:53 2016 +0100
> 
>     drm/i915: Fix races on fbdev
> 
>     The ->lastclose callback invokes intel_fbdev_restore_mode() and has
>     been witnessed to run before intel_fbdev_initial_config_async()
>     has finished.
> 
>     We might likewise receive hotplug events before we've had a chance to
>     fully set up the fbdev.
> 
>     Fix by waiting for the asynchronous thread to finish.
> 
>     v2:
>     An async_synchronize_full() was also added to intel_fbdev_set_suspend()
>     in v1 which turned out to be entirely gratuitous. It caused a deadlock
>     on suspend (discovered by CI, thanks to Damien Lespiau and Tomi Sarvela
>     for CI support) and was unnecessary since a device is never suspended
>     until its ->probe callback (and all asynchronous tasks it scheduled)
>     have finished. See dpm_prepare(), which calls wait_for_device_probe(),
>     which calls async_synchronize_full().
> 
>     Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93580
>     Reported-by: Gustav Fägerlind <gustav.fagerlind@gmail.com>
>     Reported-by: "Li, Weinan Z" <weinan.z.li@intel.com>
>     Cc: Chris Wilson <chris@chris-wilson.co.uk>
>     Cc: stable@vger.kernel.org
>     Signed-off-by: Lukas Wunner <lukas@wunner.de>
>     Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>     Link: http://patchwork.freedesktop.org/patch/msgid/20160309115147.67B2B6E0D3@gabe.freedesktop.org
> 
> 
> Regards,
> Gabriel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v3] drm/i915: Protect fbdev across slow or failed initialisation
  2016-03-30 17:57 ` [PATCH] drm/i915: Protect fbdev across slow or failed initialisation Chris Wilson
                     ` (3 preceding siblings ...)
  2016-03-30 18:30   ` Chris Wilson
@ 2016-03-30 18:56   ` Chris Wilson
  2016-03-31 12:00     ` Gabriel Feceoru
  4 siblings, 1 reply; 25+ messages in thread
From: Chris Wilson @ 2016-03-30 18:56 UTC (permalink / raw)
  To: intel-gfx, Gabriel Feceoru

If the initialisation fails, we may be left with a dangling pointer with
an incomplete fbdev structure. Here we want to disable internal calls
into fbdev. Similarly, the initialisation may be slow and we haven't yet
enabled the fbdev (e.g. quick suspend or last-close before the async init
completes).

v3: To create a typo introduced when retyping

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93580
Reported-by: "Li, Weinan Z" <weinan.z.li@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_fbdev.c | 48 ++++++++++++++++++++++++--------------
 1 file changed, 30 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c
index 153ea7a3fcf6..5029f927fe0d 100644
--- a/drivers/gpu/drm/i915/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/intel_fbdev.c
@@ -114,6 +114,24 @@ static struct fb_ops intelfb_ops = {
 	.fb_debug_leave = drm_fb_helper_debug_leave,
 };
 
+static struct intel_fbdev *intel_fbdev_get_if_active(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = to_i915(dev);
+	struct fb_info *info;
+
+	if (dev_priv->fbdev == NULL)
+		return NULL;
+
+	info = dev_priv->fbdev->helper.fbdev;
+	if (info->screen_base == NULL)
+		return NULL;
+
+	if (info->state != FBINFO_STATE_RUNNING)
+		return NULL;
+
+	return dev_priv->fbdev;
+}
+
 static int intelfb_alloc(struct drm_fb_helper *helper,
 			 struct drm_fb_helper_surface_size *sizes)
 {
@@ -766,6 +784,8 @@ void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool synchronous
 		return;
 
 	info = ifbdev->helper.fbdev;
+	if (info->screen_base == NULL)
+		return;
 
 	if (synchronous) {
 		/* Flush any pending work to turn the console on, and then
@@ -807,32 +827,24 @@ void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool synchronous
 
 void intel_fbdev_output_poll_changed(struct drm_device *dev)
 {
-	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_fbdev *ifbdev = intel_fbdev_get_if_active(dev);
+
+	if (ifbdev == NULL)
+		return;
 
-	async_synchronize_full();
-	if (dev_priv->fbdev)
-		drm_fb_helper_hotplug_event(&dev_priv->fbdev->helper);
+	drm_fb_helper_hotplug_event(&ifbdev->helper);
 }
 
 void intel_fbdev_restore_mode(struct drm_device *dev)
 {
-	int ret;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_fbdev *ifbdev = dev_priv->fbdev;
-	struct drm_fb_helper *fb_helper;
+	struct intel_fbdev *ifbdev = intel_fbdev_get_if_active(dev);
 
-	async_synchronize_full();
-	if (!ifbdev)
+	if (ifbdev == NULL)
 		return;
 
-	fb_helper = &ifbdev->helper;
-
-	ret = drm_fb_helper_restore_fbdev_mode_unlocked(fb_helper);
-	if (ret) {
-		DRM_DEBUG("failed to restore crtc mode\n");
-	} else {
-		mutex_lock(&fb_helper->dev->struct_mutex);
+	if (drm_fb_helper_restore_fbdev_mode_unlocked(&ifbdev->helper) == 0) {
+		mutex_lock(&dev->struct_mutex);
 		intel_fb_obj_invalidate(ifbdev->fb->obj, ORIGIN_GTT);
-		mutex_unlock(&fb_helper->dev->struct_mutex);
+		mutex_unlock(&dev->struct_mutex);
 	}
 }
-- 
2.8.0.rc3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [REGRESSION] system hang on ILK/SNB/IVB
  2016-03-30 17:20 [REGRESSION] system hang on ILK/SNB/IVB Gabriel Feceoru
  2016-03-30 17:57 ` [PATCH] drm/i915: Protect fbdev across slow or failed initialisation Chris Wilson
  2016-03-30 18:47 ` [REGRESSION] system hang on ILK/SNB/IVB Daniel Vetter
@ 2016-03-30 21:35 ` Lukas Wunner
  2016-03-31  7:21   ` Tomi Sarvela
  2016-03-31  7:42   ` Gabriel Feceoru
  2016-03-31 14:42 ` ✗ Fi.CI.BAT: failure for drm/i915: Protect fbdev across slow or failed initialisation (rev2) Patchwork
  3 siblings, 2 replies; 25+ messages in thread
From: Lukas Wunner @ 2016-03-30 21:35 UTC (permalink / raw)
  To: Gabriel Feceoru; +Cc: Tomi Sarvela, daniel.vetter, intel-gfx

Hi Gabriel,

On Wed, Mar 30, 2016 at 08:20:26PM +0300, Gabriel Feceoru wrote:
> This commit causes a hang while running kms suspend tests
> (kms_pipe_crc_basic@suspend-read-crc-pipe-*) on ILK/SNB/IVB, affecting CI.

This happened with v1 but not with v2 of the patch.
Please check if somehow v1 ended up in your tree.

v2 passed CI fine, save for one warning not caused by the patch:
https://patchwork.freedesktop.org/series/4068/

For comparison, this was v1:
https://patchwork.freedesktop.org/patch/75840/

Best regards,

Lukas

> 
> Probably the same problem with the one in v2, but on older HW.
> 
> 
> commit a7442b93cf32c1e1ddb721a26cd1f92302e2a222
> Author: Lukas Wunner <lukas@wunner.de>
> Date:   Wed Mar 9 12:52:53 2016 +0100
> 
>     drm/i915: Fix races on fbdev
> 
>     The ->lastclose callback invokes intel_fbdev_restore_mode() and has
>     been witnessed to run before intel_fbdev_initial_config_async()
>     has finished.
> 
>     We might likewise receive hotplug events before we've had a chance to
>     fully set up the fbdev.
> 
>     Fix by waiting for the asynchronous thread to finish.
> 
>     v2:
>     An async_synchronize_full() was also added to intel_fbdev_set_suspend()
>     in v1 which turned out to be entirely gratuitous. It caused a deadlock
>     on suspend (discovered by CI, thanks to Damien Lespiau and Tomi Sarvela
>     for CI support) and was unnecessary since a device is never suspended
>     until its ->probe callback (and all asynchronous tasks it scheduled)
>     have finished. See dpm_prepare(), which calls wait_for_device_probe(),
>     which calls async_synchronize_full().
> 
>     Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93580
>     Reported-by: Gustav Fägerlind <gustav.fagerlind@gmail.com>
>     Reported-by: "Li, Weinan Z" <weinan.z.li@intel.com>
>     Cc: Chris Wilson <chris@chris-wilson.co.uk>
>     Cc: stable@vger.kernel.org
>     Signed-off-by: Lukas Wunner <lukas@wunner.de>
>     Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>     Link: http://patchwork.freedesktop.org/patch/msgid/20160309115147.67B2B6E0D3@gabe.freedesktop.org
> 
> 
> Regards,
> Gabriel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [REGRESSION] system hang on ILK/SNB/IVB
  2016-03-30 21:35 ` Lukas Wunner
@ 2016-03-31  7:21   ` Tomi Sarvela
  2016-03-31 20:35     ` Lukas Wunner
  2016-03-31  7:42   ` Gabriel Feceoru
  1 sibling, 1 reply; 25+ messages in thread
From: Tomi Sarvela @ 2016-03-31  7:21 UTC (permalink / raw)
  To: Lukas Wunner; +Cc: intel-gfx, daniel.vetter

Hello Lukas,

The problem with the results in your link is that there is no HSW, ILK, IVB or 
SNB results. This might give the impression that everything is well.

Most damning is lack of HSW-gt2 and SNB-dellxps: those machines hang on to 
APC, and have run quite stably for every Patchwork run. The case isn't strong 
enough yet that series should fail if either of those won't run, but it might 
be so in future.

Tomi



On Wednesday 30 March 2016 23:35:08 Lukas Wunner wrote:
> Hi Gabriel,
> 
> On Wed, Mar 30, 2016 at 08:20:26PM +0300, Gabriel Feceoru wrote:
> > This commit causes a hang while running kms suspend tests
> > (kms_pipe_crc_basic@suspend-read-crc-pipe-*) on ILK/SNB/IVB, affecting CI.
> 
> This happened with v1 but not with v2 of the patch.
> Please check if somehow v1 ended up in your tree.
> 
> v2 passed CI fine, save for one warning not caused by the patch:
> https://patchwork.freedesktop.org/series/4068/
> 
> For comparison, this was v1:
> https://patchwork.freedesktop.org/patch/75840/
> 
> Best regards,
> 
> Lukas
> 
> > Probably the same problem with the one in v2, but on older HW.
> > 
> > 
> > commit a7442b93cf32c1e1ddb721a26cd1f92302e2a222
> > Author: Lukas Wunner <lukas@wunner.de>
> > Date:   Wed Mar 9 12:52:53 2016 +0100
> > 
> >     drm/i915: Fix races on fbdev
> >     
> >     The ->lastclose callback invokes intel_fbdev_restore_mode() and has
> >     been witnessed to run before intel_fbdev_initial_config_async()
> >     has finished.
> >     
> >     We might likewise receive hotplug events before we've had a chance to
> >     fully set up the fbdev.
> >     
> >     Fix by waiting for the asynchronous thread to finish.
> >     
> >     v2:
> >     An async_synchronize_full() was also added to
> >     intel_fbdev_set_suspend()
> >     in v1 which turned out to be entirely gratuitous. It caused a deadlock
> >     on suspend (discovered by CI, thanks to Damien Lespiau and Tomi
> >     Sarvela
> >     for CI support) and was unnecessary since a device is never suspended
> >     until its ->probe callback (and all asynchronous tasks it scheduled)
> >     have finished. See dpm_prepare(), which calls wait_for_device_probe(),
> >     which calls async_synchronize_full().
> >     
> >     Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93580
> >     Reported-by: Gustav Fägerlind <gustav.fagerlind@gmail.com>
> >     Reported-by: "Li, Weinan Z" <weinan.z.li@intel.com>
> >     Cc: Chris Wilson <chris@chris-wilson.co.uk>
> >     Cc: stable@vger.kernel.org
> >     Signed-off-by: Lukas Wunner <lukas@wunner.de>
> >     Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> >     Link:
> >     http://patchwork.freedesktop.org/patch/msgid/20160309115147.67B2B6E0D
> >     3@gabe.freedesktop.org> 
> > Regards,
> > Gabriel

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [REGRESSION] system hang on ILK/SNB/IVB
  2016-03-30 21:35 ` Lukas Wunner
  2016-03-31  7:21   ` Tomi Sarvela
@ 2016-03-31  7:42   ` Gabriel Feceoru
  2016-03-31 15:23     ` Lukas Wunner
  1 sibling, 1 reply; 25+ messages in thread
From: Gabriel Feceoru @ 2016-03-31  7:42 UTC (permalink / raw)
  To: Lukas Wunner; +Cc: Tomi Sarvela, daniel.vetter, intel-gfx



On 31.03.2016 00:35, Lukas Wunner wrote:
> Hi Gabriel,
>
> On Wed, Mar 30, 2016 at 08:20:26PM +0300, Gabriel Feceoru wrote:
>> This commit causes a hang while running kms suspend tests
>> (kms_pipe_crc_basic@suspend-read-crc-pipe-*) on ILK/SNB/IVB, affecting CI.
>
> This happened with v1 but not with v2 of the patch.
> Please check if somehow v1 ended up in your tree.

It's v2.

Tomi already replied, meantime I also looked at the results.
The current regression is for ILK/SNB/IVB only (v1 seemed to affect more 
platforms).
Unfortunately these machines were not available when v2 was tested, so 
this couldn't be detected.

Regards,
Gabriel.

>
> v2 passed CI fine, save for one warning not caused by the patch:
> https://patchwork.freedesktop.org/series/4068/
>
> For comparison, this was v1:
> https://patchwork.freedesktop.org/patch/75840/
>
> Best regards,
>
> Lukas
>
>>
>> Probably the same problem with the one in v2, but on older HW.
>>
>>
>> commit a7442b93cf32c1e1ddb721a26cd1f92302e2a222
>> Author: Lukas Wunner <lukas@wunner.de>
>> Date:   Wed Mar 9 12:52:53 2016 +0100
>>
>>      drm/i915: Fix races on fbdev
>>
>>      The ->lastclose callback invokes intel_fbdev_restore_mode() and has
>>      been witnessed to run before intel_fbdev_initial_config_async()
>>      has finished.
>>
>>      We might likewise receive hotplug events before we've had a chance to
>>      fully set up the fbdev.
>>
>>      Fix by waiting for the asynchronous thread to finish.
>>
>>      v2:
>>      An async_synchronize_full() was also added to intel_fbdev_set_suspend()
>>      in v1 which turned out to be entirely gratuitous. It caused a deadlock
>>      on suspend (discovered by CI, thanks to Damien Lespiau and Tomi Sarvela
>>      for CI support) and was unnecessary since a device is never suspended
>>      until its ->probe callback (and all asynchronous tasks it scheduled)
>>      have finished. See dpm_prepare(), which calls wait_for_device_probe(),
>>      which calls async_synchronize_full().
>>
>>      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93580
>>      Reported-by: Gustav Fägerlind <gustav.fagerlind@gmail.com>
>>      Reported-by: "Li, Weinan Z" <weinan.z.li@intel.com>
>>      Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>      Cc: stable@vger.kernel.org
>>      Signed-off-by: Lukas Wunner <lukas@wunner.de>
>>      Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>>      Link: http://patchwork.freedesktop.org/patch/msgid/20160309115147.67B2B6E0D3@gabe.freedesktop.org
>>
>>
>> Regards,
>> Gabriel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3] drm/i915: Protect fbdev across slow or failed initialisation
  2016-03-30 18:56   ` [PATCH v3] " Chris Wilson
@ 2016-03-31 12:00     ` Gabriel Feceoru
  2016-03-31 13:57       ` [PATCH v4 1/2] " Chris Wilson
  0 siblings, 1 reply; 25+ messages in thread
From: Gabriel Feceoru @ 2016-03-31 12:00 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

This almost fixes the problem , but with this:

diff --git a/drivers/gpu/drm/i915/intel_fbdev.c 
b/drivers/gpu/drm/i915/intel_fbdev.c
index 5029f92..a6d3c58 100644
--- a/drivers/gpu/drm/i915/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/intel_fbdev.c
@@ -123,8 +123,9 @@ static struct intel_fbdev 
*intel_fbdev_get_if_active(struct drm_device *dev)
                 return NULL;

         info = dev_priv->fbdev->helper.fbdev;
-       if (info->screen_base == NULL)
+       if (info == NULL || info->screen_base == NULL)
                 return NULL;

         if (info->state != FBINFO_STATE_RUNNING)
                 return NULL;


I got initially a page fault during init, but with this fix, the suspend 
test passes.

Thank you, please consider this.

Regards,
Gabriel



On 30.03.2016 21:56, Chris Wilson wrote:
> If the initialisation fails, we may be left with a dangling pointer with
> an incomplete fbdev structure. Here we want to disable internal calls
> into fbdev. Similarly, the initialisation may be slow and we haven't yet
> enabled the fbdev (e.g. quick suspend or last-close before the async init
> completes).
>
> v3: To create a typo introduced when retyping
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93580
> Reported-by: "Li, Weinan Z" <weinan.z.li@intel.com>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/intel_fbdev.c | 48 ++++++++++++++++++++++++--------------
>   1 file changed, 30 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c
> index 153ea7a3fcf6..5029f927fe0d 100644
> --- a/drivers/gpu/drm/i915/intel_fbdev.c
> +++ b/drivers/gpu/drm/i915/intel_fbdev.c
> @@ -114,6 +114,24 @@ static struct fb_ops intelfb_ops = {
>   	.fb_debug_leave = drm_fb_helper_debug_leave,
>   };
>
> +static struct intel_fbdev *intel_fbdev_get_if_active(struct drm_device *dev)
> +{
> +	struct drm_i915_private *dev_priv = to_i915(dev);
> +	struct fb_info *info;
> +
> +	if (dev_priv->fbdev == NULL)
> +		return NULL;
> +
> +	info = dev_priv->fbdev->helper.fbdev;
> +	if (info->screen_base == NULL)
> +		return NULL;
> +
> +	if (info->state != FBINFO_STATE_RUNNING)
> +		return NULL;
> +
> +	return dev_priv->fbdev;
> +}
> +
>   static int intelfb_alloc(struct drm_fb_helper *helper,
>   			 struct drm_fb_helper_surface_size *sizes)
>   {
> @@ -766,6 +784,8 @@ void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool synchronous
>   		return;
>
>   	info = ifbdev->helper.fbdev;
> +	if (info->screen_base == NULL)
> +		return;
>
>   	if (synchronous) {
>   		/* Flush any pending work to turn the console on, and then
> @@ -807,32 +827,24 @@ void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool synchronous
>
>   void intel_fbdev_output_poll_changed(struct drm_device *dev)
>   {
> -	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct intel_fbdev *ifbdev = intel_fbdev_get_if_active(dev);
> +
> +	if (ifbdev == NULL)
> +		return;
>
> -	async_synchronize_full();
> -	if (dev_priv->fbdev)
> -		drm_fb_helper_hotplug_event(&dev_priv->fbdev->helper);
> +	drm_fb_helper_hotplug_event(&ifbdev->helper);
>   }
>
>   void intel_fbdev_restore_mode(struct drm_device *dev)
>   {
> -	int ret;
> -	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_fbdev *ifbdev = dev_priv->fbdev;
> -	struct drm_fb_helper *fb_helper;
> +	struct intel_fbdev *ifbdev = intel_fbdev_get_if_active(dev);
>
> -	async_synchronize_full();
> -	if (!ifbdev)
> +	if (ifbdev == NULL)
>   		return;
>
> -	fb_helper = &ifbdev->helper;
> -
> -	ret = drm_fb_helper_restore_fbdev_mode_unlocked(fb_helper);
> -	if (ret) {
> -		DRM_DEBUG("failed to restore crtc mode\n");
> -	} else {
> -		mutex_lock(&fb_helper->dev->struct_mutex);
> +	if (drm_fb_helper_restore_fbdev_mode_unlocked(&ifbdev->helper) == 0) {
> +		mutex_lock(&dev->struct_mutex);
>   		intel_fb_obj_invalidate(ifbdev->fb->obj, ORIGIN_GTT);
> -		mutex_unlock(&fb_helper->dev->struct_mutex);
> +		mutex_unlock(&dev->struct_mutex);
>   	}
>   }
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 1/2] drm/i915: Protect fbdev across slow or failed initialisation
  2016-03-31 12:00     ` Gabriel Feceoru
@ 2016-03-31 13:57       ` Chris Wilson
  2016-03-31 13:57         ` [PATCH v4 2/2] drm/i915: Move fbdev_suspend_work to intel_fbdev Chris Wilson
  2016-03-31 16:05         ` [PATCH v4 1/2] drm/i915: Protect fbdev across slow or failed initialisation Joonas Lahtinen
  0 siblings, 2 replies; 25+ messages in thread
From: Chris Wilson @ 2016-03-31 13:57 UTC (permalink / raw)
  To: intel-gfx, Gabriel Feceoru

If the initialisation fails, we may be left with a dangling pointer with
an incomplete fbdev structure. Here we want to disable internal calls
into fbdev. Similarly, the initialisation may be slow and we haven't yet
enabled the fbdev (e.g. quick suspend or last-close before the async init
completes).

v3: To create a typo introduced when retyping
v4: Prevent info==NULL dereference in early boot

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93580
Reported-by: "Li, Weinan Z" <weinan.z.li@intel.com>
Tested-by: Gabriel Feceoru <gabriel.feceoru@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_fbdev.c | 72 +++++++++++++++++++++++++-------------
 1 file changed, 48 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c
index 153ea7a3fcf6..5d4be71bdf22 100644
--- a/drivers/gpu/drm/i915/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/intel_fbdev.c
@@ -756,17 +756,47 @@ void intel_fbdev_fini(struct drm_device *dev)
 	dev_priv->fbdev = NULL;
 }
 
+static struct intel_fbdev *intel_fbdev_get(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = to_i915(dev);
+	struct fb_info *info;
+
+	if (dev_priv->fbdev == NULL)
+		return NULL;
+
+	info = dev_priv->fbdev->helper.fbdev;
+	if (info == NULL)
+		return NULL;
+
+	if (info->screen_base == NULL)
+		return NULL;
+
+	return dev_priv->fbdev;
+}
+
+static struct intel_fbdev *intel_fbdev_get_if_active(struct drm_device *dev)
+{
+	struct intel_fbdev *ifbdev;
+
+	ifbdev = intel_fbdev_get(dev);
+	if (ifbdev == NULL)
+		return NULL;
+
+	if (ifbdev->helper.fbdev->state != FBINFO_STATE_RUNNING)
+		return NULL;
+
+	return ifbdev;
+}
+
 void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool synchronous)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_fbdev *ifbdev = dev_priv->fbdev;
-	struct fb_info *info;
+	struct intel_fbdev *ifbdev;
 
-	if (!ifbdev)
+	ifbdev = intel_fbdev_get(dev);
+	if (ifbdev == NULL)
 		return;
 
-	info = ifbdev->helper.fbdev;
-
 	if (synchronous) {
 		/* Flush any pending work to turn the console on, and then
 		 * wait to turn it off. It must be synchronous as we are
@@ -798,8 +828,10 @@ void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool synchronous
 	 * been restored from swap. If the object is stolen however, it will be
 	 * full of whatever garbage was left in there.
 	 */
-	if (state == FBINFO_STATE_RUNNING && ifbdev->fb->obj->stolen)
+	if (state == FBINFO_STATE_RUNNING && ifbdev->fb->obj->stolen) {
+		struct fb_info *info = ifbdev->helper.fbdev;
 		memset_io(info->screen_base, 0, info->screen_size);
+	}
 
 	drm_fb_helper_set_suspend(&ifbdev->helper, state);
 	console_unlock();
@@ -807,32 +839,24 @@ void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool synchronous
 
 void intel_fbdev_output_poll_changed(struct drm_device *dev)
 {
-	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_fbdev *ifbdev = intel_fbdev_get_if_active(dev);
+
+	if (ifbdev == NULL)
+		return;
 
-	async_synchronize_full();
-	if (dev_priv->fbdev)
-		drm_fb_helper_hotplug_event(&dev_priv->fbdev->helper);
+	drm_fb_helper_hotplug_event(&ifbdev->helper);
 }
 
 void intel_fbdev_restore_mode(struct drm_device *dev)
 {
-	int ret;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_fbdev *ifbdev = dev_priv->fbdev;
-	struct drm_fb_helper *fb_helper;
+	struct intel_fbdev *ifbdev = intel_fbdev_get_if_active(dev);
 
-	async_synchronize_full();
-	if (!ifbdev)
+	if (ifbdev == NULL)
 		return;
 
-	fb_helper = &ifbdev->helper;
-
-	ret = drm_fb_helper_restore_fbdev_mode_unlocked(fb_helper);
-	if (ret) {
-		DRM_DEBUG("failed to restore crtc mode\n");
-	} else {
-		mutex_lock(&fb_helper->dev->struct_mutex);
+	if (drm_fb_helper_restore_fbdev_mode_unlocked(&ifbdev->helper) == 0) {
+		mutex_lock(&dev->struct_mutex);
 		intel_fb_obj_invalidate(ifbdev->fb->obj, ORIGIN_GTT);
-		mutex_unlock(&fb_helper->dev->struct_mutex);
+		mutex_unlock(&dev->struct_mutex);
 	}
 }
-- 
2.8.0.rc3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 2/2] drm/i915: Move fbdev_suspend_work to intel_fbdev
  2016-03-31 13:57       ` [PATCH v4 1/2] " Chris Wilson
@ 2016-03-31 13:57         ` Chris Wilson
  2016-03-31 15:22           ` Joonas Lahtinen
  2016-03-31 16:05         ` [PATCH v4 1/2] drm/i915: Protect fbdev across slow or failed initialisation Joonas Lahtinen
  1 sibling, 1 reply; 25+ messages in thread
From: Chris Wilson @ 2016-03-31 13:57 UTC (permalink / raw)
  To: intel-gfx, Gabriel Feceoru

Since the suspend_work is entirely internal to intel_fbdev.c, move it
from the top level drm_i915_private and into struct intel_fbdev. This
requires splitting the internal interface for the suspend worker from
the external interface used by the higher layers to initiate suspend.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.c    |   4 +-
 drivers/gpu/drm/i915/i915_drv.h    |   1 -
 drivers/gpu/drm/i915/intel_drv.h   |   5 +-
 drivers/gpu/drm/i915/intel_fbdev.c | 117 ++++++++++++++++++++-----------------
 4 files changed, 68 insertions(+), 59 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 020a31c5e2bb..100d0d92b1e6 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -634,7 +634,7 @@ static int i915_drm_suspend(struct drm_device *dev)
 	intel_uncore_forcewake_reset(dev, false);
 	intel_opregion_fini(dev);
 
-	intel_fbdev_set_suspend(dev, FBINFO_STATE_SUSPENDED, true);
+	intel_fbdev_set_suspend(dev, FBINFO_STATE_SUSPENDED);
 
 	dev_priv->suspend_count++;
 
@@ -784,7 +784,7 @@ static int i915_drm_resume(struct drm_device *dev)
 
 	intel_opregion_init(dev);
 
-	intel_fbdev_set_suspend(dev, FBINFO_STATE_RUNNING, false);
+	intel_fbdev_set_suspend(dev, FBINFO_STATE_RUNNING);
 
 	mutex_lock(&dev_priv->modeset_restore_lock);
 	dev_priv->modeset_restore = MODESET_DONE;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 820c91f551ba..973a602c5077 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1880,7 +1880,6 @@ struct drm_i915_private {
 #ifdef CONFIG_DRM_FBDEV_EMULATION
 	/* list of fbdev register on this device */
 	struct intel_fbdev *fbdev;
-	struct work_struct fbdev_suspend_work;
 #endif
 
 	struct drm_property *broadcast_rgb_property;
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 6ac46d921cde..00b6c60c1cb8 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -158,6 +158,7 @@ struct intel_framebuffer {
 struct intel_fbdev {
 	struct drm_fb_helper helper;
 	struct intel_framebuffer *fb;
+	struct work_struct suspend_work;
 	int preferred_bpp;
 };
 
@@ -1335,7 +1336,7 @@ void intel_dvo_init(struct drm_device *dev);
 extern int intel_fbdev_init(struct drm_device *dev);
 extern void intel_fbdev_initial_config_async(struct drm_device *dev);
 extern void intel_fbdev_fini(struct drm_device *dev);
-extern void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool synchronous);
+extern void intel_fbdev_set_suspend(struct drm_device *dev, int state);
 extern void intel_fbdev_output_poll_changed(struct drm_device *dev);
 extern void intel_fbdev_restore_mode(struct drm_device *dev);
 #else
@@ -1352,7 +1353,7 @@ static inline void intel_fbdev_fini(struct drm_device *dev)
 {
 }
 
-static inline void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool synchronous)
+static inline void intel_fbdev_set_suspend(struct drm_device *dev, int state)
 {
 }
 
diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c
index 5d4be71bdf22..66bb79613660 100644
--- a/drivers/gpu/drm/i915/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/intel_fbdev.c
@@ -534,8 +534,7 @@ static const struct drm_fb_helper_funcs intel_fb_helper_funcs = {
 	.fb_probe = intelfb_create,
 };
 
-static void intel_fbdev_destroy(struct drm_device *dev,
-				struct intel_fbdev *ifbdev)
+static void intel_fbdev_destroy(struct intel_fbdev *ifbdev)
 {
 	/* We rely on the object-free to release the VMA pinning for
 	 * the info->screen_base mmaping. Leaking the VMA is simpler than
@@ -681,13 +680,56 @@ out:
 	return false;
 }
 
+static void __intel_fbdev_set_suspend(struct intel_fbdev *ifbdev,
+				      int state, bool synchronous)
+{
+	if (synchronous) {
+		/* Flush any pending work to turn the console on, and then
+		 * wait to turn it off. It must be synchronous as we are
+		 * about to suspend or unload the driver.
+		 *
+		 * Note that from within the work-handler, we cannot flush
+		 * ourselves, so only flush outstanding work upon suspend!
+		 */
+		if (state != FBINFO_STATE_RUNNING)
+			flush_work(&ifbdev->suspend_work);
+		console_lock();
+	} else {
+		/*
+		 * The console lock can be pretty contented on resume due
+		 * to all the printk activity.  Try to keep it out of the hot
+		 * path of resume if possible.
+		 */
+		WARN_ON(state != FBINFO_STATE_RUNNING);
+		if (!console_trylock()) {
+			/* Don't block our own workqueue as this can
+			 * be run in parallel with other i915.ko tasks.
+			 */
+			schedule_work(&ifbdev->suspend_work);
+			return;
+		}
+	}
+
+	/* On resume from hibernation: If the object is shmemfs backed, it has
+	 * been restored from swap. If the object is stolen however, it will be
+	 * full of whatever garbage was left in there.
+	 */
+	if (state == FBINFO_STATE_RUNNING && ifbdev->fb->obj->stolen) {
+		struct fb_info *info = ifbdev->helper.fbdev;
+		memset_io(info->screen_base, 0, info->screen_size);
+	}
+
+	drm_fb_helper_set_suspend(&ifbdev->helper, state);
+	console_unlock();
+}
+
 static void intel_fbdev_suspend_worker(struct work_struct *work)
 {
-	intel_fbdev_set_suspend(container_of(work,
-					     struct drm_i915_private,
-					     fbdev_suspend_work)->dev,
-				FBINFO_STATE_RUNNING,
-				true);
+	__intel_fbdev_set_suspend(container_of(work,
+					       struct intel_fbdev,
+					       suspend_work),
+				  FBINFO_STATE_RUNNING,
+				  true);
 }
 
 int intel_fbdev_init(struct drm_device *dev)
@@ -716,9 +758,9 @@ int intel_fbdev_init(struct drm_device *dev)
 	}
 
 	ifbdev->helper.atomic = true;
+	INIT_WORK(&ifbdev->suspend_work, intel_fbdev_suspend_worker);
 
 	dev_priv->fbdev = ifbdev;
-	INIT_WORK(&dev_priv->fbdev_suspend_work, intel_fbdev_suspend_worker);
 
 	drm_fb_helper_single_add_all_connectors(&ifbdev->helper);
 
@@ -743,17 +785,21 @@ void intel_fbdev_initial_config_async(struct drm_device *dev)
 
 void intel_fbdev_fini(struct drm_device *dev)
 {
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	if (!dev_priv->fbdev)
+	struct drm_i915_private *dev_priv = to_i915(dev);
+	struct intel_fbdev *ifbdev;
+
+	ifbdev = dev_priv->fbdev;
+	if (ifbdev == NULL)
 		return;
 
-	flush_work(&dev_priv->fbdev_suspend_work);
+	dev_priv->fbdev = NULL;
 
 	if (!current_is_async())
 		async_synchronize_full();
-	intel_fbdev_destroy(dev, dev_priv->fbdev);
-	kfree(dev_priv->fbdev);
-	dev_priv->fbdev = NULL;
+	flush_work(&ifbdev->suspend_work);
+
+	intel_fbdev_destroy(ifbdev);
+	kfree(ifbdev);
 }
 
 static struct intel_fbdev *intel_fbdev_get(struct drm_device *dev)
@@ -788,53 +834,16 @@ static struct intel_fbdev *intel_fbdev_get_if_active(struct drm_device *dev)
 	return ifbdev;
 }
 
-void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool synchronous)
+void intel_fbdev_set_suspend(struct drm_device *dev, int state)
 {
-	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_fbdev *ifbdev;
 
 	ifbdev = intel_fbdev_get(dev);
 	if (ifbdev == NULL)
 		return;
 
-	if (synchronous) {
-		/* Flush any pending work to turn the console on, and then
-		 * wait to turn it off. It must be synchronous as we are
-		 * about to suspend or unload the driver.
-		 *
-		 * Note that from within the work-handler, we cannot flush
-		 * ourselves, so only flush outstanding work upon suspend!
-		 */
-		if (state != FBINFO_STATE_RUNNING)
-			flush_work(&dev_priv->fbdev_suspend_work);
-		console_lock();
-	} else {
-		/*
-		 * The console lock can be pretty contented on resume due
-		 * to all the printk activity.  Try to keep it out of the hot
-		 * path of resume if possible.
-		 */
-		WARN_ON(state != FBINFO_STATE_RUNNING);
-		if (!console_trylock()) {
-			/* Don't block our own workqueue as this can
-			 * be run in parallel with other i915.ko tasks.
-			 */
-			schedule_work(&dev_priv->fbdev_suspend_work);
-			return;
-		}
-	}
-
-	/* On resume from hibernation: If the object is shmemfs backed, it has
-	 * been restored from swap. If the object is stolen however, it will be
-	 * full of whatever garbage was left in there.
-	 */
-	if (state == FBINFO_STATE_RUNNING && ifbdev->fb->obj->stolen) {
-		struct fb_info *info = ifbdev->helper.fbdev;
-		memset_io(info->screen_base, 0, info->screen_size);
-	}
-
-	drm_fb_helper_set_suspend(&ifbdev->helper, state);
-	console_unlock();
+	__intel_fbdev_set_suspend(ifbdev, state,
+				  state != FBINFO_STATE_RUNNING);
 }
 
 void intel_fbdev_output_poll_changed(struct drm_device *dev)
-- 
2.8.0.rc3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* ✗ Fi.CI.BAT: failure for drm/i915: Protect fbdev across slow or failed initialisation (rev2)
  2016-03-30 17:20 [REGRESSION] system hang on ILK/SNB/IVB Gabriel Feceoru
                   ` (2 preceding siblings ...)
  2016-03-30 21:35 ` Lukas Wunner
@ 2016-03-31 14:42 ` Patchwork
  3 siblings, 0 replies; 25+ messages in thread
From: Patchwork @ 2016-03-31 14:42 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Protect fbdev across slow or failed initialisation (rev2)
URL   : https://patchwork.freedesktop.org/series/5065/
State : failure

== Summary ==

Series 5065v2 drm/i915: Protect fbdev across slow or failed initialisation
http://patchwork.freedesktop.org/api/1.0/series/5065/revisions/2/mbox/

Test drv_module_reload_basic:
                pass       -> INCOMPLETE (bsw-nuc-2)
                pass       -> INCOMPLETE (skl-nuci5)
                pass       -> INCOMPLETE (byt-nuc)
Test gem_exec_suspend:
        Subgroup basic-s3:
                pass       -> DMESG-WARN (bsw-nuc-2)
Test kms_pipe_crc_basic:
        Subgroup suspend-read-crc-pipe-c:
                dmesg-warn -> PASS       (bsw-nuc-2)
Test pm_rpm:
        Subgroup basic-rte:
                pass       -> DMESG-WARN (bsw-nuc-2)

bsw-nuc-2        total:154  pass:122  dwarn:2   dfail:0   fail:0   skip:29 
byt-nuc          total:57   pass:46   dwarn:0   dfail:0   fail:0   skip:10 
hsw-brixbox      total:196  pass:174  dwarn:0   dfail:0   fail:0   skip:22 
skl-nuci5        total:25   pass:22   dwarn:0   dfail:0   fail:0   skip:2  

Results at /archive/results/CI_IGT_test/Patchwork_1761/

03c0f854e93263563f559d2bc8e47fb51adae697 drm-intel-nightly: 2016y-03m-31d-10h-50m-15s UTC integration manifest
ab3a2f88b9f8b6a3109c325d42770c3c5fd714a4 drm/i915: Protect fbdev across slow or failed initialisation

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 2/2] drm/i915: Move fbdev_suspend_work to intel_fbdev
  2016-03-31 13:57         ` [PATCH v4 2/2] drm/i915: Move fbdev_suspend_work to intel_fbdev Chris Wilson
@ 2016-03-31 15:22           ` Joonas Lahtinen
  2016-03-31 15:30             ` Chris Wilson
  0 siblings, 1 reply; 25+ messages in thread
From: Joonas Lahtinen @ 2016-03-31 15:22 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx, Gabriel Feceoru

On to, 2016-03-31 at 14:57 +0100, Chris Wilson wrote:
> Since the suspend_work is entirely internal to intel_fbdev.c, move it
> from the top level drm_i915_private and into struct intel_fbdev. This
> requires splitting the internal interface for the suspend worker from
> the external interface used by the higher layers to initiate suspend.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/i915_drv.c    |   4 +-
>  drivers/gpu/drm/i915/i915_drv.h    |   1 -
>  drivers/gpu/drm/i915/intel_drv.h   |   5 +-
>  drivers/gpu/drm/i915/intel_fbdev.c | 117 ++++++++++++++++++++-----------------
>  4 files changed, 68 insertions(+), 59 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 020a31c5e2bb..100d0d92b1e6 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -634,7 +634,7 @@ static int i915_drm_suspend(struct drm_device *dev)
>  	intel_uncore_forcewake_reset(dev, false);
>  	intel_opregion_fini(dev);
>  
> -	intel_fbdev_set_suspend(dev, FBINFO_STATE_SUSPENDED, true);
> +	intel_fbdev_set_suspend(dev, FBINFO_STATE_SUSPENDED);
>  
>  	dev_priv->suspend_count++;
>  
> @@ -784,7 +784,7 @@ static int i915_drm_resume(struct drm_device *dev)
>  
>  	intel_opregion_init(dev);
>  
> -	intel_fbdev_set_suspend(dev, FBINFO_STATE_RUNNING, false);
> +	intel_fbdev_set_suspend(dev, FBINFO_STATE_RUNNING);
>  
>  	mutex_lock(&dev_priv->modeset_restore_lock);
>  	dev_priv->modeset_restore = MODESET_DONE;
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 820c91f551ba..973a602c5077 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1880,7 +1880,6 @@ struct drm_i915_private {
>  #ifdef CONFIG_DRM_FBDEV_EMULATION
>  	/* list of fbdev register on this device */
>  	struct intel_fbdev *fbdev;
> -	struct work_struct fbdev_suspend_work;
>  #endif
>  
>  	struct drm_property *broadcast_rgb_property;
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index 6ac46d921cde..00b6c60c1cb8 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -158,6 +158,7 @@ struct intel_framebuffer {
>  struct intel_fbdev {
>  	struct drm_fb_helper helper;
>  	struct intel_framebuffer *fb;
> +	struct work_struct suspend_work;
>  	int preferred_bpp;
>  };
>  
> @@ -1335,7 +1336,7 @@ void intel_dvo_init(struct drm_device *dev);
>  extern int intel_fbdev_init(struct drm_device *dev);
>  extern void intel_fbdev_initial_config_async(struct drm_device *dev);
>  extern void intel_fbdev_fini(struct drm_device *dev);
> -extern void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool synchronous);
> +extern void intel_fbdev_set_suspend(struct drm_device *dev, int state);
>  extern void intel_fbdev_output_poll_changed(struct drm_device *dev);
>  extern void intel_fbdev_restore_mode(struct drm_device *dev);
>  #else
> @@ -1352,7 +1353,7 @@ static inline void intel_fbdev_fini(struct drm_device *dev)
>  {
>  }
>  
> -static inline void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool synchronous)
> +static inline void intel_fbdev_set_suspend(struct drm_device *dev, int state)
>  {
>  }
>  
> diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c
> index 5d4be71bdf22..66bb79613660 100644
> --- a/drivers/gpu/drm/i915/intel_fbdev.c
> +++ b/drivers/gpu/drm/i915/intel_fbdev.c
> @@ -534,8 +534,7 @@ static const struct drm_fb_helper_funcs intel_fb_helper_funcs = {
>  	.fb_probe = intelfb_create,
>  };
>  
> -static void intel_fbdev_destroy(struct drm_device *dev,
> -				struct intel_fbdev *ifbdev)
> +static void intel_fbdev_destroy(struct intel_fbdev *ifbdev)
>  {
>  	/* We rely on the object-free to release the VMA pinning for
>  	 * the info->screen_base mmaping. Leaking the VMA is simpler than
> @@ -681,13 +680,56 @@ out:
>  	return false;
>  }
>  
> +static void __intel_fbdev_set_suspend(struct intel_fbdev *ifbdev,
> +				      int state, bool synchronous)
> +{
> +	if (synchronous) {
> +		/* Flush any pending work to turn the console on, and then
> +		 * wait to turn it off. It must be synchronous as we are
> +		 * about to suspend or unload the driver.
> +		 *
> +		 * Note that from within the work-handler, we cannot flush
> +		 * ourselves, so only flush outstanding work upon suspend!
> +		 */
> +		if (state != FBINFO_STATE_RUNNING)
> +			flush_work(&ifbdev->suspend_work);
> +		console_lock();
> +	} else {
> +		/*
> +		 * The console lock can be pretty contented on resume due
> +		 * to all the printk activity.  Try to keep it out of the hot
> +		 * path of resume if possible.
> +		 */
> +		WARN_ON(state != FBINFO_STATE_RUNNING);
> +		if (!console_trylock()) {
> +			/* Don't block our own workqueue as this can
> +			 * be run in parallel with other i915.ko tasks.
> +			 */
> +			schedule_work(&ifbdev->suspend_work);
> +			return;
> +		}
> +	}
> +
> +	/* On resume from hibernation: If the object is shmemfs backed, it has
> +	 * been restored from swap. If the object is stolen however, it will be
> +	 * full of whatever garbage was left in there.
> +	 */
> +	if (state == FBINFO_STATE_RUNNING && ifbdev->fb->obj->stolen) {
> +		struct fb_info *info = ifbdev->helper.fbdev;
> +		memset_io(info->screen_base, 0, info->screen_size);
> +	}
> +
> +	drm_fb_helper_set_suspend(&ifbdev->helper, state);
> +	console_unlock();
> +}
> +
>  static void intel_fbdev_suspend_worker(struct work_struct *work)
>  {
> -	intel_fbdev_set_suspend(container_of(work,
> -					     struct drm_i915_private,
> -					     fbdev_suspend_work)->dev,
> -				FBINFO_STATE_RUNNING,
> -				true);
> +	__intel_fbdev_set_suspend(container_of(work,
> +					       struct intel_fbdev,
> +					       suspend_work),

Have the container_of on separate line at least, maybe even with a
macro.

> +				  FBINFO_STATE_RUNNING,
> +				  true);
>  }
>  
>  int intel_fbdev_init(struct drm_device *dev)
> @@ -716,9 +758,9 @@ int intel_fbdev_init(struct drm_device *dev)
>  	}
>  
>  	ifbdev->helper.atomic = true;
> +	INIT_WORK(&ifbdev->suspend_work, intel_fbdev_suspend_worker);
>  
>  	dev_priv->fbdev = ifbdev;
> -	INIT_WORK(&dev_priv->fbdev_suspend_work, intel_fbdev_suspend_worker);
>  
>  	drm_fb_helper_single_add_all_connectors(&ifbdev->helper);
>  
> @@ -743,17 +785,21 @@ void intel_fbdev_initial_config_async(struct drm_device *dev)
>  
>  void intel_fbdev_fini(struct drm_device *dev)
>  {
> -	struct drm_i915_private *dev_priv = dev->dev_private;
> -	if (!dev_priv->fbdev)
> +	struct drm_i915_private *dev_priv = to_i915(dev);
> +	struct intel_fbdev *ifbdev;
> +
> +	ifbdev = dev_priv->fbdev;

Straight assignment.

> +	if (ifbdev == NULL)
>  		return;
>  
> -	flush_work(&dev_priv->fbdev_suspend_work);
> +	dev_priv->fbdev = NULL;
>  
>  	if (!current_is_async())
>  		async_synchronize_full();
> -	intel_fbdev_destroy(dev, dev_priv->fbdev);
> -	kfree(dev_priv->fbdev);
> -	dev_priv->fbdev = NULL;
> +	flush_work(&ifbdev->suspend_work);
> +
> +	intel_fbdev_destroy(ifbdev);
> +	kfree(ifbdev);
>  }
>  
>  static struct intel_fbdev *intel_fbdev_get(struct drm_device *dev)
> @@ -788,53 +834,16 @@ static struct intel_fbdev *intel_fbdev_get_if_active(struct drm_device *dev)
>  	return ifbdev;
>  }
>  
> -void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool synchronous)
> +void intel_fbdev_set_suspend(struct drm_device *dev, int state)
>  {
> -	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct intel_fbdev *ifbdev;
>  
>  	ifbdev = intel_fbdev_get(dev);
>  	if (ifbdev == NULL)
>  		return;
>  
> -	if (synchronous) {
> -		/* Flush any pending work to turn the console on, and then
> -		 * wait to turn it off. It must be synchronous as we are
> -		 * about to suspend or unload the driver.
> -		 *
> -		 * Note that from within the work-handler, we cannot flush
> -		 * ourselves, so only flush outstanding work upon suspend!
> -		 */
> -		if (state != FBINFO_STATE_RUNNING)
> -			flush_work(&dev_priv->fbdev_suspend_work);
> -		console_lock();
> -	} else {
> -		/*
> -		 * The console lock can be pretty contented on resume due
> -		 * to all the printk activity.  Try to keep it out of the hot
> -		 * path of resume if possible.
> -		 */
> -		WARN_ON(state != FBINFO_STATE_RUNNING);
> -		if (!console_trylock()) {
> -			/* Don't block our own workqueue as this can
> -			 * be run in parallel with other i915.ko tasks.
> -			 */
> -			schedule_work(&dev_priv->fbdev_suspend_work);
> -			return;
> -		}
> -	}
> -
> -	/* On resume from hibernation: If the object is shmemfs backed, it has
> -	 * been restored from swap. If the object is stolen however, it will be
> -	 * full of whatever garbage was left in there.
> -	 */
> -	if (state == FBINFO_STATE_RUNNING && ifbdev->fb->obj->stolen) {
> -		struct fb_info *info = ifbdev->helper.fbdev;
> -		memset_io(info->screen_base, 0, info->screen_size);
> -	}
> -
> -	drm_fb_helper_set_suspend(&ifbdev->helper, state);
> -	console_unlock();
> +	__intel_fbdev_set_suspend(ifbdev, state,
> +				  state != FBINFO_STATE_RUNNING);

Could have changed this function name to make it easier to spot what
you changed or not. Code motion as separate patches.

With above addressed,

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

>  }
>  
>  void intel_fbdev_output_poll_changed(struct drm_device *dev)
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [REGRESSION] system hang on ILK/SNB/IVB
  2016-03-31  7:42   ` Gabriel Feceoru
@ 2016-03-31 15:23     ` Lukas Wunner
  0 siblings, 0 replies; 25+ messages in thread
From: Lukas Wunner @ 2016-03-31 15:23 UTC (permalink / raw)
  To: Gabriel Feceoru; +Cc: Tomi Sarvela, daniel.vetter, intel-gfx

Hi Gabriel,

On Thu, Mar 31, 2016 at 10:42:37AM +0300, Gabriel Feceoru wrote:
> On 31.03.2016 00:35, Lukas Wunner wrote:
> >On Wed, Mar 30, 2016 at 08:20:26PM +0300, Gabriel Feceoru wrote:
> >>This commit causes a hang while running kms suspend tests
> >>(kms_pipe_crc_basic@suspend-read-crc-pipe-*) on ILK/SNB/IVB, affecting CI.
> 
> Tomi already replied, meantime I also looked at the results.
> The current regression is for ILK/SNB/IVB only (v1 seemed to affect more
> platforms).
> Unfortunately these machines were not available when v2 was tested, so this
> couldn't be detected.

I dev on an IVB machine and cannot reproduce this. Suspend works fine.

All the patch does is call async_synchronize_full()
(1) when a hotplug event arrives or
(2) when the last DRM client closes the connection.
Either of these two things seems to be happening on your test machines
when running the suspend test.

The PM core suspends and resumes individual devices asynchronously and
calls async_synchronize_full() in a couple of places. If a device's PM
callbacks also call async_synchronize_full(), the machine deadlocks.

It is unnecessary that we call async_synchronize_full(), we only need
to synchronize up to a specific cookie (which represents initialization
of the fbdev). So I've just posted a patch to replace the calls to
async_synchronize_full() with async_synchronize_cookie(). This should
make things less fragile and hopefully also solve the hangs you're seeing.

Best regards,

Lukas

> >>
> >>Probably the same problem with the one in v2, but on older HW.
> >>
> >>
> >>commit a7442b93cf32c1e1ddb721a26cd1f92302e2a222
> >>Author: Lukas Wunner <lukas@wunner.de>
> >>Date:   Wed Mar 9 12:52:53 2016 +0100
> >>
> >>     drm/i915: Fix races on fbdev
> >>
> >>     The ->lastclose callback invokes intel_fbdev_restore_mode() and has
> >>     been witnessed to run before intel_fbdev_initial_config_async()
> >>     has finished.
> >>
> >>     We might likewise receive hotplug events before we've had a chance to
> >>     fully set up the fbdev.
> >>
> >>     Fix by waiting for the asynchronous thread to finish.
> >>
> >>     v2:
> >>     An async_synchronize_full() was also added to intel_fbdev_set_suspend()
> >>     in v1 which turned out to be entirely gratuitous. It caused a deadlock
> >>     on suspend (discovered by CI, thanks to Damien Lespiau and Tomi Sarvela
> >>     for CI support) and was unnecessary since a device is never suspended
> >>     until its ->probe callback (and all asynchronous tasks it scheduled)
> >>     have finished. See dpm_prepare(), which calls wait_for_device_probe(),
> >>     which calls async_synchronize_full().
> >>
> >>     Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93580
> >>     Reported-by: Gustav Fägerlind <gustav.fagerlind@gmail.com>
> >>     Reported-by: "Li, Weinan Z" <weinan.z.li@intel.com>
> >>     Cc: Chris Wilson <chris@chris-wilson.co.uk>
> >>     Cc: stable@vger.kernel.org
> >>     Signed-off-by: Lukas Wunner <lukas@wunner.de>
> >>     Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> >>     Link: http://patchwork.freedesktop.org/patch/msgid/20160309115147.67B2B6E0D3@gabe.freedesktop.org
> >>
> >>
> >>Regards,
> >>Gabriel
> >v2 passed CI fine, save for one warning not caused by the patch:
> >https://patchwork.freedesktop.org/series/4068/
> >
> >For comparison, this was v1:
> >https://patchwork.freedesktop.org/patch/75840/
> >
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 2/2] drm/i915: Move fbdev_suspend_work to intel_fbdev
  2016-03-31 15:22           ` Joonas Lahtinen
@ 2016-03-31 15:30             ` Chris Wilson
  2016-03-31 15:56               ` Joonas Lahtinen
  0 siblings, 1 reply; 25+ messages in thread
From: Chris Wilson @ 2016-03-31 15:30 UTC (permalink / raw)
  To: Joonas Lahtinen; +Cc: intel-gfx

On Thu, Mar 31, 2016 at 06:22:01PM +0300, Joonas Lahtinen wrote:
> On to, 2016-03-31 at 14:57 +0100, Chris Wilson wrote:
> >  static void intel_fbdev_suspend_worker(struct work_struct *work)
> >  {
> > -	intel_fbdev_set_suspend(container_of(work,
> > -					     struct drm_i915_private,
> > -					     fbdev_suspend_work)->dev,
> > -				FBINFO_STATE_RUNNING,
> > -				true);
> > +	__intel_fbdev_set_suspend(container_of(work,
> > +					       struct intel_fbdev,
> > +					       suspend_work),
> 
> Have the container_of on separate line at least, maybe even with a
> macro.

Sure.
 
> > +				  FBINFO_STATE_RUNNING,
> > +				  true);
> >  }
> >  
> >  int intel_fbdev_init(struct drm_device *dev)
> > @@ -716,9 +758,9 @@ int intel_fbdev_init(struct drm_device *dev)
> >  	}
> >  
> >  	ifbdev->helper.atomic = true;
> > +	INIT_WORK(&ifbdev->suspend_work, intel_fbdev_suspend_worker);
> >  
> >  	dev_priv->fbdev = ifbdev;
> > -	INIT_WORK(&dev_priv->fbdev_suspend_work, intel_fbdev_suspend_worker);
> >  
> >  	drm_fb_helper_single_add_all_connectors(&ifbdev->helper);
> >  
> > @@ -743,17 +785,21 @@ void intel_fbdev_initial_config_async(struct drm_device *dev)
> >  
> >  void intel_fbdev_fini(struct drm_device *dev)
> >  {
> > -	struct drm_i915_private *dev_priv = dev->dev_private;
> > -	if (!dev_priv->fbdev)
> > +	struct drm_i915_private *dev_priv = to_i915(dev);
> > +	struct intel_fbdev *ifbdev;
> > +
> > +	ifbdev = dev_priv->fbdev;
> 
> Straight assignment.
> 
> > +	if (ifbdev == NULL)

I was trying to group the local assignment with the if, and I like to leave
whitespace after the locals. By placing these two lines together this
function looks like the majority of the other functions.

> >  		return;
> >  
> > -	flush_work(&dev_priv->fbdev_suspend_work);
> > +	dev_priv->fbdev = NULL;
> >  
> >  	if (!current_is_async())
> >  		async_synchronize_full();
> > -	intel_fbdev_destroy(dev, dev_priv->fbdev);
> > -	kfree(dev_priv->fbdev);
> > -	dev_priv->fbdev = NULL;
> > +	flush_work(&ifbdev->suspend_work);
> > +
> > +	intel_fbdev_destroy(ifbdev);
> > +	kfree(ifbdev);
> >  }
> >  
> >  static struct intel_fbdev *intel_fbdev_get(struct drm_device *dev)
> > @@ -788,53 +834,16 @@ static struct intel_fbdev *intel_fbdev_get_if_active(struct drm_device *dev)
> >  	return ifbdev;
> >  }
> >  
> > -void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool synchronous)
> > +void intel_fbdev_set_suspend(struct drm_device *dev, int state)
> >  {
> > -	struct drm_i915_private *dev_priv = dev->dev_private;
> >  	struct intel_fbdev *ifbdev;
> >  
> >  	ifbdev = intel_fbdev_get(dev);
> >  	if (ifbdev == NULL)
> >  		return;
> >  
> > -	if (synchronous) {
> > -		/* Flush any pending work to turn the console on, and then
> > -		 * wait to turn it off. It must be synchronous as we are
> > -		 * about to suspend or unload the driver.
> > -		 *
> > -		 * Note that from within the work-handler, we cannot flush
> > -		 * ourselves, so only flush outstanding work upon suspend!
> > -		 */
> > -		if (state != FBINFO_STATE_RUNNING)
> > -			flush_work(&dev_priv->fbdev_suspend_work);
> > -		console_lock();
> > -	} else {
> > -		/*
> > -		 * The console lock can be pretty contented on resume due
> > -		 * to all the printk activity.  Try to keep it out of the hot
> > -		 * path of resume if possible.
> > -		 */
> > -		WARN_ON(state != FBINFO_STATE_RUNNING);
> > -		if (!console_trylock()) {
> > -			/* Don't block our own workqueue as this can
> > -			 * be run in parallel with other i915.ko tasks.
> > -			 */
> > -			schedule_work(&dev_priv->fbdev_suspend_work);
> > -			return;
> > -		}
> > -	}
> > -
> > -	/* On resume from hibernation: If the object is shmemfs backed, it has
> > -	 * been restored from swap. If the object is stolen however, it will be
> > -	 * full of whatever garbage was left in there.
> > -	 */
> > -	if (state == FBINFO_STATE_RUNNING && ifbdev->fb->obj->stolen) {
> > -		struct fb_info *info = ifbdev->helper.fbdev;
> > -		memset_io(info->screen_base, 0, info->screen_size);
> > -	}
> > -
> > -	drm_fb_helper_set_suspend(&ifbdev->helper, state);
> > -	console_unlock();
> > +	__intel_fbdev_set_suspend(ifbdev, state,
> > +				  state != FBINFO_STATE_RUNNING);
> 
> Could have changed this function name to make it easier to spot what
> you changed or not. Code motion as separate patches.

This was a separate patch! The purpose of this patch was to move
code/data about :)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 2/2] drm/i915: Move fbdev_suspend_work to intel_fbdev
  2016-03-31 15:30             ` Chris Wilson
@ 2016-03-31 15:56               ` Joonas Lahtinen
  0 siblings, 0 replies; 25+ messages in thread
From: Joonas Lahtinen @ 2016-03-31 15:56 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On to, 2016-03-31 at 16:30 +0100, Chris Wilson wrote:
> On Thu, Mar 31, 2016 at 06:22:01PM +0300, Joonas Lahtinen wrote:
> > 
> > On to, 2016-03-31 at 14:57 +0100, Chris Wilson wrote:
> > > 
> > >  static void intel_fbdev_suspend_worker(struct work_struct *work)
> > >  {
> > > -	intel_fbdev_set_suspend(container_of(work,
> > > -					     struct drm_i915_private,
> > > -					     fbdev_suspend_work)->dev,
> > > -				FBINFO_STATE_RUNNING,
> > > -				true);
> > > +	__intel_fbdev_set_suspend(container_of(work,
> > > +					       struct intel_fbdev,
> > > +					       suspend_work),
> > Have the container_of on separate line at least, maybe even with a
> > macro.
> Sure.
>  
> > 
> > > 
> > > +				  FBINFO_STATE_RUNNING,
> > > +				  true);
> > >  }
> > >  
> > >  int intel_fbdev_init(struct drm_device *dev)
> > > @@ -716,9 +758,9 @@ int intel_fbdev_init(struct drm_device *dev)
> > >  	}
> > >  
> > >  	ifbdev->helper.atomic = true;
> > > +	INIT_WORK(&ifbdev->suspend_work, intel_fbdev_suspend_worker);
> > >  
> > >  	dev_priv->fbdev = ifbdev;
> > > -	INIT_WORK(&dev_priv->fbdev_suspend_work, intel_fbdev_suspend_worker);
> > >  
> > >  	drm_fb_helper_single_add_all_connectors(&ifbdev->helper);
> > >  
> > > @@ -743,17 +785,21 @@ void intel_fbdev_initial_config_async(struct drm_device *dev)
> > >  
> > >  void intel_fbdev_fini(struct drm_device *dev)
> > >  {
> > > -	struct drm_i915_private *dev_priv = dev->dev_private;
> > > -	if (!dev_priv->fbdev)
> > > +	struct drm_i915_private *dev_priv = to_i915(dev);
> > > +	struct intel_fbdev *ifbdev;
> > > +
> > > +	ifbdev = dev_priv->fbdev;
> > Straight assignment.
> > 
> > > 
> > > +	if (ifbdev == NULL)
> I was trying to group the local assignment with the if, and I like to leave
> whitespace after the locals. By placing these two lines together this
> function looks like the majority of the other functions.
> 
> > 
> > > 
> > >  		return;
> > >  
> > > -	flush_work(&dev_priv->fbdev_suspend_work);
> > > +	dev_priv->fbdev = NULL;
> > >  
> > >  	if (!current_is_async())
> > >  		async_synchronize_full();
> > > -	intel_fbdev_destroy(dev, dev_priv->fbdev);
> > > -	kfree(dev_priv->fbdev);
> > > -	dev_priv->fbdev = NULL;
> > > +	flush_work(&ifbdev->suspend_work);
> > > +
> > > +	intel_fbdev_destroy(ifbdev);
> > > +	kfree(ifbdev);
> > >  }
> > >  
> > >  static struct intel_fbdev *intel_fbdev_get(struct drm_device *dev)
> > > @@ -788,53 +834,16 @@ static struct intel_fbdev *intel_fbdev_get_if_active(struct drm_device *dev)
> > >  	return ifbdev;
> > >  }
> > >  
> > > -void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool synchronous)
> > > +void intel_fbdev_set_suspend(struct drm_device *dev, int state)
> > >  {
> > > -	struct drm_i915_private *dev_priv = dev->dev_private;
> > >  	struct intel_fbdev *ifbdev;
> > >  
> > >  	ifbdev = intel_fbdev_get(dev);
> > >  	if (ifbdev == NULL)
> > >  		return;
> > >  
> > > -	if (synchronous) {
> > > -		/* Flush any pending work to turn the console on, and then
> > > -		 * wait to turn it off. It must be synchronous as we are
> > > -		 * about to suspend or unload the driver.
> > > -		 *
> > > -		 * Note that from within the work-handler, we cannot flush
> > > -		 * ourselves, so only flush outstanding work upon suspend!
> > > -		 */
> > > -		if (state != FBINFO_STATE_RUNNING)
> > > -			flush_work(&dev_priv->fbdev_suspend_work);
> > > -		console_lock();
> > > -	} else {
> > > -		/*
> > > -		 * The console lock can be pretty contented on resume due
> > > -		 * to all the printk activity.  Try to keep it out of the hot
> > > -		 * path of resume if possible.
> > > -		 */
> > > -		WARN_ON(state != FBINFO_STATE_RUNNING);
> > > -		if (!console_trylock()) {
> > > -			/* Don't block our own workqueue as this can
> > > -			 * be run in parallel with other i915.ko tasks.
> > > -			 */
> > > -			schedule_work(&dev_priv->fbdev_suspend_work);
> > > -			return;
> > > -		}
> > > -	}
> > > -
> > > -	/* On resume from hibernation: If the object is shmemfs backed, it has
> > > -	 * been restored from swap. If the object is stolen however, it will be
> > > -	 * full of whatever garbage was left in there.
> > > -	 */
> > > -	if (state == FBINFO_STATE_RUNNING && ifbdev->fb->obj->stolen) {
> > > -		struct fb_info *info = ifbdev->helper.fbdev;
> > > -		memset_io(info->screen_base, 0, info->screen_size);
> > > -	}
> > > -
> > > -	drm_fb_helper_set_suspend(&ifbdev->helper, state);
> > > -	console_unlock();
> > > +	__intel_fbdev_set_suspend(ifbdev, state,
> > > +				  state != FBINFO_STATE_RUNNING);
> > Could have changed this function name to make it easier to spot what
> > you changed or not. Code motion as separate patches.
> This was a separate patch! The purpose of this patch was to move
> code/data about :)

It also changed order of some stuff, so I got suspicious ;) But looks
like the function stayed the same, so OK for me.

> -Chris
> 
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 1/2] drm/i915: Protect fbdev across slow or failed initialisation
  2016-03-31 13:57       ` [PATCH v4 1/2] " Chris Wilson
  2016-03-31 13:57         ` [PATCH v4 2/2] drm/i915: Move fbdev_suspend_work to intel_fbdev Chris Wilson
@ 2016-03-31 16:05         ` Joonas Lahtinen
  2016-03-31 16:13           ` Chris Wilson
  1 sibling, 1 reply; 25+ messages in thread
From: Joonas Lahtinen @ 2016-03-31 16:05 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx, Gabriel Feceoru

On to, 2016-03-31 at 14:57 +0100, Chris Wilson wrote:
> If the initialisation fails, we may be left with a dangling pointer with
> an incomplete fbdev structure. Here we want to disable internal calls
> into fbdev. Similarly, the initialisation may be slow and we haven't yet
> enabled the fbdev (e.g. quick suspend or last-close before the async init
> completes).
> 
> v3: To create a typo introduced when retyping
> v4: Prevent info==NULL dereference in early boot
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93580
> Reported-by: "Li, Weinan Z" <weinan.z.li@intel.com>
> Tested-by: Gabriel Feceoru <gabriel.feceoru@intel.com>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/intel_fbdev.c | 72 +++++++++++++++++++++++++-------------
>  1 file changed, 48 insertions(+), 24 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c
> index 153ea7a3fcf6..5d4be71bdf22 100644
> --- a/drivers/gpu/drm/i915/intel_fbdev.c
> +++ b/drivers/gpu/drm/i915/intel_fbdev.c
> @@ -756,17 +756,47 @@ void intel_fbdev_fini(struct drm_device *dev)
>  	dev_priv->fbdev = NULL;
>  }
>  
> +static struct intel_fbdev *intel_fbdev_get(struct drm_device *dev)
> +{
> +	struct drm_i915_private *dev_priv = to_i915(dev);
> +	struct fb_info *info;
> +
> +	if (dev_priv->fbdev == NULL)
> +		return NULL;
> +
> +	info = dev_priv->fbdev->helper.fbdev;
> +	if (info == NULL)
> +		return NULL;
> +
> +	if (info->screen_base == NULL)
> +		return NULL;
> +

This is rather verbose to my liking, I'd rather be dropping those '==
NULL' and convert to !. But either way to me.

> +	return dev_priv->fbdev;
> +}
> +
> +static struct intel_fbdev *intel_fbdev_get_if_active(struct drm_device *dev)
> +{
> +	struct intel_fbdev *ifbdev;
> +
> +	ifbdev = intel_fbdev_get(dev);
> +	if (ifbdev == NULL)
> +		return NULL;
> +
> +	if (ifbdev->helper.fbdev->state != FBINFO_STATE_RUNNING)
> +		return NULL;
> +
> +	return ifbdev;
> +}
> +
>  void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool synchronous)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_fbdev *ifbdev = dev_priv->fbdev;
> -	struct fb_info *info;
> +	struct intel_fbdev *ifbdev;
>  
> -	if (!ifbdev)
> +	ifbdev = intel_fbdev_get(dev);
> +	if (ifbdev == NULL)
>  		return;
>  
> -	info = ifbdev->helper.fbdev;
> -
>  	if (synchronous) {
>  		/* Flush any pending work to turn the console on, and then
>  		 * wait to turn it off. It must be synchronous as we are
> @@ -798,8 +828,10 @@ void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool synchronous
>  	 * been restored from swap. If the object is stolen however, it will be
>  	 * full of whatever garbage was left in there.
>  	 */
> -	if (state == FBINFO_STATE_RUNNING && ifbdev->fb->obj->stolen)
> +	if (state == FBINFO_STATE_RUNNING && ifbdev->fb->obj->stolen) {
> +		struct fb_info *info = ifbdev->helper.fbdev;
>  		memset_io(info->screen_base, 0, info->screen_size);
> +	}
>  
>  	drm_fb_helper_set_suspend(&ifbdev->helper, state);
>  	console_unlock();
> @@ -807,32 +839,24 @@ void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool synchronous
>  
>  void intel_fbdev_output_poll_changed(struct drm_device *dev)
>  {
> -	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct intel_fbdev *ifbdev = intel_fbdev_get_if_active(dev);
> +
> +	if (ifbdev == NULL)
> +		return;
>  
> -	async_synchronize_full();
> -	if (dev_priv->fbdev)
> -		drm_fb_helper_hotplug_event(&dev_priv->fbdev->helper);
> +	drm_fb_helper_hotplug_event(&ifbdev->helper);
>  }
>  
>  void intel_fbdev_restore_mode(struct drm_device *dev)
>  {
> -	int ret;
> -	struct drm_i915_private *dev_priv = dev->dev_private;
> -	struct intel_fbdev *ifbdev = dev_priv->fbdev;
> -	struct drm_fb_helper *fb_helper;
> +	struct intel_fbdev *ifbdev = intel_fbdev_get_if_active(dev);
>  
> -	async_synchronize_full();

What's with the async_synchronize_full() begin removed completely?

> -	if (!ifbdev)
> +	if (ifbdev == NULL)

Argh.

>  		return;
>  
> -	fb_helper = &ifbdev->helper;
> -
> -	ret = drm_fb_helper_restore_fbdev_mode_unlocked(fb_helper);
> -	if (ret) {
> -		DRM_DEBUG("failed to restore crtc mode\n");
> -	} else {
> -		mutex_lock(&fb_helper->dev->struct_mutex);
> +	if (drm_fb_helper_restore_fbdev_mode_unlocked(&ifbdev->helper) == 0) {
> +		mutex_lock(&dev->struct_mutex);
>  		intel_fb_obj_invalidate(ifbdev->fb->obj, ORIGIN_GTT);
> -		mutex_unlock(&fb_helper->dev->struct_mutex);
> +		mutex_unlock(&dev->struct_mutex);

Above addressed,

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

>  	}
>  }
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 1/2] drm/i915: Protect fbdev across slow or failed initialisation
  2016-03-31 16:05         ` [PATCH v4 1/2] drm/i915: Protect fbdev across slow or failed initialisation Joonas Lahtinen
@ 2016-03-31 16:13           ` Chris Wilson
  2016-03-31 16:28             ` Joonas Lahtinen
  2016-03-31 16:30             ` Lukas Wunner
  0 siblings, 2 replies; 25+ messages in thread
From: Chris Wilson @ 2016-03-31 16:13 UTC (permalink / raw)
  To: Joonas Lahtinen; +Cc: intel-gfx

On Thu, Mar 31, 2016 at 07:05:21PM +0300, Joonas Lahtinen wrote:
> On to, 2016-03-31 at 14:57 +0100, Chris Wilson wrote:
> >  void intel_fbdev_restore_mode(struct drm_device *dev)
> >  {
> > -	int ret;
> > -	struct drm_i915_private *dev_priv = dev->dev_private;
> > -	struct intel_fbdev *ifbdev = dev_priv->fbdev;
> > -	struct drm_fb_helper *fb_helper;
> > +	struct intel_fbdev *ifbdev = intel_fbdev_get_if_active(dev);
> >  
> > -	async_synchronize_full();
> 
> What's with the async_synchronize_full() begin removed completely?

Because it's not just wrong, but completely broken imo.

During suspend, we want to freeze the async task not flush. Then here
and during resume we skip the restoration of the unregistered fbdev, and
then when the task is woken it can complete the registration and present
the vanilla ifbdev.

During hibernation, we really just want to cancel the task and start
from scratch on resume.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 1/2] drm/i915: Protect fbdev across slow or failed initialisation
  2016-03-31 16:13           ` Chris Wilson
@ 2016-03-31 16:28             ` Joonas Lahtinen
  2016-03-31 16:30             ` Lukas Wunner
  1 sibling, 0 replies; 25+ messages in thread
From: Joonas Lahtinen @ 2016-03-31 16:28 UTC (permalink / raw)
  To: Chris Wilson, Lukas Wunner, Daniel Vetter; +Cc: intel-gfx

On to, 2016-03-31 at 17:13 +0100, Chris Wilson wrote:
> On Thu, Mar 31, 2016 at 07:05:21PM +0300, Joonas Lahtinen wrote:
> > 
> > On to, 2016-03-31 at 14:57 +0100, Chris Wilson wrote:
> > > 
> > >  void intel_fbdev_restore_mode(struct drm_device *dev)
> > >  {
> > > -	int ret;
> > > -	struct drm_i915_private *dev_priv = dev->dev_private;
> > > -	struct intel_fbdev *ifbdev = dev_priv->fbdev;
> > > -	struct drm_fb_helper *fb_helper;
> > > +	struct intel_fbdev *ifbdev = intel_fbdev_get_if_active(dev);
> > >  
> > > -	async_synchronize_full();
> > What's with the async_synchronize_full() begin removed completely?
> Because it's not just wrong, but completely broken imo.
> 
> During suspend, we want to freeze the async task not flush. Then here
> and during resume we skip the restoration of the unregistered fbdev, and
> then when the task is woken it can complete the registration and present
> the vanilla ifbdev.
> 
> During hibernation, we really just want to cancel the task and start
> from scratch on resume.

Maybe Ack from Lukas with those? As this is effectively a revert of :

	commit a7442b93cf32c1e1ddb721a26cd1f92302e2a222
	Author: Lukas Wunner <lukas@wunner.de>
	Date:   Wed Mar 9 12:52:53 2016 +0100

	    drm/i915: Fix races on fbdev

Committed without R-b by Daniel, so maybe he has a comment on it too.

Regards, Joonas

> -Chris
> 
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 1/2] drm/i915: Protect fbdev across slow or failed initialisation
  2016-03-31 16:13           ` Chris Wilson
  2016-03-31 16:28             ` Joonas Lahtinen
@ 2016-03-31 16:30             ` Lukas Wunner
  1 sibling, 0 replies; 25+ messages in thread
From: Lukas Wunner @ 2016-03-31 16:30 UTC (permalink / raw)
  To: Chris Wilson, Joonas Lahtinen, intel-gfx, Gabriel Feceoru

Hi Chris,

On Thu, Mar 31, 2016 at 05:13:55PM +0100, Chris Wilson wrote:
> On Thu, Mar 31, 2016 at 07:05:21PM +0300, Joonas Lahtinen wrote:
> > On to, 2016-03-31 at 14:57 +0100, Chris Wilson wrote:
> > >  void intel_fbdev_restore_mode(struct drm_device *dev)
> > >  {
> > > -	int ret;
> > > -	struct drm_i915_private *dev_priv = dev->dev_private;
> > > -	struct intel_fbdev *ifbdev = dev_priv->fbdev;
> > > -	struct drm_fb_helper *fb_helper;
> > > +	struct intel_fbdev *ifbdev = intel_fbdev_get_if_active(dev);
> > >  
> > > -	async_synchronize_full();
> > 
> > What's with the async_synchronize_full() begin removed completely?
> 
> Because it's not just wrong, but completely broken imo.
> 
> During suspend, we want to freeze the async task not flush.
> Then here and during resume we skip the restoration of the unregistered
> fbdev, and then when the task is woken it can complete the registration
> and present the vanilla ifbdev.

No. The fbdev initialization is guaranteed to have finished before
suspend. So "we want to freeze the async task" is wrong thinking.
There is no async task to freeze.

Please read the commit message of a7442b93cf32 ("drm/i915: Fix races
on fbdev"):

"a device is never suspended until its ->probe callback (and all
asynchronous tasks it scheduled) have finished. See dpm_prepare(),
which calls wait_for_device_probe(), which calls async_synchronize_full()."

Best regards,

Lukas

> During hibernation, we really just want to cancel the task and start
> from scratch on resume.
> -Chris
> 
> -- 
> Chris Wilson, Intel Open Source Technology Centre
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [REGRESSION] system hang on ILK/SNB/IVB
  2016-03-31  7:21   ` Tomi Sarvela
@ 2016-03-31 20:35     ` Lukas Wunner
  2016-04-01  7:59       ` Tomi Sarvela
  0 siblings, 1 reply; 25+ messages in thread
From: Lukas Wunner @ 2016-03-31 20:35 UTC (permalink / raw)
  To: Tomi Sarvela; +Cc: intel-gfx, daniel.vetter

Hi Tomi,

On Thu, Mar 31, 2016 at 10:21:16AM +0300, Tomi Sarvela wrote:
> The problem with the results in your link is that there is no HSW, ILK, IVB
> or SNB results. This might give the impression that everything is well.
> 
> Most damning is lack of HSW-gt2 and SNB-dellxps: those machines hang on to
> APC, and have run quite stably for every Patchwork run. The case isn't strong
> enough yet that series should fail if either of those won't run, but it might
> be so in future.

So my patch seeking to fix the hangs has passed Romania CI with "success":
https://patchwork.freedesktop.org/series/5125/

However I don't see HSW-gt2 and SNB-dellxps in their hardware lineup.
And I would still like to know what the actual cause of the hangs is
since they do not occur on my IVB laptop.

If you get the chance maybe you can repeat the test and include a
"dump_stack();" at the beginning of intel_fbdev_output_poll_changed()
and intel_fbdev_restore_mode(). This should show in the logs which of
the two functions is called during suspend/resume and from where it's
called. My guess is that this particular hardware causes a hotplug
signal to be generated upon waking up. If the hangs do not stop with
this patch, booting with "no_console_suspend" should at least show
what's going on.

Thank you & best regards,

Lukas
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [REGRESSION] system hang on ILK/SNB/IVB
  2016-03-31 20:35     ` Lukas Wunner
@ 2016-04-01  7:59       ` Tomi Sarvela
  0 siblings, 0 replies; 25+ messages in thread
From: Tomi Sarvela @ 2016-04-01  7:59 UTC (permalink / raw)
  To: Lukas Wunner; +Cc: intel-gfx, daniel.vetter

Hi Lukas,

Ran this patch through the farm, and it seems that this patch might've helped 
HSW, maybe even BSW.

ILK, IVB and SNB are still hanging hard to the same igt-test,
kms_pipe_crc_basic@suspend-read-crc-pipe-?

I'll try to make a kernel with the changes you proposed, but as I'm not 
familiar with the driver innards, it might take a while.

Tomi


On Thursday 31 March 2016 22:35:17 Lukas Wunner wrote:
> Hi Tomi,
> 
> On Thu, Mar 31, 2016 at 10:21:16AM +0300, Tomi Sarvela wrote:
> > The problem with the results in your link is that there is no HSW, ILK,
> > IVB
> > or SNB results. This might give the impression that everything is well.
> > 
> > Most damning is lack of HSW-gt2 and SNB-dellxps: those machines hang on to
> > APC, and have run quite stably for every Patchwork run. The case isn't
> > strong enough yet that series should fail if either of those won't run,
> > but it might be so in future.
> 
> So my patch seeking to fix the hangs has passed Romania CI with "success":
> https://patchwork.freedesktop.org/series/5125/
> 
> However I don't see HSW-gt2 and SNB-dellxps in their hardware lineup.
> And I would still like to know what the actual cause of the hangs is
> since they do not occur on my IVB laptop.
> 
> If you get the chance maybe you can repeat the test and include a
> "dump_stack();" at the beginning of intel_fbdev_output_poll_changed()
> and intel_fbdev_restore_mode(). This should show in the logs which of
> the two functions is called during suspend/resume and from where it's
> called. My guess is that this particular hardware causes a hotplug
> signal to be generated upon waking up. If the hangs do not stop with
> this patch, booting with "no_console_suspend" should at least show
> what's going on.
> 
> Thank you & best regards,
> 
> Lukas

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2016-04-01  7:59 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-30 17:20 [REGRESSION] system hang on ILK/SNB/IVB Gabriel Feceoru
2016-03-30 17:57 ` [PATCH] drm/i915: Protect fbdev across slow or failed initialisation Chris Wilson
2016-03-30 18:10   ` kbuild test robot
2016-03-30 18:10   ` kbuild test robot
2016-03-30 18:26   ` kbuild test robot
2016-03-30 18:30   ` Chris Wilson
2016-03-30 18:56   ` [PATCH v3] " Chris Wilson
2016-03-31 12:00     ` Gabriel Feceoru
2016-03-31 13:57       ` [PATCH v4 1/2] " Chris Wilson
2016-03-31 13:57         ` [PATCH v4 2/2] drm/i915: Move fbdev_suspend_work to intel_fbdev Chris Wilson
2016-03-31 15:22           ` Joonas Lahtinen
2016-03-31 15:30             ` Chris Wilson
2016-03-31 15:56               ` Joonas Lahtinen
2016-03-31 16:05         ` [PATCH v4 1/2] drm/i915: Protect fbdev across slow or failed initialisation Joonas Lahtinen
2016-03-31 16:13           ` Chris Wilson
2016-03-31 16:28             ` Joonas Lahtinen
2016-03-31 16:30             ` Lukas Wunner
2016-03-30 18:47 ` [REGRESSION] system hang on ILK/SNB/IVB Daniel Vetter
2016-03-30 21:35 ` Lukas Wunner
2016-03-31  7:21   ` Tomi Sarvela
2016-03-31 20:35     ` Lukas Wunner
2016-04-01  7:59       ` Tomi Sarvela
2016-03-31  7:42   ` Gabriel Feceoru
2016-03-31 15:23     ` Lukas Wunner
2016-03-31 14:42 ` ✗ Fi.CI.BAT: failure for drm/i915: Protect fbdev across slow or failed initialisation (rev2) Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.