All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/vkms: Don't warn hrtimer_forward_now failure.
@ 2020-05-25 14:38 Tetsuo Handa
  2020-05-25 15:21 ` Daniel Vetter
  0 siblings, 1 reply; 9+ messages in thread
From: Tetsuo Handa @ 2020-05-25 14:38 UTC (permalink / raw)
  To: DRI; +Cc: Tetsuo Handa, Shayenne Moura, Rodrigo Siqueira, Daniel Vetter

Commit 3a0709928b172a41 ("drm/vkms: Add vblank events simulated by
hrtimers") introduced ret_overrun variable. And that variable was an
unused-but-set-variable until commit 09ef09b4ab95dc40 ("drm/vkms:
WARN when hrtimer_forward_now fails") added WARN_ON(ret_overrun != 1).

Now, syzbot is hitting this WARN_ON() using a simple reproducer that
does open("/dev/dri/card1") followed by ioctl(DRM_IOCTL_WAIT_VBLANK),
and a debug printk() patch says that syzbot is getting

   output->vblank_hrtimer.base->get_time()=93531904774 (which is uptime)
   output->period_ns=0
   ret_overrun=216994

. I can't understand what "verify the hrtimer_forward_now return" in
that commit wants to say. hrtimer_forward_now() must return, and the
return value of hrtimer_forward_now() is not a boolean. Why comparing
with 1 ? Anyway, this failure is not something that worth crashing the
system. Let's remove the ret_overrun variable and WARN_ON() test.

Link: https://syzkaller.appspot.com/bug?id=0ba17d70d062b2595e1f061231474800f076c7cb
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: syzbot+0871b14ca2e2fb64f6e3@syzkaller.appspotmail.com
Fixes: 09ef09b4ab95dc40 ("drm/vkms: WARN when hrtimer_forward_now fails")
---
 drivers/gpu/drm/vkms/vkms_crtc.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
index ac85e17428f8..cc1811ce6092 100644
--- a/drivers/gpu/drm/vkms/vkms_crtc.c
+++ b/drivers/gpu/drm/vkms/vkms_crtc.c
@@ -13,12 +13,9 @@ static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
 						  vblank_hrtimer);
 	struct drm_crtc *crtc = &output->crtc;
 	struct vkms_crtc_state *state;
-	u64 ret_overrun;
 	bool ret;
 
-	ret_overrun = hrtimer_forward_now(&output->vblank_hrtimer,
-					  output->period_ns);
-	WARN_ON(ret_overrun != 1);
+	hrtimer_forward_now(&output->vblank_hrtimer, output->period_ns);
 
 	spin_lock(&output->lock);
 	ret = drm_crtc_handle_vblank(crtc);
-- 
2.18.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/vkms: Don't warn hrtimer_forward_now failure.
  2020-05-25 14:38 [PATCH] drm/vkms: Don't warn hrtimer_forward_now failure Tetsuo Handa
@ 2020-05-25 15:21 ` Daniel Vetter
  2020-05-25 15:34   ` Tetsuo Handa
  0 siblings, 1 reply; 9+ messages in thread
From: Daniel Vetter @ 2020-05-25 15:21 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: Shayenne Moura, Rodrigo Siqueira, DRI, Daniel Vetter

On Mon, May 25, 2020 at 11:38:49PM +0900, Tetsuo Handa wrote:
> Commit 3a0709928b172a41 ("drm/vkms: Add vblank events simulated by
> hrtimers") introduced ret_overrun variable. And that variable was an
> unused-but-set-variable until commit 09ef09b4ab95dc40 ("drm/vkms:
> WARN when hrtimer_forward_now fails") added WARN_ON(ret_overrun != 1).
> 
> Now, syzbot is hitting this WARN_ON() using a simple reproducer that
> does open("/dev/dri/card1") followed by ioctl(DRM_IOCTL_WAIT_VBLANK),
> and a debug printk() patch says that syzbot is getting
> 
>    output->vblank_hrtimer.base->get_time()=93531904774 (which is uptime)
>    output->period_ns=0
>    ret_overrun=216994
> 
> . I can't understand what "verify the hrtimer_forward_now return" in
> that commit wants to say. hrtimer_forward_now() must return, and the
> return value of hrtimer_forward_now() is not a boolean. Why comparing
> with 1 ? Anyway, this failure is not something that worth crashing the
> system. Let's remove the ret_overrun variable and WARN_ON() test.

Uh we're not crashing the system, it's a warning backtrace.

And we've spent a few months hunting the races here, so just removing that
check isn't really a good idea. The correct thing to do is figure out why
we're hitting this. It could be that we're having a missing check
somewhere, or missing initialization, and that's what syzbot is hitting.
Removing this check here just papers over the bug.

If the vkms driver is loaded, and there's nothing else going on, then what
I expect to happen is that this virtual hw is entirely off. And in that
case, the vblank ioctl should be rejected outright. So there's definitely
something fishy going on to begin with.

If otoh the virtual hw is somehow on (maybe fbcon gets loaded, no idea),
then the vblank wait shouldn't just blow up like this.
-Daniel

> 
> Link: https://syzkaller.appspot.com/bug?id=0ba17d70d062b2595e1f061231474800f076c7cb
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Reported-by: syzbot+0871b14ca2e2fb64f6e3@syzkaller.appspotmail.com
> Fixes: 09ef09b4ab95dc40 ("drm/vkms: WARN when hrtimer_forward_now fails")
> ---
>  drivers/gpu/drm/vkms/vkms_crtc.c | 5 +----
>  1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
> index ac85e17428f8..cc1811ce6092 100644
> --- a/drivers/gpu/drm/vkms/vkms_crtc.c
> +++ b/drivers/gpu/drm/vkms/vkms_crtc.c
> @@ -13,12 +13,9 @@ static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
>  						  vblank_hrtimer);
>  	struct drm_crtc *crtc = &output->crtc;
>  	struct vkms_crtc_state *state;
> -	u64 ret_overrun;
>  	bool ret;
>  
> -	ret_overrun = hrtimer_forward_now(&output->vblank_hrtimer,
> -					  output->period_ns);
> -	WARN_ON(ret_overrun != 1);
> +	hrtimer_forward_now(&output->vblank_hrtimer, output->period_ns);
>  
>  	spin_lock(&output->lock);
>  	ret = drm_crtc_handle_vblank(crtc);
> -- 
> 2.18.2
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/vkms: Don't warn hrtimer_forward_now failure.
  2020-05-25 15:21 ` Daniel Vetter
@ 2020-05-25 15:34   ` Tetsuo Handa
  2020-05-25 16:57     ` Daniel Vetter
  2020-05-25 17:00     ` Daniel Vetter
  0 siblings, 2 replies; 9+ messages in thread
From: Tetsuo Handa @ 2020-05-25 15:34 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Shayenne Moura, syzbot+0871b14ca2e2fb64f6e3, Rodrigo Siqueira,
	DRI, Daniel Vetter

On 2020/05/26 0:21, Daniel Vetter wrote:
> On Mon, May 25, 2020 at 11:38:49PM +0900, Tetsuo Handa wrote:
>> Commit 3a0709928b172a41 ("drm/vkms: Add vblank events simulated by
>> hrtimers") introduced ret_overrun variable. And that variable was an
>> unused-but-set-variable until commit 09ef09b4ab95dc40 ("drm/vkms:
>> WARN when hrtimer_forward_now fails") added WARN_ON(ret_overrun != 1).
>>
>> Now, syzbot is hitting this WARN_ON() using a simple reproducer that
>> does open("/dev/dri/card1") followed by ioctl(DRM_IOCTL_WAIT_VBLANK),
>> and a debug printk() patch says that syzbot is getting
>>
>>    output->vblank_hrtimer.base->get_time()=93531904774 (which is uptime)
>>    output->period_ns=0
>>    ret_overrun=216994
>>
>> . I can't understand what "verify the hrtimer_forward_now return" in
>> that commit wants to say. hrtimer_forward_now() must return, and the
>> return value of hrtimer_forward_now() is not a boolean. Why comparing
>> with 1 ? Anyway, this failure is not something that worth crashing the
>> system. Let's remove the ret_overrun variable and WARN_ON() test.
> 
> Uh we're not crashing the system, it's a warning backtrace.

syzbot uses panic_on_warn=1, and this bug is currently the 8th top crasher.

> 
> And we've spent a few months hunting the races here, so just removing that
> check isn't really a good idea. The correct thing to do is figure out why
> we're hitting this. It could be that we're having a missing check
> somewhere, or missing initialization, and that's what syzbot is hitting.
> Removing this check here just papers over the bug.

Here is a reproducer which syzbot is using.

----------
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/ioctl.h>
#include <drm/drm.h>

int main(int argc, char *argv[])
{
	union drm_wait_vblank arg;
	int fd = open("/dev/dri/card1", O_RDONLY);

	arg.request.type = 0;
	arg.request.sequence = 0xffff;
	arg.request.signal = 0x21;
	ioctl(fd, DRM_IOCTL_WAIT_VBLANK, &arg);
	return 0;
}
----------

Debug printk() patch shows that hrtimer_forward_now() can return larger than 1.
What is the reason you are expecting hrtimer_forward_now() to always return 1 ?

> 
> If the vkms driver is loaded, and there's nothing else going on, then what
> I expect to happen is that this virtual hw is entirely off. And in that
> case, the vblank ioctl should be rejected outright. So there's definitely
> something fishy going on to begin with.
> 
> If otoh the virtual hw is somehow on (maybe fbcon gets loaded, no idea),
> then the vblank wait shouldn't just blow up like this.
> -Daniel
> 
>>
>> Link: https://syzkaller.appspot.com/bug?id=0ba17d70d062b2595e1f061231474800f076c7cb
>> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
>> Reported-by: syzbot+0871b14ca2e2fb64f6e3@syzkaller.appspotmail.com
>> Fixes: 09ef09b4ab95dc40 ("drm/vkms: WARN when hrtimer_forward_now fails")
>> ---
>>  drivers/gpu/drm/vkms/vkms_crtc.c | 5 +----
>>  1 file changed, 1 insertion(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
>> index ac85e17428f8..cc1811ce6092 100644
>> --- a/drivers/gpu/drm/vkms/vkms_crtc.c
>> +++ b/drivers/gpu/drm/vkms/vkms_crtc.c
>> @@ -13,12 +13,9 @@ static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
>>  						  vblank_hrtimer);
>>  	struct drm_crtc *crtc = &output->crtc;
>>  	struct vkms_crtc_state *state;
>> -	u64 ret_overrun;
>>  	bool ret;
>>  
>> -	ret_overrun = hrtimer_forward_now(&output->vblank_hrtimer,
>> -					  output->period_ns);
>> -	WARN_ON(ret_overrun != 1);
>> +	hrtimer_forward_now(&output->vblank_hrtimer, output->period_ns);
>>  
>>  	spin_lock(&output->lock);
>>  	ret = drm_crtc_handle_vblank(crtc);
>> -- 
>> 2.18.2
>>
> 

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/vkms: Don't warn hrtimer_forward_now failure.
  2020-05-25 15:34   ` Tetsuo Handa
@ 2020-05-25 16:57     ` Daniel Vetter
  2020-05-25 17:00     ` Daniel Vetter
  1 sibling, 0 replies; 9+ messages in thread
From: Daniel Vetter @ 2020-05-25 16:57 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Shayenne Moura, Rodrigo Siqueira, DRI, syzbot+0871b14ca2e2fb64f6e3

On Mon, May 25, 2020 at 5:34 PM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> On 2020/05/26 0:21, Daniel Vetter wrote:
> > On Mon, May 25, 2020 at 11:38:49PM +0900, Tetsuo Handa wrote:
> >> Commit 3a0709928b172a41 ("drm/vkms: Add vblank events simulated by
> >> hrtimers") introduced ret_overrun variable. And that variable was an
> >> unused-but-set-variable until commit 09ef09b4ab95dc40 ("drm/vkms:
> >> WARN when hrtimer_forward_now fails") added WARN_ON(ret_overrun != 1).
> >>
> >> Now, syzbot is hitting this WARN_ON() using a simple reproducer that
> >> does open("/dev/dri/card1") followed by ioctl(DRM_IOCTL_WAIT_VBLANK),
> >> and a debug printk() patch says that syzbot is getting
> >>
> >>    output->vblank_hrtimer.base->get_time()=93531904774 (which is uptime)
> >>    output->period_ns=0
> >>    ret_overrun=216994
> >>
> >> . I can't understand what "verify the hrtimer_forward_now return" in
> >> that commit wants to say. hrtimer_forward_now() must return, and the
> >> return value of hrtimer_forward_now() is not a boolean. Why comparing
> >> with 1 ? Anyway, this failure is not something that worth crashing the
> >> system. Let's remove the ret_overrun variable and WARN_ON() test.
> >
> > Uh we're not crashing the system, it's a warning backtrace.
>
> syzbot uses panic_on_warn=1, and this bug is currently the 8th top crasher.
>
> >
> > And we've spent a few months hunting the races here, so just removing that
> > check isn't really a good idea. The correct thing to do is figure out why
> > we're hitting this. It could be that we're having a missing check
> > somewhere, or missing initialization, and that's what syzbot is hitting.
> > Removing this check here just papers over the bug.
>
> Here is a reproducer which syzbot is using.
>
> ----------
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <fcntl.h>
> #include <sys/ioctl.h>
> #include <drm/drm.h>
>
> int main(int argc, char *argv[])
> {
>         union drm_wait_vblank arg;
>         int fd = open("/dev/dri/card1", O_RDONLY);
>
>         arg.request.type = 0;
>         arg.request.sequence = 0xffff;
>         arg.request.signal = 0x21;
>         ioctl(fd, DRM_IOCTL_WAIT_VBLANK, &arg);
>         return 0;
> }
> ----------
>
> Debug printk() patch shows that hrtimer_forward_now() can return larger than 1.
> What is the reason you are expecting hrtimer_forward_now() to always return 1 ?

The timer simulates display refresh running at 60Hz. If we miss one,
the simulation is not accurate against hardware. Missing a few
thousands either means there's a bug somewhere, or the machine has
missed a few seconds of worth of timers (which is equally bad).
-Daniel

>
> >
> > If the vkms driver is loaded, and there's nothing else going on, then what
> > I expect to happen is that this virtual hw is entirely off. And in that
> > case, the vblank ioctl should be rejected outright. So there's definitely
> > something fishy going on to begin with.
> >
> > If otoh the virtual hw is somehow on (maybe fbcon gets loaded, no idea),
> > then the vblank wait shouldn't just blow up like this.
> > -Daniel
> >
> >>
> >> Link: https://syzkaller.appspot.com/bug?id=0ba17d70d062b2595e1f061231474800f076c7cb
> >> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> >> Reported-by: syzbot+0871b14ca2e2fb64f6e3@syzkaller.appspotmail.com
> >> Fixes: 09ef09b4ab95dc40 ("drm/vkms: WARN when hrtimer_forward_now fails")
> >> ---
> >>  drivers/gpu/drm/vkms/vkms_crtc.c | 5 +----
> >>  1 file changed, 1 insertion(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
> >> index ac85e17428f8..cc1811ce6092 100644
> >> --- a/drivers/gpu/drm/vkms/vkms_crtc.c
> >> +++ b/drivers/gpu/drm/vkms/vkms_crtc.c
> >> @@ -13,12 +13,9 @@ static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
> >>                                                vblank_hrtimer);
> >>      struct drm_crtc *crtc = &output->crtc;
> >>      struct vkms_crtc_state *state;
> >> -    u64 ret_overrun;
> >>      bool ret;
> >>
> >> -    ret_overrun = hrtimer_forward_now(&output->vblank_hrtimer,
> >> -                                      output->period_ns);
> >> -    WARN_ON(ret_overrun != 1);
> >> +    hrtimer_forward_now(&output->vblank_hrtimer, output->period_ns);
> >>
> >>      spin_lock(&output->lock);
> >>      ret = drm_crtc_handle_vblank(crtc);
> >> --
> >> 2.18.2
> >>
> >
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/vkms: Don't warn hrtimer_forward_now failure.
  2020-05-25 15:34   ` Tetsuo Handa
  2020-05-25 16:57     ` Daniel Vetter
@ 2020-05-25 17:00     ` Daniel Vetter
  2020-05-26  4:18       ` Tetsuo Handa
  1 sibling, 1 reply; 9+ messages in thread
From: Daniel Vetter @ 2020-05-25 17:00 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Shayenne Moura, Rodrigo Siqueira, DRI, syzbot+0871b14ca2e2fb64f6e3

On Mon, May 25, 2020 at 5:34 PM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> On 2020/05/26 0:21, Daniel Vetter wrote:
> > On Mon, May 25, 2020 at 11:38:49PM +0900, Tetsuo Handa wrote:
> >> Commit 3a0709928b172a41 ("drm/vkms: Add vblank events simulated by
> >> hrtimers") introduced ret_overrun variable. And that variable was an
> >> unused-but-set-variable until commit 09ef09b4ab95dc40 ("drm/vkms:
> >> WARN when hrtimer_forward_now fails") added WARN_ON(ret_overrun != 1).
> >>
> >> Now, syzbot is hitting this WARN_ON() using a simple reproducer that
> >> does open("/dev/dri/card1") followed by ioctl(DRM_IOCTL_WAIT_VBLANK),
> >> and a debug printk() patch says that syzbot is getting
> >>
> >>    output->vblank_hrtimer.base->get_time()=93531904774 (which is uptime)
> >>    output->period_ns=0
> >>    ret_overrun=216994
> >>
> >> . I can't understand what "verify the hrtimer_forward_now return" in
> >> that commit wants to say. hrtimer_forward_now() must return, and the
> >> return value of hrtimer_forward_now() is not a boolean. Why comparing
> >> with 1 ? Anyway, this failure is not something that worth crashing the
> >> system. Let's remove the ret_overrun variable and WARN_ON() test.
> >
> > Uh we're not crashing the system, it's a warning backtrace.
>
> syzbot uses panic_on_warn=1, and this bug is currently the 8th top crasher.
>
> >
> > And we've spent a few months hunting the races here, so just removing that
> > check isn't really a good idea. The correct thing to do is figure out why
> > we're hitting this. It could be that we're having a missing check
> > somewhere, or missing initialization, and that's what syzbot is hitting.
> > Removing this check here just papers over the bug.
>
> Here is a reproducer which syzbot is using.
>
> ----------
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <fcntl.h>
> #include <sys/ioctl.h>
> #include <drm/drm.h>
>
> int main(int argc, char *argv[])
> {
>         union drm_wait_vblank arg;
>         int fd = open("/dev/dri/card1", O_RDONLY);
>
>         arg.request.type = 0;
>         arg.request.sequence = 0xffff;
>         arg.request.signal = 0x21;
>         ioctl(fd, DRM_IOCTL_WAIT_VBLANK, &arg);
>         return 0;
> }
> ----------

Forgot to add: I did run this quickly with vkms as secondary output.
Good fireworks show, but there's an entire army of additional splats
after the first one. The WARN_ON you're removing is only the
messenger, the real bug is probably one of the later backtraces. Or at
least points more clearly at the real bug.
-Daniel

>
> Debug printk() patch shows that hrtimer_forward_now() can return larger than 1.
> What is the reason you are expecting hrtimer_forward_now() to always return 1 ?
>
> >
> > If the vkms driver is loaded, and there's nothing else going on, then what
> > I expect to happen is that this virtual hw is entirely off. And in that
> > case, the vblank ioctl should be rejected outright. So there's definitely
> > something fishy going on to begin with.
> >
> > If otoh the virtual hw is somehow on (maybe fbcon gets loaded, no idea),
> > then the vblank wait shouldn't just blow up like this.
> > -Daniel
> >
> >>
> >> Link: https://syzkaller.appspot.com/bug?id=0ba17d70d062b2595e1f061231474800f076c7cb
> >> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> >> Reported-by: syzbot+0871b14ca2e2fb64f6e3@syzkaller.appspotmail.com
> >> Fixes: 09ef09b4ab95dc40 ("drm/vkms: WARN when hrtimer_forward_now fails")
> >> ---
> >>  drivers/gpu/drm/vkms/vkms_crtc.c | 5 +----
> >>  1 file changed, 1 insertion(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
> >> index ac85e17428f8..cc1811ce6092 100644
> >> --- a/drivers/gpu/drm/vkms/vkms_crtc.c
> >> +++ b/drivers/gpu/drm/vkms/vkms_crtc.c
> >> @@ -13,12 +13,9 @@ static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
> >>                                                vblank_hrtimer);
> >>      struct drm_crtc *crtc = &output->crtc;
> >>      struct vkms_crtc_state *state;
> >> -    u64 ret_overrun;
> >>      bool ret;
> >>
> >> -    ret_overrun = hrtimer_forward_now(&output->vblank_hrtimer,
> >> -                                      output->period_ns);
> >> -    WARN_ON(ret_overrun != 1);
> >> +    hrtimer_forward_now(&output->vblank_hrtimer, output->period_ns);
> >>
> >>      spin_lock(&output->lock);
> >>      ret = drm_crtc_handle_vblank(crtc);
> >> --
> >> 2.18.2
> >>
> >
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/vkms: Don't warn hrtimer_forward_now failure.
  2020-05-25 17:00     ` Daniel Vetter
@ 2020-05-26  4:18       ` Tetsuo Handa
  2020-05-26  4:39         ` Tetsuo Handa
  0 siblings, 1 reply; 9+ messages in thread
From: Tetsuo Handa @ 2020-05-26  4:18 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Shayenne Moura, Rodrigo Siqueira, DRI, syzbot+0871b14ca2e2fb64f6e3

On 2020/05/26 2:00, Daniel Vetter wrote:
> Forgot to add: I did run this quickly with vkms as secondary output.
> Good fireworks show, but there's an entire army of additional splats
> after the first one. The WARN_ON you're removing is only the
> messenger, the real bug is probably one of the later backtraces. Or at
> least points more clearly at the real bug.
> -Daniel
> 
>>
>> Debug printk() patch shows that hrtimer_forward_now() can return larger than 1.
>> What is the reason you are expecting hrtimer_forward_now() to always return 1 ?
>>
>>>
>>> If the vkms driver is loaded, and there's nothing else going on, then what
>>> I expect to happen is that this virtual hw is entirely off. And in that
>>> case, the vblank ioctl should be rejected outright. So there's definitely
>>> something fishy going on to begin with.
>>>
>>> If otoh the virtual hw is somehow on (maybe fbcon gets loaded, no idea),
>>> then the vblank wait shouldn't just blow up like this.

I see. Another debug printk() patch ( https://syzkaller.appspot.com/text?tag=Patch&x=135f254a100000 ) found that

  [   86.613767][ T7837] [drm:drm_stub_open] 
  [   86.621592][ T7837] [drm:drm_open] pid = 7837, minor = 1
  [   86.651632][ T7837] [drm:drm_ioctl] pid=7837, dev=0xe201, auth=1, DRM_IOCTL_WAIT_VBLANK
  [   86.663391][ T7837] [drm:drm_calc_timestamping_constants] *ERROR* crtc 32: Can't calculate constants, dotclock = 0!
  [   86.679151][ T7837] [drm:drm_calc_timestamping_constants] crtc 32: hwmode: htotal 0, vtotal 0, vdisplay 0
  [   86.688865][ T7837] [drm:drm_calc_timestamping_constants] crtc 32: clock 0 kHz framedur 0 linedur 0
  [   86.688899][ T7837] Starting out->vblank_hrtimer : out->period_ns=0 vblank->framedur_ns=0
  [   86.689348][ T7837] [drm:drm_vblank_enable] enabling vblank on crtc 0, ret: 0
  [   86.706883][ T7837] [drm:drm_update_vblank_count] updating vblank count on crtc 0: current=0, diff=0, hw=0 hw_last=0
  [   86.706962][    C1] now=86714983667 interval=0 ret_overrun=17693375

hrtimer_start(&out->vblank_hrtimer, 0, HRTIMER_MODE_REL) from vkms_enable_vblank() is
causing immediate vkms_vblank_simulate() callback. The reason of vblank->framedur_ns == 0
is that drm_calc_timestamping_constants() fails to calculate framedur_ns (which is 0 by default)
due to mode->crtc_clock <= 0. Thus, somehow initializing mode->crtc_clock > 0 might be able
to solve this problem.

By the way, if you have an environment that can reproduce this problem, you can
test with debug printk() patch above applied.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/vkms: Don't warn hrtimer_forward_now failure.
  2020-05-26  4:18       ` Tetsuo Handa
@ 2020-05-26  4:39         ` Tetsuo Handa
  2020-05-26 11:11           ` Daniel Vetter
  0 siblings, 1 reply; 9+ messages in thread
From: Tetsuo Handa @ 2020-05-26  4:39 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Shayenne Moura, Rodrigo Siqueira, DRI, syzbot+0871b14ca2e2fb64f6e3

On 2020/05/26 13:18, Tetsuo Handa wrote:
> due to mode->crtc_clock <= 0. Thus, somehow initializing mode->crtc_clock > 0 might be able
> to solve this problem.

Well, I came to think that vkms_enable_vblank() should return an error to the caller
when drm_calc_timestamping_constants() failed...

static int vkms_enable_vblank(struct drm_crtc *crtc)
{
	struct drm_device *dev = crtc->dev;
	unsigned int pipe = drm_crtc_index(crtc);
	struct drm_vblank_crtc *vblank = &dev->vblank[pipe];
	struct vkms_output *out = drm_crtc_to_vkms_output(crtc);

	drm_calc_timestamping_constants(crtc, &crtc->mode);

	hrtimer_init(&out->vblank_hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
	out->vblank_hrtimer.function = &vkms_vblank_simulate;
	out->period_ns = ktime_set(0, vblank->framedur_ns);
	hrtimer_start(&out->vblank_hrtimer, out->period_ns, HRTIMER_MODE_REL);

	return 0;
}
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/vkms: Don't warn hrtimer_forward_now failure.
  2020-05-26  4:39         ` Tetsuo Handa
@ 2020-05-26 11:11           ` Daniel Vetter
  2020-05-27  8:27             ` Daniel Vetter
  0 siblings, 1 reply; 9+ messages in thread
From: Daniel Vetter @ 2020-05-26 11:11 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Shayenne Moura, Rodrigo Siqueira, DRI, syzbot+0871b14ca2e2fb64f6e3

On Tue, May 26, 2020 at 6:39 AM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> On 2020/05/26 13:18, Tetsuo Handa wrote:
> > due to mode->crtc_clock <= 0. Thus, somehow initializing mode->crtc_clock > 0 might be able
> > to solve this problem.
>
> Well, I came to think that vkms_enable_vblank() should return an error to the caller
> when drm_calc_timestamping_constants() failed...

If my memory is right we shouldn't even get there. crtc->mode being
all zeros sounds like the simulated output isn't on (no surprise,
syzbot doesn't enable it and fbcon is probably on card0), so something
higher up should reject this. I'll see whether I can figure out what
vkms isn't doing right (or whether there's a higher level bug in
drm_vblank.c code), yesterday my machine died and already evening
anyway.

Thanks for digging into this a bit more meanwhile.

Cheers, Daniel

>
> static int vkms_enable_vblank(struct drm_crtc *crtc)
> {
>         struct drm_device *dev = crtc->dev;
>         unsigned int pipe = drm_crtc_index(crtc);
>         struct drm_vblank_crtc *vblank = &dev->vblank[pipe];
>         struct vkms_output *out = drm_crtc_to_vkms_output(crtc);
>
>         drm_calc_timestamping_constants(crtc, &crtc->mode);
>
>         hrtimer_init(&out->vblank_hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
>         out->vblank_hrtimer.function = &vkms_vblank_simulate;
>         out->period_ns = ktime_set(0, vblank->framedur_ns);
>         hrtimer_start(&out->vblank_hrtimer, out->period_ns, HRTIMER_MODE_REL);
>
>         return 0;
> }



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/vkms: Don't warn hrtimer_forward_now failure.
  2020-05-26 11:11           ` Daniel Vetter
@ 2020-05-27  8:27             ` Daniel Vetter
  0 siblings, 0 replies; 9+ messages in thread
From: Daniel Vetter @ 2020-05-27  8:27 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Shayenne Moura, Rodrigo Siqueira, DRI, syzbot+0871b14ca2e2fb64f6e3

On Tue, May 26, 2020 at 01:11:28PM +0200, Daniel Vetter wrote:
> On Tue, May 26, 2020 at 6:39 AM Tetsuo Handa
> <penguin-kernel@i-love.sakura.ne.jp> wrote:
> >
> > On 2020/05/26 13:18, Tetsuo Handa wrote:
> > > due to mode->crtc_clock <= 0. Thus, somehow initializing mode->crtc_clock > 0 might be able
> > > to solve this problem.
> >
> > Well, I came to think that vkms_enable_vblank() should return an error to the caller
> > when drm_calc_timestamping_constants() failed...
> 
> If my memory is right we shouldn't even get there. crtc->mode being
> all zeros sounds like the simulated output isn't on (no surprise,
> syzbot doesn't enable it and fbcon is probably on card0), so something
> higher up should reject this. I'll see whether I can figure out what
> vkms isn't doing right (or whether there's a higher level bug in
> drm_vblank.c code), yesterday my machine died and already evening
> anyway.
> 
> Thanks for digging into this a bit more meanwhile.

Ok good news, I think I have a fix. Bad news it's code used by like 50
drivers, so this will take some time to get reviewed and merged (and I
think almost all these other drivers are buggy too, which is why I decided
to fix the shared code).

Patch below, can you pls confirm this fixes it?

Thanks, Daniel

commit b882894ac0fce412b67db79a5c92f2b599ec5069
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Tue May 26 14:29:00 2020 +0200

    drm/atomic-helper: reset vblank on crtc reset
    
    Only when vblanks are supported ofc.
    
    Some drivers do this already, but most unfortunately missed it. This
    opens up bugs after driver load, before the crtc is enabled for the
    first time. syzbot spotted this when loading vkms as a secondary
    output. Given how many drivers are buggy it's best to solve this once
    and for all in shared helper code.
    
    Aside from moving the few existing calls to drm_crtc_vblank_reset into
    helpers (i915 doesn't use helpers, so keeps its own) I think the
    regression risk is minimal: atomic helpers already rely on drivers
    calling drm_crtc_vblank_on/off correctly in their hooks when they
    support vblanks. And driver that's failing to handle vblanks after
    this is missing those calls already, and vblanks could only work by
    accident when enabling a CRTC for the first time right after boot.
    
    FIXME: Fix up drivers and audit them all.
    
    Big thanks to Tetsuo for helping track down what's going wrong here.
    
    Link: https://syzkaller.appspot.com/bug?id=0ba17d70d062b2595e1f061231474800f076c7cb
    Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
    Reported-by: syzbot+0871b14ca2e2fb64f6e3@syzkaller.appspotmail.com
    Cc: stable@vger.kernel.org
    Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
    Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>

diff --git a/drivers/gpu/drm/drm_atomic_state_helper.c b/drivers/gpu/drm/drm_atomic_state_helper.c
index 8fce6a115dfe..843964252239 100644
--- a/drivers/gpu/drm/drm_atomic_state_helper.c
+++ b/drivers/gpu/drm/drm_atomic_state_helper.c
@@ -32,6 +32,7 @@
 #include <drm/drm_device.h>
 #include <drm/drm_plane.h>
 #include <drm/drm_print.h>
+#include <drm/drm_vblank.h>
 #include <drm/drm_writeback.h>
 
 #include <linux/slab.h>
@@ -93,6 +94,9 @@ __drm_atomic_helper_crtc_reset(struct drm_crtc *crtc,
 	if (crtc_state)
 		__drm_atomic_helper_crtc_state_reset(crtc_state, crtc);
 
+	if (crtc->dev->num_crtcs)
+		drm_crtc_vblank_reset(crtc);
+
 	crtc->state = crtc_state;
 }
 EXPORT_SYMBOL(__drm_atomic_helper_crtc_reset);
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-05-27  8:27 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-25 14:38 [PATCH] drm/vkms: Don't warn hrtimer_forward_now failure Tetsuo Handa
2020-05-25 15:21 ` Daniel Vetter
2020-05-25 15:34   ` Tetsuo Handa
2020-05-25 16:57     ` Daniel Vetter
2020-05-25 17:00     ` Daniel Vetter
2020-05-26  4:18       ` Tetsuo Handa
2020-05-26  4:39         ` Tetsuo Handa
2020-05-26 11:11           ` Daniel Vetter
2020-05-27  8:27             ` Daniel Vetter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.