All of lore.kernel.org
 help / color / mirror / Atom feed
* drm-next-misc merge breaks vmwgfx
@ 2017-04-05 17:45 Thomas Hellstrom
  2017-04-06 12:34 ` Daniel Vetter
  0 siblings, 1 reply; 10+ messages in thread
From: Thomas Hellstrom @ 2017-04-05 17:45 UTC (permalink / raw)
  To: dri-devel, Daniel Vetter

Hi!

It appears like the drm-next commit 320d8c3d3 "Merge tag
'drm-misc-next-2017-03-31..." breaks vmwgfx.

A black screen shown where plymouthd use to show the boot splash on
Fedora 25. Nothing relevant in the logs.

I eyed through the vmwgfx conflict fixes, but nothing immediately stands
out as incorrect.

I'll dig into this tomorrow. Does anybody know of any other drivers that
don't work well with current drm-next?

/Thomas





_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: drm-next-misc merge breaks vmwgfx
  2017-04-05 17:45 drm-next-misc merge breaks vmwgfx Thomas Hellstrom
@ 2017-04-06 12:34 ` Daniel Vetter
  2017-04-06 14:10   ` Thomas Hellstrom
  0 siblings, 1 reply; 10+ messages in thread
From: Daniel Vetter @ 2017-04-06 12:34 UTC (permalink / raw)
  To: Thomas Hellstrom; +Cc: dri-devel

Hi Thomas,

Bisected an offender already? Afaik there's no one else who reported
issues thus far, and for our own CI it seems all still fine.
-Daniel

On Wed, Apr 5, 2017 at 7:45 PM, Thomas Hellstrom <thellstrom@vmware.com> wrote:
> Hi!
>
> It appears like the drm-next commit 320d8c3d3 "Merge tag
> 'drm-misc-next-2017-03-31..." breaks vmwgfx.
>
> A black screen shown where plymouthd use to show the boot splash on
> Fedora 25. Nothing relevant in the logs.
>
> I eyed through the vmwgfx conflict fixes, but nothing immediately stands
> out as incorrect.
>
> I'll dig into this tomorrow. Does anybody know of any other drivers that
> don't work well with current drm-next?
>
> /Thomas
>
>
>
>
>



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: drm-next-misc merge breaks vmwgfx
  2017-04-06 12:34 ` Daniel Vetter
@ 2017-04-06 14:10   ` Thomas Hellstrom
  2017-04-06 14:46     ` Daniel Vetter
  2017-04-06 14:47     ` Daniel Vetter
  0 siblings, 2 replies; 10+ messages in thread
From: Thomas Hellstrom @ 2017-04-06 14:10 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: dri-devel

On 04/06/2017 02:34 PM, Daniel Vetter wrote:
> Hi Thomas,
>
> Bisected an offender already? Afaik there's no one else who reported
> issues thus far, and for our own CI it seems all still fine.
> -Daniel

Hi, Daniel,

Yes, I rebased drm-misc-next on top of vmwgfx-next and found the culprit
to be

38b6441e "drm/atomic-helper: Remove the backoff hack from set_config.."

Reverting first 1fa4da04 and then
38b6441e

fixes the problem.

Also, when testing the tip of drm-misc-next (with the non-atomic vmwgfx)
there appeared to be warnings about a non-NULL
dev->mode_config.acquire_ctx. I'll see if I can reproduce those, but
perhaps removing the line

dev->mode_config.acquire_ctx = &ctx

in drm_mode_setcrtc()

is part of the problem.

/Thomas




> On Wed, Apr 5, 2017 at 7:45 PM, Thomas Hellstrom <thellstrom@vmware.com> wrote:
>> Hi!
>>
>> It appears like the drm-next commit 320d8c3d3 "Merge tag
>> 'drm-misc-next-2017-03-31..." breaks vmwgfx.
>>
>> A black screen shown where plymouthd use to show the boot splash on
>> Fedora 25. Nothing relevant in the logs.
>>
>> I eyed through the vmwgfx conflict fixes, but nothing immediately stands
>> out as incorrect.
>>
>> I'll dig into this tomorrow. Does anybody know of any other drivers that
>> don't work well with current drm-next?
>>
>> /Thomas
>>
>>
>>
>>
>>
>
>


_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: drm-next-misc merge breaks vmwgfx
  2017-04-06 14:10   ` Thomas Hellstrom
@ 2017-04-06 14:46     ` Daniel Vetter
  2017-04-06 18:01       ` Thomas Hellstrom
  2017-04-06 14:47     ` Daniel Vetter
  1 sibling, 1 reply; 10+ messages in thread
From: Daniel Vetter @ 2017-04-06 14:46 UTC (permalink / raw)
  To: Thomas Hellstrom; +Cc: dri-devel

On Thu, Apr 6, 2017 at 4:10 PM, Thomas Hellstrom <thellstrom@vmware.com> wrote:
> On 04/06/2017 02:34 PM, Daniel Vetter wrote:
>> Hi Thomas,
>>
>> Bisected an offender already? Afaik there's no one else who reported
>> issues thus far, and for our own CI it seems all still fine.
>> -Daniel
>
> Hi, Daniel,
>
> Yes, I rebased drm-misc-next on top of vmwgfx-next and found the culprit
> to be
>
> 38b6441e "drm/atomic-helper: Remove the backoff hack from set_config.."
>
> Reverting first 1fa4da04 and then
> 38b6441e
>
> fixes the problem.

Yeah, we seem to have a solid functional conflict between the vmwgfx
atomic conversion, and the changes in drm-misc-next. Preliminary
analysis, but I think what's going on is:
- With the above changes in -misc we punt the deadlock retry loop to
the callers of ->set_config.
- But since it would have been way too invasive, I only fixed up the
atomic callers (in most places we have special paths for atomic and
non-atomic due to slightly different semantics), which means for
legacy functions we in some cases pass a NULL ctx down to
->set_config. But since legacy paths only get called on legacy
drivers, no problem.
- Well except I've done that audit before vmwgfx became atomic, and
that audit is now wrong, and I've forgotten to properly re-audit when
the conflicts happened all around. But since I half-expect to hit a
mid-driver conversion with this I did sprinkle
WARN_ON(drm_drv_uses_atomic_modeset()) over all these paths.

So assuming this is correct, you should see a pile of WARN_ON
backtraces that you're hitting in the atomic-vmwgfx+drm-misc-next
combo. The proper fix would be to switch over to atomic primitives for
all these cases. On a quick look I see some in the vmwgfx fbdev
emulation code, might even be worth it to check whether we could reuse
the core helpers (which do this split handling alread) in some cases.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: drm-next-misc merge breaks vmwgfx
  2017-04-06 14:10   ` Thomas Hellstrom
  2017-04-06 14:46     ` Daniel Vetter
@ 2017-04-06 14:47     ` Daniel Vetter
  2017-04-06 14:56       ` Thomas Hellstrom
  1 sibling, 1 reply; 10+ messages in thread
From: Daniel Vetter @ 2017-04-06 14:47 UTC (permalink / raw)
  To: Thomas Hellstrom; +Cc: dri-devel

On Thu, Apr 6, 2017 at 4:10 PM, Thomas Hellstrom <thellstrom@vmware.com> wrote:
>
> Also, when testing the tip of drm-misc-next (with the non-atomic vmwgfx)
> there appeared to be warnings about a non-NULL
> dev->mode_config.acquire_ctx. I'll see if I can reproduce those, but
> perhaps removing the line
>
> dev->mode_config.acquire_ctx = &ctx
>
> in drm_mode_setcrtc()
>
> is part of the problem.

Hm, where do you hit that? And by tip of drm-misc-next, do you mean
the very latest state, which includes atomic vmwgfx, or is this with
the non-atomic vmwgfx? Please paste the backtraces (and for which tree
they are).
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: drm-next-misc merge breaks vmwgfx
  2017-04-06 14:47     ` Daniel Vetter
@ 2017-04-06 14:56       ` Thomas Hellstrom
  2017-04-06 15:01         ` Daniel Vetter
  0 siblings, 1 reply; 10+ messages in thread
From: Thomas Hellstrom @ 2017-04-06 14:56 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: dri-devel

On 04/06/2017 04:47 PM, Daniel Vetter wrote:
> On Thu, Apr 6, 2017 at 4:10 PM, Thomas Hellstrom <thellstrom@vmware.com> wrote:
>> Also, when testing the tip of drm-misc-next (with the non-atomic vmwgfx)
>> there appeared to be warnings about a non-NULL
>> dev->mode_config.acquire_ctx. I'll see if I can reproduce those, but
>> perhaps removing the line
>>
>> dev->mode_config.acquire_ctx = &ctx
>>
>> in drm_mode_setcrtc()
>>
>> is part of the problem.
> Hm, where do you hit that? And by tip of drm-misc-next, do you mean
> the very latest state, which includes atomic vmwgfx, or is this with
> the non-atomic vmwgfx? Please paste the backtraces (and for which tree
> they are).
> -Daniel

Actually this must have been from a confused rebase-bisect state
somewhere. I can't reproduce this.

/Thomas


_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: drm-next-misc merge breaks vmwgfx
  2017-04-06 14:56       ` Thomas Hellstrom
@ 2017-04-06 15:01         ` Daniel Vetter
  2017-04-06 15:07           ` Thomas Hellstrom
  0 siblings, 1 reply; 10+ messages in thread
From: Daniel Vetter @ 2017-04-06 15:01 UTC (permalink / raw)
  To: Thomas Hellstrom; +Cc: dri-devel

On Thu, Apr 6, 2017 at 4:56 PM, Thomas Hellstrom <thellstrom@vmware.com> wrote:
> On 04/06/2017 04:47 PM, Daniel Vetter wrote:
>> On Thu, Apr 6, 2017 at 4:10 PM, Thomas Hellstrom <thellstrom@vmware.com> wrote:
>>> Also, when testing the tip of drm-misc-next (with the non-atomic vmwgfx)
>>> there appeared to be warnings about a non-NULL
>>> dev->mode_config.acquire_ctx. I'll see if I can reproduce those, but
>>> perhaps removing the line
>>>
>>> dev->mode_config.acquire_ctx = &ctx
>>>
>>> in drm_mode_setcrtc()
>>>
>>> is part of the problem.
>> Hm, where do you hit that? And by tip of drm-misc-next, do you mean
>> the very latest state, which includes atomic vmwgfx, or is this with
>> the non-atomic vmwgfx? Please paste the backtraces (and for which tree
>> they are).
>> -Daniel
>
> Actually this must have been from a confused rebase-bisect state
> somewhere. I can't reproduce this.

Have you changed CONFIG_DEBUG_WW_MUTEX_SLOWPATH perhaps? Without this
you shouldn't be able to hit any retry path (so minus warnings the
patch you bisected won't have an affected), with you will hit the
deadlock retry paths pseudo-randomly. It might take a few trials to
hit it again, depending upon timing.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: drm-next-misc merge breaks vmwgfx
  2017-04-06 15:01         ` Daniel Vetter
@ 2017-04-06 15:07           ` Thomas Hellstrom
  0 siblings, 0 replies; 10+ messages in thread
From: Thomas Hellstrom @ 2017-04-06 15:07 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: dri-devel

On 04/06/2017 05:01 PM, Daniel Vetter wrote:
> On Thu, Apr 6, 2017 at 4:56 PM, Thomas Hellstrom <thellstrom@vmware.com> wrote:
>> On 04/06/2017 04:47 PM, Daniel Vetter wrote:
>>> On Thu, Apr 6, 2017 at 4:10 PM, Thomas Hellstrom <thellstrom@vmware.com> wrote:
>>>> Also, when testing the tip of drm-misc-next (with the non-atomic vmwgfx)
>>>> there appeared to be warnings about a non-NULL
>>>> dev->mode_config.acquire_ctx. I'll see if I can reproduce those, but
>>>> perhaps removing the line
>>>>
>>>> dev->mode_config.acquire_ctx = &ctx
>>>>
>>>> in drm_mode_setcrtc()
>>>>
>>>> is part of the problem.
>>> Hm, where do you hit that? And by tip of drm-misc-next, do you mean
>>> the very latest state, which includes atomic vmwgfx, or is this with
>>> the non-atomic vmwgfx? Please paste the backtraces (and for which tree
>>> they are).
>>> -Daniel
>> Actually this must have been from a confused rebase-bisect state
>> somewhere. I can't reproduce this.
> Have you changed CONFIG_DEBUG_WW_MUTEX_SLOWPATH perhaps? Without this
> you shouldn't be able to hit any retry path (so minus warnings the
> patch you bisected won't have an affected), with you will hit the
> deadlock retry paths pseudo-randomly. It might take a few trials to
> hit it again, depending upon timing.
> -Daniel

No, it's not that. I've set it to off for all these tests, but lockdep
and hang-checks on.

/Thomas


_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: drm-next-misc merge breaks vmwgfx
  2017-04-06 14:46     ` Daniel Vetter
@ 2017-04-06 18:01       ` Thomas Hellstrom
  2017-04-06 19:52         ` Daniel Vetter
  0 siblings, 1 reply; 10+ messages in thread
From: Thomas Hellstrom @ 2017-04-06 18:01 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: dri-devel

On 04/06/2017 04:46 PM, Daniel Vetter wrote:
> On Thu, Apr 6, 2017 at 4:10 PM, Thomas Hellstrom <thellstrom@vmware.com> wrote:
>> On 04/06/2017 02:34 PM, Daniel Vetter wrote:
>>> Hi Thomas,
>>>
>>> Bisected an offender already? Afaik there's no one else who reported
>>> issues thus far, and for our own CI it seems all still fine.
>>> -Daniel
>> Hi, Daniel,
>>
>> Yes, I rebased drm-misc-next on top of vmwgfx-next and found the culprit
>> to be
>>
>> 38b6441e "drm/atomic-helper: Remove the backoff hack from set_config.."
>>
>> Reverting first 1fa4da04 and then
>> 38b6441e
>>
>> fixes the problem.
> Yeah, we seem to have a solid functional conflict between the vmwgfx
> atomic conversion, and the changes in drm-misc-next. Preliminary
> analysis, but I think what's going on is:
> - With the above changes in -misc we punt the deadlock retry loop to
> the callers of ->set_config.
> - But since it would have been way too invasive, I only fixed up the
> atomic callers (in most places we have special paths for atomic and
> non-atomic due to slightly different semantics), which means for
> legacy functions we in some cases pass a NULL ctx down to
> ->set_config. But since legacy paths only get called on legacy
> drivers, no problem.
> - Well except I've done that audit before vmwgfx became atomic, and
> that audit is now wrong, and I've forgotten to properly re-audit when
> the conflicts happened all around. But since I half-expect to hit a
> mid-driver conversion with this I did sprinkle
> WARN_ON(drm_drv_uses_atomic_modeset()) over all these paths.
>
> So assuming this is correct, you should see a pile of WARN_ON
> backtraces that you're hitting in the atomic-vmwgfx+drm-misc-next
> combo. The proper fix would be to switch over to atomic primitives for
> all these cases. On a quick look I see some in the vmwgfx fbdev
> emulation code, might even be worth it to check whether we could reuse
> the core helpers (which do this split handling alread) in some cases.
>
> Cheers, Daniel

So with the two reverts previously mentioned applied, I see the
following. Is this consistent with the above.

FWIW I did a pretty big vmwgfx fbdev rewrite some time ago, but at that
time we didn't have the callbacks
necessary to use the helpers. Maybe that has changed with the atomic
implementation.

Considering that Sinclair just had a baby, I'm not 100% sure though,
that I have time to fix this up in the vmwgfx driver for this merge
window...

/Thomas


[    9.547101] WARNING: CPU: 3 PID: 359 at
drivers/gpu/drm/drm_modeset_lock.c:107 drm_modeset_lock_all+0xb8/0xc0 [drm]
[    9.547102] Modules linked in: snd_rawmidi snd_timer
ghash_clmulni_intel intel_rapl_perf ppdev snd_seq_device vmw_balloon snd
rfkill joydev soundcore nfit parport_pc parport acpi_cpufreq tpm_tis
tpm_tis_core tpm shpchp vmw_vmci i2c_piix4 nfsd auth_rpcgss nfs_acl
lockd grace sunrpc vmwgfx drm_kms_helper ttm drm mptspi
scsi_transport_spi mptscsih crc32c_intel e1000 mptbase ata_generic
serio_raw pata_acpi uas usb_storage
[    9.547122] CPU: 3 PID: 359 Comm: plymouthd Tainted: G        W      
4.11.0-rc4+ #2
[    9.547122] Hardware name: VMware, Inc. VMware Virtual Platform/440BX
Desktop Reference Platform, BIOS 6.00 01/24/2017
[    9.547123] Call Trace:
[    9.547128]  dump_stack+0x63/0x86
[    9.547130]  __warn+0xcb/0xf0
[    9.547131]  warn_slowpath_null+0x1d/0x20
[    9.547137]  drm_modeset_lock_all+0xb8/0xc0 [drm]
[    9.547143]  vmw_framebuffer_dmabuf_dirty+0x4c/0x200 [vmwgfx]
[    9.547145]  ? __check_object_size+0x100/0x19d
[    9.547152]  drm_mode_dirtyfb_ioctl+0x178/0x1a0 [drm]
[    9.547158]  drm_ioctl+0x209/0x4c0 [drm]
[    9.547164]  ? drm_mode_getfb+0x100/0x100 [drm]
[    9.547165]  ? __do_fault+0x1e/0x110
[    9.547169]  vmw_generic_ioctl+0x193/0x2d0 [vmwgfx]
[    9.547175]  ? drm_getunique+0xa0/0xa0 [drm]
[    9.547179]  vmw_unlocked_ioctl+0x15/0x20 [vmwgfx]
[    9.547180]  do_vfs_ioctl+0xa3/0x5f0
[    9.547181]  SyS_ioctl+0x79/0x90
[    9.547182]  do_syscall_64+0x67/0x180
[    9.547184]  entry_SYSCALL64_slow_path+0x25/0x25
[    9.547185] RIP: 0033:0x7fd4c93b7787
[    9.547186] RSP: 002b:00007fff17d06b88 EFLAGS: 00000246 ORIG_RAX:
0000000000000010
[    9.547187] RAX: ffffffffffffffda RBX: 0000000000000c80 RCX:
00007fd4c93b7787
[    9.547187] RDX: 00007fff17d06bc0 RSI: 00000000c01864b1 RDI:
0000000000000009
[    9.547188] RBP: 00007fff17d06bc0 R08: 00007fd4c7554000 R09:
00007fd4ca1e9010
[    9.547188] R10: 0000558ffe14ca40 R11: 0000000000000246 R12:
00000000c01864b1
[    9.547188] R13: 0000000000000009 R14: 0000000000000000 R15:
0000000000000258
[    9.547190] ---[ end trace 46a3554c8816a28b ]---


    4.824456] WARNING: CPU: 2 PID: 359 at drivers/gpu/drm/drm_crtc.c:499
drm_mode_set_config_internal+0x40/0x50 [drm]
[    4.824457] Modules linked in: vmwgfx drm_kms_helper ttm drm mptspi
scsi_transport_spi mptscsih crc32c_intel e1000(+) mptbase ata_generic
serio_raw pata_acpi uas usb_storage
[    4.824467] CPU: 2 PID: 359 Comm: plymouthd Tainted: G        W      
4.11.0-rc4+ #2
[    4.824468] Hardware name: VMware, Inc. VMware Virtual Platform/440BX
Desktop Reference Platform, BIOS 6.00 01/24/2017
[    4.824468] Call Trace:
[    4.824474]  dump_stack+0x63/0x86
[    4.824476]  __warn+0xcb/0xf0
[    4.824477]  warn_slowpath_null+0x1d/0x20
[    4.824483]  drm_mode_set_config_internal+0x40/0x50 [drm]
[    4.824492]  vmw_fb_set_par+0x269/0x580 [vmwgfx]
[    4.824494]  ? selinux_capable+0x20/0x30
[    4.824498]  ? ttm_mem_global_reserve.constprop.6+0xd6/0x100 [ttm]
[    4.824503]  vmw_fb_on+0x24/0x60 [vmwgfx]
[    4.824506]  vmw_master_drop+0x81/0xc0 [vmwgfx]
[    4.824511]  drm_drop_master+0x21/0x50 [drm]
[    4.824516]  drm_dropmaster_ioctl+0x6c/0x70 [drm]
[    4.824521]  drm_ioctl+0x209/0x4c0 [drm]
[    4.824526]  ? drm_setmaster_ioctl+0xa0/0xa0 [drm]
[    4.824528]  ? do_filp_open+0xa5/0x100
[    4.824532]  vmw_generic_ioctl+0x193/0x2d0 [vmwgfx]
[    4.824537]  ? drm_getunique+0xa0/0xa0 [drm]
[    4.824541]  vmw_unlocked_ioctl+0x15/0x20 [vmwgfx]
[    4.824543]  do_vfs_ioctl+0xa3/0x5f0
[    4.824544]  SyS_ioctl+0x79/0x90
[    4.824545]  do_syscall_64+0x67/0x180
[    4.824547]  entry_SYSCALL64_slow_path+0x25/0x25
[    4.824548] RIP: 0033:0x7fd4c93b7787
[    4.824549] RSP: 002b:00007fff17d06d98 EFLAGS: 00000246 ORIG_RAX:
0000000000000010
[    4.824550] RAX: ffffffffffffffda RBX: 0000558ffe145260 RCX:
00007fd4c93b7787
[    4.824550] RDX: 0000000000000000 RSI: 000000000000641f RDI:
0000000000000009
[    4.824551] RBP: 0000000000000000 R08: 00007fd4c967ab98 R09:
0000000000000005
[    4.824551] R10: 0000558ffe145390 R11: 0000000000000246 R12:
000000000000641f
[    4.824552] R13: 0000000000000009 R14: 00007fd4c9da78e0 R15:
0000000000000000
[    4.824553] ---[ end trace 46a3554c8816a28a ]---


   19.720064] WARNING: CPU: 0 PID: 1316 at
drivers/gpu/drm/drm_modeset_lock.c:107 drm_modeset_lock_all+0xb8/0xc0 [drm]
[   19.720065] Modules linked in: xt_CHECKSUM ipt_MASQUERADE
nf_nat_masquerade_ipv4 tun nf_conntrack_netbios_ns
nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6
xt_conntrack ip_set nfnetlink ebtable_broute bridge stp llc ebtable_nat
ip6table_security ip6table_raw ip6table_mangle ip6table_nat
nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 iptable_security
iptable_raw iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4
nf_nat_ipv4 nf_nat nf_conntrack libcrc32c ebtable_filter ebtables
ip6table_filter ip6_tables vmw_vsock_vmci_transport vsock bnep
snd_seq_midi snd_seq_midi_event snd_ens1371 gameport snd_ac97_codec
crct10dif_pclmul ac97_bus btusb btrtl btbcm btintel snd_seq bluetooth
snd_pcm crc32_pclmul snd_rawmidi snd_timer ghash_clmulni_intel
intel_rapl_perf ppdev snd_seq_device
[   19.720091]  vmw_balloon snd rfkill joydev soundcore nfit parport_pc
parport acpi_cpufreq tpm_tis tpm_tis_core tpm shpchp vmw_vmci i2c_piix4
nfsd auth_rpcgss nfs_acl lockd grace sunrpc vmwgfx drm_kms_helper ttm
drm mptspi scsi_transport_spi mptscsih crc32c_intel e1000 mptbase
ata_generic serio_raw pata_acpi uas usb_storage
[   19.720106] CPU: 0 PID: 1316 Comm: Xorg Tainted: G        W      
4.11.0-rc4+ #2
[   19.720107] Hardware name: VMware, Inc. VMware Virtual Platform/440BX
Desktop Reference Platform, BIOS 6.00 01/24/2017
[   19.720107] Call Trace:
[   19.720113]  dump_stack+0x63/0x86
[   19.720115]  __warn+0xcb/0xf0
[   19.720116]  warn_slowpath_null+0x1d/0x20
[   19.720123]  drm_modeset_lock_all+0xb8/0xc0 [drm]
[   19.720129]  drm_mode_gamma_set_ioctl+0x3a/0x180 [drm]
[   19.720134]  drm_ioctl+0x209/0x4c0 [drm]
[   19.720140]  ? drm_mode_crtc_set_gamma_size+0xa0/0xa0 [drm]
[   19.720151]  ? add_wait_queue+0x65/0x80
[   19.720158]  vmw_generic_ioctl+0x193/0x2d0 [vmwgfx]
[   19.720163]  ? drm_getunique+0xa0/0xa0 [drm]
[   19.720167]  vmw_unlocked_ioctl+0x15/0x20 [vmwgfx]
[   19.720169]  do_vfs_ioctl+0xa3/0x5f0
[   19.720170]  ? sk_prot_alloc+0x5/0x120
[   19.720171]  SyS_ioctl+0x79/0x90
[   19.720173]  entry_SYSCALL_64_fastpath+0x1a/0xa9
[   19.720174] RIP: 0033:0x7f9eb9f24787
[   19.720175] RSP: 002b:00007ffd90012b88 EFLAGS: 00000246 ORIG_RAX:
0000000000000010
[   19.720176] RAX: ffffffffffffffda RBX: 000000000222ffe0 RCX:
00007f9eb9f24787
[   19.720176] RDX: 00007ffd90012bc0 RSI: 00000000c02064a5 RDI:
000000000000000c
[   19.720176] RBP: 00007f9eba1e43c0 R08: 0000000002130fb0 R09:
00000000021311b0
[   19.720177] R10: 0000000000000088 R11: 0000000000000246 R12:
0000000000000000
[   19.720177] R13: 00007f9ebc6822a8 R14: 00007f9eb9f9b5e0 R15:
00007ffd9000eeb0
[   19.720179] ---[ end trace 46a3554c8816a293 ]---
[   31.611886] systemd-journald[600]: File
/var/log/journal/fbbc68aec3984fd6b148a9830a1096e0/user-2000.journal
corrupted or uncleanly shut down, renaming and replacing.
[   31.937861] ------------[ cut here ]------------





_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: drm-next-misc merge breaks vmwgfx
  2017-04-06 18:01       ` Thomas Hellstrom
@ 2017-04-06 19:52         ` Daniel Vetter
  0 siblings, 0 replies; 10+ messages in thread
From: Daniel Vetter @ 2017-04-06 19:52 UTC (permalink / raw)
  To: Thomas Hellstrom; +Cc: dri-devel

On Thu, Apr 6, 2017 at 8:01 PM, Thomas Hellstrom <thellstrom@vmware.com> wrote:
> On 04/06/2017 04:46 PM, Daniel Vetter wrote:
>> On Thu, Apr 6, 2017 at 4:10 PM, Thomas Hellstrom <thellstrom@vmware.com> wrote:
>>> On 04/06/2017 02:34 PM, Daniel Vetter wrote:
>>>> Hi Thomas,
>>>>
>>>> Bisected an offender already? Afaik there's no one else who reported
>>>> issues thus far, and for our own CI it seems all still fine.
>>>> -Daniel
>>> Hi, Daniel,
>>>
>>> Yes, I rebased drm-misc-next on top of vmwgfx-next and found the culprit
>>> to be
>>>
>>> 38b6441e "drm/atomic-helper: Remove the backoff hack from set_config.."
>>>
>>> Reverting first 1fa4da04 and then
>>> 38b6441e
>>>
>>> fixes the problem.
>> Yeah, we seem to have a solid functional conflict between the vmwgfx
>> atomic conversion, and the changes in drm-misc-next. Preliminary
>> analysis, but I think what's going on is:
>> - With the above changes in -misc we punt the deadlock retry loop to
>> the callers of ->set_config.
>> - But since it would have been way too invasive, I only fixed up the
>> atomic callers (in most places we have special paths for atomic and
>> non-atomic due to slightly different semantics), which means for
>> legacy functions we in some cases pass a NULL ctx down to
>> ->set_config. But since legacy paths only get called on legacy
>> drivers, no problem.
>> - Well except I've done that audit before vmwgfx became atomic, and
>> that audit is now wrong, and I've forgotten to properly re-audit when
>> the conflicts happened all around. But since I half-expect to hit a
>> mid-driver conversion with this I did sprinkle
>> WARN_ON(drm_drv_uses_atomic_modeset()) over all these paths.
>>
>> So assuming this is correct, you should see a pile of WARN_ON
>> backtraces that you're hitting in the atomic-vmwgfx+drm-misc-next
>> combo. The proper fix would be to switch over to atomic primitives for
>> all these cases. On a quick look I see some in the vmwgfx fbdev
>> emulation code, might even be worth it to check whether we could reuse
>> the core helpers (which do this split handling alread) in some cases.
>>
>> Cheers, Daniel
>
> So with the two reverts previously mentioned applied, I see the
> following. Is this consistent with the above.
>
> FWIW I did a pretty big vmwgfx fbdev rewrite some time ago, but at that
> time we didn't have the callbacks
> necessary to use the helpers. Maybe that has changed with the atomic
> implementation.
>
> Considering that Sinclair just had a baby, I'm not 100% sure though,
> that I have time to fix this up in the vmwgfx driver for this merge
> window...
>
> /Thomas
>
>
> [    9.547101] WARNING: CPU: 3 PID: 359 at
> drivers/gpu/drm/drm_modeset_lock.c:107 drm_modeset_lock_all+0xb8/0xc0 [drm]
> [    9.547102] Modules linked in: snd_rawmidi snd_timer
> ghash_clmulni_intel intel_rapl_perf ppdev snd_seq_device vmw_balloon snd
> rfkill joydev soundcore nfit parport_pc parport acpi_cpufreq tpm_tis
> tpm_tis_core tpm shpchp vmw_vmci i2c_piix4 nfsd auth_rpcgss nfs_acl
> lockd grace sunrpc vmwgfx drm_kms_helper ttm drm mptspi
> scsi_transport_spi mptscsih crc32c_intel e1000 mptbase ata_generic
> serio_raw pata_acpi uas usb_storage
> [    9.547122] CPU: 3 PID: 359 Comm: plymouthd Tainted: G        W
> 4.11.0-rc4+ #2
> [    9.547122] Hardware name: VMware, Inc. VMware Virtual Platform/440BX
> Desktop Reference Platform, BIOS 6.00 01/24/2017
> [    9.547123] Call Trace:
> [    9.547128]  dump_stack+0x63/0x86
> [    9.547130]  __warn+0xcb/0xf0
> [    9.547131]  warn_slowpath_null+0x1d/0x20
> [    9.547137]  drm_modeset_lock_all+0xb8/0xc0 [drm]
> [    9.547143]  vmw_framebuffer_dmabuf_dirty+0x4c/0x200 [vmwgfx]
> [    9.547145]  ? __check_object_size+0x100/0x19d
> [    9.547152]  drm_mode_dirtyfb_ioctl+0x178/0x1a0 [drm]
> [    9.547158]  drm_ioctl+0x209/0x4c0 [drm]
> [    9.547164]  ? drm_mode_getfb+0x100/0x100 [drm]
> [    9.547165]  ? __do_fault+0x1e/0x110
> [    9.547169]  vmw_generic_ioctl+0x193/0x2d0 [vmwgfx]
> [    9.547175]  ? drm_getunique+0xa0/0xa0 [drm]
> [    9.547179]  vmw_unlocked_ioctl+0x15/0x20 [vmwgfx]
> [    9.547180]  do_vfs_ioctl+0xa3/0x5f0
> [    9.547181]  SyS_ioctl+0x79/0x90
> [    9.547182]  do_syscall_64+0x67/0x180
> [    9.547184]  entry_SYSCALL64_slow_path+0x25/0x25
> [    9.547185] RIP: 0033:0x7fd4c93b7787
> [    9.547186] RSP: 002b:00007fff17d06b88 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000010
> [    9.547187] RAX: ffffffffffffffda RBX: 0000000000000c80 RCX:
> 00007fd4c93b7787
> [    9.547187] RDX: 00007fff17d06bc0 RSI: 00000000c01864b1 RDI:
> 0000000000000009
> [    9.547188] RBP: 00007fff17d06bc0 R08: 00007fd4c7554000 R09:
> 00007fd4ca1e9010
> [    9.547188] R10: 0000558ffe14ca40 R11: 0000000000000246 R12:
> 00000000c01864b1
> [    9.547188] R13: 0000000000000009 R14: 0000000000000000 R15:
> 0000000000000258
> [    9.547190] ---[ end trace 46a3554c8816a28b ]---

This is an artifact of the two reverts, I've forgotten to properly
clear config->acquire_ctx again in the intermediate states.

>     4.824456] WARNING: CPU: 2 PID: 359 at drivers/gpu/drm/drm_crtc.c:499
> drm_mode_set_config_internal+0x40/0x50 [drm]
> [    4.824457] Modules linked in: vmwgfx drm_kms_helper ttm drm mptspi
> scsi_transport_spi mptscsih crc32c_intel e1000(+) mptbase ata_generic
> serio_raw pata_acpi uas usb_storage
> [    4.824467] CPU: 2 PID: 359 Comm: plymouthd Tainted: G        W
> 4.11.0-rc4+ #2
> [    4.824468] Hardware name: VMware, Inc. VMware Virtual Platform/440BX
> Desktop Reference Platform, BIOS 6.00 01/24/2017
> [    4.824468] Call Trace:
> [    4.824474]  dump_stack+0x63/0x86
> [    4.824476]  __warn+0xcb/0xf0
> [    4.824477]  warn_slowpath_null+0x1d/0x20
> [    4.824483]  drm_mode_set_config_internal+0x40/0x50 [drm]
> [    4.824492]  vmw_fb_set_par+0x269/0x580 [vmwgfx]
> [    4.824494]  ? selinux_capable+0x20/0x30
> [    4.824498]  ? ttm_mem_global_reserve.constprop.6+0xd6/0x100 [ttm]
> [    4.824503]  vmw_fb_on+0x24/0x60 [vmwgfx]
> [    4.824506]  vmw_master_drop+0x81/0xc0 [vmwgfx]
> [    4.824511]  drm_drop_master+0x21/0x50 [drm]
> [    4.824516]  drm_dropmaster_ioctl+0x6c/0x70 [drm]
> [    4.824521]  drm_ioctl+0x209/0x4c0 [drm]
> [    4.824526]  ? drm_setmaster_ioctl+0xa0/0xa0 [drm]
> [    4.824528]  ? do_filp_open+0xa5/0x100
> [    4.824532]  vmw_generic_ioctl+0x193/0x2d0 [vmwgfx]
> [    4.824537]  ? drm_getunique+0xa0/0xa0 [drm]
> [    4.824541]  vmw_unlocked_ioctl+0x15/0x20 [vmwgfx]
> [    4.824543]  do_vfs_ioctl+0xa3/0x5f0
> [    4.824544]  SyS_ioctl+0x79/0x90
> [    4.824545]  do_syscall_64+0x67/0x180
> [    4.824547]  entry_SYSCALL64_slow_path+0x25/0x25
> [    4.824548] RIP: 0033:0x7fd4c93b7787
> [    4.824549] RSP: 002b:00007fff17d06d98 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000010
> [    4.824550] RAX: ffffffffffffffda RBX: 0000558ffe145260 RCX:
> 00007fd4c93b7787
> [    4.824550] RDX: 0000000000000000 RSI: 000000000000641f RDI:
> 0000000000000009
> [    4.824551] RBP: 0000000000000000 R08: 00007fd4c967ab98 R09:
> 0000000000000005
> [    4.824551] R10: 0000558ffe145390 R11: 0000000000000246 R12:
> 000000000000641f
> [    4.824552] R13: 0000000000000009 R14: 00007fd4c9da78e0 R15:
> 0000000000000000
> [    4.824553] ---[ end trace 46a3554c8816a28a ]---

Yeah, this is the "don't do that" case that I expected.

>    19.720064] WARNING: CPU: 0 PID: 1316 at
> drivers/gpu/drm/drm_modeset_lock.c:107 drm_modeset_lock_all+0xb8/0xc0 [drm]
> [   19.720065] Modules linked in: xt_CHECKSUM ipt_MASQUERADE
> nf_nat_masquerade_ipv4 tun nf_conntrack_netbios_ns
> nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6
> xt_conntrack ip_set nfnetlink ebtable_broute bridge stp llc ebtable_nat
> ip6table_security ip6table_raw ip6table_mangle ip6table_nat
> nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 iptable_security
> iptable_raw iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4
> nf_nat_ipv4 nf_nat nf_conntrack libcrc32c ebtable_filter ebtables
> ip6table_filter ip6_tables vmw_vsock_vmci_transport vsock bnep
> snd_seq_midi snd_seq_midi_event snd_ens1371 gameport snd_ac97_codec
> crct10dif_pclmul ac97_bus btusb btrtl btbcm btintel snd_seq bluetooth
> snd_pcm crc32_pclmul snd_rawmidi snd_timer ghash_clmulni_intel
> intel_rapl_perf ppdev snd_seq_device
> [   19.720091]  vmw_balloon snd rfkill joydev soundcore nfit parport_pc
> parport acpi_cpufreq tpm_tis tpm_tis_core tpm shpchp vmw_vmci i2c_piix4
> nfsd auth_rpcgss nfs_acl lockd grace sunrpc vmwgfx drm_kms_helper ttm
> drm mptspi scsi_transport_spi mptscsih crc32c_intel e1000 mptbase
> ata_generic serio_raw pata_acpi uas usb_storage
> [   19.720106] CPU: 0 PID: 1316 Comm: Xorg Tainted: G        W
> 4.11.0-rc4+ #2
> [   19.720107] Hardware name: VMware, Inc. VMware Virtual Platform/440BX
> Desktop Reference Platform, BIOS 6.00 01/24/2017
> [   19.720107] Call Trace:
> [   19.720113]  dump_stack+0x63/0x86
> [   19.720115]  __warn+0xcb/0xf0
> [   19.720116]  warn_slowpath_null+0x1d/0x20
> [   19.720123]  drm_modeset_lock_all+0xb8/0xc0 [drm]
> [   19.720129]  drm_mode_gamma_set_ioctl+0x3a/0x180 [drm]
> [   19.720134]  drm_ioctl+0x209/0x4c0 [drm]
> [   19.720140]  ? drm_mode_crtc_set_gamma_size+0xa0/0xa0 [drm]
> [   19.720151]  ? add_wait_queue+0x65/0x80
> [   19.720158]  vmw_generic_ioctl+0x193/0x2d0 [vmwgfx]
> [   19.720163]  ? drm_getunique+0xa0/0xa0 [drm]
> [   19.720167]  vmw_unlocked_ioctl+0x15/0x20 [vmwgfx]
> [   19.720169]  do_vfs_ioctl+0xa3/0x5f0
> [   19.720170]  ? sk_prot_alloc+0x5/0x120
> [   19.720171]  SyS_ioctl+0x79/0x90
> [   19.720173]  entry_SYSCALL_64_fastpath+0x1a/0xa9
> [   19.720174] RIP: 0033:0x7f9eb9f24787
> [   19.720175] RSP: 002b:00007ffd90012b88 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000010
> [   19.720176] RAX: ffffffffffffffda RBX: 000000000222ffe0 RCX:
> 00007f9eb9f24787
> [   19.720176] RDX: 00007ffd90012bc0 RSI: 00000000c02064a5 RDI:
> 000000000000000c
> [   19.720176] RBP: 00007f9eba1e43c0 R08: 0000000002130fb0 R09:
> 00000000021311b0
> [   19.720177] R10: 0000000000000088 R11: 0000000000000246 R12:
> 0000000000000000
> [   19.720177] R13: 00007f9ebc6822a8 R14: 00007f9eb9f9b5e0 R15:
> 00007ffd9000eeb0
> [   19.720179] ---[ end trace 46a3554c8816a293 ]---
> [   31.611886] systemd-journald[600]: File
> /var/log/journal/fbbc68aec3984fd6b148a9830a1096e0/user-2000.journal
> corrupted or uncleanly shut down, renaming and replacing.
> [   31.937861] ------------[ cut here ]------------

This is again the leaked acquire_ctx that isn't properly cleared due
to your reverts (well, my not-perfectly-bisectable patches).

I think it should be simple to type up a quick patch to make the
vmwgfx fbdev code work again, I'll submit that asap.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-04-06 19:52 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-05 17:45 drm-next-misc merge breaks vmwgfx Thomas Hellstrom
2017-04-06 12:34 ` Daniel Vetter
2017-04-06 14:10   ` Thomas Hellstrom
2017-04-06 14:46     ` Daniel Vetter
2017-04-06 18:01       ` Thomas Hellstrom
2017-04-06 19:52         ` Daniel Vetter
2017-04-06 14:47     ` Daniel Vetter
2017-04-06 14:56       ` Thomas Hellstrom
2017-04-06 15:01         ` Daniel Vetter
2017-04-06 15:07           ` Thomas Hellstrom

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.