[BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

* [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE
@ 2013-09-12 15:06 Peter Zijlstra
  2013-09-12 15:11 ` Maarten Lankhorst
  2013-09-12 15:36 ` Daniel Vetter
  0 siblings, 2 replies; 58+ messages in thread
From: Peter Zijlstra @ 2013-09-12 15:06 UTC (permalink / raw)
  To: airlied
  Cc: Daniel Vetter, Maarten Lankhorst, Thomas Hellstrom, intel-gfx,
	dri-devel, linux-kernel, Ingo Molnar, Thomas Gleixner

Hi Dave,

So I'm poking around the preemption code and stumbled upon:

drivers/gpu/drm/i915/i915_gem.c:                set_need_resched();
drivers/gpu/drm/ttm/ttm_bo_vm.c:                        set_need_resched();
drivers/gpu/drm/ttm/ttm_bo_vm.c:                        set_need_resched();
drivers/gpu/drm/udl/udl_gem.c:          set_need_resched();

All these sites basically do:

  while (!trylock())
  	yield();

which is a horrible and broken locking pattern. 

Firstly its deadlock prone, suppose the faulting process is a FIFOn+1
task that preempted the lock holder at FIFOn.

Secondly the implementation is worse than usual by abusing
VM_FAULT_NOPAGE, which is supposed to install a PTE so that the fault
doesn't retry, but you're using it as a get out of fault path. And
you're using set_need_resched() which is not something a driver should
_ever_ touch.

Now I'm going to take away set_need_resched() -- and while you can
'reimplement' it using set_thread_flag() you're not going to do that
because it will be broken due to changes to the preempt code.

So please as to fix ASAP and don't allow anybody to trick you into
merging silly things like that again ;-)

^ permalink raw reply	[flat|nested] 58+ messages in thread