Hey Chris, Chris Wilson writes: > I have to ask, if this is all just to work around iowait triggering high > frequencies for GPU bound applications, does it all just boil down to > i915 incorrectly using iowait. Does this patch set perform better than > > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c > index 9ca9c24b4421..7e7c95411bcd 100644 > --- a/drivers/gpu/drm/i915/i915_request.c > +++ b/drivers/gpu/drm/i915/i915_request.c > @@ -1267,7 +1267,7 @@ long i915_request_wait(struct i915_request *rq, > goto complete; > } > > - timeout = io_schedule_timeout(timeout); > + timeout = schedule_timeout(timeout); > } while (1); > > GEM_BUG_ON(!intel_wait_has_seqno(&wait)); > > Quite clearly the general framework could prove useful in a broader > range of situations, but does the above suffice? (And can be backported > to stable.) > -Chris Nope... This hunk is one of the first things I tried when I started looking into this. It didn't cut it, and it seemed to lead to some regressions in latency-bound test-cases that were relying on the upward bias provided by IOWAIT boosting in combination with the upstream P-state governor. The reason why it's not sufficient is that the bulk of the energy efficiency improvement from this series is obtained by dampening high-frequency oscillations of the CPU P-state, which occur currently for any periodically fluctuating workload (not only i915 rendering) regardless of whether IOWAIT boosting kicks in. i915 using IO waits does exacerbate the problem with the upstream governor by amplifying the oscillation, but it's not really the ultimate cause. In combination with v2 (you can take a peek at the half-baked patch here [1], planning to make a few more changes this week so it isn't quite ready for review yet) this hunk will actually cause more serious regressions because v2 is able to use the frequent IOWAIT stalls of the i915 driver in combination with an IO-underutilized system as a strong indication that the workload is latency-bound, which causes it to transition to latency-minimizing mode, which significantly improves performance of latency-bound rendering (most visible in a handful X11 test-cases and some GPU/CPU sync test-cases from SynMark2). [1] https://people.freedesktop.org/~currojerez/intel_pstate-lp/0001-cpufreq-intel_pstate-Implement-variably-low-pass-fil-v1.8.patch