On May 10, 2021 08:55:55 Martin Peres wrote: > On 10/05/2021 02:11, Jason Ekstrand wrote: >> On May 9, 2021 12:12:36 Martin Peres wrote: >> >>> Hi, >>> >>> On 06/05/2021 22:13, Matthew Brost wrote: >>>> Basic GuC submission support. This is the first bullet point in the >>>> upstreaming plan covered in the following RFC [1]. >>>> >>>> At a very high level the GuC is a piece of firmware which sits between >>>> the i915 and the GPU. It offloads some of the scheduling of contexts >>>> from the i915 and programs the GPU to submit contexts. The i915 >>>> communicates with the GuC and the GuC communicates with the GPU. >>> >>> May I ask what will GuC command submission do that execlist won't/can't >>> do? And what would be the impact on users? Even forgetting the troubled >>> history of GuC (instability, performance regression, poor level of user >>> support, 6+ years of trying to upstream it...), adding this much code >>> and doubling the amount of validation needed should come with a >>> rationale making it feel worth it... and I am not seeing here. Would you >>> mind providing the rationale behind this work? >>> >>>> >>>> GuC submission will be disabled by default on all current upstream >>>> platforms behind a module parameter - enable_guc. A value of 3 will >>>> enable submission and HuC loading via the GuC. GuC submission should >>>> work on all gen11+ platforms assuming the GuC firmware is present. >>> >>> What is the plan here when it comes to keeping support for execlist? I >>> am afraid that landing GuC support in Linux is the first step towards >>> killing the execlist, which would force users to use proprietary >>> firmwares that even most Intel engineers have little influence over. >>> Indeed, if "drm/i915/guc: Disable semaphores when using GuC scheduling" >>> which states "Disable semaphores when using GuC scheduling as semaphores >>> are broken in the current GuC firmware." is anything to go by, it means >>> that even Intel developers seem to prefer working around the GuC >>> firmware, rather than fixing it. >> >> Yes, landing GuC support may be the first step in removing execlist >> support. The inevitable reality is that GPU scheduling is coming and >> likely to be there only path in the not-too-distant future. (See also >> the ongoing thread with AMD about fences.) I'm not going to pass >> judgement on whether or not this is a good thing. I'm just reading the >> winds and, in my view, this is where things are headed for good or ill. >> >> In answer to the question above, the answer to "what do we gain from >> GuC?" may soon be, "you get to use your GPU." We're not there yet and, >> again, I'm not necessarily advocating for it, but that is likely where >> things are headed. > > This will be a sad day, especially since it seems fundamentally opposed > with any long-term support, on top of taking away user freedom to > fix/tweak their system when Intel won't. > >> A firmware-based submission model isn't a bad design IMO and, aside from >> the firmware freedom issues, I think there are actual advantages to the >> model. Immediately, it'll unlock a few features like parallel submission >> (more on that in a bit) and long-running compute because they're >> implemented in GuC and the work to implement them properly in the >> execlist scheduler is highly non-trivial. Longer term, it may (no >> guarantees) unlock some performance by getting the kernel out of the way. > > Oh, I definitely agree with firmware-based submission model not being a > bad design. I was even cheering for it in 2015. Experience with it made > me regret that deeply since :s > > But with the DRM scheduler being responsible for most things, I fail to > see what we could offload in the GuC except context switching (like > every other manufacturer). The problem is, the GuC does way more than > just switching registers in bulk, and if the number of revisions of the > GuC is anything to go by, it is way too complex for me to feel > comfortable with it. It's more than just bulk register writes. When it comes to load-balancing multiple GPU users, firmware can theoretically preempt and switch faster leading to more efficient time-slicing. All we really need the DRM scheduler for is handling implicit dma_fence dependencies between different applications. > >>> In the same vein, I have another concern related to the impact of GuC on >>> Linux's stable releases. Let's say that in 3 years, a new application >>> triggers a bug in command submission inside the firmware. Given that the >>> Linux community cannot patch the GuC firmware, how likely is it that >>> Intel would release a new GuC version? That would not be necessarily >>> such a big problem if newer versions of the GuC could easily be >>> backported to this potentially-decade-old Linux version, but given that >>> the GuC seems to have ABI-breaking changes on a monthly cadence (we are >>> at major version 60 *already*? :o), I would say that it is >>> highly-unlikely that it would not require potentially-extensive changes >>> to i915 to make it work, making the fix almost impossible to land in the >>> stable tree... Do you have a plan to mitigate this problem? >>> >>> Patches like "drm/i915/guc: Disable bonding extension with GuC >>> submission" also make me twitch, as this means the two command >>> submission paths will not be functionally equivalent, and enabling GuC >>> could thus introduce a user-visible regression (one app used to work, >>> then stopped working). Could you add in the commit's message a proof >>> that this would not end up being a user regression (in which case, why >>> have this codepath to begin with?). >> >> I'd like to address this one specifically as it's become something of a >> speciality of mine the past few weeks. The current bonded submission >> model is bad. It provides a plethora of ways for a client to back itself >> into a corner and doesn't actually provide the guarantees the media >> driver needs for its real-time high-resolution decode. It's bad enough >> we're seriously considering ripping it out, backwards compatibility or >> not. The good news is that very little that your average desktop user >> does depends on it: basically just real-time >4K video decode. >> >> The new parallel submit API is much better and should be the path >> forward. (We should have landed parallel submit the first time around.) >> It isn't full of corners and does let us provides actual parallel >> execution guarantees. It also gives the scheduler the information it >> needs to reliably provide those guarantees. > >> If we need to support the parallel submit API with the execlist >> back-end, that's totally possible. The choice to only implement the >> parallel submit API with GuC is a pragmatic one. We're trying to get >> upstream back on it's feet and get all the various up-and-coming bits of >> hardware enabled. Enabling the new API in the execlist back-end makes >> that pipeline longer. > > I feel your pain, and wish you all the best to get GEM less complex > and more manageable. > > So, if I understood correctly, the plan is just to regress 4K+ video > decoding for people who do not enable GuC scheduling, or did not also > update to a recent-enough media driver that would support this new > interface? If it is indeed only for over 4K videos, then whatever. If it > is 4K, it starts being a little bad, assuming graceful fallback to > CPU-based decoding. What's the test plan for this patch then? The patch > in its current form is definitely not making me confident. My understanding is that it's only >4k that's affected; we've got enough bandwidth on a single VCS for 4K. I'm not sure where the exact cut-off is (it may be a little higher than 4k) but real-time 4k should be fine and real-time 8k requires parallel submit. So we're really not cutting off many use-cases. Also, as I said above, the new API can be implemented with the execlist scheduler if needed. We've just pragmatically deprioritized it. --Jason > >>> Finally, could you explain why IGT tests need to be modified to work the >>> GuC [1], and how much of the code in this series is covered by >>> existing/upcoming tests? I would expect a very solid set of tests to >>> minimize the maintenance burden, and enable users to reproduce potential >>> issues found in this new codepath (too many users run with enable_guc=3, >>> as can be seen on Google[2]). >> >> The IGT changes, as I understand them, are entirely around switching to >> the new parallel submit API. There shouldn't be a major effect to most >> users. > > Right, this part I followed, but failed to connect it to the GuC... > because I couldn't see why it would be needed (execlist requiring a lot > more work). > > I sincerely wish for the GuC to stay away from upstream because of the > above concerns (which are yet to be addressed), but if Intel were to > push forward with the plan to drop execlist, I can foresee a world of > trouble for users... That is of course unless the GuC were to be open > sourced, with people outside of Intel able to sign their own builds or > run unsigned. Failing that, let's hope the last 6 years were just a bad > start, and the rapid climb in major version of the GuC will magically > stop! I hope execlists will remain at feature parity with the GuC when > possible... but deplore the increase in validation needs which will only > hurt users in the end. > > Thanks for your honest answer, > Martin > >> >> --Jason