* [RFC] Using DC in amdgpu for upcoming GPU @ 2016-12-08 2:02 Harry Wentland 2016-12-08 9:59 ` Daniel Vetter ` (3 more replies) 0 siblings, 4 replies; 66+ messages in thread From: Harry Wentland @ 2016-12-08 2:02 UTC (permalink / raw) To: dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Dave Airlie Cc: Grodzovsky, Andrey, Cyr, Aric, Bridgman, John, Lazare, Jordan, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Deucher, Alexander, Cheng, Tony We propose to use the Display Core (DC) driver for display support on AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to avoid a flag day the plan is to only support uGPU initially and transition to older ASICs gradually. The DC component has received extensive testing within AMD for DCE8, 10, and 11 GPUs and is being prepared for uGPU. Support should be better than amdgpu's current display support. * All of our QA effort is focused on DC * All of our CQE effort is focused on DC * All of our OEM preloads and custom engagements use DC * DC behavior mirrors what we do for other OSes The new asic utilizes a completely re-designed atom interface, so we cannot easily leverage much of the existing atom-based code. We've introduced DC to the community earlier in 2016 and received a fair amount of feedback. Some of what we've addressed so far are: * Self-contain ASIC specific code. We did a bunch of work to pull common sequences into dc/dce and leave ASIC specific code in separate folders. * Started to expose AUX and I2C through generic kernel/drm functionality and are mostly using that. Some of that code is still needlessly convoluted. This cleanup is in progress. * Integrated Dave and Jerome’s work on removing abstraction in bios parser. * Retire adapter service and asic capability * Remove some abstraction in GPIO Since a lot of our code is shared with pre- and post-silicon validation suites changes need to be done gradually to prevent breakages due to a major flag day. This, coupled with adding support for new asics and lots of new feature introductions means progress has not been as quick as we would have liked. We have made a lot of progress none the less. The remaining concerns that were brought up during the last review that we are working on addressing: * Continue to cleanup and reduce the abstractions in DC where it makes sense. * Removing duplicate code in I2C and AUX as we transition to using the DRM core interfaces. We can't fully transition until we've helped fill in the gaps in the drm core that we need for certain features. * Making sure Atomic API support is correct. Some of the semantics of the Atomic API were not particularly clear when we started this, however, that is improving a lot as the core drm documentation improves. Getting this code upstream and in the hands of more atomic users will further help us identify and rectify any gaps we have. Unfortunately we cannot expose code for uGPU yet. However refactor / cleanup work on DC is public. We're currently transitioning to a public patch review. You can follow our progress on the amd-gfx mailing list. We value community feedback on our work. As an appendix I've included a brief overview of the how the code currently works to make understanding and reviewing the code easier. Prior discussions on DC: * https://lists.freedesktop.org/archives/dri-devel/2016-March/103398.html * https://lists.freedesktop.org/archives/dri-devel/2016-February/100524.html Current version of DC: * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7 Once Alex pulls in the latest patches: * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7 Best Regards, Harry ************************************************ *** Appendix: A Day in the Life of a Modeset *** ************************************************ Below is a high-level overview of a modeset with dc. Some of this might be a little out-of-date since it's based on my XDC presentation but it should be more-or-less the same. amdgpu_dm_atomic_commit() { /* setup atomic state */ drm_atomic_helper_prepare_planes(dev, state); drm_atomic_helper_swap_state(dev, state); drm_atomic_helper_update_legacy_modeset_state(dev, state); /* create or remove targets */ /******************************************************************** * *** Call into DC to commit targets with list of all known targets ********************************************************************/ /* DC is optimized not to do anything if 'targets' didn't change. */ dc_commit_targets(dm->dc, commit_targets, commit_targets_count) { /****************************************************************** * *** Build context (function also used for validation) ******************************************************************/ result = core_dc->res_pool->funcs->validate_with_context( core_dc,set,target_count,context); /****************************************************************** * *** Apply safe power state ******************************************************************/ pplib_apply_safe_state(core_dc); /**************************************************************** * *** Apply the context to HW (program HW) ****************************************************************/ result = core_dc->hwss.apply_ctx_to_hw(core_dc,context) { /* reset pipes that need reprogramming */ /* disable pipe power gating */ /* set safe watermarks */ /* for all pipes with an attached stream */ /************************************************************ * *** Programming all per-pipe contexts ************************************************************/ status = apply_single_controller_ctx_to_hw(...) { pipe_ctx->tg->funcs->set_blank(...); pipe_ctx->clock_source->funcs->program_pix_clk(...); pipe_ctx->tg->funcs->program_timing(...); pipe_ctx->mi->funcs->allocate_mem_input(...); pipe_ctx->tg->funcs->enable_crtc(...); bios_parser_crtc_source_select(...); pipe_ctx->opp->funcs->opp_set_dyn_expansion(...); pipe_ctx->opp->funcs->opp_program_fmt(...); stream->sink->link->link_enc->funcs->setup(...); pipe_ctx->stream_enc->funcs->dp_set_stream_attribute(...); pipe_ctx->tg->funcs->set_blank_color(...); core_link_enable_stream(pipe_ctx); unblank_stream(pipe_ctx, program_scaler(dc, pipe_ctx); } /* program audio for all pipes */ /* update watermarks */ } program_timing_sync(core_dc, context); /* for all targets */ target_enable_memory_requests(...); /* Update ASIC power states */ pplib_apply_display_requirements(...); /* update surface or page flip */ } } _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [RFC] Using DC in amdgpu for upcoming GPU 2016-12-08 2:02 [RFC] Using DC in amdgpu for upcoming GPU Harry Wentland @ 2016-12-08 9:59 ` Daniel Vetter [not found] ` <20161208095952.hnbfs4b3nac7faap-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org> 2016-12-11 20:28 ` Daniel Vetter ` (2 subsequent siblings) 3 siblings, 1 reply; 66+ messages in thread From: Daniel Vetter @ 2016-12-08 9:59 UTC (permalink / raw) To: Harry Wentland Cc: Grodzovsky, Andrey, amd-gfx, dri-devel, Deucher, Alexander, Cheng, Tony Hi Harry, On Wed, Dec 07, 2016 at 09:02:13PM -0500, Harry Wentland wrote: > We propose to use the Display Core (DC) driver for display support on > AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to > avoid a flag day the plan is to only support uGPU initially and transition > to older ASICs gradually. > > The DC component has received extensive testing within AMD for DCE8, 10, and > 11 GPUs and is being prepared for uGPU. Support should be better than > amdgpu's current display support. > > * All of our QA effort is focused on DC > * All of our CQE effort is focused on DC > * All of our OEM preloads and custom engagements use DC > * DC behavior mirrors what we do for other OSes > > The new asic utilizes a completely re-designed atom interface, so we cannot > easily leverage much of the existing atom-based code. > > We've introduced DC to the community earlier in 2016 and received a fair > amount of feedback. Some of what we've addressed so far are: > > * Self-contain ASIC specific code. We did a bunch of work to pull > common sequences into dc/dce and leave ASIC specific code in > separate folders. > * Started to expose AUX and I2C through generic kernel/drm > functionality and are mostly using that. Some of that code is still > needlessly convoluted. This cleanup is in progress. > * Integrated Dave and Jerome’s work on removing abstraction in bios > parser. > * Retire adapter service and asic capability > * Remove some abstraction in GPIO > > Since a lot of our code is shared with pre- and post-silicon validation > suites changes need to be done gradually to prevent breakages due to a major > flag day. This, coupled with adding support for new asics and lots of new > feature introductions means progress has not been as quick as we would have > liked. We have made a lot of progress none the less. > > The remaining concerns that were brought up during the last review that we > are working on addressing: > > * Continue to cleanup and reduce the abstractions in DC where it > makes sense. > * Removing duplicate code in I2C and AUX as we transition to using the > DRM core interfaces. We can't fully transition until we've helped > fill in the gaps in the drm core that we need for certain features. > * Making sure Atomic API support is correct. Some of the semantics of > the Atomic API were not particularly clear when we started this, > however, that is improving a lot as the core drm documentation > improves. Getting this code upstream and in the hands of more > atomic users will further help us identify and rectify any gaps we > have. > > Unfortunately we cannot expose code for uGPU yet. However refactor / cleanup > work on DC is public. We're currently transitioning to a public patch > review. You can follow our progress on the amd-gfx mailing list. We value > community feedback on our work. > > As an appendix I've included a brief overview of the how the code currently > works to make understanding and reviewing the code easier. > > Prior discussions on DC: > > * https://lists.freedesktop.org/archives/dri-devel/2016-March/103398.html > * > https://lists.freedesktop.org/archives/dri-devel/2016-February/100524.html > > Current version of DC: > > * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7 > > Once Alex pulls in the latest patches: > > * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7 > > Best Regards, > Harry > > > ************************************************ > *** Appendix: A Day in the Life of a Modeset *** > ************************************************ > > Below is a high-level overview of a modeset with dc. Some of this might be a > little out-of-date since it's based on my XDC presentation but it should be > more-or-less the same. > > amdgpu_dm_atomic_commit() > { > /* setup atomic state */ > drm_atomic_helper_prepare_planes(dev, state); > drm_atomic_helper_swap_state(dev, state); > drm_atomic_helper_update_legacy_modeset_state(dev, state); > > /* create or remove targets */ > > /******************************************************************** > * *** Call into DC to commit targets with list of all known targets > ********************************************************************/ > /* DC is optimized not to do anything if 'targets' didn't change. */ > dc_commit_targets(dm->dc, commit_targets, commit_targets_count) > { > /****************************************************************** > * *** Build context (function also used for validation) > ******************************************************************/ > result = core_dc->res_pool->funcs->validate_with_context( > core_dc,set,target_count,context); I can't dig into details of DC, so this is not a 100% assessment, but if you call a function called "validate" in atomic_commit, you're very, very likely breaking atomic. _All_ validation must happen in ->atomic_check, if that's not the case TEST_ONLY mode is broken. And atomic userspace is relying on that working. The only thing that you're allowed to return from ->atomic_commit is out-of-memory, hw-on-fire and similar unforeseen and catastrophic issues. Kerneldoc expklains this. Now the reason I bring this up (and we've discussed it at length in private) is that DC still suffers from a massive abstraction midlayer. A lot of the back-end stuff (dp aux, i2c, abstractions for allocation, timers, irq, ...) have been cleaned up, but the midlayer is still there. And I understand why you have it, and why it's there - without some OS abstraction your grand plan of a unified driver across everything doesn't work out so well. But in a way the backend stuff isn't such a big deal. It's annoying since lots of code, and bugfixes have to be duplicated and all that, but it's fairly easy to fix case-by-case, and as long as AMD folks stick around (which I fully expect) not a maintainance issue. It makes it harder for others to contribute, but then since it's mostly the leaf it's generally easy to just improve the part you want to change (as an outsider). And if you want to improve shared code the only downside is that you can't also improve amd, but that's not so much a problem for non-amd folks ;-) The problem otoh with the abstraction layer between drm core and the amd driver is that you can't ignore if you want to refactor shared code. And because it's an entire world of its own, it's much harder to understand what the driver is doing (without reading it all). Some examples of what I mean: - All other drm drivers subclass drm objects (by embedding them) into the corresponding hw part that most closely matches the drm object's semantics. That means even when you have 0 clue about how a given piece of hw works, you have a reasonable chance of understanding code. If it's all your own stuff you always have to keep in minde the special amd naming conventions. That gets old real fast if you trying to figure out what 20+ (or are we at 30 already?) drivers are doing. - This is even more true for atomic. Atomic has a pretty complicated check/commmit transactional model for updating display state. It's a standardized interface, and it's extensible, and we want generic userspace to be able to run on any driver. Fairly often we realize that semantics of existing or newly proposed properties and state isn't well-defined enough, and then we need to go&read all the drivers and figure out how to fix up the mess. DC has it's entirely separate state structures which again don't subclass the atomic core structures (afaik at least). Again the same problems apply that you can't find things, and that figuring out the exact semantics and spotting differences in behaviour is almost impossible. - The trouble isn't just in reading code and understanding it correctly, it's also in finding it. If you have your own completely different world then just finding the right code is hard - cscope and grep fail to work. - Another issue is that very often we unify semantics in drivers by adding some new helpers that at least dtrt for most of the drivers. If you have your own world then the impendance mismatch will make sure that amd drivers will have slightly different semantics, and I think that's not good for the ecosystem and kms - people want to run a lot more than just a boot splash with generic kms userspace, stuff like xf86-video-$vendor is going out of favour heavily. Note that all this isn't about amd walking away and leaving an unmaintainable mess behind. Like I've said I don't think this is a big risk. The trouble is that having your own world makes it harder for everyone else to understand the amd driver, and understanding all drivers is very often step 1 in some big refactoring or feature addition effort. Because starting to refactor without understanding the problem generally doesn't work ;_) And you can't make this step 1 easier for others by promising to always maintain DC and update it to all the core changes, because that's only step 2. In all the DC discussions we've had thus far I haven't seen anyone address this issue. And this isn't just an issue in drm, it's pretty much established across all linux subsystems with the "no midlayer or OS abstraction layers in drivers" rule. There's some real solid reasons why such a HAl is extremely unpopular with upstream. And I haven't yet seen any good reason why amd needs to be different, thus far it looks like a textbook case, and there's been lots of vendors in lots of subsystems who tried to push their HAL. Thanks, Daniel > > /****************************************************************** > * *** Apply safe power state > ******************************************************************/ > pplib_apply_safe_state(core_dc); > > /**************************************************************** > * *** Apply the context to HW (program HW) > ****************************************************************/ > result = core_dc->hwss.apply_ctx_to_hw(core_dc,context) > { > /* reset pipes that need reprogramming */ > /* disable pipe power gating */ > /* set safe watermarks */ > > /* for all pipes with an attached stream */ > /************************************************************ > * *** Programming all per-pipe contexts > ************************************************************/ > status = apply_single_controller_ctx_to_hw(...) > { > pipe_ctx->tg->funcs->set_blank(...); > pipe_ctx->clock_source->funcs->program_pix_clk(...); > pipe_ctx->tg->funcs->program_timing(...); > pipe_ctx->mi->funcs->allocate_mem_input(...); > pipe_ctx->tg->funcs->enable_crtc(...); > bios_parser_crtc_source_select(...); > > pipe_ctx->opp->funcs->opp_set_dyn_expansion(...); > pipe_ctx->opp->funcs->opp_program_fmt(...); > > stream->sink->link->link_enc->funcs->setup(...); > pipe_ctx->stream_enc->funcs->dp_set_stream_attribute(...); > pipe_ctx->tg->funcs->set_blank_color(...); > > core_link_enable_stream(pipe_ctx); > unblank_stream(pipe_ctx, > > program_scaler(dc, pipe_ctx); > } > /* program audio for all pipes */ > /* update watermarks */ > } > > program_timing_sync(core_dc, context); > /* for all targets */ > target_enable_memory_requests(...); > > /* Update ASIC power states */ > pplib_apply_display_requirements(...); > > /* update surface or page flip */ > } > } > > > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 66+ messages in thread
[parent not found: <20161208095952.hnbfs4b3nac7faap-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>]
* Re: [RFC] Using DC in amdgpu for upcoming GPU [not found] ` <20161208095952.hnbfs4b3nac7faap-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org> @ 2016-12-08 14:33 ` Harry Wentland 2016-12-08 15:34 ` Daniel Vetter 2016-12-08 20:07 ` Dave Airlie 1 sibling, 1 reply; 66+ messages in thread From: Harry Wentland @ 2016-12-08 14:33 UTC (permalink / raw) To: Daniel Vetter Cc: Grodzovsky, Andrey, Dave Airlie, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Deucher, Alexander, Cheng, Tony Hi Daniel, just a quick clarification in-line about "validation" inside atomic_commit. On 2016-12-08 04:59 AM, Daniel Vetter wrote: > Hi Harry, > > On Wed, Dec 07, 2016 at 09:02:13PM -0500, Harry Wentland wrote: >> We propose to use the Display Core (DC) driver for display support on >> AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to >> avoid a flag day the plan is to only support uGPU initially and transition >> to older ASICs gradually. >> >> The DC component has received extensive testing within AMD for DCE8, 10, and >> 11 GPUs and is being prepared for uGPU. Support should be better than >> amdgpu's current display support. >> >> * All of our QA effort is focused on DC >> * All of our CQE effort is focused on DC >> * All of our OEM preloads and custom engagements use DC >> * DC behavior mirrors what we do for other OSes >> >> The new asic utilizes a completely re-designed atom interface, so we cannot >> easily leverage much of the existing atom-based code. >> >> We've introduced DC to the community earlier in 2016 and received a fair >> amount of feedback. Some of what we've addressed so far are: >> >> * Self-contain ASIC specific code. We did a bunch of work to pull >> common sequences into dc/dce and leave ASIC specific code in >> separate folders. >> * Started to expose AUX and I2C through generic kernel/drm >> functionality and are mostly using that. Some of that code is still >> needlessly convoluted. This cleanup is in progress. >> * Integrated Dave and Jerome’s work on removing abstraction in bios >> parser. >> * Retire adapter service and asic capability >> * Remove some abstraction in GPIO >> >> Since a lot of our code is shared with pre- and post-silicon validation >> suites changes need to be done gradually to prevent breakages due to a major >> flag day. This, coupled with adding support for new asics and lots of new >> feature introductions means progress has not been as quick as we would have >> liked. We have made a lot of progress none the less. >> >> The remaining concerns that were brought up during the last review that we >> are working on addressing: >> >> * Continue to cleanup and reduce the abstractions in DC where it >> makes sense. >> * Removing duplicate code in I2C and AUX as we transition to using the >> DRM core interfaces. We can't fully transition until we've helped >> fill in the gaps in the drm core that we need for certain features. >> * Making sure Atomic API support is correct. Some of the semantics of >> the Atomic API were not particularly clear when we started this, >> however, that is improving a lot as the core drm documentation >> improves. Getting this code upstream and in the hands of more >> atomic users will further help us identify and rectify any gaps we >> have. >> >> Unfortunately we cannot expose code for uGPU yet. However refactor / cleanup >> work on DC is public. We're currently transitioning to a public patch >> review. You can follow our progress on the amd-gfx mailing list. We value >> community feedback on our work. >> >> As an appendix I've included a brief overview of the how the code currently >> works to make understanding and reviewing the code easier. >> >> Prior discussions on DC: >> >> * https://lists.freedesktop.org/archives/dri-devel/2016-March/103398.html >> * >> https://lists.freedesktop.org/archives/dri-devel/2016-February/100524.html >> >> Current version of DC: >> >> * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7 >> >> Once Alex pulls in the latest patches: >> >> * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7 >> >> Best Regards, >> Harry >> >> >> ************************************************ >> *** Appendix: A Day in the Life of a Modeset *** >> ************************************************ >> >> Below is a high-level overview of a modeset with dc. Some of this might be a >> little out-of-date since it's based on my XDC presentation but it should be >> more-or-less the same. >> >> amdgpu_dm_atomic_commit() >> { >> /* setup atomic state */ >> drm_atomic_helper_prepare_planes(dev, state); >> drm_atomic_helper_swap_state(dev, state); >> drm_atomic_helper_update_legacy_modeset_state(dev, state); >> >> /* create or remove targets */ >> >> /******************************************************************** >> * *** Call into DC to commit targets with list of all known targets >> ********************************************************************/ >> /* DC is optimized not to do anything if 'targets' didn't change. */ >> dc_commit_targets(dm->dc, commit_targets, commit_targets_count) >> { >> /****************************************************************** >> * *** Build context (function also used for validation) >> ******************************************************************/ >> result = core_dc->res_pool->funcs->validate_with_context( >> core_dc,set,target_count,context); > > I can't dig into details of DC, so this is not a 100% assessment, but if > you call a function called "validate" in atomic_commit, you're very, very > likely breaking atomic. _All_ validation must happen in ->atomic_check, > if that's not the case TEST_ONLY mode is broken. And atomic userspace is > relying on that working. > This function is not really named correctly. What it does is it builds a context and validates at the same time. In commit we simply care that it builds the context. Validate should never fail here (since this was already validated in atomic_check). We call the same function at atomic_check amdgpu_dm_atomic_check -> dc_validate_resources -> core_dc->res_pool->funcs->validate_with_context As for the rest, I hear you and appreciate your feedback. Let me get back to you on that later. Thanks, Harry > The only thing that you're allowed to return from ->atomic_commit is > out-of-memory, hw-on-fire and similar unforeseen and catastrophic issues. > Kerneldoc expklains this. > > Now the reason I bring this up (and we've discussed it at length in > private) is that DC still suffers from a massive abstraction midlayer. A > lot of the back-end stuff (dp aux, i2c, abstractions for allocation, > timers, irq, ...) have been cleaned up, but the midlayer is still there. > And I understand why you have it, and why it's there - without some OS > abstraction your grand plan of a unified driver across everything doesn't > work out so well. > > But in a way the backend stuff isn't such a big deal. It's annoying since > lots of code, and bugfixes have to be duplicated and all that, but it's > fairly easy to fix case-by-case, and as long as AMD folks stick around > (which I fully expect) not a maintainance issue. It makes it harder for > others to contribute, but then since it's mostly the leaf it's generally > easy to just improve the part you want to change (as an outsider). And if > you want to improve shared code the only downside is that you can't also > improve amd, but that's not so much a problem for non-amd folks ;-) > > The problem otoh with the abstraction layer between drm core and the amd > driver is that you can't ignore if you want to refactor shared code. And > because it's an entire world of its own, it's much harder to understand > what the driver is doing (without reading it all). Some examples of what I > mean: > > - All other drm drivers subclass drm objects (by embedding them) into the > corresponding hw part that most closely matches the drm object's > semantics. That means even when you have 0 clue about how a given piece > of hw works, you have a reasonable chance of understanding code. If it's > all your own stuff you always have to keep in minde the special amd > naming conventions. That gets old real fast if you trying to figure out > what 20+ (or are we at 30 already?) drivers are doing. > > - This is even more true for atomic. Atomic has a pretty complicated > check/commmit transactional model for updating display state. It's a > standardized interface, and it's extensible, and we want generic > userspace to be able to run on any driver. Fairly often we realize that > semantics of existing or newly proposed properties and state isn't > well-defined enough, and then we need to go&read all the drivers and > figure out how to fix up the mess. DC has it's entirely separate state > structures which again don't subclass the atomic core structures (afaik > at least). Again the same problems apply that you can't find things, and > that figuring out the exact semantics and spotting differences in > behaviour is almost impossible. > > - The trouble isn't just in reading code and understanding it correctly, > it's also in finding it. If you have your own completely different world > then just finding the right code is hard - cscope and grep fail to work. > > - Another issue is that very often we unify semantics in drivers by adding > some new helpers that at least dtrt for most of the drivers. If you have > your own world then the impendance mismatch will make sure that amd > drivers will have slightly different semantics, and I think that's not > good for the ecosystem and kms - people want to run a lot more than just > a boot splash with generic kms userspace, stuff like xf86-video-$vendor > is going out of favour heavily. > > Note that all this isn't about amd walking away and leaving an > unmaintainable mess behind. Like I've said I don't think this is a big > risk. The trouble is that having your own world makes it harder for > everyone else to understand the amd driver, and understanding all drivers > is very often step 1 in some big refactoring or feature addition effort. > Because starting to refactor without understanding the problem generally > doesn't work ;_) And you can't make this step 1 easier for others by > promising to always maintain DC and update it to all the core changes, > because that's only step 2. > > In all the DC discussions we've had thus far I haven't seen anyone address > this issue. And this isn't just an issue in drm, it's pretty much > established across all linux subsystems with the "no midlayer or OS > abstraction layers in drivers" rule. There's some real solid reasons why > such a HAl is extremely unpopular with upstream. And I haven't yet seen > any good reason why amd needs to be different, thus far it looks like a > textbook case, and there's been lots of vendors in lots of subsystems who > tried to push their HAL. > > Thanks, Daniel > >> >> /****************************************************************** >> * *** Apply safe power state >> ******************************************************************/ >> pplib_apply_safe_state(core_dc); >> >> /**************************************************************** >> * *** Apply the context to HW (program HW) >> ****************************************************************/ >> result = core_dc->hwss.apply_ctx_to_hw(core_dc,context) >> { >> /* reset pipes that need reprogramming */ >> /* disable pipe power gating */ >> /* set safe watermarks */ >> >> /* for all pipes with an attached stream */ >> /************************************************************ >> * *** Programming all per-pipe contexts >> ************************************************************/ >> status = apply_single_controller_ctx_to_hw(...) >> { >> pipe_ctx->tg->funcs->set_blank(...); >> pipe_ctx->clock_source->funcs->program_pix_clk(...); >> pipe_ctx->tg->funcs->program_timing(...); >> pipe_ctx->mi->funcs->allocate_mem_input(...); >> pipe_ctx->tg->funcs->enable_crtc(...); >> bios_parser_crtc_source_select(...); >> >> pipe_ctx->opp->funcs->opp_set_dyn_expansion(...); >> pipe_ctx->opp->funcs->opp_program_fmt(...); >> >> stream->sink->link->link_enc->funcs->setup(...); >> pipe_ctx->stream_enc->funcs->dp_set_stream_attribute(...); >> pipe_ctx->tg->funcs->set_blank_color(...); >> >> core_link_enable_stream(pipe_ctx); >> unblank_stream(pipe_ctx, >> >> program_scaler(dc, pipe_ctx); >> } >> /* program audio for all pipes */ >> /* update watermarks */ >> } >> >> program_timing_sync(core_dc, context); >> /* for all targets */ >> target_enable_memory_requests(...); >> >> /* Update ASIC power states */ >> pplib_apply_display_requirements(...); >> >> /* update surface or page flip */ >> } >> } >> >> >> _______________________________________________ >> dri-devel mailing list >> dri-devel@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/dri-devel > _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [RFC] Using DC in amdgpu for upcoming GPU 2016-12-08 14:33 ` Harry Wentland @ 2016-12-08 15:34 ` Daniel Vetter [not found] ` <20161208153417.yrpbhmot5gfv37lo-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org> 0 siblings, 1 reply; 66+ messages in thread From: Daniel Vetter @ 2016-12-08 15:34 UTC (permalink / raw) To: Harry Wentland Cc: Grodzovsky, Andrey, Cheng, Tony, amd-gfx, dri-devel, Deucher, Alexander On Thu, Dec 08, 2016 at 09:33:25AM -0500, Harry Wentland wrote: > Hi Daniel, > > just a quick clarification in-line about "validation" inside atomic_commit. > > On 2016-12-08 04:59 AM, Daniel Vetter wrote: > > Hi Harry, > > > > On Wed, Dec 07, 2016 at 09:02:13PM -0500, Harry Wentland wrote: > > > We propose to use the Display Core (DC) driver for display support on > > > AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to > > > avoid a flag day the plan is to only support uGPU initially and transition > > > to older ASICs gradually. > > > > > > The DC component has received extensive testing within AMD for DCE8, 10, and > > > 11 GPUs and is being prepared for uGPU. Support should be better than > > > amdgpu's current display support. > > > > > > * All of our QA effort is focused on DC > > > * All of our CQE effort is focused on DC > > > * All of our OEM preloads and custom engagements use DC > > > * DC behavior mirrors what we do for other OSes > > > > > > The new asic utilizes a completely re-designed atom interface, so we cannot > > > easily leverage much of the existing atom-based code. > > > > > > We've introduced DC to the community earlier in 2016 and received a fair > > > amount of feedback. Some of what we've addressed so far are: > > > > > > * Self-contain ASIC specific code. We did a bunch of work to pull > > > common sequences into dc/dce and leave ASIC specific code in > > > separate folders. > > > * Started to expose AUX and I2C through generic kernel/drm > > > functionality and are mostly using that. Some of that code is still > > > needlessly convoluted. This cleanup is in progress. > > > * Integrated Dave and Jerome’s work on removing abstraction in bios > > > parser. > > > * Retire adapter service and asic capability > > > * Remove some abstraction in GPIO > > > > > > Since a lot of our code is shared with pre- and post-silicon validation > > > suites changes need to be done gradually to prevent breakages due to a major > > > flag day. This, coupled with adding support for new asics and lots of new > > > feature introductions means progress has not been as quick as we would have > > > liked. We have made a lot of progress none the less. > > > > > > The remaining concerns that were brought up during the last review that we > > > are working on addressing: > > > > > > * Continue to cleanup and reduce the abstractions in DC where it > > > makes sense. > > > * Removing duplicate code in I2C and AUX as we transition to using the > > > DRM core interfaces. We can't fully transition until we've helped > > > fill in the gaps in the drm core that we need for certain features. > > > * Making sure Atomic API support is correct. Some of the semantics of > > > the Atomic API were not particularly clear when we started this, > > > however, that is improving a lot as the core drm documentation > > > improves. Getting this code upstream and in the hands of more > > > atomic users will further help us identify and rectify any gaps we > > > have. > > > > > > Unfortunately we cannot expose code for uGPU yet. However refactor / cleanup > > > work on DC is public. We're currently transitioning to a public patch > > > review. You can follow our progress on the amd-gfx mailing list. We value > > > community feedback on our work. > > > > > > As an appendix I've included a brief overview of the how the code currently > > > works to make understanding and reviewing the code easier. > > > > > > Prior discussions on DC: > > > > > > * https://lists.freedesktop.org/archives/dri-devel/2016-March/103398.html > > > * > > > https://lists.freedesktop.org/archives/dri-devel/2016-February/100524.html > > > > > > Current version of DC: > > > > > > * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7 > > > > > > Once Alex pulls in the latest patches: > > > > > > * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7 > > > > > > Best Regards, > > > Harry > > > > > > > > > ************************************************ > > > *** Appendix: A Day in the Life of a Modeset *** > > > ************************************************ > > > > > > Below is a high-level overview of a modeset with dc. Some of this might be a > > > little out-of-date since it's based on my XDC presentation but it should be > > > more-or-less the same. > > > > > > amdgpu_dm_atomic_commit() > > > { > > > /* setup atomic state */ > > > drm_atomic_helper_prepare_planes(dev, state); > > > drm_atomic_helper_swap_state(dev, state); > > > drm_atomic_helper_update_legacy_modeset_state(dev, state); > > > > > > /* create or remove targets */ > > > > > > /******************************************************************** > > > * *** Call into DC to commit targets with list of all known targets > > > ********************************************************************/ > > > /* DC is optimized not to do anything if 'targets' didn't change. */ > > > dc_commit_targets(dm->dc, commit_targets, commit_targets_count) > > > { > > > /****************************************************************** > > > * *** Build context (function also used for validation) > > > ******************************************************************/ > > > result = core_dc->res_pool->funcs->validate_with_context( > > > core_dc,set,target_count,context); > > > > I can't dig into details of DC, so this is not a 100% assessment, but if > > you call a function called "validate" in atomic_commit, you're very, very > > likely breaking atomic. _All_ validation must happen in ->atomic_check, > > if that's not the case TEST_ONLY mode is broken. And atomic userspace is > > relying on that working. > > > > This function is not really named correctly. What it does is it builds a > context and validates at the same time. In commit we simply care that it > builds the context. Validate should never fail here (since this was already > validated in atomic_check). > > We call the same function at atomic_check > > amdgpu_dm_atomic_check -> > dc_validate_resources -> > core_dc->res_pool->funcs->validate_with_context Ah right, iirc you told me this the last time around too ;-) I guess a great example for what I mean with rolling your own world: Existing atomic drivers put their derived/computed/validated check into their subclasses state structures, which means they don't need to be re-computed in atomic_check. It also makes sure that the validation code/state computation code between check and commit doesn't get out of sync. > As for the rest, I hear you and appreciate your feedback. Let me get back to > you on that later. Just an added note on that: I do think that there's some driver teams who've managed to pull a shared codebase between validation and upstream linux (iirc some of the intel wireless drivers work like that). But it requires careful aligning of everything, and with something fast-moving like drm it might become real painful and not really worth it. So not outright rejecting DC (and the code sharing you want to achieve with it) as an idea here. -Daniel > > Thanks, > Harry > > > > The only thing that you're allowed to return from ->atomic_commit is > > out-of-memory, hw-on-fire and similar unforeseen and catastrophic issues. > > Kerneldoc expklains this. > > > > Now the reason I bring this up (and we've discussed it at length in > > private) is that DC still suffers from a massive abstraction midlayer. A > > lot of the back-end stuff (dp aux, i2c, abstractions for allocation, > > timers, irq, ...) have been cleaned up, but the midlayer is still there. > > And I understand why you have it, and why it's there - without some OS > > abstraction your grand plan of a unified driver across everything doesn't > > work out so well. > > > > But in a way the backend stuff isn't such a big deal. It's annoying since > > lots of code, and bugfixes have to be duplicated and all that, but it's > > fairly easy to fix case-by-case, and as long as AMD folks stick around > > (which I fully expect) not a maintainance issue. It makes it harder for > > others to contribute, but then since it's mostly the leaf it's generally > > easy to just improve the part you want to change (as an outsider). And if > > you want to improve shared code the only downside is that you can't also > > improve amd, but that's not so much a problem for non-amd folks ;-) > > > > The problem otoh with the abstraction layer between drm core and the amd > > driver is that you can't ignore if you want to refactor shared code. And > > because it's an entire world of its own, it's much harder to understand > > what the driver is doing (without reading it all). Some examples of what I > > mean: > > > > - All other drm drivers subclass drm objects (by embedding them) into the > > corresponding hw part that most closely matches the drm object's > > semantics. That means even when you have 0 clue about how a given piece > > of hw works, you have a reasonable chance of understanding code. If it's > > all your own stuff you always have to keep in minde the special amd > > naming conventions. That gets old real fast if you trying to figure out > > what 20+ (or are we at 30 already?) drivers are doing. > > > > - This is even more true for atomic. Atomic has a pretty complicated > > check/commmit transactional model for updating display state. It's a > > standardized interface, and it's extensible, and we want generic > > userspace to be able to run on any driver. Fairly often we realize that > > semantics of existing or newly proposed properties and state isn't > > well-defined enough, and then we need to go&read all the drivers and > > figure out how to fix up the mess. DC has it's entirely separate state > > structures which again don't subclass the atomic core structures (afaik > > at least). Again the same problems apply that you can't find things, and > > that figuring out the exact semantics and spotting differences in > > behaviour is almost impossible. > > > > - The trouble isn't just in reading code and understanding it correctly, > > it's also in finding it. If you have your own completely different world > > then just finding the right code is hard - cscope and grep fail to work. > > > > - Another issue is that very often we unify semantics in drivers by adding > > some new helpers that at least dtrt for most of the drivers. If you have > > your own world then the impendance mismatch will make sure that amd > > drivers will have slightly different semantics, and I think that's not > > good for the ecosystem and kms - people want to run a lot more than just > > a boot splash with generic kms userspace, stuff like xf86-video-$vendor > > is going out of favour heavily. > > > > Note that all this isn't about amd walking away and leaving an > > unmaintainable mess behind. Like I've said I don't think this is a big > > risk. The trouble is that having your own world makes it harder for > > everyone else to understand the amd driver, and understanding all drivers > > is very often step 1 in some big refactoring or feature addition effort. > > Because starting to refactor without understanding the problem generally > > doesn't work ;_) And you can't make this step 1 easier for others by > > promising to always maintain DC and update it to all the core changes, > > because that's only step 2. > > > > In all the DC discussions we've had thus far I haven't seen anyone address > > this issue. And this isn't just an issue in drm, it's pretty much > > established across all linux subsystems with the "no midlayer or OS > > abstraction layers in drivers" rule. There's some real solid reasons why > > such a HAl is extremely unpopular with upstream. And I haven't yet seen > > any good reason why amd needs to be different, thus far it looks like a > > textbook case, and there's been lots of vendors in lots of subsystems who > > tried to push their HAL. > > > > Thanks, Daniel > > > > > > > > /****************************************************************** > > > * *** Apply safe power state > > > ******************************************************************/ > > > pplib_apply_safe_state(core_dc); > > > > > > /**************************************************************** > > > * *** Apply the context to HW (program HW) > > > ****************************************************************/ > > > result = core_dc->hwss.apply_ctx_to_hw(core_dc,context) > > > { > > > /* reset pipes that need reprogramming */ > > > /* disable pipe power gating */ > > > /* set safe watermarks */ > > > > > > /* for all pipes with an attached stream */ > > > /************************************************************ > > > * *** Programming all per-pipe contexts > > > ************************************************************/ > > > status = apply_single_controller_ctx_to_hw(...) > > > { > > > pipe_ctx->tg->funcs->set_blank(...); > > > pipe_ctx->clock_source->funcs->program_pix_clk(...); > > > pipe_ctx->tg->funcs->program_timing(...); > > > pipe_ctx->mi->funcs->allocate_mem_input(...); > > > pipe_ctx->tg->funcs->enable_crtc(...); > > > bios_parser_crtc_source_select(...); > > > > > > pipe_ctx->opp->funcs->opp_set_dyn_expansion(...); > > > pipe_ctx->opp->funcs->opp_program_fmt(...); > > > > > > stream->sink->link->link_enc->funcs->setup(...); > > > pipe_ctx->stream_enc->funcs->dp_set_stream_attribute(...); > > > pipe_ctx->tg->funcs->set_blank_color(...); > > > > > > core_link_enable_stream(pipe_ctx); > > > unblank_stream(pipe_ctx, > > > > > > program_scaler(dc, pipe_ctx); > > > } > > > /* program audio for all pipes */ > > > /* update watermarks */ > > > } > > > > > > program_timing_sync(core_dc, context); > > > /* for all targets */ > > > target_enable_memory_requests(...); > > > > > > /* Update ASIC power states */ > > > pplib_apply_display_requirements(...); > > > > > > /* update surface or page flip */ > > > } > > > } > > > > > > > > > _______________________________________________ > > > dri-devel mailing list > > > dri-devel@lists.freedesktop.org > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel > > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 66+ messages in thread
[parent not found: <20161208153417.yrpbhmot5gfv37lo-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>]
* Re: [RFC] Using DC in amdgpu for upcoming GPU [not found] ` <20161208153417.yrpbhmot5gfv37lo-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org> @ 2016-12-08 15:41 ` Christian König 2016-12-08 15:46 ` Daniel Vetter 2016-12-08 20:24 ` Matthew Macy 2016-12-08 17:40 ` Alex Deucher 1 sibling, 2 replies; 66+ messages in thread From: Christian König @ 2016-12-08 15:41 UTC (permalink / raw) To: Daniel Vetter, Harry Wentland Cc: Deucher, Alexander, Grodzovsky, Andrey, Cheng, Tony, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW Am 08.12.2016 um 16:34 schrieb Daniel Vetter: > On Thu, Dec 08, 2016 at 09:33:25AM -0500, Harry Wentland wrote: >> Hi Daniel, >> >> just a quick clarification in-line about "validation" inside atomic_commit. >> >> On 2016-12-08 04:59 AM, Daniel Vetter wrote: >>> Hi Harry, >>> >>> On Wed, Dec 07, 2016 at 09:02:13PM -0500, Harry Wentland wrote: >>>> We propose to use the Display Core (DC) driver for display support on >>>> AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to >>>> avoid a flag day the plan is to only support uGPU initially and transition >>>> to older ASICs gradually. >>>> >>>> The DC component has received extensive testing within AMD for DCE8, 10, and >>>> 11 GPUs and is being prepared for uGPU. Support should be better than >>>> amdgpu's current display support. >>>> >>>> * All of our QA effort is focused on DC >>>> * All of our CQE effort is focused on DC >>>> * All of our OEM preloads and custom engagements use DC >>>> * DC behavior mirrors what we do for other OSes >>>> >>>> The new asic utilizes a completely re-designed atom interface, so we cannot >>>> easily leverage much of the existing atom-based code. >>>> >>>> We've introduced DC to the community earlier in 2016 and received a fair >>>> amount of feedback. Some of what we've addressed so far are: >>>> >>>> * Self-contain ASIC specific code. We did a bunch of work to pull >>>> common sequences into dc/dce and leave ASIC specific code in >>>> separate folders. >>>> * Started to expose AUX and I2C through generic kernel/drm >>>> functionality and are mostly using that. Some of that code is still >>>> needlessly convoluted. This cleanup is in progress. >>>> * Integrated Dave and Jerome’s work on removing abstraction in bios >>>> parser. >>>> * Retire adapter service and asic capability >>>> * Remove some abstraction in GPIO >>>> >>>> Since a lot of our code is shared with pre- and post-silicon validation >>>> suites changes need to be done gradually to prevent breakages due to a major >>>> flag day. This, coupled with adding support for new asics and lots of new >>>> feature introductions means progress has not been as quick as we would have >>>> liked. We have made a lot of progress none the less. >>>> >>>> The remaining concerns that were brought up during the last review that we >>>> are working on addressing: >>>> >>>> * Continue to cleanup and reduce the abstractions in DC where it >>>> makes sense. >>>> * Removing duplicate code in I2C and AUX as we transition to using the >>>> DRM core interfaces. We can't fully transition until we've helped >>>> fill in the gaps in the drm core that we need for certain features. >>>> * Making sure Atomic API support is correct. Some of the semantics of >>>> the Atomic API were not particularly clear when we started this, >>>> however, that is improving a lot as the core drm documentation >>>> improves. Getting this code upstream and in the hands of more >>>> atomic users will further help us identify and rectify any gaps we >>>> have. >>>> >>>> Unfortunately we cannot expose code for uGPU yet. However refactor / cleanup >>>> work on DC is public. We're currently transitioning to a public patch >>>> review. You can follow our progress on the amd-gfx mailing list. We value >>>> community feedback on our work. >>>> >>>> As an appendix I've included a brief overview of the how the code currently >>>> works to make understanding and reviewing the code easier. >>>> >>>> Prior discussions on DC: >>>> >>>> * https://lists.freedesktop.org/archives/dri-devel/2016-March/103398.html >>>> * >>>> https://lists.freedesktop.org/archives/dri-devel/2016-February/100524.html >>>> >>>> Current version of DC: >>>> >>>> * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7 >>>> >>>> Once Alex pulls in the latest patches: >>>> >>>> * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7 >>>> >>>> Best Regards, >>>> Harry >>>> >>>> >>>> ************************************************ >>>> *** Appendix: A Day in the Life of a Modeset *** >>>> ************************************************ >>>> >>>> Below is a high-level overview of a modeset with dc. Some of this might be a >>>> little out-of-date since it's based on my XDC presentation but it should be >>>> more-or-less the same. >>>> >>>> amdgpu_dm_atomic_commit() >>>> { >>>> /* setup atomic state */ >>>> drm_atomic_helper_prepare_planes(dev, state); >>>> drm_atomic_helper_swap_state(dev, state); >>>> drm_atomic_helper_update_legacy_modeset_state(dev, state); >>>> >>>> /* create or remove targets */ >>>> >>>> /******************************************************************** >>>> * *** Call into DC to commit targets with list of all known targets >>>> ********************************************************************/ >>>> /* DC is optimized not to do anything if 'targets' didn't change. */ >>>> dc_commit_targets(dm->dc, commit_targets, commit_targets_count) >>>> { >>>> /****************************************************************** >>>> * *** Build context (function also used for validation) >>>> ******************************************************************/ >>>> result = core_dc->res_pool->funcs->validate_with_context( >>>> core_dc,set,target_count,context); >>> I can't dig into details of DC, so this is not a 100% assessment, but if >>> you call a function called "validate" in atomic_commit, you're very, very >>> likely breaking atomic. _All_ validation must happen in ->atomic_check, >>> if that's not the case TEST_ONLY mode is broken. And atomic userspace is >>> relying on that working. >>> >> This function is not really named correctly. What it does is it builds a >> context and validates at the same time. In commit we simply care that it >> builds the context. Validate should never fail here (since this was already >> validated in atomic_check). >> >> We call the same function at atomic_check >> >> amdgpu_dm_atomic_check -> >> dc_validate_resources -> >> core_dc->res_pool->funcs->validate_with_context > Ah right, iirc you told me this the last time around too ;-) I guess a > great example for what I mean with rolling your own world: Existing atomic > drivers put their derived/computed/validated check into their subclasses > state structures, which means they don't need to be re-computed in > atomic_check. It also makes sure that the validation code/state > computation code between check and commit doesn't get out of sync. > >> As for the rest, I hear you and appreciate your feedback. Let me get back to >> you on that later. > Just an added note on that: I do think that there's some driver teams > who've managed to pull a shared codebase between validation and upstream > linux (iirc some of the intel wireless drivers work like that). But it > requires careful aligning of everything, and with something fast-moving > like drm it might become real painful and not really worth it. So not > outright rejecting DC (and the code sharing you want to achieve with it) > as an idea here. I used to have examples of such a things for other network drivers as well, but right now I can't find them of hand. Leave me a note if you need more info on existing things. A good idea might as well be to take a look at drivers shared between Linux and BSD as well, cause both code bases are usually public available and you can see what changes during porting and what stays the same. Regards, Christian. > -Daniel > >> Thanks, >> Harry >> >> >>> The only thing that you're allowed to return from ->atomic_commit is >>> out-of-memory, hw-on-fire and similar unforeseen and catastrophic issues. >>> Kerneldoc expklains this. >>> >>> Now the reason I bring this up (and we've discussed it at length in >>> private) is that DC still suffers from a massive abstraction midlayer. A >>> lot of the back-end stuff (dp aux, i2c, abstractions for allocation, >>> timers, irq, ...) have been cleaned up, but the midlayer is still there. >>> And I understand why you have it, and why it's there - without some OS >>> abstraction your grand plan of a unified driver across everything doesn't >>> work out so well. >>> >>> But in a way the backend stuff isn't such a big deal. It's annoying since >>> lots of code, and bugfixes have to be duplicated and all that, but it's >>> fairly easy to fix case-by-case, and as long as AMD folks stick around >>> (which I fully expect) not a maintainance issue. It makes it harder for >>> others to contribute, but then since it's mostly the leaf it's generally >>> easy to just improve the part you want to change (as an outsider). And if >>> you want to improve shared code the only downside is that you can't also >>> improve amd, but that's not so much a problem for non-amd folks ;-) >>> >>> The problem otoh with the abstraction layer between drm core and the amd >>> driver is that you can't ignore if you want to refactor shared code. And >>> because it's an entire world of its own, it's much harder to understand >>> what the driver is doing (without reading it all). Some examples of what I >>> mean: >>> >>> - All other drm drivers subclass drm objects (by embedding them) into the >>> corresponding hw part that most closely matches the drm object's >>> semantics. That means even when you have 0 clue about how a given piece >>> of hw works, you have a reasonable chance of understanding code. If it's >>> all your own stuff you always have to keep in minde the special amd >>> naming conventions. That gets old real fast if you trying to figure out >>> what 20+ (or are we at 30 already?) drivers are doing. >>> >>> - This is even more true for atomic. Atomic has a pretty complicated >>> check/commmit transactional model for updating display state. It's a >>> standardized interface, and it's extensible, and we want generic >>> userspace to be able to run on any driver. Fairly often we realize that >>> semantics of existing or newly proposed properties and state isn't >>> well-defined enough, and then we need to go&read all the drivers and >>> figure out how to fix up the mess. DC has it's entirely separate state >>> structures which again don't subclass the atomic core structures (afaik >>> at least). Again the same problems apply that you can't find things, and >>> that figuring out the exact semantics and spotting differences in >>> behaviour is almost impossible. >>> >>> - The trouble isn't just in reading code and understanding it correctly, >>> it's also in finding it. If you have your own completely different world >>> then just finding the right code is hard - cscope and grep fail to work. >>> >>> - Another issue is that very often we unify semantics in drivers by adding >>> some new helpers that at least dtrt for most of the drivers. If you have >>> your own world then the impendance mismatch will make sure that amd >>> drivers will have slightly different semantics, and I think that's not >>> good for the ecosystem and kms - people want to run a lot more than just >>> a boot splash with generic kms userspace, stuff like xf86-video-$vendor >>> is going out of favour heavily. >>> >>> Note that all this isn't about amd walking away and leaving an >>> unmaintainable mess behind. Like I've said I don't think this is a big >>> risk. The trouble is that having your own world makes it harder for >>> everyone else to understand the amd driver, and understanding all drivers >>> is very often step 1 in some big refactoring or feature addition effort. >>> Because starting to refactor without understanding the problem generally >>> doesn't work ;_) And you can't make this step 1 easier for others by >>> promising to always maintain DC and update it to all the core changes, >>> because that's only step 2. >>> >>> In all the DC discussions we've had thus far I haven't seen anyone address >>> this issue. And this isn't just an issue in drm, it's pretty much >>> established across all linux subsystems with the "no midlayer or OS >>> abstraction layers in drivers" rule. There's some real solid reasons why >>> such a HAl is extremely unpopular with upstream. And I haven't yet seen >>> any good reason why amd needs to be different, thus far it looks like a >>> textbook case, and there's been lots of vendors in lots of subsystems who >>> tried to push their HAL. >>> >>> Thanks, Daniel >>> >>>> /****************************************************************** >>>> * *** Apply safe power state >>>> ******************************************************************/ >>>> pplib_apply_safe_state(core_dc); >>>> >>>> /**************************************************************** >>>> * *** Apply the context to HW (program HW) >>>> ****************************************************************/ >>>> result = core_dc->hwss.apply_ctx_to_hw(core_dc,context) >>>> { >>>> /* reset pipes that need reprogramming */ >>>> /* disable pipe power gating */ >>>> /* set safe watermarks */ >>>> >>>> /* for all pipes with an attached stream */ >>>> /************************************************************ >>>> * *** Programming all per-pipe contexts >>>> ************************************************************/ >>>> status = apply_single_controller_ctx_to_hw(...) >>>> { >>>> pipe_ctx->tg->funcs->set_blank(...); >>>> pipe_ctx->clock_source->funcs->program_pix_clk(...); >>>> pipe_ctx->tg->funcs->program_timing(...); >>>> pipe_ctx->mi->funcs->allocate_mem_input(...); >>>> pipe_ctx->tg->funcs->enable_crtc(...); >>>> bios_parser_crtc_source_select(...); >>>> >>>> pipe_ctx->opp->funcs->opp_set_dyn_expansion(...); >>>> pipe_ctx->opp->funcs->opp_program_fmt(...); >>>> >>>> stream->sink->link->link_enc->funcs->setup(...); >>>> pipe_ctx->stream_enc->funcs->dp_set_stream_attribute(...); >>>> pipe_ctx->tg->funcs->set_blank_color(...); >>>> >>>> core_link_enable_stream(pipe_ctx); >>>> unblank_stream(pipe_ctx, >>>> >>>> program_scaler(dc, pipe_ctx); >>>> } >>>> /* program audio for all pipes */ >>>> /* update watermarks */ >>>> } >>>> >>>> program_timing_sync(core_dc, context); >>>> /* for all targets */ >>>> target_enable_memory_requests(...); >>>> >>>> /* Update ASIC power states */ >>>> pplib_apply_display_requirements(...); >>>> >>>> /* update surface or page flip */ >>>> } >>>> } >>>> >>>> >>>> _______________________________________________ >>>> dri-devel mailing list >>>> dri-devel@lists.freedesktop.org >>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [RFC] Using DC in amdgpu for upcoming GPU 2016-12-08 15:41 ` Christian König @ 2016-12-08 15:46 ` Daniel Vetter 2016-12-08 20:24 ` Matthew Macy 1 sibling, 0 replies; 66+ messages in thread From: Daniel Vetter @ 2016-12-08 15:46 UTC (permalink / raw) To: Christian König Cc: Grodzovsky, Andrey, dri-devel, amd-gfx, Deucher, Alexander, Cheng, Tony On Thu, Dec 08, 2016 at 04:41:52PM +0100, Christian König wrote: > Am 08.12.2016 um 16:34 schrieb Daniel Vetter: > > On Thu, Dec 08, 2016 at 09:33:25AM -0500, Harry Wentland wrote: > > > Hi Daniel, > > > > > > just a quick clarification in-line about "validation" inside atomic_commit. > > > > > > On 2016-12-08 04:59 AM, Daniel Vetter wrote: > > > > Hi Harry, > > > > > > > > On Wed, Dec 07, 2016 at 09:02:13PM -0500, Harry Wentland wrote: > > > > > We propose to use the Display Core (DC) driver for display support on > > > > > AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to > > > > > avoid a flag day the plan is to only support uGPU initially and transition > > > > > to older ASICs gradually. > > > > > > > > > > The DC component has received extensive testing within AMD for DCE8, 10, and > > > > > 11 GPUs and is being prepared for uGPU. Support should be better than > > > > > amdgpu's current display support. > > > > > > > > > > * All of our QA effort is focused on DC > > > > > * All of our CQE effort is focused on DC > > > > > * All of our OEM preloads and custom engagements use DC > > > > > * DC behavior mirrors what we do for other OSes > > > > > > > > > > The new asic utilizes a completely re-designed atom interface, so we cannot > > > > > easily leverage much of the existing atom-based code. > > > > > > > > > > We've introduced DC to the community earlier in 2016 and received a fair > > > > > amount of feedback. Some of what we've addressed so far are: > > > > > > > > > > * Self-contain ASIC specific code. We did a bunch of work to pull > > > > > common sequences into dc/dce and leave ASIC specific code in > > > > > separate folders. > > > > > * Started to expose AUX and I2C through generic kernel/drm > > > > > functionality and are mostly using that. Some of that code is still > > > > > needlessly convoluted. This cleanup is in progress. > > > > > * Integrated Dave and Jerome’s work on removing abstraction in bios > > > > > parser. > > > > > * Retire adapter service and asic capability > > > > > * Remove some abstraction in GPIO > > > > > > > > > > Since a lot of our code is shared with pre- and post-silicon validation > > > > > suites changes need to be done gradually to prevent breakages due to a major > > > > > flag day. This, coupled with adding support for new asics and lots of new > > > > > feature introductions means progress has not been as quick as we would have > > > > > liked. We have made a lot of progress none the less. > > > > > > > > > > The remaining concerns that were brought up during the last review that we > > > > > are working on addressing: > > > > > > > > > > * Continue to cleanup and reduce the abstractions in DC where it > > > > > makes sense. > > > > > * Removing duplicate code in I2C and AUX as we transition to using the > > > > > DRM core interfaces. We can't fully transition until we've helped > > > > > fill in the gaps in the drm core that we need for certain features. > > > > > * Making sure Atomic API support is correct. Some of the semantics of > > > > > the Atomic API were not particularly clear when we started this, > > > > > however, that is improving a lot as the core drm documentation > > > > > improves. Getting this code upstream and in the hands of more > > > > > atomic users will further help us identify and rectify any gaps we > > > > > have. > > > > > > > > > > Unfortunately we cannot expose code for uGPU yet. However refactor / cleanup > > > > > work on DC is public. We're currently transitioning to a public patch > > > > > review. You can follow our progress on the amd-gfx mailing list. We value > > > > > community feedback on our work. > > > > > > > > > > As an appendix I've included a brief overview of the how the code currently > > > > > works to make understanding and reviewing the code easier. > > > > > > > > > > Prior discussions on DC: > > > > > > > > > > * https://lists.freedesktop.org/archives/dri-devel/2016-March/103398.html > > > > > * > > > > > https://lists.freedesktop.org/archives/dri-devel/2016-February/100524.html > > > > > > > > > > Current version of DC: > > > > > > > > > > * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7 > > > > > > > > > > Once Alex pulls in the latest patches: > > > > > > > > > > * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7 > > > > > > > > > > Best Regards, > > > > > Harry > > > > > > > > > > > > > > > ************************************************ > > > > > *** Appendix: A Day in the Life of a Modeset *** > > > > > ************************************************ > > > > > > > > > > Below is a high-level overview of a modeset with dc. Some of this might be a > > > > > little out-of-date since it's based on my XDC presentation but it should be > > > > > more-or-less the same. > > > > > > > > > > amdgpu_dm_atomic_commit() > > > > > { > > > > > /* setup atomic state */ > > > > > drm_atomic_helper_prepare_planes(dev, state); > > > > > drm_atomic_helper_swap_state(dev, state); > > > > > drm_atomic_helper_update_legacy_modeset_state(dev, state); > > > > > > > > > > /* create or remove targets */ > > > > > > > > > > /******************************************************************** > > > > > * *** Call into DC to commit targets with list of all known targets > > > > > ********************************************************************/ > > > > > /* DC is optimized not to do anything if 'targets' didn't change. */ > > > > > dc_commit_targets(dm->dc, commit_targets, commit_targets_count) > > > > > { > > > > > /****************************************************************** > > > > > * *** Build context (function also used for validation) > > > > > ******************************************************************/ > > > > > result = core_dc->res_pool->funcs->validate_with_context( > > > > > core_dc,set,target_count,context); > > > > I can't dig into details of DC, so this is not a 100% assessment, but if > > > > you call a function called "validate" in atomic_commit, you're very, very > > > > likely breaking atomic. _All_ validation must happen in ->atomic_check, > > > > if that's not the case TEST_ONLY mode is broken. And atomic userspace is > > > > relying on that working. > > > > > > > This function is not really named correctly. What it does is it builds a > > > context and validates at the same time. In commit we simply care that it > > > builds the context. Validate should never fail here (since this was already > > > validated in atomic_check). > > > > > > We call the same function at atomic_check > > > > > > amdgpu_dm_atomic_check -> > > > dc_validate_resources -> > > > core_dc->res_pool->funcs->validate_with_context > > Ah right, iirc you told me this the last time around too ;-) I guess a > > great example for what I mean with rolling your own world: Existing atomic > > drivers put their derived/computed/validated check into their subclasses > > state structures, which means they don't need to be re-computed in > > atomic_check. It also makes sure that the validation code/state > > computation code between check and commit doesn't get out of sync. > > > > > As for the rest, I hear you and appreciate your feedback. Let me get back to > > > you on that later. > > Just an added note on that: I do think that there's some driver teams > > who've managed to pull a shared codebase between validation and upstream > > linux (iirc some of the intel wireless drivers work like that). But it > > requires careful aligning of everything, and with something fast-moving > > like drm it might become real painful and not really worth it. So not > > outright rejecting DC (and the code sharing you want to achieve with it) > > as an idea here. > > I used to have examples of such a things for other network drivers as well, > but right now I can't find them of hand. Leave me a note if you need more > info on existing things. > > A good idea might as well be to take a look at drivers shared between Linux > and BSD as well, cause both code bases are usually public available and you > can see what changes during porting and what stays the same. bsd and linux might not be a good example anymore, at least in the gfx space - upstream linux has so massively outpaced bsd kernels that they stopped porting and switched over to implement a shim in the bsd drm subsystem to fully emulate the linux interfaces. I think on the networking and storage side things are a bit better aligned still, and not quite moving as fast, to make a more native approach on each OS feasible. -Daniel > > Regards, > Christian. > > > -Daniel > > > > > Thanks, > > > Harry > > > > > > > > > > The only thing that you're allowed to return from ->atomic_commit is > > > > out-of-memory, hw-on-fire and similar unforeseen and catastrophic issues. > > > > Kerneldoc expklains this. > > > > > > > > Now the reason I bring this up (and we've discussed it at length in > > > > private) is that DC still suffers from a massive abstraction midlayer. A > > > > lot of the back-end stuff (dp aux, i2c, abstractions for allocation, > > > > timers, irq, ...) have been cleaned up, but the midlayer is still there. > > > > And I understand why you have it, and why it's there - without some OS > > > > abstraction your grand plan of a unified driver across everything doesn't > > > > work out so well. > > > > > > > > But in a way the backend stuff isn't such a big deal. It's annoying since > > > > lots of code, and bugfixes have to be duplicated and all that, but it's > > > > fairly easy to fix case-by-case, and as long as AMD folks stick around > > > > (which I fully expect) not a maintainance issue. It makes it harder for > > > > others to contribute, but then since it's mostly the leaf it's generally > > > > easy to just improve the part you want to change (as an outsider). And if > > > > you want to improve shared code the only downside is that you can't also > > > > improve amd, but that's not so much a problem for non-amd folks ;-) > > > > > > > > The problem otoh with the abstraction layer between drm core and the amd > > > > driver is that you can't ignore if you want to refactor shared code. And > > > > because it's an entire world of its own, it's much harder to understand > > > > what the driver is doing (without reading it all). Some examples of what I > > > > mean: > > > > > > > > - All other drm drivers subclass drm objects (by embedding them) into the > > > > corresponding hw part that most closely matches the drm object's > > > > semantics. That means even when you have 0 clue about how a given piece > > > > of hw works, you have a reasonable chance of understanding code. If it's > > > > all your own stuff you always have to keep in minde the special amd > > > > naming conventions. That gets old real fast if you trying to figure out > > > > what 20+ (or are we at 30 already?) drivers are doing. > > > > > > > > - This is even more true for atomic. Atomic has a pretty complicated > > > > check/commmit transactional model for updating display state. It's a > > > > standardized interface, and it's extensible, and we want generic > > > > userspace to be able to run on any driver. Fairly often we realize that > > > > semantics of existing or newly proposed properties and state isn't > > > > well-defined enough, and then we need to go&read all the drivers and > > > > figure out how to fix up the mess. DC has it's entirely separate state > > > > structures which again don't subclass the atomic core structures (afaik > > > > at least). Again the same problems apply that you can't find things, and > > > > that figuring out the exact semantics and spotting differences in > > > > behaviour is almost impossible. > > > > > > > > - The trouble isn't just in reading code and understanding it correctly, > > > > it's also in finding it. If you have your own completely different world > > > > then just finding the right code is hard - cscope and grep fail to work. > > > > > > > > - Another issue is that very often we unify semantics in drivers by adding > > > > some new helpers that at least dtrt for most of the drivers. If you have > > > > your own world then the impendance mismatch will make sure that amd > > > > drivers will have slightly different semantics, and I think that's not > > > > good for the ecosystem and kms - people want to run a lot more than just > > > > a boot splash with generic kms userspace, stuff like xf86-video-$vendor > > > > is going out of favour heavily. > > > > > > > > Note that all this isn't about amd walking away and leaving an > > > > unmaintainable mess behind. Like I've said I don't think this is a big > > > > risk. The trouble is that having your own world makes it harder for > > > > everyone else to understand the amd driver, and understanding all drivers > > > > is very often step 1 in some big refactoring or feature addition effort. > > > > Because starting to refactor without understanding the problem generally > > > > doesn't work ;_) And you can't make this step 1 easier for others by > > > > promising to always maintain DC and update it to all the core changes, > > > > because that's only step 2. > > > > > > > > In all the DC discussions we've had thus far I haven't seen anyone address > > > > this issue. And this isn't just an issue in drm, it's pretty much > > > > established across all linux subsystems with the "no midlayer or OS > > > > abstraction layers in drivers" rule. There's some real solid reasons why > > > > such a HAl is extremely unpopular with upstream. And I haven't yet seen > > > > any good reason why amd needs to be different, thus far it looks like a > > > > textbook case, and there's been lots of vendors in lots of subsystems who > > > > tried to push their HAL. > > > > > > > > Thanks, Daniel > > > > > > > > > /****************************************************************** > > > > > * *** Apply safe power state > > > > > ******************************************************************/ > > > > > pplib_apply_safe_state(core_dc); > > > > > > > > > > /**************************************************************** > > > > > * *** Apply the context to HW (program HW) > > > > > ****************************************************************/ > > > > > result = core_dc->hwss.apply_ctx_to_hw(core_dc,context) > > > > > { > > > > > /* reset pipes that need reprogramming */ > > > > > /* disable pipe power gating */ > > > > > /* set safe watermarks */ > > > > > > > > > > /* for all pipes with an attached stream */ > > > > > /************************************************************ > > > > > * *** Programming all per-pipe contexts > > > > > ************************************************************/ > > > > > status = apply_single_controller_ctx_to_hw(...) > > > > > { > > > > > pipe_ctx->tg->funcs->set_blank(...); > > > > > pipe_ctx->clock_source->funcs->program_pix_clk(...); > > > > > pipe_ctx->tg->funcs->program_timing(...); > > > > > pipe_ctx->mi->funcs->allocate_mem_input(...); > > > > > pipe_ctx->tg->funcs->enable_crtc(...); > > > > > bios_parser_crtc_source_select(...); > > > > > > > > > > pipe_ctx->opp->funcs->opp_set_dyn_expansion(...); > > > > > pipe_ctx->opp->funcs->opp_program_fmt(...); > > > > > > > > > > stream->sink->link->link_enc->funcs->setup(...); > > > > > pipe_ctx->stream_enc->funcs->dp_set_stream_attribute(...); > > > > > pipe_ctx->tg->funcs->set_blank_color(...); > > > > > > > > > > core_link_enable_stream(pipe_ctx); > > > > > unblank_stream(pipe_ctx, > > > > > > > > > > program_scaler(dc, pipe_ctx); > > > > > } > > > > > /* program audio for all pipes */ > > > > > /* update watermarks */ > > > > > } > > > > > > > > > > program_timing_sync(core_dc, context); > > > > > /* for all targets */ > > > > > target_enable_memory_requests(...); > > > > > > > > > > /* Update ASIC power states */ > > > > > pplib_apply_display_requirements(...); > > > > > > > > > > /* update surface or page flip */ > > > > > } > > > > > } > > > > > > > > > > > > > > > _______________________________________________ > > > > > dri-devel mailing list > > > > > dri-devel@lists.freedesktop.org > > > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel > > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [RFC] Using DC in amdgpu for upcoming GPU 2016-12-08 15:41 ` Christian König 2016-12-08 15:46 ` Daniel Vetter @ 2016-12-08 20:24 ` Matthew Macy 1 sibling, 0 replies; 66+ messages in thread From: Matthew Macy @ 2016-12-08 20:24 UTC (permalink / raw) To: "Christian König"; +Cc: Deucher, Alexander, amd-gfx, dri-devel putation code between check and commit doesn't get out of sync. > > > >> As for the rest, I hear you and appreciate your feedback. Let me get back to > >> you on that later. > > Just an added note on that: I do think that there's some driver teams > > who've managed to pull a shared codebase between validation and upstream > > linux (iirc some of the intel wireless drivers work like that). But it > > requires careful aligning of everything, and with something fast-moving > > like drm it might become real painful and not really worth it. So not > > outright rejecting DC (and the code sharing you want to achieve with it) > > as an idea here. > > I used to have examples of such a things for other network drivers as > well, but right now I can't find them of hand. Leave me a note if you > need more info on existing things. > > A good idea might as well be to take a look at drivers shared between > Linux and BSD as well, cause both code bases are usually public > available and you can see what changes during porting and what stays the > same. Although their core drivers are tightly the coupled with a given OS, the Chelsio 10GigE and Intel ethernet drivers in general have large amounts of platform agnostic code coupled with a fairly minimal OS abstraction layer. I don't know how analogous to DAL/DC this is. However, I will say that the Chelsio driver was an order of magnitude easier to port to FreeBSD and the end result much better than Solarflare's which felt obliged to not have any separation of concerns. -M _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [RFC] Using DC in amdgpu for upcoming GPU [not found] ` <20161208153417.yrpbhmot5gfv37lo-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org> 2016-12-08 15:41 ` Christian König @ 2016-12-08 17:40 ` Alex Deucher 1 sibling, 0 replies; 66+ messages in thread From: Alex Deucher @ 2016-12-08 17:40 UTC (permalink / raw) To: Daniel Vetter Cc: Grodzovsky, Andrey, Harry Wentland, Maling list - DRI developers, amd-gfx list, Deucher, Alexander, Cheng, Tony On Thu, Dec 8, 2016 at 10:34 AM, Daniel Vetter <daniel@ffwll.ch> wrote: > On Thu, Dec 08, 2016 at 09:33:25AM -0500, Harry Wentland wrote: >> Hi Daniel, >> >> just a quick clarification in-line about "validation" inside atomic_commit. >> >> On 2016-12-08 04:59 AM, Daniel Vetter wrote: >> > Hi Harry, >> > >> > On Wed, Dec 07, 2016 at 09:02:13PM -0500, Harry Wentland wrote: >> > > We propose to use the Display Core (DC) driver for display support on >> > > AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to >> > > avoid a flag day the plan is to only support uGPU initially and transition >> > > to older ASICs gradually. >> > > >> > > The DC component has received extensive testing within AMD for DCE8, 10, and >> > > 11 GPUs and is being prepared for uGPU. Support should be better than >> > > amdgpu's current display support. >> > > >> > > * All of our QA effort is focused on DC >> > > * All of our CQE effort is focused on DC >> > > * All of our OEM preloads and custom engagements use DC >> > > * DC behavior mirrors what we do for other OSes >> > > >> > > The new asic utilizes a completely re-designed atom interface, so we cannot >> > > easily leverage much of the existing atom-based code. >> > > >> > > We've introduced DC to the community earlier in 2016 and received a fair >> > > amount of feedback. Some of what we've addressed so far are: >> > > >> > > * Self-contain ASIC specific code. We did a bunch of work to pull >> > > common sequences into dc/dce and leave ASIC specific code in >> > > separate folders. >> > > * Started to expose AUX and I2C through generic kernel/drm >> > > functionality and are mostly using that. Some of that code is still >> > > needlessly convoluted. This cleanup is in progress. >> > > * Integrated Dave and Jerome’s work on removing abstraction in bios >> > > parser. >> > > * Retire adapter service and asic capability >> > > * Remove some abstraction in GPIO >> > > >> > > Since a lot of our code is shared with pre- and post-silicon validation >> > > suites changes need to be done gradually to prevent breakages due to a major >> > > flag day. This, coupled with adding support for new asics and lots of new >> > > feature introductions means progress has not been as quick as we would have >> > > liked. We have made a lot of progress none the less. >> > > >> > > The remaining concerns that were brought up during the last review that we >> > > are working on addressing: >> > > >> > > * Continue to cleanup and reduce the abstractions in DC where it >> > > makes sense. >> > > * Removing duplicate code in I2C and AUX as we transition to using the >> > > DRM core interfaces. We can't fully transition until we've helped >> > > fill in the gaps in the drm core that we need for certain features. >> > > * Making sure Atomic API support is correct. Some of the semantics of >> > > the Atomic API were not particularly clear when we started this, >> > > however, that is improving a lot as the core drm documentation >> > > improves. Getting this code upstream and in the hands of more >> > > atomic users will further help us identify and rectify any gaps we >> > > have. >> > > >> > > Unfortunately we cannot expose code for uGPU yet. However refactor / cleanup >> > > work on DC is public. We're currently transitioning to a public patch >> > > review. You can follow our progress on the amd-gfx mailing list. We value >> > > community feedback on our work. >> > > >> > > As an appendix I've included a brief overview of the how the code currently >> > > works to make understanding and reviewing the code easier. >> > > >> > > Prior discussions on DC: >> > > >> > > * https://lists.freedesktop.org/archives/dri-devel/2016-March/103398.html >> > > * >> > > https://lists.freedesktop.org/archives/dri-devel/2016-February/100524.html >> > > >> > > Current version of DC: >> > > >> > > * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7 >> > > >> > > Once Alex pulls in the latest patches: >> > > >> > > * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7 >> > > >> > > Best Regards, >> > > Harry >> > > >> > > >> > > ************************************************ >> > > *** Appendix: A Day in the Life of a Modeset *** >> > > ************************************************ >> > > >> > > Below is a high-level overview of a modeset with dc. Some of this might be a >> > > little out-of-date since it's based on my XDC presentation but it should be >> > > more-or-less the same. >> > > >> > > amdgpu_dm_atomic_commit() >> > > { >> > > /* setup atomic state */ >> > > drm_atomic_helper_prepare_planes(dev, state); >> > > drm_atomic_helper_swap_state(dev, state); >> > > drm_atomic_helper_update_legacy_modeset_state(dev, state); >> > > >> > > /* create or remove targets */ >> > > >> > > /******************************************************************** >> > > * *** Call into DC to commit targets with list of all known targets >> > > ********************************************************************/ >> > > /* DC is optimized not to do anything if 'targets' didn't change. */ >> > > dc_commit_targets(dm->dc, commit_targets, commit_targets_count) >> > > { >> > > /****************************************************************** >> > > * *** Build context (function also used for validation) >> > > ******************************************************************/ >> > > result = core_dc->res_pool->funcs->validate_with_context( >> > > core_dc,set,target_count,context); >> > >> > I can't dig into details of DC, so this is not a 100% assessment, but if >> > you call a function called "validate" in atomic_commit, you're very, very >> > likely breaking atomic. _All_ validation must happen in ->atomic_check, >> > if that's not the case TEST_ONLY mode is broken. And atomic userspace is >> > relying on that working. >> > >> >> This function is not really named correctly. What it does is it builds a >> context and validates at the same time. In commit we simply care that it >> builds the context. Validate should never fail here (since this was already >> validated in atomic_check). >> >> We call the same function at atomic_check >> >> amdgpu_dm_atomic_check -> >> dc_validate_resources -> >> core_dc->res_pool->funcs->validate_with_context > > Ah right, iirc you told me this the last time around too ;-) I guess a > great example for what I mean with rolling your own world: Existing atomic > drivers put their derived/computed/validated check into their subclasses > state structures, which means they don't need to be re-computed in > atomic_check. It also makes sure that the validation code/state > computation code between check and commit doesn't get out of sync. > >> As for the rest, I hear you and appreciate your feedback. Let me get back to >> you on that later. > > Just an added note on that: I do think that there's some driver teams > who've managed to pull a shared codebase between validation and upstream > linux (iirc some of the intel wireless drivers work like that). But it > requires careful aligning of everything, and with something fast-moving > like drm it might become real painful and not really worth it. So not > outright rejecting DC (and the code sharing you want to achieve with it) > as an idea here. I think we have to make it work. We don't have the resources to have separate validation and Linux core teams. It's not just the coding. Much of our validation and compliance testing on Linux leverages this as well. From our perspective, I think the pain is probably worth it at this point. Display is starting to eclipse other blocks as far as complexity. Not even just the complexity of lighting up complex topologies. The really tough stuff is that display is basically a real-time service and the hw is designed with very little margin for error with respect to timing and bandwidth. That's where much of the value comes from sharing resources with validation teams. For us, that makes the potential pain of dealing with fast moving drm worth it. This is not to say that we won't adopt more use of drm infrastructure, we are working on it within the bounds of our resource constraints. Alex _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [RFC] Using DC in amdgpu for upcoming GPU [not found] ` <20161208095952.hnbfs4b3nac7faap-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org> 2016-12-08 14:33 ` Harry Wentland @ 2016-12-08 20:07 ` Dave Airlie [not found] ` <CAPM=9tw=OLirgVU1RVxfPZ1PV64qtjOPTJ2q540=9VJhF4o2RQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 1 sibling, 1 reply; 66+ messages in thread From: Dave Airlie @ 2016-12-08 20:07 UTC (permalink / raw) To: Daniel Vetter Cc: Grodzovsky, Andrey, Harry Wentland, amd-gfx mailing list, dri-devel, Deucher, Alexander, Cheng, Tony > I can't dig into details of DC, so this is not a 100% assessment, but if > you call a function called "validate" in atomic_commit, you're very, very > likely breaking atomic. _All_ validation must happen in ->atomic_check, > if that's not the case TEST_ONLY mode is broken. And atomic userspace is > relying on that working. > > The only thing that you're allowed to return from ->atomic_commit is > out-of-memory, hw-on-fire and similar unforeseen and catastrophic issues. > Kerneldoc expklains this. > > Now the reason I bring this up (and we've discussed it at length in > private) is that DC still suffers from a massive abstraction midlayer. A > lot of the back-end stuff (dp aux, i2c, abstractions for allocation, > timers, irq, ...) have been cleaned up, but the midlayer is still there. > And I understand why you have it, and why it's there - without some OS > abstraction your grand plan of a unified driver across everything doesn't > work out so well. > > But in a way the backend stuff isn't such a big deal. It's annoying since > lots of code, and bugfixes have to be duplicated and all that, but it's > fairly easy to fix case-by-case, and as long as AMD folks stick around > (which I fully expect) not a maintainance issue. It makes it harder for > others to contribute, but then since it's mostly the leaf it's generally > easy to just improve the part you want to change (as an outsider). And if > you want to improve shared code the only downside is that you can't also > improve amd, but that's not so much a problem for non-amd folks ;-) > > The problem otoh with the abstraction layer between drm core and the amd > driver is that you can't ignore if you want to refactor shared code. And > because it's an entire world of its own, it's much harder to understand > what the driver is doing (without reading it all). Some examples of what I > mean: > > - All other drm drivers subclass drm objects (by embedding them) into the > corresponding hw part that most closely matches the drm object's > semantics. That means even when you have 0 clue about how a given piece > of hw works, you have a reasonable chance of understanding code. If it's > all your own stuff you always have to keep in minde the special amd > naming conventions. That gets old real fast if you trying to figure out > what 20+ (or are we at 30 already?) drivers are doing. > > - This is even more true for atomic. Atomic has a pretty complicated > check/commmit transactional model for updating display state. It's a > standardized interface, and it's extensible, and we want generic > userspace to be able to run on any driver. Fairly often we realize that > semantics of existing or newly proposed properties and state isn't > well-defined enough, and then we need to go&read all the drivers and > figure out how to fix up the mess. DC has it's entirely separate state > structures which again don't subclass the atomic core structures (afaik > at least). Again the same problems apply that you can't find things, and > that figuring out the exact semantics and spotting differences in > behaviour is almost impossible. > > - The trouble isn't just in reading code and understanding it correctly, > it's also in finding it. If you have your own completely different world > then just finding the right code is hard - cscope and grep fail to work. > > - Another issue is that very often we unify semantics in drivers by adding > some new helpers that at least dtrt for most of the drivers. If you have > your own world then the impendance mismatch will make sure that amd > drivers will have slightly different semantics, and I think that's not > good for the ecosystem and kms - people want to run a lot more than just > a boot splash with generic kms userspace, stuff like xf86-video-$vendor > is going out of favour heavily. > > Note that all this isn't about amd walking away and leaving an > unmaintainable mess behind. Like I've said I don't think this is a big > risk. The trouble is that having your own world makes it harder for > everyone else to understand the amd driver, and understanding all drivers > is very often step 1 in some big refactoring or feature addition effort. > Because starting to refactor without understanding the problem generally > doesn't work ;_) And you can't make this step 1 easier for others by > promising to always maintain DC and update it to all the core changes, > because that's only step 2. > > In all the DC discussions we've had thus far I haven't seen anyone address > this issue. And this isn't just an issue in drm, it's pretty much > established across all linux subsystems with the "no midlayer or OS > abstraction layers in drivers" rule. There's some real solid reasons why > such a HAl is extremely unpopular with upstream. And I haven't yet seen > any good reason why amd needs to be different, thus far it looks like a > textbook case, and there's been lots of vendors in lots of subsystems who > tried to push their HAL. Daniel has said this all very nicely, I'm going to try and be a bit more direct, because apparently I've possibly been too subtle up until now. No HALs. We don't do HALs in the kernel. We might do midlayers sometimes we try not to do midlayers. In the DRM we don't do either unless the maintainers are asleep. They might be worth the effort for AMD, however for the Linux kernel they don't provide a benefit and make maintaining the code a lot harder. I've maintained this code base for over 10 years now and I'd like to think I've only merged something for semi-political reasons once (initial exynos was still more Linuxy than DC), and that thing took a lot of time to cleanup, I really don't feel like saying yes again. Given the choice between maintaining Linus' trust that I won't merge 100,000 lines of abstracted HAL code and merging 100,000 lines of abstracted HAL code I'll give you one guess where my loyalties lie. The reason the toplevel maintainer (me) doesn't work for Intel or AMD or any vendors, is that I can say NO when your maintainers can't or won't say it. I've only got one true power as a maintainer, and that is to say No. The other option is I personally sit down and rewrite all the code in an acceptable manner, and merge that instead. But I've discovered I probably don't scale to that level, so again it leaves me with just the one actual power. AMD can't threaten not to support new GPUs in upstream kernels without merging this, that is totally something you can do, and here's the thing Linux will survive, we'll piss off a bunch of people, but the Linux kernel will just keep on rolling forward, maybe at some point someone will get pissed about lacking upstream support for your HW and go write support and submit it, maybe they won't. The kernel is bigger than any of us and has standards about what is acceptable. Read up on the whole mac80211 problems we had years ago, where every wireless vendor wrote their own 80211 layer inside their driver, there was a lot of time spent creating a central 80211 before any of those drivers were suitable for merge, well we've spent our time creating a central modesetting infrastructure, bypassing it is taking a driver in totally the wrong direction. I've also wondered if the DC code is ready for being part of the kernel anyways, what happens if I merge this, and some external contributor rewrites 50% of it and removes a bunch of stuff that the kernel doesn't need. By any kernel standards I'll merge that sort of change over your heads if Alex doesn't, it might mean you have to rewrite a chunk of your internal validation code, or some other interactions, but those won't be reasons to block the changes from my POV. I'd like some serious introspection on your team's part on how you got into this situation and how even if I was feeling like merging this (which I'm not) how you'd actually deal with being part of the Linux kernel and not hiding in nicely framed orgchart silo behind a HAL. I honestly don't think the code is Linux worthy code, and I also really dislike having to spend my Friday morning being negative about it, but hey at least I can have a shower now. No. Dave. _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
[parent not found: <CAPM=9tw=OLirgVU1RVxfPZ1PV64qtjOPTJ2q540=9VJhF4o2RQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [RFC] Using DC in amdgpu for upcoming GPU [not found] ` <CAPM=9tw=OLirgVU1RVxfPZ1PV64qtjOPTJ2q540=9VJhF4o2RQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2016-12-08 23:29 ` Dave Airlie [not found] ` <CAPM=9tzqaSR3dUBV9RUmo-kQZ8VmNP=rdgiHwOBii=7A2X0Dew-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2016-12-09 17:56 ` Cheng, Tony 2016-12-09 17:32 ` Deucher, Alexander 1 sibling, 2 replies; 66+ messages in thread From: Dave Airlie @ 2016-12-08 23:29 UTC (permalink / raw) To: Daniel Vetter Cc: Grodzovsky, Andrey, Harry Wentland, amd-gfx mailing list, dri-devel, Deucher, Alexander, Cheng, Tony > > No. > I'd also like to apologise for the formatting, gmail great for typing, crap for editing. So I've thought about it a bit more and Daniel mentioned something useful I thought I should add. Merging this code as well as maintaining a trust relationship with Linus, also maintains a trust relationship with the Linux graphics community and other drm contributors. There have been countless requests from various companies and contributors to merge unsavoury things over the years and we've denied them. They've all had the same reasons behind why they couldn't do what we want and why we were wrong, but lots of people have shown up who do get what we are at and have joined the community and contributed drivers that conform to the standards. Turning around now and saying well AMD ignored our directions, so we'll give them a free pass even though we've denied you all the same thing over time. If I'd given in and merged every vendor coded driver as-is we'd never have progressed to having atomic modesetting, there would have been too many vendor HALs and abstractions that would have blocked forward progression. Merging one HAL or abstraction is going to cause pain, but setting a precedent to merge more would be just downright stupid maintainership. Here's the thing, we want AMD to join the graphics community not hang out inside the company in silos. We need to enable FreeSync on Linux, go ask the community how would be best to do it, don't shove it inside the driver hidden in a special ioctl. Got some new HDMI features that are secret, talk to other ppl in the same position and work out a plan for moving forward. At the moment there is no engaging with the Linux stack because you aren't really using it, as long as you hide behind the abstraction there won't be much engagement, and neither side benefits, so why should we merge the code if nobody benefits? The platform problem/Windows mindset is scary and makes a lot of decisions for you, open source doesn't have those restrictions, and I don't accept drivers that try and push those development model problems into our codebase. Dave. _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
[parent not found: <CAPM=9tzqaSR3dUBV9RUmo-kQZ8VmNP=rdgiHwOBii=7A2X0Dew-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* RE: [RFC] Using DC in amdgpu for upcoming GPU [not found] ` <CAPM=9tzqaSR3dUBV9RUmo-kQZ8VmNP=rdgiHwOBii=7A2X0Dew-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2016-12-09 17:26 ` Cheng, Tony 2016-12-09 19:59 ` Daniel Vetter 0 siblings, 1 reply; 66+ messages in thread From: Cheng, Tony @ 2016-12-09 17:26 UTC (permalink / raw) To: Dave Airlie, Daniel Vetter Cc: Deucher, Alexander, Grodzovsky, Andrey, Wentland, Harry, amd-gfx mailing list, dri-devel > Merging this code as well as maintaining a trust relationship with > Linus, also maintains a trust relationship with the Linux graphics > community and other drm contributors. There have been countless > requests from various companies and contributors to merge unsavoury > things over the years and we've denied them. They've all had the same > reasons behind why they couldn't do what we want and why we were > wrong, but lots of people have shown up who do get what we are at and > have joined the community and contributed drivers that conform to the standards. > Turning around now and saying well AMD ignored our directions, so > we'll give them a free pass even though we've denied you all the same > thing over time. I'd like to say that I acknowledge the good and hard work maintainers are doing. You nor the community is wrong to say no. I understand where the no comes from. If somebody wants to throw 100k lines into DAL I would say no as well. > If I'd given in and merged every vendor coded driver as-is we'd never > have progressed to having atomic modesetting, there would have been > too many vendor HALs and abstractions that would have blocked forward > progression. Merging one HAL or abstraction is going to cause pain, > but setting a precedent to merge more would be just downright stupid > maintainership. > Here's the thing, we want AMD to join the graphics community not hang > out inside the company in silos. We need to enable FreeSync on Linux, > go ask the community how would be best to do it, don't shove it inside > the driver hidden in a special ioctl. Got some new HDMI features that > are secret, talk to other ppl in the same position and work out a plan > for moving forward. At the moment there is no engaging with the Linux > stack because you aren't really using it, as long as you hide behind > the abstraction there won't be much engagement, and neither side > benefits, so why should we merge the code if nobody benefits? > The platform problem/Windows mindset is scary and makes a lot of > decisions for you, open source doesn't have those restrictions, and I > don't accept drivers that try and push those development model > problems into our codebase. I would like to share how platform problem/Windows mindset look from our side. We are dealing with ever more complex hardware with the push to reduce power while driving more pixels through. It is the power reduction that is causing us driver developers most of the pain. Display is a high bandwidth real time memory fetch sub system which is always on, even when the system is idle. When the system is idle, pretty much all of power consumption comes from display. Can we use existing DRM infrastructure? Definitely yes, if we talk about modes up to 300Mpix/s and leaving a lot of voltage and clock margin on the table. How hard is it to set up a timing while bypass most of the pixel processing pipeline to light up a display? How about adding all the power optimization such as burst read to fill display cache and keep DRAM in self-refresh as much as possible? How about powering off some of the cache or pixel processing pipeline if we are not using them? We need to manage and maximize valuable resources like cache (cache == silicon area == $$) and clock (== power) and optimize memory request patterns at different memory clock speeds, while DPM is going, in real time on the system. This is why there is so much code to program registers, track our states, and manages resources, and it's getting more complex as HW would prefer SW program the same value into 5 different registers in different sub blocks to save a few cross tile wires on silicon and do complex calculations to find the magical optimal settings (the hated bandwidth_cals.c). There are a lot of registers need to be programmed to correct values in the right situation if we enable all these power/performance optimizations. It's really not a problem of windows mindset, rather is what is the bring up platform when silicon is in the lab with HW designer support. Today no surprise we do that almost exclusively on windows. Display team is working hard to change that to have linux in the mix while we have the attention from HW designers. We have a recent effort to try to enable all power features on Stoney (current gen low power APU) to match idle power on windows after Stoney shipped. Linux driver guys working hard on it for 4+ month and still having hard time getting over the hurdle without support from HW designers because designers are tied up with the next generation silicon currently in the lab and the rest of them already moved onto next next generation. To me I would rather have everything built on top of DC, including HW diagnostic test suites. Even if I have to build DC on top of DRM mode setting I would prefer that over trying to do another bring up without HW support. After all as driver developer refactoring and changing code is more fun than digging through documents/email and experimenting with different combination of settings in register and countless of reboots to try get pass some random hang. FYI, just dce_mem_input.c programs over 50 distinct register fields, and DC for current generation ASIC doesn't yet support all features and power optimizations. This doesn't even include more complex programming model in future generation with HW IP getting more modular. We are already making progress with bring up with shared DC code for next gen ASIC in the lab. DC HW programming / resource management / power optimization will be fully validated on all platforms including Linux and that will benefit the Linux driver running on AMD HW, especially in battery life. Just in case you are wondering Polaris windows driver isn't using DC and was on a "windows architecture" code base. We understand that from community point of view you are not getting much feature / power benefit yet because CI/VI/CZ/Polaris Linux driver with DC is only used in Linux and we don’t have the man power to make it fully optimized yet. Next gen will be performance and power optimized at launch. I acknowledge that we don't have full feature on Linux yet and we still need to work with community to amend DRM to enable FreeSync, HDR, next gen resolution and other display feature just made available in Crimson ReLive. However it's not realistic to engage with community early on in these efforts, as up to 1 month prior to release we were still experimenting with different solutions to make the feature better and we wouldn't have known what we end up building half year ago. And of course marketing wouldn't let us leak these features before Crimson launch. I would like to work with the community and I think we have shown that we welcome, appreciate and take feedback seriously. There is plenty of work done in DC addressing some of the easier to fix problems while we have next gen ASIC in the lab as top priority. We are already down to 66k lines of code from 93k through refactoring and remove numerous abstractions. We can't just tear apart the "mid layer" or "HAL" over night. Plenty of work need to be done to understand if/how we can fit resource optimization complexity into existing DRM framework. If you look at DC structure closely, we created them to plug into DRM structures (ie. dc_surface == FB/plane, dc_stream ~= CRTC, dc_link+dc_sink = encoder + connector), but we need a resource layer to decide how to realize the given "state" with our HW. The problem is not getting simpler as on top of multi-plane combine, shared encoders and clock resources, compression is starting to get into display domain. By the way, existing DRM structure do fit nicely for HW of 4 generations ago, and with current windows driver we do have concept of crtc, encoders, connector. However over the years complexity has grown and resource management is becoming a problem, which led us to design of putting in a resource management layer. We might not be supporting full range of what atomic can do and our semantics may be different at this stage of development, but saying dc_validate breaks atomic only tells me you haven't take a close look at our DC code. For us all validation runs same topology/resource algorithm in check and commit. It's not optimal yet as we will end up doing this algorithm twice today on a commit but we do intend to fix it over time. I welcome any concrete suggestions on using existing framework to solve the resource/topology management issue. It's not too late to change DC now but after couple year after more OS and ASICs are built on top of DC it will be very difficult to change. > Now the reason I bring this up (and we've discussed it at length in > private) is that DC still suffers from a massive abstraction midlayer. > A lot of the back-end stuff (dp aux, i2c, abstractions for allocation, > timers, irq, ...) have been cleaned up, but the midlayer is still there. > And I understand why you have it, and why it's there - without some OS > abstraction your grand plan of a unified driver across everything > doesn't work out so well. > > But in a way the backend stuff isn't such a big deal. It's annoying > since lots of code, and bugfixes have to be duplicated and all that, > but it's fairly easy to fix case-by-case, and as long as AMD folks > stick around (which I fully expect) not a maintainance issue. It makes > it harder for others to contribute, but then since it's mostly the > leaf it's generally easy to just improve the part you want to change > (as an outsider). And if you want to improve shared code the only > downside is that you can't also improve amd, but that's not so much a > problem for non-amd folks ;-) Unfortunately duplicating bug fixes is not trivial and if code base diverge some of the fixes will be different. Surprisingly if you track where we spend our time, < 20% is writing code. Probably 50% is trying to figure out which register need a different value programmed in those situations. The other 30% is trying to make sure the change doesn’t break other stuff in different scenarios. If power and performance optimizations remains off in Linux then I would agree with your assessment. > I've only got one true power as a maintainer, and that is to say No. We AMD driver developer only got 2 true power over community, and that is having access to internal documentation and HW designers. Not pulling Linux into the mix while silicon is still in the lab means we lose half of our power (HW designer support). > I've also wondered if the DC code is ready for being part of the kernel > anyways, what happens if I merge this, and some external > contributor rewrites 50% of it and removes a bunch of stuff that the > kernel doesn't need. By any kernel standards I'll merge that sort of > change over your heads if Alex doesn't, it might mean you have to > rewrite a chunk of your internal validation code, or some other > interactions, but those won't be reasons to block the changes from > my POV. I'd like some serious introspection on your team's part on > how you got into this situation and how even if I was feeling like > merging this (which I'm not) how you'd actually deal with being part > of the Linux kernel and not hiding in nicely framed orgchart silo > behind a HAL. We have come a long way compare to how we used to be windows centric, and I am sure there is plenty of work remaining for us to be ready to be part of the kernel. If community has clever and clean solution that doesn’t break our ASICs we’ll take it internally with open arms. We merged Dave and Jerome’s clean up on removing abstractions and we had lots of patches following Dave and Jerome’s lead in different area. Again this is not about orgchart. It’s about what’s validated when samples are in the lab. God I miss the day when everything is plugged into the wall and dual link DVI was cutting edge. At least most of our problem can be solved by diffing register dump between good and bad case. Tony _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [RFC] Using DC in amdgpu for upcoming GPU 2016-12-09 17:26 ` Cheng, Tony @ 2016-12-09 19:59 ` Daniel Vetter [not found] ` <CAKMK7uGDUBHZKNEZTdOi2_66vKZmCsc+ViM0UyTdRPfnYa-Zww-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 66+ messages in thread From: Daniel Vetter @ 2016-12-09 19:59 UTC (permalink / raw) To: Cheng, Tony Cc: Grodzovsky, Andrey, amd-gfx mailing list, dri-devel, Deucher, Alexander I guess things went a bit sideways by me and Dave only talking about the midlayer, so let me first state that the DC stuff has massively improved through replacing all the backend services that reimplemented Linux helper libraries with their native equivalent. That's some serious work, and it shows that AMD is committed to doing the right thing. I absolutely didn't want to belittle all that effort by only raising what I see is the one holdover left. On Fri, Dec 9, 2016 at 6:26 PM, Cheng, Tony <Tony.Cheng@amd.com> wrote: >> Merging this code as well as maintaining a trust relationship with >> Linus, also maintains a trust relationship with the Linux graphics >> community and other drm contributors. There have been countless >> requests from various companies and contributors to merge unsavoury >> things over the years and we've denied them. They've all had the same >> reasons behind why they couldn't do what we want and why we were >> wrong, but lots of people have shown up who do get what we are at and >> have joined the community and contributed drivers that conform to the standards. >> Turning around now and saying well AMD ignored our directions, so >> we'll give them a free pass even though we've denied you all the same >> thing over time. > > I'd like to say that I acknowledge the good and hard work maintainers are doing. You nor the community is wrong to say no. I understand where the no comes from. If somebody wants to throw 100k lines into DAL I would say no as well. > >> If I'd given in and merged every vendor coded driver as-is we'd never >> have progressed to having atomic modesetting, there would have been >> too many vendor HALs and abstractions that would have blocked forward >> progression. Merging one HAL or abstraction is going to cause pain, >> but setting a precedent to merge more would be just downright stupid >> maintainership. > >> Here's the thing, we want AMD to join the graphics community not hang >> out inside the company in silos. We need to enable FreeSync on Linux, >> go ask the community how would be best to do it, don't shove it inside >> the driver hidden in a special ioctl. Got some new HDMI features that >> are secret, talk to other ppl in the same position and work out a plan >> for moving forward. At the moment there is no engaging with the Linux >> stack because you aren't really using it, as long as you hide behind >> the abstraction there won't be much engagement, and neither side >> benefits, so why should we merge the code if nobody benefits? > > >> The platform problem/Windows mindset is scary and makes a lot of >> decisions for you, open source doesn't have those restrictions, and I >> don't accept drivers that try and push those development model >> problems into our codebase. > > I would like to share how platform problem/Windows mindset look from our side. We are dealing with ever more complex hardware with the push to reduce power while driving more pixels through. It is the power reduction that is causing us driver developers most of the pain. Display is a high bandwidth real time memory fetch sub system which is always on, even when the system is idle. When the system is idle, pretty much all of power consumption comes from display. Can we use existing DRM infrastructure? Definitely yes, if we talk about modes up to 300Mpix/s and leaving a lot of voltage and clock margin on the table. How hard is it to set up a timing while bypass most of the pixel processing pipeline to light up a display? How about adding all the power optimization such as burst read to fill display cache and keep DRAM in self-refresh as much as possible? How about powering off some of the cache or pixel processing pipeline if we are not using them? We need to manage and maximize valuable resources like cache (cache == silicon area == $$) and clock (== power) and optimize memory request patterns at different memory clock speeds, while DPM is going, in real time on the system. This is why there is so much code to program registers, track our states, and manages resources, and it's getting more complex as HW would prefer SW program the same value into 5 different registers in different sub blocks to save a few cross tile wires on silicon and do complex calculations to find the magical optimal settings (the hated bandwidth_cals.c). There are a lot of registers need to be programmed to correct values in the right situation if we enable all these power/performance optimizations. > > It's really not a problem of windows mindset, rather is what is the bring up platform when silicon is in the lab with HW designer support. Today no surprise we do that almost exclusively on windows. Display team is working hard to change that to have linux in the mix while we have the attention from HW designers. We have a recent effort to try to enable all power features on Stoney (current gen low power APU) to match idle power on windows after Stoney shipped. Linux driver guys working hard on it for 4+ month and still having hard time getting over the hurdle without support from HW designers because designers are tied up with the next generation silicon currently in the lab and the rest of them already moved onto next next generation. To me I would rather have everything built on top of DC, including HW diagnostic test suites. Even if I have to build DC on top of DRM mode setting I would prefer that over trying to do another bring up without HW support. After all as driver developer refactoring and changing code is more fun than digging through documents/email and experimenting with different combination of settings in register and countless of reboots to try get pass some random hang. > > FYI, just dce_mem_input.c programs over 50 distinct register fields, and DC for current generation ASIC doesn't yet support all features and power optimizations. This doesn't even include more complex programming model in future generation with HW IP getting more modular. We are already making progress with bring up with shared DC code for next gen ASIC in the lab. DC HW programming / resource management / power optimization will be fully validated on all platforms including Linux and that will benefit the Linux driver running on AMD HW, especially in battery life. > > Just in case you are wondering Polaris windows driver isn't using DC and was on a "windows architecture" code base. We understand that from community point of view you are not getting much feature / power benefit yet because CI/VI/CZ/Polaris Linux driver with DC is only used in Linux and we don’t have the man power to make it fully optimized yet. Next gen will be performance and power optimized at launch. I acknowledge that we don't have full feature on Linux yet and we still need to work with community to amend DRM to enable FreeSync, HDR, next gen resolution and other display feature just made available in Crimson ReLive. However it's not realistic to engage with community early on in these efforts, as up to 1 month prior to release we were still experimenting with different solutions to make the feature better and we wouldn't have known what we end up building half year ago. And of course marketing wouldn't let us leak these features before Crimson launch. This is something you need to fix, or it'll stay completely painful forever. It's hard work and takes years, but here at Intel we pulled it off. We can upstream everything from a _very_ early stage (can't tell you how early). And we have full marketing approval for that. If you watch the i915 commit stream you can see how our code is chasing updates from the hw engineers debugging things. > I would like to work with the community and I think we have shown that we welcome, appreciate and take feedback seriously. There is plenty of work done in DC addressing some of the easier to fix problems while we have next gen ASIC in the lab as top priority. We are already down to 66k lines of code from 93k through refactoring and remove numerous abstractions. We can't just tear apart the "mid layer" or "HAL" over night. Plenty of work need to be done to understand if/how we can fit resource optimization complexity into existing DRM framework. If you look at DC structure closely, we created them to plug into DRM structures (ie. dc_surface == FB/plane, dc_stream ~= CRTC, dc_link+dc_sink = encoder + connector), but we need a resource layer to decide how to realize the given "state" with our HW. The problem is not getting simpler as on top of multi-plane combine, shared encoders and clock resources, compression is starting to get into display domain. By the way, existing DRM structure do fit nicely for HW of 4 generations ago, and with current windows driver we do have concept of crtc, encoders, connector. However over the years complexity has grown and resource management is becoming a problem, which led us to design of putting in a resource management layer. We might not be supporting full range of what atomic can do and our semantics may be different at this stage of development, but saying dc_validate breaks atomic only tells me you haven't take a close look at our DC code. For us all validation runs same topology/resource algorithm in check and commit. It's not optimal yet as we will end up doing this algorithm twice today on a commit but we do intend to fix it over time. I welcome any concrete suggestions on using existing framework to solve the resource/topology management issue. It's not too late to change DC now but after couple year after more OS and ASICs are built on top of DC it will be very difficult to change. I guess I assumed too much that midlayer is a known thing. No one's asking AMD to throw all the platform DC code away. No ones asking you to rewrite the bandwidth calculations, clock tuning and all the code that requires tons of work at power on to get right. Asking for that would be beyond silly. The disagreement is purely about how all that code interfaces with the DRM subsystem, and how exactly it implements the userspace ABI. As you've noticed the DRM objects don't really fit well for todays hardware any more, but because it's Linux userspace ABI we can't ever change those (well at least not easily) and will be stuck for another few years or maybe even decades with them. Which means _every_ driver has to deal with an impendence mismatch. The question now is how you deal with that impendence mismatch. The industry practice has been to insert an abstraction layer to isolate your own code as much as possible. The linux best practice is essentially an inversion of control where you write a bit of linux-specific glue which drives the bits and pieces that are part of your backend much more directly. And the reason why the abstraction layer isn't popular in linux is that it makes cross-vendor collaboration much more painful, and unecessarily so. Me misreading your atomic code is a pretty good example - of course you understand it and can see that I missed things, it's your codebase. But as someone who reads drm drivers all day long stumbling over a driver where things work completely differently means I'm just lost, and it's much harder to understand things. And upstream does optimize for cross-vendor collaboration. But none of that means you need to throw away your entire backend. It only means that the interface should try to be understandle to people who don't look at dal/dc all day long. So if you say above that e.g. dc_surface ~ drm_plane, then the expectation is that dc_surface embeds (subclassing in OO speak) drm_plane. Yes there will be some mismatches and there's code patterns and support in atomic to handle them, but it makes it much easier to understand vendor code for outsides. And this also doesn't mean that your backend code needs to deal with drm_planes all the time, it will still deal with dc_surface. The only thing that kinda changes is that if you want to keep cross-vendor support you might need some abstraction so that your shared code doesn't heavily depend upon the drm_plane layout. Similar for resource optimization and state handling in atomic: There's a very clear pattern that all DRM drivers follow, which is massively extensible (you're not the only ones support shared resources, and not the only vendor where the simple plane->crtc->encoder pipeline has about 10 different components and IP blocks in reality). And by following that pattern (and again you can store whatever you want in your own private dc_surface_state) it makes it really easy for others to quickly check a few things in your driver, and I wouldn't have made the mistake of not realizing that you do validate the state in atomic_check. And personally I don't believe that designing things this way round will result in more unshared code between different platforms. The intel i915 is being reused (not on windows, but on a bunch of other more fringe OS), and the people doing that don't seem to terribly struggle with it. >> Now the reason I bring this up (and we've discussed it at length in >> private) is that DC still suffers from a massive abstraction midlayer. >> A lot of the back-end stuff (dp aux, i2c, abstractions for allocation, >> timers, irq, ...) have been cleaned up, but the midlayer is still there. >> And I understand why you have it, and why it's there - without some OS >> abstraction your grand plan of a unified driver across everything >> doesn't work out so well. >> >> But in a way the backend stuff isn't such a big deal. It's annoying >> since lots of code, and bugfixes have to be duplicated and all that, >> but it's fairly easy to fix case-by-case, and as long as AMD folks >> stick around (which I fully expect) not a maintainance issue. It makes >> it harder for others to contribute, but then since it's mostly the >> leaf it's generally easy to just improve the part you want to change >> (as an outsider). And if you want to improve shared code the only >> downside is that you can't also improve amd, but that's not so much a >> problem for non-amd folks ;-) > > Unfortunately duplicating bug fixes is not trivial and if code base diverge some of the fixes will be different. Surprisingly if you track where we spend our time, < 20% is writing code. Probably 50% is trying to figure out which register need a different value programmed in those situations. The other 30% is trying to make sure the change doesn’t break other stuff in different scenarios. If power and performance optimizations remains off in Linux then I would agree with your assessment. > >> I've only got one true power as a maintainer, and that is to say No. > > We AMD driver developer only got 2 true power over community, and that is having access to internal documentation and HW designers. Not pulling Linux into the mix while silicon is still in the lab means we lose half of our power (HW designer support). > >> I've also wondered if the DC code is ready for being part of the kernel >> anyways, what happens if I merge this, and some external >> contributor rewrites 50% of it and removes a bunch of stuff that the >> kernel doesn't need. By any kernel standards I'll merge that sort of >> change over your heads if Alex doesn't, it might mean you have to >> rewrite a chunk of your internal validation code, or some other >> interactions, but those won't be reasons to block the changes from >> my POV. I'd like some serious introspection on your team's part on >> how you got into this situation and how even if I was feeling like >> merging this (which I'm not) how you'd actually deal with being part >> of the Linux kernel and not hiding in nicely framed orgchart silo >> behind a HAL. > > We have come a long way compare to how we used to be windows centric, and I am sure there is plenty of work remaining for us to be ready to be part of the kernel. If community has clever and clean solution that doesn’t break our ASICs we’ll take it internally with open arms. We merged Dave and Jerome’s clean up on removing abstractions and we had lots of patches following Dave and Jerome’s lead in different area. > > Again this is not about orgchart. It’s about what’s validated when samples are in the lab. > > God I miss the day when everything is plugged into the wall and dual link DVI was cutting edge. At least most of our problem can be solved by diffing register dump between good and bad case. Yeah, nothing different to what we suffer/experience here at Intel ;-) -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 66+ messages in thread
[parent not found: <CAKMK7uGDUBHZKNEZTdOi2_66vKZmCsc+ViM0UyTdRPfnYa-Zww-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [RFC] Using DC in amdgpu for upcoming GPU [not found] ` <CAKMK7uGDUBHZKNEZTdOi2_66vKZmCsc+ViM0UyTdRPfnYa-Zww-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2016-12-09 20:34 ` Dave Airlie 2016-12-09 20:38 ` Daniel Vetter ` (2 more replies) 0 siblings, 3 replies; 66+ messages in thread From: Dave Airlie @ 2016-12-09 20:34 UTC (permalink / raw) To: Daniel Vetter Cc: Grodzovsky, Andrey, Cheng, Tony, amd-gfx mailing list, dri-devel, Deucher, Alexander, Wentland, Harry On 10 December 2016 at 05:59, Daniel Vetter <daniel@ffwll.ch> wrote: > I guess things went a bit sideways by me and Dave only talking about > the midlayer, so let me first state that the DC stuff has massively > improved through replacing all the backend services that reimplemented > Linux helper libraries with their native equivalent. That's some > serious work, and it shows that AMD is committed to doing the right > thing. > > I absolutely didn't want to belittle all that effort by only raising > what I see is the one holdover left. I see myself and Daniel have kinda fallen into good-cop, bad-cop mode. I agree with everything Daniel had said in here, and come next week I might try and write something more constructive up, but believe me Daniel is totally right! It's Saturday morning, I've got a weekend to deal with and I'm going to try and avoid thinking too much about this. I actually love bandwidth_calcs.c I'd like to merge it even before DAL, yes it's ugly code, and it's horrible but it's a single piece of hw team magic, and we can hide that. It's the sw abstraction magic that is my issue. Dave. _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [RFC] Using DC in amdgpu for upcoming GPU 2016-12-09 20:34 ` Dave Airlie @ 2016-12-09 20:38 ` Daniel Vetter 2016-12-10 0:29 ` Matthew Macy 2016-12-11 12:34 ` Daniel Vetter 2 siblings, 0 replies; 66+ messages in thread From: Daniel Vetter @ 2016-12-09 20:38 UTC (permalink / raw) To: Dave Airlie Cc: Grodzovsky, Andrey, Cheng, Tony, amd-gfx mailing list, dri-devel, Deucher, Alexander On Fri, Dec 9, 2016 at 9:34 PM, Dave Airlie <airlied@gmail.com> wrote: > I actually love bandwidth_calcs.c I'd like to merge it even before DAL, yes > it's ugly code, and it's horrible but it's a single piece of hw team magic, and > we can hide that. It's the sw abstraction magic that is my issue. If anyone wants an example, look at the original vlv pll compuatation code. A lot smaller but about 8 levels of indent, one function with no structure, local variables i, j, k, l, m, o ... with no explanation, but it was the Word of God (akak hw engineers) and that's why we merged it. Later on we had to rewrite it because in the conversion from the excel formula to C hw engineers forgot that u32 truncates differently than the floating point excel uses ;-) -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [RFC] Using DC in amdgpu for upcoming GPU 2016-12-09 20:34 ` Dave Airlie 2016-12-09 20:38 ` Daniel Vetter @ 2016-12-10 0:29 ` Matthew Macy 2016-12-11 12:34 ` Daniel Vetter 2 siblings, 0 replies; 66+ messages in thread From: Matthew Macy @ 2016-12-10 0:29 UTC (permalink / raw) To: Dave Airlie; +Cc: Deucher, Alexander, dri-devel, amd-gfx mailing list ---- On Fri, 09 Dec 2016 12:34:17 -0800 Dave Airlie <airlied@gmail.com> wrote ---- > On 10 December 2016 at 05:59, Daniel Vetter <daniel@ffwll.ch> wrote: > > I guess things went a bit sideways by me and Dave only talking about > > the midlayer, so let me first state that the DC stuff has massively > > improved through replacing all the backend services that reimplemented > > Linux helper libraries with their native equivalent. That's some > > serious work, and it shows that AMD is committed to doing the right > > thing. > > > > I absolutely didn't want to belittle all that effort by only raising > > what I see is the one holdover left. > > I see myself and Daniel have kinda fallen into good-cop, bad-cop mode. > > I agree with everything Daniel had said in here, and come next week I might > try and write something more constructive up, but believe me Daniel is totally > right! It's Saturday morning, I've got a weekend to deal with and I'm going to > try and avoid thinking too much about this. > > I actually love bandwidth_calcs.c I'd like to merge it even before DAL, yes > it's ugly code, and it's horrible but it's a single piece of hw team magic, and > we can hide that. It's the sw abstraction magic that is my issue. > > Dave. David - I recognize that the maintainer role you play is critical to the success of Linux. You need to honor your responsibilities as well as maintain your rapport with Linus. In FreeBSD committers are largely siloed and no one is designated to facilitate the import of outside contributions. As a consequence, vendor driver developers are given commit bits and not infrequently commit near unmaintainable garbage. Academic committers commit half baked code to meet a publishing deadline - which they subsequently abandon. And much work by non-committers never makes it in. Frequently, the self-appointed gatekeepers in the community will block work with little visible discussion or negotiation about how to meet their demands (ENOTIME being the typical response). When I talk to people outside the community about the ways in which FreeBSD most notably fell short of Linux, the one point that resonates the most is the lack of clear path or transparency in upstreaming contributions. As maintainer your responsibility is first and foremost to the long term health of Linux - not being popular with contributors. That said, as a prospective AMD shareholder I have a few observations to make about what they should do. First of all, by any measures AMD graphics profit margins are razor thin. Even when their products have been clearly superior to Nvidia's consumers have, as a group, held off and paid more for Nvidia's. See the following youtube video if you're curious as to just how poorly they have fared in mindshare: http://bit.ly/1J7020P I have no doubt that they lack the resources to support Linux at the same level of Windows without large amounts of code sharing. I was under the impression that their ROC compute stack would be near ready for mainline this summer. It's now clear that, at best, it won't happen any sooner than next summer. As a downstream consumer of Alex's code on Linux and FreeBSD I *hope* that AMD will do whatever it takes to put their codebase on par with Windows. There are only two makers of high end GPUs and one one of them is opaque and closed source. However, as a prospective shareholder who is under the impression that almost none of their income comes from Linux users - if they need to have a fully native Linux driver I think they have 3 real choices: a) Dumb down the driver. It just needs to push pixels. Admit that Nvidia has won the mindshare for anything like high end graphics on Linux. Just be good enough to run X and basic mesa demos. b) Go back to a closed source driver. Although the DRM layer churns rapidly, the underlying KPIs that it uses change very slowly. To the limited extent they need to it's not that hard to decouple from the underlying kernel. Nvidia's seen little to no blowback for only providing binary support for a narrow set of Linux kernels. In fact - the reason I run Linux is because of the Linux only binary CUDA stack. Blobs can have some nice lock-in benefits for Linux. c) a+b - write a "good enough" driver for open source and keep a closed driver for selected large consumers. AMD's responsibility is first and foremost to its shareholders. If doing right by Linux is in conflict with that the choice is clear. It is those of us dependent on it being open source that lose the most. I think the net consequence of this will be to reinforce the dominant position of Nvidia and the marginal relevance of open source graphics outside of embedded (Intel's support for Linux is great, but it really is not in the same league as Nvidia or AMD). -M _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [RFC] Using DC in amdgpu for upcoming GPU 2016-12-09 20:34 ` Dave Airlie 2016-12-09 20:38 ` Daniel Vetter 2016-12-10 0:29 ` Matthew Macy @ 2016-12-11 12:34 ` Daniel Vetter 2 siblings, 0 replies; 66+ messages in thread From: Daniel Vetter @ 2016-12-11 12:34 UTC (permalink / raw) To: Dave Airlie Cc: Grodzovsky, Andrey, Cheng, Tony, amd-gfx mailing list, dri-devel, Deucher, Alexander On Fri, Dec 9, 2016 at 9:34 PM, Dave Airlie airlied@gmail.com> wrote: > On 10 December 2016 at 05:59, Daniel Vetter <daniel@ffwll.ch> wrote: >> I guess things went a bit sideways by me and Dave only talking about >> the midlayer, so let me first state that the DC stuff has massively >> improved through replacing all the backend services that reimplemented >> Linux helper libraries with their native equivalent. That's some >> serious work, and it shows that AMD is committed to doing the right >> thing. >> >> I absolutely didn't want to belittle all that effort by only raising >> what I see is the one holdover left. > > I see myself and Daniel have kinda fallen into good-cop, bad-cop mode. > > I agree with everything Daniel had said in here, and come next week I might > try and write something more constructive up, but believe me Daniel is totally > right! It's Saturday morning, I've got a weekend to deal with and I'm going to > try and avoid thinking too much about this. Yeah I'm pondering what a reasonable action plan for dc from an atomic pov is too. One issue we have is that right now the atomic docs are a bit lacking for large-scale/design issues. But I'm working on this (hopefully happens soonish, we need it for intel projects too), both pulling the original atomic design stuff from my blog into docs and beat into shape. And also how to handle state and atomic_check/commit for when you want a state model that goes massively beyond what's there with just drm_plane/crtc/connector_state (like e.g. i915 has). But instead of me typing this up in this thread here and then getting lost again (hopefully amdgpu/dc is not the last full-featured driver we'll get ...) I think it's better if I type this up for the drm docs and ask Harry/Tony&co for review feedback. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 66+ messages in thread
* RE: [RFC] Using DC in amdgpu for upcoming GPU 2016-12-08 23:29 ` Dave Airlie [not found] ` <CAPM=9tzqaSR3dUBV9RUmo-kQZ8VmNP=rdgiHwOBii=7A2X0Dew-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2016-12-09 17:56 ` Cheng, Tony 1 sibling, 0 replies; 66+ messages in thread From: Cheng, Tony @ 2016-12-09 17:56 UTC (permalink / raw) To: Dave Airlie, Daniel Vetter Cc: Deucher, Alexander, Grodzovsky, Andrey, Wentland, Harry, amd-gfx mailing list, dri-devel > Merging this code as well as maintaining a trust relationship with > Linus, also maintains a trust relationship with the Linux graphics > community and other drm contributors. There have been countless > requests from various companies and contributors to merge unsavoury > things over the years and we've denied them. They've all had the same > reasons behind why they couldn't do what we want and why we were > wrong, but lots of people have shown up who do get what we are at and > have joined the community and contributed drivers that conform to the standards. > Turning around now and saying well AMD ignored our directions, so > we'll give them a free pass even though we've denied you all the same > thing over time. I'd like to say that I acknowledge the good and hard work maintainers are doing. You nor the community is wrong to say no. I understand where the no comes from. If somebody wants to throw 100k lines into DAL I would say no as well. > If I'd given in and merged every vendor coded driver as-is we'd never > have progressed to having atomic modesetting, there would have been > too many vendor HALs and abstractions that would have blocked forward > progression. Merging one HAL or abstraction is going to cause pain, > but setting a precedent to merge more would be just downright stupid > maintainership. > Here's the thing, we want AMD to join the graphics community not hang > out inside the company in silos. We need to enable FreeSync on Linux, > go ask the community how would be best to do it, don't shove it inside > the driver hidden in a special ioctl. Got some new HDMI features that > are secret, talk to other ppl in the same position and work out a plan > for moving forward. At the moment there is no engaging with the Linux > stack because you aren't really using it, as long as you hide behind > the abstraction there won't be much engagement, and neither side > benefits, so why should we merge the code if nobody benefits? > The platform problem/Windows mindset is scary and makes a lot of > decisions for you, open source doesn't have those restrictions, and I > don't accept drivers that try and push those development model > problems into our codebase. I would like to share how platform problem/Windows mindset look from our side. We are dealing with ever more complex hardware with the push to reduce power while driving more pixels through. It is the power reduction that is causing us driver developers most of the pain. Display is a high bandwidth real time memory fetch sub system which is always on, even when the system is idle. When the system is idle, pretty much all of power consumption comes from display. Can we use existing DRM infrastructure? Definitely yes, if we talk about modes up to 300Mpix/s and leaving a lot of voltage and clock margin on the table. How hard is it to set up a timing while bypass most of the pixel processing pipeline to light up a display? How about adding all the power optimization such as burst read to fill display cache and keep DRAM in self-refresh as much as possible? How about powering off some of the cache or pixel processing pipeline if we are not using them? We need to manage and maximize valuable resources like cache (cache == silicon area == $$) and clock (== power) and optimize memory request patterns at different memory clock speeds, while DPM is going, in real time on the system. This is why there is so much code to program registers, track our states, and manages resources, and it's getting more complex as HW would prefer SW program the same value into 5 different registers in different sub blocks to save a few cross tile wires on silicon and do complex calculations to find the magical optimal settings (the hated bandwidth_cals.c). There are a lot of registers need to be programmed to correct values in the right situation if we enable all these power/performance optimizations. It's really not a problem of windows mindset, rather is what is the bring up platform when silicon is in the lab with HW designer support. Today no surprise we do that almost exclusively on windows. Display team is working hard to change that to have linux in the mix while we have the attention from HW designers. We have a recent effort to try to enable all power features on Stoney (current gen low power APU) to match idle power on windows after Stoney shipped. Linux driver guys working hard on it for 4+ month and still having hard time getting over the hurdle without support from HW designers because designers are tied up with the next generation silicon currently in the lab and the rest of them already moved onto next next generation. To me I would rather have everything built on top of DC, including HW diagnostic test suites. Even if I have to build DC on top of DRM mode setting I would prefer that over trying to do another bring up without HW support. After all as driver developer refactoring and changing code is more fun than digging through documents/email and experimenting with different combination of settings in register and countless of reboots to try get pass some random hang. FYI, just dce_mem_input.c programs over 50 distinct register fields, and DC for current generation ASIC doesn't yet support all features and power optimizations. This doesn't even include more complex programming model in future generation with HW IP getting more modular. We are already making progress with bring up with shared DC code for next gen ASIC in the lab. DC HW programming / resource management / power optimization will be fully validated on all platforms including Linux and that will benefit the Linux driver running on AMD HW, especially in battery life. Just in case you are wondering Polaris windows driver isn't using DC and was on a "windows architecture" code base. We understand that from community point of view you are not getting much feature / power benefit yet because CI/VI/CZ/Polaris Linux driver with DC is only used in Linux and we don’t have the man power to make it fully optimized yet. Next gen will be performance and power optimized at launch. I acknowledge that we don't have full feature on Linux yet and we still need to work with community to amend DRM to enable FreeSync, HDR, next gen resolution and other display feature just made available in Crimson ReLive. However it's not realistic to engage with community early on in these efforts, as up to 1 month prior to release we were still experimenting with different solutions to make the feature better and we wouldn't have known what we end up building half year ago. And of course marketing wouldn't let us leak these features before Crimson launch. I would like to work with the community and I think we have shown that we welcome, appreciate and take feedback seriously. There is plenty of work done in DC addressing some of the easier to fix problems while we have next gen ASIC in the lab as top priority. We are already down to 66k lines of code from 93k through refactoring and remove numerous abstractions. We can't just tear apart the "mid layer" or "HAL" over night. Plenty of work need to be done to understand if/how we can fit resource optimization complexity into existing DRM framework. If you look at DC structure closely, we created them to plug into DRM structures (ie. dc_surface == FB/plane, dc_stream ~= CRTC, dc_link+dc_sink = encoder + connector), but we need a resource layer to decide how to realize the given "state" with our HW. The problem is not getting simpler as on top of multi-plane combine, shared encoders and clock resources, compression is starting to get into display domain. By the way, existing DRM structure do fit nicely for HW of 4 generations ago, and with current windows driver we do have concept of crtc, encoders, connector. However over the years complexity has grown and resource management is becoming a problem, which led us to design of putting in a resource management layer. We might not be supporting full range of what atomic can do and our semantics may be different at this stage of development, but saying dc_validate breaks atomic only tells me you haven't take a close look at our DC code. For us all validation runs same topology/resource algorithm in check and commit. It's not optimal yet as we will end up doing this algorithm twice today on a commit but we do intend to fix it over time. I welcome any concrete suggestions on using existing framework to solve the resource/topology management issue. It's not too late to change DC now but after couple year after more OS and ASICs are built on top of DC it will be very difficult to change. > Now the reason I bring this up (and we've discussed it at length in > private) is that DC still suffers from a massive abstraction midlayer. > A lot of the back-end stuff (dp aux, i2c, abstractions for allocation, > timers, irq, ...) have been cleaned up, but the midlayer is still there. > And I understand why you have it, and why it's there - without some OS > abstraction your grand plan of a unified driver across everything > doesn't work out so well. > > But in a way the backend stuff isn't such a big deal. It's annoying > since lots of code, and bugfixes have to be duplicated and all that, > but it's fairly easy to fix case-by-case, and as long as AMD folks > stick around (which I fully expect) not a maintainance issue. It makes > it harder for others to contribute, but then since it's mostly the > leaf it's generally easy to just improve the part you want to change > (as an outsider). And if you want to improve shared code the only > downside is that you can't also improve amd, but that's not so much a > problem for non-amd folks ;-) Unfortunately duplicating bug fixes is not trivial and if code base diverge some of the fixes will be different. Surprisingly if you track where we spend our time, < 20% is writing code. Probably 50% is trying to figure out which register need a different value programmed in those situations. The other 30% is trying to make sure the change doesn’t break other stuff in different scenarios. If power and performance optimizations remains off in Linux then I would agree with your assessment. > I've only got one true power as a maintainer, and that is to say No. We AMD driver developer only got 2 true power over community, and that is having access to internal documentation and HW designers. Not pulling Linux into the mix while silicon is still in the lab means we lose half of our power (HW designer support). > I've also wondered if the DC code is ready for being part of the > kernel anyways, what happens if I merge this, and some external > contributor rewrites 50% of it and removes a bunch of stuff that the > kernel doesn't need. By any kernel standards I'll merge that sort of > change over your heads if Alex doesn't, it might mean you have to > rewrite a chunk of your internal validation code, or some other > interactions, but those won't be reasons to block the changes from my > POV. I'd like some serious introspection on your team's part on how > you got into this situation and how even if I was feeling like merging > this (which I'm not) how you'd actually deal with being part of the > Linux kernel and not hiding in nicely framed orgchart silo behind a > HAL. We have come a long way compare to how we used to be windows centric, and I am sure there is plenty of work remaining for us to be ready to be part of the kernel. If community has clever and clean solution that doesn’t break our ASICs we’ll take it internally with open arms. We merged Dave and Jerome’s clean up on removing abstractions and we had lots of patches following Dave and Jerome’s lead in different area. Again this is not about orgchart. It’s about what’s validated when samples are in the lab. God I miss the day when everything is plugged into the wall and dual link DVI was cutting edge. At least most of our problem can be solved by diffing register dump between good and bad case. Tony _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
* RE: [RFC] Using DC in amdgpu for upcoming GPU [not found] ` <CAPM=9tw=OLirgVU1RVxfPZ1PV64qtjOPTJ2q540=9VJhF4o2RQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2016-12-08 23:29 ` Dave Airlie @ 2016-12-09 17:32 ` Deucher, Alexander [not found] ` <MWHPR12MB169473F270C372CE90D3A254F7870-Gy0DoCVfaSW4WA4dJ5YXGAdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org> 2016-12-09 20:31 ` Daniel Vetter 1 sibling, 2 replies; 66+ messages in thread From: Deucher, Alexander @ 2016-12-09 17:32 UTC (permalink / raw) To: 'Dave Airlie', Daniel Vetter Cc: Grodzovsky, Andrey, Cheng, Tony, Wentland, Harry, amd-gfx mailing list, dri-devel > -----Original Message----- > From: Dave Airlie [mailto:airlied@gmail.com] > Sent: Thursday, December 08, 2016 3:07 PM > To: Daniel Vetter > Cc: Wentland, Harry; dri-devel; Grodzovsky, Andrey; amd-gfx mailing list; > Deucher, Alexander; Cheng, Tony > Subject: Re: [RFC] Using DC in amdgpu for upcoming GPU > > > I can't dig into details of DC, so this is not a 100% assessment, but if > > you call a function called "validate" in atomic_commit, you're very, very > > likely breaking atomic. _All_ validation must happen in ->atomic_check, > > if that's not the case TEST_ONLY mode is broken. And atomic userspace is > > relying on that working. > > > > The only thing that you're allowed to return from ->atomic_commit is > > out-of-memory, hw-on-fire and similar unforeseen and catastrophic issues. > > Kerneldoc expklains this. > > > > Now the reason I bring this up (and we've discussed it at length in > > private) is that DC still suffers from a massive abstraction midlayer. A > > lot of the back-end stuff (dp aux, i2c, abstractions for allocation, > > timers, irq, ...) have been cleaned up, but the midlayer is still there. > > And I understand why you have it, and why it's there - without some OS > > abstraction your grand plan of a unified driver across everything doesn't > > work out so well. > > > > But in a way the backend stuff isn't such a big deal. It's annoying since > > lots of code, and bugfixes have to be duplicated and all that, but it's > > fairly easy to fix case-by-case, and as long as AMD folks stick around > > (which I fully expect) not a maintainance issue. It makes it harder for > > others to contribute, but then since it's mostly the leaf it's generally > > easy to just improve the part you want to change (as an outsider). And if > > you want to improve shared code the only downside is that you can't also > > improve amd, but that's not so much a problem for non-amd folks ;-) > > > > The problem otoh with the abstraction layer between drm core and the > amd > > driver is that you can't ignore if you want to refactor shared code. And > > because it's an entire world of its own, it's much harder to understand > > what the driver is doing (without reading it all). Some examples of what I > > mean: > > > > - All other drm drivers subclass drm objects (by embedding them) into the > > corresponding hw part that most closely matches the drm object's > > semantics. That means even when you have 0 clue about how a given > piece > > of hw works, you have a reasonable chance of understanding code. If it's > > all your own stuff you always have to keep in minde the special amd > > naming conventions. That gets old real fast if you trying to figure out > > what 20+ (or are we at 30 already?) drivers are doing. > > > > - This is even more true for atomic. Atomic has a pretty complicated > > check/commmit transactional model for updating display state. It's a > > standardized interface, and it's extensible, and we want generic > > userspace to be able to run on any driver. Fairly often we realize that > > semantics of existing or newly proposed properties and state isn't > > well-defined enough, and then we need to go&read all the drivers and > > figure out how to fix up the mess. DC has it's entirely separate state > > structures which again don't subclass the atomic core structures (afaik > > at least). Again the same problems apply that you can't find things, and > > that figuring out the exact semantics and spotting differences in > > behaviour is almost impossible. > > > > - The trouble isn't just in reading code and understanding it correctly, > > it's also in finding it. If you have your own completely different world > > then just finding the right code is hard - cscope and grep fail to work. > > > > - Another issue is that very often we unify semantics in drivers by adding > > some new helpers that at least dtrt for most of the drivers. If you have > > your own world then the impendance mismatch will make sure that amd > > drivers will have slightly different semantics, and I think that's not > > good for the ecosystem and kms - people want to run a lot more than just > > a boot splash with generic kms userspace, stuff like xf86-video-$vendor > > is going out of favour heavily. > > > > Note that all this isn't about amd walking away and leaving an > > unmaintainable mess behind. Like I've said I don't think this is a big > > risk. The trouble is that having your own world makes it harder for > > everyone else to understand the amd driver, and understanding all drivers > > is very often step 1 in some big refactoring or feature addition effort. > > Because starting to refactor without understanding the problem generally > > doesn't work ;_) And you can't make this step 1 easier for others by > > promising to always maintain DC and update it to all the core changes, > > because that's only step 2. > > > > In all the DC discussions we've had thus far I haven't seen anyone address > > this issue. And this isn't just an issue in drm, it's pretty much > > established across all linux subsystems with the "no midlayer or OS > > abstraction layers in drivers" rule. There's some real solid reasons why > > such a HAl is extremely unpopular with upstream. And I haven't yet seen > > any good reason why amd needs to be different, thus far it looks like a > > textbook case, and there's been lots of vendors in lots of subsystems who > > tried to push their HAL. > > Daniel has said this all very nicely, I'm going to try and be a bit more direct, > because apparently I've possibly been too subtle up until now. > > No HALs. We don't do HALs in the kernel. We might do midlayers sometimes > we try not to do midlayers. In the DRM we don't do either unless the > maintainers > are asleep. They might be worth the effort for AMD, however for the Linux > kernel > they don't provide a benefit and make maintaining the code a lot harder. I've > maintained this code base for over 10 years now and I'd like to think > I've only merged > something for semi-political reasons once (initial exynos was still > more Linuxy than DC), > and that thing took a lot of time to cleanup, I really don't feel like > saying yes again. > > Given the choice between maintaining Linus' trust that I won't merge > 100,000 lines > of abstracted HAL code and merging 100,000 lines of abstracted HAL code > I'll give you one guess where my loyalties lie. The reason the > toplevel maintainer (me) > doesn't work for Intel or AMD or any vendors, is that I can say NO > when your maintainers > can't or won't say it. > > I've only got one true power as a maintainer, and that is to say No. > The other option > is I personally sit down and rewrite all the code in an acceptable > manner, and merge that > instead. But I've discovered I probably don't scale to that level, so > again it leaves me > with just the one actual power. > > AMD can't threaten not to support new GPUs in upstream kernels without > merging this, > that is totally something you can do, and here's the thing Linux will > survive, we'll piss off > a bunch of people, but the Linux kernel will just keep on rolling > forward, maybe at some > point someone will get pissed about lacking upstream support for your > HW and go write > support and submit it, maybe they won't. The kernel is bigger than any > of us and has > standards about what is acceptable. Read up on the whole mac80211 > problems we had > years ago, where every wireless vendor wrote their own 80211 layer > inside their driver, > there was a lot of time spent creating a central 80211 before any of > those drivers were > suitable for merge, well we've spent our time creating a central > modesetting infrastructure, > bypassing it is taking a driver in totally the wrong direction. > > I've also wondered if the DC code is ready for being part of the > kernel anyways, what > happens if I merge this, and some external contributor rewrites 50% of > it and removes a > bunch of stuff that the kernel doesn't need. By any kernel standards > I'll merge that sort > of change over your heads if Alex doesn't, it might mean you have to > rewrite a chunk > of your internal validation code, or some other interactions, but > those won't be reasons > to block the changes from my POV. I'd like some serious introspection > on your team's > part on how you got into this situation and how even if I was feeling > like merging this > (which I'm not) how you'd actually deal with being part of the Linux > kernel and not hiding > in nicely framed orgchart silo behind a HAL. I honestly don't think > the code is Linux worthy > code, and I also really dislike having to spend my Friday morning > being negative about it, > but hey at least I can have a shower now. > > No. Hi Dave, I think this is part of the reason a lot of people get fed up with working upstream in Linux. I can respect your technical points and if you kept it to that, I'd be fine with it and we could have a technical discussion starting there. But attacking us or our corporate culture is not cool. I think perhaps you have been in the RH silo for too long. Our corporate culture is not like RH's. Like it or not, we have historically been a windows centric company. We have a few small Linux team that has been engaged with the community for a long time, but the rest of the company has not. We are working to improve it, but we can only do so many things at one time. GPU cycles are fast. There's only so much time in the day; we'd like to make our code perfect, but we also want to get it out to customers while the hw is still relevant. We are finally at a point where our AMD Linux drivers are almost feature complete compared to windows and we have support upstream well before hw launch and we get shit on for trying to do the right thing. It doesn't exactly make us want to continue contributing. That's the problem with Linux. Unless you are part time hacker who is part of the "in" crowd can spend all of his days tinkering with making the code perfect, a vendor with massive resources who can just through more people at it, or a throw it over the wall and forget it vendor (hey, my code can just live in staging), there's no room for you. You love to tell the exynos story about how crappy the code was and then after it was cleaned up how glorious it was. Except the vendor didn't do that. Another vendor paid another vendor to do it. We don't happen to have the resources to pay someone else to do that for us. Moreover, doing so would negate all of the advantages to bringing up the code along with the hw team in the lab when the asics come back from the fab. Additionally, the original argument against the exynos code was that it was just thrown over the wall and largely ignored by the vendor once it was upstream. We've been consistently involved in upstream (heck, I've been at AMD almost 10 years now maintaining our drivers). You talk about trust. I think there's something to cutting a trusted partner some slack as they work to further improve their support vs. taking a hard line because you got burned once by a throw it over the wall vendor who was not engaged. Even if you want to take a hard line, let's discuss it on technical merits, not mud-slinging. I realize you care about code quality and style, but do you care about stable functionality? Would you really merge a bunch of huge cleanups that would potentially break tons of stuff in subtle ways because coding style is that important? I'm done with that myself. I've merged too many half-baked cleanups and new features in the past and ended up spending way more time fixing them than I would have otherwise for relatively little gain. The hw is just too complicated these days. At some point people what support for the hw they have and they want it to work. If code trumps all, then why do we have staging? I understand forward progress on APIs, but frankly from my perspective, atomic has been a disaster for stability of both atomic and pre-atomic code. Every kernel cycle manages to break several drivers. What happened to figuring out how to do in right in a couple of drivers and then moving that to the core. We seem to have lost that in favor of starting in the core first. I feel like we constantly refactor the core to deal with that or that quirk or requirement of someone's hardware and then deal with tons of fallout. Is all we care about android? I constantly hear the argument, if we don't do all of this android will do their own thing and then that will be the end. Right now we are all suffering and android barely even using this yet. If Linux will carry on without AMD contributing maybe Linux will carry on ok without bending over backwards for android. Are you basically telling us that you'd rather we water down our driver and limit the features and capabilities and stability we can support so that others can refactor our code constantly for hazy goals to support some supposed glorious future that never seems to come? What about right now? Maybe we could try and support some features right now. Maybe we'll finally see Linux on the desktop. Alex _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
[parent not found: <MWHPR12MB169473F270C372CE90D3A254F7870-Gy0DoCVfaSW4WA4dJ5YXGAdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>]
* Re: [RFC] Using DC in amdgpu for upcoming GPU [not found] ` <MWHPR12MB169473F270C372CE90D3A254F7870-Gy0DoCVfaSW4WA4dJ5YXGAdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org> @ 2016-12-09 20:30 ` Dave Airlie [not found] ` <CAPM=9tw4U6Ps1KgTpn-Sq2esfqkmDCPvpoRXnJB-X6pwjbBmTw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 66+ messages in thread From: Dave Airlie @ 2016-12-09 20:30 UTC (permalink / raw) To: Deucher, Alexander Cc: Grodzovsky, Andrey, Wentland, Harry, amd-gfx mailing list, dri-devel, Daniel Vetter, Cheng, Tony > I think this is part of the reason a lot of people get fed up with working upstream in Linux. I can respect your technical points and if you kept it to that, I'd be fine with it and we could have a technical discussion starting there. But attacking us or our corporate culture is not cool. I think perhaps you have been in the RH silo for too long. Our corporate culture is not like RH's. Like it or not, we have historically been a windows centric company. We have a few small Linux team that has been engaged with the community for a long time, but the rest of the company has not. We are working to improve it, but we can only do so many things at one time. GPU cycles are fast. There's only so much time in the day; we'd like to make our code perfect, but we also want to get it out to customers while the hw is still relevant. We are finally at a point where our AMD Linux drivers are almost feature complete compared to windows and we have support upstream well before hw launch and we get shit on for trying to do the right thing. It doesn't exactly make us want to continue contributing. That's the problem with Linux. Unless you are part time hacker who is part of the "in" crowd can spend all of his days tinkering with making the code perfect, a vendor with massive resources who can just through more people at it, or a throw it over the wall and forget it vendor (hey, my code can just live in staging), there's no room for you. I don't think that's fair, AMD as a company has a number of experienced Linux kernel developers, who are well aware of the upstream kernel development process and views. I should not be put in a position where I have to say no, that is frankly the position you are in as a maintainer, you work for AMD but you answer to the kernel development process out here. AMD is travelling a well travelled road here, Intel/Daniel have lots of times I've had to deal with the same problems, eventually Intel learn that what Daniel says matters and people are a lot happier. I brought up the AMD culture because either one of two things have happened here, a) you've lost sight of what upstream kernel code looks like, or b) people in AMD aren't listening to you, and if its the latter case then it is a direct result of the AMD culture, and so far I'm not willing to believe it's the former (except maybe CGS - still on the wall whether that was a good idea or a floodgate warning). From what I understood this DAL code was a rewrite from scratch, with upstreamability as a possible goal, it isn't directly taken from Windows or fglrx. This goal was not achieved, why do I have to live with the result. AMD could have done better, they have so many people experienced in how this thing should go down. > You love to tell the exynos story about how crappy the code was and then after it was cleaned up how glorious it was. Except the vendor didn't do that. Another vendor paid another vendor to do it. We don't happen to have the resources to pay someone else to do that for us. Moreover, doing so would negate all of the advantages to bringing up the code along with the hw team in the lab when the asics come back from the fab. Additionally, the original argument against the exynos code was that it was just thrown over the wall and largely ignored by the vendor once it was upstream. We've been consistently involved in upstream (heck, I've been at AMD almost 10 years now maintaining our drivers). You talk about trust. I think there's something to cutting a trusted partner some slack as they work to further improve their support vs. taking a hard line because you got burned once by a throw it over the wall vendor who was not engaged. Even if you want to take a hard line, let's discuss it on technical merits, not mud-slinging. Here's the thing, what happens if a vendor pays another vendor to clean up DAL after I merge it, how do you handle it? Being part of the upstream kernel isn't about hiding in the corner, if you want to gain the benefits of upstream development you need to participate in upstream development. If you want to do what AMD seems to be only in a position to do, and have upstream development as an after thought then you of course are going to run into lots of problems. > > I realize you care about code quality and style, but do you care about stable functionality? Would you really merge a bunch of huge cleanups that would potentially break tons of stuff in subtle ways because coding style is that important? I'm done with that myself. I've merged too many half-baked cleanups and new features in the past and ended up spending way more time fixing them than I would have otherwise for relatively little gain. The hw is just too complicated these days. At some point people what support for the hw they have and they want it to work. If code trumps all, then why do we have staging? Code doesn't trump all, I'd have merged DAL if it did. Maintainability trumps all. The kernel will be around for a long time more, I'd like it to still be something we can make changes to as expectations change. > I understand forward progress on APIs, but frankly from my perspective, atomic has been a disaster for stability of both atomic and pre-atomic code. Every kernel cycle manages to break several drivers. What happened to figuring out how to do in right in a couple of drivers and then moving that to the core. We seem to have lost that in favor of starting in the core first. I feel like we constantly refactor the core to deal with that or that quirk or requirement of someone's hardware and then deal with tons of fallout. Is all we care about android? I constantly hear the argument, if we don't do all of this android will do their own thing and then that will be the end. Right now we are all suffering and android barely even using this yet. If Linux will carry on without AMD contributing maybe Linux will carry on ok without bending over backwards for android. Are you basically telling us that you'd rather we water down our driver and limit the features and capabilities and stability we can support so that others can refactor our code constantly for hazy goals to support some supposed glorious future that never seems to come? What about right now? Maybe we could try and support some features right now. Maybe we'll finally see Linux on the desktop. > All of this comes from the development model you have ended up at. Do you have upstream CI? Upstream keeps breaking things, how do you find out? I've seen spstarr bisect a bunch of AMD regressions in the past 6 months (not due to atomic), where are the QA/CI teams validating that, why aren't they bisecting the upstream kernel, instead of people in the community on irc. AMD has been operating in throw it over the wall at upstream for a while, I've tried to help motivate changing that and slowly we get there with things like the external mailing list, and I realise these things take time, but if upstream isn't something that people really care about at AMD enough to continuously validate and get involved in defining new APIs like atomic, you are in no position to come back when upstream refuses to participate in merging 60-90k of vendor produced code with lots of bits of functionality that shouldn't be in there. I'm unloading a lot of stuff here, and really I understand it's not your fault, but I've stated I've only got one power left when people let code like DAL/DC get to me, I'm not going to be tell you how to rewrite it, because you already know, you've always known, now we just need the right people to listen to you. Dave. _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
[parent not found: <CAPM=9tw4U6Ps1KgTpn-Sq2esfqkmDCPvpoRXnJB-X6pwjbBmTw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [RFC] Using DC in amdgpu for upcoming GPU [not found] ` <CAPM=9tw4U6Ps1KgTpn-Sq2esfqkmDCPvpoRXnJB-X6pwjbBmTw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2016-12-11 0:36 ` Alex Deucher 0 siblings, 0 replies; 66+ messages in thread From: Alex Deucher @ 2016-12-11 0:36 UTC (permalink / raw) To: Dave Airlie Cc: Grodzovsky, Andrey, Cheng, Tony, dri-devel, amd-gfx mailing list, Daniel Vetter, Deucher, Alexander, Wentland, Harry On Fri, Dec 9, 2016 at 3:30 PM, Dave Airlie <airlied@gmail.com> wrote: >> I think this is part of the reason a lot of people get fed up with working upstream in Linux. I can respect your technical points and if you kept it to that, I'd be fine with it and we could have a technical discussion starting there. But attacking us or our corporate culture is not cool. I think perhaps you have been in the RH silo for too long. Our corporate culture is not like RH's. Like it or not, we have historically been a windows centric company. We have a few small Linux team that has been engaged with the community for a long time, but the rest of the company has not. We are working to improve it, but we can only do so many things at one time. GPU cycles are fast. There's only so much time in the day; we'd like to make our code perfect, but we also want to get it out to customers while the hw is still relevant. We are finally at a point where our AMD Linux drivers are almost feature complete compared to windows and we have support upstream well before hw launch and we get shit on for trying to do the right thing. It doesn't exactly make us want to continue contributing. That's the problem with Linux. Unless you are part time hacker who is part of the "in" crowd can spend all of his days tinkering with making the code perfect, a vendor with massive resources who can just through more people at it, or a throw it over the wall and forget it vendor (hey, my code can just live in staging), there's no room for you. > > I don't think that's fair, AMD as a company has a number of > experienced Linux kernel developers, who are well aware of the > upstream kernel development process and views. I should not be put in > a position where I have to say no, that is frankly the position you > are in as a maintainer, you work for AMD but you answer to the kernel > development process out here. AMD is travelling a well travelled road > here, Intel/Daniel have lots of times I've had to deal with the same > problems, eventually Intel learn that what Daniel says matters and > people are a lot happier. I brought up the AMD culture because either > one of two things have happened here, a) you've lost sight of what > upstream kernel code looks like, or b) people in AMD aren't listening > to you, and if its the latter case then it is a direct result of the > AMD culture, and so far I'm not willing to believe it's the former > (except maybe CGS - still on the wall whether that was a good idea or > a floodgate warning). > > From what I understood this DAL code was a rewrite from scratch, with > upstreamability as a possible goal, it isn't directly taken from > Windows or fglrx. This goal was not achieved, why do I have to live > with the result. AMD could have done better, they have so many people > experienced in how this thing should go down. I think I over-reated a bit with this email. What I really wanted to say was that this was an RFC, basically saying this is how far we've come, this is what we still need to do, and here's what we'd like to do. This was not a request to merge now or an ultimatum. I understand the requirements of upstream, I just didn't expect such a visceral response from that original email and it put me on the defensive. I take our driver quality seriously and the idea of having arbitrary large patches applied to "clean up" our code without our say or validation didn't sit well with me. Alex _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [RFC] Using DC in amdgpu for upcoming GPU 2016-12-09 17:32 ` Deucher, Alexander [not found] ` <MWHPR12MB169473F270C372CE90D3A254F7870-Gy0DoCVfaSW4WA4dJ5YXGAdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org> @ 2016-12-09 20:31 ` Daniel Vetter 1 sibling, 0 replies; 66+ messages in thread From: Daniel Vetter @ 2016-12-09 20:31 UTC (permalink / raw) To: Deucher, Alexander Cc: Grodzovsky, Andrey, Cheng, Tony, amd-gfx mailing list, dri-devel Hi Alex, I'll leave the other bits out, just replying to atomic/android comments. On Fri, Dec 9, 2016 at 6:32 PM, Deucher, Alexander <Alexander.Deucher@amd.com> wrote: > I understand forward progress on APIs, but frankly from my perspective, atomic has been a disaster for stability of both atomic and pre-atomic code. Every kernel cycle manages to break several drivers. What happened to figuring out how to do in right in a couple of drivers and then moving that to the core. We seem to have lost that in favor of starting in the core first. I feel like we constantly refactor the core to deal with that or that quirk or requirement of someone's hardware and then deal with tons of fallout. Is all we care about android? I constantly hear the argument, if we don't do all of this android will do their own thing and then that will be the end. Right now we are all suffering and android barely even using this yet. If Linux will carry on without AMD contributing maybe Linux will carry on ok without bending over backwards for android. Are you basically telling us that you'd rather we water down our driver and limit the features and capabilities and stability we can support so that others can refactor our code constantly for hazy goals to support some supposed glorious future that never seems to come? What about right now? Maybe we could try and support some features right now. Maybe we'll finally see Linux on the desktop. Before atomic landed we've had 3 proof-of-concept drivers. Before I've added the the nonblocking helpers we've had about 5-10 drivers doing it all wrong in different ways (and yes the rework highlighted that in a few cases rather brutally). We know have about 20 atomic drivers (and counting), and pretty much all the refactoring, helper extractions and reworks _are_ motivated by a bunch of drivers hand-rolling a given pattern. So I think we're doing things roughly right, it's just a bit hard. And no Android isn't everything we care about, we want atomic also for CrOS (which is pretty much the only linux desktop thing shipping in quantities), and we want it for the traditional linux desktop (weston/wayland/mutter). And we want it for embedded/entertainment systems. Atomic is pretty much the answer to "KMS is outdated and doesn't match modern hw anymore". E.g. on i915 we want atomic (and related work) to be able to support render compression. And of course I'd like to invite everyone who wants something else with DRM to also bring that in, e.g. over the past few months we've merged the simple kms helpers for super-dumb displays to be able to be better at the fbdev game than fbdev itself. Not something I care about personally, but it's still great because more users and usecases. And the same applies of course to AMD. But what I'm seeing (and you're not the only one complaining, Michel has raised this on irc a few times too) is that you're not in the driver seat, and AMD folks don't really have any say in where DRM overall heads towards. As an outsider looking in I think that's because AMD is largely absorbed with itself, doesn't have people who can just do random things because they see the long-term benefits, and is occopied absorbing new teams that don't yet design and develop with an upstream first approach. Personally I'm not really happy about that, because I'd like more of AMD's perspective in infrastructure work. But I don't think that's because upstream and maintainers reject your stuff, I'm trying as hard as possible to drag you folks in all the time, and tons of people get stuff merged with even smaller teams than you have.I think it's simply because core work seems not to be a top priority (yet). I can't fix that. You're other criticism is that all these changes break shit, and agree there's been a bit much of that. But otoh if we can't change e.g. fb refcounting anymore because it would break drivers, or try to deprecate old interfaces to get rid of the plenty of rootholes in there, then upstream is dead and why should we bother with having a standardized, cross-vendor modeset interface. And I'm trying to fix this mess, by emphasising CI, building up a cross-vendor validation suite in igt, inviting folks to participate in drm-misc to make sure core stuff is working for everyone and moves in the right direction. And again lots of people pick up on that offer, and we have multiple people and vendors now e.g. looking into igt and starting to contribute. But again AMD is left out, and I don't think that can be blamed on the community. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [RFC] Using DC in amdgpu for upcoming GPU 2016-12-08 2:02 [RFC] Using DC in amdgpu for upcoming GPU Harry Wentland 2016-12-08 9:59 ` Daniel Vetter @ 2016-12-11 20:28 ` Daniel Vetter [not found] ` <20161211202827.cif3jnbuouay6xyz-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org> [not found] ` <55d5e664-25f7-70e0-f2f5-9c9daf3efdf6-5C7GfCeVMHo@public.gmane.org> 2016-12-12 7:22 ` Daniel Vetter 3 siblings, 1 reply; 66+ messages in thread From: Daniel Vetter @ 2016-12-11 20:28 UTC (permalink / raw) To: Harry Wentland Cc: Grodzovsky, Andrey, amd-gfx, dri-devel, Deucher, Alexander, Cheng, Tony On Wed, Dec 07, 2016 at 09:02:13PM -0500, Harry Wentland wrote: > We propose to use the Display Core (DC) driver for display support on > AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to > avoid a flag day the plan is to only support uGPU initially and transition > to older ASICs gradually. Bridgeman brought it up a few times that this here was the question - it's kinda missing a question mark, hard to figure this out ;-). I'd say for upstream it doesn't really matter, but imo having both atomic and non-atomic paths in one driver is one world of hurt and I strongly recommend against it, at least if feasible. All drivers that switched switched in one go, the only exception was i915 (it took much longer than we ever feared, causing lots of pain) and nouveau (which only converted nv50+, but pre/post-nv50 have always been two almost completely separate worlds anyway). > The DC component has received extensive testing within AMD for DCE8, 10, and > 11 GPUs and is being prepared for uGPU. Support should be better than > amdgpu's current display support. > > * All of our QA effort is focused on DC > * All of our CQE effort is focused on DC > * All of our OEM preloads and custom engagements use DC > * DC behavior mirrors what we do for other OSes > > The new asic utilizes a completely re-designed atom interface, so we cannot > easily leverage much of the existing atom-based code. > > We've introduced DC to the community earlier in 2016 and received a fair > amount of feedback. Some of what we've addressed so far are: > > * Self-contain ASIC specific code. We did a bunch of work to pull > common sequences into dc/dce and leave ASIC specific code in > separate folders. > * Started to expose AUX and I2C through generic kernel/drm > functionality and are mostly using that. Some of that code is still > needlessly convoluted. This cleanup is in progress. > * Integrated Dave and Jerome’s work on removing abstraction in bios > parser. > * Retire adapter service and asic capability > * Remove some abstraction in GPIO > > Since a lot of our code is shared with pre- and post-silicon validation > suites changes need to be done gradually to prevent breakages due to a major > flag day. This, coupled with adding support for new asics and lots of new > feature introductions means progress has not been as quick as we would have > liked. We have made a lot of progress none the less. > > The remaining concerns that were brought up during the last review that we > are working on addressing: > > * Continue to cleanup and reduce the abstractions in DC where it > makes sense. > * Removing duplicate code in I2C and AUX as we transition to using the > DRM core interfaces. We can't fully transition until we've helped > fill in the gaps in the drm core that we need for certain features. > * Making sure Atomic API support is correct. Some of the semantics of > the Atomic API were not particularly clear when we started this, > however, that is improving a lot as the core drm documentation > improves. Getting this code upstream and in the hands of more > atomic users will further help us identify and rectify any gaps we > have. Ok so I guess Dave is typing some more general comments about demidlayering, let me type some guidelines about atomic. Hopefully this all materializes itself a bit better into improved upstream docs, but meh. Step 0: Prep So atomic is transactional, but it's not validate + rollback or commit, but duplicate state, validate and then either throw away or commit. There's a few big reasons for this: a) partial atomic updates - if you duplicate it's much easier to check that you have all the right locks b) kfree() is much easier to check for correctness than a rollback code and c) atomic_check functions are much easier to audit for invalid changes to persistent state. Trouble is that this seems a bit unusual compared to all other approaches, and ime (from the drawn-out i915 conversion) you really don't want to mix things up. Ofc for private state you can roll back (e.g. vc4 does that for the drm_mm allocator thing for scanout slots or whatever it is), but it's trivial easy to accidentally check the wrong state or mix them up or something else bad. Long story short, I think step 0 for DC is to split state from objects, i.e. for each dc_surface/foo/bar you need a dc_surface/foo/bar_state. And all the back-end functions need to take both the object and the state explicitly. This is a bit a pain to do, but should be pretty much just mechanical. And imo not all of it needs to happen before DC lands in upstream, but see above imo that half-converted state is postively horrible. This should also not harm cross-os reuse at all, you can still store things together on os where that makes sense. Guidelines for amdgpu atomic structures drm atomic stores everything in state structs on plane/connector/crtc. This includes any property extensions or anything else really, the entire userspace abi is built on top of this. Non-trivial drivers are supposed to subclass these to store their own stuff, so e.g. amdgpu_plane_state { struct drm_plane_state base; /* amdgpu glue state and stuff that's linux-specific, e.g. * property values and similar things. Note that there's strong * push towards standardizing properties and stroing them in the * drm_*_state structs. */ struct dc_surface_state surface_state; /* other dc states that fit to a plane */ }; Yes not everything will fit 1:1 in one of these, but to get started I strongly recommend to make them fit (maybe with reduced feature sets to start out). Stuff that is shared between e.g. planes, but always on the same crtc can be put into amdgpu_crtc_state, e.g. if you have scalers that are assignable to a plane. Of course atomic also supports truly global resources, for that you need to subclass drm_atomic_state. Currently msm and i915 do that, and probably best to read those structures as examples until I've typed the docs. But I expect that especially for planes a few dc_*_state structs will stay in amdgpu_*_state. Guidelines for atomic_check Please use the helpers as much as makes sense, and put at least the basic steps that from drm_*_state into the respective dc_*_state functional block into the helper callbacks for that object. I think basic validation of individal bits (as much as possible, e.g. if you just don't support e.g. scaling or rotation with certain pixel formats) should happen in there too. That way when we e.g. want to check how drivers corrently validate a given set of properties to be able to more strictly define the semantics, that code is easy to find. Also I expect that this won't result in code duplication with other OS, you need code to map from drm to dc anyway, might as well check&reject the stuff that dc can't even represent right there. The other reason is that the helpers are good guidelines for some of the semantics, e.g. it's mandatory that drm_crtc_needs_modeset gives the right answer after atomic_check. If it doesn't, then you're driver doesn't follow atomic. If you completely roll your own this becomes much harder to assure. Of course extend it all however you want, e.g. by adding all the global optimization and resource assignment stuff after initial per-object checking has been done using the helper infrastructure. Guidelines for atomic_commit Use the new nonblcoking helpers. Everyone who didn't got it wrong. Also, your atomic_commit should pretty much match the helper one, except for a custom swap_state to handle all your globally shared specia dc_*_state objects. Everything hw specific should be in atomic_commit_tail. Wrt the hw commit itself, for the modeset step just roll your own. That's the entire point of atomic, and atm both i915 and nouveau exploit this fully. Besides a bit of glue there shouldn't be much need for linux-specific code here - what you need is something to fish the right dc_*_state objects and give it your main sequencer functions. What you should make sure though is that only ever do a modeset when that was signalled, i.e. please use drm_crtc_needs_modeset to control that part. Feel free to wrap up in a dc_*_needs_modeset for better abstraction if that's needed. I do strongly suggest however that you implement the plane commit using the helpers. There's really only a few ways to implement this in the hw, and it should work everywhere. Misc guidelines Use the suspend/resume helpers. If your atomic can't do that, it's not terribly good. Also, if DC can't make those fit, it's probably still too much midlayer and its own world than helper library. Use all the legacy helpers, again your atomic should be able to pull it off. One exception is async plane flips (both primary and cursors), that's atm still unsolved. Probably best to keep the old code around for just that case (but redirect to the compat helpers for everything), see e.g. how vc4 implements cursors. Most imporant of all Ask questions on #dri-devel. amdgpu atomic is the only nontrivial atomic driver for which I don't remember a single discussion about some detail, at least not with any of the DAL folks. Michel&Alex asked some questions sometimes, but that indirection is bonghits and the defeats the point of upstream: Direct cross-vendor collaboration to get shit done. Please make it happen. Oh and I pretty much assume Harry&Tony are volunteered to review atomic docs ;-) Cheers, Daniel > > Unfortunately we cannot expose code for uGPU yet. However refactor / cleanup > work on DC is public. We're currently transitioning to a public patch > review. You can follow our progress on the amd-gfx mailing list. We value > community feedback on our work. > > As an appendix I've included a brief overview of the how the code currently > works to make understanding and reviewing the code easier. > > Prior discussions on DC: > > * https://lists.freedesktop.org/archives/dri-devel/2016-March/103398.html > * > https://lists.freedesktop.org/archives/dri-devel/2016-February/100524.html > > Current version of DC: > > * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7 > > Once Alex pulls in the latest patches: > > * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7 > > Best Regards, > Harry > > > ************************************************ > *** Appendix: A Day in the Life of a Modeset *** > ************************************************ > > Below is a high-level overview of a modeset with dc. Some of this might be a > little out-of-date since it's based on my XDC presentation but it should be > more-or-less the same. > > amdgpu_dm_atomic_commit() > { > /* setup atomic state */ > drm_atomic_helper_prepare_planes(dev, state); > drm_atomic_helper_swap_state(dev, state); > drm_atomic_helper_update_legacy_modeset_state(dev, state); > > /* create or remove targets */ > > /******************************************************************** > * *** Call into DC to commit targets with list of all known targets > ********************************************************************/ > /* DC is optimized not to do anything if 'targets' didn't change. */ > dc_commit_targets(dm->dc, commit_targets, commit_targets_count) > { > /****************************************************************** > * *** Build context (function also used for validation) > ******************************************************************/ > result = core_dc->res_pool->funcs->validate_with_context( > core_dc,set,target_count,context); > > /****************************************************************** > * *** Apply safe power state > ******************************************************************/ > pplib_apply_safe_state(core_dc); > > /**************************************************************** > * *** Apply the context to HW (program HW) > ****************************************************************/ > result = core_dc->hwss.apply_ctx_to_hw(core_dc,context) > { > /* reset pipes that need reprogramming */ > /* disable pipe power gating */ > /* set safe watermarks */ > > /* for all pipes with an attached stream */ > /************************************************************ > * *** Programming all per-pipe contexts > ************************************************************/ > status = apply_single_controller_ctx_to_hw(...) > { > pipe_ctx->tg->funcs->set_blank(...); > pipe_ctx->clock_source->funcs->program_pix_clk(...); > pipe_ctx->tg->funcs->program_timing(...); > pipe_ctx->mi->funcs->allocate_mem_input(...); > pipe_ctx->tg->funcs->enable_crtc(...); > bios_parser_crtc_source_select(...); > > pipe_ctx->opp->funcs->opp_set_dyn_expansion(...); > pipe_ctx->opp->funcs->opp_program_fmt(...); > > stream->sink->link->link_enc->funcs->setup(...); > pipe_ctx->stream_enc->funcs->dp_set_stream_attribute(...); > pipe_ctx->tg->funcs->set_blank_color(...); > > core_link_enable_stream(pipe_ctx); > unblank_stream(pipe_ctx, > > program_scaler(dc, pipe_ctx); > } > /* program audio for all pipes */ > /* update watermarks */ > } > > program_timing_sync(core_dc, context); > /* for all targets */ > target_enable_memory_requests(...); > > /* Update ASIC power states */ > pplib_apply_display_requirements(...); > > /* update surface or page flip */ > } > } > > > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 66+ messages in thread
[parent not found: <20161211202827.cif3jnbuouay6xyz-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>]
* Re: [RFC] Using DC in amdgpu for upcoming GPU [not found] ` <20161211202827.cif3jnbuouay6xyz-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org> @ 2016-12-13 2:33 ` Harry Wentland [not found] ` <b64d0072-4909-c680-2f09-adae9f856642-5C7GfCeVMHo@public.gmane.org> 0 siblings, 1 reply; 66+ messages in thread From: Harry Wentland @ 2016-12-13 2:33 UTC (permalink / raw) To: Daniel Vetter Cc: Grodzovsky, Andrey, Dave Airlie, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Deucher, Alexander, Cheng, Tony On 2016-12-11 03:28 PM, Daniel Vetter wrote: > On Wed, Dec 07, 2016 at 09:02:13PM -0500, Harry Wentland wrote: >> We propose to use the Display Core (DC) driver for display support on >> AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to >> avoid a flag day the plan is to only support uGPU initially and transition >> to older ASICs gradually. > > Bridgeman brought it up a few times that this here was the question - it's > kinda missing a question mark, hard to figure this out ;-). I'd say for My bad for the missing question mark (imprecise phrasing). On the other hand letting this blow over a bit helped get us on the map a bit more and allows us to argue the challenges (and benefits) of open source. :) > upstream it doesn't really matter, but imo having both atomic and > non-atomic paths in one driver is one world of hurt and I strongly > recommend against it, at least if feasible. All drivers that switched > switched in one go, the only exception was i915 (it took much longer than > we ever feared, causing lots of pain) and nouveau (which only converted > nv50+, but pre/post-nv50 have always been two almost completely separate > worlds anyway). > You mention the two probably most complex DRM drivers didn't switch in a single go... I imagine amdgpu/DC falls into the same category. I think one of the problems is making a sudden change with a fully validated driver without breaking existing use cases and customers. We really should've started DC development in public and probably would do that if we had to start anew. >> The DC component has received extensive testing within AMD for DCE8, 10, and >> 11 GPUs and is being prepared for uGPU. Support should be better than >> amdgpu's current display support. >> >> * All of our QA effort is focused on DC >> * All of our CQE effort is focused on DC >> * All of our OEM preloads and custom engagements use DC >> * DC behavior mirrors what we do for other OSes >> >> The new asic utilizes a completely re-designed atom interface, so we cannot >> easily leverage much of the existing atom-based code. >> >> We've introduced DC to the community earlier in 2016 and received a fair >> amount of feedback. Some of what we've addressed so far are: >> >> * Self-contain ASIC specific code. We did a bunch of work to pull >> common sequences into dc/dce and leave ASIC specific code in >> separate folders. >> * Started to expose AUX and I2C through generic kernel/drm >> functionality and are mostly using that. Some of that code is still >> needlessly convoluted. This cleanup is in progress. >> * Integrated Dave and Jerome’s work on removing abstraction in bios >> parser. >> * Retire adapter service and asic capability >> * Remove some abstraction in GPIO >> >> Since a lot of our code is shared with pre- and post-silicon validation >> suites changes need to be done gradually to prevent breakages due to a major >> flag day. This, coupled with adding support for new asics and lots of new >> feature introductions means progress has not been as quick as we would have >> liked. We have made a lot of progress none the less. >> >> The remaining concerns that were brought up during the last review that we >> are working on addressing: >> >> * Continue to cleanup and reduce the abstractions in DC where it >> makes sense. >> * Removing duplicate code in I2C and AUX as we transition to using the >> DRM core interfaces. We can't fully transition until we've helped >> fill in the gaps in the drm core that we need for certain features. >> * Making sure Atomic API support is correct. Some of the semantics of >> the Atomic API were not particularly clear when we started this, >> however, that is improving a lot as the core drm documentation >> improves. Getting this code upstream and in the hands of more >> atomic users will further help us identify and rectify any gaps we >> have. > > Ok so I guess Dave is typing some more general comments about > demidlayering, let me type some guidelines about atomic. Hopefully this > all materializes itself a bit better into improved upstream docs, but meh. > Excellent writeup. Let us know when/if you want our review for upstream docs. We'll have to really take some time to go over our atomic implementation. A couple small comments below with regard to DC. > Step 0: Prep > > So atomic is transactional, but it's not validate + rollback or commit, > but duplicate state, validate and then either throw away or commit. > There's a few big reasons for this: a) partial atomic updates - if you > duplicate it's much easier to check that you have all the right locks b) > kfree() is much easier to check for correctness than a rollback code and > c) atomic_check functions are much easier to audit for invalid changes to > persistent state. > There isn't really any rollback. I believe even in our other drivers we've abandoned the rollback approach years ago because it doesn't really work on modern HW. Any rollback cases you might find in DC should really only be for catastrophic errors (read: something went horribly wrong... read: congratulations, you just found a bug). > Trouble is that this seems a bit unusual compared to all other approaches, > and ime (from the drawn-out i915 conversion) you really don't want to mix > things up. Ofc for private state you can roll back (e.g. vc4 does that for > the drm_mm allocator thing for scanout slots or whatever it is), but it's > trivial easy to accidentally check the wrong state or mix them up or > something else bad. > > Long story short, I think step 0 for DC is to split state from objects, > i.e. for each dc_surface/foo/bar you need a dc_surface/foo/bar_state. And > all the back-end functions need to take both the object and the state > explicitly. > > This is a bit a pain to do, but should be pretty much just mechanical. And > imo not all of it needs to happen before DC lands in upstream, but see > above imo that half-converted state is postively horrible. This should > also not harm cross-os reuse at all, you can still store things together > on os where that makes sense. > > Guidelines for amdgpu atomic structures > > drm atomic stores everything in state structs on plane/connector/crtc. > This includes any property extensions or anything else really, the entire > userspace abi is built on top of this. Non-trivial drivers are supposed to > subclass these to store their own stuff, so e.g. > > amdgpu_plane_state { > struct drm_plane_state base; > > /* amdgpu glue state and stuff that's linux-specific, e.g. > * property values and similar things. Note that there's strong > * push towards standardizing properties and stroing them in the > * drm_*_state structs. */ > > struct dc_surface_state surface_state; > > /* other dc states that fit to a plane */ > }; > > Yes not everything will fit 1:1 in one of these, but to get started I > strongly recommend to make them fit (maybe with reduced feature sets to > start out). Stuff that is shared between e.g. planes, but always on the > same crtc can be put into amdgpu_crtc_state, e.g. if you have scalers that > are assignable to a plane. > > Of course atomic also supports truly global resources, for that you need > to subclass drm_atomic_state. Currently msm and i915 do that, and probably > best to read those structures as examples until I've typed the docs. But I > expect that especially for planes a few dc_*_state structs will stay in > amdgpu_*_state. > > Guidelines for atomic_check > > Please use the helpers as much as makes sense, and put at least the basic > steps that from drm_*_state into the respective dc_*_state functional > block into the helper callbacks for that object. I think basic validation > of individal bits (as much as possible, e.g. if you just don't support > e.g. scaling or rotation with certain pixel formats) should happen in > there too. That way when we e.g. want to check how drivers corrently > validate a given set of properties to be able to more strictly define the > semantics, that code is easy to find. > > Also I expect that this won't result in code duplication with other OS, > you need code to map from drm to dc anyway, might as well check&reject the > stuff that dc can't even represent right there. > > The other reason is that the helpers are good guidelines for some of the > semantics, e.g. it's mandatory that drm_crtc_needs_modeset gives the right > answer after atomic_check. If it doesn't, then you're driver doesn't > follow atomic. If you completely roll your own this becomes much harder to > assure. > Interesting point. Not sure if we've checked that. Is there some sort of automated test for this that we can use to check? > Of course extend it all however you want, e.g. by adding all the global > optimization and resource assignment stuff after initial per-object > checking has been done using the helper infrastructure. > > Guidelines for atomic_commit > > Use the new nonblcoking helpers. Everyone who didn't got it wrong. Also, I believe we're not using those and didn't start with those which might explain (along with lack of discussion on dri-devel) why atomic currently looks the way it does in DC. This is definitely one of the bigger issues we'd want to clean up and where you wouldn't find much pushback, other than us trying to find time to do it. > your atomic_commit should pretty much match the helper one, except for a > custom swap_state to handle all your globally shared specia dc_*_state > objects. Everything hw specific should be in atomic_commit_tail. > > Wrt the hw commit itself, for the modeset step just roll your own. That's > the entire point of atomic, and atm both i915 and nouveau exploit this > fully. Besides a bit of glue there shouldn't be much need for > linux-specific code here - what you need is something to fish the right > dc_*_state objects and give it your main sequencer functions. What you > should make sure though is that only ever do a modeset when that was > signalled, i.e. please use drm_crtc_needs_modeset to control that part. > Feel free to wrap up in a dc_*_needs_modeset for better abstraction if > that's needed. > > I do strongly suggest however that you implement the plane commit using > the helpers. There's really only a few ways to implement this in the hw, > and it should work everywhere. > > Misc guidelines > > Use the suspend/resume helpers. If your atomic can't do that, it's not > terribly good. Also, if DC can't make those fit, it's probably still too > much midlayer and its own world than helper library. > Do they handle swapping DP displays while the system is asleep? If not we'll probably need to add that. The other case where we have some special handling has to do with headless (sleep or resume, don't remember). > Use all the legacy helpers, again your atomic should be able to pull it > off. One exception is async plane flips (both primary and cursors), that's > atm still unsolved. Probably best to keep the old code around for just > that case (but redirect to the compat helpers for everything), see e.g. > how vc4 implements cursors. > Good old flip. There probably isn't much shareable code between OSes here. It seems like every OS rolls there own thing, regarding flips. We still seem to be revisiting flips regularly, especially with FreeSync (adaptive sync) in the mix now. Good to know that this is still a bit of an open topic. > Most imporant of all > > Ask questions on #dri-devel. amdgpu atomic is the only nontrivial atomic > driver for which I don't remember a single discussion about some detail, > at least not with any of the DAL folks. Michel&Alex asked some questions > sometimes, but that indirection is bonghits and the defeats the point of > upstream: Direct cross-vendor collaboration to get shit done. Please make > it happen. > Please keep asking us to get on dri-devel with questions. I need to get into the habit again of leaving the IRC channel open. I think most of us are still a bit scared of it or don't know how to deal with some of the information overload (IRC and mailing list). It's some of my job to change that all the while I'm learning this myself. :) Thanks for all your effort trying to get people involved. > Oh and I pretty much assume Harry&Tony are volunteered to review atomic > docs ;-) > Sure. Cheers, Harry > Cheers, Daniel > > > >> >> Unfortunately we cannot expose code for uGPU yet. However refactor / cleanup >> work on DC is public. We're currently transitioning to a public patch >> review. You can follow our progress on the amd-gfx mailing list. We value >> community feedback on our work. >> >> As an appendix I've included a brief overview of the how the code currently >> works to make understanding and reviewing the code easier. >> >> Prior discussions on DC: >> >> * https://lists.freedesktop.org/archives/dri-devel/2016-March/103398.html >> * >> https://lists.freedesktop.org/archives/dri-devel/2016-February/100524.html >> >> Current version of DC: >> >> * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7 >> >> Once Alex pulls in the latest patches: >> >> * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7 >> >> Best Regards, >> Harry >> >> >> ************************************************ >> *** Appendix: A Day in the Life of a Modeset *** >> ************************************************ >> >> Below is a high-level overview of a modeset with dc. Some of this might be a >> little out-of-date since it's based on my XDC presentation but it should be >> more-or-less the same. >> >> amdgpu_dm_atomic_commit() >> { >> /* setup atomic state */ >> drm_atomic_helper_prepare_planes(dev, state); >> drm_atomic_helper_swap_state(dev, state); >> drm_atomic_helper_update_legacy_modeset_state(dev, state); >> >> /* create or remove targets */ >> >> /******************************************************************** >> * *** Call into DC to commit targets with list of all known targets >> ********************************************************************/ >> /* DC is optimized not to do anything if 'targets' didn't change. */ >> dc_commit_targets(dm->dc, commit_targets, commit_targets_count) >> { >> /****************************************************************** >> * *** Build context (function also used for validation) >> ******************************************************************/ >> result = core_dc->res_pool->funcs->validate_with_context( >> core_dc,set,target_count,context); >> >> /****************************************************************** >> * *** Apply safe power state >> ******************************************************************/ >> pplib_apply_safe_state(core_dc); >> >> /**************************************************************** >> * *** Apply the context to HW (program HW) >> ****************************************************************/ >> result = core_dc->hwss.apply_ctx_to_hw(core_dc,context) >> { >> /* reset pipes that need reprogramming */ >> /* disable pipe power gating */ >> /* set safe watermarks */ >> >> /* for all pipes with an attached stream */ >> /************************************************************ >> * *** Programming all per-pipe contexts >> ************************************************************/ >> status = apply_single_controller_ctx_to_hw(...) >> { >> pipe_ctx->tg->funcs->set_blank(...); >> pipe_ctx->clock_source->funcs->program_pix_clk(...); >> pipe_ctx->tg->funcs->program_timing(...); >> pipe_ctx->mi->funcs->allocate_mem_input(...); >> pipe_ctx->tg->funcs->enable_crtc(...); >> bios_parser_crtc_source_select(...); >> >> pipe_ctx->opp->funcs->opp_set_dyn_expansion(...); >> pipe_ctx->opp->funcs->opp_program_fmt(...); >> >> stream->sink->link->link_enc->funcs->setup(...); >> pipe_ctx->stream_enc->funcs->dp_set_stream_attribute(...); >> pipe_ctx->tg->funcs->set_blank_color(...); >> >> core_link_enable_stream(pipe_ctx); >> unblank_stream(pipe_ctx, >> >> program_scaler(dc, pipe_ctx); >> } >> /* program audio for all pipes */ >> /* update watermarks */ >> } >> >> program_timing_sync(core_dc, context); >> /* for all targets */ >> target_enable_memory_requests(...); >> >> /* Update ASIC power states */ >> pplib_apply_display_requirements(...); >> >> /* update surface or page flip */ >> } >> } >> >> >> _______________________________________________ >> dri-devel mailing list >> dri-devel@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/dri-devel > _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
[parent not found: <b64d0072-4909-c680-2f09-adae9f856642-5C7GfCeVMHo@public.gmane.org>]
* Re: [RFC] Using DC in amdgpu for upcoming GPU [not found] ` <b64d0072-4909-c680-2f09-adae9f856642-5C7GfCeVMHo@public.gmane.org> @ 2016-12-13 4:10 ` Cheng, Tony 2016-12-13 7:50 ` Daniel Vetter [not found] ` <3219f6f2-080e-0c77-45bb-cf59aa5b5858-5C7GfCeVMHo@public.gmane.org> 2016-12-13 7:31 ` Daniel Vetter 2016-12-13 10:09 ` Ernst Sjöstrand 2 siblings, 2 replies; 66+ messages in thread From: Cheng, Tony @ 2016-12-13 4:10 UTC (permalink / raw) To: Harry Wentland, Daniel Vetter Cc: Deucher, Alexander, Grodzovsky, Andrey, Dave Airlie, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW [-- Attachment #1: Type: text/plain, Size: 22899 bytes --] Thanks for the write up for the guide. We can definitely re-do atomic according to guideline provided as I am not satified with how our code look today. To me it seems more like we need to shuffle stuff around and rename a few things than rewrite much of anything. I hope to get an answer on the reply to Dave's question regarding to if there is anything else. If we can keep most of the stuff under /dc as the "back end" helper and do most of the change under /amdgpu_dm then it isn't that difficult as we don't need to go deal with the fall out on other platforms. Again it's not just windows. We are fully aware that it's hard to find the common abstraction between all different OS so we try our best to have DC behave more like a helper than abstraction layer anyways. In our design states and policies are domain of Display Managers (DM) and because of linux we also say anything DRM can do that's also domain of DM. We don't put anything in DC that we don't feel comfortable if HW decide to hide it in FW. On 12/12/2016 9:33 PM, Harry Wentland wrote: > On 2016-12-11 03:28 PM, Daniel Vetter wrote: >> On Wed, Dec 07, 2016 at 09:02:13PM -0500, Harry Wentland wrote: >>> We propose to use the Display Core (DC) driver for display support on >>> AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In >>> order to >>> avoid a flag day the plan is to only support uGPU initially and >>> transition >>> to older ASICs gradually. >> >> Bridgeman brought it up a few times that this here was the question - >> it's >> kinda missing a question mark, hard to figure this out ;-). I'd say for > > My bad for the missing question mark (imprecise phrasing). On the > other hand letting this blow over a bit helped get us on the map a bit > more and allows us to argue the challenges (and benefits) of open > source. :) > >> upstream it doesn't really matter, but imo having both atomic and >> non-atomic paths in one driver is one world of hurt and I strongly >> recommend against it, at least if feasible. All drivers that switched >> switched in one go, the only exception was i915 (it took much longer >> than >> we ever feared, causing lots of pain) and nouveau (which only converted >> nv50+, but pre/post-nv50 have always been two almost completely separate >> worlds anyway). >> > trust me we would like to upstream everything. Just we didn't invest enough in DC code in the previous generation so the quality might not be there. > You mention the two probably most complex DRM drivers didn't switch in > a single go... I imagine amdgpu/DC falls into the same category. > > I think one of the problems is making a sudden change with a fully > validated driver without breaking existing use cases and customers. We > really should've started DC development in public and probably would > do that if we had to start anew. > >>> The DC component has received extensive testing within AMD for DCE8, >>> 10, and >>> 11 GPUs and is being prepared for uGPU. Support should be better than >>> amdgpu's current display support. >>> >>> * All of our QA effort is focused on DC >>> * All of our CQE effort is focused on DC >>> * All of our OEM preloads and custom engagements use DC >>> * DC behavior mirrors what we do for other OSes >>> >>> The new asic utilizes a completely re-designed atom interface, so we >>> cannot >>> easily leverage much of the existing atom-based code. >>> >>> We've introduced DC to the community earlier in 2016 and received a >>> fair >>> amount of feedback. Some of what we've addressed so far are: >>> >>> * Self-contain ASIC specific code. We did a bunch of work to pull >>> common sequences into dc/dce and leave ASIC specific code in >>> separate folders. >>> * Started to expose AUX and I2C through generic kernel/drm >>> functionality and are mostly using that. Some of that code is still >>> needlessly convoluted. This cleanup is in progress. >>> * Integrated Dave and Jerome’s work on removing abstraction in bios >>> parser. >>> * Retire adapter service and asic capability >>> * Remove some abstraction in GPIO >>> >>> Since a lot of our code is shared with pre- and post-silicon validation >>> suites changes need to be done gradually to prevent breakages due to >>> a major >>> flag day. This, coupled with adding support for new asics and lots >>> of new >>> feature introductions means progress has not been as quick as we >>> would have >>> liked. We have made a lot of progress none the less. >>> >>> The remaining concerns that were brought up during the last review >>> that we >>> are working on addressing: >>> >>> * Continue to cleanup and reduce the abstractions in DC where it >>> makes sense. >>> * Removing duplicate code in I2C and AUX as we transition to using the >>> DRM core interfaces. We can't fully transition until we've helped >>> fill in the gaps in the drm core that we need for certain features. >>> * Making sure Atomic API support is correct. Some of the semantics of >>> the Atomic API were not particularly clear when we started this, >>> however, that is improving a lot as the core drm documentation >>> improves. Getting this code upstream and in the hands of more >>> atomic users will further help us identify and rectify any gaps we >>> have. >> >> Ok so I guess Dave is typing some more general comments about >> demidlayering, let me type some guidelines about atomic. Hopefully this >> all materializes itself a bit better into improved upstream docs, but >> meh. >> > > Excellent writeup. Let us know when/if you want our review for > upstream docs. > > We'll have to really take some time to go over our atomic > implementation. A couple small comments below with regard to DC. > >> Step 0: Prep >> >> So atomic is transactional, but it's not validate + rollback or commit, >> but duplicate state, validate and then either throw away or commit. >> There's a few big reasons for this: a) partial atomic updates - if you >> duplicate it's much easier to check that you have all the right locks b) >> kfree() is much easier to check for correctness than a rollback code and >> c) atomic_check functions are much easier to audit for invalid >> changes to >> persistent state. >> > > There isn't really any rollback. I believe even in our other drivers > we've abandoned the rollback approach years ago because it doesn't > really work on modern HW. Any rollback cases you might find in DC > should really only be for catastrophic errors (read: something went > horribly wrong... read: congratulations, you just found a bug). > There is no rollback. We moved to "atomic" for Windows Vista in the previous DAL 8 years ago. Windows only care about VidPnSource (frame buffer) and VidPnTarget (display output) and leave the rest up to driver but we had to behave atomic as Window obsolutely "check" every possible config with the famous EnumConfunctionalModality DDI. >> Trouble is that this seems a bit unusual compared to all other >> approaches, >> and ime (from the drawn-out i915 conversion) you really don't want to >> mix >> things up. Ofc for private state you can roll back (e.g. vc4 does >> that for >> the drm_mm allocator thing for scanout slots or whatever it is), but >> it's >> trivial easy to accidentally check the wrong state or mix them up or >> something else bad. >> >> Long story short, I think step 0 for DC is to split state from objects, >> i.e. for each dc_surface/foo/bar you need a dc_surface/foo/bar_state. >> And >> all the back-end functions need to take both the object and the state >> explicitly. >> >> This is a bit a pain to do, but should be pretty much just >> mechanical. And >> imo not all of it needs to happen before DC lands in upstream, but see >> above imo that half-converted state is postively horrible. This should >> also not harm cross-os reuse at all, you can still store things together >> on os where that makes sense. >> >> Guidelines for amdgpu atomic structures >> >> drm atomic stores everything in state structs on plane/connector/crtc. >> This includes any property extensions or anything else really, the >> entire >> userspace abi is built on top of this. Non-trivial drivers are >> supposed to >> subclass these to store their own stuff, so e.g. >> >> amdgpu_plane_state { >> struct drm_plane_state base; >> >> /* amdgpu glue state and stuff that's linux-specific, e.g. >> * property values and similar things. Note that there's strong >> * push towards standardizing properties and stroing them in the >> * drm_*_state structs. */ >> >> struct dc_surface_state surface_state; >> >> /* other dc states that fit to a plane */ >> }; >> Is there any requirement where the header and code that deal with dc_surface_state has to be? Can we keep it under /dc while amdgpu_plane_state exist under /amdgpu_dm? >> Yes not everything will fit 1:1 in one of these, but to get started I >> strongly recommend to make them fit (maybe with reduced feature sets to >> start out). Stuff that is shared between e.g. planes, but always on the >> same crtc can be put into amdgpu_crtc_state, e.g. if you have scalers >> that >> are assignable to a plane. >> >> Of course atomic also supports truly global resources, for that you need >> to subclass drm_atomic_state. Currently msm and i915 do that, and >> probably >> best to read those structures as examples until I've typed the docs. >> But I >> expect that especially for planes a few dc_*_state structs will stay in >> amdgpu_*_state. >> We need to treat most of resource that don't map well as global. One example is pixel pll. We have 6 display pipes but only 2 or 3 plls in CI/VI, as a result we are limited in number of HDMI or DVI we can drive at the same time. Also the pixel pll can be used to drive DP as well, so there is another layer of HW specific but we can't really contain it in crtc or encoder by itself. Doing this resource allocation require knowlege of the whole system, and knowning which pixel pll is already used, and what can we support with remaining pll. Another ask is lets say we are driving 2 displays, we would always want instance 0 and instance 1 of scaler, timing generator etc getting used. We want to avoid possiblity of due to different user mode commit sequence we end up with driving the 2 display with 0 and 2nd instance of HW. Not only this configuration isn't really validated in the lab, we will be less effective in power gating as instance 0 and 1 are one the same tile. instead of having 2/3 of processing pipeline silicon power gated we can only power gate 1/3. And if we power gate wrong the you will have 1 of the 2 display not lighting up. Having HW resource used the same way on all platform under any sequence / circumstance is important for us, as power optimization/measure is done for given platform + display config mostly on only 1 OS by the HW team. >> Guidelines for atomic_check >> >> Please use the helpers as much as makes sense, and put at least the >> basic >> steps that from drm_*_state into the respective dc_*_state functional >> block into the helper callbacks for that object. I think basic >> validation >> of individal bits (as much as possible, e.g. if you just don't support >> e.g. scaling or rotation with certain pixel formats) should happen in >> there too. That way when we e.g. want to check how drivers corrently >> validate a given set of properties to be able to more strictly define >> the >> semantics, that code is easy to find. >> >> Also I expect that this won't result in code duplication with other OS, >> you need code to map from drm to dc anyway, might as well >> check&reject the >> stuff that dc can't even represent right there. >> >> The other reason is that the helpers are good guidelines for some of the >> semantics, e.g. it's mandatory that drm_crtc_needs_modeset gives the >> right >> answer after atomic_check. If it doesn't, then you're driver doesn't >> follow atomic. If you completely roll your own this becomes much >> harder to >> assure. >> it doesn't today and we have equilvant check in dc in our hw_seq. We will look into how to make it work. Our "atomic" operate on always knowing the current state (core_dc.current_ctx) and finding out the delta between the desired future state computed in our dc_validate. One thing we were stuggling with it seems DRM is building up incremental state, ie. if something isn't mentioned in atomic_commit then you don't touch it. We operate in a mode where if something isn't mentioned in dc_commit_target we disable those output. this method allow us to always know current and future state, as future state is built up by caller (amdgpu), and we are able to transition into the future state on vsync boundary if required. It seems to me that drm_*_state require us to compartimentize states. It won't be as trivial to fill the input for bandwidth_calc as that beast need everything as everything end up goes through the same memory controller. our validate_context is specifically design to make it easy to generate input parameter for bandwidth_calc. Per pipe validate like pixel format, scaling is not a problem. > > Interesting point. Not sure if we've checked that. Is there some sort > of automated test for this that we can use to check? > >> Of course extend it all however you want, e.g. by adding all the global >> optimization and resource assignment stuff after initial per-object >> checking has been done using the helper infrastructure. >> >> Guidelines for atomic_commit >> >> Use the new nonblcoking helpers. Everyone who didn't got it wrong. Also, > > I believe we're not using those and didn't start with those which > might explain (along with lack of discussion on dri-devel) why atomic > currently looks the way it does in DC. This is definitely one of the > bigger issues we'd want to clean up and where you wouldn't find much > pushback, other than us trying to find time to do it. > >> your atomic_commit should pretty much match the helper one, except for a >> custom swap_state to handle all your globally shared specia dc_*_state >> objects. Everything hw specific should be in atomic_commit_tail. >> >> Wrt the hw commit itself, for the modeset step just roll your own. >> That's >> the entire point of atomic, and atm both i915 and nouveau exploit this >> fully. Besides a bit of glue there shouldn't be much need for >> linux-specific code here - what you need is something to fish the right >> dc_*_state objects and give it your main sequencer functions. What you >> should make sure though is that only ever do a modeset when that was >> signalled, i.e. please use drm_crtc_needs_modeset to control that part. >> Feel free to wrap up in a dc_*_needs_modeset for better abstraction if >> that's needed. >> Using state properly will solve our double resource assignment/validation problem during commit. Thanks for the guidance on how to do this. now the question is can we have a helper function to house the main sequence and put it in /dc? >> I do strongly suggest however that you implement the plane commit using >> the helpers. There's really only a few ways to implement this in the hw, >> and it should work everywhere. >> Maybe from SW perspective I'll look at the intel code to understand this. In terms of HW I would have to say I disagree with that. The multi-plane blend stuff even in our HW we have gone through 1 minor revision and 1 major change. Also the same HW is build to handle stereo 3D, multi-plane blending, pipe spliting and more. The pipeline / blending stuff tend to change in HW because HW need to constently design to meet the timing requirement of ever increasing pixel rate to keep us competitive. When HW can't meet timing they employ the split trick and have 2 copy of the same HW to be able to push through that many number of pixel. If we are Intel and on latest process node then we probably won't have this problem. I beat our 2018 HW will change again especially things are moving toward 64bpp FP16 pixel format by default for HDR. >> Misc guidelines >> >> Use the suspend/resume helpers. If your atomic can't do that, it's not >> terribly good. Also, if DC can't make those fit, it's probably still too >> much midlayer and its own world than helper library. >> > > Do they handle swapping DP displays while the system is asleep? If not > we'll probably need to add that. The other case where we have some > special handling has to do with headless (sleep or resume, don't > remember). > >> Use all the legacy helpers, again your atomic should be able to pull it >> off. One exception is async plane flips (both primary and cursors), >> that's >> atm still unsolved. Probably best to keep the old code around for just >> that case (but redirect to the compat helpers for everything), see e.g. >> how vc4 implements cursors. >> > > Good old flip. There probably isn't much shareable code between OSes > here. It seems like every OS rolls there own thing, regarding flips. > We still seem to be revisiting flips regularly, especially with > FreeSync (adaptive sync) in the mix now. Good to know that this is > still a bit of an open topic. > >> Most imporant of all >> >> Ask questions on #dri-devel. amdgpu atomic is the only nontrivial atomic >> driver for which I don't remember a single discussion about some detail, >> at least not with any of the DAL folks. Michel&Alex asked some questions >> sometimes, but that indirection is bonghits and the defeats the point of >> upstream: Direct cross-vendor collaboration to get shit done. Please >> make >> it happen. >> > > Please keep asking us to get on dri-devel with questions. I need to > get into the habit again of leaving the IRC channel open. I think most > of us are still a bit scared of it or don't know how to deal with some > of the information overload (IRC and mailing list). It's some of my > job to change that all the while I'm learning this myself. :) > > Thanks for all your effort trying to get people involved. > >> Oh and I pretty much assume Harry&Tony are volunteered to review atomic >> docs ;-) >> > > Sure. > > Cheers, > Harry > >> Cheers, Daniel >> >> >> >>> >>> Unfortunately we cannot expose code for uGPU yet. However refactor / >>> cleanup >>> work on DC is public. We're currently transitioning to a public patch >>> review. You can follow our progress on the amd-gfx mailing list. We >>> value >>> community feedback on our work. >>> >>> As an appendix I've included a brief overview of the how the code >>> currently >>> works to make understanding and reviewing the code easier. >>> >>> Prior discussions on DC: >>> >>> * >>> https://lists.freedesktop.org/archives/dri-devel/2016-March/103398.html >>> * >>> https://lists.freedesktop.org/archives/dri-devel/2016-February/100524.html >>> >>> >>> Current version of DC: >>> >>> * >>> https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7 >>> >>> Once Alex pulls in the latest patches: >>> >>> * >>> https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7 >>> >>> Best Regards, >>> Harry >>> >>> >>> ************************************************ >>> *** Appendix: A Day in the Life of a Modeset *** >>> ************************************************ >>> >>> Below is a high-level overview of a modeset with dc. Some of this >>> might be a >>> little out-of-date since it's based on my XDC presentation but it >>> should be >>> more-or-less the same. >>> >>> amdgpu_dm_atomic_commit() >>> { >>> /* setup atomic state */ >>> drm_atomic_helper_prepare_planes(dev, state); >>> drm_atomic_helper_swap_state(dev, state); >>> drm_atomic_helper_update_legacy_modeset_state(dev, state); >>> >>> /* create or remove targets */ >>> >>> /******************************************************************** >>> * *** Call into DC to commit targets with list of all known targets >>> ********************************************************************/ >>> /* DC is optimized not to do anything if 'targets' didn't change. */ >>> dc_commit_targets(dm->dc, commit_targets, commit_targets_count) >>> { >>> /****************************************************************** >>> * *** Build context (function also used for validation) >>> ******************************************************************/ >>> result = core_dc->res_pool->funcs->validate_with_context( >>> core_dc,set,target_count,context); >>> >>> /****************************************************************** >>> * *** Apply safe power state >>> ******************************************************************/ >>> pplib_apply_safe_state(core_dc); >>> >>> /**************************************************************** >>> * *** Apply the context to HW (program HW) >>> ****************************************************************/ >>> result = core_dc->hwss.apply_ctx_to_hw(core_dc,context) >>> { >>> /* reset pipes that need reprogramming */ >>> /* disable pipe power gating */ >>> /* set safe watermarks */ >>> >>> /* for all pipes with an attached stream */ >>> /************************************************************ >>> * *** Programming all per-pipe contexts >>> ************************************************************/ >>> status = apply_single_controller_ctx_to_hw(...) >>> { >>> pipe_ctx->tg->funcs->set_blank(...); >>> pipe_ctx->clock_source->funcs->program_pix_clk(...); >>> pipe_ctx->tg->funcs->program_timing(...); >>> pipe_ctx->mi->funcs->allocate_mem_input(...); >>> pipe_ctx->tg->funcs->enable_crtc(...); >>> bios_parser_crtc_source_select(...); >>> >>> pipe_ctx->opp->funcs->opp_set_dyn_expansion(...); >>> pipe_ctx->opp->funcs->opp_program_fmt(...); >>> >>> stream->sink->link->link_enc->funcs->setup(...); >>> pipe_ctx->stream_enc->funcs->dp_set_stream_attribute(...); >>> pipe_ctx->tg->funcs->set_blank_color(...); >>> >>> core_link_enable_stream(pipe_ctx); >>> unblank_stream(pipe_ctx, >>> >>> program_scaler(dc, pipe_ctx); >>> } >>> /* program audio for all pipes */ >>> /* update watermarks */ >>> } >>> >>> program_timing_sync(core_dc, context); >>> /* for all targets */ >>> target_enable_memory_requests(...); >>> >>> /* Update ASIC power states */ >>> pplib_apply_display_requirements(...); >>> >>> /* update surface or page flip */ >>> } >>> } >>> >>> >>> _______________________________________________ >>> dri-devel mailing list >>> dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org >>> https://lists.freedesktop.org/mailman/listinfo/dri-devel >> [-- Attachment #2: DAL3.JPG --] [-- Type: image/jpeg, Size: 117556 bytes --] [-- Attachment #3: Type: text/plain, Size: 154 bytes --] _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [RFC] Using DC in amdgpu for upcoming GPU 2016-12-13 4:10 ` Cheng, Tony @ 2016-12-13 7:50 ` Daniel Vetter [not found] ` <3219f6f2-080e-0c77-45bb-cf59aa5b5858-5C7GfCeVMHo@public.gmane.org> 1 sibling, 0 replies; 66+ messages in thread From: Daniel Vetter @ 2016-12-13 7:50 UTC (permalink / raw) To: Cheng, Tony; +Cc: Grodzovsky, Andrey, amd-gfx, dri-devel, Deucher, Alexander On Mon, Dec 12, 2016 at 11:10:30PM -0500, Cheng, Tony wrote: > Thanks for the write up for the guide. We can definitely re-do atomic > according to guideline provided as I am not satified with how our code look > today. To me it seems more like we need to shuffle stuff around and rename > a few things than rewrite much of anything. > > I hope to get an answer on the reply to Dave's question regarding to if > there is anything else. If we can keep most of the stuff under /dc as the > "back end" helper and do most of the change under /amdgpu_dm then it isn't > that difficult as we don't need to go deal with the fall out on other > platforms. Again it's not just windows. We are fully aware that it's hard > to find the common abstraction between all different OS so we try our best > to have DC behave more like a helper than abstraction layer anyways. In our > design states and policies are domain of Display Managers (DM) and because > of linux we also say anything DRM can do that's also domain of DM. We don't > put anything in DC that we don't feel comfortable if HW decide to hide it in > FW. > > > On 12/12/2016 9:33 PM, Harry Wentland wrote: > > On 2016-12-11 03:28 PM, Daniel Vetter wrote: > > > On Wed, Dec 07, 2016 at 09:02:13PM -0500, Harry Wentland wrote: > > > > We propose to use the Display Core (DC) driver for display support on > > > > AMD's upcoming GPU (referred to by uGPU in the rest of the doc). > > > > In order to > > > > avoid a flag day the plan is to only support uGPU initially and > > > > transition > > > > to older ASICs gradually. > > > > > > Bridgeman brought it up a few times that this here was the question > > > - it's > > > kinda missing a question mark, hard to figure this out ;-). I'd say for > > > > My bad for the missing question mark (imprecise phrasing). On the other > > hand letting this blow over a bit helped get us on the map a bit more > > and allows us to argue the challenges (and benefits) of open source. :) > > > > > upstream it doesn't really matter, but imo having both atomic and > > > non-atomic paths in one driver is one world of hurt and I strongly > > > recommend against it, at least if feasible. All drivers that switched > > > switched in one go, the only exception was i915 (it took much longer > > > than > > > we ever feared, causing lots of pain) and nouveau (which only converted > > > nv50+, but pre/post-nv50 have always been two almost completely separate > > > worlds anyway). > > > > > > trust me we would like to upstream everything. Just we didn't invest enough > in DC code in the previous generation so the quality might not be there. > > > You mention the two probably most complex DRM drivers didn't switch in a > > single go... I imagine amdgpu/DC falls into the same category. > > > > I think one of the problems is making a sudden change with a fully > > validated driver without breaking existing use cases and customers. We > > really should've started DC development in public and probably would do > > that if we had to start anew. > > > > > > The DC component has received extensive testing within AMD for > > > > DCE8, 10, and > > > > 11 GPUs and is being prepared for uGPU. Support should be better than > > > > amdgpu's current display support. > > > > > > > > * All of our QA effort is focused on DC > > > > * All of our CQE effort is focused on DC > > > > * All of our OEM preloads and custom engagements use DC > > > > * DC behavior mirrors what we do for other OSes > > > > > > > > The new asic utilizes a completely re-designed atom interface, > > > > so we cannot > > > > easily leverage much of the existing atom-based code. > > > > > > > > We've introduced DC to the community earlier in 2016 and > > > > received a fair > > > > amount of feedback. Some of what we've addressed so far are: > > > > > > > > * Self-contain ASIC specific code. We did a bunch of work to pull > > > > common sequences into dc/dce and leave ASIC specific code in > > > > separate folders. > > > > * Started to expose AUX and I2C through generic kernel/drm > > > > functionality and are mostly using that. Some of that code is still > > > > needlessly convoluted. This cleanup is in progress. > > > > * Integrated Dave and Jerome’s work on removing abstraction in bios > > > > parser. > > > > * Retire adapter service and asic capability > > > > * Remove some abstraction in GPIO > > > > > > > > Since a lot of our code is shared with pre- and post-silicon validation > > > > suites changes need to be done gradually to prevent breakages > > > > due to a major > > > > flag day. This, coupled with adding support for new asics and > > > > lots of new > > > > feature introductions means progress has not been as quick as we > > > > would have > > > > liked. We have made a lot of progress none the less. > > > > > > > > The remaining concerns that were brought up during the last > > > > review that we > > > > are working on addressing: > > > > > > > > * Continue to cleanup and reduce the abstractions in DC where it > > > > makes sense. > > > > * Removing duplicate code in I2C and AUX as we transition to using the > > > > DRM core interfaces. We can't fully transition until we've helped > > > > fill in the gaps in the drm core that we need for certain features. > > > > * Making sure Atomic API support is correct. Some of the semantics of > > > > the Atomic API were not particularly clear when we started this, > > > > however, that is improving a lot as the core drm documentation > > > > improves. Getting this code upstream and in the hands of more > > > > atomic users will further help us identify and rectify any gaps we > > > > have. > > > > > > Ok so I guess Dave is typing some more general comments about > > > demidlayering, let me type some guidelines about atomic. Hopefully this > > > all materializes itself a bit better into improved upstream docs, > > > but meh. > > > > > > > Excellent writeup. Let us know when/if you want our review for upstream > > docs. > > > > We'll have to really take some time to go over our atomic > > implementation. A couple small comments below with regard to DC. > > > > > Step 0: Prep > > > > > > So atomic is transactional, but it's not validate + rollback or commit, > > > but duplicate state, validate and then either throw away or commit. > > > There's a few big reasons for this: a) partial atomic updates - if you > > > duplicate it's much easier to check that you have all the right locks b) > > > kfree() is much easier to check for correctness than a rollback code and > > > c) atomic_check functions are much easier to audit for invalid > > > changes to > > > persistent state. > > > > > > > There isn't really any rollback. I believe even in our other drivers > > we've abandoned the rollback approach years ago because it doesn't > > really work on modern HW. Any rollback cases you might find in DC should > > really only be for catastrophic errors (read: something went horribly > > wrong... read: congratulations, you just found a bug). > > > There is no rollback. We moved to "atomic" for Windows Vista in the > previous DAL 8 years ago. Windows only care about VidPnSource (frame > buffer) and VidPnTarget (display output) and leave the rest up to driver but > we had to behave atomic as Window obsolutely "check" every possible config > with the famous EnumConfunctionalModality DDI. > > > > Trouble is that this seems a bit unusual compared to all other > > > approaches, > > > and ime (from the drawn-out i915 conversion) you really don't want > > > to mix > > > things up. Ofc for private state you can roll back (e.g. vc4 does > > > that for > > > the drm_mm allocator thing for scanout slots or whatever it is), but > > > it's > > > trivial easy to accidentally check the wrong state or mix them up or > > > something else bad. > > > > > > Long story short, I think step 0 for DC is to split state from objects, > > > i.e. for each dc_surface/foo/bar you need a > > > dc_surface/foo/bar_state. And > > > all the back-end functions need to take both the object and the state > > > explicitly. > > > > > > This is a bit a pain to do, but should be pretty much just > > > mechanical. And > > > imo not all of it needs to happen before DC lands in upstream, but see > > > above imo that half-converted state is postively horrible. This should > > > also not harm cross-os reuse at all, you can still store things together > > > on os where that makes sense. > > > > > > Guidelines for amdgpu atomic structures > > > > > > drm atomic stores everything in state structs on plane/connector/crtc. > > > This includes any property extensions or anything else really, the > > > entire > > > userspace abi is built on top of this. Non-trivial drivers are > > > supposed to > > > subclass these to store their own stuff, so e.g. > > > > > > amdgpu_plane_state { > > > struct drm_plane_state base; > > > > > > /* amdgpu glue state and stuff that's linux-specific, e.g. > > > * property values and similar things. Note that there's strong > > > * push towards standardizing properties and stroing them in the > > > * drm_*_state structs. */ > > > > > > struct dc_surface_state surface_state; > > > > > > /* other dc states that fit to a plane */ > > > }; > > > > Is there any requirement where the header and code that deal with > dc_surface_state has to be? Can we keep it under /dc while > amdgpu_plane_state exist under /amdgpu_dm? None. And my proposal here with having dc_*_state structures for the dc block, and fairly separate amdgpu_*_state blocks to bind it into drm is exaclty to facilitate this split, so dc_* stuff would still entirely live in dc/ (and hopefully shared with everyone else), while amdgpu would be the linux glue. 1 > > > Yes not everything will fit 1:1 in one of these, but to get started I > > > strongly recommend to make them fit (maybe with reduced feature sets to > > > start out). Stuff that is shared between e.g. planes, but always on the > > > same crtc can be put into amdgpu_crtc_state, e.g. if you have > > > scalers that > > > are assignable to a plane. > > > > > > Of course atomic also supports truly global resources, for that you need > > > to subclass drm_atomic_state. Currently msm and i915 do that, and > > > probably > > > best to read those structures as examples until I've typed the docs. > > > But I > > > expect that especially for planes a few dc_*_state structs will stay in > > > amdgpu_*_state. > > > > We need to treat most of resource that don't map well as global. One example > is pixel pll. We have 6 display pipes but only 2 or 3 plls in CI/VI, as a > result we are limited in number of HDMI or DVI we can drive at the same > time. Also the pixel pll can be used to drive DP as well, so there is > another layer of HW specific but we can't really contain it in crtc or > encoder by itself. Doing this resource allocation require knowlege of the > whole system, and knowning which pixel pll is already used, and what can we > support with remaining pll. Same on i915. Other stuff we currently treat as global are the overall clocks&bandwidth/latency needs, plus all the fetch fifo settings and latencies (because they depend in complicated ways on everything else). But e.g. the fetch latency and bw needed for each plane is computed in the plane check code. Scalers otoh are per-crtc on intel. > Another ask is lets say we are driving 2 displays, we would always want > instance 0 and instance 1 of scaler, timing generator etc getting used. We > want to avoid possiblity of due to different user mode commit sequence we > end up with driving the 2 display with 0 and 2nd instance of HW. Not only > this configuration isn't really validated in the lab, we will be less > effective in power gating as instance 0 and 1 are one the same tile. > instead of having 2/3 of processing pipeline silicon power gated we can only > power gate 1/3. And if we power gate wrong the you will have 1 of the 2 > display not lighting up. Just implement some bias in which shared resoures your prefer for which crtc. Also note that with atomic you can always add more drm objeccts to the commit. So if you've put yourself into a corner (for power optimizatino reasons), but then userspace wants to light up more displays, and you want to reassing the resources for the already enabled outputs (e.g. when not all clocks are the same and only some support really high clocks). We do that on intel when we need to change the display core clock, since that means recomputing everything (and you can't change the display core clock while a display is on). > Having HW resource used the same way on all platform under any sequence / > circumstance is important for us, as power optimization/measure is done for > given platform + display config mostly on only 1 OS by the HW team. Yeah, should all be possible with atomic. And I think with some work it should be possible to keep that selection logic for shared resources in the shared code, even with this redesign. > > > Guidelines for atomic_check > > > > > > Please use the helpers as much as makes sense, and put at least the > > > basic > > > steps that from drm_*_state into the respective dc_*_state functional > > > block into the helper callbacks for that object. I think basic > > > validation > > > of individal bits (as much as possible, e.g. if you just don't support > > > e.g. scaling or rotation with certain pixel formats) should happen in > > > there too. That way when we e.g. want to check how drivers corrently > > > validate a given set of properties to be able to more strictly > > > define the > > > semantics, that code is easy to find. > > > > > > Also I expect that this won't result in code duplication with other OS, > > > you need code to map from drm to dc anyway, might as well > > > check&reject the > > > stuff that dc can't even represent right there. > > > > > > The other reason is that the helpers are good guidelines for some of the > > > semantics, e.g. it's mandatory that drm_crtc_needs_modeset gives the > > > right > > > answer after atomic_check. If it doesn't, then you're driver doesn't > > > follow atomic. If you completely roll your own this becomes much > > > harder to > > > assure. > > > > it doesn't today and we have equilvant check in dc in our hw_seq. We will > look into how to make it work. Our "atomic" operate on always knowing the > current state (core_dc.current_ctx) and finding out the delta between the > desired future state computed in our dc_validate. One thing we were > stuggling with it seems DRM is building up incremental state, ie. if > something isn't mentioned in atomic_commit then you don't touch it. We > operate in a mode where if something isn't mentioned in dc_commit_target we > disable those output. this method allow us to always know current and > future state, as future state is built up by caller (amdgpu), and we are > able to transition into the future state on vsync boundary if required. It > seems to me that drm_*_state require us to compartimentize states. It won't > be as trivial to fill the input for bandwidth_calc as that beast need > everything as everything end up goes through the same memory controller. > our validate_context is specifically design to make it easy to generate > input parameter for bandwidth_calc. Per pipe validate like pixel format, > scaling is not a problem. Hm, that needs to be fixed. Atom also gives you both old and new state, but only for state objects which are changed. You can just go around and add everything (see example above), but you should _only_ do that when necessary. One design goal of atomic is that when you do a modeset on an 2nd display then page-flip should continue to work (assuming the hw can do it) on the 1st display without any stalls. We have fine-grained locking for this, but if you always need all the state then you defeat that point. Of course this is a bit tricky if you have lots of complicated shared state. The way we solve this is by pushing copies of relevant data from planes/crtc down to the shared resources. This way you end up with a read-only copy, and as long as those derived values don't change the independtly running pageflip loop won't need to stall for your modeset. And the modeset code can still look at the data, without grabbing the full update lock. This is probably going to be a bit of a rework, so for starters would make sense to only aim to have parallel flips (without any modesets). That still means you need to be careful with grabbing global states. > > Interesting point. Not sure if we've checked that. Is there some sort of > > automated test for this that we can use to check? > > > > > Of course extend it all however you want, e.g. by adding all the global > > > optimization and resource assignment stuff after initial per-object > > > checking has been done using the helper infrastructure. > > > > > > Guidelines for atomic_commit > > > > > > Use the new nonblcoking helpers. Everyone who didn't got it wrong. Also, > > > > I believe we're not using those and didn't start with those which might > > explain (along with lack of discussion on dri-devel) why atomic > > currently looks the way it does in DC. This is definitely one of the > > bigger issues we'd want to clean up and where you wouldn't find much > > pushback, other than us trying to find time to do it. > > > > > your atomic_commit should pretty much match the helper one, except for a > > > custom swap_state to handle all your globally shared specia dc_*_state > > > objects. Everything hw specific should be in atomic_commit_tail. > > > > > > Wrt the hw commit itself, for the modeset step just roll your own. > > > That's > > > the entire point of atomic, and atm both i915 and nouveau exploit this > > > fully. Besides a bit of glue there shouldn't be much need for > > > linux-specific code here - what you need is something to fish the right > > > dc_*_state objects and give it your main sequencer functions. What you > > > should make sure though is that only ever do a modeset when that was > > > signalled, i.e. please use drm_crtc_needs_modeset to control that part. > > > Feel free to wrap up in a dc_*_needs_modeset for better abstraction if > > > that's needed. > > > > Using state properly will solve our double resource assignment/validation > problem during commit. Thanks for the guidance on how to do this. > > now the question is can we have a helper function to house the main sequence > and put it in /dc? Yeah, that's my proposal. You probably need some helper functions and iterator macros on the overall dc_state (or amdgpu_state, whatever you call it) so that your helper function can walk all the state objects correctly. And only those which are part of the state (see above for why this is important), but sharing that overall commit logic should be possible. > > > I do strongly suggest however that you implement the plane commit using > > > the helpers. There's really only a few ways to implement this in the hw, > > > and it should work everywhere. > > > > Maybe from SW perspective I'll look at the intel code to understand this. > In terms of HW I would have to say I disagree with that. The multi-plane > blend stuff even in our HW we have gone through 1 minor revision and 1 major > change. Also the same HW is build to handle stereo 3D, multi-plane > blending, pipe spliting and more. The pipeline / blending stuff tend to > change in HW because HW need to constently design to meet the timing > requirement of ever increasing pixel rate to keep us competitive. When HW > can't meet timing they employ the split trick and have 2 copy of the same HW > to be able to push through that many number of pixel. If we are Intel and > on latest process node then we probably won't have this problem. I beat our > 2018 HW will change again especially things are moving toward 64bpp FP16 > pixel format by default for HDR. None of this matters for atomic multi-plane commit. There's about 3 ways to do that: - GO bit that you set to signal to the hw the new state that it should commit on the next vblank. - vblank inhibit bit (works like GO inverted, but doesn't auto-clear). - vblank evasion in software. I haven't seen anything else in 20+ atomic drivers. None of this has anything to do with the features your planes and blending engine support. And the above is somewhat interesting to know because it matters for how you send out the completion event for the atomic commit correctly, which again is part of the uabi contract. Hence why I think it makes sense to consider these strongly, even though they're fairly deeply nested - they are again a part of the glue that binds DC into the linux world. Cheers, Daniel > > > Misc guidelines > > > > > > Use the suspend/resume helpers. If your atomic can't do that, it's not > > > terribly good. Also, if DC can't make those fit, it's probably still too > > > much midlayer and its own world than helper library. > > > > > > > Do they handle swapping DP displays while the system is asleep? If not > > we'll probably need to add that. The other case where we have some > > special handling has to do with headless (sleep or resume, don't > > remember). > > > > > Use all the legacy helpers, again your atomic should be able to pull it > > > off. One exception is async plane flips (both primary and cursors), > > > that's > > > atm still unsolved. Probably best to keep the old code around for just > > > that case (but redirect to the compat helpers for everything), see e.g. > > > how vc4 implements cursors. > > > > > > > Good old flip. There probably isn't much shareable code between OSes > > here. It seems like every OS rolls there own thing, regarding flips. We > > still seem to be revisiting flips regularly, especially with FreeSync > > (adaptive sync) in the mix now. Good to know that this is still a bit of > > an open topic. > > > > > Most imporant of all > > > > > > Ask questions on #dri-devel. amdgpu atomic is the only nontrivial atomic > > > driver for which I don't remember a single discussion about some detail, > > > at least not with any of the DAL folks. Michel&Alex asked some questions > > > sometimes, but that indirection is bonghits and the defeats the point of > > > upstream: Direct cross-vendor collaboration to get shit done. Please > > > make > > > it happen. > > > > > > > Please keep asking us to get on dri-devel with questions. I need to get > > into the habit again of leaving the IRC channel open. I think most of us > > are still a bit scared of it or don't know how to deal with some of the > > information overload (IRC and mailing list). It's some of my job to > > change that all the while I'm learning this myself. :) > > > > Thanks for all your effort trying to get people involved. > > > > > Oh and I pretty much assume Harry&Tony are volunteered to review atomic > > > docs ;-) > > > > > > > Sure. > > > > Cheers, > > Harry > > > > > Cheers, Daniel > > > > > > > > > > > > > > > > > Unfortunately we cannot expose code for uGPU yet. However > > > > refactor / cleanup > > > > work on DC is public. We're currently transitioning to a public patch > > > > review. You can follow our progress on the amd-gfx mailing list. > > > > We value > > > > community feedback on our work. > > > > > > > > As an appendix I've included a brief overview of the how the > > > > code currently > > > > works to make understanding and reviewing the code easier. > > > > > > > > Prior discussions on DC: > > > > > > > > * > > > > https://lists.freedesktop.org/archives/dri-devel/2016-March/103398.html > > > > * > > > > https://lists.freedesktop.org/archives/dri-devel/2016-February/100524.html > > > > > > > > > > > > Current version of DC: > > > > > > > > * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7 > > > > > > > > Once Alex pulls in the latest patches: > > > > > > > > * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7 > > > > > > > > Best Regards, > > > > Harry > > > > > > > > > > > > ************************************************ > > > > *** Appendix: A Day in the Life of a Modeset *** > > > > ************************************************ > > > > > > > > Below is a high-level overview of a modeset with dc. Some of > > > > this might be a > > > > little out-of-date since it's based on my XDC presentation but > > > > it should be > > > > more-or-less the same. > > > > > > > > amdgpu_dm_atomic_commit() > > > > { > > > > /* setup atomic state */ > > > > drm_atomic_helper_prepare_planes(dev, state); > > > > drm_atomic_helper_swap_state(dev, state); > > > > drm_atomic_helper_update_legacy_modeset_state(dev, state); > > > > > > > > /* create or remove targets */ > > > > > > > > /******************************************************************** > > > > * *** Call into DC to commit targets with list of all known targets > > > > ********************************************************************/ > > > > /* DC is optimized not to do anything if 'targets' didn't change. */ > > > > dc_commit_targets(dm->dc, commit_targets, commit_targets_count) > > > > { > > > > /****************************************************************** > > > > * *** Build context (function also used for validation) > > > > ******************************************************************/ > > > > result = core_dc->res_pool->funcs->validate_with_context( > > > > core_dc,set,target_count,context); > > > > > > > > /****************************************************************** > > > > * *** Apply safe power state > > > > ******************************************************************/ > > > > pplib_apply_safe_state(core_dc); > > > > > > > > /**************************************************************** > > > > * *** Apply the context to HW (program HW) > > > > ****************************************************************/ > > > > result = core_dc->hwss.apply_ctx_to_hw(core_dc,context) > > > > { > > > > /* reset pipes that need reprogramming */ > > > > /* disable pipe power gating */ > > > > /* set safe watermarks */ > > > > > > > > /* for all pipes with an attached stream */ > > > > /************************************************************ > > > > * *** Programming all per-pipe contexts > > > > ************************************************************/ > > > > status = apply_single_controller_ctx_to_hw(...) > > > > { > > > > pipe_ctx->tg->funcs->set_blank(...); > > > > pipe_ctx->clock_source->funcs->program_pix_clk(...); > > > > pipe_ctx->tg->funcs->program_timing(...); > > > > pipe_ctx->mi->funcs->allocate_mem_input(...); > > > > pipe_ctx->tg->funcs->enable_crtc(...); > > > > bios_parser_crtc_source_select(...); > > > > > > > > pipe_ctx->opp->funcs->opp_set_dyn_expansion(...); > > > > pipe_ctx->opp->funcs->opp_program_fmt(...); > > > > > > > > stream->sink->link->link_enc->funcs->setup(...); > > > > pipe_ctx->stream_enc->funcs->dp_set_stream_attribute(...); > > > > pipe_ctx->tg->funcs->set_blank_color(...); > > > > > > > > core_link_enable_stream(pipe_ctx); > > > > unblank_stream(pipe_ctx, > > > > > > > > program_scaler(dc, pipe_ctx); > > > > } > > > > /* program audio for all pipes */ > > > > /* update watermarks */ > > > > } > > > > > > > > program_timing_sync(core_dc, context); > > > > /* for all targets */ > > > > target_enable_memory_requests(...); > > > > > > > > /* Update ASIC power states */ > > > > pplib_apply_display_requirements(...); > > > > > > > > /* update surface or page flip */ > > > > } > > > > } > > > > > > > > > > > > _______________________________________________ > > > > dri-devel mailing list > > > > dri-devel@lists.freedesktop.org > > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel > > > > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 66+ messages in thread
[parent not found: <3219f6f2-080e-0c77-45bb-cf59aa5b5858-5C7GfCeVMHo@public.gmane.org>]
* Re: [RFC] Using DC in amdgpu for upcoming GPU [not found] ` <3219f6f2-080e-0c77-45bb-cf59aa5b5858-5C7GfCeVMHo@public.gmane.org> @ 2016-12-13 7:30 ` Dave Airlie 2016-12-13 9:14 ` Cheng, Tony 2016-12-13 14:59 ` Rob Clark 1 sibling, 1 reply; 66+ messages in thread From: Dave Airlie @ 2016-12-13 7:30 UTC (permalink / raw) To: Cheng, Tony Cc: Grodzovsky, Andrey, amd-gfx mailing list, dri-devel, Daniel Vetter, Deucher, Alexander, Harry Wentland (hit send too early) > We would love to upstream DC for all supported asic! We made enough change > to make Sea Island work but it's really not validate to the extend we > validate Polaris on linux and no where close to what we do for 2017 ASICs. > With DC the display hardware programming, resource optimization, power > management and interaction with rest of system will be fully validated > across multiple OSs. Therefore we have high confidence that the quality is > going to better than what we have upstreammed today. > > I don't have a baseline to say if DC is in good enough quality for older > generation compare to upstream. For example we don't have HW generate > bandwidth_calc for DCE 8/10 (Sea/Vocanic island family) but our code is > structured in a way that we assume bandwidth_calc is there. None of us feel > like go untangle the formulas in windows driver at this point to create our > own version of bandwidth_calc. It sort of work with HW default values but > some mode / config is likely to underflows. If community is okay with > uncertain quality, sure we would love to upstream everything to reduce our > maintaince overhead. You do get audio with DC on DCE8 though. If we get any of this upstream, we should get all of the hw supported with it. If it regresses we just need someone to debug why. > Maybe let me share what we are doing and see if we can come up with > something to make DC work for both upstream and our internal need. We are > sharing code not just on Linux and we will do our best to make our code > upstream friendly. Last year we focussed on having enough code to prove > that our DAL rewrite works and get more people contributing to it. We rush > a bit as a result we had a few legacy component we port from Windows driver > and we know it's bloat that needed to go. > > We designed DC so HW can contribute bandwidth_calc magic and psuedo code to > program the HW blocks. The HW blocks on the bottom of DC.JPG in models our > HW blocks and the programming sequence are provided by HW engineers. If a > piece of HW need a bit toggled 7 times during power up I rather have HW > engineer put that in their psedo code rather than me trying to find that > sequence in some document. Afterall they did simulate the HW with the > toggle sequence. I guess these are back-end code Daniel talked about. Can > we agree that DRM core is not interested in how things are done in that > layer and we can upstream these as it? > > The next is dce_hwseq.c to program the HW blocks in correct sequence. Some > HW block can be programmed in any sequence, but some requires strict > sequence to be followed. For example Display CLK and PHY CLK need to be up > before we enable timing generator. I would like these sequence to remain in > DC as it's really not DRM's business to know how to program the HW. In a > way you can consider hwseq as a helper to commit state to HW. > > Above hwseq is the dce*_resource.c. It's job is to come up with the HW > state required to realize given config. For example we would use the exact > same HW resources with same optimization setting to drive any same given > config. If 4 x 4k@60 is supported with resource setting A on HW diagnositc > suite during bring up setting B on Linux then we have a problem. It know > which HW block work with which block and their capability and limitations. > I hope you are not asking this stuff to move up to core because in reality > we should probably hide this in some FW, as HW expose the register to config > them differently that doesn't mean all combination of HW usage is validated. > To me resource is more of a helper to put together functional pipeline and > does not make any decision that any OS might be interested in. > > These yellow boxes in DC.JPG are really specific to each generation of HW > and changes frequently. These are things that HW has consider hiding it in > FW before. Can we agree on those code (under /dc/dce*) can stay? I think most of these things are fine to be part of the solution we end up at, but I can't say for certain they won't require interface changes. I think the most useful code is probably the stuff in the dce subdirectories. > > Is this about demonstration how basic functionality work and add more > features with series of patches to make review eaiser? If so I don't think > we are staff to do this kind of rewrite. For example it make no sense to > hooking up bandwidth_calc to calculate HW magic if we don't have mem_input > to program the memory settings. We need portion of hw_seq to ensure these > blocks are programming in correct sequence. We will need to feed > bandwidth_calc it's required inputs, which is basically the whole system > state tracked in validate_context today, which means we basically need big > bulk of resource.c. This effort might have benefit in reviewing the code, > but we will end up with pretty much similar if not the same as what we > already have. This is something people always say, I'm betting you won't end up there at all, it's not just review, it's incremental development model, so that when things go wrong we can pinpoint why and where a lot easier. Just merging this all in one fell swoop is going to just mean a lot of pain in the end. I understand you aren't resourced for this sort of development on this codebase, but it's going to be an impasse to try and merge this all at once even if was clean code. > Or is the objection that we have the white boxes in DC.JPG instead of using > DRM objects? We can probably workout something to have the white boxes > derive from DRM objects and extend atomic state with our validate_context > where dce*_resource.c stores the constructed pipelines. I think Daniel explained quite well how things should look in terms of subclassing. > > 5) Why is a midlayer bad? > I'm not going to go into specifics on the DC midlayer, but we abhor > midlayers for a fair few reasons. The main reason I find causes the > most issues is locking. When you have breaks in code flow between > multiple layers, but having layers calling back into previous layers > it becomes near impossible to track who owns the locking and what the > current locking state is. > > Consider > drma -> dca -> dcb -> drmb > drmc -> dcc -> dcb -> drmb > > We have two codes paths that go back into drmb, now maybe drma has a > lock taken, but drmc doesn't, but we've no indication when we hit drmb > of what the context pre entering the DC layer is. This causes all > kinds of problems. The main requirement is the driver maintains the > execution flow as much as possible. The only callback behaviour should > be from an irq or workqueue type situations where you've handed > execution flow to the hardware to do something and it is getting back > to you. The pattern we use to get our of this sort of hole is helper > libraries, we structure code as much as possible as leaf nodes that > don't call back into the parents if we can avoid it (we don't always > succeed). > > Okay. by the way DC does behave like a helper for most part. There is no > locking in DC. We work enough with different OS to know they all have > different synchronization primatives and interrupt handling and have DC lock > anything is just shooting ourself in the foot. We do have function with > lock in their function name in DC but those are HW register lock to ensure > that the HW register update atomically. ie have 50 register write latch in > HW at next vsync to ensure HW state change on vsync boundary. > > So the above might becomes > drma-> dca_helper > -> dcb_helper > -> drmb. > > In this case the code flow is controlled by drma, dca/dcb might be > modifying data or setting hw state but when we get to drmb it's easy > to see what data is needs and what locking. > > DAL/DC goes against this in so many ways, and when I look at the code > I'm never sure where to even start pulling the thread to unravel it. > > I don't know where we go against it. In the case we do callback to DRM for > MST case we have > > amdgpu_dm_atomic_commit (implement atomic_commit) > dc_commit_targets (commit helper) > dce110_apply_ctx_to_hw (hw_seq) > core_link_enable_stream (part of MST enable sequence) > allocate_mst_payload (helper for above func in same file) > dm_helpers_dp_mst_write_payload_allocation_table (glue code to call DRM) > drm_dp_mst_allocate_vcpi (DRM) > > As you see even in this case we are only 6 level deep before we callback to > DRM, and 2 of those functions are in same file as helper func of the bigger > sequence. > > Can you clarify the distinction between what you would call a mid layer vs > helper. We consulted Alex a lot and we know about this inversion of control > pattern and we are trying our best to do it. Is it the way functions are > named and files folder structure? Would it help if we flatten > amdgpu_dm_atomic_commit and dc_commit_targets? Even if we do I would > imagine we want some helper in commit rather a giant 1000 line function. Is > there any concern that we put dc_commit_targets under /dc folder as we want > other platform to run exact same helper? Or this is about the state > dc_commit_targets is too big? or the state is stored validate_context > rather than drm_atomic_state? Well one area I hit today while looking, is trace the path for a dpcd read or write. An internal one in the dc layer goes core_link_dpcd_read (core_link) dm_helpers_dp_read_dpcd(context, dc_link) search connector list for the appropriate connector drm_dp_dpcd_read Note the connector list searching, this is a case of where you have called back into the toplevel driver without the info necessary because core_link and dc_link are too far abstracted from the drm connector. (get_connector_for_link is a bad idea) Then we get back around through the aux stuff and end up at: dc_read_dpcd which passes connector->dc_link->link_index down this look up the dc_link again in core_dc->links[index] dal_ddc_service_read_dpcd_data(link->ddc) which calls into the i2caux path. This is not helper functions or anything close, this is layering hell. > I don't think it make sense for DRM to get into how we decide to use our HW > blocks. For example any refactor done in core should not result in us using > different pipeline to drive the same config. We would like to have control > over how our HW pipeline is constructed. I don't think the DRM wants to get involved at that level, but it would be good if we could collapse the mountains of functions and layers so that you can clearly see how a modeset happens all the way down to the hw in a linear fashion. > > How do you plan on dealing with people rewriting or removing code > upstream that is redundant in the kernel, but required for internal > stuff? > > > Honestly I don't know what these are. Like you and Jerome remove func ptr > abstraction (I know it was bad, that was one of the component we ported from > windows) and we need to keep it as function pointer so we can still run our > code on FPGA before we see first silicon? I don't think if we nak the > function ptr removal will be a problem for community. The rest is valued > and we took with open arm. > > Or this is more like we have code duplication after DRM added some > functionality we can use? I would imaging its more of moving what we got > working in our code to DRM core if we are upstreamed and we have no problem > accomodate for that as the code moved out to DRM core can be included in > other platforms. We don't have any private ioctl today and we don't plan to > have any outside of using DRM object properties. I've just sent some patches to remove a bunch of dpcd defines, that is just one small example. > I really don't know what those new linux things can be that could cause us > problem. If anything the new things will be probably come from us if we are > upstreammed. But until then there will be competing development upstream, and you might want to merge things. > > DP MST: AMD was the first source certified and we work closely with the > first branch certified. I was a part of that team and we had a very solid > implementation. If we were upstreamed I don't see you would want to > reinvent the wheel and not try to massage what we have into shape for DRM > core for other driver to reuse. Definitely, I hate writing MST code, and it would have been good if someone else had gotten to it first. So I think after looking more at it, my major issue is with DC, the core stuff, not the hw touching stuff, but the layering stuff, dc and core infrastructure in a lot of places calls into the the DM layer and back into itself. It's a bit of an tangle to pull any one thread of it and try to unravel it. There also seems to be a fair lot of headers of questionable value, I've found the same set of defines (or pretty close ones) in a few headers, conversion functions between different layer definitions etc. There are redundant header files, unused structs, structs of questionable value or structs that should be merged. Stuff is hidden between dc and core structs, but it isn't always obvious why stuff is in dc_link vs core_link. Ideally we'd lose some of that layering. Also things like loggers and fixed function calculators, and vector code probably need to be bumped up a layer or two or made sure to be completely generic, and put outside the DC code, if code is in amd/display dir it should be display code. I'm going to be happily ignoring most of this until early next year at this point (I might jump in/out a few times) but I think Daniel and Alex have a pretty good handle on where this code should be going to get upstream, I think we should all be listening to them as much as possible. Dave. _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [RFC] Using DC in amdgpu for upcoming GPU 2016-12-13 7:30 ` Dave Airlie @ 2016-12-13 9:14 ` Cheng, Tony 0 siblings, 0 replies; 66+ messages in thread From: Cheng, Tony @ 2016-12-13 9:14 UTC (permalink / raw) To: Dave Airlie Cc: Grodzovsky, Andrey, amd-gfx mailing list, dri-devel, Deucher, Alexander On 12/13/2016 2:30 AM, Dave Airlie wrote: > (hit send too early) >> We would love to upstream DC for all supported asic! We made enough change >> to make Sea Island work but it's really not validate to the extend we >> validate Polaris on linux and no where close to what we do for 2017 ASICs. >> With DC the display hardware programming, resource optimization, power >> management and interaction with rest of system will be fully validated >> across multiple OSs. Therefore we have high confidence that the quality is >> going to better than what we have upstreammed today. >> >> I don't have a baseline to say if DC is in good enough quality for older >> generation compare to upstream. For example we don't have HW generate >> bandwidth_calc for DCE 8/10 (Sea/Vocanic island family) but our code is >> structured in a way that we assume bandwidth_calc is there. None of us feel >> like go untangle the formulas in windows driver at this point to create our >> own version of bandwidth_calc. It sort of work with HW default values but >> some mode / config is likely to underflows. If community is okay with >> uncertain quality, sure we would love to upstream everything to reduce our >> maintaince overhead. You do get audio with DC on DCE8 though. > If we get any of this upstream, we should get all of the hw supported with it. > > If it regresses we just need someone to debug why. great will do. > >> Maybe let me share what we are doing and see if we can come up with >> something to make DC work for both upstream and our internal need. We are >> sharing code not just on Linux and we will do our best to make our code >> upstream friendly. Last year we focussed on having enough code to prove >> that our DAL rewrite works and get more people contributing to it. We rush >> a bit as a result we had a few legacy component we port from Windows driver >> and we know it's bloat that needed to go. >> >> We designed DC so HW can contribute bandwidth_calc magic and psuedo code to >> program the HW blocks. The HW blocks on the bottom of DC.JPG in models our >> HW blocks and the programming sequence are provided by HW engineers. If a >> piece of HW need a bit toggled 7 times during power up I rather have HW >> engineer put that in their psedo code rather than me trying to find that >> sequence in some document. Afterall they did simulate the HW with the >> toggle sequence. I guess these are back-end code Daniel talked about. Can >> we agree that DRM core is not interested in how things are done in that >> layer and we can upstream these as it? >> >> The next is dce_hwseq.c to program the HW blocks in correct sequence. Some >> HW block can be programmed in any sequence, but some requires strict >> sequence to be followed. For example Display CLK and PHY CLK need to be up >> before we enable timing generator. I would like these sequence to remain in >> DC as it's really not DRM's business to know how to program the HW. In a >> way you can consider hwseq as a helper to commit state to HW. >> >> Above hwseq is the dce*_resource.c. It's job is to come up with the HW >> state required to realize given config. For example we would use the exact >> same HW resources with same optimization setting to drive any same given >> config. If 4 x 4k@60 is supported with resource setting A on HW diagnositc >> suite during bring up setting B on Linux then we have a problem. It know >> which HW block work with which block and their capability and limitations. >> I hope you are not asking this stuff to move up to core because in reality >> we should probably hide this in some FW, as HW expose the register to config >> them differently that doesn't mean all combination of HW usage is validated. >> To me resource is more of a helper to put together functional pipeline and >> does not make any decision that any OS might be interested in. >> >> These yellow boxes in DC.JPG are really specific to each generation of HW >> and changes frequently. These are things that HW has consider hiding it in >> FW before. Can we agree on those code (under /dc/dce*) can stay? > I think most of these things are fine to be part of the solution we end up at, > but I can't say for certain they won't require interface changes. I think the > most useful code is probably the stuff in the dce subdirectories. okay as long as we can agree on this piece stay I am sure we can make it work. > >> Is this about demonstration how basic functionality work and add more >> features with series of patches to make review eaiser? If so I don't think >> we are staff to do this kind of rewrite. For example it make no sense to >> hooking up bandwidth_calc to calculate HW magic if we don't have mem_input >> to program the memory settings. We need portion of hw_seq to ensure these >> blocks are programming in correct sequence. We will need to feed >> bandwidth_calc it's required inputs, which is basically the whole system >> state tracked in validate_context today, which means we basically need big >> bulk of resource.c. This effort might have benefit in reviewing the code, >> but we will end up with pretty much similar if not the same as what we >> already have. > This is something people always say, I'm betting you won't end up there at all, > it's not just review, it's incremental development model, so that when things > go wrong we can pinpoint why and where a lot easier. Just merging this all in > one fell swoop is going to just mean a lot of pain in the end. I understand you > aren't resourced for this sort of development on this codebase, but it's going > to be an impasse to try and merge this all at once even if was clean code. how is it going to work then? we can merge hardware programming code (the hw objects under /dc/dce) without anyone call it? > >> Or is the objection that we have the white boxes in DC.JPG instead of using >> DRM objects? We can probably workout something to have the white boxes >> derive from DRM objects and extend atomic state with our validate_context >> where dce*_resource.c stores the constructed pipelines. > I think Daniel explained quite well how things should look in terms of > subclassing. okay we will look into how to do it. this definitely won't happen over night as we need to get clear on what to do first and look at how other driver does it. As per Harry's RFC (last still to do item), we plan to work on atomic anyways this expanded the scope of that a bit. > >> 5) Why is a midlayer bad? >> I'm not going to go into specifics on the DC midlayer, but we abhor >> midlayers for a fair few reasons. The main reason I find causes the >> most issues is locking. When you have breaks in code flow between >> multiple layers, but having layers calling back into previous layers >> it becomes near impossible to track who owns the locking and what the >> current locking state is. >> >> Consider >> drma -> dca -> dcb -> drmb >> drmc -> dcc -> dcb -> drmb >> >> We have two codes paths that go back into drmb, now maybe drma has a >> lock taken, but drmc doesn't, but we've no indication when we hit drmb >> of what the context pre entering the DC layer is. This causes all >> kinds of problems. The main requirement is the driver maintains the >> execution flow as much as possible. The only callback behaviour should >> be from an irq or workqueue type situations where you've handed >> execution flow to the hardware to do something and it is getting back >> to you. The pattern we use to get our of this sort of hole is helper >> libraries, we structure code as much as possible as leaf nodes that >> don't call back into the parents if we can avoid it (we don't always >> succeed). >> >> Okay. by the way DC does behave like a helper for most part. There is no >> locking in DC. We work enough with different OS to know they all have >> different synchronization primatives and interrupt handling and have DC lock >> anything is just shooting ourself in the foot. We do have function with >> lock in their function name in DC but those are HW register lock to ensure >> that the HW register update atomically. ie have 50 register write latch in >> HW at next vsync to ensure HW state change on vsync boundary. >> >> So the above might becomes >> drma-> dca_helper >> -> dcb_helper >> -> drmb. >> >> In this case the code flow is controlled by drma, dca/dcb might be >> modifying data or setting hw state but when we get to drmb it's easy >> to see what data is needs and what locking. >> >> DAL/DC goes against this in so many ways, and when I look at the code >> I'm never sure where to even start pulling the thread to unravel it. >> >> I don't know where we go against it. In the case we do callback to DRM for >> MST case we have >> >> amdgpu_dm_atomic_commit (implement atomic_commit) >> dc_commit_targets (commit helper) >> dce110_apply_ctx_to_hw (hw_seq) >> core_link_enable_stream (part of MST enable sequence) >> allocate_mst_payload (helper for above func in same file) >> dm_helpers_dp_mst_write_payload_allocation_table (glue code to call DRM) >> drm_dp_mst_allocate_vcpi (DRM) >> >> As you see even in this case we are only 6 level deep before we callback to >> DRM, and 2 of those functions are in same file as helper func of the bigger >> sequence. >> >> Can you clarify the distinction between what you would call a mid layer vs >> helper. We consulted Alex a lot and we know about this inversion of control >> pattern and we are trying our best to do it. Is it the way functions are >> named and files folder structure? Would it help if we flatten >> amdgpu_dm_atomic_commit and dc_commit_targets? Even if we do I would >> imagine we want some helper in commit rather a giant 1000 line function. Is >> there any concern that we put dc_commit_targets under /dc folder as we want >> other platform to run exact same helper? Or this is about the state >> dc_commit_targets is too big? or the state is stored validate_context >> rather than drm_atomic_state? > Well one area I hit today while looking, is trace the path for a dpcd > read or write. > > An internal one in the dc layer goes > > core_link_dpcd_read (core_link) > dm_helpers_dp_read_dpcd(context, dc_link) > search connector list for the appropriate connector > drm_dp_dpcd_read > > Note the connector list searching, this is a case of where you have called > back into the toplevel driver without the info necessary because core_link > and dc_link are too far abstracted from the drm connector. > (get_connector_for_link is a bad idea) > > Then we get back around through the aux stuff and end up at: > dc_read_dpcd which passes connector->dc_link->link_index down > this look up the dc_link again in core_dc->links[index] > dal_ddc_service_read_dpcd_data(link->ddc) > which calls into the i2caux path. > > This is not helper functions or anything close, this is layering hell. As per Harry's RFC (2nd still todo item), we are working on switching fully to use I2c / aux code provided by DRM. we just haven't got there yet. I would think this part would look similar to how MST part look once we cleaned it up. By the way anything with dal_* in the name are stuff we ported to get us up an running quickly. dal_* has no business in dc and will be removed. >> I don't think it make sense for DRM to get into how we decide to use our HW >> blocks. For example any refactor done in core should not result in us using >> different pipeline to drive the same config. We would like to have control >> over how our HW pipeline is constructed. > I don't think the DRM wants to get involved at that level, but it would be good > if we could collapse the mountains of functions and layers so that you can > clearly see how a modeset happens all the way down to the hw in a linear > fashion. there is really not that many layers. if you look at MST example we will hit registers in 5 level. amdgpu_dm_atomic_commit dc_commit_targets dce110_apply_ctx_to_hw core_link_enable_stream allocate_mst_payload (same as above drm callback exmaple) dce110_stream_encoder_set_mst_bandwidth REG_SET > >> How do you plan on dealing with people rewriting or removing code >> upstream that is redundant in the kernel, but required for internal >> stuff? >> >> >> Honestly I don't know what these are. Like you and Jerome remove func ptr >> abstraction (I know it was bad, that was one of the component we ported from >> windows) and we need to keep it as function pointer so we can still run our >> code on FPGA before we see first silicon? I don't think if we nak the >> function ptr removal will be a problem for community. The rest is valued >> and we took with open arm. >> >> Or this is more like we have code duplication after DRM added some >> functionality we can use? I would imaging its more of moving what we got >> working in our code to DRM core if we are upstreamed and we have no problem >> accomodate for that as the code moved out to DRM core can be included in >> other platforms. We don't have any private ioctl today and we don't plan to >> have any outside of using DRM object properties. > I've just sent some patches to remove a bunch of dpcd defines, that is just > one small example. All of them are great for us to merge except patch 1/8. dc: remove dc hub. As you might have guess that function is for ASIC currently in the lab. Maybe we should have sanitized it with #ifdef and not have it visiable upstream in the first place. > >> I really don't know what those new linux things can be that could cause us >> problem. If anything the new things will be probably come from us if we are >> upstreammed. > But until then there will be competing development upstream, and you might > want to merge things. Maybe. Somehow my gut feel is either we will either have those new things in demo-able shape before competing development start like FreeSync or it's something everybody care about and need SW ecosystem support like HDR. In HDR case we are more than happen to participate up front. > >> DP MST: AMD was the first source certified and we work closely with the >> first branch certified. I was a part of that team and we had a very solid >> implementation. If we were upstreamed I don't see you would want to >> reinvent the wheel and not try to massage what we have into shape for DRM >> core for other driver to reuse. > Definitely, I hate writing MST code, and it would have been good if someone else > had gotten to it first. > > So I think after looking more at it, my major issue is with DC, the > core stuff, let me clarify you mean stuff under /dc/core? I think there is a path for us to have those subclass DRM objects. > not the hw > touching stuff, but the layering stuff, dc and core infrastructure in > a lot of places > calls into the the DM layer and back into itself. It's a bit of an > tangle to pull any > one thread of it and try to unravel it. I think we only have dpcd/i2c left, which we said we will fix. > > There also seems to be a fair lot of headers of questionable value, I've found > the same set of defines (or pretty close ones) in a few headers, redundant header will go. just we need to spend time to go through them. most of them are leftover from dal port. > conversion functions > between different layer definitions etc. There are redundant header > files, unused structs, structs > of questionable value or structs that should be merged. > > Stuff is hidden between dc and core structs, but it isn't always obvious why > stuff is in dc_link vs core_link. Ideally we'd lose some of that layering. okay, we will probably end up subclassing for DRM object anyways. > > Also things like loggers and fixed function calculators, and vector > code probably need to be bumped > up a layer or two or made sure to be completely generic, and put > outside the DC code, if code is > in amd/display dir it should be display code. stuff under dc/basic are quick ports to get us going and you probably already notice we don't use some of them. By the way is there any problem using float to do our bandwidth_calc? we can safe/restore context on x86. bandwidth_calc will run a lot faster and we can make it somewhat readible and get rid of fixpt31_32.c. > > I'm going to be happily ignoring most of this until early next year at > this point (I might jump in/out a few times) > but I think Daniel and Alex have a pretty good handle on where this > code should be going to get upstream, I think we should > all be listening to them as much as possible. > > Dave. _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [RFC] Using DC in amdgpu for upcoming GPU [not found] ` <3219f6f2-080e-0c77-45bb-cf59aa5b5858-5C7GfCeVMHo@public.gmane.org> 2016-12-13 7:30 ` Dave Airlie @ 2016-12-13 14:59 ` Rob Clark 1 sibling, 0 replies; 66+ messages in thread From: Rob Clark @ 2016-12-13 14:59 UTC (permalink / raw) To: Cheng, Tony Cc: Grodzovsky, Andrey, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Daniel Vetter, Deucher, Alexander, Harry Wentland On Mon, Dec 12, 2016 at 11:10 PM, Cheng, Tony <tony.cheng@amd.com> wrote: > We need to treat most of resource that don't map well as global. One example > is pixel pll. We have 6 display pipes but only 2 or 3 plls in CI/VI, as a > result we are limited in number of HDMI or DVI we can drive at the same > time. Also the pixel pll can be used to drive DP as well, so there is > another layer of HW specific but we can't really contain it in crtc or > encoder by itself. Doing this resource allocation require knowlege of the > whole system, and knowning which pixel pll is already used, and what can we > support with remaining pll. > > Another ask is lets say we are driving 2 displays, we would always want > instance 0 and instance 1 of scaler, timing generator etc getting used. We > want to avoid possiblity of due to different user mode commit sequence we > end up with driving the 2 display with 0 and 2nd instance of HW. Not only > this configuration isn't really validated in the lab, we will be less > effective in power gating as instance 0 and 1 are one the same tile. > instead of having 2/3 of processing pipeline silicon power gated we can only > power gate 1/3. And if we power gate wrong the you will have 1 of the 2 > display not lighting up. Note that as of 4.10, drm/msm/mdp5 is dynamically assigning hwpipes to planes tracked as part of the driver's global atomic state. (And for future hw we will need to dynamically assign layermixers to crtc's). I'm also using global state for allocating SMP (basically fifo) blocks. And drm/i915 is also using global atomic state for shared resources. Dynamic assignment of hw resources to kms objects is not a problem, and the locking model in atomic allows for this. (I introduced one new global modeset_lock to protect the global state, so only multiple parallel updates which both touch shared state will serialize) BR, -R _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [RFC] Using DC in amdgpu for upcoming GPU [not found] ` <b64d0072-4909-c680-2f09-adae9f856642-5C7GfCeVMHo@public.gmane.org> 2016-12-13 4:10 ` Cheng, Tony @ 2016-12-13 7:31 ` Daniel Vetter 2016-12-13 10:09 ` Ernst Sjöstrand 2 siblings, 0 replies; 66+ messages in thread From: Daniel Vetter @ 2016-12-13 7:31 UTC (permalink / raw) To: Harry Wentland Cc: Grodzovsky, Andrey, Cheng, Tony, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Daniel Vetter, Deucher, Alexander, Dave Airlie On Mon, Dec 12, 2016 at 09:33:52PM -0500, Harry Wentland wrote: > On 2016-12-11 03:28 PM, Daniel Vetter wrote: > > On Wed, Dec 07, 2016 at 09:02:13PM -0500, Harry Wentland wrote: > > > We propose to use the Display Core (DC) driver for display support on > > > AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to > > > avoid a flag day the plan is to only support uGPU initially and transition > > > to older ASICs gradually. > > > > Bridgeman brought it up a few times that this here was the question - it's > > kinda missing a question mark, hard to figure this out ;-). I'd say for > > My bad for the missing question mark (imprecise phrasing). On the other hand > letting this blow over a bit helped get us on the map a bit more and allows > us to argue the challenges (and benefits) of open source. :) > > > upstream it doesn't really matter, but imo having both atomic and > > non-atomic paths in one driver is one world of hurt and I strongly > > recommend against it, at least if feasible. All drivers that switched > > switched in one go, the only exception was i915 (it took much longer than > > we ever feared, causing lots of pain) and nouveau (which only converted > > nv50+, but pre/post-nv50 have always been two almost completely separate > > worlds anyway). > > > > You mention the two probably most complex DRM drivers didn't switch in a > single go... I imagine amdgpu/DC falls into the same category. > > I think one of the problems is making a sudden change with a fully validated > driver without breaking existing use cases and customers. We really > should've started DC development in public and probably would do that if we > had to start anew. > > > > The DC component has received extensive testing within AMD for DCE8, 10, and > > > 11 GPUs and is being prepared for uGPU. Support should be better than > > > amdgpu's current display support. > > > > > > * All of our QA effort is focused on DC > > > * All of our CQE effort is focused on DC > > > * All of our OEM preloads and custom engagements use DC > > > * DC behavior mirrors what we do for other OSes > > > > > > The new asic utilizes a completely re-designed atom interface, so we cannot > > > easily leverage much of the existing atom-based code. > > > > > > We've introduced DC to the community earlier in 2016 and received a fair > > > amount of feedback. Some of what we've addressed so far are: > > > > > > * Self-contain ASIC specific code. We did a bunch of work to pull > > > common sequences into dc/dce and leave ASIC specific code in > > > separate folders. > > > * Started to expose AUX and I2C through generic kernel/drm > > > functionality and are mostly using that. Some of that code is still > > > needlessly convoluted. This cleanup is in progress. > > > * Integrated Dave and Jerome’s work on removing abstraction in bios > > > parser. > > > * Retire adapter service and asic capability > > > * Remove some abstraction in GPIO > > > > > > Since a lot of our code is shared with pre- and post-silicon validation > > > suites changes need to be done gradually to prevent breakages due to a major > > > flag day. This, coupled with adding support for new asics and lots of new > > > feature introductions means progress has not been as quick as we would have > > > liked. We have made a lot of progress none the less. > > > > > > The remaining concerns that were brought up during the last review that we > > > are working on addressing: > > > > > > * Continue to cleanup and reduce the abstractions in DC where it > > > makes sense. > > > * Removing duplicate code in I2C and AUX as we transition to using the > > > DRM core interfaces. We can't fully transition until we've helped > > > fill in the gaps in the drm core that we need for certain features. > > > * Making sure Atomic API support is correct. Some of the semantics of > > > the Atomic API were not particularly clear when we started this, > > > however, that is improving a lot as the core drm documentation > > > improves. Getting this code upstream and in the hands of more > > > atomic users will further help us identify and rectify any gaps we > > > have. > > > > Ok so I guess Dave is typing some more general comments about > > demidlayering, let me type some guidelines about atomic. Hopefully this > > all materializes itself a bit better into improved upstream docs, but meh. > > > > Excellent writeup. Let us know when/if you want our review for upstream > docs. > > We'll have to really take some time to go over our atomic implementation. A > couple small comments below with regard to DC. > > > Step 0: Prep > > > > So atomic is transactional, but it's not validate + rollback or commit, > > but duplicate state, validate and then either throw away or commit. > > There's a few big reasons for this: a) partial atomic updates - if you > > duplicate it's much easier to check that you have all the right locks b) > > kfree() is much easier to check for correctness than a rollback code and > > c) atomic_check functions are much easier to audit for invalid changes to > > persistent state. > > > > There isn't really any rollback. I believe even in our other drivers we've > abandoned the rollback approach years ago because it doesn't really work on > modern HW. Any rollback cases you might find in DC should really only be for > catastrophic errors (read: something went horribly wrong... read: > congratulations, you just found a bug). I meant rollback in software. Rollback in hw isn't a good idea, and atomi's point is to avoid these. > > Trouble is that this seems a bit unusual compared to all other approaches, > > and ime (from the drawn-out i915 conversion) you really don't want to mix > > things up. Ofc for private state you can roll back (e.g. vc4 does that for > > the drm_mm allocator thing for scanout slots or whatever it is), but it's > > trivial easy to accidentally check the wrong state or mix them up or > > something else bad. > > > > Long story short, I think step 0 for DC is to split state from objects, > > i.e. for each dc_surface/foo/bar you need a dc_surface/foo/bar_state. And > > all the back-end functions need to take both the object and the state > > explicitly. > > > > This is a bit a pain to do, but should be pretty much just mechanical. And > > imo not all of it needs to happen before DC lands in upstream, but see > > above imo that half-converted state is postively horrible. This should > > also not harm cross-os reuse at all, you can still store things together > > on os where that makes sense. > > > > Guidelines for amdgpu atomic structures > > > > drm atomic stores everything in state structs on plane/connector/crtc. > > This includes any property extensions or anything else really, the entire > > userspace abi is built on top of this. Non-trivial drivers are supposed to > > subclass these to store their own stuff, so e.g. > > > > amdgpu_plane_state { > > struct drm_plane_state base; > > > > /* amdgpu glue state and stuff that's linux-specific, e.g. > > * property values and similar things. Note that there's strong > > * push towards standardizing properties and stroing them in the > > * drm_*_state structs. */ > > > > struct dc_surface_state surface_state; > > > > /* other dc states that fit to a plane */ > > }; > > > > Yes not everything will fit 1:1 in one of these, but to get started I > > strongly recommend to make them fit (maybe with reduced feature sets to > > start out). Stuff that is shared between e.g. planes, but always on the > > same crtc can be put into amdgpu_crtc_state, e.g. if you have scalers that > > are assignable to a plane. > > > > Of course atomic also supports truly global resources, for that you need > > to subclass drm_atomic_state. Currently msm and i915 do that, and probably > > best to read those structures as examples until I've typed the docs. But I > > expect that especially for planes a few dc_*_state structs will stay in > > amdgpu_*_state. > > > > Guidelines for atomic_check > > > > Please use the helpers as much as makes sense, and put at least the basic > > steps that from drm_*_state into the respective dc_*_state functional > > block into the helper callbacks for that object. I think basic validation > > of individal bits (as much as possible, e.g. if you just don't support > > e.g. scaling or rotation with certain pixel formats) should happen in > > there too. That way when we e.g. want to check how drivers corrently > > validate a given set of properties to be able to more strictly define the > > semantics, that code is easy to find. > > > > Also I expect that this won't result in code duplication with other OS, > > you need code to map from drm to dc anyway, might as well check&reject the > > stuff that dc can't even represent right there. > > > > The other reason is that the helpers are good guidelines for some of the > > semantics, e.g. it's mandatory that drm_crtc_needs_modeset gives the right > > answer after atomic_check. If it doesn't, then you're driver doesn't > > follow atomic. If you completely roll your own this becomes much harder to > > assure. > > > > Interesting point. Not sure if we've checked that. Is there some sort of > automated test for this that we can use to check? We're typing them up in igt - generic testcase is pretty simple: Semi-randomly change stuff, ask with TEST_ONLY whether it would modeset, then commit and watch the vblank counter: If there's a gap, there was a modeset. If it doesn't match the TEST_ONLY answer, complain. > > Of course extend it all however you want, e.g. by adding all the global > > optimization and resource assignment stuff after initial per-object > > checking has been done using the helper infrastructure. > > > > Guidelines for atomic_commit > > > > Use the new nonblcoking helpers. Everyone who didn't got it wrong. Also, > > I believe we're not using those and didn't start with those which might > explain (along with lack of discussion on dri-devel) why atomic currently > looks the way it does in DC. This is definitely one of the bigger issues > we'd want to clean up and where you wouldn't find much pushback, other than > us trying to find time to do it. Yeah, back when DC was developed the recommendation was still "roll your own". > > your atomic_commit should pretty much match the helper one, except for a > > custom swap_state to handle all your globally shared specia dc_*_state > > objects. Everything hw specific should be in atomic_commit_tail. > > > > Wrt the hw commit itself, for the modeset step just roll your own. That's > > the entire point of atomic, and atm both i915 and nouveau exploit this > > fully. Besides a bit of glue there shouldn't be much need for > > linux-specific code here - what you need is something to fish the right > > dc_*_state objects and give it your main sequencer functions. What you > > should make sure though is that only ever do a modeset when that was > > signalled, i.e. please use drm_crtc_needs_modeset to control that part. > > Feel free to wrap up in a dc_*_needs_modeset for better abstraction if > > that's needed. > > > > I do strongly suggest however that you implement the plane commit using > > the helpers. There's really only a few ways to implement this in the hw, > > and it should work everywhere. > > > > Misc guidelines > > > > Use the suspend/resume helpers. If your atomic can't do that, it's not > > terribly good. Also, if DC can't make those fit, it's probably still too > > much midlayer and its own world than helper library. > > > > Do they handle swapping DP displays while the system is asleep? If not we'll > probably need to add that. The other case where we have some special > handling has to do with headless (sleep or resume, don't remember). Atm we do a dumb restore, and since the link will fail to train, then just light it up with a default mode. You kinda have to do that, because disabling a pipe behind userspace's back is not a nice thing to do. Then we also send out the usual uevent (or should, there's some broken versions out there) so that userspace sees the reconfiguration and can adjust the desired config. Same with MST, although that was only fixed recently: Before we had the connector refcounting we just force-unplugged everything and caused some surprises with userspace. -intel learned to cope, but with the proliferation of kms native compositors I don't think assuming that your userspace will always cope if the kernel yanks screens randomly is a good one. This approach also will tie into the new link_status flag, to indicate that something is wrong with an output. > > Use all the legacy helpers, again your atomic should be able to pull it > > off. One exception is async plane flips (both primary and cursors), that's > > atm still unsolved. Probably best to keep the old code around for just > > that case (but redirect to the compat helpers for everything), see e.g. > > how vc4 implements cursors. > > > > Good old flip. There probably isn't much shareable code between OSes here. > It seems like every OS rolls there own thing, regarding flips. We still seem > to be revisiting flips regularly, especially with FreeSync (adaptive sync) > in the mix now. Good to know that this is still a bit of an open topic. Yeah, so freesync and atomic is also not solved. But since that's just a variable vblank, but the flip itself is still synced to it, it shouldn't be a problem to wire this up for atomic. > > > Most imporant of all > > > > Ask questions on #dri-devel. amdgpu atomic is the only nontrivial atomic > > driver for which I don't remember a single discussion about some detail, > > at least not with any of the DAL folks. Michel&Alex asked some questions > > sometimes, but that indirection is bonghits and the defeats the point of > > upstream: Direct cross-vendor collaboration to get shit done. Please make > > it happen. > > > > Please keep asking us to get on dri-devel with questions. I need to get into > the habit again of leaving the IRC channel open. I think most of us are > still a bit scared of it or don't know how to deal with some of the > information overload (IRC and mailing list). It's some of my job to change > that all the while I'm learning this myself. :) Also just discuss design issues that interact with the core/helpers there, even amongst yourself. Since sometimes not understand it is a problem with a bug in the core ;-) > Thanks for all your effort trying to get people involved. > > > Oh and I pretty much assume Harry&Tony are volunteered to review atomic > > docs ;-) > > > > Sure. Thanks, Daniel > > Cheers, > Harry > > > Cheers, Daniel > > > > > > > > > > > > Unfortunately we cannot expose code for uGPU yet. However refactor / cleanup > > > work on DC is public. We're currently transitioning to a public patch > > > review. You can follow our progress on the amd-gfx mailing list. We value > > > community feedback on our work. > > > > > > As an appendix I've included a brief overview of the how the code currently > > > works to make understanding and reviewing the code easier. > > > > > > Prior discussions on DC: > > > > > > * https://lists.freedesktop.org/archives/dri-devel/2016-March/103398.html > > > * > > > https://lists.freedesktop.org/archives/dri-devel/2016-February/100524.html > > > > > > Current version of DC: > > > > > > * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7 > > > > > > Once Alex pulls in the latest patches: > > > > > > * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7 > > > > > > Best Regards, > > > Harry > > > > > > > > > ************************************************ > > > *** Appendix: A Day in the Life of a Modeset *** > > > ************************************************ > > > > > > Below is a high-level overview of a modeset with dc. Some of this might be a > > > little out-of-date since it's based on my XDC presentation but it should be > > > more-or-less the same. > > > > > > amdgpu_dm_atomic_commit() > > > { > > > /* setup atomic state */ > > > drm_atomic_helper_prepare_planes(dev, state); > > > drm_atomic_helper_swap_state(dev, state); > > > drm_atomic_helper_update_legacy_modeset_state(dev, state); > > > > > > /* create or remove targets */ > > > > > > /******************************************************************** > > > * *** Call into DC to commit targets with list of all known targets > > > ********************************************************************/ > > > /* DC is optimized not to do anything if 'targets' didn't change. */ > > > dc_commit_targets(dm->dc, commit_targets, commit_targets_count) > > > { > > > /****************************************************************** > > > * *** Build context (function also used for validation) > > > ******************************************************************/ > > > result = core_dc->res_pool->funcs->validate_with_context( > > > core_dc,set,target_count,context); > > > > > > /****************************************************************** > > > * *** Apply safe power state > > > ******************************************************************/ > > > pplib_apply_safe_state(core_dc); > > > > > > /**************************************************************** > > > * *** Apply the context to HW (program HW) > > > ****************************************************************/ > > > result = core_dc->hwss.apply_ctx_to_hw(core_dc,context) > > > { > > > /* reset pipes that need reprogramming */ > > > /* disable pipe power gating */ > > > /* set safe watermarks */ > > > > > > /* for all pipes with an attached stream */ > > > /************************************************************ > > > * *** Programming all per-pipe contexts > > > ************************************************************/ > > > status = apply_single_controller_ctx_to_hw(...) > > > { > > > pipe_ctx->tg->funcs->set_blank(...); > > > pipe_ctx->clock_source->funcs->program_pix_clk(...); > > > pipe_ctx->tg->funcs->program_timing(...); > > > pipe_ctx->mi->funcs->allocate_mem_input(...); > > > pipe_ctx->tg->funcs->enable_crtc(...); > > > bios_parser_crtc_source_select(...); > > > > > > pipe_ctx->opp->funcs->opp_set_dyn_expansion(...); > > > pipe_ctx->opp->funcs->opp_program_fmt(...); > > > > > > stream->sink->link->link_enc->funcs->setup(...); > > > pipe_ctx->stream_enc->funcs->dp_set_stream_attribute(...); > > > pipe_ctx->tg->funcs->set_blank_color(...); > > > > > > core_link_enable_stream(pipe_ctx); > > > unblank_stream(pipe_ctx, > > > > > > program_scaler(dc, pipe_ctx); > > > } > > > /* program audio for all pipes */ > > > /* update watermarks */ > > > } > > > > > > program_timing_sync(core_dc, context); > > > /* for all targets */ > > > target_enable_memory_requests(...); > > > > > > /* Update ASIC power states */ > > > pplib_apply_display_requirements(...); > > > > > > /* update surface or page flip */ > > > } > > > } > > > > > > > > > _______________________________________________ > > > dri-devel mailing list > > > dri-devel@lists.freedesktop.org > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel > > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [RFC] Using DC in amdgpu for upcoming GPU [not found] ` <b64d0072-4909-c680-2f09-adae9f856642-5C7GfCeVMHo@public.gmane.org> 2016-12-13 4:10 ` Cheng, Tony 2016-12-13 7:31 ` Daniel Vetter @ 2016-12-13 10:09 ` Ernst Sjöstrand 2 siblings, 0 replies; 66+ messages in thread From: Ernst Sjöstrand @ 2016-12-13 10:09 UTC (permalink / raw) To: Harry Wentland Cc: Grodzovsky, Andrey, Cheng, Tony, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, amd-gfx mailing list, Daniel Vetter, Deucher, Alexander, Dave Airlie [-- Attachment #1.1: Type: text/plain, Size: 563 bytes --] 2016-12-13 3:33 GMT+01:00 Harry Wentland <harry.wentland-5C7GfCeVMHo@public.gmane.org>: Please keep asking us to get on dri-devel with questions. I need to get > into the habit again of leaving the IRC channel open. I think most of us > are still a bit scared of it or don't know how to deal with some of the > information overload (IRC and mailing list). It's some of my job to change > that all the while I'm learning this myself. :) > https://www.irccloud.com/ is pretty nice if you're not the keep-irssi-running-in-screen-on-a-server type. Regards //Ernst [-- Attachment #1.2: Type: text/html, Size: 1023 bytes --] [-- Attachment #2: Type: text/plain, Size: 154 bytes --] _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
[parent not found: <55d5e664-25f7-70e0-f2f5-9c9daf3efdf6-5C7GfCeVMHo@public.gmane.org>]
* Re: [RFC] Using DC in amdgpu for upcoming GPU [not found] ` <55d5e664-25f7-70e0-f2f5-9c9daf3efdf6-5C7GfCeVMHo@public.gmane.org> @ 2016-12-12 2:57 ` Dave Airlie 2016-12-12 7:09 ` Daniel Vetter ` (2 more replies) 0 siblings, 3 replies; 66+ messages in thread From: Dave Airlie @ 2016-12-12 2:57 UTC (permalink / raw) To: Harry Wentland Cc: Grodzovsky, Andrey, Cyr, Aric, Bridgman, John, Lazare, Jordan, amd-gfx mailing list, dri-devel, Deucher, Alexander, Cheng, Tony On 8 December 2016 at 12:02, Harry Wentland <harry.wentland@amd.com> wrote: > We propose to use the Display Core (DC) driver for display support on > AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to > avoid a flag day the plan is to only support uGPU initially and transition > to older ASICs gradually. [FAQ: from past few days] 1) Hey you replied to Daniel, you never addressed the points of the RFC! I've read it being said that I hadn't addressed the RFC, and you know I've realised I actually had, because the RFC is great but it presupposes the codebase as designed can get upstream eventually, and I don't think it can. The code is too littered with midlayering and other problems, that actually addressing the individual points of the RFC would be missing the main point I'm trying to make. This code needs rewriting, not cleaning, not polishing, it needs to be split into its constituent parts, and reintegrated in a form more Linux process friendly. I feel that if I reply to the individual points Harry has raised in this RFC, that it means the code would then be suitable for merging, which it still won't, and I don't want people wasting another 6 months. If DC was ready for the next-gen GPU it would be ready for the current GPU, it's not the specific ASIC code that is the problem, it's the huge midlayer sitting in the middle. 2) We really need to share all of this code between OSes, why does Linux not want it? Sharing code is a laudable goal and I appreciate the resourcing constraints that led us to the point at which we find ourselves, but the way forward involves finding resources to upstream this code, dedicated people (even one person) who can spend time on a day by day basis talking to people in the open and working upstream, improving other pieces of the drm as they go, reading atomic patches and reviewing them, and can incrementally build the DC experience on top of the Linux kernel infrastructure. Then having the corresponding changes in the DC codebase happen internally to correspond to how the kernel code ends up looking. Lots of this code overlaps with stuff the drm already does, lots of is stuff the drm should be doing, so patches to the drm should be sent instead. 3) Then how do we upstream it? Resource(s) need(s) to start concentrating at splitting this thing up and using portions of it in the upstream kernel. We don't land fully formed code in the kernel if we can avoid it. Because you can't review the ideas and structure as easy as when someone builds up code in chunks and actually develops in the Linux kernel. This has always produced better more maintainable code. Maybe the result will end up improving the AMD codebase as well. 4) Why can't we put this in staging? People have also mentioned staging, Daniel has called it a dead end, I'd have considered staging for this code base, and I still might. However staging has rules, and the main one is code in staging needs a TODO list, and agreed criteria for exiting staging, I don't think we'd be able to get an agreement on what the TODO list should contain and how we'd ever get all things on it done. If this code ended up in staging, it would most likely require someone dedicated to recreating it in the mainline driver in an incremental fashion, and I don't see that resource being available. 5) Why is a midlayer bad? I'm not going to go into specifics on the DC midlayer, but we abhor midlayers for a fair few reasons. The main reason I find causes the most issues is locking. When you have breaks in code flow between multiple layers, but having layers calling back into previous layers it becomes near impossible to track who owns the locking and what the current locking state is. Consider drma -> dca -> dcb -> drmb drmc -> dcc -> dcb -> drmb We have two codes paths that go back into drmb, now maybe drma has a lock taken, but drmc doesn't, but we've no indication when we hit drmb of what the context pre entering the DC layer is. This causes all kinds of problems. The main requirement is the driver maintains the execution flow as much as possible. The only callback behaviour should be from an irq or workqueue type situations where you've handed execution flow to the hardware to do something and it is getting back to you. The pattern we use to get our of this sort of hole is helper libraries, we structure code as much as possible as leaf nodes that don't call back into the parents if we can avoid it (we don't always succeed). So the above might becomes drma-> dca_helper -> dcb_helper -> drmb. In this case the code flow is controlled by drma, dca/dcb might be modifying data or setting hw state but when we get to drmb it's easy to see what data is needs and what locking. DAL/DC goes against this in so many ways, and when I look at the code I'm never sure where to even start pulling the thread to unravel it. Some questions I have for AMD engineers that also I'd want to see addressed before any consideration of merging would happen! How do you plan on dealing with people rewriting or removing code upstream that is redundant in the kernel, but required for internal stuff? How are you going to deal with new Linux things that overlap incompatibly with your internally developed stuff? If the code is upstream will it be tested in the kernel by some QA group, or will there be some CI infrastructure used to maintain and to watch for Linux code that breaks assumptions in the DC code? Can you show me you understand that upstream code is no longer 100% in your control and things can happen to it that you might not expect and you need to deal with it? Dave. _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [RFC] Using DC in amdgpu for upcoming GPU 2016-12-12 2:57 ` Dave Airlie @ 2016-12-12 7:09 ` Daniel Vetter [not found] ` <CAPM=9tx+j9-3fZNY=peLjdsVqyLS6i3V-sV3XrnYsK2YuhWRBA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2016-12-13 2:52 ` Cheng, Tony 2 siblings, 0 replies; 66+ messages in thread From: Daniel Vetter @ 2016-12-12 7:09 UTC (permalink / raw) To: Dave Airlie Cc: Grodzovsky, Andrey, dri-devel, amd-gfx mailing list, Deucher, Alexander, Cheng, Tony On Mon, Dec 12, 2016 at 12:57:40PM +1000, Dave Airlie wrote: > 4) Why can't we put this in staging? > People have also mentioned staging, Daniel has called it a dead end, > I'd have considered staging for this code base, and I still might. > However staging has rules, and the main one is code in staging needs a > TODO list, and agreed criteria for exiting staging, I don't think we'd > be able to get an agreement on what the TODO list should contain and > how we'd ever get all things on it done. If this code ended up in > staging, it would most likely require someone dedicated to recreating > it in the mainline driver in an incremental fashion, and I don't see > that resource being available. So it's not just that I think the staging experience for drivers isn't good (e.g. imx, gma500), there's also the trouble that it's a separate tree and the coordination becomes a pain. That was very ugly around all the sync_file stuff imo, and for next time around we ever do that we should just put it into drm first and clean up second. We could do staging like with nouveau, but that's imo not really any different from just merging if we only slap a Kconfig depends upon the entire pile. So just don't see the benefit. I think stagin is good for checkpatch cleanup, but we already agreed that we're ok with ugly code if it's the stuff debugged by hw engineers. And for anything else like real refactoring I of big pieces of code I just don't see how staging makes sense. Maybe if it's a completely new subsystem, but the point here is that we want DC to integrate tighter with drm and be able to share code. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 66+ messages in thread
[parent not found: <CAPM=9tx+j9-3fZNY=peLjdsVqyLS6i3V-sV3XrnYsK2YuhWRBA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [RFC] Using DC in amdgpu for upcoming GPU [not found] ` <CAPM=9tx+j9-3fZNY=peLjdsVqyLS6i3V-sV3XrnYsK2YuhWRBA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2016-12-12 3:21 ` Bridgman, John 2016-12-12 3:23 ` Bridgman, John 2016-12-13 1:49 ` Harry Wentland 1 sibling, 1 reply; 66+ messages in thread From: Bridgman, John @ 2016-12-12 3:21 UTC (permalink / raw) To: Dave Airlie, Wentland, Harry Cc: Grodzovsky, Andrey, Cyr, Aric, Lazare, Jordan, amd-gfx mailing list, dri-devel, Deucher, Alexander, Cheng, Tony [-- Attachment #1.1: Type: text/plain, Size: 7084 bytes --] Thanks Dave. Apologies in advance for top posting but I'm stuck on a mail client that makes a big mess when I try... >If DC was ready for the next-gen GPU it would be ready for the current >GPU, it's not the specific ASIC code that is the problem, it's the >huge midlayer sitting in the middle. We realize that (a) we are getting into the high-risk-of-breakage part of the rework and (b) no matter how much we change the code structure there's a good chance that a month after it goes upstream one of us is going to find that more structural changes are required. I was kinda thinking that if we are doing high-risk activities (risk of subtle breakage not obvious regression, and/or risk of making structural changes that turn out to be a bad idea even though we all thought they were correct last week) there's an argument for doing it in code which only supports cards that people can't buy yet. ________________________________ From: Dave Airlie <airlied-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> Sent: December 11, 2016 9:57 PM To: Wentland, Harry Cc: dri-devel; amd-gfx mailing list; Bridgman, John; Deucher, Alexander; Lazare, Jordan; Cheng, Tony; Cyr, Aric; Grodzovsky, Andrey Subject: Re: [RFC] Using DC in amdgpu for upcoming GPU On 8 December 2016 at 12:02, Harry Wentland <harry.wentland-5C7GfCeVMHo@public.gmane.org> wrote: > We propose to use the Display Core (DC) driver for display support on > AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to > avoid a flag day the plan is to only support uGPU initially and transition > to older ASICs gradually. [FAQ: from past few days] 1) Hey you replied to Daniel, you never addressed the points of the RFC! I've read it being said that I hadn't addressed the RFC, and you know I've realised I actually had, because the RFC is great but it presupposes the codebase as designed can get upstream eventually, and I don't think it can. The code is too littered with midlayering and other problems, that actually addressing the individual points of the RFC would be missing the main point I'm trying to make. This code needs rewriting, not cleaning, not polishing, it needs to be split into its constituent parts, and reintegrated in a form more Linux process friendly. I feel that if I reply to the individual points Harry has raised in this RFC, that it means the code would then be suitable for merging, which it still won't, and I don't want people wasting another 6 months. If DC was ready for the next-gen GPU it would be ready for the current GPU, it's not the specific ASIC code that is the problem, it's the huge midlayer sitting in the middle. 2) We really need to share all of this code between OSes, why does Linux not want it? Sharing code is a laudable goal and I appreciate the resourcing constraints that led us to the point at which we find ourselves, but the way forward involves finding resources to upstream this code, dedicated people (even one person) who can spend time on a day by day basis talking to people in the open and working upstream, improving other pieces of the drm as they go, reading atomic patches and reviewing them, and can incrementally build the DC experience on top of the Linux kernel infrastructure. Then having the corresponding changes in the DC codebase happen internally to correspond to how the kernel code ends up looking. Lots of this code overlaps with stuff the drm already does, lots of is stuff the drm should be doing, so patches to the drm should be sent instead. 3) Then how do we upstream it? Resource(s) need(s) to start concentrating at splitting this thing up and using portions of it in the upstream kernel. We don't land fully formed code in the kernel if we can avoid it. Because you can't review the ideas and structure as easy as when someone builds up code in chunks and actually develops in the Linux kernel. This has always produced better more maintainable code. Maybe the result will end up improving the AMD codebase as well. 4) Why can't we put this in staging? People have also mentioned staging, Daniel has called it a dead end, I'd have considered staging for this code base, and I still might. However staging has rules, and the main one is code in staging needs a TODO list, and agreed criteria for exiting staging, I don't think we'd be able to get an agreement on what the TODO list should contain and how we'd ever get all things on it done. If this code ended up in staging, it would most likely require someone dedicated to recreating it in the mainline driver in an incremental fashion, and I don't see that resource being available. 5) Why is a midlayer bad? I'm not going to go into specifics on the DC midlayer, but we abhor midlayers for a fair few reasons. The main reason I find causes the most issues is locking. When you have breaks in code flow between multiple layers, but having layers calling back into previous layers it becomes near impossible to track who owns the locking and what the current locking state is. Consider drma -> dca -> dcb -> drmb drmc -> dcc -> dcb -> drmb We have two codes paths that go back into drmb, now maybe drma has a lock taken, but drmc doesn't, but we've no indication when we hit drmb of what the context pre entering the DC layer is. This causes all kinds of problems. The main requirement is the driver maintains the execution flow as much as possible. The only callback behaviour should be from an irq or workqueue type situations where you've handed execution flow to the hardware to do something and it is getting back to you. The pattern we use to get our of this sort of hole is helper libraries, we structure code as much as possible as leaf nodes that don't call back into the parents if we can avoid it (we don't always succeed). So the above might becomes drma-> dca_helper -> dcb_helper -> drmb. In this case the code flow is controlled by drma, dca/dcb might be modifying data or setting hw state but when we get to drmb it's easy to see what data is needs and what locking. DAL/DC goes against this in so many ways, and when I look at the code I'm never sure where to even start pulling the thread to unravel it. Some questions I have for AMD engineers that also I'd want to see addressed before any consideration of merging would happen! How do you plan on dealing with people rewriting or removing code upstream that is redundant in the kernel, but required for internal stuff? How are you going to deal with new Linux things that overlap incompatibly with your internally developed stuff? If the code is upstream will it be tested in the kernel by some QA group, or will there be some CI infrastructure used to maintain and to watch for Linux code that breaks assumptions in the DC code? Can you show me you understand that upstream code is no longer 100% in your control and things can happen to it that you might not expect and you need to deal with it? Dave. [-- Attachment #1.2: Type: text/html, Size: 8759 bytes --] [-- Attachment #2: Type: text/plain, Size: 154 bytes --] _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [RFC] Using DC in amdgpu for upcoming GPU 2016-12-12 3:21 ` Bridgman, John @ 2016-12-12 3:23 ` Bridgman, John [not found] ` <BN6PR12MB13484A1D247707C399180266E8980-/b2+HYfkarQX0pEhCR5T8QdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org> 0 siblings, 1 reply; 66+ messages in thread From: Bridgman, John @ 2016-12-12 3:23 UTC (permalink / raw) To: Dave Airlie, Wentland, Harry Cc: Grodzovsky, Andrey, amd-gfx mailing list, dri-devel, Deucher, Alexander, Cheng, Tony [-- Attachment #1.1: Type: text/plain, Size: 7435 bytes --] couple of typo fixes re: top posting and "only supports" -> "is only used for" ________________________________ From: Bridgman, John Sent: December 11, 2016 10:21 PM To: Dave Airlie; Wentland, Harry Cc: dri-devel; amd-gfx mailing list; Deucher, Alexander; Lazare, Jordan; Cheng, Tony; Cyr, Aric; Grodzovsky, Andrey Subject: Re: [RFC] Using DC in amdgpu for upcoming GPU Thanks Dave. Apologies in advance for top posting but I'm stuck on a mail client that makes a big mess when I try anything else... >If DC was ready for the next-gen GPU it would be ready for the current >GPU, it's not the specific ASIC code that is the problem, it's the >huge midlayer sitting in the middle. We realize that (a) we are getting into the high-risk-of-breakage part of the rework and (b) no matter how much we change the code structure there's a good chance that a month after it goes upstream one of us is going to find that more structural changes are required. I was kinda thinking that if we are doing high-risk activities (risk of subtle breakage not obvious regression, and/or risk of making structural changes that turn out to be a bad idea even though we all thought they were correct last week) there's an argument for doing it in code which is only used for cards that people can't buy yet. ________________________________ From: Dave Airlie <airlied@gmail.com> Sent: December 11, 2016 9:57 PM To: Wentland, Harry Cc: dri-devel; amd-gfx mailing list; Bridgman, John; Deucher, Alexander; Lazare, Jordan; Cheng, Tony; Cyr, Aric; Grodzovsky, Andrey Subject: Re: [RFC] Using DC in amdgpu for upcoming GPU On 8 December 2016 at 12:02, Harry Wentland <harry.wentland@amd.com> wrote: > We propose to use the Display Core (DC) driver for display support on > AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to > avoid a flag day the plan is to only support uGPU initially and transition > to older ASICs gradually. [FAQ: from past few days] 1) Hey you replied to Daniel, you never addressed the points of the RFC! I've read it being said that I hadn't addressed the RFC, and you know I've realised I actually had, because the RFC is great but it presupposes the codebase as designed can get upstream eventually, and I don't think it can. The code is too littered with midlayering and other problems, that actually addressing the individual points of the RFC would be missing the main point I'm trying to make. This code needs rewriting, not cleaning, not polishing, it needs to be split into its constituent parts, and reintegrated in a form more Linux process friendly. I feel that if I reply to the individual points Harry has raised in this RFC, that it means the code would then be suitable for merging, which it still won't, and I don't want people wasting another 6 months. If DC was ready for the next-gen GPU it would be ready for the current GPU, it's not the specific ASIC code that is the problem, it's the huge midlayer sitting in the middle. 2) We really need to share all of this code between OSes, why does Linux not want it? Sharing code is a laudable goal and I appreciate the resourcing constraints that led us to the point at which we find ourselves, but the way forward involves finding resources to upstream this code, dedicated people (even one person) who can spend time on a day by day basis talking to people in the open and working upstream, improving other pieces of the drm as they go, reading atomic patches and reviewing them, and can incrementally build the DC experience on top of the Linux kernel infrastructure. Then having the corresponding changes in the DC codebase happen internally to correspond to how the kernel code ends up looking. Lots of this code overlaps with stuff the drm already does, lots of is stuff the drm should be doing, so patches to the drm should be sent instead. 3) Then how do we upstream it? Resource(s) need(s) to start concentrating at splitting this thing up and using portions of it in the upstream kernel. We don't land fully formed code in the kernel if we can avoid it. Because you can't review the ideas and structure as easy as when someone builds up code in chunks and actually develops in the Linux kernel. This has always produced better more maintainable code. Maybe the result will end up improving the AMD codebase as well. 4) Why can't we put this in staging? People have also mentioned staging, Daniel has called it a dead end, I'd have considered staging for this code base, and I still might. However staging has rules, and the main one is code in staging needs a TODO list, and agreed criteria for exiting staging, I don't think we'd be able to get an agreement on what the TODO list should contain and how we'd ever get all things on it done. If this code ended up in staging, it would most likely require someone dedicated to recreating it in the mainline driver in an incremental fashion, and I don't see that resource being available. 5) Why is a midlayer bad? I'm not going to go into specifics on the DC midlayer, but we abhor midlayers for a fair few reasons. The main reason I find causes the most issues is locking. When you have breaks in code flow between multiple layers, but having layers calling back into previous layers it becomes near impossible to track who owns the locking and what the current locking state is. Consider drma -> dca -> dcb -> drmb drmc -> dcc -> dcb -> drmb We have two codes paths that go back into drmb, now maybe drma has a lock taken, but drmc doesn't, but we've no indication when we hit drmb of what the context pre entering the DC layer is. This causes all kinds of problems. The main requirement is the driver maintains the execution flow as much as possible. The only callback behaviour should be from an irq or workqueue type situations where you've handed execution flow to the hardware to do something and it is getting back to you. The pattern we use to get our of this sort of hole is helper libraries, we structure code as much as possible as leaf nodes that don't call back into the parents if we can avoid it (we don't always succeed). So the above might becomes drma-> dca_helper -> dcb_helper -> drmb. In this case the code flow is controlled by drma, dca/dcb might be modifying data or setting hw state but when we get to drmb it's easy to see what data is needs and what locking. DAL/DC goes against this in so many ways, and when I look at the code I'm never sure where to even start pulling the thread to unravel it. Some questions I have for AMD engineers that also I'd want to see addressed before any consideration of merging would happen! How do you plan on dealing with people rewriting or removing code upstream that is redundant in the kernel, but required for internal stuff? How are you going to deal with new Linux things that overlap incompatibly with your internally developed stuff? If the code is upstream will it be tested in the kernel by some QA group, or will there be some CI infrastructure used to maintain and to watch for Linux code that breaks assumptions in the DC code? Can you show me you understand that upstream code is no longer 100% in your control and things can happen to it that you might not expect and you need to deal with it? Dave. [-- Attachment #1.2: Type: text/html, Size: 9555 bytes --] [-- Attachment #2: Type: text/plain, Size: 160 bytes --] _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 66+ messages in thread
[parent not found: <BN6PR12MB13484A1D247707C399180266E8980-/b2+HYfkarQX0pEhCR5T8QdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>]
* Re: [RFC] Using DC in amdgpu for upcoming GPU [not found] ` <BN6PR12MB13484A1D247707C399180266E8980-/b2+HYfkarQX0pEhCR5T8QdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org> @ 2016-12-12 3:43 ` Bridgman, John 2016-12-12 4:05 ` Dave Airlie 0 siblings, 1 reply; 66+ messages in thread From: Bridgman, John @ 2016-12-12 3:43 UTC (permalink / raw) To: Dave Airlie, Wentland, Harry Cc: Grodzovsky, Andrey, Cyr, Aric, Lazare, Jordan, amd-gfx mailing list, dri-devel, Deucher, Alexander, Cheng, Tony [-- Attachment #1.1: Type: text/plain, Size: 8159 bytes --] v3 with typo fixes and additional comments/questions.. ________________________________ From: Bridgman, John Sent: December 11, 2016 10:21 PM To: Dave Airlie; Wentland, Harry Cc: dri-devel; amd-gfx mailing list; Deucher, Alexander; Lazare, Jordan; Cheng, Tony; Cyr, Aric; Grodzovsky, Andrey Subject: Re: [RFC] Using DC in amdgpu for upcoming GPU Thanks Dave. Apologies in advance for top posting but I'm stuck on a mail client that makes a big mess when I try anything else... >This code needs rewriting, not cleaning, not polishing, it needs to be >split into its constituent parts, and reintegrated in a form more >Linux process friendly. Can we say "restructuring" just for consistency with Daniel's message (the HW-dependent bits don't need to be rewritten but the way they are used/called needs to change) ? >I feel that if I reply to the individual points Harry has raised in >this RFC, that it means the code would then be suitable for merging, >which it still won't, and I don't want people wasting another 6 >months. That's fair. There was an implicit "when it's suitable" assumption in the RFC, but we'll make that explicit in the future. >If DC was ready for the next-gen GPU it would be ready for the current >GPU, it's not the specific ASIC code that is the problem, it's the >huge midlayer sitting in the middle. We realize that (a) we are getting into the high-risk-of-breakage part of the rework and (b) no matter how much we change the code structure there's a good chance that a month after it goes upstream one of us is going to find that more structural changes are required. I was kinda thinking that if we are doing high-risk activities (risk of subtle breakage not obvious regression, and/or risk of making structural changes that turn out to be a bad idea even though we all thought they were correct last week) there's an argument for doing it in code which is only used for cards that people can't buy yet. ________________________________ From: Dave Airlie <airlied-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> Sent: December 11, 2016 9:57 PM To: Wentland, Harry Cc: dri-devel; amd-gfx mailing list; Bridgman, John; Deucher, Alexander; Lazare, Jordan; Cheng, Tony; Cyr, Aric; Grodzovsky, Andrey Subject: Re: [RFC] Using DC in amdgpu for upcoming GPU On 8 December 2016 at 12:02, Harry Wentland <harry.wentland-5C7GfCeVMHo@public.gmane.org> wrote: > We propose to use the Display Core (DC) driver for display support on > AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to > avoid a flag day the plan is to only support uGPU initially and transition > to older ASICs gradually. [FAQ: from past few days] 1) Hey you replied to Daniel, you never addressed the points of the RFC! I've read it being said that I hadn't addressed the RFC, and you know I've realised I actually had, because the RFC is great but it presupposes the codebase as designed can get upstream eventually, and I don't think it can. The code is too littered with midlayering and other problems, that actually addressing the individual points of the RFC would be missing the main point I'm trying to make. This code needs rewriting, not cleaning, not polishing, it needs to be split into its constituent parts, and reintegrated in a form more Linux process friendly. I feel that if I reply to the individual points Harry has raised in this RFC, that it means the code would then be suitable for merging, which it still won't, and I don't want people wasting another 6 months. If DC was ready for the next-gen GPU it would be ready for the current GPU, it's not the specific ASIC code that is the problem, it's the huge midlayer sitting in the middle. 2) We really need to share all of this code between OSes, why does Linux not want it? Sharing code is a laudable goal and I appreciate the resourcing constraints that led us to the point at which we find ourselves, but the way forward involves finding resources to upstream this code, dedicated people (even one person) who can spend time on a day by day basis talking to people in the open and working upstream, improving other pieces of the drm as they go, reading atomic patches and reviewing them, and can incrementally build the DC experience on top of the Linux kernel infrastructure. Then having the corresponding changes in the DC codebase happen internally to correspond to how the kernel code ends up looking. Lots of this code overlaps with stuff the drm already does, lots of is stuff the drm should be doing, so patches to the drm should be sent instead. 3) Then how do we upstream it? Resource(s) need(s) to start concentrating at splitting this thing up and using portions of it in the upstream kernel. We don't land fully formed code in the kernel if we can avoid it. Because you can't review the ideas and structure as easy as when someone builds up code in chunks and actually develops in the Linux kernel. This has always produced better more maintainable code. Maybe the result will end up improving the AMD codebase as well. 4) Why can't we put this in staging? People have also mentioned staging, Daniel has called it a dead end, I'd have considered staging for this code base, and I still might. However staging has rules, and the main one is code in staging needs a TODO list, and agreed criteria for exiting staging, I don't think we'd be able to get an agreement on what the TODO list should contain and how we'd ever get all things on it done. If this code ended up in staging, it would most likely require someone dedicated to recreating it in the mainline driver in an incremental fashion, and I don't see that resource being available. 5) Why is a midlayer bad? I'm not going to go into specifics on the DC midlayer, but we abhor midlayers for a fair few reasons. The main reason I find causes the most issues is locking. When you have breaks in code flow between multiple layers, but having layers calling back into previous layers it becomes near impossible to track who owns the locking and what the current locking state is. Consider drma -> dca -> dcb -> drmb drmc -> dcc -> dcb -> drmb We have two codes paths that go back into drmb, now maybe drma has a lock taken, but drmc doesn't, but we've no indication when we hit drmb of what the context pre entering the DC layer is. This causes all kinds of problems. The main requirement is the driver maintains the execution flow as much as possible. The only callback behaviour should be from an irq or workqueue type situations where you've handed execution flow to the hardware to do something and it is getting back to you. The pattern we use to get our of this sort of hole is helper libraries, we structure code as much as possible as leaf nodes that don't call back into the parents if we can avoid it (we don't always succeed). So the above might becomes drma-> dca_helper -> dcb_helper -> drmb. In this case the code flow is controlled by drma, dca/dcb might be modifying data or setting hw state but when we get to drmb it's easy to see what data is needs and what locking. DAL/DC goes against this in so many ways, and when I look at the code I'm never sure where to even start pulling the thread to unravel it. Some questions I have for AMD engineers that also I'd want to see addressed before any consideration of merging would happen! How do you plan on dealing with people rewriting or removing code upstream that is redundant in the kernel, but required for internal stuff? How are you going to deal with new Linux things that overlap incompatibly with your internally developed stuff? If the code is upstream will it be tested in the kernel by some QA group, or will there be some CI infrastructure used to maintain and to watch for Linux code that breaks assumptions in the DC code? Can you show me you understand that upstream code is no longer 100% in your control and things can happen to it that you might not expect and you need to deal with it? Dave. [-- Attachment #1.2: Type: text/html, Size: 10728 bytes --] [-- Attachment #2: Type: text/plain, Size: 154 bytes --] _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [RFC] Using DC in amdgpu for upcoming GPU 2016-12-12 3:43 ` Bridgman, John @ 2016-12-12 4:05 ` Dave Airlie 0 siblings, 0 replies; 66+ messages in thread From: Dave Airlie @ 2016-12-12 4:05 UTC (permalink / raw) To: Bridgman, John Cc: Grodzovsky, Andrey, amd-gfx mailing list, dri-devel, Deucher, Alexander, Cheng, Tony > >>This code needs rewriting, not cleaning, not polishing, it needs to be >>split into its constituent parts, and reintegrated in a form more >>Linux process friendly. > > > Can we say "restructuring" just for consistency with Daniel's message (the > HW-dependent bits don't need to be rewritten but the way they are > used/called needs to change) ? Yes I think there is a lot of the code that could be reused with little change, it's just all the pieces tying it together needs restructure. Dave. _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [RFC] Using DC in amdgpu for upcoming GPU [not found] ` <CAPM=9tx+j9-3fZNY=peLjdsVqyLS6i3V-sV3XrnYsK2YuhWRBA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2016-12-12 3:21 ` Bridgman, John @ 2016-12-13 1:49 ` Harry Wentland [not found] ` <634f5374-027a-6ec9-41a5-64351c4f7eac-5C7GfCeVMHo@public.gmane.org> 1 sibling, 1 reply; 66+ messages in thread From: Harry Wentland @ 2016-12-13 1:49 UTC (permalink / raw) To: Dave Airlie Cc: Grodzovsky, Andrey, Cyr, Aric, Bridgman, John, Lazare, Jordan, amd-gfx mailing list, dri-devel, Deucher, Alexander, Cheng, Tony Hi Dave, Apologies for waking you up with the RFC on a Friday morning. I'll try to time big stuff better next time. A couple of thoughts below after having some discussions internally. I think Tony might add to some of them or provide his own. On 2016-12-11 09:57 PM, Dave Airlie wrote: > On 8 December 2016 at 12:02, Harry Wentland <harry.wentland@amd.com> wrote: >> We propose to use the Display Core (DC) driver for display support on >> AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to >> avoid a flag day the plan is to only support uGPU initially and transition >> to older ASICs gradually. > > [FAQ: from past few days] > > 1) Hey you replied to Daniel, you never addressed the points of the RFC! > I've read it being said that I hadn't addressed the RFC, and you know > I've realised I actually had, because the RFC is great but it > presupposes the codebase as designed can get upstream eventually, and > I don't think it can. The code is too littered with midlayering and > other problems, that actually addressing the individual points of the > RFC would be missing the main point I'm trying to make. > > This code needs rewriting, not cleaning, not polishing, it needs to be > split into its constituent parts, and reintegrated in a form more > Linux process friendly. > > I feel that if I reply to the individual points Harry has raised in > this RFC, that it means the code would then be suitable for merging, > which it still won't, and I don't want people wasting another 6 > months. > > If DC was ready for the next-gen GPU it would be ready for the current > GPU, it's not the specific ASIC code that is the problem, it's the > huge midlayer sitting in the middle. > > 2) We really need to share all of this code between OSes, why does > Linux not want it? > > Sharing code is a laudable goal and I appreciate the resourcing > constraints that led us to the point at which we find ourselves, but > the way forward involves finding resources to upstream this code, > dedicated people (even one person) who can spend time on a day by day > basis talking to people in the open and working upstream, improving > other pieces of the drm as they go, reading atomic patches and > reviewing them, and can incrementally build the DC experience on top > of the Linux kernel infrastructure. Then having the corresponding > changes in the DC codebase happen internally to correspond to how the > kernel code ends up looking. Lots of this code overlaps with stuff the > drm already does, lots of is stuff the drm should be doing, so patches > to the drm should be sent instead. > Personally I'm with you on this and hope to get us there. I'm learning... we're learning. I agree that changes on atomic, removing abstractions, etc. should happen on dri-devel. When it comes to brand-new technologies (MST, Freesync), though, we're often the first which means that we're spending a considerable amount of time to get things right, working with HW teams, receiver vendors and other partners internal and external to AMD. By the time we do get it right it's time to hit the market. This gives us fairly little leeway to work with the community on patches that won't land in distros for another half a year. We're definitely hoping to improve some of this but it's not easy and in some case impossible ahead of time (though definitely possibly after initial release). > 3) Then how do we upstream it? > Resource(s) need(s) to start concentrating at splitting this thing up > and using portions of it in the upstream kernel. We don't land fully > formed code in the kernel if we can avoid it. Because you can't review > the ideas and structure as easy as when someone builds up code in > chunks and actually develops in the Linux kernel. This has always > produced better more maintainable code. Maybe the result will end up > improving the AMD codebase as well. > > 4) Why can't we put this in staging? > People have also mentioned staging, Daniel has called it a dead end, > I'd have considered staging for this code base, and I still might. > However staging has rules, and the main one is code in staging needs a > TODO list, and agreed criteria for exiting staging, I don't think we'd > be able to get an agreement on what the TODO list should contain and > how we'd ever get all things on it done. If this code ended up in > staging, it would most likely require someone dedicated to recreating > it in the mainline driver in an incremental fashion, and I don't see > that resource being available. > I don't think we really want staging. If it helps us get into DRM, sure, but if it's more of a pain, as suggested, then probably no. > 5) Why is a midlayer bad? > I'm not going to go into specifics on the DC midlayer, but we abhor > midlayers for a fair few reasons. The main reason I find causes the > most issues is locking. When you have breaks in code flow between > multiple layers, but having layers calling back into previous layers > it becomes near impossible to track who owns the locking and what the > current locking state is. > There's a conscious design decision to have absolutely no locking in DC. This is one of the reasons. Locking is really OS dependent behavior which has no place in DC. > Consider > drma -> dca -> dcb -> drmb > drmc -> dcc -> dcb -> drmb > > We have two codes paths that go back into drmb, now maybe drma has a > lock taken, but drmc doesn't, but we've no indication when we hit drmb > of what the context pre entering the DC layer is. This causes all > kinds of problems. The main requirement is the driver maintains the > execution flow as much as possible. The only callback behaviour should > be from an irq or workqueue type situations where you've handed > execution flow to the hardware to do something and it is getting back > to you. The pattern we use to get our of this sort of hole is helper > libraries, we structure code as much as possible as leaf nodes that > don't call back into the parents if we can avoid it (we don't always > succeed). > Is that the reason for using ww_mutex in atomic? > So the above might becomes > drma-> dca_helper > -> dcb_helper > -> drmb. > > In this case the code flow is controlled by drma, dca/dcb might be > modifying data or setting hw state but when we get to drmb it's easy > to see what data is needs and what locking. > This actually looks pretty close to drm_atomic_commit -> amdgpu_dm_atomic_commit -> dc_commit_targets > dce110_apply_ctx_to_hw -> apply_single_controller_ctx_to_hw -> core_link_enable_stream > allocate_mst_payload -> dm_helpers_dp_mst_write_payload_allocation_table -> drm_dp_update_payload_part1 though the latter is a bit more complex. > DAL/DC goes against this in so many ways, and when I look at the code > I'm never sure where to even start pulling the thread to unravel it. > There's a lot of code there but that doesn't mean it's needlessly complex. We're definitely open for suggestions on how to simplify this, ideally without breaking existing functionality. > Some questions I have for AMD engineers that also I'd want to see > addressed before any consideration of merging would happen! > > How do you plan on dealing with people rewriting or removing code > upstream that is redundant in the kernel, but required for internal > stuff? There's already a bunch of stuff in our internal trees that never make it into open-source trees, for various reasons. We guard those with an #ifdef and strip them when preparing code for open source. It shouldn't be a big deal to deal with code removed upstream in similar ways. Rewritten code would have to be looked at on a case by case basis. DC code is fully validated in many different configurations and is used for ASIC bringup when we can sit next to HW guys to work out complex issues. Modifying the code in a way that can't be shared would mean that all this validation is lost. Some of the bugs we're talking about are non-trivial and will show up only if HW is programmed in a certain way (e.g. Linux code leaves out some power-saving feature, causing HW to hang in weird scenarios). > How are you going to deal with new Linux things that overlap > incompatibly with your internally developed stuff? Do you have examples? If we're talking about stuff like MST, atomic, FreeSync, HDR... we're generally the first to the game and would love to be working with the community to push those out. > If the code is upstream will it be tested in the kernel by some QA > group, or will there be some CI infrastructure used to maintain and to > watch for Linux code that breaks assumptions in the DC code? I think Alex is working on getting our internal tree onto a rolling tip of drm-next (or nearly there). Once we got this we'll be switching our existing builds and manual (no automated yet) testing onto. We're currently building daily and with each DC commit and doing testing of a basic feature matrix at least every second day. > Can you show me you understand that upstream code is no longer 100% in > your control and things can happen to it that you might not expect and > you need to deal with it? > I think this is the big question. I would love to let other AMDers chime in on this. Harry > Dave. > _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
[parent not found: <634f5374-027a-6ec9-41a5-64351c4f7eac-5C7GfCeVMHo@public.gmane.org>]
* Re: [RFC] Using DC in amdgpu for upcoming GPU [not found] ` <634f5374-027a-6ec9-41a5-64351c4f7eac-5C7GfCeVMHo@public.gmane.org> @ 2016-12-13 12:22 ` Daniel Stone 2016-12-13 12:59 ` Daniel Vetter [not found] ` <CAPj87rNrwsfAR75138WDQPbti_BmS_D-NxESZ075obcjO3T04g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 2 replies; 66+ messages in thread From: Daniel Stone @ 2016-12-13 12:22 UTC (permalink / raw) To: Harry Wentland Cc: Grodzovsky, Andrey, Dave Airlie, dri-devel, amd-gfx mailing list, Deucher, Alexander, Cheng, Tony Hi Harry, I've been loathe to jump in here, not least because both cop roles seem to be taken, but ... On 13 December 2016 at 01:49, Harry Wentland <harry.wentland@amd.com> wrote: > On 2016-12-11 09:57 PM, Dave Airlie wrote: >> On 8 December 2016 at 12:02, Harry Wentland <harry.wentland@amd.com> wrote: >> Sharing code is a laudable goal and I appreciate the resourcing >> constraints that led us to the point at which we find ourselves, but >> the way forward involves finding resources to upstream this code, >> dedicated people (even one person) who can spend time on a day by day >> basis talking to people in the open and working upstream, improving >> other pieces of the drm as they go, reading atomic patches and >> reviewing them, and can incrementally build the DC experience on top >> of the Linux kernel infrastructure. Then having the corresponding >> changes in the DC codebase happen internally to correspond to how the >> kernel code ends up looking. Lots of this code overlaps with stuff the >> drm already does, lots of is stuff the drm should be doing, so patches >> to the drm should be sent instead. > > Personally I'm with you on this and hope to get us there. I'm learning... > we're learning. I agree that changes on atomic, removing abstractions, etc. > should happen on dri-devel. > > When it comes to brand-new technologies (MST, Freesync), though, we're often > the first which means that we're spending a considerable amount of time to > get things right, working with HW teams, receiver vendors and other partners > internal and external to AMD. By the time we do get it right it's time to > hit the market. This gives us fairly little leeway to work with the > community on patches that won't land in distros for another half a year. > We're definitely hoping to improve some of this but it's not easy and in > some case impossible ahead of time (though definitely possibly after initial > release). Speaking with my Wayland hat on, I think these need to be very carefully considered. Both MST and FreeSync have _significant_ UABI implications, which may not be immediately obvious when working with a single implementation. Having them working and validated with a vertical stack of amdgpu-DC/KMS + xf86-video-amdgpu + Mesa-amdgpu/AMDGPU-Pro is one thing, but looking outside the X11 world we now have Weston, Mutter and KWin all directly driving KMS, plus whatever Mir/Unity ends up doing (presumably the same), and that's just on the desktop. Beyond the desktop, there's also CrOS/Freon and Android/HWC. For better or worse, outside of Xorg and HWC, we no longer have a vendor-provided userspace component driving KMS. It was also easy to get away with loose semantics before with X11 imposing little to no structure on rendering, but we now have the twin requirements of an atomic and timing-precise ABI - see Mario Kleiner's unending quest for accuracy - and also a vendor-independent ABI. So a good part of the (not insignificant) pain incurred in the atomic transition for drivers, was in fact making those drivers conform to the expectations of the KMS UABI contract, which just happened to not have been tripped over previously. Speaking with my Collabora hat on now: we did do a substantial amount of demidlayering on the Exynos driver, including an atomic conversion, on Google's behalf. The original Exynos driver happened to work with the Tizen stack, but ChromeOS exposed a huge amount of subtle behaviour differences between that and other drivers when using Freon. We'd also hit the same issues when attempting to use Weston on Exynos in embedded devices for OEMs we worked with, so took on the project to remove the midlayer and have as much as possible driven from generic code. How the hardware is programmed is of course ultimately up to you, and in this regard AMD will be very different from Intel is very different from Nouveau is very different from Rockchip. But especially for new features like FreeSync, I think we need to be very conscious of walking the line between getting those features in early, and setting unworkable UABI in stone. It would be unfortunate if later on down the line, you had to choose between breaking older xf86-video-amdgpu userspace which depended on specific behaviours of the amdgpu kernel driver, or breaking the expectations of generic userspace such as Weston/Mutter/etc. One good way to make sure you don't get into that position, is to have core KMS code driving as much of the machinery as possible, with a very clear separation of concerns between actual hardware programming, versus things which may be visible to userspace. This I think is DanielV's point expressed at much greater length. ;) I should be clear though that this isn't unique to AMD, nor a problem of your creation. For example, I'm currently looking at a flip-timing issue in Rockchip - a fairly small, recent, atomic-native, and generally exemplary driver - which I'm pretty sure is going to be resolved by deleting more driver code and using more of the helpers! Probably one of the reasons why KMS has been lagging behind in capability for a while (as Alex noted), is that even the basic ABI was utterly incoherent between drivers. The magnitude of the sea change that's taken place in KMS lately isn't always obvious to the outside world: the actual atomic modesetting API is just the cherry on top, rather than the most drastic change, which is the coherent driver-independent core machinery. Cheers, Daniel _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [RFC] Using DC in amdgpu for upcoming GPU 2016-12-13 12:22 ` Daniel Stone @ 2016-12-13 12:59 ` Daniel Vetter [not found] ` <20161213125953.zczaojxp37yg6a6f-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org> [not found] ` <CAPj87rNrwsfAR75138WDQPbti_BmS_D-NxESZ075obcjO3T04g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 1 sibling, 1 reply; 66+ messages in thread From: Daniel Vetter @ 2016-12-13 12:59 UTC (permalink / raw) To: Daniel Stone Cc: Grodzovsky, Andrey, amd-gfx mailing list, dri-devel, Deucher, Alexander, Cheng, Tony On Tue, Dec 13, 2016 at 12:22:59PM +0000, Daniel Stone wrote: > Hi Harry, > I've been loathe to jump in here, not least because both cop roles > seem to be taken, but ... > > On 13 December 2016 at 01:49, Harry Wentland <harry.wentland@amd.com> wrote: > > On 2016-12-11 09:57 PM, Dave Airlie wrote: > >> On 8 December 2016 at 12:02, Harry Wentland <harry.wentland@amd.com> wrote: > >> Sharing code is a laudable goal and I appreciate the resourcing > >> constraints that led us to the point at which we find ourselves, but > >> the way forward involves finding resources to upstream this code, > >> dedicated people (even one person) who can spend time on a day by day > >> basis talking to people in the open and working upstream, improving > >> other pieces of the drm as they go, reading atomic patches and > >> reviewing them, and can incrementally build the DC experience on top > >> of the Linux kernel infrastructure. Then having the corresponding > >> changes in the DC codebase happen internally to correspond to how the > >> kernel code ends up looking. Lots of this code overlaps with stuff the > >> drm already does, lots of is stuff the drm should be doing, so patches > >> to the drm should be sent instead. > > > > Personally I'm with you on this and hope to get us there. I'm learning... > > we're learning. I agree that changes on atomic, removing abstractions, etc. > > should happen on dri-devel. > > > > When it comes to brand-new technologies (MST, Freesync), though, we're often > > the first which means that we're spending a considerable amount of time to > > get things right, working with HW teams, receiver vendors and other partners > > internal and external to AMD. By the time we do get it right it's time to > > hit the market. This gives us fairly little leeway to work with the > > community on patches that won't land in distros for another half a year. > > We're definitely hoping to improve some of this but it's not easy and in > > some case impossible ahead of time (though definitely possibly after initial > > release). > > Speaking with my Wayland hat on, I think these need to be very > carefully considered. Both MST and FreeSync have _significant_ UABI > implications, which may not be immediately obvious when working with a > single implementation. Having them working and validated with a > vertical stack of amdgpu-DC/KMS + xf86-video-amdgpu + > Mesa-amdgpu/AMDGPU-Pro is one thing, but looking outside the X11 world > we now have Weston, Mutter and KWin all directly driving KMS, plus > whatever Mir/Unity ends up doing (presumably the same), and that's > just on the desktop. Beyond the desktop, there's also CrOS/Freon and > Android/HWC. For better or worse, outside of Xorg and HWC, we no > longer have a vendor-provided userspace component driving KMS. > > It was also easy to get away with loose semantics before with X11 > imposing little to no structure on rendering, but we now have the twin > requirements of an atomic and timing-precise ABI - see Mario Kleiner's > unending quest for accuracy - and also a vendor-independent ABI. So a > good part of the (not insignificant) pain incurred in the atomic > transition for drivers, was in fact making those drivers conform to > the expectations of the KMS UABI contract, which just happened to not > have been tripped over previously. > > Speaking with my Collabora hat on now: we did do a substantial amount > of demidlayering on the Exynos driver, including an atomic conversion, > on Google's behalf. The original Exynos driver happened to work with > the Tizen stack, but ChromeOS exposed a huge amount of subtle > behaviour differences between that and other drivers when using Freon. > We'd also hit the same issues when attempting to use Weston on Exynos > in embedded devices for OEMs we worked with, so took on the project to > remove the midlayer and have as much as possible driven from generic > code. > > How the hardware is programmed is of course ultimately up to you, and > in this regard AMD will be very different from Intel is very different > from Nouveau is very different from Rockchip. But especially for new > features like FreeSync, I think we need to be very conscious of > walking the line between getting those features in early, and setting > unworkable UABI in stone. It would be unfortunate if later on down the > line, you had to choose between breaking older xf86-video-amdgpu > userspace which depended on specific behaviours of the amdgpu kernel > driver, or breaking the expectations of generic userspace such as > Weston/Mutter/etc. > > One good way to make sure you don't get into that position, is to have > core KMS code driving as much of the machinery as possible, with a > very clear separation of concerns between actual hardware programming, > versus things which may be visible to userspace. This I think is > DanielV's point expressed at much greater length. ;) > > I should be clear though that this isn't unique to AMD, nor a problem > of your creation. For example, I'm currently looking at a flip-timing > issue in Rockchip - a fairly small, recent, atomic-native, and > generally exemplary driver - which I'm pretty sure is going to be > resolved by deleting more driver code and using more of the helpers! > Probably one of the reasons why KMS has been lagging behind in > capability for a while (as Alex noted), is that even the basic ABI was > utterly incoherent between drivers. The magnitude of the sea change > that's taken place in KMS lately isn't always obvious to the outside > world: the actual atomic modesetting API is just the cherry on top, > rather than the most drastic change, which is the coherent > driver-independent core machinery. +1 on everything Daniel said here. And I'm a bit worried that AMD is not realizing what's going on here, given that Michel called the plan that most everything will switch over to a generic kms userspace a "pipe dream". It's happening, and in a few years I expect the only amd-specific userspace left and still shipping will be amdgpu-pro for enterprise/workstation customers. In the end AMD missing that seems just another case of designing something pretty inhouse and entirely missing to synchronize with the community and what's going on outside of AMD. And for freesync specifically I agree with Daniel that enabling this only in -amdgpu gives us a very high chance of ending up with something that doesn't work elsewhere. Or is at least badly underspecified, and then tears and blodshed ensues when someone else enables things. At intel we've already stopped enabling kms features only in -intel, and instead using weston, -modesetting or drm_hwcomposer as userspace demonstration vehicles for new stuff. And I'll be pushing everyone else into that direction, too. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 66+ messages in thread
[parent not found: <20161213125953.zczaojxp37yg6a6f-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>]
* Re: [RFC] Using DC in amdgpu for upcoming GPU [not found] ` <20161213125953.zczaojxp37yg6a6f-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org> @ 2016-12-14 1:50 ` Michel Dänzer [not found] ` <afa3fdb6-1bb4-976e-d14f-b04ab8243819-otUistvHUpPR7s880joybQ@public.gmane.org> 0 siblings, 1 reply; 66+ messages in thread From: Michel Dänzer @ 2016-12-14 1:50 UTC (permalink / raw) To: Daniel Vetter, Daniel Stone Cc: Grodzovsky, Andrey, Harry Wentland, dri-devel, amd-gfx mailing list, Deucher, Alexander, Cheng, Tony On 13/12/16 09:59 PM, Daniel Vetter wrote: > On Tue, Dec 13, 2016 at 12:22:59PM +0000, Daniel Stone wrote: >> Hi Harry, >> I've been loathe to jump in here, not least because both cop roles >> seem to be taken, but ... >> >> On 13 December 2016 at 01:49, Harry Wentland <harry.wentland@amd.com> wrote: >>> On 2016-12-11 09:57 PM, Dave Airlie wrote: >>>> On 8 December 2016 at 12:02, Harry Wentland <harry.wentland@amd.com> wrote: >>>> Sharing code is a laudable goal and I appreciate the resourcing >>>> constraints that led us to the point at which we find ourselves, but >>>> the way forward involves finding resources to upstream this code, >>>> dedicated people (even one person) who can spend time on a day by day >>>> basis talking to people in the open and working upstream, improving >>>> other pieces of the drm as they go, reading atomic patches and >>>> reviewing them, and can incrementally build the DC experience on top >>>> of the Linux kernel infrastructure. Then having the corresponding >>>> changes in the DC codebase happen internally to correspond to how the >>>> kernel code ends up looking. Lots of this code overlaps with stuff the >>>> drm already does, lots of is stuff the drm should be doing, so patches >>>> to the drm should be sent instead. >>> >>> Personally I'm with you on this and hope to get us there. I'm learning... >>> we're learning. I agree that changes on atomic, removing abstractions, etc. >>> should happen on dri-devel. >>> >>> When it comes to brand-new technologies (MST, Freesync), though, we're often >>> the first which means that we're spending a considerable amount of time to >>> get things right, working with HW teams, receiver vendors and other partners >>> internal and external to AMD. By the time we do get it right it's time to >>> hit the market. This gives us fairly little leeway to work with the >>> community on patches that won't land in distros for another half a year. >>> We're definitely hoping to improve some of this but it's not easy and in >>> some case impossible ahead of time (though definitely possibly after initial >>> release). >> >> Speaking with my Wayland hat on, I think these need to be very >> carefully considered. Both MST and FreeSync have _significant_ UABI >> implications, which may not be immediately obvious when working with a >> single implementation. Having them working and validated with a >> vertical stack of amdgpu-DC/KMS + xf86-video-amdgpu + >> Mesa-amdgpu/AMDGPU-Pro is one thing, but looking outside the X11 world >> we now have Weston, Mutter and KWin all directly driving KMS, plus >> whatever Mir/Unity ends up doing (presumably the same), and that's >> just on the desktop. Beyond the desktop, there's also CrOS/Freon and >> Android/HWC. For better or worse, outside of Xorg and HWC, we no >> longer have a vendor-provided userspace component driving KMS. >> >> It was also easy to get away with loose semantics before with X11 >> imposing little to no structure on rendering, but we now have the twin >> requirements of an atomic and timing-precise ABI - see Mario Kleiner's >> unending quest for accuracy - and also a vendor-independent ABI. So a >> good part of the (not insignificant) pain incurred in the atomic >> transition for drivers, was in fact making those drivers conform to >> the expectations of the KMS UABI contract, which just happened to not >> have been tripped over previously. >> >> Speaking with my Collabora hat on now: we did do a substantial amount >> of demidlayering on the Exynos driver, including an atomic conversion, >> on Google's behalf. The original Exynos driver happened to work with >> the Tizen stack, but ChromeOS exposed a huge amount of subtle >> behaviour differences between that and other drivers when using Freon. >> We'd also hit the same issues when attempting to use Weston on Exynos >> in embedded devices for OEMs we worked with, so took on the project to >> remove the midlayer and have as much as possible driven from generic >> code. >> >> How the hardware is programmed is of course ultimately up to you, and >> in this regard AMD will be very different from Intel is very different >> from Nouveau is very different from Rockchip. But especially for new >> features like FreeSync, I think we need to be very conscious of >> walking the line between getting those features in early, and setting >> unworkable UABI in stone. It would be unfortunate if later on down the >> line, you had to choose between breaking older xf86-video-amdgpu >> userspace which depended on specific behaviours of the amdgpu kernel >> driver, or breaking the expectations of generic userspace such as >> Weston/Mutter/etc. >> >> One good way to make sure you don't get into that position, is to have >> core KMS code driving as much of the machinery as possible, with a >> very clear separation of concerns between actual hardware programming, >> versus things which may be visible to userspace. This I think is >> DanielV's point expressed at much greater length. ;) >> >> I should be clear though that this isn't unique to AMD, nor a problem >> of your creation. For example, I'm currently looking at a flip-timing >> issue in Rockchip - a fairly small, recent, atomic-native, and >> generally exemplary driver - which I'm pretty sure is going to be >> resolved by deleting more driver code and using more of the helpers! >> Probably one of the reasons why KMS has been lagging behind in >> capability for a while (as Alex noted), is that even the basic ABI was >> utterly incoherent between drivers. The magnitude of the sea change >> that's taken place in KMS lately isn't always obvious to the outside >> world: the actual atomic modesetting API is just the cherry on top, >> rather than the most drastic change, which is the coherent >> driver-independent core machinery. > > +1 on everything Daniel said here. And I'm a bit worried that AMD is not > realizing what's going on here, given that Michel called the plan that > most everything will switch over to a generic kms userspace a "pipe > dream". It's happening, and in a few years I expect the only amd-specific > userspace left and still shipping will be amdgpu-pro for > enterprise/workstation customers. The pipe dream is replacing our Xorg drivers with -modesetting. I fully agree with you Daniels when it comes to non-Xorg userspace. > In the end AMD missing that seems just another case of designing something > pretty inhouse and entirely missing to synchronize with the community and > what's going on outside of AMD. > > And for freesync specifically I agree with Daniel that enabling this only > in -amdgpu gives us a very high chance of ending up with something that > doesn't work elsewhere. Or is at least badly underspecified, and then > tears and blodshed ensues when someone else enables things. Right, I think I clearly stated before both internally and externally that the current amdgpu-pro FreeSync support isn't suitable for upstream (not even for xf86-video-amdgpu). -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
[parent not found: <afa3fdb6-1bb4-976e-d14f-b04ab8243819-otUistvHUpPR7s880joybQ@public.gmane.org>]
* Re: [RFC] Using DC in amdgpu for upcoming GPU [not found] ` <afa3fdb6-1bb4-976e-d14f-b04ab8243819-otUistvHUpPR7s880joybQ@public.gmane.org> @ 2016-12-14 15:46 ` Harry Wentland 0 siblings, 0 replies; 66+ messages in thread From: Harry Wentland @ 2016-12-14 15:46 UTC (permalink / raw) To: Michel Dänzer, Daniel Vetter, Daniel Stone Cc: Deucher, Alexander, Grodzovsky, Andrey, Cheng, Tony, dri-devel, amd-gfx mailing list On 2016-12-13 08:50 PM, Michel Dänzer wrote: > On 13/12/16 09:59 PM, Daniel Vetter wrote: >> On Tue, Dec 13, 2016 at 12:22:59PM +0000, Daniel Stone wrote: >>> Hi Harry, >>> I've been loathe to jump in here, not least because both cop roles >>> seem to be taken, but ... >>> >>> On 13 December 2016 at 01:49, Harry Wentland <harry.wentland@amd.com> wrote: >>>> On 2016-12-11 09:57 PM, Dave Airlie wrote: >>>>> On 8 December 2016 at 12:02, Harry Wentland <harry.wentland@amd.com> wrote: >>>>> Sharing code is a laudable goal and I appreciate the resourcing >>>>> constraints that led us to the point at which we find ourselves, but >>>>> the way forward involves finding resources to upstream this code, >>>>> dedicated people (even one person) who can spend time on a day by day >>>>> basis talking to people in the open and working upstream, improving >>>>> other pieces of the drm as they go, reading atomic patches and >>>>> reviewing them, and can incrementally build the DC experience on top >>>>> of the Linux kernel infrastructure. Then having the corresponding >>>>> changes in the DC codebase happen internally to correspond to how the >>>>> kernel code ends up looking. Lots of this code overlaps with stuff the >>>>> drm already does, lots of is stuff the drm should be doing, so patches >>>>> to the drm should be sent instead. >>>> >>>> Personally I'm with you on this and hope to get us there. I'm learning... >>>> we're learning. I agree that changes on atomic, removing abstractions, etc. >>>> should happen on dri-devel. >>>> >>>> When it comes to brand-new technologies (MST, Freesync), though, we're often >>>> the first which means that we're spending a considerable amount of time to >>>> get things right, working with HW teams, receiver vendors and other partners >>>> internal and external to AMD. By the time we do get it right it's time to >>>> hit the market. This gives us fairly little leeway to work with the >>>> community on patches that won't land in distros for another half a year. >>>> We're definitely hoping to improve some of this but it's not easy and in >>>> some case impossible ahead of time (though definitely possibly after initial >>>> release). >>> >>> Speaking with my Wayland hat on, I think these need to be very >>> carefully considered. Both MST and FreeSync have _significant_ UABI >>> implications, which may not be immediately obvious when working with a >>> single implementation. Having them working and validated with a >>> vertical stack of amdgpu-DC/KMS + xf86-video-amdgpu + >>> Mesa-amdgpu/AMDGPU-Pro is one thing, but looking outside the X11 world >>> we now have Weston, Mutter and KWin all directly driving KMS, plus >>> whatever Mir/Unity ends up doing (presumably the same), and that's >>> just on the desktop. Beyond the desktop, there's also CrOS/Freon and >>> Android/HWC. For better or worse, outside of Xorg and HWC, we no >>> longer have a vendor-provided userspace component driving KMS. >>> >>> It was also easy to get away with loose semantics before with X11 >>> imposing little to no structure on rendering, but we now have the twin >>> requirements of an atomic and timing-precise ABI - see Mario Kleiner's >>> unending quest for accuracy - and also a vendor-independent ABI. So a >>> good part of the (not insignificant) pain incurred in the atomic >>> transition for drivers, was in fact making those drivers conform to >>> the expectations of the KMS UABI contract, which just happened to not >>> have been tripped over previously. >>> >>> Speaking with my Collabora hat on now: we did do a substantial amount >>> of demidlayering on the Exynos driver, including an atomic conversion, >>> on Google's behalf. The original Exynos driver happened to work with >>> the Tizen stack, but ChromeOS exposed a huge amount of subtle >>> behaviour differences between that and other drivers when using Freon. >>> We'd also hit the same issues when attempting to use Weston on Exynos >>> in embedded devices for OEMs we worked with, so took on the project to >>> remove the midlayer and have as much as possible driven from generic >>> code. >>> >>> How the hardware is programmed is of course ultimately up to you, and >>> in this regard AMD will be very different from Intel is very different >>> from Nouveau is very different from Rockchip. But especially for new >>> features like FreeSync, I think we need to be very conscious of >>> walking the line between getting those features in early, and setting >>> unworkable UABI in stone. It would be unfortunate if later on down the >>> line, you had to choose between breaking older xf86-video-amdgpu >>> userspace which depended on specific behaviours of the amdgpu kernel >>> driver, or breaking the expectations of generic userspace such as >>> Weston/Mutter/etc. >>> >>> One good way to make sure you don't get into that position, is to have >>> core KMS code driving as much of the machinery as possible, with a >>> very clear separation of concerns between actual hardware programming, >>> versus things which may be visible to userspace. This I think is >>> DanielV's point expressed at much greater length. ;) >>> >>> I should be clear though that this isn't unique to AMD, nor a problem >>> of your creation. For example, I'm currently looking at a flip-timing >>> issue in Rockchip - a fairly small, recent, atomic-native, and >>> generally exemplary driver - which I'm pretty sure is going to be >>> resolved by deleting more driver code and using more of the helpers! >>> Probably one of the reasons why KMS has been lagging behind in >>> capability for a while (as Alex noted), is that even the basic ABI was >>> utterly incoherent between drivers. The magnitude of the sea change >>> that's taken place in KMS lately isn't always obvious to the outside >>> world: the actual atomic modesetting API is just the cherry on top, >>> rather than the most drastic change, which is the coherent >>> driver-independent core machinery. >> >> +1 on everything Daniel said here. And I'm a bit worried that AMD is not >> realizing what's going on here, given that Michel called the plan that >> most everything will switch over to a generic kms userspace a "pipe >> dream". It's happening, and in a few years I expect the only amd-specific >> userspace left and still shipping will be amdgpu-pro for >> enterprise/workstation customers. > > The pipe dream is replacing our Xorg drivers with -modesetting. I fully > agree with you Daniels when it comes to non-Xorg userspace. > > >> In the end AMD missing that seems just another case of designing something >> pretty inhouse and entirely missing to synchronize with the community and >> what's going on outside of AMD. >> >> And for freesync specifically I agree with Daniel that enabling this only >> in -amdgpu gives us a very high chance of ending up with something that >> doesn't work elsewhere. Or is at least badly underspecified, and then >> tears and blodshed ensues when someone else enables things. > > Right, I think I clearly stated before both internally and externally > that the current amdgpu-pro FreeSync support isn't suitable for upstream > (not even for xf86-video-amdgpu). > > Thanks, DanielS, DanielV, and Michel for the insight. Michel is actually one of the strongest voices at AMD against any ABI stuff that's not well thought-out and might get us in trouble down the road. Harry _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
[parent not found: <CAPj87rNrwsfAR75138WDQPbti_BmS_D-NxESZ075obcjO3T04g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [RFC] Using DC in amdgpu for upcoming GPU [not found] ` <CAPj87rNrwsfAR75138WDQPbti_BmS_D-NxESZ075obcjO3T04g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2016-12-14 16:35 ` Alex Deucher 0 siblings, 0 replies; 66+ messages in thread From: Alex Deucher @ 2016-12-14 16:35 UTC (permalink / raw) To: Daniel Stone Cc: Grodzovsky, Andrey, Harry Wentland, amd-gfx mailing list, Cheng, Tony, dri-devel, Deucher, Alexander, Dave Airlie On Tue, Dec 13, 2016 at 7:22 AM, Daniel Stone <daniel@fooishbar.org> wrote: > Hi Harry, > I've been loathe to jump in here, not least because both cop roles > seem to be taken, but ... > > On 13 December 2016 at 01:49, Harry Wentland <harry.wentland@amd.com> wrote: >> On 2016-12-11 09:57 PM, Dave Airlie wrote: >>> On 8 December 2016 at 12:02, Harry Wentland <harry.wentland@amd.com> wrote: >>> Sharing code is a laudable goal and I appreciate the resourcing >>> constraints that led us to the point at which we find ourselves, but >>> the way forward involves finding resources to upstream this code, >>> dedicated people (even one person) who can spend time on a day by day >>> basis talking to people in the open and working upstream, improving >>> other pieces of the drm as they go, reading atomic patches and >>> reviewing them, and can incrementally build the DC experience on top >>> of the Linux kernel infrastructure. Then having the corresponding >>> changes in the DC codebase happen internally to correspond to how the >>> kernel code ends up looking. Lots of this code overlaps with stuff the >>> drm already does, lots of is stuff the drm should be doing, so patches >>> to the drm should be sent instead. >> >> Personally I'm with you on this and hope to get us there. I'm learning... >> we're learning. I agree that changes on atomic, removing abstractions, etc. >> should happen on dri-devel. >> >> When it comes to brand-new technologies (MST, Freesync), though, we're often >> the first which means that we're spending a considerable amount of time to >> get things right, working with HW teams, receiver vendors and other partners >> internal and external to AMD. By the time we do get it right it's time to >> hit the market. This gives us fairly little leeway to work with the >> community on patches that won't land in distros for another half a year. >> We're definitely hoping to improve some of this but it's not easy and in >> some case impossible ahead of time (though definitely possibly after initial >> release). > > Speaking with my Wayland hat on, I think these need to be very > carefully considered. Both MST and FreeSync have _significant_ UABI > implications, which may not be immediately obvious when working with a > single implementation. Having them working and validated with a > vertical stack of amdgpu-DC/KMS + xf86-video-amdgpu + > Mesa-amdgpu/AMDGPU-Pro is one thing, but looking outside the X11 world > we now have Weston, Mutter and KWin all directly driving KMS, plus > whatever Mir/Unity ends up doing (presumably the same), and that's > just on the desktop. Beyond the desktop, there's also CrOS/Freon and > Android/HWC. For better or worse, outside of Xorg and HWC, we no > longer have a vendor-provided userspace component driving KMS. > > It was also easy to get away with loose semantics before with X11 > imposing little to no structure on rendering, but we now have the twin > requirements of an atomic and timing-precise ABI - see Mario Kleiner's > unending quest for accuracy - and also a vendor-independent ABI. So a > good part of the (not insignificant) pain incurred in the atomic > transition for drivers, was in fact making those drivers conform to > the expectations of the KMS UABI contract, which just happened to not > have been tripped over previously. > > Speaking with my Collabora hat on now: we did do a substantial amount > of demidlayering on the Exynos driver, including an atomic conversion, > on Google's behalf. The original Exynos driver happened to work with > the Tizen stack, but ChromeOS exposed a huge amount of subtle > behaviour differences between that and other drivers when using Freon. > We'd also hit the same issues when attempting to use Weston on Exynos > in embedded devices for OEMs we worked with, so took on the project to > remove the midlayer and have as much as possible driven from generic > code. > > How the hardware is programmed is of course ultimately up to you, and > in this regard AMD will be very different from Intel is very different > from Nouveau is very different from Rockchip. But especially for new > features like FreeSync, I think we need to be very conscious of > walking the line between getting those features in early, and setting > unworkable UABI in stone. It would be unfortunate if later on down the > line, you had to choose between breaking older xf86-video-amdgpu > userspace which depended on specific behaviours of the amdgpu kernel > driver, or breaking the expectations of generic userspace such as > Weston/Mutter/etc. For clarity, as Michel said, the freesync stuff we have in the pro driver is not indented for upstream in either the kernel or the userspace. It's a short term solution for short term deliverables. That said, I think it's also useful to have something developers in the community can test and play with to get a better understanding of what use cases make sense when designing and validating the upstream solution. Alex _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [RFC] Using DC in amdgpu for upcoming GPU 2016-12-12 2:57 ` Dave Airlie 2016-12-12 7:09 ` Daniel Vetter [not found] ` <CAPM=9tx+j9-3fZNY=peLjdsVqyLS6i3V-sV3XrnYsK2YuhWRBA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2016-12-13 2:52 ` Cheng, Tony [not found] ` <5a1f2762-f1e0-05f1-3c16-173cb1f46571-5C7GfCeVMHo@public.gmane.org> 2016-12-13 9:40 ` Lukas Wunner 2 siblings, 2 replies; 66+ messages in thread From: Cheng, Tony @ 2016-12-13 2:52 UTC (permalink / raw) To: Dave Airlie, Harry Wentland Cc: Grodzovsky, Andrey, amd-gfx mailing list, dri-devel, Deucher, Alexander [-- Attachment #1.1: Type: text/plain, Size: 17244 bytes --] On 12/11/2016 9:57 PM, Dave Airlie wrote: > On 8 December 2016 at 12:02, Harry Wentland<harry.wentland@amd.com> wrote: >> We propose to use the Display Core (DC) driver for display support on >> AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to >> avoid a flag day the plan is to only support uGPU initially and transition >> to older ASICs gradually. > [FAQ: from past few days] > > 1) Hey you replied to Daniel, you never addressed the points of the RFC! > I've read it being said that I hadn't addressed the RFC, and you know > I've realised I actually had, because the RFC is great but it > presupposes the codebase as designed can get upstream eventually, and > I don't think it can. The code is too littered with midlayering and > other problems, that actually addressing the individual points of the > RFC would be missing the main point I'm trying to make. > > This code needs rewriting, not cleaning, not polishing, it needs to be > split into its constituent parts, and reintegrated in a form more > Linux process friendly. > > I feel that if I reply to the individual points Harry has raised in > this RFC, that it means the code would then be suitable for merging, > which it still won't, and I don't want people wasting another 6 > months. > > If DC was ready for the next-gen GPU it would be ready for the current > GPU, it's not the specific ASIC code that is the problem, it's the > huge midlayer sitting in the middle. We would love to upstream DC for all supported asic! We made enough change to make Sea Island work but it's really not validate to the extend we validate Polaris on linux and no where close to what we do for 2017 ASICs. With DC the display hardware programming, resource optimization, power management and interaction with rest of system will be fully validated across multiple OSs. Therefore we have high confidence that the quality is going to better than what we have upstreammed today. I don't have a baseline to say if DC is in good enough quality for older generation compare to upstream. For example we don't have HW generate bandwidth_calc for DCE 8/10 (Sea/Vocanic island family) but our code is structured in a way that we assume bandwidth_calc is there. None of us feel like go untangle the formulas in windows driver at this point to create our own version of bandwidth_calc. It sort of work with HW default values but some mode / config is likely to underflows. If community is okay with uncertain quality, sure we would love to upstream everything to reduce our maintaince overhead. You do get audio with DC on DCE8 though. > 2) We really need to share all of this code between OSes, why does > Linux not want it? > > Sharing code is a laudable goal and I appreciate the resourcing > constraints that led us to the point at which we find ourselves, but > the way forward involves finding resources to upstream this code, > dedicated people (even one person) who can spend time on a day by day > basis talking to people in the open and working upstream, improving > other pieces of the drm as they go, reading atomic patches and > reviewing them, and can incrementally build the DC experience on top > of the Linux kernel infrastructure. Then having the corresponding > changes in the DC codebase happen internally to correspond to how the > kernel code ends up looking. Lots of this code overlaps with stuff the > drm already does, lots of is stuff the drm should be doing, so patches > to the drm should be sent instead. Maybe let me share what we are doing and see if we can come up with something to make DC work for both upstream and our internal need. We are sharing code not just on Linux and we will do our best to make our code upstream friendly. Last year we focussed on having enough code to prove that our DAL rewrite works and get more people contributing to it. We rush a bit as a result we had a few legacy component we port from Windows driver and we know it's bloat that needed to go. We designed DC so HW can contribute bandwidth_calc magic and psuedo code to program the HW blocks. The HW blocks on the bottom of DC.JPG in models our HW blocks and the programming sequence are provided by HW engineers. If a piece of HW need a bit toggled 7 times during power up I rather have HW engineer put that in their psedo code rather than me trying to find that sequence in some document. Afterall they did simulate the HW with the toggle sequence. I guess these are back-end code Daniel talked about. Can we agree that DRM core is not interested in how things are done in that layer and we can upstream these as it? The next is dce_hwseq.c to program the HW blocks in correct sequence. Some HW block can be programmed in any sequence, but some requires strict sequence to be followed. For example Display CLK and PHY CLK need to be up before we enable timing generator. I would like these sequence to remain in DC as it's really not DRM's business to know how to program the HW. In a way you can consider hwseq as a helper to commit state to HW. Above hwseq is the dce*_resource.c. It's job is to come up with the HW state required to realize given config. For example we would use the exact same HW resources with same optimization setting to drive any same given config. If 4 x 4k@60 is supported with resource setting A on HW diagnositc suite during bring up setting B on Linux then we have a problem. It know which HW block work with which block and their capability and limitations. I hope you are not asking this stuff to move up to core because in reality we should probably hide this in some FW, as HW expose the register to config them differently that doesn't mean all combination of HW usage is validated. To me resource is more of a helper to put together functional pipeline and does not make any decision that any OS might be interested in. These yellow boxes in DC.JPG are really specific to each generation of HW and changes frequently. These are things that HW has consider hiding it in FW before. Can we agree on those code (under /dc/dce*) can stay? DAL3.JPG shows how we put this all together. The core part is design to behave like helper, except we try to limit the entry point and opted for caller to build desire state we want DC to commit to. It didn't make sense for us to expose hundred of function (our windows dal interface did) and require caller to call these helpers in correct sequence. Caller builds absolute state it want to get to and DC will make it happen with the HW available. > 3) Then how do we upstream it? > Resource(s) need(s) to start concentrating at splitting this thing up > and using portions of it in the upstream kernel. We don't land fully > formed code in the kernel if we can avoid it. Because you can't review > the ideas and structure as easy as when someone builds up code in > chunks and actually develops in the Linux kernel. This has always > produced better more maintainable code. Maybe the result will end up > improving the AMD codebase as well. Is this about demonstration how basic functionality work and add more features with series of patches to make review eaiser? If so I don't think we are staff to do this kind of rewrite. For example it make no sense to hooking up bandwidth_calc to calculate HW magic if we don't have mem_input to program the memory settings. We need portion of hw_seq to ensure these blocks are programming in correct sequence. We will need to feed bandwidth_calc it's required inputs, which is basically the whole system state tracked in validate_context today, which means we basically need big bulk of resource.c. This effort might have benefit in reviewing the code, but we will end up with pretty much similar if not the same as what we already have. Or is the objection that we have the white boxes in DC.JPG instead of using DRM objects? We can probably workout something to have the white boxes derive from DRM objects and extend atomic state with our validate_context where dce*_resource.c stores the constructed pipelines. > 4) Why can't we put this in staging? > People have also mentioned staging, Daniel has called it a dead end, > I'd have considered staging for this code base, and I still might. > However staging has rules, and the main one is code in staging needs a > TODO list, and agreed criteria for exiting staging, I don't think we'd > be able to get an agreement on what the TODO list should contain and > how we'd ever get all things on it done. If this code ended up in > staging, it would most likely require someone dedicated to recreating > it in the mainline driver in an incremental fashion, and I don't see > that resource being available. > > 5) Why is a midlayer bad? > I'm not going to go into specifics on the DC midlayer, but we abhor > midlayers for a fair few reasons. The main reason I find causes the > most issues is locking. When you have breaks in code flow between > multiple layers, but having layers calling back into previous layers > it becomes near impossible to track who owns the locking and what the > current locking state is. > > Consider > drma -> dca -> dcb -> drmb > drmc -> dcc -> dcb -> drmb > > We have two codes paths that go back into drmb, now maybe drma has a > lock taken, but drmc doesn't, but we've no indication when we hit drmb > of what the context pre entering the DC layer is. This causes all > kinds of problems. The main requirement is the driver maintains the > execution flow as much as possible. The only callback behaviour should > be from an irq or workqueue type situations where you've handed > execution flow to the hardware to do something and it is getting back > to you. The pattern we use to get our of this sort of hole is helper > libraries, we structure code as much as possible as leaf nodes that > don't call back into the parents if we can avoid it (we don't always > succeed). Okay. by the way DC does behave like a helper for most part. There is no locking in DC. We work enough with different OS to know they all have different synchronization primatives and interrupt handling and have DC lock anything is just shooting ourself in the foot. We do have function with lock in their function name in DC but those are HW register lock to ensure that the HW register update atomically. ie have 50 register write latch in HW at next vsync to ensure HW state change on vsync boundary. > So the above might becomes > drma-> dca_helper > -> dcb_helper > -> drmb. > > In this case the code flow is controlled by drma, dca/dcb might be > modifying data or setting hw state but when we get to drmb it's easy > to see what data is needs and what locking. > > DAL/DC goes against this in so many ways, and when I look at the code > I'm never sure where to even start pulling the thread to unravel it. I don't know where we go against it. In the case we do callback to DRM for MST case we have amdgpu_dm_atomic_commit (implement atomic_commit) dc_commit_targets (commit helper) dce110_apply_ctx_to_hw (hw_seq) core_link_enable_stream (part of MST enable sequence) allocate_mst_payload (helper for above func in same file) dm_helpers_dp_mst_write_payload_allocation_table (glue code to call DRM) drm_dp_mst_allocate_vcpi (DRM) As you see even in this case we are only 6 level deep before we callback to DRM, and 2 of those functions are in same file as helper func of the bigger sequence. Can you clarify the distinction between what you would call a mid layer vs helper. We consulted Alex a lot and we know about this inversion of control pattern and we are trying our best to do it. Is it the way functions are named and files folder structure? Would it help if we flatten amdgpu_dm_atomic_commit and dc_commit_targets? Even if we do I would imagine we want some helper in commit rather a giant 1000 line function. Is there any concern that we put dc_commit_targets under /dc folder as we want other platform to run exact same helper? Or this is about the state dc_commit_targets is too big? or the state is stored validate_context rather than drm_atomic_state? I don't think it make sense for DRM to get into how we decide to use our HW blocks. For example any refactor done in core should not result in us using different pipeline to drive the same config. We would like to have control over how our HW pipeline is constructed. > Some questions I have for AMD engineers that also I'd want to see > addressed before any consideration of merging would happen! > > How do you plan on dealing with people rewriting or removing code > upstream that is redundant in the kernel, but required for internal > stuff? Honestly I don't know what these are. Like you and Jerome remove func ptr abstraction (I know it was bad, that was one of the component we ported from windows) and we need to keep it as function pointer so we can still run our code on FPGA before we see first silicon? I don't think if we nak the function ptr removal will be a problem for community. The rest is valued and we took with open arm. Or this is more like we have code duplication after DRM added some functionality we can use? I would imaging its more of moving what we got working in our code to DRM core if we are upstreamed and we have no problem accomodate for that as the code moved out to DRM core can be included in other platforms. We don't have any private ioctl today and we don't plan to have any outside of using DRM object properties. > How are you going to deal with new Linux things that overlap > incompatibly with your internally developed stuff? I really don't know what those new linux things can be that could cause us problem. If anything the new things will be probably come from us if we are upstreammed. atomic: we had that on windows 8 years ago for windows vista, yes sematic/abstraction is different but concept is the same. We could have easily settled with DRM semantics or DRM could easily take some form of our pattern. DP MST: AMD was the first source certified and we work closely with the first branch certified. I was a part of that team and we had a very solid implementation. If we were upstreamed I don't see you would want to reinvent the wheel and not try to massage what we have into shape for DRM core for other driver to reuse. drm_plane: windows multi-plane overlay and Andriod HW composer? We had that working 2 years ago. If you are upstreammed and you are first you usually have a say in how it should go down don't you? The new thing coming are Free Sync HDR, 8k@60 with DP DSC etc. I would imaging we would beat all other vendor to the first open source solution if we leverage effort from our extended display team. > If the code is upstream will it be tested in the kernel by some QA > group, or will there be some CI infrastructure used to maintain and to > watch for Linux code that breaks assumptions in the DC code? We have tester that runs set of display test every other day on linux. We don't run on DRM_Next tree yet and Alex is working out a plan to allow us use DRM_Next as our development branch. Upstream is not likely to be tested by QA though. DC does not assume anything. DC require full state given in dc_commit_targets / dc_commit_surfaces_to_target. we do whatever is specified in the data structure. dc_commit_surfaces_to_target can be considered as a helper function to change plane without visual side effect on vsync bondary. dc_commit_targets can be considered as a helper function to light up a display with black screen. DRM core has full control if you want to light up to black screen as soon as monitor is plugged in or you want to light up after someone does a mode set. Hotplug interrupt goes to amdgpu_dm, and it will take the require lock in DRM object because calling DC to detect. > Can you show me you understand that upstream code is no longer 100% in > your control and things can happen to it that you might not expect and > you need to deal with it? I think so, other than we haven't been spanning the mailing list. We already dealing with we don't control 100% our code to some extend. We don't control bandwidth_calc. Trust me we are not keeping up with the updates that HW is doing with it for next gen hw. Everytime we pull there is a new term they added and we have to find a way to feed that input. We had to clean up linux style for them everytime we pull. Our HW diagnostic suite has different set of requirements and they frequently contribute to our code. We took you and Jerome's patch. If it's validated we want that code. At end of the day I think the architecture is really about what's HW and what's DRM core. Like I said all the yellow boxes has been proposed to running on firmware but we decide to keep them in driver as it's easier to debug on x86 than uC. I can tell you that our HW guys were happy when I decide to open source bandwidth_calc but we did it anyways. I feel like because we are opening up the complexity and inner working of our HW, we are somehow getting penalized for being open. > > Dave. Tony [-- Attachment #1.2: Type: text/html, Size: 20118 bytes --] [-- Attachment #2: DC.JPG --] [-- Type: image/jpeg, Size: 147686 bytes --] [-- Attachment #3: DAL3.JPG --] [-- Type: image/jpeg, Size: 117556 bytes --] [-- Attachment #4: Type: text/plain, Size: 160 bytes --] _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 66+ messages in thread
[parent not found: <5a1f2762-f1e0-05f1-3c16-173cb1f46571-5C7GfCeVMHo@public.gmane.org>]
* Re: [RFC] Using DC in amdgpu for upcoming GPU [not found] ` <5a1f2762-f1e0-05f1-3c16-173cb1f46571-5C7GfCeVMHo@public.gmane.org> @ 2016-12-13 7:09 ` Dave Airlie 0 siblings, 0 replies; 66+ messages in thread From: Dave Airlie @ 2016-12-13 7:09 UTC (permalink / raw) To: Cheng, Tony Cc: Grodzovsky, Andrey, Cyr, Aric, Bridgman, John, Lazare, Jordan, amd-gfx mailing list, dri-devel, Deucher, Alexander, Harry Wentland > We would love to upstream DC for all supported asic! We made enough change > to make Sea Island work but it's really not validate to the extend we > validate Polaris on linux and no where close to what we do for 2017 ASICs. > With DC the display hardware programming, resource optimization, power > management and interaction with rest of system will be fully validated > across multiple OSs. Therefore we have high confidence that the quality is > going to better than what we have upstreammed today. > > I don't have a baseline to say if DC is in good enough quality for older > generation compare to upstream. For example we don't have HW generate > bandwidth_calc for DCE 8/10 (Sea/Vocanic island family) but our code is > structured in a way that we assume bandwidth_calc is there. None of us feel > like go untangle the formulas in windows driver at this point to create our > own version of bandwidth_calc. It sort of work with HW default values but > some mode / config is likely to underflows. If community is okay with > uncertain quality, sure we would love to upstream everything to reduce our > maintaince overhead. You do get audio with DC on DCE8 though. If we get any of this upstream, we should get all of the hw supported with it. If it regresses we just need someone to debug why. > Maybe let me share what we are doing and see if we can come up with > something to make DC work for both upstream and our internal need. We are > sharing code not just on Linux and we will do our best to make our code > upstream friendly. Last year we focussed on having enough code to prove > that our DAL rewrite works and get more people contributing to it. We rush > a bit as a result we had a few legacy component we port from Windows driver > and we know it's bloat that needed to go. > > We designed DC so HW can contribute bandwidth_calc magic and psuedo code to > program the HW blocks. The HW blocks on the bottom of DC.JPG in models our > HW blocks and the programming sequence are provided by HW engineers. If a > piece of HW need a bit toggled 7 times during power up I rather have HW > engineer put that in their psedo code rather than me trying to find that > sequence in some document. Afterall they did simulate the HW with the > toggle sequence. I guess these are back-end code Daniel talked about. Can > we agree that DRM core is not interested in how things are done in that > layer and we can upstream these as it? > > The next is dce_hwseq.c to program the HW blocks in correct sequence. Some > HW block can be programmed in any sequence, but some requires strict > sequence to be followed. For example Display CLK and PHY CLK need to be up > before we enable timing generator. I would like these sequence to remain in > DC as it's really not DRM's business to know how to program the HW. In a > way you can consider hwseq as a helper to commit state to HW. > > Above hwseq is the dce*_resource.c. It's job is to come up with the HW > state required to realize given config. For example we would use the exact > same HW resources with same optimization setting to drive any same given > config. If 4 x 4k@60 is supported with resource setting A on HW diagnositc > suite during bring up setting B on Linux then we have a problem. It know > which HW block work with which block and their capability and limitations. > I hope you are not asking this stuff to move up to core because in reality > we should probably hide this in some FW, as HW expose the register to config > them differently that doesn't mean all combination of HW usage is validated. > To me resource is more of a helper to put together functional pipeline and > does not make any decision that any OS might be interested in. > > These yellow boxes in DC.JPG are really specific to each generation of HW > and changes frequently. These are things that HW has consider hiding it in > FW before. Can we agree on those code (under /dc/dce*) can stay? I think most of these things are fine to be part of the solution we end up at, but I can't say for certain they won't require interface changes. I think the most useful code is probably the stuff in the dce subdirectories. > > Is this about demonstration how basic functionality work and add more > features with series of patches to make review eaiser? If so I don't think > we are staff to do this kind of rewrite. For example it make no sense to > hooking up bandwidth_calc to calculate HW magic if we don't have mem_input > to program the memory settings. We need portion of hw_seq to ensure these > blocks are programming in correct sequence. We will need to feed > bandwidth_calc it's required inputs, which is basically the whole system > state tracked in validate_context today, which means we basically need big > bulk of resource.c. This effort might have benefit in reviewing the code, > but we will end up with pretty much similar if not the same as what we > already have. This is something people always say, I'm betting you won't end up there at all, it's not just review, it's incremental development model, so that when things go wrong we can pinpoint why and where a lot easier. Just merging this all in one fell swoop is going to just mean a lot of pain in the end. I understand you aren't resourced for this sort of development on this codebase, but it's going to be an impasse to try and merge this all at once even if was clean code. > Or is the objection that we have the white boxes in DC.JPG instead of using > DRM objects? We can probably workout something to have the white boxes > derive from DRM objects and extend atomic state with our validate_context > where dce*_resource.c stores the constructed pipelines. I think Daniel explained quite well how things should look in terms of subclassing. > > 5) Why is a midlayer bad? > I'm not going to go into specifics on the DC midlayer, but we abhor > midlayers for a fair few reasons. The main reason I find causes the > most issues is locking. When you have breaks in code flow between > multiple layers, but having layers calling back into previous layers > it becomes near impossible to track who owns the locking and what the > current locking state is. > > Consider > drma -> dca -> dcb -> drmb > drmc -> dcc -> dcb -> drmb > > We have two codes paths that go back into drmb, now maybe drma has a > lock taken, but drmc doesn't, but we've no indication when we hit drmb > of what the context pre entering the DC layer is. This causes all > kinds of problems. The main requirement is the driver maintains the > execution flow as much as possible. The only callback behaviour should > be from an irq or workqueue type situations where you've handed > execution flow to the hardware to do something and it is getting back > to you. The pattern we use to get our of this sort of hole is helper > libraries, we structure code as much as possible as leaf nodes that > don't call back into the parents if we can avoid it (we don't always > succeed). > > Okay. by the way DC does behave like a helper for most part. There is no > locking in DC. We work enough with different OS to know they all have > different synchronization primatives and interrupt handling and have DC lock > anything is just shooting ourself in the foot. We do have function with > lock in their function name in DC but those are HW register lock to ensure > that the HW register update atomically. ie have 50 register write latch in > HW at next vsync to ensure HW state change on vsync boundary. > > So the above might becomes > drma-> dca_helper > -> dcb_helper > -> drmb. > > In this case the code flow is controlled by drma, dca/dcb might be > modifying data or setting hw state but when we get to drmb it's easy > to see what data is needs and what locking. > > DAL/DC goes against this in so many ways, and when I look at the code > I'm never sure where to even start pulling the thread to unravel it. > > I don't know where we go against it. In the case we do callback to DRM for > MST case we have > > amdgpu_dm_atomic_commit (implement atomic_commit) > dc_commit_targets (commit helper) > dce110_apply_ctx_to_hw (hw_seq) > core_link_enable_stream (part of MST enable sequence) > allocate_mst_payload (helper for above func in same file) > dm_helpers_dp_mst_write_payload_allocation_table (glue code to call DRM) > drm_dp_mst_allocate_vcpi (DRM) > > As you see even in this case we are only 6 level deep before we callback to > DRM, and 2 of those functions are in same file as helper func of the bigger > sequence. > > Can you clarify the distinction between what you would call a mid layer vs > helper. We consulted Alex a lot and we know about this inversion of control > pattern and we are trying our best to do it. Is it the way functions are > named and files folder structure? Would it help if we flatten > amdgpu_dm_atomic_commit and dc_commit_targets? Even if we do I would > imagine we want some helper in commit rather a giant 1000 line function. Is > there any concern that we put dc_commit_targets under /dc folder as we want > other platform to run exact same helper? Or this is about the state > dc_commit_targets is too big? or the state is stored validate_context > rather than drm_atomic_state? Well one area I hit today while looking, is trace the path for a dpcd read or write. An internal one in the dc layer goes core_link_dpcd_read > > I don't think it make sense for DRM to get into how we decide to use our HW > blocks. For example any refactor done in core should not result in us using > different pipeline to drive the same config. We would like to have control > over how our HW pipeline is constructed. > > Some questions I have for AMD engineers that also I'd want to see > addressed before any consideration of merging would happen! > > How do you plan on dealing with people rewriting or removing code > upstream that is redundant in the kernel, but required for internal > stuff? > > > Honestly I don't know what these are. Like you and Jerome remove func ptr > abstraction (I know it was bad, that was one of the component we ported from > windows) and we need to keep it as function pointer so we can still run our > code on FPGA before we see first silicon? I don't think if we nak the > function ptr removal will be a problem for community. The rest is valued > and we took with open arm. > > Or this is more like we have code duplication after DRM added some > functionality we can use? I would imaging its more of moving what we got > working in our code to DRM core if we are upstreamed and we have no problem > accomodate for that as the code moved out to DRM core can be included in > other platforms. We don't have any private ioctl today and we don't plan to > have any outside of using DRM object properties. > > > How are you going to deal with new Linux things that overlap > incompatibly with your internally developed stuff? > > I really don't know what those new linux things can be that could cause us > problem. If anything the new things will be probably come from us if we are > upstreammed. > > atomic: we had that on windows 8 years ago for windows vista, yes > sematic/abstraction is different but concept is the same. We could have > easily settled with DRM semantics or DRM could easily take some form of our > pattern. > > DP MST: AMD was the first source certified and we work closely with the > first branch certified. I was a part of that team and we had a very solid > implementation. If we were upstreamed I don't see you would want to > reinvent the wheel and not try to massage what we have into shape for DRM > core for other driver to reuse. > > drm_plane: windows multi-plane overlay and Andriod HW composer? We had that > working 2 years ago. If you are upstreammed and you are first you usually > have a say in how it should go down don't you? > > The new thing coming are Free Sync HDR, 8k@60 with DP DSC etc. I would > imaging we would beat all other vendor to the first open source solution if > we leverage effort from our extended display team. > > If the code is upstream will it be tested in the kernel by some QA > group, or will there be some CI infrastructure used to maintain and to > watch for Linux code that breaks assumptions in the DC code? > > We have tester that runs set of display test every other day on linux. We > don't run on DRM_Next tree yet and Alex is working out a plan to allow us > use DRM_Next as our development branch. Upstream is not likely to be tested > by QA though. > > DC does not assume anything. DC require full state given in > dc_commit_targets / dc_commit_surfaces_to_target. we do whatever is > specified in the data structure. dc_commit_surfaces_to_target can be > considered as a helper function to change plane without visual side effect > on vsync bondary. dc_commit_targets can be considered as a helper function > to light up a display with black screen. DRM core has full control if you > want to light up to black screen as soon as monitor is plugged in or you > want to light up after someone does a mode set. Hotplug interrupt goes to > amdgpu_dm, and it will take the require lock in DRM object because calling > DC to detect. > > Can you show me you understand that upstream code is no longer 100% in > your control and things can happen to it that you might not expect and > you need to deal with it? > > I think so, other than we haven't been spanning the mailing list. We > already dealing with we don't control 100% our code to some extend. We > don't control bandwidth_calc. Trust me we are not keeping up with the > updates that HW is doing with it for next gen hw. Everytime we pull there > is a new term they added and we have to find a way to feed that input. We > had to clean up linux style for them everytime we pull. Our HW diagnostic > suite has different set of requirements and they frequently contribute to > our code. We took you and Jerome's patch. If it's validated we want that > code. > > At end of the day I think the architecture is really about what's HW and > what's DRM core. Like I said all the yellow boxes has been proposed to > running on firmware but we decide to keep them in driver as it's easier to > debug on x86 than uC. I can tell you that our HW guys were happy when I > decide to open source bandwidth_calc but we did it anyways. I feel like > because we are opening up the complexity and inner working of our HW, we are > somehow getting penalized for being open. > > Dave. > > Tony _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [RFC] Using DC in amdgpu for upcoming GPU 2016-12-13 2:52 ` Cheng, Tony [not found] ` <5a1f2762-f1e0-05f1-3c16-173cb1f46571-5C7GfCeVMHo@public.gmane.org> @ 2016-12-13 9:40 ` Lukas Wunner [not found] ` <20161213094035.GA10916-JFq808J9C/izQB+pC5nmwQ@public.gmane.org> 1 sibling, 1 reply; 66+ messages in thread From: Lukas Wunner @ 2016-12-13 9:40 UTC (permalink / raw) To: Cheng, Tony Cc: Grodzovsky, Andrey, dri-devel, amd-gfx mailing list, Deucher, Alexander On Mon, Dec 12, 2016 at 09:52:08PM -0500, Cheng, Tony wrote: > With DC the display hardware programming, resource optimization, power > management and interaction with rest of system will be fully validated > across multiple OSs. Do I understand DAL3.jpg correctly that the macOS driver builds on top of DAL Core? I'm asking because the graphics drivers shipping with macOS as well as on Apple's EFI Firmware Volume are closed source. If the Linux community contributes to DC, I guess those contributions can generally be assumed to be GPLv2 licensed. Yet a future version of the macOS driver would incorporate those contributions in the same binary as their closed source OS-specific portion. I don't quite see how that would be legal but maybe I'm missing something. Presumably the situation with the Windows driver is the same. I guess you could maintain a separate branch sans community contributions which would serve as a basis for closed source drivers, but not sure if that is feasible given your resource constraints. Thanks, Lukas _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 66+ messages in thread
[parent not found: <20161213094035.GA10916-JFq808J9C/izQB+pC5nmwQ@public.gmane.org>]
* Re: [RFC] Using DC in amdgpu for upcoming GPU [not found] ` <20161213094035.GA10916-JFq808J9C/izQB+pC5nmwQ@public.gmane.org> @ 2016-12-13 15:03 ` Cheng, Tony 2016-12-13 15:09 ` Deucher, Alexander ` (2 more replies) 2016-12-13 16:14 ` Bridgman, John 1 sibling, 3 replies; 66+ messages in thread From: Cheng, Tony @ 2016-12-13 15:03 UTC (permalink / raw) To: Lukas Wunner, John Cc: Grodzovsky, Andrey, Dave Airlie, dri-devel, amd-gfx mailing list, Deucher, Alexander, Harry Wentland [-- Attachment #1.1: Type: text/plain, Size: 2098 bytes --] Only DM that’s open source is amdgpu_dm.the rest will remain closed source.I remember we had discussion around legal issues with our grand plan of unifying everything, and I remember maybe it was John who assured us that it's okay.John can you chime in how it would work with GPLv2 licsense? On 12/13/2016 4:40 AM, Lukas Wunner wrote: > On Mon, Dec 12, 2016 at 09:52:08PM -0500, Cheng, Tony wrote: >> With DC the display hardware programming, resource optimization, power >> management and interaction with rest of system will be fully validated >> across multiple OSs. > Do I understand DAL3.jpg correctly that the macOS driver builds on top > of DAL Core? I'm asking because the graphics drivers shipping with > macOS as well as on Apple's EFI Firmware Volume are closed source. macOS currently ship with their own driver. I can't really comment on what macOS do without getting into trouble. > If the Linux community contributes to DC, I guess those contributions > can generally be assumed to be GPLv2 licensed. Yet a future version > of the macOS driver would incorporate those contributions in the same > binary as their closed source OS-specific portion. I am struggling with that these comminty contributions to DC would be. Us AMD developer has access to HW docs and designer and we are still spending 50% of our time figuring out why our HW doesn't work right. I can't image community doing much of this heavy lifting. > > I don't quite see how that would be legal but maybe I'm missing > something. > > Presumably the situation with the Windows driver is the same. > > I guess you could maintain a separate branch sans community contributions > which would serve as a basis for closed source drivers, but not sure if > that is feasible given your resource constraints. Dave sent us series of patch to show how it would look like if someone were to change DC. These changes are more removing code that DRM already has and deleting/clean up stuff. I guess we can nak all changes and "rewrite" our own version of clean up patch community want to see? > Thanks, > > Lukas [-- Attachment #1.2: Type: text/html, Size: 41251 bytes --] [-- Attachment #2: Type: text/plain, Size: 154 bytes --] _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
* RE: [RFC] Using DC in amdgpu for upcoming GPU 2016-12-13 15:03 ` Cheng, Tony @ 2016-12-13 15:09 ` Deucher, Alexander 2016-12-13 15:57 ` Lukas Wunner 2016-12-14 9:57 ` Jani Nikula 2 siblings, 0 replies; 66+ messages in thread From: Deucher, Alexander @ 2016-12-13 15:09 UTC (permalink / raw) To: Cheng, Tony, Lukas Wunner, Bridgman, John Cc: dri-devel, amd-gfx mailing list, Grodzovsky, Andrey [-- Attachment #1.1: Type: text/plain, Size: 2569 bytes --] Our driver code and most of the drm is MIT/X11 licensed. Lot of other non GPL OSes (e.g., the BSDs) already import Linux drm drivers and core code. Alex From: Cheng, Tony Sent: Tuesday, December 13, 2016 10:04 AM To: Lukas Wunner; Bridgman, John Cc: Dave Airlie; Wentland, Harry; Grodzovsky, Andrey; amd-gfx mailing list; dri-devel; Deucher, Alexander Subject: Re: [RFC] Using DC in amdgpu for upcoming GPU Only DM that's open source is amdgpu_dm. the rest will remain closed source. I remember we had discussion around legal issues with our grand plan of unifying everything, and I remember maybe it was John who assured us that it's okay. John can you chime in how it would work with GPLv2 licsense? On 12/13/2016 4:40 AM, Lukas Wunner wrote: On Mon, Dec 12, 2016 at 09:52:08PM -0500, Cheng, Tony wrote: With DC the display hardware programming, resource optimization, power management and interaction with rest of system will be fully validated across multiple OSs. Do I understand DAL3.jpg correctly that the macOS driver builds on top of DAL Core? I'm asking because the graphics drivers shipping with macOS as well as on Apple's EFI Firmware Volume are closed source. macOS currently ship with their own driver. I can't really comment on what macOS do without getting into trouble. If the Linux community contributes to DC, I guess those contributions can generally be assumed to be GPLv2 licensed. Yet a future version of the macOS driver would incorporate those contributions in the same binary as their closed source OS-specific portion. I am struggling with that these comminty contributions to DC would be. Us AMD developer has access to HW docs and designer and we are still spending 50% of our time figuring out why our HW doesn't work right. I can't image community doing much of this heavy lifting. I don't quite see how that would be legal but maybe I'm missing something. Presumably the situation with the Windows driver is the same. I guess you could maintain a separate branch sans community contributions which would serve as a basis for closed source drivers, but not sure if that is feasible given your resource constraints. Dave sent us series of patch to show how it would look like if someone were to change DC. These changes are more removing code that DRM already has and deleting/clean up stuff. I guess we can nak all changes and "rewrite" our own version of clean up patch community want to see? Thanks, Lukas [-- Attachment #1.2: Type: text/html, Size: 7720 bytes --] [-- Attachment #2: Type: text/plain, Size: 160 bytes --] _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [RFC] Using DC in amdgpu for upcoming GPU 2016-12-13 15:03 ` Cheng, Tony 2016-12-13 15:09 ` Deucher, Alexander @ 2016-12-13 15:57 ` Lukas Wunner 2016-12-14 9:57 ` Jani Nikula 2 siblings, 0 replies; 66+ messages in thread From: Lukas Wunner @ 2016-12-13 15:57 UTC (permalink / raw) To: Cheng, Tony Cc: Grodzovsky, Andrey, dri-devel, amd-gfx mailing list, Deucher, Alexander On Tue, Dec 13, 2016 at 10:03:58AM -0500, Cheng, Tony wrote: > On 12/13/2016 4:40 AM, Lukas Wunner wrote: > > If the Linux community contributes to DC, I guess those contributions > > can generally be assumed to be GPLv2 licensed. Yet a future version > > of the macOS driver would incorporate those contributions in the same > > binary as their closed source OS-specific portion. > > I am struggling with that these comminty contributions to DC would be. > > Us AMD developer has access to HW docs and designer and we are still > spending 50% of our time figuring out why our HW doesn't work right. > I can't image community doing much of this heavy lifting. True, but past experience with radeon/amdgpu is that the community has use cases that AMD developers don't specifically cater to, e.g. due to lack of the required hardware or resource constraints. E.g. Mario Kleiner has contributed lots of patches for proper vsync handling which are needed for his neuroscience software. I've contributed DDC switching support for MacBook Pros to radeon. Your driver becomes more useful, you get more customers, everyone wins. > > Do I understand DAL3.jpg correctly that the macOS driver builds on top > > of DAL Core? I'm asking because the graphics drivers shipping with > > macOS as well as on Apple's EFI Firmware Volume are closed source. > > macOS currently ship with their own driver. I can't really comment on what > macOS do without getting into trouble. The Intel Israel folks working on Thunderbolt are similarly between the rock that is the community's expectation of openness and the hard place that is Apple's secrecy. So I sympathize with your situation, kudos for trying to do the right thing. > I guess we can nak all changes and "rewrite" our > own version of clean up patch community want to see? I don't think that would be workable honestly. One way out of this conundrum might be to use a permissive license such as BSD for DC. Then whenever you merge a community patch, in addition to informing the contributor thereof, send them a boilerplate one-liner that community contributions are assumed to be under the same license and if the contributor disagrees they should send a short notice to have their contribution removed. But IANAL. Best regards, Lukas _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [RFC] Using DC in amdgpu for upcoming GPU 2016-12-13 15:03 ` Cheng, Tony 2016-12-13 15:09 ` Deucher, Alexander 2016-12-13 15:57 ` Lukas Wunner @ 2016-12-14 9:57 ` Jani Nikula 2016-12-14 17:23 ` Cheng, Tony 2 siblings, 1 reply; 66+ messages in thread From: Jani Nikula @ 2016-12-14 9:57 UTC (permalink / raw) To: Cheng, Tony, Lukas Wunner, John Cc: Deucher, Alexander, Grodzovsky, Andrey, amd-gfx mailing list, dri-devel On Tue, 13 Dec 2016, "Cheng, Tony" <tony.cheng@amd.com> wrote: > I am struggling with that these comminty contributions to DC would be. > > Us AMD developer has access to HW docs and designer and we are still > spending 50% of our time figuring out why our HW doesn't work right. I > can't image community doing much of this heavy lifting. I can sympathize with that view, and certainly most of the heavy lifting would come from you, same as with us and i915. However, when you put together your hardware, an open source driver, and smart people, they *will* scratch their itches, whether they're bugs you're not fixing or features you're missing. Please don't underestimate and patronize them, it's going to rub people the wrong way. > Dave sent us series of patch to show how it would look like if someone > were to change DC. These changes are more removing code that DRM > already has and deleting/clean up stuff. I guess we can nak all changes > and "rewrite" our own version of clean up patch community want to see? Please have a look at, say, $ git shortlog -sne --since @{1year} -- drivers/gpu/drm/amd | grep -v amd\.com Do you really want to actively discourage all of them from contributing? I think this would be detrimental to not only your driver, but the whole drm community. It feels like you'd like to have your code upstream, but still retain ownership as if it was in your internal repo. You can't have your cake and eat it too. BR, Jani. -- Jani Nikula, Intel Open Source Technology Center _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [RFC] Using DC in amdgpu for upcoming GPU 2016-12-14 9:57 ` Jani Nikula @ 2016-12-14 17:23 ` Cheng, Tony [not found] ` <d68102d4-b99c-cc60-4eb2-9c6295af130f-5C7GfCeVMHo@public.gmane.org> 0 siblings, 1 reply; 66+ messages in thread From: Cheng, Tony @ 2016-12-14 17:23 UTC (permalink / raw) To: Jani Nikula, Lukas Wunner, John Cc: Deucher, Alexander, Grodzovsky, Andrey, amd-gfx mailing list, dri-devel On 12/14/2016 4:57 AM, Jani Nikula wrote: > On Tue, 13 Dec 2016, "Cheng, Tony" <tony.cheng@amd.com> wrote: >> I am struggling with that these comminty contributions to DC would be. >> >> Us AMD developer has access to HW docs and designer and we are still >> spending 50% of our time figuring out why our HW doesn't work right. I >> can't image community doing much of this heavy lifting. > I can sympathize with that view, and certainly most of the heavy lifting > would come from you, same as with us and i915. However, when you put > together your hardware, an open source driver, and smart people, they > *will* scratch their itches, whether they're bugs you're not fixing or > features you're missing. Please don't underestimate and patronize them, > it's going to rub people the wrong way. I aplogize if my statement offended any one in the community. I'll say more about bugs below. >> Dave sent us series of patch to show how it would look like if someone >> were to change DC. These changes are more removing code that DRM >> already has and deleting/clean up stuff. I guess we can nak all changes >> and "rewrite" our own version of clean up patch community want to see? > Please have a look at, say, > > $ git shortlog -sne --since @{1year} -- drivers/gpu/drm/amd | grep -v amd\.com > > Do you really want to actively discourage all of them from contributing? > I think this would be detrimental to not only your driver, but the whole > drm community. It feels like you'd like to have your code upstream, but > still retain ownership as if it was in your internal repo. You can't > have your cake and eat it too. That's none "dal" path. It's just Alex plus a handful of guys trying to figure out what register writes is needed base on windows driver. You knwo who has been contributing to that code path from AMD and we know it's a relatively small group of people. Alex and team does great job at being good citizen on linux world and provide support. But in terms of HW programming and fully expolit our HW that's pretty much the best they can do with the resource constraint. Of course the quality is not as good as we would like thus we needed all the help we can get from community. We just don't have the man power to make it great. We are proposing to get on a path where we can fully leverage the coding and validation resources from rest of AMD Display teams (SW, HW, tuning, validation, QA etc). Our goal is to provide a driver to linux community that's feature rich and high quality. My goal is community finds 0 bug in our code because we should've seen and fixed those bug in our validation pass before we release the GPUs. We do have a good size team around validation, just today that validation covers 0% of upstream source code. Alex and I are trying to find a path to get these goodies on the upstream driver without 2x size of our teams. We know 2x our team size is not an option. I just want to say I understand were community is coming from. Like I said in my first respond to Dave that I would've say no if someone want to throw 100k lines of code into project (DAL) I have to maintain without knowning what's there and the benefit we are getting. We already made a lot of change and design choice in our code base to play well with community and absorbing the effort to restructure code on other platforms as result of these modification. We are going to continue making more modifications to make our code linux worthy base on the good feedback we have gotten so far. DAL3/DC is a new project we started a little over years ago and still early enough stage to make changes. Like how community is pushing back on our code, after 1 or 2 future generation of GPU built on top of DC, the AMD teams on rest of platforms will start pushing back on changes in DC. We need find find the balance of what's HW and what's core and how to draw the line so community doesn't make much modification in what we (both AMD and community) deem "hardware backend code". We need to have the linux coding style and design principals baked in DC code so when our internal teams contribute to DC the code is written in a form linux community can accept. All of this need to happen soon or we miss this critical inflection point and it's going to be anther 6-8 years before we get another crack at re-archiecture project to try getting the rest of extended AMD display teams behind our upstream driver. > > BR, > Jani. > > _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 66+ messages in thread
[parent not found: <d68102d4-b99c-cc60-4eb2-9c6295af130f-5C7GfCeVMHo@public.gmane.org>]
* Re: [RFC] Using DC in amdgpu for upcoming GPU [not found] ` <d68102d4-b99c-cc60-4eb2-9c6295af130f-5C7GfCeVMHo@public.gmane.org> @ 2016-12-14 18:01 ` Alex Deucher [not found] ` <CADnq5_Nha9502S=DOJDNepNv9CBV88=0R6N+tpBuO+U+s1eUQA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 66+ messages in thread From: Alex Deucher @ 2016-12-14 18:01 UTC (permalink / raw) To: Cheng, Tony Cc: Grodzovsky, Andrey, John, dri-devel, amd-gfx mailing list, Lukas Wunner, Jani Nikula, Deucher, Alexander On Wed, Dec 14, 2016 at 12:23 PM, Cheng, Tony <tony.cheng@amd.com> wrote: > > > On 12/14/2016 4:57 AM, Jani Nikula wrote: >> >> On Tue, 13 Dec 2016, "Cheng, Tony" <tony.cheng@amd.com> wrote: >>> >>> I am struggling with that these comminty contributions to DC would be. >>> >>> Us AMD developer has access to HW docs and designer and we are still >>> spending 50% of our time figuring out why our HW doesn't work right. I >>> can't image community doing much of this heavy lifting. >> >> I can sympathize with that view, and certainly most of the heavy lifting >> would come from you, same as with us and i915. However, when you put >> together your hardware, an open source driver, and smart people, they >> *will* scratch their itches, whether they're bugs you're not fixing or >> features you're missing. Please don't underestimate and patronize them, >> it's going to rub people the wrong way. > > I aplogize if my statement offended any one in the community. I'll say more > about bugs below. >>> >>> Dave sent us series of patch to show how it would look like if someone >>> were to change DC. These changes are more removing code that DRM >>> already has and deleting/clean up stuff. I guess we can nak all changes >>> and "rewrite" our own version of clean up patch community want to see? >> >> Please have a look at, say, >> >> $ git shortlog -sne --since @{1year} -- drivers/gpu/drm/amd | grep -v >> amd\.com >> >> Do you really want to actively discourage all of them from contributing? >> I think this would be detrimental to not only your driver, but the whole >> drm community. It feels like you'd like to have your code upstream, but >> still retain ownership as if it was in your internal repo. You can't >> have your cake and eat it too. > > That's none "dal" path. It's just Alex plus a handful of guys trying to > figure out what register writes is needed base on windows driver. You knwo > who has been contributing to that code path from AMD and we know it's a > relatively small group of people. Alex and team does great job at being > good citizen on linux world and provide support. But in terms of HW > programming and fully expolit our HW that's pretty much the best they can do > with the resource constraint. Of course the quality is not as good as we > would like thus we needed all the help we can get from community. We just > don't have the man power to make it great. > > We are proposing to get on a path where we can fully leverage the coding and > validation resources from rest of AMD Display teams (SW, HW, tuning, > validation, QA etc). Our goal is to provide a driver to linux community > that's feature rich and high quality. My goal is community finds 0 bug in > our code because we should've seen and fixed those bug in our validation > pass before we release the GPUs. We do have a good size team around > validation, just today that validation covers 0% of upstream source code. > Alex and I are trying to find a path to get these goodies on the upstream > driver without 2x size of our teams. We know 2x our team size is not an > option. > > I just want to say I understand were community is coming from. Like I said > in my first respond to Dave that I would've say no if someone want to throw > 100k lines of code into project (DAL) I have to maintain without knowning > what's there and the benefit we are getting. We already made a lot of > change and design choice in our code base to play well with community and > absorbing the effort to restructure code on other platforms as result of > these modification. We are going to continue making more modifications to > make our code linux worthy base on the good feedback we have gotten so far. > > DAL3/DC is a new project we started a little over years ago and still early > enough stage to make changes. Like how community is pushing back on our > code, after 1 or 2 future generation of GPU built on top of DC, the AMD > teams on rest of platforms will start pushing back on changes in DC. We > need find find the balance of what's HW and what's core and how to draw the > line so community doesn't make much modification in what we (both AMD and > community) deem "hardware backend code". We need to have the linux coding > style and design principals baked in DC code so when our internal teams > contribute to DC the code is written in a form linux community can accept. > All of this need to happen soon or we miss this critical inflection point > and it's going to be anther 6-8 years before we get another crack at > re-archiecture project to try getting the rest of extended AMD display teams > behind our upstream driver. I think the point is that there are changes that make sense and changes that don't If they make sense, we'll definitely take them. Removing dead code or duplicate defines makes sense. Rearranging a propgramming sequence so all registers that start with CRTC_ get programmed at the same time just because it looks logical does not make sense. Alex _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
[parent not found: <CADnq5_Nha9502S=DOJDNepNv9CBV88=0R6N+tpBuO+U+s1eUQA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* RE: [RFC] Using DC in amdgpu for upcoming GPU [not found] ` <CADnq5_Nha9502S=DOJDNepNv9CBV88=0R6N+tpBuO+U+s1eUQA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2016-12-14 18:16 ` Cheng, Tony 0 siblings, 0 replies; 66+ messages in thread From: Cheng, Tony @ 2016-12-14 18:16 UTC (permalink / raw) To: Alex Deucher Cc: Grodzovsky, Andrey, Bridgman, John, dri-devel, amd-gfx mailing list, Lukas Wunner, Jani Nikula, Deucher, Alexander Thanks Alex my reply was a little off topic :) -----Original Message----- From: Alex Deucher [mailto:alexdeucher@gmail.com] Sent: Wednesday, December 14, 2016 1:02 PM To: Cheng, Tony <Tony.Cheng@amd.com> Cc: Jani Nikula <jani.nikula@linux.intel.com>; Lukas Wunner <lukas@wunner.de>; Bridgman, John <John.Bridgman@amd.com>; Deucher, Alexander <Alexander.Deucher@amd.com>; Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>; amd-gfx mailing list <amd-gfx@lists.freedesktop.org>; dri-devel <dri-devel@lists.freedesktop.org> Subject: Re: [RFC] Using DC in amdgpu for upcoming GPU On Wed, Dec 14, 2016 at 12:23 PM, Cheng, Tony <tony.cheng@amd.com> wrote: > > > On 12/14/2016 4:57 AM, Jani Nikula wrote: >> >> On Tue, 13 Dec 2016, "Cheng, Tony" <tony.cheng@amd.com> wrote: >>> >>> I am struggling with that these comminty contributions to DC would be. >>> >>> Us AMD developer has access to HW docs and designer and we are still >>> spending 50% of our time figuring out why our HW doesn't work right. >>> I can't image community doing much of this heavy lifting. >> >> I can sympathize with that view, and certainly most of the heavy >> lifting would come from you, same as with us and i915. However, when >> you put together your hardware, an open source driver, and smart >> people, they >> *will* scratch their itches, whether they're bugs you're not fixing >> or features you're missing. Please don't underestimate and patronize >> them, it's going to rub people the wrong way. > > I aplogize if my statement offended any one in the community. I'll > say more about bugs below. >>> >>> Dave sent us series of patch to show how it would look like if >>> someone were to change DC. These changes are more removing code >>> that DRM already has and deleting/clean up stuff. I guess we can >>> nak all changes and "rewrite" our own version of clean up patch community want to see? >> >> Please have a look at, say, >> >> $ git shortlog -sne --since @{1year} -- drivers/gpu/drm/amd | grep -v >> amd\.com >> >> Do you really want to actively discourage all of them from contributing? >> I think this would be detrimental to not only your driver, but the >> whole drm community. It feels like you'd like to have your code >> upstream, but still retain ownership as if it was in your internal >> repo. You can't have your cake and eat it too. > > That's none "dal" path. It's just Alex plus a handful of guys trying > to figure out what register writes is needed base on windows driver. > You knwo who has been contributing to that code path from AMD and we > know it's a relatively small group of people. Alex and team does > great job at being good citizen on linux world and provide support. > But in terms of HW programming and fully expolit our HW that's pretty > much the best they can do with the resource constraint. Of course the > quality is not as good as we would like thus we needed all the help we > can get from community. We just don't have the man power to make it great. > > We are proposing to get on a path where we can fully leverage the > coding and validation resources from rest of AMD Display teams (SW, > HW, tuning, validation, QA etc). Our goal is to provide a driver to > linux community that's feature rich and high quality. My goal is > community finds 0 bug in our code because we should've seen and fixed > those bug in our validation pass before we release the GPUs. We do > have a good size team around validation, just today that validation covers 0% of upstream source code. > Alex and I are trying to find a path to get these goodies on the > upstream driver without 2x size of our teams. We know 2x our team > size is not an option. > > I just want to say I understand were community is coming from. Like I > said in my first respond to Dave that I would've say no if someone > want to throw 100k lines of code into project (DAL) I have to maintain > without knowning what's there and the benefit we are getting. We > already made a lot of change and design choice in our code base to > play well with community and absorbing the effort to restructure code > on other platforms as result of these modification. We are going to > continue making more modifications to make our code linux worthy base on the good feedback we have gotten so far. > > DAL3/DC is a new project we started a little over years ago and still > early enough stage to make changes. Like how community is pushing > back on our code, after 1 or 2 future generation of GPU built on top > of DC, the AMD teams on rest of platforms will start pushing back on > changes in DC. We need find find the balance of what's HW and what's > core and how to draw the line so community doesn't make much > modification in what we (both AMD and > community) deem "hardware backend code". We need to have the linux > coding style and design principals baked in DC code so when our > internal teams contribute to DC the code is written in a form linux community can accept. > All of this need to happen soon or we miss this critical inflection > point and it's going to be anther 6-8 years before we get another > crack at re-archiecture project to try getting the rest of extended > AMD display teams behind our upstream driver. I think the point is that there are changes that make sense and changes that don't If they make sense, we'll definitely take them. Removing dead code or duplicate defines makes sense. Rearranging a propgramming sequence so all registers that start with CRTC_ get programmed at the same time just because it looks logical does not make sense. Alex _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [RFC] Using DC in amdgpu for upcoming GPU [not found] ` <20161213094035.GA10916-JFq808J9C/izQB+pC5nmwQ@public.gmane.org> 2016-12-13 15:03 ` Cheng, Tony @ 2016-12-13 16:14 ` Bridgman, John 1 sibling, 0 replies; 66+ messages in thread From: Bridgman, John @ 2016-12-13 16:14 UTC (permalink / raw) To: Lukas Wunner, Cheng, Tony Cc: Deucher, Alexander, Grodzovsky, Andrey, amd-gfx mailing list, dri-devel [-- Attachment #1.1: Type: text/plain, Size: 2679 bytes --] >>If the Linux community contributes to DC, I guess those contributions can generally be assumed to be GPLv2 licensed. Yet a future version of the macOS driver would incorporate those contributions in the same binary as their closed source OS-specific portion. My understanding of the "general rule" was that contributions are normally assumed to be made under the "local license", ie GPLv2 for kernel changes in general, but the appropriate lower-level license when made to a specific subsystem with a more permissive license (eg the X11 license aka MIT aka "GPL plus additional rights" license we use for almost all of the graphics subsystem. If DC is not X11 licensed today it should be (but I'm pretty sure it already is). We need to keep the graphics subsystem permissively licensed in general to allow uptake by other free OS projects such as *BSD, not just closed source. Either way, driver-level maintainers are going to have to make sure that contributions have clear licensing. Thanks, John ________________________________ From: dri-devel <dri-devel-bounces-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org> on behalf of Lukas Wunner <lukas-JFq808J9C/izQB+pC5nmwQ@public.gmane.org> Sent: December 13, 2016 4:40 AM To: Cheng, Tony Cc: Grodzovsky, Andrey; dri-devel; amd-gfx mailing list; Deucher, Alexander Subject: Re: [RFC] Using DC in amdgpu for upcoming GPU On Mon, Dec 12, 2016 at 09:52:08PM -0500, Cheng, Tony wrote: > With DC the display hardware programming, resource optimization, power > management and interaction with rest of system will be fully validated > across multiple OSs. Do I understand DAL3.jpg correctly that the macOS driver builds on top of DAL Core? I'm asking because the graphics drivers shipping with macOS as well as on Apple's EFI Firmware Volume are closed source. If the Linux community contributes to DC, I guess those contributions can generally be assumed to be GPLv2 licensed. Yet a future version of the macOS driver would incorporate those contributions in the same binary as their closed source OS-specific portion. I don't quite see how that would be legal but maybe I'm missing something. Presumably the situation with the Windows driver is the same. I guess you could maintain a separate branch sans community contributions which would serve as a basis for closed source drivers, but not sure if that is feasible given your resource constraints. Thanks, Lukas _______________________________________________ dri-devel mailing list dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org https://lists.freedesktop.org/mailman/listinfo/dri-devel [-- Attachment #1.2: Type: text/html, Size: 3828 bytes --] [-- Attachment #2: Type: text/plain, Size: 154 bytes --] _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [RFC] Using DC in amdgpu for upcoming GPU 2016-12-08 2:02 [RFC] Using DC in amdgpu for upcoming GPU Harry Wentland ` (2 preceding siblings ...) [not found] ` <55d5e664-25f7-70e0-f2f5-9c9daf3efdf6-5C7GfCeVMHo@public.gmane.org> @ 2016-12-12 7:22 ` Daniel Vetter [not found] ` <20161212072243.ah6sy3q57z4gimka-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org> 3 siblings, 1 reply; 66+ messages in thread From: Daniel Vetter @ 2016-12-12 7:22 UTC (permalink / raw) To: Harry Wentland Cc: Grodzovsky, Andrey, amd-gfx, dri-devel, Deucher, Alexander, Cheng, Tony On Wed, Dec 07, 2016 at 09:02:13PM -0500, Harry Wentland wrote: > Current version of DC: > > * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7 > > Once Alex pulls in the latest patches: > > * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7 One more: That 4.7 here is going to be unbelievable amounts of pain for you. Yes it's a totally sensible idea to just freeze your baseline kernel because then linux looks a lot more like Windows where the driver abi is frozen. But it makes following upstream entirely impossible, because rebasing is always a pain and hence postponed. Which means you can't just use the latest stuff in upstream drm, which means collaboration with others and sharing bugfixes in core is a lot more pain, which then means you do more than necessary in your own code and results in HALs like DAL, perpetuating the entire mess. So I think you don't just need to demidlayer DAL/DC, you also need to demidlayer your development process. In our experience here at Intel that needs continuous integration testing (in drm-tip), because even 1 month of not resyncing with drm-next is sometimes way too long. See e.g. the controlD regression we just had. And DAL is stuck on a 1 year old kernel, so pretty much only of historical significance and otherwise dead code. And then for any stuff which isn't upstream yet (like your internal enabling, or DAL here, or our own internal enabling) you need continuous rebasing&re-validation. When we started doing this years ago it was still manually, but we still rebased like every few days to keep the pain down and adjust continuously to upstream evolution. But then going to a continous rebase bot that sends you mail when something goes wrong was again a massive improvement. I guess in the end Conway's law that your software architecture necessarily reflects how you organize your teams applies again. Fix your process and it'll become glaringly obvious to everyone involved that DC-the-design as-is is entirely unworkeable and how it needs to be fixed. From my own experience over the past few years: Doing that is a fun journey ;-) Cheers, Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 66+ messages in thread
[parent not found: <20161212072243.ah6sy3q57z4gimka-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>]
* Re: [RFC] Using DC in amdgpu for upcoming GPU [not found] ` <20161212072243.ah6sy3q57z4gimka-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org> @ 2016-12-12 7:54 ` Bridgman, John [not found] ` <BN6PR12MB13484DA35697DBD0CA815CFFE8980-/b2+HYfkarQX0pEhCR5T8QdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org> 2016-12-13 2:05 ` Harry Wentland 1 sibling, 1 reply; 66+ messages in thread From: Bridgman, John @ 2016-12-12 7:54 UTC (permalink / raw) To: Daniel Vetter, Wentland, Harry Cc: Deucher, Alexander, Grodzovsky, Andrey, Cheng, Tony, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW [-- Attachment #1.1: Type: text/plain, Size: 3796 bytes --] Yep, good point. We have tended to stay a bit behind bleeding edge because our primary tasks so far have been: 1. Support enterprise distros (with old kernels) via the hybrid driver (AMDGPU-PRO), where the closer to upstream we get the more of a gap we have to paper over with KCL code 2. Push architecturally simple code (new GPU support) upstream, where being closer to upstream makes the up-streaming task simpler but not by that much So 4.7 isn't as bad a compromise as it might seem. That said, in the case of DAL/DC it's a different story as you say... architecturally complex code needing to be woven into a fast-moving subsystem of the kernel. So for DAL/DC anything other than upstream is going to be a big pain. OK, need to think that through. Thanks ! ________________________________ From: dri-devel <dri-devel-bounces-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org> on behalf of Daniel Vetter <daniel-/w4YWyX8dFk@public.gmane.org> Sent: December 12, 2016 2:22 AM To: Wentland, Harry Cc: Grodzovsky, Andrey; amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org; dri-devel-PD4FTy7X32mptlylMvRsHA@public.gmane.orgdesktop.org; Deucher, Alexander; Cheng, Tony Subject: Re: [RFC] Using DC in amdgpu for upcoming GPU On Wed, Dec 07, 2016 at 09:02:13PM -0500, Harry Wentland wrote: > Current version of DC: > > * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7 > > Once Alex pulls in the latest patches: > > * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7 One more: That 4.7 here is going to be unbelievable amounts of pain for you. Yes it's a totally sensible idea to just freeze your baseline kernel because then linux looks a lot more like Windows where the driver abi is frozen. But it makes following upstream entirely impossible, because rebasing is always a pain and hence postponed. Which means you can't just use the latest stuff in upstream drm, which means collaboration with others and sharing bugfixes in core is a lot more pain, which then means you do more than necessary in your own code and results in HALs like DAL, perpetuating the entire mess. So I think you don't just need to demidlayer DAL/DC, you also need to demidlayer your development process. In our experience here at Intel that needs continuous integration testing (in drm-tip), because even 1 month of not resyncing with drm-next is sometimes way too long. See e.g. the controlD regression we just had. And DAL is stuck on a 1 year old kernel, so pretty much only of historical significance and otherwise dead code. And then for any stuff which isn't upstream yet (like your internal enabling, or DAL here, or our own internal enabling) you need continuous rebasing&re-validation. When we started doing this years ago it was still manually, but we still rebased like every few days to keep the pain down and adjust continuously to upstream evolution. But then going to a continous rebase bot that sends you mail when something goes wrong was again a massive improvement. I guess in the end Conway's law that your software architecture necessarily reflects how you organize your teams applies again. Fix your process and it'll become glaringly obvious to everyone involved that DC-the-design as-is is entirely unworkeable and how it needs to be fixed. >From my own experience over the past few years: Doing that is a fun journey ;-) Cheers, Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org https://lists.freedesktop.org/mailman/listinfo/dri-devel [-- Attachment #1.2: Type: text/html, Size: 5225 bytes --] [-- Attachment #2: Type: text/plain, Size: 154 bytes --] _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
[parent not found: <BN6PR12MB13484DA35697DBD0CA815CFFE8980-/b2+HYfkarQX0pEhCR5T8QdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>]
* Re: [RFC] Using DC in amdgpu for upcoming GPU [not found] ` <BN6PR12MB13484DA35697DBD0CA815CFFE8980-/b2+HYfkarQX0pEhCR5T8QdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org> @ 2016-12-12 9:27 ` Daniel Vetter [not found] ` <20161212092727.6jgsgzlrdsha6zsl-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org> 2016-12-12 15:28 ` Deucher, Alexander 0 siblings, 2 replies; 66+ messages in thread From: Daniel Vetter @ 2016-12-12 9:27 UTC (permalink / raw) To: Bridgman, John Cc: Grodzovsky, Andrey, Cheng, Tony, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Daniel Vetter, Deucher, Alexander, Wentland, Harry On Mon, Dec 12, 2016 at 07:54:54AM +0000, Bridgman, John wrote: > Yep, good point. We have tended to stay a bit behind bleeding edge because our primary tasks so far have been: > > > 1. Support enterprise distros (with old kernels) via the hybrid driver > (AMDGPU-PRO), where the closer to upstream we get the more of a gap we > have to paper over with KCL code Hm, I thought resonable enterprise distros roll their drm core forward to the very latest upstream fairly often, so it shouldn't be too bad? Fixing this completely requires that you upstream your pre-production hw support early enough that by the time it ships its the backport is already in a realeased enterprise distro upgrade. But then adding bugfixes on top should be doable. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
[parent not found: <20161212092727.6jgsgzlrdsha6zsl-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>]
* Re: [RFC] Using DC in amdgpu for upcoming GPU [not found] ` <20161212092727.6jgsgzlrdsha6zsl-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org> @ 2016-12-12 9:29 ` Daniel Vetter 0 siblings, 0 replies; 66+ messages in thread From: Daniel Vetter @ 2016-12-12 9:29 UTC (permalink / raw) To: Bridgman, John Cc: Grodzovsky, Andrey, Cheng, Tony, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Daniel Vetter, Deucher, Alexander, Wentland, Harry On Mon, Dec 12, 2016 at 10:27:27AM +0100, Daniel Vetter wrote: > On Mon, Dec 12, 2016 at 07:54:54AM +0000, Bridgman, John wrote: > > Yep, good point. We have tended to stay a bit behind bleeding edge because our primary tasks so far have been: > > > > > > 1. Support enterprise distros (with old kernels) via the hybrid driver > > (AMDGPU-PRO), where the closer to upstream we get the more of a gap we > > have to paper over with KCL code > > Hm, I thought resonable enterprise distros roll their drm core forward to > the very latest upstream fairly often, so it shouldn't be too bad? Fixing > this completely requires that you upstream your pre-production hw support > early enough that by the time it ships its the backport is already in a > realeased enterprise distro upgrade. But then adding bugfixes on top > should be doable. Or just put an entire statically linked copy of the corresponding drm core into your dkms. A bit horrible, but iirc it's been done before. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
* RE: [RFC] Using DC in amdgpu for upcoming GPU 2016-12-12 9:27 ` Daniel Vetter [not found] ` <20161212092727.6jgsgzlrdsha6zsl-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org> @ 2016-12-12 15:28 ` Deucher, Alexander [not found] ` <MWHPR12MB1694EE6082AE9315EF5E6C68F7980-Gy0DoCVfaSW4WA4dJ5YXGAdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org> 1 sibling, 1 reply; 66+ messages in thread From: Deucher, Alexander @ 2016-12-12 15:28 UTC (permalink / raw) To: 'Daniel Vetter', Bridgman, John Cc: Grodzovsky, Andrey, Cheng, Tony, amd-gfx, dri-devel > -----Original Message----- > From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf > Of Daniel Vetter > Sent: Monday, December 12, 2016 4:27 AM > To: Bridgman, John > Cc: Grodzovsky, Andrey; Cheng, Tony; dri-devel@lists.freedesktop.org; amd- > gfx@lists.freedesktop.org; Daniel Vetter; Deucher, Alexander; Wentland, > Harry > Subject: Re: [RFC] Using DC in amdgpu for upcoming GPU > > On Mon, Dec 12, 2016 at 07:54:54AM +0000, Bridgman, John wrote: > > Yep, good point. We have tended to stay a bit behind bleeding edge > because our primary tasks so far have been: > > > > > > 1. Support enterprise distros (with old kernels) via the hybrid driver > > (AMDGPU-PRO), where the closer to upstream we get the more of a gap > we > > have to paper over with KCL code > > Hm, I thought resonable enterprise distros roll their drm core forward to > the very latest upstream fairly often, so it shouldn't be too bad? Fixing > this completely requires that you upstream your pre-production hw support > early enough that by the time it ships its the backport is already in a > realeased enterprise distro upgrade. But then adding bugfixes on top > should be doable. The issue is we need DAL/DC for enterprise distros and OEM preloads and, for workstation customers, we need some additional patches that aren't upstream yet because they we don’t have an open source user for them yet. This gets much easier once we get OCL and VK open sourced. As for new asic support, unfortunately, they do not often align well with enterprise distros at least for dGPUs (APUs are usually easier since the cycles are longer, dGPUs cycles are very fast). The other problem with dGPUs is that we often can't release support for new hw or feature too much earlier than launch due to the very competitive dGPU environment in gaming and workstation. Alex _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 66+ messages in thread
[parent not found: <MWHPR12MB1694EE6082AE9315EF5E6C68F7980-Gy0DoCVfaSW4WA4dJ5YXGAdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>]
* Re: [RFC] Using DC in amdgpu for upcoming GPU [not found] ` <MWHPR12MB1694EE6082AE9315EF5E6C68F7980-Gy0DoCVfaSW4WA4dJ5YXGAdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org> @ 2016-12-12 16:06 ` Luke A. Guest 2016-12-12 16:17 ` Luke A. Guest 1 sibling, 0 replies; 66+ messages in thread From: Luke A. Guest @ 2016-12-12 16:06 UTC (permalink / raw) To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW On 12/12/16 15:28, Deucher, Alexander wrote: >> -----Original Message----- >> From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf >> Of Daniel Vetter >> Sent: Monday, December 12, 2016 4:27 AM >> To: Bridgman, John >> Cc: Grodzovsky, Andrey; Cheng, Tony; dri-devel@lists.freedesktop.org; amd- >> gfx@lists.freedesktop.org; Daniel Vetter; Deucher, Alexander; Wentland, >> Harry >> Subject: Re: [RFC] Using DC in amdgpu for upcoming GPU >> >> On Mon, Dec 12, 2016 at 07:54:54AM +0000, Bridgman, John wrote: >>> Yep, good point. We have tended to stay a bit behind bleeding edge >> because our primary tasks so far have been: >>> >>> 1. Support enterprise distros (with old kernels) via the hybrid driver >>> (AMDGPU-PRO), where the closer to upstream we get the more of a gap >> we >>> have to paper over with KCL code >> Hm, I thought resonable enterprise distros roll their drm core forward to >> the very latest upstream fairly often, so it shouldn't be too bad? Fixing >> this completely requires that you upstream your pre-production hw support >> early enough that by the time it ships its the backport is already in a >> realeased enterprise distro upgrade. But then adding bugfixes on top >> should be doable. > The issue is we need DAL/DC for enterprise distros and OEM preloads and, for workstation customers, we need some additional patches that aren't upstream yet because they we don’t have an open source user for them yet. This gets much easier once we get OCL and VK open sourced. As for new asic support, unfortunately, they do not often align well with enterprise distros at least for dGPUs (APUs are usually easier since the cycles are longer, dGPUs cycles are very fast). The other problem with dGPUs is that we often can't release support for new hw or feature too much earlier than launch due to the very competitive dGPU environment in gaming and workstation. > What Daniel said is something I've said to you before, especially regarding libdrm. You keep mentioning these patches you need, but tbh, there's no reason why these patches cannot be in patchwork so people can use them. I've asked for this for months and the response was, "shouldn't be a problem, but I won't get to it this week," months later, still not there. Please just get your stuff public so the people who aren't on enterprise and ancient OSes can upgrade their systems. This would enable me to test amdgpu-pro and latest Mesa/LLVM alongside each other for Gentoo without having to replace a source built libdrm with your ancient one. Luke. _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [RFC] Using DC in amdgpu for upcoming GPU [not found] ` <MWHPR12MB1694EE6082AE9315EF5E6C68F7980-Gy0DoCVfaSW4WA4dJ5YXGAdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org> 2016-12-12 16:06 ` Luke A. Guest @ 2016-12-12 16:17 ` Luke A. Guest [not found] ` <584ECD8B.8000509-z/KZkw/0wg5BDgjK7y7TUQ@public.gmane.org> 1 sibling, 1 reply; 66+ messages in thread From: Luke A. Guest @ 2016-12-12 16:17 UTC (permalink / raw) To: Deucher, Alexander, 'Daniel Vetter', Bridgman, John Cc: Grodzovsky, Andrey, Cheng, Tony, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW On 12/12/16 15:28, Deucher, Alexander wrote: >> -----Original Message----- >> From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf >> Of Daniel Vetter >> Sent: Monday, December 12, 2016 4:27 AM >> To: Bridgman, John >> Cc: Grodzovsky, Andrey; Cheng, Tony; dri-devel@lists.freedesktop.org; amd- >> gfx@lists.freedesktop.org; Daniel Vetter; Deucher, Alexander; Wentland, >> Harry >> Subject: Re: [RFC] Using DC in amdgpu for upcoming GPU >> >> On Mon, Dec 12, 2016 at 07:54:54AM +0000, Bridgman, John wrote: >>> Yep, good point. We have tended to stay a bit behind bleeding edge >> because our primary tasks so far have been: >>> >>> 1. Support enterprise distros (with old kernels) via the hybrid driver >>> (AMDGPU-PRO), where the closer to upstream we get the more of a gap >> we >>> have to paper over with KCL code >> Hm, I thought resonable enterprise distros roll their drm core forward to >> the very latest upstream fairly often, so it shouldn't be too bad? Fixing >> this completely requires that you upstream your pre-production hw support >> early enough that by the time it ships its the backport is already in a >> realeased enterprise distro upgrade. But then adding bugfixes on top >> should be doable. > The issue is we need DAL/DC for enterprise distros and OEM preloads and, for workstation customers, we need some additional patches that aren't upstream yet because they we don’t have an open source user for them yet. This gets much easier once we get OCL and VK open sourced. As for new asic support, unfortunately, they do not often align well with enterprise distros at least for dGPUs (APUs are usually easier since the cycles are longer, dGPUs cycles are very fast). The other problem with dGPUs is that we often can't release support for new hw or feature too much earlier than launch due to the very competitive dGPU environment in gaming and workstation. > > Apologies for spamming, but I didn't send this to all. What Daniel said is something I've said to you before, especially regarding libdrm. You keep mentioning these patches you need, but tbh, there's no reason why these patches cannot be in patchwork so people can use them. I've asked for this for months and the response was, "shouldn't be a problem, but I won't get to it this week," months later, still not there. Please just get your stuff public so the people who aren't on enterprise and ancient OSes can upgrade their systems. This would enable me to test amdgpu-pro and latest Mesa/LLVM alongside each other for Gentoo without having to replace a source built libdrm with your ancient one. Luke. _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
[parent not found: <584ECD8B.8000509-z/KZkw/0wg5BDgjK7y7TUQ@public.gmane.org>]
* RE: [RFC] Using DC in amdgpu for upcoming GPU [not found] ` <584ECD8B.8000509-z/KZkw/0wg5BDgjK7y7TUQ@public.gmane.org> @ 2016-12-12 16:44 ` Deucher, Alexander 0 siblings, 0 replies; 66+ messages in thread From: Deucher, Alexander @ 2016-12-12 16:44 UTC (permalink / raw) To: 'Luke A. Guest', 'Daniel Vetter', Bridgman, John Cc: Grodzovsky, Andrey, Cheng, Tony, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW > -----Original Message----- > From: Luke A. Guest [mailto:laguest@archeia.com] > Sent: Monday, December 12, 2016 11:17 AM > To: Deucher, Alexander; 'Daniel Vetter'; Bridgman, John > Cc: Grodzovsky, Andrey; Cheng, Tony; amd-gfx@lists.freedesktop.org; dri- > devel@lists.freedesktop.org > Subject: Re: [RFC] Using DC in amdgpu for upcoming GPU > > On 12/12/16 15:28, Deucher, Alexander wrote: > >> -----Original Message----- > >> From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On > Behalf > >> Of Daniel Vetter > >> Sent: Monday, December 12, 2016 4:27 AM > >> To: Bridgman, John > >> Cc: Grodzovsky, Andrey; Cheng, Tony; dri-devel@lists.freedesktop.org; > amd- > >> gfx@lists.freedesktop.org; Daniel Vetter; Deucher, Alexander; Wentland, > >> Harry > >> Subject: Re: [RFC] Using DC in amdgpu for upcoming GPU > >> > >> On Mon, Dec 12, 2016 at 07:54:54AM +0000, Bridgman, John wrote: > >>> Yep, good point. We have tended to stay a bit behind bleeding edge > >> because our primary tasks so far have been: > >>> > >>> 1. Support enterprise distros (with old kernels) via the hybrid driver > >>> (AMDGPU-PRO), where the closer to upstream we get the more of a > gap > >> we > >>> have to paper over with KCL code > >> Hm, I thought resonable enterprise distros roll their drm core forward to > >> the very latest upstream fairly often, so it shouldn't be too bad? Fixing > >> this completely requires that you upstream your pre-production hw > support > >> early enough that by the time it ships its the backport is already in a > >> realeased enterprise distro upgrade. But then adding bugfixes on top > >> should be doable. > > The issue is we need DAL/DC for enterprise distros and OEM preloads and, > for workstation customers, we need some additional patches that aren't > upstream yet because they we don’t have an open source user for them yet. > This gets much easier once we get OCL and VK open sourced. As for new asic > support, unfortunately, they do not often align well with enterprise distros at > least for dGPUs (APUs are usually easier since the cycles are longer, dGPUs > cycles are very fast). The other problem with dGPUs is that we often can't > release support for new hw or feature too much earlier than launch due to > the very competitive dGPU environment in gaming and workstation. > > > > > Apologies for spamming, but I didn't send this to all. > > What Daniel said is something I've said to you before, especially > regarding libdrm. You keep mentioning these patches you need, but tbh, > there's no reason why these patches cannot be in patchwork so people can > use them. I've asked for this for months and the response was, > "shouldn't be a problem, but I won't get to it this week," months later, > still not there. The kernel side is public. The dkms packages have the full source tree. As I said before, we plan to make this all public, but just haven't had the time (as this thread shows, we've got a lot of other higher priority things on our plate). Even when we do, it doesn’t change the fact that the patches can't go upstream at the moment so it doesn't fix the situation Daniel was talking about anyway. Distro's generally don't take code that is not upstream yet. While we only validate the dkms packages on the enterprise distros, they should be adaptable to other kernels. Alex _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [RFC] Using DC in amdgpu for upcoming GPU [not found] ` <20161212072243.ah6sy3q57z4gimka-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org> 2016-12-12 7:54 ` Bridgman, John @ 2016-12-13 2:05 ` Harry Wentland [not found] ` <2032d12b-f675-eb25-33bf-3aa0fcd20cb3-5C7GfCeVMHo@public.gmane.org> 1 sibling, 1 reply; 66+ messages in thread From: Harry Wentland @ 2016-12-13 2:05 UTC (permalink / raw) To: Daniel Vetter Cc: Grodzovsky, Andrey, Dave Airlie, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Deucher, Alexander, Cheng, Tony On 2016-12-12 02:22 AM, Daniel Vetter wrote: > On Wed, Dec 07, 2016 at 09:02:13PM -0500, Harry Wentland wrote: >> Current version of DC: >> >> * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7 >> >> Once Alex pulls in the latest patches: >> >> * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7 > > One more: That 4.7 here is going to be unbelievable amounts of pain for > you. Yes it's a totally sensible idea to just freeze your baseline kernel > because then linux looks a lot more like Windows where the driver abi is > frozen. But it makes following upstream entirely impossible, because > rebasing is always a pain and hence postponed. Which means you can't just > use the latest stuff in upstream drm, which means collaboration with > others and sharing bugfixes in core is a lot more pain, which then means > you do more than necessary in your own code and results in HALs like DAL, > perpetuating the entire mess. > > So I think you don't just need to demidlayer DAL/DC, you also need to > demidlayer your development process. In our experience here at Intel that > needs continuous integration testing (in drm-tip), because even 1 month of > not resyncing with drm-next is sometimes way too long. See e.g. the > controlD regression we just had. And DAL is stuck on a 1 year old kernel, > so pretty much only of historical significance and otherwise dead code. > > And then for any stuff which isn't upstream yet (like your internal > enabling, or DAL here, or our own internal enabling) you need continuous > rebasing&re-validation. When we started doing this years ago it was still > manually, but we still rebased like every few days to keep the pain down > and adjust continuously to upstream evolution. But then going to a > continous rebase bot that sends you mail when something goes wrong was > again a massive improvement. > I think we've seen that pain already but haven't quite realized how much of it is due to a mismatch in kernel trees. We're trying to move onto a tree following drm-next much more closely. I'd love to help automate some of that (time permitting). Would the drm-misc scripts be of any use with that? I only had a very cursory glance at those. > I guess in the end Conway's law that your software architecture > necessarily reflects how you organize your teams applies again. Fix your > process and it'll become glaringly obvious to everyone involved that > DC-the-design as-is is entirely unworkeable and how it needs to be fixed. > > From my own experience over the past few years: Doing that is a fun > journey ;-) > Absolutely. We're only at the start of this but have learned a lot from the community (maybe others in the DC team disagree with me somewhat). Not sure if I fully agree that this means that DC-the-design-as-is will become apparent as unworkable... There are definitely pieces to be cleaned here and lessons learned from the DRM community but on the other hand we feel there are some good reasons behind our approach that we'd like to share with the community (some of which I'm learning myself). Harry > Cheers, Daniel > _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
[parent not found: <2032d12b-f675-eb25-33bf-3aa0fcd20cb3-5C7GfCeVMHo@public.gmane.org>]
* Re: [RFC] Using DC in amdgpu for upcoming GPU [not found] ` <2032d12b-f675-eb25-33bf-3aa0fcd20cb3-5C7GfCeVMHo@public.gmane.org> @ 2016-12-13 8:33 ` Daniel Vetter 0 siblings, 0 replies; 66+ messages in thread From: Daniel Vetter @ 2016-12-13 8:33 UTC (permalink / raw) To: Harry Wentland Cc: Grodzovsky, Andrey, Cheng, Tony, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Daniel Vetter, Deucher, Alexander, Dave Airlie On Mon, Dec 12, 2016 at 09:05:15PM -0500, Harry Wentland wrote: > > On 2016-12-12 02:22 AM, Daniel Vetter wrote: > > On Wed, Dec 07, 2016 at 09:02:13PM -0500, Harry Wentland wrote: > > > Current version of DC: > > > > > > * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7 > > > > > > Once Alex pulls in the latest patches: > > > > > > * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7 > > > > One more: That 4.7 here is going to be unbelievable amounts of pain for > > you. Yes it's a totally sensible idea to just freeze your baseline kernel > > because then linux looks a lot more like Windows where the driver abi is > > frozen. But it makes following upstream entirely impossible, because > > rebasing is always a pain and hence postponed. Which means you can't just > > use the latest stuff in upstream drm, which means collaboration with > > others and sharing bugfixes in core is a lot more pain, which then means > > you do more than necessary in your own code and results in HALs like DAL, > > perpetuating the entire mess. > > > > So I think you don't just need to demidlayer DAL/DC, you also need to > > demidlayer your development process. In our experience here at Intel that > > needs continuous integration testing (in drm-tip), because even 1 month of > > not resyncing with drm-next is sometimes way too long. See e.g. the > > controlD regression we just had. And DAL is stuck on a 1 year old kernel, > > so pretty much only of historical significance and otherwise dead code. > > > > And then for any stuff which isn't upstream yet (like your internal > > enabling, or DAL here, or our own internal enabling) you need continuous > > rebasing&re-validation. When we started doing this years ago it was still > > manually, but we still rebased like every few days to keep the pain down > > and adjust continuously to upstream evolution. But then going to a > > continous rebase bot that sends you mail when something goes wrong was > > again a massive improvement. > > > > I think we've seen that pain already but haven't quite realized how much of > it is due to a mismatch in kernel trees. We're trying to move onto a tree > following drm-next much more closely. I'd love to help automate some of that > (time permitting). Would the drm-misc scripts be of any use with that? I > only had a very cursory glance at those. I've offered to Alex that we could include the amd trees (only stuff ready for pull request) into drm-tip for continuous integration testing at least. Would mean that Alex needs to use dim when updating those branches, and you CI needs to test drm-tip (and do that everytime it changes, i.e. really continuously). For continues rebasing there's no ready-made thing public, but I highly recommend you use one of the patch pile tools. In intel we have a glue of quilt + tracking quilt state with git, implemented in the qf script in maintainer-tools. That one has a lot more sharp edges than dim, but it gets the job done. And the combination of git track + raw patch file for seding is very powerful for rebasing. Long term I'm hopefully that git series will become the new shiny, since Josh Tripplet really understands the use-cases of having long-term rebasing trees which are collaboratively maintainer. It's a lot nicer than qf, but can't yet do everything we need (and likely what you'll need to be able to rebase DC without going crazy). > > I guess in the end Conway's law that your software architecture > > necessarily reflects how you organize your teams applies again. Fix your > > process and it'll become glaringly obvious to everyone involved that > > DC-the-design as-is is entirely unworkeable and how it needs to be fixed. > > > > From my own experience over the past few years: Doing that is a fun > > journey ;-) > > > > Absolutely. We're only at the start of this but have learned a lot from the > community (maybe others in the DC team disagree with me somewhat). > > Not sure if I fully agree that this means that DC-the-design-as-is will > become apparent as unworkable... There are definitely pieces to be cleaned > here and lessons learned from the DRM community but on the other hand we > feel there are some good reasons behind our approach that we'd like to share > with the community (some of which I'm learning myself). Tony asking what the difference between a midlayer and a helper library is is imo a good indicator that there's still learning to do in the team ;-) Cheers, Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [RFC] Using DC in amdgpu for upcoming GPU @ 2016-12-09 16:32 Jan Ziak 2016-12-13 7:31 ` Michel Dänzer 0 siblings, 1 reply; 66+ messages in thread From: Jan Ziak @ 2016-12-09 16:32 UTC (permalink / raw) To: dri-devel [-- Attachment #1.1: Type: text/plain, Size: 2603 bytes --] Hello Dave Let's cool down the discussion a bit and try to work out a solution. To summarize the facts, your decision implies that the probability of merging DAL/DC into the mainline Linux kernel the next year (2017) has become extremely low. In essence, the strategy you are implicitly proposing is to move away from a software architecture which looks like this: APPLICATION USERSPACE DRIVERS (OPENGL, XSERVER) ---- HAL/DC IN AMDGPU.KO (FREESYNC, etc) LINUX KERNEL SERVICES HARDWARE towards a software architecture looking like this: APPLICATION USERSPACE DRIVERS (OPENGL, XSERVER) USERSPACE HAL/DC IMPLEMENTATION (FREESYNC, etc) ---- AMDGPU.KO LINUX KERNEL SERVICES HARDWARE For the future of Linux the latter basically means that the Linux kernel won't be initializing display resolution (modesetting) when the machine is booting. The initial modesetting will be performed by a user-space executable launched by openrc/systemd/etc as soon as possible. Launching the userspace modesetting executable will be among the first actions of openrc/systemd/etc. Note that during the 90-ties Linux-based systems _already_ had the xserver responsible for modesetting. Linux gradually moved away from the 90-ties software architecture towards an in-kernel modesetting architecture. A citation from https://en.wikipedia.org/wiki/X.Org_Server is in order here: "In ancient times, the mode-setting was done by some x-server graphics device drivers specific to some video controller/graphics card. To this mode-setting functionality, additional support for 2D acceleration was added when such became available with various GPUs. The mode-setting functionality was moved into the DRM and is being exposed through an DRM mode-setting interface, the new approach being called "kernel mode-setting" (KMS)." The underlying simple hard fact behind the transition from 90-ties user-modesetting to kernel-modesetting is that most Linux users prefer to see a single display mode initialization which persists from machine boot to machine shutdown. In the near future, the combination of the following four factors: 1. General availability of 144Hz displays 2. Up/down-scaling of fullscreen OpenGL/Vulkan apps (virtual display resolution) 3. Per-frame monitor refresh rate adjustment (freesync, g-sync) 4. Competition of innovations will render non-native monitor resolutions and non-native physical framerates completely obsolete. (3) is a transient phenomenon which will be later superseded by further developments in the field of (1) towards emergence of virtual refresh rates. Jan [-- Attachment #1.2: Type: text/html, Size: 3202 bytes --] [-- Attachment #2: Type: text/plain, Size: 160 bytes --] _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 66+ messages in thread
* Re: [RFC] Using DC in amdgpu for upcoming GPU 2016-12-09 16:32 Jan Ziak @ 2016-12-13 7:31 ` Michel Dänzer 0 siblings, 0 replies; 66+ messages in thread From: Michel Dänzer @ 2016-12-13 7:31 UTC (permalink / raw) To: Jan Ziak; +Cc: dri-devel On 10/12/16 01:32 AM, Jan Ziak wrote: > Hello Dave > > Let's cool down the discussion a bit and try to work out a solution. > > To summarize the facts, your decision implies that the probability of > merging DAL/DC into the mainline Linux kernel the next year (2017) has > become extremely low. > > In essence, the strategy you are implicitly proposing is to move away > from a software architecture which looks like this: > > APPLICATION > USERSPACE DRIVERS (OPENGL, XSERVER) > ---- > HAL/DC IN AMDGPU.KO (FREESYNC, etc) > LINUX KERNEL SERVICES > HARDWARE > > towards a software architecture looking like this: > > APPLICATION > USERSPACE DRIVERS (OPENGL, XSERVER) > USERSPACE HAL/DC IMPLEMENTATION (FREESYNC, etc) > ---- > AMDGPU.KO > LINUX KERNEL SERVICES > HARDWARE You misunderstood what Dave wrote. The whole discussion is mostly about the DC related code in the amdgpu driver and its interaction with core DRM/kernel code, i.e. mostly about code under drivers/gpu/drm/. It doesn't affect anything outside of that, certainly not how things are divided up between kernel and userspace. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 66+ messages in thread
* [RFC] Using DC in amdgpu for upcoming GPU @ 2016-12-15 15:48 Kevin Brace 0 siblings, 0 replies; 66+ messages in thread From: Kevin Brace @ 2016-12-15 15:48 UTC (permalink / raw) To: dri-devel Hi, I have been reading the ongoing discussion about what to do about AMD DC (Display Core) with great interest since I have started to put more time into developing OpenChrome DRM for VIA Technologies Chrome IGP. I particularly enjoyed reading what Tony Cheng wrote about what is going on inside AMD Radeon GPUs. As a graphics stack developer, I suppose I am still someone somewhat above a beginner level, and Chrome IGP might be considered garbage graphics to some (I do not really care what people say or think about it.), but since my background is really digital hardware design (self taught) rather than graphics device driver development, I will like to add my 2 cents (U.S.D.) to the discussion. I also consider myself an amateur semiconductor industry historian, and in particular, I have been a close watcher of Intel's business / hiring practice for many years. For some, what I am writing may not make sense or even offend some (my guess will be the people who work at Intel), but I will not pull any punches, and if you do not like what I write, let me know. (That does not mean I will necessarily take back my comment even if it offended you. I typically stand behind what I say, unless it is obvious that I am wrong.) While my understanding of DRM is still quite primitive, my simplistic understanding of why AMD is pushing DC is due to the following factors. 1) AMD is understaffed due to its precarious financial condition it is in right now (i.e., < $1 billion CoH and losing 7,000 employees since Year 2008 or so) 2) The complexity of the next generation ASIC is only getting worse due to the continuing process scaling = more transistors one has to use (i.e., TSMC 28 nm to GF 14 nm to probably Samsung / TSMC 10 nm or GF 7 nm) 3) Based on 1 and 2, unless the design productively can be improved, AMD will be late to market, and this can be the possible end to AMD as a corporation 4) Hence, in order to meet TtM and improve engineer productivity, AMD needs to reuse the existing pre-silicon / post-silicon bring up test code and share the code with the Windows side of the device driver developers 5) In addition, power is already the biggest design challenge, and very precise power management is crucial to the performance of the chip (i.e., it's not all about the laptop anymore, and desktop "monster" graphics cards also need power management for performance reasons, in order to manage heat generation) 6) AMD Radeon is really running an RTOS (Real Time Operating System) inside the GPU card, and they want to put the code to handle initialization / power management closer to the GPU rather than from the slower response x86 (or any other general purpose) microprocessor Since I will probably need to obtain "favors" down the road when I try to get OpenChrome DRM mainlined, I probably should not go into what I think of how Intel works on their graphics device driver stack (I do not mean to make this personal, but Intel is the "other" open source camp in the OSS x86 graphics world, so I find it a fair game to discuss the approach Intel takes from semiconductor industry perspective. I am probably going to overly generalize what is going on, so if you wanted to correct me, let me know.), but based on my understanding of how Intel works, Intel probably has more staffing resources than AMD when it comes to graphics device driver stack development. (and on the x86 microprocessor development side) Based on my understanding of where Intel stands financially, I feel like Intel is standing on very thin ice due to the following factors, and I will predict that they will eventually adopt AMD DC like design concept. (i.e., use of a HAL) Here is my logic. 1) PC (desktop and laptop) x86 processors are not selling very well, and my understanding is that since Year 2012 peak, x86 processor shipment is down 30% as of Year 2016 (I will say around $200 ASP) 2) Intel's margins are being propped up by the unnaturally high data center marketshare (99% for x86 data center microprocessors) and very high data center x86 processor ASP (Average Selling Price) of $600 (Up from $500 a few years ago due to AMD screwing up the Bulldozer microarchitecture. More on this later.) 3) Intel did a significant layoff in April 2016 where they targeted older (read "expensive"), experienced engineers 4) Like Cisco Systems (notorious for their annual summer time 5,000 layoff), Intel then turns around and goes in a hiring spree hiring from many graduate programs of U.S. second and third tier universities, bringing down the overall experience level of the engineering departments 5) While AMD is financially in a desperate shape, it will likely have one last chance in Zen microarchitecture to get back into the game (Zen will be the last chance for AMD, IMO.) 6) Since AMD is now fabless due to divestiture of the fabs in Year 2009 (GLOBALFOUNDRIES), it no longer has the financial burden of having to pay for the fab, whereas Intel "had to" delay 10 nm process deployment to 2H'17 due to weak demand of 14 nm process products and low utilization of 14 nm process (Low utilization delays the amortization of 14 nm process. Intel historically amortized the given process technology in 2 years. 14 nm is starting to look like 2.5 to 3 years due to yield issues they encountered in 2014.) 7) Inevitably, the magic of market competition will drag down Intel ASP (both PC and data center) since Zen microarchitecture is a rather straight forward x86 microarchitectural implementation (i.e., not too far apart from Skylake), hence, their low 60% gross margin will be under pressure from AMD starting in Year 2017. 8) Intel overpaid for Altera (a struggling FPGA vendor where the CEO probably felt like he had to sell the corporation in order to cover up the Stratix 10 FPGA development screw up of missing the tape out target date by 1.5 years) by $8 billion, and the next generation process technology is getting ever more expensive (10 nm, 7 nm, 5 nm, etc.) 9) In order to "please" Wall Street, Intel management will possibly do further destructive layoffs every year, and if I were to guess, will likely layoff another 25,000 to 30,000 people over the next 3 to 4 years 10) Intel has already lost the experienced engineers over the past layoffs, replacing them with far less experienced engineers hired relatively recently from mostly second and third tier U.S. universities 11) Now, with 25,000 to 30,000 layoff, the management will force the software engineering side to reorganize, and Intel will be "forced" to come up with ways to reuse their graphics stack code (i.e., sharing more code between Windows and Linux) 12) Hence, maybe a few years from now, Intel people will have to do something similar to AMD DC, in order to improve their design productivity since they no longer can throw people at the problem (Their tendency to overhire new college graduates since they are cheaper, and this allowed them to throw people at the problem relatively cheaply until recently. High x86 ASP also allowed them to do this as well, and they got too used to this for too long. They will not be able to do this in the future. In the meantime, their organizational experience level is coming down due to hiring too many NCGs and laying off too many experienced people at the same time.) I am sure there are people who are not happy reading this, but this is my harsh, honest assessment of what Intel is going through right now, and what will happen in the future. I am sure I will be effectively blacklisted from working at Intel for writing what I just wrote (That's okay since I am not interested in working at Intel.), but I came to this conclusion based on various people who used to work at Intel told me and observing the hiring practice of Intel for a number of years. In particular, one person who worked on Intel 740 project (i.e., the long forgotten discrete AGP graphics chip from 1998) on the technical side has told me that Intel is really terrible at IP (Intellectual Property) core reuse, and Intel frequently redesigns too many portions of their ASICs all the time. Based on that, I am not too surprised to hear that Intel does Windows and Linux graphics device driver stack development separately. (That's what I read.) In other words, Intel is bloated from a staffing point of view. (I do not necessarily like people to lose jobs, but compared to AMD and NVIDIA, Intel is really bloated. The same person who worked on the Intel 740 project told me that Intel employee productivity is much lower than their competitors like AMD and NVIDIA on a per employee basis, and they have not been able to fix this for years.) Despite the constant layoffs, Intel's employee count has not really gone down for the past few years (it is staying around 100,000 for the past 4 years), but eventually Intel will have to get rid of people in absolute numbers. Intel also heavily relies on its "shadow" workforce of interns (from local universities, especially the foreign master's degree students desperate to pay off part of their high out of state tuition) and contractors / consultants, so their "real" employee count is probably closer to 115,000 or 120,000. I get Intel related contractor / consultant position "unsolicited" e-mails from recruiters possibly located 12 time zones away from where I reside (please do not call me a racist for pointing this out since I find this so weird as a U.S. citizen) almost every weekday (M-F), and I am always surprised at the type of work Intel wants contractors to work on. Many of the positions they want people to work are highly specialized stuff (I saw a graphics device driver contract position recently.), and they have been like this for several years already. I no longer bother with Intel anymore based on this since they appear to not want to commit to proper employment of highly technical people. Going back to the graphics world, my take is, Intel will have to get used to doing the same with far fewer people, and they will need to change their corporate culture of throwing people at the problem very soon since their x86 ASP will be crashing down fairly soon, and AMD will likely never repeat the Bulldozer microarchitecture screw up again. (Intel got lucky when former IBM PowerPC architects AMD hired around Year 2005 screwed up the Bulldozer. Speed Demon design is a disaster in a power constrained post-90 nm process node. They tried to compensate for Bulldozer's low IPC with high clock frequency. Intel learned a painful lesson about power with NetBurst microarchitecture between Year 2003 to 2005. Also, then AMD management seem to have really believed in the many-core concept too seriously. AMD had to live with the messed up Bulldozer for 10+ years with disastrous financial results.) I do understand that what I am writing isn't terribly technical in nature (it is more like corporate strategy stuff business / marketing side people worry about), but I feel like what AMD is doing is quite logical. (i.e., using higher abstraction level for initialization / power management, and code reuse) Sorry for the off topic assessment of Intel (i.e., hiring practice stuff, x86 stuff), and based on the subsequent messages, it appears that DC can be rearchitected to satisfy Linux kernel developers, but overall, I feel like there is a lack of appreciation for the concept of design reuse in this case even though in ASIC / FPGA design world, this is very normal. (It has been like this since the mid-'90s when ASIC engineers had to start doing this regularly.) AMD side people appeared to have been trying to apply this concept to the device driver side as well. Considering AMD's meager staffing resources (currently approximately 9,000; less than 1/10 of Intel although Intel owns many fabs and product lines, so the actual developer staffing disadvantage is probably more like 1:3 to 1:5 ratio), I am not too surprised to read that it is trying to improve their productivity where they can, and combining some portions of Windows and Linux code makes sense. I would imagine that NVIDIA is going something like this already. (but closed source) Again, I will almost bet that Intel will adopt AMD DC like concept in the next few years. Let me know if I was right in a few years. Regards, Kevin Brace The OpenChrome Project maintainer / developer _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 66+ messages in thread
end of thread, other threads:[~2016-12-15 15:48 UTC | newest] Thread overview: 66+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-12-08 2:02 [RFC] Using DC in amdgpu for upcoming GPU Harry Wentland 2016-12-08 9:59 ` Daniel Vetter [not found] ` <20161208095952.hnbfs4b3nac7faap-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org> 2016-12-08 14:33 ` Harry Wentland 2016-12-08 15:34 ` Daniel Vetter [not found] ` <20161208153417.yrpbhmot5gfv37lo-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org> 2016-12-08 15:41 ` Christian König 2016-12-08 15:46 ` Daniel Vetter 2016-12-08 20:24 ` Matthew Macy 2016-12-08 17:40 ` Alex Deucher 2016-12-08 20:07 ` Dave Airlie [not found] ` <CAPM=9tw=OLirgVU1RVxfPZ1PV64qtjOPTJ2q540=9VJhF4o2RQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2016-12-08 23:29 ` Dave Airlie [not found] ` <CAPM=9tzqaSR3dUBV9RUmo-kQZ8VmNP=rdgiHwOBii=7A2X0Dew-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2016-12-09 17:26 ` Cheng, Tony 2016-12-09 19:59 ` Daniel Vetter [not found] ` <CAKMK7uGDUBHZKNEZTdOi2_66vKZmCsc+ViM0UyTdRPfnYa-Zww-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2016-12-09 20:34 ` Dave Airlie 2016-12-09 20:38 ` Daniel Vetter 2016-12-10 0:29 ` Matthew Macy 2016-12-11 12:34 ` Daniel Vetter 2016-12-09 17:56 ` Cheng, Tony 2016-12-09 17:32 ` Deucher, Alexander [not found] ` <MWHPR12MB169473F270C372CE90D3A254F7870-Gy0DoCVfaSW4WA4dJ5YXGAdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org> 2016-12-09 20:30 ` Dave Airlie [not found] ` <CAPM=9tw4U6Ps1KgTpn-Sq2esfqkmDCPvpoRXnJB-X6pwjbBmTw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2016-12-11 0:36 ` Alex Deucher 2016-12-09 20:31 ` Daniel Vetter 2016-12-11 20:28 ` Daniel Vetter [not found] ` <20161211202827.cif3jnbuouay6xyz-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org> 2016-12-13 2:33 ` Harry Wentland [not found] ` <b64d0072-4909-c680-2f09-adae9f856642-5C7GfCeVMHo@public.gmane.org> 2016-12-13 4:10 ` Cheng, Tony 2016-12-13 7:50 ` Daniel Vetter [not found] ` <3219f6f2-080e-0c77-45bb-cf59aa5b5858-5C7GfCeVMHo@public.gmane.org> 2016-12-13 7:30 ` Dave Airlie 2016-12-13 9:14 ` Cheng, Tony 2016-12-13 14:59 ` Rob Clark 2016-12-13 7:31 ` Daniel Vetter 2016-12-13 10:09 ` Ernst Sjöstrand [not found] ` <55d5e664-25f7-70e0-f2f5-9c9daf3efdf6-5C7GfCeVMHo@public.gmane.org> 2016-12-12 2:57 ` Dave Airlie 2016-12-12 7:09 ` Daniel Vetter [not found] ` <CAPM=9tx+j9-3fZNY=peLjdsVqyLS6i3V-sV3XrnYsK2YuhWRBA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2016-12-12 3:21 ` Bridgman, John 2016-12-12 3:23 ` Bridgman, John [not found] ` <BN6PR12MB13484A1D247707C399180266E8980-/b2+HYfkarQX0pEhCR5T8QdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org> 2016-12-12 3:43 ` Bridgman, John 2016-12-12 4:05 ` Dave Airlie 2016-12-13 1:49 ` Harry Wentland [not found] ` <634f5374-027a-6ec9-41a5-64351c4f7eac-5C7GfCeVMHo@public.gmane.org> 2016-12-13 12:22 ` Daniel Stone 2016-12-13 12:59 ` Daniel Vetter [not found] ` <20161213125953.zczaojxp37yg6a6f-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org> 2016-12-14 1:50 ` Michel Dänzer [not found] ` <afa3fdb6-1bb4-976e-d14f-b04ab8243819-otUistvHUpPR7s880joybQ@public.gmane.org> 2016-12-14 15:46 ` Harry Wentland [not found] ` <CAPj87rNrwsfAR75138WDQPbti_BmS_D-NxESZ075obcjO3T04g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2016-12-14 16:35 ` Alex Deucher 2016-12-13 2:52 ` Cheng, Tony [not found] ` <5a1f2762-f1e0-05f1-3c16-173cb1f46571-5C7GfCeVMHo@public.gmane.org> 2016-12-13 7:09 ` Dave Airlie 2016-12-13 9:40 ` Lukas Wunner [not found] ` <20161213094035.GA10916-JFq808J9C/izQB+pC5nmwQ@public.gmane.org> 2016-12-13 15:03 ` Cheng, Tony 2016-12-13 15:09 ` Deucher, Alexander 2016-12-13 15:57 ` Lukas Wunner 2016-12-14 9:57 ` Jani Nikula 2016-12-14 17:23 ` Cheng, Tony [not found] ` <d68102d4-b99c-cc60-4eb2-9c6295af130f-5C7GfCeVMHo@public.gmane.org> 2016-12-14 18:01 ` Alex Deucher [not found] ` <CADnq5_Nha9502S=DOJDNepNv9CBV88=0R6N+tpBuO+U+s1eUQA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2016-12-14 18:16 ` Cheng, Tony 2016-12-13 16:14 ` Bridgman, John 2016-12-12 7:22 ` Daniel Vetter [not found] ` <20161212072243.ah6sy3q57z4gimka-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org> 2016-12-12 7:54 ` Bridgman, John [not found] ` <BN6PR12MB13484DA35697DBD0CA815CFFE8980-/b2+HYfkarQX0pEhCR5T8QdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org> 2016-12-12 9:27 ` Daniel Vetter [not found] ` <20161212092727.6jgsgzlrdsha6zsl-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org> 2016-12-12 9:29 ` Daniel Vetter 2016-12-12 15:28 ` Deucher, Alexander [not found] ` <MWHPR12MB1694EE6082AE9315EF5E6C68F7980-Gy0DoCVfaSW4WA4dJ5YXGAdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org> 2016-12-12 16:06 ` Luke A. Guest 2016-12-12 16:17 ` Luke A. Guest [not found] ` <584ECD8B.8000509-z/KZkw/0wg5BDgjK7y7TUQ@public.gmane.org> 2016-12-12 16:44 ` Deucher, Alexander 2016-12-13 2:05 ` Harry Wentland [not found] ` <2032d12b-f675-eb25-33bf-3aa0fcd20cb3-5C7GfCeVMHo@public.gmane.org> 2016-12-13 8:33 ` Daniel Vetter 2016-12-09 16:32 Jan Ziak 2016-12-13 7:31 ` Michel Dänzer 2016-12-15 15:48 Kevin Brace
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.